expression sequence analysis: Topics by Science.gov

Sample records for expression sequence analysis

mESAdb: microRNA Expression and Sequence Analysis Database

PubMed Central

Kaya, Koray D.; Karakülah, Gökhan; Yakıcıer, Cengiz M.; Acar, Aybar C.; Konu, Özlen

2011-01-01

microRNA expression and sequence analysis database (http://konulab.fen.bilkent.edu.tr/mirna/) (mESAdb) is a regularly updated database for the multivariate analysis of sequences and expression of microRNAs from multiple taxa. mESAdb is modular and has a user interface implemented in PHP and JavaScript and coupled with statistical analysis and visualization packages written for the R language. The database primarily comprises mature microRNA sequences and their target data, along with selected human, mouse and zebrafish expression data sets. mESAdb analysis modules allow (i) mining of microRNA expression data sets for subsets of microRNAs selected manually or by motif; (ii) pair-wise multivariate analysis of expression data sets within and between taxa; and (iii) association of microRNA subsets with annotation databases, HUGE Navigator, KEGG and GO. The use of existing and customized R packages facilitates future addition of data sets and analysis tools. Furthermore, the ability to upload and analyze user-specified data sets makes mESAdb an interactive and expandable analysis tool for microRNA sequence and expression data. PMID:21177657
mESAdb: microRNA expression and sequence analysis database.

PubMed

Kaya, Koray D; Karakülah, Gökhan; Yakicier, Cengiz M; Acar, Aybar C; Konu, Ozlen

2011-01-01

microRNA expression and sequence analysis database (http://konulab.fen.bilkent.edu.tr/mirna/) (mESAdb) is a regularly updated database for the multivariate analysis of sequences and expression of microRNAs from multiple taxa. mESAdb is modular and has a user interface implemented in PHP and JavaScript and coupled with statistical analysis and visualization packages written for the R language. The database primarily comprises mature microRNA sequences and their target data, along with selected human, mouse and zebrafish expression data sets. mESAdb analysis modules allow (i) mining of microRNA expression data sets for subsets of microRNAs selected manually or by motif; (ii) pair-wise multivariate analysis of expression data sets within and between taxa; and (iii) association of microRNA subsets with annotation databases, HUGE Navigator, KEGG and GO. The use of existing and customized R packages facilitates future addition of data sets and analysis tools. Furthermore, the ability to upload and analyze user-specified data sets makes mESAdb an interactive and expandable analysis tool for microRNA sequence and expression data.
Single-Cell RNA-Sequencing: Assessment of Differential Expression Analysis Methods.

PubMed

Dal Molin, Alessandra; Baruzzo, Giacomo; Di Camillo, Barbara

2017-01-01

The sequencing of the transcriptomes of single-cells, or single-cell RNA-sequencing, has now become the dominant technology for the identification of novel cell types and for the study of stochastic gene expression. In recent years, various tools for analyzing single-cell RNA-sequencing data have been proposed, many of them with the purpose of performing differentially expression analysis. In this work, we compare four different tools for single-cell RNA-sequencing differential expression, together with two popular methods originally developed for the analysis of bulk RNA-sequencing data, but largely applied to single-cell data. We discuss results obtained on two real and one synthetic dataset, along with considerations about the perspectives of single-cell differential expression analysis. In particular, we explore the methods performance in four different scenarios, mimicking different unimodal or bimodal distributions of the data, as characteristic of single-cell transcriptomics. We observed marked differences between the selected methods in terms of precision and recall, the number of detected differentially expressed genes and the overall performance. Globally, the results obtained in our study suggest that is difficult to identify a best performing tool and that efforts are needed to improve the methodologies for single-cell RNA-sequencing data analysis and gain better accuracy of results.
Efficient experimental design and analysis strategies for the detection of differential expression using RNA-Sequencing

PubMed Central

2012-01-01

Background RNA sequencing (RNA-Seq) has emerged as a powerful approach for the detection of differential gene expression with both high-throughput and high resolution capabilities possible depending upon the experimental design chosen. Multiplex experimental designs are now readily available, these can be utilised to increase the numbers of samples or replicates profiled at the cost of decreased sequencing depth generated per sample. These strategies impact on the power of the approach to accurately identify differential expression. This study presents a detailed analysis of the power to detect differential expression in a range of scenarios including simulated null and differential expression distributions with varying numbers of biological or technical replicates, sequencing depths and analysis methods. Results Differential and non-differential expression datasets were simulated using a combination of negative binomial and exponential distributions derived from real RNA-Seq data. These datasets were used to evaluate the performance of three commonly used differential expression analysis algorithms and to quantify the changes in power with respect to true and false positive rates when simulating variations in sequencing depth, biological replication and multiplex experimental design choices. Conclusions This work quantitatively explores comparisons between contemporary analysis tools and experimental design choices for the detection of differential expression using RNA-Seq. We found that the DESeq algorithm performs more conservatively than edgeR and NBPSeq. With regard to testing of various experimental designs, this work strongly suggests that greater power is gained through the use of biological replicates relative to library (technical) replicates and sequencing depth. Strikingly, sequencing depth could be reduced as low as 15% without substantial impacts on false positive or true positive rates. PMID:22985019
Efficient experimental design and analysis strategies for the detection of differential expression using RNA-Sequencing.

PubMed

Robles, José A; Qureshi, Sumaira E; Stephen, Stuart J; Wilson, Susan R; Burden, Conrad J; Taylor, Jennifer M

2012-09-17

RNA sequencing (RNA-Seq) has emerged as a powerful approach for the detection of differential gene expression with both high-throughput and high resolution capabilities possible depending upon the experimental design chosen. Multiplex experimental designs are now readily available, these can be utilised to increase the numbers of samples or replicates profiled at the cost of decreased sequencing depth generated per sample. These strategies impact on the power of the approach to accurately identify differential expression. This study presents a detailed analysis of the power to detect differential expression in a range of scenarios including simulated null and differential expression distributions with varying numbers of biological or technical replicates, sequencing depths and analysis methods. Differential and non-differential expression datasets were simulated using a combination of negative binomial and exponential distributions derived from real RNA-Seq data. These datasets were used to evaluate the performance of three commonly used differential expression analysis algorithms and to quantify the changes in power with respect to true and false positive rates when simulating variations in sequencing depth, biological replication and multiplex experimental design choices. This work quantitatively explores comparisons between contemporary analysis tools and experimental design choices for the detection of differential expression using RNA-Seq. We found that the DESeq algorithm performs more conservatively than edgeR and NBPSeq. With regard to testing of various experimental designs, this work strongly suggests that greater power is gained through the use of biological replicates relative to library (technical) replicates and sequencing depth. Strikingly, sequencing depth could be reduced as low as 15% without substantial impacts on false positive or true positive rates.
BioVLAB-mCpG-SNP-EXPRESS: A system for multi-level and multi-perspective analysis and exploration of DNA methylation, sequence variation (SNPs), and gene expression from multi-omics data.

PubMed

Chae, Heejoon; Lee, Sangseon; Seo, Seokjun; Jung, Daekyoung; Chang, Hyeonsook; Nephew, Kenneth P; Kim, Sun

2016-12-01

Measuring gene expression, DNA sequence variation, and DNA methylation status is routinely done using high throughput sequencing technologies. To analyze such multi-omics data and explore relationships, reliable bioinformatics systems are much needed. Existing systems are either for exploring curated data or for processing omics data in the form of a library such as R. Thus scientists have much difficulty in investigating relationships among gene expression, DNA sequence variation, and DNA methylation using multi-omics data. In this study, we report a system called BioVLAB-mCpG-SNP-EXPRESS for the integrated analysis of DNA methylation, sequence variation (SNPs), and gene expression for distinguishing cellular phenotypes at the pairwise and multiple phenotype levels. The system can be deployed on either the Amazon cloud or a publicly available high-performance computing node, and the data analysis and exploration of the analysis result can be conveniently done using a web-based interface. In order to alleviate analysis complexity, all the process are fully automated, and graphical workflow system is integrated to represent real-time analysis progression. The BioVLAB-mCpG-SNP-EXPRESS system works in three stages. First, it processes and analyzes multi-omics data as input in the form of the raw data, i.e., FastQ files. Second, various integrated analyses such as methylation vs. gene expression and mutation vs. methylation are performed. Finally, the analysis result can be explored in a number of ways through a web interface for the multi-level, multi-perspective exploration. Multi-level interpretation can be done by either gene, gene set, pathway or network level and multi-perspective exploration can be explored from either gene expression, DNA methylation, sequence variation, or their relationship perspective. The utility of the system is demonstrated by performing analysis of phenotypically distinct 30 breast cancer cell line data set. BioVLAB-mCpG-SNP-EXPRESS is available at http://biohealth.snu.ac.kr/software/biovlab_mcpg_snp_express/. Copyright Â© 2016 Elsevier Inc. All rights reserved.
An efficient annotation and gene-expression derivation tool for Illumina Solexa datasets.

PubMed

Hosseini, Parsa; Tremblay, Arianne; Matthews, Benjamin F; Alkharouf, Nadim W

2010-07-02

The data produced by an Illumina flow cell with all eight lanes occupied, produces well over a terabyte worth of images with gigabytes of reads following sequence alignment. The ability to translate such reads into meaningful annotation is therefore of great concern and importance. Very easily, one can get flooded with such a great volume of textual, unannotated data irrespective of read quality or size. CASAVA, a optional analysis tool for Illumina sequencing experiments, enables the ability to understand INDEL detection, SNP information, and allele calling. To not only extract from such analysis, a measure of gene expression in the form of tag-counts, but furthermore to annotate such reads is therefore of significant value. We developed TASE (Tag counting and Analysis of Solexa Experiments), a rapid tag-counting and annotation software tool specifically designed for Illumina CASAVA sequencing datasets. Developed in Java and deployed using jTDS JDBC driver and a SQL Server backend, TASE provides an extremely fast means of calculating gene expression through tag-counts while annotating sequenced reads with the gene's presumed function, from any given CASAVA-build. Such a build is generated for both DNA and RNA sequencing. Analysis is broken into two distinct components: DNA sequence or read concatenation, followed by tag-counting and annotation. The end result produces output containing the homology-based functional annotation and respective gene expression measure signifying how many times sequenced reads were found within the genomic ranges of functional annotations. TASE is a powerful tool to facilitate the process of annotating a given Illumina Solexa sequencing dataset. Our results indicate that both homology-based annotation and tag-count analysis are achieved in very efficient times, providing researchers to delve deep in a given CASAVA-build and maximize information extraction from a sequencing dataset. TASE is specially designed to translate sequence data in a CASAVA-build into functional annotations while producing corresponding gene expression measurements. Achieving such analysis is executed in an ultrafast and highly efficient manner, whether the analysis be a single-read or paired-end sequencing experiment. TASE is a user-friendly and freely available application, allowing rapid analysis and annotation of any given Illumina Solexa sequencing dataset with ease.
dictyExpress: a web-based platform for sequence data management and analytics in Dictyostelium and beyond.

PubMed

Stajdohar, Miha; Rosengarten, Rafael D; Kokosar, Janez; Jeran, Luka; Blenkus, Domen; Shaulsky, Gad; Zupan, Blaz

2017-06-02

Dictyostelium discoideum, a soil-dwelling social amoeba, is a model for the study of numerous biological processes. Research in the field has benefited mightily from the adoption of next-generation sequencing for genomics and transcriptomics. Dictyostelium biologists now face the widespread challenges of analyzing and exploring high dimensional data sets to generate hypotheses and discovering novel insights. We present dictyExpress (2.0), a web application designed for exploratory analysis of gene expression data, as well as data from related experiments such as Chromatin Immunoprecipitation sequencing (ChIP-Seq). The application features visualization modules that include time course expression profiles, clustering, gene ontology enrichment analysis, differential expression analysis and comparison of experiments. All visualizations are interactive and interconnected, such that the selection of genes in one module propagates instantly to visualizations in other modules. dictyExpress currently stores the data from over 800 Dictyostelium experiments and is embedded within a general-purpose software framework for management of next-generation sequencing data. dictyExpress allows users to explore their data in a broader context by reciprocal linking with dictyBase-a repository of Dictyostelium genomic data. In addition, we introduce a companion application called GenBoard, an intuitive graphic user interface for data management and bioinformatics analysis. dictyExpress and GenBoard enable broad adoption of next generation sequencing based inquiries by the Dictyostelium research community. Labs without the means to undertake deep sequencing projects can mine the data available to the public. The entire information flow, from raw sequence data to hypothesis testing, can be accomplished in an efficient workspace. The software framework is generalizable and represents a useful approach for any research community. To encourage more wide usage, the backend is open-source, available for extension and further development by bioinformaticians and data scientists.
Comparisons between Arabidopsis thaliana and Drosophila melanogaster in relation to Coding and Noncoding Sequence Length and Gene Expression

PubMed Central

Caldwell, Rachel; Lin, Yan-Xia; Zhang, Ren

2015-01-01

There is a continuing interest in the analysis of gene architecture and gene expression to determine the relationship that may exist. Advances in high-quality sequencing technologies and large-scale resource datasets have increased the understanding of relationships and cross-referencing of expression data to the large genome data. Although a negative correlation between expression level and gene (especially transcript) length has been generally accepted, there have been some conflicting results arising from the literature concerning the impacts of different regions of genes, and the underlying reason is not well understood. The research aims to apply quantile regression techniques for statistical analysis of coding and noncoding sequence length and gene expression data in the plant, Arabidopsis thaliana, and fruit fly, Drosophila melanogaster, to determine if a relationship exists and if there is any variation or similarities between these species. The quantile regression analysis found that the coding sequence length and gene expression correlations varied, and similarities emerged for the noncoding sequence length (5′ and 3′ UTRs) between animal and plant species. In conclusion, the information described in this study provides the basis for further exploration into gene regulation with regard to coding and noncoding sequence length. PMID:26114098
Single nucleotide polymorphisms from Theobroma cacao expressed sequence tags associated with witches' broom disease in cacao.

PubMed

Lima, L S; Gramacho, K P; Carels, N; Novais, R; Gaiotto, F A; Lopes, U V; Gesteira, A S; Zaidan, H A; Cascardo, J C M; Pires, J L; Micheli, F

2009-07-14

In order to increase the efficiency of cacao tree resistance to witches' broom disease, which is caused by Moniliophthora perniciosa (Tricholomataceae), we looked for molecular markers that could help in the selection of resistant cacao genotypes. Among the different markers useful for developing marker-assisted selection, single nucleotide polymorphisms (SNPs) constitute the most common type of sequence difference between alleles and can be easily detected by in silico analysis from expressed sequence tag libraries. We report the first detection and analysis of SNPs from cacao-M. perniciosa interaction expressed sequence tags, using bioinformatics. Selection based on analysis of these SNPs should be useful for developing cacao varieties resistant to this devastating disease.
An efficient annotation and gene-expression derivation tool for Illumina Solexa datasets

PubMed Central

2010-01-01

Background The data produced by an Illumina flow cell with all eight lanes occupied, produces well over a terabyte worth of images with gigabytes of reads following sequence alignment. The ability to translate such reads into meaningful annotation is therefore of great concern and importance. Very easily, one can get flooded with such a great volume of textual, unannotated data irrespective of read quality or size. CASAVA, a optional analysis tool for Illumina sequencing experiments, enables the ability to understand INDEL detection, SNP information, and allele calling. To not only extract from such analysis, a measure of gene expression in the form of tag-counts, but furthermore to annotate such reads is therefore of significant value. Findings We developed TASE (Tag counting and Analysis of Solexa Experiments), a rapid tag-counting and annotation software tool specifically designed for Illumina CASAVA sequencing datasets. Developed in Java and deployed using jTDS JDBC driver and a SQL Server backend, TASE provides an extremely fast means of calculating gene expression through tag-counts while annotating sequenced reads with the gene's presumed function, from any given CASAVA-build. Such a build is generated for both DNA and RNA sequencing. Analysis is broken into two distinct components: DNA sequence or read concatenation, followed by tag-counting and annotation. The end result produces output containing the homology-based functional annotation and respective gene expression measure signifying how many times sequenced reads were found within the genomic ranges of functional annotations. Conclusions TASE is a powerful tool to facilitate the process of annotating a given Illumina Solexa sequencing dataset. Our results indicate that both homology-based annotation and tag-count analysis are achieved in very efficient times, providing researchers to delve deep in a given CASAVA-build and maximize information extraction from a sequencing dataset. TASE is specially designed to translate sequence data in a CASAVA-build into functional annotations while producing corresponding gene expression measurements. Achieving such analysis is executed in an ultrafast and highly efficient manner, whether the analysis be a single-read or paired-end sequencing experiment. TASE is a user-friendly and freely available application, allowing rapid analysis and annotation of any given Illumina Solexa sequencing dataset with ease. PMID:20598141
Composite transcriptome assembly of RNA-seq data in a sheep model for delayed bone healing.

PubMed

Jäger, Marten; Ott, Claus-Eric; Grünhagen, Johannes; Hecht, Jochen; Schell, Hanna; Mundlos, Stefan; Duda, Georg N; Robinson, Peter N; Lienau, Jasmin

2011-03-24

The sheep is an important model organism for many types of medically relevant research, but molecular genetic experiments in the sheep have been limited by the lack of knowledge about ovine gene sequences. Prior to our study, mRNA sequences for only 1,556 partial or complete ovine genes were publicly available. Therefore, we developed a composite de novo transcriptome assembly method for next-generation sequence data to combine known ovine mRNA and EST sequences, mRNA sequences from mouse and cow, and sequences assembled de novo from short read RNA-Seq data into a composite reference transcriptome, and identified transcripts from over 12 thousand previously undescribed ovine genes. Gene expression analysis based on these data revealed substantially different expression profiles in standard versus delayed bone healing in an ovine tibial osteotomy model. Hundreds of transcripts were differentially expressed between standard and delayed healing and between the time points of the standard and delayed healing groups. We used the sheep sequences to design quantitative RT-PCR assays with which we validated the differential expression of 26 genes that had been identified by RNA-seq analysis. A number of clusters of characteristic expression profiles could be identified, some of which showed striking differences between the standard and delayed healing groups. Gene Ontology (GO) analysis showed that the differentially expressed genes were enriched in terms including extracellular matrix, cartilage development, contractile fiber, and chemokine activity. Our results provide a first atlas of gene expression profiles and differentially expressed genes in standard and delayed bone healing in a large-animal model and provide a number of clues as to the shifts in gene expression that underlie delayed bone healing. In the course of our study, we identified transcripts of 13,987 ovine genes, including 12,431 genes for which no sequence information was previously available. This information will provide a basis for future molecular research involving the sheep as a model organism.
Composite transcriptome assembly of RNA-seq data in a sheep model for delayed bone healing

PubMed Central

2011-01-01

Background The sheep is an important model organism for many types of medically relevant research, but molecular genetic experiments in the sheep have been limited by the lack of knowledge about ovine gene sequences. Results Prior to our study, mRNA sequences for only 1,556 partial or complete ovine genes were publicly available. Therefore, we developed a composite de novo transcriptome assembly method for next-generation sequence data to combine known ovine mRNA and EST sequences, mRNA sequences from mouse and cow, and sequences assembled de novo from short read RNA-Seq data into a composite reference transcriptome, and identified transcripts from over 12 thousand previously undescribed ovine genes. Gene expression analysis based on these data revealed substantially different expression profiles in standard versus delayed bone healing in an ovine tibial osteotomy model. Hundreds of transcripts were differentially expressed between standard and delayed healing and between the time points of the standard and delayed healing groups. We used the sheep sequences to design quantitative RT-PCR assays with which we validated the differential expression of 26 genes that had been identified by RNA-seq analysis. A number of clusters of characteristic expression profiles could be identified, some of which showed striking differences between the standard and delayed healing groups. Gene Ontology (GO) analysis showed that the differentially expressed genes were enriched in terms including extracellular matrix, cartilage development, contractile fiber, and chemokine activity. Conclusions Our results provide a first atlas of gene expression profiles and differentially expressed genes in standard and delayed bone healing in a large-animal model and provide a number of clues as to the shifts in gene expression that underlie delayed bone healing. In the course of our study, we identified transcripts of 13,987 ovine genes, including 12,431 genes for which no sequence information was previously available. This information will provide a basis for future molecular research involving the sheep as a model organism. PMID:21435219
Isoform-level gene expression patterns in single-cell RNA-sequencing data.

PubMed

Vu, Trung Nghia; Wills, Quin F; Kalari, Krishna R; Niu, Nifang; Wang, Liewei; Pawitan, Yudi; Rantalainen, Mattias

2018-02-27

RNA sequencing of single cells enables characterization of transcriptional heterogeneity in seemingly homogeneous cell populations. Single-cell sequencing has been applied in a wide range of researches fields. However, few studies have focus on characterization of isoform-level expression patterns at the single-cell level. In this study we propose and apply a novel method, ISOform-Patterns (ISOP), based on mixture modeling, to characterize the expression patterns of isoform pairs from the same gene in single-cell isoform-level expression data. We define six principal patterns of isoform expression relationships and describe a method for differential-pattern analysis. We demonstrate ISOP through analysis of single-cell RNA-sequencing data from a breast cancer cell line, with replication in three independent datasets. We assigned the pattern types to each of 16,562 isoform-pairs from 4,929 genes. Among those, 26% of the discovered patterns were significant (p<0.05), while remaining patterns are possibly effects of transcriptional bursting, drop-out and stochastic biological heterogeneity. Furthermore, 32% of genes discovered through differential-pattern analysis were not detected by differential-expression analysis. The effect of drop-out events, mean expression level, and properties of the expression distribution on the performances of ISOP were also investigated through simulated datasets. To conclude, ISOP provides a novel approach for characterization of isoformlevel preference, commitment and heterogeneity in single-cell RNA-sequencing data. The ISOP method has been implemented as a R package and is available at https://github.com/nghiavtr/ISOP under a GPL-3 license. mattias.rantalainen@ki.se. Supplementary data are available at Bioinformatics online.
A diverse family of serine proteinase genes expressed in cotton boll weevil (Anthonomus grandis): implications for the design of pest-resistant transgenic cotton plants.

PubMed

Oliveira-Neto, Osmundo B; Batista, João A N; Rigden, Daniel J; Fragoso, Rodrigo R; Silva, Rodrigo O; Gomes, Eliane A; Franco, Octávio L; Dias, Simoni C; Cordeiro, Célia M T; Monnerat, Rose G; Grossi-De-Sá, Maria F

2004-09-01

Fourteen different cDNA fragments encoding serine proteinases were isolated by reverse transcription-PCR from cotton boll weevil (Anthonomus grandis) larvae. A large diversity between the sequences was observed, with a mean pairwise identity of 22% in the amino acid sequence. The cDNAs encompassed 11 trypsin-like sequences classifiable into three families and three chymotrypsin-like sequences belonging to a single family. Using a combination of 5' and 3' RACE, the full-length sequence was obtained for five of the cDNAs, named Agser2, Agser5, Agser6, Agser10 and Agser21. The encoded proteins included amino acid sequence motifs of serine proteinase active sites, conserved cysteine residues, and both zymogen activation and signal peptides. Southern blotting analysis suggested that one or two copies of these serine proteinase genes exist in the A. grandis genome. Northern blotting analysis of Agser2 and Agser5 showed that for both genes, expression is induced upon feeding and is concentrated in the gut of larvae and adult insects. Reverse northern analysis of the 14 cDNA fragments showed that only two trypsin-like and two chymotrypsin-like were expressed at detectable levels. Under the effect of the serine proteinase inhibitors soybean Kunitz trypsin inhibitor and black-eyed pea trypsin/chymotrypsin inhibitor, expression of one of the trypsin-like sequences was upregulated while expression of the two chymotrypsin-like sequences was downregulated. Copyright 2004 Elsevier Ltd.
Analysis of xylem formation in pine by cDNA sequencing

NASA Technical Reports Server (NTRS)

Allona, I.; Quinn, M.; Shoop, E.; Swope, K.; St Cyr, S.; Carlis, J.; Riedl, J.; Retzel, E.; Campbell, M. M.; Sederoff, R.;

1998-01-01

Secondary xylem (wood) formation is likely to involve some genes expressed rarely or not at all in herbaceous plants. Moreover, environmental and developmental stimuli influence secondary xylem differentiation, producing morphological and chemical changes in wood. To increase our understanding of xylem formation, and to provide material for comparative analysis of gymnosperm and angiosperm sequences, ESTs were obtained from immature xylem of loblolly pine (Pinus taeda L.). A total of 1,097 single-pass sequences were obtained from 5' ends of cDNAs made from gravistimulated tissue from bent trees. Cluster analysis detected 107 groups of similar sequences, ranging in size from 2 to 20 sequences. A total of 361 sequences fell into these groups, whereas 736 sequences were unique. About 55% of the pine EST sequences show similarity to previously described sequences in public databases. About 10% of the recognized genes encode factors involved in cell wall formation. Sequences similar to cell wall proteins, most known lignin biosynthetic enzymes, and several enzymes of carbohydrate metabolism were found. A number of putative regulatory proteins also are represented. Expression patterns of several of these genes were studied in various tissues and organs of pine. Sequencing novel genes expressed during xylem formation will provide a powerful means of identifying mechanisms controlling this important differentiation pathway.

Advanced colorectal adenoma related gene expression signature may predict prognostic for colorectal cancer patients with adenoma-carcinoma sequence.

PubMed

Li, Bing; Shi, Xiao-Yu; Liao, Dai-Xiang; Cao, Bang-Rong; Luo, Cheng-Hua; Cheng, Shu-Jun

2015-01-01

There are still no absolute parameters predicting progression of adenoma into cancer. The present study aimed to characterize functional differences on the multistep carcinogenetic process from the adenoma-carcinoma sequence. All samples were collected and mRNA expression profiling was performed by using Agilent Microarray high-throughput gene-chip technology. Then, the characteristics of mRNA expression profiles of adenoma-carcinoma sequence were described with bioinformatics software, and we analyzed the relationship between gene expression profiles of adenoma-adenocarcinoma sequence and clinical prognosis of colorectal cancer. The mRNA expressions of adenoma-carcinoma sequence were significantly different between high-grade intraepithelial neoplasia group and adenocarcinoma group. The biological process of gene ontology function enrichment analysis on differentially expressed genes between high-grade intraepithelial neoplasia group and adenocarcinoma group showed that genes enriched in the extracellular structure organization, skeletal system development, biological adhesion and itself regulated growth regulation, with the P value after FDR correction of less than 0.05. In addition, IPR-related protein mainly focused on the insulin-like growth factor binding proteins. The variable trends of gene expression profiles for adenoma-carcinoma sequence were mainly concentrated in high-grade intraepithelial neoplasia and adenocarcinoma. The differentially expressed genes are significantly correlated between high-grade intraepithelial neoplasia group and adenocarcinoma group. Bioinformatics analysis is an effective way to study the gene expression profiles in the adenoma-carcinoma sequence, and may provide an effective tool to involve colorectal cancer research strategy into colorectal adenoma or advanced adenoma.
A 20 bp cis-acting element is both necessary and sufficient to mediate elicitor response of a maize PRms gene.

PubMed

Raventós, D; Jensen, A B; Rask, M B; Casacuberta, J M; Mundy, J; San Segundo, B

1995-01-01

Transient gene expression assays in barley aleurone protoplasts were used to identify a cis-regulatory element involved in the elicitor-responsive expression of the maize PRms gene. Analysis of transcriptional fusions between PRms 5' upstream sequences and a chloramphenicol acetyltransferase reporter gene, as well as chimeric promoters containing PRms promoter fragments or repeated oligonucleotides fused to a minimal promoter, delineated a 20 bp sequence which functioned as an elicitor-response element (ERE). This sequence contains a motif (-246 AATTGACC) similar to sequences found in promoters of other pathogen-responsive genes. The analysis also indicated that an enhancing sequence(s) between -397 and -296 is required for full PRms activation by elicitors. The protein kinase inhibitor staurosporine was found to completely block the transcriptional activation induced by elicitors. These data indicate that protein phosphorylation is involved in the signal transduction pathway leading to PRms expression.
Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks

PubMed Central

Trapnell, Cole; Roberts, Adam; Goff, Loyal; Pertea, Geo; Kim, Daehwan; Kelley, David R; Pimentel, Harold; Salzberg, Steven L; Rinn, John L; Pachter, Lior

2012-01-01

Recent advances in high-throughput cDNA sequencing (RNA-seq) can reveal new genes and splice variants and quantify expression genome-wide in a single assay. The volume and complexity of data from RNA-seq experiments necessitate scalable, fast and mathematically principled analysis software. TopHat and Cufflinks are free, open-source software tools for gene discovery and comprehensive expression analysis of high-throughput mRNA sequencing (RNA-seq) data. Together, they allow biologists to identify new genes and new splice variants of known ones, as well as compare gene and transcript expression under two or more conditions. This protocol describes in detail how to use TopHat and Cufflinks to perform such analyses. It also covers several accessory tools and utilities that aid in managing data, including CummeRbund, a tool for visualizing RNA-seq analysis results. Although the procedure assumes basic informatics skills, these tools assume little to no background with RNA-seq analysis and are meant for novices and experts alike. The protocol begins with raw sequencing reads and produces a transcriptome assembly, lists of differentially expressed and regulated genes and transcripts, and publication-quality visualizations of analysis results. The protocol's execution time depends on the volume of transcriptome sequencing data and available computing resources but takes less than 1 d of computer time for typical experiments and ~1 h of hands-on time. PMID:22383036
Isolation and characterization of the promoter sequence of a cassava gene coding for Pt2L4, a glutamic acid-rich protein differentially expressed in storage roots.

PubMed

de Souza, C R; Aragão, F J; Moreira, E C O; Costa, C N M; Nascimento, S B; Carvalho, L J

2009-03-24

Cassava is one of the most important tropical food crops for more than 600 million people worldwide. Transgenic technologies can be useful for increasing its nutritional value and its resistance to viral diseases and insect pests. However, tissue-specific promoters that guarantee correct expression of transgenes would be necessary. We used inverse polymerase chain reaction to isolate a promoter sequence of the Mec1 gene coding for Pt2L4, a glutamic acid-rich protein differentially expressed in cassava storage roots. In silico analysis revealed putative cis-acting regulatory elements within this promoter sequence, including root-specific elements that may be required for its expression in vascular tissues. Transient expression experiments showed that the Mec1 promoter is functional, since this sequence was able to drive GUS expression in bean embryonic axes. Results from our computational analysis can serve as a guide for functional experiments to identify regions with tissue-specific Mec1 promoter activity. The DNA sequence that we identified is a new promoter that could be a candidate for genetic engineering of cassava roots.

Studies of a biochemical factory: tomato trichome deep expressed sequence tag sequencing and proteomics.

PubMed

Schilmiller, Anthony L; Miner, Dennis P; Larson, Matthew; McDowell, Eric; Gang, David R; Wilkerson, Curtis; Last, Robert L

2010-07-01

Shotgun proteomics analysis allows hundreds of proteins to be identified and quantified from a single sample at relatively low cost. Extensive DNA sequence information is a prerequisite for shotgun proteomics, and it is ideal to have sequence for the organism being studied rather than from related species or accessions. While this requirement has limited the set of organisms that are candidates for this approach, next generation sequencing technologies make it feasible to obtain deep DNA sequence coverage from any organism. As part of our studies of specialized (secondary) metabolism in tomato (Solanum lycopersicum) trichomes, 454 sequencing of cDNA was combined with shotgun proteomics analyses to obtain in-depth profiles of genes and proteins expressed in leaf and stem glandular trichomes of 3-week-old plants. The expressed sequence tag and proteomics data sets combined with metabolite analysis led to the discovery and characterization of a sesquiterpene synthase that produces beta-caryophyllene and alpha-humulene from E,E-farnesyl diphosphate in trichomes of leaf but not of stem. This analysis demonstrates the utility of combining high-throughput cDNA sequencing with proteomics experiments in a target tissue. These data can be used for dissection of other biochemical processes in these specialized epidermal cells.
Studies of a Biochemical Factory: Tomato Trichome Deep Expressed Sequence Tag Sequencing and Proteomics1[W][OA

PubMed Central

Schilmiller, Anthony L.; Miner, Dennis P.; Larson, Matthew; McDowell, Eric; Gang, David R.; Wilkerson, Curtis; Last, Robert L.

2010-01-01

Shotgun proteomics analysis allows hundreds of proteins to be identified and quantified from a single sample at relatively low cost. Extensive DNA sequence information is a prerequisite for shotgun proteomics, and it is ideal to have sequence for the organism being studied rather than from related species or accessions. While this requirement has limited the set of organisms that are candidates for this approach, next generation sequencing technologies make it feasible to obtain deep DNA sequence coverage from any organism. As part of our studies of specialized (secondary) metabolism in tomato (Solanum lycopersicum) trichomes, 454 sequencing of cDNA was combined with shotgun proteomics analyses to obtain in-depth profiles of genes and proteins expressed in leaf and stem glandular trichomes of 3-week-old plants. The expressed sequence tag and proteomics data sets combined with metabolite analysis led to the discovery and characterization of a sesquiterpene synthase that produces β-caryophyllene and α-humulene from E,E-farnesyl diphosphate in trichomes of leaf but not of stem. This analysis demonstrates the utility of combining high-throughput cDNA sequencing with proteomics experiments in a target tissue. These data can be used for dissection of other biochemical processes in these specialized epidermal cells. PMID:20431087
Differential expression analysis of Paralichthys olivaceus microRNAs in adult ovary and testis by deep sequencing.

PubMed

Gu, Yifeng; Zhang, Lei; Chen, Xiaowu

2014-08-01

MicroRNAs (miRNAs) play an important role in gonadal development and differentiation in fish. However, understanding of the mechanism of this process is hindered by our poor knowledge of miRNA expression patterns in fish gonads. In this study, miRNA libraries derived from adult gonads of Paralichthys olivaceus were generated by using next-generation sequencing (NGS) technology. Bioinformatics analysis was performed to distinguish mature miRNA sequences from two classes of small RNAs represented in the sequencing data. A total of 141 mature miRNAs were identified, in which 21 miRNAs were found in P. olivaceus for the first time. Variance and preference of miRNAs expression were concluded from the deep sequencing reads. Some miRNAs, such as pol-miR-143, pol-miR-26a and pol-let-7a were found with quite high expression levels in both gonads, while some exhibited a clear sex-biased expression in different gonad. Approximate 20.0% and 13.1% of the isolated miRNAs were preferentially expressed in the testis (FC<0.5) or ovary (FC>2), respectively. The identification and the preliminary analysis of the sex-biased expression of miRNAs in P. olivaceus gonads in our work by using NGS will provide us a basic catalog of miRNAs to facilitate future improvement and exploitation of sexual regulatory mechanisms in P. olivaceus. Copyright © 2014. Published by Elsevier Inc.
Complementary DNA cloning, sequence analysis, and tissue transcription profile of a novel U2AF2 gene from the Chinese Banna mini-pig inbred line.

PubMed

Wang, S Y; Huo, J L; Miao, Y W; Cheng, W M; Zeng, Y Z

2013-04-02

U2 small nuclear RNA auxiliary factor 2 (U2AF2) is an important gene for pre-messenger RNA splicing in higher eukaryotes. In this study, the Banna mini-pig inbred line (BMI) U2AF2 coding sequence (CDS) was cloned, sequenced, and characterized. The U2AF2 complete CDS was amplified using the reverse transcription-polymerase chain reaction (RT-PCR) technique based on the conserved sequence information of cattle and known highly homologous swine expressed sequence tags. This novel gene was deposited into the National Center for Biotechnology Information database (Accession No. JQ839267). Sequence analysis revealed that the BMI U2AF2 coding sequence consisted of 1416 bp and encoded 471 amino acids with a molecular weight of 53.12 kDa. The protein sequence has high sequence homology with U2AF65 of 6 species - Homo sapiens (100%), Equus caballus (100%), Canis lupus (100%), Macaca mulatta (99.8%), Bos taurus (74.4%), and Mus musculus (74.4%). The phylogenetic tree analysis revealed that BMI U2AF65 has a closer genetic relationship with B. taurus U2AF65 than with U2AF65 of E. caballus, C. lupus, M. mulatta, H. sapiens, and M. musculus. RT-PCR analysis showed that BMI U2AF2 was most highly expressed in the brain; moderately expressed in the spleen, lung, muscle, and skin; and weakly expressed in the liver, kidney, and ovary. Its expression was nearly silent in the spinal cord, nerve fiber, heart, stomach, pancreas, and intestine. Three microRNA target sites were predicted in the CDS of BMI U2AF2 messenger RNA. Our results establish a foundation for further insight into this swine gene.
A combination of LongSAGE with Solexa sequencing is well suited to explore the depth and the complexity of transcriptome

PubMed Central

Hanriot, Lucie; Keime, Céline; Gay, Nadine; Faure, Claudine; Dossat, Carole; Wincker, Patrick; Scoté-Blachon, Céline; Peyron, Christelle; Gandrillon, Olivier

2008-01-01

Background "Open" transcriptome analysis methods allow to study gene expression without a priori knowledge of the transcript sequences. As of now, SAGE (Serial Analysis of Gene Expression), LongSAGE and MPSS (Massively Parallel Signature Sequencing) are the mostly used methods for "open" transcriptome analysis. Both LongSAGE and MPSS rely on the isolation of 21 pb tag sequences from each transcript. In contrast to LongSAGE, the high throughput sequencing method used in MPSS enables the rapid sequencing of very large libraries containing several millions of tags, allowing deep transcriptome analysis. However, a bias in the complexity of the transcriptome representation obtained by MPSS was recently uncovered. Results In order to make a deep analysis of mouse hypothalamus transcriptome avoiding the limitation introduced by MPSS, we combined LongSAGE with the Solexa sequencing technology and obtained a library of more than 11 millions of tags. We then compared it to a LongSAGE library of mouse hypothalamus sequenced with the Sanger method. Conclusion We found that Solexa sequencing technology combined with LongSAGE is perfectly suited for deep transcriptome analysis. In contrast to MPSS, it gives a complex representation of transcriptome as reliable as a LongSAGE library sequenced by the Sanger method. PMID:18796152
Analysis and Functional Annotation of an Expressed Sequence Tag Collection for Tropical Crop Sugarcane

PubMed Central

Vettore, André L.; da Silva, Felipe R.; Kemper, Edson L.; Souza, Glaucia M.; da Silva, Aline M.; Ferro, Maria Inês T.; Henrique-Silva, Flavio; Giglioti, Éder A.; Lemos, Manoel V.F.; Coutinho, Luiz L.; Nobrega, Marina P.; Carrer, Helaine; França, Suzelei C.; Bacci, Maurício; Goldman, Maria Helena S.; Gomes, Suely L.; Nunes, Luiz R.; Camargo, Luis E.A.; Siqueira, Walter J.; Van Sluys, Marie-Anne; Thiemann, Otavio H.; Kuramae, Eiko E.; Santelli, Roberto V.; Marino, Celso L.; Targon, Maria L.P.N.; Ferro, Jesus A.; Silveira, Henrique C.S.; Marini, Danyelle C.; Lemos, Eliana G.M.; Monteiro-Vitorello, Claudia B.; Tambor, José H.M.; Carraro, Dirce M.; Roberto, Patrícia G.; Martins, Vanderlei G.; Goldman, Gustavo H.; de Oliveira, Regina C.; Truffi, Daniela; Colombo, Carlos A.; Rossi, Magdalena; de Araujo, Paula G.; Sculaccio, Susana A.; Angella, Aline; Lima, Marleide M.A.; de Rosa, Vicente E.; Siviero, Fábio; Coscrato, Virginia E.; Machado, Marcos A.; Grivet, Laurent; Di Mauro, Sonia M.Z.; Nobrega, Francisco G.; Menck, Carlos F.M.; Braga, Marilia D.V.; Telles, Guilherme P.; Cara, Frank A.A.; Pedrosa, Guilherme; Meidanis, João; Arruda, Paulo

2003-01-01

To contribute to our understanding of the genome complexity of sugarcane, we undertook a large-scale expressed sequence tag (EST) program. More than 260,000 cDNA clones were partially sequenced from 26 standard cDNA libraries generated from different sugarcane tissues. After the processing of the sequences, 237,954 high-quality ESTs were identified. These ESTs were assembled into 43,141 putative transcripts. Of the assembled sequences, 35.6% presented no matches with existing sequences in public databases. A global analysis of the whole SUCEST data set indicated that 14,409 assembled sequences (33% of the total) contained at least one cDNA clone with a full-length insert. Annotation of the 43,141 assembled sequences associated almost 50% of the putative identified sugarcane genes with protein metabolism, cellular communication/signal transduction, bioenergetics, and stress responses. Inspection of the translated assembled sequences for conserved protein domains revealed 40,821 amino acid sequences with 1415 Pfam domains. Reassembling the consensus sequences of the 43,141 transcripts revealed a 22% redundancy in the first assembling. This indicated that possibly 33,620 unique genes had been identified and indicated that >90% of the sugarcane expressed genes were tagged. PMID:14613979
Distinct profiles of expressed sequence tags during intestinal regeneration in the sea cucumber Holothuria glaberrima

PubMed Central

Rojas-Cartagena, Carmencita; Ortíz-Pineda, Pablo; Ramírez-Gómez, Francisco; Suárez-Castillo, Edna C.; Matos-Cruz, Vanessa; Rodríguez, Carlos; Ortíz-Zuazaga, Humberto; García-Arrarás, José E.

2010-01-01

Repair and regeneration are key processes for tissue maintenance, and their disruption may lead to disease states. Little is known about the molecular mechanisms that underline the repair and regeneration of the digestive tract. The sea cucumber Holothuria glaberrima represents an excellent model to dissect and characterize the molecular events during intestinal regeneration. To study the gene expression profile, cDNA libraries were constructed from normal, 3-day, and 7-day regenerating intestines of H. glaberrima. Clones were randomly sequenced and queried against the nonredundant protein database at the National Center for Biotechnology Information. RT-PCR analyses were made of several genes to determine their expression profile during intestinal regeneration. A total of 5,173 sequences from three cDNA libraries were obtained. About 46.2, 35.6, and 26.2% of the sequences for the normal, 3-days, and 7-days cDNA libraries, respectively, shared significant similarity with known sequences in the protein database of GenBank but only present 10% of similarity among them. Analysis of the libraries in terms of functional processes, protein domains, and most common sequences suggests that a differential expression profile is taking place during the regeneration process. Further examination of the expressed sequence tag dataset revealed that 12 putative genes are differentially expressed at significant level (R > 6). Experimental validation by RT-PCR analysis reveals that at least three genes (unknown C-4677-1, melanotransferrin, and centaurin) present a differential expression during regeneration. These findings strongly suggest that the gene expression profile varies among regeneration stages and provide evidence for the existence of differential gene expression. PMID:17579180
Single cell sequencing reveals heterogeneity within ovarian cancer epithelium and cancer associated stromal cells.

PubMed

Winterhoff, Boris J; Maile, Makayla; Mitra, Amit Kumar; Sebe, Attila; Bazzaro, Martina; Geller, Melissa A; Abrahante, Juan E; Klein, Molly; Hellweg, Raffaele; Mullany, Sally A; Beckman, Kenneth; Daniel, Jerry; Starr, Timothy K

2017-03-01

The purpose of this study was to determine the level of heterogeneity in high grade serous ovarian cancer (HGSOC) by analyzing RNA expression in single epithelial and cancer associated stromal cells. In addition, we explored the possibility of identifying subgroups based on pathway activation and pre-defined signatures from cancer stem cells and chemo-resistant cells. A fresh, HGSOC tumor specimen derived from ovary was enzymatically digested and depleted of immune infiltrating cells. RNA sequencing was performed on 92 single cells and 66 of these single cell datasets passed quality control checks. Sequences were analyzed using multiple bioinformatics tools, including clustering, principle components analysis, and geneset enrichment analysis to identify subgroups and activated pathways. Immunohistochemistry for ovarian cancer, stem cell and stromal markers was performed on adjacent tumor sections. Analysis of the gene expression patterns identified two major subsets of cells characterized by epithelial and stromal gene expression patterns. The epithelial group was characterized by proliferative genes including genes associated with oxidative phosphorylation and MYC activity, while the stromal group was characterized by increased expression of extracellular matrix (ECM) genes and genes associated with epithelial-to-mesenchymal transition (EMT). Neither group expressed a signature correlating with published chemo-resistant gene signatures, but many cells, predominantly in the stromal subgroup, expressed markers associated with cancer stem cells. Single cell sequencing provides a means of identifying subpopulations of cancer cells within a single patient. Single cell sequence analysis may prove to be critical for understanding the etiology, progression and drug resistance in ovarian cancer. Copyright Â© 2017 Elsevier Inc. All rights reserved.
Molecular cloning of a putative gene encoding isopentenyltransferase from pingyitiancha (Malus hupehensis) and characterization of its response to nitrate.

PubMed

Peng, Jing; Peng, Futian; Zhu, Chunfu; Wei, Shaochong

2008-06-01

A putative isopentenyltransferase (IPT) encoding gene was identified from a pingyitiancha (Malus hupehensis Rehd.) expressed sequence tag database, and the full-length gene was cloned by RACE. Based on expression profile and sequence alignment, the nucleotide sequence of the clone, named MhIPT3, was most similar to AtIPT3, an IPT gene in Arabidopsis. The full-length cDNA contained a 963-bp open reading frame encoding a protein of 321 amino acids with a molecular mass of 37.3 kDa. Sequence analysis of genomic DNA revealed the absence of introns in the frame. Quantitative real-time PCR analysis demonstrated that the gene was expressed in roots, stems and leaves. Application of nitrate to roots of nitrogen-deprived seedlings strongly induced expression of MhIPT3 and was accompanied by the accumulation of cytokinins, whereas MhIPT3 expression was little affected by ammonium application to roots of nitrogen-deprived seedlings. Application of nitrate to leaves also up-regulated the expression of MhIPT3 and corresponded closely with the accumulation of isopentyladenine and isopentyladenosine in leaves.
Canine Lat1: molecular structure, distribution and its expression in cancer samples.

PubMed

Ochiai, Hideharu; Morishita, Taiki; Onda, Ken; Sugiyama, Hiroki; Maruo, Takuya

2012-07-01

A full-length cDNA sequence of canine L-type amino acid transporter 1 (Lat1) was determined from a canine brain. The sequence was 1828 bp long and was predicted to encode 485 amino acid polypeptides. The deduced amino acid sequence of canine Lat1 showed 93.2% and 91.1% similarities to those of humans and rats, respectively. Northern blot analysis detected Lat1 expression in the cerebellum at 4 kb, and Western blot analysis showed a single band at 40 kDa. RT-PCR analysis revealed a distinct expression of Lat1 in the pancreas and testis in addition to the cerebrum and cerebellum. Notably, Lat1 expression was observed in the tissues of thyroid cancer, melanoma and hemangiopericytoma. Although the cancer samples examined were not enough, Lat1 may serve as a useful biomarker of cancer cells in veterinary clinic.
Integrated systems analysis reveals a molecular network underlying autism spectrum disorders

PubMed Central

Li, Jingjing; Shi, Minyi; Ma, Zhihai; Zhao, Shuchun; Euskirchen, Ghia; Ziskin, Jennifer; Urban, Alexander; Hallmayer, Joachim; Snyder, Michael

2014-01-01

Autism is a complex disease whose etiology remains elusive. We integrated previously and newly generated data and developed a systems framework involving the interactome, gene expression and genome sequencing to identify a protein interaction module with members strongly enriched for autism candidate genes. Sequencing of 25 patients confirmed the involvement of this module in autism, which was subsequently validated using an independent cohort of over 500 patients. Expression of this module was dichotomized with a ubiquitously expressed subcomponent and another subcomponent preferentially expressed in the corpus callosum, which was significantly affected by our identified mutations in the network center. RNA-sequencing of the corpus callosum from patients with autism exhibited extensive gene mis-expression in this module, and our immunochemical analysis showed that the human corpus callosum is predominantly populated by oligodendrocyte cells. Analysis of functional genomic data further revealed a significant involvement of this module in the development of oligodendrocyte cells in mouse brain. Our analysis delineates a natural network involved in autism, helps uncover novel candidate genes for this disease and improves our understanding of its molecular pathology. PMID:25549968
Identification of differentially expressed genes through RNA sequencing in goats (Capra hircus) at different postnatal stages

PubMed Central

Li, Qian; Lin, Sen

2017-01-01

Intramuscular fat (IMF) content and fatty acid composition of longissimus dorsi muscle (LM) change with growth, which partially determines the flavor and nutritional value of goat (Capra hircus) meat. However, unlike cattle, little information is available on the transcriptome-wide changes during different postnatal stages in small ruminants, especially goats. In this study, the sequencing reads of goat LM tissues collected from kid, youth, and adult period were mapped to the goat genome. Results showed that out of total 24 689 Unigenes, 20 435 Unigenes were annotated. Based on expected number of fragments per kilobase of transcript sequence per million base pairs sequenced (FPKM), 111 annotated differentially expressed genes (DEGs) were identified among different postnatal stages, which were subsequently assigned to 16 possible expression patterns by series-cluster analysis. Functional classification by Gene Ontology (GO) analysis was used for selecting the genes showing highest expression related to lipid metabolism. Finally, we identified the node genes for lipid metabolism regulation using co-expression analysis. In conclusion, these data may uncover candidate genes having functional roles in regulation of goat muscle development and lipid metabolism during the various growth stages in goats. PMID:28800357
Identification of differentially expressed genes through RNA sequencing in goats (Capra hircus) at different postnatal stages.

PubMed

Lin, Yaqiu; Zhu, Jiangjiang; Wang, Yong; Li, Qian; Lin, Sen

2017-01-01

Intramuscular fat (IMF) content and fatty acid composition of longissimus dorsi muscle (LM) change with growth, which partially determines the flavor and nutritional value of goat (Capra hircus) meat. However, unlike cattle, little information is available on the transcriptome-wide changes during different postnatal stages in small ruminants, especially goats. In this study, the sequencing reads of goat LM tissues collected from kid, youth, and adult period were mapped to the goat genome. Results showed that out of total 24 689 Unigenes, 20 435 Unigenes were annotated. Based on expected number of fragments per kilobase of transcript sequence per million base pairs sequenced (FPKM), 111 annotated differentially expressed genes (DEGs) were identified among different postnatal stages, which were subsequently assigned to 16 possible expression patterns by series-cluster analysis. Functional classification by Gene Ontology (GO) analysis was used for selecting the genes showing highest expression related to lipid metabolism. Finally, we identified the node genes for lipid metabolism regulation using co-expression analysis. In conclusion, these data may uncover candidate genes having functional roles in regulation of goat muscle development and lipid metabolism during the various growth stages in goats.
Characterizing the Grape Transcriptome. Analysis of Expressed Sequence Tags from Multiple Vitis Species and Development of a Compendium of Gene Expression during Berry Development1[w

PubMed Central

Silva, Francisco Goes da; Iandolino, Alberto; Al-Kayal, Fadi; Bohlmann, Marlene C.; Cushman, Mary Ann; Lim, Hyunju; Ergul, Ali; Figueroa, Rubi; Kabuloglu, Elif K.; Osborne, Craig; Rowe, Joan; Tattersall, Elizabeth; Leslie, Anna; Xu, Jane; Baek, JongMin; Cramer, Grant R.; Cushman, John C.; Cook, Douglas R.

2005-01-01

We report the analysis and annotation of 146,075 expressed sequence tags from Vitis species. The majority of these sequences were derived from different cultivars of Vitis vinifera, comprising an estimated 25,746 unique contig and singleton sequences that survey transcription in various tissues and developmental stages and during biotic and abiotic stress. Putatively homologous proteins were identified for over 17,752 of the transcripts, with 1,962 transcripts further subdivided into one or more Gene Ontology categories. A simple structured vocabulary, with modules for plant genotype, plant development, and stress, was developed to describe the relationship between individual expressed sequence tags and cDNA libraries; the resulting vocabulary provides query terms to facilitate data mining within the context of a relational database. As a measure of the extent to which characterized metabolic pathways were encompassed by the data set, we searched for homologs of the enzymes leading from glycolysis, through the oxidative/nonoxidative pentose phosphate pathway, and into the general phenylpropanoid pathway. Homologs were identified for 65 of these 77 enzymes, with 86% of enzymatic steps represented by paralogous genes. Differentially expressed transcripts were identified by means of a stringent believability index cutoff of ≥98.4%. Correlation analysis and two-dimensional hierarchical clustering grouped these transcripts according to similarity of expression. In the broadest analysis, 665 differentially expressed transcripts were identified across 29 cDNA libraries, representing a range of developmental and stress conditions. The groupings revealed expected associations between plant developmental stages and tissue types, with the notable exception of abiotic stress treatments. A more focused analysis of flower and berry development identified 87 differentially expressed transcripts and provides the basis for a compendium that relates gene expression and annotation to previously characterized aspects of berry development and physiology. Comparison with published results for select genes, as well as correlation analysis between independent data sets, suggests that the inferred in silico patterns of expression are likely to be an accurate representation of transcript abundance for the conditions surveyed. Thus, the combined data set reveals the in silico expression patterns for hundreds of genes in V. vinifera, the majority of which have not been previously studied within this species. PMID:16219919
Short-term application of dexamethasone on stem cells derived from human gingiva reduces the expression of RUNX2 and β-catenin.

PubMed

Kim, Bo-Bae; Kim, Minji; Park, Yun-Hee; Ko, Youngkyung; Park, Jun-Beom

2017-06-01

Objective Next-generation sequencing was performed to evaluate the effects of short-term application of dexamethasone on human gingiva-derived mesenchymal stem cells. Methods Human gingiva-derived stem cells were treated with a final concentration of 10 -7 M dexamethasone and the same concentration of vehicle control. This was followed by mRNA sequencing and data analysis, gene ontology and pathway analysis, quantitative real-time polymerase chain reaction of mRNA, and western blot analysis of RUNX2 and β-catenin. Results In total, 26,364 mRNAs were differentially expressed. Comparison of the results of dexamethasone versus control at 2 hours revealed that 7 mRNAs were upregulated and 25 mRNAs were downregulated. The application of dexamethasone reduced the expression of RUNX2 and β-catenin in human gingiva-derived mesenchymal stem cells. Conclusion The effects of dexamethasone on stem cells were evaluated with mRNA sequencing, and validation of the expression was performed with qualitative real-time polymerase chain reaction and western blot analysis. The results of this study can provide new insights into the role of mRNA sequencing in maxillofacial areas.
Streaming fragment assignment for real-time analysis of sequencing experiments

PubMed Central

Roberts, Adam; Pachter, Lior

2013-01-01

We present eXpress, a software package for highly efficient probabilistic assignment of ambiguously mapping sequenced fragments. eXpress uses a streaming algorithm with linear run time and constant memory use. It can determine abundances of sequenced molecules in real time, and can be applied to ChIP-seq, metagenomics and other large-scale sequencing data. We demonstrate its use on RNA-seq data, showing greater efficiency than other quantification methods. PMID:23160280
An integrated PCR colony hybridization approach to screen cDNA libraries for full-length coding sequences.

PubMed

Pollier, Jacob; González-Guzmán, Miguel; Ardiles-Diaz, Wilson; Geelen, Danny; Goossens, Alain

2011-01-01

cDNA-Amplified Fragment Length Polymorphism (cDNA-AFLP) is a commonly used technique for genome-wide expression analysis that does not require prior sequence knowledge. Typically, quantitative expression data and sequence information are obtained for a large number of differentially expressed gene tags. However, most of the gene tags do not correspond to full-length (FL) coding sequences, which is a prerequisite for subsequent functional analysis. A medium-throughput screening strategy, based on integration of polymerase chain reaction (PCR) and colony hybridization, was developed that allows in parallel screening of a cDNA library for FL clones corresponding to incomplete cDNAs. The method was applied to screen for the FL open reading frames of a selection of 163 cDNA-AFLP tags from three different medicinal plants, leading to the identification of 109 (67%) FL clones. Furthermore, the protocol allows for the use of multiple probes in a single hybridization event, thus significantly increasing the throughput when screening for rare transcripts. The presented strategy offers an efficient method for the conversion of incomplete expressed sequence tags (ESTs), such as cDNA-AFLP tags, to FL-coding sequences.
ArrayExpress update--trends in database growth and links to data analysis tools.

PubMed

Rustici, Gabriella; Kolesnikov, Nikolay; Brandizi, Marco; Burdett, Tony; Dylag, Miroslaw; Emam, Ibrahim; Farne, Anna; Hastings, Emma; Ison, Jon; Keays, Maria; Kurbatova, Natalja; Malone, James; Mani, Roby; Mupo, Annalisa; Pedro Pereira, Rui; Pilicheva, Ekaterina; Rung, Johan; Sharma, Anjan; Tang, Y Amy; Ternent, Tobias; Tikhonov, Andrew; Welter, Danielle; Williams, Eleanor; Brazma, Alvis; Parkinson, Helen; Sarkans, Ugis

2013-01-01

The ArrayExpress Archive of Functional Genomics Data (http://www.ebi.ac.uk/arrayexpress) is one of three international functional genomics public data repositories, alongside the Gene Expression Omnibus at NCBI and the DDBJ Omics Archive, supporting peer-reviewed publications. It accepts data generated by sequencing or array-based technologies and currently contains data from almost a million assays, from over 30 000 experiments. The proportion of sequencing-based submissions has grown significantly over the last 2 years and has reached, in 2012, 15% of all new data. All data are available from ArrayExpress in MAGE-TAB format, which allows robust linking to data analysis and visualization tools, including Bioconductor and GenomeSpace. Additionally, R objects, for microarray data, and binary alignment format files, for sequencing data, have been generated for a significant proportion of ArrayExpress data.
De novo assembled expressed gene catalog of a fast-growing Eucalyptus tree produced by Illumina mRNA-Seq

PubMed Central

2010-01-01

Background De novo assembly of transcript sequences produced by short-read DNA sequencing technologies offers a rapid approach to obtain expressed gene catalogs for non-model organisms. A draft genome sequence will be produced in 2010 for a Eucalyptus tree species (E. grandis) representing the most important hardwood fibre crop in the world. Genome annotation of this valuable woody plant and genetic dissection of its superior growth and productivity will be greatly facilitated by the availability of a comprehensive collection of expressed gene sequences from multiple tissues and organs. Results We present an extensive expressed gene catalog for a commercially grown E. grandis × E. urophylla hybrid clone constructed using only Illumina mRNA-Seq technology and de novo assembly. A total of 18,894 transcript-derived contigs, a large proportion of which represent full-length protein coding genes were assembled and annotated. Analysis of assembly quality, length and diversity show that this dataset represent the most comprehensive expressed gene catalog for any Eucalyptus tree. mRNA-Seq analysis furthermore allowed digital expression profiling of all of the assembled transcripts across diverse xylogenic and non-xylogenic tissues, which is invaluable for ascribing putative gene functions. Conclusions De novo assembly of Illumina mRNA-Seq reads is an efficient approach for transcriptome sequencing and profiling in Eucalyptus and other non-model organisms. The transcriptome resource (Eucspresso, http://eucspresso.bi.up.ac.za/) generated by this study will be of value for genomic analysis of woody biomass production in Eucalyptus and for comparative genomic analysis of growth and development in woody and herbaceous plants. PMID:21122097
[Isolation and function of genes regulating aphB expression in Vibrio cholerae].

PubMed

Chen, Haili; Zhu, Zhaoqin; Zhong, Zengtao; Zhu, Jun; Kan, Biao

2012-02-04

We identified genes that regulate the expression of aphB, the gene encoding a key virulence regulator in Vibrio cholerae O1 E1 Tor C6706(-). We constructed a transposon library in V. cholerae C6706 strain containing a P(aphB)-luxCDABE and P(aphB)-lacZ transcriptional reporter plasmids. Using a chemiluminescence imager system, we rapidly detected aphB promoter expression level at a large scale. We then sequenced the transposon insertion sites by arbitrary PCR and sequencing analysis. We obtained two candidate mutants T1 and T2 which displayed reduced aphB expression from approximately 40,000 transposon insertion mutants. Sequencing analysis shows that Tn inserted in vc1585 reading frame in the T1 mutant and Tn inserted in the end of coding sequence of vc1602 in the T2 mutant. By using a genetic screen, we identified two potential genes that may involve in regulation of the expression of the key virulence regulator AphB. This study sheds light on our further investigation to fully understand V. cholerae virulence gene regulatory cascades.

RNA-Seq for Bacterial Gene Expression.

PubMed

Poulsen, Line Dahl; Vinther, Jeppe

2018-06-01

RNA sequencing (RNA-seq) has become the preferred method for global quantification of bacterial gene expression. With the continued improvements in sequencing technology and data analysis tools, the most labor-intensive and expensive part of an RNA-seq experiment is the preparation of sequencing libraries, which is also essential for the quality of the data obtained. Here, we present a straightforward and inexpensive basic protocol for preparation of strand-specific RNA-seq libraries from bacterial RNA as well as a computational pipeline for the data analysis of sequencing reads. The protocol is based on the Illumina platform and allows easy multiplexing of samples and the removal of sequencing reads that are PCR duplicates. © 2018 by John Wiley & Sons, Inc. © 2018 John Wiley & Sons, Inc.
Molecular phenotype of zebrafish ovarian follicle by serial analysis of gene expression and proteomic profiling, and comparison with the transcriptomes of other animals

PubMed Central

Knoll-Gellida, Anja; André, Michèle; Gattegno, Tamar; Forgue, Jean; Admon, Arie; Babin, Patrick J

2006-01-01

Background The ability of an oocyte to develop into a viable embryo depends on the accumulation of specific maternal information and molecules, such as RNAs and proteins. A serial analysis of gene expression (SAGE) was carried out in parallel with proteomic analysis on fully-grown ovarian follicles from zebrafish (Danio rerio). The data obtained were compared with ovary/follicle/egg molecular phenotypes of other animals, published or available in public sequence databases. Results Sequencing of 27,486 SAGE tags identified 11,399 different ones, including 3,329 tags with an occurrence superior to one. Fifty-eight genes were expressed at over 0.15% of the total population and represented 17.34% of the mRNA population identified. The three most expressed transcripts were a rhamnose-binding lectin, beta-actin 2, and a transcribed locus similar to the H2B histone family. Comparison with the large-scale expressed sequence tags sequencing approach revealed highly expressed transcripts that were not previously known to be expressed at high levels in fish ovaries, like the short-sized polarized metallothionein 2 transcript. A higher sensitivity for the detection of transcripts with a characterized maternal genetic contribution was also demonstrated compared to large-scale sequencing of cDNA libraries. Ferritin heavy polypeptide 1, heat shock protein 90-beta, lactate dehydrogenase B4, beta-actin isoforms, tubulin beta 2, ATP synthase subunit 9, together with 40 S ribosomal protein S27a, were common highly-expressed transcripts of vertebrate ovary/unfertilized egg. Comparison of transcriptome and proteome data revealed that transcript levels provide little predictive value with respect to the extent of protein abundance. All the proteins identified by proteomic analysis of fully-grown zebrafish follicles had at least one transcript counterpart, with two exceptions: eosinophil chemotactic cytokine and nothepsin. Conclusion This study provides a complete sequence data set of maternal mRNA stored in zebrafish germ cells at the end of oogenesis. This catalogue contains highly-expressed transcripts that are part of a vertebrate ovarian expressed gene signature. Comparison of transcriptome and proteome data identified downregulated transcripts or proteins potentially incorporated in the oocyte by endocytosis. The molecular phenotype described provides groundwork for future experimental approaches aimed at identifying functionally important stored maternal transcripts and proteins involved in oogenesis and early stages of embryo development. PMID:16526958
PanGEA: identification of allele specific gene expression using the 454 technology.

PubMed

Kofler, Robert; Teixeira Torres, Tatiana; Lelley, Tamas; Schlötterer, Christian

2009-05-14

Next generation sequencing technologies hold great potential for many biological questions. While mainly used for genomic sequencing, they are also very promising for gene expression profiling. Sequencing of cDNA does not only provide an estimate of the absolute expression level, it can also be used for the identification of allele specific gene expression. We developed PanGEA, a tool which enables a fast and user-friendly analysis of allele specific gene expression using the 454 technology. PanGEA allows mapping of 454-ESTs to genes or whole genomes, displaying gene expression profiles, identification of SNPs and the quantification of allele specific gene expression. The intuitive GUI of PanGEA facilitates a flexible and interactive analysis of the data. PanGEA additionally implements a modification of the Smith-Waterman algorithm which deals with incorrect estimates of homopolymer length as occuring in the 454 technology To our knowledge, PanGEA is the first tool which facilitates the identification of allele specific gene expression. PanGEA is distributed under the Mozilla Public License and available at: http://www.kofler.or.at/bioinformatics/PanGEA
PanGEA: Identification of allele specific gene expression using the 454 technology

PubMed Central

Kofler, Robert; Teixeira Torres, Tatiana; Lelley, Tamas; Schlötterer, Christian

2009-01-01

Background Next generation sequencing technologies hold great potential for many biological questions. While mainly used for genomic sequencing, they are also very promising for gene expression profiling. Sequencing of cDNA does not only provide an estimate of the absolute expression level, it can also be used for the identification of allele specific gene expression. Results We developed PanGEA, a tool which enables a fast and user-friendly analysis of allele specific gene expression using the 454 technology. PanGEA allows mapping of 454-ESTs to genes or whole genomes, displaying gene expression profiles, identification of SNPs and the quantification of allele specific gene expression. The intuitive GUI of PanGEA facilitates a flexible and interactive analysis of the data. PanGEA additionally implements a modification of the Smith-Waterman algorithm which deals with incorrect estimates of homopolymer length as occuring in the 454 technology Conclusion To our knowledge, PanGEA is the first tool which facilitates the identification of allele specific gene expression. PanGEA is distributed under the Mozilla Public License and available at: PMID:19442283
Gene expression analysis of flax seed development

PubMed Central

2011-01-01

Background Flax, Linum usitatissimum L., is an important crop whose seed oil and stem fiber have multiple industrial applications. Flax seeds are also well-known for their nutritional attributes, viz., omega-3 fatty acids in the oil and lignans and mucilage from the seed coat. In spite of the importance of this crop, there are few molecular resources that can be utilized toward improving seed traits. Here, we describe flax embryo and seed development and generation of comprehensive genomic resources for the flax seed. Results We describe a large-scale generation and analysis of expressed sequences in various tissues. Collectively, the 13 libraries we have used provide a broad representation of genes active in developing embryos (globular, heart, torpedo, cotyledon and mature stages) seed coats (globular and torpedo stages) and endosperm (pooled globular to torpedo stages) and genes expressed in flowers, etiolated seedlings, leaves, and stem tissue. A total of 261,272 expressed sequence tags (EST) (GenBank accessions LIBEST_026995 to LIBEST_027011) were generated. These EST libraries included transcription factor genes that are typically expressed at low levels, indicating that the depth is adequate for in silico expression analysis. Assembly of the ESTs resulted in 30,640 unigenes and 82% of these could be identified on the basis of homology to known and hypothetical genes from other plants. When compared with fully sequenced plant genomes, the flax unigenes resembled poplar and castor bean more than grape, sorghum, rice or Arabidopsis. Nearly one-fifth of these (5,152) had no homologs in sequences reported for any organism, suggesting that this category represents genes that are likely unique to flax. Digital analyses revealed gene expression dynamics for the biosynthesis of a number of important seed constituents during seed development. Conclusions We have developed a foundational database of expressed sequences and collection of plasmid clones that comprise even low-expressed genes such as those encoding transcription factors. This has allowed us to delineate the spatio-temporal aspects of gene expression underlying the biosynthesis of a number of important seed constituents in flax. Flax belongs to a taxonomic group of diverse plants and the large sequence database will allow for evolutionary studies as well. PMID:21529361
Identification, characterization and expression analysis of lineage-specific genes within sweet orange (Citrus sinensis).

PubMed

Xu, Yuantao; Wu, Guizhi; Hao, Baohai; Chen, Lingling; Deng, Xiuxin; Xu, Qiang

2015-11-23

With the availability of rapidly increasing number of genome and transcriptome sequences, lineage-specific genes (LSGs) can be identified and characterized. Like other conserved functional genes, LSGs play important roles in biological evolution and functions. Two set of citrus LSGs, 296 citrus-specific genes (CSGs) and 1039 orphan genes specific to sweet orange, were identified by comparative analysis between the sweet orange genome sequences and 41 genomes and 273 transcriptomes. With the two sets of genes, gene structure and gene expression pattern were investigated. On average, both the CSGs and orphan genes have fewer exons, shorter gene length and higher GC content when compared with those evolutionarily conserved genes (ECs). Expression profiling indicated that most of the LSGs expressed in various tissues of sweet orange and some of them exhibited distinct temporal and spatial expression patterns. Particularly, the orphan genes were preferentially expressed in callus, which is an important pluripotent tissue of citrus. Besides, part of the CSGs and orphan genes expressed responsive to abiotic stress, indicating their potential functions during interaction with environment. This study identified and characterized two sets of LSGs in citrus, dissected their sequence features and expression patterns, and provided valuable clues for future functional analysis of the LSGs in sweet orange.
TEcandidates: Prediction of genomic origin of expressed Transposable Elements using RNA-seq data.

PubMed

Valdebenito-Maturana, Braulio; Riadi, Gonzalo

2018-06-01

In recent years, Transposable Elements (TEs) have been related to gene regulation. However, estimating the origin of expression of TEs through RNA-seq is complicated by multimapping reads coming from their repetitive sequences. Current approaches that address multimapping reads are focused in expression quantification and not in finding the origin of expression. Addressing the genomic origin of expressed TEs could further aid in understanding the role that TEs might have in the cell. We have developed a new pipeline called TEcandidates, based on de novo transcriptome assembly to assess the instances of TEs being expressed, along with their location, to include in downstream DE analysis. TEcandidates takes as input the RNA-seq data, the genome sequence and the TE annotation file, and returns a list of coordinates of candidate TEs being expressed, the TEs that have been removed, and the genome sequence with removed TEs as masked. This masked genome is suited to include TEs in downstream expression analysis, as the ambiguity of reads coming from TEs is significantly reduced in the mapping step of the analysis. The script which runs the pipeline can be downloaded at http://www.mobilomics.org/tecandidates/downloads or http://github.com/TEcandidates/TEcandidates. griadi@utalca.cl. Supplementary data are available at Bioinformatics online.
From reads to genes to pathways: differential expression analysis of RNA-Seq experiments using Rsubread and the edgeR quasi-likelihood pipeline.

PubMed

Chen, Yunshun; Lun, Aaron T L; Smyth, Gordon K

2016-01-01

In recent years, RNA sequencing (RNA-seq) has become a very widely used technology for profiling gene expression. One of the most common aims of RNA-seq profiling is to identify genes or molecular pathways that are differentially expressed (DE) between two or more biological conditions. This article demonstrates a computational workflow for the detection of DE genes and pathways from RNA-seq data by providing a complete analysis of an RNA-seq experiment profiling epithelial cell subsets in the mouse mammary gland. The workflow uses R software packages from the open-source Bioconductor project and covers all steps of the analysis pipeline, including alignment of read sequences, data exploration, differential expression analysis, visualization and pathway analysis. Read alignment and count quantification is conducted using the Rsubread package and the statistical analyses are performed using the edgeR package. The differential expression analysis uses the quasi-likelihood functionality of edgeR.
Identification of microRNAs differentially expressed involved in male flower development.

PubMed

Wang, Zhengjia; Huang, Jianqin; Sun, Zhichao; Zheng, Bingsong

2015-03-01

Hickory (Carya cathayensis Sarg.) is one of the most economically important woody trees in eastern China, but its long flowering phase delays yield. Our understanding of the regulatory roles of microRNAs (miRNAs) in male flower development in hickory remains poor. Using high-throughput sequencing technology, we have pyrosequenced two small RNA libraries from two male flower differentiation stages in hickory. Analysis of the sequencing data identified 114 conserved miRNAs that belonged to 23 miRNA families, five novel miRNAs including their corresponding miRNA*s, and 22 plausible miRNA candidates. Differential expression analysis revealed 12 miRNA sequences that were upregulated in the later (reproductive) stage of male flower development. Quantitative real-time PCR showed similar expression trends as that of the deep sequencing. Novel miRNAs and plausible miRNA candidates were predicted using bioinformatic analysis methods. The miRNAs newly identified in this study have increased the number of known miRNAs in hickory, and the identification of differentially expressed miRNAs will provide new avenues for studies into miRNAs involved in the process of male flower development in hickory and other related trees.
EXP-PAC: providing comparative analysis and storage of next generation gene expression data.

PubMed

Church, Philip C; Goscinski, Andrzej; Lefèvre, Christophe

2012-07-01

Microarrays and more recently RNA sequencing has led to an increase in available gene expression data. How to manage and store this data is becoming a key issue. In response we have developed EXP-PAC, a web based software package for storage, management and analysis of gene expression and sequence data. Unique to this package is SQL based querying of gene expression data sets, distributed normalization of raw gene expression data and analysis of gene expression data across experiments and species. This package has been populated with lactation data in the international milk genomic consortium web portal (http://milkgenomics.org/). Source code is also available which can be hosted on a Windows, Linux or Mac APACHE server connected to a private or public network (http://mamsap.it.deakin.edu.au/~pcc/Release/EXP_PAC.html). Copyright © 2012 Elsevier Inc. All rights reserved.
RNA sequencing: current and prospective uses in metabolic research.

PubMed

Vikman, Petter; Fadista, Joao; Oskolkov, Nikolay

2014-10-01

Previous global RNA analysis was restricted to known transcripts in species with a defined transcriptome. Next generation sequencing has transformed transcriptomics by making it possible to analyse expressed genes with an exon level resolution from any tissue in any species without any a priori knowledge of which genes that are being expressed, splice patterns or their nucleotide sequence. In addition, RNA sequencing is a more sensitive technique compared with microarrays with a larger dynamic range, and it also allows for investigation of imprinting and allele-specific expression. This can be done for a cost that is able to compete with that of a microarray, making RNA sequencing a technique available to most researchers. Therefore RNA sequencing has recently become the state of the art with regards to large-scale RNA investigations and has to a large extent replaced microarrays. The only drawback is the large data amounts produced, which together with the complexity of the data can make a researcher spend far more time on analysis than performing the actual experiment. © 2014 Society for Endocrinology.
Transcriptome-wide analysis of WRKY transcription factors in wheat and their leaf rust responsive expression profiling.

PubMed

Satapathy, Lopamudra; Singh, Dharmendra; Ranjan, Prashant; Kumar, Dhananjay; Kumar, Manish; Prabhu, Kumble Vinod; Mukhopadhyay, Kunal

2014-12-01

WRKY, a plant-specific transcription factor family, has important roles in pathogen defense, abiotic cues and phytohormone signaling, yet little is known about their roles and molecular mechanism of function in response to rust diseases in wheat. We identified 100 TaWRKY sequences using wheat Expressed Sequence Tag database of which 22 WRKY sequences were novel. Identified proteins were characterized based on their zinc finger motifs and phylogenetic analysis clustered them into six clades consisting of class IIc and class III WRKY proteins. Functional annotation revealed major functions in metabolic and cellular processes in control plants; whereas response to stimuli, signaling and defense in pathogen inoculated plants, their major molecular function being binding to DNA. Tag-based expression analysis of the identified genes revealed differential expression between mock and Puccinia triticina inoculated wheat near isogenic lines. Gene expression was also performed with six rust-related microarray experiments at Gene Expression Omnibus database. TaWRKY10, 15, 17 and 56 were common in both tag-based and microarray-based differential expression analysis and could be representing rust specific WRKY genes. The obtained results will bestow insight into the functional characterization of WRKY transcription factors responsive to leaf rust pathogenesis that can be used as candidate genes in molecular breeding programs to improve biotic stress tolerance in wheat.
A Generalized Least-Squares Estimate for the Origin of Sporophytic Self-Incompatibility

PubMed Central

Uyenoyama, M. K.

1995-01-01

Analysis of nucleotide sequences that regulate the expression of self-incompatibility in flowering plants affords a direct means of examining classical hypotheses for the origin and evolution of this major feature of mating systems. Departing from the classical view of monophyly of all forms of self-incompatibility, the current paradigm for the origin of self-incompatibility postulates multiple episodes of recruitment and modification of preexisting genes. In Brassica, the S locus, which regulates sporophytic self-incompatibility, shows homology to a multigene family present both in self-compatible congeners and in groups for which this form of self-incompatibility is atypical. A phylogenetic analysis of S-allele sequences together with homologous sequences that do not cosegregate with self-incompatibility permits dating the change of function that marked the origin of self-incompatibility. A generalized least-squares method is introduced that provides closed-form expressions for estimates and standard errors for function-specific divergence rates and times of divergence among sequences. This analysis suggests that the age of the sporophytic self-incompatibility system expressed in Brassica exceeds species divergence within the genus by four- to fivefold. The extraordinarily high levels of sequence diversity exhibited by S alleles appears to reflect their ancient derivation, with the alternative hypothesis of hypermutability rejected by the analysis. PMID:7713446
Parallel gene analysis with allele-specific padlock probes and tag microarrays

PubMed Central

Banér, Johan; Isaksson, Anders; Waldenström, Erik; Jarvius, Jonas; Landegren, Ulf; Nilsson, Mats

2003-01-01

Parallel, highly specific analysis methods are required to take advantage of the extensive information about DNA sequence variation and of expressed sequences. We present a scalable laboratory technique suitable to analyze numerous target sequences in multiplexed assays. Sets of padlock probes were applied to analyze single nucleotide variation directly in total genomic DNA or cDNA for parallel genotyping or gene expression analysis. All reacted probes were then co-amplified and identified by hybridization to a standard tag oligonucleotide array. The technique was illustrated by analyzing normal and pathogenic variation within the Wilson disease-related ATP7B gene, both at the level of DNA and RNA, using allele-specific padlock probes. PMID:12930977
Genome-wide sequencing and quantification of circulating microRNAs for dogs with congestive heart failure secondary to myxomatous mitral valve degeneration.

PubMed

Jung, SeungWoo; Bohan, Amy

2018-02-01

OBJECTIVE To characterize expression profiles of circulating microRNAs via genome-wide sequencing for dogs with congestive heart failure (CHF) secondary to myxomatous mitral valve degeneration (MMVD). ANIMALS 9 healthy client-owned dogs and 8 age-matched client-owned dogs with CHF secondary to MMVD. PROCEDURES Blood samples were collected before administering cardiac medications for the management of CHF. Isolated microRNAs from plasma were classified into microRNA libraries and subjected to next-generation sequencing (NGS) for genome-wide sequencing analysis and quantification of circulating microRNAs. Quantitative reverse transcription PCR (qRT-PCR) assays were used to validate expression profiles of differentially expressed circulating microRNAs identified from NGS analysis of dogs with CHF. RESULTS 326 microRNAs were identified with NGS analysis. Hierarchical analysis revealed distinct expression patterns of circulating microRNAs between healthy dogs and dogs with CHF. Results of qRT-PCR assays confirmed upregulation of 4 microRNAs (miR-133, miR-1, miR-let-7e, and miR-125) and downregulation of 4 selected microRNAs (miR-30c, miR-128, miR-142, and miR-423). Results of qRT-PCR assays were highly correlated with NGS data and supported the specificity of circulating microRNA expression profiles in dogs with CHF secondary to MMVD. CONCLUSIONS AND CLINICAL RELEVANCE These results suggested that circulating microRNA expression patterns were unique and could serve as molecular biomarkers of CHF in dogs with MMVD.
Cell cycle, differentiation and tissue-independent expression of ribosomal protein L37.

PubMed

Su, S; Bird, R C

1995-09-15

A unique human cDNA (hG1.16) that encodes a mRNA of 450 nucleotides was isolated from a subtractive library derived from HeLa cells. The relative expression level of hG1.16 during different cell-cycle phases was determined by Northern-blot analysis of cells synchronized by double-thymidine block and serum deprivation/refeeding. hG1.16 was constitutively expressed during all phases of the cell cycle, including the quiescent phase when even most constitutively expressed genes experience some suppression of expression. The expression level of hG1.16 did not change during terminal differentiation of myoblasts to myotubes, during which cells become permanently post-mitotic. Examination of other tissues revealed that the relative expression level of hG1.16 was constitutive in all embryonic mouse tissues examined, including brain, eye, heart, kidney, liver, lung and skeletal muscle. This was unusual in that expression was not down-modulated during differentiation and did not vary appreciably between tissue types. Analysis by inter-species Northern-blot analysis revealed that hG1.16 was highly conserved among all vertebrates studied (from fish to humans but not in insects). DNA sequence analysis of hG1.16 revealed a high level of similarity to rat ribosomal protein L37, identifying hG1.16 as a new member of this multigene family. The deduced amino acid sequence of hG1.16 was identical to rat ribosomal protein L37 that contained 97 amino acids, many of which are highly positively charged (15 arginine and 14 lysine residues with a predicted M(r) of 11,065). hG1.16 protein has a single C2-C2 zinc-finger-like motif which is also present in rat ribosomal protein L37. Using primers designed from the sequence of hG1.16, unique bovine and rat cDNAs were also isolated by 5'-rapid-amplification of cDNA ends. DNA sequences of bovine and rat G1.16, clones were 92.8% and 92.2% similar to human G1.16 while the deduced amino acid sequences derived from bovine and rat cDNAs each differed by a single amino acid from the sequence of hG1.16 and the published rat L37 sequence. Southern-blot analysis revealed that hG1.16 exists in multiple copies in human, rat and mouse genomes. These G1.16 clones encode unique human, rat and bovine members of the ribosomal protein L37 gene family, which are constitutively expressed even during transitions from quiescence to active cell proliferation or terminal differentiation, in all tissues and all vertebrates investigated.
An Ambystoma mexicanum EST sequencing project: analysis of 17,352 expressed sequence tags from embryonic and regenerating blastema cDNA libraries

PubMed Central

Habermann, Bianca; Bebin, Anne-Gaelle; Herklotz, Stephan; Volkmer, Michael; Eckelt, Kay; Pehlke, Kerstin; Epperlein, Hans Henning; Schackert, Hans Konrad; Wiebe, Glenis; Tanaka, Elly M

2004-01-01

Background The ambystomatid salamander, Ambystoma mexicanum (axolotl), is an important model organism in evolutionary and regeneration research but relatively little sequence information has so far been available. This is a major limitation for molecular studies on caudate development, regeneration and evolution. To address this lack of sequence information we have generated an expressed sequence tag (EST) database for A. mexicanum. Results Two cDNA libraries, one made from stage 18-22 embryos and the other from day-6 regenerating tail blastemas, generated 17,352 sequences. From the sequenced ESTs, 6,377 contigs were assembled that probably represent 25% of the expressed genes in this organism. Sequence comparison revealed significant homology to entries in the NCBI non-redundant database. Further examination of this gene set revealed the presence of genes involved in important cell and developmental processes, including cell proliferation, cell differentiation and cell-cell communication. On the basis of these data, we have performed phylogenetic analysis of key cell-cycle regulators. Interestingly, while cell-cycle proteins such as the cyclin B family display expected evolutionary relationships, the cyclin-dependent kinase inhibitor 1 gene family shows an unusual evolutionary behavior among the amphibians. Conclusions Our analysis reveals the importance of a comprehensive sequence set from a representative of the Caudata and illustrates that the EST sequence database is a rich source of molecular, developmental and regeneration studies. To aid in data mining, the ESTs have been organized into an easily searchable database that is freely available online. PMID:15345051
Study of cnidarian-algal symbiosis in the "omics" age.

PubMed

Meyer, Eli; Weis, Virginia M

2012-08-01

The symbiotic associations between cnidarians and dinoflagellate algae (Symbiodinium) support productive and diverse ecosystems in coral reefs. Many aspects of this association, including the mechanistic basis of host-symbiont recognition and metabolic interaction, remain poorly understood. The first completed genome sequence for a symbiotic anthozoan is now available (the coral Acropora digitifera), and extensive expressed sequence tag resources are available for a variety of other symbiotic corals and anemones. These resources make it possible to profile gene expression, protein abundance, and protein localization associated with the symbiotic state. Here we review the history of "omics" studies of cnidarian-algal symbiosis and the current availability of sequence resources for corals and anemones, identifying genes putatively involved in symbiosis across 10 anthozoan species. The public availability of candidate symbiosis-associated genes leaves the field of cnidarian-algal symbiosis poised for in-depth comparative studies of sequence diversity and gene expression and for targeted functional studies of genes associated with symbiosis. Reviewing the progress to date suggests directions for future investigations of cnidarian-algal symbiosis that include (i) sequencing of Symbiodinium, (ii) proteomic analysis of the symbiosome membrane complex, (iii) glycomic analysis of Symbiodinium cell surfaces, and (iv) expression profiling of the gastrodermal cells hosting Symbiodinium.
Expressed sequence tags from heat-shocked seagrass Zostera noltii (Hornemann) from its southern distribution range.

PubMed

Massa, Sónia I; Pearson, Gareth A; Aires, Tânia; Kube, Michael; Olsen, Jeanine L; Reinhardt, Richard; Serrão, Ester A; Arnaud-Haond, Sophie

2011-09-01

Predicted global climate change threatens the distributional ranges of species worldwide. We identified genes expressed in the intertidal seagrass Zostera noltii during recovery from a simulated low tide heat-shock exposure. Five Expressed Sequence Tag (EST) libraries were compared, corresponding to four recovery times following sub-lethal temperature stress, and a non-stressed control. We sequenced and analyzed 7009 sequence reads from 30min, 2h, 4h and 24h after the beginning of the heat-shock (AHS), and 1585 from the control library, for a total of 8594 sequence reads. Among 51 Tentative UniGenes (TUGs) exhibiting significantly different expression between libraries, 19 (37.3%) were identified as 'molecular chaperones' and were over-expressed following heat-shock, while 12 (23.5%) were 'photosynthesis TUGs' generally under-expressed in heat-shocked plants. A time course analysis of expression showed a rapid increase in expression of the molecular chaperone class, most of which were heat-shock proteins; which increased from 2 sequence reads in the control library to almost 230 in the 30min AHS library, followed by a slow decrease during further recovery. In contrast, 'photosynthesis TUGs' were under-expressed 30min AHS compared with the control library, and declined progressively with recovery time in the stress libraries, with a total of 29 sequence reads 24h AHS, compared with 125 in the control. A total of 4734 TUGs were screened for EST-Single Sequence Repeats (EST-SSRs) and 86 microsatellites were identified. Copyright © 2011 Elsevier B.V. All rights reserved.
[Cloning and characterization of genes differentially expressed in human dental pulp cells and gingival fibroblasts].

PubMed

Wang, Zhong-dong; Wu, Ji-nan; Zhou, Lin; Ling, Jun-qi; Guo, Xi-min; Xiao, Ming-zhen; Zhu, Feng; Pu, Qin; Chai, Yu-bo; Zhao, Zhong-liang

2007-02-01

To study the biological properties of human dental pulp cells (HDPC) by cloning and analysis of genes differentially expressed in HDPC in comparison with human gingival fibroblasts (HGF). HDPC and HGF were cultured and identified by immunocytochemistry. HPDC and HGF subtractive cDNA library was established by PCR-based modified subtractive hybridization, genes differentially expressed by HPDC were cloned, sequenced and compared to find homogeneous sequence in GenBank by BLAST. Cloning and sequencing analysis indicate 12 genes differentially expressed were obtained, in which two were unknown genes. Among the 10 known genes, 4 were related to signal transduction, 2 were related to trans-membrane transportation (both cell membrane and nuclear membrane), and 2 were related to RNA splicing mechanisms. The biological properties of HPDC are determined by the differential expression of some genes and the growth and differentiation of HPDC are associated to the dynamic protein synthesis and secretion activities of the cell.

Inferring the expression variability of human transposable element-derived exons by linear model analysis of deep RNA sequencing data.

PubMed

Zhang, Wensheng; Edwards, Andrea; Fan, Wei; Fang, Zhide; Deininger, Prescott; Zhang, Kun

2013-08-28

The exonization of transposable elements (TEs) has proven to be a significant mechanism for the creation of novel exons. Existing knowledge of the retention patterns of TE exons in mRNAs were mainly established by the analysis of Expressed Sequence Tag (EST) data and microarray data. This study seeks to validate and extend previous studies on the expression of TE exons by an integrative statistical analysis of high throughput RNA sequencing data. We collected 26 RNA-seq datasets spanning multiple tissues and cancer types. The exon-level digital expressions (indicating retention rates in mRNAs) were quantified by a double normalized measure, called the rescaled RPKM (Reads Per Kilobase of exon model per Million mapped reads). We analyzed the distribution profiles and the variability (across samples and between tissue/disease groups) of TE exon expressions, and compared them with those of other constitutive or cassette exons. We inferred the effects of four genomic factors, including the location, length, cognate TE family and TE nucleotide proportion (RTE, see Methods section) of a TE exon, on the exons' expression level and expression variability. We also investigated the biological implications of an assembly of highly-expressed TE exons. Our analysis confirmed prior studies from the following four aspects. First, with relatively high expression variability, most TE exons in mRNAs, especially those without exact counterparts in the UCSC RefSeq (Reference Sequence) gene tables, demonstrate low but still detectable expression levels in most tissue samples. Second, the TE exons in coding DNA sequences (CDSs) are less highly expressed than those in 3' (5') untranslated regions (UTRs). Third, the exons derived from chronologically ancient repeat elements, such as MIRs, tend to be highly expressed in comparison with those derived from younger TEs. Fourth, the previously observed negative relationship between the lengths of exons and the inclusion levels in transcripts is also true for exonized TEs. Furthermore, our study resulted in several novel findings. They include: (1) for the TE exons with non-zero expression and as shown in most of the studied biological samples, a high TE nucleotide proportion leads to their lower retention rates in mRNAs; (2) the considered genomic features (i.e. a continuous variable such as the exon length or a category indicator such as 3'UTR) influence the expression level and the expression variability (CV) of TE exons in an inverse manner; (3) not only the exons derived from Alu elements but also the exons from the TEs of other families were preferentially established in zinc finger (ZNF) genes.
Analysis of Epstein-Barr Virus Genomes and Expression Profiles in Gastric Adenocarcinoma.

PubMed

Borozan, Ivan; Zapatka, Marc; Frappier, Lori; Ferretti, Vincent

2018-01-15

Epstein-Barr virus (EBV) is a causative agent of a variety of lymphomas, nasopharyngeal carcinoma (NPC), and ∼9% of gastric carcinomas (GCs). An important question is whether particular EBV variants are more oncogenic than others, but conclusions are currently hampered by the lack of sequenced EBV genomes. Here, we contribute to this question by mining whole-genome sequences of 201 GCs to identify 13 EBV-positive GCs and by assembling 13 new EBV genome sequences, almost doubling the number of available GC-derived EBV genome sequences and providing the first non-Asian EBV genome sequences from GC. Whole-genome sequence comparisons of all EBV isolates sequenced to date (85 from tumors and 57 from healthy individuals) showed that most GC and NPC EBV isolates were closely related although American Caucasian GC samples were more distant, suggesting a geographical component. However, EBV GC isolates were found to contain some consistent changes in protein sequences regardless of geographical origin. In addition, transcriptome data available for eight of the EBV-positive GCs were analyzed to determine which EBV genes are expressed in GC. In addition to the expected latency proteins (EBNA1, LMP1, and LMP2A), specific subsets of lytic genes were consistently expressed that did not reflect a typical lytic or abortive lytic infection, suggesting a novel mechanism of EBV gene regulation in the context of GC. These results are consistent with a model in which a combination of specific latent and lytic EBV proteins promotes tumorigenesis. IMPORTANCE Epstein-Barr virus (EBV) is a widespread virus that causes cancer, including gastric carcinoma (GC), in a small subset of individuals. An important question is whether particular EBV variants are more cancer associated than others, but more EBV sequences are required to address this question. Here, we have generated 13 new EBV genome sequences from GC, almost doubling the number of EBV sequences from GC isolates and providing the first EBV sequences from non-Asian GC. We further identify sequence changes in some EBV proteins common to GC isolates. In addition, gene expression analysis of eight of the EBV-positive GCs showed consistent expression of both the expected latency proteins and a subset of lytic proteins that was not consistent with typical lytic or abortive lytic expression. These results suggest that novel mechanisms activate expression of some EBV lytic proteins and that their expression may contribute to oncogenesis. Copyright © 2018 American Society for Microbiology.
Expressed sequence tags from Atta laevigata and identification of candidate genes for the control of pest leaf-cutting ants.

PubMed

Rodovalho, Cynara M; Ferro, Milene; Fonseca, Fernando Pp; Antonio, Erik A; Guilherme, Ivan R; Henrique-Silva, Flávio; Bacci, Maurício

2011-06-17

Leafcutters are the highest evolved within Neotropical ants in the tribe Attini and model systems for studying caste formation, labor division and symbiosis with microorganisms. Some species of leafcutters are agricultural pests controlled by chemicals which affect other animals and accumulate in the environment. Aiming to provide genetic basis for the study of leafcutters and for the development of more specific and environmentally friendly methods for the control of pest leafcutters, we generated expressed sequence tag data from Atta laevigata, one of the pest ants with broad geographic distribution in South America. The analysis of the expressed sequence tags allowed us to characterize 2,006 unique sequences in Atta laevigata. Sixteen of these genes had a high number of transcripts and are likely positively selected for high level of gene expression, being responsible for three basic biological functions: energy conservation through redox reactions in mitochondria; cytoskeleton and muscle structuring; regulation of gene expression and metabolism. Based on leafcutters lifestyle and reports of genes involved in key processes of other social insects, we identified 146 sequences potential targets for controlling pest leafcutters. The targets are responsible for antixenobiosis, development and longevity, immunity, resistance to pathogens, pheromone function, cell signaling, behavior, polysaccharide metabolism and arginine kynase activity. The generation and analysis of expressed sequence tags from Atta laevigata have provided important genetic basis for future studies on the biology of leaf-cutting ants and may contribute to the development of a more specific and environmentally friendly method for the control of agricultural pest leafcutters.
Expressed sequence tags from Atta laevigata and identification of candidate genes for the control of pest leaf-cutting ants

PubMed Central

2011-01-01

Background Leafcutters are the highest evolved within Neotropical ants in the tribe Attini and model systems for studying caste formation, labor division and symbiosis with microorganisms. Some species of leafcutters are agricultural pests controlled by chemicals which affect other animals and accumulate in the environment. Aiming to provide genetic basis for the study of leafcutters and for the development of more specific and environmentally friendly methods for the control of pest leafcutters, we generated expressed sequence tag data from Atta laevigata, one of the pest ants with broad geographic distribution in South America. Results The analysis of the expressed sequence tags allowed us to characterize 2,006 unique sequences in Atta laevigata. Sixteen of these genes had a high number of transcripts and are likely positively selected for high level of gene expression, being responsible for three basic biological functions: energy conservation through redox reactions in mitochondria; cytoskeleton and muscle structuring; regulation of gene expression and metabolism. Based on leafcutters lifestyle and reports of genes involved in key processes of other social insects, we identified 146 sequences potential targets for controlling pest leafcutters. The targets are responsible for antixenobiosis, development and longevity, immunity, resistance to pathogens, pheromone function, cell signaling, behavior, polysaccharide metabolism and arginine kynase activity. Conclusion The generation and analysis of expressed sequence tags from Atta laevigata have provided important genetic basis for future studies on the biology of leaf-cutting ants and may contribute to the development of a more specific and environmentally friendly method for the control of agricultural pest leafcutters. PMID:21682882
ISRNA: an integrative online toolkit for short reads from high-throughput sequencing data.

PubMed

Luo, Guan-Zheng; Yang, Wei; Ma, Ying-Ke; Wang, Xiu-Jie

2014-02-01

Integrative Short Reads NAvigator (ISRNA) is an online toolkit for analyzing high-throughput small RNA sequencing data. Besides the high-speed genome mapping function, ISRNA provides statistics for genomic location, length distribution and nucleotide composition bias analysis of sequence reads. Number of reads mapped to known microRNAs and other classes of short non-coding RNAs, coverage of short reads on genes, expression abundance of sequence reads as well as some other analysis functions are also supported. The versatile search functions enable users to select sequence reads according to their sub-sequences, expression abundance, genomic location, relationship to genes, etc. A specialized genome browser is integrated to visualize the genomic distribution of short reads. ISRNA also supports management and comparison among multiple datasets. ISRNA is implemented in Java/C++/Perl/MySQL and can be freely accessed at http://omicslab.genetics.ac.cn/ISRNA/.
Analysis of expressed sequence tags from a single wheat cultivar facilitates interpretation of tandem mass spectrometry data and discrimination of gamma gliadin proteins that may play different functional roles in flour

USDA-ARS?s Scientific Manuscript database

The complement of gamma gliadin genes expressed in the wheat cultivar Butte 86 was evaluated by analyzing publicly available expressed sequence tag (EST) data. Eleven contigs were assembled from 153 Butte 86 ESTs. Nine of the contigs encoded full-length proteins and four of the proteins contained an...
Bone morphogenetic protein-binding endothelial regulator of liver sinusoidal endothelial cells induces iron overload in a fatty liver mouse model.

PubMed

Hasebe, Takumu; Tanaka, Hiroki; Sawada, Koji; Nakajima, Shunsuke; Ohtake, Takaaki; Fujiya, Mikihiro; Kohgo, Yutaka

2017-03-01

Non-alcoholic fatty liver disease (NAFLD) is frequently accompanied by iron overload. However, because of the complex hepcidin-regulating molecules, the molecular mechanism underlying iron overload remains unknown. To identify the key molecule involved in NAFLD-associated iron dysregulation, we performed whole-RNA sequencing on the livers of obese mice. Male C57BL/6 mice were fed a regular or high-fat diet for 16 or 48 weeks. Internal iron was evaluated by plasma iron, ferritin or hepatic iron content. Whole-RNA sequencing was performed by transcriptome analysis using semiconductor high-throughput sequencer. Mouse liver tissues or isolated hepatocytes and sinusoidal endothelial cells were used to assess the expression of iron-regulating molecules. Mice fed a high-fat diet for 16 weeks showed excess iron accumulation. Longer exposure to a high-fat diet increased hepatic fibrosis and intrahepatic iron accumulation. A pathway analysis of the sequencing data showed that several inflammatory pathways, including bone morphogenetic protein (BMP)-SMAD signaling, were significantly affected. Sequencing analysis showed 2314 altered genes, including decreased mRNA expression of the hepcidin-coding gene Hamp. Hepcidin protein expression and SMAD phosphorylation, which induces Hamp, were found to be reduced. The expression of BMP-binding endothelial regulator (BMPER), which inhibits BMP-SMAD signaling by binding BMP extracellularly, was up-regulated in fatty livers. In addition, immunohistochemical and cell isolation analyses showed that BMPER was primarily expressed in the liver sinusoidal endothelial cells (LSECs) rather than hepatocytes. BMPER secretion by LSECs inhibits BMP-SMAD signaling in hepatocytes and further reduces hepcidin protein expression. These intrahepatic molecular interactions suggest a novel molecular basis of iron overload in NAFLD.
Characterization of Cer-1 cis-regulatory region during early Xenopus development.

PubMed

Silva, Ana Cristina; Filipe, Mário; Steinbeisser, Herbert; Belo, José António

2011-05-01

Cerberus-related molecules are well-known Wnt, Nodal, and BMP inhibitors that have been implicated in different processes including anterior–posterior patterning and left–right asymmetry. In both mouse and frog, two Cerberus-related genes have been isolated, mCer-1 and mCer-2, and Xcer and Xcoco, respectively. Until now, little is known about the mechanisms involved in their transcriptional regulation. Here, we report a heterologous analysis of the mouse Cerberus-1 gene upstream regulatory regions, responsible for its expression in the visceral endodermal cells. Our analysis showed that the consensus sequences for a TATA, CAAT, or GC boxes were absent but a TGTGG sequence was present at position -172 to -168 bp, relative to the ATG. Using a series of deletion constructs and transient expression in Xenopus embryos, we found that a fragment of 1.4 kb of Cer-1 promoter sequence could reproduce the endogenous expression pattern of Xenopus cerberus. A 0.7-kb mcer-1 upstream region was able to drive reporter expression to the involuting mesendodermal cells, while further deletions abolished reporter gene expression. Our results suggest that although no sequence similarity was found between mouse and Xenopus cerberus cis-regulatory regions, the signaling cascades regulating cerberus expression, during gastrulation, is conserved.
Sequences required for induction of neurotensin receptor gene expression during neuronal differentiation of N1E-115 neuroblastoma cells.

PubMed

Tavares, D; Tully, K; Dobner, P R

1999-10-15

The promoter region of the mouse high affinity neurotensin receptor (Ntr-1) gene was characterized, and sequences required for expression in neuroblastoma cell lines that express high affinity NT-binding sites were characterized. Me(2)SO-induced neuronal differentiation of N1E-115 neuroblastoma cells increased both the expression of the endogenous Ntr-1 gene and reporter genes driven by NTR-1 promoter sequences by 3-4-fold. Deletion analysis revealed that an 83-base pair promoter region containing the transcriptional start site is required for Me(2)SO activation. Detailed mutational analysis of this region revealed that a CACCC box and the central region of a large GC-rich palindrome are the crucial cis-regulatory elements required for Me(2)SO induction. The CACCC box is bound by at least one factor that is induced upon Me(2)SO treatment of N1E-115 cells. The Me(2)SO effect was found to be both selective and cell type-restricted. Basal expression in the neuroblastoma cell lines required a distinct set of sequences, including an Sp1-like sequence, and a sequence resembling an NGFI-A-binding site; however, a more distal 5' sequence was found to repress basal activity in N1E-115 cells. These results provide evidence that Ntr-1 gene regulation involves both positive and negative regulatory elements located in the 5'-flanking region and that Ntr-1 gene activation involves the coordinate activation or induction of several factors, including a CACCC box binding complex.
IDP-ASE: haplotyping and quantifying allele-specific expression at the gene and gene isoform level by hybrid sequencing

PubMed Central

Deonovic, Benjamin; Wang, Yunhao; Weirather, Jason; Wang, Xiu-Jie; Au, Kin Fai

2017-01-01

Abstract Allele-specific expression (ASE) is a fundamental problem in studying gene regulation and diploid transcriptome profiles, with two key challenges: (i) haplotyping and (ii) estimation of ASE at the gene isoform level. Existing ASE analysis methods are limited by a dependence on haplotyping from laborious experiments or extra genome/family trio data. In addition, there is a lack of methods for gene isoform level ASE analysis. We developed a tool, IDP-ASE, for full ASE analysis. By innovative integration of Third Generation Sequencing (TGS) long reads with Second Generation Sequencing (SGS) short reads, the accuracy of haplotyping and ASE quantification at the gene and gene isoform level was greatly improved as demonstrated by the gold standard data GM12878 data and semi-simulation data. In addition to methodology development, applications of IDP-ASE to human embryonic stem cells and breast cancer cells indicate that the imbalance of ASE and non-uniformity of gene isoform ASE is widespread, including tumorigenesis relevant genes and pluripotency markers. These results show that gene isoform expression and allele-specific expression cooperate to provide high diversity and complexity of gene regulation and expression, highlighting the importance of studying ASE at the gene isoform level. Our study provides a robust bioinformatics solution to understand ASE using RNA sequencing data only. PMID:27899656
Concurrent Validity and Classification Accuracy of Curriculum-Based Measurement for Written Expression

ERIC Educational Resources Information Center

Furey, William M.; Marcotte, Amanda M.; Hintze, John M.; Shackett, Caroline M.

2016-01-01

The study presents a critical analysis of written expression curriculum-based measurement (WE-CBM) metrics derived from 3- and 10-min test lengths. Criterion validity and classification accuracy were examined for Total Words Written (TWW), Correct Writing Sequences (CWS), Percent Correct Writing Sequences (%CWS), and Correct Minus Incorrect…
Differentially expressed genes of Coptotermes formosanus (Isoptera: Rhinotermitidae) challenged by chemical insecticides.

PubMed

Zhang, Yi; Zhao, Yuanyuan; Qiu, Xuehong; Han, Richou

2013-08-01

Coptotermes formosanus Shiraki (Isoptera: Rhinotermitidae) termites are harmful social insects to wood constructions. The current control methods heavily depend on the chemical insecticides with increasing resistance. Analysis of the differentially expressed genes mediated by chemical insecticides will contribute to the understanding of the termite resistance to chemicals and to the establishment of alternative control measures. In the present article, a full-length cDNA library was constructed from the termites induced by a mixture of commonly used insecticides (0.01% sulfluramid and 0.01% triflumuron) for 24 h, by using the RNA ligase-mediated Rapid Amplification cDNA End method. Fifty-eight differentially expressed clones were obtained by polymerase chain reaction and confirmed by dot-blot hybridization. Forty-six known sequences were obtained, which clustered into 33 unique sequences grouped in 6 contigs and 27 singlets. Sixty-seven percent (22) of the sequences had counterpart genes from other organisms, whereas 33% (11) were undescribed. A Gene Ontology analysis classified 33 unique sequences into different functional categories. In general, most of the differential expression genes were involved in binding and catalytic activity.
Phylogenomic analysis of UDP glycosyltransferase 1 multigene family in Linum usitatissimum identified genes with varied expression patterns.

PubMed

Barvkar, Vitthal T; Pardeshi, Varsha C; Kale, Sandip M; Kadoo, Narendra Y; Gupta, Vidya S

2012-05-08

The glycosylation process, catalyzed by ubiquitous glycosyltransferase (GT) family enzymes, is a prevalent modification of plant secondary metabolites that regulates various functions such as hormone homeostasis, detoxification of xenobiotics and biosynthesis and storage of secondary metabolites. Flax (Linum usitatissimum L.) is a commercially grown oilseed crop, important because of its essential fatty acids and health promoting lignans. Identification and characterization of UDP glycosyltransferase (UGT) genes from flax could provide valuable basic information about this important gene family and help to explain the seed specific glycosylated metabolite accumulation and other processes in plants. Plant genome sequencing projects are useful to discover complexity within this gene family and also pave way for the development of functional genomics approaches. Taking advantage of the newly assembled draft genome sequence of flax, we identified 137 UDP glycosyltransferase (UGT) genes from flax using a conserved signature motif. Phylogenetic analysis of these protein sequences clustered them into 14 major groups (A-N). Expression patterns of these genes were investigated using publicly available expressed sequence tag (EST), microarray data and reverse transcription quantitative real time PCR (RT-qPCR). Seventy-three per cent of these genes (100 out of 137) showed expression evidence in 15 tissues examined and indicated varied expression profiles. The RT-qPCR results of 10 selected genes were also coherent with the digital expression analysis. Interestingly, five duplicated UGT genes were identified, which showed differential expression in various tissues. Of the seven intron loss/gain positions detected, two intron positions were conserved among most of the UGTs, although a clear relationship about the evolution of these genes could not be established. Comparison of the flax UGTs with orthologs from four other sequenced dicot genomes indicated that seven UGTs were flax diverged. Flax has a large number of UGT genes including few flax diverged ones. Phylogenetic analysis and expression profiles of these genes identified tissue and condition specific repertoire of UGT genes from this crop. This study would facilitate precise selection of candidate genes and their further characterization of substrate specificities and in planta functions.
Phylogenomic analysis of UDP glycosyltransferase 1 multigene family in Linum usitatissimum identified genes with varied expression patterns

PubMed Central

2012-01-01

Background The glycosylation process, catalyzed by ubiquitous glycosyltransferase (GT) family enzymes, is a prevalent modification of plant secondary metabolites that regulates various functions such as hormone homeostasis, detoxification of xenobiotics and biosynthesis and storage of secondary metabolites. Flax (Linum usitatissimum L.) is a commercially grown oilseed crop, important because of its essential fatty acids and health promoting lignans. Identification and characterization of UDP glycosyltransferase (UGT) genes from flax could provide valuable basic information about this important gene family and help to explain the seed specific glycosylated metabolite accumulation and other processes in plants. Plant genome sequencing projects are useful to discover complexity within this gene family and also pave way for the development of functional genomics approaches. Results Taking advantage of the newly assembled draft genome sequence of flax, we identified 137 UDP glycosyltransferase (UGT) genes from flax using a conserved signature motif. Phylogenetic analysis of these protein sequences clustered them into 14 major groups (A-N). Expression patterns of these genes were investigated using publicly available expressed sequence tag (EST), microarray data and reverse transcription quantitative real time PCR (RT-qPCR). Seventy-three per cent of these genes (100 out of 137) showed expression evidence in 15 tissues examined and indicated varied expression profiles. The RT-qPCR results of 10 selected genes were also coherent with the digital expression analysis. Interestingly, five duplicated UGT genes were identified, which showed differential expression in various tissues. Of the seven intron loss/gain positions detected, two intron positions were conserved among most of the UGTs, although a clear relationship about the evolution of these genes could not be established. Comparison of the flax UGTs with orthologs from four other sequenced dicot genomes indicated that seven UGTs were flax diverged. Conclusions Flax has a large number of UGT genes including few flax diverged ones. Phylogenetic analysis and expression profiles of these genes identified tissue and condition specific repertoire of UGT genes from this crop. This study would facilitate precise selection of candidate genes and their further characterization of substrate specificities and in planta functions. PMID:22568875
Cloning, analysis and functional annotation of expressed sequence tags from the Earthworm Eisenia fetida

PubMed Central

Pirooznia, Mehdi; Gong, Ping; Guan, Xin; Inouye, Laura S; Yang, Kuan; Perkins, Edward J; Deng, Youping

2007-01-01

Background Eisenia fetida, commonly known as red wiggler or compost worm, belongs to the Lumbricidae family of the Annelida phylum. Little is known about its genome sequence although it has been extensively used as a test organism in terrestrial ecotoxicology. In order to understand its gene expression response to environmental contaminants, we cloned 4032 cDNAs or expressed sequence tags (ESTs) from two E. fetida libraries enriched with genes responsive to ten ordnance related compounds using suppressive subtractive hybridization-PCR. Results A total of 3144 good quality ESTs (GenBank dbEST accession number EH669363–EH672369 and EL515444–EL515580) were obtained from the raw clone sequences after cleaning. Clustering analysis yielded 2231 unique sequences including 448 contigs (from 1361 ESTs) and 1783 singletons. Comparative genomic analysis showed that 743 or 33% of the unique sequences shared high similarity with existing genes in the GenBank nr database. Provisional function annotation assigned 830 Gene Ontology terms to 517 unique sequences based on their homology with the annotated genomes of four model organisms Drosophila melanogaster, Mus musculus, Saccharomyces cerevisiae, and Caenorhabditis elegans. Seven percent of the unique sequences were further mapped to 99 Kyoto Encyclopedia of Genes and Genomes pathways based on their matching Enzyme Commission numbers. All the information is stored and retrievable at a highly performed, web-based and user-friendly relational database called EST model database or ESTMD version 2. Conclusion The ESTMD containing the sequence and annotation information of 4032 E. fetida ESTs is publicly accessible at . PMID:18047730
Molecular cloning and expression analysis of annexin A2 gene in sika deer antler tip.

PubMed

Xia, Yanling; Qu, Haomiao; Lu, Binshan; Zhang, Qiang; Li, Heping

2018-04-01

Molecular cloning and bioinformatics analysis of annexin A2 ( ANXA2 ) gene in sika deer antler tip were conducted. The role of ANXA2 gene in the growth and development of the antler were analyzed initially. The reverse transcriptase polymerase chain reaction (RT-PCR) was used to clone the cDNA sequence of the ANXA2 gene from antler tip of sika deer ( Cervus Nippon hortulorum ) and the bioinformatics methods were applied to analyze the amino acid sequence of Anxa2 protein. The mRNA expression levels of the ANXA2 gene in different growth stages were examined by real time reverse transcriptase polymerase chain reaction (real time RT-PCR). The nucleotide sequence analysis revealed an open reading frame of 1,020 bp encoding 339 amino acids long protein of calculated molecular weight 38.6 kDa and isoelectric point 6.09. Homologous sequence alignment and phylogenetic analysis indicated that the Anxa2 mature protein of sika deer had the closest genetic distance with Cervus elaphus and Bos mutus . Real time RT-PCR results showed that the gene had differential expression levels in different growth stages, and the expression level of the ANXA2 gene was the highest at metaphase (rapid growing period). ANXA2 gene may promote the cell proliferation, and the finding suggested Anxa2 as an important candidate for regulating the growth and development of deer antler.
Genome-wide identification of conserved intronic non-coding sequences using a Bayesian segmentation approach.

PubMed

Algama, Manjula; Tasker, Edward; Williams, Caitlin; Parslow, Adam C; Bryson-Richardson, Robert J; Keith, Jonathan M

2017-03-27

Computational identification of non-coding RNAs (ncRNAs) is a challenging problem. We describe a genome-wide analysis using Bayesian segmentation to identify intronic elements highly conserved between three evolutionarily distant vertebrate species: human, mouse and zebrafish. We investigate the extent to which these elements include ncRNAs (or conserved domains of ncRNAs) and regulatory sequences. We identified 655 deeply conserved intronic sequences in a genome-wide analysis. We also performed a pathway-focussed analysis on genes involved in muscle development, detecting 27 intronic elements, of which 22 were not detected in the genome-wide analysis. At least 87% of the genome-wide and 70% of the pathway-focussed elements have existing annotations indicative of conserved RNA secondary structure. The expression of 26 of the pathway-focused elements was examined using RT-PCR, providing confirmation that they include expressed ncRNAs. Consistent with previous studies, these elements are significantly over-represented in the introns of transcription factors. This study demonstrates a novel, highly effective, Bayesian approach to identifying conserved non-coding sequences. Our results complement previous findings that these sequences are enriched in transcription factors. However, in contrast to previous studies which suggest the majority of conserved sequences are regulatory factor binding sites, the majority of conserved sequences identified using our approach contain evidence of conserved RNA secondary structures, and our laboratory results suggest most are expressed. Functional roles at DNA and RNA levels are not mutually exclusive, and many of our elements possess evidence of both. Moreover, ncRNAs play roles in transcriptional and post-transcriptional regulation, and this may contribute to the over-representation of these elements in introns of transcription factors. We attribute the higher sensitivity of the pathway-focussed analysis compared to the genome-wide analysis to improved alignment quality, suggesting that enhanced genomic alignments may reveal many more conserved intronic sequences.
Enhancer Linking by Methylation/Expression Relationships (ELMER) | Informatics Technology for Cancer Research (ITCR)

Cancer.gov

R tool for analysis of DNA methylation and expression datasets. Integrative analysis allows reconstruction of in vivo transcription factor networks altered in cancer along with identification of the underlying gene regulatory sequences.
Genomic resources for songbird research and their use in characterizing gene expression during brain development

PubMed Central

Li, XiaoChing; Wang, Xiu-Jie; Tannenhauser, Jonathan; Podell, Sheila; Mukherjee, Piali; Hertel, Moritz; Biane, Jeremy; Masuda, Shoko; Nottebohm, Fernando; Gaasterland, Terry

2007-01-01

Vocal learning and neuronal replacement have been studied extensively in songbirds, but until recently, few molecular and genomic tools for songbird research existed. Here we describe new molecular/genomic resources developed in our laboratory. We made cDNA libraries from zebra finch (Taeniopygia guttata) brains at different developmental stages. A total of 11,000 cDNA clones from these libraries, representing 5,866 unique gene transcripts, were randomly picked and sequenced from the 3′ ends. A web-based database was established for clone tracking, sequence analysis, and functional annotations. Our cDNA libraries were not normalized. Sequencing ESTs without normalization produced many developmental stage-specific sequences, yielding insights into patterns of gene expression at different stages of brain development. In particular, the cDNA library made from brains at posthatching day 30–50, corresponding to the period of rapid song system development and song learning, has the most diverse and richest set of genes expressed. We also identified five microRNAs whose sequences are highly conserved between zebra finch and other species. We printed cDNA microarrays and profiled gene expression in the high vocal center of both adult male zebra finches and canaries (Serinus canaria). Genes differentially expressed in the high vocal center were identified from the microarray hybridization results. Selected genes were validated by in situ hybridization. Networks among the regulated genes were also identified. These resources provide songbird biologists with tools for genome annotation, comparative genomics, and microarray gene expression analysis. PMID:17426146
Differences in expression of retinal pigment epithelium mRNA between normal canines

PubMed Central

2004-01-01

Abstract A reference database of differences in mRNA expression in normal healthy canine retinal pigment epithelium (RPE) has been established. This database identifies non-informative differences in mRNA expression that can be used in screening canine RPE for mutations associated with clinical effects on vision. Complementary DNA (cDNA) pools were prepared from mRNA harvested from RPE, amplified by PCR, and used in a subtractive hybridization protocol (representational differential analysis) to identify differences in RPE mRNA expression between canines. The effect of relatedness of the test canines on the frequency of occurrence of differences was evaluated by using 2 unrelated canines for comparison with 2 female sibling canines of blue heeler/bull terrier lineage. Differentially expressed cDNA species were cloned, sequenced, and identified by comparison to public database entries. The most frequently observed differentially expressed sequence from the unrelated canine comparison was cDNA with 21 base pairs (bp) identical to the human epithelial membrane protein 1 gene (present in 8 of 20 clones). Different clones from the same-sex sibling RPE contained repetitions of several short sequence motifs including the human epithelial membrane protein 1 (4 of 25 clones). Other prevalent differences between sibling RPE included sequences similar to a chicken genetic marker sequence motif (5 of 25), and 6 clones with homology to porcine major histocompatibility loci. In addition to identifying several repetitively occurring, noninformative, differentially expressed RPE mRNA species, the findings confirm that fewer differences occurred between siblings, highlighting the importance of using closely related subjects in representational difference analysis studies. PMID:15352545

Alteration of gene expression in human hepatocellular carcinoma with integrated hepatitis B virus DNA.

PubMed

Tamori, Akihiro; Yamanishi, Yoshihiro; Kawashima, Shuichi; Kanehisa, Minoru; Enomoto, Masaru; Tanaka, Hiromu; Kubo, Shoji; Shiomi, Susumu; Nishiguchi, Shuhei

2005-08-15

Integration of hepatitis B virus (HBV) DNA into the human genome is one of the most important steps in HBV-related carcinogenesis. This study attempted to find the link between HBV DNA, the adjoining cellular sequence, and altered gene expression in hepatocellular carcinoma (HCC) with integrated HBV DNA. We examined 15 cases of HCC infected with HBV by cassette ligation-mediated PCR. The human DNA adjacent to the integrated HBV DNA was sequenced. Protein coding sequences were searched for in the human sequence. In five cases with HBV DNA integration, from which good quality RNA was extracted, gene expression was examined by cDNA microarray analysis. The human DNA sequence successive to integrated HBV DNA was determined in the 15 HCCs. Eight protein-coding regions were involved: ras-responsive element binding protein 1, calmodulin 1, mixed lineage leukemia 2 (MLL2), FLJ333655, LOC220272, LOC255345, LOC220220, and LOC168991. The MLL2 gene was expressed in three cases with HBV DNA integrated into exon 3 of MLL2 and in one case with HBV DNA integrated into intron 3 of MLL2. Gene expression analysis suggested that two HCCs with HBV integrated into MLL2 had similar patterns of gene expression compared with three HCCs with HBV integrated into other loci of human chromosomes. HBV DNA was integrated at random sites of human DNA, and the MLL2 gene was one of the targets for integration. Our results suggest that HBV DNA might modulate human genes near integration sites, followed by integration site-specific expression of such genes during hepatocarcinogenesis.
Characteristics of the Lotus japonicus gene repertoire deduced from large-scale expressed sequence tag (EST) analysis.

PubMed

Asamizu, Erika; Nakamura, Yasukazu; Sato, Shusei; Tabata, Satoshi

2004-02-01

To perform a comprehensive analysis of genes expressed in a model legume, Lotus japonicus, a total of 74472 3'-end expressed sequence tags (EST) were generated from cDNA libraries produced from six different organs. Clustering of sequences was performed with an identity criterion of 95% for 50 bases, and a total of 20457 non-redundant sequences, 8503 contigs and 11954 singletons were generated. EST sequence coverage was analyzed by using the annotated L. japonicus genomic sequence and 1093 of the 1889 predicted protein-encoding genes (57.9%) were hit by the EST sequence(s). Gene content was compared to several plant species. Among the 8503 contigs, 471 were identified as sequences conserved only in leguminous species and these included several disease resistance-related genes. This suggested that in legumes, these genes may have evolved specifically to resist pathogen attack. The rate of gene sequence divergence was assessed by comparing similarity level and functional category based on the Gene Ontology (GO) annotation of Arabidopsis genes. This revealed that genes encoding ribosomal proteins, as well as those related to translation, photosynthesis, and cellular structure were more abundantly represented in the highly conserved class, and that genes encoding transcription factors and receptor protein kinases were abundantly represented in the less conserved class. To make the sequence information and the cDNA clones available to the research community, a Web database with useful services was created at http://www.kazusa.or.jp/en/plant/lotus/EST/.
TEtools facilitates big data expression analysis of transposable elements and reveals an antagonism between their activity and that of piRNA genes

PubMed Central

Lerat, Emmanuelle; Fablet, Marie; Modolo, Laurent; Lopez-Maestre, Hélène

2017-01-01

Abstract Over recent decades, substantial efforts have been made to understand the interactions between host genomes and transposable elements (TEs). The impact of TEs on the regulation of host genes is well known, with TEs acting as platforms of regulatory sequences. Nevertheless, due to their repetitive nature it is considerably hard to integrate TE analysis into genome-wide studies. Here, we developed a specific tool for the analysis of TE expression: TEtools. This tool takes into account the TE sequence diversity of the genome, it can be applied to unannotated or unassembled genomes and is freely available under the GPL3 (https://github.com/l-modolo/TEtools). TEtools performs the mapping of RNA-seq data obtained from classical mRNAs or small RNAs onto a list of TE sequences and performs differential expression analyses with statistical relevance. Using this tool, we analyzed TE expression from five Drosophila wild-type strains. Our data show for the first time that the activity of TEs is strictly linked to the activity of the genes implicated in the piwi-interacting RNA biogenesis and therefore fits an arms race scenario between TE sequences and host control genes. PMID:28204592
Transcriptome analysis by strand-specific sequencing of complementary DNA

PubMed Central

Parkhomchuk, Dmitri; Borodina, Tatiana; Amstislavskiy, Vyacheslav; Banaru, Maria; Hallen, Linda; Krobitsch, Sylvia; Lehrach, Hans; Soldatov, Alexey

2009-01-01

High-throughput complementary DNA sequencing (RNA-Seq) is a powerful tool for whole-transcriptome analysis, supplying information about a transcript's expression level and structure. However, it is difficult to determine the polarity of transcripts, and therefore identify which strand is transcribed. Here, we present a simple cDNA sequencing protocol that preserves information about a transcript's direction. Using Saccharomyces cerevisiae and mouse brain transcriptomes as models, we demonstrate that knowing the transcript's orientation allows more accurate determination of the structure and expression of genes. It also helps to identify new genes and enables studying promoter-associated and antisense transcription. The transcriptional landscapes we obtained are available online. PMID:19620212
Transcriptome analysis by strand-specific sequencing of complementary DNA.

PubMed

Parkhomchuk, Dmitri; Borodina, Tatiana; Amstislavskiy, Vyacheslav; Banaru, Maria; Hallen, Linda; Krobitsch, Sylvia; Lehrach, Hans; Soldatov, Alexey

2009-10-01

High-throughput complementary DNA sequencing (RNA-Seq) is a powerful tool for whole-transcriptome analysis, supplying information about a transcript's expression level and structure. However, it is difficult to determine the polarity of transcripts, and therefore identify which strand is transcribed. Here, we present a simple cDNA sequencing protocol that preserves information about a transcript's direction. Using Saccharomyces cerevisiae and mouse brain transcriptomes as models, we demonstrate that knowing the transcript's orientation allows more accurate determination of the structure and expression of genes. It also helps to identify new genes and enables studying promoter-associated and antisense transcription. The transcriptional landscapes we obtained are available online.
Sequence analysis reveals genomic factors affecting EST-SSR primer performance and polymorphism

USDA-ARS?s Scientific Manuscript database

Search for simple sequence repeat (SSR) motifs and design of flanking primers in expressed sequence tag (EST) sequences can be easily done at a large scale using bioinformatics programs. However, failed amplification and/or detection, along with lack of polymorphism, is often seen among randomly sel...
A small test of a sequence-based typing method: definition of the B*1520 allele.

PubMed

Domena, J D; Little, A M; Arnett, K L; Adams, E J; Marsh, S G; Parham, P

1994-10-01

Santamaria et al. (Human Immunology 1993 37: 39-50) describe a method of sequence-based typing (SBT) for HLA-A, B and C alleles said to give "unambiguous typing of any sample, heterozygous or homozygous, without requiring additional typing information". From SBT analysis, which involves determination of partial sequences of mixed alleles, these investigators reported that cell lines KT17 (HLA-B35,62) and OLGA (HLA-B62) from the reference panel of the 10th International Histocompatibility Workshop express novel variants of HLA-B15 (B1501-MN6) and HLA-B35 (B3501-MN7) respectively. To study further the novel alleles, we cloned and sequenced full-length HLA-B cDNA clones isolated from the KT17 and OLGA cell lines. We find that KT17 expresses B*3501, as assigned by SBT, and B*1501, the common allele encoding the B62 antigen. We were unable to confirm that KT17 expresses the novel B1501-MN6 variant identified by SBT. For OLGA our analysis confirms the partial sequences obtained by SBT. Thus OLGA expresses B*1501 and a novel HLA-B allele. The complete sequence of the latter shows it is a hybrid having exons 1 and 2 in common with B*1501 and other B15 subtypes and exons 3-7 in common with B*3501 and related molecules including B*5301 and B*5801. The novel allele has been designated B*1520 because of its sequence similarity with the B15 group; furthermore, serological analysis shows that the B*1520 product does not express epitopes in common with either B35, B53 or B58. The B*1520 heavy chain has a similar isoelectric point to A*3101; B*1520 was undetected by previous applications of isoelectric focusing because B*1520 and A31 are both expressed by OLGA. In conclusion, HLA-B typing of two cell lines by cDNA cloning and sequencing gives concordant results with SBT for three of the four alleles. The cause of the discrepancy for the fourth allele is unknown, however, this finding indicates that the novel HLA-A, B and C sequences emerging from SBT studies need independent verification.
Analysis and expression of the alpha-expansin and beta-expansin gene families in maize

NASA Technical Reports Server (NTRS)

Wu, Y.; Meeley, R. B.; Cosgrove, D. J.

2001-01-01

Expansins comprise a multigene family of proteins in maize (Zea mays). We isolated and characterized 13 different maize expansin cDNAs, five of which are alpha-expansins and eight of which are beta-expansins. This paper presents an analysis of these 13 expansins, as well as an expression analysis by northern blotting with materials from young and mature maize plants. Some expansins were expressed in restricted regions, such as the beta-expansins ExpB1 (specifically expressed in maize pollen) and ExpB4 (expressed principally in young husks). Other expansins such as alpha-expansin Exp1 and beta-expansin ExpB2 were expressed in several organs. The expression of yet a third group was not detected in the selected organs and tissues. An analysis of expansin sequences from the maize expressed sequence tag collection is also presented. Our results indicate that expansin genes may have general, overlapping expression in some instances, whereas in other cases the expression may be highly specific and limited to a single organ or cell type. In contrast to the situation in Arabidopsis, beta-expansins in maize seem to be more numerous and more highly expressed than are alpha-expansins. The results support the concept that beta-expansins multiplied and evolved special functions in the grasses.
Comprehensive processing of high-throughput small RNA sequencing data including quality checking, normalization, and differential expression analysis using the UEA sRNA Workbench

PubMed Central

Beckers, Matthew; Mohorianu, Irina; Stocks, Matthew; Applegate, Christopher; Dalmay, Tamas; Moulton, Vincent

2017-01-01

Recently, high-throughput sequencing (HTS) has revealed compelling details about the small RNA (sRNA) population in eukaryotes. These 20 to 25 nt noncoding RNAs can influence gene expression by acting as guides for the sequence-specific regulatory mechanism known as RNA silencing. The increase in sequencing depth and number of samples per project enables a better understanding of the role sRNAs play by facilitating the study of expression patterns. However, the intricacy of the biological hypotheses coupled with a lack of appropriate tools often leads to inadequate mining of the available data and thus, an incomplete description of the biological mechanisms involved. To enable a comprehensive study of differential expression in sRNA data sets, we present a new interactive pipeline that guides researchers through the various stages of data preprocessing and analysis. This includes various tools, some of which we specifically developed for sRNA analysis, for quality checking and normalization of sRNA samples as well as tools for the detection of differentially expressed sRNAs and identification of the resulting expression patterns. The pipeline is available within the UEA sRNA Workbench, a user-friendly software package for the processing of sRNA data sets. We demonstrate the use of the pipeline on a H. sapiens data set; additional examples on a B. terrestris data set and on an A. thaliana data set are described in the Supplemental Information. A comparison with existing approaches is also included, which exemplifies some of the issues that need to be addressed for sRNA analysis and how the new pipeline may be used to do this. PMID:28289155
Generation and analysis of expressed sequence tags from the bone marrow of Chinese Sika deer.

PubMed

Yao, Baojin; Zhao, Yu; Zhang, Mei; Li, Juan

2012-03-01

Sika deer is one of the best-known and highly valued animals of China. Despite its economic, cultural, and biological importance, there has not been a large-scale sequencing project for Sika deer to date. With the ultimate goal of sequencing the complete genome of this organism, we first established a bone marrow cDNA library for Sika deer and generated a total of 2,025 reads. After processing the sequences, 2,017 high-quality expressed sequence tags (ESTs) were obtained. These ESTs were assembled into 1,157 unigenes, including 238 contigs and 919 singletons. Comparative analyses indicated that 888 (76.75%) of the unigenes had significant matches to sequences in the non-redundant protein database, In addition to highly expressed genes, such as stearoyl-CoA desaturase, cytochrome c oxidase, adipocyte-type fatty acid-binding protein, adiponectin and thymosin beta-4, we also obtained vascular endothelial growth factor-A and heparin-binding growth-associated molecule, both of which are of great importance for angiogenesis research. There were 244 (21.09%) unigenes with no significant match to any sequence in current protein or nucleotide databases, and these sequences may represent genes with unknown function in Sika deer. Open reading frame analysis of the sequences was performed using the getorf program. In addition, the sequences were functionally classified using the gene ontology hierarchy, clusters of orthologous groups of proteins and Kyoto encyclopedia of genes and genomes databases. Analysis of ESTs described in this paper provides an important resource for the transcriptome exploration of Sika deer, and will also facilitate further studies on functional genomics, gene discovery and genome annotation of Sika deer.
Genomic characterization and expression profiles upon bacterial infection of a novel cystatin B homologue from disk abalone (Haliotis discus discus).

PubMed

Premachandra, H K A; Wan, Qiang; Elvitigala, Don Anushka Sandaruwan; De Zoysa, Mahanama; Choi, Cheol Young; Whang, Ilson; Lee, Jehee

2012-12-01

Cystatins are a large family of cysteine proteinase inhibitors which are involved in diverse biological and pathological processes. In the present study, we identified a gene related to cystatin superfamily, AbCyt B, from disk abalone Haliotis discus discus by expressed sequence tag (EST) analysis and BAC library screening. The complete cDNA sequence of AbCyt B is comprised of 1967 nucleotides with a 306 bp open reading frame (ORF) encoding for 101 amino acids. The amino acid sequence consists of a single cystatin-like domain, which has a cysteine proteinase inhibitor signature, a conserved Gly in N-terminal region, QVVAG motif and a variant of PW motif. No signal peptide, disulfide bonds or carbohydrate side chains were identified. Analysis of deduced amino acid sequence revealed that AbCyt B shares up to 44.7% identity and 65.7% similarity with the cystatin B genes from other organisms. The genomic sequence of AbCyt B is approximately 8.4 Kb, consisting of three exons and two introns. Phylogenetic tree analysis showed that AbCyt B was closely related to the cystatin B from pacific oyster (Crassostrea gigas) under the family 1.Functional analysis of recombinant AbCyt B protein exhibited inhibitory activity against the papain, with almost 84% inhibition at a concentration of 3.5 μmol/L. In tissue expression analysis, AbCyt B transcripts were expressed abundantly in the hemocyte, gill, mantle, and digestive tract, while weakly in muscle, testis, and hepatopancreas. After the immune challenge with Vibrio parahemolyticus, the AbCyt B showed significant (P<0.05) up-regulation of relative mRNA expression in gill and hemocytes at 24 and 6 h of post infection, respectively. These results collectively suggest that AbCyst B is a potent inhibitor of cysteine proteinases and is also potentially involved in immune responses against invading bacterial pathogens in abalone. Copyright © 2012 Elsevier Ltd. All rights reserved.
Functional Analysis of Maize Silk-Specific ZmbZIP25 Promoter.

PubMed

Li, Wanying; Yu, Dan; Yu, Jingjuan; Zhu, Dengyun; Zhao, Qian

2018-03-12

ZmbZIP25 ( Zea mays bZIP (basic leucine zipper) transcription factor 25) is a function-unknown protein that belongs to the D group of the bZIP transcription factor family. RNA-seq data showed that the expression of ZmbZIP25 was tissue-specific in maize silks, and this specificity was confirmed by RT-PCR (reverse transcription-polymerase chain reaction). In situ RNA hybridization showed that ZmbZIP25 was expressed exclusively in the xylem of maize silks. A 5' RACE (rapid amplification of cDNA ends) assay identified an adenine residue as the transcription start site of the ZmbZIP25 gene. To characterize this silk-specific promoter, we isolated and analyzed a 2450 bp (from -2083 to +367) and a 2600 bp sequence of ZmbZIP25 (from -2083 to +517, the transcription start site was denoted +1). Stable expression assays in Arabidopsis showed that the expression of the reporter gene GUS driven by the 2450 bp ZmbZIP25 5'-flanking fragment occurred exclusively in the papillae of Arabidopsis stigmas. Furthermore, transient expression assays in maize indicated that GUS and GFP expression driven by the 2450 bp ZmbZIP25 5'-flanking sequences occurred only in maize silks and not in other tissues. However, no GUS or GFP expression was driven by the 2600 bp ZmbZIP25 5'-flanking sequences in either stable or transient expression assays. A series of deletion analyses of the 2450 bp ZmbZIP25 5'-flanking sequence was performed in transgenic Arabidopsis plants, and probable elements prediction analysis revealed the possible presence of negative regulatory elements within the 161 bp region from -1117 to -957 that were responsible for the specificity of the ZmbZIP25 5'-flanking sequence.
Functional Analysis of Maize Silk-Specific ZmbZIP25 Promoter

PubMed Central

Li, Wanying; Yu, Dan; Yu, Jingjuan; Zhu, Dengyun; Zhao, Qian

2018-01-01

ZmbZIP25 (Zea mays bZIP (basic leucine zipper) transcription factor 25) is a function-unknown protein that belongs to the D group of the bZIP transcription factor family. RNA-seq data showed that the expression of ZmbZIP25 was tissue-specific in maize silks, and this specificity was confirmed by RT-PCR (reverse transcription-polymerase chain reaction). In situ RNA hybridization showed that ZmbZIP25 was expressed exclusively in the xylem of maize silks. A 5′ RACE (rapid amplification of cDNA ends) assay identified an adenine residue as the transcription start site of the ZmbZIP25 gene. To characterize this silk-specific promoter, we isolated and analyzed a 2450 bp (from −2083 to +367) and a 2600 bp sequence of ZmbZIP25 (from −2083 to +517, the transcription start site was denoted +1). Stable expression assays in Arabidopsis showed that the expression of the reporter gene GUS driven by the 2450 bp ZmbZIP25 5′-flanking fragment occurred exclusively in the papillae of Arabidopsis stigmas. Furthermore, transient expression assays in maize indicated that GUS and GFP expression driven by the 2450 bp ZmbZIP25 5′-flanking sequences occurred only in maize silks and not in other tissues. However, no GUS or GFP expression was driven by the 2600 bp ZmbZIP25 5′-flanking sequences in either stable or transient expression assays. A series of deletion analyses of the 2450 bp ZmbZIP25 5′-flanking sequence was performed in transgenic Arabidopsis plants, and probable elements prediction analysis revealed the possible presence of negative regulatory elements within the 161 bp region from −1117 to −957 that were responsible for the specificity of the ZmbZIP25 5′-flanking sequence. PMID:29534529
Unraveling the oral cancer lncRNAome: Identification of novel lncRNAs associated with malignant progression and HPV infection.

PubMed

Nohata, Nijiro; Abba, Martin C; Gutkind, J Silvio

2016-08-01

The role of long non-coding RNA (lncRNA) expression in human head and neck squamous cell carcinoma (HNSCC) is still poorly understood. In this study, we aimed at establishing the onco-lncRNAome profiling of HNSCC and to identify lncRNAs correlating with prognosis and patient survival. The Atlas of Noncoding RNAs in Cancer (TANRIC) database was employed to retrieve the lncRNA expression information generated from The Cancer Genome Atlas (TCGA) HNSCC RNA-sequencing data. RNA-sequencing data from HNSCC cell lines were also considered for this study. Bioinformatics approaches, such as differential gene expression analysis, survival analysis, principal component analysis, and Co-LncRNA enrichment analysis were performed. Using TCGA HNSCC RNA-sequencing data from 426 HNSCC and 42 adjacent normal tissues, we found 728 lncRNA transcripts significantly and differentially expressed in HNSCC. Among the 728 lncRNAs, 55 lncRNAs were significantly associated with poor prognosis, such as overall survival and/or disease-free survival. Next, we found 140 lncRNA transcripts significantly and differentially expressed between Human Papilloma Virus (HPV) positive tumors and HPV negative tumors. Thirty lncRNA transcripts were differentially expressed between TP53 mutated and TP53 wild type tumors. Co-LncRNA analysis suggested that protein-coding genes that are co-expressed with these deregulated lncRNAs might be involved in cancer associated molecular events. With consideration of differential expression of lncRNAs in a HNSCC cell lines panel (n=22), we found several lncRNAs that may represent potential targets for diagnosis, therapy and prevention of HNSCC. LncRNAs profiling could provide novel insights into the potential mechanisms of HNSCC oncogenesis. Copyright © 2016 Elsevier Ltd. All rights reserved.
Unraveling the Oral Cancer lncRNAome: Identification of Novel lncRNAs Associated with Malignant Progression and HPV Infection

PubMed Central

Nohata, Nijiro; Abba, Martin C.; Gutkind, J. Silvio

2017-01-01

Objectives The role of long non-coding RNA (lncRNA) expression in human head and neck squamous cell carcinoma (HNSCC) is still poorly understood. In this study, we aimed at establishing the onco-lncRNAome profiling of HNSCC and to identify lncRNAs correlating with prognosis and patient survival. Materials and Methods The Atlas of Noncoding RNAs in Cancer (TANRIC) database was employed to retrieve the lncRNA expression information generated from The Cancer Genome Atlas (TCGA) HNSCC RNA-sequencing data. RNA-sequencing data from HNSCC cell lines were also considered for this study. Bioinformatics approaches, such as differential gene expression analysis, survival analysis, principal component analysis, and Co-LncRNA enrichment analysis were performed. Results Using TCGA HNSCC RNA-sequencing data from 426 HNSCC and 42 adjacent normal tissues, we found 728 lncRNA transcripts significantly and differentially expressed in HNSCC. Among the 728 lncRNAs, 55 lncRNAs were significantly associated with poor prognosis, such as overall survival and/or disease-free survival. Next, we found 140 lncRNA transcripts significantly and differentially expressed between Human Papilloma Virus (HPV) positive tumors and HPV negative tumors. Thirty lncRNA transcripts were differentially expressed between TP53 mutated and TP53 wild type tumors. Co-LncRNA analysis suggested that protein-coding genes that are co-expressed with these deregulated lncRNAs might be involved in cancer associated molecular events. With consideration of differential expression of lncRNAs in a HNSCC cell lines panel (n=22), we found several lncRNAs that may represent potential targets for diagnosis, therapy and prevention of HNSCC. Conclusion LncRNAs profiling could provide novel insights into the potential mechanisms of HNSCC oncogenesis. PMID:27424183
Transcriptome sequencing and de novo analysis of the copepod Calanus sinicus using 454 GS FLX.

PubMed

Ning, Juan; Wang, Minxiao; Li, Chaolun; Sun, Song

2013-01-01

Despite their species abundance and primary economic importance, genomic information about copepods is still limited. In particular, genomic resources are lacking for the copepod Calanus sinicus, which is a dominant species in the coastal waters of East Asia. In this study, we performed de novo transcriptome sequencing to produce a large number of expressed sequence tags for the copepod C. sinicus. Copepodid larvae and adults were used as the basic material for transcriptome sequencing. Using 454 pyrosequencing, a total of 1,470,799 reads were obtained, which were assembled into 56,809 high quality expressed sequence tags. Based on their sequence similarity to known proteins, about 14,000 different genes were identified, including members of all major conserved signaling pathways. Transcripts that were putatively involved with growth, lipid metabolism, molting, and diapause were also identified among these genes. Differentially expressed genes related to several processes were found in C. sinicus copepodid larvae and adults. We detected 284,154 single nucleotide polymorphisms (SNPs) that provide a resource for gene function studies. Our data provide the most comprehensive transcriptome resource available for C. sinicus. This resource allowed us to identify genes associated with primary physiological processes and SNPs in coding regions, which facilitated the quantitative analysis of differential gene expression. These data should provide foundation for future genetic and genomic studies of this and related species.
Identification of human chromosome 22 transcribed sequences with ORF expressed sequence tags

PubMed Central

de Souza, Sandro J.; Camargo, Anamaria A.; Briones, Marcelo R. S.; Costa, Fernando F.; Nagai, Maria Aparecida; Verjovski-Almeida, Sergio; Zago, Marco A.; Andrade, Luis Eduardo C.; Carrer, Helaine; El-Dorry, Hamza F. A.; Espreafico, Enilza M.; Habr-Gama, Angelita; Giannella-Neto, Daniel; Goldman, Gustavo H.; Gruber, Arthur; Hackel, Christine; Kimura, Edna T.; Maciel, Rui M. B.; Marie, Suely K. N.; Martins, Elizabeth A. L.; Nóbrega, Marina P.; Paçó-Larson, Maria Luisa; Pardini, Maria Inês M. C.; Pereira, Gonçalo G.; Pesquero, João Bosco; Rodrigues, Vanderlei; Rogatto, Silvia R.; da Silva, Ismael D. C. G.; Sogayar, Mari C.; de Fátima Sonati, Maria; Tajara, Eloiza H.; Valentini, Sandro R.; Acencio, Marcio; Alberto, Fernando L.; Amaral, Maria Elisabete J.; Aneas, Ivy; Bengtson, Mário Henrique; Carraro, Dirce M.; Carvalho, Alex F.; Carvalho, Lúcia Helena; Cerutti, Janete M.; Corrêa, Maria Lucia C.; Costa, Maria Cristina R.; Curcio, Cyntia; Gushiken, Tsieko; Ho, Paulo L.; Kimura, Elza; Leite, Luciana C. C.; Maia, Gustavo; Majumder, Paromita; Marins, Mozart; Matsukuma, Adriana; Melo, Analy S. A.; Mestriner, Carlos Alberto; Miracca, Elisabete C.; Miranda, Daniela C.; Nascimento, Ana Lucia T. O.; Nóbrega, Francisco G.; Ojopi, Élida P. B.; Pandolfi, José Rodrigo C.; Pessoa, Luciana Gilbert; Rahal, Paula; Rainho, Claudia A.; da Ro's, Nancy; de Sá, Renata G.; Sales, Magaly M.; da Silva, Neusa P.; Silva, Tereza C.; da Silva, Wilson; Simão, Daniel F.; Sousa, Josane F.; Stecconi, Daniella; Tsukumo, Fernando; Valente, Valéria; Zalcberg, Heloisa; Brentani, Ricardo R.; Reis, Luis F. L.; Dias-Neto, Emmanuel; Simpson, Andrew J. G.

2000-01-01

Transcribed sequences in the human genome can be identified with confidence only by alignment with sequences derived from cDNAs synthesized from naturally occurring mRNAs. We constructed a set of 250,000 cDNAs that represent partial expressed gene sequences and that are biased toward the central coding regions of the resulting transcripts. They are termed ORF expressed sequence tags (ORESTES). The 250,000 ORESTES were assembled into 81,429 contigs. Of these, 1,181 (1.45%) were found to match sequences in chromosome 22 with at least one ORESTES contig for 162 (65.6%) of the 247 known genes, for 67 (44.6%) of the 150 related genes, and for 45 of the 148 (30.4%) EST-predicted genes on this chromosome. Using a set of stringent criteria to validate our sequences, we identified a further 219 previously unannotated transcribed sequences on chromosome 22. Of these, 171 were in fact also defined by EST or full length cDNA sequences available in GenBank but not utilized in the initial annotation of the first human chromosome sequence. Thus despite representing less than 15% of all expressed human sequences in the public databases at the time of the present analysis, ORESTES sequences defined 48 transcribed sequences on chromosome 22 not defined by other sequences. All of the transcribed sequences defined by ORESTES coincided with DNA regions predicted as encoding exons by genscan. (http://genes.mit.edu/GENSCAN.html). PMID:11070084
Genetic analysis of tumorigenesis: XXXII. Localization of constitutionally amplified KRAS sequences to Chinese hamster chromosomes X and Y by in situ hybridization.

PubMed

Stenman, G; Anisowicz, A; Sager, R

1988-11-01

The KRAS gene is constitutionally amplified in the Chinese hamster. We have mapped the amplified sequences by in situ hybridization to two major sites on the X and Y chromosomes, Xq4 and Yp2. No autosomal site was detected despite a search under relaxed hybridization conditions. KRAS DNA is amplified about 50-fold compared to a human cell line known to have a diploid number of KRAS sequences, whereas mRNA expression is 5- to 10-fold lower than in normal human cells. While mRNA expression levels do not necessarily parallel gene copy number, the low expression level strongly suggests that the amplified sequences are transcriptionally silent. It is suggested that the amplified sequences arose from the original KRAS gene on chromosome 8 and that the KRAS sequences on the Y chromosome arose by X-Y recombination.
Expressed Sequence Tag Analysis of the Human Pathogen Paracoccidioides brasiliensis Yeast Phase: Identification of Putative Homologues of Candida albicans Virulence and Pathogenicity Genes

PubMed Central

Goldman, Gustavo H.; dos Reis Marques, Everaldo; Custódio Duarte Ribeiro, Diógenes; Ângelo de Souza Bernardes, Luciano; Quiapin, Andréa Carla; Vitorelli, Patrícia Marostica; Savoldi, Marcela; Semighini, Camile P.; de Oliveira, Regina C.; Nunes, Luiz R.; Travassos, Luiz R.; Puccia, Rosana; Batista, Wagner L.; Ferreira, Leslie Ecker; Moreira, Júlio C.; Bogossian, Ana Paula; Tekaia, Fredj; Nobrega, Marina Pasetto; Nobrega, Francisco G.; Goldman, Maria Helena S.

2003-01-01

Paracoccidioides brasiliensis, a thermodimorphic fungus, is the causative agent of the prevalent systemic mycosis in Latin America, paracoccidioidomycosis. We present here a survey of expressed genes in the yeast pathogenic phase of P. brasiliensis. We obtained 13,490 expressed sequence tags from both 5′ and 3′ ends. Clustering analysis yielded the partial sequences of 4,692 expressed genes that were functionally classified by similarity to known genes. We have identified several Candida albicans virulence and pathogenicity homologues in P. brasiliensis. Furthermore, we have analyzed the expression of some of these genes during the dimorphic yeast-mycelium-yeast transition by real-time quantitative reverse transcription-PCR. Clustering analysis of the mycelium-yeast transition revealed three groups: (i) RBT, hydrophobin, and isocitrate lyase; (ii) malate dehydrogenase, contigs Pb1067 and Pb1145, GPI, and alternative oxidase; and (iii) ubiquitin, delta-9-desaturase, HSP70, HSP82, and HSP104. The first two groups displayed high mRNA expression in the mycelial phase, whereas the third group showed higher mRNA expression in the yeast phase. Our results suggest the possible conservation of pathogenicity and virulence mechanisms among fungi, expand considerably gene identification in P. brasiliensis, and provide a broader basis for further progress in understanding its biological peculiarities. PMID:12582121
Mobile Genome Express (MGE): A comprehensive automatic genetic analyses pipeline with a mobile device.

PubMed

Yoon, Jun-Hee; Kim, Thomas W; Mendez, Pedro; Jablons, David M; Kim, Il-Jin

2017-01-01

The development of next-generation sequencing (NGS) technology allows to sequence whole exomes or genome. However, data analysis is still the biggest bottleneck for its wide implementation. Most laboratories still depend on manual procedures for data handling and analyses, which translates into a delay and decreased efficiency in the delivery of NGS results to doctors and patients. Thus, there is high demand for developing an automatic and an easy-to-use NGS data analyses system. We developed comprehensive, automatic genetic analyses controller named Mobile Genome Express (MGE) that works in smartphones or other mobile devices. MGE can handle all the steps for genetic analyses, such as: sample information submission, sequencing run quality check from the sequencer, secured data transfer and results review. We sequenced an Actrometrix control DNA containing multiple proven human mutations using a targeted sequencing panel, and the whole analysis was managed by MGE, and its data reviewing program called ELECTRO. All steps were processed automatically except for the final sequencing review procedure with ELECTRO to confirm mutations. The data analysis process was completed within several hours. We confirmed the mutations that we have identified were consistent with our previous results obtained by using multi-step, manual pipelines.

Parallel human genome analysis: microarray-based expression monitoring of 1000 genes.

PubMed Central

Schena, M; Shalon, D; Heller, R; Chai, A; Brown, P O; Davis, R W

1996-01-01

Microarrays containing 1046 human cDNAs of unknown sequence were printed on glass with high-speed robotics. These 1.0-cm2 DNA "chips" were used to quantitatively monitor differential expression of the cognate human genes using a highly sensitive two-color hybridization assay. Array elements that displayed differential expression patterns under given experimental conditions were characterized by sequencing. The identification of known and novel heat shock and phorbol ester-regulated genes in human T cells demonstrates the sensitivity of the assay. Parallel gene analysis with microarrays provides a rapid and efficient method for large-scale human gene discovery. Images Fig. 1 Fig. 2 Fig. 3 PMID:8855227
Positive Selection Underlies Faster-Z Evolution of Gene Expression in Birds

PubMed Central

Dean, Rebecca; Harrison, Peter W.; Wright, Alison E.; Zimmer, Fabian; Mank, Judith E.

2015-01-01

The elevated rate of evolution for genes on sex chromosomes compared with autosomes (Fast-X or Fast-Z evolution) can result either from positive selection in the heterogametic sex or from nonadaptive consequences of reduced relative effective population size. Recent work in birds suggests that Fast-Z of coding sequence is primarily due to relaxed purifying selection resulting from reduced relative effective population size. However, gene sequence and gene expression are often subject to distinct evolutionary pressures; therefore, we tested for Fast-Z in gene expression using next-generation RNA-sequencing data from multiple avian species. Similar to studies of Fast-Z in coding sequence, we recover clear signatures of Fast-Z in gene expression; however, in contrast to coding sequence, our data indicate that Fast-Z in expression is due to positive selection acting primarily in females. In the soma, where gene expression is highly correlated between the sexes, we detected Fast-Z in both sexes, although at a higher rate in females, suggesting that many positively selected expression changes in females are also expressed in males. In the gonad, where intersexual correlations in expression are much lower, we detected Fast-Z for female gene expression, but crucially, not males. This suggests that a large amount of expression variation is sex-specific in its effects within the gonad. Taken together, our results indicate that Fast-Z evolution of gene expression is the product of positive selection acting on recessive beneficial alleles in the heterogametic sex. More broadly, our analysis suggests that the adaptive potential of Z chromosome gene expression may be much greater than that of gene sequence, results which have important implications for the role of sex chromosomes in speciation and sexual selection. PMID:26067773
Characterization of the Structural Gene Promoter of Aedes aegypti Densovirus

PubMed Central

Ward, Todd W.; Kimmick, Michael W.; Afanasiev, Boris N.; Carlson, Jonathan O.

2001-01-01

Aedes aegypti densonucleosis virus (AeDNV) has two promoters that have been shown to be active by reporter gene expression analysis (B. N. Afanasiev, Y. V. Koslov, J. O. Carlson, and B. J. Beaty, Exp. Parasitol. 79:322–339, 1994). Northern blot analysis of cells infected with AeDNV revealed two transcripts 1,200 and 3,500 nucleotides in length that are assumed to express the structural protein (VP) gene and nonstructural protein genes, respectively. Primer extension was used to map the transcriptional start site of the structural protein gene. Surprisingly, the structural protein gene transcript began at an initiator consensus sequence, CAGT, 60 nucleotides upstream from the map unit 61 TATAA sequence previously thought to define the promoter. Constructs with the β-galactosidase gene fused to the structural protein gene were used to determine elements necessary for promoter function. Deletion or mutation of the initiator sequence, CAGT, reduced protein expression by 93%, whereas mutation of the TATAA sequence at map unit 61 had little effect. An additional open reading frame was observed upstream of the structural protein gene that can express β-galactosidase at a low level (20% of that of VP fusions). Expression of the AeDNV structural protein gene was shown to be stimulated by the major nonstructural protein NS1 (Afanasiev et al., Exp. parasitol., 1994). To determine the sequences required for transactivation, expression of structural protein gene–β-galactosidase gene fusion constructs differing in AeDNV genome content was measured with and without NS1. The presence of NS1 led to an 8- to 10-fold increase in expression when either genomic end was present, compared to a 2-fold increase with a construct lacking the genomic ends. An even higher (37-fold) increase in expression occurred with both genomic ends present; however, this was in part due to template replication as shown by Southern blot analysis. These data indicate the location and importance of various elements necessary for efficient protein expression and transactivation from the structural protein gene promoter of AeDNV. PMID:11152505
miRanalyzer: a microRNA detection and analysis tool for next-generation sequencing experiments.

PubMed

Hackenberg, Michael; Sturm, Martin; Langenberger, David; Falcón-Pérez, Juan Manuel; Aransay, Ana M

2009-07-01

Next-generation sequencing allows now the sequencing of small RNA molecules and the estimation of their expression levels. Consequently, there will be a high demand of bioinformatics tools to cope with the several gigabytes of sequence data generated in each single deep-sequencing experiment. Given this scene, we developed miRanalyzer, a web server tool for the analysis of deep-sequencing experiments for small RNAs. The web server tool requires a simple input file containing a list of unique reads and its copy numbers (expression levels). Using these data, miRanalyzer (i) detects all known microRNA sequences annotated in miRBase, (ii) finds all perfect matches against other libraries of transcribed sequences and (iii) predicts new microRNAs. The prediction of new microRNAs is an especially important point as there are many species with very few known microRNAs. Therefore, we implemented a highly accurate machine learning algorithm for the prediction of new microRNAs that reaches AUC values of 97.9% and recall values of up to 75% on unseen data. The web tool summarizes all the described steps in a single output page, which provides a comprehensive overview of the analysis, adding links to more detailed output pages for each analysis module. miRanalyzer is available at http://web.bioinformatics.cicbiogune.es/microRNA/.
Differential effects of simple repeating DNA sequences on gene expression from the SV40 early promoter.

PubMed

Amirhaeri, S; Wohlrab, F; Wells, R D

1995-02-17

The influence of simple repeat sequences, cloned into different positions relative to the SV40 early promoter/enhancer, on the transient expression of the chloramphenicol acetyltransferase (CAT) gene was investigated. Insertion of (G)29.(C)29 in either orientation into the 5'-untranslated region of the CAT gene reduced expression in CV-1 cells 50-100 fold when compared with controls with random sequence inserts. Analysis of CAT-specific mRNA levels demonstrated that the effect was due to a reduction of CAT mRNA production rather than to posttranscriptional events. In contrast, insertion of the same insert in either orientation upstream of the promoter-enhancer or downstream of the gene stimulated gene expression 2-3-fold. These effects could be reversed by cotransfection of a competitor plasmid carrying (G)25.(C)25 sequences. The results suggest that a G.C-binding transcription factor modulates gene expression in this system and that promoter strength can be regulated by providing protein-binding sites in trans. Although constructs containing longer tracts of alternating (C-G), (T-G), or (A-T) sequences inhibited CAT expression when inserted in the 5'-untranslated region of the CAT gene, the amount of CAT mRNA was unaffected. Hence, these inhibitions must be due to posttranscriptional events, presumably at the level of translation. These effects of microsatellite sequences on gene expression are discussed with respect to recent data on related simple repeat sequences which cause several human genetic diseases.
Gene discovery in the hamster: a comparative genomics approach for gene annotation by sequencing of hamster testis cDNAs

PubMed Central

Oduru, Sreedhar; Campbell, Janee L; Karri, SriTulasi; Hendry, William J; Khan, Shafiq A; Williams, Simon C

2003-01-01

Background Complete genome annotation will likely be achieved through a combination of computer-based analysis of available genome sequences combined with direct experimental characterization of expressed regions of individual genomes. We have utilized a comparative genomics approach involving the sequencing of randomly selected hamster testis cDNAs to begin to identify genes not previously annotated on the human, mouse, rat and Fugu (pufferfish) genomes. Results 735 distinct sequences were analyzed for their relatedness to known sequences in public databases. Eight of these sequences were derived from previously unidentified genes and expression of these genes in testis was confirmed by Northern blotting. The genomic locations of each sequence were mapped in human, mouse, rat and pufferfish, where applicable, and the structure of their cognate genes was derived using computer-based predictions, genomic comparisons and analysis of uncharacterized cDNA sequences from human and macaque. Conclusion The use of a comparative genomics approach resulted in the identification of eight cDNAs that correspond to previously uncharacterized genes in the human genome. The proteins encoded by these genes included a new member of the kinesin superfamily, a SET/MYND-domain protein, and six proteins for which no specific function could be predicted. Each gene was expressed primarily in testis, suggesting that they may play roles in the development and/or function of testicular cells. PMID:12783626
Next Generation Sequencing at the University of Chicago Genomics Core

DOE Office of Scientific and Technical Information (OSTI.GOV)

Faber, Pieter

2013-04-24

The University of Chicago Genomics Core provides University of Chicago investigators (and external clients) access to State-of-the-Art genomics capabilities: next generation sequencing, Sanger sequencing / genotyping and micro-arrays (gene expression, genotyping, and methylation). The current presentation will highlight our capabilities in the area of ultra-high throughput sequencing analysis.
Informatics for RNA Sequencing: A Web Resource for Analysis on the Cloud

PubMed Central

Griffith, Malachi; Walker, Jason R.; Spies, Nicholas C.; Ainscough, Benjamin J.; Griffith, Obi L.

2015-01-01

Massively parallel RNA sequencing (RNA-seq) has rapidly become the assay of choice for interrogating RNA transcript abundance and diversity. This article provides a detailed introduction to fundamental RNA-seq molecular biology and informatics concepts. We make available open-access RNA-seq tutorials that cover cloud computing, tool installation, relevant file formats, reference genomes, transcriptome annotations, quality-control strategies, expression, differential expression, and alternative splicing analysis methods. These tutorials and additional training resources are accompanied by complete analysis pipelines and test datasets made available without encumbrance at www.rnaseq.wiki. PMID:26248053
Interferon-gamma of the giant panda (Ailuropoda melanoleuca): complementary DNA cloning, expression, and phylogenetic analysis.

PubMed

Tao, Yaqiong; Zeng, Bo; Xu, Liu; Yue, Bisong; Yang, Dong; Zou, Fangdong

2010-01-01

Interferon-gamma (IFN-gamma) is the only member of type II IFN and is vital in the regulation of immune and inflammatory responses. Herein we report the cloning, expression, and sequence analysis of IFN-gamma from the giant panda (Ailuropoda melanoleuca). The open reading frame of this gene is 501 base pair in length and encodes a polypeptide consisting of 166 amino acids. All conserved N-linked glycosylation sites and cysteine residues among carnivores were found in the predicted amino acid sequence of the giant panda. Recombinant giant panda IFN-gamma with a V5 epitope and polyhistidine tag was expressed in HEK293 host cells and confirmed by Western blotting. Phylogenetic analysis of mammalian IFN-gamma-coding sequences indicated that the giant panda IFN-gamma was closest to that of carnivores, then to ungulates and dolphin, and shared a distant relationship with mouse and human. These results represent a first step into the study of IFN-gamma in giant panda.
Comparative Analysis of Expressed Genes from Cacao Meristems Infected by Moniliophthora perniciosa

PubMed Central

Gesteira, Abelmon S.; Micheli, Fabienne; Carels, Nicolas; Da Silva, Aline C.; Gramacho, Karina P.; Schuster, Ivan; Macêdo, Joci N.; Pereira, Gonçalo A. G.; Cascardo, Júlio C. M.

2007-01-01

Background and Aims Witches' broom disease is caused by the hemibiotrophic basidiomycete Moniliophthora perniciosa, and is one of the most important diseases of cacao in the western hemisphere. Because very little is known about the global process of such disease development, expressed sequence tags (ESTs) were used to identify genes expressed during the Theobroma cacao–Moniliophthora perniciosa interaction. Methods Two cDNA libraries corresponding to the resistant (RT) and susceptible (SP) cacao–M. perniciosa interactions were constructed from total RNA, using the DB SMART Creator cDNA library kit (Clontech). Clones were randomly selected, sequenced from the 5′ end and analysed using bioinformatics tools including in silico analysis of the differential gene expression. Key Results A total of 6884 ESTs were generated from the RT and SP cDNA libraries. These ESTs were composed of 2585 singlets and 341 contigs for a total of 2926 non-redundant sequences. The redundancy of the libraries was low and their specificity high when compared with the few other cacao libraries already published. Sequence analysis allowed the assignment of a putative functional category for 54 % of sequences, whereas approx. 22 % of sequences corresponded to unknown function and approx. 24 % of sequences did not show any significant similarity with other proteins present in the database. Despite the similar overall distribution of the sequences in functional categories between the two libraries, qualitative differences were observed. Genes involved during the defence response to pathogen infection or in programmed cell death were identified, such as pathogenesis related-proteins, trypsin inhibitor or oxalate oxidase, and some of them showed an in silico differential expression between the resistant and the susceptible interactions. Conclusions As far as is known this is the first EST resource from the cacao–M. perniciosa interaction and it is believed that it will provide a significant contribution to the understanding of the molecular mechanisms of the resistance and susceptibility of cacao to M. perniciosa, to develop strategies to control witches broom, and as a source of polymorphism for molecular marker development and marker-assisted selection. PMID:17557832
Regulation of iron assimilation: nucleotide sequence analysis of an iron-regulated promoter from a fluorescent pseudomonad.

PubMed

O'Sullivan, D J; O'Gara, F

1991-08-01

An iron-regulated promoter was cloned on a 2.1 kb Bg/II fragment from Pseudomonas sp. strain M114 and fused to the lacZ reporter gene. Iron-regulated lacZ expression from the resulting construct (pSP1) in strain M114 was mediated via the Fur-like repressor which also regulates siderophore production in this strain. A 390 bp StuI-PstI internal fragment contained the necessary information for iron-regulated promoter expression. This fragment was sequenced and the initiation point for transcription was determined by primer extension analysis. The region directly upstream of the transcription start point contained no significant homology to known promoter consensus sequences. However the -16 to -25 bp region contained homology to four other iron-regulated pseudomonad promoters. Deletion of bases downstream from the transcriptional start did not affect the iron-regulated expression of the promoter. The -37 and -43 bp regions exhibited some homology to the 19 bp Escherichia coli Fur-binding consensus sequence. When expressed in E. coli (via a cloned transacting factor from strain M114) lacZ expression from pSP1 was found to be regulated by iron. A region of greater than 77 bases but less than 131 upstream from the transcriptional start was found to be necessary for promoter activity, further suggesting that a transcriptional activator may be required for expression.
Identification of differentially expressed genes in cucumber (Cucumis sativus L.) root under waterlogging stress by digital gene expression profile.

PubMed

Qi, Xiao-Hua; Xu, Xue-Wen; Lin, Xiao-Jian; Zhang, Wen-Jie; Chen, Xue-Hao

2012-03-01

High-throughput tag-sequencing (Tag-seq) analysis based on the Solexa Genome Analyzer platform was applied to analyze the gene expression profiling of cucumber plant at 5 time points over a 24h period of waterlogging treatment. Approximately 5.8 million total clean sequence tags per library were obtained with 143013 distinct clean tag sequences. Approximately 23.69%-29.61% of the distinct clean tags were mapped unambiguously to the unigene database, and 53.78%-60.66% of the distinct clean tags were mapped to the cucumber genome database. Analysis of the differentially expressed genes revealed that most of the genes were down-regulated in the waterlogging stages, and the differentially expressed genes mainly linked to carbon metabolism, photosynthesis, reactive oxygen species generation/scavenging, and hormone synthesis/signaling. Finally, quantitative real-time polymerase chain reaction using nine genes independently verified the tag-mapped results. This present study reveals the comprehensive mechanisms of waterlogging-responsive transcription in cucumber. Copyright Â© 2011 Elsevier Inc. All rights reserved.
Analysis of codon usage in beta-tubulin sequences of helminths.

PubMed

von Samson-Himmelstjerna, G; Harder, A; Failing, K; Pape, M; Schnieder, T

2003-07-01

Codon usage bias has been shown to be correlated with gene expression levels in many organisms, including the nematode Caenorhabditis elegans. Here, the codon usage (cu) characteristics for a set of currently available beta-tubulin coding sequences of helminths were assessed by calculating several indices, including the effective codon number (Nc), the intrinsic codon deviation index (ICDI), the P2 value and the mutational response index (MRI). The P2 value gives a measure of translational pressure, which has been shown to be correlated to high gene expression levels in some organisms, but it has not yet been analysed in that respect in helminths. For all but two of the C. elegans beta-tubulin coding sequences investigated, the P2 value was the only index that indicated the presence of codon usage bias. Therefore, we propose that in general the helminth beta-tubulin sequences investigated here are not expressed at high levels. Furthermore, we calculated the correlation coefficients for the cu patterns of the helminth beta-tubulin sequences compared with those of highly expressed genes in organisms such as Escherichia coli and C. elegans. It was found that beta-tubulin cu patterns for all sequences of members of the Strongylida were significantly correlated to those for highly expressed C. elegans genes. This approach provides a new measure for comparing the adaptation of cu of a particular coding sequence with that of highly expressed genes in possible expression systems.Finally, using the cu patterns of the sequences studied, a phylogenetic tree was constructed. The topology of this tree was very much in concordance with that of a phylogeny based on small subunit ribosomal DNA sequence alignments.
Sequence analysis of PROTEOLYSIS 6 from Solanum lycopersicum

NASA Astrophysics Data System (ADS)

Roslan, Nur Farhana; Chew, Bee Lyn; Goh, Hoe-Han; Isa, Nurulhikma Md

2018-04-01

The N-end rule pathway is a protein degradation pathway that relates the protein half-life with the identity of its N-terminal residues. A destabilizing N-terminal residues is created by enzymatic reaction or chemical modifications. This destabilized substrate will be recognized by PROTEOLYSIS 6 (PRT6) protein, which encodes an E3 ligase enzyme and resulted in substrate degradation by proteasome. PRT6 has been studied in Arabidopsis thaliana and barley but not yet been studied in fleshy fruit plants. Hence, this study was carried out in tomato that is known as the model for fleshy fruit plants. BLASTX analysis identified that Solyc09g010830 which encodes for a PRT6 gene in tomato based on its sequence similarity with PRT6 in A. thaliana. In silico gene expression analysis shows that PRT6 gene was highly expressed in tomato fruits breaker +5. Co-expression analysis shows that PRT6 may not only involved in abiotic stresses but also in biotic stresses. The objective is to analyze the sequence and characterize PRT6 gene in tomato.
Genome-Scale Transcriptome Analysis in Response to Nitric Oxide in Birch Cells: Implications of the Triterpene Biosynthetic Pathway

PubMed Central

Zeng, Fansuo; Sun, Fengkun; Li, Leilei; Liu, Kun; Zhan, Yaguang

2014-01-01

Evidence supporting nitric oxide (NO) as a mediator of plant biochemistry continues to grow, but its functions at the molecular level remains poorly understood and, in some cases, controversial. To study the role of NO at the transcriptional level in Betula platyphylla cells, we conducted a genome-scale transcriptome analysis of these cells. The transcriptome of untreated birch cells and those treated by sodium nitroprusside (SNP) were analyzed using the Solexa sequencing. Data were collected by sequencing cDNA libraries of birch cells, which had a long period to adapt to the suspension culture conditions before SNP-treated cells and untreated cells were sampled. Among the 34,100 UniGenes detected, BLASTX search revealed that 20,631 genes showed significant (E-values≤10−5) sequence similarity with proteins from the NR-database. Numerous expressed sequence tags (i.e., 1374) were identified as differentially expressed between the 12 h SNP-treated cells and control cells samples: 403 up-regulated and 971 down-regulated. From this, we specifically examined a core set of NO-related transcripts. The altered expression levels of several transcripts, as determined by transcriptome analysis, was confirmed by qRT-PCR. The results of transcriptome analysis, gene expression quantification, the content of triterpenoid and activities of defensive enzymes elucidated NO has a significant effect on many processes including triterpenoid production, carbohydrate metabolism and cell wall biosynthesis. PMID:25551661
Genome-wide identification, classification, and expression analysis of the arabinogalactan protein gene family in rice (Oryza sativa L.)

PubMed Central

Zhao, Jie

2010-01-01

Arabinogalactan proteins (AGPs) comprise a family of hydroxyproline-rich glycoproteins that are implicated in plant growth and development. In this study, 69 AGPs are identified from the rice genome, including 13 classical AGPs, 15 arabinogalactan (AG) peptides, three non-classical AGPs, three early nodulin-like AGPs (eNod-like AGPs), eight non-specific lipid transfer protein-like AGPs (nsLTP-like AGPs), and 27 fasciclin-like AGPs (FLAs). The results from expressed sequence tags, microarrays, and massively parallel signature sequencing tags are used to analyse the expression of AGP-encoding genes, which is confirmed by real-time PCR. The results reveal that several rice AGP-encoding genes are predominantly expressed in anthers and display differential expression patterns in response to abscisic acid, gibberellic acid, and abiotic stresses. Based on the results obtained from this analysis, an attempt has been made to link the protein structures and expression patterns of rice AGP-encoding genes to their functions. Taken together, the genome-wide identification and expression analysis of the rice AGP gene family might facilitate further functional studies of rice AGPs. PMID:20423940
Revealing impaired pathways in the an11 mutant by high-throughput characterization of Petunia axillaris and Petunia inflata transcriptomes.

PubMed

Zenoni, Sara; D'Agostino, Nunzio; Tornielli, Giovanni B; Quattrocchio, Francesca; Chiusano, Maria L; Koes, Ronald; Zethof, Jan; Guzzo, Flavia; Delledonne, Massimo; Frusciante, Luigi; Gerats, Tom; Pezzotti, Mario

2011-10-01

Petunia is an excellent model system, especially for genetic, physiological and molecular studies. Thus far, however, genome-wide expression analysis has been applied rarely because of the lack of sequence information. We applied next-generation sequencing to generate, through de novo read assembly, a large catalogue of transcripts for Petunia axillaris and Petunia inflata. On the basis of both transcriptomes, comprehensive microarray chips for gene expression analysis were established and used for the analysis of global- and organ-specific gene expression in Petunia axillaris and Petunia inflata and to explore the molecular basis of the seed coat defects in a Petunia hybrida mutant, anthocyanin 11 (an11), lacking a WD40-repeat (WDR) transcription regulator. Among the transcripts differentially expressed in an11 seeds compared with wild type, many expected targets of AN11 were found but also several interesting new candidates that might play a role in morphogenesis of the seed coat. Our results validate the combination of next-generation sequencing with microarray analyses strategies to identify the transcriptome of two petunia species without previous knowledge of their genome, and to develop comprehensive chips as useful tools for the analysis of gene expression in P. axillaris, P. inflata and P. hybrida. © 2011 The Authors. The Plant Journal © 2011 Blackwell Publishing Ltd.
Serial analysis of gene expression (SAGE) in normal human trabecular meshwork.

PubMed

Liu, Yutao; Munro, Drew; Layfield, David; Dellinger, Andrew; Walter, Jeffrey; Peterson, Katherine; Rickman, Catherine Bowes; Allingham, R Rand; Hauser, Michael A

2011-04-08

To identify the genes expressed in normal human trabecular meshwork tissue, a tissue critical to the pathogenesis of glaucoma. Total RNA was extracted from human trabecular meshwork (HTM) harvested from 3 different donors. Extracted RNA was used to synthesize individual SAGE (serial analysis of gene expression) libraries using the I-SAGE Long kit from Invitrogen. Libraries were analyzed using SAGE 2000 software to extract the 17 base pair sequence tags. The extracted sequence tags were mapped to the genome using SAGE Genie map. A total of 298,834 SAGE tags were identified from all HTM libraries (96,842, 88,126, and 113,866 tags, respectively). Collectively, there were 107,325 unique tags. There were 10,329 unique tags with a minimum of 2 counts from a single library. These tags were mapped to known unique Unigene clusters. Approximately 29% of the tags (orphan tags) did not map to a known Unigene cluster. Thirteen percent of the tags mapped to at least 2 Unigene clusters. Sequence tags from many glaucoma-related genes, including myocilin, optineurin, and WD repeat domain 36, were identified. This is the first time SAGE analysis has been used to characterize the gene expression profile in normal HTM. SAGE analysis provides an unbiased sampling of gene expression of the target tissue. These data will provide new and valuable information to improve understanding of the biology of human aqueous outflow.
Generation and analysis of expressed sequence tags from a cDNA library of the fruiting body of Ganoderma lucidum

PubMed Central

2010-01-01

Background Little genomic or trancriptomic information on Ganoderma lucidum (Lingzhi) is known. This study aims to discover the transcripts involved in secondary metabolite biosynthesis and developmental regulation of G. lucidum using an expressed sequence tag (EST) library. Methods A cDNA library was constructed from the G. lucidum fruiting body. Its high-quality ESTs were assembled into unique sequences with contigs and singletons. The unique sequences were annotated according to sequence similarities to genes or proteins available in public databases. The detection of simple sequence repeats (SSRs) was preformed by online analysis. Results A total of 1,023 clones were randomly selected from the G. lucidum library and sequenced, yielding 879 high-quality ESTs. These ESTs showed similarities to a diverse range of genes. The sequences encoding squalene epoxidase (SE) and farnesyl-diphosphate synthase (FPS) were identified in this EST collection. Several candidate genes, such as hydrophobin, MOB2, profilin and PHO84 were detected for the first time in G. lucidum. Thirteen (13) potential SSR-motif microsatellite loci were also identified. Conclusion The present study demonstrates a successful application of EST analysis in the discovery of transcripts involved in the secondary metabolite biosynthesis and the developmental regulation of G. lucidum. PMID:20230644
Fluorescent in situ sequencing (FISSEQ) of RNA for gene expression profiling in intact cells and tissues

PubMed Central

Lee, Je Hyuk; Daugharthy, Evan R.; Scheiman, Jonathan; Kalhor, Reza; Ferrante, Thomas C.; Terry, Richard; Turczyk, Brian M.; Yang, Joyce L.; Lee, Ho Suk; Aach, John; Zhang, Kun; Church, George M.

2014-01-01

RNA sequencing measures the quantitative change in gene expression over the whole transcriptome, but it lacks spatial context. On the other hand, in situ hybridization provides the location of gene expression, but only for a small number of genes. Here we detail a protocol for genome-wide profiling of gene expression in situ in fixed cells and tissues, in which RNA is converted into cross-linked cDNA amplicons and sequenced manually on a confocal microscope. Unlike traditional RNA-seq our method enriches for context-specific transcripts over house-keeping and/or structural RNA, and it preserves the tissue architecture for RNA localization studies. Our protocol is written for researchers experienced in cell microscopy with minimal computing skills. Library construction and sequencing can be completed within 14 d, with image analysis requiring an additional 2 d. PMID:25675209

End Joining-Mediated Gene Expression in Mammalian Cells Using PCR-Amplified DNA Constructs that Contain Terminator in Front of Promoter.

PubMed

Nakamura, Mikiko; Suzuki, Ayako; Akada, Junko; Tomiyoshi, Keisuke; Hoshida, Hisashi; Akada, Rinji

2015-12-01

Mammalian gene expression constructs are generally prepared in a plasmid vector, in which a promoter and terminator are located upstream and downstream of a protein-coding sequence, respectively. In this study, we found that front terminator constructs-DNA constructs containing a terminator upstream of a promoter rather than downstream of a coding region-could sufficiently express proteins as a result of end joining of the introduced DNA fragment. By taking advantage of front terminator constructs, FLAG substitutions, and deletions were generated using mutagenesis primers to identify amino acids specifically recognized by commercial FLAG antibodies. A minimal epitope sequence for polyclonal FLAG antibody recognition was also identified. In addition, we analyzed the sequence of a C-terminal Ser-Lys-Leu peroxisome localization signal, and identified the key residues necessary for peroxisome targeting. Moreover, front terminator constructs of hepatitis B surface antigen were used for deletion analysis, leading to the identification of regions required for the particle formation. Collectively, these results indicate that front terminator constructs allow for easy manipulations of C-terminal protein-coding sequences, and suggest that direct gene expression with PCR-amplified DNA is useful for high-throughput protein analysis in mammalian cells.
Characterizing differential gene expression in polyploid grasses lacking a reference transcriptome

USDA-ARS?s Scientific Manuscript database

Basal transcriptome characterization and differential gene expression in response to varying conditions are often addressed through next generation sequencing (NGS) and data analysis techniques. While these strategies are commonly used, there are countless tools, pipelines, data analysis methods an...
Genes encoding calmodulin-binding proteins in the Arabidopsis genome

NASA Technical Reports Server (NTRS)

Reddy, Vaka S.; Ali, Gul S.; Reddy, Anireddy S N.

2002-01-01

Analysis of the recently completed Arabidopsis genome sequence indicates that approximately 31% of the predicted genes could not be assigned to functional categories, as they do not show any sequence similarity with proteins of known function from other organisms. Calmodulin (CaM), a ubiquitous and multifunctional Ca(2+) sensor, interacts with a wide variety of cellular proteins and modulates their activity/function in regulating diverse cellular processes. However, the primary amino acid sequence of the CaM-binding domain in different CaM-binding proteins (CBPs) is not conserved. One way to identify most of the CBPs in the Arabidopsis genome is by protein-protein interaction-based screening of expression libraries with CaM. Here, using a mixture of radiolabeled CaM isoforms from Arabidopsis, we screened several expression libraries prepared from flower meristem, seedlings, or tissues treated with hormones, an elicitor, or a pathogen. Sequence analysis of 77 positive clones that interact with CaM in a Ca(2+)-dependent manner revealed 20 CBPs, including 14 previously unknown CBPs. In addition, by searching the Arabidopsis genome sequence with the newly identified and known plant or animal CBPs, we identified a total of 27 CBPs. Among these, 16 CBPs are represented by families with 2-20 members in each family. Gene expression analysis revealed that CBPs and CBP paralogs are expressed differentially. Our data suggest that Arabidopsis has a large number of CBPs including several plant-specific ones. Although CaM is highly conserved between plants and animals, only a few CBPs are common to both plants and animals. Analysis of Arabidopsis CBPs revealed the presence of a variety of interesting domains. Our analyses identified several hypothetical proteins in the Arabidopsis genome as CaM targets, suggesting their involvement in Ca(2+)-mediated signaling networks.
Deep sequencing reveals cell-type-specific patterns of single-cell transcriptome variation.

PubMed

Dueck, Hannah; Khaladkar, Mugdha; Kim, Tae Kyung; Spaethling, Jennifer M; Francis, Chantal; Suresh, Sangita; Fisher, Stephen A; Seale, Patrick; Beck, Sheryl G; Bartfai, Tamas; Kuhn, Bernhard; Eberwine, James; Kim, Junhyong

2015-06-09

Differentiation of metazoan cells requires execution of different gene expression programs but recent single-cell transcriptome profiling has revealed considerable variation within cells of seeming identical phenotype. This brings into question the relationship between transcriptome states and cell phenotypes. Additionally, single-cell transcriptomics presents unique analysis challenges that need to be addressed to answer this question. We present high quality deep read-depth single-cell RNA sequencing for 91 cells from five mouse tissues and 18 cells from two rat tissues, along with 30 control samples of bulk RNA diluted to single-cell levels. We find that transcriptomes differ globally across tissues with regard to the number of genes expressed, the average expression patterns, and within-cell-type variation patterns. We develop methods to filter genes for reliable quantification and to calibrate biological variation. All cell types include genes with high variability in expression, in a tissue-specific manner. We also find evidence that single-cell variability of neuronal genes in mice is correlated with that in rats consistent with the hypothesis that levels of variation may be conserved. Single-cell RNA-sequencing data provide a unique view of transcriptome function; however, careful analysis is required in order to use single-cell RNA-sequencing measurements for this purpose. Technical variation must be considered in single-cell RNA-sequencing studies of expression variation. For a subset of genes, biological variability within each cell type appears to be regulated in order to perform dynamic functions, rather than solely molecular noise.
Content of intrinsic disorder influences the outcome of cell-free protein synthesis.

PubMed

Tokmakov, Alexander A; Kurotani, Atsushi; Ikeda, Mariko; Terazawa, Yumiko; Shirouzu, Mikako; Stefanov, Vasily; Sakurai, Tetsuya; Yokoyama, Shigeyuki

2015-09-11

Cell-free protein synthesis is used to produce proteins with various structural traits. Recent bioinformatics analyses indicate that more than half of eukaryotic proteins possess long intrinsically disordered regions. However, no systematic study concerning the connection between intrinsic disorder and expression success of cell-free protein synthesis has been presented until now. To address this issue, we examined correlations of the experimentally observed cell-free protein expression yields with the contents of intrinsic disorder bioinformatically predicted in the expressed sequences. This analysis revealed strong relationships between intrinsic disorder and protein amenability to heterologous cell-free expression. On the one hand, elevated disorder content was associated with the increased ratio of soluble expression. On the other hand, overall propensity for detectable protein expression decreased with disorder content. We further demonstrated that these tendencies are rooted in some distinct features of intrinsically disordered regions, such as low hydrophobicity, elevated surface accessibility and high abundance of sequence motifs for proteolytic degradation, including sites of ubiquitination and PEST sequences. Our findings suggest that identification of intrinsically disordered regions in the expressed amino acid sequences can be of practical use for predicting expression success and optimizing cell-free protein synthesis.
CORNAS: coverage-dependent RNA-Seq analysis of gene expression data without biological replicates.

PubMed

Low, Joel Z B; Khang, Tsung Fei; Tammi, Martti T

2017-12-28

In current statistical methods for calling differentially expressed genes in RNA-Seq experiments, the assumption is that an adjusted observed gene count represents an unknown true gene count. This adjustment usually consists of a normalization step to account for heterogeneous sample library sizes, and then the resulting normalized gene counts are used as input for parametric or non-parametric differential gene expression tests. A distribution of true gene counts, each with a different probability, can result in the same observed gene count. Importantly, sequencing coverage information is currently not explicitly incorporated into any of the statistical models used for RNA-Seq analysis. We developed a fast Bayesian method which uses the sequencing coverage information determined from the concentration of an RNA sample to estimate the posterior distribution of a true gene count. Our method has better or comparable performance compared to NOISeq and GFOLD, according to the results from simulations and experiments with real unreplicated data. We incorporated a previously unused sequencing coverage parameter into a procedure for differential gene expression analysis with RNA-Seq data. Our results suggest that our method can be used to overcome analytical bottlenecks in experiments with limited number of replicates and low sequencing coverage. The method is implemented in CORNAS (Coverage-dependent RNA-Seq), and is available at https://github.com/joel-lzb/CORNAS .
Reduced expression of APC-1B but not APC-1A by the deletion of promoter 1B is responsible for familial adenomatous polyposis.

PubMed

Yamaguchi, Kiyoshi; Nagayama, Satoshi; Shimizu, Eigo; Komura, Mitsuhiro; Yamaguchi, Rui; Shibuya, Tetsuo; Arai, Masami; Hatakeyama, Seira; Ikenoue, Tsuneo; Ueno, Masashi; Miyano, Satoru; Imoto, Seiya; Furukawa, Yoichi

2016-05-24

Germline mutations in the tumor suppressor gene APC are associated with familial adenomatous polyposis (FAP). Here we applied whole-genome sequencing (WGS) to the DNA of a sporadic FAP patient in which we did not find any pathological APC mutations by direct sequencing. WGS identified a promoter deletion of approximately 10 kb encompassing promoter 1B and exon1B of APC. Additional allele-specific expression analysis by deep cDNA sequencing revealed that the deletion reduced the expression of the mutated APC allele to as low as 11.2% in the total APC transcripts, suggesting that the residual mutant transcripts were driven by other promoter(s). Furthermore, cap analysis of gene expression (CAGE) demonstrated that the deleted promoter 1B region is responsible for the great majority of APC transcription in many tissues except the brain. The deletion decreased the transcripts of APC-1B to 39-45% in the patient compared to the healthy controls, but it did not decrease those of APC-1A. Different deletions including promoter 1B have been reported in FAP patients. Taken together, our results strengthen the evidence that analysis of structural variations in promoter 1B should be considered for the FAP patients whose pathological mutations are not identified by conventional direct sequencing.
Characterization of the glutathione S-transferase gene family through ESTs and expression analyses within common and pigmented cultivars of Citrus sinensis (L.) Osbeck.

PubMed

Licciardello, Concetta; D'Agostino, Nunzio; Traini, Alessandra; Recupero, Giuseppe Reforgiato; Frusciante, Luigi; Chiusano, Maria Luisa

2014-02-03

Glutathione S-transferases (GSTs) represent a ubiquitous gene family encoding detoxification enzymes able to recognize reactive electrophilic xenobiotic molecules as well as compounds of endogenous origin. Anthocyanin pigments require GSTs for their transport into the vacuole since their cytoplasmic retention is toxic to the cell. Anthocyanin accumulation in Citrus sinensis (L.) Osbeck fruit flesh determines different phenotypes affecting the typical pigmentation of Sicilian blood oranges. In this paper we describe: i) the characterization of the GST gene family in C. sinensis through a systematic EST analysis; ii) the validation of the EST assembly by exploiting the genome sequences of C. sinensis and C. clementina and their genome annotations; iii) GST gene expression profiling in six tissues/organs and in two different sweet orange cultivars, Cadenera (common) and Moro (pigmented). We identified 61 GST transcripts, described the full- or partial-length nature of the sequences and assigned to each sequence the GST class membership exploiting a comparative approach and the classification scheme proposed for plant species. A total of 23 full-length sequences were defined. Fifty-four of the 61 transcripts were successfully aligned to the C. sinensis and C. clementina genomes. Tissue specific expression profiling demonstrated that the expression of some GST transcripts was 'tissue-affected' and cultivar specific. A comparative analysis of C. sinensis GSTs with those from other plant species was also considered. Data from the current analysis are accessible at http://biosrv.cab.unina.it/citrusGST/, with the aim to provide a reference resource for C. sinensis GSTs. This study aimed at the characterization of the GST gene family in C. sinensis. Based on expression patterns from two different cultivars and on sequence-comparative analyses, we also highlighted that two sequences, a Phi class GST and a Mapeg class GST, could be involved in the conjugation of anthocyanin pigments and in their transport into the vacuole, specifically in fruit flesh of the pigmented cultivar.
Isolation, sequence identification and tissue expression profiles of 3 novel porcine genes: ASPA, NAGA, and HEXA.

PubMed

Shu, Xianghua; Liu, Yonggang; Yang, Liangyu; Song, Chunlian; Hou, Jiafa

2008-01-01

The complete coding sequences of 3 porcine genes - ASPA, NAGA, and HEXA - were amplified by the reverse transcriptase polymerase chain reaction (RT-PCR) based on the conserved sequence information of the mouse or other mammals and referenced pig ESTs. These 3 novel porcine genes were then deposited in the NCBI database and assigned GeneIDs: 100142661, 100142664 and 100142667. The phylogenetic tree analysis revealed that the porcine ASPA, NAGA, and HEXA all have closer genetic relationships with the ASPA, NAGA, and HEXA of cattle. Tissue expression profile analysis was also carried out and results revealed that swine ASPA, NAGA, and HEXA genes were differentially expressed in various organs, including skeletal muscle, the heart, liver, fat, kidney, lung, and small and large intestines. Our experiment is the first one to establish the foundation for further research on these 3 swine genes.
Analysis of SSR information in EST resources of sugarcane

USDA-ARS?s Scientific Manuscript database

Expressed sequence tags ( ESTs) offer the opportunity to exploit single, low -copy, conserved sequence motifs for the development of simple sequence repeats ( SSRs). The total of 262 113 ESTs of sugarcane (Saccharum officinarum) in the database of NCBI were downloaded and analyzed, which resulted in...
Cell-free translational screening of an expression sequence tag library of Clonorchis sinensis for novel antigen discovery.

PubMed

Kasi, Devi; Catherine, Christy; Lee, Seung-Won; Lee, Kyung-Ho; Kim, Yu Jung; Ro Lee, Myeong; Ju, Jung Won; Kim, Dong-Myung

2017-05-01

The rapidly evolving cloning and sequencing technologies have enabled understanding of genomic structure of parasite genomes, opening up new ways of combatting parasite-related diseases. To make the most of the exponentially accumulating genomic data, however, it is crucial to analyze the proteins encoded by these genomic sequences. In this study, we adopted an engineered cell-free protein synthesis system for large-scale expression screening of an expression sequence tag (EST) library of Clonorchis sinensis to identify potential antigens that can be used for diagnosis and treatment of clonorchiasis. To allow high-throughput expression and identification of individual genes comprising the library, a cell-free synthesis reaction was designed such that both the template DNA and the expressed proteins were co-immobilized on the same microbeads, leading to microbead-based linkage of the genotype and phenotype. This reaction configuration allowed streamlined expression, recovery, and analysis of proteins. This approach enabled us to identify 21 antigenic proteins. © 2017 American Institute of Chemical Engineers Biotechnol. Prog., 33:832-837, 2017. © 2017 American Institute of Chemical Engineers.
Geoseq: a tool for dissecting deep-sequencing datasets.

PubMed

Gurtowski, James; Cancio, Anthony; Shah, Hardik; Levovitz, Chaya; George, Ajish; Homann, Robert; Sachidanandam, Ravi

2010-10-12

Datasets generated on deep-sequencing platforms have been deposited in various public repositories such as the Gene Expression Omnibus (GEO), Sequence Read Archive (SRA) hosted by the NCBI, or the DNA Data Bank of Japan (ddbj). Despite being rich data sources, they have not been used much due to the difficulty in locating and analyzing datasets of interest. Geoseq http://geoseq.mssm.edu provides a new method of analyzing short reads from deep sequencing experiments. Instead of mapping the reads to reference genomes or sequences, Geoseq maps a reference sequence against the sequencing data. It is web-based, and holds pre-computed data from public libraries. The analysis reduces the input sequence to tiles and measures the coverage of each tile in a sequence library through the use of suffix arrays. The user can upload custom target sequences or use gene/miRNA names for the search and get back results as plots and spreadsheet files. Geoseq organizes the public sequencing data using a controlled vocabulary, allowing identification of relevant libraries by organism, tissue and type of experiment. Analysis of small sets of sequences against deep-sequencing datasets, as well as identification of public datasets of interest, is simplified by Geoseq. We applied Geoseq to, a) identify differential isoform expression in mRNA-seq datasets, b) identify miRNAs (microRNAs) in libraries, and identify mature and star sequences in miRNAS and c) to identify potentially mis-annotated miRNAs. The ease of using Geoseq for these analyses suggests its utility and uniqueness as an analysis tool.
Too much data, but little inter-changeability: a lesson learned from mining public data on tissue specificity of gene expression.

PubMed

Li, Shuyu; Li, Yiqun Helen; Wei, Tao; Su, Eric Wen; Duffin, Kevin; Liao, Birong

2006-10-25

The tissue expression pattern of a gene often provides an important clue to its potential role in a biological process. A vast amount of gene expression data have been and are being accumulated in public repository through different technology platforms. However, exploitations of these rich data sources remain limited in part due to issues of technology standardization. Our objective is to test the data comparability between SAGE and microarray technologies, through examining the expression pattern of genes under normal physiological states across variety of tissues. There are 42-54% of genes showing significant correlations in tissue expression patterns between SAGE and GeneChip, with 30-40% of genes whose expression patterns are positively correlated and 10-15% of genes whose expression patterns are negatively correlated at a statistically significant level (p = 0.05). Our analysis suggests that the discrepancy on the expression patterns derived from technology platforms is not likely from the heterogeneity of tissues used in these technologies, or other spurious correlations resulting from microarray probe design, abundance of genes, or gene function. The discrepancy can be partially explained by errors in the original assignment of SAGE tags to genes due to the evolution of sequence databases. In addition, sequence analysis has indicated that many SAGE tags and Affymetrix array probe sets are mapped to different splice variants or different sequence regions although they represent the same gene, which also contributes to the observed discrepancies between SAGE and array expression data. To our knowledge, this is the first report attempting to mine gene expression patterns across tissues using public data from different technology platforms. Unlike previous similar studies that only demonstrated the discrepancies between the two gene expression platforms, we carried out in-depth analysis to further investigate the cause for such discrepancies. Our study shows that the exploitation of rich public expression resource requires extensive knowledge about the technologies, and experiment. Informatic methodologies for better interoperability among platforms still remain a gap. One of the areas that can be improved practically is the accurate sequence mapping of SAGE tags and array probes to full-length genes.
Early Detection of NSCLC Using Stromal Markers in Peripheral Blood

DTIC Science & Technology

2016-09-01

circulating myeloid cells, flow cytometry, RNA -sequencing, expression profiling. 3. ACCOMPLISHMENTS:  What were the major goals of the project...Subtask 2: Flow cytometry sorting of circulating myeloid cells. Subtask 3: RNA -Sequencing Subtask 4: RNA -seq data analysis Subtask 5: Feasible RT-PCR...accomplished the patient recruitment, flow cytometry sorting of circulating myeloid cells, RNA -sequencing of the samples. During the RNA - seq data analysis, we
Identification, characterization and functional analysis of regulatory region of nanos gene from half-smooth tongue sole (Cynoglossus semilaevis).

PubMed

Huang, Jinqiang; Li, Yongjuan; Shao, Changwei; Wang, Na; Chen, Songlin

2017-06-20

The nanos gene encodes an RNA-binding zinc finger protein, which is required in the development and maintenance of germ cells. However, there is very limited information about nanos in flatfish, which impedes its application in fish breeding. In this study, we report the molecular cloning, characterization and functional analysis of the 3'-untranslated region of the nanos gene (Csnanos) from half-smooth tongue sole (Cynoglossus semilaevis), which is an economically important flatfish in China. The 1233-bp cDNA sequence, 1709-bp genomic sequence and flanking sequences (2.8-kb 5'- and 1.6-kb 3'-flanking regions) of Csnanos were cloned and characterized. Sequence analysis revealed that CsNanos shares low homology with Nanos in other species, but the zinc finger domain of CsNanos is highly similar. Phylogenetic analysis indicated that CsNanos belongs to the Nanos2 subfamily. Csnanos expression was widely detected in various tissues, but the expression level was higher in testis and ovary. During early development and sex differentiation, Csnanos expression exhibited a clear sexually dimorphic pattern, suggesting its different roles in the migration and differentiation of primordial germ cells (PGCs). Higher expression levels of Csnanos mRNA in normal females and males than in neomales indicated that the nanos gene may play key roles in maintaining the differentiation of gonad. Moreover, medaka PGCs were successfully labeled by the microinjection of synthesized mRNA consisting of green fluorescence protein and the 3'-untranslated region of Csnanos. These findings provide new insights into nanos gene expression and function, and lay the foundation for further study of PGC development and applications in tongue sole breeding. Copyright © 2017 Elsevier B.V. All rights reserved.
Sequence and expression variation in SUPPRESSOR of OVEREXPRESSION of CONSTANS 1 (SOC1): homeolog evolution in Indian Brassicas.

PubMed

Sri, Tanu; Mayee, Pratiksha; Singh, Anandita

2015-09-01

Whole genome sequence analyses allow unravelling such evolutionary consequences of meso-triplication event in Brassicaceae (∼14-20 million years ago (MYA)) as differential gene fractionation and diversification in homeologous sub-genomes. This study presents a simple gene-centric approach involving microsynteny and natural genetic variation analysis for understanding SUPPRESSOR of OVEREXPRESSION of CONSTANS 1 (SOC1) homeolog evolution in Brassica. Analysis of microsynteny in Brassica rapa homeologous regions containing SOC1 revealed differential gene fractionation correlating to reported fractionation status of sub-genomes of origin, viz. least fractionated (LF), moderately fractionated 1 (MF1) and most fractionated (MF2), respectively. Screening 18 cultivars of 6 Brassica species led to the identification of 8 genomic and 27 transcript variants of SOC1, including splice-forms. Co-occurrence of both interrupted and intronless SOC1 genes was detected in few Brassica species. In silico analysis characterised Brassica SOC1 as MADS intervening, K-box, C-terminal (MIKC(C)) transcription factor, with highly conserved MADS and I domains relative to K-box and C-terminal domain. Phylogenetic analyses and multiple sequence alignments depicting shared pattern of silent/non-silent mutations assigned Brassica SOC1 homologs into groups based on shared diploid base genome. In addition, a sub-genome structure in uncharacterised Brassica genomes was inferred. Expression analysis of putative MF2 and LF (Brassica diploid base genome A (AA)) sub-genome-specific SOC1 homeologs of Brassica juncea revealed near identical expression pattern. However, MF2-specific homeolog exhibited significantly higher expression implying regulatory diversification. In conclusion, evidence for polyploidy-induced sequence and regulatory evolution in Brassica SOC1 is being presented wherein differential homeolog expression is implied in functional diversification.
Evolutionarily conserved ELOVL4 gene expression in the vertebrate retina.

PubMed

Lagali, Pamela S; Liu, Jiafan; Ambasudhan, Rajesh; Kakuk, Laura E; Bernstein, Steven L; Seigel, Gail M; Wong, Paul W; Ayyagari, Radha

2003-07-01

The gene elongation of very long chain fatty acids-4 (ELOVL4) has been shown to underlie phenotypically heterogeneous forms of autosomal dominant macular degeneration. In this study, the extent of evolutionary conservation and the existence and localization of retinal expression of this gene was investigated across a wide variety of species. Southern blot analysis of genomic DNA and bioinformatic analysis using the human ELOVL4 cDNA and protein sequences, respectively, were performed to identify species in which ELOVL4 orthologues and/or homologues are present. Retinal RNA and protein extracts derived from different species were assessed by Northern hybridization and immunoblot techniques to assess evolutionary conservation of gene expression. Immunohistochemical analysis of tissue sections prepared from various mammalian retinas was performed to determine the distribution of ELOVL4 and homologous proteins within specific retinal cell layers. The existence of ELOVL4 sequence orthologues and homologues was confirmed by both Southern blot analysis and in silico searches of protein sequence databases. Phylogenetic analysis places ELOVL4 among a large family of known and putative fatty acid elongase proteins. Northern blot analysis revealed the presence of multiple transcripts corresponding to ELOVL4 homologues expressed in the retina of several different mammalian species. Conserved proteins were also detected among retinal extracts of different mammals and were found to localize predominantly to the photoreceptor cell layer within retinal tissue preparations. The ELOVL4 gene is highly conserved throughout evolution and is expressed in the photoreceptor cells of the retina in a variety of different species, which suggests that it plays a critical role in retinal cell biology.
Transcriptome Sequencing of Gracilariopsis lemaneiformis to Analyze the Genes Related to Optically Active Phycoerythrin Synthesis.

PubMed

Huang, Xiaoyun; Zang, Xiaonan; Wu, Fei; Jin, Yuming; Wang, Haitao; Liu, Chang; Ding, Yating; He, Bangxiang; Xiao, Dongfang; Song, Xinwei; Liu, Zhu

2017-01-01

Gracilariopsis lemaneiformis (aka Gracilaria lemaneiformis) is a red macroalga rich in phycoerythrin, which can capture light efficiently and transfer it to photosystemⅡ. However, little is known about the synthesis of optically active phycoerythrinin in G. lemaneiformis at the molecular level. With the advent of high-throughput sequencing technology, analysis of genetic information for G. lemaneiformis by transcriptome sequencing is an effective means to get a deeper insight into the molecular mechanism of phycoerythrin synthesis. Illumina technology was employed to sequence the transcriptome of two strains of G. lemaneiformis- the wild type and a green-pigmented mutant. We obtained a total of 86915 assembled unigenes as a reference gene set, and 42884 unigenes were annotated in at least one public database. Taking the above transcriptome sequencing as a reference gene set, 4041 differentially expressed genes were screened to analyze and compare the gene expression profiles of the wild type and green mutant. By GO and KEGG pathway analysis, we concluded that three factors, including a reduction in the expression level of apo-phycoerythrin, an increase of chlorophyll light-harvesting complex synthesis, and reduction of phycoerythrobilin by competitive inhibition, caused the reduction of optically active phycoerythrin in the green-pigmented mutant.
Structure, inheritance, and expression of hybrid poplar (Populus trichocarpa x Populus deltoides) phenylalanine ammonia-lyase genes.

PubMed Central

Subramaniam, R; Reinold, S; Molitor, E K; Douglas, C J

1993-01-01

A heterologous probe encoding phenylalanine ammonia-lyase (PAL) was used to identify PAL clones in cDNA libraries made with RNA from young leaf tissue of two Populus deltoides x P. trichocarpa F1 hybrid clones. Sequence analysis of a 2.4-kb cDNA confirmed its identity as a full-length PAl clone. The predicted amino acid sequence is conserved in comparison with that of PAL genes from several other plants. Southern blot analysis of popular genomic DNA from parental and hybrid individuals, restriction site polymorphism in PAL cDNA clones, and sequence heterogeneity in the 3' ends of several cDNA clones suggested that PAL is encoded by at least two genes that can be distinguished by HindIII restriction site polymorphisms. Clones containing each type of PAL gene were isolated from a poplar genomic library. Analysis of the segregation of PAL-specific HindIII restriction fragment-length polymorphisms demonstrated the existence of two independently segregating PAL loci, one of which was mapped to a linkage group of the poplar genetic map. Developmentally regulated PAL expression in poplar was analyzed using RNA blots. Highest expression was observed in young stems, apical buds, and young leaves. Expression was lower in older stems and undetectable in mature leaves. Cellular localization of PAL expression by in situ hybridization showed very high levels of expression in subepidermal cells of leaves early during leaf development. In stems and petioles, expression was associated with subepidermal cells and vascular tissues. PMID:8108506
microRNA expression profiling in fetal single ventricle malformation identified by deep sequencing.

PubMed

Yu, Zhang-Bin; Han, Shu-Ping; Bai, Yun-Fei; Zhu, Chun; Pan, Ya; Guo, Xi-Rong

2012-01-01

microRNAs (miRNAs) have emerged as key regulators in many biological processes, particularly cardiac growth and development, although the specific miRNA expression profile associated with this process remains to be elucidated. This study aimed to characterize the cellular microRNA profile involved in the development of congenital heart malformation, through the investigation of single ventricle (SV) defects. Comprehensive miRNA profiling in human fetal SV cardiac tissue was performed by deep sequencing. Differential expression of 48 miRNAs was revealed by sequencing by oligonucleotide ligation and detection (SOLiD) analysis. Of these, 38 were down-regulated and 10 were up-regulated in differentiated SV cardiac tissue, compared to control cardiac tissue. This was confirmed by real-time quantitative reverse transcription-polymerase chain reaction (qRT-PCR) analysis. Predicted target genes of the 48 differentially expressed miRNAs were analyzed by gene ontology and categorized according to cellular process, regulation of biological process and metabolic process. Pathway-Express analysis identified the WNT and mTOR signaling pathways as the most significant processes putatively affected by the differential expression of these miRNAs. The candidate genes involved in cardiac development were identified as potential targets for these differentially expressed microRNAs and the collaborative network of microRNAs and cardiac development related-mRNAs was constructed. These data provide the basis for future investigation of the mechanism of the occurrence and development of fetal SV malformations.

Analysis of C. elegans VIG-1 expression.

PubMed

Shin, Kyoung-Hwa; Choi, Boram; Park, Yang-Seo; Cho, Nam Jeong

2008-12-31

Double-stranded RNA (dsRNA) induces gene silencing in a sequence-specific manner by a process known as RNA interference (RNAi). The RNA-induced silencing complex (RISC) is a multi-subunit ribonucleoprotein complex that plays a key role in RNAi. VIG (Vasa intronic gene) has been identified as a component of Drosophila RISC; however, the role VIG plays in regulating RNAi is poorly understood. Here, we examined the spatial and temporal expression patterns of VIG-1, the C. elegans ortholog of Drosophila VIG, using a vig-1::gfp fusion construct. This construct contains the 908-bp region immediately upstream of vig-1 gene translation initiation site. Analysis by confocal microscopy demonstrated GFP-VIG-1 expression in a number of tissues including the pharynx, body wall muscle, hypodermis, intestine, reproductive system, and nervous system at the larval and adult stages. Furthermore, western blot analysis showed that VIG-1 is present in each developmental stage examined. To investigate regulatory sequences for vig-1 gene expression, we generated constructs containing deletions in the upstream region. It was determined that the GFP expression pattern of a deletion construct (delta-908 to -597) was generally similar to that of the non-deletion construct. In contrast, removal of a larger segment (delta-908 to -191) resulted in the loss of GFP expression in most cell types. Collectively, these results indicate that the 406-bp upstream region (-596 to -191) contains essential regulatory sequences required for VIG-1 expression.
Digital Gene Expression Analysis Based on De Novo Transcriptome Assembly Reveals New Genes Associated with Floral Organ Differentiation of the Orchid Plant Cymbidium ensifolium

PubMed Central

Yang, Fengxi; Zhu, Genfa

2015-01-01

Cymbidium ensifolium belongs to the genus Cymbidium of the orchid family. Owing to its spectacular flower morphology, C. ensifolium has considerable ecological and cultural value. However, limited genetic data is available for this non-model plant, and the molecular mechanism underlying floral organ identity is still poorly understood. In this study, we characterize the floral transcriptome of C. ensifolium and present, for the first time, extensive sequence and transcript abundance data of individual floral organs. After sequencing, over 10 Gb clean sequence data were generated and assembled into 111,892 unigenes with an average length of 932.03 base pairs, including 1,227 clusters and 110,665 singletons. Assembled sequences were annotated with gene descriptions, gene ontology, clusters of orthologous group terms, the Kyoto Encyclopedia of Genes and Genomes, and the plant transcription factor database. From these annotations, 131 flowering-associated unigenes, 61 CONSTANS-LIKE (COL) unigenes and 90 floral homeotic genes were identified. In addition, four digital gene expression libraries were constructed for the sepal, petal, labellum and gynostemium, and 1,058 genes corresponding to individual floral organ development were identified. Among them, eight MADS-box genes were further investigated by full-length cDNA sequence analysis and expression validation, which revealed two APETALA1/AGL9-like MADS-box genes preferentially expressed in the sepal and petal, two AGAMOUS-like genes particularly restricted to the gynostemium, and four DEF-like genes distinctively expressed in different floral organs. The spatial expression of these genes varied distinctly in different floral mutant corresponding to different floral morphogenesis, which validated the specialized roles of them in floral patterning and further supported the effectiveness of our in silico analysis. This dataset generated in our study provides new insights into the molecular mechanisms underlying floral patterning of Cymbidium and supports a valuable resource for molecular breeding of the orchid plant. PMID:26580566
VaDiR: an integrated approach to Variant Detection in RNA.

PubMed

Neums, Lisa; Suenaga, Seiji; Beyerlein, Peter; Anders, Sara; Koestler, Devin; Mariani, Andrea; Chien, Jeremy

2018-02-01

Advances in next-generation DNA sequencing technologies are now enabling detailed characterization of sequence variations in cancer genomes. With whole-genome sequencing, variations in coding and non-coding sequences can be discovered. But the cost associated with it is currently limiting its general use in research. Whole-exome sequencing is used to characterize sequence variations in coding regions, but the cost associated with capture reagents and biases in capture rate limit its full use in research. Additional limitations include uncertainty in assigning the functional significance of the mutations when these mutations are observed in the non-coding region or in genes that are not expressed in cancer tissue. We investigated the feasibility of uncovering mutations from expressed genes using RNA sequencing datasets with a method called Variant Detection in RNA(VaDiR) that integrates 3 variant callers, namely: SNPiR, RVBoost, and MuTect2. The combination of all 3 methods, which we called Tier 1 variants, produced the highest precision with true positive mutations from RNA-seq that could be validated at the DNA level. We also found that the integration of Tier 1 variants with those called by MuTect2 and SNPiR produced the highest recall with acceptable precision. Finally, we observed a higher rate of mutation discovery in genes that are expressed at higher levels. Our method, VaDiR, provides a possibility of uncovering mutations from RNA sequencing datasets that could be useful in further functional analysis. In addition, our approach allows orthogonal validation of DNA-based mutation discovery by providing complementary sequence variation analysis from paired RNA/DNA sequencing datasets.
Pair-barcode high-throughput sequencing for large-scale multiplexed sample analysis

PubMed Central

2012-01-01

Background The multiplexing becomes the major limitation of the next-generation sequencing (NGS) in application to low complexity samples. Physical space segregation allows limited multiplexing, while the existing barcode approach only permits simultaneously analysis of up to several dozen samples. Results Here we introduce pair-barcode sequencing (PBS), an economic and flexible barcoding technique that permits parallel analysis of large-scale multiplexed samples. In two pilot runs using SOLiD sequencer (Applied Biosystems Inc.), 32 independent pair-barcoded miRNA libraries were simultaneously discovered by the combination of 4 unique forward barcodes and 8 unique reverse barcodes. Over 174,000,000 reads were generated and about 64% of them are assigned to both of the barcodes. After mapping all reads to pre-miRNAs in miRBase, different miRNA expression patterns are captured from the two clinical groups. The strong correlation using different barcode pairs and the high consistency of miRNA expression in two independent runs demonstrates that PBS approach is valid. Conclusions By employing PBS approach in NGS, large-scale multiplexed pooled samples could be practically analyzed in parallel so that high-throughput sequencing economically meets the requirements of samples which are low sequencing throughput demand. PMID:22276739
Pair-barcode high-throughput sequencing for large-scale multiplexed sample analysis.

PubMed

Tu, Jing; Ge, Qinyu; Wang, Shengqin; Wang, Lei; Sun, Beili; Yang, Qi; Bai, Yunfei; Lu, Zuhong

2012-01-25

The multiplexing becomes the major limitation of the next-generation sequencing (NGS) in application to low complexity samples. Physical space segregation allows limited multiplexing, while the existing barcode approach only permits simultaneously analysis of up to several dozen samples. Here we introduce pair-barcode sequencing (PBS), an economic and flexible barcoding technique that permits parallel analysis of large-scale multiplexed samples. In two pilot runs using SOLiD sequencer (Applied Biosystems Inc.), 32 independent pair-barcoded miRNA libraries were simultaneously discovered by the combination of 4 unique forward barcodes and 8 unique reverse barcodes. Over 174,000,000 reads were generated and about 64% of them are assigned to both of the barcodes. After mapping all reads to pre-miRNAs in miRBase, different miRNA expression patterns are captured from the two clinical groups. The strong correlation using different barcode pairs and the high consistency of miRNA expression in two independent runs demonstrates that PBS approach is valid. By employing PBS approach in NGS, large-scale multiplexed pooled samples could be practically analyzed in parallel so that high-throughput sequencing economically meets the requirements of samples which are low sequencing throughput demand.
Expression profiling of snoRNAs in normal hematopoiesis and AML

PubMed Central

Warner, Wayne A.; Spencer, David H.; Trissal, Maria; White, Brian S.; Helton, Nichole; Ley, Timothy J.

2018-01-01

Small nucleolar RNAs (snoRNAs) are noncoding RNAs that contribute to ribosome biogenesis and RNA splicing by modifying ribosomal RNA and spliceosome RNAs, respectively. We optimized a next-generation sequencing approach and a custom analysis pipeline to identify and quantify expression of snoRNAs in acute myeloid leukemia (AML) and normal hematopoietic cell populations. We show that snoRNAs are expressed in a lineage- and development-specific fashion during hematopoiesis. The most striking examples involve snoRNAs located in 2 imprinted loci, which are highly expressed in hematopoietic progenitors and downregulated during myeloid differentiation. Although most snoRNAs are expressed at similar levels in AML cells compared with CD34+, a subset of snoRNAs showed consistent differential expression, with the great majority of these being decreased in the AML samples. Analysis of host gene expression, splicing patterns, and whole-genome sequence data for mutational events did not identify transcriptional patterns or genetic alterations that account for these expression differences. These data provide a comprehensive analysis of the snoRNA transcriptome in normal and leukemic cells and should be helpful in the design of studies to define the contribution of snoRNAs to normal and malignant hematopoiesis. PMID:29365324
Massive Collection of Full-Length Complementary DNA Clones and Microarray Analyses:. Keys to Rice Transcriptome Analysis

NASA Astrophysics Data System (ADS)

Kikuchi, Shoshi

2009-02-01

Completion of the high-precision genome sequence analysis of rice led to the collection of about 35,000 full-length cDNA clones and the determination of their complete sequences. Mapping of these full-length cDNA sequences has given us information on (1) the number of genes expressed in the rice genome; (2) the start and end positions and exon-intron structures of rice genes; (3) alternative transcripts; (4) possible encoded proteins; (5) non-protein-coding (np) RNAs; (6) the density of gene localization on the chromosome; (7) setting the parameters of gene prediction programs; and (8) the construction of a microarray system that monitors global gene expression. Manual curation for rice gene annotation by using mapping information on full-length cDNA and EST assemblies has revealed about 32,000 expressed genes in the rice genome. Analysis of major gene families, such as those encoding membrane transport proteins (pumps, ion channels, and secondary transporters), along with the evolution from bacteria to higher animals and plants, reveals how gene numbers have increased through adaptation to circumstances. Family-based gene annotation also gives us a new way of comparing organisms. Massive amounts of data on gene expression under many kinds of physiological conditions are being accumulated in rice oligoarrays (22K and 44K) based on full-length cDNA sequences. Cluster analyses of genes that have the same promoter cis-elements, that have similar expression profiles, or that encode enzymes in the same metabolic pathways or signal transduction cascades give us clues to understanding the networks of gene expression in rice. As a tool for that purpose, we recently developed "RiCES", a tool for searching for cis-elements in the promoter regions of clustered genes.
An unusual plant triterpene synthase with predominant α-amyrin-producing activity identified by characterizing oxidosqualene cyclases from Malus × domestica.

PubMed

Brendolise, Cyril; Yauk, Yar-Khing; Eberhard, Ellen D; Wang, Mindy; Chagne, David; Andre, Christelle; Greenwood, David R; Beuning, Lesley L

2011-07-01

The pentacyclic triterpenes, in particular ursolic acid and oleanolic acid and their derivatives, exist abundantly in the plant kingdom, where they are well known for their anti-inflammatory, antitumour and antimicrobial properties. α-Amyrin and β-amyrin are the precursors of ursolic and oleanolic acids, respectively, formed by concerted cyclization of squalene epoxide by a complex synthase reaction. We identified three full-length expressed sequence tag sequences in cDNA libraries constructed from apple (Malus × domestica 'Royal Gala') that were likely to encode triterpene synthases. Two of these expressed sequence tag sequences were essentially identical (> 99% amino acid similarity; MdOSC1 and MdOSC3). MdOSC1 and MdOSC2 were expressed by transient expression in Nicotiana benthamiana leaves and by expression in the yeast Pichia methanolica. The resulting products were analysed by GC and GC-MS. MdOSC1 was shown to be a mixed amyrin synthase (a 5 : 1 ratio of α-amyrin to β-amyrin). MdOSC1 is the only triterpene synthase so far identified in which the level of α-amyrin produced is > 80% of the total product and is, therefore, primarily an α-amyrin synthase. No product was evident for MdOSC2 when expressed either transiently or in yeast, suggesting that this putative triterpene synthase is either encoded by a pseudogene or does not express well in these systems. Transcript expression analysis in Royal Gala indicated that the genes are mostly expressed in apple peel, and that the MdOSC2 expression level was much lower than that of MdOSC1 and MdOSC3 in all the tissues tested. Amyrin content analysis was undertaken by LC-MS, and demonstrated that levels and ratios differ between tissues, but that the true consequence of synthase activity is reflected in the ursolic/oleanolic acid content and in further triterpenoids derived from them. Phylogenetic analysis placed the three triterpene synthase sequences with other triterpene synthases that encoded either α-amyrin and/or β-amyrin synthase. MdOSC1 and MdOSC3 clustered with the multifunctional triterpene synthases, whereas MdOSC2 was most similar to the β-amyrin synthases. © 2011 The New Zealand Institute for Plant and Food Research Limited. Journal compilation © 2011 FEBS.
Regulation of the Human Endogenous Retrovirus K (HML-2) Transcriptome by the HIV-1 Tat Protein

PubMed Central

Gonzalez-Hernandez, Marta J.; Cavalcoli, James D.; Sartor, Maureen A.; Contreras-Galindo, Rafael; Meng, Fan; Dai, Manhong; Dube, Derek; Saha, Anjan K.; Gitlin, Scott D.; Omenn, Gilbert S.; Kaplan, Mark H.

2014-01-01

ABSTRACT Approximately 8% of the human genome is made up of endogenous retroviral sequences. As the HIV-1 Tat protein activates the overall expression of the human endogenous retrovirus type K (HERV-K) (HML-2), we used next-generation sequencing to determine which of the 91 currently annotated HERV-K (HML-2) proviruses are regulated by Tat. Transcriptome sequencing of total RNA isolated from Tat- and vehicle-treated peripheral blood lymphocytes from a healthy donor showed that Tat significantly activates expression of 26 unique HERV-K (HML-2) proviruses, silences 12, and does not significantly alter the expression of the remaining proviruses. Quantitative reverse transcription-PCR validation of the sequencing data was performed on Tat-treated PBLs of seven donors using provirus-specific primers and corroborated the results with a substantial degree of quantitative similarity. IMPORTANCE The expression of HERV-K (HML-2) is tightly regulated but becomes markedly increased following infection with HIV-1, in part due to the HIV-1 Tat protein. The findings reported here demonstrate the complexity of the genome-wide regulation of HERV-K (HML-2) expression by Tat. This work also demonstrates that although HERV-K (HML-2) proviruses in the human genome are highly similar in terms of DNA sequence, modulation of the expression of specific proviruses in a given biological situation can be ascertained using next-generation sequencing and bioinformatics analysis. PMID:24872592
Transcriptome analysis of Cymbidium sinense and its application to the identification of genes associated with floral development

PubMed Central

2013-01-01

Background Cymbidium sinense belongs to the Orchidaceae, which is one of the most abundant angiosperm families. C. sinense, a high-grade traditional potted flower, is most prevalent in China and some Southeast Asian countries. The control of flowering time is a major bottleneck in the industrialized development of C. sinense. Little is known about the mechanisms responsible for floral development in this orchid. Moreover, genome references for entire transcriptome sequences do not currently exist for C. sinense. Thus, transcriptome and expression profiling data for this species are needed as an important resource to identify genes and to better understand the biological mechanisms of floral development in C. sinense. Results In this study, de novo transcriptome assembly and gene expression analysis using Illumina sequencing technology were performed. Transcriptome analysis assembles gene-related information related to vegetative and reproductive growth of C. sinense. Illumina sequencing generated 54,248,006 high quality reads that were assembled into 83,580 unigenes with an average sequence length of 612 base pairs, including 13,315 clusters and 70,265 singletons. A total of 41,687 (49.88%) unique sequences were annotated, 23,092 of which were assigned to specific metabolic pathways by the Kyoto Encyclopedia of Genes and Genomes (KEGG). Gene Ontology (GO) analysis of the annotated unigenes revealed that the majority of sequenced genes were associated with metabolic and cellular processes, cell and cell parts, catalytic activity and binding. Furthermore, 120 flowering-associated unigenes, 73 MADS-box unigenes and 28 CONSTANS-LIKE (COL) unigenes were identified from our collection. In addition, three digital gene expression (DGE) libraries were constructed for the vegetative phase (VP), floral differentiation phase (FDP) and reproductive phase (RP). The specific expression of many genes in the three development phases was also identified. 32 genes among three sub-libraries with high differential expression were selected as candidates connected with flower development. Conclusion RNA-seq and DGE profiling data provided comprehensive gene expression information at the transcriptional level that could facilitate our understanding of the molecular mechanisms of floral development at three development phases of C. sinense. This data could be used as an important resource for investigating the genetics of the flowering pathway and various biological mechanisms in this orchid. PMID:23617896
Transcriptome analysis of Cymbidium sinense and its application to the identification of genes associated with floral development.

PubMed

Zhang, Jianxia; Wu, Kunlin; Zeng, Songjun; Teixeira da Silva, Jaime A; Zhao, Xiaolan; Tian, Chang-En; Xia, Haoqiang; Duan, Jun

2013-04-24

Cymbidium sinense belongs to the Orchidaceae, which is one of the most abundant angiosperm families. C. sinense, a high-grade traditional potted flower, is most prevalent in China and some Southeast Asian countries. The control of flowering time is a major bottleneck in the industrialized development of C. sinense. Little is known about the mechanisms responsible for floral development in this orchid. Moreover, genome references for entire transcriptome sequences do not currently exist for C. sinense. Thus, transcriptome and expression profiling data for this species are needed as an important resource to identify genes and to better understand the biological mechanisms of floral development in C. sinense. In this study, de novo transcriptome assembly and gene expression analysis using Illumina sequencing technology were performed. Transcriptome analysis assembles gene-related information related to vegetative and reproductive growth of C. sinense. Illumina sequencing generated 54,248,006 high quality reads that were assembled into 83,580 unigenes with an average sequence length of 612 base pairs, including 13,315 clusters and 70,265 singletons. A total of 41,687 (49.88%) unique sequences were annotated, 23,092 of which were assigned to specific metabolic pathways by the Kyoto Encyclopedia of Genes and Genomes (KEGG). Gene Ontology (GO) analysis of the annotated unigenes revealed that the majority of sequenced genes were associated with metabolic and cellular processes, cell and cell parts, catalytic activity and binding. Furthermore, 120 flowering-associated unigenes, 73 MADS-box unigenes and 28 CONSTANS-LIKE (COL) unigenes were identified from our collection. In addition, three digital gene expression (DGE) libraries were constructed for the vegetative phase (VP), floral differentiation phase (FDP) and reproductive phase (RP). The specific expression of many genes in the three development phases was also identified. 32 genes among three sub-libraries with high differential expression were selected as candidates connected with flower development. RNA-seq and DGE profiling data provided comprehensive gene expression information at the transcriptional level that could facilitate our understanding of the molecular mechanisms of floral development at three development phases of C. sinense. This data could be used as an important resource for investigating the genetics of the flowering pathway and various biological mechanisms in this orchid.
Sequence analysis of 497 mouse brain ESTs expressed in the substantia nigra

DOE Office of Scientific and Technical Information (OSTI.GOV)

Stewart, G.J.; Savioz, A.; Davies, R.W.

1997-01-15

The use of subtracted, region-specific cDNA libraries combined with single-pass cDNA sequencing allows the discovery of novel genes and facilitates molecular description of the tissue or region involved. We report the sequence of 497 mouse expressed sequence tags (ESTs) from two subtracted libraries enriched for cDNAs expressed in the substantia nigra, a brain region with important roles in movement control and Parkinson disease. Of these, 238 ESTs give no database matches and therefore derive from novel genes. A further 115 ESTs show sequence similarity to ESTs from other organisms, which themselves do not yield any significant database matches to genesmore » of known function. Fifty-six ESTs show sequence similarity to previously identified genes whose mouse homologues have not been reported. The total number of ESTs reported that are new for the mouse is 407, which, together with the 90 ESTs corresponding to known mouse genes or cDNAs, contributes to the molecular description of the substantia nigra. 21 refs., 4 tabs.« less
Characterization of shark complement factor I gene(s): genomic analysis of a novel shark-specific sequence.

PubMed

Shin, Dong-Ho; Webb, Barbara M; Nakao, Miki; Smith, Sylvia L

2009-07-01

Complement factor I is a crucial regulator of mammalian complement activity. Very little is known of complement regulators in non-mammalian species. We isolated and sequenced four highly similar complement factor I cDNAs from the liver of the nurse shark (Ginglymostoma cirratum), designated as GcIf-1, GcIf-2, GcIf-3 and GcIf-4 (previously referred to as nsFI-a, -b, -c and -d) which encode 689, 673, 673 and 657 amino acid residues, respectively. They share 95% (
Characterization of shark complement factor I gene(s): genomic analysis of a novel shark-specific sequence

PubMed Central

Shin, Dong-Ho; Webb, Barbara M.; Nakao, Miki; Smith, Sylvia L.

2009-01-01

Complement factor I is a crucial regulator of mammalian complement activity. Very little is known of complement regulators in non-mammalian species. We isolated and sequenced four highly similar complement factor I cDNAs from the liver of the nurse shark (Ginglymostoma cirratum), designated as GcIf-1, GcIf-2, GcIf-3 and GcIf-4 (previously referred to as nsFI-a, -b, -c and –d) which encode 689, 673, 673 and 657 amino acid residues, respectively. They share 95% (≤) amino acid identities with each other, 35.4 ~ 39.6% and 62.8 ~ 65.9% with factor I of mammals and banded houndshark (Triakis scyllium), respectively. The modular structure of the GcIf is similar to that of mammals with one notable exception, the presence of a novel shark-specific sequence between the leader peptide (LP) and the factor I membrane attack complex (FIMAC) domain. The cDNA sequences differ only in the size and composition of the shark-specific region (SSR). Sequence analysis of each SSR has identified within the region two novel short sequences (SS1 and SS2) and three repeat sequences (RS1, 2 and 3). Genomic analysis has revealed the existence of three introns between the leader peptide and the FIMAC domain, tentatively designated intron 1, intron 2, and intron 3 which span 4067, 2293 and 2082 bp, respectively. Southern blot analysis suggests the presence of a single gene copy for each cDNA type. Phylogenetic analysis suggests that complement factor I of cartilaginous fish diverged prior to the emergence of mammals. All four GcIf cDNA species are expressed in four different tissues and the liver is the main tissue in which expression level of all four is high. This suggests that the expression of GcIf isotypes is tissue-dependent. PMID:19423168
Identification of tissue-specific, abiotic stress-responsive gene expression patterns in wine grape (Vitis vinifera L.) based on curation and mining of large-scale EST data sets

PubMed Central

2011-01-01

Background Abiotic stresses, such as water deficit and soil salinity, result in changes in physiology, nutrient use, and vegetative growth in vines, and ultimately, yield and flavor in berries of wine grape, Vitis vinifera L. Large-scale expressed sequence tags (ESTs) were generated, curated, and analyzed to identify major genetic determinants responsible for stress-adaptive responses. Although roots serve as the first site of perception and/or injury for many types of abiotic stress, EST sequencing in root tissues of wine grape exposed to abiotic stresses has been extremely limited to date. To overcome this limitation, large-scale EST sequencing was conducted from root tissues exposed to multiple abiotic stresses. Results A total of 62,236 expressed sequence tags (ESTs) were generated from leaf, berry, and root tissues from vines subjected to abiotic stresses and compared with 32,286 ESTs sequenced from 20 public cDNA libraries. Curation to correct annotation errors, clustering and assembly of the berry and leaf ESTs with currently available V. vinifera full-length transcripts and ESTs yielded a total of 13,278 unique sequences, with 2302 singletons and 10,976 mapped to V. vinifera gene models. Of these, 739 transcripts were found to have significant differential expression in stressed leaves and berries including 250 genes not described previously as being abiotic stress responsive. In a second analysis of 16,452 ESTs from a normalized root cDNA library derived from roots exposed to multiple, short-term, abiotic stresses, 135 genes with root-enriched expression patterns were identified on the basis of their relative EST abundance in roots relative to other tissues. Conclusions The large-scale analysis of relative EST frequency counts among a diverse collection of 23 different cDNA libraries from leaf, berry, and root tissues of wine grape exposed to a variety of abiotic stress conditions revealed distinct, tissue-specific expression patterns, previously unrecognized stress-induced genes, and many novel genes with root-enriched mRNA expression for improving our understanding of root biology and manipulation of rootstock traits in wine grape. mRNA abundance estimates based on EST library-enriched expression patterns showed only modest correlations between microarray and quantitative, real-time reverse transcription-polymerase chain reaction (qRT-PCR) methods highlighting the need for deep-sequencing expression profiling methods. PMID:21592389
High-throughput amplification of mature microRNAs in uncharacterized animal models using polyadenylated RNA and stem-loop reverse transcription polymerase chain reaction.

PubMed

Biggar, Kyle K; Wu, Cheng-Wei; Storey, Kenneth B

2014-10-01

This study makes a significant advancement on a microRNA amplification technique previously used for expression analysis and sequencing in animal models without annotated mature microRNA sequences. As research progresses into the post-genomic era of microRNA prediction and analysis, the need for a rapid and cost-effective method for microRNA amplification is critical to facilitate wide-scale analysis of microRNA expression. To facilitate this requirement, we have reoptimized the design of amplification primers and introduced a polyadenylation step to allow amplification of all mature microRNAs from a single RNA sample. Importantly, this method retains the ability to sequence reverse transcription polymerase chain reaction (RT-PCR) products, validating microRNA-specific amplification. Copyright © 2014 Elsevier Inc. All rights reserved.
Gene amplification of 5-enol-pyruvylshikimate-3-phosphate synthase in glyphosate-resistant Kochia scoparia.

PubMed

Wiersma, Andrew T; Gaines, Todd A; Preston, Christopher; Hamilton, John P; Giacomini, Darci; Robin Buell, C; Leach, Jan E; Westra, Philip

2015-02-01

Field-evolved resistance to the herbicide glyphosate is due to amplification of one of two EPSPS alleles, increasing transcription and protein with no splice variants or effects on other pathway genes. The widely used herbicide glyphosate inhibits the shikimate pathway enzyme 5-enolpyruvylshikimate-3-phosphate synthase (EPSPS). Globally, the intensive use of glyphosate for weed control has selected for glyphosate resistance in 31 weed species. Populations of suspected glyphosate-resistant Kochia scoparia were collected from fields located in the US central Great Plains. Glyphosate dose response verified glyphosate resistance in nine populations. The mechanism of resistance to glyphosate was investigated using targeted sequencing, quantitative PCR, immunoblotting, and whole transcriptome de novo sequencing to characterize the sequence and expression of EPSPS. Sequence analysis showed no mutation of the EPSPS Pro106 codon in glyphosate-resistant K. scoparia, whereas EPSPS genomic copy number and transcript abundance were elevated three- to ten-fold in resistant individuals relative to susceptible individuals. Glyphosate-resistant individuals with increased relative EPSPS copy numbers had consistently lower shikimate accumulation in leaf disks treated with 100 μM glyphosate and EPSPS protein levels were higher in glyphosate-resistant individuals with increased gene copy number compared to glyphosate-susceptible individuals. RNA sequence analysis revealed seven nucleotide positions with two different expressed alleles in glyphosate-susceptible reads. However, one nucleotide at the seven positions was predominant in glyphosate-resistant sequences, suggesting that only one of two EPSPS alleles was amplified in glyphosate-resistant individuals. No alternatively spliced EPSPS transcripts were detected. Expression of five other genes in the chorismate pathway was unaffected in glyphosate-resistant individuals with increased EPSPS expression. These results indicate increased EPSPS expression is a mechanism for glyphosate resistance in these K. scoparia populations.
Analysis of expressed sequence tags from Prunus mume flower and fruit and development of simple sequence repeat markers

PubMed Central

2010-01-01

Background Expressed Sequence Tag (EST) has been a cost-effective tool in molecular biology and represents an abundant valuable resource for genome annotation, gene expression, and comparative genomics in plants. Results In this study, we constructed a cDNA library of Prunus mume flower and fruit, sequenced 10,123 clones of the library, and obtained 8,656 expressed sequence tag (EST) sequences with high quality. The ESTs were assembled into 4,473 unigenes composed of 1,492 contigs and 2,981 singletons and that have been deposited in NCBI (accession IDs: GW868575 - GW873047), among which 1,294 unique ESTs were with known or putative functions. Furthermore, we found 1,233 putative simple sequence repeats (SSRs) in the P. mume unigene dataset. We randomly tested 42 pairs of PCR primers flanking potential SSRs, and 14 pairs were identified as true-to-type SSR loci and could amplify polymorphic bands from 20 individual plants of P. mume. We further used the 14 EST-SSR primer pairs to test the transferability on peach and plum. The result showed that nearly 89% of the primer pairs produced target PCR bands in the two species. A high level of marker polymorphism was observed in the plum species (65%) and low in the peach (46%), and the clustering analysis of the three species indicated that these SSR markers were useful in the evaluation of genetic relationships and diversity between and within the Prunus species. Conclusions We have constructed the first cDNA library of P. mume flower and fruit, and our data provide sets of molecular biology resources for P. mume and other Prunus species. These resources will be useful for further study such as genome annotation, new gene discovery, gene functional analysis, molecular breeding, evolution and comparative genomics between Prunus species. PMID:20626882
Characterization and improvement of RNA-Seq precision in quantitative transcript expression profiling.

PubMed

Łabaj, Paweł P; Leparc, Germán G; Linggi, Bryan E; Markillie, Lye Meng; Wiley, H Steven; Kreil, David P

2011-07-01

Measurement precision determines the power of any analysis to reliably identify significant signals, such as in screens for differential expression, independent of whether the experimental design incorporates replicates or not. With the compilation of large-scale RNA-Seq datasets with technical replicate samples, however, we can now, for the first time, perform a systematic analysis of the precision of expression level estimates from massively parallel sequencing technology. This then allows considerations for its improvement by computational or experimental means. We report on a comprehensive study of target identification and measurement precision, including their dependence on transcript expression levels, read depth and other parameters. In particular, an impressive recall of 84% of the estimated true transcript population could be achieved with 331 million 50 bp reads, with diminishing returns from longer read lengths and even less gains from increased sequencing depths. Most of the measurement power (75%) is spent on only 7% of the known transcriptome, however, making less strongly expressed transcripts harder to measure. Consequently, <30% of all transcripts could be quantified reliably with a relative error<20%. Based on established tools, we then introduce a new approach for mapping and analysing sequencing reads that yields substantially improved performance in gene expression profiling, increasing the number of transcripts that can reliably be quantified to over 40%. Extrapolations to higher sequencing depths highlight the need for efficient complementary steps. In discussion we outline possible experimental and computational strategies for further improvements in quantification precision. rnaseq10@boku.ac.at
[Construction and functional identification of eukaryotic expression vector carrying Sprague-Dawley rat MSX-2 gene].

PubMed

Yang, Xian-Xian; Zhang, Mei; Yan, Zhao-Wen; Zhang, Ru-Hong; Mu, Xiong-Zheng

2008-01-01

To construct a high effective eukaryotic expressing plasmid PcDNA 3.1-MSX-2 encoding Sprague-Dawley rat MSX-2 gene for the further study of MSX-2 gene function. The full length SD rat MSX-2 gene was amplified by PCR, and the full length DNA was inserted in the PMD1 8-T vector. It was isolated by restriction enzyme digest with BamHI and Xhol, then ligated into the cloning site of the PcDNA3.1 expression plasmid. The positive recombinant was identified by PCR analysis, restriction endonudease analysis and sequence analysis. Expression of RNA and protein was detected by RT-PCR and Western blot analysis in PcDNA3.1-MSX-2 transfected HEK293 cells. Sequence analysis and restriction endonudease analysis of PcDNA3.1-MSX-2 demonstrated that the position and size of MSX-2 cDNA insertion were consistent with the design. RT-PCR and Western blot analysis showed specific expression of mRNA and protein of MSX-2 in the transfected HEK293 cells. The high effective eukaryotic expression plasmid PcDNA3.1-MSX-2 encoding Sprague-Dawley Rat MSX-2 gene which is related to craniofacial development can be successfully reconstructed. It may serve as the basis for the further study of MSX-2 gene function.

Arkas: Rapid reproducible RNAseq analysis

PubMed Central

Colombo, Anthony R.; J. Triche Jr, Timothy; Ramsingh, Giridharan

2017-01-01

The recently introduced Kallisto pseudoaligner has radically simplified the quantification of transcripts in RNA-sequencing experiments. We offer cloud-scale RNAseq pipelines Arkas-Quantification, and Arkas-Analysis available within Illumina’s BaseSpace cloud application platform which expedites Kallisto preparatory routines, reliably calculates differential expression, and performs gene-set enrichment of REACTOME pathways . Due to inherit inefficiencies of scale, Illumina's BaseSpace computing platform offers a massively parallel distributive environment improving data management services and data importing. Arkas-Quantification deploys Kallisto for parallel cloud computations and is conveniently integrated downstream from the BaseSpace Sequence Read Archive (SRA) import/conversion application titled SRA Import. Arkas-Analysis annotates the Kallisto results by extracting structured information directly from source FASTA files with per-contig metadata, calculates the differential expression and gene-set enrichment analysis on both coding genes and transcripts. The Arkas cloud pipeline supports ENSEMBL transcriptomes and can be used downstream from the SRA Import facilitating raw sequencing importing, SRA FASTQ conversion, RNA quantification and analysis steps. PMID:28868134
Cloning and expression of a nuclear encoded plastid specific 33 kDa ribonucleoprotein gene (33RNP) from pea that is light stimulated.

PubMed

Reddy, M K; Nair, S; Singh, B N; Mudgil, Y; Tewari, K K; Sopory, S K

2001-01-24

We report the cloning and sequencing of both cDNA and genomic DNA of a 33 kDa chloroplast ribonucleoprotein (33RNP) from pea. The analysis of the predicted amino acid sequence of the cDNA clone revealed that the encoded protein contains two RNA binding domains, including the conserved consensus ribonucleoprotein sequences CS-RNP1 and CS-RNP2, on the C-terminus half and the presence of a putative transit peptide sequence in the N-terminus region. The phylogenetic and multiple sequence alignment analysis of pea chloroplast RNP along with RNPs reported from the other plant sources revealed that the pea 33RNP is very closely related to Nicotiana sylvestris 31RNP and 28RNP and also to 31RNP and 28RNP of Arabidopsis and spinach, respectively. The pea 33RNP was expressed in Escherichia coli and purified to homogeneity. The in vitro import of precursor protein into chloroplasts confirmed that the N-terminus putative transit peptide is a bona fide transit peptide and 33RNP is localized in the chloroplast. The nucleic acid-binding properties of the recombinant protein, as revealed by South-Western analysis, showed that 33RNP has higher binding affinity for poly (U) and oligo dT than for ssDNA and dsDNA. The steady state transcript level was higher in leaves than in roots and the expression of this gene is light stimulated. Sequence analysis of the genomic clone revealed that the gene contains four exons and three introns. We have also isolated and analyzed the 5' flanking region of the pea 33RNP gene.
DSAP: deep-sequencing small RNA analysis pipeline.

PubMed

Huang, Po-Jung; Liu, Yi-Chung; Lee, Chi-Ching; Lin, Wei-Chen; Gan, Richie Ruei-Chi; Lyu, Ping-Chiang; Tang, Petrus

2010-07-01

DSAP is an automated multiple-task web service designed to provide a total solution to analyzing deep-sequencing small RNA datasets generated by next-generation sequencing technology. DSAP uses a tab-delimited file as an input format, which holds the unique sequence reads (tags) and their corresponding number of copies generated by the Solexa sequencing platform. The input data will go through four analysis steps in DSAP: (i) cleanup: removal of adaptors and poly-A/T/C/G/N nucleotides; (ii) clustering: grouping of cleaned sequence tags into unique sequence clusters; (iii) non-coding RNA (ncRNA) matching: sequence homology mapping against a transcribed sequence library from the ncRNA database Rfam (http://rfam.sanger.ac.uk/); and (iv) known miRNA matching: detection of known miRNAs in miRBase (http://www.mirbase.org/) based on sequence homology. The expression levels corresponding to matched ncRNAs and miRNAs are summarized in multi-color clickable bar charts linked to external databases. DSAP is also capable of displaying miRNA expression levels from different jobs using a log(2)-scaled color matrix. Furthermore, a cross-species comparative function is also provided to show the distribution of identified miRNAs in different species as deposited in miRBase. DSAP is available at http://dsap.cgu.edu.tw.
Next-generation sequencing facilitates quantitative analysis of wild-type and Nrl−/− retinal transcriptomes

PubMed Central

Brooks, Matthew J.; Rajasimha, Harsha K.; Roger, Jerome E.

2011-01-01

Purpose Next-generation sequencing (NGS) has revolutionized systems-based analysis of cellular pathways. The goals of this study are to compare NGS-derived retinal transcriptome profiling (RNA-seq) to microarray and quantitative reverse transcription polymerase chain reaction (qRT–PCR) methods and to evaluate protocols for optimal high-throughput data analysis. Methods Retinal mRNA profiles of 21-day-old wild-type (WT) and neural retina leucine zipper knockout (Nrl−/−) mice were generated by deep sequencing, in triplicate, using Illumina GAIIx. The sequence reads that passed quality filters were analyzed at the transcript isoform level with two methods: Burrows–Wheeler Aligner (BWA) followed by ANOVA (ANOVA) and TopHat followed by Cufflinks. qRT–PCR validation was performed using TaqMan and SYBR Green assays. Results Using an optimized data analysis workflow, we mapped about 30 million sequence reads per sample to the mouse genome (build mm9) and identified 16,014 transcripts in the retinas of WT and Nrl−/− mice with BWA workflow and 34,115 transcripts with TopHat workflow. RNA-seq data confirmed stable expression of 25 known housekeeping genes, and 12 of these were validated with qRT–PCR. RNA-seq data had a linear relationship with qRT–PCR for more than four orders of magnitude and a goodness of fit (R2) of 0.8798. Approximately 10% of the transcripts showed differential expression between the WT and Nrl−/− retina, with a fold change ≥1.5 and p value <0.05. Altered expression of 25 genes was confirmed with qRT–PCR, demonstrating the high degree of sensitivity of the RNA-seq method. Hierarchical clustering of differentially expressed genes uncovered several as yet uncharacterized genes that may contribute to retinal function. Data analysis with BWA and TopHat workflows revealed a significant overlap yet provided complementary insights in transcriptome profiling. Conclusions Our study represents the first detailed analysis of retinal transcriptomes, with biologic replicates, generated by RNA-seq technology. The optimized data analysis workflows reported here should provide a framework for comparative investigations of expression profiles. Our results show that NGS offers a comprehensive and more accurate quantitative and qualitative evaluation of mRNA content within a cell or tissue. We conclude that RNA-seq based transcriptome characterization would expedite genetic network analyses and permit the dissection of complex biologic functions. PMID:22162623
Analysis of beta-carotene hydroxylase gene cDNA isolated from the American oil-palm (Elaeis oleifera) mesocarp tissue cDNA library

PubMed Central

Bhore, Subhash J; Kassim, Amelia; Loh, Chye Ying; Shah, Farida H

2010-01-01

It is well known that the nutritional quality of the American oil-palm (Elaeis oleifera) mesocarp oil is superior to that of African oil-palm (Elaeis guineensis Jacq. Tenera) mesocarp oil. Therefore, it is of important to identify the genetic features for its superior value. This could be achieved through the genome sequencing of the oil-palm. However, the genome sequence is not available in the public domain due to commercial secrecy. Hence, we constructed a cDNA library and generated expressed sequence tags (3,205) from the mesocarp tissue of the American oil-palm. We continued to annotate each of these cDNAs after submitting to GenBank/DDBJ/EMBL. A rough analysis turned our attention to the beta-carotene hydroxylase (Chyb) enzyme encoding cDNA. Then, we completed the full sequencing of cDNA clone for its both strands using M13 forward and reverse primers. The full nucleotide and protein sequence was further analyzed and annotated using various Bioinformatics tools. The analysis results showed the presence of fatty acid hydroxylase superfamily domain in the protein sequence. The multiple sequence alignment of selected Chyb amino acid sequences from other plant species and algal members with E. oleifera Chyb using ClustalW and its phylogenetic analysis suggest that Chyb from monocotyledonous plant species, Lilium hubrid, Crocus sativus and Zea mays are the most evolutionary related with E. oleifera Chyb. This study reports the annotation of E. oleifera Chyb. Abbreviations ESTs - expressed sequence tags, EoChyb - Elaeis oleifera beta-carotene hydroxylase, MC - main cluster PMID:21364789
Transcriptome analysis and gene expression profiling of abortive and developing ovules during fruit development in hazelnut.

PubMed

Cheng, Yunqing; Liu, Jianfeng; Zhang, Huidi; Wang, Ju; Zhao, Yixin; Geng, Wanting

2015-01-01

A high ratio of blank fruit in hazelnut (Corylus heterophylla Fisch) is a very common phenomenon that causes serious yield losses in northeast China. The development of blank fruit in the Corylus genus is known to be associated with embryo abortion. However, little is known about the molecular mechanisms responsible for embryo abortion during the nut development stage. Genomic information for C. heterophylla Fisch is not available; therefore, data related to transcriptome and gene expression profiling of developing and abortive ovules are needed. In this study, de novo transcriptome sequencing and RNA-seq analysis were conducted using short-read sequencing technology (Illumina HiSeq 2000). The results of the transcriptome assembly analysis revealed genetic information that was associated with the fruit development stage. Two digital gene expression libraries were constructed, one for a full (normally developing) ovule and one for an empty (abortive) ovule. Transcriptome sequencing and assembly results revealed 55,353 unigenes, including 18,751 clusters and 36,602 singletons. These results were annotated using the public databases NR, NT, Swiss-Prot, KEGG, COG, and GO. Using digital gene expression profiling, gene expression differences in developing and abortive ovules were identified. A total of 1,637 and 715 unigenes were significantly upregulated and downregulated, respectively, in abortive ovules, compared with developing ovules. Quantitative real-time polymerase chain reaction analysis was used in order to verify the differential expression of some genes. The transcriptome and digital gene expression profiling data of normally developing and abortive ovules in hazelnut provide exhaustive information that will improve our understanding of the molecular mechanisms of abortive ovule formation in hazelnut.
New extension software modules to enhance searching and display of transcriptome data in Tripal databases

PubMed Central

Chen, Ming; Henry, Nathan; Almsaeed, Abdullah; Zhou, Xiao; Wegrzyn, Jill; Ficklin, Stephen

2017-01-01

Abstract Tripal is an open source software package for developing biological databases with a focus on genetic and genomic data. It consists of a set of core modules that deliver essential functions for loading and displaying data records and associated attributes including organisms, sequence features and genetic markers. Beyond the core modules, community members are encouraged to contribute extension modules to build on the Tripal core and to customize Tripal for individual community needs. To expand the utility of the Tripal software system, particularly for RNASeq data, we developed two new extension modules. Tripal Elasticsearch enables fast, scalable searching of the entire content of a Tripal site as well as the construction of customized advanced searches of specific data types. We demonstrate the use of this module for searching assembled transcripts by functional annotation. A second module, Tripal Analysis Expression, houses and displays records from gene expression assays such as RNA sequencing. This includes biological source materials (biomaterials), gene expression values and protocols used to generate the data. In the case of an RNASeq experiment, this would reflect the individual organisms and tissues used to produce sequencing libraries, the normalized gene expression values derived from the RNASeq data analysis and a description of the software or code used to generate the expression values. The module will load data from common flat file formats including standard NCBI Biosample XML. Data loading, display options and other configurations can be controlled by authorized users in the Drupal administrative backend. Both modules are open source, include usage documentation, and can be found in the Tripal organization’s GitHub repository. Database URL: Tripal Elasticsearch module: https://github.com/tripal/tripal_elasticsearch Tripal Analysis Expression module: https://github.com/tripal/tripal_analysis_expression PMID:29220446
RNA-ID, a Powerful Tool for Identifying and Characterizing Regulatory Sequences.

PubMed

Brule, C E; Dean, K M; Grayhack, E J

2016-01-01

The identification and analysis of sequences that regulate gene expression is critical because regulated gene expression underlies biology. RNA-ID is an efficient and sensitive method to discover and investigate regulatory sequences in the yeast Saccharomyces cerevisiae, using fluorescence-based assays to detect green fluorescent protein (GFP) relative to a red fluorescent protein (RFP) control in individual cells. Putative regulatory sequences can be inserted either in-frame or upstream of a superfolder GFP fusion protein whose expression, like that of RFP, is driven by the bidirectional GAL1,10 promoter. In this chapter, we describe the methodology to identify and study cis-regulatory sequences in the RNA-ID system, explaining features and variations of the RNA-ID reporter, as well as some applications of this system. We describe in detail the methods to analyze a single regulatory sequence, from construction of a single GFP variant to assay of variants by flow cytometry, as well as modifications required to screen libraries of different strains simultaneously. We also describe subsequent analyses of regulatory sequences. © 2016 Elsevier Inc. All rights reserved.
SigEMD: A powerful method for differential gene expression analysis in single-cell RNA sequencing data.

PubMed

Wang, Tianyu; Nabavi, Sheida

2018-04-24

Differential gene expression analysis is one of the significant efforts in single cell RNA sequencing (scRNAseq) analysis to discover the specific changes in expression levels of individual cell types. Since scRNAseq exhibits multimodality, large amounts of zero counts, and sparsity, it is different from the traditional bulk RNA sequencing (RNAseq) data. The new challenges of scRNAseq data promote the development of new methods for identifying differentially expressed (DE) genes. In this study, we proposed a new method, SigEMD, that combines a data imputation approach, a logistic regression model and a nonparametric method based on the Earth Mover's Distance, to precisely and efficiently identify DE genes in scRNAseq data. The regression model and data imputation are used to reduce the impact of large amounts of zero counts, and the nonparametric method is used to improve the sensitivity of detecting DE genes from multimodal scRNAseq data. By additionally employing gene interaction network information to adjust the final states of DE genes, we further reduce the false positives of calling DE genes. We used simulated datasets and real datasets to evaluate the detection accuracy of the proposed method and to compare its performance with those of other differential expression analysis methods. Results indicate that the proposed method has an overall powerful performance in terms of precision in detection, sensitivity, and specificity. Copyright © 2018 Elsevier Inc. All rights reserved.
Positive Selection Underlies Faster-Z Evolution of Gene Expression in Birds.

PubMed

Dean, Rebecca; Harrison, Peter W; Wright, Alison E; Zimmer, Fabian; Mank, Judith E

2015-10-01

The elevated rate of evolution for genes on sex chromosomes compared with autosomes (Fast-X or Fast-Z evolution) can result either from positive selection in the heterogametic sex or from nonadaptive consequences of reduced relative effective population size. Recent work in birds suggests that Fast-Z of coding sequence is primarily due to relaxed purifying selection resulting from reduced relative effective population size. However, gene sequence and gene expression are often subject to distinct evolutionary pressures; therefore, we tested for Fast-Z in gene expression using next-generation RNA-sequencing data from multiple avian species. Similar to studies of Fast-Z in coding sequence, we recover clear signatures of Fast-Z in gene expression; however, in contrast to coding sequence, our data indicate that Fast-Z in expression is due to positive selection acting primarily in females. In the soma, where gene expression is highly correlated between the sexes, we detected Fast-Z in both sexes, although at a higher rate in females, suggesting that many positively selected expression changes in females are also expressed in males. In the gonad, where intersexual correlations in expression are much lower, we detected Fast-Z for female gene expression, but crucially, not males. This suggests that a large amount of expression variation is sex-specific in its effects within the gonad. Taken together, our results indicate that Fast-Z evolution of gene expression is the product of positive selection acting on recessive beneficial alleles in the heterogametic sex. More broadly, our analysis suggests that the adaptive potential of Z chromosome gene expression may be much greater than that of gene sequence, results which have important implications for the role of sex chromosomes in speciation and sexual selection. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Issues with RNA-seq analysis in non-model organisms: A salmonid example.

PubMed

Sundaram, Arvind; Tengs, Torstein; Grimholt, Unni

2017-10-01

High throughput sequencing (HTS) is useful for many purposes as exemplified by the other topics included in this special issue. The purpose of this paper is to look into the unique challenges of using this technology in non-model organisms where resources such as genomes, functional genome annotations or genome complexity provide obstacles not met in model organisms. To describe these challenges, we narrow our scope to RNA sequencing used to study differential gene expression in response to pathogen challenge. As a demonstration species we chose Atlantic salmon, which has a sequenced genome with poor annotation and an added complexity due to many duplicated genes. We find that our RNA-seq analysis pipeline deciphers between duplicates despite high sequence identity. However, annotation issues provide problems in linking differentially expressed genes to pathways. Also, comparing results between approaches and species are complicated due to lack of standardized annotation. Copyright © 2017 Elsevier Ltd. All rights reserved.
Characterization and expression profiles of MaACS and MaACO genes from mulberry (Morus alba L.)*

PubMed Central

Liu, Chang-ying; Lü, Rui-hua; Li, Jun; Zhao, Ai-chun; Wang, Xi-ling; Diane, Umuhoza; Wang, Xiao-hong; Wang, Chuan-hong; Yu, Ya-sheng; Han, Shu-mei; Lu, Cheng; Yu, Mao-de

2014-01-01

1-Aminocyclopropane-1-carboxylic acid synthase (ACS) and 1-aminocyclopropane-1-carboxylic acid oxidase (ACO) are encoded by multigene families and are involved in fruit ripening by catalyzing the production of ethylene throughout the development of fruit. However, there are no reports on ACS or ACO genes in mulberry, partly because of the limited molecular research background. In this study, we have obtained five ACS gene sequences and two ACO gene sequences from Morus Genome Database. Sequence alignment and phylogenetic analysis of MaACO1 and MaACO2 showed that their amino acids are conserved compared with ACO proteins from other species. MaACS1 and MaACS2 are type I, MaACS3 and MaACS4 are type II, and MaACS5 is type III, with different C-terminal sequences. Quantitative reverse transcriptase polymerase chain reaction (qRT-PCR) expression analysis showed that the transcripts of MaACS genes were strongly expressed in fruit, and more weakly in other tissues. The expression of MaACO1 and MaACO2 showed different patterns in various mulberry tissues. MaACS and MaACO genes demonstrated two patterns throughout the development of mulberry fruit, and both of them were strongly up-regulated by abscisic acid (ABA) and ethephon. PMID:25001221
Molecular cloning and expression of the calmodulin gene from guinea pig hearts.

PubMed

Feng, Rui; Liu, Yan; Sun, Xuefei; Wang, Yan; Hu, Huiyuan; Guo, Feng; Zhao, Jinsheng; Hao, Liying

2015-06-01

The aim of the present study was to isolate and characterize a complementary DNA (cDNA) clone encoding the calmodulin (CaM; GenBank accession no. FJ012165) gene from guinea pig hearts. The CaM gene was amplified from cDNA collected from guinea pig hearts and inserted into a pGEM®-T Easy vector. Subsequently, CaM nucleotide and protein sequence similarity analysis was conducted between guinea pigs and other species. In addition, reverse transcription-polymerase chain reaction (RT-PCR) was performed to investigate the CaM 3 expression patterns in different guinea pig tissues. Sequence analysis revealed that the CaM gene isolated from the guinea pig heart had ∼90% sequence identity with the CaM 3 genes in humans, mice and rats. Furthermore, the deduced peptide sequences of CaM 3 in the guinea pig showed 100% homology to the CaM proteins from other species. In addition, the RT-PCR results indicated that CaM 3 was widely and differentially expressed in guinea pigs. In conclusion, the current study provided valuable information with regard to the cloning and expression of CaM 3 in guinea pig hearts. These findings may be helpful for understanding the function of CaM3 and the possible role of CaM3 in cardiovascular diseases.
Small RNA Analysis in Sindbis Virus Infected Human HEK293 Cells

PubMed Central

Dalmay, Tamas; Powell, Penny P.

2013-01-01

Introduction In contrast to the defence mechanism of RNA interference (RNAi) in plants and invertebrates, its role in the innate response to virus infection of mammals is a matter of debate. Since RNAi has a well-established role in controlling infection of the alphavirus Sindbis virus (SINV) in insects, we have used this virus to investigate the role of RNAi in SINV infection of human cells. Results SINV AR339 and TR339-GFP were adapted to grow in HEK293 cells. Deep sequencing of small RNAs (sRNAs) early in SINV infection (4 and 6 hpi) showed low abundance (0.8%) of viral sRNAs (vsRNAs), with no size, sequence or location specific patterns characteristic of Dicer products nor did they possess any discernible pattern to ascribe to a specific RNAi biogenesis pathway. This was supported by multiple variants for each sequence, and lack of hot spots along the viral genome sequence. The abundance of the best defined vsRNAs was below the limit of Northern blot detection. The adaptation of the virus to HEK293 cells showed little sequence changes compared to the reference; however, a SNP in E1 gene with a preference from G to C was found. Deep sequencing results showed little variation of expression of cellular microRNAs (miRNAs) at 4 and 6 hpi compared to uninfected cells. Twelve miRNAs exhibiting some minor differential expression by sequencing, showed no difference in expression by Northern blot analysis. Conclusions We show that, unlike SINV infection of invertebrates, generation of Dicer-dependent svRNAs and change in expression of cellular miRNAs were not detected as part of the Human response to SINV. PMID:24391886
Differential expression of copper-zinc superoxide dismutase gene of Polygonum sibiricum leaves, stems and underground stems, subjected to high-salt stress.

PubMed

Qu, Chun-Pu; Xu, Zhi-Ru; Liu, Guan-Jun; Liu, Chun; Li, Yang; Wei, Zhi-Gang; Liu, Gui-Feng

2010-01-01

In aerobic organisms, protection against oxidative damage involves the combined action of highly specialized antioxidant enzymes, such as copper-zinc superoxide dismutase. In this work, a cDNA clone which encodes a copper-zinc superoxide dismutase gene, named PS-CuZnSOD, has been identified from P. sibiricum Laxm. by the rapid amplification of cDNA ends method (RACE). Analysis of the nucleotide sequence reveals that the PS-CuZnSOD gene cDNA clone consists of 669 bp, containing 87 bp in the 5' untranslated region; 459 bp in the open reading frame (ORF) encoding 152 amino acids; and 123 bp in 3' untranslated region. The gene accession nucleotide sequence number in GenBank is GQ472846. Sequence analysis indicates that the protein, like most plant superoxide dismutases (SOD), includes two conserved ecCuZnSOD signatures that are from the amino acids 43 to 51, and from the amino acids 137 to 148, and it has a signal peptide extension in the front of the N-terminus (1-16 aa). Expression analysis by real-time quantitative PCR reveals that the PS-CuZnSOD gene is expressed in leaves, stems and underground stems. PS-CuZnSOD gene expression can be induced by 3% NaHCO(3). The different mRNA levels' expression of PS-CuZnSOD show the gene's different expression modes in leaves, stems and underground stems under the salinity-alkalinity stress.
OP17MICRORNA PROFILING USING SMALL RNA-SEQ IN PAEDIATRIC LOW GRADE GLIOMAS

PubMed Central

Jeyapalan, Jennie N.; Jones, Tania A.; Tatevossian, Ruth G.; Qaddoumi, Ibrahim; Ellison, David W.; Sheer, Denise

2014-01-01

INTRODUCTION: MicroRNAs regulate gene expression by targeting mRNAs for translational repression or degradation at the post-transcriptional level. In paediatric low-grade gliomas a few key genetic mutations have been identified, including BRAF fusions, FGFR1 duplications and MYB rearrangements. Our aim in the current study is to profile aberrant microRNA expression in paediatric low-grade gliomas and determine the role of epigenetic changes in the aetiology and behaviour of these tumours. METHOD: MicroRNA profiling of tumour samples (6 pilocytic, 2 diffuse, 2 pilomyxoid astrocytomas) and normal brain controls (4 adult normal brain samples and a primary glial progenitor cell-line) was performed using small RNA sequencing. Bioinformatic analysis included sequence alignment, analysis of the number of reads (CPM, counts per million) and differential expression. RESULTS: Sequence alignment identified 695 microRNAs, whose expression was compared in tumours v. normal brain. PCA and hierarchical clustering showed separate groups for tumours and normal brain. Computational analysis identified approximately 400 differentially expressed microRNAs in the tumours compared to matched location controls. Our findings will then be validated and integrated with extensive genetic and epigenetic information we have previously obtained for the full tumour cohort. CONCLUSION: We have identified microRNAs that are differentially expressed in paediatric low-grade gliomas. As microRNAs are known to target genes involved in the initiation and progression of cancer, they provide critical information on tumour pathogenesis and are an important class of biomarkers.
Integration of Next Generation Sequencing and EPR Analysis to Uncover Molecular Mechanism Underlying Shell Color Variation in Scallops

PubMed Central

Sun, Xiujun; Liu, Zhihong; Zhou, Liqing; Wu, Biao; Dong, Yinghui; Yang, Aiguo

2016-01-01

The Yesso scallop Patinopecten yessoensis displays polymorphism in shell colors, which is of great interest for the scallop industry. To identify genes involved in the shell coloration, in the present study, we investigate the transcriptome differences by Illumina digital gene expression (DGE) analysis in two extreme color phenotypes, Red and White. Illumina sequencing yields a total of 62,715,364 clean sequence reads, and more than 85% reads are mapped into our previously sequenced transcriptome. There are 25 significantly differentially expressed genes between Red and White scallops. EPR (Electron paramagnetic resonance) analysis has identified EPR spectra of pheomelanin and eumelanin in the red shells, but not in the white shells. Compared to the Red scallops, the White scallops have relatively higher mRNA expression in tyrosinase genes, but lower expression in other melanogensis-associated genes. Meantime, the relatively lower tyrosinase protein and decreased tyrosinase activity in White scallops are suggested to be associated with the lack of melanin in the white shells. Our findings highlight the functional roles of melanogensis-associated genes in the melanization process of scallop shells, and shed new lights on the transcriptional and post-transcriptional mechanisms in the regulation of tyrosinase activity during the process of melanin synthesis. The present results will assist our molecular understanding of melanin synthesis underlying shell color polymorphism in scallops, as well as other bivalves, and also help the color-based breeding in shellfish aquaculture. PMID:27563719
An integrated expression atlas of miRNAs and their promoters in human and mouse

PubMed Central

de Rie, Derek; Abugessaisa, Imad; Alam, Tanvir; Arner, Erik; Arner, Peter; Ashoor, Haitham; Åström, Gaby; Babina, Magda; Bertin, Nicolas; Burroughs, A. Maxwell; Carlisle, Ailsa J.; Daub, Carsten O.; Detmar, Michael; Deviatiiarov, Ruslan; Fort, Alexandre; Gebhard, Claudia; Goldowitz, Daniel; Guhl, Sven; Ha, Thomas J.; Harshbarger, Jayson; Hasegawa, Akira; Hashimoto, Kosuke; Herlyn, Meenhard; Heutink, Peter; Hitchens, Kelly J.; Hon, Chung Chau; Huang, Edward; Ishizu, Yuri; Kai, Chieko; Kasukawa, Takeya; Klinken, Peter; Lassmann, Timo; Lecellier, Charles-Henri; Lee, Weonju; Lizio, Marina; Makeev, Vsevolod; Mathelier, Anthony; Medvedeva, Yulia A.; Mejhert, Niklas; Mungall, Christopher J.; Noma, Shohei; Ohshima, Mitsuhiro; Okada-Hatakeyama, Mariko; Persson, Helena; Rizzu, Patrizia; Roudnicky, Filip; Sætrom, Pål; Sato, Hiroki; Severin, Jessica; Shin, Jay W.; Swoboda, Rolf K.; Tarui, Hiroshi; Toyoda, Hiroo; Vitting-Seerup, Kristoffer; Winteringham, Louise; Yamaguchi, Yoko; Yasuzawa, Kayoko; Yoneda, Misako; Yumoto, Noriko; Zabierowski, Susan; Zhang, Peter G.; Wells, Christine A.; Summers, Kim M.; Kawaji, Hideya; Sandelin, Albin; Rehli, Michael; Hayashizaki, Yoshihide; Carninci, Piero; Forrest, Alistair R. R.; de Hoon, Michiel J. L.

2018-01-01

MicroRNAs (miRNAs) are short non-coding RNAs with key roles in cellular regulation. As part of the fifth edition of the Functional Annotation of Mammalian Genome (FANTOM5) project, we created an integrated expression atlas of miRNAs and their promoters by deep-sequencing 492 short RNA (sRNA) libraries, with matching Cap Analysis Gene Expression (CAGE) data, from 396 human and 47 mouse RNA samples. Promoters were identified for 1,357 human and 804 mouse miRNAs and showed strong sequence conservation between species. We also found that primary and mature miRNA expression levels were correlated, allowing us to use the primary miRNA measurements as a proxy for mature miRNA levels in a total of 1,829 human and 1,029 mouse CAGE libraries. We thus provide a broad atlas of miRNA expression and promoters in primary mammalian cells, establishing a foundation for detailed analysis of miRNA expression patterns and transcriptional control regions. PMID:28829439
Generation and Analysis of a Large-Scale Expressed Sequence Tag Database from a Full-Length Enriched cDNA Library of Developing Leaves of Gossypium hirsutum L

PubMed Central

Pang, Chaoyou; Fan, Shuli; Song, Meizhen; Yu, Shuxun

2013-01-01

Background Cotton (Gossypium hirsutum L.) is one of the world’s most economically-important crops. However, its entire genome has not been sequenced, and limited resources are available in GenBank for understanding the molecular mechanisms underlying leaf development and senescence. Methodology/Principal Findings In this study, 9,874 high-quality ESTs were generated from a normalized, full-length cDNA library derived from pooled RNA isolated from throughout leaf development during the plant blooming stage. After clustering and assembly of these ESTs, 5,191 unique sequences, representative 1,652 contigs and 3,539 singletons, were obtained. The average unique sequence length was 682 bp. Annotation of these unique sequences revealed that 84.4% showed significant homology to sequences in the NCBI non-redundant protein database, and 57.3% had significant hits to known proteins in the Swiss-Prot database. Comparative analysis indicated that our library added 2,400 ESTs and 991 unique sequences to those known for cotton. The unigenes were functionally characterized by gene ontology annotation. We identified 1,339 and 200 unigenes as potential leaf senescence-related genes and transcription factors, respectively. Moreover, nine genes related to leaf senescence and eleven MYB transcription factors were randomly selected for quantitative real-time PCR (qRT-PCR), which revealed that these genes were regulated differentially during senescence. The qRT-PCR for three GhYLSs revealed that these genes express express preferentially in senescent leaves. Conclusions/Significance These EST resources will provide valuable sequence information for gene expression profiling analyses and functional genomics studies to elucidate their roles, as well as for studying the mechanisms of leaf development and senescence in cotton and discovering candidate genes related to important agronomic traits of cotton. These data will also facilitate future whole-genome sequence assembly and annotation in G. hirsutum and comparative genomics among Gossypium species. PMID:24146870
Diversity, expression and mRNA targeting abilities of Argonaute-targeting miRNAs among selected vascular plants.

PubMed

Jagtap, Soham; Shivaprasad, Padubidri V

2014-12-02

Micro (mi)RNAs are important regulators of plant development. Across plant lineages, Dicer-like 1 (DCL1) proteins process long ds-like structures to produce micro (mi) RNA duplexes in a stepwise manner. These miRNAs are incorporated into Argonaute (AGO) proteins and influence expression of RNAs that have sequence complementarity with miRNAs. Expression levels of AGOs are greatly regulated by plants in order to minimize unwarranted perturbations using miRNAs to target mRNAs coding for AGOs. AGOs may also have high promoter specificity-sometimes expression of AGO can be limited to just a few cells in a plant. Viral pathogens utilize various means to counter antiviral roles of AGOs including hijacking the host encoded miRNAs to target AGOs. Two host encoded miRNAs namely miR168 and miR403 that target AGOs have been described in the model plant Arabidopsis and such a mechanism is thought to be well conserved across plants because AGO sequences are well conserved. We show that the interaction between AGO mRNAs and miRNAs is species-specific due to the diversity in sequences of two miRNAs that target AGOs, sequence diversity among corresponding target regions in AGO mRNAs and variable expression levels of these miRNAs among vascular plants. We used miRNA sequences from 68 plant species representing 31 plant families for this analysis. Sequences of miR168 and miR403 are not conserved among plant lineages, but surprisingly they differ drastically in their sequence diversity and expression levels even among closely related plants. Variation in miR168 expression among plants correlates well with secondary structures/length of loop sequences of their precursors. Our data indicates a complex AGO targeting interaction among plant lineages due to miRNA sequence diversity and sequences of miRNA targeting regions among AGO mRNAs, thus leading to the assumption that the perturbations by viruses that use host miRNAs to target antiviral AGOs can only be species-specific. We also show that rapid evolution and likely loss of expression of miR168 isoforms in tobacco is related to the insertion of MITE-like transposons between miRNA and miRNA* sequences, a possible mechanism showing how miRNAs are lost in few plant lineages even though other close relatives have abundantly expressing miRNAs.

Sequence analysis of dolphin ferritin H and L subunits and possible iron-dependent translational control of dolphin ferritin gene

PubMed Central

Takaesu, Azusa; Watanabe, Kiyotaka; Takai, Shinji; Sasaki, Yukako; Orino, Koichi

2008-01-01

Background Iron-storage protein, ferritin plays a central role in iron metabolism. Ferritin has dual function to store iron and segregate iron for protection of iron-catalyzed reactive oxygen species. Tissue ferritin is composed of two kinds of subunits (H: heavy chain or heart-type subunit; L: light chain or liver-type subunit). Ferritin gene expression is controlled at translational level in iron-dependent manner or at transcriptional level in iron-independent manner. However, sequencing analysis of marine mammalian ferritin subunits has not yet been performed fully. The purpose of this study is to reveal cDNA-derived amino acid sequences of cetacean ferritin H and L subunits, and demonstrate the possibility of expression of these subunits, especially H subunit, by iron. Methods Sequence analyses of cetacean ferritin H and L subunits were performed by direct sequencing of polymerase chain reaction (PCR) fragments from cDNAs generated via reverse transcription-PCR of leukocyte total RNA prepared from blood samples of six different dolphin species (Pseudorca crassidens, Lagenorhynchus obliquidens, Grampus griseus, Globicephala macrorhynchus, Tursiops truncatus, and Delphinapterus leucas). The putative iron-responsive element sequence in the 5'-untranslated region of the six different dolphin species was revealed by direct sequencing of PCR fragments obtained using leukocyte genomic DNA. Results Dolphin H and L subunits consist of 182 and 174 amino acids, respectively, and amino acid sequence identities of ferritin subunits among these dolphins are highly conserved (H: 99–100%, (99→98) ; L: 98–100%). The conserved 28 bp IRE sequence was located -144 bp upstream from the initiation codon in the six different dolphin species. Conclusion These results indicate that six different dolphin species have conserved ferritin sequences, and suggest that these genes are iron-dependently expressed. PMID:18954429
Sustained expression of MCP-1 by low wall shear stress loading concomitant with turbulent flow on endothelial cells of intracranial aneurysm.

PubMed

Aoki, Tomohiro; Yamamoto, Kimiko; Fukuda, Miyuki; Shimogonya, Yuji; Fukuda, Shunichi; Narumiya, Shuh

2016-05-09

Enlargement of a pre-existing intracranial aneurysm is a well-established risk factor of rupture. Excessive low wall shear stress concomitant with turbulent flow in the dome of an aneurysm may contribute to progression and rupture. However, how stress conditions regulate enlargement of a pre-existing aneurysm remains to be elucidated. Wall shear stress was calculated with 3D-computational fluid dynamics simulation using three cases of unruptured intracranial aneurysm. The resulting value, 0.017 Pa at the dome, was much lower than that in the parent artery. We loaded wall shear stress corresponding to the value and also turbulent flow to the primary culture of endothelial cells. We then obtained gene expression profiles by RNA sequence analysis. RNA sequence analysis detected hundreds of differentially expressed genes among groups. Gene ontology and pathway analysis identified signaling related with cell division/proliferation as overrepresented in the low wall shear stress-loaded group, which was further augmented by the addition of turbulent flow. Moreover, expression of some chemoattractants for inflammatory cells, including MCP-1, was upregulated under low wall shear stress with concomitant turbulent flow. We further examined the temporal sequence of expressions of factors identified in an in vitro study using a rat model. No proliferative cells were detected, but MCP-1 expression was induced and sustained in the endothelial cell layer. Low wall shear stress concomitant with turbulent flow contributes to sustained expression of MCP-1 in endothelial cells and presumably plays a role in facilitating macrophage infiltration and exacerbating inflammation, which leads to enlargement or rupture.
An ethylene-responsive enhancer element is involved in the senescence-related expression of the carnation glutathione-S-transferase (GST1) gene.

PubMed

Itzhaki, H; Maxson, J M; Woodson, W R

1994-09-13

The increased production of ethylene during carnation petal senescence regulates the transcription of the GST1 gene encoding a subunit of glutathione-S-transferase. We have investigated the molecular basis for this ethylene-responsive transcription by examining the cis elements and trans-acting factors involved in the expression of the GST1 gene. Transient expression assays following delivery of GST1 5' flanking DNA fused to a beta-glucuronidase receptor gene were used to functionally define sequences responsible for ethylene-responsive expression. Deletion analysis of the 5' flanking sequences of GST1 identified a single positive regulatory element of 197 bp between -667 and -470 necessary for ethylene-responsive expression. The sequences within this ethylene-responsive region were further localized to 126 bp between -596 and -470. The ethylene-responsive element (ERE) within this region conferred ethylene-regulated expression upon a minimal cauliflower mosaic virus-35S TATA-box promoter in an orientation-independent manner. Gel electrophoresis mobility-shift assays and DNase I footprinting were used to identify proteins that bind to sequences within the ERE. Nuclear proteins from carnation petals were shown to specifically interact with the 126-bp ERE and the presence and binding of these proteins were independent of ethylene or petal senescence. DNase I footprinting defined DNA sequences between -510 and -488 within the ERE specifically protected by bound protein. An 8-bp sequence (ATTTCAAA) within the protected region shares significant homology with promoter sequences required for ethylene responsiveness from the tomato fruit-ripening E4 gene.
An expressed sequence tag (EST) data mining strategy succeeding in the discovery of new G-protein coupled receptors.

PubMed

Wittenberger, T; Schaller, H C; Hellebrand, S

2001-03-30

We have developed a comprehensive expressed sequence tag database search method and used it for the identification of new members of the G-protein coupled receptor superfamily. Our approach proved to be especially useful for the detection of expressed sequence tag sequences that do not encode conserved parts of a protein, making it an ideal tool for the identification of members of divergent protein families or of protein parts without conserved domain structures in the expressed sequence tag database. At least 14 of the expressed sequence tags found with this strategy are promising candidates for new putative G-protein coupled receptors. Here, we describe the sequence and expression analysis of five new members of this receptor superfamily, namely GPR84, GPR86, GPR87, GPR90 and GPR91. We also studied the genomic structure and chromosomal localization of the respective genes applying in silico methods. A cluster of six closely related G-protein coupled receptors was found on the human chromosome 3q24-3q25. It consists of four orphan receptors (GPR86, GPR87, GPR91, and H963), the purinergic receptor P2Y1, and the uridine 5'-diphosphoglucose receptor KIAA0001. It seems likely that these receptors evolved from a common ancestor and therefore might have related ligands. In conclusion, we describe a data mining procedure that proved to be useful for the identification and first characterization of new genes and is well applicable for other gene families. Copyright 2001 Academic Press.
New approach for the study of mite reproduction: The first transcriptome analysis of a mite, Phytoseiulus persimilis (Acari: Phytoseiidae).

PubMed

Cabrera, Ana R; Donohue, Kevin V; Khalil, Sayed M S; Scholl, Elizabeth; Opperman, Charles; Sonenshine, Daniel E; Roe, R Michael

2011-01-01

Many species of mites and ticks are of agricultural and medical importance. Much can be learned from the study of transcriptomes of acarines which can generate DNA-sequence information of potential target genes for the control of acarine pests. High throughput transcriptome sequencing can also yield sequences of genes critical during physiological processes poorly understood in acarines, i.e., the regulation of female reproduction in mites. The predatory mite, Phytoseiulus persimilis, was selected to conduct a transcriptome analysis using 454 pyrosequencing. The objective of this project was to obtain DNA-sequence information of expressed genes from P. persimilis with special interest in sequences corresponding to vitellogenin (Vg) and the vitellogenin receptor (VgR). These genes are critical to the understanding of vitellogenesis, and they will facilitate the study of the regulation of mite female reproduction. A total of 12,556 contiguous sequences (contigs) were assembled with an average size of 935bp. From these sequences, the putative translated peptides of 11 contigs were similar in amino acid sequences to other arthropod Vgs, while 6 were similar to VgRs. We selected some of these sequences to conduct stage-specific expression studies to further determine their function. 2010 Elsevier Ltd. All rights reserved.
Genome-wide discovery of novel and conserved microRNAs in white shrimp (Litopenaeus vannamei).

PubMed

Xi, Qian-Yun; Xiong, Yuan-Yan; Wang, Yuan-Mei; Cheng, Xiao; Qi, Qi-En; Shu, Gang; Wang, Song-Bo; Wang, Li-Na; Gao, Ping; Zhu, Xiao-Tong; Jiang, Qing-Yan; Zhang, Yong-Liang; Liu, Li

2015-01-01

Of late years, a large amount of conserved and species-specific microRNAs (miRNAs) have been performed on identification from species which are economically important but lack a full genome sequence. In this study, Solexa deep sequencing and cross-species miRNA microarray were used to detect miRNAs in white shrimp. We identified 239 conserved miRNAs, 14 miRNA* sequences and 20 novel miRNAs by bioinformatics analysis from 7,561,406 high-quality reads representing 325,370 distinct sequences. The all 20 novel miRNAs were species-specific in white shrimp and not homologous in other species. Using the conserved miRNAs from the miRBase database as a query set to search for homologs from shrimp expressed sequence tags (ESTs), 32 conserved computationally predicted miRNAs were discovered in shrimp. In addition, using microarray analysis in the shrimp fed with Panax ginseng polysaccharide complex, 151 conserved miRNAs were identified, 18 of which were significant up-expression, while 49 miRNAs were significant down-expression. In particular, qRT-PCR analysis was also performed for nine miRNAs in three shrimp tissues such as muscle, gill and hepatopancreas. Results showed that these miRNAs expression are tissue specific. Combining results of the three methods, we detected 20 novel and 394 conserved miRNAs. Verification with quantitative reverse transcription (qRT-PCR) and Northern blot showed a high confidentiality of data. The study provides the first comprehensive specific miRNA profile of white shrimp, which includes useful information for future investigations into the function of miRNAs in regulation of shrimp development and immunology.
Selection and Validation of Reference Genes for Quantitative Real-Time PCR in Buckwheat (Fagopyrum esculentum) Based on Transcriptome Sequence Data

PubMed Central

Demidenko, Natalia V.; Logacheva, Maria D.; Penin, Aleksey A.

2011-01-01

Quantitative reverse transcription PCR (qRT-PCR) is one of the most precise and widely used methods of gene expression analysis. A necessary prerequisite of exact and reliable data is the accurate choice of reference genes. We studied the expression stability of potential reference genes in common buckwheat (Fagopyrum esculentum) in order to find the optimal reference for gene expression analysis in this economically important crop. Recently sequenced buckwheat floral transcriptome was used as source of sequence information. Expression stability of eight candidate reference genes was assessed in different plant structures (leaves and inflorescences at two stages of development and fruits). These genes are the orthologs of Arabidopsis genes identified as stable in a genome-wide survey gene of expression stability and a traditionally used housekeeping gene GAPDH. Three software applications – geNorm, NormFinder and BestKeeper - were used to estimate expression stability and provided congruent results. The orthologs of AT4G33380 (expressed protein of unknown function, Expressed1), AT2G28390 (SAND family protein, SAND) and AT5G46630 (clathrin adapter complex subunit family protein, CACS) are revealed as the most stable. We recommend using the combination of Expressed1, SAND and CACS for the normalization of gene expression data in studies on buckwheat using qRT-PCR. These genes are listed among five the most stably expressed in Arabidopsis that emphasizes utility of the studies on model plants as a framework for other species. PMID:21589908
Characterization of transcriptome dynamics during watermelon fruit development: sequencing, assembly, annotation and gene expression profiles

PubMed Central

2011-01-01

Background Cultivated watermelon [Citrullus lanatus (Thunb.) Matsum. & Nakai var. lanatus] is an important agriculture crop world-wide. The fruit of watermelon undergoes distinct stages of development with dramatic changes in its size, color, sweetness, texture and aroma. In order to better understand the genetic and molecular basis of these changes and significantly expand the watermelon transcript catalog, we have selected four critical stages of watermelon fruit development and used Roche/454 next-generation sequencing technology to generate a large expressed sequence tag (EST) dataset and a comprehensive transcriptome profile for watermelon fruit flesh tissues. Results We performed half Roche/454 GS-FLX run for each of the four watermelon fruit developmental stages (immature white, white-pink flesh, red flesh and over-ripe) and obtained 577,023 high quality ESTs with an average length of 302.8 bp. De novo assembly of these ESTs together with 11,786 watermelon ESTs collected from GenBank produced 75,068 unigenes with a total length of approximately 31.8 Mb. Overall 54.9% of the unigenes showed significant similarities to known sequences in GenBank non-redundant (nr) protein database and around two-thirds of them matched proteins of cucumber, the most closely-related species with a sequenced genome. The unigenes were further assigned with gene ontology (GO) terms and mapped to biochemical pathways. More than 5,000 SSRs were identified from the EST collection. Furthermore we carried out digital gene expression analysis of these ESTs and identified 3,023 genes that were differentially expressed during watermelon fruit development and ripening, which provided novel insights into watermelon fruit biology and a comprehensive resource of candidate genes for future functional analysis. We then generated profiles of several interesting metabolites that are important to fruit quality including pigmentation and sweetness. Integrative analysis of metabolite and digital gene expression profiles helped elucidating molecular mechanisms governing these important quality-related traits during watermelon fruit development. Conclusion We have generated a large collection of watermelon ESTs, which represents a significant expansion of the current transcript catalog of watermelon and a valuable resource for future studies on the genomics of watermelon and other closely-related species. Digital expression analysis of this EST collection allowed us to identify a large set of genes that were differentially expressed during watermelon fruit development and ripening, which provide a rich source of candidates for future functional analysis and represent a valuable increase in our knowledge base of watermelon fruit biology. PMID:21936920
Characterization of transcriptome dynamics during watermelon fruit development: sequencing, assembly, annotation and gene expression profiles.

PubMed

Guo, Shaogui; Liu, Jingan; Zheng, Yi; Huang, Mingyun; Zhang, Haiying; Gong, Guoyi; He, Hongju; Ren, Yi; Zhong, Silin; Fei, Zhangjun; Xu, Yong

2011-09-21

Cultivated watermelon [Citrullus lanatus (Thunb.) Matsum. & Nakai var. lanatus] is an important agriculture crop world-wide. The fruit of watermelon undergoes distinct stages of development with dramatic changes in its size, color, sweetness, texture and aroma. In order to better understand the genetic and molecular basis of these changes and significantly expand the watermelon transcript catalog, we have selected four critical stages of watermelon fruit development and used Roche/454 next-generation sequencing technology to generate a large expressed sequence tag (EST) dataset and a comprehensive transcriptome profile for watermelon fruit flesh tissues. We performed half Roche/454 GS-FLX run for each of the four watermelon fruit developmental stages (immature white, white-pink flesh, red flesh and over-ripe) and obtained 577,023 high quality ESTs with an average length of 302.8 bp. De novo assembly of these ESTs together with 11,786 watermelon ESTs collected from GenBank produced 75,068 unigenes with a total length of approximately 31.8 Mb. Overall 54.9% of the unigenes showed significant similarities to known sequences in GenBank non-redundant (nr) protein database and around two-thirds of them matched proteins of cucumber, the most closely-related species with a sequenced genome. The unigenes were further assigned with gene ontology (GO) terms and mapped to biochemical pathways. More than 5,000 SSRs were identified from the EST collection. Furthermore we carried out digital gene expression analysis of these ESTs and identified 3,023 genes that were differentially expressed during watermelon fruit development and ripening, which provided novel insights into watermelon fruit biology and a comprehensive resource of candidate genes for future functional analysis. We then generated profiles of several interesting metabolites that are important to fruit quality including pigmentation and sweetness. Integrative analysis of metabolite and digital gene expression profiles helped elucidating molecular mechanisms governing these important quality-related traits during watermelon fruit development. We have generated a large collection of watermelon ESTs, which represents a significant expansion of the current transcript catalog of watermelon and a valuable resource for future studies on the genomics of watermelon and other closely-related species. Digital expression analysis of this EST collection allowed us to identify a large set of genes that were differentially expressed during watermelon fruit development and ripening, which provide a rich source of candidates for future functional analysis and represent a valuable increase in our knowledge base of watermelon fruit biology.
Molecular and phylogenetic characterization of the homoeologous EPSP Synthase genes of allohexaploid wheat, Triticum aestivum (L.).

PubMed

Aramrak, Attawan; Kidwell, Kimberlee K; Steber, Camille M; Burke, Ian C

2015-10-23

5-Enolpyruvylshikimate-3-phosphate synthase (EPSPS) is the sixth and penultimate enzyme in the shikimate biosynthesis pathway, and is the target of the herbicide glyphosate. The EPSPS genes of allohexaploid wheat (Triticum aestivum, AABBDD) have not been well characterized. Herein, the three homoeologous copies of the allohexaploid wheat EPSPS gene were cloned and characterized. Genomic and coding DNA sequences of EPSPS from the three related genomes of allohexaploid wheat were isolated using PCR and inverse PCR approaches from soft white spring "Louise'. Development of genome-specific primers allowed the mapping and expression analysis of TaEPSPS-7A1, TaEPSPS-7D1, and TaEPSPS-4A1 on chromosomes 7A, 7D, and 4A, respectively. Sequence alignments of cDNA sequences from wheat and wheat relatives served as a basis for phylogenetic analysis. The three genomic copies of wheat EPSPS differed by insertion/deletion and single nucleotide polymorphisms (SNPs), largely in intron sequences. RT-PCR analysis and cDNA cloning revealed that EPSPS is expressed from all three genomic copies. However, TaEPSPS-4A1 is expressed at much lower levels than TaEPSPS-7A1 and TaEPSPS-7D1 in wheat seedlings. Phylogenetic analysis of 1190-bp cDNA clones from wheat and wheat relatives revealed that: 1) TaEPSPS-7A1 is most similar to EPSPS from the tetraploid AB genome donor, T. turgidum (99.7 % identity); 2) TaEPSPS-7D1 most resembles EPSPS from the diploid D genome donor, Aegilops tauschii (100 % identity); and 3) TaEPSPS-4A1 resembles EPSPS from the diploid B genome relative, Ae. speltoides (97.7 % identity). Thus, EPSPS sequences in allohexaploid wheat are preserved from the most two recent ancestors. The wheat EPSPS genes are more closely related to Lolium multiflorum and Brachypodium distachyon than to Oryza sativa (rice). The three related EPSPS homoeologues of wheat exhibited conservation of the exon/intron structure and of coding region sequence, but contained significant sequence variation within intron regions. The genome-specific primers developed will enable future characterization of natural and induced variation in EPSPS sequence and expression. This can be useful in investigating new causes of glyphosate herbicide resistance.
Digital transcriptome analysis of putative sex-determination genes in papaya (Carica papaya).

PubMed

Urasaki, Naoya; Tarora, Kazuhiko; Shudo, Ayano; Ueno, Hiroki; Tamaki, Moritoshi; Miyagi, Norimichi; Adaniya, Shinichi; Matsumura, Hideo

2012-01-01

Papaya (Carica papaya) is a trioecious plant species that has male, female and hermaphrodite flowers on different plants. The primitive sex chromosomes genetically determine the sex of the papaya. Although draft sequences of the papaya genome are already available, the genes for sex determination have not been identified, likely due to the complicated structure of its sex-chromosome sequences. To identify the candidate genes for sex determination, we conducted a transcriptome analysis of flower samples from male, female and hermaphrodite plants using high-throughput SuperSAGE for digital gene expression analysis. Among the short sequence tags obtained from the transcripts, 312 unique tags were specifically mapped to the primitive sex chromosome (X or Y(h)) sequences. An annotation analysis revealed that retroelements are the most abundant sequences observed in the genes corresponding to these tags. The majority of tags on the sex chromosomes were located on the X chromosome, and only 30 tags were commonly mapped to both the X and Y(h) chromosome, implying a loss of many genes on the Y(h) chromosome. Nevertheless, candidate Y(h) chromosome-specific female determination genes, including a MADS-box gene, were identified. Information on these sex chromosome-specific expressed genes will help elucidating sex determination in the papaya.
Digital Transcriptome Analysis of Putative Sex-Determination Genes in Papaya (Carica papaya)

PubMed Central

Urasaki, Naoya; Tarora, Kazuhiko; Shudo, Ayano; Ueno, Hiroki; Tamaki, Moritoshi; Miyagi, Norimichi; Adaniya, Shinichi; Matsumura, Hideo

2012-01-01

Papaya (Carica papaya) is a trioecious plant species that has male, female and hermaphrodite flowers on different plants. The primitive sex chromosomes genetically determine the sex of the papaya. Although draft sequences of the papaya genome are already available, the genes for sex determination have not been identified, likely due to the complicated structure of its sex-chromosome sequences. To identify the candidate genes for sex determination, we conducted a transcriptome analysis of flower samples from male, female and hermaphrodite plants using high-throughput SuperSAGE for digital gene expression analysis. Among the short sequence tags obtained from the transcripts, 312 unique tags were specifically mapped to the primitive sex chromosome (X or Yh) sequences. An annotation analysis revealed that retroelements are the most abundant sequences observed in the genes corresponding to these tags. The majority of tags on the sex chromosomes were located on the X chromosome, and only 30 tags were commonly mapped to both the X and Yh chromosome, implying a loss of many genes on the Yh chromosome. Nevertheless, candidate Yh chromosome-specific female determination genes, including a MADS-box gene, were identified. Information on these sex chromosome-specific expressed genes will help elucidating sex determination in the papaya. PMID:22815863
Phenotype classification of single cells using SRS microscopy, RNA sequencing, and microfluidics (Conference Presentation)

NASA Astrophysics Data System (ADS)

Streets, Aaron M.; Cao, Chen; Zhang, Xiannian; Huang, Yanyi

2016-03-01

Phenotype classification of single cells reveals biological variation that is masked in ensemble measurement. This heterogeneity is found in gene and protein expression as well as in cell morphology. Many techniques are available to probe phenotypic heterogeneity at the single cell level, for example quantitative imaging and single-cell RNA sequencing, but it is difficult to perform multiple assays on the same single cell. In order to directly track correlation between morphology and gene expression at the single cell level, we developed a microfluidic platform for quantitative coherent Raman imaging and immediate RNA sequencing (RNA-Seq) of single cells. With this device we actively sort and trap cells for analysis with stimulated Raman scattering microscopy (SRS). The cells are then processed in parallel pipelines for lysis, and preparation of cDNA for high-throughput transcriptome sequencing. SRS microscopy offers three-dimensional imaging with chemical specificity for quantitative analysis of protein and lipid distribution in single cells. Meanwhile, the microfluidic platform facilitates single-cell manipulation, minimizes contamination, and furthermore, provides improved RNA-Seq detection sensitivity and measurement precision, which is necessary for differentiating biological variability from technical noise. By combining coherent Raman microscopy with RNA sequencing, we can better understand the relationship between cellular morphology and gene expression at the single-cell level.
Preparing and Analyzing Expressed Sequence Tags (ESTs) Library for the Mammary Tissue of Local Turkish Kivircik Sheep

PubMed Central

Omeroglu Ulu, Zehra; Ulu, Salih; Un, Cemal; Ozdem Oztabak, Kemal; Altunatmaz, Kemal

2017-01-01

Kivircik sheep is an important local Turkish sheep according to its meat quality and milk productivity. The aim of this study was to analyze gene expression profiles of both prenatal and postnatal stages for the Kivircik sheep. Therefore, two different cDNA libraries, which were taken from the same Kivircik sheep mammary gland tissue at prenatal and postnatal stages, were constructed. Total 3072 colonies which were randomly selected from the two libraries were sequenced for developing a sheep ESTs collection. We used Phred/Phrap computer programs for analysis of the raw EST and readable EST sequences were assembled with the CAP3 software. Putative functions of all unique sequences and statistical analysis were determined by Geneious software. Total 422 ESTs have over 80% similarity to known sequences of other organisms in NCBI classified by Panther database for the Gene Ontology (GO) category. By comparing gene expression profiles, we observed some putative genes that may be relative to reproductive performance or play important roles in milk synthesis and secretion. A total of 2414 ESTs have been deposited to the NCBI GenBank database (GW996847–GW999260). EST data in this study have provided a new source of information to functional genome studies of sheep. PMID:28239610
Identification of Delta5-fatty acid desaturase from the cellular slime mold dictyostelium discoideum.

PubMed

Saito, T; Ochiai, H

1999-10-01

cDNA fragments putatively encoding amino acid sequences characteristic of the fatty acid desaturase were obtained using expressed sequence tag (EST) information of the Dictyostelium cDNA project. Using this sequence, we have determined the cDNA sequence and genomic sequence of a desaturase. The cloned cDNA is 1489 nucleotides long and the deduced amino acid sequence comprised 464 amino acid residues containing an N-terminal cytochrome b5 domain. The whole sequence was 38.6% identical to the initially identified Delta5-desaturase of Mortierella alpina. We have confirmed its function as Delta5-desaturase by over expression mutation in D. discoideum and also the gain of function mutation in the yeast Saccharomyces cerevisiae. Analysis of the lipids from transformed D. discoideum and yeast demonstrated the accumulation of Delta5-desaturated products. This is the first report concering fatty acid desaturase in cellular slime molds.
The unique C- and N-terminal sequences of Metallothionein isoform 3 mediate growth inhibition and Vectorial active transport in MCF-7 cells.

PubMed

Voels, Brent; Wang, Liping; Sens, Donald A; Garrett, Scott H; Zhang, Ke; Somji, Seema

2017-05-25

The 3rd isoform of the metallothionein (MT3) gene family has been shown to be overexpressed in most ductal breast cancers. A previous study has shown that the stable transfection of MCF-7 cells with the MT3 gene inhibits cell growth. The goal of the present study was to determine the role of the unique C-terminal and N-terminal sequences of MT3 on phenotypic properties and gene expression profiles of MCF-7 cells. MCF-7 cells were transfected with various metallothionein gene constructs which contain the insertion or the removal of the unique MT3 C- and N-terminal domains. Global gene expression analysis was performed on the MCF-7 cells containing the various constructs and the expression of the unique C- and N- terminal domains of MT3 was correlated to phenotypic properties of the cells. The results of the present study demonstrate that the C-terminal sequence of MT3, in the absence of the N-terminal sequence, induces dome formation in MCF-7 cells, which in cell cultures is the phenotypic manifestation of a cell's ability to perform vectorial active transport. Global gene expression analysis demonstrated that the increased expression of the GAGE gene family correlated with dome formation. Expression of the C-terminal domain induced GAGE gene expression, whereas the N-terminal domain inhibited GAGE gene expression and that the effect of the N-terminal domain inhibition was dominant over the C-terminal domain of MT3. Transfection with the metallothionein 1E gene increased the expression of GAGE genes. In addition, both the C- and the N-terminal sequences of the MT3 gene had growth inhibitory properties, which correlated to an increased expression of the interferon alpha-inducible protein 6. Our study shows that the C-terminal domain of MT3 confers dome formation in MCF-7 cells and the presence of this domain induces expression of the GAGE family of genes. The differential effects of MT3 and metallothionein 1E on the expression of GAGE genes suggests unique roles of these genes in the development and progression of breast cancer. The finding that interferon alpha-inducible protein 6 expression is associated with the ability of MT3 to inhibit growth needs further investigation.
Insights into rubber biosynthesis from transcriptome analysis of Hevea brasiliensis latex.

PubMed

Chow, Keng-See; Wan, Kiew-Lian; Isa, Mohd Noor Mat; Bahari, Azlina; Tan, Siang-Hee; Harikrishna, K; Yeang, Hoong-Yeet

2007-01-01

Hevea brasiliensis is the most widely cultivated species for commercial production of natural rubber (cis-polyisoprene). In this study, 10,040 expressed sequence tags (ESTs) were generated from the latex of the rubber tree, which represents the cytoplasmic content of a single cell type, in order to analyse the latex transcription profile with emphasis on rubber biosynthesis-related genes. A total of 3,441 unique transcripts (UTs) were obtained after quality editing and assembly of EST sequences. Functional classification of UTs according to the Gene Ontology convention showed that 73.8% were related to genes of unknown function. Among highly expressed ESTs, a significant proportion encoded proteins related to rubber biosynthesis and stress or defence responses. Sequences encoding rubber particle membrane proteins (RPMPs) belonging to three protein families accounted for 12% of the ESTs. Characterization of these ESTs revealed nine RPMP variants (7.9-27 kDa) including the 14 kDa REF (rubber elongation factor) and 22 kDa SRPP (small rubber particle protein). The expression of multiple RPMP isoforms in latex was shown using antibodies against REF and SRPP. Both EST and quantitative reverse transcription-PCR (QRT-PCR) analyses demonstrated REF and SRPP to be the most abundant transcripts in latex. Besides rubber biosynthesis, comparative sequence analysis showed that the RPMPs are highly similar to sequences in the plant kingdom having stress-related functions. Implications of the RPMP function in cis-polyisoprene biosynthesis in the context of transcript abundance and differential gene expression are discussed.
Characterization of a novel ADAM protease expressed by Pneumocystis carinii.

PubMed

Kennedy, Cassie C; Kottom, Theodore J; Limper, Andrew H

2009-08-01

Pneumocystis species are opportunistic fungal pathogens that cause severe pneumonia in immunocompromised hosts. Recent evidence has suggested that unidentified proteases are involved in Pneumocystis life cycle regulation. Proteolytically active ADAM (named for "a disintegrin and metalloprotease") family molecules have been identified in some fungal organisms, such as Aspergillus fumigatus and Schizosaccharomyces pombe, and some have been shown to participate in life cycle regulation. Accordingly, we sought to characterize ADAM-like molecules in the fungal opportunistic pathogen, Pneumocystis carinii (PcADAM). After an in silico search of the P. carinii genomic sequencing project identified a 329-bp partial sequence with homology to known ADAM proteins, the full-length PcADAM sequence was obtained by PCR extension cloning, yielding a final coding sequence of 1,650 bp. Sequence analysis detected the presence of a typical ADAM catalytic active site (HEXXHXXGXXHD). Expression of PcADAM over the Pneumocystis life cycle was analyzed by Northern blot. Southern and contour-clamped homogenous electronic field blot analysis demonstrated its presence in the P. carinii genome. Expression of PcADAM was observed to be increased in Pneumocystis cysts compared to trophic forms. The full-length gene was subsequently cloned and heterologously expressed in Saccharomyces cerevisiae. Purified PcADAMp protein was proteolytically active in casein zymography, requiring divalent zinc. Furthermore, native PcADAMp extracted directly from freshly isolated Pneumocystis organisms also exhibited protease activity. This is the first report of protease activity attributable to a specific, characterized protein in the clinically important opportunistic fungal pathogen Pneumocystis.
Transcriptional Activation Signals Found in the Epstein-Barr Virus (EBV) Latency C Promoter Are Conserved in the Latency C Promoter Sequences from Baboon and Rhesus Monkey EBV-Like Lymphocryptoviruses (Cercopithicine Herpesviruses 12 and 15)

PubMed Central

Fuentes-Pananá, Ezequiel M.; Swaminathan, Sankar; Ling, Paul D.

1999-01-01

The Epstein-Barr virus (EBV) EBNA2 protein is a transcriptional activator that controls viral latent gene expression and is essential for EBV-driven B-cell immortalization. EBNA2 is expressed from the viral C promoter (Cp) and regulates its own expression by activating Cp through interaction with the cellular DNA binding protein CBF1. Through regulation of Cp and EBNA2 expression, EBV controls the pattern of latent protein expression and the type of latency established. To gain further insight into the important regulatory elements that modulate Cp usage, we isolated and sequenced the Cp regions corresponding to nucleotides 10251 to 11479 of the EBV genome (−1079 to +144 relative to the transcription initiation site) from the EBV-like lymphocryptoviruses found in baboons (herpesvirus papio; HVP) and Rhesus macaques (RhEBV). Sequence comparison of the approximately 1,230-bp Cp regions from these primate viruses revealed that EBV and HVP Cp sequences are 64% conserved, EBV and RhEBV Cp sequences are 66% conserved, and HVP and RhEBV Cp sequences are 65% conserved relative to each other. Approximately 50% of the residues are conserved among all three sequences, yet all three viruses have retained response elements for glucocorticoids, two positionally conserved CCAAT boxes, and positionally conserved TATA boxes. The putative EBNA2 100-bp enhancers within these promoters contain 54 conserved residues, and the binding sites for CBF1 and CBF2 are well conserved. Cp usage in the HVP- and RhEBV-transformed cell lines was detected by S1 nuclease protection analysis. Transient-transfection analysis showed that promoters of both HVP and RhEBV are responsive to EBNA2 and that they bind CBF1 and CBF2 in gel mobility shift assays. These results suggest that similar mechanisms for regulation of latent gene expression are conserved among the EBV-related lymphocryptoviruses found in nonhuman primates. PMID:9847397
Transcriptional activation signals found in the Epstein-Barr virus (EBV) latency C promoter are conserved in the latency C promoter sequences from baboon and Rhesus monkey EBV-like lymphocryptoviruses (cercopithicine herpesviruses 12 and 15).

PubMed

Fuentes-Pananá, E M; Swaminathan, S; Ling, P D

1999-01-01

The Epstein-Barr virus (EBV) EBNA2 protein is a transcriptional activator that controls viral latent gene expression and is essential for EBV-driven B-cell immortalization. EBNA2 is expressed from the viral C promoter (Cp) and regulates its own expression by activating Cp through interaction with the cellular DNA binding protein CBF1. Through regulation of Cp and EBNA2 expression, EBV controls the pattern of latent protein expression and the type of latency established. To gain further insight into the important regulatory elements that modulate Cp usage, we isolated and sequenced the Cp regions corresponding to nucleotides 10251 to 11479 of the EBV genome (-1079 to +144 relative to the transcription initiation site) from the EBV-like lymphocryptoviruses found in baboons (herpesvirus papio; HVP) and Rhesus macaques (RhEBV). Sequence comparison of the approximately 1,230-bp Cp regions from these primate viruses revealed that EBV and HVP Cp sequences are 64% conserved, EBV and RhEBV Cp sequences are 66% conserved, and HVP and RhEBV Cp sequences are 65% conserved relative to each other. Approximately 50% of the residues are conserved among all three sequences, yet all three viruses have retained response elements for glucocorticoids, two positionally conserved CCAAT boxes, and positionally conserved TATA boxes. The putative EBNA2 100-bp enhancers within these promoters contain 54 conserved residues, and the binding sites for CBF1 and CBF2 are well conserved. Cp usage in the HVP- and RhEBV-transformed cell lines was detected by S1 nuclease protection analysis. Transient-transfection analysis showed that promoters of both HVP and RhEBV are responsive to EBNA2 and that they bind CBF1 and CBF2 in gel mobility shift assays. These results suggest that similar mechanisms for regulation of latent gene expression are conserved among the EBV-related lymphocryptoviruses found in nonhuman primates.

A microarray analysis of the effects of moderate hypothermia and rewarming on gene expression by human hepatocytes (HepG2).

PubMed

Sonna, Larry A; Kuhlmeier, Matthew M; Khatri, Purvesh; Chen, Dechang; Lilly, Craig M

2010-09-01

The gene expression changes produced by moderate hypothermia are not fully known, but appear to differ in important ways from those produced by heat shock. We examined the gene expression changes produced by moderate hypothermia and tested the hypothesis that rewarming after hypothermia approximates a heat-shock response. Six sets of human HepG2 hepatocytes were subjected to moderate hypothermia (31 degrees C for 16 h), a conventional in vitro heat shock (43 degrees C for 30 min) or control conditions (37 degrees C), then harvested immediately or allowed to recover for 3 h at 37 degrees C. Expression analysis was performed with Affymetrix U133A gene chips, using analysis of variance-based techniques. Moderate hypothermia led to distinct time-dependent expression changes, as did heat shock. Hypothermia initially caused statistically significant, greater than or equal to twofold changes in expression (relative to controls) of 409 sequences (143 increased and 266 decreased), whereas heat shock affected 71 (35 increased and 36 decreased). After 3 h of recovery, 192 sequences (83 increased, 109 decreased) were affected by hypothermia and 231 (146 increased, 85 decreased) by heat shock. Expression of many heat shock proteins was decreased by hypothermia but significantly increased after rewarming. A comparison of sequences affected by thermal stress without regard to the magnitude of change revealed that the overlap between heat and cold stress was greater after 3 h of recovery than immediately following thermal stress. Thus, while some overlap occurs (particularly after rewarming), moderate hypothermia produces extensive, time-dependent gene expression changes in HepG2 cells that differ in important ways from those induced by heat shock.
Identification and expression analysis of cDNA encoding insulin-like growth factor 2 in horses

PubMed Central

KIKUCHI, Kohta; SASAKI, Keisuke; AKIZAWA, Hiroki; TSUKAHARA, Hayato; BAI, Hanako; TAKAHASHI, Masashi; NAMBO, Yasuo; HATA, Hiroshi; KAWAHARA, Manabu

2017-01-01

Insulin-like growth factor 2 (IGF2) is responsible for a broad range of physiological processes during fetal development and adulthood, but genomic analyses of IGF2 containing the 5ʹ- and 3ʹ-untranslated regions (UTRs) in equines have been limited. In this study, we characterized the IGF2 mRNA containing the UTRs, and determined its expression pattern in the fetal tissues of horses. The complete equine IGF2 mRNA sequence harboring another exon approximately 2.8 kb upstream from the canonical transcription start site was identified as a new transcript variant. As this upstream exon did not contain the start codon, the amino acid sequence was identical to the canonical variant. Analysis of the deduced amino acid sequence revealed that the protein possessed two major domains, IlGF and IGF2_C, and analysis of IGF2 sequence polymorphism in fetal tissues of Hokkaido native horse and Thoroughbreds revealed a single nucleotide polymorphism (T to C transition) at position 398 in Thoroughbreds, which caused an amino acid substitution at position 133 in the IGF2 sequence. Furthermore, the expression pattern of the IGF2 mRNA in the fetal tissues of horses was determined for the first time, and was found to be consistent with those of other species. Taken together, these results suggested that the transcriptional and translational products of the IGF2 gene have conserved functions in the fetal development of mammals, including horses. PMID:29151450
Promoter mapping of the mouse Tcp-10bt gene in transgenic mice identifies essential male germ cell regulatory sequences.

PubMed

Ewulonu, U K; Snyder, L; Silver, L M; Schimenti, J C

1996-03-01

Transgenic mice were generated to localize essential promoter elements in the mouse testis-expressed Tcp-10 genes. These genes are expressed exclusively in male germ cells, and exhibit a diffuse range of transcriptional start sites, possibly due to the absence of a TATA box. A series of transgene constructs containing different amounts of 5' flanking DNA revealed that all sequences necessary for appropriate temporal and tissue-specific transcription of Tcp-10 reside between positions -1 to -973. All transgenic animals containing these sequences expressed a chimeric transgene at high levels, in a pattern that paralleled the endogenous genes. These experiments further defined a 227 bp fragment from -746 to -973 that was absolutely essential for expression. In a gel-shift assay, this 227-bp fragment bound nuclear protein from testis, but not other tissues, to yield two retarded bands. Sequence analysis of this fragment revealed a half-site for the AP-2 transcription factor recognition sequence. Gel shift assays using native or mutant oligonucleotides demonstrated that the putative AP-2 recognition sequence was essential for generating the retarded bands. Since the binding activity is testis-specific, but AP-2 expression is not exclusive to male germ cells, it is possible that transcription of Tcp-10 requires interaction between AP-2 and a germ cell-specific transcription factor.
Analysis and functional annotation of expressed sequence tags from the fall armyworm Spodoptera frugiperda

PubMed Central

Deng, Youping; Dong, Yinghua; Thodima, Venkata; Clem, Rollie J; Passarelli, A Lorena

2006-01-01

Background Little is known about the genome sequences of lepidopteran insects, although this group of insects has been studied extensively in the fields of endocrinology, development, immunity, and pathogen-host interactions. In addition, cell lines derived from Spodoptera frugiperda and other lepidopteran insects are routinely used for baculovirus foreign gene expression. This study reports the results of an expressed sequence tag (EST) sequencing project in cells from the lepidopteran insect S. frugiperda, the fall armyworm. Results We have constructed an EST database using two cDNA libraries from the S. frugiperda-derived cell line, SF-21. The database consists of 2,367 ESTs which were assembled into 244 contigs and 951 singlets for a total of 1,195 unique sequences. Conclusion S. frugiperda is an agriculturally important pest insect and genomic information will be instrumental for establishing initial transcriptional profiling and gene function studies, and for obtaining information about genes manipulated during infections by insect pathogens such as baculoviruses. PMID:17052344
Characterization of the synthesis and expression of the GTA-kinase from transformed and normal rodent cells.

PubMed

Kerr, M; Fischer, J E; Purushotham, K R; Gao, D; Nakagawa, Y; Maeda, N; Ghanta, V; Hiramoto, R; Chegini, N; Humphreys-Beher, M G

1994-08-02

The murine transformed cell line YC-8 and beta-adrenergic receptor agonist (isoproternol) treated rat and mouse parotid gland acinar cells ectopically express cell surface beta 1-4 galactosyltransferase during active proliferation. This activity is dependent upon the expression of the GTA-kinase (p58) in these cells. Using total RNA, cDNA clones for the protein coding region of the kinase were isolated by reverse transcriptase-PCR cloning. DNA sequence analysis failed to show sequence differences with the normal homolog from mouse cells although Southern blot analysis of YC-8, and a second cell line KI81, indicated changes in the restriction enzyme digestion profile relative to murine cell lines which do not express cell surface galactosyltransferase. The rat cDNA clone from isoproterenol-treated salivary glands showed a high degree of protein and nucleic acid sequence homology to the GTA-kinase from both murine and human sources. Northern blot analysis of YC-8 and a control cell line LSTRA revealed the synthesis of a major 3.0 kb mRNA from both cell lines plus the unique expression of a 4.5 kb mRNA in the YC-8 cells. Reverse transcriptase-PCR of LSTRA and YC-8 confirmed the increased steady state levels of the GTA-kinase mRNA in YC-8. In the mouse, induction of cell proliferation by isoproterenol resulted in a 50-fold increase in steady state mRNA levels for the kinase over the low level of expression in quiescent cells. Expression of the rat 3' untranslated region in rat parotid cells in vitro led to an increased rate of DNA synthesis, cell number an ectopic expression of cell surface galactosyltransferase in the sense orientation. Antisense expression or vector alone did not alter growth characteristics of acinar cells. A polyclonal antibody monospecific to a murine amino terminal peptide sequence revealed a uniform distribution of GTA-kinase over the cytoplasm of acinar and duct cells of control mouse parotid glands. However, upon growth stimulation, kinase was detected primarily in a perinuclear and nuclear immunostaining pattern. Western blot analysis confirmed a translocation from a cytoplasmic localization in both LSTRA and quiescent salivary cells to a membrane-associated localization in YC-8 and proliferating salivary cells.
Molecular cloning and functional analysis of MRLC2 in Tianfu, Boer, and Chengdu Ma goats.

PubMed

Xu, H G; Xu, G Y; Wan, L; Ma, J

2013-03-15

To determine the molecular basis of heterosis in goats, fluorescence quantitative polymerase chain reaction (PCR) was performed to investigate myosin-regulatory light chain 2 (MRLC2) gene expression in the longissimus dorsi muscle tissues of the Tianfu goat and its parents, the Boer and Chengdu Ma goats. The goat MRLC2 gene was differentially expressed in the crossbreed, and the purebred mRNA were isolated and identified using fluorescence quantitative reverse transcription-PCR (RT-PCR). The complete coding sequence of MRLC2 was obtained using the cDNA method, and the full-length coding sequence consisted of 513 bp encoding 172 amino acids. The EF-hand superfamily domain of the MRLC2 protein is well conserved in caprine and other animals. The deduced amino acid sequence of MRLC2 shared significant identity with MRLC2 from other mammals. Phylogenetic tree analysis revealed that the MRLC2 protein was closely related to MRLC2 in other mammals. Several predicted miRNA target sites were found in the coding sequence of caprine MRLC2 mRNA. Analysis by RT-PCR showed that MRLC2 mRNA was present in the heart, stomach, liver, spleen, lung, small intestine, kidney, leg muscle, abdominal muscle, and longissimus dorsi muscles. In particular, the high expression of MRLC2 mRNA was detected in the longissimus dorsi, leg muscle, abdominal muscle, stomach, and heart, but low levels of expression were also observed in the liver, spleen, lung, small intestine, and kidney. The expression of the MRLC2 gene was upregulated in the longissimus dorsi muscle of Boer and Tianfu goats, and it was moderately upregulated in Chengdu Ma goats.
Transcriptome Analysis of the Differentially Expressed Genes in the Male and Female Shrub Willows (Salix suchowensis)

PubMed Central

Liu, Jingjing; Yin, Tongming; Ye, Ning; Chen, Yingnan; Yin, Tingting; Liu, Min; Hassani, Danial

2013-01-01

Background The dioecious system is relatively rare in plants. Shrub willow is an annual flowering dioecious woody plant, and possesses many characteristics that lend it as a great model for tracking the missing pieces of sex determination evolution. To gain a global view of the genes differentially expressed in the male and female shrub willows and to develop a database for further studies, we performed a large-scale transcriptome sequencing of flower buds which were separately collected from two types of sexes. Results Totally, 1,201,931 high quality reads were obtained, with an average length of 389 bp and a total length of 467.96 Mb. The ESTs were assembled into 29,048 contigs, and 132,709 singletons. These unigenes were further functionally annotated by comparing their sequences to different proteins and functional domain databases and assigned with Gene Ontology (GO) terms. A biochemical pathway database containing 291 predicted pathways was also created based on the annotations of the unigenes. Digital expression analysis identified 806 differentially expressed genes between the male and female flower buds. And 33 of them located on the incipient sex chromosome of Salicaceae, among which, 12 genes might involve in plant sex determination empirically. These genes were worthy of special notification in future studies. Conclusions In this study, a large number of EST sequences were generated from the flower buds of a male and a female shrub willow. We also reported the differentially expressed genes between the two sex-type flowers. This work provides valuable information and sequence resources for uncovering the sex determining genes and for future functional genomics analysis of Salicaceae spp. PMID:23560075
SPARTA: Simple Program for Automated reference-based bacterial RNA-seq Transcriptome Analysis.

PubMed

Johnson, Benjamin K; Scholz, Matthew B; Teal, Tracy K; Abramovitch, Robert B

2016-02-04

Many tools exist in the analysis of bacterial RNA sequencing (RNA-seq) transcriptional profiling experiments to identify differentially expressed genes between experimental conditions. Generally, the workflow includes quality control of reads, mapping to a reference, counting transcript abundance, and statistical tests for differentially expressed genes. In spite of the numerous tools developed for each component of an RNA-seq analysis workflow, easy-to-use bacterially oriented workflow applications to combine multiple tools and automate the process are lacking. With many tools to choose from for each step, the task of identifying a specific tool, adapting the input/output options to the specific use-case, and integrating the tools into a coherent analysis pipeline is not a trivial endeavor, particularly for microbiologists with limited bioinformatics experience. To make bacterial RNA-seq data analysis more accessible, we developed a Simple Program for Automated reference-based bacterial RNA-seq Transcriptome Analysis (SPARTA). SPARTA is a reference-based bacterial RNA-seq analysis workflow application for single-end Illumina reads. SPARTA is turnkey software that simplifies the process of analyzing RNA-seq data sets, making bacterial RNA-seq analysis a routine process that can be undertaken on a personal computer or in the classroom. The easy-to-install, complete workflow processes whole transcriptome shotgun sequencing data files by trimming reads and removing adapters, mapping reads to a reference, counting gene features, calculating differential gene expression, and, importantly, checking for potential batch effects within the data set. SPARTA outputs quality analysis reports, gene feature counts and differential gene expression tables and scatterplots. SPARTA provides an easy-to-use bacterial RNA-seq transcriptional profiling workflow to identify differentially expressed genes between experimental conditions. This software will enable microbiologists with limited bioinformatics experience to analyze their data and integrate next generation sequencing (NGS) technologies into the classroom. The SPARTA software and tutorial are available at sparta.readthedocs.org.
Comprehensive evaluation of AmpliSeq transcriptome, a novel targeted whole transcriptome RNA sequencing methodology for global gene expression analysis.

PubMed

Li, Wenli; Turner, Amy; Aggarwal, Praful; Matter, Andrea; Storvick, Erin; Arnett, Donna K; Broeckel, Ulrich

2015-12-16

Whole transcriptome sequencing (RNA-seq) represents a powerful approach for whole transcriptome gene expression analysis. However, RNA-seq carries a few limitations, e.g., the requirement of a significant amount of input RNA and complications led by non-specific mapping of short reads. The Ion AmpliSeq Transcriptome Human Gene Expression Kit (AmpliSeq) was recently introduced by Life Technologies as a whole-transcriptome, targeted gene quantification kit to overcome these limitations of RNA-seq. To assess the performance of this new methodology, we performed a comprehensive comparison of AmpliSeq with RNA-seq using two well-established next-generation sequencing platforms (Illumina HiSeq and Ion Torrent Proton). We analyzed standard reference RNA samples and RNA samples obtained from human induced pluripotent stem cell derived cardiomyocytes (hiPSC-CMs). Using published data from two standard RNA reference samples, we observed a strong concordance of log2 fold change for all genes when comparing AmpliSeq to Illumina HiSeq (Pearson's r = 0.92) and Ion Torrent Proton (Pearson's r = 0.92). We used ROC, Matthew's correlation coefficient and RMSD to determine the overall performance characteristics. All three statistical methods demonstrate AmpliSeq as a highly accurate method for differential gene expression analysis. Additionally, for genes with high abundance, AmpliSeq outperforms the two RNA-seq methods. When analyzing four closely related hiPSC-CM lines, we show that both AmpliSeq and RNA-seq capture similar global gene expression patterns consistent with known sources of variations. Our study indicates that AmpliSeq excels in the limiting areas of RNA-seq for gene expression quantification analysis. Thus, AmpliSeq stands as a very sensitive and cost-effective approach for very large scale gene expression analysis and mRNA marker screening with high accuracy.
Characterization and Expression Analysis of Receptor for Activated C Kinase from Silk-producing Insect Antheraea pernyi.

PubMed

Zhu, Bao-Jian; Yu, Hao; Tian, Sen; Dai, Li-Shang; Sun, Yu; Liu, Chao-Liang

2016-01-01

The receptor for activated C kinase (RACK) is an important scaffold protein with regulatory functions in cells. However, its role in the immune response of Antheraea pernyi to pathogen challenge remains unclear. To investigate the biological functions of RACK in the wild silkworm A. pernyi, cloning was performed and the expression patterns of the RACK gene were analyzed. Sequence analysis revealed that the RACK gene was 1120 bp containing a 960-bp open reading frame. The deduced RACK protein sequence reveals the higher identity with its homologs from other insects. SDS-PAGE and western blot analysis demonstrated successful expression of a 36-kDa recombinant RACK protein in Escherichia coli. The titer of a rabbit-raised antibody against recombinant RACK protein was about 1: 20000, determined by ELISA. Real-time PCR analysis showed that RACK expression was higher in fat bodies than in other examined A. pernyi tissues. The expression of RACK mRNA in fat bodies of fifth larvae of A. pernyi was obviously induced after nucleopolyhedrovirus, E. coli or Beauveria bassiana challenge. However, the expression patterns of RACK were different in response to these pathogens. Our data suggest that RACK may play a role in the innate immune responses of A. pernyi.
Cloning, Phylogenetic Analysis, and Distribution of Free Fatty Acid Receptor GPR120 Expression along the Gastrointestinal Tract of Housing versus Grazing Kid Goats.

PubMed

Ran, Tao; Li, Hengzhi; Liu, Yong; Zhou, Chuanshe; Tang, Shaoxun; Han, Xuefeng; Wang, Min; He, Zhixiong; Kang, Jinghe; Yan, Qiongxian; Tan, Zhiliang; Beauchemin, Karen A

2016-03-23

G-protein-coupled receptor 120 (GPR120) is reported as a long-chain fatty acid (LCFA) receptor that elicits free fatty acid (FFA) regulation on metabolism homeostasis. The study aimed to clone the gpr120 gene of goats (g-GPR120) and subsequently investigate phylogenetic analysis and tissue distribution throughout the digestive tracts of kid goats, as well as the effect of housing versus grazing (H vs G) feeding systems on GPR120 expression. Partial coding sequence (CDS) of g-GPR120 was cloned and submitted to NCBI (accession no. KU161270 ). Phylogenetic analysis revealed that g-GPR120 shared higher homology in both mRNA and amino acid sequences for ruminants than nonruminants. Immunochemistry, real-time PCR, and Western blot analysis showed that g-GPR120 was expressed throughout the digestive tracts of goats. The expression of g-GPR120 was affected by feeding system and age, with greater expression of g-GPR120 in the G group. It was concluded that the g-GPR120-mediated LCFA chemosensing mechanism is widely present in the tongue and gastrointestinal tract of goats and that its expression can be affected by feeding system and age.
Transcriptome Profiling of Chironomus kiinensis under Phenol Stress Using Solexa Sequencing Technology

PubMed Central

Cao, Chuanwang; Wang, Zhiying; Niu, Changying; Desneux, Nicolas; Gao, Xiwu

2013-01-01

Phenol is a major pollutant in aquatic ecosystems due to its chemical stability, water solubility and environmental mobility. To date, little is known about the molecular modifications of invertebrates under phenol stress. In the present study, we used Solexa sequencing technology to investigate the transcriptome and differentially expressed genes (DEGs) of midges (Chironomus kiinensis) in response to phenol stress. A total of 51,518,972 and 51,150,832 clean reads in the phenol-treated and control libraries, respectively, were obtained and assembled into 51,014 non-redundant (Nr) consensus sequences. A total of 6,032 unigenes were classified by Gene Ontology (GO), and 18,366 unigenes were categorized into 238 Kyoto Encyclopedia of Genes and Genomes (KEGG) categories. These genes included representatives from almost all functional categories. A total of 10,724 differentially expressed genes (P value <0.05) were detected in a comparative analysis of the expression profiles between phenol-treated and control C. kiinensis including 8,390 upregulated and 2,334 downregulated genes. The expression levels of 20 differentially expressed genes were confirmed by real-time RT-PCR, and the trends in gene expression that were observed matched the Solexa expression profiles, although the magnitude of the variations was different. Through pathway enrichment analysis, significantly enriched pathways were identified for the DEGs, including metabolic pathways, aryl hydrocarbon receptor (AhR), pancreatic secretion and neuroactive ligand-receptor interaction pathways, which may be associated with the phenol responses of C. kiinensis. Using Solexa sequencing technology, we identified several groups of key candidate genes as well as important biological pathways involved in the molecular modifications of chironomids under phenol stress. PMID:23527048
Characterization of the glutathione S-transferase gene family through ESTs and expression analyses within common and pigmented cultivars of Citrus sinensis (L.) Osbeck

PubMed Central

2014-01-01

Background Glutathione S-transferases (GSTs) represent a ubiquitous gene family encoding detoxification enzymes able to recognize reactive electrophilic xenobiotic molecules as well as compounds of endogenous origin. Anthocyanin pigments require GSTs for their transport into the vacuole since their cytoplasmic retention is toxic to the cell. Anthocyanin accumulation in Citrus sinensis (L.) Osbeck fruit flesh determines different phenotypes affecting the typical pigmentation of Sicilian blood oranges. In this paper we describe: i) the characterization of the GST gene family in C. sinensis through a systematic EST analysis; ii) the validation of the EST assembly by exploiting the genome sequences of C. sinensis and C. clementina and their genome annotations; iii) GST gene expression profiling in six tissues/organs and in two different sweet orange cultivars, Cadenera (common) and Moro (pigmented). Results We identified 61 GST transcripts, described the full- or partial-length nature of the sequences and assigned to each sequence the GST class membership exploiting a comparative approach and the classification scheme proposed for plant species. A total of 23 full-length sequences were defined. Fifty-four of the 61 transcripts were successfully aligned to the C. sinensis and C. clementina genomes. Tissue specific expression profiling demonstrated that the expression of some GST transcripts was 'tissue-affected' and cultivar specific. A comparative analysis of C. sinensis GSTs with those from other plant species was also considered. Data from the current analysis are accessible at http://biosrv.cab.unina.it/citrusGST/, with the aim to provide a reference resource for C. sinensis GSTs. Conclusions This study aimed at the characterization of the GST gene family in C. sinensis. Based on expression patterns from two different cultivars and on sequence-comparative analyses, we also highlighted that two sequences, a Phi class GST and a Mapeg class GST, could be involved in the conjugation of anthocyanin pigments and in their transport into the vacuole, specifically in fruit flesh of the pigmented cultivar. PMID:24490620
Cloning of an ADP-ribosylation factor gene from banana (Musa acuminata) and its expression patterns in postharvest ripening fruit.

PubMed

Wang, Yuan; Wu, Jing; Xu, Bi-Yu; Liu, Ju-Hua; Zhang, Jian-Bin; Jia, Cai-Hong; Jin, Zhi-Qiang

2010-08-15

A full-length cDNA encoding an ADP-ribosylation factor (ARF) from banana (Musa acuminata) fruit was cloned and named MaArf. It contains an open reading frame encoding a 181-amino-acid polypeptide. Sequence analysis showed that MaArf shared high similarity with ARF of other plant species. The genomic sequence of MaArf was also obtained using polymerase chain reaction (PCR). Sequence analysis showed that MaArf was a split gene containing five exons and four introns in genomic DNA. Reverse-transcriptase PCR was used to analyze the spatial expression of MaArf. The results showed that MaArf was expressed in all the organs examined: root, rhizome, leaf, flower and fruit. Real-time quantitative PCR was used to explore expression patterns of MaArf in postharvest banana. There was differential expression of MaArf associated with ethylene biosynthesis. In naturally ripened banana, expression of MaArf was in accordance with ethylene biosynthesis. However, in 1-methylcyclopropene-treated banana, the expression of MaArf was inhibited and changed little. When treated with ethylene, MaArf expression in banana fruit significantly increased in accordance with ethylene biosynthesis; the peak of MaArf was 3 d after harvest, 11 d earlier than for naturally ripened banana fruits. These results suggest that MaArf is induced by ethylene in regulating postharvest banana ripening. Finally, subcellular localization assays showed the MaArf protein in the cytoplasm. Copyright 2010 Elsevier GmbH. All rights reserved.
Isolation, cDNA cloning and gene expression of an antibacterial protein from larvae of the coconut rhinoceros beetle, Oryctes rhinoceros.

PubMed

Yang, J; Yamamoto, M; Ishibashi, J; Taniai, K; Yamakawa, M

1998-08-01

An antibacterial protein, designated rhinocerosin, was purified to homogeneity from larvae of the coconut rhinoceros beetle, Oryctes rhinoceros immunized with Escherichia coli. Based on the amino acid sequence of the N-terminal region, a degenerate primer was synthesized and reverse-transcriptase PCR was performed to clone rhinocerosin cDNA. As a result, a 279-bp fragment was obtained. The complete nucleotide sequence was determined by sequencing the extended rhinocerosin cDNA clone by 5' rapid amplification of cDNA ends. The deduced amino acid sequence of the mature portion of rhinocerosin was composed of 72 amino acids without cystein residues and was shown to be rich in glycine (11.1%) and proline (11.1%) residues. Comparison of the deduced amino acid sequence of rhinocerosin with those of other antibacterial proteins indicated that it has 77.8% and 44.6% identity with holotricin 2 and coleoptrecin, respectively. Rhinocerosin had strong antibacterial activity against E. coli, Streptococcus pyogenes, Staphylococcus aureus but not against Pseudomonas aeruginosa. Results of reverse-transcriptase PCR analysis of gene expression in different tissues indicated that the rhinocerosin gene is strongly expressed in the fat body and the Malpighian tubule, and weakly expressed in hemocytes and midgut. In addition, gene expression was inducible by bacteria in the fat body, the Malpighian tubule and hemocyte but constitutive expression was observed in the midgut.
A Systematic Analysis of the Structures of Heterologously Expressed Proteins and Those from Their Native Hosts in the RCSB PDB Archive.

PubMed

Zhou, Ren-Bin; Lu, Hui-Meng; Liu, Jie; Shi, Jian-Yu; Zhu, Jing; Lu, Qin-Qin; Yin, Da-Chuan

2016-01-01

Recombinant expression of proteins has become an indispensable tool in modern day research. The large yields of recombinantly expressed proteins accelerate the structural and functional characterization of proteins. Nevertheless, there are literature reported that the recombinant proteins show some differences in structure and function as compared with the native ones. Now there have been more than 100,000 structures (from both recombinant and native sources) publicly available in the Protein Data Bank (PDB) archive, which makes it possible to investigate if there exist any proteins in the RCSB PDB archive that have identical sequence but have some difference in structures. In this paper, we present the results of a systematic comparative study of the 3D structures of identical naturally purified versus recombinantly expressed proteins. The structural data and sequence information of the proteins were mined from the RCSB PDB archive. The combinatorial extension (CE), FATCAT-flexible and TM-Align methods were employed to align the protein structures. The root-mean-square distance (RMSD), TM-score, P-value, Z-score, secondary structural elements and hydrogen bonds were used to assess the structure similarity. A thorough analysis of the PDB archive generated five-hundred-seventeen pairs of native and recombinant proteins that have identical sequence. There were no pairs of proteins that had the same sequence and significantly different structural fold, which support the hypothesis that expression in a heterologous host usually could fold correctly into their native forms.
A Systematic Analysis of the Structures of Heterologously Expressed Proteins and Those from Their Native Hosts in the RCSB PDB Archive

PubMed Central

Zhou, Ren-Bin; Lu, Hui-Meng; Liu, Jie; Shi, Jian-Yu; Zhu, Jing; Lu, Qin-Qin; Yin, Da-Chuan

2016-01-01

Recombinant expression of proteins has become an indispensable tool in modern day research. The large yields of recombinantly expressed proteins accelerate the structural and functional characterization of proteins. Nevertheless, there are literature reported that the recombinant proteins show some differences in structure and function as compared with the native ones. Now there have been more than 100,000 structures (from both recombinant and native sources) publicly available in the Protein Data Bank (PDB) archive, which makes it possible to investigate if there exist any proteins in the RCSB PDB archive that have identical sequence but have some difference in structures. In this paper, we present the results of a systematic comparative study of the 3D structures of identical naturally purified versus recombinantly expressed proteins. The structural data and sequence information of the proteins were mined from the RCSB PDB archive. The combinatorial extension (CE), FATCAT-flexible and TM-Align methods were employed to align the protein structures. The root-mean-square distance (RMSD), TM-score, P-value, Z-score, secondary structural elements and hydrogen bonds were used to assess the structure similarity. A thorough analysis of the PDB archive generated five-hundred-seventeen pairs of native and recombinant proteins that have identical sequence. There were no pairs of proteins that had the same sequence and significantly different structural fold, which support the hypothesis that expression in a heterologous host usually could fold correctly into their native forms. PMID:27517583
Cloning, expression, and sequence analysis of the Bacillus methanolicus C1 methanol dehydrogenase gene.

PubMed Central

de Vries, G E; Arfman, N; Terpstra, P; Dijkhuizen, L

1992-01-01

The gene (mdh) coding for methanol dehydrogenase (MDH) of thermotolerant, methylotroph Bacillus methanolicus C1 has been cloned and sequenced. The deduced amino acid sequence of the mdh gene exhibited similarity to those of five other alcohol dehydrogenase (type III) enzymes, which are distinct from the long-chain zinc-containing (type I) or short-chain zinc-lacking (type II) enzymes. Highly efficient expression of the mdh gene in Escherichia coli was probably driven from its own promoter sequence. After purification of MDH from E. coli, the kinetic and biochemical properties of the enzyme were investigated. The physiological effect of MDH synthesis in E. coli and the role of conserved sequence patterns in type III alcohol dehydrogenases have been analyzed and are discussed. Images PMID:1644761
Molecular cloning and sequence analysis of two carbonic anhydrase in the swimming crab Portunus trituberculatus and its expression in response to salinity and pH stress.

PubMed

Pan, Luqing; Hu, Dongxu; Liu, Maoqi; Hu, Yanyan; Liu, Shengnan

2016-01-15

Carbonic anhydrase (CA) is involved in ion transport, acid-base balance and pH regulation by catalyzing the interconversion of CO2 and HCO3(-). In this study, full-length cDNA sequences of two CA isoforms were identified from Portunus trituberculatus. One was Portunus trituberculatus cytoplasmic carbonic anydrase (PtCAc) and the other one was Portunus trituberculatus glycosyl-phosphatidylinositol-linked carbonic anhydrase (PtCAg). The sequence of PtCAc was formed by an ORF of 816 bp, encoding a protein of 30.18 kDa. The PtCAg was constituted by an ORF of 927 bp, encoding a protein of 34.09 kDa. The deduced amino acid sequences of the two CA isoforms were compared to other crustacean' CA sequences. Both of them reflected high conservation of the residues and domains essential to the function of the two enzymes. The tissue expression analysis of PtCAc and PtCAg were detected in gill, muscle, hepatopancreas, hemocytes and gonad. PtCAc and PtCAg gene expressions were studied under salinity and pH challenge. The results showed that when salinity decreased (30 to 20 ppt), the mRNA expression of PtCAc increased significantly at 24 and 48 h, and the highest value appeared at 24h. The mRNA expression of PtCAg had the same situation with PtCAc. However, when salinity increased (30 to 35 ppt), only the mRNA expression of PtCAc increased significantly at 48 h. When pH changed, only the mRNA expression of PtCAc increased significantly at 12h, which was under low pH situation. The mRNA expression of PtCAg increased significantly at 12-48 h, and there was no significant difference of the expression between the pH challenged group and the control group in other experimental time. The results provided the base of understanding CA' function and the underlying mechanism in response to environmental changes in crustaceans. Copyright © 2015 Elsevier B.V. All rights reserved.
Quantification of differential gene expression by multiplexed targeted resequencing of cDNA

PubMed Central

Arts, Peer; van der Raadt, Jori; van Gestel, Sebastianus H.C.; Steehouwer, Marloes; Shendure, Jay; Hoischen, Alexander; Albers, Cornelis A.

2017-01-01

Whole-transcriptome or RNA sequencing (RNA-Seq) is a powerful and versatile tool for functional analysis of different types of RNA molecules, but sample reagent and sequencing cost can be prohibitive for hypothesis-driven studies where the aim is to quantify differential expression of a limited number of genes. Here we present an approach for quantification of differential mRNA expression by targeted resequencing of complementary DNA using single-molecule molecular inversion probes (cDNA-smMIPs) that enable highly multiplexed resequencing of cDNA target regions of ∼100 nucleotides and counting of individual molecules. We show that accurate estimates of differential expression can be obtained from molecule counts for hundreds of smMIPs per reaction and that smMIPs are also suitable for quantification of relative gene expression and allele-specific expression. Compared with low-coverage RNA-Seq and a hybridization-based targeted RNA-Seq method, cDNA-smMIPs are a cost-effective high-throughput tool for hypothesis-driven expression analysis in large numbers of genes (10 to 500) and samples (hundreds to thousands). PMID:28474677

Mapping analysis of scaffold/matrix attachment regions (s/MARs) from two different mammalian cell lines

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pilus, Nur Shazwani Mohd; Ahmad, Azrin; Yusof, Nurul Yuziana Mohd

Scaffold/matrix attachment regions (S/MARs) are potential element that can be integrated into expression vector to increase expression of recombinant protein. Many studies on S/MAR have been done but none has revealed the distribution of S/MAR in a genome. In this study, we have isolated S/MAR sequences from HEK293 and Chinese hamster ovary cell lines (CHO DG44) using two different methods utilizing 2 M NaCl and lithium-3,5-diiodosalicylate (LIS). The isolated S/MARs were sequenced using Next Generation Sequencing (NGS) platform. Based on reference mapping analysis against human genome database, a total of 8,994,856 and 8,412,672 contigs of S/MAR sequences were retrieved frommore » 2M NaCl and LIS extraction of HEK293 respectively. On the other hand, reference mapping analysis of S/MAR derived from CHO DG44 against our own CHO DG44 database have generated a total of 7,204,348 and 4,672,913 contigs from 2 M NaCl and LIS extraction method respectively.« less
Comprehensive analysis of single molecule sequencing-derived complete genome and whole transcriptome of Hyposidra talaca nuclear polyhedrosis virus.

PubMed

Nguyen, Thong T; Suryamohan, Kushal; Kuriakose, Boney; Janakiraman, Vasantharajan; Reichelt, Mike; Chaudhuri, Subhra; Guillory, Joseph; Divakaran, Neethu; Rabins, P E; Goel, Ridhi; Deka, Bhabesh; Sarkar, Suman; Ekka, Preety; Tsai, Yu-Chih; Vargas, Derek; Santhosh, Sam; Mohan, Sangeetha; Chin, Chen-Shan; Korlach, Jonas; Thomas, George; Babu, Azariah; Seshagiri, Somasekar

2018-06-12

We sequenced the Hyposidra talaca NPV (HytaNPV) double stranded circular DNA genome using PacBio single molecule sequencing technology. We found that the HytaNPV genome is 139,089 bp long with a GC content of 39.6%. It encodes 141 open reading frames (ORFs) including the 37 baculovirus core genes, 25 genes conserved among lepidopteran baculoviruses, 72 genes known in baculovirus, and 7 genes unique to the HytaNPV genome. It is a group II alphabaculovirus that codes for the F protein and lacks the gp64 gene found in group I alphabaculovirus viruses. Using RNA-seq, we confirmed the expression of the ORFs identified in the HytaNPV genome. Phylogenetic analysis showed HytaNPV to be closest to BusuNPV, SujuNPV and EcobNPV that infect other tea pests, Buzura suppressaria, Sucra jujuba, and Ectropis oblique, respectively. We identified repeat elements and a conserved non-coding baculovirus element in the genome. Analysis of the putative promoter sequences identified motif consistent with the temporal expression of the genes observed in the RNA-seq data.
A large scale analysis of cDNA in Arabidopsis thaliana: generation of 12,028 non-redundant expressed sequence tags from normalized and size-selected cDNA libraries.

PubMed

Asamizu, E; Nakamura, Y; Sato, S; Tabata, S

2000-06-30

For comprehensive analysis of genes expressed in the model dicotyledonous plant, Arabidopsis thaliana, expressed sequence tags (ESTs) were accumulated. Normalized and size-selected cDNA libraries were constructed from aboveground organs, flower buds, roots, green siliques and liquid-cultured seedlings, respectively, and a total of 14,026 5'-end ESTs and 39,207 3'-end ESTs were obtained. The 3'-end ESTs could be clustered into 12,028 non-redundant groups. Similarity search of the non-redundant ESTs against the public non-redundant protein database indicated that 4816 groups show similarity to genes of known function, 1864 to hypothetical genes, and the remaining 5348 are novel sequences. Gene coverage by the non-redundant ESTs was analyzed using the annotated genomic sequences of approximately 10 Mb on chromosomes 3 and 5. A total of 923 regions were hit by at least one EST, among which only 499 regions were hit by the ESTs deposited in the public database. The result indicates that the EST source generated in this project complements the EST data in the public database and facilitates new gene discovery.
Molecular cloning, mRNA expression and tissue distribution analysis of Slc7a11 gene in alpaca (Lama paco) skins associated with different coat colors.

PubMed

Tian, Xue; Meng, Xiaolin; Wang, Liangyan; Song, Yunfei; Zhang, Danli; Ji, Yuankai; Li, Xuejun; Dong, Changsheng

2015-01-25

Slc7a11 encoding solute carrier family 7 member 11 (amionic amino acid transporter light chain, xCT), has been identified to be a critical genetic regulator of pheomelanin synthesis in hair and melanocytes. To better understand the molecular characterization of Slc7a11 and the expression patterns in skin of white versus brown alpaca (lama paco), we cloned the full length coding sequence (CDS) of alpaca Slc7a11 gene and analyzed the expression patterns using Real Time PCR, Western blotting and immunohistochemistry. The full length CDS of 1512bp encodes a 503 amino acid polypeptide. Sequence analysis showed that alpaca xCT contains 12 transmembrane regions consistent with the highly conserved amino acid permease (AA_permease_2) domain similar to other vertebrates. Sequence alignment and phylogenetic analysis revealed that alpaca xCT had the highest identity and shared the same branch with Camelus ferus. Real Time PCR and Western blotting suggested that xCT was expressed at significantly high levels in brown alpaca skin, and transcripts and protein possessed the same expression pattern in white and brown alpaca skins. Additionally, immunohistochemical analysis further demonstrated that xCT staining was robustly increased in the matrix and root sheath of brown alpaca skin compared with that of white. These results suggest that Slc7a11 functions in alpaca coat color regulation and offer essential information for further exploration on the role of Slc7a11 in melanogenesis. Copyright © 2014 Elsevier B.V. All rights reserved.
Molecular cloning, gene expression analysis, and recombinant protein expression of novel silk proteins from larvae of a retreat-maker caddisfly, Stenopsyche marmorata.

PubMed

Bai, Xue; Sakaguchi, Mayo; Yamaguchi, Yuko; Ishihara, Shiori; Tsukada, Masuhiro; Hirabayashi, Kimio; Ohkawa, Kousaku; Nomura, Takaomi; Arai, Ryoichi

2015-08-28

Retreat-maker larvae of Stenopsyche marmorata, one of the major caddisfly species in Japan, produce silk threads and adhesives to build food capture nets and protective nests in water. Research on these underwater adhesive silk proteins potentially leads to the development of new functional biofiber materials. Recently, we identified four major S. marmorata silk proteins (Smsps), Smsp-1, Smsp-2, Smsp-3, and Smsp-4 from silk glands of S. marmorata larvae. In this study, we cloned full-length cDNAs of Smsp-2, Smsp-3, and Smsp-4 from the cDNA library of the S. marmorata silk glands to reveal the primary sequences of Smsps. Homology search results of the deduced amino acid sequences indicate that Smsp-2 and Smsp-4 are novel proteins. The Smsp-2 sequence [167 amino acids (aa)] has an array of GYD-rich repeat motifs and two (SX)4E motifs. The Smsp-4 sequence (132 aa) contains a number of GW-rich repeat motifs and three (SX)4E motifs. The Smsp-3 sequence (248 aa) exhibits high homology with fibroin light chain of other caddisflies. Gene expression analysis of Smsps by real-time PCR suggested that the gene expression of Smsp-1 and Smsp-3 was relatively stable throughout the year, whereas that of Smsp-2 and Smsp-4 varied seasonally. Furthermore, Smsps recombinant protein expression was successfully performed in Escherichia coli. The study provides new molecular insights into caddisfly aquatic silk and its potential for future applications. Copyright © 2015 Elsevier Inc. All rights reserved.
Construction of a cDNA library from female adult of Toxocara canis, and analysis of EST and immune-related genes expressions.

PubMed

Zhou, Rongqiong; Xia, Qingyou; Huang, Hancheng; Lai, Min; Wang, Zhenxin

2011-10-01

Toxocara canis is a widespread intestinal nematode parasite of dogs, which can also cause disease in humans. We employed an expressed sequence tag (EST) strategy in order to study gene-expression including development, digestion and reproduction of T. canis. ESTs provided a rapid way to identify genes, particularly in organisms for which we have very little molecular information. In this study, a cDNA library was constructed from a female adult of T. canis and 215 high-quality ESTs from 5'-ends of the cDNA clones representing 79 unigenes were obtained. The titer of the primary cDNA library was 1.83×10(6)pfu/mL with a recombination rate of 99.33%. Most of the sequences ranged from 300 to 900bp with an average length of 656bp. Cluster analysis of these ESTs allowed identification of 79 unique sequences containing 28 contigs and 51 singletons. BLASTX searches revealed that 18 unigenes (22.78% of the total) or 70 ESTs (32.56% of the total) were novel genes that had no significant matches to any protein sequences in the public databases. The rest of the 61 unigenes (77.22% of the total) or 145 ESTs (67.44% of the total) were closely matched to the known genes or sequences deposited in the public databases. These genes were classified into seven groups based on their known or putative biological functions. We also confirmed the gene expression patterns of several immune-related genes using RT-PCR examination. This work will provide a valuable resource for the further investigations in the stage-, sex- and tissue-specific gene transcription or expression. Copyright © 2011. Published by Elsevier Inc.
Differential gene expression in dentate granule cells in mesial temporal lobe epilepsy with and without hippocampal sclerosis.

PubMed

Griffin, Nicole G; Wang, Yu; Hulette, Christine M; Halvorsen, Matt; Cronin, Kenneth D; Walley, Nicole M; Haglund, Michael M; Radtke, Rodney A; Skene, J H Pate; Sinha, Saurabh R; Heinzen, Erin L

2016-03-01

Hippocampal sclerosis is the most common neuropathologic finding in cases of medically intractable mesial temporal lobe epilepsy. In this study, we analyzed the gene expression profiles of dentate granule cells of patients with mesial temporal lobe epilepsy with and without hippocampal sclerosis to show that next-generation sequencing methods can produce interpretable genomic data from RNA collected from small homogenous cell populations, and to shed light on the transcriptional changes associated with hippocampal sclerosis. RNA was extracted, and complementary DNA (cDNA) was prepared and amplified from dentate granule cells that had been harvested by laser capture microdissection from surgically resected hippocampi from patients with mesial temporal lobe epilepsy with and without hippocampal sclerosis. Sequencing libraries were sequenced, and the resulting sequencing reads were aligned to the reference genome. Differential expression analysis was used to ascertain expression differences between patients with and without hippocampal sclerosis. Greater than 90% of the RNA-Seq reads aligned to the reference. There was high concordance between transcriptional profiles obtained for duplicate samples. Principal component analysis revealed that the presence or absence of hippocampal sclerosis was the main determinant of the variance within the data. Among the genes up-regulated in the hippocampal sclerosis samples, there was significant enrichment for genes involved in oxidative phosphorylation. By analyzing the gene expression profiles of dentate granule cells from surgically resected hippocampal specimens from patients with mesial temporal lobe epilepsy with and without hippocampal sclerosis, we have demonstrated the utility of next-generation sequencing methods for producing biologically relevant results from small populations of homogeneous cells, and have provided insight on the transcriptional changes associated with this pathology. Wiley Periodicals, Inc. © 2016 International League Against Epilepsy.
A detailed gene expression study of the Miscanthus genus reveals changes in the transcriptome associated with the rejuvenation of spring rhizomes.

PubMed

Barling, Adam; Swaminathan, Kankshita; Mitros, Therese; James, Brandon T; Morris, Juliette; Ngamboma, Ornella; Hall, Megan C; Kirkpatrick, Jessica; Alabady, Magdy; Spence, Ashley K; Hudson, Matthew E; Rokhsar, Daniel S; Moose, Stephen P

2013-12-09

The Miscanthus genus of perennial C4 grasses contains promising biofuel crops for temperate climates. However, few genomic resources exist for Miscanthus, which limits understanding of its interesting biology and future genetic improvement. A comprehensive catalog of expressed sequences were generated from a variety of Miscanthus species and tissue types, with an emphasis on characterizing gene expression changes in spring compared to fall rhizomes. Illumina short read sequencing technology was used to produce transcriptome sequences from different tissues and organs during distinct developmental stages for multiple Miscanthus species, including Miscanthus sinensis, Miscanthus sacchariflorus, and their interspecific hybrid Miscanthus × giganteus. More than fifty billion base-pairs of Miscanthus transcript sequence were produced. Overall, 26,230 Sorghum gene models (i.e., ~ 96% of predicted Sorghum genes) had at least five Miscanthus reads mapped to them, suggesting that a large portion of the Miscanthus transcriptome is represented in this dataset. The Miscanthus × giganteus data was used to identify genes preferentially expressed in a single tissue, such as the spring rhizome, using Sorghum bicolor as a reference. Quantitative real-time PCR was used to verify examples of preferential expression predicted via RNA-Seq. Contiguous consensus transcript sequences were assembled for each species and annotated using InterProScan. Sequences from the assembled transcriptome were used to amplify genomic segments from a doubled haploid Miscanthus sinensis and from Miscanthus × giganteus to further disentangle the allelic and paralogous variations in genes. This large expressed sequence tag collection creates a valuable resource for the study of Miscanthus biology by providing detailed gene sequence information and tissue preferred expression patterns. We have successfully generated a database of transcriptome assemblies and demonstrated its use in the study of genes of interest. Analysis of gene expression profiles revealed biological pathways that exhibit altered regulation in spring compared to fall rhizomes, which are consistent with their different physiological functions. The expression profiles of the subterranean rhizome provides a better understanding of the biological activities of the underground stem structures that are essentials for perenniality and the storage or remobilization of carbon and nutrient resources.
Identification of miRNAs during mouse postnatal ovarian development and superovulation.

PubMed

Khan, Hamid Ali; Zhao, Yi; Wang, Li; Li, Qian; Du, Yu-Ai; Dan, Yi; Huo, Li-Jun

2015-07-08

MicroRNAs are small noncoding RNAs that play critical roles in regulation of gene expression in wide array of tissues including the ovary through sequence complementarity at post-transcriptional level. Tight regulation of multitude of genes involved in ovarian development and folliculogenesis could be regulated at transcription level by these miRNAs. Therefore, tissue specific miRNAs identification is considered a key step towards understanding the role of miRNAs in biological processes. To investigate the role of microRNAs during ovarian development and folliculogenesis we sequenced eight different libraries using Illumina deep sequencing technology. Different developmental stages were selected to explore miRNAs expression pattern at different stages of gonadal maturation with/without treatment of PMSG/hCG for superovulation. From massive sequencing reads, clean reads of 16-26 bp were selected for further analysis of differential expression analysis and novel microRNA annotation. Expression analysis of all miRNAs at different developmental stages showed that some miRNAs were present ubiquitously while others were differentially expressed at different stages. Among differentially expressed miRNAs we reported 61 miRNAs with a fold change of more than 2 at different developmental stages among all libraries. Among the up-regulated miRNAs, mmu-mir-1298 had the highest fold change with 4.025 while mmu-mir-150 was down-regulated more than 3 fold. Furthermore, we found 2659 target genes for 20 differentially expressed microRNAs using seven different target predictions programs (DIANA-mT, miRanda, miRDB, miRWalk, RNAhybrid, PICTAR5, TargetScan). Analysis of the predicted targets showed certain ovary specific genes targeted by single or multiple microRNAs. Furthermore, pathway annotation and Gene ontology showed involvement of these microRNAs in basic cellular process. These results suggest the presence of different miRNAs at different stages of ovarian development and superovulation. Potential role of these microRNAs was elucidated using bioinformatics tools in regulation of different pathways, biological functions and cellular components underlying ovarian development and superovulation. These results provide a framework for extended analysis of miRNAs and their roles during ovarian development and superovulation. Furthermore, this study provides a base for characterization of individual miRNAs to discover their role in ovarian development and female fertility.
Functional regression method for whole genome eQTL epistasis analysis with sequencing data.

PubMed

Xu, Kelin; Jin, Li; Xiong, Momiao

2017-05-18

Epistasis plays an essential rule in understanding the regulation mechanisms and is an essential component of the genetic architecture of the gene expressions. However, interaction analysis of gene expressions remains fundamentally unexplored due to great computational challenges and data availability. Due to variation in splicing, transcription start sites, polyadenylation sites, post-transcriptional RNA editing across the entire gene, and transcription rates of the cells, RNA-seq measurements generate large expression variability and collectively create the observed position level read count curves. A single number for measuring gene expression which is widely used for microarray measured gene expression analysis is highly unlikely to sufficiently account for large expression variation across the gene. Simultaneously analyzing epistatic architecture using the RNA-seq and whole genome sequencing (WGS) data poses enormous challenges. We develop a nonlinear functional regression model (FRGM) with functional responses where the position-level read counts within a gene are taken as a function of genomic position, and functional predictors where genotype profiles are viewed as a function of genomic position, for epistasis analysis with RNA-seq data. Instead of testing the interaction of all possible pair-wises SNPs, the FRGM takes a gene as a basic unit for epistasis analysis, which tests for the interaction of all possible pairs of genes and use all the information that can be accessed to collectively test interaction between all possible pairs of SNPs within two genome regions. By large-scale simulations, we demonstrate that the proposed FRGM for epistasis analysis can achieve the correct type 1 error and has higher power to detect the interactions between genes than the existing methods. The proposed methods are applied to the RNA-seq and WGS data from the 1000 Genome Project. The numbers of pairs of significantly interacting genes after Bonferroni correction identified using FRGM, RPKM and DESeq were 16,2361, 260 and 51, respectively, from the 350 European samples. The proposed FRGM for epistasis analysis of RNA-seq can capture isoform and position-level information and will have a broad application. Both simulations and real data analysis highlight the potential for the FRGM to be a good choice of the epistatic analysis with sequencing data.
A highly sensitive and accurate gene expression analysis by sequencing ("bead-seq") for a single cell.

PubMed

Matsunaga, Hiroko; Goto, Mari; Arikawa, Koji; Shirai, Masataka; Tsunoda, Hiroyuki; Huang, Huan; Kambara, Hideki

2015-02-15

Analyses of gene expressions in single cells are important for understanding detailed biological phenomena. Here, a highly sensitive and accurate method by sequencing (called "bead-seq") to obtain a whole gene expression profile for a single cell is proposed. A key feature of the method is to use a complementary DNA (cDNA) library on magnetic beads, which enables adding washing steps to remove residual reagents in a sample preparation process. By adding the washing steps, the next steps can be carried out under the optimal conditions without losing cDNAs. Error sources were carefully evaluated to conclude that the first several steps were the key steps. It is demonstrated that bead-seq is superior to the conventional methods for single-cell gene expression analyses in terms of reproducibility, quantitative accuracy, and biases caused during sample preparation and sequencing processes. Copyright © 2014 Elsevier Inc. All rights reserved.
GobyWeb: Simplified Management and Analysis of Gene Expression and DNA Methylation Sequencing Data

PubMed Central

Dorff, Kevin C.; Chambwe, Nyasha; Zeno, Zachary; Simi, Manuele; Shaknovich, Rita; Campagne, Fabien

2013-01-01

We present GobyWeb, a web-based system that facilitates the management and analysis of high-throughput sequencing (HTS) projects. The software provides integrated support for a broad set of HTS analyses and offers a simple plugin extension mechanism. Analyses currently supported include quantification of gene expression for messenger and small RNA sequencing, estimation of DNA methylation (i.e., reduced bisulfite sequencing and whole genome methyl-seq), or the detection of pathogens in sequenced data. In contrast to previous analysis pipelines developed for analysis of HTS data, GobyWeb requires significantly less storage space, runs analyses efficiently on a parallel grid, scales gracefully to process tens or hundreds of multi-gigabyte samples, yet can be used effectively by researchers who are comfortable using a web browser. We conducted performance evaluations of the software and found it to either outperform or have similar performance to analysis programs developed for specialized analyses of HTS data. We found that most biologists who took a one-hour GobyWeb training session were readily able to analyze RNA-Seq data with state of the art analysis tools. GobyWeb can be obtained at http://gobyweb.campagnelab.org and is freely available for non-commercial use. GobyWeb plugins are distributed in source code and licensed under the open source LGPL3 license to facilitate code inspection, reuse and independent extensions http://github.com/CampagneLaboratory/gobyweb2-plugins. PMID:23936070
The Human EST Ontology Explorer: a tissue-oriented visualization system for ontologies distribution in human EST collections.

PubMed

Merelli, Ivan; Caprera, Andrea; Stella, Alessandra; Del Corvo, Marcello; Milanesi, Luciano; Lazzari, Barbara

2009-10-15

The NCBI dbEST currently contains more than eight million human Expressed Sequenced Tags (ESTs). This wide collection represents an important source of information for gene expression studies, provided it can be inspected according to biologically relevant criteria. EST data can be browsed using different dedicated web resources, which allow to investigate library specific gene expression levels and to make comparisons among libraries, highlighting significant differences in gene expression. Nonetheless, no tool is available to examine distributions of quantitative EST collections in Gene Ontology (GO) categories, nor to retrieve information concerning library-dependent EST involvement in metabolic pathways. In this work we present the Human EST Ontology Explorer (HEOE) http://www.itb.cnr.it/ptp/human_est_explorer, a web facility for comparison of expression levels among libraries from several healthy and diseased tissues. The HEOE provides library-dependent statistics on the distribution of sequences in the GO Direct Acyclic Graph (DAG) that can be browsed at each GO hierarchical level. The tool is based on large-scale BLAST annotation of EST sequences. Due to the huge number of input sequences, this BLAST analysis was performed with the aid of grid computing technology, which is particularly suitable to address data parallel task. Relying on the achieved annotation, library-specific distributions of ESTs in the GO Graph were inferred. A pathway-based search interface was also implemented, for a quick evaluation of the representation of libraries in metabolic pathways. EST processing steps were integrated in a semi-automatic procedure that relies on Perl scripts and stores results in a MySQL database. A PHP-based web interface offers the possibility to simultaneously visualize, retrieve and compare data from the different libraries. Statistically significant differences in GO categories among user selected libraries can also be computed. The HEOE provides an alternative and complementary way to inspect EST expression levels with respect to approaches currently offered by other resources. Furthermore, BLAST computation on the whole human EST dataset was a suitable test of grid scalability in the context of large-scale bioinformatics analysis. The HEOE currently comprises sequence analysis from 70 non-normalized libraries, representing a comprehensive overview on healthy and unhealthy tissues. As the analysis procedure can be easily applied to other libraries, the number of represented tissues is intended to increase.
TRAPR: R Package for Statistical Analysis and Visualization of RNA-Seq Data.

PubMed

Lim, Jae Hyun; Lee, Soo Youn; Kim, Ju Han

2017-03-01

High-throughput transcriptome sequencing, also known as RNA sequencing (RNA-Seq), is a standard technology for measuring gene expression with unprecedented accuracy. Numerous bioconductor packages have been developed for the statistical analysis of RNA-Seq data. However, these tools focus on specific aspects of the data analysis pipeline, and are difficult to appropriately integrate with one another due to their disparate data structures and processing methods. They also lack visualization methods to confirm the integrity of the data and the process. In this paper, we propose an R-based RNA-Seq analysis pipeline called TRAPR, an integrated tool that facilitates the statistical analysis and visualization of RNA-Seq expression data. TRAPR provides various functions for data management, the filtering of low-quality data, normalization, transformation, statistical analysis, data visualization, and result visualization that allow researchers to build customized analysis pipelines.
Digital gene expression analysis of the zebra finch genome

PubMed Central

2010-01-01

Background In order to understand patterns of adaptation and molecular evolution it is important to quantify both variation in gene expression and nucleotide sequence divergence. Gene expression profiling in non-model organisms has recently been facilitated by the advent of massively parallel sequencing technology. Here we investigate tissue specific gene expression patterns in the zebra finch (Taeniopygia guttata) with special emphasis on the genes of the major histocompatibility complex (MHC). Results Almost 2 million 454-sequencing reads from cDNA of six different tissues were assembled and analysed. A total of 11,793 zebra finch transcripts were represented in this EST data, indicating a transcriptome coverage of about 65%. There was a positive correlation between the tissue specificity of gene expression and non-synonymous to synonymous nucleotide substitution ratio of genes, suggesting that genes with a specialised function are evolving at a higher rate (or with less constraint) than genes with a more general function. In line with this, there was also a negative correlation between overall expression levels and expression specificity of contigs. We found evidence for expression of 10 different genes related to the MHC. MHC genes showed relatively tissue specific expression levels and were in general primarily expressed in spleen. Several MHC genes, including MHC class I also showed expression in brain. Furthermore, for all genes with highest levels of expression in spleen there was an overrepresentation of several gene ontology terms related to immune function. Conclusions Our study highlights the usefulness of next-generation sequence data for quantifying gene expression in the genome as a whole as well as in specific candidate genes. Overall, the data show predicted patterns of gene expression profiles and molecular evolution in the zebra finch genome. Expression of MHC genes in particular, corresponds well with expression patterns in other vertebrates. PMID:20359325
Transcriptome Sequencing Revealed Significant Alteration of Cortical Promoter Usage and Splicing in Schizophrenia

PubMed Central

Wu, Jing Qin; Wang, Xi; Beveridge, Natalie J.; Tooney, Paul A.; Scott, Rodney J.; Carr, Vaughan J.; Cairns, Murray J.

2012-01-01

Background While hybridization based analysis of the cortical transcriptome has provided important insight into the neuropathology of schizophrenia, it represents a restricted view of disease-associated gene activity based on predetermined probes. By contrast, sequencing technology can provide un-biased analysis of transcription at nucleotide resolution. Here we use this approach to investigate schizophrenia-associated cortical gene expression. Methodology/Principal Findings The data was generated from 76 bp reads of RNA-Seq, aligned to the reference genome and assembled into transcripts for quantification of exons, splice variants and alternative promoters in postmortem superior temporal gyrus (STG/BA22) from 9 male subjects with schizophrenia and 9 matched non-psychiatric controls. Differentially expressed genes were then subjected to further sequence and functional group analysis. The output, amounting to more than 38 Gb of sequence, revealed significant alteration of gene expression including many previously shown to be associated with schizophrenia. Gene ontology enrichment analysis followed by functional map construction identified three functional clusters highly relevant to schizophrenia including neurotransmission related functions, synaptic vesicle trafficking, and neural development. Significantly, more than 2000 genes displayed schizophrenia-associated alternative promoter usage and more than 1000 genes showed differential splicing (FDR<0.05). Both types of transcriptional isoforms were exemplified by reads aligned to the neurodevelopmentally significant doublecortin-like kinase 1 (DCLK1) gene. Conclusions This study provided the first deep and un-biased analysis of schizophrenia-associated transcriptional diversity within the STG, and revealed variants with important implications for the complex pathophysiology of schizophrenia. PMID:22558445
Molecular analysis of two phytohemagglutinin genes and their expression in Phaseolus vulgaris cv. Pinto, a lectin-deficient cultivar of the bean.

PubMed

Voelker, T A; Staswick, P; Chrispeels, M J

1986-12-01

Phytohemagglutinin (PHA), the seed lectin of the common bean, Phaseolus vulgaris, is encoded by two highly homologous, tandemly linked genes, dlec1 and dlec2, which are coordinately expressed at high levels in developing cotyledons. Their respective transcripts translate into closely related polypeptides, PHA-E and PHA-L, constituents of the tetrameric lectin which accumulates at high levels in developing seeds. In the bean cultivar Pinto UI111, PHA-E is not detectable, and PHA-L accumulates at very reduced levels. To investigate the cause of the Pinto phenotype, we cloned and sequenced the two PHA genes of Pinto, called Pdlec1 and Pdlec2, and determined the abundance of their respective mRNAs in developing cotyledons. Both genes are more than 90% homologous to the normal PHA genes found in other cultivars. Pdlec1 carries a 1-bp frameshift mutation close to the 5' end of its coding sequence. Only very truncated polypeptides could be made from its mRNA. The gene Pdlec2 encodes a polypeptide, which resembles PHA-L and its predicted amino acid sequence agrees with the available Pinto PHA amino acid sequence data. Analysis of the mRNA of developing cotyledons revealed that the Pdlec1 message is reduced 600-fold, and Pdlec2 mRNA is reduced 20-fold with respect to mRNA levels in normal cultivars. A comparison of the sequences which are upstream from the coding sequence shows that Pdlec2 has a 100-bp deletion compared to the other genes (dlec1, dlec2 and Pdlec1). This deletion which contains a large tandem repeat may be responsible for the low level of expression of Pdlec2. The very low expression of Pdlec1 is as yet unexplained.
PROSPECT improves cis-acting regulatory element prediction by integrating expression profile data with consensus pattern searches

PubMed Central

Fujibuchi, Wataru; Anderson, John S. J.; Landsman, David

2001-01-01

Consensus pattern and matrix-based searches designed to predict cis-acting transcriptional regulatory sequences have historically been subject to large numbers of false positives. We sought to decrease false positives by incorporating expression profile data into a consensus pattern-based search method. We have systematically analyzed the expression phenotypes of over 6000 yeast genes, across 121 expression profile experiments, and correlated them with the distribution of 14 known regulatory elements over sequences upstream of the genes. Our method is based on a metric we term probabilistic element assessment (PEA), which is a ranking of potential sites based on sequence similarity in the upstream regions of genes with similar expression phenotypes. For eight of the 14 known elements that we examined, our method had a much higher selectivity than a naïve consensus pattern search. Based on our analysis, we have developed a web-based tool called PROSPECT, which allows consensus pattern-based searching of gene clusters obtained from microarray data. PMID:11574681
miRNAome expression profiles in the gonads of adult Melopsittacus undulatus

PubMed Central

Jiang, Lan; Wang, Qingqing; Yu, Jue; Gowda, Vinita; Johnson, Gabriel; Yang, Jianke

2018-01-01

The budgerigar (Melopsittacus undulatus) is one of the most widely studied parrot species, serving as an excellent animal model for behavior and neuroscience research. Until recently, it was unknown how sexual differences in the behavior, physiology, and development of organisms are regulated by differential gene expression. MicroRNAs (miRNAs) are endogenous short non-coding RNA molecules that can post-transcriptionally regulate gene expression and play a critical role in gonadal differentiation as well as early development of animals. However, very little is known about the role gonadal miRNAs play in the early development of birds. Research on the sex-biased expression of miRNAs in avian gonads are limited, and little is known about M. undulatus. In the current study, we sequenced two small non-coding RNA libraries made from the gonads of adult male and female budgerigars using Illumina paired-end sequencing technology. We obtained 254 known and 141 novel miRNAs, and randomly validated five miRNAs. Of these, three miRNAs were differentially expressed miRNAs and 18 miRNAs involved in sexual differentiation as determined by functional analysis with GO annotation and KEGG pathway analysis. In conclusion, this work is the first report of sex-biased miRNAs expression in the budgerigar, and provides additional sequences to the avian miRNAome database which will foster further functional genomic research. PMID:29666766
Establishing glucose- and ABA-regulated transcription networks in Arabidopsis by microarray analysis and promoter classification using a Relevance Vector Machine.

PubMed

Li, Yunhai; Lee, Kee Khoon; Walsh, Sean; Smith, Caroline; Hadingham, Sophie; Sorefan, Karim; Cawley, Gavin; Bevan, Michael W

2006-03-01

Establishing transcriptional regulatory networks by analysis of gene expression data and promoter sequences shows great promise. We developed a novel promoter classification method using a Relevance Vector Machine (RVM) and Bayesian statistical principles to identify discriminatory features in the promoter sequences of genes that can correctly classify transcriptional responses. The method was applied to microarray data obtained from Arabidopsis seedlings treated with glucose or abscisic acid (ABA). Of those genes showing >2.5-fold changes in expression level, approximately 70% were correctly predicted as being up- or down-regulated (under 10-fold cross-validation), based on the presence or absence of a small set of discriminative promoter motifs. Many of these motifs have known regulatory functions in sugar- and ABA-mediated gene expression. One promoter motif that was not known to be involved in glucose-responsive gene expression was identified as the strongest classifier of glucose-up-regulated gene expression. We show it confers glucose-responsive gene expression in conjunction with another promoter motif, thus validating the classification method. We were able to establish a detailed model of glucose and ABA transcriptional regulatory networks and their interactions, which will help us to understand the mechanisms linking metabolism with growth in Arabidopsis. This study shows that machine learning strategies coupled to Bayesian statistical methods hold significant promise for identifying functionally significant promoter sequences.

Cloning and expression of two 9-cis-epoxycarotenoid dioxygenase genes during fruit development and under stress conditions from Malus.

PubMed

Xia, Hui; Wu, Shan; Ma, Fengwang

2014-10-01

There is now biochemical and genetic evidence that oxidative cleavage of cis-epoxycarotenoids by 9-cis-epoxycarotenoid dioxygenase (NCED) is the critical step in the regulation of abscisic acid (ABA) synthesis in higher plants. To understand the expression characteristics of NCED during ABA biosynthesis in apple (Malus), two NCED genes cDNA sequence were cloned from Malus prunifolia using RT-PCR techniques, named MpNCED1 and MpNCED2. The two cDNA sequences have full-length open reading frame, encoding a polypeptide of 607 and 614 amino acids, respectively. Sequences analysis showed that the deduced two apple NCED proteins were highly homologous to other NCED proteins from different plant species. Real-time PCR analysis revealed MpNCED2 were expressed continuously during the whole period of apple fruit development with the pattern of "higher-low-highest", while the expression of MpNCED1 clearly declined to a steady low level in the mid-later period of fruit development. Expression of the MpNCED2 increased under the drought stress, high temperature and low temperature strongly and rapidly, whereas expression of the MpNCED1 was detected in response to temperature stress, but did not detected under drought stress. These results revealed that MpNCED1 and MpNCED2 may play different roles in regulation of the ABA biosynthesis in fruit development and various stresses response.
Performing the unexplainable: Implicit task performance reveals individually reliable sequence learning without explicit knowledge

PubMed Central

Sanchez, Daniel J.; Gobel, Eric W.; Reber, Paul J.

2015-01-01

Memory-impaired patients express intact implicit perceptual–motor sequence learning, but it has been difficult to obtain a similarly clear dissociation in healthy participants. When explicit memory is intact, participants acquire some explicit knowledge and performance improvements from implicit learning may be subtle. Therefore, it is difficult to determine whether performance exceeds what could be expected on the basis of the concomitant explicit knowledge. Using a challenging new sequence-learning task, robust implicit learning was found in healthy participants with virtually no associated explicit knowledge. Participants trained on a repeating sequence that was selected randomly from a set of five. On a performance test of all five sequences, performance was best on the trained sequence, and two-thirds of the participants exhibited individually reliable improvement (by chi-square analysis). Participants could not reliably indicate which sequence had been trained by either recognition or recall. Only by expressing their knowledge via performance were participants able to indicate which sequence they had learned. PMID:21169570
Sequence Evolution and Expression Regulation of Stress-Responsive Genes in Natural Populations of Wild Tomato

PubMed Central

Fischer, Iris; Steige, Kim A.; Stephan, Wolfgang; Mboup, Mamadou

2013-01-01

The wild tomato species Solanum chilense and S. peruvianum are a valuable non-model system for studying plant adaptation since they grow in diverse environments facing many abiotic constraints. Here we investigate the sequence evolution of regulatory regions of drought and cold responsive genes and their expression regulation. The coding regions of these genes were previously shown to exhibit signatures of positive selection. Expression profiles and sequence evolution of regulatory regions of members of the Asr (ABA/water stress/ripening induced) gene family and the dehydrin gene pLC30-15 were analyzed in wild tomato populations from contrasting environments. For S. chilense, we found that Asr4 and pLC30-15 appear to respond much faster to drought conditions in accessions from very dry environments than accessions from more mesic locations. Sequence analysis suggests that the promoter of Asr2 and the downstream region of pLC30-15 are under positive selection in some local populations of S. chilense. By investigating gene expression differences at the population level we provide further support of our previous conclusions that Asr2, Asr4, and pLC30-15 are promising candidates for functional studies of adaptation. Our analysis also demonstrates the power of the candidate gene approach in evolutionary biology research and highlights the importance of wild Solanum species as a genetic resource for their cultivated relatives. PMID:24205149
Porcine transcriptome analysis based on 97 non-normalized cDNA libraries and assembly of 1,021,891 expressed sequence tags

PubMed Central

Gorodkin, Jan; Cirera, Susanna; Hedegaard, Jakob; Gilchrist, Michael J; Panitz, Frank; Jørgensen, Claus; Scheibye-Knudsen, Karsten; Arvin, Troels; Lumholdt, Steen; Sawera, Milena; Green, Trine; Nielsen, Bente J; Havgaard, Jakob H; Rosenkilde, Carina; Wang, Jun; Li, Heng; Li, Ruiqiang; Liu, Bin; Hu, Songnian; Dong, Wei; Li, Wei; Yu, Jun; Wang, Jian; Stærfeldt, Hans-Henrik; Wernersson, Rasmus; Madsen, Lone B; Thomsen, Bo; Hornshøj, Henrik; Bujie, Zhan; Wang, Xuegang; Wang, Xuefei; Bolund, Lars; Brunak, Søren; Yang, Huanming; Bendixen, Christian; Fredholm, Merete

2007-01-01

Background Knowledge of the structure of gene expression is essential for mammalian transcriptomics research. We analyzed a collection of more than one million porcine expressed sequence tags (ESTs), of which two-thirds were generated in the Sino-Danish Pig Genome Project and one-third are from public databases. The Sino-Danish ESTs were generated from one normalized and 97 non-normalized cDNA libraries representing 35 different tissues and three developmental stages. Results Using the Distiller package, the ESTs were assembled to roughly 48,000 contigs and 73,000 singletons, of which approximately 25% have a high confidence match to UniProt. Approximately 6,000 new porcine gene clusters were identified. Expression analysis based on the non-normalized libraries resulted in the following findings. The distribution of cluster sizes is scaling invariant. Brain and testes are among the tissues with the greatest number of different expressed genes, whereas tissues with more specialized function, such as developing liver, have fewer expressed genes. There are at least 65 high confidence housekeeping gene candidates and 876 cDNA library-specific gene candidates. We identified differential expression of genes between different tissues, in particular brain/spinal cord, and found patterns of correlation between genes that share expression in pairs of libraries. Finally, there was remarkable agreement in expression between specialized tissues according to Gene Ontology categories. Conclusion This EST collection, the largest to date in pig, represents an essential resource for annotation, comparative genomics, assembly of the pig genome sequence, and further porcine transcription studies. PMID:17407547
Transcriptome sequencing and differential gene expression analysis in Viola yedoensis Makino (Fam. Violaceae) responsive to cadmium (Cd) pollution

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gao, Jian; Luo, Mao; Zhu, Ye

2015-03-27

Viola yedoensis Makino is an important Chinese traditional medicine plant adapted to cadmium (Cd) pollution regions. Illumina sequencing technology was used to sequence the transcriptome of V. yedoensis Makino. We sequenced Cd-treated (VIYCd) and untreated (VIYCK) samples of V. yedoensis, and obtained 100,410,834 and 83,587,676 high quality reads, respectively. After de novo assembly and quantitative assessment, 109,800 unigenes were finally generated with an average length of 661 bp. We then obtained functional annotations by aligning unigenes with public protein databases including NR, NT, SwissProt, KEGG and COG. In addition, 892 differentially expressed genes (DEGs) were investigated between the two libraries ofmore » untreated (VIYCK) and Cd-treated (VIYCd) plants. Moreover, 15 randomly selected DEGs were further validated with qRT-PCR and the results were highly accordant with the Solexa analysis. This study firstly generated a successful global analysis of the V. yedoensis transcriptome and it will provide for further studies on gene expression, genomics, and functional genomics in Violaceae. - Highlights: • A de novo assembly generated 109,800 unigenes and 5,4479 of them were annotated. • 31,285 could be classified into 26 COG categories. • 263 biosynthesis pathways were predicted and classified into five categories. • 892 DEGs were detected and 15 of them were validated by qRT-PCR.« less
Comparative transcriptome analysis reveals differentially expressed genes associated with sex expression in garden asparagus (Asparagus officinalis).

PubMed

Li, Shu-Fen; Zhang, Guo-Jun; Zhang, Xue-Jin; Yuan, Jin-Hong; Deng, Chuan-Liang; Gao, Wu-Jun

2017-08-22

Garden asparagus (Asparagus officinalis) is a highly valuable vegetable crop of commercial and nutritional interest. It is also commonly used to investigate the mechanisms of sex determination and differentiation in plants. However, the sex expression mechanisms in asparagus remain poorly understood. De novo transcriptome sequencing via Illumina paired-end sequencing revealed more than 26 billion bases of high-quality sequence data from male and female asparagus flower buds. A total of 72,626 unigenes with an average length of 979 bp were assembled. In comparative transcriptome analysis, 4876 differentially expressed genes (DEGs) were identified in the possible sex-determining stage of female and male/supermale flower buds. Of these DEGs, 433, including 285 male/supermale-biased and 149 female-biased genes, were annotated as flower related. Of the male/supermale-biased flower-related genes, 102 were probably involved in anther development. In addition, 43 DEGs implicated in hormone response and biosynthesis putatively associated with sex expression and reproduction were discovered. Moreover, 128 transcription factor (TF)-related genes belonging to various families were found to be differentially expressed, and this finding implied the essential roles of TF in sex determination or differentiation in asparagus. Correlation analysis indicated that miRNA-DEG pairs were also implicated in asparagus sexual development. Our study identified a large number of DEGs involved in the sex expression and reproduction of asparagus, including known genes participating in plant reproduction, plant hormone signaling, TF encoding, and genes with unclear functions. We also found that miRNAs might be involved in the sex differentiation process. Our study could provide a valuable basis for further investigations on the regulatory networks of sex determination and differentiation in asparagus and facilitate further genetic and genomic studies on this dioecious species.
Porcine calbindin-D9k gene: expression in endometrium, myometrium, and placenta in the absence of a functional estrogen response element in intron A.

PubMed

Krisinger, J; Jeung, E B; Simmen, R C; Leung, P C

1995-01-01

The expression of Calbindin-D9k (CaBP-9k) in the pig uterus and placenta was measured by Northern blot analysis and reverse transcription polymerase chain reaction (PCR), respectively. Progesterone (P4) administration to ovariectomized pigs decreased CaBP-9k mRNA levels. Expression of endometrial CaBP-9k mRNA was high on pregnancy Days 10-12 and below the detection limit on Days 15 and 18. On Day 60, expression could be detected at low levels. In myometrium and placenta, CaBP-9k mRNA expression was not detectable by Northern analysis using total RNA. Reverse-transcribed RNA from both tissues demonstrated the presence of CaBP-9k transcripts by means of PCR. The partial CaBP-9k gene was amplified by PCR and cloned to determine the sequence of intron A. In contrast to the rat CaBP-9k gene, the pig gene does not contain a functional estrogen response element (ERE) within this region. A similar ERE-like sequence located at the identical location was examined by gel retardation analysis and failed to bind the estradiol receptor. A similar disruption of this ERE-like sequence has been described in the human CaBP-9k gene, which is not expressed at any level in placenta, myometrium, or endometrium. It is concluded that the pig CaBP-9k gene is regulated in these reproductive tissues in a manner distinct from that in rat and human tissues. The regulation is probably due to a regulatory region outside of intron A, which in the rat gene contains the key cis element for uterine expression of the CaBP-9k gene.
Phylogenetic and comparative gene expression analysis of barley (Hordeum vulgare)WRKY transcription factor family reveals putatively retained functions betweenmonocots and dicots

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mangelsen, Elke; Kilian, Joachim; Berendzen, Kenneth W.

2008-02-01

WRKY proteins belong to the WRKY-GCM1 superfamily of zinc finger transcription factors that have been subject to a large plant-specific diversification. For the cereal crop barley (Hordeum vulgare), three different WRKY proteins have been characterized so far, as regulators in sucrose signaling, in pathogen defense, and in response to cold and drought, respectively. However, their phylogenetic relationship remained unresolved. In this study, we used the available sequence information to identify a minimum number of 45 barley WRKY transcription factor (HvWRKY) genes. According to their structural features the HvWRKY factors were classified into the previously defined polyphyletic WRKY subgroups 1 tomore » 3. Furthermore, we could assign putative orthologs of the HvWRKY proteins in Arabidopsis and rice. While in most cases clades of orthologous proteins were formed within each group or subgroup, other clades were composed of paralogous proteins for the grasses and Arabidopsis only, which is indicative of specific gene radiation events. To gain insight into their putative functions, we examined expression profiles of WRKY genes from publicly available microarray data resources and found group specific expression patterns. While putative orthologs of the HvWRKY transcription factors have been inferred from phylogenetic sequence analysis, we performed a comparative expression analysis of WRKY genes in Arabidopsis and barley. Indeed, highly correlative expression profiles were found between some of the putative orthologs. HvWRKY genes have not only undergone radiation in monocot or dicot species, but exhibit evolutionary traits specific to grasses. HvWRKY proteins exhibited not only sequence similarities between orthologs with Arabidopsis, but also relatedness in their expression patterns. This correlative expression is indicative for a putative conserved function of related WRKY proteins in mono- and dicot species.« less
An integrative systems genetics approach reveals potential causal genes and pathways related to obesity.

PubMed

Kogelman, Lisette J A; Zhernakova, Daria V; Westra, Harm-Jan; Cirera, Susanna; Fredholm, Merete; Franke, Lude; Kadarmideen, Haja N

2015-10-20

Obesity is a multi-factorial health problem in which genetic factors play an important role. Limited results have been obtained in single-gene studies using either genomic or transcriptomic data. RNA sequencing technology has shown its potential in gaining accurate knowledge about the transcriptome, and may reveal novel genes affecting complex diseases. Integration of genomic and transcriptomic variation (expression quantitative trait loci [eQTL] mapping) has identified causal variants that affect complex diseases. We integrated transcriptomic data from adipose tissue and genomic data from a porcine model to investigate the mechanisms involved in obesity using a systems genetics approach. Using a selective gene expression profiling approach, we selected 36 animals based on a previously created genomic Obesity Index for RNA sequencing of subcutaneous adipose tissue. Differential expression analysis was performed using the Obesity Index as a continuous variable in a linear model. eQTL mapping was then performed to integrate 60 K porcine SNP chip data with the RNA sequencing data. Results were restricted based on genome-wide significant single nucleotide polymorphisms, detected differentially expressed genes, and previously detected co-expressed gene modules. Further data integration was performed by detecting co-expression patterns among eQTLs and integration with protein data. Differential expression analysis of RNA sequencing data revealed 458 differentially expressed genes. The eQTL mapping resulted in 987 cis-eQTLs and 73 trans-eQTLs (false discovery rate < 0.05), of which the cis-eQTLs were associated with metabolic pathways. We reduced the eQTL search space by focusing on differentially expressed and co-expressed genes and disease-associated single nucleotide polymorphisms to detect obesity-related genes and pathways. Building a co-expression network using eQTLs resulted in the detection of a module strongly associated with lipid pathways. Furthermore, we detected several obesity candidate genes, for example, ENPP1, CTSL, and ABHD12B. To our knowledge, this is the first study to perform an integrated genomics and transcriptomics (eQTL) study using, and modeling, genomic and subcutaneous adipose tissue RNA sequencing data on obesity in a porcine model. We detected several pathways and potential causal genes for obesity. Further validation and investigation may reveal their exact function and association with obesity.
RNA-Seq Alignment to Individualized Genomes Improves Transcript Abundance Estimates in Multiparent Populations

PubMed Central

Munger, Steven C.; Raghupathy, Narayanan; Choi, Kwangbom; Simons, Allen K.; Gatti, Daniel M.; Hinerfeld, Douglas A.; Svenson, Karen L.; Keller, Mark P.; Attie, Alan D.; Hibbs, Matthew A.; Graber, Joel H.; Chesler, Elissa J.; Churchill, Gary A.

2014-01-01

Massively parallel RNA sequencing (RNA-seq) has yielded a wealth of new insights into transcriptional regulation. A first step in the analysis of RNA-seq data is the alignment of short sequence reads to a common reference genome or transcriptome. Genetic variants that distinguish individual genomes from the reference sequence can cause reads to be misaligned, resulting in biased estimates of transcript abundance. Fine-tuning of read alignment algorithms does not correct this problem. We have developed Seqnature software to construct individualized diploid genomes and transcriptomes for multiparent populations and have implemented a complete analysis pipeline that incorporates other existing software tools. We demonstrate in simulated and real data sets that alignment to individualized transcriptomes increases read mapping accuracy, improves estimation of transcript abundance, and enables the direct estimation of allele-specific expression. Moreover, when applied to expression QTL mapping we find that our individualized alignment strategy corrects false-positive linkage signals and unmasks hidden associations. We recommend the use of individualized diploid genomes over reference sequence alignment for all applications of high-throughput sequencing technology in genetically diverse populations. PMID:25236449
Identification of immunity-related genes in the larvae of Protaetia brevitarsis seulensis (Coleoptera: Cetoniidae) by a next-generation sequencing-based transcriptome analysis.

PubMed

Bang, Kyeongrin; Hwang, Sejung; Lee, Jiae; Cho, Saeyoull

2015-01-01

To identify immune-related genes in the larvae of white-spotted flower chafers, next-generation sequencing was conducted with an Illumina HiSeq2000, resulting in 100 million cDNA reads with sequence information from over 10 billion base pairs (bp) and >50× transcriptome coverage. A subset of 77,336 contigs was created, and ∼35,532 sequences matched entries against the NCBI nonredundant database (cutoff, e < 10(-5)). Statistical analysis was performed on the 35,532 contigs. For profiling of the immune response, samples were analyzed by aligning 42 base sequence tags to the de novo reference assembly, comparing levels in immunized larvae to control levels of expression. Of the differentially expressed genes, 3,440 transcripts were upregulated and 3,590 transcripts were downregulated. Many of these genes were confirmed as immune-related genes such as pattern recognition proteins, immune-related signal transduction proteins, antimicrobial peptides, and cellular response proteins, by comparison to published data. © The Author 2015. Published by Oxford University Press on behalf of the Entomological Society of America.
Vascular endothelial cells express isoforms of protein kinase A inhibitor.

PubMed

Lum, Hazel; Hao, Zengping; Gayle, Dave; Kumar, Priyadarsini; Patterson, Carolyn E; Uhler, Michael D

2002-01-01

The expression and function of the endogenous inhibitor of cAMP-dependent protein kinase (PKI) in endothelial cells are unknown. In this study, overexpression of rabbit muscle PKI gene into endothelial cells inhibited the cAMP-mediated increase and exacerbated thrombin-induced decrease in endothelial barrier function. We investigated PKI expression in human pulmonary artery (HPAECs), foreskin microvessel (HMECs), and brain microvessel endothelial cells (HBMECs). RT-PCR using specific primers for human PKI alpha, human PKI gamma, and mouse PKI beta sequences detected PKI alpha and PKI gamma mRNA in all three cell types. Sequencing and BLAST analysis indicated that forward and reverse DNA strands for PKI alpha and PKI gamma were of >96% identity with database sequences. RNase protection assays showed protection of the 542 nucleotides in HBMEC and HPAEC PKI alpha mRNA and 240 nucleotides in HBMEC, HPAEC, and HMEC PKI gamma mRNA. Western blot analysis indicated that PKI gamma protein was detected in all three cell types, whereas PKI alpha was found in HBMECs. In summary, endothelial cells from three different vascular beds express PKI alpha and PKI gamma, which may be physiologically important in endothelial barrier function.
BASiCS: Bayesian Analysis of Single-Cell Sequencing Data

PubMed Central

Vallejos, Catalina A.; Marioni, John C.; Richardson, Sylvia

2015-01-01

Single-cell mRNA sequencing can uncover novel cell-to-cell heterogeneity in gene expression levels in seemingly homogeneous populations of cells. However, these experiments are prone to high levels of unexplained technical noise, creating new challenges for identifying genes that show genuine heterogeneous expression within the population of cells under study. BASiCS (Bayesian Analysis of Single-Cell Sequencing data) is an integrated Bayesian hierarchical model where: (i) cell-specific normalisation constants are estimated as part of the model parameters, (ii) technical variability is quantified based on spike-in genes that are artificially introduced to each analysed cell’s lysate and (iii) the total variability of the expression counts is decomposed into technical and biological components. BASiCS also provides an intuitive detection criterion for highly (or lowly) variable genes within the population of cells under study. This is formalised by means of tail posterior probabilities associated to high (or low) biological cell-to-cell variance contributions, quantities that can be easily interpreted by users. We demonstrate our method using gene expression measurements from mouse Embryonic Stem Cells. Cross-validation and meaningful enrichment of gene ontology categories within genes classified as highly (or lowly) variable supports the efficacy of our approach. PMID:26107944
BASiCS: Bayesian Analysis of Single-Cell Sequencing Data.

PubMed

Vallejos, Catalina A; Marioni, John C; Richardson, Sylvia

2015-06-01

Single-cell mRNA sequencing can uncover novel cell-to-cell heterogeneity in gene expression levels in seemingly homogeneous populations of cells. However, these experiments are prone to high levels of unexplained technical noise, creating new challenges for identifying genes that show genuine heterogeneous expression within the population of cells under study. BASiCS (Bayesian Analysis of Single-Cell Sequencing data) is an integrated Bayesian hierarchical model where: (i) cell-specific normalisation constants are estimated as part of the model parameters, (ii) technical variability is quantified based on spike-in genes that are artificially introduced to each analysed cell's lysate and (iii) the total variability of the expression counts is decomposed into technical and biological components. BASiCS also provides an intuitive detection criterion for highly (or lowly) variable genes within the population of cells under study. This is formalised by means of tail posterior probabilities associated to high (or low) biological cell-to-cell variance contributions, quantities that can be easily interpreted by users. We demonstrate our method using gene expression measurements from mouse Embryonic Stem Cells. Cross-validation and meaningful enrichment of gene ontology categories within genes classified as highly (or lowly) variable supports the efficacy of our approach.
Molecular Cloning, Expression Profile and 5′ Regulatory Region Analysis of Two Chemosensory Protein Genes from the Diamondback Moth, Plutella xylostella

PubMed Central

Gong, Liang; Zhong, Guo-Hua; Hu, Mei-Ying; Luo, Qian; Ren, Zhen-Zhen

2010-01-01

Chemosensory proteins play an important role in transporting chemical compounds to their receptors on dendrite membranes. In this study, two full-length cDNA codings for chemosensory proteins of Plutella xylostella (Lepidoptera: Plutellidae) were obtained by RACE-PCR. PxylCSP3 and Pxyl-CSP4, with GenBank accession numbers ABM92663 and ABM92664, respectively, were cloned and sequenced. The gene sequences both consisted of three exons and two introns. RT-PCR analysis showed that Pxyl-CSP3 and Pxyl-CSP4 had different expression patterns in the examined developmental stages, but were expressed in all larval stages. Phylogenetic analysis indicated that lepidopteran insects consist of three branches, and Pxyl-CSP3 and Pxyl-CSP4 belong to different branches. The 5′regulatory regions of Pxyl-CSP3 and Pxyl-CSP4 were isolated and analyzed, and the results consist of not only the core promoter sequences (TATA-box), but also several transcriptional elements (BR-C Z4, Hb, Dfd, CF2-II, etc.). This study provides clues to better understanding the various physiological functions of CSPs in P. xylostella and other insects. PMID:21073345
Rapid in silico cloning of genes using expressed sequence tags (ESTs).

PubMed

Gill, R W; Sanseau, P

2000-01-01

Expressed sequence tags (ESTs) are short single-pass DNA sequences obtained from either end of cDNA clones. These ESTs are derived from a vast number of cDNA libraries obtained from different species. Human ESTs are the bulk of the data and have been widely used to identify new members of gene families, as markers on the human chromosomes, to discover polymorphism sites and to compare expression patterns in different tissues or pathologies states. Information strategies have been devised to query EST databases. Since most of the analysis is performed with a computer, the term "in silico" strategy has been coined. In this chapter we will review the current status of EST databases, the pros and cons of EST-type data and describe possible strategies to retrieve meaningful information.
Identification of cDNAs encoding HSP70 and HSP90 in the abalone Haliotis tuberculata: Transcriptional induction in response to thermal stress in hemocyte primary culture.

PubMed

Farcy, Emilie; Serpentini, Antoine; Fiévet, Bruno; Lebel, Jean-Marc

2007-04-01

Heat-shock proteins are a multigene family of proteins whose expression is induced by a variety of stress factors. This work reports the cloning and sequencing of HSP70 and HSP90 cDNAs in the gastropod Haliotis tuberculata. The deduced amino acid sequences of both HSP70 and HSP90 from H. tuberculata shared a high degree of homology with their homologues in other species, including typical eukaryotic HSP70 and HSP90 signature sequences. We examined their transcription expression pattern in abalone hemocytes exposed to thermal stress. Real-time PCR analysis indicated that both HSP70 and HSP90 mRNA were expressed in control animals but rapidly increased after heat-shock.
Quantitative analysis of a deeply sequenced marine microbial metatranscriptome.

PubMed

Gifford, Scott M; Sharma, Shalabh; Rinta-Kanto, Johanna M; Moran, Mary Ann

2011-03-01

The potential of metatranscriptomic sequencing to provide insights into the environmental factors that regulate microbial activities depends on how fully the sequence libraries capture community expression (that is, sample-sequencing depth and coverage depth), and the sensitivity with which expression differences between communities can be detected (that is, statistical power for hypothesis testing). In this study, we use an internal standard approach to make absolute (per liter) estimates of transcript numbers, a significant advantage over proportional estimates that can be biased by expression changes in unrelated genes. Coastal waters of the southeastern United States contain 1 × 10(12) bacterioplankton mRNA molecules per liter of seawater (~200 mRNA molecules per bacterial cell). Even for the large bacterioplankton libraries obtained in this study (~500,000 possible protein-encoding sequences in each of two libraries after discarding rRNAs and small RNAs from >1 million 454 FLX pyrosequencing reads), sample-sequencing depth was only 0.00001%. Expression levels of 82 genes diagnostic for transformations in the marine nitrogen, phosphorus and sulfur cycles ranged from below detection (<1 × 10(6) transcripts per liter) for 36 genes (for example, phosphonate metabolism gene phnH, dissimilatory nitrate reductase subunit napA) to >2.7 × 10(9) transcripts per liter (ammonia transporter amt and ammonia monooxygenase subunit amoC). Half of the categories for which expression was detected, however, had too few copy numbers for robust statistical resolution, as would be required for comparative (experimental or time-series) expression studies. By representing whole community gene abundance and expression in absolute units (per volume or mass of environment), 'omics' data can be better leveraged to improve understanding of microbially mediated processes in the ocean.
Gene discovery in Eimeria tenella by immunoscreening cDNA expression libraries of sporozoites and schizonts with chicken intestinal antibodies.

PubMed

Réfega, Susana; Girard-Misguich, Fabienne; Bourdieu, Christiane; Péry, Pierre; Labbé, Marie

2003-04-02

Specific antibodies were produced ex vivo from intestinal culture of Eimeria tenella infected chickens. The specificity of these intestinal antibodies was tested against different parasite stages. These antibodies were used to immunoscreen first generation schizont and sporozoite cDNA libraries permitting the identification of new E. tenella antigens. We obtained a total of 119 cDNA clones which were subjected to sequence analysis. The sequences coding for the proteins inducing local immune responses were compared with nucleotide or protein databases and with expressed sequence tags (ESTs) databases. We identified new Eimeria genes coding for heat shock proteins, a ribosomal protein, a pyruvate kinase and a pyridoxine kinase. Specific features of other sequences are discussed.
Generation, annotation and analysis of ESTs from Trichoderma harzianum CECT 2413

PubMed Central

Vizcaíno, Juan Antonio; González, Francisco Javier; Suárez, M Belén; Redondo, José; Heinrich, Julian; Delgado-Jarana, Jesús; Hermosa, Rosa; Gutiérrez, Santiago; Monte, Enrique; Llobell, Antonio; Rey, Manuel

2006-01-01

Background The filamentous fungus Trichoderma harzianum is used as biological control agent of several plant-pathogenic fungi. In order to study the genome of this fungus, a functional genomics project called "TrichoEST" was developed to give insights into genes involved in biological control activities using an approach based on the generation of expressed sequence tags (ESTs). Results Eight different cDNA libraries from T. harzianum strain CECT 2413 were constructed. Different growth conditions involving mainly different nutrient conditions and/or stresses were used. We here present the analysis of the 8,710 ESTs generated. A total of 3,478 unique sequences were identified of which 81.4% had sequence similarity with GenBank entries, using the BLASTX algorithm. Using the Gene Ontology hierarchy, we performed the annotation of 51.1% of the unique sequences and compared its distribution among the gene libraries. Additionally, the InterProScan algorithm was used in order to further characterize the sequences. The identification of the putatively secreted proteins was also carried out. Later, based on the EST abundance, we examined the highly expressed genes and a hydrophobin was identified as the gene expressed at the highest level. We compared our collection of ESTs with the previous collections obtained from Trichoderma species and we also compared our sequence set with different complete eukaryotic genomes from several animals, plants and fungi. Accordingly, the presence of similar sequences in different kingdoms was also studied. Conclusion This EST collection and its annotation provide a significant resource for basic and applied research on T. harzianum, a fungus with a high biotechnological interest. PMID:16872539

MytiBase: a knowledgebase of mussel (M. galloprovincialis) transcribed sequences

PubMed Central

Venier, Paola; De Pittà, Cristiano; Bernante, Filippo; Varotto, Laura; De Nardi, Barbara; Bovo, Giuseppe; Roch, Philippe; Novoa, Beatriz; Figueras, Antonio; Pallavicini, Alberto; Lanfranchi, Gerolamo

2009-01-01

Background Although Bivalves are among the most studied marine organisms due to their ecological role, economic importance and use in pollution biomonitoring, very little information is available on the genome sequences of mussels. This study reports the functional analysis of a large-scale Expressed Sequence Tag (EST) sequencing from different tissues of Mytilus galloprovincialis (the Mediterranean mussel) challenged with toxic pollutants, temperature and potentially pathogenic bacteria. Results We have constructed and sequenced seventeen cDNA libraries from different Mediterranean mussel tissues: gills, digestive gland, foot, anterior and posterior adductor muscle, mantle and haemocytes. A total of 24,939 clones were sequenced from these libraries generating 18,788 high-quality ESTs which were assembled into 2,446 overlapping clusters and 4,666 singletons resulting in a total of 7,112 non-redundant sequences. In particular, a high-quality normalized cDNA library (Nor01) was constructed as determined by the high rate of gene discovery (65.6%). Bioinformatic screening of the non-redundant M. galloprovincialis sequences identified 159 microsatellite-containing ESTs. Clusters, consensuses, related similarities and gene ontology searches have been organized in a dedicated, searchable database . Conclusion We defined the first species-specific catalogue of M. galloprovincialis ESTs including 7,112 unique transcribed sequences. Putative microsatellite markers were identified. This annotated catalogue represents a valuable platform for expression studies, marker validation and genetic linkage analysis for investigations in the biology of Mediterranean mussels. PMID:19203376
Comparative analysis of the feline immunoglobulin repertoire.

PubMed

Steiniger, Sebastian C J; Glanville, Jacob; Harris, Douglas W; Wilson, Thomas L; Ippolito, Gregory C; Dunham, Steven A

2017-03-01

Next-Generation Sequencing combined with bioinformatics is a powerful tool for analyzing the large number of DNA sequences present in the expressed antibody repertoire and these data sets can be used to advance a number of research areas including antibody discovery and engineering. The accurate measurement of the immune repertoire sequence composition, diversity and abundance is important for understanding the repertoire response in infections, vaccinations and cancer immunology and could also be useful for elucidating novel molecular targets. In this study 4 individual domestic cats (Felis catus) were subjected to antibody repertoire sequencing with total number of sequences generated 1079863 for VH for IgG, 1050824 VH for IgM, 569518 for VK and 450195 for VL. Our analysis suggests that a similar VDJ expression patterns exists across all cats. Similar to the canine repertoire, the feline repertoire is dominated by a single subgroup, namely VH3. The antibody paratope of felines showed similar amino acid variation when compared to human, mouse and canine counterparts. All animals show a similarly skewed VH CDR-H3 profile and, when compared to canine, human and mouse, distinct differences are observed. Our study represents the first attempt to characterize sequence diversity in the expressed feline antibody repertoire and this demonstrates the utility of using NGS to elucidate entire antibody repertoires from individual animals. These data provide significant insight into understanding the feline immune system function. Copyright © 2017 International Alliance for Biological Standardization. Published by Elsevier Ltd. All rights reserved.
Digital gene expression analysis of gene expression differences within Brassica diploids and allopolyploids.

PubMed

Jiang, Jinjin; Wang, Yue; Zhu, Bao; Fang, Tingting; Fang, Yujie; Wang, Youping

2015-01-27

Brassica includes many successfully cultivated crop species of polyploid origin, either by ancestral genome triplication or by hybridization between two diploid progenitors, displaying complex repetitive sequences and transposons. The U's triangle, which consists of three diploids and three amphidiploids, is optimal for the analysis of complicated genomes after polyploidization. Next-generation sequencing enables the transcriptome profiling of polyploids on a global scale. We examined the gene expression patterns of three diploids (Brassica rapa, B. nigra, and B. oleracea) and three amphidiploids (B. napus, B. juncea, and B. carinata) via digital gene expression analysis. In total, the libraries generated between 5.7 and 6.1 million raw reads, and the clean tags of each library were mapped to 18547-21995 genes of B. rapa genome. The unambiguous tag-mapped genes in the libraries were compared. Moreover, the majority of differentially expressed genes (DEGs) were explored among diploids as well as between diploids and amphidiploids. Gene ontological analysis was performed to functionally categorize these DEGs into different classes. The Kyoto Encyclopedia of Genes and Genomes analysis was performed to assign these DEGs into approximately 120 pathways, among which the metabolic pathway, biosynthesis of secondary metabolites, and peroxisomal pathway were enriched. The non-additive genes in Brassica amphidiploids were analyzed, and the results indicated that orthologous genes in polyploids are frequently expressed in a non-additive pattern. Methyltransferase genes showed differential expression pattern in Brassica species. Our results provided an understanding of the transcriptome complexity of natural Brassica species. The gene expression changes in diploids and allopolyploids may help elucidate the morphological and physiological differences among Brassica species.
Digital gene expression profiling of flax (Linum usitatissimum L.) stem peel identifies genes enriched in fiber-bearing phloem tissue.

PubMed

Guo, Yuan; Qiu, Caisheng; Long, Songhua; Chen, Ping; Hao, Dongmei; Preisner, Marta; Wang, Hui; Wang, Yufu

2017-08-30

To better understand the molecular mechanisms and gene expression characteristics associated with development of bast fiber cell within flax stem phloem, the gene expression profiling of flax stem peels and leaves were screened, using Illumina's Digital Gene Expression (DGE) analysis. Four DGE libraries (2 for stem peel and 2 for leaf), ranging from 6.7 to 9.2 million clean reads were obtained, which produced 7.0 million and 6.8 million mapped reads for flax stem peel and leave, respectively. By differential gene expression analysis, a total of 975 genes, of which 708 (73%) genes have protein-coding annotation, were identified as phloem enriched genes putatively involved in the processes of polysaccharide and cell wall metabolism. Differential expression genes (DEGs) was validated using quantitative RT-PCR, the expression pattern of all nine genes determined by qRT-PCR fitted in well with that obtained by sequencing analysis. Cluster and Gene Ontology (GO) analysis revealed that a large number of genes related to metabolic process, catalytic activity and binding category were expressed predominantly in the stem peels. The Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis of the phloem enriched genes suggested approximately 111 biological pathways. The large number of genes and pathways produced from DGE sequencing will expand our understanding of the complex molecular and cellular events in flax bast fiber development and provide a foundation for future studies on fiber development in other bast fiber crops. Copyright © 2017 Elsevier B.V. All rights reserved.
Analysis of 10,000 ESTs from lymphocytes of the cynomolgus monkey to improve our understanding of its immune system

PubMed Central

Chen, Wei-Hua; Wang, Xue-Xia; Lin, Wei; He, Xiao-Wei; Wu, Zhen-Qiang; Lin, Ying; Hu, Song-Nian; Wang, Xiao-Ning

2006-01-01

Background The cynomolgus monkey (Macaca fascicularis) is one of the most widely used surrogate animal models for an increasing number of human diseases and vaccines, especially immune-system-related ones. Towards a better understanding of the gene expression background upon its immunogenetics, we constructed a cDNA library from Epstein-Barr virus (EBV)-transformed B lymphocytes of a cynomolgus monkey and sequenced 10,000 randomly picked clones. Results After processing, 8,312 high-quality expressed sequence tags (ESTs) were generated and assembled into 3,728 unigenes. Annotations of these uniquely expressed transcripts demonstrated that out of the 2,524 open reading frame (ORF) positive unigenes (mitochondrial and ribosomal sequences were not included), 98.8% shared significant similarities (E-value less than 1e-10) with the NCBI nucleotide (nt) database, while only 67.7% (E-value less than 1e-5) did so with the NCBI non-redundant protein (nr) database. Further analysis revealed that 90.0% of the unigenes that shared no similarities to the nr database could be assigned to human chromosomes, in which 75 did not match significantly to any cynomolgus monkey and human ESTs. The mapping regions to known human genes on the human genome were described in detail. The protein family and domain analysis revealed that the first, second and fourth of the most abundantly expressed protein families were all assigned to immunoglobulin and major histocompatibility complex (MHC)-related proteins. The expression profiles of these genes were compared with that of homologous genes in human blood, lymph nodes and a RAMOS cell line, which demonstrated expression changes after transformation with EBV. The degree of sequence similarity of the MHC class I and II genes to the human reference sequences was evaluated. The results indicated that class I molecules showed weak amino acid identities (<90%), while class II showed slightly higher ones. Conclusion These results indicated that the genes expressed in the cynomolgus monkey could be used to identify novel protein-coding genes and revise those incomplete or incorrect annotations in the human genome by comparative methods, since the old world monkeys and humans share high similarities at the molecular level, especially within coding regions. The identification of multiple genes involved in the immune response, their sequence variations to the human homologues, and their responses to EBV infection could provide useful information to improve our understanding of the cynomolgus monkey immune system. PMID:16618371
RNA Sequencing Reveals Differential Expression of Mitochondrial and Oxidation Reduction Genes in the Long-Lived Naked Mole-Rat When Compared to Mice

PubMed Central

Holmes, Andrew; Szafranski, Karol; Faulkes, Chris G.; Coen, Clive W.; Buffenstein, Rochelle; Platzer, Matthias; de Magalhães, João Pedro; Church, George M.

2011-01-01

The naked mole-rat (Heterocephalus glaber) is a long-lived, cancer resistant rodent and there is a great interest in identifying the adaptations responsible for these and other of its unique traits. We employed RNA sequencing to compare liver gene expression profiles between naked mole-rats and wild-derived mice. Our results indicate that genes associated with oxidoreduction and mitochondria were expressed at higher relative levels in naked mole-rats. The largest effect is nearly 300-fold higher expression of epithelial cell adhesion molecule (Epcam), a tumour-associated protein. Also of interest are the protease inhibitor, alpha2-macroglobulin (A2m), and the mitochondrial complex II subunit Sdhc, both ageing-related genes found strongly over-expressed in the naked mole-rat. These results hint at possible candidates for specifying species differences in ageing and cancer, and in particular suggest complex alterations in mitochondrial and oxidation reduction pathways in the naked mole-rat. Our differential gene expression analysis obviated the need for a reference naked mole-rat genome by employing a combination of Illumina/Solexa and 454 platforms for transcriptome sequencing and assembling transcriptome contigs of the non-sequenced species. Overall, our work provides new research foci and methods for studying the naked mole-rat's fascinating characteristics. PMID:22073188
Sequestration of cAMP response element-binding proteins by transcription factor decoys causes collateral elaboration of regenerating Aplysia motor neuron axons.

PubMed

Dash, P K; Tian, L M; Moore, A N

1998-07-07

Axonal injury increases intracellular Ca2+ and cAMP and has been shown to induce gene expression, which is thought to be a key event for regeneration. Increases in intracellular Ca2+ and/or cAMP can alter gene expression via activation of a family of transcription factors that bind to and modulate the expression of CRE (Ca2+/cAMP response element) sequence-containing genes. We have used Aplysia motor neurons to examine the role of CRE-binding proteins in axonal regeneration after injury. We report that axonal injury increases the binding of proteins to a CRE sequence-containing probe. In addition, Western blot analysis revealed that the level of ApCREB2, a CRE sequence-binding repressor, was enhanced as a result of axonal injury. The sequestration of CRE-binding proteins by microinjection of CRE sequence-containing plasmids enhanced axon collateral formation (both number and length) as compared with control plasmid injections. These findings show that Ca2+/cAMP-mediated gene expression via CRE-binding transcription factors participates in the regeneration of motor neuron axons.
Bioinformatic Analysis of the Human Recombinant Iduronate 2-Sulfate Sulfatase

PubMed Central

Morales-Álvarez, Edwin D.; Rivera-Hoyos, Claudia M.; Landázuri, Patricia; Poutou-Piñales, Raúl A.; Pedroza-Rodríguez, Aura M.

2016-01-01

Mucopolysaccharidosis type II is a human recessive disease linked to the X chromosome caused by deficiency of lysosomal enzyme Iduronate 2-Sulfate Sulfatase (IDS), which leads to accumulation of glycosaminoglycans in tissues and organs. The human enzyme has been expressed in Escherichia coli and Pichia pastoris in attempt to develop more successful expression systems that allow the production of recombinant IDS for Enzyme Replacement Therapy (ERT). However, the preservation of native signal peptide in the sequence has caused conflicts in processing and recognition in the past, which led to problems in expression and enzyme activity. With the main object being the improvement of the expression system, we eliminate the native signal peptide of human recombinant IDS. The resulting sequence showed two modified codons, thus, our study aimed to analyze computationally the nucleotide sequence of the IDSnh without signal peptide in order to determine the 3D structure and other biochemical properties to compare them with the native human IDS (IDSnh). Results showed that there are no significant differences between both molecules in spite of the two-codon modifications detected in the recombinant DNA sequence. PMID:27335624
Analysis of temporal transcription expression profiles reveal links between protein function and developmental stages of Drosophila melanogaster.

PubMed

Wan, Cen; Lees, Jonathan G; Minneci, Federico; Orengo, Christine A; Jones, David T

2017-10-01

Accurate gene or protein function prediction is a key challenge in the post-genome era. Most current methods perform well on molecular function prediction, but struggle to provide useful annotations relating to biological process functions due to the limited power of sequence-based features in that functional domain. In this work, we systematically evaluate the predictive power of temporal transcription expression profiles for protein function prediction in Drosophila melanogaster. Our results show significantly better performance on predicting protein function when transcription expression profile-based features are integrated with sequence-derived features, compared with the sequence-derived features alone. We also observe that the combination of expression-based and sequence-based features leads to further improvement of accuracy on predicting all three domains of gene function. Based on the optimal feature combinations, we then propose a novel multi-classifier-based function prediction method for Drosophila melanogaster proteins, FFPred-fly+. Interpreting our machine learning models also allows us to identify some of the underlying links between biological processes and developmental stages of Drosophila melanogaster.
Genome-wide identification of wheat (Triticum aestivum) expansins and expansin expression analysis in cold-tolerant and cold-sensitive wheat cultivars

PubMed Central

Zhang, Jun-Feng; Xu, Yong-Qing; Dong, Jia-Min; Peng, Li-Na; Feng, Xu; Wang, Xu; Li, Fei; Miao, Yu; Yao, Shu-Kuan; Zhao, Qiao-Qin; Feng, Shan-Shan; Hu, Bao-Zhong

2018-01-01

Plant expansins are proteins involved in cell wall loosening, plant growth, and development, as well as in response to plant diseases and other stresses. In this study, we identified 128 expansin coding sequences from the wheat (Triticum aestivum) genome. These sequences belong to 45 homoeologous copies of TaEXPs, including 26 TaEXPAs, 15 TaEXPBs and four TaEXLAs. No TaEXLB was identified. Gene expression and sub-expression profiles revealed that most of the TaEXPs were expressed either only in root tissues or in multiple organs. Real-time qPCR analysis showed that many TaEXPs were differentially expressed in four different tissues of the two wheat cultivars—the cold-sensitive ‘Chinese Spring (CS)’ and the cold-tolerant ‘Dongnongdongmai 1 (D1)’ cultivars. Our results suggest that the differential expression of TaEXPs could be related to low-temperature tolerance or sensitivity of different wheat cultivars. Our study expands our knowledge on wheat expansins and sheds new light on the functions of expansins in plant development and stress response. PMID:29596529
Functionally Convergent B Cell Receptor Sequences in Transgenic Rats Expressing a Human B Cell Repertoire in Response to Tetanus Toxoid and Measles Antigens.

PubMed

Bürckert, Jean-Philippe; Dubois, Axel R S X; Faison, William J; Farinelle, Sophie; Charpentier, Emilie; Sinner, Regina; Wienecke-Baldacchino, Anke; Muller, Claude P

2017-01-01

The identification and tracking of antigen-specific immunoglobulin (Ig) sequences within total Ig repertoires is central to high-throughput sequencing (HTS) studies of infections or vaccinations. In this context, public Ig sequences shared by different individuals exposed to the same antigen could be valuable markers for tracing back infections, measuring vaccine immunogenicity, and perhaps ultimately allow the reconstruction of the immunological history of an individual. Here, we immunized groups of transgenic rats expressing human Ig against tetanus toxoid (TT), Modified Vaccinia virus Ankara (MVA), measles virus hemagglutinin and fusion proteins expressed on MVA, and the environmental carcinogen benzo[a]pyrene, coupled to TT. We showed that these antigens impose a selective pressure causing the Ig heavy chain (IgH) repertoires of the rats to converge toward the expression of antibodies with highly similar IgH CDR3 amino acid sequences. We present a computational approach, similar to differential gene expression analysis, that selects for clusters of CDR3s with 80% similarity, significantly overrepresented within the different groups of immunized rats. These IgH clusters represent antigen-induced IgH signatures exhibiting stereotypic amino acid patterns including previously described TT- and measles-specific IgH sequences. Our data suggest that with the presented methodology, transgenic Ig rats can be utilized as a model to identify antigen-induced, human IgH signatures to a variety of different antigens.
Two estrogen response element sequences near the PCNA gene are not responsible for its estrogen-enhanced expression in MCF7 cells.

PubMed

Wang, Cheng; Yu, Jie; Kallen, Caleb B

2008-01-01

The proliferating cell nuclear antigen (PCNA) is an essential component of DNA replication, cell cycle regulation, and epigenetic inheritance. High expression of PCNA is associated with poor prognosis in patients with breast cancer. The 5'-region of the PCNA gene contains two computationally-detected estrogen response element (ERE) sequences, one of which is evolutionarily conserved. Both of these sequences are of undocumented cis-regulatory function. We recently demonstrated that estradiol (E2) enhances PCNA mRNA expression in MCF7 breast cancer cells. MCF7 cells proliferate in response to E2. Here, we demonstrate that E2 rapidly enhanced PCNA mRNA and protein expression in a process that requires ERalpha as well as de novo protein synthesis. One of the two upstream ERE sequences was specifically bound by ERalpha-containing protein complexes, in vitro, in gel shift analysis. Yet, each ERE sequence, when cloned as a single copy, or when engineered as two tandem copies of the ERE-containing sequence, was not capable of activating a luciferase reporter construct in response to E2. In MCF7 cells, neither ERE-containing genomic region demonstrated E2-dependent recruitment of ERalpha by sensitive ChIP-PCR assays. We conclude that E2 enhances PCNA gene expression by an indirect process and that computational detection of EREs, even when evolutionarily conserved and when near E2-responsive genes, requires biochemical validation.
Functional identification and regulatory analysis of Δ6-fatty acid desaturase from the oleaginous fungus Mucor sp. EIM-10.

PubMed

Jiang, Xianzhang; Liu, Hongjiao; Niu, Yongchao; Qi, Feng; Zhang, Mingliang; Huang, Jianzhong

2017-03-01

To enlarge the diversity of the desaturases associated with PUFA biosynthesis and to better understand the transcriptional regulation of desaturases, a Δ 6 -desaturase gene (Md6) from Mucor sp. and its 5'-upstream sequence was functionally identified in Saccharomyces cerevisiae. Expression of the Δ 6 -fatty acid desaturase (Md6) in S. cerevisiae showed that Md6 could convert linolenic acid to γ-linolenic acid. Computational analysis of the promoter of Md6 suggested it contains several eukaryotic fundamental transcription regulatory elements. In vivo functional analysis of the promoter showed the 5'-upstream sequence of Md6 could initiate expression of GFP and Md6 itself in S. cerevisiae. A series deletion analysis of the promoter suggested that sequence between -919 to -784 bp (relative to start site) named as eMd6 is the key factor for high activity of Δ 6 -desaturase. The activity of Δ 6 -desaturase was increased by 2.8-fold and 2.5-fold when the eMd6 sequence was placed upstream of -434 with forward or reverse orientations respectively. To our best knowledge, the native promoter of Md6 from Mucor is the strongest promoter for Δ 6 -desaturase reported so far and the sequence between -919 to -784 bp is an enhancer for Δ 6 -desaturase activity.
Characterization of microRNAs Expressed during Secondary Wall Biosynthesis in Acacia mangium

PubMed Central

Ong, Seong Siang; Wickneswari, Ratnam

2012-01-01

MicroRNAs (miRNAs) play critical regulatory roles by acting as sequence specific guide during secondary wall formation in woody and non-woody species. Although thousands of plant miRNAs have been sequenced, there is no comprehensive view of miRNA mediated gene regulatory network to provide profound biological insights into the regulation of xylem development. Herein, we report the involvement of six highly conserved amg-miRNA families (amg-miR166, amg-miR172, amg-miR168, amg-miR159, amg-miR394, and amg-miR156) as the potential regulatory sequences of secondary cell wall biosynthesis. Within this highly conserved amg-miRNA family, only amg-miR166 exhibited strong differences in expression between phloem and xylem tissue. The functional characterization of amg-miR166 targets in various tissues revealed three groups of HD-ZIP III: ATHB8, ATHB15, and REVOLUTA which play pivotal roles in xylem development. Although these three groups vary in their functions, -psRNA target analysis indicated that miRNA target sequences of the nine different members of HD-ZIP III are always conserved. We found that precursor structures of amg-miR166 undergo exhaustive sequence variation even within members of the same family. Gene expression analysis showed three key lignin pathway genes: C4H, CAD, and CCoAOMT were upregulated in compression wood where a cascade of miRNAs was downregulated. This study offers a comprehensive analysis on the involvement of highly conserved miRNAs implicated in the secondary wall formation of woody plants. PMID:23251324
Isolation and expression analysis of EcbZIP17 from different finger millet genotypes shows conserved nature of the gene.

PubMed

Chopperla, Ramakrishna; Singh, Sonam; Mohanty, Sasmita; Reddy, Nanja; Padaria, Jasdeep C; Solanke, Amolkumar U

2017-10-01

Basic leucine zipper (bZIP) transcription factors comprise one of the largest gene families in plants. They play a key role in almost every aspect of plant growth and development and also in biotic and abiotic stress tolerance. In this study, we report isolation and characterization of EcbZIP17 , a group B bZIP transcription factor from a climate smart cereal, finger millet ( Eleusine coracana L.). The genomic sequence of EcbZIP17 is 2662 bp long encompassing two exons and one intron with ORF of 1722 bp and peptide length of 573 aa. This gene is homologous to AtbZIP17 ( Arabidopsis ), ZmbZIP17 (maize) and OsbZIP60 (rice) which play a key role in endoplasmic reticulum (ER) stress pathway. In silico analysis confirmed the presence of basic leucine zipper (bZIP) and transmembrane (TM) domains in the EcbZIP17 protein. Allele mining of this gene in 16 different genotypes by Sanger sequencing revealed no variation in nucleotide sequence, including the 618 bp long intron. Expression analysis of EcbZIP17 under heat stress exhibited similar pattern of expression in all the genotypes across time intervals with highest upregulation after 4 h. The present study established the conserved nature of EcbZIP17 at nucleotide and expression level.
Cloning and characterization of the Cerasus humilis sucrose phosphate synthase gene (ChSPS1)

PubMed Central

Du, Junjie; Mu, Xiaopeng; Wang, Pengfei

2017-01-01

Sucrose is crucial to the growth and development of plants, and sucrose phosphate synthase (SPS) plays a key role in sucrose synthesis. To understand the genetic and molecular mechanisms of sucrose synthesis in Cerasus humilis, ChSPS1, a homologue of SPS, was cloned using RT-PCR. Sequence analysis showed that the open reading frame (ORF) sequence of ChSPS1 is 3174 bp in length, encoding a predicted protein of 1057 amino acids. The predicted protein showed a high degree of sequence identity with SPS homologues from other species. Real-time RT-PCR analysis showed that ChSPS1 mRNA was detected in all tissues and the transcription level was the highest in mature fruit. There is a significant positive correlation between expression of ChSPS1 and sucrose content. Prokaryotic expression of ChSPS1 indicated that ChSPS1 protein was expressed in E. coli and it had the SPS activity. Overexpression of ChSPS1 in tobacco led to upregulation of enzyme activity and increased sucrose contents in transgenic plants. Real-time RT-PCR analysis showed that the expression of ChSPS1 in transgenic tobacco was significantly higher than in wild type plants. These results suggested that ChSPS1 plays an important role in sucrose synthesis in Cerasus humilis. PMID:29036229
Regulation of the Osem gene by abscisic acid and the transcriptional activator VP1: analysis of cis-acting promoter elements required for regulation by abscisic acid and VP1.

PubMed

Hattori, T; Terada, T; Hamasuna, S

1995-06-01

Osem, a rice gene homologous to the wheat Em gene, which encodes one of the late-embryogenesis abundant proteins was isolated. The gene was characterized with respect to control of transcription by abscisic acid (ABA) and the transcriptional activator VP1, which is involved in the ABA-regulated gene expression during late embryo-genesis. A fusion gene (Osem-GUS) consisting of the Osem promoter and the bacterial beta-glucuronidase (GUS) gene was constructed and tested in a transient expression system, using protoplasts derived from a suspension-cultured line of rice cells, for activation by ABA and by co-transfection with an expression vector (35S-Osvp1) for the rice VP1 (OSVP1) cDNA. The expression of Osem-GUS was strongly (40- to 150-fold) activated by externally applied ABA and by over-expression of (OS)VP1. The Osem promoter has three ACGTG-containing sequences, motif A, motif B and motif A', which resemble the abscisic acid-responsive element (ABRE) that was previously identified in the wheat Em and the rice Rab16. There is also a CATGCATG sequence, which is known as the Sph box and is shown to be essential for the regulation by VP1 of the maize anthocyanin regulatory gene C1. Focusing on these sequence elements, various mutant derivatives of the Osem promoter in the transient expression system were assayed. The analysis revealed that motif A functions not only as an ABRE but also as a sequence element required for the regulation by (OS)VP1.
Solexa-Sequencing Based Transcriptome Study of Plaice Skin Phenotype in Rex Rabbits (Oryctolagus cuniculus)

PubMed Central

Pan, Lei; Liu, Yan; Wei, Qiang; Xiao, Chenwen; Ji, Quanan; Bao, Guolian; Wu, Xinsheng

2015-01-01

Background Fur is an important genetically-determined characteristic of domestic rabbits; rabbit furs are of great economic value. We used the Solexa sequencing technology to assess gene expression in skin tissues from full-sib Rex rabbits of different phenotypes in order to explore the molecular mechanisms associated with fur determination. Methodology/Principal Findings Transcriptome analysis included de novo assembly, gene function identification, and gene function classification and enrichment. We obtained 74,032,912 and 71,126,891 short reads of 100 nt, which were assembled into 377,618 unique sequences by Trinity strategy (N50=680 nt). Based on BLAST results with known proteins, 50,228 sequences were identified at a cut-off E-value ≥ 10-5. Using Blast to Gene Ontology (GO), Clusters of Orthologous Groups (KOG) and Kyoto Encyclopedia of Genes and Genomes (KEGG), we obtained several genes with important protein functions. A total of 308 differentially expressed genes were obtained by transcriptome analysis of plaice and un-plaice phenotype animals; 209 additional differentially expressed genes were not found in any database. These genes included 49 that were only expressed in plaice skin rabbits. The novel genes may play important roles during skin growth and development. In addition, 99 known differentially expressed genes were assigned to PI3K-Akt signaling, focal adhesion, and ECM-receptor interactin, among others. Growth factors play a role in skin growth and development by regulating these signaling pathways. We confirmed the altered expression levels of seven target genes by qRT-PCR. And chosen a key gene for SNP to found the differentially between plaice and un-plaice phenotypes rabbit. Conclusions/Significance The rabbit transcriptome profiling data provide new insights in understanding the molecular mechanisms underlying rabbit skin growth and development. PMID:25955442
Cloning and Characterization of an Outer Membrane Protein of Vibrio vulnificus Required for Heme Utilization: Regulation of Expression and Determination of the Gene Sequence

PubMed Central

Litwin, Christine M.; Byrne, Burke L.

1998-01-01

Vibrio vulnificus is a halophilic, marine pathogen that has been associated with septicemia and serious wound infections in patients with iron overload and preexisting liver disease. For V. vulnificus, the ability to acquire iron from the host has been shown to correlate with virulence. V. vulnificus is able to use host iron sources such as hemoglobin and heme. We previously constructed a fur mutant of V. vulnificus which constitutively expresses at least two iron-regulated outer membrane proteins, of 72 and 77 kDa. The N-terminal amino acid sequence of the 77-kDa protein purified from the V. vulnificus fur mutant had 67% homology with the first 15 amino acids of the mature protein of the Vibrio cholerae heme receptor, HutA. In this report, we describe the cloning, DNA sequence, mutagenesis, and analysis of transcriptional regulation of the structural gene for HupA, the heme receptor of V. vulnificus. DNA sequencing of hupA demonstrated a single open reading frame of 712 amino acids that was 50% identical and 66% similar to the sequence of V. cholerae HutA and similar to those of other TonB-dependent outer membrane receptors. Primer extension analysis localized one promoter for the V. vulnificus hupA gene. Analysis of the promoter region of V. vulnificus hupA showed a sequence homologous to the consensus Fur box. Northern blot analysis showed that the transcript was strongly regulated by iron. An internal deletion in the V. vulnificus hupA gene, done by using marker exchange, resulted in the loss of expression of the 77-kDa protein and the loss of the ability to use hemin or hemoglobin as a source of iron. The hupA deletion mutant of V. vulnificus will be helpful in future studies of the role of heme iron in V. vulnificus pathogenesis. PMID:9632577
Identification of aberrantly expressed long non-coding RNAs in stomach adenocarcinoma.

PubMed

Gu, Jianbin; Li, Yong; Fan, Liqiao; Zhao, Qun; Tan, Bibo; Hua, Kelei; Wu, Guobin

2017-07-25

Stomach adenocarcinoma (STAD) is a common malignancy worldwide. This study aimed to identify the aberrantly expressed long non-coding RNAs (lncRNAs) in STAD. Total of 74 DElncRNAs and 449 DEmRNAs were identified in STAD compared with paired non-tumor tissues. The DElncRNA/DEmRNA co-expression network was constructed, which covered 519 nodes and 2993 edges. The qRT-PCR validation results of DElncRNAs were consistent with our bioinformatics analysis based on RNA-sequencing. The DEmRNAs co-expressed with DElncRNAs were significantly enriched in gastric acid secretion, complement and coagulation cascades, pancreatic secretion, cytokine-cytokine receptor interaction and Jak-STAT signaling pathway. The expression levels of the nine candidate DElncRNAs in TCGA database were compatible with our RNA-sequencing. FEZF1-AS1, HOTAIR and LINC01234 had the potential diagnosis value for STAD. The lncRNA and mRNA expression profile of 3 STAD tissues and 3 matched adjacent non-tumor tissues was obtained through high-throughput RNA-sequencing. Differentially expressed lncRNAs/mRNAs (DElncRNAs/DEmRNAs) were identified in STAD. DElncRNA/DEmRNA co-expression network construction, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses were conducted to predict the biological functions of DElncRNAs. Quantitative real-time polymerase chain reaction (qRT-PCR) was subjected to validate the expression levels of DEmRNAs and DElncRNAs. Moreover, the expression of DElncRNAs was validated through The Cancer Genome Atlas (TCGA) database. The diagnosis value of candidate DElncRNAs was accessed by receiver operating characteristic (ROC) analysis. Our work might provide useful information for exploring the tumorigenesis mechanism of STAD and pave the road for identification of diagnostic biomarkers in STAD.

Analysis of the Nicotiana tabacum Stigma/Style Transcriptome Reveals Gene Expression Differences between Wet and Dry Stigma Species1[W][OA

PubMed Central

Quiapim, Andréa C.; Brito, Michael S.; Bernardes, Luciano A.S.; daSilva, Idalete; Malavazi, Iran; DePaoli, Henrique C.; Molfetta-Machado, Jeanne B.; Giuliatti, Silvana; Goldman, Gustavo H.; Goldman, Maria Helena S.

2009-01-01

The success of plant reproduction depends on pollen-pistil interactions occurring at the stigma/style. These interactions vary depending on the stigma type: wet or dry. Tobacco (Nicotiana tabacum) represents a model of wet stigma, and its stigmas/styles express genes to accomplish the appropriate functions. For a large-scale study of gene expression during tobacco pistil development and preparation for pollination, we generated 11,216 high-quality expressed sequence tags (ESTs) from stigmas/styles and created the TOBEST database. These ESTs were assembled in 6,177 clusters, from which 52.1% are pistil transcripts/genes of unknown function. The 21 clusters with the highest number of ESTs (putative higher expression levels) correspond to genes associated with defense mechanisms or pollen-pistil interactions. The database analysis unraveled tobacco sequences homologous to the Arabidopsis (Arabidopsis thaliana) genes involved in specifying pistil identity or determining normal pistil morphology and function. Additionally, 782 independent clusters were examined by macroarray, revealing 46 stigma/style preferentially expressed genes. Real-time reverse transcription-polymerase chain reaction experiments validated the pistil-preferential expression for nine out of 10 genes tested. A search for these 46 genes in the Arabidopsis pistil data sets demonstrated that only 11 sequences, with putative equivalent molecular functions, are expressed in this dry stigma species. The reverse search for the Arabidopsis pistil genes in the TOBEST exposed a partial overlap between these dry and wet stigma transcriptomes. The TOBEST represents the most extensive survey of gene expression in the stigmas/styles of wet stigma plants, and our results indicate that wet and dry stigmas/styles express common as well as distinct genes in preparation for the pollination process. PMID:19052150
Differences in acid tolerance between Bifidobacterium breve BB8 and its acid-resistant derivative B. breve BB8dpH, revealed by RNA-sequencing and physiological analysis.

PubMed

Yang, Xu; Hang, Xiaomin; Tan, Jing; Yang, Hong

2015-06-01

Bifidobacteria are common inhabitants of the human gastrointestinal tract, and their application has increased dramatically in recent years due to their health-promoting effects. The ability of bifidobacteria to tolerate acidic environments is particularly important for their function as probiotics because they encounter such environments in food products and during passage through the gastrointestinal tract. In this study, we generated a derivative, Bifidobacterium breve BB8dpH, which displayed a stable, acid-resistant phenotype. To investigate the possible reasons for the higher acid tolerance of B. breve BB8dpH, as compared with its parental strain B. breve BB8, a combined transcriptome and physiological approach was used to characterize differences between the two strains. An analysis of the transcriptome by RNA-sequencing indicated that the expression of 121 genes was increased by more than 2-fold, while the expression of 146 genes was reduced more than 2-fold, in B. breve BB8dpH. Validation of the RNA-sequencing data using real-time quantitative PCR analysis demonstrated that the RNA-sequencing results were highly reliable. The comparison analysis, based on differentially expressed genes, suggested that the acid tolerance of B. breve BB8dpH was enhanced by regulating the expression of genes involved in carbohydrate transport and metabolism, energy production, synthesis of cell envelope components (peptidoglycan and exopolysaccharide), synthesis and transport of glutamate and glutamine, and histidine synthesis. Furthermore, an analysis of physiological data showed that B. breve BB8dpH displayed higher production of exopolysaccharide and lower H(+)-ATPase activity than B. breve BB8. The results presented here will improve our understanding of acid tolerance in bifidobacteria, and they will lead to the development of new strategies to enhance the acid tolerance of bifidobacterial strains. Copyright © 2015 Elsevier Ltd. All rights reserved.
Improving RNA-Seq expression estimation by modeling isoform- and exon-specific read sequencing rate.

PubMed

Liu, Xuejun; Shi, Xinxin; Chen, Chunlin; Zhang, Li

2015-10-16

The high-throughput sequencing technology, RNA-Seq, has been widely used to quantify gene and isoform expression in the study of transcriptome in recent years. Accurate expression measurement from the millions or billions of short generated reads is obstructed by difficulties. One is ambiguous mapping of reads to reference transcriptome caused by alternative splicing. This increases the uncertainty in estimating isoform expression. The other is non-uniformity of read distribution along the reference transcriptome due to positional, sequencing, mappability and other undiscovered sources of biases. This violates the uniform assumption of read distribution for many expression calculation approaches, such as the direct RPKM calculation and Poisson-based models. Many methods have been proposed to address these difficulties. Some approaches employ latent variable models to discover the underlying pattern of read sequencing. However, most of these methods make bias correction based on surrounding sequence contents and share the bias models by all genes. They therefore cannot estimate gene- and isoform-specific biases as revealed by recent studies. We propose a latent variable model, NLDMseq, to estimate gene and isoform expression. Our method adopts latent variables to model the unknown isoforms, from which reads originate, and the underlying percentage of multiple spliced variants. The isoform- and exon-specific read sequencing biases are modeled to account for the non-uniformity of read distribution, and are identified by utilizing the replicate information of multiple lanes of a single library run. We employ simulation and real data to verify the performance of our method in terms of accuracy in the calculation of gene and isoform expression. Results show that NLDMseq obtains competitive gene and isoform expression compared to popular alternatives. Finally, the proposed method is applied to the detection of differential expression (DE) to show its usefulness in the downstream analysis. The proposed NLDMseq method provides an approach to accurately estimate gene and isoform expression from RNA-Seq data by modeling the isoform- and exon-specific read sequencing biases. It makes use of a latent variable model to discover the hidden pattern of read sequencing. We have shown that it works well in both simulations and real datasets, and has competitive performance compared to popular methods. The method has been implemented as a freely available software which can be found at https://github.com/PUGEA/NLDMseq.
Single-Cell Sequencing of the Healthy and Diseased Heart Reveals Ckap4 as a New Modulator of Fibroblasts Activation.

PubMed

Gladka, Monika M; Molenaar, Bas; de Ruiter, Hesther; van der Elst, Stefan; Tsui, Hoyee; Versteeg, Danielle; Lacraz, Grègory P A; Huibers, Manon M H; van Oudenaarden, Alexander; van Rooij, Eva

2018-01-31

Background -Genome-wide transcriptome analysis has greatly advanced our understanding of the regulatory networks underlying basic cardiac biology and mechanisms driving disease. However, so far, the resolution of studying gene expression patterns in the adult heart has been limited to the level of extracts from whole tissues. The use of tissue homogenates inherently causes the loss of any information on cellular origin or cell type-specific changes in gene expression. Recent developments in RNA amplification strategies provide a unique opportunity to use small amounts of input RNA for genome-wide sequencing of single cells. Methods -Here, we present a method to obtain high quality RNA from digested cardiac tissue from adult mice for automated single-cell sequencing of both the healthy and diseased heart. Results -After optimization, we were able to perform single-cell sequencing on adult cardiac tissue under both homeostatic conditions and after ischemic injury. Clustering analysis based on differential gene expression unveiled known and novel markers of all main cardiac cell types. Based on differential gene expression we were also able to identify multiple subpopulations within a certain cell type. Furthermore, applying single-cell sequencing on both the healthy and the injured heart indicated the presence of disease-specific cell subpopulations. As such, we identified cytoskeleton associated protein 4 ( Ckap4 ) as a novel marker for activated fibroblasts that positively correlates with known myofibroblast markers in both mouse and human cardiac tissue. Ckap4 inhibition in activated fibroblasts treated with TGFβ triggered a greater increase in the expression of genes related to activated fibroblasts compared to control, suggesting a role of Ckap4 in modulating fibroblast activation in the injured heart. Conclusions -Single-cell sequencing on both the healthy and diseased adult heart allows us to study transcriptomic differences between cardiac cells, as well as cell type-specific changes in gene expression during cardiac disease. This new approach provides a wealth of novel insights into molecular changes that underlie the cellular processes relevant for cardiac biology and pathophysiology. Applying this technology could lead to the discovery of new therapeutic targets relevant for heart disease.
The G-quadruplex augments translation in the 5' untranslated region of transforming growth factor β2.

PubMed

Agarwala, Prachi; Pandey, Satyaprakash; Mapa, Koyeli; Maiti, Souvik

2013-03-05

Transforming growth factor β2 (TGFβ2) is a versatile cytokine with a prominent role in cell migration, invasion, cellular development, and immunomodulation. TGFβ2 promotes the malignancy of tumors by inducing epithelial-mesenchymal transition, angiogenesis, and immunosuppression. As it is well-documented that nucleic acid secondary structure can regulate gene expression, we assessed whether any secondary motif regulates its expression at the post-transcriptional level. Bioinformatics analysis predicts an existence of a 23-nucleotide putative G-quadruplex sequence (PG4) in the 5' untranslated region (UTR) of TGFβ2 mRNA. The ability of this stretch of sequence to form a highly stable, intramolecular parallel quadruplex was demonstrated using ultraviolet and circular dichroism spectroscopy. Footprinting studies further validated its existence in the presence of a neighboring nucleotide sequence. Following structural characterization, we evaluated the biological relevance of this secondary motif using a dual luciferase assay. Although PG4 inhibits the expression of the reporter gene, its presence in the context of the entire 5' UTR sequence interestingly enhances gene expression. Mutation or removal of the G-quadruplex sequence from the 5' UTR of the gene diminished the level of expression of this gene at the translational level. Thus, here we highlight an activating role of the G-quadruplex in modulating gene expression of TGFβ2 at the translational level and its potential to be used as a target for the development of therapeutics against cancer.
Identification, characterization and expression analysis of pigeonpea miRNAs in response to Fusarium wilt.

PubMed

Hussain, Khalid; Mungikar, Kanak; Kulkarni, Abhijeet; Kamble, Avinash

2018-05-05

Upon confrontation with unfavourable conditions, plants invoke a very complex set of biochemical and physiological reactions and alter gene expression patterns to combat the situations. MicroRNAs (miRNAs), a class of small non-coding RNA, contribute extensively in regulation of gene expression through translation inhibition or degradation of their target mRNAs during such conditions. Therefore, identification of miRNAs and their targets holds importance in understanding the regulatory networks triggered during stress. Structure and sequence similarity based in silico prediction of miRNAs in Cajanus cajan L. (Pigeonpea) draft genome sequence has been carried out earlier. These annotations also appear in related GenBank genome sequence entries. However, there are no reports available on context dependent miRNA expression and their targets in pigeonpea. Therefore, in the present study we addressed these questions computationally, using pigeonpea EST sequence information. We identified five novel pigeonpea miRNA precursors, their mature forms and targets. Interestingly, only one of these miRNAs (miR169i-3p) was identified earlier in draft genome sequence. We then validated expression of these miRNAs, experimentally. It was also observed that these miRNAs show differential expression patterns in response to Fusarium inoculation indicating their biotic stress responsive nature. Overall these results will help towards better understanding the regulatory network of defense during pigeonpea -pathogen interactions and role of miRNAs in the process. Copyright © 2018 Elsevier B.V. All rights reserved.
Cloning, characterization, expression and comparative analysis of pig Golgi membrane sphingomyelin synthase 1.

PubMed

Guillén, Natalia; Navarro, María A; Surra, Joaquín C; Arnal, Carmen; Fernández-Juan, Marta; Cebrián-Pérez, Jose Alvaro; Osada, Jesús

2007-02-15

Pig sphingomyelin synthase 1 (SMS1) cDNA was cloned, characterized and compared to the human ortholog. Porcine protein consists of 413 amino acids and displays a 97% sequence identity with human protein. A phylogenic tree of proteins reveals that porcine SMS1 is more closely related to bovine and rodent proteins than to human. Analysis of protein mass was higher than the theoretical prediction based on amino acid sequence suggesting a kind of posttranslational modification. Quantitative representation of tissue distribution obtained by real-time RT-PCR showed that it was widely expressed although important variations in levels were obtained among organs. Thus, the cardiovascular system, especially the heart, showed the highest value of all the tissues studied. Regional differences of expression were observed in the central nervous system and intestinal tract. Analysis of the hepatic mRNA and protein expressions of SMS1 following turpentine treatment revealed a progressive decrease in the former paralleled by a decrease in the protein concentration. These findings indicate the variation in expression in the different tissues might suggest a different requirement of Golgi sphingomyelin for the specific function in each organ and a regulation of the enzyme in response to turpentine-induced hepatic injury.
Estimating differential expression from multiple indicators

PubMed Central

Ilmjärv, Sten; Hundahl, Christian Ansgar; Reimets, Riin; Niitsoo, Margus; Kolde, Raivo; Vilo, Jaak; Vasar, Eero; Luuk, Hendrik

2014-01-01

Regardless of the advent of high-throughput sequencing, microarrays remain central in current biomedical research. Conventional microarray analysis pipelines apply data reduction before the estimation of differential expression, which is likely to render the estimates susceptible to noise from signal summarization and reduce statistical power. We present a probe-level framework, which capitalizes on the high number of concurrent measurements to provide more robust differential expression estimates. The framework naturally extends to various experimental designs and target categories (e.g. transcripts, genes, genomic regions) as well as small sample sizes. Benchmarking in relation to popular microarray and RNA-sequencing data-analysis pipelines indicated high and stable performance on the Microarray Quality Control dataset and in a cell-culture model of hypoxia. Experimental-data-exhibiting long-range epigenetic silencing of gene expression was used to demonstrate the efficacy of detecting differential expression of genomic regions, a level of analysis not embraced by conventional workflows. Finally, we designed and conducted an experiment to identify hypothermia-responsive genes in terms of monotonic time-response. As a novel insight, hypothermia-dependent up-regulation of multiple genes of two major antioxidant pathways was identified and verified by quantitative real-time PCR. PMID:24586062
Isolation, structural analysis, and expression characteristics of the maize (Zea mays L.) hexokinase gene family.

PubMed

Zhang, Zhongbao; Zhang, Jiewei; Chen, Yajuan; Li, Ruifen; Wang, Hongzhi; Ding, Liping; Wei, Jianhua

2014-09-01

Hexokinases (HXKs, EC 2.7.1.1) play important roles in metabolism, glucose (Glc) signaling, and phosphorylation of Glc and fructose and are ubiquitous in all organisms. Despite their physiological importance, the maize HXK (ZmHXK) genes have not been analyzed systematically. We isolated and characterized nine members of the ZmHXK gene family which were distributed on 3 of the 10 maize chromosomes. A multiple sequence alignment and motif analysis revealed that the maize ZmHXK proteins share three conserved domains. Phylogenetic analysis revealed that the ZmHXK family can be divided into four subfamilies. We identified putative cis-elements in the ZmHXK promoter sequences potentially involved in phytohormone and abiotic stress responses, sugar repression, light and circadian rhythm regulation, Ca(2+) responses, seed development and germination, and CO2-responsive transcriptional activation. To study the functions of maize HXK isoforms, we characterized the expression of the ZmHXK5 and ZmHXK6 genes, which are evolutionarily related to the OsHXK5 and OsHXK6 genes from rice. Analysis of tissue-specific expression patterns using quantitative real time-PCR showed that ZmHXK5 was highly expressed in tassels, while ZmHXK6 was expressed in both tassels and leaves. ZmHXK5 and ZmHXK6 expression levels were upregulated by phytohormones and by abiotic stress.
Development of a Green Fluorescent Protein-Based Laboratory Curriculum

ERIC Educational Resources Information Center

Larkin, Patrick D.; Hartberg, Yasha

2005-01-01

A laboratory curriculum has been designed for an undergraduate biochemistry course that focuses on the investigation of the green fluorescent protein (GFP). The sequence of procedures extends from analysis of the DNA sequence through PCR amplification, recombinant plasmid DNA synthesis, bacterial transformation, expression, isolation, and…
Generation and analysis of expression sequence tags from haustoria of the wheat stripe rust fungus Puccinia striiformis f. sp. Tritici

PubMed Central

2009-01-01

Background Stripe rust, caused by Puccinia striiformis f. sp. tritici (Pst), is one of the most destructive diseases of wheat (Triticum aestivum L.) worldwide. In spite of its agricultural importance, the genomics and genetics of the pathogen are poorly characterized. Pst transcripts from urediniospores and germinated urediniospores have been examined previously, but little is known about genes expressed during host infection. Some genes involved in virulence in other rust fungi have been found to be specifically expressed in haustoria. Therefore, the objective of this study was to generate a cDNA library to characterize genes expressed in haustoria of Pst. Results A total of 5,126 EST sequences of high quality were generated from haustoria of Pst, from which 287 contigs and 847 singletons were derived. Approximately 10% and 26% of the 1,134 unique sequences were homologous to proteins with known functions and hypothetical proteins, respectively. The remaining 64% of the unique sequences had no significant similarities in GenBank. Fifteen genes were predicted to be proteins secreted from Pst haustoria. Analysis of ten genes, including six secreted protein genes, using quantitative RT-PCR revealed changes in transcript levels in different developmental and infection stages of the pathogen. Conclusions The haustorial cDNA library was useful in identifying genes of the stripe rust fungus expressed during the infection process. From the library, we identified 15 genes encoding putative secreted proteins and six genes induced during the infection process. These genes are candidates for further studies to determine their functions in wheat-Pst interactions. PMID:20028560
Inferring gene expression from ribosomal promoter sequences, a crowdsourcing approach

PubMed Central

Meyer, Pablo; Siwo, Geoffrey; Zeevi, Danny; Sharon, Eilon; Norel, Raquel; Segal, Eran; Stolovitzky, Gustavo; Siwo, Geoffrey; Rider, Andrew K.; Tan, Asako; Pinapati, Richard S.; Emrich, Scott; Chawla, Nitesh; Ferdig, Michael T.; Tung, Yi-An; Chen, Yong-Syuan; Chen, Mei-Ju May; Chen, Chien-Yu; Knight, Jason M.; Sahraeian, Sayed Mohammad Ebrahim; Esfahani, Mohammad Shahrokh; Dreos, Rene; Bucher, Philipp; Maier, Ezekiel; Saeys, Yvan; Szczurek, Ewa; Myšičková, Alena; Vingron, Martin; Klein, Holger; Kiełbasa, Szymon M.; Knisley, Jeff; Bonnell, Jeff; Knisley, Debra; Kursa, Miron B.; Rudnicki, Witold R.; Bhattacharjee, Madhuchhanda; Sillanpää, Mikko J.; Yeung, James; Meysman, Pieter; Rodríguez, Aminael Sánchez; Engelen, Kristof; Marchal, Kathleen; Huang, Yezhou; Mordelet, Fantine; Hartemink, Alexander; Pinello, Luca; Yuan, Guo-Cheng

2013-01-01

The Gene Promoter Expression Prediction challenge consisted of predicting gene expression from promoter sequences in a previously unknown experimentally generated data set. The challenge was presented to the community in the framework of the sixth Dialogue for Reverse Engineering Assessments and Methods (DREAM6), a community effort to evaluate the status of systems biology modeling methodologies. Nucleotide-specific promoter activity was obtained by measuring fluorescence from promoter sequences fused upstream of a gene for yellow fluorescence protein and inserted in the same genomic site of yeast Saccharomyces cerevisiae. Twenty-one teams submitted results predicting the expression levels of 53 different promoters from yeast ribosomal protein genes. Analysis of participant predictions shows that accurate values for low-expressed and mutated promoters were difficult to obtain, although in the latter case, only when the mutation induced a large change in promoter activity compared to the wild-type sequence. As in previous DREAM challenges, we found that aggregation of participant predictions provided robust results, but did not fare better than the three best algorithms. Finally, this study not only provides a benchmark for the assessment of methods predicting activity of a specific set of promoters from their sequence, but it also shows that the top performing algorithm, which used machine-learning approaches, can be improved by the addition of biological features such as transcription factor binding sites. PMID:23950146
Cloning and analysis of fetal ovary microRNAs in cattle.

PubMed

Tripurani, Swamy K; Xiao, Caide; Salem, Mohamed; Yao, Jianbo

2010-07-01

Ovarian folliculogenesis and early embryogenesis are complex processes, which require tightly regulated expression and interaction of a multitude of genes. Small endogenous RNA molecules, termed microRNAs (miRNAs), are involved in the regulation of gene expression during folliculogenesis and early embryonic development. To identify miRNAs in bovine oocytes/ovaries, a bovine fetal ovary miRNA library was constructed. Sequence analysis of random clones from the library identified 679 miRNA sequences, which represent 58 distinct bovine miRNAs. Of these distinct miRNAs, 42 are known bovine miRNAs present in the miRBase database and the remaining 16 miRNAs include 15 new bovine miRNAs that are homologous to miRNAs identified in other species, and one novel miRNA, which does not match any miRNAs in the database. The precursor sequences for 14 of the new 15 miRNAs as well as the novel miRNA were identified from the bovine genome database and their hairpin structures were predicted. Expression analysis of the 58 miRNAs in fetal ovaries in comparison to somatic tissue pools identified 8 miRNAs predominantly expressed in fetal ovaries. Further analysis of the eight miRNAs in germinal vesicle (GV) stage oocytes identified two miRNAs (bta-mir424 and bta-mir-10b), that are highly abundant in GV oocytes. Both miRNAs show similar expression patterns during oocyte maturation and preimplantation development of bovine embryos, being abundant in GV and MII stage oocytes, as well as in early stage embryos (until 16-cell stage). The amount of the novel miRNA is relatively small in oocytes and early cleavage embryos but greater in blastocysts, suggesting a role of this miRNA in blastocyst cell differentiation. Copyright 2010 Elsevier B.V. All rights reserved.
Pharmacological characterization of a β-adrenergic-like octopamine receptor in Plutella xylostella.

PubMed

Huang, Qing-Ting; Ma, Hai-Hao; Deng, Xi-Le; Zhu, Hang; Liu, Jia; Zhou, Yong; Zhou, Xiao-Mao

2018-04-25

The β-adrenergic-like octopamine receptor (OA2B2) belongs to the class of G-protein coupled receptors. It regulates important physiological functions in insects, thus is potentially a good target for insecticides. In this study, the putative open reading frame sequence of the Pxoa2b2 gene in Plutella xylostella was cloned. Orthologous sequence alignment, phylogenetic tree analysis, and protein sequence analysis all showed that the cloned receptor belongs to the OA2B2 protein family. PxOA2B2 was transiently expressed in HEK-293 cells. It was found that PxOA2B2 could be activated by both octopamine and tyramine, resulting in increased intracellular cyclic AMP (cAMP) levels, whereas dopamine and serotonin were not effective in eliciting cAMP production. Further studies with series of PxOA2B2 agonists and antagonists showed that all four tested agonists (e.g., naphazoline, clonidine, 2-phenylethylamine, and amitraz) could activate the PxOA2B2 receptor, and two of tested antagonists (e.g., phentolamine and mianserin) had significant antagonistic effects. However, antagonist of yohimbine had no effects. Quantitative real-time polymerase chain reaction analysis showed that Pxoa2b2 gene was expressed in all developmental stages of P. xylostella and that the highest expression occurred in male adults. Further analysis with fourth-instar P. xylostella larvae showed that the Pxoa2b2 gene was mainly expressed in Malpighian tubule, epidermal, and head tissues. This study provides both a pharmacological characterization and the gene expression patterns of the OA2B2 in P. xylostella, facilitating further research for insecticides using PxOA2B2 as a target. © 2018 Wiley Periodicals, Inc.
openSputnik--a database to ESTablish comparative plant genomics using unsaturated sequence collections.

PubMed

Rudd, Stephen

2005-01-01

The public expressed sequence tag collections are continually being enriched with high-quality sequences that represent an ever-expanding range of taxonomically diverse plant species. While these sequence collections provide biased insight into the populations of expressed genes available within individual species and their associated tissues, the information is conceivably of wider relevance in a comparative context. When we consider the available expressed sequence tag (EST) collections of summer 2004, most of the major plant taxonomic clades are at least superficially represented. Investigation of the five million available plant ESTs provides a wealth of information that has applications in modelling the routes of plant genome evolution and the identification of lineage-specific genes and gene families. Over four million ESTs from over 50 distinct plant species have been collated within an EST analysis pipeline called openSputnik. The ESTs were resolved down into approximately one million unigene sequences. These have been annotated using orthology-based annotation transfer from reference plant genomes and using a variety of contemporary bioinformatics methods to assign peptide, structural and functional attributes. The openSputnik database is available at http://sputnik.btk.fi.
Whole Transcriptome Sequencing Enables Discovery and Analysis of Viruses in Archived Primary Central Nervous System Lymphomas

PubMed Central

DeBoever, Christopher; Reid, Erin G.; Smith, Erin N.; Wang, Xiaoyun; Dumaop, Wilmar; Harismendy, Olivier; Carson, Dennis; Richman, Douglas; Masliah, Eliezer; Frazer, Kelly A.

2013-01-01

Primary central nervous system lymphomas (PCNSL) have a dramatically increased prevalence among persons living with AIDS and are known to be associated with human Epstein Barr virus (EBV) infection. Previous work suggests that in some cases, co-infection with other viruses may be important for PCNSL pathogenesis. Viral transcription in tumor samples can be measured using next generation transcriptome sequencing. We demonstrate the ability of transcriptome sequencing to identify viruses, characterize viral expression, and identify viral variants by sequencing four archived AIDS-related PCNSL tissue samples and analyzing raw sequencing reads. EBV was detected in all four PCNSL samples and cytomegalovirus (CMV), JC polyomavirus (JCV), and HIV were also discovered, consistent with clinical diagnoses. CMV was found to express three long non-coding RNAs recently reported as expressed during active infection. Single nucleotide variants were observed in each of the viruses observed and three indels were found in CMV. No viruses were found in several control tumor types including 32 diffuse large B-cell lymphoma samples. This study demonstrates the ability of next generation transcriptome sequencing to accurately identify viruses, including DNA viruses, in solid human cancer tissue samples. PMID:24023918
Gene expression and splicing alterations analyzed by high throughput RNA sequencing of chronic lymphocytic leukemia specimens.

PubMed

Liao, Wei; Jordaan, Gwen; Nham, Phillipp; Phan, Ryan T; Pelegrini, Matteo; Sharma, Sanjai

2015-10-16

To determine differentially expressed and spliced RNA transcripts in chronic lymphocytic leukemia specimens a high throughput RNA-sequencing (HTS RNA-seq) analysis was performed. Ten CLL specimens and five normal peripheral blood CD19+ B cells were analyzed by HTS RNA-seq. The library preparation was performed with Illumina TrueSeq RNA kit and analyzed by Illumina HiSeq 2000 sequencing system. An average of 48.5 million reads for B cells, and 50.6 million reads for CLL specimens were obtained with 10396 and 10448 assembled transcripts for normal B cells and primary CLL specimens respectively. With the Cuffdiff analysis, 2091 differentially expressed genes (DEG) between B cells and CLL specimens based on FPKM (fragments per kilobase of transcript per million reads and false discovery rate, FDR q < 0.05, fold change >2) were identified. Expression of selected DEGs (n = 32) with up regulated and down regulated expression in CLL from RNA-seq data were also analyzed by qRT-PCR in a test cohort of CLL specimens. Even though there was a variation in fold expression of DEG genes between RNA-seq and qRT-PCR; more than 90 % of analyzed genes were validated by qRT-PCR analysis. Analysis of RNA-seq data for splicing alterations in CLL and B cells was performed by Multivariate Analysis of Transcript Splicing (MATS analysis). Skipped exon was the most frequent splicing alteration in CLL specimens with 128 significant events (P-value <0.05, minimum inclusion level difference >0.1). The RNA-seq analysis of CLL specimens identifies novel DEG and alternatively spliced genes that are potential prognostic markers and therapeutic targets. High level of validation by qRT-PCR for a number of DEG genes supports the accuracy of this analysis. Global comparison of transcriptomes of B cells, IGVH non-mutated CLL (U-CLL) and mutated CLL specimens (M-CLL) with multidimensional scaling analysis was able to segregate CLL and B cell transcriptomes but the M-CLL and U-CLL transcriptomes were indistinguishable. The analysis of HTS RNA-seq data to identify alternative splicing events and other genetic abnormalities specific to CLL is an added advantage of RNA-seq that is not feasible with other genome wide analysis.
Insights into the diversity of NOD-like receptors: Identification and expression analysis of NLRC3, NLRC5 and NLRX1 in rainbow trout.

PubMed

Álvarez, Claudio A; Ramírez-Cepeda, Felipe; Santana, Paula; Torres, Elisa; Cortés, Jimena; Guzmán, Fanny; Schmitt, Paulina; Mercado, Luis

2017-07-01

Nucleotide-binding oligomerization domain (NOD)-like receptors (NLRs) are efficient soluble intracellular sensors that activate defense mechanisms against pathogens. In teleost fish, the involvement of NLRs in the immune response is not well understood. However, recent work has evidenced the expression of different NLRs in response to some pathogen associated molecular patterns (PAMPs). In the present work, the cDNA sequence encoding three new NOD-like receptors were identified in Oncorhynchus mykiss, namely OmNLRC3, OmNLRC5 and OmNLRX1. Results showed that their sequences coded for proteins of 1135, 836 and 1010 amino acids, respectively. The deduced protein sequences of all receptors showed characteristic domains of this receptor family, such as leucine rich repeats and NACHT domain. Phylogenetic analysis revealed a high degree of identity with other NOD-like receptors and they are clustered into different families. Transcript expression analysis indicated that OmNLRs are constitutively expressed in liver, spleen, intestine, gill, skin and brain. OmNLR expression was upregulated in kidney and gills from rainbow trout in response to LPS. In order to give new insights into the function of these new NLR members, an in vitro model of immune stimulation was established using the rainbow trout cell line RTgill-W1. Expression analysis revealed that RTgill-W1 overexpressed proinflammatory cytokines in response to LPS and poly I:C alongside with a differential overexpression of OmNLRC3, OmNLRC5 and OmNLRX1. The expression of OmNLRC5 was further verified at the protein level by immunofluorescence. Finally, the effect of the overexpressed cytokines on the OmNLR expression by RTgill-W1 cells was assessed, suggesting a regulatory mechanism on OmNLRC3 expression. Overall, results suggest that O. mykiss NOD-like receptors could play a key role in the defense mechanisms of teleost through PAMP recognition. Future studies will focus on gills which could be related with a key sensor mucosal system in one of the most environmentally fish exposed tissues. Copyright © 2017 Elsevier Ltd. All rights reserved.
Clonal Relatedness of Enterotoxigenic Escherichia coli (ETEC) Strains Expressing LT and CS17 Isolated from Children with Diarrhoea in La Paz, Bolivia

PubMed Central

Rodas, Claudia; Klena, John D.; Nicklasson, Matilda; Iniguez, Volga; Sjöling, Åsa

2011-01-01

Background Enterotoxigenic Escherichia coli (ETEC) is a major cause of traveller's and infantile diarrhoea in the developing world. ETEC produces two toxins, a heat-stable toxin (known as ST) and a heat-labile toxin (LT) and colonization factors that help the bacteria to attach to epithelial cells. Methodology/Principal Findings In this study, we characterized a subset of ETEC clinical isolates recovered from Bolivian children under 5 years of age using a combination of multilocus sequence typing (MLST) analysis, virulence typing, serotyping and antimicrobial resistance test patterns in order to determine the genetic background of ETEC strains circulating in Bolivia. We found that strains expressing the heat-labile (LT) enterotoxin and colonization factor CS17 were common and belonged to several MLST sequence types but mainly to sequence type-423 and sequence type-443 (Achtman scheme). To further study the LT/CS17 strains we analysed the nucleotide sequence of the CS17 operon and compared the structure to LT/CS17 ETEC isolates from Bangladesh. Sequence analysis confirmed that all sequence type-423 strains from Bolivia had a single nucleotide polymorphism; SNPbol in the CS17 operon that was also found in some other MLST sequence types from Bolivia but not in strains recovered from Bangladeshi children. The dominant ETEC clone in Bolivia (sequence type-423/SNPbol) was found to persist over multiple years and was associated with severe diarrhoea but these strains were variable with respect to antimicrobial resistance patterns. Conclusion/Significance The results showed that although the LT/CS17 phenotype is common among ETEC strains in Bolivia, multiple clones, as determined by unique MLST sequence types, populate this phenotype. Our data also appear to suggest that acquisition and loss of antimicrobial resistance in LT-expressing CS17 ETEC clones is more dynamic than acquisition or loss of virulence factors. PMID:22140423
Clonal relatedness of enterotoxigenic Escherichia coli (ETEC) strains expressing LT and CS17 isolated from children with diarrhoea in La Paz, Bolivia.

PubMed

Rodas, Claudia; Klena, John D; Nicklasson, Matilda; Iniguez, Volga; Sjöling, Asa

2011-01-01

Enterotoxigenic Escherichia coli (ETEC) is a major cause of traveller's and infantile diarrhoea in the developing world. ETEC produces two toxins, a heat-stable toxin (known as ST) and a heat-labile toxin (LT) and colonization factors that help the bacteria to attach to epithelial cells. In this study, we characterized a subset of ETEC clinical isolates recovered from Bolivian children under 5 years of age using a combination of multilocus sequence typing (MLST) analysis, virulence typing, serotyping and antimicrobial resistance test patterns in order to determine the genetic background of ETEC strains circulating in Bolivia. We found that strains expressing the heat-labile (LT) enterotoxin and colonization factor CS17 were common and belonged to several MLST sequence types but mainly to sequence type-423 and sequence type-443 (Achtman scheme). To further study the LT/CS17 strains we analysed the nucleotide sequence of the CS17 operon and compared the structure to LT/CS17 ETEC isolates from Bangladesh. Sequence analysis confirmed that all sequence type-423 strains from Bolivia had a single nucleotide polymorphism; SNP(bol) in the CS17 operon that was also found in some other MLST sequence types from Bolivia but not in strains recovered from Bangladeshi children. The dominant ETEC clone in Bolivia (sequence type-423/SNP(bol)) was found to persist over multiple years and was associated with severe diarrhoea but these strains were variable with respect to antimicrobial resistance patterns. The results showed that although the LT/CS17 phenotype is common among ETEC strains in Bolivia, multiple clones, as determined by unique MLST sequence types, populate this phenotype. Our data also appear to suggest that acquisition and loss of antimicrobial resistance in LT-expressing CS17 ETEC clones is more dynamic than acquisition or loss of virulence factors.

Cloning and expression of Bartonella henselae sucB gene encoding an immunogenic dihydrolipoamide succinyltransferase homologous protein.

PubMed

Kabeya, Hidenori; Maruyama, Soichi; Hirano, Kouji; Mikami, Takeshi

2003-01-01

Immunoscreening of a ZAP genomic library of Bartonella henselae strain Houston-1 expressed in Escherichia coli resulted in the isolation of a clone containing 3.5 kb BamHI genomic DNA fragment. This 3.5 kb DNA fragment was found to contain a sequence of a gene encoding a protein with significant homology to the dihydrolipoamide succinyltransferase of Brucella melitensis (sucB). Subsequent cloning and DNA sequence analysis revealed that the deduced amino acid sequence from the cloned gene showed 66.5% identity to SucB protein of B. melitensis, and 43.4 and 47.2% identities to those of Coxiella burnetii and E. coli, respectively. The gene was expressed as a His-Nus A-tagged fusion protein. The recombinant SucB protein (rSucB) was shown to be an immunoreactive protein of about 115 kDa by Western blot analysis with sera from B. henselae-immunized mice. Therefore the rSucB may be a candidate antigen for a specific serological diagnosis of B. henselae infection.
Characterization of defensin gene from abalone Haliotis discus hannai and its deduced protein

NASA Astrophysics Data System (ADS)

Hong, Xuguang; Sun, Xiuqin; Zheng, Minggang; Qu, Lingyun; Zan, Jindong; Zhang, Jinxing

2008-11-01

Defensin is one of preserved ancient host defensive materials formed in biological evolution. As a regulator and effector molecule, it is very important in animals’ acquired immune system. This paper reports the defensin gene from the mixed liver and kidney cDNA library of abalone Haliotis discus hannai Ino. Sequence analysis shows that the gene sequence of full-length cDNA encodes 42 mature peptides (including six Cys), molecular weight of 4 323 Da, and pI of 8.02. Amino acid sequence homology analysis shows that the peptides are highly similar (70% in common) to other insects defensin. Because of a typical insect-defensin structural character of mature peptide in the secondary structure, the polypeptide named Haliotis discus defensin (hd-def), a novel of antimicrobial peptides, belongs to insects defensin subfamily. The RT-PCR result of Haliotis discus defensin shows that the gene can be expressed only in the hepatopancreas by Gram-negative and positive bacteria stimulation, which is ascribed to inducible expression. Therefore, it is revealed that the Haliotis discus defensin gene expression was related to the antibacterial infection of Haliotis discus hannai Ino.
Analysis of expressed sequence tags generated from full-length enriched cDNA libraries of melon

PubMed Central

2011-01-01

Background Melon (Cucumis melo), an economically important vegetable crop, belongs to the Cucurbitaceae family which includes several other important crops such as watermelon, cucumber, and pumpkin. It has served as a model system for sex determination and vascular biology studies. However, genomic resources currently available for melon are limited. Result We constructed eleven full-length enriched and four standard cDNA libraries from fruits, flowers, leaves, roots, cotyledons, and calluses of four different melon genotypes, and generated 71,577 and 22,179 ESTs from full-length enriched and standard cDNA libraries, respectively. These ESTs, together with ~35,000 ESTs available in public domains, were assembled into 24,444 unigenes, which were extensively annotated by comparing their sequences to different protein and functional domain databases, assigning them Gene Ontology (GO) terms, and mapping them onto metabolic pathways. Comparative analysis of melon unigenes and other plant genomes revealed that 75% to 85% of melon unigenes had homologs in other dicot plants, while approximately 70% had homologs in monocot plants. The analysis also identified 6,972 gene families that were conserved across dicot and monocot plants, and 181, 1,192, and 220 gene families specific to fleshy fruit-bearing plants, the Cucurbitaceae family, and melon, respectively. Digital expression analysis identified a total of 175 tissue-specific genes, which provides a valuable gene sequence resource for future genomics and functional studies. Furthermore, we identified 4,068 simple sequence repeats (SSRs) and 3,073 single nucleotide polymorphisms (SNPs) in the melon EST collection. Finally, we obtained a total of 1,382 melon full-length transcripts through the analysis of full-length enriched cDNA clones that were sequenced from both ends. Analysis of these full-length transcripts indicated that sizes of melon 5' and 3' UTRs were similar to those of tomato, but longer than many other dicot plants. Codon usages of melon full-length transcripts were largely similar to those of Arabidopsis coding sequences. Conclusion The collection of melon ESTs generated from full-length enriched and standard cDNA libraries is expected to play significant roles in annotating the melon genome. The ESTs and associated analysis results will be useful resources for gene discovery, functional analysis, marker-assisted breeding of melon and closely related species, comparative genomic studies and for gaining insights into gene expression patterns. PMID:21599934
Identification and substrate prediction of new Fragaria x ananassa aquaporins and expression in different tissues and during strawberry fruit development.

PubMed

Merlaen, Britt; De Keyser, Ellen; Van Labeke, Marie-Christine

2018-01-01

The newly identified aquaporin coding sequences presented here pave the way for further insights into the plant-water relations in the commercial strawberry ( Fragaria x ananassa ). Aquaporins are water channel proteins that allow water to cross (intra)cellular membranes. In Fragaria x ananassa , few of them have been identified hitherto, hampering the exploration of the water transport regulation at cellular level. Here, we present new aquaporin coding sequences belonging to different subclasses: plasma membrane intrinsic proteins subtype 1 and subtype 2 (PIP1 and PIP2) and tonoplast intrinsic proteins (TIP). The classification is based on phylogenetic analysis and is confirmed by the presence of conserved residues. Substrate-specific signature sequences (SSSSs) and specificity-determining positions (SDPs) predict the substrate specificity of each new aquaporin. Expression profiling in leaves, petioles and developing fruits reveals distinct patterns, even within the same (sub)class. Expression profiles range from leaf-specific expression over constitutive expression to fruit-specific expression. Both upregulation and downregulation during fruit ripening occur. Substrate specificity and expression profiles suggest that functional specialization exists among aquaporins belonging to a different but also to the same (sub)class.
Transcriptome assembly and digital gene expression atlas of the rainbow trout

USDA-ARS?s Scientific Manuscript database

Background: Transcriptome analysis is a preferred method for gene discovery, marker development and gene expression profiling in non-model organisms. Previously, we sequenced a transcriptome reference using Sanger-based and 454-pyrosequencing, however, a transcriptome assembly is still incomplete an...
Characterization of a human X-linked gene from the DXS732E locus in the candidate region for the anhidrotic ectodermal dysplasia (EDA) gene (Xq13.1)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gault, J.; Zonana, J.; Zeltinger, J.

A conserved mouse genomic clone was used to identify a homologous human genomic clone (the DXS732E locus), which was subsequently employed to isolate cDNAs from a human fetal brain library. Nine unique overlapping cDNAs were isolated, and sequences analysis of 3.9 kb identified a putative 1 kb ORF. GRAIL analysis of the sequence supported the hypothesis that the putative ORF was coding sequence, and Prosite analysis of the putative ORF identified potential glycosylation and phosphorylation sites. The 5{prime} end of the gene maps within a CpG island, and comparison of cDNA sequences indicate the gene is alternatively spliced at itsmore » 3{prime} end. Northern analysis and RT-PCR indicate that two different sized messages appear to be expressed with the gene expressed in human fetal kidney, intestine, brain, and muscle. The gene is expressed in 77 day human skin, a time when hair follicle formation occurs. Anhidrotic ectodermal dysplasia (EDA) results in the abnormal morphogenesis of hair, teeth and eccrine sweat glands. A positional cloning strategy towards cloning the EDA gene had been used, and deletion and X-autosome translocation patients have been useful in further delimiting the EDA region. The present gene at the DXS732E locus is partially deleted in one EDA patient who does not have other apparent abnormalities. No rearrangements of the gene have been detected in two female X-autosome translocation EDA patients, nor in four additional male patients with submicroscopic molecular deletions.« less
Sequences of emotional distress expressed by clients and acknowledged by therapists: are they associated more with some therapists than others?

PubMed

Viney, L L

1994-11-01

When clients come to psychotherapy they are distressed, this distress usually being expressed in the form of anxiety, hostility, depression and helplessness. This study explored the sequences of emotional distress expressed by clients and acknowledged by therapists, and examined their associations with other factors. The transcripts of five therapists (two single sessions each) were content-analysed: they used personal construct, client centered, rational-emotive, Gestalt and transactional analysis therapy. Log-linear analyses of appropriate contingency table cell frequencies were conducted to test associations between identified sequences and the two variables of therapist and timing of completion of the sequence. Therapist-client sequences of Anxiety-Anxiety, Anxiety-Hostility and Helplessness-Hostility were found to be associated more with the personal construct and client centred therapists than with the rational-emotive therapist. Client-therapist sequences of Anxiety-Anxiety, Helplessness-Anxiety and Helplessness-Helplessness were more often found with the client centred therapist than the other therapists. For most of these sequences timing had an effect, yet timing rarely interacted with the therapist variable. The findings are discussed in terms of their relevance to the theoretical positions represented, the shortcomings of the research and the value of this methodology in studies linking therapy process with outcome.
Generation and Analysis of Expressed Sequence Tags from Olea europaea L.

PubMed Central

Ozdemir Ozgenturk, Nehir; Oruç, Fatma; Sezerman, Ugur; Kuçukural, Alper; Vural Korkut, Senay; Toksoz, Feriha; Un, Cemal

2010-01-01

Olive (Olea europaea L.) is an important source of edible oil which was originated in Near-East region. In this study, two cDNA libraries were constructed from young olive leaves and immature olive fruits for generation of ESTs to discover the novel genes and search the function of unknown genes of olive. The randomly selected 3840 colonies were sequenced for EST collection from both libraries. Readable 2228 sequences for olive leaf and 1506 sequences for olive fruit were assembled into 205 and 69 contigs, respectively, whereas 2478 were singletons. Putative functions of all 2752 differentially expressed unique sequences were designated by gene homology based on BLAST and annotated using BLAST2GO. While 1339 ESTs show no homology to the database, 2024 ESTs have homology (under 80%) with hypothetical proteins, putative proteins, expressed proteins, and unknown proteins in NCBI-GenBank. 635 EST's unique genes sequence have been identified by over 80% homology to known function in other species which were not previously described in Olea family. Only 3.1% of total EST's was shown similarity with olive database existing in NCBI. This generated EST's data and consensus sequences were submitted to NCBI as valuable source for functional genome studies of olive. PMID:21197085
Genomic analysis of expressed sequence tags in American black bear Ursus americanus

PubMed Central

2010-01-01

Background Species of the bear family (Ursidae) are important organisms for research in molecular evolution, comparative physiology and conservation biology, but relatively little genetic sequence information is available for this group. Here we report the development and analyses of the first large scale Expressed Sequence Tag (EST) resource for the American black bear (Ursus americanus). Results Comprehensive analyses of molecular functions, alternative splicing, and tissue-specific expression of 38,757 black bear EST sequences were conducted using the dog genome as a reference. We identified 18 genes, involved in functions such as lipid catabolism, cell cycle, and vesicle-mediated transport, that are showing rapid evolution in the bear lineage Three genes, Phospholamban (PLN), cysteine glycine-rich protein 3 (CSRP3) and Troponin I type 3 (TNNI3), are related to heart contraction, and defects in these genes in humans lead to heart disease. Two genes, biphenyl hydrolase-like (BPHL) and CSRP3, contain positively selected sites in bear. Global analysis of evolution rates of hibernation-related genes in bear showed that they are largely conserved and slowly evolving genes, rather than novel and fast-evolving genes. Conclusion We provide a genomic resource for an important mammalian organism and our study sheds new light on the possible functions and evolution of bear genes. PMID:20338065
Genomic analysis of expressed sequence tags in American black bear Ursus americanus.

PubMed

Zhao, Sen; Shao, Chunxuan; Goropashnaya, Anna V; Stewart, Nathan C; Xu, Yichi; Tøien, Øivind; Barnes, Brian M; Fedorov, Vadim B; Yan, Jun

2010-03-26

Species of the bear family (Ursidae) are important organisms for research in molecular evolution, comparative physiology and conservation biology, but relatively little genetic sequence information is available for this group. Here we report the development and analyses of the first large scale Expressed Sequence Tag (EST) resource for the American black bear (Ursus americanus). Comprehensive analyses of molecular functions, alternative splicing, and tissue-specific expression of 38,757 black bear EST sequences were conducted using the dog genome as a reference. We identified 18 genes, involved in functions such as lipid catabolism, cell cycle, and vesicle-mediated transport, that are showing rapid evolution in the bear lineage Three genes, Phospholamban (PLN), cysteine glycine-rich protein 3 (CSRP3) and Troponin I type 3 (TNNI3), are related to heart contraction, and defects in these genes in humans lead to heart disease. Two genes, biphenyl hydrolase-like (BPHL) and CSRP3, contain positively selected sites in bear. Global analysis of evolution rates of hibernation-related genes in bear showed that they are largely conserved and slowly evolving genes, rather than novel and fast-evolving genes. We provide a genomic resource for an important mammalian organism and our study sheds new light on the possible functions and evolution of bear genes.
De novo Transcriptome Assembly of Chinese Kale and Global Expression Analysis of Genes Involved in Glucosinolate Metabolism in Multiple Tissues

PubMed Central

Wu, Shuanghua; Lei, Jianjun; Chen, Guoju; Chen, Hancai; Cao, Bihao; Chen, Changming

2017-01-01

Chinese kale, a vegetable of the cruciferous family, is a popular crop in southern China and Southeast Asia due to its high glucosinolate content and nutritional qualities. However, there is little research on the molecular genetics and genes involved in glucosinolate metabolism and its regulation in Chinese kale. In this study, we sequenced and characterized the transcriptomes and expression profiles of genes expressed in 11 tissues of Chinese kale. A total of 216 million 150-bp clean reads were generated using RNA-sequencing technology. From the sequences, 98,180 unigenes were assembled for the whole plant, and 49,582~98,423 unigenes were assembled for each tissue. Blast analysis indicated that a total of 80,688 (82.18%) unigenes exhibited similarity to known proteins. The functional annotation and classification tools used in this study suggested that genes principally expressed in Chinese kale, were mostly involved in fundamental processes, such as cellular and molecular functions, the signal transduction, and biosynthesis of secondary metabolites. The expression levels of all unigenes were analyzed in various tissues of Chinese kale. A large number of candidate genes involved in glucosinolate metabolism and its regulation were identified, and the expression patterns of these genes were analyzed. We found that most of the genes involved in glucosinolate biosynthesis were highly expressed in the root, petiole, and in senescent leaves. The expression patterns of ten glucosinolate biosynthetic genes from RNA-seq were validated by quantitative RT-PCR in different tissues. These results provided an initial and global overview of Chinese kale gene functions and expression activities in different tissues. PMID:28228764
Characterization of X Chromosome Inactivation Using Integrated Analysis of Whole-Exome and mRNA Sequencing

PubMed Central

Szelinger, Szabolcs; Malenica, Ivana; Corneveaux, Jason J.; Siniard, Ashley L.; Kurdoglu, Ahmet A.; Ramsey, Keri M.; Schrauwen, Isabelle; Trent, Jeffrey M.; Narayanan, Vinodh; Huentelman, Matthew J.; Craig, David W.

2014-01-01

In females, X chromosome inactivation (XCI) is an epigenetic, gene dosage compensatory mechanism by inactivation of one copy of X in cells. Random XCI of one of the parental chromosomes results in an approximately equal proportion of cells expressing alleles from either the maternally or paternally inherited active X, and is defined by the XCI ratio. Skewed XCI ratio is suggestive of non-random inactivation, which can play an important role in X-linked genetic conditions. Current methods rely on indirect, semi-quantitative DNA methylation-based assay to estimate XCI ratio. Here we report a direct approach to estimate XCI ratio by integrated, family-trio based whole-exome and mRNA sequencing using phase-by-transmission of alleles coupled with allele-specific expression analysis. We applied this method to in silico data and to a clinical patient with mild cognitive impairment but no clear diagnosis or understanding molecular mechanism underlying the phenotype. Simulation showed that phased and unphased heterozygous allele expression can be used to estimate XCI ratio. Segregation analysis of the patient's exome uncovered a de novo, interstitial, 1.7 Mb deletion on Xp22.31 that originated on the paternally inherited X and previously been associated with heterogeneous, neurological phenotype. Phased, allelic expression data suggested an 83∶20 moderately skewed XCI that favored the expression of the maternally inherited, cytogenetically normal X and suggested that the deleterious affect of the de novo event on the paternal copy may be offset by skewed XCI that favors expression of the wild-type X. This study shows the utility of integrated sequencing approach in XCI ratio estimation. PMID:25503791
The expression of the clock gene cycle has rhythmic pattern and is affected by photoperiod in the moth Sesamia nonagrioides.

PubMed

Kontogiannatos, Dimitrios; Gkouvitsas, Theodoros; Kourti, Anna

2017-06-01

To obtain clues to the link between the molecular mechanism of circadian and photoperiod clocks, we have cloned the circadian clock gene cycle (Sncyc) in the corn stalk borer, Sesamia nonagrioides, which undergoes facultative diapause controlled by photoperiod. Sequence analysis revealed a high degree of conservation among insects for this gene. SnCYC consists of 667 amino acids and structural analysis showed that it contains a BCTR domain in its C-terminal in addition to the common domains found in Drosophila CYC, i.e. bHLH, PAS-A, PAS-B domains. The results revealed that the sequence of Sncyc showed a similarity to that of its mammalian orthologue, Bmal1. We also investigated the expression patterns of Sncyc in the brain of larvae growing under long-day 16L: 8D (LD), constant darkness (DD) and short-day 10L: 14D (SD) conditions using qRT-PCR assays. The mRNAs of Sncyc expression was rhythmic in LD, DD and SD cycles. Also, it is remarkable that the photoperiodic conditions affect the expression patterns and/or amplitudes of circadian clock gene Sncyc. This gene is associated with diapause in S. nonagrioides, because under SD (diapause conditions) the photoperiodic signal altered mRNA accumulation. Sequence and expression analysis of cyc in S. nonagrioides shows interesting differences compared to Drosophila where this gene does not oscillate or change in expression patterns in response to photoperiod, suggesting that this species is an interesting new model to study the molecular control of insect circadian and photoperiodic clocks. Copyright © 2017 Elsevier Inc. All rights reserved.
Transcriptomic sequencing reveals a set of unique genes activated by butyrate-induced histone modification

USDA-ARS?s Scientific Manuscript database

Butyrate is a nutritional element with strong epigenetic regulatory activity as an inhibitor of histone deacetylases (HDACs). Based on the analysis of differentially expressed genes induced by butyrate in the bovine epithelial cell using deep RNA-sequencing technology (RNA-seq), a set of unique gen...
Identification of novel bacteriophage peptides using a combination of gene sequence LC-MS-MS analysis and BLASTP

USDA-ARS?s Scientific Manuscript database

Introduction: In an effort to characterize novel bacteriophage with lytic activity against pathogenic E.coli associated with foodborne illness, gene sequencing and mass spectrometry have been used to identify expressed peptides which differentiate isolated bacteriophage from other known phage. Here,...
Computational analysis and functional expression of ancestral copepod luciferase.

PubMed

Takenaka, Yasuhiro; Noda-Ogura, Akiko; Imanishi, Tadashi; Yamaguchi, Atsushi; Gojobori, Takashi; Shigeri, Yasushi

2013-10-10

We recently reported the cDNA sequences of 11 copepod luciferases from the superfamily Augaptiloidea in the order Calanoida. They were classified into two groups, Metridinidae and Heterorhabdidae/Lucicutiidae families, by phylogenetic analyses. To elucidate the evolutionary processes, we have now further isolated 12 copepod luciferases from Augaptiloidea species (Metridia asymmetrica, Metridia curticauda, Pleuromamma scutullata, Pleuromamma xiphias, Lucicutia ovaliformis and Heterorhabdus tanneri). Codon-based synonymous/nonsynonymous tests of positive selection for 25 identified copepod luciferases suggested that positive Darwinian selection operated in the evolution of Heterorhabdidae luciferases, whereas two types of Metridinidae luciferases had diversified via neutral mechanism. By in silico analysis of the decoded amino acid sequences of 25 copepod luciferases, we inferred two protein sequences as ancestral copepod luciferases. They were expressed in HEK293 cells where they exhibited notable luciferase activity both in intracellular lysates and cultured media, indicating that the luciferase activity was established before evolutionary diversification of these copepod species. © 2013.
Functionally conserved cis-regulatory elements of COL18A1 identified through zebrafish transgenesis.

PubMed

Kague, Erika; Bessling, Seneca L; Lee, Josephine; Hu, Gui; Passos-Bueno, Maria Rita; Fisher, Shannon

2010-01-15

Type XVIII collagen is a component of basement membranes, and expressed prominently in the eye, blood vessels, liver, and the central nervous system. Homozygous mutations in COL18A1 lead to Knobloch Syndrome, characterized by ocular defects and occipital encephalocele. However, relatively little has been described on the role of type XVIII collagen in development, and nothing is known about the regulation of its tissue-specific expression pattern. We have used zebrafish transgenesis to identify and characterize cis-regulatory sequences controlling expression of the human gene. Candidate enhancers were selected from non-coding sequence associated with COL18A1 based on sequence conservation among mammals. Although these displayed no overt conservation with orthologous zebrafish sequences, four regions nonetheless acted as tissue-specific transcriptional enhancers in the zebrafish embryo, and together recapitulated the major aspects of col18a1 expression. Additional post-hoc computational analysis on positive enhancer sequences revealed alignments between mammalian and teleost sequences, which we hypothesize predict the corresponding zebrafish enhancers; for one of these, we demonstrate functional overlap with the orthologous human enhancer sequence. Our results provide important insight into the biological function and regulation of COL18A1, and point to additional sequences that may contribute to complex diseases involving COL18A1. More generally, we show that combining functional data with targeted analyses for phylogenetic conservation can reveal conserved cis-regulatory elements in the large number of cases where computational alignment alone falls short. Copyright 2009 Elsevier Inc. All rights reserved.
An improved strategy and a useful housekeeping gene for RNA analysis from formalin-fixed, paraffin-embedded tissues by PCR.

PubMed

Finke, J; Fritzen, R; Ternes, P; Lange, W; Dölken, G

1993-03-01

Specific amplification of nucleic acid sequences by PCR has been extensively used for the detection of gene rearrangements and gene expression. Although successful amplification of DNA sequences has been carried out with DNA prepared from formalin-fixed, paraffin-embedded (FFPE) tissues, there are only a few reports regarding RNA analysis in this kind of material. We describe a procedure for RNA extraction from different types of FFPE tissues, involving digestion with proteinase K followed by guanidinium-thiocyanate acid phenol extraction and DNase I digestion. These RNA preparations are suitable for PCR analysis of mRNA and even of intronless genes. Furthermore, the universally expressed porphobilinogen deaminase mRNA proved to be useful as a positive control because of the lack of pseudogenes.
Expressed sequence tag analysis of functional genes associated with adventitious rooting in Liriodendron hybrids.

PubMed

Zhong, Y D; Sun, X Y; Liu, E Y; Li, Y Q; Gao, Z; Yu, F X

2016-06-24

Liriodendron hybrids (Liriodendron chinense x L. tulipifera) are important landscaping and afforestation hardwood trees. To date, little genomic research on adventitious rooting has been reported in these hybrids, as well as in the genus Liriodendron. In the present study, we used adventitious roots to construct the first cDNA library for Liriodendron hybrids. A total of 5176 expressed sequence tags (ESTs) were generated and clustered into 2921 unigenes. Among these unigenes, 2547 had significant homology to the non-redundant protein database representing a wide variety of putative functions. Homologs of these genes regulated many aspects of adventitious rooting, including those for auxin signal transduction and root hair development. Results of quantitative real-time polymerase chain reaction showed that AUX1, IRE, and FB1 were highly expressed in adventitious roots and the expression of AUX1, ARF1, NAC1, RHD1, and IRE increased during the development of adventitious roots. Additionally, 181 simple sequence repeats were identified from 166 ESTs and more than 91.16% of these were dinucleotide and trinucleotide repeats. To the best of our knowledge, the present study reports the identification of the genes associated with adventitious rooting in the genus Liriodendron for the first time and provides a valuable resource for future genomic studies. Expression analysis of selected genes could allow us to identify regulatory genes that may be essential for adventitious rooting.
Cloning of a neonatal calcium atpase isoform (SERCA 1B) from extraocular muscle of adult blue marlin (Makaira nigricans).

PubMed

Londraville, R L; Cramer, T D; Franck, J P; Tullis, A; Block, B A

2000-10-01

Complete cDNAs for the fast-twitch Ca2+ -ATPase isoform (SERCA 1) were cloned and sequenced from blue marlin (Makaira nigricans) extraocular muscle (EOM). Complete cDNAs for SERCA 1 were also cloned from fast-twitch skeletal muscle of the same species. The two sequences are identical over the coding region except for the last five codons on the carboxyl end; EOM SERCA 1 cDNA codes for 996 amino acids and the fast-twitch cDNAs code for 991 aa. Phylogenetic analysis revealed that EOM SERCA 1 clusters with an isoform of Ca2+ -ATPase normally expressed in early development of mammals (SERCA 1B). This is the first report of SERCA 1B in an adult vertebrate. RNA hybridization assays indicate that 1B expression is limited to extraocular muscles. Because EOM gives rise to the thermogenic heater organ in marlin, we investigated whether SERCA 1B may play a role in heat generation, or if 1B expression is common in EOM among vertebrates. Chicken also expresses SERCA 1B in EOM, but rat expresses SERCA 1A; because SERCA 1B is not specific to heater tissue we conclude it is unlikely that it plays a specific role in intracellular heat production. Comparative sequence analysis does reveal, however, several sites that may be the source of functional differences between fish and mammalian SERCAs.

Generation and Analysis of the Expressed Sequence Tags from the Mycelium of Ganoderma lucidum

PubMed Central

Huang, Yen-Hua; Wu, Hung-Yi; Wu, Keh-Ming; Liu, Tze-Tze; Liou, Ruey-Fen; Tsai, Shih-Feng; Shiao, Ming-Shi; Ho, Low-Tone; Tzean, Shean-Shong; Yang, Ueng-Cheng

2013-01-01

Ganoderma lucidum (G. lucidum) is a medicinal mushroom renowned in East Asia for its potential biological effects. To enable a systematic exploration of the genes associated with the various phenotypes of the fungus, the genome consortium of G. lucidum has carried out an expressed sequence tag (EST) sequencing project. Using a Sanger sequencing based approach, 47,285 ESTs were obtained from in vitro cultures of G. lucidum mycelium of various durations. These ESTs were further clustered and merged into 7,774 non-redundant expressed loci. The features of these expressed contigs were explored in terms of over-representation, alternative splicing, and natural antisense transcripts. Our results provide an invaluable information resource for exploring the G. lucidum transcriptome and its regulation. Many cases of the genes over-represented in fast-growing dikaryotic mycelium are closely related to growth, such as cell wall and bioactive compound synthesis. In addition, the EST-genome alignments containing putative cassette exons and retained introns were manually curated and then used to make inferences about the predominating splice-site recognition mechanism of G. lucidum. Moreover, a number of putative antisense transcripts have been pinpointed, from which we noticed that two cases are likely to reveal hitherto undiscovered biological pathways. To allow users to access the data and the initial analysis of the results of this project, a dedicated web site has been created at http://csb2.ym.edu.tw/est/. PMID:23658685
Expressed sequence tag based identification and expression analysis of some cold inducible elements in seabuckthorn (Hippophae rhamnoides L.).

PubMed

Ghangal, Rajesh; Raghuvanshi, Saurabh; Sharma, Prakash C

2012-02-01

A cDNA library was constructed from the mature leaves of seabuckthorn (Hippophae rhamnoides). Expressed Sequence Tags (ESTs) were generated by single pass sequencing of 4500 cDNA clones. We submitted 3412 ESTs to dbEST of NCBI. Clustering of these ESTs yielded 1665 unigenes comprising of 345 contigs and 1320 singletons. Out of 1665 unigenes, 1278 unigenes were annotated by similarity search while the remaining 387 unannotated unigenes were considered as organism specific. Gene Ontology (GO) analysis of the unigene dataset showed 691 unigenes related to biological processes, 727 to molecular functions and 588 to cellular component category. On the basis of similarity search and GO annotation, 43 unigenes were found responsive to biotic and abiotic stresses. To validate this observation, 13 genes that are known to be associated with cold stress tolerance from previous studies in Arabidopsis and 3 novel transcripts were examined by Real time RT-PCR to understand the change in expression pattern under cold/freeze stress. In silico study of occurrence of microsatellites in these ESTs revealed the presence of 62 Simple Sequence Repeats (SSRs), some of which are being explored to assess genetic diversity among seabuckthorn collections. This is the first report of generation of transcriptome data providing information about genes involved in managing plant abiotic stress in seabuckthorn, a plant known for its enormous medicinal and ecological value. Copyright © 2011 Elsevier Masson SAS. All rights reserved.
Deep transcriptome sequencing provides new insights into the structural and functional organization of the wheat genome.

PubMed

Pingault, Lise; Choulet, Frédéric; Alberti, Adriana; Glover, Natasha; Wincker, Patrick; Feuillet, Catherine; Paux, Etienne

2015-02-10

Because of its size, allohexaploid nature, and high repeat content, the bread wheat genome is a good model to study the impact of the genome structure on gene organization, function, and regulation. However, because of the lack of a reference genome sequence, such studies have long been hampered and our knowledge of the wheat gene space is still limited. The access to the reference sequence of the wheat chromosome 3B provided us with an opportunity to study the wheat transcriptome and its relationships to genome and gene structure at a level that has never been reached before. By combining this sequence with RNA-seq data, we construct a fine transcriptome map of the chromosome 3B. More than 8,800 transcription sites are identified, that are distributed throughout the entire chromosome. Expression level, expression breadth, alternative splicing as well as several structural features of genes, including transcript length, number of exons, and cumulative intron length are investigated. Our analysis reveals a non-monotonic relationship between gene expression and structure and leads to the hypothesis that gene structure is determined by its function, whereas gene expression is subject to energetic cost. Moreover, we observe a recombination-based partitioning at the gene structure and function level. Our analysis provides new insights into the relationships between gene and genome structure and function. It reveals mechanisms conserved with other plant species as well as superimposed evolutionary forces that shaped the wheat gene space, likely participating in wheat adaptation.
GENE-Counter: A Computational Pipeline for the Analysis of RNA-Seq Data for Gene Expression Differences

PubMed Central

Di, Yanming; Schafer, Daniel W.; Wilhelm, Larry J.; Fox, Samuel E.; Sullivan, Christopher M.; Curzon, Aron D.; Carrington, James C.; Mockler, Todd C.; Chang, Jeff H.

2011-01-01

GENE-counter is a complete Perl-based computational pipeline for analyzing RNA-Sequencing (RNA-Seq) data for differential gene expression. In addition to its use in studying transcriptomes of eukaryotic model organisms, GENE-counter is applicable for prokaryotes and non-model organisms without an available genome reference sequence. For alignments, GENE-counter is configured for CASHX, Bowtie, and BWA, but an end user can use any Sequence Alignment/Map (SAM)-compliant program of preference. To analyze data for differential gene expression, GENE-counter can be run with any one of three statistics packages that are based on variations of the negative binomial distribution. The default method is a new and simple statistical test we developed based on an over-parameterized version of the negative binomial distribution. GENE-counter also includes three different methods for assessing differentially expressed features for enriched gene ontology (GO) terms. Results are transparent and data are systematically stored in a MySQL relational database to facilitate additional analyses as well as quality assessment. We used next generation sequencing to generate a small-scale RNA-Seq dataset derived from the heavily studied defense response of Arabidopsis thaliana and used GENE-counter to process the data. Collectively, the support from analysis of microarrays as well as the observed and substantial overlap in results from each of the three statistics packages demonstrates that GENE-counter is well suited for handling the unique characteristics of small sample sizes and high variability in gene counts. PMID:21998647
Identification and characterization of microRNAs in white and brown alpaca skin

PubMed Central

2012-01-01

Background MicroRNAs (miRNAs) are small, non-coding 21–25 nt RNA molecules that play an important role in regulating gene expression. Little is known about the expression profiles and functions of miRNAs in skin and their role in pigmentation. Alpacas have more than 22 natural coat colors, more than any other fiber producing species. To better understand the role of miRNAs in control of coat color we performed a comprehensive analysis of miRNA expression profiles in skin of white versus brown alpacas. Results Two small RNA libraries from white alpaca (WA) and brown alpaca (BA) skin were sequenced with the aid of Illumina sequencing technology. 272 and 267 conserved miRNAs were obtained from the WA and BA skin libraries, respectively. Of these conserved miRNAs, 35 and 13 were more abundant in WA and BA skin, respectively. The targets of these miRNAs were predicted and grouped based on Gene Ontology and KEGG pathway analysis. Many predicted target genes for these miRNAs are involved in the melanogenesis pathway controlling pigmentation. In addition to the conserved miRNAs, we also obtained 22 potentially novel miRNAs from the WA and BA skin libraries. Conclusion This study represents the first comprehensive survey of miRNAs expressed in skin of animals of different coat colors by deep sequencing analysis. We discovered a collection of miRNAs that are differentially expressed in WA and BA skin. The results suggest important potential functions of miRNAs in coat color regulation. PMID:23067000
Genome-Wide Identification of Regulatory Sequences Undergoing Accelerated Evolution in the Human Genome

PubMed Central

Dong, Xinran; Wang, Xiao; Zhang, Feng; Tian, Weidong

2016-01-01

Accelerated evolution of regulatory sequence can alter the expression pattern of target genes, and cause phenotypic changes. In this study, we used DNase I hypersensitive sites (DHSs) to annotate putative regulatory sequences in the human genome, and conducted a genome-wide analysis of the effects of accelerated evolution on regulatory sequences. Working under the assumption that local ancient repeat elements of DHSs are under neutral evolution, we discovered that ∼0.44% of DHSs are under accelerated evolution (ace-DHSs). We found that ace-DHSs tend to be more active than background DHSs, and are strongly associated with epigenetic marks of active transcription. The target genes of ace-DHSs are significantly enriched in neuron-related functions, and their expression levels are positively selected in the human brain. Thus, these lines of evidences strongly suggest that accelerated evolution on regulatory sequences plays important role in the evolution of human-specific phenotypes. PMID:27401230
Comparative transcriptome analysis of microsclerotia development in Nomuraea rileyi.

PubMed

Song, Zhangyong; Yin, Youping; Jiang, Shasha; Liu, Juanjuan; Chen, Huan; Wang, Zhongkang

2013-06-19

Nomuraea rileyi is used as an environmental-friendly biopesticide. However, mass production and commercialization of this organism are limited due to its fastidious growth and sporulation requirements. When cultured in amended medium, we found that N. rileyi could produce microsclerotia bodies, replacing conidiophores as the infectious agent. However, little is known about the genes involved in microsclerotia development. In the present study, the transcriptomes were analyzed using next-generation sequencing technology to find the genes involved in microsclerotia development. A total of 4.69 Gb of clean nucleotides comprising 32,061 sequences was obtained, and 20,919 sequences were annotated (about 65%). Among the annotated sequences, only 5928 were annotated with 34 gene ontology (GO) functional categories, and 12,778 sequences were mapped to 165 pathways by searching against the Kyoto Encyclopedia of Genes and Genomes pathway (KEGG) database. Furthermore, we assessed the transcriptomic differences between cultures grown in minimal and amended medium. In total, 4808 sequences were found to be differentially expressed; 719 differentially expressed unigenes were assigned to 25 GO classes and 1888 differentially expressed unigenes were assigned to 161 KEGG pathways, including 25 enrichment pathways. Subsequently, we examined the up-regulation or uniquely expressed genes following amended medium treatment, which were also expressed on the enrichment pathway, and found that most of them participated in mediating oxidative stress homeostasis. To elucidate the role of oxidative stress in microsclerotia development, we analyzed the diversification of unigenes using quantitative reverse transcription-PCR (RT-qPCR). Our findings suggest that oxidative stress occurs during microsclerotia development, along with a broad metabolic activity change. Our data provide the most comprehensive sequence resource available for the study of N. rileyi. We believe that the transcriptome datasets will serve as an important public information platform to accelerate studies on N. rileyi microsclerotia.
DMRT gene cluster analysis in the platypus: new insights into genomic organization and regulatory regions.

PubMed

El-Mogharbel, Nisrine; Wakefield, Matthew; Deakin, Janine E; Tsend-Ayush, Enkhjargal; Grützner, Frank; Alsop, Amber; Ezaz, Tariq; Marshall Graves, Jennifer A

2007-01-01

We isolated and characterized a cluster of platypus DMRT genes and compared their arrangement, location, and sequence across vertebrates. The DMRT gene cluster on human 9p24.3 harbors, in order, DMRT1, DMRT3, and DMRT2, which share a DM domain. DMRT1 is highly conserved and involved in sexual development in vertebrates, and deletions in this region cause sex reversal in humans. Sequence comparisons of DMRT genes between species have been valuable in identifying exons, control regions, and conserved nongenic regions (CNGs). The addition of platypus sequences is expected to be particularly valuable, since monotremes fill a gap in the vertebrate genome coverage. We therefore isolated and fully sequenced platypus BAC clones containing DMRT3 and DMRT2 as well as DMRT1 and then generated multispecies alignments and ran prediction programs followed by experimental verification to annotate this gene cluster. We found that the three genes have 58-66% identity to their human orthologues, lie in the same order as in other vertebrates, and colocate on 1 of the 10 platypus sex chromosomes, X5. We also predict that optimal annotation of the newly sequenced platypus genome will be challenging. The analysis of platypus sequence revealed differences in structure and sequence of the DMRT gene cluster. Multispecies comparison was particularly effective for detecting CNGs, revealing several novel potential regulatory regions within DMRT3 and DMRT2 as well as DMRT1. RT-PCR indicated that platypus DMRT1 and DMRT3 are expressed specifically in the adult testis (and not ovary), but DMRT2 has a wider expression profile, as it does for other mammals. The platypus DMRT1 expression pattern, and its location on an X chromosome, suggests an involvement in monotreme sexual development.
Molecular cloning and characterization of RGA1 encoding a G protein alpha subunit from rice (Oryza sativa L. IR-36).

PubMed

Seo, H S; Kim, H Y; Jeong, J Y; Lee, S Y; Cho, M J; Bahk, J D

1995-03-01

A cDNA clone, RGA1, was isolated by using a GPA1 cDNA clone of Arabidopsis thaliana G protein alpha subunit as a probe from a rice (Oryza sativa L. IR-36) seedling cDNA library from roots and leaves. Sequence analysis of genomic clone reveals that the RGA1 gene has 14 exons and 13 introns, and encodes a polypeptide of 380 amino acid residues with a calculated molecular weight of 44.5 kDa. The encoded protein exhibits a considerable degree of amino acid sequence similarity to all the other known G protein alpha subunits. A putative TATA sequence (ATATGA), a potential CAAT box sequence (AGCAATAC), and a cis-acting element, CCACGTGG (ABRE), known to be involved in ABA induction are found in the promoter region. The RGA1 protein contains all the consensus regions of G protein alpha subunits except the cysteine residue near the C-terminus for ADP-ribosylation by pertussis toxin. The RGA1 polypeptide expressed in Escherichia coli was, however, ADP-ribosylated by 10 microM [adenylate-32P] NAD and activated cholera toxin. Southern analysis indicates that there are no other genes similar to the RGA1 gene in the rice genome. Northern analysis reveals that the RGA1 mRNA is 1.85 kb long and expressed in vegetative tissues, including leaves and roots, and that its expression is regulated by light.
RNA-seq Data: Challenges in and Recommendations for Experimental Design and Analysis.

PubMed

Williams, Alexander G; Thomas, Sean; Wyman, Stacia K; Holloway, Alisha K

2014-10-01

RNA-seq is widely used to determine differential expression of genes or transcripts as well as identify novel transcripts, identify allele-specific expression, and precisely measure translation of transcripts. Thoughtful experimental design and choice of analysis tools are critical to ensure high-quality data and interpretable results. Important considerations for experimental design include number of replicates, whether to collect paired-end or single-end reads, sequence length, and sequencing depth. Common analysis steps in all RNA-seq experiments include quality control, read alignment, assigning reads to genes or transcripts, and estimating gene or transcript abundance. Our aims are two-fold: to make recommendations for common components of experimental design and assess tool capabilities for each of these steps. We also test tools designed to detect differential expression, since this is the most widespread application of RNA-seq. We hope that these analyses will help guide those who are new to RNA-seq and will generate discussion about remaining needs for tool improvement and development. Copyright © 2014 John Wiley & Sons, Inc.
Cross species analysis of microarray expression data

PubMed Central

Lu, Yong; Huggins, Peter; Bar-Joseph, Ziv

2009-01-01

Motivation: Many biological systems operate in a similar manner across a large number of species or conditions. Cross-species analysis of sequence and interaction data is often applied to determine the function of new genes. In contrast to these static measurements, microarrays measure the dynamic, condition-specific response of complex biological systems. The recent exponential growth in microarray expression datasets allows researchers to combine expression experiments from multiple species to identify genes that are not only conserved in sequence but also operated in a similar way in the different species studied. Results: In this review we discuss the computational and technical challenges associated with these studies, the approaches that have been developed to address these challenges and the advantages of cross-species analysis of microarray data. We show how successful application of these methods lead to insights that cannot be obtained when analyzing data from a single species. We also highlight current open problems and discuss possible ways to address them. Contact: zivbj@cs.cmu.edu PMID:19357096
Principles of gene microarray data analysis.

PubMed

Mocellin, Simone; Rossi, Carlo Riccardo

2007-01-01

The development of several gene expression profiling methods, such as comparative genomic hybridization (CGH), differential display, serial analysis of gene expression (SAGE), and gene microarray, together with the sequencing of the human genome, has provided an opportunity to monitor and investigate the complex cascade of molecular events leading to tumor development and progression. The availability of such large amounts of information has shifted the attention of scientists towards a nonreductionist approach to biological phenomena. High throughput technologies can be used to follow changing patterns of gene expression over time. Among them, gene microarray has become prominent because it is easier to use, does not require large-scale DNA sequencing, and allows for the parallel quantification of thousands of genes from multiple samples. Gene microarray technology is rapidly spreading worldwide and has the potential to drastically change the therapeutic approach to patients affected with tumor. Therefore, it is of paramount importance for both researchers and clinicians to know the principles underlying the analysis of the huge amount of data generated with microarray technology.
Informatic selection of a neural crest-melanocyte cDNA set for microarray analysis

PubMed Central

Loftus, S. K.; Chen, Y.; Gooden, G.; Ryan, J. F.; Birznieks, G.; Hilliard, M.; Baxevanis, A. D.; Bittner, M.; Meltzer, P.; Trent, J.; Pavan, W.

1999-01-01

With cDNA microarrays, it is now possible to compare the expression of many genes simultaneously. To maximize the likelihood of finding genes whose expression is altered under the experimental conditions, it would be advantageous to be able to select clones for tissue-appropriate cDNA sets. We have taken advantage of the extensive sequence information in the dbEST expressed sequence tag (EST) database to identify a neural crest-derived melanocyte cDNA set for microarray analysis. Analysis of characterized genes with dbEST identified one library that contained ESTs representing 21 neural crest-expressed genes (library 198). The distribution of the ESTs corresponding to these genes was biased toward being derived from library 198. This is in contrast to the EST distribution profile for a set of control genes, characterized to be more ubiquitously expressed in multiple tissues (P < 1 × 10−9). From library 198, a subset of 852 clustered ESTs were selected that have a library distribution profile similar to that of the 21 neural crest-expressed genes. Microarray analysis demonstrated the majority of the neural crest-selected 852 ESTs (Mel1 array) were differentially expressed in melanoma cell lines compared with a non-neural crest kidney epithelial cell line (P < 1 × 10−8). This was not observed with an array of 1,238 ESTs that was selected without library origin bias (P = 0.204). This study presents an approach for selecting tissue-appropriate cDNAs that can be used to examine the expression profiles of developmental processes and diseases. PMID:10430933
Simian virus 40 major late promoter: an upstream DNA sequence required for efficient in vitro transcription.

PubMed Central

Brady, J; Radonovich, M; Thoren, M; Das, G; Salzman, N P

1984-01-01

We have previously identified an 11-base DNA sequence, 5'-G-G-T-A-C-C-T-A-A-C-C-3' (simian virus 40 [SV40] map position 294 to 304), which is important in the control of SV40 late RNA expression in vitro and in vivo (Brady et al., Cell 31:625-633, 1982). We report here the identification of another domain of the SV40 late promoter. A series of mutants with deletions extending from SV40 map position 0 to 300 was prepared by nuclease BAL 31 treatment. The cloned templates were then analyzed for efficiency and accuracy of late SV40 RNA expression in the Manley in vitro transcription system. Our studies showed that, in addition to the promoter domain near map position 300, there are essential DNA sequences between nucleotide positions 74 and 95 that are required for efficient expression of late SV40 RNA. Included in this SV40 DNA sequence were two of the six GGGCGG SV40 repeat sequences and an 11-nucleotide segment which showed strong homology with the upstream sequences required for the efficient in vitro and in vivo expression of the histone H2A gene. This upstream promoter sequence supported transcription with the same efficiency even when it was moved 72 nucleotides closer to the major late cap site. In vitro promoter competition analysis demonstrated that the upstream promoter sequence, independent of the 294 to 304 promoter element, is capable of binding polymerase-transcription factors required for SV40 late gene transcription. Finally, we show that DNA sequences which control the specificity of RNA initiation at nucleotide 325 lie downstream of map position 294. Images PMID:6321950
Transcriptomic analysis of grain amaranth (Amaranthus hypochondriacus) using 454 pyrosequencing: comparison with A. tuberculatus, expression profiling in stems and in response to biotic and abiotic stress

PubMed Central

2011-01-01

Background Amaranthus hypochondriacus, a grain amaranth, is a C4 plant noted by its ability to tolerate stressful conditions and produce highly nutritious seeds. These possess an optimal amino acid balance and constitute a rich source of health-promoting peptides. Although several recent studies, mostly involving subtractive hybridization strategies, have contributed to increase the relatively low number of grain amaranth expressed sequence tags (ESTs), transcriptomic information of this species remains limited, particularly regarding tissue-specific and biotic stress-related genes. Thus, a large scale transcriptome analysis was performed to generate stem- and (a)biotic stress-responsive gene expression profiles in grain amaranth. Results A total of 2,700,168 raw reads were obtained from six 454 pyrosequencing runs, which were assembled into 21,207 high quality sequences (20,408 isotigs + 799 contigs). The average sequence length was 1,064 bp and 930 bp for isotigs and contigs, respectively. Only 5,113 singletons were recovered after quality control. Contigs/isotigs were further incorporated into 15,667 isogroups. All unique sequences were queried against the nr, TAIR, UniRef100, UniRef50 and Amaranthaceae EST databases for annotation. Functional GO annotation was performed with all contigs/isotigs that produced significant hits with the TAIR database. Only 8,260 sequences were found to be homologous when the transcriptomes of A. tuberculatus and A. hypochondriacus were compared, most of which were associated with basic house-keeping processes. Digital expression analysis identified 1,971 differentially expressed genes in response to at least one of four stress treatments tested. These included several multiple-stress-inducible genes that could represent potential candidates for use in the engineering of stress-resistant plants. The transcriptomic data generated from pigmented stems shared similarity with findings reported in developing stems of Arabidopsis and black cottonwood (Populus trichocarpa). Conclusions This study represents the first large-scale transcriptomic analysis of A. hypochondriacus, considered to be a highly nutritious and stress-tolerant crop. Numerous genes were found to be induced in response to (a)biotic stress, many of which could further the understanding of the mechanisms that contribute to multiple stress-resistance in plants, a trait that has potential biotechnological applications in agriculture. PMID:21752295
A novel cytochrome P450 gene (CYP4G25) of the silkmoth Antheraea yamamai: cloning and expression pattern in pharate first instar larvae in relation to diapause.

PubMed

Yang, Ping; Tanaka, Hiromasa; Kuwano, Eiichi; Suzuki, Koichi

2008-03-01

A new cytochrome P450 gene, CYP4G25, was identified as a differentially expressed gene between the diapausing and post-diapausing pharate first instar larvae of the wild silkmoth Antheraea yamamai, using subtractive cDNA hybridization. The cDNA sequence of CYP4G25 has an open reading frame of 1674 nucleotides encoding 557 amino acid residues. Sequence analysis of the putative CYP4G25 protein disclosed the motif FXXGXRXCXG that is essential for heme binding in P450 cytochromes. Hybridization in situ demonstrated predominant expression of CYP4G25 in the integument of pharate first instar larvae. Northern blotting analysis showed an intensive signal after the initiation of diapause and no or weak expression throughout the periods of pre-diapause and post-diapause, including larval development. These results indicate that CYP4G25 is strongly associated with diapause in pharate first instar larvae.
Molecular cloning and expression of the gene encoding the kinetoplast-associated type II DNA topoisomerase of Crithidia fasciculata.

PubMed

Pasion, S G; Hines, J C; Aebersold, R; Ray, D S

1992-01-01

A type II DNA topoisomerase, topoIImt, was shown previously to be associated with the kinetoplast DNA of the trypanosomatid Crithidia fasciculata. The gene encoding this kinetoplast-associated topoisomerase has been cloned by immunological screening of a Crithidia genomic expression library with monoclonal antibodies raised against the purified enzyme. The gene CfaTOP2 is a single copy gene and is expressed as a 4.8-kb polyadenylated transcript. The nucleotide sequence of CfaTOP2 has been determined and encodes a predicted polypeptide of 1239 amino acids with a molecular mass of 138,445. The identification of the cloned gene is supported by immunoblot analysis of the beta-galactosidase-CfaTOP2 fusion protein expressed in Escherichia coli and by analysis of tryptic peptide sequences derived from purified topoIImt. CfaTOP2 shares significant homology with nuclear type II DNA topoisomerases of other eukaryotes suggesting that in Crithidia both nuclear and mitochondrial forms of topoisomerase II are encoded by the same gene.
Ovule development: identification of stage-specific and tissue-specific cDNAs.

PubMed Central

Nadeau, J A; Zhang, X S; Li, J; O'Neill, S D

1996-01-01

A differential screening approach was used to identify seven ovule-specific cDNAs representing genes that are expressed in a stage-specific manner during ovule development. The Phalaenopsis orchid takes 80 days to complete the sequence of ovule developmental events, making it a good system to isolate stage-specific ovule genes. We constructed cDNA libraries from orchid ovule tissue during archesporial cell differentiation, megasporocyte formation, and the transition to meiosis, as well as during the final mitotic divisions of female gametophyte development. RNA gel blot hybridization analysis revealed that four clones were stage specific and expressed solely in ovule tissue, whereas one clone was specific to pollen tubes. Two other clones were not ovule specific. Sequence analysis and in situ hybridization revealed the identities and domain of expression of several of the cDNAs. O39 encodes a putative homeobox transcription factor that is expressed early in the differentiation of the ovule primordium; O40 encodes a cytochrome P450 monooxygenase (CYP78A2) that is pollen tube specific. O108 encodes a protein of unknown function that is expressed exclusively in the outer layer of the outer integument and in the female gametophyte of mature ovules. O126 encodes a glycine-rich protein that is expressed in mature ovules, and O141 encodes a cysteine proteinase that is expressed in the outer integument of ovules during seed formation. Sequences homologous to these ovule clones can now be isolated from other organisms, and this should facilitate their functional characterization. PMID:8742709
Transcriptome Analysis of Orbital Adipose Tissue in Active Thyroid Eye Disease Using Next Generation RNA Sequencing Technology

PubMed Central

Lee, Bradford W.; Kumar, Virender B.; Biswas, Pooja; Ko, Audrey C.; Alameddine, Ramzi M.; Granet, David B.; Ayyagari, Radha; Kikkawa, Don O.; Korn, Bobby S.

2018-01-01

Objective: This study utilized Next Generation Sequencing (NGS) to identify differentially expressed transcripts in orbital adipose tissue from patients with active Thyroid Eye Disease (TED) versus healthy controls. Method: This prospective, case-control study enrolled three patients with severe, active thyroid eye disease undergoing orbital decompression, and three healthy controls undergoing routine eyelid surgery with removal of orbital fat. RNA Sequencing (RNA-Seq) was performed on freshly obtained orbital adipose tissue from study patients to analyze the transcriptome. Bioinformatics analysis was performed to determine pathways and processes enriched for the differential expression profile. Quantitative Reverse Transcriptase-Polymerase Chain Reaction (qRT-PCR) was performed to validate the differential expression of selected genes identified by RNA-Seq. Results: RNA-Seq identified 328 differentially expressed genes associated with active thyroid eye disease, many of which were responsible for mediating inflammation, cytokine signaling, adipogenesis, IGF-1 signaling, and glycosaminoglycan binding. The IL-5 and chemokine signaling pathways were highly enriched, and very-low-density-lipoprotein receptor activity and statin medications were implicated as having a potential role in TED. Conclusion: This study is the first to use RNA-Seq technology to elucidate differential gene expression associated with active, severe TED. This study suggests a transcriptional basis for the role of statins in modulating differentially expressed genes that mediate the pathogenesis of thyroid eye disease. Furthermore, the identification of genes with altered levels of expression in active, severe TED may inform the molecular pathways central to this clinical phenotype and guide the development of novel therapeutic agents. PMID:29760827
Gene identification and analysis of transcripts differentially regulated in fracture healing by EST sequencing in the domestic sheep.

PubMed

Hecht, Jochen; Kuhl, Heiner; Haas, Stefan A; Bauer, Sebastian; Poustka, Albert J; Lienau, Jasmin; Schell, Hanna; Stiege, Asita C; Seitz, Volkhard; Reinhardt, Richard; Duda, Georg N; Mundlos, Stefan; Robinson, Peter N

2006-07-05

The sheep is an important model animal for testing novel fracture treatments and other medical applications. Despite these medical uses and the well known economic and cultural importance of the sheep, relatively little research has been performed into sheep genetics, and DNA sequences are available for only a small number of sheep genes. In this work we have sequenced over 47 thousand expressed sequence tags (ESTs) from libraries developed from healing bone in a sheep model of fracture healing. These ESTs were clustered with the previously available 10 thousand sheep ESTs to a total of 19087 contigs with an average length of 603 nucleotides. We used the newly identified sequences to develop RT-PCR assays for 78 sheep genes and measured differential expression during the course of fracture healing between days 7 and 42 postfracture. All genes showed significant shifts at one or more time points. 23 of the genes were differentially expressed between postfracture days 7 and 10, which could reflect an important role for these genes for the initiation of osteogenesis. The sequences we have identified in this work are a valuable resource for future studies on musculoskeletal healing and regeneration using sheep and represent an important head-start for genomic sequencing projects for Ovis aries, with partial or complete sequences being made available for over 5,800 previously unsequenced sheep genes.

Construction of a Lotus japonicus late nodulin expressed sequence tag library and identification of novel nodule-specific genes.

PubMed Central

Szczyglowski, K; Hamburger, D; Kapranov, P; de Bruijn, F J

1997-01-01

A range of novel expressed sequence tags (ESTs) associated with late developmental events during nodule organogenesis in the legume Lotus japonicus were identified using mRNA differential display; 110 differentially displayed polymerase chain reaction products were cloned and analyzed. Of 88 unique cDNAs obtained, 22 shared significant homology to DNA/protein sequences in the respective databases. This group comprises, among others, a nodule-specific homolog of protein phosphatase 2C, a peptide transporter protein, and a nodule-specific form of cytochrome P450. RNA gel-blot analysis of 16 differentially displayed ESTs confirmed their nodule-specific expression pattern. The kinetics of mRNA accumulation of the majority of the ESTs analyzed were found to resemble the expression pattern observed for the L. japonicus leghemoglobin gene. These results indicate that the newly isolated molecular markers correspond to genes induced during late developmental stages of L. japonicus nodule organogenesis and provide important, novel tools for the study of nodulation. PMID:9276951
QuASAR: quantitative allele-specific analysis of reads

PubMed Central

Harvey, Chris T.; Moyerbrailean, Gregory A.; Davis, Gordon O.; Wen, Xiaoquan; Luca, Francesca; Pique-Regi, Roger

2015-01-01

Motivation: Expression quantitative trait loci (eQTL) studies have discovered thousands of genetic variants that regulate gene expression, enabling a better understanding of the functional role of non-coding sequences. However, eQTL studies are costly, requiring large sample sizes and genome-wide genotyping of each sample. In contrast, analysis of allele-specific expression (ASE) is becoming a popular approach to detect the effect of genetic variation on gene expression, even within a single individual. This is typically achieved by counting the number of RNA-seq reads matching each allele at heterozygous sites and testing the null hypothesis of a 1:1 allelic ratio. In principle, when genotype information is not readily available, it could be inferred from the RNA-seq reads directly. However, there are currently no existing methods that jointly infer genotypes and conduct ASE inference, while considering uncertainty in the genotype calls. Results: We present QuASAR, quantitative allele-specific analysis of reads, a novel statistical learning method for jointly detecting heterozygous genotypes and inferring ASE. The proposed ASE inference step takes into consideration the uncertainty in the genotype calls, while including parameters that model base-call errors in sequencing and allelic over-dispersion. We validated our method with experimental data for which high-quality genotypes are available. Results for an additional dataset with multiple replicates at different sequencing depths demonstrate that QuASAR is a powerful tool for ASE analysis when genotypes are not available. Availability and implementation: http://github.com/piquelab/QuASAR. Contact: fluca@wayne.edu or rpique@wayne.edu Supplementary information: Supplementary Material is available at Bioinformatics online. PMID:25480375
Identification of Actinobacillus pleuropneumoniae Genes Preferentially Expressed During Infection Using In Vivo-Induced Antigen Technology (IVIAT).

PubMed

Zhang, Fei; Zhang, Yangyi; Wen, Xintian; Huang, Xiaobo; Wen, Yiping; Wu, Rui; Yan, Qigui; Huang, Yong; Ma, Xiaoping; Zhao, Qin; Cao, Sanjie

2015-10-01

Porcine pleuropneumonia is an infectious disease caused by Actinobacillus pleuropneumoniae. The identification of A. pleuropneumoniae genes, specially expressed in vivo, is a useful tool to reveal the mechanism of infection. IVIAT was used in this work to identify antigens expressed in vivo during A. pleuropneumoniae infection, using sera from individuals with chronic porcine pleuropneumonia. Sequencing of DNA inserts from positive clones showed 11 open reading frames with high homology to A. pleuropneumoniae genes. Based on sequence analysis, proteins encoded by these genes were involved in metabolism, replication, transcription regulation, and signal transduction. Moreover, three function-unknown proteins were also indentified in this work. Expression analysis using quantitative real-time PCR showed that most of the genes tested were up-regulated in vivo relative to their expression levels in vitro. IVI (in vivoinduced) genes that were amplified by PCR in different A. pleuropneumoniae strains showed that these genes could be detected in almost all of the strains. It is demonstrated that the identified IVI antigen may have important roles in the infection of A. pleuropneumoniae.
Molecular cloning and characterization of a gene regulating flowering time from Alfalfa (Medicago sativa L.).

PubMed

Zhang, Tiejun; Chao, Yuehui; Kang, Junmei; Ding, Wang; Yang, Qingchuan

2013-07-01

Genes that regulate flowering time play crucial roles in plant development and biomass formation. Based on the cDNA sequence of Medicago truncatula (accession no. AY690425), the LFY gene of alfalfa was cloned. Sequence similarity analysis revealed high homology with FLO/LFY family genes of other plants. When fused to the green fluorescent protein, MsLFY protein was localized in the nucleus of onion (Allium cepa L.) epidermal cells. The RT-qPCR analysis of MsLFY expression patterns showed that the expression of MsLFY gene was at a low level in roots, stems, leaves and pods, and the expression level in floral buds was the highest. The expression of MsLFY was induced by GA3 and long photoperiod. Plant expression vector was constructed and transformed into Arabidopsis by the agrobacterium-mediated methods. PCR amplification with the transgenic Arabidopsis genome DNA indicated that MsLFY gene had integrated in Arabidopsis genome. Overexpression of MsLFY specifically caused early flowering under long day conditions compared with non-transgenic plants. These results indicated MsLFY played roles in promoting flowering time.
Rift Valley Fever Virus Structural and Nonstructural Proteins: Recombinant Protein Expression and Immunoreactivity Against Antisera from Sheep

PubMed Central

Faburay, Bonto; Wilson, William; McVey, D. Scott; Drolet, Barbara S.; Weingartl, Hana; Madden, Daniel; Young, Alan; Ma, Wenjun

2013-01-01

Abstract The Rift Valley fever virus (RVFV) encodes the structural proteins nucleoprotein (N), aminoterminal glycoprotein (Gn), carboxyterminal glycoprotein (Gc), and L protein, 78-kD, and the nonstructural proteins NSm and NSs. Using the baculovirus system, we expressed the full-length coding sequence of N, NSs, NSm, Gc, and the ectodomain of the coding sequence of the Gn glycoprotein derived from the virulent strain of RVFV ZH548. Western blot analysis using anti-His antibodies and monoclonal antibodies against Gn and N confirmed expression of the recombinant proteins, and in vitro biochemical analysis showed that the two glycoproteins, Gn and Gc, were expressed in glycosylated form. Immunoreactivity profiles of the recombinant proteins in western blot and in indirect enzyme-linked immunosorbent assay against a panel of antisera obtained from vaccinated or wild type (RVFV)-challenged sheep confirmed the results obtained with anti-His antibodies and demonstrated the suitability of the baculo-expressed antigens for diagnostic assays. In addition, these recombinant proteins could be valuable for the development of diagnostic methods that differentiate infected from vaccinated animals (DIVA). PMID:23962238
Characterization of Developmental- and Stress-Mediated Expression of Cinnamoyl-CoA Reductase in Kenaf (Hibiscus cannabinus L.)

PubMed Central

Lim, Hyoun-Sub; Park, Sang-Un; Bae, Hyeun-Jong; Natarajan, Savithiry

2014-01-01

Cinnamoyl-CoA reductase (CCR) is an important enzyme for lignin biosynthesis as it catalyzes the first specific committed step in monolignol biosynthesis. We have cloned a full length coding sequence of CCR from kenaf (Hibiscus cannabinus L.), which contains a 1,020-bp open reading frame (ORF), encoding 339 amino acids of 37.37 kDa, with an isoelectric point (pI) of 6.27 (JX524276, HcCCR2). BLAST result found that it has high homology with other plant CCR orthologs. Multiple alignment with other plant CCR sequences showed that it contains two highly conserved motifs: NAD(P) binding domain (VTGAGGFIASWMVKLLLEKGY) at N-terminal and probable catalytic domain (NWYCYGK). According to phylogenetic analysis, it was closely related to CCR sequences of Gossypium hirsutum (ACQ59094) and Populus trichocarpa (CAC07424). HcCCR2 showed ubiquitous expression in various kenaf tissues and the highest expression was detected in mature flower. HcCCR2 was expressed differentially in response to various stresses, and the highest expression was observed by drought and NaCl treatments. PMID:24723816
MDC9, a widely expressed cellular disintegrin containing cytoplasmic SH3 ligand domains

PubMed Central

1996-01-01

Cellular disintegrins are a family of proteins that are related to snake venom integrin ligands and metalloproteases. We have cloned and sequenced the mouse and human homologue of a widely expressed cellular disintegrin, which we have termed MDC9 (for metalloprotease/disintegrin/cysteine-rich protein 9). The deduced mouse and human protein sequences are 82% identical. MDC9 contains several distinct protein domains: a signal sequence is followed by a prodomain and a domain with sequence similarity to snake venom metalloproteases, a disintegrin domain, a cysteine-rich region, an EGF repeat, a membrane anchor, and a cytoplasmic tail. The cytoplasmic tail of MDC9 has two proline-rich sequences which can bind the SH3 domain of Src, and may therefore function as SH3 ligand domains. Western blot analysis shows that MDC9 is an approximately 84-kD glycoprotein in all mouse tissues examined, and in NIH 3T3 fibroblast and C2C12 myoblast mouse cell lines. MDC9 can be both cell surface biotinylated and 125I-labeled in NIH 3T3 mouse fibroblasts, indicating that the protein is present on the plasma membrane. Expression of MDC9 in COS-7 cells yields an 84-kD protein, and immunofluorescence analysis of COS-7 cells expressing MDC9 shows a staining pattern that is consistent with a plasma membrane localization. The apparent molecular mass of 84 kD suggests that MDC9 contains a membrane-anchored metalloprotease and disintegrin domain. We propose that MDC9 might function as a membrane-anchored integrin ligand or metalloprotease, or that MDC9 may combine both activities in one protein. PMID:8647900
De novo sequencing and analysis of the cranberry fruit transcriptome to identify putative genes involved in flavonoid biosynthesis, transport and regulation.

PubMed

Sun, Haiyue; Liu, Yushan; Gai, Yuzhuo; Geng, Jinman; Chen, Li; Liu, Hongdi; Kang, Limin; Tian, Youwen; Li, Yadong

2015-09-02

Cranberries (Vaccinium macrocarpon Ait.), renowned for their excellent health benefits, are an important berry crop. Here, we performed transcriptome sequencing of one cranberry cultivar, from fruits at two different developmental stages, on the Illumina HiSeq 2000 platform. Our main goals were to identify putative genes for major metabolic pathways of bioactive compounds and compare the expression patterns between white fruit (W) and red fruit (R) in cranberry. In this study, two cDNA libraries of W and R were constructed. Approximately 119 million raw sequencing reads were generated and assembled de novo, yielding 57,331 high quality unigenes with an average length of 739 bp. Using BLASTx, 38,460 unigenes were identified as putative homologs of annotated sequences in public protein databases, including NCBI NR, NT, Swiss-Prot, KEGG, COG and GO. Of these, 21,898 unigenes mapped to 128 KEGG pathways, with the metabolic pathways, secondary metabolites, glycerophospholipid metabolism, ether lipid metabolism, starch and sucrose metabolism, purine metabolism, and pyrimidine metabolism being well represented. Among them, many candidate genes were involved in flavonoid biosynthesis, transport and regulation. Furthermore, digital gene expression (DEG) analysis identified 3,257 unigenes that were differentially expressed between the two fruit developmental stages. In addition, 14,473 simple sequence repeats (SSRs) were detected. Our results present comprehensive gene expression information about the cranberry fruit transcriptome that could facilitate our understanding of the molecular mechanisms of fruit development in cranberries. Although it will be necessary to validate the functions carried out by these genes, these results could be used to improve the quality of breeding programs for the cranberry and related species.
CoLIde: a bioinformatics tool for CO-expression-based small RNA Loci Identification using high-throughput sequencing data.

PubMed

Mohorianu, Irina; Stocks, Matthew Benedict; Wood, John; Dalmay, Tamas; Moulton, Vincent

2013-07-01

Small RNAs (sRNAs) are 20-25 nt non-coding RNAs that act as guides for the highly sequence-specific regulatory mechanism known as RNA silencing. Due to the recent increase in sequencing depth, a highly complex and diverse population of sRNAs in both plants and animals has been revealed. However, the exponential increase in sequencing data has also made the identification of individual sRNA transcripts corresponding to biological units (sRNA loci) more challenging when based exclusively on the genomic location of the constituent sRNAs, hindering existing approaches to identify sRNA loci. To infer the location of significant biological units, we propose an approach for sRNA loci detection called CoLIde (Co-expression based sRNA Loci Identification) that combines genomic location with the analysis of other information such as variation in expression levels (expression pattern) and size class distribution. For CoLIde, we define a locus as a union of regions sharing the same pattern and located in close proximity on the genome. Biological relevance, detected through the analysis of size class distribution, is also calculated for each locus. CoLIde can be applied on ordered (e.g., time-dependent) or un-ordered (e.g., organ, mutant) series of samples both with or without biological/technical replicates. The method reliably identifies known types of loci and shows improved performance on sequencing data from both plants (e.g., A. thaliana, S. lycopersicum) and animals (e.g., D. melanogaster) when compared with existing locus detection techniques. CoLIde is available for use within the UEA Small RNA Workbench which can be downloaded from: http://srna-workbench.cmp.uea.ac.uk.
Characterization and expression analysis of Toll-like receptor 3 cDNA from Atlantic salmon (Salmo salar).

PubMed

Vidal, R; González, R; Gil, F

2015-06-10

Innate pathway activation is fundamental for early anti-viral defense in fish, but currently there is insufficient understanding of how salmonid fish identify viral molecules and activate these pathways. The Toll-like receptor (TLR) is believed to play a crucial role in host defense of pathogenic microbes in the innate immune system. In the present study, the full-length cDNA of Salmo salar TLR3 (ssTLR3) was cloned. The ssTLR3 cDNA sequence was 6071 bp long, containing an open reading frame of 2754 bp and encoding 971 amino acids. The TLR group motifs, such as leucine-rich repeat (LRR) domains and Toll-interleukin-1 receptor (TIR) domains, were maintained in ssTLR3, with sixteen LRR domains and one TIR domain. In contrast to descriptions of the TLR3 in rainbow trout and the murine (TATA-less), we found a putative TATA box in the proximal promoter region 29 bp upstream of the transcription start point of ssTLR3. Multiple-sequence alignment analysis of the ssTLR3 protein-coding sequence with other known TLR3 sequences showed the sequence to be conserved among all species analyzed, implying that the function of the TLR3 had been sustained throughout evolution. The ssTLR3 mRNA expression patterns were measured using real-time PCR. The results revealed that TLR3 is widely expressed in various healthy tissues. Individuals challenged with infectious pancreatic necrosis virus and immunostimulated with polyinosinic:polycytidylic acid exhibited increased expression of TLR3 at the mRNA level, indicating that ssTLR3 may be involved in pathogen recognition in the early innate immune system.
Riboflavin accumulation and characterization of cDNAs encoding lumazine synthase and riboflavin synthase in bitter melon (Momordica charantia).

PubMed

Tuan, Pham Anh; Kim, Jae Kwang; Lee, Sanghyun; Chae, Soo Cheon; Park, Sang Un

2012-12-05

Riboflavin (vitamin B2) is the universal precursor of the coenzymes flavin mononucleotide and flavin adenine dinucleotide--cofactors that are essential for the activity of a wide variety of metabolic enzymes in animals, plants, and microbes. Using the RACE PCR approach, cDNAs encoding lumazine synthase (McLS) and riboflavin synthase (McRS), which catalyze the last two steps in the riboflavin biosynthetic pathway, were cloned from bitter melon (Momordica charantia), a popular vegetable crop in Asia. Amino acid sequence alignments indicated that McLS and McRS share high sequence identity with other orthologous genes and carry an N-terminal extension, which is reported to be a plastid-targeting sequence. Organ expression analysis using quantitative real-time RT PCR showed that McLS and McRS were constitutively expressed in M. charantia, with the strongest expression levels observed during the last stage of fruit ripening (stage 6). This correlated with the highest level of riboflavin content, which was detected during ripening stage 6 by HPLC analysis. McLS and McRS were highly expressed in the young leaves and flowers, whereas roots exhibited the highest accumulation of riboflavin. The cloning and characterization of McLS and McRS from M. charantia may aid the metabolic engineering of vitamin B2 in crops.
Identification of a cis-Regulatory Element Involved in Phytochrome Down-Regulated Expression of the Pea Small GTPase Gene pra21

PubMed Central

Inaba, Takehito; Nagano, Yukio; Sakakibara, Toshihiro; Sasaki, Yukiko

1999-01-01

The pra2 gene encodes a pea (Pisum sativum) small GTPase belonging to the YPT/rab family, and its expression is down-regulated by light, mediated by phytochrome. We have isolated and characterized a genomic clone of this gene and constructed a fusion DNA of its 5′-upstream region in front of the gene for firefly luciferase. Using this construct in a transient assay, we determined a pra2 cis-regulatory region sufficient to direct the light down-regulation of the luciferase reporter gene. Both 5′- and internal deletion analyses revealed that the 93-bp sequence between −734 and −642 from the transcriptional start site was important for phytochrome down-regulation. Gain-of-function analysis showed that this 93-bp region could confer light down-regulation when fused to the cauliflower mosaic virus 35S promoter. Furthermore, linker-scanning analysis showed that a 12-bp sequence within the 93-bp region mediated phytochrome down-regulation. Gel-retardation analysis showed the presence of a nuclear factor that was specifically bound to the 12-bp sequence in vitro. These results indicate that this element is a cis-regulatory element involved in phytochrome down-regulated expression. PMID:10364400
De Novo Transcriptome Sequencing of Olea europaea L. to Identify Genes Involved in the Development of the Pollen Tube.

PubMed

Iaria, Domenico; Chiappetta, Adriana; Muzzalupo, Innocenzo

2016-01-01

In olive (Olea europaea L.), the processes controlling self-incompatibility are still unclear and the molecular basis underlying this process are still not fully characterized. In order to determine compatibility relationships, using next-generation sequencing techniques and a de novo transcriptome assembly strategy, we show that pollen tubes from different olive plants, grown in vitro in a medium containing its own pistil and in combination pollen/pistil from self-sterile and self-fertile cultivars, have a distinct gene expression profile and many of the differentially expressed sequences between the samples fall within gene families involved in the development of the pollen tube, such as lipase, carboxylesterase, pectinesterase, pectin methylesterase, and callose synthase. Moreover, different genes involved in signal transduction, transcription, and growth are overrepresented. The analysis also allowed us to identify members in actin and actin depolymerization factor and fibrin gene family and member of the Ca(2+) binding gene family related to the development and polarization of pollen apical tip. The whole transcriptomic analysis, through the identification of the differentially expressed transcripts set and an extended functional annotation analysis, will lead to a better understanding of the mechanisms of pollen germination and pollen tube growth in the olive.
Functional analysis of Pacific oyster (Crassostrea gigas) β-thymosin: Focus on antimicrobial activity.

PubMed

Nam, Bo-Hye; Seo, Jung-Kil; Lee, Min Jeong; Kim, Young-Ok; Kim, Dong-Gyun; An, Cheul Min; Park, Nam Gyu

2015-07-01

An antimicrobial peptide, ∼5 kDa in size, was isolated and purified in its active form from the mantle of the Pacific oyster Crassostrea gigas by C18 reversed-phase high-performance liquid chromatography. Matrix-assisted laser desorption ionisation time-of-flight analysis revealed 4656.4 Da of the purified and unreduced peptide. A comparison of the N-terminal amino acid sequence of oyster antimicrobial peptide with deduced amino acid sequences in our local expressed sequence tag (EST) database of C. gigas (unpublished data) revealed that the oyster antimicrobial peptide sequence entirely matched the deduced amino acid sequence of an EST clone (HM-8_A04), which was highly homologous with the β-thymosin of other species. The cDNA possessed a 126-bp open reading frame that encoded a protein of 41 amino acids. To confirm the antimicrobial activity of C. gigas β-thymosin, we overexpressed a recombinant β-thymosin (rcgTβ) using a pET22 expression plasmid in an Escherichia coli system. The antimicrobial activity of rcgTβ was evaluated and demonstrated using a bacterial growth inhibition test in both liquid and solid cultures. Copyright © 2015 Elsevier Ltd. All rights reserved.
Comparing the normalization methods for the differential analysis of Illumina high-throughput RNA-Seq data.

PubMed

Li, Peipei; Piao, Yongjun; Shon, Ho Sun; Ryu, Keun Ho

2015-10-28

Recently, rapid improvements in technology and decrease in sequencing costs have made RNA-Seq a widely used technique to quantify gene expression levels. Various normalization approaches have been proposed, owing to the importance of normalization in the analysis of RNA-Seq data. A comparison of recently proposed normalization methods is required to generate suitable guidelines for the selection of the most appropriate approach for future experiments. In this paper, we compared eight non-abundance (RC, UQ, Med, TMM, DESeq, Q, RPKM, and ERPKM) and two abundance estimation normalization methods (RSEM and Sailfish). The experiments were based on real Illumina high-throughput RNA-Seq of 35- and 76-nucleotide sequences produced in the MAQC project and simulation reads. Reads were mapped with human genome obtained from UCSC Genome Browser Database. For precise evaluation, we investigated Spearman correlation between the normalization results from RNA-Seq and MAQC qRT-PCR values for 996 genes. Based on this work, we showed that out of the eight non-abundance estimation normalization methods, RC, UQ, Med, TMM, DESeq, and Q gave similar normalization results for all data sets. For RNA-Seq of a 35-nucleotide sequence, RPKM showed the highest correlation results, but for RNA-Seq of a 76-nucleotide sequence, least correlation was observed than the other methods. ERPKM did not improve results than RPKM. Between two abundance estimation normalization methods, for RNA-Seq of a 35-nucleotide sequence, higher correlation was obtained with Sailfish than that with RSEM, which was better than without using abundance estimation methods. However, for RNA-Seq of a 76-nucleotide sequence, the results achieved by RSEM were similar to without applying abundance estimation methods, and were much better than with Sailfish. Furthermore, we found that adding a poly-A tail increased alignment numbers, but did not improve normalization results. Spearman correlation analysis revealed that RC, UQ, Med, TMM, DESeq, and Q did not noticeably improve gene expression normalization, regardless of read length. Other normalization methods were more efficient when alignment accuracy was low; Sailfish with RPKM gave the best normalization results. When alignment accuracy was high, RC was sufficient for gene expression calculation. And we suggest ignoring poly-A tail during differential gene expression analysis.
Identification of cis-elements and evaluation of upstream regulatory region of a rice anther-specific gene, OSIPP3, conferring pollen-specific expression in Oryza sativa (L.) ssp. indica.

PubMed

Manimaran, P; Raghurami Reddy, M; Bhaskar Rao, T; Mangrauthia, Satendra K; Sundaram, R M; Balachandran, S M

2015-12-01

Pollen-specific expression. Promoters comprise of various cis-regulatory elements which control development and physiology of plants by regulating gene expression. To understand the promoter specificity and also identification of functional cis-acting elements, progressive 5' deletion analysis of the promoter fragments is widely used. We have evaluated the activity of regulatory elements of 5' promoter deletion sequences of anther-specific gene OSIPP3, viz. OSIPP3-∆1 (1504 bp), OSIPP3-∆2 (968 bp), OSIPP3-∆3 (388 bp) and OSIPP3-∆4 (286 bp) through the expression of transgene GUS in rice. In silico analysis of 1504-bp sequence harboring different copy number of cis-acting regulatory elements such as POLLENLELAT52, GTGANTG10, enhancer element of LAT52 and LAT56 indicated that they were essential for high level of expression in pollen. Histochemical GUS analysis of the transgenic plants revealed that 1504- and 968-bp fragments directed GUS expression in roots and anthers, while the 388- and 286-bp fragments restricted the GUS expression to only pollen, of which 388 bp conferred strong GUS expression. Further, GUS staining analysis of different panicle development stages (P1-P6) confirmed that the GUS gene was preferentially expressed only at P6 stage (late pollen stage). The qRT-PCR analysis of GUS transcript revealed 23-fold higher expression of GUS transcript in OSIPP3-Δ1 followed by OSIPP3-Δ2 (eightfold) and OSIPP3-Δ3 (threefold) when compared to OSIPP3-Δ4. Based on our results, we proposed that among the two smaller fragments, the 388-bp upstream regulatory region could be considered as a promising candidate for pollen-specific expression of agronomically important transgenes in rice.
Molecular characterization of DnaJ 5 homologs in silkworm Bombyx mori and its expression during egg diapause.

PubMed

Sirigineedi, Sasibhushan; Vijayagowri, Esvaran; Murthy, Geetha N; Rao, Guruprasada; Ponnuvel, Kangayam M

2014-12-01

A comparison of the cDNA sequences (1 056 bp) of Bombyx mori DnaJ 5 homolog with B. mori genome revealed that unlike in other Hsps, it has an intron of 234 bp. The DnaJ 5 homolog contains 351 amino acids, of which 70 contain the conserved DnaJ domain at the N-terminal end. This homolog of B. mori has all desirable functional domains similar to other insects, and the 13 different DnaJ homologs identified in B. mori genome were distributed on different chromosomes. The expressed sequence tag database analysis of Hsp40 gene expression revealed higher expression in wing disc followed by diapause-induced eggs. Microarray analysis revealed higher expression of DnaJ 5 homolog at 18th h after oviposition in diapause-induced eggs. Further validation of DnaJ 5 expression through qPCR in diapause-induced and nondiapause eggs at different time intervals revealed higher expression in diapause eggs at 18 and 24 h after oviposition, which coincided with the expression of Hsp70 as the Hsp 40 is its co-chaperone. This study thus provides an outline of the genome organization of Hsp40 gene, and its role in egg diapause induction in B. mori. © 2013 Institute of Zoology, Chinese Academy of Sciences.
The presence of both negative and positive elements in the 5'-flanking sequence of the rat Na,K-ATPase alpha 3 subunit gene are required for brain expression in transgenic mice.

PubMed Central

Pathak, B G; Neumann, J C; Croyle, M L; Lingrel, J B

1994-01-01

The Na,K-ATPase is an integral plasma membrane protein consisting of alpha and beta subunits, each of which has discrete isoforms expressed in a tissue-specific manner. Of the three functional alpha isoform genes, the one encoding the alpha 3 isoform is the most tissue-restricted in its expression, being found primarily in the brain. To identify regions of the alpha 3 isoform gene that are involved in directing expression in the brain, a 1.6 kb 5'-flanking sequence was attached to a reporter gene, chloramphenicol acetyltransferase (CAT). The alpha 3-CAT chimeric gene construct was microinjected into fertilized mouse eggs, and transgenic mice were produced. Analysis of adult transgenic mice from different lines revealed that the transgene is expressed primarily in the brain. To further delineate regions that are needed for conferring expression in this tissue, systematic deletions of the 5'-flanking sequence of the alpha 3-CAT fusion constructs were made and analyzed, again using transgenic mice. The results from these analyses indicate that DNA sequences required for mediating brain-specific expression of the alpha 3 isoform gene are present within 210 bp upstream of the transcription initiation site. alpha 3-CAT promoter constructs containing scanning mutations in this region were also assayed in transgenic mice. These studies have identified both a functional neural-restrictive silencer element as well as a positively acting cis element. Images PMID:7984427
MAGIC database and interfaces: an integrated package for gene discovery and expression.

PubMed

Cordonnier-Pratt, Marie-Michèle; Liang, Chun; Wang, Haiming; Kolychev, Dmitri S; Sun, Feng; Freeman, Robert; Sullivan, Robert; Pratt, Lee H

2004-01-01

The rapidly increasing rate at which biological data is being produced requires a corresponding growth in relational databases and associated tools that can help laboratories contend with that data. With this need in mind, we describe here a Modular Approach to a Genomic, Integrated and Comprehensive (MAGIC) Database. This Oracle 9i database derives from an initial focus in our laboratory on gene discovery via production and analysis of expressed sequence tags (ESTs), and subsequently on gene expression as assessed by both EST clustering and microarrays. The MAGIC Gene Discovery portion of the database focuses on information derived from DNA sequences and on its biological relevance. In addition to MAGIC SEQ-LIMS, which is designed to support activities in the laboratory, it contains several additional subschemas. The latter include MAGIC Admin for database administration, MAGIC Sequence for sequence processing as well as sequence and clone attributes, MAGIC Cluster for the results of EST clustering, MAGIC Polymorphism in support of microsatellite and single-nucleotide-polymorphism discovery, and MAGIC Annotation for electronic annotation by BLAST and BLAT. The MAGIC Microarray portion is a MIAME-compliant database with two components at present. These are MAGIC Array-LIMS, which makes possible remote entry of all information into the database, and MAGIC Array Analysis, which provides data mining and visualization. Because all aspects of interaction with the MAGIC Database are via a web browser, it is ideally suited not only for individual research laboratories but also for core facilities that serve clients at any distance.
Biochemical and genetic analysis of the yeast proteome with a movable ORF collection

PubMed Central

Gelperin, Daniel M.; White, Michael A.; Wilkinson, Martha L.; Kon, Yoshiko; Kung, Li A.; Wise, Kevin J.; Lopez-Hoyo, Nelson; Jiang, Lixia; Piccirillo, Stacy; Yu, Haiyuan; Gerstein, Mark; Dumont, Mark E.; Phizicky, Eric M.; Snyder, Michael; Grayhack, Elizabeth J.

2005-01-01

Functional analysis of the proteome is an essential part of genomic research. To facilitate different proteomic approaches, a MORF (moveable ORF) library of 5854 yeast expression plasmids was constructed, each expressing a sequence-verified ORF as a C-terminal ORF fusion protein, under regulated control. Analysis of 5573 MORFs demonstrates that nearly all verified ORFs are expressed, suggests the authenticity of 48 ORFs characterized as dubious, and implicates specific processes including cytoskeletal organization and transcriptional control in growth inhibition caused by overexpression. Global analysis of glycosylated proteins identifies 109 new confirmed N-linked and 345 candidate glycoproteins, nearly doubling the known yeast glycome. PMID:16322557

eRNA: a graphic user interface-based tool optimized for large data analysis from high-throughput RNA sequencing

PubMed Central

2014-01-01

Background RNA sequencing (RNA-seq) is emerging as a critical approach in biological research. However, its high-throughput advantage is significantly limited by the capacity of bioinformatics tools. The research community urgently needs user-friendly tools to efficiently analyze the complicated data generated by high throughput sequencers. Results We developed a standalone tool with graphic user interface (GUI)-based analytic modules, known as eRNA. The capacity of performing parallel processing and sample management facilitates large data analyses by maximizing hardware usage and freeing users from tediously handling sequencing data. The module miRNA identification” includes GUIs for raw data reading, adapter removal, sequence alignment, and read counting. The module “mRNA identification” includes GUIs for reference sequences, genome mapping, transcript assembling, and differential expression. The module “Target screening” provides expression profiling analyses and graphic visualization. The module “Self-testing” offers the directory setups, sample management, and a check for third-party package dependency. Integration of other GUIs including Bowtie, miRDeep2, and miRspring extend the program’s functionality. Conclusions eRNA focuses on the common tools required for the mapping and quantification analysis of miRNA-seq and mRNA-seq data. The software package provides an additional choice for scientists who require a user-friendly computing environment and high-throughput capacity for large data analysis. eRNA is available for free download at https://sourceforge.net/projects/erna/?source=directory. PMID:24593312
eRNA: a graphic user interface-based tool optimized for large data analysis from high-throughput RNA sequencing.

PubMed

Yuan, Tiezheng; Huang, Xiaoyi; Dittmar, Rachel L; Du, Meijun; Kohli, Manish; Boardman, Lisa; Thibodeau, Stephen N; Wang, Liang

2014-03-05

RNA sequencing (RNA-seq) is emerging as a critical approach in biological research. However, its high-throughput advantage is significantly limited by the capacity of bioinformatics tools. The research community urgently needs user-friendly tools to efficiently analyze the complicated data generated by high throughput sequencers. We developed a standalone tool with graphic user interface (GUI)-based analytic modules, known as eRNA. The capacity of performing parallel processing and sample management facilitates large data analyses by maximizing hardware usage and freeing users from tediously handling sequencing data. The module miRNA identification" includes GUIs for raw data reading, adapter removal, sequence alignment, and read counting. The module "mRNA identification" includes GUIs for reference sequences, genome mapping, transcript assembling, and differential expression. The module "Target screening" provides expression profiling analyses and graphic visualization. The module "Self-testing" offers the directory setups, sample management, and a check for third-party package dependency. Integration of other GUIs including Bowtie, miRDeep2, and miRspring extend the program's functionality. eRNA focuses on the common tools required for the mapping and quantification analysis of miRNA-seq and mRNA-seq data. The software package provides an additional choice for scientists who require a user-friendly computing environment and high-throughput capacity for large data analysis. eRNA is available for free download at https://sourceforge.net/projects/erna/?source=directory.
Bacterial identification and subtyping using DNA microarray and DNA sequencing.

PubMed

Al-Khaldi, Sufian F; Mossoba, Magdi M; Allard, Marc M; Lienau, E Kurt; Brown, Eric D

2012-01-01

The era of fast and accurate discovery of biological sequence motifs in prokaryotic and eukaryotic cells is here. The co-evolution of direct genome sequencing and DNA microarray strategies not only will identify, isotype, and serotype pathogenic bacteria, but also it will aid in the discovery of new gene functions by detecting gene expressions in different diseases and environmental conditions. Microarray bacterial identification has made great advances in working with pure and mixed bacterial samples. The technological advances have moved beyond bacterial gene expression to include bacterial identification and isotyping. Application of new tools such as mid-infrared chemical imaging improves detection of hybridization in DNA microarrays. The research in this field is promising and future work will reveal the potential of infrared technology in bacterial identification. On the other hand, DNA sequencing by using 454 pyrosequencing is so cost effective that the promise of $1,000 per bacterial genome sequence is becoming a reality. Pyrosequencing technology is a simple to use technique that can produce accurate and quantitative analysis of DNA sequences with a great speed. The deposition of massive amounts of bacterial genomic information in databanks is creating fingerprint phylogenetic analysis that will ultimately replace several technologies such as Pulsed Field Gel Electrophoresis. In this chapter, we will review (1) the use of DNA microarray using fluorescence and infrared imaging detection for identification of pathogenic bacteria, and (2) use of pyrosequencing in DNA cluster analysis to fingerprint bacterial phylogenetic trees.
Molecular cloning and sequence analysis of stearoyl-CoA desaturase in milkfish, Chanos chanos.

PubMed

Hsieh, S L; Liao, W L; Kuo, C M

2001-12-01

Stearoyl-CoA desaturase (EC 1.14.99.5) is a key enzyme in the biosynthesis of polyunsaturated fatty acids and the maintenance of the homeoviscous fluidity of biological membranes. The stearoyl-CoA desaturase cDNA in milkfish (Chanos chanos) was cloned by RT-PCR and RACE, and it was compared with the stearoyl-CoA desaturase in cold-tolerant teleosts, common carp and grass carp. Nucleotide sequence analysis revealed that the cDNA clone has a 972-bp open reading frame encoding 323 amino acid residues. Alignments of the deduced amino acid sequence showed that the milkfish stearoyl-CoA desaturase shares 79% and 75% identity with common carp and grass carp, and 63%-64% with other vertebrates such as sheep, hamsters, rats, mice, and humans. Like common carp and grass carp, the deduced amino acid sequence in milkfish well conserves three histidine cluster motifs (one HXXXXH and two HXXHH) that are essential for catalysis of stearoyl-CoA desaturase activity. However, RT-PCR analysis showed that stearoyl-CoA desaturase expression in milkfish is detected in the tissues of liver, muscle, kidney, brain, and gill, and more expression sites were found in milkfish than in common carp and grass carp. Phylogenic relationships among the deduced stearoyl-CoA desaturase amino acid sequence in milkfish and those in other vertebrates showed that the milkfish stearoyl-CoA desaturase amino acid sequence is phylogenetically closer to those of common carp and grass carp than to other higher vertebrates.
An optimized protocol for generation and analysis of Ion Proton sequencing reads for RNA-Seq.

PubMed

Yuan, Yongxian; Xu, Huaiqian; Leung, Ross Ka-Kit

2016-05-26

Previous studies compared running cost, time and other performance measures of popular sequencing platforms. However, comprehensive assessment of library construction and analysis protocols for Proton sequencing platform remains unexplored. Unlike Illumina sequencing platforms, Proton reads are heterogeneous in length and quality. When sequencing data from different platforms are combined, this can result in reads with various read length. Whether the performance of the commonly used software for handling such kind of data is satisfactory is unknown. By using universal human reference RNA as the initial material, RNaseIII and chemical fragmentation methods in library construction showed similar result in gene and junction discovery number and expression level estimated accuracy. In contrast, sequencing quality, read length and the choice of software affected mapping rate to a much larger extent. Unspliced aligner TMAP attained the highest mapping rate (97.27 % to genome, 86.46 % to transcriptome), though 47.83 % of mapped reads were clipped. Long reads could paradoxically reduce mapping in junctions. With reference annotation guide, the mapping rate of TopHat2 significantly increased from 75.79 to 92.09 %, especially for long (>150 bp) reads. Sailfish, a k-mer based gene expression quantifier attained highly consistent results with that of TaqMan array and highest sensitivity. We provided for the first time, the reference statistics of library preparation methods, gene detection and quantification and junction discovery for RNA-Seq by the Ion Proton platform. Chemical fragmentation performed equally well with the enzyme-based one. The optimal Ion Proton sequencing options and analysis software have been evaluated.
Mining, identification and function analysis of microRNAs and target genes in peanut (Arachis hypogaea L.).

PubMed

Zhang, Tingting; Hu, Shuhao; Yan, Caixia; Li, Chunjuan; Zhao, Xiaobo; Wan, Shubo; Shan, Shihua

2017-02-01

In the present investigation, a total of 60 conserved peanut (Arachis hypogaea L.) microRNA (miRNA) sequences, belonging to 16 families, were identified using bioinformatics methods. There were 392 target gene sequences, identified from 58 miRNAs with Target-align software and BLASTx analyses. Gene Ontology (GO) functional analysis suggested that these target genes were involved in mediating peanut growth and development, signal transduction and stress resistance. There were 55 miRNA sequences, verified employing a poly (A) tailing test, with a success rate of up to 91.67%. Twenty peanut target gene sequences were randomly selected, and the 5' rapid amplification of the cDNA ends (5'-RACE) method were used to validate the cleavage sites of these target genes. Of these, 14 (70%) peanut miRNA targets were verified by means of gel electrophoresis, cloning and sequencing. Furthermore, functional analysis and homologous sequence retrieval were conducted for target gene sequences, and 26 target genes were chosen as the objects for stress resistance experimental study. Real-time fluorescence quantitative PCR (qRT-PCR) technology was applied to measure the expression level of resistance-associated miRNAs and their target genes in peanut exposed to Aspergillus flavus (A. flavus) infection and drought stress, respectively. In consequence, 5 groups of miRNAs & targets were found accorded with the mode of miRNA negatively controlling the expression of target genes. This study, preliminarily determined the biological functions of some resistance-associated miRNAs and their target genes in peanut. Copyright © 2016 Elsevier Masson SAS. All rights reserved.
Purification, developmental expression, and in silico characterization of α-amylase inhibitor from Echinochloa frumentacea.

PubMed

Panwar, Priyankar; Verma, A K; Dubey, Ashutosh

2018-05-01

Barnyard ( Echinochloa frumentacea ) and finger ( Eleusine coracana ) millet growing at northwestern Himalaya were explored for the α-amylase inhibitor (α-AI). The mature seeds of barnyard millet variety PRJ1 had maximum α-AI activity which increases in different developmental stage. α-AI was purified up to 22.25-fold from barnyard millet variety PRJ1. Semi-quantitative PCR of different developmental stages of barnyard millet seeds showed increased levels of the transcript from 7 to 28 days. Sequence analysis revealed that it contained 315 bp nucleotide which encodes 104 amino acid sequence with molecular weight 10.72 kDa. The predicted 3D structure of α-AI was 86.73% similar to a bifunctional inhibitor of ragi. In silico analysis of 71 α-AI protein sequences were carried out for biochemical features, homology search, multiple sequence alignment, phylogenetic tree construction, motif, and superfamily distribution of protein sequences. Analysis of multiple sequence alignment revealed the existence of conserved regions NPLP[S/G]CRWYVV[S/Q][Q/R]TCG[V/I] throughout sequences. Superfam analysis revealed that α-AI protein sequences were distributed among seven different superfamilies.
Preparation of highly multiplexed small RNA sequencing libraries.

PubMed

Persson, Helena; Søkilde, Rolf; Pirona, Anna Chiara; Rovira, Carlos

2017-08-01

MicroRNAs (miRNAs) are ~22-nucleotide-long small non-coding RNAs that regulate the expression of protein-coding genes by base pairing to partially complementary target sites, preferentially located in the 3´ untranslated region (UTR) of target mRNAs. The expression and function of miRNAs have been extensively studied in human disease, as well as the possibility of using these molecules as biomarkers for prognostication and treatment guidance. To identify and validate miRNAs as biomarkers, their expression must be screened in large collections of patient samples. Here, we develop a scalable protocol for the rapid and economical preparation of a large number of small RNA sequencing libraries using dual indexing for multiplexing. Combined with the use of off-the-shelf reagents, more samples can be sequenced simultaneously on large-scale sequencing platforms at a considerably lower cost per sample. Sample preparation is simplified by pooling libraries prior to gel purification, which allows for the selection of a narrow size range while minimizing sample variation. A comparison with publicly available data from benchmarking of miRNA analysis platforms showed that this method captures absolute and differential expression as effectively as commercially available alternatives.
The organization and expression of the mdm2 gene.

PubMed

de Oca Luna, R M; Tabor, A D; Eberspaecher, H; Hulboy, D L; Worth, L L; Colman, M S; Finlay, C A; Lozano, G

1996-05-01

The mdm2 gene encodes a zinc finger protein that negatively regulates p53 function by binding and masking the p53 transcriptional activation domain. Two different promoters control expression of mdm2, one of which is also transactivated by p53. We cloned and characterized the mdm2 gene from a murine 129 library. It contained at least 12 exons and spanned approximately 25 kb of DNA. Sequencing of the mdm2 gene revealed three nucleotide differences that resulted in amino acid substitutions in the previously published mdm2 sequence. Sequencing of normal BalbC/J DNA and the original cosmid clone isolated from the 3T3DM cell line revealed that they are identical, suggesting that the published sequence is in error at these three positions. In addition, we analyzed the expression pattern of mdm2 and found ubiquitous low-level expression throughout embryo development and in adult tissues. Analysis of mRNA from numerous tissues for several mdm2 spliced variants that had been identified in the transformed 3T3DM cell line revealed that these variants could not be detected in the developing embryo or in adult tissues.
Radiosensitivity in HeLa cervical cancer cells overexpressing glutathione S-transferase π 1

PubMed Central

YANG, LIANG; LIU, REN; MA, HONG-BIN; YING, MING-ZHEN; WANG, YA-JIE

2015-01-01

The aims of the present study were to investigate the effect of overexpressed exogenous glutathione S-transferase π 1 (GSTP1) gene on the radiosensitivity of the HeLa human cervical cancer cell line and conduct a preliminarily investigation into the underlying mechanisms of the effect. The full-length sequence of human GSTP1 was obtained by performing a polymerase chain reaction (PCR) using primers based on the GenBank sequence of GSTP1. Subsequently, the gene was cloned into a recombinant eukaryotic expression plasmid, and the resulting construct was confirmed by restriction analysis and DNA sequencing. A HeLa cell line that was stably expressing high levels of GSTP1 was obtained through stable transfection of the constructed plasmids using lipofectamine and screening for G418 resistance, as demonstrated by reverse transcription-PCR. Using the transfected HeLa cells, a colony formation assay was conducted to detect the influence of GSTP1 overexpression on the cell radiosensitivity. Furthermore, flow cytometry was used to investigate the effect of GSTP1 overexpression on cell cycle progression, with the protein expression levels of the cell cycle regulating factor cyclin B1 detected using western blot analysis. Colony formation and G2/M phase arrest in the GSTP1-expressing cells were significantly increased compared with the control group (P<0.01). In addition, the expression of cyclin B1 was significantly reduced in the GSTP1-expressing cells. These results demonstrated that increased expression of GSTP1 inhibits radiosensitivity in HeLa cells. The mechanism underlying this effect may be associated with the ability of the GSTP1 protein to reduce cyclin B1 expression, resulting in significant G2/M phase arrest. PMID:26622693
Radiosensitivity in HeLa cervical cancer cells overexpressing glutathione S-transferase π 1.

PubMed

Yang, Liang; Liu, Ren; Ma, Hong-Bin; Ying, Ming-Zhen; Wang, Ya-Jie

2015-09-01

The aims of the present study were to investigate the effect of overexpressed exogenous glutathione S-transferase π 1 ( GSTP1 ) gene on the radiosensitivity of the HeLa human cervical cancer cell line and conduct a preliminarily investigation into the underlying mechanisms of the effect. The full-length sequence of human GSTP1 was obtained by performing a polymerase chain reaction (PCR) using primers based on the GenBank sequence of GSTP1. Subsequently, the gene was cloned into a recombinant eukaryotic expression plasmid, and the resulting construct was confirmed by restriction analysis and DNA sequencing. A HeLa cell line that was stably expressing high levels of GSTP1 was obtained through stable transfection of the constructed plasmids using lipofectamine and screening for G418 resistance, as demonstrated by reverse transcription-PCR. Using the transfected HeLa cells, a colony formation assay was conducted to detect the influence of GSTP1 overexpression on the cell radiosensitivity. Furthermore, flow cytometry was used to investigate the effect of GSTP1 overexpression on cell cycle progression, with the protein expression levels of the cell cycle regulating factor cyclin B1 detected using western blot analysis. Colony formation and G 2 /M phase arrest in the GSTP1 -expressing cells were significantly increased compared with the control group (P<0.01). In addition, the expression of cyclin B1 was significantly reduced in the GSTP1 -expressing cells. These results demonstrated that increased expression of GSTP1 inhibits radiosensitivity in HeLa cells. The mechanism underlying this effect may be associated with the ability of the GSTP1 protein to reduce cyclin B1 expression, resulting in significant G 2 /M phase arrest.
Cloning, annotation and expression analysis of mycoparasitism-related genes in Trichoderma harzianum 88.

PubMed

Yao, Lin; Yang, Qian; Song, Jinzhu; Tan, Chong; Guo, Changhong; Wang, Li; Qu, Lianhai; Wang, Yun

2013-04-01

Trichoderma harzianum 88, a filamentous soil fungus, is an effective biocontrol agent against several plant pathogens. High-throughput sequencing was used here to study the mycoparasitism mechanisms of T. harzianum 88. Plate confrontation tests of T. harzianum 88 against plant pathogens were conducted, and a cDNA library was constructed from T. harzianum 88 mycelia in the presence of plant pathogen cell walls. Randomly selected transcripts from the cDNA library were compared with eukaryotic plant and fungal genomes. Of the 1,386 transcripts sequenced, the most abundant Gene Ontology (GO) classification group was "physiological process". Differential expression of 19 genes was confirmed by real-time RT-PCR at different mycoparasitism stages against plant pathogens. Gene expression analysis revealed the transcription of various genes involved in mycoparasitism of T. harzianum 88. Our study provides helpful insights into the mechanisms of T. harzianum 88-plant pathogen interactions.
Resources and Recommendations for Using Transcriptomics to Address Grand Challenges in Comparative Biology

PubMed Central

Mykles, Donald L.; Burnett, Karen G.; Durica, David S.; Joyce, Blake L.; McCarthy, Fiona M.; Schmidt, Carl J.; Stillman, Jonathon H.

2016-01-01

High-throughput RNA sequencing (RNA-seq) technology has become an important tool for studying physiological responses of organisms to changes in their environment. De novo assembly of RNA-seq data has allowed researchers to create a comprehensive catalog of genes expressed in a tissue and to quantify their expression without a complete genome sequence. The contributions from the “Tapping the Power of Crustacean Transcriptomics to Address Grand Challenges in Comparative Biology” symposium in this issue show the successes and limitations of using RNA-seq in the study of crustaceans. In conjunction with the symposium, the Animal Genome to Phenome Research Coordination Network collated comments from participants at the meeting regarding the challenges encountered when using transcriptomics in their research. Input came from novices and experts ranging from graduate students to principal investigators. Many were unaware of the bioinformatics analysis resources currently available on the CyVerse platform. Our analysis of community responses led to three recommendations for advancing the field: (1) integration of genomic and RNA-seq sequence assemblies for crustacean gene annotation and comparative expression; (2) development of methodologies for the functional analysis of genes; and (3) information and training exchange among laboratories for transmission of best practices. The field lacks the methods for manipulating tissue-specific gene expression. The decapod crustacean research community should consider the cherry shrimp, Neocaridina denticulata, as a decapod model for the application of transgenic tools for functional genomics. This would require a multi-investigator effort. PMID:27639274
RNA sequencing and pathway analysis identify tumor necrosis factor alpha driven small proline-rich protein dysregulation in chronic rhinosinusitis.

PubMed

Ramakrishnan, Vijay R; Gonzalez, Joseph R; Cooper, Sarah E; Barham, Henry P; Anderson, Catherine B; Larson, Eric D; Cool, Carlyne D; Diller, John D; Jones, Kenneth; Kinnamon, Sue C

2017-09-01

Chronic rhinosinusitis (CRS) is a heterogeneous inflammatory disorder in which many pathways contribute to end-organ disease. Small proline-rich proteins (SPRR) are polypeptides that have recently been shown to contribute to epithelial biomechanical properties relevant in T-helper type 2 inflammation. There is evidence that genetic polymorphism in SPRR genes may predict the development of asthma in children with atopy and, correlatively, that expression of SPRRs is increased under allergic conditions, which leads to epithelial barrier dysfunction in atopic disease. RNAs from uncinate tissue specimens from patients with CRS and control subjects were compared by RNA sequencing by using Ingenuity Pathway Analysis (n = 4 each), and quantitative polymerase chain reaction (PCR) (n = 15). A separate cohort of archived sinus tissue was examined by immunohistochemistry (n = 19). A statistically significant increase of SPRR expression in CRS sinus tissue was identified that was not a result of atopic presence. SPRR1 and SPRR2A expressions were markedly increased in patients with CRS (p < 0.01) on RNA sequencing, with confirmation by using real-time PCR. Immunohistochemistry of archived surgical samples demonstrated staining of SPRR proteins within squamous epithelium of both groups. Pathway analysis indicated tumor necrosis factor (TNF) alpha as a master regulator of the SPRR gene products. Expression of SPRR1 and of SPRR2A is increased in mucosal samples from patients with CRS and appeared as a downstream result of TNF alpha modulation, which possibly resulted in epithelial barrier dysfunction.
Regulation of the alpha-glucuronidase-encoding gene ( aguA) from Aspergillus niger.

PubMed

de Vries, R P; van de Vondervoort, P J I; Hendriks, L; van de Belt, M; Visser, J

2002-09-01

The alpha-glucuronidase gene aguA from Aspergillus niger was cloned and characterised. Analysis of the promoter region of aguA revealed the presence of four putative binding sites for the major carbon catabolite repressor protein CREA and one putative binding site for the transcriptional activator XLNR. In addition, a sequence motif was detected which differed only in the last nucleotide from the XLNR consensus site. A construct in which part of the aguA coding region was deleted still resulted in production of a stable mRNA upon transformation of A. niger. The putative XLNR binding sites and two of the putative CREA binding sites were mutated individually in this construct and the effects on expression were examined in A. niger transformants. Northern analysis of the transformants revealed that the consensus XLNR site is not actually functional in the aguA promoter, whereas the sequence that diverges from the consensus at a single position is functional. This indicates that XLNR is also able to bind to the sequence GGCTAG, and the XLNR binding site consensus should therefore be changed to GGCTAR. Both CREA sites are functional, indicating that CREA has a strong influence on aguA expression. A detailed expression analysis of aguA in four genetic backgrounds revealed a second regulatory system involved in activation of aguA gene expression. This system responds to the presence of glucuronic and galacturonic acids, and is not dependent on XLNR.
High-throughput sequencing of small RNAs and analysis of differentially expressed microRNAs associated with pistil development in Japanese apricot

PubMed Central

2012-01-01

Background MicroRNAs (miRNAs) are a class of endogenous, small, non-coding RNAs that regulate gene expression by mediating gene silencing at transcriptional and post-transcriptional levels in high plants. However, the diversity of miRNAs and their roles in floral development in Japanese apricot (Prunus mume Sieb. et Zucc) remains largely unexplored. Imperfect flowers with pistil abortion seriously decrease production yields. To understand the role of miRNAs in pistil development, pistil development-related miRNAs were identified by Solexa sequencing in Japanese apricot. Results Solexa sequencing was used to identify and quantitatively profile small RNAs from perfect and imperfect flower buds of Japanese apricot. A total of 22,561,972 and 24,952,690 reads were sequenced from two small RNA libraries constructed from perfect and imperfect flower buds, respectively. Sixty-one known miRNAs, belonging to 24 families, were identified. Comparative profiling revealed that seven known miRNAs exhibited significant differential expression between perfect and imperfect flower buds. A total of 61 potentially novel miRNAs/new members of known miRNA families were also identified by the presence of mature miRNAs and corresponding miRNA*s in the sRNA libraries. Comparative analysis showed that six potentially novel miRNAs were differentially expressed between perfect and imperfect flower buds. Target predictions of the 13 differentially expressed miRNAs resulted in 212 target genes. Gene ontology (GO) annotation revealed that high-ranking miRNA target genes are those implicated in the developmental process, the regulation of transcription and response to stress. Conclusions This study represents the first comparative identification of miRNAomes between perfect and imperfect Japanese apricot flowers. Seven known miRNAs and six potentially novel miRNAs associated with pistil development were identified, using high-throughput sequencing of small RNAs. The findings, both computationally and experimentally, provide valuable information for further functional characterisation of miRNAs associated with pistil development in plants. PMID:22863067
Mining genes involved in insecticide resistance of Liposcelis bostrychophila Badonnel by transcriptome and expression profile analysis.

PubMed

Dou, Wei; Shen, Guang-Mao; Niu, Jin-Zhi; Ding, Tian-Bo; Wei, Dan-Dan; Wang, Jin-Jun

2013-01-01

Recent studies indicate that infestations of psocids pose a new risk for global food security. Among the psocids species, Liposcelis bostrychophila Badonnel has gained recognition in importance because of its parthenogenic reproduction, rapid adaptation, and increased worldwide distribution. To date, the molecular data available for L. bostrychophila is largely limited to genes identified through homology. Also, no transcriptome data relevant to psocids infection is available. In this study, we generated de novo assembly of L. bostrychophila transcriptome performed through the short read sequencing technology (Illumina). In a single run, we obtained more than 51 million sequencing reads that were assembled into 60,012 unigenes (mean size = 711 bp) by Trinity. The transcriptome sequences from different developmental stages of L. bostrychophila including egg, nymph and adult were annotated with non-redundant (Nr) protein database, gene ontology (GO), cluster of orthologous groups of proteins (COG), and KEGG orthology (KO). The analysis revealed three major enzyme families involved in insecticide metabolism as differentially expressed in the L. bostrychophila transcriptome. A total of 49 P450-, 31 GST- and 21 CES-specific genes representing the three enzyme families were identified. Besides, 16 transcripts were identified to contain target site sequences of resistance genes. Furthermore, we profiled gene expression patterns upon insecticide (malathion and deltamethrin) exposure using the tag-based digital gene expression (DGE) method. The L. bostrychophila transcriptome and DGE data provide gene expression data that would further our understanding of molecular mechanisms in psocids. In particular, the findings of this investigation will facilitate identification of genes involved in insecticide resistance and designing of new compounds for control of psocids.
Mining Genes Involved in Insecticide Resistance of Liposcelis bostrychophila Badonnel by Transcriptome and Expression Profile Analysis

PubMed Central

Dou, Wei; Shen, Guang-Mao; Niu, Jin-Zhi; Ding, Tian-Bo; Wei, Dan-Dan; Wang, Jin-Jun

2013-01-01

Background Recent studies indicate that infestations of psocids pose a new risk for global food security. Among the psocids species, Liposcelis bostrychophila Badonnel has gained recognition in importance because of its parthenogenic reproduction, rapid adaptation, and increased worldwide distribution. To date, the molecular data available for L. bostrychophila is largely limited to genes identified through homology. Also, no transcriptome data relevant to psocids infection is available. Methodology and Principal Findings In this study, we generated de novo assembly of L. bostrychophila transcriptome performed through the short read sequencing technology (Illumina). In a single run, we obtained more than 51 million sequencing reads that were assembled into 60,012 unigenes (mean size = 711 bp) by Trinity. The transcriptome sequences from different developmental stages of L. bostrychophila including egg, nymph and adult were annotated with non-redundant (Nr) protein database, gene ontology (GO), cluster of orthologous groups of proteins (COG), and KEGG orthology (KO). The analysis revealed three major enzyme families involved in insecticide metabolism as differentially expressed in the L. bostrychophila transcriptome. A total of 49 P450-, 31 GST- and 21 CES-specific genes representing the three enzyme families were identified. Besides, 16 transcripts were identified to contain target site sequences of resistance genes. Furthermore, we profiled gene expression patterns upon insecticide (malathion and deltamethrin) exposure using the tag-based digital gene expression (DGE) method. Conclusion The L. bostrychophila transcriptome and DGE data provide gene expression data that would further our understanding of molecular mechanisms in psocids. In particular, the findings of this investigation will facilitate identification of genes involved in insecticide resistance and designing of new compounds for control of psocids. PMID:24278202
Expressed sequence tag analysis of guinea pig (Cavia porcellus) eye tissues for NEIBank

PubMed Central

Simpanya, Mukoma F.; Wistow, Graeme; Gao, James; David, Larry L.; Giblin, Frank J.

2008-01-01

Purpose To characterize gene expression patterns in guinea pig ocular tissues and identify orthologs of human genes from NEIBank expressed sequence tags. Methods RNA was extracted from dissected eye tissues of 2.5-month-old guinea pigs to make three unamplified and unnormalized cDNA libraries in the pCMVSport-6 vector for the lens, retina, and eye minus lens and retina. Over 4,000 clones were sequenced from each library and were analyzed using GRIST for clustering and gene identification. Lens crystallin EST data were validated using two-dimensional electrophoresis (2-DE), matrix assisted laser desorption (MALDI), and electrospray ionization mass spectrometry (ESIMS). Results Combined data from the three libraries generated a total of 6,694 distinctive gene clusters, with each library having between 1,000 and 3,000 clusters. Approximately 60% of the total gene clusters were novel cDNA sequences and had significant homologies to other mammalian sequences in GenBank. Complete cDNA sequences were obtained for many guinea pig lens proteins, including αA/αAinsert-, γN-, and γS-crystallins, lengsin and GRIFIN. The ratio of αA- to αB-crystallin on 2-DE gels was 8: 1 in the lens nucleus and 6.5: 1 in the cortex. Analysis of ESTs, genome sequence, and proteins (by MALDI), did not reveal any evidence for the presence of γD-, γE-, and γF-crystallin in the guinea pig. Predicted masses of many guinea pig lens crystallins were confirmed by ESIMS analysis. For the retina, orthologs of human phototransduction genes were found, such as Rhodopsin, S-antigen (Sag, Arrestin), and Transducin. The guinea-pig ortholog of NRL, a key rod photoreceptor-specific transcription factor, was also represented in EST data. In the ‘rest-of-eye’ library, the most abundant transcripts included decorin and keratin 12, representative of the cornea. Conclusions Genomic analysis of guinea pig eye tissues provides sequence-verified clones for future studies. Guinea pig orthologs of many human eye specific genes were identified. Guinea pig gene structures were similar to their human and rodent gene counterparts. Surprisingly, no orthologs of γD-, γE-, and γF-crystallin were found in EST, proteomic, or the current guinea pig genome data. PMID:19104676
Informatic and genomic analysis of melanocyte cDNA libraries as a resource for the study of melanocyte development and function.

PubMed

Baxter, Laura L; Hsu, Benjamin J; Umayam, Lowell; Wolfsberg, Tyra G; Larson, Denise M; Frith, Martin C; Kawai, Jun; Hayashizaki, Yoshihide; Carninci, Piero; Pavan, William J

2007-06-01

As part of the RIKEN mouse encyclopedia project, two cDNA libraries were prepared from melanocyte-derived cell lines, using techniques of full-length clone selection and subtraction/normalization to enrich for rare transcripts. End sequencing showed that these libraries display over 83% complete coding sequence at the 5' end and 96-97% complete coding sequence at the 3' end. Evaluation of the libraries, derived from B16F10Y tumor cells and melan-c cells, revealed that they contain clones for a majority of the genes previously demonstrated to function in melanocyte biology. Analysis of genomic locations for transcripts revealed that the distribution of melanocyte genes is non-random throughout the genome. Three genomic regions identified that showed significant clustering of melanocyte-expressed genes contain one or more genes previously shown to regulate melanocyte development or function. A catalog of genes expressed in these libraries is presented, providing a valuable resource of cDNA clones and sequence information that can be used for identification of new genes important for melanocyte development, function, and disease.

Accounting for technical noise in differential expression analysis of single-cell RNA sequencing data.

PubMed

Jia, Cheng; Hu, Yu; Kelly, Derek; Kim, Junhyong; Li, Mingyao; Zhang, Nancy R

2017-11-02

Recent technological breakthroughs have made it possible to measure RNA expression at the single-cell level, thus paving the way for exploring expression heterogeneity among individual cells. Current single-cell RNA sequencing (scRNA-seq) protocols are complex and introduce technical biases that vary across cells, which can bias downstream analysis without proper adjustment. To account for cell-to-cell technical differences, we propose a statistical framework, TASC (Toolkit for Analysis of Single Cell RNA-seq), an empirical Bayes approach to reliably model the cell-specific dropout rates and amplification bias by use of external RNA spike-ins. TASC incorporates the technical parameters, which reflect cell-to-cell batch effects, into a hierarchical mixture model to estimate the biological variance of a gene and detect differentially expressed genes. More importantly, TASC is able to adjust for covariates to further eliminate confounding that may originate from cell size and cell cycle differences. In simulation and real scRNA-seq data, TASC achieves accurate Type I error control and displays competitive sensitivity and improved robustness to batch effects in differential expression analysis, compared to existing methods. TASC is programmed to be computationally efficient, taking advantage of multi-threaded parallelization. We believe that TASC will provide a robust platform for researchers to leverage the power of scRNA-seq. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Accounting for technical noise in differential expression analysis of single-cell RNA sequencing data

PubMed Central

Jia, Cheng; Hu, Yu; Kelly, Derek; Kim, Junhyong

2017-01-01

Abstract Recent technological breakthroughs have made it possible to measure RNA expression at the single-cell level, thus paving the way for exploring expression heterogeneity among individual cells. Current single-cell RNA sequencing (scRNA-seq) protocols are complex and introduce technical biases that vary across cells, which can bias downstream analysis without proper adjustment. To account for cell-to-cell technical differences, we propose a statistical framework, TASC (Toolkit for Analysis of Single Cell RNA-seq), an empirical Bayes approach to reliably model the cell-specific dropout rates and amplification bias by use of external RNA spike-ins. TASC incorporates the technical parameters, which reflect cell-to-cell batch effects, into a hierarchical mixture model to estimate the biological variance of a gene and detect differentially expressed genes. More importantly, TASC is able to adjust for covariates to further eliminate confounding that may originate from cell size and cell cycle differences. In simulation and real scRNA-seq data, TASC achieves accurate Type I error control and displays competitive sensitivity and improved robustness to batch effects in differential expression analysis, compared to existing methods. TASC is programmed to be computationally efficient, taking advantage of multi-threaded parallelization. We believe that TASC will provide a robust platform for researchers to leverage the power of scRNA-seq. PMID:29036714
Identification and expression analysis of an olfactory receptor gene family in green plant bug Apolygus lucorum (Meyer-Dür)

PubMed Central

An, Xing-Kui; Sun, Liang; Liu, Hang-Wei; Liu, Dan-Feng; Ding, Yu-Xiao; Li, Le-Mei; Zhang, Yong-Jun; Guo, Yu-Yuan

2016-01-01

Olfactory receptors are believed to play a central role in insects host-seeking, mating, and ovipositing. On the basis of male and female antennal transcriptome of adult Apolygus lucorum, a total of 110 candidate A. lucorum odorant receptors (AlucOR) were identified in this study including five previously annotated AlucORs. All the sequences were validated by cloning and sequencing. Tissue expression profiles analysis by RT-PCR indicated most AlucORs were antennal highly expressed genes. The qPCR measurements further revealed 40 AlucORs were significantly higher in the antennae. One AlucOR was primarily expressed in the female antennae, while nine AlucORs exhibited male-biased expression patterns. Additionally, both the RPKM value and RT-qPCR analysis showed AlucOR83 and AlucOR21 were much higher abundant in male antennae than in female antennae, suggesting their different roles in chemoreception of gender. Phylogenetic analysis of ORs from several Hemipteran species demonstrated that most AlucORs had orthologous genes, and five AlucOR-specific clades were defined. In addition, a sub-clade of potential male-based sex pheromone receptors were also identified in the phylogenetic tree of AlucORs. Our results will facilitate the functional studies of AlucORs, and thereby provide a foundation for novel pest management approaches based on these genes. PMID:27892490
Suppressive subtractive hybridization approach revealed differential expression of hypersensitive response and reactive oxygen species production genes in tea (Camellia sinensis (L.) O. Kuntze) leaves during Pestalotiopsis thea infection.

PubMed

Senthilkumar, Palanisamy; Thirugnanasambantham, Krishnaraj; Mandal, Abul Kalam Azad

2012-12-01

Tea (Camellia sinensis (L.) O. Kuntze) is an economically important plant cultivated for its leaves. Infection of Pestalotiopsis theae in leaves causes gray blight disease and enormous loss to the tea industry. We used suppressive subtractive hybridization (SSH) technique to unravel the differential gene expression pattern during gray blight disease development in tea. Complementary DNA from P. theae-infected and uninfected leaves of disease tolerant cultivar UPASI-10 was used as tester and driver populations respectively. Subtraction efficiency was confirmed by comparing abundance of β-actin gene. A total of 377 and 720 clones with insert size >250 bp from forward and reverse library respectively were sequenced and analyzed. Basic Local Alignment Search Tool analysis revealed 17 sequences in forward SSH library have high degree of similarity with disease and hypersensitive response related genes and 20 sequences with hypothetical proteins while in reverse SSH library, 23 sequences have high degree of similarity with disease and stress response-related genes and 15 sequences with hypothetical proteins. Functional analysis indicated unknown (61 and 59 %) or hypothetical functions (23 and 18 %) for most of the differentially regulated genes in forward and reverse SSH library, respectively, while others have important role in different cellular activities. Majority of the upregulated genes are related to hypersensitive response and reactive oxygen species production. Based on these expressed sequence tag data, putative role of differentially expressed genes were discussed in relation to disease. We also demonstrated the efficiency of SSH as a tool in enriching gray blight disease related up- and downregulated genes in tea. The present study revealed that many genes related to disease resistance were suppressed during P. theae infection and enhancing these genes by the application of inducers may impart better disease tolerance to the plants.
Molecular analysis of two phytohemagglutinin genes and their expression in Phaseolus vulgaris cv. Pinto, a lectin-deficient cultivar of the bean

PubMed Central

Voelker, Toni A.; Staswick, Paul; Chrispeels, Maarten J.

1986-01-01

Phytohemagglutinin (PHA), the seed lectin of the common bean, Phaseolus vulgaris, is encoded by two highly homologous, tandemly linked genes, dlec1 and dlec2, which are coordinately expressed at high levels in developing cotyledons. Their respective transcripts translate into closely related polypeptides, PHA-E and PHA-L, constituents of the tetrameric lectin which accumulates at high levels in developing seeds. In the bean cultivar Pinto UI111, PHA-E is not detectable, and PHA-L accumulates at very reduced levels. To investigate the cause of the Pinto phenotype, we cloned and sequenced the two PHA genes of Pinto, called Pdlec1 and Pdlec2, and determined the abundance of their respective mRNAs in developing cotyledons. Both genes are more than 90% homologous to the normal PHA genes found in other cultivars. Pdlec1 carries a 1-bp frameshift mutation close to the 5' end of its coding sequence. Only very truncated polypeptides could be made from its mRNA. The gene Pdlec2 encodes a polypeptide, which resembles PHA-L and its predicted amino acid sequence agrees with the available Pinto PHA amino acid sequence data. Analysis of the mRNA of developing cotyledons revealed that the Pdlec1 message is reduced 600-fold, and Pdlec2 mRNA is reduced 20-fold with respect to mRNA levels in normal cultivars. A comparison of the sequences which are upstream from the coding sequence shows that Pdlec2 has a 100-bp deletion compared to the other genes (dlec1, dlec2 and Pdlec1). This deletion which contains a large tandem repeat may be responsible for the low level of expression of Pdlec2. The very low expression of Pdlec1 is as yet unexplained. ImagesFig. 5. PMID:16453730
Expressed sequence tag analysis of human RPE/choroid for the NEIBank Project: over 6000 non-redundant transcripts, novel genes and splice variants.

PubMed

Wistow, Graeme; Bernstein, Steven L; Wyatt, M Keith; Fariss, Robert N; Behal, Amita; Touchman, Jeffrey W; Bouffard, Gerald; Smith, Don; Peterson, Katherine

2002-06-15

The retinal pigment epithelium (RPE) and choroid comprise a functional unit of the eye that is essential to normal retinal health and function. Here we describe expressed sequence tag (EST) analysis of human RPE/choroid as part of a project for ocular bioinformatics. A cDNA library (cs) was made from human RPE/choroid and sequenced. Data were analyzed and assembled using the program GRIST (GRouping and Identification of Sequence Tags). Complete sequencing, Northern and Western blots, RH mapping, peptide antibody synthesis and immunofluorescence (IF) have been used to examine expression patterns and genome location for selected transcripts and proteins. Ten thousand individual sequence reads yield over 6300 unique gene clusters of which almost half have no matches with named genes. One of the most abundant transcripts is from a gene (named "alpha") that maps to the BBS1 region of chromosome 11. A number of tissue preferred transcripts are common to both RPE/choroid and iris. These include oculoglycan/opticin, for which an alternative splice form is detected in RPE/choroid, and "oculospanin" (Ocsp), a novel tetraspanin that maps to chromosome 17q. Antiserum to Ocsp detects expression in RPE, iris, ciliary body, and retinal ganglion cells by IF. A newly identified gene for a zinc-finger protein (TIRC) maps to 19q13.4. Variant transcripts of several genes were also detected. Most notably, the predominant form of Bestrophin represented in cs contains a longer open reading frame as a result of splice junction skipping. The unamplified cs library gives a view of the transcriptional repertoire of the adult RPE/choroid. A large number of potentially novel genes and splice forms and candidates for genetic diseases are revealed. Clones from this collection are being included in a large, nonredundant set for cDNA microarray construction.
Protein Sequencing with Tandem Mass Spectrometry

NASA Astrophysics Data System (ADS)

Ziady, Assem G.; Kinter, Michael

The recent introduction of electrospray ionization techniques that are suitable for peptides and whole proteins has allowed for the design of mass spectrometric protocols that provide accurate sequence information for proteins. The advantages gained by these approaches over traditional Edman Degradation sequencing include faster analysis and femtomole, sometimes attomole, sensitivity. The ability to efficiently identify proteins has allowed investigators to conduct studies on their differential expression or modification in response to various treatments or disease states. In this chapter, we discuss the use of electrospray tandem mass spectrometry, a technique whereby protein-derived peptides are subjected to fragmentation in the gas phase, revealing sequence information for the protein. This powerful technique has been instrumental for the study of proteins and markers associated with various disorders, including heart disease, cancer, and cystic fibrosis. We use the study of protein expression in cystic fibrosis as an example.
Temporal variations in the gene expression levels of cyanobacterial anti-oxidant enzymes through geological history: implications for biological evolution during the Great Oxidation Event

NASA Astrophysics Data System (ADS)

Harada, M.; Furukawa, R.; Yokobori, S. I.; Tajika, E.; Yamagishi, A.

2016-12-01

A significant rise in atmospheric O2 levels during the GOE (Great Oxidation Event), ca. 2.45-2.0 Ga, must have caused a great stress to biosphere, enforcing life to adapt to oxic conditions. Cyanobacteria, oxygenic photosynthetic bacteria that had been responsible for the GOE, are at the same time one of the organisms that would have been greatly affected by the rise of O2 level in the surface environments. Knowledge on the evolution of cyanobacteria is not only important to elucidate the cause of the GOE, but also helps us to better understand the adaptive evolution of life in response to the GOE. Here we performed phylogenetic analysis of an anti-oxidant enzyme Fe-SOD (iron superoxide dismutase) of cyanobacteria, to assess the adaptive evolution of life under the GOE. The rise of O2 level must have increased the level of toxic reactive oxygen species in cyanobacterial cells, thus forced them to change activities or the gene expression levels of Fe-SOD. In the present study, we focus on the change in the gene expression levels of the enzyme, which can be estimated from the promoter sequences of the gene. Promoters are DNA sequences found upstream of protein encoding regions, where RNA polymerase binds and initiates transcription. "Strong" promoters that efficiently interact with RNA polymerase induce high rates of transcription, leading to high levels of gene expression. Thus, from the temporal changes in the promoter sequences, we can estimate the variations in the gene expression levels during the geological time. Promoter sequences of Fe-SOD at each ancestral node of cyanobacteria were predicted from phylogenetic analysis, and the ancestral promoter sequences were compared to the promoters of known highly expressed genes. The similarity was low at the time of the emergence of cyanobacteria; however, increased at the branching nodes diverged 2.4 billon years ago. This roughly coincided with the onset of the GOE, implying that the transition from low to high gene expression levels of Fe-SOD occurred in response to the GOE. We propose that this is the first direct evidence of the evolution of cyanobacteria related to the rise of O2, and that the methodologies of ancestral promoter analysis used in this study can be a novel tools to reveal the biological adaptation to such a significant geologic event.
Integration of Bioinformatics and Synthetic Promoters Leads to the Discovery of Novel Elicitor-Responsive cis-Regulatory Sequences in Arabidopsis1[C][W][OA

PubMed Central

Koschmann, Jeannette; Machens, Fabian; Becker, Marlies; Niemeyer, Julia; Schulze, Jutta; Bülow, Lorenz; Stahl, Dietmar J.; Hehl, Reinhard

2012-01-01

A combination of bioinformatic tools, high-throughput gene expression profiles, and the use of synthetic promoters is a powerful approach to discover and evaluate novel cis-sequences in response to specific stimuli. With Arabidopsis (Arabidopsis thaliana) microarray data annotated to the PathoPlant database, 732 different queries with a focus on fungal and oomycete pathogens were performed, leading to 510 up-regulated gene groups. Using the binding site estimation suite of tools, BEST, 407 conserved sequence motifs were identified in promoter regions of these coregulated gene sets. Motif similarities were determined with STAMP, classifying the 407 sequence motifs into 37 families. A comparative analysis of these 37 families with the AthaMap, PLACE, and AGRIS databases revealed similarities to known cis-elements but also led to the discovery of cis-sequences not yet implicated in pathogen response. Using a parsley (Petroselinum crispum) protoplast system and a modified reporter gene vector with an internal transformation control, 25 elicitor-responsive cis-sequences from 10 different motif families were identified. Many of the elicitor-responsive cis-sequences also drive reporter gene expression in an Agrobacterium tumefaciens infection assay in Nicotiana benthamiana. This work significantly increases the number of known elicitor-responsive cis-sequences and demonstrates the successful integration of a diverse set of bioinformatic resources combined with synthetic promoter analysis for data mining and functional screening in plant-pathogen interaction. PMID:22744985
Exome sequencing coupled with mRNA analysis identifies NDUFAF6 as a Leigh gene.

PubMed

Bianciardi, Laura; Imperatore, Valentina; Fernandez-Vizarra, Erika; Lopomo, Angela; Falabella, Micol; Furini, Simone; Galluzzi, Paolo; Grosso, Salvatore; Zeviani, Massimo; Renieri, Alessandra; Mari, Francesca; Frullanti, Elisa

2016-11-01

We report here the case of a young male who started to show verbal fluency disturbance, clumsiness and gait anomalies at the age of 3.5years and presented bilateral striatal necrosis. Clinically, the diagnosis was compatible with Leigh syndrome but the underlying molecular defect remained elusive even after exome analysis using autosomal/X-linked recessive or de novo models. Dosage of respiratory chain activity on fibroblasts, but not in muscle, underlined a deficit in complex I. Re-analysis of heterozygous probably pathogenic variants, inherited from one healthy parent, identified the p.Ala178Pro in NDUFAF6, a complex I assembly factor. RNA analysis showed an almost mono-allelic expression of the mutated allele in blood and fibroblasts and puromycin treatment on cultured fibroblasts did not lead to the rescue of the maternal allele expression, not supporting the involvement of nonsense-mediated RNA decay mechanism. Complementation assay underlined a recovery of complex I activity after transduction of the wild-type gene. Since the second mutation was not detected and promoter methylation analysis resulted normal, we hypothesized a non-exonic event in the maternal allele affecting a regulatory element that, in conjunction with the paternal mutation, leads to the autosomal recessive disorder and the different allele expression in various tissues. This paper confirms NDUFAF6 as a genuine morbid gene and proposes the coupling of exome sequencing with mRNA analysis as a method useful for enhancing the exome sequencing detection rate when the simple application of classical inheritance models fails. Copyright © 2016 Elsevier Inc. All rights reserved.
Identification and expression analysis of BoMF25, a novel polygalacturonase gene involved in pollen development of Brassica oleracea.

PubMed

Lyu, Meiling; Liang, Ying; Yu, Youjian; Ma, Zhiming; Song, Limin; Yue, Xiaoyan; Cao, Jiashu

2015-06-01

BoMF25 acts on pollen wall. Polygalacturonase (PG) is a pectin-digesting enzyme involved in numerous plant developmental processes and is described to be of critical importance for pollen wall development. In the present study, a PG gene, BoMF25, was isolated from Brassica oleracea. BoMF25 is the homologous gene of At4g35670, a PG gene in Arabidopsis thaliana with a high expression level at the tricellular pollen stage. Collinear analysis revealed that the orthologous gene of BoMF25 in Brassica campestris (syn. B. rapa) genome was probably lost because of genome deletion and reshuffling. Sequence analysis indicated that BoMF25 contained four classical conserved domains (I, II, III, and IV) of PG protein. Homology and phylogenetic analyses showed that BoMF25 was clustered in Clade F. The putative promoter sequence, containing classical cis-acting elements and pollen-specific motifs, could drive green fluorescence protein expression in onion epidermal cells. Quantitative RT-PCR analysis suggested that BoMF25 was mainly expressed in the anther at the late stage of pollen development. In situ hybridization analysis also indicated that the strong and specific expression signal of BoMF25 existed in pollen grains at the mature pollen stage. Subcellular localization showed that the fluorescence signal was observed in the cell wall of onion epidermal cells, which suggested that BoMF25 may be a secreted protein localized in the pollen wall.
Determination of differential gene expression profiles in superficial and deeper zones of mature rat articular cartilage using RNA sequencing of laser microdissected tissue specimens.

PubMed

Mori, Yoshifumi; Chung, Ung-Il; Tanaka, Sakae; Saito, Taku

2014-01-01

Superficial zone (SFZ) cells, which are morphologically and functionally distinct from chondrocytes in deeper zones, play important roles in the maintenance of articular cartilage. Here, we established an easy and reliable method for performance of laser microdissection (LMD) on cryosections of mature rat articular cartilage using an adhesive membrane. We further examined gene expression profiles in the SFZ and the deeper zones of articular cartilage by performing RNA sequencing (RNA-seq). We validated sample collection methods, RNA amplification and the RNA-seq data using real-time RT-PCR. The combined data provide comprehensive information regarding genes specifically expressed in the SFZ or deeper zones, as well as a useful protocol for expression analysis of microsamples of hard tissues.
Genome-wide transcriptome and expression profile analysis of Phalaenopsis during explant browning.

PubMed

Xu, Chuanjun; Zeng, Biyu; Huang, Junmei; Huang, Wen; Liu, Yumei

2015-01-01

Explant browning presents a major problem for in vitro culture, and can lead to the death of the explant and failure of regeneration. Considerable work has examined the physiological mechanisms underlying Phalaenopsis leaf explant browning, but the molecular mechanisms of browning remain elusive. In this study, we used whole genome RNA sequencing to examine Phalaenopsis leaf explant browning at genome-wide level. We first used Illumina high-throughput technology to sequence the transcriptome of Phalaenopsis and then performed de novo transcriptome assembly. We assembled 79,434,350 clean reads into 31,708 isogenes and generated 26,565 annotated unigenes. We assigned Gene Ontology (GO) terms, Kyoto Encyclopedia of Genes and Genomes (KEGG) annotations, and potential Pfam domains to each transcript. Using the transcriptome data as a reference, we next analyzed the differential gene expression of explants cultured for 0, 3, and 6 d, respectively. We then identified differentially expressed genes (DEGs) before and after Phalaenopsis explant browning. We also performed GO, KEGG functional enrichment and Pfam analysis of all DEGs. Finally, we selected 11 genes for quantitative real-time PCR (qPCR) analysis to confirm the expression profile analysis. Here, we report the first comprehensive analysis of transcriptome and expression profiles during Phalaenopsis explant browning. Our results suggest that Phalaenopsis explant browning may be due in part to gene expression changes that affect the secondary metabolism, such as: phenylpropanoid pathway and flavonoid biosynthesis. Genes involved in photosynthesis and ATPase activity have been found to be changed at transcription level; these changes may perturb energy metabolism and thus lead to the decay of plant cells and tissues. This study provides comprehensive gene expression data for Phalaenopsis browning. Our data constitute an important resource for further functional studies to prevent explant browning.
Genome-Wide Transcriptome and Expression Profile Analysis of Phalaenopsis during Explant Browning

PubMed Central

Xu, Chuanjun; Zeng, Biyu; Huang, Junmei; Huang, Wen; Liu, Yumei

2015-01-01

Background Explant browning presents a major problem for in vitro culture, and can lead to the death of the explant and failure of regeneration. Considerable work has examined the physiological mechanisms underlying Phalaenopsis leaf explant browning, but the molecular mechanisms of browning remain elusive. In this study, we used whole genome RNA sequencing to examine Phalaenopsis leaf explant browning at genome-wide level. Methodology/Principal Findings We first used Illumina high-throughput technology to sequence the transcriptome of Phalaenopsis and then performed de novo transcriptome assembly. We assembled 79,434,350 clean reads into 31,708 isogenes and generated 26,565 annotated unigenes. We assigned Gene Ontology (GO) terms, Kyoto Encyclopedia of Genes and Genomes (KEGG) annotations, and potential Pfam domains to each transcript. Using the transcriptome data as a reference, we next analyzed the differential gene expression of explants cultured for 0, 3, and 6 d, respectively. We then identified differentially expressed genes (DEGs) before and after Phalaenopsis explant browning. We also performed GO, KEGG functional enrichment and Pfam analysis of all DEGs. Finally, we selected 11 genes for quantitative real-time PCR (qPCR) analysis to confirm the expression profile analysis. Conclusions/Significance Here, we report the first comprehensive analysis of transcriptome and expression profiles during Phalaenopsis explant browning. Our results suggest that Phalaenopsis explant browning may be due in part to gene expression changes that affect the secondary metabolism, such as: phenylpropanoid pathway and flavonoid biosynthesis. Genes involved in photosynthesis and ATPase activity have been found to be changed at transcription level; these changes may perturb energy metabolism and thus lead to the decay of plant cells and tissues. This study provides comprehensive gene expression data for Phalaenopsis browning. Our data constitute an important resource for further functional studies to prevent explant browning. PMID:25874455
Proliferating cell nuclear antigen (Pcna) as a direct downstream target gene of Hoxc8

DOE Office of Scientific and Technical Information (OSTI.GOV)

Min, Hyehyun; Lee, Ji-Yeon; Bok, Jinwoong

2010-02-19

Hoxc8 is a member of Hox family transcription factors that play crucial roles in spatiotemporal body patterning during embryogenesis. Hox proteins contain a conserved 61 amino acid homeodomain, which is responsible for recognition and binding of the proteins onto Hox-specific DNA binding motifs and regulates expression of their target genes. Previously, using proteome analysis, we identified Proliferating cell nuclear antigen (Pcna) as one of the putative target genes of Hoxc8. Here, we asked whether Hoxc8 regulates Pcna expression by directly binding to the regulatory sequence of Pcna. In mouse embryos at embryonic day 11.5, the expression pattern of Pcna wasmore » similar to that of Hoxc8 along the anteroposterior body axis. Moreover, Pcna transcript levels as well as cell proliferation rate were increased by overexpression of Hoxc8 in C3H10T1/2 mouse embryonic fibroblast cells. Characterization of 2.3 kb genomic sequence upstream of Pcna coding region revealed that the upstream sequence contains several Hox core binding sequences and one Hox-Pbx binding sequence. Direct binding of Hoxc8 proteins to the Pcna regulatory sequence was verified by chromatin immunoprecipitation assay. Taken together, our data suggest that Pcna is a direct downstream target of Hoxc8.« less
VIZARD: analysis of Affymetrix Arabidopsis GeneChip data

NASA Technical Reports Server (NTRS)

Moseyko, Nick; Feldman, Lewis J.

2002-01-01

SUMMARY: The Affymetrix GeneChip Arabidopsis genome array has proved to be a very powerful tool for the analysis of gene expression in Arabidopsis thaliana, the most commonly studied plant model organism. VIZARD is a Java program created at the University of California, Berkeley, to facilitate analysis of Arabidopsis GeneChip data. It includes several integrated tools for filtering, sorting, clustering and visualization of gene expression data as well as tools for the discovery of regulatory motifs in upstream sequences. VIZARD also includes annotation and upstream sequence databases for the majority of genes represented on the Affymetrix Arabidopsis GeneChip array. AVAILABILITY: VIZARD is available free of charge for educational, research, and not-for-profit purposes, and can be downloaded at http://www.anm.f2s.com/research/vizard/ CONTACT: moseyko@uclink4.berkeley.edu.
Gene expression profiling of the plant pathogenic basidiomycetous fungus Rhizoctonia solani AG 4 reveals putative virulence factors

USDA-ARS?s Scientific Manuscript database

Rhizoctonia solani is a ubiquitous basidiomycetous soilborne fungal pathogen causing damping off of seedlings, aerial blights and postharvest diseases. To gain insight into the molecular mechanisms of pathogenesis a global approach based on analysis of expressed sequence tags (ESTs) was undertaken. ...
APPLICATION OF DNA MICROARRAYS TO REPRODUCTIVE TOXICOLOGY AND THE DEVELOPMENT OF A TESTIS ARRAY

EPA Science Inventory

With the advent of sequence information for entire mammalian genomes, it is now possible to analyze gene expression and gene polymorphisms on a genomic scale. The primary tool for analysis of gene expression is the DNA microarray. We have used commercially available cDNA micro...
BIOMONITORING THE TOXICOGENOMIC RESPONSE TO ENDOCRINE DISRUPTING CHEMICALS IN HUMANS, LABORATORY SPECIES AND WILDLIFE

EPA Science Inventory

With the advent of sequence information for entire eukaryotic genomes, it is now possible to analyze gene expression on a genomic scale. The primary tool for genomic analysis of gene expression is the gene microarray. We have used commercially available and custom cDNA microarray...
Clinical implications of genomic profiles in metastatic breast cancer with a focus on TP53 and PIK3CA, the most frequently mutated genes.

PubMed

Kim, Ji-Yeon; Lee, Eunjin; Park, Kyunghee; Park, Woong-Yang; Jung, Hae Hyun; Ahn, Jin Seok; Im, Young-Hyuck; Park, Yeon Hee

2017-04-25

Breast cancer (BC) has been genetically profiled through large-scale genome analyses. However, the role and clinical implications of genetic alterations in metastatic BC (MBC) have not been evaluated. Therefore, we conducted whole-exome sequencing (WES) and RNA-Seq of 37 MBC samples and targeted deep sequencing of another 29 MBCs. We evaluated somatic mutations from WES and targeted sequencing and assessed gene expression and performed pathway analysis from RNA-Seq. In this analysis, PIK3CA was the most commonly mutated gene in estrogen receptor (ER)-positive BC, while in ER-negative BC, TP53 was the most commonly mutated gene (p = 0.018 and p < 0.001, respectively). TP53 stopgain/loss and frameshift mutation was related to low expression of TP53 in contrast nonsynonymous mutation was related to high expression. The impact of TP53 mutation on clinical outcome varied with regard to ER status. In ER-positive BCs, wild type TP53 had a better prognosis than mutated TP53 (median overall survival (OS) (wild type vs. mutated): 88.5 ± 54.4 vs. 32.6 ± 10.7 (months), p = 0.002). In contrast, mutated TP53 had a protective effect in ER-negative BCs (median OS: 0.10 vs. 32.6 ± 8.2, p = 0.026). However, PIK3CA mutation did not affect patient survival. In gene expression analysis, CALM1, a potential regulator of AKT, was highly expressed in PIK3CA-mutated BCs. In conclusion, mutation of TP53 was associated with expression status and affect clinical outcome according to ER status in MBC. Although mutation of PIK3CA was not related to survival in this study, mutation of PIK3CA altered the expression of other genes and pathways including CALM1 and may be a potential predictive marker of PI3K inhibitor effectiveness.

A perchlorate sensitive iodide transporter in frogs

PubMed Central

Carr, Deborah L.; Carr, James A.; Willis, Ray E.; Pressley, Thomas A.

2008-01-01

Nucleotide sequence comparisons have identified a gene product in the genome database of African clawed frogs (Xenopus laevis) as a probable member of the solute carrier family of membrane transporters. To confirm its identity as a putative iodide transporter, we examined the function of this sequence after heterologous expression in mammalian cells. A green monkey kidney cell line transfected with the Xenopus nucleotide sequence had significantly greater 125I uptake than sham-transfected control cells. The uptake in carrier-transfected cells was significantly inhibited in the presence of perchlorate, a competitive inhibitor of mammalian Na+/iodide symporter. Tissue distributions of the sequence were also consistent with a role in iodide uptake. The mRNA encoding the carrier was found to be expressed in the thyroid gland, stomach, and kidney of tadpoles from X. laevis, as well as the bullfrog Rana catesbeiana. The ovaries of adult X. laevis also were found to express the carrier. Phylogenetic analysis suggested that the putative X. laevis iodide transporter is orthologous to vertebrate Na+-dependent iodide symporters. We conclude that the amphibian sequence encodes a protein that is indeed a functional Na+/iodide symporter in Xenopus laevis, as well as Rana catesbeiana. PMID:18275962
Annotation of differentially expressed genes in the somatic embryogenesis of musa and their location in the banana genome.

PubMed

Maldonado-Borges, Josefina Ines; Ku-Cauich, José Roberto; Escobedo-Graciamedrano, Rosa Maria

2013-01-01

Analysis of cDNA-AFLP was used to study the genes expressed in zygotic and somatic embryogenesis of Musa acuminata Colla ssp. malaccensis, and a comparison was made between their differential transcribed fragments (TDFs) and the sequenced genome of the double haploid- (DH-) Pahang of the malaccensis subspecies that is available in the network. A total of 253 transcript-derived fragments (TDFs) were detected with apparent size of 100-4000 bp using 5 pairs of AFLP primers, of which 21 were differentially expressed during the different stages of banana embryogenesis; 15 of the sequences have matched DH-Pahang chromosomes, with 7 of them being homologous to gene sequences encoding either known or putative protein domains of higher plants. Four TDF sequences were located in all Musa chromosomes, while the rest were located in one or two chromosomes. Their putative individual function is briefly reviewed based on published information, and the potential roles of these genes in embryo development are discussed. Thus the availability of the genome of Musa and the information of TDFs sequences presented here opens new possibilities for an in-depth study of the molecular and biochemical research of zygotic and somatic embryogenesis of Musa.
Development of expressed sequence tag-simple sequence repeat markers for genetic characterization and population structure analysis of Praxelis clematidea (Asteraceae).

PubMed

Wang, Q Z; Huang, M; Downie, S R; Chen, Z X

2016-05-23

Invasive plants tend to spread aggressively in new habitats and an understanding of their genetic diversity and population structure is useful for their management. In this study, expressed sequence tag-simple sequence repeat (EST-SSR) markers were developed for the invasive plant species Praxelis clematidea (Asteraceae) from 5548 Stevia rebaudiana (Asteraceae) expressed sequence tags (ESTs). A total of 133 microsatellite-containing ESTs (2.4%) were identified, of which 56 (42.1%) were hexanucleotide repeat motifs and 50 (37.6%) were trinucleotide repeat motifs. Of the 24 primer pairs designed from these 133 ESTs, 7 (29.2%) resulted in significant polymorphisms. The number of alleles per locus ranged from 5 to 9. The relatively high genetic diversity (H = 0.2667, I = 0.4212, and P = 100%) of P. clematidea was related to high gene flow (Nm = 1.4996) among populations. The coefficient of population differentiation (GST = 0.2500) indicated that most genetic variation occurred within populations. A Mantel test suggested that there was significant correlation between genetic distance and geographical distribution (r = 0.3192, P = 0.012). These results further support the transferability of EST-SSR markers between closely related genera of the same family.
Cloning and stage-specific expression of CK-M1 gene during metamorphosis of Japanese flounder, Paralichthys olivaceus

NASA Astrophysics Data System (ADS)

Chen, Yanjie; Zhang, Quanqi; Qi, Jie; Wang, Zhigang; Wang, Xubo; Sun, Yeying; Zhong, Qiwang; Li, Shuo; Li, Chunmei

2010-05-01

The symmetrical body of flatfish larvae changes dramatically into an asymmetrical form after metamorphosis. The molecular mechanisms responsible for this change are poorly understood. As an initial step to clarify these mechanisms, we used representational difference analysis of cDNA for the identification of genes active during metamorphosis in the Japanese flounder, Paralichthys olicaceus. One of the up-regulated genes was identified as creatine kinase muscle type 1 (CK-M1). Sequence analysis of CK-M1 revealed that it spanned 1 708 bp and encoded a protein of 382 amino acids. The overall amino acid sequence of the CK-M1 was highly conserved with those of other organisms. CK-M1 was expressed in adult fish tissues, including skeletal muscle, intestine and gill. Whole mount in-situ hybridization showed that the enhanced expression of CK-M1 expanded from the head to the whole body of larvae as metamorphosis progressed. Quantitative analysis revealed stage-specific high expression of CK-M1 during metamorphosis. The expression level of CK-M1 increased initially and peaked at metamorphosis, decreased afterward, and finally returned to the pre-metamorphosis level. This stage-specific expression pattern suggested strongly that CK-M1 was related to metamorphosis in the Japanese flounder. Its specific role in metamorphosis requires further study.
Comparative transcriptome analysis of lufenuron-resistant and susceptible strains of Spodoptera frugiperda (Lepidoptera: Noctuidae).

PubMed

do Nascimento, Antonio Rogério Bezerra; Fresia, Pablo; Cônsoli, Fernando Luis; Omoto, Celso

2015-11-21

The evolution of insecticide resistance in Spodoptera frugiperda (Lepidoptera: Noctuidae) has resulted in large economic losses and disturbances to the environment and agroecosystems. Resistance to lufenuron, a chitin biosynthesis inhibitor insecticide, was recently documented in Brazilian populations of S. frugiperda. Thus, we utilized large-scale cDNA sequencing (RNA-Seq analysis) to compare the pattern of gene expression between lufenuron-resistant (LUF-R) and susceptible (LUF-S) S. larvae in an attempt to identify the molecular basis behind the resistance mechanism(s) of S. frugiperda to this insecticide. A transcriptome was assembled using approximately 19.6 million 100 bp-long single-end reads, which generated 18,506 transcripts with a N50 of 996 bp. A search against the NCBI non-redundant database generated 51.1% (9,457) functionally annotated transcripts. A large portion of the alignments were homologous to insects, with the majority (45%) being similar to sequences of Bombyx mori (Lepidoptera: Bombycidae). Moreover, 10% of the alignments were similar to sequences of various species of Spodoptera (Lepidoptera: Noctuidae), with 3% of them being similar to sequences of S. frugiperda. A comparative analysis of the gene expression between LUF-R and LUF-S S. frugiperda larvae identified 940 differentially expressed transcripts (p ≤ 0.05, t-test; fold change ≥ 4). Six of them were associated with cuticle metabolism. Of those, four were overexpressed in LUF-R larvae. The machinery involved with the detoxification process was represented by 35 differentially expressed transcripts; 24 of them belonging to P450 monooxygenases, four to glutathione-S-transferases, six to carboxylases and one to sulfotransferases. RNA-Seq analysis was validated for a number of selected candidate transcripts by using quantitative real time PCR (qPCR). The gene expression profile of LUF-R larvae of S. frugiperda differs from LUF-S larvae. In general, gene expression is much higher in resistant larvae when compared to the susceptible ones, particularly for those genes involved with pathways for xenobiotic detoxification, mainly represented by P450 monooxygenases transcripts. Our data indicate that enzymes involved with the detoxification process, and mostly the P450, are one of the resistance mechanisms employed by the LUF-R S. frugiperda larvae against lufenuron.
Discovery of Transcription Factors Novel to Mouse Cerebellar Granule Cell Development Through Laser-Capture Microdissection.

PubMed

Zhang, Peter G Y; Yeung, Joanna; Gupta, Ishita; Ramirez, Miguel; Ha, Thomas; Swanson, Douglas J; Nagao-Sato, Sayaka; Itoh, Masayoshi; Kawaji, Hideya; Lassmann, Timo; Daub, Carsten O; Arner, Erik; de Hoon, Michiel; Carninci, Piero; Forrest, Alistair R R; Hayashizaki, Yoshihide; Goldowitz, Dan

2018-06-01

Laser-capture microdissection was used to isolate external germinal layer tissue from three developmental periods of mouse cerebellar development: embryonic days 13, 15, and 18. The cerebellar granule cell-enriched mRNA library was generated with next-generation sequencing using the Helicos technology. Our objective was to discover transcriptional regulators that could be important for the development of cerebellar granule cells-the most numerous neuron in the central nervous system. Through differential expression analysis, we have identified 82 differentially expressed transcription factors (TFs) from a total of 1311 differentially expressed genes. In addition, with TF-binding sequence analysis, we have identified 46 TF candidates that could be key regulators responsible for the variation in the granule cell transcriptome between developmental stages. Altogether, we identified 125 potential TFs (82 from differential expression analysis, 46 from motif analysis with 3 overlaps in the two sets). From this gene set, 37 TFs are considered novel due to the lack of previous knowledge about their roles in cerebellar development. The results from transcriptome-wide analyses were validated with existing online databases, qRT-PCR, and in situ hybridization. This study provides an initial insight into the TFs of cerebellar granule cells that might be important for development and provide valuable information for further functional studies on these transcriptional regulators.
De novo transcriptome sequencing of axolotl blastema for identification of differentially expressed genes during limb regeneration

PubMed Central

2013-01-01

Background Salamanders are unique among vertebrates in their ability to completely regenerate amputated limbs through the mediation of blastema cells located at the stump ends. This regeneration is nerve-dependent because blastema formation and regeneration does not occur after limb denervation. To obtain the genomic information of blastema tissues, de novo transcriptomes from both blastema tissues and denervated stump ends of Ambystoma mexicanum (axolotls) 14 days post-amputation were sequenced and compared using Solexa DNA sequencing. Results The sequencing done for this study produced 40,688,892 reads that were assembled into 307,345 transcribed sequences. The N50 of transcribed sequence length was 562 bases. A similarity search with known proteins identified 39,200 different genes to be expressed during limb regeneration with a cut-off E-value exceeding 10-5. We annotated assembled sequences by using gene descriptions, gene ontology, and clusters of orthologous group terms. Targeted searches using these annotations showed that the majority of the genes were in the categories of essential metabolic pathways, transcription factors and conserved signaling pathways, and novel candidate genes for regenerative processes. We discovered and confirmed numerous sequences of the candidate genes by using quantitative polymerase chain reaction and in situ hybridization. Conclusion The results of this study demonstrate that de novo transcriptome sequencing allows gene expression analysis in a species lacking genome information and provides the most comprehensive mRNA sequence resources for axolotls. The characterization of the axolotl transcriptome can help elucidate the molecular mechanisms underlying blastema formation during limb regeneration. PMID:23815514
A cDNA from a mouse pancreatic beta cell encoding a putative transcription factor of the insulin gene.

PubMed Central

Walker, M D; Park, C W; Rosen, A; Aronheim, A

1990-01-01

Cell specific expression of the insulin gene is achieved through transcriptional mechanisms operating on multiple DNA sequence elements located in the 5' flanking region of the gene. Of particular importance in the rat insulin I gene are two closely similar 9 bp sequences (IEB1 and IEB2): mutation of either of these leads to 5-10 fold reduction in transcriptional activity. We have screened an expression cDNA library derived from mouse pancreatic endocrine beta cells with a radioactive DNA probe containing multiple copies of the IEB1 sequence. A cDNA clone (A1) isolated by this procedure encodes a protein which shows efficient binding to the IEB1 probe, but much weaker binding to either an unrelated DNA probe or to a probe bearing a single base pair insertion within the recognition sequence. DNA sequence analysis indicates a protein belonging to the helix-loop-helix family of DNA-binding proteins. The ability of the protein encoded by clone A1 to recognize a number of wild type and mutant DNA sequences correlates closely with the ability of each sequence element to support transcription in vivo in the context of the insulin 5' flanking DNA. We conclude that the isolated cDNA may encode a transcription factor that participates in control of insulin gene expression. Images PMID:2181401
De novo transcriptome sequencing in Frankliniella occidentalis to identify genes involved in plant virus transmission and insecticide resistance.

PubMed

Zhang, Zhijun; Zhang, Pengjun; Li, Weidi; Zhang, Jinming; Huang, Fang; Yang, Jian; Bei, Yawei; Lu, Yaobin

2013-05-01

The western flower thrips (WFT), Frankliniella occidentalis, a world-wide invasive insect, causes agricultural damage by directly feeding and by indirectly vectoring Tospoviruses, such as Tomato spotted wilt virus (TSWV). We characterized the transcriptome of WFT and analyzed global gene expression of WFT response to TSWV infection using Illumina sequencing platform. We compiled 59,932 unigenes, and identified 36,339 unigenes by similarity analysis against public databases, most of which were annotated using gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis. Within these annotated transcripts, we collected 278 sequences related to insecticide resistance. GO and KEGG analysis of different expression genes between TSWV-infected and non-infected WFT population revealed that TSWV can regulate cellular process and immune response, which might lead to low virus titers in thrips cells and no detrimental effects on F. occidentalis. This data-set not only enriches genomic resource for WFT, but also benefits research into its molecular genetics and functional genomics. Copyright © 2013 Elsevier Inc. All rights reserved.
Molecular cloning of the Coch gene of guinea pig inner ear and its expression analysis in cultured fibrocytes of the spiral ligament.

PubMed

Li, Lishu; Ikezono, Tetsuo; Sekine, Kuwon; Shindo, Susumu; Matsumura, Tomohiro; Pawankar, Ruby; Ichimiya, Issei; Yagi, Toshiaki

2010-08-01

We have cloned guinea pig Coch cDNA and the sequence information will be useful for future molecular study combined with physiological experiments. Proper Coch gene expression appears to be dependent on the unique extracellular micro-environment of the inner ear in vivo. These results provide insight into the Coch gene expression and its regulation. To characterize the guinea pig Coch gene, we performed molecular cloning and expression analysis in the inner ear and cultured fibrocytes of the spiral ligament. The Coch cDNA was isolated using RACE. Cochlin isofoms were studied by Western blot using three different types of mammalian inner ear. The cochlear fibrocytes were cultured and characterized by immunostaining. Coch gene expression in the fibrocytes was investigated and the influence of cytokine stimulation was evaluated. The full-length 1991 bp Coch cDNA that encodes a 553 amino acid protein was isolated. The sequence had significant homology with other mammals, and the sizes of the Cochlin isoforms were identical. In the cultured fibrocytes, Coch mRNA was expressed in a very small amount and the isoform production was different, compared with the results in vivo. Cytokine stimulation did not alter the level of mRNA expression or isoform formation.
An integrated systems genetics screen reveals the transcriptional structure of inherited predisposition to metastatic disease

PubMed Central

Faraji, Farhoud; Hu, Ying; Wu, Gang; Goldberger, Natalie E.; Walker, Renard C.; Zhang, Jinghui; Hunter, Kent W.

2014-01-01

Metastasis is the result of stochastic genomic and epigenetic events leading to gene expression profiles that drive tumor dissemination. Here we exploit the principle that metastatic propensity is modified by the genetic background to generate prognostic gene expression signatures that illuminate regulators of metastasis. We also identify multiple microRNAs whose germline variation is causally linked to tumor progression and metastasis. We employ network analysis of global gene expression profiles in tumors derived from a panel of recombinant inbred mice to identify a network of co-expressed genes centered on Cnot2 that predicts metastasis-free survival. Modulating Cnot2 expression changes tumor cell metastatic potential in vivo, supporting a functional role for Cnot2 in metastasis. Small RNA sequencing of the same tumor set revealed a negative correlation between expression of the Mir216/217 cluster and tumor progression. Expression quantitative trait locus analysis (eQTL) identified cis-eQTLs at the Mir216/217 locus, indicating that differences in expression may be inherited. Ectopic expression of Mir216/217 in tumor cells suppressed metastasis in vivo. Finally, small RNA sequencing and mRNA expression profiling data were integrated to reveal that miR-3470a/b target a high proportion of network transcripts. In vivo analysis of Mir3470a/b demonstrated that both promote metastasis. Moreover, Mir3470b is a likely regulator of the Cnot2 network as its overexpression down-regulated expression of network hub genes and enhanced metastasis in vivo, phenocopying Cnot2 knockdown. The resulting data from this strategy identify Cnot2 as a novel regulator of metastasis and demonstrate the power of our systems-level approach in identifying modifiers of metastasis. PMID:24322557
Transcriptome Sequencing and Characterization of Japanese Scallop Patinopecten yessoensis from Different Shell Color Lines

PubMed Central

Chang, Yaqing; Zhao, Wenming; Du, Zhenlin; Hao, Zhenlin

2015-01-01

Shell color is an important trait that is used in breeding the Japanese scallop Patinopecten yessoensis, the most economically important scallop species in China. We constructed four transcriptome libraries from different shell color lines of P. yessoensis: the left and right shell mantles of ordinary strains of P. yessoensis and the left shell mantles of the ‘Ivory’ and ‘Maple’ strains. These four libraries were paired-end sequenced using the Illumina HiSeq 2000 platform and contained 54,802,692 sequences, 40,798,962 sequences, 74,019,262 sequences, and 44,466,166 sequences, respectively. A total of 214,087,082 expressed sequence tags were assembled into 73,522 unigenes with an average size of 1,163 bp. When the data were compared against the public Nr and Swiss-Prot databases using BlastX, nearly 30.55% (22,458) of the unigenes were significantly matched to known unique proteins. Gene Ontology annotation and pathway mapping analysis using the Kyoto Encyclopedia of Genes and Genomes categorized unigenes according to their diverse biological functions and processes and identified candidate genes that were potentially involved in growth, pigmentation, metal transcription, and immunity. Expression profile analysis was performed on all four libraries and many differentially expressed genes were identified. In addition, 5,772 simple sequence repeats were obtained from the P. yessoensis transcriptomes, and 464,197, 395,646, and 310,649 single nucleotide polymorphisms were revealed in the ordinary strains, the ‘Ivory’ strain, and the ‘Maple’ strain, respectively. These results provide valuable information for future genomic studies on P. yessoensis and improve our understanding of the molecular mechanisms involved in the growth, immunity, shell coloring, and shell biomineralization of this species. These resources also may be used in a variety of applications, such as trait mapping, marker-assisted breeding, studies of population genetics and genomics, and work on functional genomics. PMID:25680107
De novo assembly and characterization of the Trichuris trichiura adult worm transcriptome using Ion Torrent sequencing.

PubMed

Santos, Leonardo N; Silva, Eduardo S; Santos, André S; De Sá, Pablo H; Ramos, Rommel T; Silva, Artur; Cooper, Philip J; Barreto, Maurício L; Loureiro, Sebastião; Pinheiro, Carina S; Alcantara-Neves, Neuza M; Pacheco, Luis G C

2016-07-01

Infection with helminthic parasites, including the soil-transmitted helminth Trichuris trichiura (human whipworm), has been shown to modulate host immune responses and, consequently, to have an impact on the development and manifestation of chronic human inflammatory diseases. De novo derivation of helminth proteomes from sequencing of transcriptomes will provide valuable data to aid identification of parasite proteins that could be evaluated as potential immunotherapeutic molecules in near future. Herein, we characterized the transcriptome of the adult stage of the human whipworm T. trichiura, using next-generation sequencing technology and a de novo assembly strategy. Nearly 17.6 million high-quality clean reads were assembled into 6414 contiguous sequences, with an N50 of 1606bp. In total, 5673 protein-encoding sequences were confidentially identified in the T. trichiura adult worm transcriptome; of these, 1013 sequences represent potential newly discovered proteins for the species, most of which presenting orthologs already annotated in the related species T. suis. A number of transcripts representing probable novel non-coding transcripts for the species T. trichiura were also identified. Among the most abundant transcripts, we found sequences that code for proteins involved in lipid transport, such as vitellogenins, and several chitin-binding proteins. Through a cross-species expression analysis of gene orthologs shared by T. trichiura and the closely related parasites T. suis and T. muris it was possible to find twenty-six protein-encoding genes that are consistently highly expressed in the adult stages of the three helminth species. Additionally, twenty transcripts could be identified that code for proteins previously detected by mass spectrometry analysis of protein fractions of the whipworm somatic extract that present immunomodulatory activities. Five of these transcripts were amongst the most highly expressed protein-encoding sequences in the T. trichiura adult worm. Besides, orthologs of proteins demonstrated to have potent immunomodulatory properties in related parasitic helminths were also predicted from the T. trichiura de novo assembled transcriptome. Copyright © 2016. Published by Elsevier B.V.
Heterologous Array Analysis in Pinaceae: Hybridization of Pinus Taeda cDNA Arrays With cDNA From Needles and Embryogenic Cultures of P. Taeda, P. Sylvestris or Picea Abies

PubMed Central

van Zyl, Leonel; von Arnold, Sara; Bozhkov, Peter; Chen, Yongzhong; Egertsdotter, Ulrika; MacKay, John; Sederoff, Ronald R.; Shen, Jing; Zelena, Lyubov

2002-01-01

Hybridization of labelled cDNA from various cell types with high-density arrays of expressed sequence tags is a powerful technique for investigating gene expression. Few conifer cDNA libraries have been sequenced. Because of the high level of sequence conservation between Pinus and Picea we have investigated the use of arrays from one genus for studies of gene expression in the other. The partial cDNAs from 384 identifiable genes expressed in differentiating xylem of Pinus taeda were printed on nylon membranes in randomized replicates. These were hybridized with labelled cDNA from needles or embryogenic cultures of Pinus taeda, P. sylvestris and Picea abies, and with labelled cDNA from leaves of Nicotiana tabacum. The Spearman correlation of gene expression for pairs of conifer species was high for needles (r2 = 0.78 − 0.86), and somewhat lower for embryogenic cultures (r2 = 0.68 − 0.83). The correlation of gene expression for tobacco leaves and needles of each of the three conifer species was lower but sufficiently high (r2 = 0.52 − 0.63) to suggest that many partial gene sequences are conserved in angiosperms and gymnosperms. Heterologous probing was further used to identify tissue-specific gene expression over species boundaries. To evaluate the significance of differences in gene expression, conventional parametric tests were compared with permutation tests after four methods of normalization. Permutation tests after Z-normalization provide the highest degree of discrimination but may enhance the probability of type I errors. It is concluded that arrays of cDNA from loblolly pine are useful for studies of gene expression in other pines or spruces. PMID:18629264
Single-cell RNA-sequencing reveals a distinct population of proglucagon-expressing cells specific to the mouse upper small intestine.

PubMed

Glass, Leslie L; Calero-Nieto, Fernando J; Jawaid, Wajid; Larraufie, Pierre; Kay, Richard G; Göttgens, Berthold; Reimann, Frank; Gribble, Fiona M

2017-10-01

To identify sub-populations of intestinal preproglucagon-expressing (PPG) cells producing Glucagon-like Peptide-1, and their associated expression profiles of sensory receptors, thereby enabling the discovery of therapeutic strategies that target these cell populations for the treatment of diabetes and obesity. We performed single cell RNA sequencing of PPG-cells purified by flow cytometry from the upper small intestine of 3 GLU-Venus mice. Cells from 2 mice were sequenced at low depth, and from the third mouse at high depth. High quality sequencing data from 234 PPG-cells were used to identify clusters by tSNE analysis. qPCR was performed to compare the longitudinal and crypt/villus locations of cluster-specific genes. Immunofluorescence and mass spectrometry were used to confirm protein expression. PPG-cells formed 3 major clusters: a group with typical characteristics of classical L-cells, including high expression of Gcg and Pyy (comprising 51% of all PPG-cells); a cell type overlapping with Gip-expressing K-cells (14%); and a unique cluster expressing Tph1 and Pzp that was predominantly located in proximal small intestine villi and co-produced 5-HT (35%). Expression of G-protein coupled receptors differed between clusters, suggesting the cell types are differentially regulated and would be differentially targetable. Our findings support the emerging concept that many enteroendocrine cell populations are highly overlapping, with individual cells producing a range of peptides previously assigned to distinct cell types. Different receptor expression profiles across the clusters highlight potential drug targets to increase gut hormone secretion for the treatment of diabetes and obesity. Copyright © 2017 The Authors. Published by Elsevier GmbH.. All rights reserved.
[Blue-light induced expression of S-adenosy-L-homocysteine hydrolase-like gene in Mucor amphibiorum RCS1].

PubMed

Gao, Ya; Wang, Shu; Fu, Mingjia; Zhong, Guolin

2013-09-04

To determine blue-light induced expression of S-adenosyl-L-homocysteine hydrolase-like (sahhl) gene in fungus Mucor amphibiorum RCS1. In the random process of PCR, a sequence of 555 bp was obtained from M. amphibiorum RCS1. The 555 bp sequence was labeled with digoxin to prepare the probe for northern hybridization. By northern hybridization, the transcription of sahhl gene was analyzed in M. amphibiorum RCS1 mycelia culture process from darkness to blue light to darkness. Simultaneously real-time PCR method was used to the sahhl gene expression analysis. Compared with the sequence of sahh gene from Homo sapiens, Mus musculus and some fungi species, a high homology of the 555 bp sequence was confirmed. Therefore, the preliminary confirmation has supported that the 555 bp sequence should be sahhl gene from M. amphibiorum RCS1. Under the dark pre-culture in 24 h, a large amounts of transcript of sahhl gene in the mycelia can be detected by northern hybridization and real-time PCR in the condition of 24 h blue light. But a large amounts of transcript of sahhl gene were not found in other detection for the dark pre-culture of 48 h, even though M. amphibiorum RCS1 mycelia were induced by blue light. Blue light can induce the expression of sahhl gene in the vigorous growth of M. amphibiorum RCS1 mycelia.
Mango (Mangifera indica L.) cv. Kent fruit mesocarp de novo transcriptome assembly identifies gene families important for ripening

PubMed Central

Dautt-Castro, Mitzuko; Ochoa-Leyva, Adrian; Contreras-Vergara, Carmen A.; Pacheco-Sanchez, Magda A.; Casas-Flores, Sergio; Sanchez-Flores, Alejandro; Kuhn, David N.; Islas-Osuna, Maria A.

2015-01-01

Fruit ripening is a physiological and biochemical process genetically programmed to regulate fruit quality parameters like firmness, flavor, odor and color, as well as production of ethylene in climacteric fruit. In this study, a transcriptomic analysis of mango (Mangifera indica L.) mesocarp cv. “Kent” was done to identify key genes associated with fruit ripening. Using the Illumina sequencing platform, 67,682,269 clean reads were obtained and a transcriptome of 4.8 Gb. A total of 33,142 coding sequences were predicted and after functional annotation, 25,154 protein sequences were assigned with a product according to Swiss-Prot database and 32,560 according to non-redundant database. Differential expression analysis identified 2,306 genes with significant differences in expression between mature-green and ripe mango [1,178 up-regulated and 1,128 down-regulated (FDR ≤ 0.05)]. The expression of 10 genes evaluated by both qRT-PCR and RNA-seq data was highly correlated (R = 0.97), validating the differential expression data from RNA-seq alone. Gene Ontology enrichment analysis, showed significantly represented terms associated to fruit ripening like “cell wall,” “carbohydrate catabolic process” and “starch and sucrose metabolic process” among others. Mango genes were assigned to 327 metabolic pathways according to Kyoto Encyclopedia of Genes and Genomes database, among them those involved in fruit ripening such as plant hormone signal transduction, starch and sucrose metabolism, galactose metabolism, terpenoid backbone, and carotenoid biosynthesis. This study provides a mango transcriptome that will be very helpful to identify genes for expression studies in early and late flowering mangos during fruit ripening. PMID:25741352
Mango (Mangifera indica L.) cv. Kent fruit mesocarp de novo transcriptome assembly identifies gene families important for ripening.

PubMed

Dautt-Castro, Mitzuko; Ochoa-Leyva, Adrian; Contreras-Vergara, Carmen A; Pacheco-Sanchez, Magda A; Casas-Flores, Sergio; Sanchez-Flores, Alejandro; Kuhn, David N; Islas-Osuna, Maria A

2015-01-01

Fruit ripening is a physiological and biochemical process genetically programmed to regulate fruit quality parameters like firmness, flavor, odor and color, as well as production of ethylene in climacteric fruit. In this study, a transcriptomic analysis of mango (Mangifera indica L.) mesocarp cv. "Kent" was done to identify key genes associated with fruit ripening. Using the Illumina sequencing platform, 67,682,269 clean reads were obtained and a transcriptome of 4.8 Gb. A total of 33,142 coding sequences were predicted and after functional annotation, 25,154 protein sequences were assigned with a product according to Swiss-Prot database and 32,560 according to non-redundant database. Differential expression analysis identified 2,306 genes with significant differences in expression between mature-green and ripe mango [1,178 up-regulated and 1,128 down-regulated (FDR ≤ 0.05)]. The expression of 10 genes evaluated by both qRT-PCR and RNA-seq data was highly correlated (R = 0.97), validating the differential expression data from RNA-seq alone. Gene Ontology enrichment analysis, showed significantly represented terms associated to fruit ripening like "cell wall," "carbohydrate catabolic process" and "starch and sucrose metabolic process" among others. Mango genes were assigned to 327 metabolic pathways according to Kyoto Encyclopedia of Genes and Genomes database, among them those involved in fruit ripening such as plant hormone signal transduction, starch and sucrose metabolism, galactose metabolism, terpenoid backbone, and carotenoid biosynthesis. This study provides a mango transcriptome that will be very helpful to identify genes for expression studies in early and late flowering mangos during fruit ripening.
Microarray analysis of gene expression profiles in ripening pineapple fruits.

PubMed

Koia, Jonni H; Moyle, Richard L; Botella, Jose R

2012-12-18

Pineapple (Ananas comosus) is a tropical fruit crop of significant commercial importance. Although the physiological changes that occur during pineapple fruit development have been well characterized, little is known about the molecular events that occur during the fruit ripening process. Understanding the molecular basis of pineapple fruit ripening will aid the development of new varieties via molecular breeding or genetic modification. In this study we developed a 9277 element pineapple microarray and used it to profile gene expression changes that occur during pineapple fruit ripening. Microarray analyses identified 271 unique cDNAs differentially expressed at least 1.5-fold between the mature green and mature yellow stages of pineapple fruit ripening. Among these 271 sequences, 184 share significant homology with genes encoding proteins of known function, 53 share homology with genes encoding proteins of unknown function and 34 share no significant homology with any database accession. Of the 237 pineapple sequences with homologs, 160 were up-regulated and 77 were down-regulated during pineapple fruit ripening. DAVID Functional Annotation Cluster (FAC) analysis of all 237 sequences with homologs revealed confident enrichment scores for redox activity, organic acid metabolism, metalloenzyme activity, glycolysis, vitamin C biosynthesis, antioxidant activity and cysteine peptidase activity, indicating the functional significance and importance of these processes and pathways during pineapple fruit development. Quantitative real-time PCR analysis validated the microarray expression results for nine out of ten genes tested. This is the first report of a microarray based gene expression study undertaken in pineapple. Our bioinformatic analyses of the transcript profiles have identified a number of genes, processes and pathways with putative involvement in the pineapple fruit ripening process. This study extends our knowledge of the molecular basis of pineapple fruit ripening and non-climacteric fruit ripening in general.
Microarray analysis of gene expression profiles in ripening pineapple fruits

PubMed Central

2012-01-01

Background Pineapple (Ananas comosus) is a tropical fruit crop of significant commercial importance. Although the physiological changes that occur during pineapple fruit development have been well characterized, little is known about the molecular events that occur during the fruit ripening process. Understanding the molecular basis of pineapple fruit ripening will aid the development of new varieties via molecular breeding or genetic modification. In this study we developed a 9277 element pineapple microarray and used it to profile gene expression changes that occur during pineapple fruit ripening. Results Microarray analyses identified 271 unique cDNAs differentially expressed at least 1.5-fold between the mature green and mature yellow stages of pineapple fruit ripening. Among these 271 sequences, 184 share significant homology with genes encoding proteins of known function, 53 share homology with genes encoding proteins of unknown function and 34 share no significant homology with any database accession. Of the 237 pineapple sequences with homologs, 160 were up-regulated and 77 were down-regulated during pineapple fruit ripening. DAVID Functional Annotation Cluster (FAC) analysis of all 237 sequences with homologs revealed confident enrichment scores for redox activity, organic acid metabolism, metalloenzyme activity, glycolysis, vitamin C biosynthesis, antioxidant activity and cysteine peptidase activity, indicating the functional significance and importance of these processes and pathways during pineapple fruit development. Quantitative real-time PCR analysis validated the microarray expression results for nine out of ten genes tested. Conclusions This is the first report of a microarray based gene expression study undertaken in pineapple. Our bioinformatic analyses of the transcript profiles have identified a number of genes, processes and pathways with putative involvement in the pineapple fruit ripening process. This study extends our knowledge of the molecular basis of pineapple fruit ripening and non-climacteric fruit ripening in general. PMID:23245313

WHAM!: a web-based visualization suite for user-defined analysis of metagenomic shotgun sequencing data.

PubMed

Devlin, Joseph C; Battaglia, Thomas; Blaser, Martin J; Ruggles, Kelly V

2018-06-25

Exploration of large data sets, such as shotgun metagenomic sequence or expression data, by biomedical experts and medical professionals remains as a major bottleneck in the scientific discovery process. Although tools for this purpose exist for 16S ribosomal RNA sequencing analysis, there is a growing but still insufficient number of user-friendly interactive visualization workflows for easy data exploration and figure generation. The development of such platforms for this purpose is necessary to accelerate and streamline microbiome laboratory research. We developed the Workflow Hub for Automated Metagenomic Exploration (WHAM!) as a web-based interactive tool capable of user-directed data visualization and statistical analysis of annotated shotgun metagenomic and metatranscriptomic data sets. WHAM! includes exploratory and hypothesis-based gene and taxa search modules for visualizing differences in microbial taxa and gene family expression across experimental groups, and for creating publication quality figures without the need for command line interface or in-house bioinformatics. WHAM! is an interactive and customizable tool for downstream metagenomic and metatranscriptomic analysis providing a user-friendly interface allowing for easy data exploration by microbiome and ecological experts to facilitate discovery in multi-dimensional and large-scale data sets.
A De Novo-Assembly Based Data Analysis Pipeline for Plant Obligate Parasite Metatranscriptomic Studies.

PubMed

Guo, Li; Allen, Kelly S; Deiulio, Greg; Zhang, Yong; Madeiras, Angela M; Wick, Robert L; Ma, Li-Jun

2016-01-01

Current and emerging plant diseases caused by obligate parasitic microbes such as rusts, downy mildews, and powdery mildews threaten worldwide crop production and food safety. These obligate parasites are typically unculturable in the laboratory, posing technical challenges to characterize them at the genetic and genomic level. Here we have developed a data analysis pipeline integrating several bioinformatic software programs. This pipeline facilitates rapid gene discovery and expression analysis of a plant host and its obligate parasite simultaneously by next generation sequencing of mixed host and pathogen RNA (i.e., metatranscriptomics). We applied this pipeline to metatranscriptomic sequencing data of sweet basil (Ocimum basilicum) and its obligate downy mildew parasite Peronospora belbahrii, both lacking a sequenced genome. Even with a single data point, we were able to identify both candidate host defense genes and pathogen virulence genes that are highly expressed during infection. This demonstrates the power of this pipeline for identifying genes important in host-pathogen interactions without prior genomic information for either the plant host or the obligate biotrophic pathogen. The simplicity of this pipeline makes it accessible to researchers with limited computational skills and applicable to metatranscriptomic data analysis in a wide range of plant-obligate-parasite systems.
PNMA family: Protein interaction network and cell signalling pathways implicated in cancer and apoptosis.

PubMed

Pang, Siew Wai; Lahiri, Chandrajit; Poh, Chit Laa; Tan, Kuan Onn

2018-05-01

Paraneoplastic Ma Family (PNMA) comprises a growing number of family members which share relatively conserved protein sequences encoded by the human genome and is localized to several human chromosomes, including the X-chromosome. Based on sequence analysis, PNMA family members share sequence homology to the Gag protein of LTR retrotransposon, and several family members with aberrant protein expressions have been reported to be closely associated with the human Paraneoplastic Disorder (PND). In addition, gene mutations of specific members of PNMA family are known to be associated with human mental retardation or 3-M syndrome consisting of restrictive post-natal growth or dwarfism, and development of skeletal abnormalities. Other than sequence homology, the physiological function of many members in this family remains unclear. However, several members of this family have been characterized, including cell signalling events mediated by these proteins that are associated with apoptosis, and cancer in different cell types. Furthermore, while certain PNMA family members show restricted gene expression in the human brain and testis, other PNMA family members exhibit broader gene expression or preferential and selective protein interaction profiles, suggesting functional divergence within the family. Functional analysis of some members of this family have identified protein domains that are required for subcellular localization, protein-protein interactions, and cell signalling events which are the focus of this review paper. Copyright © 2018 Elsevier Inc. All rights reserved.
Transcriptional profiling reveals the expression of novel genes in response to various stimuli in the human dermatophyte Trichophyton rubrum

PubMed Central

2010-01-01

Background Cutaneous mycoses are common human infections among healthy and immunocompromised hosts, and the anthropophilic fungus Trichophyton rubrum is the most prevalent microorganism isolated from such clinical cases worldwide. The aim of this study was to determine the transcriptional profile of T. rubrum exposed to various stimuli in order to obtain insights into the responses of this pathogen to different environmental challenges. Therefore, we generated an expressed sequence tag (EST) collection by constructing one cDNA library and nine suppression subtractive hybridization libraries. Results The 1388 unigenes identified in this study were functionally classified based on the Munich Information Center for Protein Sequences (MIPS) categories. The identified proteins were involved in transcriptional regulation, cellular defense and stress, protein degradation, signaling, transport, and secretion, among other functions. Analysis of these unigenes revealed 575 T. rubrum sequences that had not been previously deposited in public databases. Conclusion In this study, we identified novel T. rubrum genes that will be useful for ORF prediction in genome sequencing and facilitating functional genome analysis. Annotation of these expressed genes revealed metabolic adaptations of T. rubrum to carbon sources, ambient pH shifts, and various antifungal drugs used in medical practice. Furthermore, challenging T. rubrum with cytotoxic drugs and ambient pH shifts extended our understanding of the molecular events possibly involved in the infectious process and resistance to antifungal drugs. PMID:20144196
Identification, expression and tissue distribution of a renalase homologue from mouse.

PubMed

Wang, Jian; Qi, Shaoling; Cheng, Wei; Li, Li; Wang, Fu; Li, Ying-Zi; Zhang, Shu-Ping

2008-12-01

FAD (flavin adenine dinucleotide)-dependent monoamine oxidases play very important roles in many biological processes. A novel monoamine oxidase, named renalase, has been identified in human kidney recently and is found to be markedly reduced in patients with end-stage renal disease (ESRD). Here, we reported the identification of a renalase homologue from mouse, termed mMAO-C (mouse monoamine oxidase-C) after the monoamine oxidase-A and -B (MAO-A and -B). This gene locates on the mouse chromosome 19C1 and its coding region spans 7 exons. The deuced amino acid sequences were predicted to contain a typical secretive signal peptide and a conserved amine oxidase domain. Phylogenetic analysis and multiple sequences alignment indicated that mMAO-C-like sequences exist in all examined species and share significant similarities. This gene has been submitted to the NCBI GenBank database (Accession number: DQ788834). With expression vectors generated from the cloned mMAO-C gene, exogenous protein was effectively expressed in both prokaryotic and eukaryotic cells. Recombinant mMAO-C protein was secreted out of human cell lines, indicating the biological function of its signal peptide. Moreover, tissue expression pattern analysis revealed that mMAO-C gene is predominantly expressed in the mouse kidney and testicle, which implies that kidney and testicle are the main sources of renalase secretion. Shortly, this study provides an insight into understanding the physiological and biological functions of mMAO-C and its homologues in endocrine.
A long and abundant non-coding RNA in Lactobacillus salivarius.

PubMed

Cousin, Fabien J; Lynch, Denise B; Chuat, Victoria; Bourin, Maxence J B; Casey, Pat G; Dalmasso, Marion; Harris, Hugh M B; McCann, Angela; O'Toole, Paul W

2017-09-01

Lactobacillus salivarius , found in the intestinal microbiota of humans and animals, is studied as an example of the sub-dominant intestinal commensals that may impart benefits upon their host. Strains typically harbour at least one megaplasmid that encodes functions contributing to contingency metabolism and environmental adaptation. RNA sequencing (RNA-seq)transcriptomic analysis of L. salivarius strain UCC118 identified the presence of a novel unusually abundant long non-coding RNA (lncRNA) encoded by the megaplasmid, and which represented more than 75 % of the total RNA-seq reads after depletion of rRNA species. The expression level of this 520 nt lncRNA in L. salivarius UCC118 exceeded that of the 16S rRNA, it accumulated during growth, was very stable over time and was also expressed during intestinal transit in a mouse. This lncRNA sequence is specific to the L. salivarius species; however, among 45 L . salivarius genomes analysed, not all (only 34) harboured the sequence for the lncRNA. This lncRNA was produced in 27 tested L. salivarius strains, but at strain-specific expression levels. High-level lncRNA expression correlated with high megaplasmid copy number. Transcriptome analysis of a deletion mutant lacking this lncRNA identified altered expression levels of genes in a number of pathways, but a definitive function of this new lncRNA was not identified. This lncRNA presents distinctive and unique properties, and suggests potential basic and applied scientific developments of this phenomenon.
Transcriptome assembly, profiling and differential gene expression analysis of the halophyte Suaeda fruticosa provides insights into salt tolerance.

PubMed

Diray-Arce, Joann; Clement, Mark; Gul, Bilquees; Khan, M Ajmal; Nielsen, Brent L

2015-05-06

Improvement of crop production is needed to feed the growing world population as the amount and quality of agricultural land decreases and soil salinity increases. This has stimulated research on salt tolerance in plants. Most crops tolerate a limited amount of salt to survive and produce biomass, while halophytes (salt-tolerant plants) have the ability to grow with saline water utilizing specific biochemical mechanisms. However, little is known about the genes involved in salt tolerance. We have characterized the transcriptome of Suaeda fruticosa, a halophyte that has the ability to sequester salts in its leaves. Suaeda fruticosa is an annual shrub in the family Chenopodiaceae found in coastal and inland regions of Pakistan and Mediterranean shores. This plant is an obligate halophyte that grows optimally from 200-400 mM NaCl and can grow at up to 1000 mM NaCl. High throughput sequencing technology was performed to provide understanding of genes involved in the salt tolerance mechanism. De novo assembly of the transcriptome and analysis has allowed identification of differentially expressed and unique genes present in this non-conventional crop. Twelve sequencing libraries prepared from control (0 mM NaCl treated) and optimum (300 mM NaCl treated) plants were sequenced using Illumina Hiseq 2000 to investigate differential gene expression between shoots and roots of Suaeda fruticosa. The transcriptome was assembled de novo using Velvet and Oases k-45 and clustered using CDHIT-EST. There are 54,526 unigenes; among these 475 genes are downregulated and 44 are upregulated when samples from plants grown under optimal salt are compared with those grown without salt. BLAST analysis identified the differentially expressed genes, which were categorized in gene ontology terms and their pathways. This work has identified potential genes involved in salt tolerance in Suaeda fruticosa, and has provided an outline of tools to use for de novo transcriptome analysis. The assemblies that were used provide coverage of a considerable proportion of the transcriptome, which allows analysis of differential gene expression and identification of genes that may be involved in salt tolerance. The transcriptome may serve as a reference sequence for study of other succulent halophytes.
Expressed sequence tag analysis of adult human optic nerve for NEIBank: Identification of cell type and tissue markers

PubMed Central

Bernstein, Steven L; Guo, Yan; Peterson, Katherine; Wistow, Graeme

2009-01-01

Background The optic nerve is a pure white matter central nervous system (CNS) tract with an isolated blood supply, and is widely used in physiological studies of white matter response to various insults. We examined the gene expression profile of human optic nerve (ON) and, through the NEIBANK online resource, to provide a resource of sequenced verified cDNA clones. An un-normalized cDNA library was constructed from pooled human ON tissues and was used in expressed sequence tag (EST) analysis. Location of an abundant oligodendrocyte marker was examined by immunofluorescence. Quantitative real time polymerase chain reaction (qRT-PCR) and Western analysis were used to compare levels of expression for key calcium channel protein genes and protein product in primate and rodent ON. Results Our analyses revealed a profile similar in many respects to other white matter related tissues, but significantly different from previously available ON cDNA libraries. The previous libraries were found to include specific markers for other eye tissues, suggesting contamination. Immune/inflammatory markers were abundant in the new ON library. The oligodendrocyte marker QKI was abundant at the EST level. Immunofluorescence revealed that this protein is a useful oligodendrocyte cell-type marker in rodent and primate ONs. L-type calcium channel EST abundance was found to be particularly low. A qRT-PCR-based comparative mammalian species analysis reveals that L-type calcium channel expression levels are significantly lower in primate than in rodent ON, which may help account for the class-specific difference in responsiveness to calcium channel blocking agents. Several known eye disease genes are abundantly expressed in ON. Many genes associated with normal axonal function, mRNAs associated with axonal transport, inflammation and neuroprotection are observed. Conclusion We conclude that the new cDNA library is a faithful representation of human ON and EST data provide an initial overview of gene expression patterns in this tissue. The data provide clues for tissue-specific and species-specific properties of human ON that will help in design of therapeutic models. PMID:19778450
RNA‑sequencing analysis of aberrantly expressed long non‑coding RNAs and mRNAs in a mouse model of ventilator‑induced lung injury.

PubMed

Xu, Bo; Wang, Yizhou; Li, Xiujuan; Mao, Yanfei; Deng, Xiaoming

2018-05-17

Long non-coding RNAs (lncRNAs) are closely associated with the regulation of various biological processes and are involved in the pathogenesis of numerous diseases. However, to the best of our knowledge, the role of lncRNAs in ventilator‑induced lung injury (VILI) has yet to be evaluated. In the present study, high‑throughput sequencing was applied to investigate differentially expressed lncRNAs and mRNAs (fold change >2; false discovery rate <0.05). Bioinformatics analysis was employed to predict the functions of differentially expressed lncRNAs. A total of 104 lncRNAs (74 upregulated and 30 downregulated) and 809 mRNAs (521 upregulated and 288 downregulated) were differentially expressed in lung tissues from the VILI group. Gene ontology analysis demonstrated that the differentially expressed lncRNAs and mRNAs were mainly associated with biological functions, including apoptosis, angiogenesis, neutrophil chemotaxis and skeletal muscle cell differentiation. The top four enriched pathways were the tumor necrosis factor (TNF) signaling pathway, P53 signaling pathway, neuroactive ligand‑receptor interaction and the forkhead box O signaling pathway. Several lncRNAs were predicted to serve a vital role in VILI. Subsequently, three lncRNAs [mitogen‑activated protein kinase kinase 3, opposite strand (Map2k3os), dynamin 3, opposite strand and abhydrolase domain containing 11, opposite strand] and three mRNAs (growth arrest and DNA damage‑inducible α, claudin 4 and thromboxane A2 receptor) were measured by reverse transcription‑quantitative polymerase chain reaction, in order to confirm the veracity of RNA‑sequencing analysis. In addition, Map2k3os small interfering RNA transfection inhibited the expression of stretch‑induced cytokines [TNF‑α, interleukin (IL)‑1β and IL‑6] in MLE12 cells. In conclusion, the results of the present study provided a profile of differentially expressed lncRNAs in VILI. Several important lncRNAs may be involved in the pathological process of VILI, which may be useful to guide further investigation into the pathogenesis for this disease.
Association of coral algal symbionts with a diverse viral community responsive to heat shock.

PubMed

Brüwer, Jan D; Agrawal, Shobhit; Liew, Yi Jin; Aranda, Manuel; Voolstra, Christian R

2017-08-17

Stony corals provide the structural foundation of coral reef ecosystems and are termed holobionts given they engage in symbioses, in particular with photosynthetic dinoflagellates of the genus Symbiodinium. Besides Symbiodinium, corals also engage with bacteria affecting metabolism, immunity, and resilience of the coral holobiont, but the role of associated viruses is largely unknown. In this regard, the increase of studies using RNA sequencing (RNA-Seq) to assess gene expression provides an opportunity to elucidate viral signatures encompassed within the data via careful delineation of sequence reads and their source of origin. Here, we re-analyzed an RNA-Seq dataset from a cultured coral symbiont (Symbiodinium microadriaticum, Clade A1) across four experimental treatments (control, cold shock, heat shock, dark shock) to characterize associated viral diversity, abundance, and gene expression. Our approach comprised the filtering and removal of host sequence reads, subsequent phylogenetic assignment of sequence reads of putative viral origin, and the assembly and analysis of differentially expressed viral genes. About 15.46% (123 million) of all sequence reads were non-host-related, of which <1% could be classified as archaea, bacteria, or virus. Of these, 18.78% were annotated as virus and comprised a diverse community consistent across experimental treatments. Further, non-host related sequence reads assembled into 56,064 contigs, including 4856 contigs of putative viral origin that featured 43 differentially expressed genes during heat shock. The differentially expressed genes included viral kinases, ubiquitin, and ankyrin repeat proteins (amongst others), which are suggested to help the virus proliferate and inhibit the algal host's antiviral response. Our results suggest that a diverse viral community is associated with coral algal endosymbionts of the genus Symbiodinium, which prompts further research on their ecological role in coral health and resilience.
Evolutionary characterization and transcript profiling of β-tubulin genes in flax (Linum usitatissimum L.) during plant development.

PubMed

Gavazzi, Floriana; Pigna, Gaia; Braglia, Luca; Gianì, Silvia; Breviario, Diego; Morello, Laura

2017-12-08

Microtubules, polymerized from alpha and beta-tubulin monomers, play a fundamental role in plant morphogenesis, determining the cell division plane, the direction of cell expansion and the deposition of cell wall material. During polarized pollen tube elongation, microtubules serve as tracks for vesicular transport and deposition of proteins/lipids at the tip membrane. Such functions are controlled by cortical microtubule arrays. Aim of this study was to first characterize the flax β-tubulin family by sequence and phylogenetic analysis and to investigate differential expression of β-tubulin genes possibly related to fibre elongation and to flower development. We report the cloning and characterization of the complete flax β-tubulin gene family: exon-intron organization, duplicated gene comparison, phylogenetic analysis and expression pattern during stem and hypocotyl elongation and during flower development. Sequence analysis of the fourteen expressed β-tubulin genes revealed that the recent whole genome duplication of the flax genome was followed by massive retention of duplicated tubulin genes. Expression analysis showed that β-tubulin mRNA profiles gradually changed along with phloem fibre development in both the stem and hypocotyl. In flowers, changes in relative tubulin transcript levels took place at anthesis in anthers, but not in carpels. Phylogenetic analysis supports the origin of extant plant β-tubulin genes from four ancestral genes pre-dating angiosperm separation. Expression analysis suggests that particular tubulin subpopulations are more suitable to sustain different microtubule functions such as cell elongation, cell wall thickening or pollen tube growth. Tubulin genes possibly related to different microtubule functions were identified as candidate for more detailed studies.
Using deep RNA sequencing for the structural annotation of the laccaria bicolor mycorrhizal transcriptome.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Larsen, P. E.; Trivedi, G.; Sreedasyam, A.

2010-07-06

Accurate structural annotation is important for prediction of function and required for in vitro approaches to characterize or validate the gene expression products. Despite significant efforts in the field, determination of the gene structure from genomic data alone is a challenging and inaccurate process. The ease of acquisition of transcriptomic sequence provides a direct route to identify expressed sequences and determine the correct gene structure. We developed methods to utilize RNA-seq data to correct errors in the structural annotation and extend the boundaries of current gene models using assembly approaches. The methods were validated with a transcriptomic data set derivedmore » from the fungus Laccaria bicolor, which develops a mycorrhizal symbiotic association with the roots of many tree species. Our analysis focused on the subset of 1501 gene models that are differentially expressed in the free living vs. mycorrhizal transcriptome and are expected to be important elements related to carbon metabolism, membrane permeability and transport, and intracellular signaling. Of the set of 1501 gene models, 1439 (96%) successfully generated modified gene models in which all error flags were successfully resolved and the sequences aligned to the genomic sequence. The remaining 4% (62 gene models) either had deviations from transcriptomic data that could not be spanned or generated sequence that did not align to genomic sequence. The outcome of this process is a set of high confidence gene models that can be reliably used for experimental characterization of protein function. 69% of expressed mycorrhizal JGI 'best' gene models deviated from the transcript sequence derived by this method. The transcriptomic sequence enabled correction of a majority of the structural inconsistencies and resulted in a set of validated models for 96% of the mycorrhizal genes. The method described here can be applied to improve gene structural annotation in other species, provided that there is a sequenced genome and a set of gene models.« less
Casein expression in cytotoxic T lymphocytes.

PubMed Central

Grusby, M J; Mitchell, S C; Nabavi, N; Glimcher, L H

1990-01-01

A cDNA that expresses a mRNA restricted to cytotoxic T lymphocytes (CTL) and mammary tissue has been isolated and characterized. The deduced amino acid sequence from this cDNA shows extensive homology with the previously reported amino acid sequence for rat alpha-casein. Indeed, the presence of a six-residue-repeated motif that is specific for rodent alpha-caseins strongly supports the identification of this cDNA as mouse alpha-casein. Northern (RNA) blot analysis of many hematopoietic cell types revealed that this gene is restricted to CTL, being expressed in four of six CTL lines examined. Furthermore, CTL that express this gene were also found to express other members of the casein gene family, such as beta- and kappa-casein. These results suggest that caseins may be important in CTL function, and their potential role in CTL-mediated lysis is discussed. Images PMID:2395885
A distributed system for fast alignment of next-generation sequencing data.

PubMed

Srimani, Jaydeep K; Wu, Po-Yen; Phan, John H; Wang, May D

2010-12-01

We developed a scalable distributed computing system using the Berkeley Open Interface for Network Computing (BOINC) to align next-generation sequencing (NGS) data quickly and accurately. NGS technology is emerging as a promising platform for gene expression analysis due to its high sensitivity compared to traditional genomic microarray technology. However, despite the benefits, NGS datasets can be prohibitively large, requiring significant computing resources to obtain sequence alignment results. Moreover, as the data and alignment algorithms become more prevalent, it will become necessary to examine the effect of the multitude of alignment parameters on various NGS systems. We validate the distributed software system by (1) computing simple timing results to show the speed-up gained by using multiple computers, (2) optimizing alignment parameters using simulated NGS data, and (3) computing NGS expression levels for a single biological sample using optimal parameters and comparing these expression levels to that of a microarray sample. Results indicate that the distributed alignment system achieves approximately a linear speed-up and correctly distributes sequence data to and gathers alignment results from multiple compute clients.
Use of the Fluidigm C1 platform for RNA sequencing of single mouse pancreatic islet cells.

PubMed

Xin, Yurong; Kim, Jinrang; Ni, Min; Wei, Yi; Okamoto, Haruka; Lee, Joseph; Adler, Christina; Cavino, Katie; Murphy, Andrew J; Yancopoulos, George D; Lin, Hsin Chieh; Gromada, Jesper

2016-03-22

This study provides an assessment of the Fluidigm C1 platform for RNA sequencing of single mouse pancreatic islet cells. The system combines microfluidic technology and nanoliter-scale reactions. We sequenced 622 cells, allowing identification of 341 islet cells with high-quality gene expression profiles. The cells clustered into populations of α-cells (5%), β-cells (92%), δ-cells (1%), and pancreatic polypeptide cells (2%). We identified cell-type-specific transcription factors and pathways primarily involved in nutrient sensing and oxidation and cell signaling. Unexpectedly, 281 cells had to be removed from the analysis due to low viability, low sequencing quality, or contamination resulting in the detection of more than one islet hormone. Collectively, we provide a resource for identification of high-quality gene expression datasets to help expand insights into genes and pathways characterizing islet cell types. We reveal limitations in the C1 Fluidigm cell capture process resulting in contaminated cells with altered gene expression patterns. This calls for caution when interpreting single-cell transcriptomics data using the C1 Fluidigm system.
5’-Terminal AUGs in Escherichia coli mRNAs with Shine-Dalgarno Sequences: Identification and Analysis of Their Roles in Non-Canonical Translation Initiation

PubMed Central

Beck, Heather J.; Fleming, Ian M. C.

2016-01-01

Analysis of the Escherichia coli transcriptome identified a unique subset of messenger RNAs (mRNAs) that contain a conventional untranslated leader and Shine-Dalgarno (SD) sequence upstream of the gene’s start codon while also containing an AUG triplet at the mRNA’s 5’- terminus (5’-uAUG). Fusion of the coding sequence specified by the 5’-terminal putative AUG start codon to a lacZ reporter gene, as well as primer extension inhibition assays, reveal that the majority of the 5’-terminal upstream open reading frames (5’-uORFs) tested support some level of lacZ translation, indicating that these mRNAs can function both as leaderless and canonical SD-leadered mRNAs. Although some of the uORFs were expressed at low levels, others were expressed at levels close to that of the respective downstream genes and as high as the naturally leaderless cI mRNA of bacteriophage λ. These 5’-terminal uORFs potentially encode peptides of varying lengths, but their functions, if any, are unknown. In an effort to determine whether expression from the 5’-terminal uORFs impact expression of the immediately downstream cistron, we examined expression from the downstream coding sequence after mutations were introduced that inhibit efficient 5’-uORF translation. These mutations were found to affect expression from the downstream cistrons to varying degrees, suggesting that some 5’-uORFs may play roles in downstream regulation. Since the 5’-uAUGs found on these conventionally leadered mRNAs can function to bind ribosomes and initiate translation, this indicates that canonical mRNAs containing 5’-uAUGs should be examined for their potential to function also as leaderless mRNAs. PMID:27467758
QuASAR: quantitative allele-specific analysis of reads.

PubMed

Harvey, Chris T; Moyerbrailean, Gregory A; Davis, Gordon O; Wen, Xiaoquan; Luca, Francesca; Pique-Regi, Roger

2015-04-15

Expression quantitative trait loci (eQTL) studies have discovered thousands of genetic variants that regulate gene expression, enabling a better understanding of the functional role of non-coding sequences. However, eQTL studies are costly, requiring large sample sizes and genome-wide genotyping of each sample. In contrast, analysis of allele-specific expression (ASE) is becoming a popular approach to detect the effect of genetic variation on gene expression, even within a single individual. This is typically achieved by counting the number of RNA-seq reads matching each allele at heterozygous sites and testing the null hypothesis of a 1:1 allelic ratio. In principle, when genotype information is not readily available, it could be inferred from the RNA-seq reads directly. However, there are currently no existing methods that jointly infer genotypes and conduct ASE inference, while considering uncertainty in the genotype calls. We present QuASAR, quantitative allele-specific analysis of reads, a novel statistical learning method for jointly detecting heterozygous genotypes and inferring ASE. The proposed ASE inference step takes into consideration the uncertainty in the genotype calls, while including parameters that model base-call errors in sequencing and allelic over-dispersion. We validated our method with experimental data for which high-quality genotypes are available. Results for an additional dataset with multiple replicates at different sequencing depths demonstrate that QuASAR is a powerful tool for ASE analysis when genotypes are not available. http://github.com/piquelab/QuASAR. fluca@wayne.edu or rpique@wayne.edu Supplementary Material is available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Analysis of global gene expression in Brachypodium distachyon reveals extensive network plasticity in response to abiotic stress.

PubMed

Priest, Henry D; Fox, Samuel E; Rowley, Erik R; Murray, Jessica R; Michael, Todd P; Mockler, Todd C

2014-01-01

Brachypodium distachyon is a close relative of many important cereal crops. Abiotic stress tolerance has a significant impact on productivity of agriculturally important food and feedstock crops. Analysis of the transcriptome of Brachypodium after chilling, high-salinity, drought, and heat stresses revealed diverse differential expression of many transcripts. Weighted Gene Co-Expression Network Analysis revealed 22 distinct gene modules with specific profiles of expression under each stress. Promoter analysis implicated short DNA sequences directly upstream of module members in the regulation of 21 of 22 modules. Functional analysis of module members revealed enrichment in functional terms for 10 of 22 network modules. Analysis of condition-specific correlations between differentially expressed gene pairs revealed extensive plasticity in the expression relationships of gene pairs. Photosynthesis, cell cycle, and cell wall expression modules were down-regulated by all abiotic stresses. Modules which were up-regulated by each abiotic stress fell into diverse and unique gene ontology GO categories. This study provides genomics resources and improves our understanding of abiotic stress responses of Brachypodium.
Molecular cloning and expression of the CRISP family of proteins in the boar.

PubMed

Vadnais, Melissa L; Foster, Douglas N; Roberts, Kenneth P

2008-12-01

The family of mammalian cysteine-rich secretory proteins (CRISP) have been well characterized in the rat, mouse, and human. Here we report the molecular cloning and expression analysis of CRISP1, CRISP2, and CRISP3 in the boar. A partial sequence published in the National Center for Biotechnology Information (NCBI) database was used to derive the full-length sequences for CRISP1 and CRISP2 using rapid amplification of cDNA ends. RT-PCR confirmed the expression of these mRNAs in the boar reproductive tract, and real time RT-PCR showed CRISP1 to be highly expressed throughout the epididymis, with CRISP2 highly expressed in the testis. A search of the porcine genomic sequence in the NCBI database identified a BAC (CH242-199E6) encoding the CRISP1 gene. This BAC is derived from porcine Chromosome 7 and is syntenic with the regions of the mouse, rat, and human genomes encoding the CRISP gene family. This BAC was found to encode a third CRISP protein with a predicted amino acid sequence of high similarity to human CRISP3. Using RT-PCR we show that CRISP3 expression in the boar reproductive tract is confined to the prostate. Recombinant porcine (rp) CRISP2 protein was produced and purified. When incubated with capacitated boar sperm, rpCRISP2 induced an acrosome reaction, consistent with its demonstrated ability to alter the activity of calcium channels.
In silico Analysis of 2085 Clones from a Normalized Rat Vestibular Periphery 3′ cDNA Library

PubMed Central

Roche, Joseph P.; Cioffi, Joseph A.; Kwitek, Anne E.; Erbe, Christy B.; Popper, Paul

2005-01-01

The inserts from 2400 cDNA clones isolated from a normalized Rattus norvegicus vestibular periphery cDNA library were sequenced and characterized. The Wackym-Soares vestibular 3′ cDNA library was constructed from the saccular and utricular maculae, the ampullae of all three semicircular canals and Scarpa's ganglia containing the somata of the primary afferent neurons, microdissected from 104 male and female rats. The inserts from 2400 randomly selected clones were sequenced from the 5′ end. Each sequence was analyzed using the BLAST algorithm compared to the Genbank nonredundant, rat genome, mouse genome and human genome databases to search for high homology alignments. Of the initial 2400 clones, 315 (13%) were found to be of poor quality and did not yield useful information, and therefore were eliminated from the analysis. Of the remaining 2085 sequences, 918 (44%) were found to represent 758 unique genes having useful annotations that were identified in databases within the public domain or in the published literature; these sequences were designated as known characterized sequences. 1141 sequences (55%) aligned with 1011 unique sequences had no useful annotations and were designated as known but uncharacterized sequences. Of the remaining 26 sequences (1%), 24 aligned with rat genomic sequences, but none matched previously described rat expressed sequence tags or mRNAs. No significant alignment to the rat or human genomic sequences could be found for the remaining 2 sequences. Of the 2085 sequences analyzed, 86% were singletons. The known, characterized sequences were analyzed with the FatiGO online data-mining tool (http://fatigo.bioinfo.cnio.es/) to identify level 5 biological process gene ontology (GO) terms for each alignment and to group alignments with similar or identical GO terms. Numerous genes were identified that have not been previously shown to be expressed in the vestibular system. Further characterization of the novel cDNA sequences may lead to the identification of genes with vestibular-specific functions. Continued analysis of the rat vestibular periphery transcriptome should provide new insights into vestibular function and generate new hypotheses. Physiological studies are necessary to further elucidate the roles of the identified genes and novel sequences in vestibular function. PMID:16103642

Honey bee (Apis mellifera) transferrin-gene structure and the role of ecdysteroids in the developmental regulation of its expression.

PubMed

do Nascimento, Adriana Mendes; Cuvillier-Hot, Virginie; Barchuk, Angel Roberto; Simões, Zilá Luz Paulino; Hartfelder, Klaus

2004-05-01

Social life is prone to invasion by microorganisms, and binding of ferric ions by transferrin is an efficient strategy to restrict their access to iron. In this study, we isolated cDNA and genomic clones encoding an Apis mellifera transferrin (AmTRF) gene. It has an open reading frame (ORF) of 2136 bp spread over nine exons. The deduced protein sequence comprises 686 amino acid residues plus a 26 residues signal sequence, giving a predicted molecular mass of 76 kDa. Comparison of the deduced AmTRF amino acid sequence with known insect transferrins revealed significant similarity extending over the entire sequence. It clusters with monoferric transferrins, with which it shares putative iron-binding residues in the N-terminal lobe. In a functional analysis of AmTRF expression in honey bee development, we monitored its expression profile in the larval and pupal stages. The negative regulation of AmTRF by ecdysteroids deduced from the developmental expression profile was confirmed by experimental treatment of spinning-stage honey bee larvae with 20-hydroxyecdysone, and of fourth instar-larvae with juvenile hormone. A juvenile hormone application to spinning-stage larvae, in contrast, had only a minor effect on AmTRF transcript levels. This is the first study implicating ecdysteroids in the developmental regulation of transferrin expression in an insect species.
A dehydration-inducible gene in the truffle Tuber borchii identifies a novel group of dehydrins

PubMed Central

Abba', Simona; Ghignone, Stefano; Bonfante, Paola

2006-01-01

Background The expressed sequence tag M6G10 was originally isolated from a screening for differentially expressed transcripts during the reproductive stage of the white truffle Tuber borchii. mRNA levels for M6G10 increased dramatically during fruiting body maturation compared to the vegetative mycelial stage. Results Bioinformatics tools, phylogenetic analysis and expression studies were used to support the hypothesis that this sequence, named TbDHN1, is the first dehydrin (DHN)-like coding gene isolated in fungi. Homologs of this gene, all defined as "coding for hypothetical proteins" in public databases, were exclusively found in ascomycetous fungi and in plants. Although complete (or almost complete) fungal genomes and EST collections of some Basidiomycota and Glomeromycota are already available, DHN-like proteins appear to be represented only in Ascomycota. A new and previously uncharacterized conserved signature pattern was identified and proposed to Uniprot database as the main distinguishing feature of this new group of DHNs. Expression studies provide experimental evidence of a transcript induction of TbDHN1 during cellular dehydration. Conclusion Expression pattern and sequence similarities to known plant DHNs indicate that TbDHN1 is the first characterized DHN-like protein in fungi. The high similarity of TbDHN1 with homolog coding sequences implies the existence of a novel fungal/plant group of LEA Class II proteins characterized by a previously undescribed signature pattern. PMID:16512918
Transcriptomic analysis of Prunus domestica undergoing hypersensitive response to plum pox virus infection.

PubMed

Rodamilans, Bernardo; San León, David; Mühlberger, Louisa; Candresse, Thierry; Neumüller, Michael; Oliveros, Juan Carlos; García, Juan Antonio

2014-01-01

Plum pox virus (PPV) infects Prunus trees around the globe, posing serious fruit production problems and causing severe economic losses. One variety of Prunus domestica, named 'Jojo', develops a hypersensitive response to viral infection. Here we compared infected and non-infected samples using next-generation RNA sequencing to characterize the genetic complexity of the viral population in infected samples and to identify genes involved in development of the resistance response. Analysis of viral reads from the infected samples allowed reconstruction of a PPV-D consensus sequence. De novo reconstruction showed a second viral isolate of the PPV-Rec strain. RNA-seq analysis of PPV-infected 'Jojo' trees identified 2,234 and 786 unigenes that were significantly up- or downregulated, respectively (false discovery rate; FDR≤0.01). Expression of genes associated with defense was generally enhanced, while expression of those related to photosynthesis was repressed. Of the total of 3,020 differentially expressed unigenes, 154 were characterized as potential resistance genes, 10 of which were included in the NBS-LRR type. Given their possible role in plant defense, we selected 75 additional unigenes as candidates for further study. The combination of next-generation sequencing and a Prunus variety that develops a hypersensitive response to PPV infection provided an opportunity to study the factors involved in this plant defense mechanism. Transcriptomic analysis presented an overview of the changes that occur during PPV infection as a whole, and identified candidates suitable for further functional characterization.
Two cis elements collaborate to spatially repress transcription from a sea urchin promoter

NASA Technical Reports Server (NTRS)

Frudakis, T. N.; Wilt, F.

1995-01-01

The expression pattern of many territory-specific genes in metazoan embryos is maintained by an active process of negative spatial regulation. However, the mechanism of this strategy of gene regulation is not well understood in any system. Here we show that reporter constructs containing regulatory sequence for the SM30-alpha gene of Stronglyocentrotus purpuratus are expressed in a pattern congruent with that of the endogenous SM30 gene(s), largely as a result of active transcriptional repression in cell lineages in which the gene is not normally expressed. Chloramphenicol acetyl transferase assays of deletion constructs from the 2600-bp upstream region showed that repressive elements were present in the region from -1628 to -300. In situ hybridization analysis showed that the spatial fidelity of expression was severely compromised when the region from -1628 to -300 was deleted. Two highly repetitive sequence motifs, (G/A/C)CCCCT and (T/C)(T/A/C)CTTTT(T/A/C), are present in the -1628 to -300 region. Representatives of these elements were analyzed by gel mobility shift experiments and were found to interact specifically with protein in crude nuclear extracts. When oligonucleotides containing either sequence element were co-injected with a correctly regulated reporter as potential competitors, the reporter was expressed in inappropriate cells. When composite oligonucleotides, containing both sequence elements, were fused to a misregulated reporter, the expression of the reporter in inappropriate cells was suppressed. Comparison of composite oligonucleotides with oligonucleotides containing single constituent elements show that both sequence elements are required for effective spatial regulation. Thus, both individual elements are required, but only a composite element containing both elements is sufficient to function as a tissue-specific repressive element.
ExprAlign - the identification of ESTs in non-model species by alignment of cDNA microarray expression profiles

PubMed Central

2009-01-01

Background Sequence identification of ESTs from non-model species offers distinct challenges particularly when these species have duplicated genomes and when they are phylogenetically distant from sequenced model organisms. For the common carp, an environmental model of aquacultural interest, large numbers of ESTs remained unidentified using BLAST sequence alignment. We have used the expression profiles from large-scale microarray experiments to suggest gene identities. Results Expression profiles from ~700 cDNA microarrays describing responses of 7 major tissues to multiple environmental stressors were used to define a co-expression landscape. This was based on the Pearsons correlation coefficient relating each gene with all other genes, from which a network description provided clusters of highly correlated genes as 'mountains'. We show that these contain genes with known identities and genes with unknown identities, and that the correlation constitutes evidence of identity in the latter. This procedure has suggested identities to 522 of 2701 unknown carp ESTs sequences. We also discriminate several common carp genes and gene isoforms that were not discriminated by BLAST sequence alignment alone. Precision in identification was substantially improved by use of data from multiple tissues and treatments. Conclusion The detailed analysis of co-expression landscapes is a sensitive technique for suggesting an identity for the large number of BLAST unidentified cDNAs generated in EST projects. It is capable of detecting even subtle changes in expression profiles, and thereby of distinguishing genes with a common BLAST identity into different identities. It benefits from the use of multiple treatments or contrasts, and from the large-scale microarray data. PMID:19939286
Deep sequencing-based transcriptome analysis of Plutella xylostella larvae parasitized by Diadegma semiclausum

PubMed Central

2011-01-01

Background Parasitoid insects manipulate their hosts' physiology by injecting various factors into their host upon parasitization. Transcriptomic approaches provide a powerful approach to study insect host-parasitoid interactions at the molecular level. In order to investigate the effects of parasitization by an ichneumonid wasp (Diadegma semiclausum) on the host (Plutella xylostella), the larval transcriptome profile was analyzed using a short-read deep sequencing method (Illumina). Symbiotic polydnaviruses (PDVs) associated with ichneumonid parasitoids, known as ichnoviruses, play significant roles in host immune suppression and developmental regulation. In the current study, D. semiclausum ichnovirus (DsIV) genes expressed in P. xylostella were identified and their sequences compared with other reported PDVs. Five of these genes encode proteins of unknown identity, that have not previously been reported. Results De novo assembly of cDNA sequence data generated 172,660 contigs between 100 and 10000 bp in length; with 35% of > 200 bp in length. Parasitization had significant impacts on expression levels of 928 identified insect host transcripts. Gene ontology data illustrated that the majority of the differentially expressed genes are involved in binding, catalytic activity, and metabolic and cellular processes. In addition, the results show that transcription levels of antimicrobial peptides, such as gloverin, cecropin E and lysozyme, were up-regulated after parasitism. Expression of ichnovirus genes were detected in parasitized larvae with 19 unique sequences identified from five PDV gene families including vankyrin, viral innexin, repeat elements, a cysteine-rich motif, and polar residue rich protein. Vankyrin 1 and repeat element 1 genes showed the highest transcription levels among the DsIV genes. Conclusion This study provides detailed information on differential expression of P. xylostella larval genes following parasitization, DsIV genes expressed in the host and also improves our current understanding of this host-parasitoid interaction. PMID:21906285
Monoallelic Gene Expression in Mammals.

PubMed

Chess, Andrew

2016-11-23

Monoallelic expression not due to cis-regulatory sequence polymorphism poses an intriguing problem in epigenetics because it requires the unequal treatment of two segments of DNA that are present in the same nucleus and that can indeed have absolutely identical sequences. Here, I focus on a few recent developments in the field of monoallelic expression that are of particular interest and raise interesting questions for future work. One development is regarding analyses of imprinted genes, in which recent work suggests the possibility that intriguing networks of imprinted genes exist and are important for genetic and physiological studies. Another issue that has been raised in recent years by a number of publications is the question of how skewed allelic expression should be for it to be designated as monoallelic expression and, further, what methods are appropriate or inappropriate for analyzing genomic data to examine allele-specific expression. Perhaps the most exciting recent development in mammalian monoallelic expression is a clever and carefully executed analysis of genetic diversity of autosomal genes subject to random monoallelic expression (RMAE), which provides compelling evidence for distinct evolutionary forces acting on random monoallelically expressed genes.
Genome and Transcriptome Sequencing of the Ostreid herpesvirus 1 From Tomales Bay, California

NASA Astrophysics Data System (ADS)

Burge, C. A.; Langevin, S.; Closek, C. J.; Roberts, S. B.; Friedman, C. S.

2016-02-01

Mass mortalities of larval and seed bivalve molluscs attributed to the Ostreid herpesvirus 1 (OsHV-1) occur globally. OsHV-1 was fully sequenced and characterized as a member of the Family Malacoherpesviridae. Multiple strains of OsHV-1 exist and may vary in virulence, i.e. OsHV-1 µvar. For most global variants of OsHV-1, sequence data is limited to PCR-based sequencing of segments, including two recent genomes. In the United States, OsHV-1 is limited to detection in adjacent embayments in California, Tomales and Drakes bays. Limited DNA sequence data of OsHV-1 infecting oysters in Tomales Bay indicates the virus detected in Tomales Bay is similar but not identical to any one global variant of OsHV-1. In order to better understand both strain variation and virulence of OsHV-1 infecting oysters in Tomales Bay, we used genomic and transcriptomic sequencing. Meta-genomic sequencing (Illumina MiSeq) was conducted from infected oysters (n=4 per year) collected in 2003, 2007, and 2014, where full OsHV-1 genome sequences and low overall microbial diversity were achieved from highly infected oysters. Increased microbial diversity was detected in three of four samples sequenced from 2003, where qPCR based genome copy numbers of OsHV-1 were lower. Expression analysis (SOLiD RNA sequencing) of OsHV-1 genes expressed in oyster larvae at 24 hours post exposure revealed a nearly complete transcriptome, with several highly expressed genes, which are similar to recent transcriptomic analyses of other OsHV-1 variants. Taken together, our results indicate that genome and transcriptome sequencing may be powerful tools in understanding both strain variation and virulence of non-culturable marine viruses.
Expression profiling during arabidopsis/downy mildew interaction reveals a highly-expressed effector that attenuates responses to salicylic acid.

PubMed

Asai, Shuta; Rallapalli, Ghanasyam; Piquerez, Sophie J M; Caillaud, Marie-Cécile; Furzer, Oliver J; Ishaque, Naveed; Wirthmueller, Lennart; Fabro, Georgina; Shirasu, Ken; Jones, Jonathan D G

2014-10-01

Plants have evolved strong innate immunity mechanisms, but successful pathogens evade or suppress plant immunity via effectors delivered into the plant cell. Hyaloperonospora arabidopsidis (Hpa) causes downy mildew on Arabidopsis thaliana, and a genome sequence is available for isolate Emoy2. Here, we exploit the availability of genome sequences for Hpa and Arabidopsis to measure gene-expression changes in both Hpa and Arabidopsis simultaneously during infection. Using a high-throughput cDNA tag sequencing method, we reveal expression patterns of Hpa predicted effectors and Arabidopsis genes in compatible and incompatible interactions, and promoter elements associated with Hpa genes expressed during infection. By resequencing Hpa isolate Waco9, we found it evades Arabidopsis resistance gene RPP1 through deletion of the cognate recognized effector ATR1. Arabidopsis salicylic acid (SA)-responsive genes including PR1 were activated not only at early time points in the incompatible interaction but also at late time points in the compatible interaction. By histochemical analysis, we found that Hpa suppresses SA-inducible PR1 expression, specifically in the haustoriated cells into which host-translocated effectors are delivered, but not in non-haustoriated adjacent cells. Finally, we found a highly-expressed Hpa effector candidate that suppresses responsiveness to SA. As this approach can be easily applied to host-pathogen interactions for which both host and pathogen genome sequences are available, this work opens the door towards transcriptome studies in infection biology that should help unravel pathogen infection strategies and the mechanisms by which host defense responses are overcome.
Minimal doses of a sequence-optimized transgene mediate high-level and long-term EPO expression in vivo: challenging CpG-free gene design.

PubMed

Kosovac, D; Wild, J; Ludwig, C; Meissner, S; Bauer, A P; Wagner, R

2011-02-01

Advanced gene delivery techniques can be combined with rational gene design to further improve the efficiency of plasmid DNA (pDNA)-mediated transgene expression in vivo. Herein, we analyzed the influence of intragenic sequence modifications on transgene expression in vitro and in vivo using murine erythropoietin (mEPO) as a transgene model. A single electro-gene transfer of an RNA- and codon-optimized mEPOopt gene into skeletal muscle resulted in a 3- to 4-fold increase of mEPO production sustained for >1 year and triggered a significant increase in hematocrit and hemoglobin without causing adverse effects. mEPO expression and hematologic levels were significantly lower when using comparable amounts of the wild type (mEPOwt) gene and only marginal effects were induced by mEPOΔCpG lacking intragenic CpG dinucleotides, even at high pDNA amounts. Corresponding with these observations, in vitro analysis of transfected cells revealed a 2- to 3-fold increased (mEPOopt) and 50% decreased (mEPOΔCpG) erythropoietin expression compared with mEPOwt, respectively. RNA analyses demonstrated that the specific design of the transgene sequence influenced expression levels by modulating transcriptional activity and nuclear plus cytoplasmic RNA amounts rather than translation. In sum, whereas CpG depletion negatively interferes with efficient expression in postmitotic tissues, mEPOopt doses <0.5 μg were sufficient to trigger optimal long-term hematologic effects encouraging the use of sequence-optimized transgenes to further reduce effective pDNA amounts.
Whole genome co-expression analysis of soybean cytochrome P450 genes identifies nodulation-specific P450 monooxygenases

PubMed Central

2010-01-01

Background Cytochrome P450 monooxygenases (P450s) catalyze oxidation of various substrates using oxygen and NAD(P)H. Plant P450s are involved in the biosynthesis of primary and secondary metabolites performing diverse biological functions. The recent availability of the soybean genome sequence allows us to identify and analyze soybean putative P450s at a genome scale. Co-expression analysis using an available soybean microarray and Illumina sequencing data provides clues for functional annotation of these enzymes. This approach is based on the assumption that genes that have similar expression patterns across a set of conditions may have a functional relationship. Results We have identified a total number of 332 full-length P450 genes and 378 pseudogenes from the soybean genome. From the full-length sequences, 195 genes belong to A-type, which could be further divided into 20 families. The remaining 137 genes belong to non-A type P450s and are classified into 28 families. A total of 178 probe sets were found to correspond to P450 genes on the Affymetrix soybean array. Out of these probe sets, 108 represented single genes. Using the 28 publicly available microarray libraries that contain organ-specific information, some tissue-specific P450s were identified. Similarly, stress responsive soybean P450s were retrieved from 99 microarray soybean libraries. We also utilized Illumina transcriptome sequencing technology to analyze the expressions of all 332 soybean P450 genes. This dataset contains total RNAs isolated from nodules, roots, root tips, leaves, flowers, green pods, apical meristem, mock-inoculated and Bradyrhizobium japonicum-infected root hair cells. The tissue-specific expression patterns of these P450 genes were analyzed and the expression of a representative set of genes were confirmed by qRT-PCR. We performed the co-expression analysis on many of the 108 P450 genes on the Affymetrix arrays. First we confirmed that CYP93C5 (an isoflavone synthase gene) is co-expressed with several genes encoding isoflavonoid-related metabolic enzymes. We then focused on nodulation-induced P450s and found that CYP728H1 was co-expressed with the genes involved in phenylpropanoid metabolism. Similarly, CYP736A34 was highly co-expressed with lipoxygenase, lectin and CYP83D1, all of which are involved in root and nodule development. Conclusions The genome scale analysis of P450s in soybean reveals many unique features of these important enzymes in this crop although the functions of most of them are largely unknown. Gene co-expression analysis proves to be a useful tool to infer the function of uncharacterized genes. Our work presented here could provide important leads toward functional genomics studies of soybean P450s and their regulatory network through the integration of reverse genetics, biochemistry, and metabolic profiling tools. The identification of nodule-specific P450s and their further exploitation may help us to better understand the intriguing process of soybean and rhizobium interaction. PMID:21062474
Discovery of cashmere goat (Capra hircus) microRNAs in skin and hair follicles by Solexa sequencing.

PubMed

Yuan, Chao; Wang, Xiaolong; Geng, Rongqing; He, Xiaolin; Qu, Lei; Chen, Yulin

2013-07-28

MicroRNAs (miRNAs) are a large family of endogenous, non-coding RNAs, about 22 nucleotides long, which regulate gene expression through sequence-specific base pairing with target mRNAs. Extensive studies have shown that miRNA expression in the skin changes remarkably during distinct stages of the hair cycle in humans, mice, goats and sheep. In this study, the skin tissues were harvested from the three stages of hair follicle cycling (anagen, catagen and telogen) in a fibre-producing goat breed. In total, 63,109,004 raw reads were obtained by Solexa sequencing and 61,125,752 clean reads remained for the small RNA digitalisation analysis. This resulted in the identification of 399 conserved miRNAs; among these, 326 miRNAs were expressed in all three follicular cycling stages, whereas 3, 12 and 11 miRNAs were specifically expressed in anagen, catagen, and telogen, respectively. We also identified 172 potential novel miRNAs by Mireap, 36 miRNAs were expressed in all three cycling stages, whereas 23, 29 and 44 miRNAs were specifically expressed in anagen, catagen, and telogen, respectively. The expression level of five arbitrarily selected miRNAs was analyzed by quantitative PCR, and the results indicated that the expression patterns were consistent with the Solexa sequencing results. Gene Ontology and KEGG pathway analyses indicated that five major biological pathways (Metabolic pathways, Pathways in cancer, MAPK signalling pathway, Endocytosis and Focal adhesion) accounted for 23.08% of target genes among 278 biological functions, indicating that these pathways are likely to play significant roles during hair cycling. During all hair cycle stages of cashmere goats, a large number of conserved and novel miRNAs were identified through a high-throughput sequencing approach. This study enriches the Capra hircus miRNA databases and provides a comprehensive miRNA transcriptome profile in the skin of goats during the hair follicle cycle.
Protein sequence analysis, cloning, and expression of flammutoxin, a pore-forming cytolysin from Flammulina velutipes. Maturation of dimeric precursor to monomeric active form by carboxyl-terminal truncation.

PubMed

Tomita, Toshio; Mizumachi, Yoshihiro; Chong, Kang; Ogawa, Kanako; Konishi, Norihide; Sugawara-Tomita, Noriko; Dohmae, Naoshi; Hashimoto, Yohichi; Takio, Koji

2004-12-24

Flammutoxin (FTX), a 31-kDa pore-forming cytolysin from Flammulina velutipes, is specifically expressed during the fruiting body formation. We cloned and expressed the cDNA encoding a 272-residue protein with an identical N-terminal sequence with that of FTX but failed to obtain hemolytically active protein. This, together with the presence of multiple FTX family proteins in the mushroom, prompted us to determine the complete primary structure of FTX by protein sequence analysis. The N-terminal 72 and C-terminal 107 residues were sequenced by Edman degradation of the fragments generated from the alkylated FTX by enzymatic digestions with Achromobacter protease I or Staphylococcus aureus V8 protease and by chemical cleavages with CNBr, hydroxylamine, or 1% formic acid. The central part of FTX was sequenced with a surface-adhesive 7-kDa fragment, which was generated by a tryptic digestion of FTX and recovered by rinsing the wall of a test tube with 6 M guanidine HCl. The 7-kDa peptide was cleaved with 12 M HCl, thermolysin, or S. aureus V8 protease to produce smaller peptides for sequence analysis. As a result, FTX consisted of 251 residues, and protein and nucleotide sequences were in accord except for the lack of the initial Met and the C-terminal 20 residues in protein. Recombinant FTX (rFTX) with or without the C-terminal 20 residues (rFTX271 or rFTX251, respectively) was prepared to study the maturation process of FTX. Like natural FTX, rFTX251 existed as a monomer in solution and assembled into an SDS-stable, ring-shaped pore complex on human erythrocytes, causing hemolysis. In contrast, rFTX271, existing as a dimer in solution, bound to the cells but failed to form pore complex. The dimeric rFTX271 was converted to hemolytically active monomers upon the cleavage between Lys(251) and Met(252) by trypsin.
The UEA Small RNA Workbench: A Suite of Computational Tools for Small RNA Analysis.

PubMed

Mohorianu, Irina; Stocks, Matthew Benedict; Applegate, Christopher Steven; Folkes, Leighton; Moulton, Vincent

2017-01-01

RNA silencing (RNA interference, RNAi) is a complex, highly conserved mechanism mediated by short, typically 20-24 nt in length, noncoding RNAs known as small RNAs (sRNAs). They act as guides for the sequence-specific transcriptional and posttranscriptional regulation of target mRNAs and play a key role in the fine-tuning of biological processes such as growth, response to stresses, or defense mechanism.High-throughput sequencing (HTS) technologies are employed to capture the expression levels of sRNA populations. The processing of the resulting big data sets facilitated the computational analysis of the sRNA patterns of variation within biological samples such as time point experiments, tissue series or various treatments. Rapid technological advances enable larger experiments, often with biological replicates leading to a vast amount of raw data. As a result, in this fast-evolving field, the existing methods for sequence characterization and prediction of interaction (regulatory) networks periodically require adapting or in extreme cases, a complete redesign to cope with the data deluge. In addition, the presence of numerous tools focused only on particular steps of HTS analysis hinders the systematic parsing of the results and their interpretation.The UEA small RNA Workbench (v1-4), described in this chapter, provides a user-friendly, modular, interactive analysis in the form of a suite of computational tools designed to process and mine sRNA datasets for interesting characteristics that can be linked back to the observed phenotypes. First, we show how to preprocess the raw sequencing output and prepare it for downstream analysis. Then we review some quality checks that can be used as a first indication of sources of variability between samples. Next we show how the Workbench can provide a comparison of the effects of different normalization approaches on the distributions of expression, enhanced methods for the identification of differentially expressed transcripts and a summary of their corresponding patterns. Finally we describe individual analysis tools such as PAREsnip, for the analysis of PARE (degradome) data or CoLIde for the identification of sRNA loci based on their expression patterns and the visualization of the results using the software. We illustrate the features of the UEA sRNA Workbench on Arabidopsis thaliana and Homo sapiens datasets.
The transcriptome of Lutzomyia longipalpis (Diptera: Psychodidae) male reproductive organs.

PubMed

Azevedo, Renata V D M; Dias, Denise B S; Bretãs, Jorge A C; Mazzoni, Camila J; Souza, Nataly A; Albano, Rodolpho M; Wagner, Glauber; Davila, Alberto M R; Peixoto, Alexandre A

2012-01-01

It has been suggested that genes involved in the reproductive biology of insect disease vectors are potential targets for future alternative methods of control. Little is known about the molecular biology of reproduction in phlebotomine sand flies and there is no information available concerning genes that are expressed in male reproductive organs of Lutzomyia longipalpis, the main vector of American visceral leishmaniasis and a species complex. We generated 2678 high quality ESTs ("Expressed Sequence Tags") of L. longipalpis male reproductive organs that were grouped in 1391 non-redundant sequences (1136 singlets and 255 clusters). BLAST analysis revealed that only 57% of these sequences share similarity with a L. longipalpis female EST database. Although no more than 36% of the non-redundant sequences showed similarity to protein sequences deposited in databases, more than half of them presented the best-match hits with mosquito genes. Gene ontology analysis identified subsets of genes involved in biological processes such as protein biosynthesis and DNA replication, which are probably associated with spermatogenesis. A number of non-redundant sequences were also identified as putative male reproductive gland proteins (mRGPs), also known as male accessory gland protein genes (Acps). The transcriptome analysis of L. longipalpis male reproductive organs is one step further in the study of the molecular basis of the reproductive biology of this important species complex. It has allowed the identification of genes potentially involved in spermatogenesis as well as putative mRGPs sequences, which have been studied in many insect species because of their effects on female post-mating behavior and physiology and their potential role in sexual selection and speciation. These data open a number of new avenues for further research in the molecular and evolutionary reproductive biology of sand flies.
The Transcriptome of Lutzomyia longipalpis (Diptera: Psychodidae) Male Reproductive Organs

PubMed Central

Bretãs, Jorge A. C.; Mazzoni, Camila J.; Souza, Nataly A.; Albano, Rodolpho M.; Wagner, Glauber; Davila, Alberto M. R.; Peixoto, Alexandre A.

2012-01-01

Background It has been suggested that genes involved in the reproductive biology of insect disease vectors are potential targets for future alternative methods of control. Little is known about the molecular biology of reproduction in phlebotomine sand flies and there is no information available concerning genes that are expressed in male reproductive organs of Lutzomyia longipalpis, the main vector of American visceral leishmaniasis and a species complex. Methods/Principal Findings We generated 2678 high quality ESTs (“Expressed Sequence Tags”) of L. longipalpis male reproductive organs that were grouped in 1391 non-redundant sequences (1136 singlets and 255 clusters). BLAST analysis revealed that only 57% of these sequences share similarity with a L. longipalpis female EST database. Although no more than 36% of the non-redundant sequences showed similarity to protein sequences deposited in databases, more than half of them presented the best-match hits with mosquito genes. Gene ontology analysis identified subsets of genes involved in biological processes such as protein biosynthesis and DNA replication, which are probably associated with spermatogenesis. A number of non-redundant sequences were also identified as putative male reproductive gland proteins (mRGPs), also known as male accessory gland protein genes (Acps). Conclusions The transcriptome analysis of L. longipalpis male reproductive organs is one step further in the study of the molecular basis of the reproductive biology of this important species complex. It has allowed the identification of genes potentially involved in spermatogenesis as well as putative mRGPs sequences, which have been studied in many insect species because of their effects on female post-mating behavior and physiology and their potential role in sexual selection and speciation. These data open a number of new avenues for further research in the molecular and evolutionary reproductive biology of sand flies. PMID:22496818
Identification and Characterization of MicroRNAs in Ovary and Testis of Nile Tilapia (Oreochromis niloticus) by Using Solexa Sequencing Technology

PubMed Central

Zhou, Yi; Yu, Fan; Gao, Yun; Luo, Yongju; Tang, Zhanyang; Guo, Zhongbao; Guo, Enyan; Gan, Xi; Zhang, Ming; Zhang, Yaping

2014-01-01

MicroRNAs (miRNAs) are endogenous non-coding small RNAs which play important roles in the regulation of gene expression by cleaving or inhibiting the translation of target gene transcripts. Thereinto, some specific miRNAs show regulatory activities in gonad development via translational control. In order to further understand the role of miRNA-mediated posttranscriptional regulation in Nile tilapia (Oreochromis niloticus) ovary and testis, two small RNA libraries of Nile tilapia were sequenced by Solexa small RNA deep sequencing methods. A total of 9,731,431 and 8,880,497 raw reads, representing 5,407,800 and 4,396,281 unique sequences were obtained from the sexually mature ovaries and testes, respectively. After comparing the small RNA sequences with the Rfam database, 1,432,210 reads in ovaries and 984,146 reads in testes were matched to the genome sequence of Nile tilapia. Bioinformatic analysis identified 764 mature miRNA, 209 miRNA-5p and 202 miRNA-3p were found in the two libraries, of which 525 known miRNAs are both expressed in the ovary and testis of Nile tilapia. Comparison of expression profiles of the testis, miR-727, miR-129 and miR-29 families were highly expressed in tilapia ovary. Additionally, miR-132, miR-212, miR-33a and miR-135b families, showed significant higher expression in testis compared with that in ovary. Furthermore, the expression patterns of the miRNAs were analyzed in different developmental stages of gonad. The result showed different expression patterns were observed during development of testis and ovary. In addition, the identification and characterization of differentially expressed miRNAs in the ovaries and testis of Nile tilapia provides important information on the role of miRNA in the regulation of the ovarian and testicular development and function. This data will be helpful to facilitate studies on the regulation of miRNAs during teleosts reproduction. PMID:24466258
RAP: RNA-Seq Analysis Pipeline, a new cloud-based NGS web application

PubMed Central

2015-01-01

Background The study of RNA has been dramatically improved by the introduction of Next Generation Sequencing platforms allowing massive and cheap sequencing of selected RNA fractions, also providing information on strand orientation (RNA-Seq). The complexity of transcriptomes and of their regulative pathways make RNA-Seq one of most complex field of NGS applications, addressing several aspects of the expression process (e.g. identification and quantification of expressed genes and transcripts, alternative splicing and polyadenylation, fusion genes and trans-splicing, post-transcriptional events, etc.). Moreover, the huge volume of data generated by NGS platforms introduces unprecedented computational and technological challenges to efficiently analyze and store sequence data and results. Methods In order to provide researchers with an effective and friendly resource for analyzing RNA-Seq data, we present here RAP (RNA-Seq Analysis Pipeline), a cloud computing web application implementing a complete but modular analysis workflow. This pipeline integrates both state-of-the-art bioinformatics tools for RNA-Seq analysis and in-house developed scripts to offer to the user a comprehensive strategy for data analysis. RAP is able to perform quality checks (adopting FastQC and NGS QC Toolkit), identify and quantify expressed genes and transcripts (with Tophat, Cufflinks and HTSeq), detect alternative splicing events (using SpliceTrap) and chimeric transcripts (with ChimeraScan). This pipeline is also able to identify splicing junctions and constitutive or alternative polyadenylation sites (implementing custom analysis modules) and call for statistically significant differences in genes and transcripts expression, splicing pattern and polyadenylation site usage (using Cuffdiff2 and DESeq). Results Through a user friendly web interface, the RAP workflow can be suitably customized by the user and it is automatically executed on our cloud computing environment. This strategy allows to access to bioinformatics tools and computational resources without specific bioinformatics and IT skills. RAP provides a set of tabular and graphical results that can be helpful to browse, filter and export analyzed data, according to the user needs. PMID:26046471
Comparison and correlation of Simple Sequence Repeats distribution in genomes of Brucella species

PubMed Central

Kiran, Jangampalli Adi Pradeep; Chakravarthi, Veeraraghavulu Praveen; Kumar, Yellapu Nanda; Rekha, Somesula Swapna; Kruti, Srinivasan Shanthi; Bhaskar, Matcha

2011-01-01

Computational genomics is one of the important tools to understand the distribution of closely related genomes including simple sequence repeats (SSRs) in an organism, which gives valuable information regarding genetic variations. The central objective of the present study was to screen the SSRs distributed in coding and non-coding regions among different human Brucella species which are involved in a range of pathological disorders. Computational analysis of the SSRs in the Brucella indicates few deviations from expected random models. Statistical analysis also reveals that tri-nucleotide SSRs are overrepresented and tetranucleotide SSRs underrepresented in Brucella genomes. From the data, it can be suggested that over expressed tri-nucleotide SSRs in genomic and coding regions might be responsible in the generation of functional variation of proteins expressed which in turn may lead to different pathogenicity, virulence determinants, stress response genes, transcription regulators and host adaptation proteins of Brucella genomes. Abbreviations SSRs - Simple Sequence Repeats, ORFs - Open Reading Frames. PMID:21738309
Sequencing and functional analysis of the nifENXorf1orf2 gene cluster of Herbaspirillum seropedicae.

PubMed

Klassen, G; Pedrosa, F O; Souza, E M; Yates, M G; Rigo, L U

1999-12-01

A 5.1-kb DNA fragment from the nifHDK region of H. seropedicae was isolated and sequenced. Sequence analysis showed the presence of nifENXorf1orf2 but nifTY were not present. No nif or consensus promoter was identified. Furthermore, orf1 expression occurred only under nitrogen-fixing conditions and no promoter activity was detected between nifK and nifE, suggesting that these genes are expressed from the upstream nifH promoter and are parts of a unique nif operon. Mutagenesis studies indicate that nifN was essential for nitrogenase activity whereas nifXorf1orf2 were not. High homology between the C-terminal region of the NifX and NifB proteins from H. seropedicae was observed. Since the NifX and NifY proteins are important for FeMo cofactor (FeMoco) synthesis, we propose that alternative proteins with similar activities exist in H. seropedicae.

Expression and functional analysis of the lysine decarboxylase and copper amine oxidase genes from the endophytic fungus Colletotrichum gloeosporioides ES026.

PubMed

Zhang, Xiangmei; Wang, Zhangqian; Jan, Saad; Yang, Qian; Wang, Mo

2017-06-05

Huperzine A (HupA) isolated from Huperzia serrata is an important compound used to treat Alzheimer's disease (AD). Recently, HupA was reported in various endophytic fungi, with Colletotrichum gloeosporioides ES026 previously isolated from H. serrata shown to produce HupA. In this study, we performed next-generation sequencing and de novo RNA sequencing of C. gloeosporioides ES026 to elucidate the molecular functions, biological processes, and biochemical pathways of these unique sequences. Gene ontology and Kyoto Encyclopedia of Genes and Genomes assignments allowed annotation of lysine decarboxylase (LDC) and copper amine oxidase (CAO) for their conversion of L-lysine to 5-aminopentanal during HupA biosynthesis. Additionally, we constructed a stable, high-yielding HupA-expression system resulting from the overexpression of CgLDC and CgCAO from the HupA-producing endophytic fungus C. gloeosporioides ES026 in Escherichia coli. Quantitative reverse transcription polymerase chain reaction analysis confirmed CgLDC and CgCAO expression, and quantitative determination of HupA levels was assessed by liquid chromatography high-resolution mass spectrometry, which revealed that elevated expression of CgLDC and CgCAO produced higher yields of HupA than those derived from C. gloeosporioides ES026. These results revealed CgLDC and CgCAO involvement in HupA biosynthesis and their key role in regulating HupA content in C. gloeosporioides ES026.
Uncovering microRNA-mediated response to SO2 stress in Arabidopsis thaliana by deep sequencing.

PubMed

Li, Lihong; Xue, Meizhao; Yi, Huilan

2016-10-05

Sulfur dioxide (SO2) is a major air pollutant and has significant impacts on plants. MicroRNAs (miRNAs) are a class of gene expression regulators that play important roles in response to environmental stresses. In this study, deep sequencing was used for genome-wide identification of miRNAs and their expression profiles in response to SO2 stress in Arabidopsis thaliana shoots. A total of 27 conserved miRNAs and 5 novel miRNAs were found to be differentially expressed under SO2 stress. qRT-PCR analysis showed mostly negative correlation between miRNA accumulation and target gene mRNA abundance, suggesting regulatory roles of these miRNAs during SO2 exposure. The target genes of SO2-responsive miRNAs encode transcription factors and proteins that regulate auxin signaling and stress response, and the miRNAs-mediated suppression of these genes could improve plant resistance to SO2 stress. Promoter sequence analysis of genes encoding SO2-responsive miRNAs showed that stress-responsive and phytohormone-related cis-regulatory elements occurred frequently, providing additional evidence of the involvement of miRNAs in adaption to SO2 stress. This study represents a comprehensive expression profiling of SO2-responsive miRNAs in Arabidopsis and broads our perspective on the ubiquitous regulatory roles of miRNAs under stress conditions. Copyright © 2016 Elsevier B.V. All rights reserved.
Genome-Wide Identification of Regulatory Sequences Undergoing Accelerated Evolution in the Human Genome.

PubMed

Dong, Xinran; Wang, Xiao; Zhang, Feng; Tian, Weidong

2016-10-01

Accelerated evolution of regulatory sequence can alter the expression pattern of target genes, and cause phenotypic changes. In this study, we used DNase I hypersensitive sites (DHSs) to annotate putative regulatory sequences in the human genome, and conducted a genome-wide analysis of the effects of accelerated evolution on regulatory sequences. Working under the assumption that local ancient repeat elements of DHSs are under neutral evolution, we discovered that ∼0.44% of DHSs are under accelerated evolution (ace-DHSs). We found that ace-DHSs tend to be more active than background DHSs, and are strongly associated with epigenetic marks of active transcription. The target genes of ace-DHSs are significantly enriched in neuron-related functions, and their expression levels are positively selected in the human brain. Thus, these lines of evidences strongly suggest that accelerated evolution on regulatory sequences plays important role in the evolution of human-specific phenotypes. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Finding similar nucleotide sequences using network BLAST searches.

PubMed

Ladunga, Istvan

2009-06-01

The Basic Local Alignment Search Tool (BLAST) is a keystone of bioinformatics due to its performance and user-friendliness. Beginner and intermediate users will learn how to design and submit blastn and Megablast searches on the Web pages at the National Center for Biotechnology Information. We map nucleic acid sequences to genomes, find identical or similar mRNA, expressed sequence tag, and noncoding RNA sequences, and run Megablast searches, which are much faster than blastn. Understanding results is assisted by taxonomy reports, genomic views, and multiple alignments. We interpret expected frequency thresholds, biological significance, and statistical significance. Weak hits provide no evidence, but hints for further analyses. We find genes that may code for homologous proteins by translated BLAST. We reduce false positives by filtering out low-complexity regions. Parsed BLAST results can be integrated into analysis pipelines. Links in the output connect to Entrez, PUBMED, structural, sequence, interaction, and expression databases. This facilitates integration with a wide spectrum of biological knowledge.
Analysis of global gene expression profiles to identify differentially expressed genes critical for embryo development in Brassica rapa.

PubMed

Zhang, Yu; Peng, Lifang; Wu, Ya; Shen, Yanyue; Wu, Xiaoming; Wang, Jianbo

2014-11-01

Embryo development represents a crucial developmental period in the life cycle of flowering plants. To gain insights into the genetic programs that control embryo development in Brassica rapa L., RNA sequencing technology was used to perform transcriptome profiling analysis of B. rapa developing embryos. The results generated 42,906,229 sequence reads aligned with 32,941 genes. In total, 27,760, 28,871, 28,384, and 25,653 genes were identified from embryos at globular, heart, early cotyledon, and mature developmental stages, respectively, and analysis between stages revealed a subset of stage-specific genes. We next investigated 9,884 differentially expressed genes with more than fivefold changes in expression and false discovery rate ≤ 0.001 from three adjacent-stage comparisons; 1,514, 3,831, and 6,633 genes were detected between globular and heart stage embryo libraries, heart stage and early cotyledon stage, and early cotyledon and mature stage, respectively. Large numbers of genes related to cellular process, metabolism process, response to stimulus, and biological process were expressed during the early and middle stages of embryo development. Fatty acid biosynthesis, biosynthesis of secondary metabolites, and photosynthesis-related genes were expressed predominantly in embryos at the middle stage. Genes for lipid metabolism and storage proteins were highly expressed in the middle and late stages of embryo development. We also identified 911 transcription factor genes that show differential expression across embryo developmental stages. These results increase our understanding of the complex molecular and cellular events during embryo development in B. rapa and provide a foundation for future studies on other oilseed crops.
Regulation of gene expression in the mammalian eye and its relevance to eye disease.

PubMed

Scheetz, Todd E; Kim, Kwang-Youn A; Swiderski, Ruth E; Philp, Alisdair R; Braun, Terry A; Knudtson, Kevin L; Dorrance, Anne M; DiBona, Gerald F; Huang, Jian; Casavant, Thomas L; Sheffield, Val C; Stone, Edwin M

2006-09-26

We used expression quantitative trait locus mapping in the laboratory rat (Rattus norvegicus) to gain a broad perspective of gene regulation in the mammalian eye and to identify genetic variation relevant to human eye disease. Of >31,000 gene probes represented on an Affymetrix expression microarray, 18,976 exhibited sufficient signal for reliable analysis and at least 2-fold variation in expression among 120 F(2) rats generated from an SR/JrHsd x SHRSP intercross. Genome-wide linkage analysis with 399 genetic markers revealed significant linkage with at least one marker for 1,300 probes (alpha = 0.001; estimated empirical false discovery rate = 2%). Both contiguous and noncontiguous loci were found to be important in regulating mammalian eye gene expression. We investigated one locus of each type in greater detail and identified putative transcription-altering variations in both cases. We found an inserted cREL binding sequence in the 5' flanking sequence of the Abca4 gene associated with an increased expression level of that gene, and we found a mutation of the gene encoding thyroid hormone receptor beta2 associated with a decreased expression level of the gene encoding short-wavelength sensitive opsin (Opn1sw). In addition to these positional studies, we performed a pairwise analysis of gene expression to identify genes that are regulated in a coordinated manner and used this approach to validate two previously undescribed genes involved in the human disease Bardet-Biedl syndrome. These data and analytical approaches can be used to facilitate the discovery of additional genes and regulatory elements involved in human eye disease.
Using Poisson mixed-effects model to quantify transcript-level gene expression in RNA-Seq.

PubMed

Hu, Ming; Zhu, Yu; Taylor, Jeremy M G; Liu, Jun S; Qin, Zhaohui S

2012-01-01

RNA sequencing (RNA-Seq) is a powerful new technology for mapping and quantifying transcriptomes using ultra high-throughput next-generation sequencing technologies. Using deep sequencing, gene expression levels of all transcripts including novel ones can be quantified digitally. Although extremely promising, the massive amounts of data generated by RNA-Seq, substantial biases and uncertainty in short read alignment pose challenges for data analysis. In particular, large base-specific variation and between-base dependence make simple approaches, such as those that use averaging to normalize RNA-Seq data and quantify gene expressions, ineffective. In this study, we propose a Poisson mixed-effects (POME) model to characterize base-level read coverage within each transcript. The underlying expression level is included as a key parameter in this model. Since the proposed model is capable of incorporating base-specific variation as well as between-base dependence that affect read coverage profile throughout the transcript, it can lead to improved quantification of the true underlying expression level. POME can be freely downloaded at http://www.stat.purdue.edu/~yuzhu/pome.html. yuzhu@purdue.edu; zhaohui.qin@emory.edu Supplementary data are available at Bioinformatics online.
Discovery, characterization and expression of a novel zebrafish gene, znfr, important for notochord formation.

PubMed

Xu, Yan; Zou, Peng; Liu, Yao; Deng, Fengjiao

2010-06-01

Genes specifically expressed in the notochord may be crucial for proper notochord development. Using the digital differential display program offered by the National Center for Biotechnology Information, we identified a novel EST sequence from a zebrafish ovary library (No. XM_701450). The full-length cDNA of this transcript was cloned by performing 3' and 5'-RACE and was further confirmed by PCR and sequencing. The resulting 614 bp gene was found to encode a novel 94 amino acid protein that did not share significant homology with any other known protein. Characterization of the genomic sequence revealed that the gene spanned 4.9 kb and was composed of four exons and three introns. RT-PCR gene expression analysis revealed that our gene of interest was expressed in ovary, kidney, brain, mature oocytes and during the early stages of embryogenesis. During embryonic development, znfr mRNA was found to be expressed in the embryonic shield, chordamesoderm and the vacuolated notochord cells by in situ hybridization. Based on this information, we hypothesize that this novel gene is an important maternal factor required for zebrafish notochord formation during early embryonic development. We have thus named this gene znfr (zebrafish notochord formation related).
Nucleotide sequencing analysis of a LEU gene of Candida maltosa which complements leuB mutation of Escherichia coli and leu2 mutation of Saccharomyces cerevisiae.

PubMed

Takagi, M; Kobayashi, N; Sugimoto, M; Fujii, T; Watari, J; Yano, K

1987-01-01

The expression of a LEU gene from Candida maltosa (designated as C-LEU2) isolated previously (Kawamura et al. 1983) was shown to be regulated, when transferred into Saccharomyces cerevisiae, by leucine and threonine in the medium, as in the case of LEU2 gene of S. cerevisiae. The coding region together with the regulatory region was subcloned and the nucleotide sequence was determined. When the sequence of the coding region was compared with that of LEU2, the homology was 72% for base pairs and 76% for deduced amino acids. Comparison of the regulatory region of C-LEU2 with those of LEU1 and LEU2 suggested a few short consensus sequences which are involved in regulation of gene expression by leucine and threonine in the medium.
Droplet barcoding for single cell transcriptomics applied to embryonic stem cells

PubMed Central

Klein, Allon M; Mazutis, Linas; Akartuna, Ilke; Tallapragada, Naren; Veres, Adrian; Li, Victor; Peshkin, Leonid; Weitz, David A; Kirschner, Marc W

2015-01-01

Summary It has long been the dream of biologists to map gene expression at the single cell level. With such data one might track heterogeneous cell sub-populations, and infer regulatory relationships between genes and pathways. Recently, RNA sequencing has achieved single cell resolution. What is limiting is an effective way to routinely isolate and process large numbers of individual cells for quantitative in-depth sequencing. We have developed a high-throughput droplet-microfluidic approach for barcoding the RNA from thousands of individual cells for subsequent analysis by next-generation sequencing. The method shows a surprisingly low noise profile and is readily adaptable to other sequencing-based assays. We analyzed mouse embryonic stem cells, revealing in detail the population structure and the heterogeneous onset of differentiation after LIF withdrawal. The reproducibility of these high-throughput single cell data allowed us to deconstruct cell populations and infer gene expression relationships. PMID:26000487
Mapping RNA-seq Reads with STAR

PubMed Central

Dobin, Alexander; Gingeras, Thomas R.

2015-01-01

Mapping of large sets of high-throughput sequencing reads to a reference genome is one of the foundational steps in RNA-seq data analysis. The STAR software package performs this task with high levels of accuracy and speed. In addition to detecting annotated and novel splice junctions, STAR is capable of discovering more complex RNA sequence arrangements, such as chimeric and circular RNA. STAR can align spliced sequences of any length with moderate error rates providing scalability for emerging sequencing technologies. STAR generates output files that can be used for many downstream analyses such as transcript/gene expression quantification, differential gene expression, novel isoform reconstruction, signal visualization, and so forth. In this unit we describe computational protocols that produce various output files, use different RNA-seq datatypes, and utilize different mapping strategies. STAR is Open Source software that can be run on Unix, Linux or Mac OS X systems. PMID:26334920
Mapping RNA-seq Reads with STAR.

PubMed

Dobin, Alexander; Gingeras, Thomas R

2015-09-03

Mapping of large sets of high-throughput sequencing reads to a reference genome is one of the foundational steps in RNA-seq data analysis. The STAR software package performs this task with high levels of accuracy and speed. In addition to detecting annotated and novel splice junctions, STAR is capable of discovering more complex RNA sequence arrangements, such as chimeric and circular RNA. STAR can align spliced sequences of any length with moderate error rates, providing scalability for emerging sequencing technologies. STAR generates output files that can be used for many downstream analyses such as transcript/gene expression quantification, differential gene expression, novel isoform reconstruction, and signal visualization. In this unit, we describe computational protocols that produce various output files, use different RNA-seq datatypes, and utilize different mapping strategies. STAR is open source software that can be run on Unix, Linux, or Mac OS X systems. Copyright © 2015 John Wiley & Sons, Inc.
Genome-wide transcriptional analysis of two soybean genotypes under dehydration and rehydration conditions

PubMed Central

2013-01-01

Background Soybean is an important crop that provides valuable proteins and oils for human use. Because soybean growth and development is extremely sensitive to water deficit, quality and crop yields are severely impacted by drought stress. In the face of limited water resources, drought-responsive genes are therefore of interest. Identification and analysis of dehydration- and rehydration-inducible differentially expressed genes (DEGs) would not only aid elucidation of molecular mechanisms of stress response, but also enable improvement of crop stress tolerance via gene transfer. Using Digital Gene Expression Tag profiling (DGE), a new technique based on Illumina sequencing, we analyzed expression profiles between two soybean genotypes to identify drought-responsive genes. Results Two soybean genotypes—drought-tolerant Jindou21 and drought-sensitive Zhongdou33—were subjected to dehydration and rehydration conditions. For analysis of DEGs under dehydration conditions, 20 cDNA libraries were generated from roots and leaves at two different time points under well-watered and dehydration conditions. We also generated eight libraries for analysis under rehydration conditions. Sequencing of the 28 libraries produced 25,000–33,000 unambiguous tags, which were mapped to reference sequences for annotation of expressed genes. Many genes exhibited significant expression differences among the libraries. DEGs in the drought-tolerant genotype were identified by comparison of DEGs among treatments and genotypes. In Jindou21, 518 and 614 genes were differentially expressed under dehydration in leaves and roots, respectively, with 24 identified both in leaves and roots. The main functional categories enriched in these DEGs were metabolic process, response to stresses, plant hormone signal transduction, protein processing, and plant-pathogen interaction pathway; the associated genes primarily encoded transcription factors, protein kinases, and other regulatory proteins. The seven most significantly expressed (|log2 ratio| ≥ 8) genes— Glyma15g03920, Glyma05g02470, Glyma15g15010, Glyma05g09070, Glyma06g35630, Glyma08g12590, and Glyma11g16000—are more likely to determine drought stress tolerance. The expression patterns of eight randomly-selected genes were confirmed by quantitative RT-PCR; the results of QRT-PCR analysis agreed with transcriptional profile data for 96 out of 128 (75%) data points. Conclusions Many soybean genes were differentially expressed between drought-tolerant and drought-sensitive genotypes. Based on GO functional annotation and pathway enrichment analysis, some of these genes encoded transcription factors, protein kinases, and other regulatory proteins. The seven most significant DEGs are candidates for improving soybean drought tolerance. These findings will be helpful for analysis and elucidation of molecular mechanisms of drought tolerance; they also provide a basis for cultivating new varieties of drought-tolerant soybean. PMID:24093224
Metatranscriptome sequence analysis reveals diel periodicity of microbial community gene expression in the ocean's interior

NASA Astrophysics Data System (ADS)

Vislova, A.; Aylward, F.; Sosa, O.; DeLong, E.

2016-02-01

Previous work has revealed diel periodicity of gene expression in key metabolic pathways in both autotrophic and heterotrophic microbes in the surface ocean. In this study, we investigated patterns of diel periodicity of gene expression in depth profiles (25, 75, 125 and 250 meters). We postulated that microbial diel transcriptional signals would be increasingly dampened with depth, and that the timing of peak expression of specific transcripts would be shifted in time between depths, in accordance with depth-dependent diel light variability. Bacterioplankton were sampled from four depths every four hours at station ALOHA (22° 45' N 158° W) over 2 days. RNA was extracted from cells preserved on filters, converted to cDNA, and sequenced on the Illumina platform. Surprisingly, harmonic regression analysis revealed an increasing proportion of genes with diel periodic expression patterns with increasing depth between 25- 125 meters. At 250 meters, the proportion of genes exhibiting diel expression patterns decreased an order of magnitude compared to the photic zone. Community composition, functional gene categories, and diel patterns of gene expression were significantly different between the photic zone and 250 meter samples. The signals driving diel periodic gene expression in microbes at 250 meters is under further investigation. These data are now beginning provide a better understanding of the tempo and mode of microbial dynamics among specific taxa, throughout the ocean's interior.
Identification and validation of differentially expressed transcripts by RNA-sequencing of formalin-fixed, paraffin-embedded (FFPE) lung tissue from patients with Idiopathic Pulmonary Fibrosis.

PubMed

Vukmirovic, Milica; Herazo-Maya, Jose D; Blackmon, John; Skodric-Trifunovic, Vesna; Jovanovic, Dragana; Pavlovic, Sonja; Stojsic, Jelena; Zeljkovic, Vesna; Yan, Xiting; Homer, Robert; Stefanovic, Branko; Kaminski, Naftali

2017-01-12

Idiopathic Pulmonary Fibrosis (IPF) is a lethal lung disease of unknown etiology. A major limitation in transcriptomic profiling of lung tissue in IPF has been a dependence on snap-frozen fresh tissues (FF). In this project we sought to determine whether genome scale transcript profiling using RNA Sequencing (RNA-Seq) could be applied to archived Formalin-Fixed Paraffin-Embedded (FFPE) IPF tissues. We isolated total RNA from 7 IPF and 5 control FFPE lung tissues and performed 50 base pair paired-end sequencing on Illumina 2000 HiSeq. TopHat2 was used to map sequencing reads to the human genome. On average ~62 million reads (53.4% of ~116 million reads) were mapped per sample. 4,131 genes were differentially expressed between IPF and controls (1,920 increased and 2,211 decreased (FDR < 0.05). We compared our results to differentially expressed genes calculated from a previously published dataset generated from FF tissues analyzed on Agilent microarrays (GSE47460). The overlap of differentially expressed genes was very high (760 increased and 1,413 decreased, FDR < 0.05). Only 92 differentially expressed genes changed in opposite directions. Pathway enrichment analysis performed using MetaCore confirmed numerous IPF relevant genes and pathways including extracellular remodeling, TGF-beta, and WNT. Gene network analysis of MMP7, a highly differentially expressed gene in both datasets, revealed the same canonical pathways and gene network candidates in RNA-Seq and microarray data. For validation by NanoString nCounter® we selected 35 genes that had a fold change of 2 in at least one dataset (10 discordant, 10 significantly differentially expressed in one dataset only and 15 concordant genes). High concordance of fold change and FDR was observed for each type of the samples (FF vs FFPE) with both microarrays (r = 0.92) and RNA-Seq (r = 0.90) and the number of discordant genes was reduced to four. Our results demonstrate that RNA sequencing of RNA obtained from archived FFPE lung tissues is feasible. The results obtained from FFPE tissue are highly comparable to FF tissues. The ability to perform RNA-Seq on archived FFPE IPF tissues should greatly enhance the availability of tissue biopsies for research in IPF.
Molecular Characterization of Bombyx mori Cytoplasmic Polyhedrosis Virus Genome Segment 4

PubMed Central

Ikeda, Keiko; Nagaoka, Sumiharu; Winkler, Stefan; Kotani, Kumiko; Yagi, Hiroaki; Nakanishi, Kae; Miyajima, Shigetoshi; Kobayashi, Jun; Mori, Hajime

2001-01-01

The complete nucleotide sequence of the genome segment 4 (S4) of Bombyx mori cytoplasmic polyhedrosis virus (BmCPV) was determined. The 3,259-nucleotide sequence contains a single long open reading frame which spans nucleotides 14 to 3187 and which is predicted to encode a protein with a molecular mass of about 130 kDa. Western blot analysis showed that S4 encodes BmCPV protein VP3, which is one of the outer components of the BmCPV virion. Sequence analysis of the deduced amino acid sequence of BmCPV VP3 revealed possible sequence homology with proteins from rice ragged stunt virus (RRSV) S2, Nilaparvata lugens reovirus S4, and Fiji disease fijivirus S4. This may suggest that plant reoviruses originated from insect viruses and that RRSV emerged more recently than other plant reoviruses. A chimeric protein consisting of BmCPV VP3 and green fluorescent protein (GFP) was constructed and expressed with BmCPV polyhedrin using a baculovirus expression vector. The VP3-GFP chimera was incorporated into BmCPV polyhedra and released under alkaline conditions. The results indicate that specific interactions occur between BmCPV polyhedrin and VP3 which might facilitate BmCPV virion occlusion into the polyhedra. PMID:11134312
DEApp: an interactive web interface for differential expression analysis of next generation sequence data.

PubMed

Li, Yan; Andrade, Jorge

2017-01-01

A growing trend in the biomedical community is the use of Next Generation Sequencing (NGS) technologies in genomics research. The complexity of downstream differential expression (DE) analysis is however still challenging, as it requires sufficient computer programing and command-line knowledge. Furthermore, researchers often need to evaluate and visualize interactively the effect of using differential statistical and error models, assess the impact of selecting different parameters and cutoffs, and finally explore the overlapping consensus of cross-validated results obtained with different methods. This represents a bottleneck that slows down or impedes the adoption of NGS technologies in many labs. We developed DEApp, an interactive and dynamic web application for differential expression analysis of count based NGS data. This application enables models selection, parameter tuning, cross validation and visualization of results in a user-friendly interface. DEApp enables labs with no access to full time bioinformaticians to exploit the advantages of NGS applications in biomedical research. This application is freely available at https://yanli.shinyapps.io/DEAppand https://gallery.shinyapps.io/DEApp.
Identification and Characterization of a Cis-Encoded Antisense RNA Associated with the Replication Process of Salmonella enterica Serovar Typhi

PubMed Central

Dadzie, Isaac; Xu, Shungao; Ni, Bin; Zhang, Xiaolei; Zhang, Haifang; Sheng, Xiumei; Xu, Huaxi; Huang, Xinxiang

2013-01-01

Antisense RNAs that originate from the complementary strand of protein coding genes are involved in the regulation of gene expression in all domains of life. In bacteria, some of these antisense RNAs are transcriptional noise whiles others play a vital role to adapt the cell to changing environmental conditions. By deep sequencing analysis of transcriptome of Salmonella enterica serovar Typhi, a partial RNA sequence encoded in-cis to the dnaA gene was revealed. Northern blot and RACE analysis confirmed the transcription of this antisense RNA which was expressed mostly in the stationary phase of the bacterial growth and also under iron limitation and osmotic stress. Pulse expression analysis showed that overexpression of the antisense RNA resulted in a significant increase in the mRNA levels of dnaA, which will ultimately enhance their translation. Our findings have revealed that antisense RNA of dnaA is indeed transcribed not merely as a by-product of the cell's transcription machinery but plays a vital role as far as stability of dnaA mRNA is concerned. PMID:23637809
The heptanucleotide motif GAGACGC is a key component of a cis-acting promoter element that is critical for SnSAG1 expression in Sarcocystis neurona.

PubMed

Gaji, Rajshekhar Y; Howe, Daniel K

2009-07-01

The apicomplexan parasite Sarcocystis neurona undergoes a complex process of intracellular development, during which many genes are temporally regulated. The described study was undertaken to begin identifying the basic promoter elements that control gene expression in S. neurona. Sequence analysis of the 5'-flanking region of five S. neurona genes revealed a conserved heptanucleotide motif GAGACGC that is similar to the WGAGACG motif described upstream of multiple genes in Toxoplasma gondii. The promoter region for the major surface antigen gene SnSAG1, which contains three heptanucleotide motifs within 135 bases of the transcription start site, was dissected by functional analysis using a dual luciferase reporter assay. These analyses revealed that a minimal promoter fragment containing all three motifs was sufficient to drive reporter molecule expression, with the presence and orientation of the 5'-most heptanucleotide motif being absolutely critical for promoter function. Further studies should help to identify additional sequence elements important for promoter function and for controlling gene expression during intracellular development by this apicomplexan pathogen.
Evidence for Phex haploinsufficiency in murine X-linked hypophosphatemia.

PubMed

Wang, L; Du, L; Ecarot, B

1999-04-01

Mutations in the PHEX gene (phosphate-regulating gene with homology to endopeptidases on the X-chromosome) are responsible for X-linked hypophosphatemia (HYP). We previously reported the full-length coding sequence of murine Phex cDNA and provided evidence of Phex expression in bone and tooth. Here, we report the cloning of the entire 3.5-kb 3'UTR of the Phex gene, yielding a total of 6248 bp for the Phex transcript. Southern blot and RT-PCR analyses revealed that the 3' end of the coding sequence and the 3'UTR of the Phex gene, spanning exons 16 to 22, are deleted in Hyp, the mouse model for HYP. Northern blot analysis of bone revealed lack of expression of stable Phex mRNA from the mutant allele and expression of Phex transcripts from the wild-type allele in Hyp heterozygous females. Expression of the Phex protein in heterozygotes was confirmed by Western analysis with antibodies raised against a COOH-terminal peptide of the mouse Phex protein. Taken together, these results indicate that the dominant pattern of Hyp inheritance in mice is due to Phex haploinsufficiency.

Cell-Free Expression and In Situ Immobilization of Parasite Proteins from Clonorchis sinensis for Rapid Identification of Antigenic Candidates

PubMed Central

Ju, Jung Won; Kim, Ho-Cheol; Shin, Hyun-Il; Kim, Yu Jung; Kim, Dong-Myung

2015-01-01

Progress towards genetic sequencing of human parasites has provided the groundwork for a post-genomic approach to develop novel antigens for the diagnosis and treatment of parasite infections. To fully utilize the genomic data, however, high-throughput methodologies are required for functional analysis of the proteins encoded in the genomic sequences. In this study, we investigated cell-free expression and in situ immobilization of parasite proteins as a novel platform for the discovery of antigenic proteins. PCR-amplified parasite DNA was immobilized on microbeads that were also functionalized to capture synthesized proteins. When the microbeads were incubated in a reaction mixture for cell-free synthesis, proteins expressed from the microbead-immobilized DNA were instantly immobilized on the same microbeads, providing a physical linkage between the genetic information and encoded proteins. This approach of in situ expression and isolation enables streamlined recovery and analysis of cell-free synthesized proteins and also allows facile identification of the genes coding antigenic proteins through direct PCR of the microbead-bound DNA. PMID:26599101
The visual pigments of the West Indian manatee (Trichechus manatus).

PubMed

Newman, Lucy A; Robinson, Phyllis R

2006-10-01

Manatees are unique among the fully aquatic marine mammals in that they are herbivorous creatures, with hunting strategies restricted to grazing on sea-grasses. Since the other groups of (carnivorous) marine mammals have been found to possess various visual system adaptations to their unique visual environments, it was of interest to investigate the visual capability of the manatee. Previous work, both behavioral (Griebel & Schmid, 1996), and ultrastructural (Cohen, Tucker, & Odell, 1982; unpublished work cited by Griebel & Peichl, 2003), has suggested that manatees have the dichromatic color vision typical of diurnal mammals. This study uses molecular techniques to investigate the cone visual pigments of the manatee. The aim was to clone and sequence cone opsins from the retina, and, if possible, express and reconstitute functional visual pigments to perform spectral analysis. Both LWS and SWS cone opsins were cloned and sequenced from manatee retinae, which, upon expression and spectral analysis, had lambda(max) values of 555 and 410 nm, respectively. The expression of both the LWS and SWS cone opsin in the manatee retina is unique as both pinnipeds and cetaceans only express a cone LWS opsin.
Alu-derived cis-element regulates tumorigenesis-dependent gastric expression of GASDERMIN B (GSDMB).

PubMed

Komiyama, Hiromitsu; Aoki, Aya; Tanaka, Shigekazu; Maekawa, Hiroshi; Kato, Yoriko; Wada, Ryo; Maekawa, Takeo; Tamura, Masaru; Shiroishi, Toshihiko

2010-02-01

GASDERMIN B (GSDMB) belongs to the novel gene family GASDERMIN (GSDM). All GSDM family members are located in amplicons, genomic regions often amplified during cancer development. Given that GSDMB is highly expressed in cancerous cells and the locus resides in an amplicon, GSDMB may be involved in cancer development and/or progression. However, only limited information is available on GSDMB expression in tissues, normal and cancerous, from cancer patients. Furthermore, the molecular mechanisms that regulate GSDMB expression in gastric tissues are poorly understood. We investigated the spatiotemporal expression patterns of GSDMB in gastric cancer patients and the 5' regulatory sequences upstream of GSDMB. GSDMB was not expressed in the majority of normal gastric-tissue samples, and the expression level was very low in the few normal samples with GSDMB expression. Most pre-cancer samples showed moderate GSDMB expression, and most cancerous samples showed augmented GSDMB expression. Analysis of genome sequences revealed that an Alu element resides in the 5' region upstream of GSDMB. Reporter assays using intact, deleted, and mutated Alu elements clearly showed that this Alu element positively regulates GSDMB expression and that a putative IKZF binding motif in this element is crucial to upregulate GSDMB expression.
Global Gene Expression Patterns and Somatic Mutations in Sporadic Intracranial Aneurysms.

PubMed

Li, Zhili; Tan, Haibin; Shi, Yi; Huang, Guangfu; Wang, Zhenyu; Liu, Ling; Yin, Cheng; Wang, Qi

2017-04-01

High-throughput sequencing technologies can expand our understanding of the pathologic basis of intracranial aneurysms (IAs). Our study was aimed to decipher the gene expression signature and genetic factors associated with IAs. We determined the gene expression levels of 3 cases of IAs by RNA sequencing. Bioinformatics analysis was conducted to identify the differentially expressed genes (DEGs) and uncover their biological function. In addition, whole genome sequencing was performed on an additional 6 cases of IAs to detect the potential somatic alterations in DEGs. Compared with the normal arterial tissue, 1709 genes were differentially expressed in IAs arterial tissue. The most significantly up-regulated gene and down-regulated gene, H19 and HIST1H3J, may be essential for tumorigenesis of IAs. Hub protein of IKBKG in protein-protein interaction network was probably involved in the inflammation process in aneurysms. Another 2 hub proteins, ACTB and MKI67IP, as well as up-regulated genes, might be abnormally activated in aneurysms and involved in the pathogenesis of IAs. Further whole genome sequencing and filtering yielded 4 candidate somatic single nucleotide variants including MUC3B, and BLM may be involved in the pathogenesis of IAs. Even though, our results do not support the hypothesis of somatic mutations occurred in the DEGs. Two-dimensional genomic data from transcriptome and whole genome sequencing indicated that no somatic mutations occurred in DEGs. In addition, 3 DEGs (IKBKG, ACTB, and MKI67IP) and 2 mutant genes (MUC3B and BLM) were essential in IAs. Copyright © 2017 Elsevier Inc. All rights reserved.
Construction and Evaluation of Normalized cDNA Libraries Enriched with Full-Length Sequences for Rapid Discovery of New Genes from Sisal (Agave sisalana Perr.) Different Developmental Stages

PubMed Central

Zhou, Wen-Zhao; Zhang, Yan-Mei; Lu, Jun-Ying; Li, Jun-Feng

2012-01-01

To provide a resource of sisal-specific expressed sequence data and facilitate this powerful approach in new gene research, the preparation of normalized cDNA libraries enriched with full-length sequences is necessary. Four libraries were produced with RNA pooled from Agave sisalana multiple tissues to increase efficiency of normalization and maximize the number of independent genes by SMART™ method and the duplex-specific nuclease (DSN). This procedure kept the proportion of full-length cDNAs in the subtracted/normalized libraries and dramatically enhanced the discovery of new genes. Sequencing of 3875 cDNA clones of libraries revealed 3320 unigenes with an average insert length about 1.2 kb, indicating that the non-redundancy of libraries was about 85.7%. These unigene functions were predicted by comparing their sequences to functional domain databases and extensively annotated with Gene Ontology (GO) terms. Comparative analysis of sisal unigenes and other plant genomes revealed that four putative MADS-box genes and knotted-like homeobox (knox) gene were obtained from a total of 1162 full-length transcripts. Furthermore, real-time PCR showed that the characteristics of their transcripts mainly depended on the tight expression regulation of a number of genes during the leaf and flower development. Analysis of individual library sequence data indicated that the pooled-tissue approach was highly effective in discovering new genes and preparing libraries for efficient deep sequencing. PMID:23202944
Analysis of Microbe-Associated Molecular Pattern-Responsive Synthetic Promoters with the Parsley Protoplast System.

PubMed

Kanofsky, Konstantin; Lehmeyer, Mona; Schulze, Jutta; Hehl, Reinhard

2016-01-01

Plants recognize pathogens by microbe-associated molecular patterns (MAMPs) and subsequently induce an immune response. The regulation of gene expression during the immune response depends largely on cis-sequences conserved in promoters of MAMP-responsive genes. These cis-sequences can be analyzed by constructing synthetic promoters linked to a reporter gene and by testing these constructs in transient expression systems. Here, the use of the parsley (Petroselinum crispum) protoplast system for analyzing MAMP-responsive synthetic promoters is described. The synthetic promoter consists of four copies of a potential MAMP-responsive cis-sequence cloned upstream of a minimal promoter and the uidA reporter gene. The reporter plasmid contains a second reporter gene, which is constitutively expressed and hence eliminates the requirement of a second plasmid used as a transformation control. The reporter plasmid is transformed into parsley protoplasts that are elicited by the MAMP Pep25. The MAMP responsiveness is validated by comparing the reporter gene activity from MAMP-treated and untreated cells and by normalizing reporter gene activity using the constitutively expressed reporter gene.
Cloning and expression of a cDNA coding for catalase from zebrafish (Danio rerio).

PubMed

Ken, C F; Lin, C T; Wu, J L; Shaw, J F

2000-06-01

A full-length complementary DNA (cDNA) clone encoding a catalase was amplified by the rapid amplication of cDNA ends-polymerase chain reaction (RACE-PCR) technique from zebrafish (Danio rerio) mRNA. Nucleotide sequence analysis of this cDNA clone revealed that it comprised a complete open reading frame coding for 526 amino acid residues and that it had a molecular mass of 59 654 Da. The deduced amino acid sequence showed high similarity with the sequences of catalase from swine (86.9%), mouse (85.8%), rat (85%), human (83.7%), fruit fly (75.6%), nematode (71.1%), and yeast (58.6%). The amino acid residues for secondary structures are apparently conserved as they are present in other mammal species. Furthermore, the coding region of zebrafish catalase was introduced into an expression vector, pET-20b(+), and transformed into Escherichia coli expression host BL21(DE3)pLysS. A 60-kDa active catalase protein was expressed and detected by Coomassie blue staining as well as activity staining on polyacrylamide gel followed electrophoresis.
Transcriptomic Analysis of Paeonia delavayi Wild Population Flowers to Identify Differentially Expressed Genes Involved in Purple-Red and Yellow Petal Pigmentation

PubMed Central

Wang, Yan; Li, Kui; Zheng, Baoqiang; Miao, Kun

2015-01-01

Tree peony (Paeonia suffruticosa Andrews) is a very famous traditional ornamental plant in China. P. delavayi is a species endemic to Southwest China that has aroused great interest from researchers as a precious genetic resource for flower color breeding. However, the current understanding of the molecular mechanisms of flower pigmentation in this plant is limited, hindering the genetic engineering of novel flower color in tree peonies. In this study, we conducted a large-scale transcriptome analysis based on Illumina HiSeq sequencing of cDNA libraries generated from yellow and purple-red P. delavayi petals. A total of 90,202 unigenes were obtained by de novo assembly, with an average length of 721 nt. Using Blastx, 44,811 unigenes (49.68%) were found to have significant similarity to accessions in the NR, NT, and Swiss-Prot databases. We also examined COG, GO and KEGG annotations to better understand the functions of these unigenes. Further analysis of the two digital transcriptomes revealed that 6,855 unigenes were differentially expressed between yellow and purple-red flower petals, with 3,430 up-regulated and 3,425 down-regulated. According to the RNA-Seq data and qRT-PCR analysis, we proposed that four up-regulated key structural genes, including F3H, DFR, ANS and 3GT, might play an important role in purple-red petal pigmentation, while high co-expression of THC2'GT, CHI and FNS II ensures the accumulation of pigments contributing to the yellow color. We also found 50 differentially expressed transcription factors that might be involved in flavonoid biosynthesis. This study is the first to report genetic information for P. delavayi. The large number of gene sequences produced by transcriptome sequencing and the candidate genes identified using pathway mapping and expression profiles will provide a valuable resource for future association studies aimed at better understanding the molecular mechanisms underlying flower pigmentation in tree peonies. PMID:26267644
Cloning, sequencing and expression in MEL cells of a cDNA encoding the mouse ribosomal protein S5.

PubMed

Vanegas, N; Castañeda, V; Santamaría, D; Hernández, P; Schvartzman, J B; Krimer, D B

1997-06-05

We describe the isolation and characterization of a cDNA encoding the mouse S5 ribosomal protein. It was isolated from a MEL (murine erythroleukemia) cell cDNA library by differential hybridization as a down regulated sequence during HMBA-induced differentiation. Northern series analysis showed that S5 mRNA expression is reduced 5-fold throughout the differentiation process. The mouse S5 mRNA is 760 bp long and encodes for a 204 amino acid protein with 94% homology with the human and rat S5.
High throughput protein production screening

DOEpatents

Beernink, Peter T [Walnut Creek, CA; Coleman, Matthew A [Oakland, CA; Segelke, Brent W [San Ramon, CA

2009-09-08

Methods, compositions, and kits for the cell-free production and analysis of proteins are provided. The invention allows for the production of proteins from prokaryotic sequences or eukaryotic sequences, including human cDNAs using PCR and IVT methods and detecting the proteins through fluorescence or immunoblot techniques. This invention can be used to identify optimized PCR and WT conditions, codon usages and mutations. The methods are readily automated and can be used for high throughput analysis of protein expression levels, interactions, and functional states.
Sequence divergence in the 3'-untranslated region has an effect on the subfunctionalization of duplicate genes.

PubMed

Tong, Ying; Zheng, Kang; Zhao, Shufang; Xiao, Guanxiu; Luo, Chen

2012-11-01

Recent studies demonstrated that sequence divergence in both transcriptional regulatory region and coding region contributes to the subfunctionalization of duplicate gene. However, whether sequence divergence in the 3'-untranslated region (3'-UTR) has an impact on the subfunctionalization of duplicate genes remains unclear. Here, we identified two diverging duplicate vsx1 (visual system homeobox-1) loci in goldfish, named vsx1A1 and vsx1A2. Phylogenetic analysis suggests that vsx1A1 and vsx1A2 may arise from a duplication of vsx1 after the separation of goldfish and zebrafish. Sequence comparison revealed that divergence in both transcriptional and translational regulatory regions is higher than divergence in the introns. vsx1A2 expresses during blastula and gastrula stages and in adult retina but silences from segmentation stage to hatching stage, vsx1A1 starts expression from segmentation onward. Comparing to that zebrafish vsx1 expresses in all the developmental stages and in the adult retina, it appears that goldfish vsx1A1 and vsx1A2 are under going to share the functions of ancestral vsx1. The different but overlapping temporal expression patterns of vsx1A1 and vsx1A2 suggest that sequence divergence in the promoter region of duplicate vsx1 is not sufficient for partitioning the functions of ancestral vsx1. By comparing vsx1A1 and vsx1A2 3'-UTR-linked green fluorescent protein gene expression patterns, we demonstrated that the 3'-UTR of vsx1A1 remains but the 3'-UTR of vsx1A2 has lost the capability of mediating bipolar cell specific expression during retina development. These results indicate that sequence divergence in the 3'-UTRs has a clear effect on subfunctionalization of the duplicate genes. © 2012 WILEY PERIODICALS, INC.
COL1A1 transgene expression in stably transfected osteoblastic cells. Relative contributions of first intron, 3'-flanking sequences, and sequences derived from the body of the human COL1A1 minigene

NASA Technical Reports Server (NTRS)

Breault, D. T.; Lichtler, A. C.; Rowe, D. W.

1997-01-01

Collagen reporter gene constructs have be used to identify cell-specific sequences needed for transcriptional activation. The elements required for endogenous levels of COL1A1 expression, however, have not been elucidated. The human COL1A1 minigene is expressed at high levels and likely harbors sequence elements required for endogenous levels of activity. Using stably transfected osteoblastic Py1a cells, we studied a series of constructs (pOBColCAT) designed to characterize further the elements required for high level of expression. pOBColCAT, which contains the COL1A1 first intron, was expressed at 50-100-fold higher levels than ColCAT 3.6, which lacks the first intron. This difference is best explained by improved mRNA processing rather than a transcriptional effect. Furthermore, variation in activity observed with the intron deletion constructs is best explained by altered mRNA splicing. Two major regions of the human COL1A1 minigene, the 3'-flanking sequences and the minigene body, were introduced into pOBColCAT to assess both transcriptional enhancing activity and the effect on mRNA stability. Analysis of the minigene body, which includes the first five exons and introns fused with the terminal six introns and exons, revealed an orientation-independent 5-fold increase in CAT activity. In contrast the 3'-flanking sequences gave rise to a modest 61% increase in CAT activity. Neither region increased the mRNA half-life of the parent construct, suggesting that CAT-specific mRNA instability elements may serve as dominant negative regulators of stability. This study suggests that other sites within the body of the COL1A1 minigene are important for high expression, e.g. during periods of rapid extracellular matrix production.
Identification of a DNA sequence motif required for expression of iron-regulated genes in pseudomonads.

PubMed

Rombel, I T; McMorran, B J; Lamont, I L

1995-02-20

Many bacteria respond to a lack of iron in the environment by synthesizing siderophores, which act as iron-scavenging compounds. Fluorescent pseudomonads synthesize strain-specific but chemically related siderophores called pyoverdines or pseudobactins. We have investigated the mechanisms by which iron controls expression of genes involved in pyoverdine metabolism in Pseudomonas aeruginosa. Transcription of these genes is repressed by the presence of iron in the growth medium. Three promoters from these genes were cloned and the activities of the promoters were dependent on the amounts of iron in the growth media. Two of the promoters were sequenced and the transcriptional start site were identified by S1 nuclease analysis. Sequences similar to the consensus binding site for the Fur repressor protein, which controls expression of iron-repressible genes in several gram-negative species, were not present in the promoters, suggesting that they are unlikely to have a high affinity for Fur. However, comparison of the promoter sequences with those of iron-regulated genes from other Pseudomonas species and also the iron-regulated exotoxin gene of P. aeruginosa allowed identification of a shared sequence element, with the consensus sequence (G/C)CTAAAT-CCC, which is likely to act as a binding site for a transcriptional activator protein. Mutations in this sequence greatly reduced the activities of the promoters characterized here as well as those of other iron-regulated promoters. The requirement for this motif in the promoters of iron-regulated genes of different Pseudomonas species indicates that similar mechanisms are likely to be involved in controlling expression of a range of iron-regulated genes in pseudomonads.
Transcriptome and Small RNA Deep Sequencing Reveals Deregulation of miRNA Biogenesis in Human Glioma

PubMed Central

Moore, Lynette M.; Kivinen, Virpi; Liu, Yuexin; Annala, Matti; Cogdell, David; Liu, Xiuping; Liu, Chang-Gong; Sawaya, Raymond; Yli-Harja, Olli; Shmulevich, Ilya; Fuller, Gregory N.; Zhang, Wei; Nykter, Matti

2013-01-01

Altered expression of oncogenic and tumor-suppressing microRNAs (miRNAs) is widely associated with tumorigenesis. However, the regulatory mechanisms underlying these alterations are poorly understood. We sought to shed light on the deregulation of miRNA biogenesis promoting the aberrant miRNA expression profiles identified in these tumors. Using sequencing technology to perform both whole-transcriptome and small RNA sequencing of glioma patient samples, we examined precursor and mature miRNAs to directly evaluate the miRNA maturation process, and interrogated expression profiles for genes involved in the major steps of miRNA biogenesis. We found that ratios of mature to precursor forms of a large number of miRNAs increased with the progression from normal brain to low-grade and then to high-grade gliomas. The expression levels of genes involved in each of the three major steps of miRNA biogenesis (nuclear processing, nucleo-cytoplasmic transport, and cytoplasmic processing) were systematically altered in glioma tissues. Survival analysis of an independent data set demonstrated that the alteration of genes involved in miRNA maturation correlates with survival in glioma patients. Direct quantification of miRNA maturation with deep sequencing demonstrated that deregulation of the miRNA biogenesis pathway is a hallmark for glioma genesis and progression. PMID:23007860
Transcript map of the Ovum mutant (Om) locus: isolation by exon trapping of new candidate genes for the DDK syndrome.

PubMed

Le Bras, Stéphanie; Cohen-Tannoudji, Michel; Guyot, Valérie; Vandormael-Pournin, Sandrine; Coumailleau, Franck; Babinet, Charles; Baldacci, Patricia

2002-08-21

The DDK syndrome is defined as the embryonic lethality of F1 mouse embryos from crosses between DDK females and males from other strains (named hereafter as non-DDK strains). Genetically controlled by the Ovum mutant (Om) locus, it is due to a deleterious interaction between a maternal factor present in DDK oocytes and the non-DDK paternal pronucleus. Therefore, the DDK syndrome constitutes a unique genetic tool to study the crucial interactions that take place between the parental genomes and the egg cytoplasm during mammalian development. In this paper, we present an extensive analysis performed by exon trapping on the Om region. Twenty-seven trapped sequences were from genes in the databases: beta-adaptin, CCT zeta2, DNA LigaseIII, Notchless, Rad51l3 and Scya1. Twenty-eight other sequences presented similarities with expressed sequence tags and genomic sequences whereas 57 did not. The pattern of expression of 37 of these markers was established. Importantly, five of them are expressed in DDK oocytes and are candidate genes for the maternal factor, and 20 are candidate genes for the paternal factor since they are expressed in testis. This data is an important step towards identifying the genes responsible for the DDK syndrome.
Oligo Design: a computer program for development of probes for oligonucleotide microarrays.

PubMed

Herold, Keith E; Rasooly, Avraham

2003-12-01

Oligonucleotide microarrays have demonstrated potential for the analysis of gene expression, genotyping, and mutational analysis. Our work focuses primarily on the detection and identification of bacteria based on known short sequences of DNA. Oligo Design, the software described here, automates several design aspects that enable the improved selection of oligonucleotides for use with microarrays for these applications. Two major features of the program are: (i) a tiling algorithm for the design of short overlapping temperature-matched oligonucleotides of variable length, which are useful for the analysis of single nucleotide polymorphisms and (ii) a set of tools for the analysis of multiple alignments of gene families and related short DNA sequences, which allow for the identification of conserved DNA sequences for PCR primer selection and variable DNA sequences for the selection of unique probes for identification. Note that the program does not address the full genome perspective but, instead, is focused on the genetic analysis of short segments of DNA. The program is Internet-enabled and includes a built-in browser and the automated ability to download sequences from GenBank by specifying the GI number. The program also includes several utilities, including audio recital of a DNA sequence (useful for verifying sequences against a written document), a random sequence generator that provides insight into the relationship between melting temperature and GC content, and a PCR calculator.
Sequencing, Analysis, and Annotation of Expressed Sequence Tags for Camelus dromedarius

PubMed Central

Al-Swailem, Abdulaziz M.; Shehata, Maher M.; Abu-Duhier, Faisel M.; Al-Yamani, Essam J.; Al-Busadah, Khalid A.; Al-Arawi, Mohammed S.; Al-Khider, Ali Y.; Al-Muhaimeed, Abdullah N.; Al-Qahtani, Fahad H.; Manee, Manee M.; Al-Shomrani, Badr M.; Al-Qhtani, Saad M.; Al-Harthi, Amer S.; Akdemir, Kadir C.; Otu, Hasan H.

2010-01-01

Despite its economical, cultural, and biological importance, there has not been a large scale sequencing project to date for Camelus dromedarius. With the goal of sequencing complete DNA of the organism, we first established and sequenced camel EST libraries, generating 70,272 reads. Following trimming, chimera check, repeat masking, cluster and assembly, we obtained 23,602 putative gene sequences, out of which over 4,500 potentially novel or fast evolving gene sequences do not carry any homology to other available genomes. Functional annotation of sequences with similarities in nucleotide and protein databases has been obtained using Gene Ontology classification. Comparison to available full length cDNA sequences and Open Reading Frame (ORF) analysis of camel sequences that exhibit homology to known genes show more than 80% of the contigs with an ORF>300 bp and ∼40% hits extending to the start codons of full length cDNAs suggesting successful characterization of camel genes. Similarity analyses are done separately for different organisms including human, mouse, bovine, and rat. Accompanying web portal, CAGBASE (http://camel.kacst.edu.sa/), hosts a relational database containing annotated EST sequences and analysis tools with possibility to add sequences from public domain. We anticipate our results to provide a home base for genomic studies of camel and other comparative studies enabling a starting point for whole genome sequencing of the organism. PMID:20502665
Drought-induced gene expression in Atriplex canescens (salt bush): Transcriptional and post transcriptional response

DOE Office of Scientific and Technical Information (OSTI.GOV)

Cairney, J.; Hays, D.; Stockand, J.D.

1991-05-01

The rangeland shrub Atriplex canescens (saltbush) is extremely drought-tolerant and is capable of growing at water potentials below {minus}40 bar. To discover the molecular basis of this tolerance, the authors have isolated a number of cDNA clones of drought-stress induced genes. Analysis of the nucleotide sequence and expression of these genes in different tissues and in response to different stresses reveals the diversity of the stress response. Members of a drought-induced, multi-gene family, have been sequenced. Although 95% homologous, non-conservative substitutions result in proteins of different tertiary structure. Additionally, the genes are expressed through a number of mature forms ofmore » mRNA which may arise by alternative RNA processing.« less
Transcriptome Analysis of Differentially Expressed Genes Provides Insight into Stolon Formation in Tulipa edulis

PubMed Central

Miao, Yuanyuan; Zhu, Zaibiao; Guo, Qiaosheng; Zhu, Yunhao; Yang, Xiaohua; Sun, Yuan

2016-01-01

Tulipa edulis (Miq.) Baker is an important medicinal plant with a variety of anti-cancer properties. The stolon is one of the main asexual reproductive organs of T. edulis and possesses a unique morphology. To explore the molecular mechanism of stolon formation, we performed an RNA-seq analysis of the transcriptomes of stolons at three developmental stages. In the present study, 15.49 Gb of raw data were generated and assembled into 74,006 unigenes, and a total of 2,811 simple sequence repeats were detected in T. edulis. Among the three libraries of stolons at different developmental stages, there were 5,119 differentially expressed genes (DEGs). A functional annotation analysis based on sequence similarity queries of the GO, COG, KEGG databases showed that these DEGs were mainly involved in many physiological and biochemical processes, such as material and energy metabolism, hormone signaling, cell growth, and transcription regulation. In addition, quantitative real-time PCR analysis revealed that the expression patterns of the DEGs were consistent with the transcriptome data, which further supported a role for the DEGs in stolon formation. This study provides novel resources for future genetic and molecular studies in T. edulis. PMID:27064558
Transcriptome Analysis of Differentially Expressed Genes Provides Insight into Stolon Formation in Tulipa edulis.

PubMed

Miao, Yuanyuan; Zhu, Zaibiao; Guo, Qiaosheng; Zhu, Yunhao; Yang, Xiaohua; Sun, Yuan

2016-01-01

Tulipa edulis (Miq.) Baker is an important medicinal plant with a variety of anti-cancer properties. The stolon is one of the main asexual reproductive organs of T. edulis and possesses a unique morphology. To explore the molecular mechanism of stolon formation, we performed an RNA-seq analysis of the transcriptomes of stolons at three developmental stages. In the present study, 15.49 Gb of raw data were generated and assembled into 74,006 unigenes, and a total of 2,811 simple sequence repeats were detected in T. edulis. Among the three libraries of stolons at different developmental stages, there were 5,119 differentially expressed genes (DEGs). A functional annotation analysis based on sequence similarity queries of the GO, COG, KEGG databases showed that these DEGs were mainly involved in many physiological and biochemical processes, such as material and energy metabolism, hormone signaling, cell growth, and transcription regulation. In addition, quantitative real-time PCR analysis revealed that the expression patterns of the DEGs were consistent with the transcriptome data, which further supported a role for the DEGs in stolon formation. This study provides novel resources for future genetic and molecular studies in T. edulis.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.