Delimiting regulatory sequences of the Drosophila melanogaster Ddc gene.
Hirsh, J; Morgan, B A; Scholnick, S B
1986-01-01
We delimited sequences necessary for in vivo expression of the Drosophila melanogaster dopa decarboxylase gene Ddc. The expression of in vitro-altered genes was assayed following germ line integration via P-element vectors. Sequences between -209 and -24 were necessary for normally regulated expression, although genes lacking these sequences could be expressed at 10 to 50% of wild-type levels at specific developmental times. These genes showed components of normal developmental expression, which suggests that they retain some regulatory elements. All Ddc genes lacking the normal immediate 5'-flanking sequences were grossly deficient in larval central nervous system expression. Thus, this upstream region must contain at least one element necessary for this expression. A mutated Ddc gene without a normal TATA boxlike sequence used the normal RNA start points, indicating that this sequences is not required for start point specificity. Images PMID:3099170
Positive Selection Underlies Faster-Z Evolution of Gene Expression in Birds
Dean, Rebecca; Harrison, Peter W.; Wright, Alison E.; Zimmer, Fabian; Mank, Judith E.
2015-01-01
The elevated rate of evolution for genes on sex chromosomes compared with autosomes (Fast-X or Fast-Z evolution) can result either from positive selection in the heterogametic sex or from nonadaptive consequences of reduced relative effective population size. Recent work in birds suggests that Fast-Z of coding sequence is primarily due to relaxed purifying selection resulting from reduced relative effective population size. However, gene sequence and gene expression are often subject to distinct evolutionary pressures; therefore, we tested for Fast-Z in gene expression using next-generation RNA-sequencing data from multiple avian species. Similar to studies of Fast-Z in coding sequence, we recover clear signatures of Fast-Z in gene expression; however, in contrast to coding sequence, our data indicate that Fast-Z in expression is due to positive selection acting primarily in females. In the soma, where gene expression is highly correlated between the sexes, we detected Fast-Z in both sexes, although at a higher rate in females, suggesting that many positively selected expression changes in females are also expressed in males. In the gonad, where intersexual correlations in expression are much lower, we detected Fast-Z for female gene expression, but crucially, not males. This suggests that a large amount of expression variation is sex-specific in its effects within the gonad. Taken together, our results indicate that Fast-Z evolution of gene expression is the product of positive selection acting on recessive beneficial alleles in the heterogametic sex. More broadly, our analysis suggests that the adaptive potential of Z chromosome gene expression may be much greater than that of gene sequence, results which have important implications for the role of sex chromosomes in speciation and sexual selection. PMID:26067773
Sequences 5' to translation start regulate expression of petunia rbcS genes.
Dean, C; Favreau, M; Bedbrook, J; Dunsmuir, P
1989-01-01
The promoter sequences that contribute to quantitative differences in expression of the petunia genes (rbcS) encoding the small subunit of ribulose bisphosphate carboxylase have been characterized. The promoter regions of the two most abundantly expressed petunia rbcS genes, SSU301 and SSU611, show sequence similarity not present in other rbcS genes. We investigated the significance of these and other sequences by adding specific regions from the SSU301 promoter (the most strongly expressed gene) to equivalent regions in the SSU911 promoter (the least strongly expressed gene) and assaying the expression of the fusions in transgenic tobacco plants. In this way, we characterized an SSU301 promoter region (either from -285 to -178 or -291 to -204) which, when added to SSU911, in either orientation, increased SSU911 expression 25-fold. This increase was equivalent to that caused by addition of the entire SSU301 5'-flanking region. Replacement of SSU911 promoter sequences between -198 and the start codon with sequences from the equivalent region of SSU301 did not increase SSU911 expression significantly. The -291 to -204 SSU301 promoter fragment contributes significantly to quantitative differences in expression between the petunia rbcS genes. PMID:2535543
Composite transcriptome assembly of RNA-seq data in a sheep model for delayed bone healing.
Jäger, Marten; Ott, Claus-Eric; Grünhagen, Johannes; Hecht, Jochen; Schell, Hanna; Mundlos, Stefan; Duda, Georg N; Robinson, Peter N; Lienau, Jasmin
2011-03-24
The sheep is an important model organism for many types of medically relevant research, but molecular genetic experiments in the sheep have been limited by the lack of knowledge about ovine gene sequences. Prior to our study, mRNA sequences for only 1,556 partial or complete ovine genes were publicly available. Therefore, we developed a composite de novo transcriptome assembly method for next-generation sequence data to combine known ovine mRNA and EST sequences, mRNA sequences from mouse and cow, and sequences assembled de novo from short read RNA-Seq data into a composite reference transcriptome, and identified transcripts from over 12 thousand previously undescribed ovine genes. Gene expression analysis based on these data revealed substantially different expression profiles in standard versus delayed bone healing in an ovine tibial osteotomy model. Hundreds of transcripts were differentially expressed between standard and delayed healing and between the time points of the standard and delayed healing groups. We used the sheep sequences to design quantitative RT-PCR assays with which we validated the differential expression of 26 genes that had been identified by RNA-seq analysis. A number of clusters of characteristic expression profiles could be identified, some of which showed striking differences between the standard and delayed healing groups. Gene Ontology (GO) analysis showed that the differentially expressed genes were enriched in terms including extracellular matrix, cartilage development, contractile fiber, and chemokine activity. Our results provide a first atlas of gene expression profiles and differentially expressed genes in standard and delayed bone healing in a large-animal model and provide a number of clues as to the shifts in gene expression that underlie delayed bone healing. In the course of our study, we identified transcripts of 13,987 ovine genes, including 12,431 genes for which no sequence information was previously available. This information will provide a basis for future molecular research involving the sheep as a model organism.
Composite transcriptome assembly of RNA-seq data in a sheep model for delayed bone healing
2011-01-01
Background The sheep is an important model organism for many types of medically relevant research, but molecular genetic experiments in the sheep have been limited by the lack of knowledge about ovine gene sequences. Results Prior to our study, mRNA sequences for only 1,556 partial or complete ovine genes were publicly available. Therefore, we developed a composite de novo transcriptome assembly method for next-generation sequence data to combine known ovine mRNA and EST sequences, mRNA sequences from mouse and cow, and sequences assembled de novo from short read RNA-Seq data into a composite reference transcriptome, and identified transcripts from over 12 thousand previously undescribed ovine genes. Gene expression analysis based on these data revealed substantially different expression profiles in standard versus delayed bone healing in an ovine tibial osteotomy model. Hundreds of transcripts were differentially expressed between standard and delayed healing and between the time points of the standard and delayed healing groups. We used the sheep sequences to design quantitative RT-PCR assays with which we validated the differential expression of 26 genes that had been identified by RNA-seq analysis. A number of clusters of characteristic expression profiles could be identified, some of which showed striking differences between the standard and delayed healing groups. Gene Ontology (GO) analysis showed that the differentially expressed genes were enriched in terms including extracellular matrix, cartilage development, contractile fiber, and chemokine activity. Conclusions Our results provide a first atlas of gene expression profiles and differentially expressed genes in standard and delayed bone healing in a large-animal model and provide a number of clues as to the shifts in gene expression that underlie delayed bone healing. In the course of our study, we identified transcripts of 13,987 ovine genes, including 12,431 genes for which no sequence information was previously available. This information will provide a basis for future molecular research involving the sheep as a model organism. PMID:21435219
Li, Bing; Shi, Xiao-Yu; Liao, Dai-Xiang; Cao, Bang-Rong; Luo, Cheng-Hua; Cheng, Shu-Jun
2015-01-01
There are still no absolute parameters predicting progression of adenoma into cancer. The present study aimed to characterize functional differences on the multistep carcinogenetic process from the adenoma-carcinoma sequence. All samples were collected and mRNA expression profiling was performed by using Agilent Microarray high-throughput gene-chip technology. Then, the characteristics of mRNA expression profiles of adenoma-carcinoma sequence were described with bioinformatics software, and we analyzed the relationship between gene expression profiles of adenoma-adenocarcinoma sequence and clinical prognosis of colorectal cancer. The mRNA expressions of adenoma-carcinoma sequence were significantly different between high-grade intraepithelial neoplasia group and adenocarcinoma group. The biological process of gene ontology function enrichment analysis on differentially expressed genes between high-grade intraepithelial neoplasia group and adenocarcinoma group showed that genes enriched in the extracellular structure organization, skeletal system development, biological adhesion and itself regulated growth regulation, with the P value after FDR correction of less than 0.05. In addition, IPR-related protein mainly focused on the insulin-like growth factor binding proteins. The variable trends of gene expression profiles for adenoma-carcinoma sequence were mainly concentrated in high-grade intraepithelial neoplasia and adenocarcinoma. The differentially expressed genes are significantly correlated between high-grade intraepithelial neoplasia group and adenocarcinoma group. Bioinformatics analysis is an effective way to study the gene expression profiles in the adenoma-carcinoma sequence, and may provide an effective tool to involve colorectal cancer research strategy into colorectal adenoma or advanced adenoma.
Siebert, Stefan; Robinson, Mark D; Tintori, Sophia C; Goetz, Freya; Helm, Rebecca R; Smith, Stephen A; Shaner, Nathan; Haddock, Steven H D; Dunn, Casey W
2011-01-01
We investigated differential gene expression between functionally specialized feeding polyps and swimming medusae in the siphonophore Nanomia bijuga (Cnidaria) with a hybrid long-read/short-read sequencing strategy. We assembled a set of partial gene reference sequences from long-read data (Roche 454), and generated short-read sequences from replicated tissue samples that were mapped to the references to quantify expression. We collected and compared expression data with three short-read expression workflows that differ in sample preparation, sequencing technology, and mapping tools. These workflows were Illumina mRNA-Seq, which generates sequence reads from random locations along each transcript, and two tag-based approaches, SOLiD SAGE and Helicos DGE, which generate reads from particular tag sites. Differences in expression results across workflows were mostly due to the differential impact of missing data in the partial reference sequences. When all 454-derived gene reference sequences were considered, Illumina mRNA-Seq detected more than twice as many differentially expressed (DE) reference sequences as the tag-based workflows. This discrepancy was largely due to missing tag sites in the partial reference that led to false negatives in the tag-based workflows. When only the subset of reference sequences that unambiguously have tag sites was considered, we found broad congruence across workflows, and they all identified a similar set of DE sequences. Our results are promising in several regards for gene expression studies in non-model organisms. First, we demonstrate that a hybrid long-read/short-read sequencing strategy is an effective way to collect gene expression data when an annotated genome sequence is not available. Second, our replicated sampling indicates that expression profiles are highly consistent across field-collected animals in this case. Third, the impacts of partial reference sequences on the ability to detect DE can be mitigated through workflow choice and deeper reference sequencing.
Siebert, Stefan; Robinson, Mark D.; Tintori, Sophia C.; Goetz, Freya; Helm, Rebecca R.; Smith, Stephen A.; Shaner, Nathan; Haddock, Steven H. D.; Dunn, Casey W.
2011-01-01
We investigated differential gene expression between functionally specialized feeding polyps and swimming medusae in the siphonophore Nanomia bijuga (Cnidaria) with a hybrid long-read/short-read sequencing strategy. We assembled a set of partial gene reference sequences from long-read data (Roche 454), and generated short-read sequences from replicated tissue samples that were mapped to the references to quantify expression. We collected and compared expression data with three short-read expression workflows that differ in sample preparation, sequencing technology, and mapping tools. These workflows were Illumina mRNA-Seq, which generates sequence reads from random locations along each transcript, and two tag-based approaches, SOLiD SAGE and Helicos DGE, which generate reads from particular tag sites. Differences in expression results across workflows were mostly due to the differential impact of missing data in the partial reference sequences. When all 454-derived gene reference sequences were considered, Illumina mRNA-Seq detected more than twice as many differentially expressed (DE) reference sequences as the tag-based workflows. This discrepancy was largely due to missing tag sites in the partial reference that led to false negatives in the tag-based workflows. When only the subset of reference sequences that unambiguously have tag sites was considered, we found broad congruence across workflows, and they all identified a similar set of DE sequences. Our results are promising in several regards for gene expression studies in non-model organisms. First, we demonstrate that a hybrid long-read/short-read sequencing strategy is an effective way to collect gene expression data when an annotated genome sequence is not available. Second, our replicated sampling indicates that expression profiles are highly consistent across field-collected animals in this case. Third, the impacts of partial reference sequences on the ability to detect DE can be mitigated through workflow choice and deeper reference sequencing. PMID:21829563
Digital gene expression analysis of the zebra finch genome
2010-01-01
Background In order to understand patterns of adaptation and molecular evolution it is important to quantify both variation in gene expression and nucleotide sequence divergence. Gene expression profiling in non-model organisms has recently been facilitated by the advent of massively parallel sequencing technology. Here we investigate tissue specific gene expression patterns in the zebra finch (Taeniopygia guttata) with special emphasis on the genes of the major histocompatibility complex (MHC). Results Almost 2 million 454-sequencing reads from cDNA of six different tissues were assembled and analysed. A total of 11,793 zebra finch transcripts were represented in this EST data, indicating a transcriptome coverage of about 65%. There was a positive correlation between the tissue specificity of gene expression and non-synonymous to synonymous nucleotide substitution ratio of genes, suggesting that genes with a specialised function are evolving at a higher rate (or with less constraint) than genes with a more general function. In line with this, there was also a negative correlation between overall expression levels and expression specificity of contigs. We found evidence for expression of 10 different genes related to the MHC. MHC genes showed relatively tissue specific expression levels and were in general primarily expressed in spleen. Several MHC genes, including MHC class I also showed expression in brain. Furthermore, for all genes with highest levels of expression in spleen there was an overrepresentation of several gene ontology terms related to immune function. Conclusions Our study highlights the usefulness of next-generation sequence data for quantifying gene expression in the genome as a whole as well as in specific candidate genes. Overall, the data show predicted patterns of gene expression profiles and molecular evolution in the zebra finch genome. Expression of MHC genes in particular, corresponds well with expression patterns in other vertebrates. PMID:20359325
Lewis, Jo E.; Brameld, John M.; Hill, Phil; Barrett, Perry; Ebling, Francis J.P.; Jethwa, Preeti H.
2015-01-01
Introduction The viral 2A sequence has become an attractive alternative to the traditional internal ribosomal entry site (IRES) for simultaneous over-expression of two genes and in combination with recombinant adeno-associated viruses (rAAV) has been used to manipulate gene expression in vitro. New method To develop a rAAV construct in combination with the viral 2A sequence to allow long-term over-expression of the vgf gene and fluorescent marker gene for tracking of the transfected neurones in vivo. Results Transient transfection of the AAV plasmid containing the vgf gene, viral 2A sequence and eGFP into SH-SY5Y cells resulted in eGFP fluorescence comparable to a commercially available reporter construct. This increase in fluorescent cells was accompanied by an increase in VGF mRNA expression. Infusion of the rAAV vector containing the vgf gene, viral 2A sequence and eGFP resulted in eGFP fluorescence in the hypothalamus of both mice and Siberian hamsters, 32 weeks post infusion. In situ hybridisation confirmed that the location of VGF mRNA expression in the hypothalamus corresponded to the eGFP pattern of fluorescence. Comparison with old method The viral 2A sequence is much smaller than the traditional IRES and therefore allowed over-expression of the vgf gene with fluorescent tracking without compromising viral capacity. Conclusion The use of the viral 2A sequence in the AAV plasmid allowed the simultaneous expression of both genes in vitro. When used in combination with rAAV it resulted in long-term over-expression of both genes at equivalent locations in the hypothalamus of both Siberian hamsters and mice, without any adverse effects. PMID:26300182
Methods and compositions for regulating gene expression in plant cells
NASA Technical Reports Server (NTRS)
Dai, Shunhong (Inventor); Beachy, Roger N. (Inventor); Luis, Maria Isabel Ordiz (Inventor)
2010-01-01
Novel chimeric plant promoter sequences are provided, together with plant gene expression cassettes comprising such sequences. In certain preferred embodiments, the chimeric plant promoters comprise the BoxII cis element and/or derivatives thereof. In addition, novel transcription factors are provided, together with nucleic acid sequences encoding such transcription factors and plant gene expression cassettes comprising such nucleic acid sequences. In certain preferred embodiments, the novel transcription factors comprise the acidic domain, or fragments thereof, of the RF2a transcription factor. Methods for using the chimeric plant promoter sequences and novel transcription factors in regulating the expression of at least one gene of interest are provided, together with transgenic plants comprising such chimeric plant promoter sequences and novel transcription factors.
Positive Selection Underlies Faster-Z Evolution of Gene Expression in Birds.
Dean, Rebecca; Harrison, Peter W; Wright, Alison E; Zimmer, Fabian; Mank, Judith E
2015-10-01
The elevated rate of evolution for genes on sex chromosomes compared with autosomes (Fast-X or Fast-Z evolution) can result either from positive selection in the heterogametic sex or from nonadaptive consequences of reduced relative effective population size. Recent work in birds suggests that Fast-Z of coding sequence is primarily due to relaxed purifying selection resulting from reduced relative effective population size. However, gene sequence and gene expression are often subject to distinct evolutionary pressures; therefore, we tested for Fast-Z in gene expression using next-generation RNA-sequencing data from multiple avian species. Similar to studies of Fast-Z in coding sequence, we recover clear signatures of Fast-Z in gene expression; however, in contrast to coding sequence, our data indicate that Fast-Z in expression is due to positive selection acting primarily in females. In the soma, where gene expression is highly correlated between the sexes, we detected Fast-Z in both sexes, although at a higher rate in females, suggesting that many positively selected expression changes in females are also expressed in males. In the gonad, where intersexual correlations in expression are much lower, we detected Fast-Z for female gene expression, but crucially, not males. This suggests that a large amount of expression variation is sex-specific in its effects within the gonad. Taken together, our results indicate that Fast-Z evolution of gene expression is the product of positive selection acting on recessive beneficial alleles in the heterogametic sex. More broadly, our analysis suggests that the adaptive potential of Z chromosome gene expression may be much greater than that of gene sequence, results which have important implications for the role of sex chromosomes in speciation and sexual selection. © The Author 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Caldwell, Rachel; Lin, Yan-Xia; Zhang, Ren
2015-01-01
There is a continuing interest in the analysis of gene architecture and gene expression to determine the relationship that may exist. Advances in high-quality sequencing technologies and large-scale resource datasets have increased the understanding of relationships and cross-referencing of expression data to the large genome data. Although a negative correlation between expression level and gene (especially transcript) length has been generally accepted, there have been some conflicting results arising from the literature concerning the impacts of different regions of genes, and the underlying reason is not well understood. The research aims to apply quantile regression techniques for statistical analysis of coding and noncoding sequence length and gene expression data in the plant, Arabidopsis thaliana, and fruit fly, Drosophila melanogaster, to determine if a relationship exists and if there is any variation or similarities between these species. The quantile regression analysis found that the coding sequence length and gene expression correlations varied, and similarities emerged for the noncoding sequence length (5′ and 3′ UTRs) between animal and plant species. In conclusion, the information described in this study provides the basis for further exploration into gene regulation with regard to coding and noncoding sequence length. PMID:26114098
Amirhaeri, S; Wohlrab, F; Wells, R D
1995-02-17
The influence of simple repeat sequences, cloned into different positions relative to the SV40 early promoter/enhancer, on the transient expression of the chloramphenicol acetyltransferase (CAT) gene was investigated. Insertion of (G)29.(C)29 in either orientation into the 5'-untranslated region of the CAT gene reduced expression in CV-1 cells 50-100 fold when compared with controls with random sequence inserts. Analysis of CAT-specific mRNA levels demonstrated that the effect was due to a reduction of CAT mRNA production rather than to posttranscriptional events. In contrast, insertion of the same insert in either orientation upstream of the promoter-enhancer or downstream of the gene stimulated gene expression 2-3-fold. These effects could be reversed by cotransfection of a competitor plasmid carrying (G)25.(C)25 sequences. The results suggest that a G.C-binding transcription factor modulates gene expression in this system and that promoter strength can be regulated by providing protein-binding sites in trans. Although constructs containing longer tracts of alternating (C-G), (T-G), or (A-T) sequences inhibited CAT expression when inserted in the 5'-untranslated region of the CAT gene, the amount of CAT mRNA was unaffected. Hence, these inhibitions must be due to posttranscriptional events, presumably at the level of translation. These effects of microsatellite sequences on gene expression are discussed with respect to recent data on related simple repeat sequences which cause several human genetic diseases.
Alu sequence involvement in transcriptional insulation of the keratin 18 gene in transgenic mice.
Thorey, I S; Ceceña, G; Reynolds, W; Oshima, R G
1993-01-01
The human keratin 18 (K18) gene is expressed in a variety of adult simple epithelial tissues, including liver, intestine, lung, and kidney, but is not normally found in skin, muscle, heart, spleen, or most of the brain. Transgenic animals derived from the cloned K18 gene express the transgene in appropriate tissues at levels directly proportional to the copy number and independently of the sites of integration. We have investigated in transgenic mice the dependence of K18 gene expression on the distal 5' and 3' flanking sequences and upon the RNA polymerase III promoter of an Alu repetitive DNA transcription unit immediately upstream of the K18 promoter. Integration site-independent expression of tandemly duplicated K18 transgenes requires the presence of either an 825-bp fragment of the 5' flanking sequence or the 3.5-kb 3' flanking sequence. Mutation of the RNA polymerase III promoter of the Alu element within the 825-bp fragment abolishes copy number-dependent expression in kidney but does not abolish integration site-independent expression when assayed in the absence of the 3' flanking sequence of the K18 gene. The characteristics of integration site-independent expression and copy number-dependent expression are separable. In addition, the formation of the chromatin state of the K18 gene, which likely restricts the tissue-specific expression of this gene, is not dependent upon the distal flanking sequences of the 10-kb K18 gene but rather may depend on internal regulatory regions of the gene. Images PMID:7692231
Blixt, Maria K E; Hallböök, Finn
2016-01-01
Combining techniques of episomal vector gene-specific Cre expression and genomic integration using the piggyBac transposon system enables studies of gene expression-specific cell lineage tracing in the chicken retina. In this work, we aimed to target the retinal horizontal cell progenitors. A 208 bp gene regulatory sequence from the chicken retinoid X receptor γ gene (RXRγ208) was used to drive Cre expression. RXRγ is expressed in progenitors and photoreceptors during development. The vector was combined with a piggyBac "donor" vector containing a floxed STOP sequence followed by enhanced green fluorescent protein (EGFP), as well as a piggyBac helper vector for efficient integration into the host cell genome. The vectors were introduced into the embryonic chicken retina with in ovo electroporation. Tissue electroporation targets specific developmental time points and in specific structures. Cells that drove Cre expression from the regulatory RXRγ208 sequence excised the floxed STOP-sequence and expressed GFP. The approach generated a stable lineage with robust expression of GFP in retinal cells that have activated transcription from the RXRγ208 sequence. Furthermore, GFP was expressed in cells that express horizontal or photoreceptor markers when electroporation was performed between developmental stages 22 and 28. Electroporation of a stage 12 optic cup gave multiple cell types in accordance with RXRγ gene expression in the early retina. In this study, we describe an easy, cost-effective, and time-efficient method for testing regulatory sequences in general. More specifically, our results open up the possibility for further studies of the RXRγ-gene regulatory network governing the formation of photoreceptor and horizontal cells. In addition, the method presents approaches to target the expression of effector genes, such as regulators of cell fate or cell cycle progression, to these cells and their progenitor.
Lewis, Jo E; Brameld, John M; Hill, Phil; Barrett, Perry; Ebling, Francis J P; Jethwa, Preeti H
2015-12-30
The viral 2A sequence has become an attractive alternative to the traditional internal ribosomal entry site (IRES) for simultaneous over-expression of two genes and in combination with recombinant adeno-associated viruses (rAAV) has been used to manipulate gene expression in vitro. To develop a rAAV construct in combination with the viral 2A sequence to allow long-term over-expression of the vgf gene and fluorescent marker gene for tracking of the transfected neurones in vivo. Transient transfection of the AAV plasmid containing the vgf gene, viral 2A sequence and eGFP into SH-SY5Y cells resulted in eGFP fluorescence comparable to a commercially available reporter construct. This increase in fluorescent cells was accompanied by an increase in VGF mRNA expression. Infusion of the rAAV vector containing the vgf gene, viral 2A sequence and eGFP resulted in eGFP fluorescence in the hypothalamus of both mice and Siberian hamsters, 32 weeks post infusion. In situ hybridisation confirmed that the location of VGF mRNA expression in the hypothalamus corresponded to the eGFP pattern of fluorescence. The viral 2A sequence is much smaller than the traditional IRES and therefore allowed over-expression of the vgf gene with fluorescent tracking without compromising viral capacity. The use of the viral 2A sequence in the AAV plasmid allowed the simultaneous expression of both genes in vitro. When used in combination with rAAV it resulted in long-term over-expression of both genes at equivalent locations in the hypothalamus of both Siberian hamsters and mice, without any adverse effects. Copyright © 2015 The Authors. Published by Elsevier B.V. All rights reserved.
Transposon Variants and Their Effects on Gene Expression in Arabidopsis
Wang, Xi; Weigel, Detlef; Smith, Lisa M.
2013-01-01
Transposable elements (TEs) make up the majority of many plant genomes. Their transcription and transposition is controlled through siRNAs and epigenetic marks including DNA methylation. To dissect the interplay of siRNA–mediated regulation and TE evolution, and to examine how TE differences affect nearby gene expression, we investigated genome-wide differences in TEs, siRNAs, and gene expression among three Arabidopsis thaliana accessions. Both TE sequence polymorphisms and presence of linked TEs are positively correlated with intraspecific variation in gene expression. The expression of genes within 2 kb of conserved TEs is more stable than that of genes next to variant TEs harboring sequence polymorphisms. Polymorphism levels of TEs and closely linked adjacent genes are positively correlated as well. We also investigated the distribution of 24-nt-long siRNAs, which mediate TE repression. TEs targeted by uniquely mapping siRNAs are on average farther from coding genes, apparently because they more strongly suppress expression of adjacent genes. Furthermore, siRNAs, and especially uniquely mapping siRNAs, are enriched in TE regions missing in other accessions. Thus, targeting by uniquely mapping siRNAs appears to promote sequence deletions in TEs. Overall, our work indicates that siRNA–targeting of TEs may influence removal of sequences from the genome and hence evolution of gene expression in plants. PMID:23408902
Multi-targeted priming for genome-wide gene expression assays.
Adomas, Aleksandra B; Lopez-Giraldez, Francesc; Clark, Travis A; Wang, Zheng; Townsend, Jeffrey P
2010-08-17
Complementary approaches to assaying global gene expression are needed to assess gene expression in regions that are poorly assayed by current methodologies. A key component of nearly all gene expression assays is the reverse transcription of transcribed sequences that has traditionally been performed by priming the poly-A tails on many of the transcribed genes in eukaryotes with oligo-dT, or by priming RNA indiscriminately with random hexamers. We designed an algorithm to find common sequence motifs that were present within most protein-coding genes of Saccharomyces cerevisiae and of Neurospora crassa, but that were not present within their ribosomal RNA or transfer RNA genes. We then experimentally tested whether degenerately priming these motifs with multi-targeted primers improved the accuracy and completeness of transcriptomic assays. We discovered two multi-targeted primers that would prime a preponderance of genes in the genomes of Saccharomyces cerevisiae and Neurospora crassa while avoiding priming ribosomal RNA or transfer RNA. Examining the response of Saccharomyces cerevisiae to nitrogen deficiency and profiling Neurospora crassa early sexual development, we demonstrated that using multi-targeted primers in reverse transcription led to superior performance of microarray profiling and next-generation RNA tag sequencing. Priming with multi-targeted primers in addition to oligo-dT resulted in higher sensitivity, a larger number of well-measured genes and greater power to detect differences in gene expression. Our results provide the most complete and detailed expression profiles of the yeast nitrogen starvation response and N. crassa early sexual development to date. Furthermore, our multi-targeting priming methodology for genome-wide gene expression assays provides selective targeting of multiple sequences and counter-selection against undesirable sequences, facilitating a more complete and precise assay of the transcribed sequences within the genome.
Castresana, C; Garcia-Luque, I; Alonso, E; Malik, V S; Cashmore, A R
1988-01-01
We have analyzed promoter regulatory elements from a photoregulated CAB gene (Cab-E) isolated from Nicotiana plumbaginifolia. These studies have been performed by introducing chimeric gene constructs into tobacco cells via Agrobacterium tumefaciens-mediated transformation. Expression studies on the regenerated transgenic plants have allowed us to characterize three positive and one negative cis-acting elements that influence photoregulated expression of the Cab-E gene. Within the upstream sequences we have identified two positive regulatory elements (PRE1 and PRE2) which confer maximum levels of photoregulated expression. These sequences contain multiple repeated elements related to the sequence-ACCGGCCCACTT-. We have also identified within the upstream region a negative regulatory element (NRE) extremely rich in AT sequences, which reduces the level of gene expression in the light. We have defined a light regulatory element (LRE) within the promoter region extending from -396 to -186 bp which confers photoregulated expression when fused to a constitutive nopaline synthase ('nos') promoter. Within this region there is a 132-bp element, extending from -368 to -234 bp, which on deletion from the Cab-E promoter reduces gene expression from high levels to undetectable levels. Finally, we have demonstrated for a full length Cab-E promoter conferring high levels of photoregulated expression, that sequences proximal to the Cab-E TATA box are not replaceable by corresponding sequences from a 'nos' promoter. This contrasts with the apparent equivalence of these Cab-E and 'nos' TATA box-proximal sequences in truncated promoters conferring low levels of photoregulated expression. Images PMID:2901343
Plant nitrogen regulatory P-PII genes
Coruzzi, Gloria M.; Lam, Hon-Ming; Hsieh, Ming-Hsiun
2001-01-01
The present invention generally relates to plant nitrogen regulatory PII gene (hereinafter P-PII gene), a gene involved in regulating plant nitrogen metabolism. The invention provides P-PII nucleotide sequences, expression constructs comprising said nucleotide sequences, and host cells and plants having said constructs and, optionally expressing the P-PII gene from said constructs. The invention also provides substantially pure P-PII proteins. The P-PII nucleotide sequences and constructs of the
Le Chevanton, L; Leblon, G
1989-04-15
We cloned the ura5 gene coding for the orotate phosphoribosyl transferase from the ascomycete Sordaria macrospora by heterologous probing of a Sordaria genomic DNA library with the corresponding Podospora anserina sequence. The Sordaria gene was expressed in an Escherichia coli pyrE mutant strain defective for the same enzyme, and expression was shown to be promoted by plasmid sequences. The nucleotide sequence of the 1246-bp DNA fragment encompassing the region of homology with the Podospora gene has been determined. This sequence contains an open reading frame of 699 nucleotides. The deduced amino acid sequence shows 72% similarity with the corresponding Podospora protein.
Barling, Adam; Swaminathan, Kankshita; Mitros, Therese; James, Brandon T; Morris, Juliette; Ngamboma, Ornella; Hall, Megan C; Kirkpatrick, Jessica; Alabady, Magdy; Spence, Ashley K; Hudson, Matthew E; Rokhsar, Daniel S; Moose, Stephen P
2013-12-09
The Miscanthus genus of perennial C4 grasses contains promising biofuel crops for temperate climates. However, few genomic resources exist for Miscanthus, which limits understanding of its interesting biology and future genetic improvement. A comprehensive catalog of expressed sequences were generated from a variety of Miscanthus species and tissue types, with an emphasis on characterizing gene expression changes in spring compared to fall rhizomes. Illumina short read sequencing technology was used to produce transcriptome sequences from different tissues and organs during distinct developmental stages for multiple Miscanthus species, including Miscanthus sinensis, Miscanthus sacchariflorus, and their interspecific hybrid Miscanthus × giganteus. More than fifty billion base-pairs of Miscanthus transcript sequence were produced. Overall, 26,230 Sorghum gene models (i.e., ~ 96% of predicted Sorghum genes) had at least five Miscanthus reads mapped to them, suggesting that a large portion of the Miscanthus transcriptome is represented in this dataset. The Miscanthus × giganteus data was used to identify genes preferentially expressed in a single tissue, such as the spring rhizome, using Sorghum bicolor as a reference. Quantitative real-time PCR was used to verify examples of preferential expression predicted via RNA-Seq. Contiguous consensus transcript sequences were assembled for each species and annotated using InterProScan. Sequences from the assembled transcriptome were used to amplify genomic segments from a doubled haploid Miscanthus sinensis and from Miscanthus × giganteus to further disentangle the allelic and paralogous variations in genes. This large expressed sequence tag collection creates a valuable resource for the study of Miscanthus biology by providing detailed gene sequence information and tissue preferred expression patterns. We have successfully generated a database of transcriptome assemblies and demonstrated its use in the study of genes of interest. Analysis of gene expression profiles revealed biological pathways that exhibit altered regulation in spring compared to fall rhizomes, which are consistent with their different physiological functions. The expression profiles of the subterranean rhizome provides a better understanding of the biological activities of the underground stem structures that are essentials for perenniality and the storage or remobilization of carbon and nutrient resources.
Asamizu, Erika; Nakamura, Yasukazu; Sato, Shusei; Tabata, Satoshi
2004-02-01
To perform a comprehensive analysis of genes expressed in a model legume, Lotus japonicus, a total of 74472 3'-end expressed sequence tags (EST) were generated from cDNA libraries produced from six different organs. Clustering of sequences was performed with an identity criterion of 95% for 50 bases, and a total of 20457 non-redundant sequences, 8503 contigs and 11954 singletons were generated. EST sequence coverage was analyzed by using the annotated L. japonicus genomic sequence and 1093 of the 1889 predicted protein-encoding genes (57.9%) were hit by the EST sequence(s). Gene content was compared to several plant species. Among the 8503 contigs, 471 were identified as sequences conserved only in leguminous species and these included several disease resistance-related genes. This suggested that in legumes, these genes may have evolved specifically to resist pathogen attack. The rate of gene sequence divergence was assessed by comparing similarity level and functional category based on the Gene Ontology (GO) annotation of Arabidopsis genes. This revealed that genes encoding ribosomal proteins, as well as those related to translation, photosynthesis, and cellular structure were more abundantly represented in the highly conserved class, and that genes encoding transcription factors and receptor protein kinases were abundantly represented in the less conserved class. To make the sequence information and the cDNA clones available to the research community, a Web database with useful services was created at http://www.kazusa.or.jp/en/plant/lotus/EST/.
PanGEA: identification of allele specific gene expression using the 454 technology.
Kofler, Robert; Teixeira Torres, Tatiana; Lelley, Tamas; Schlötterer, Christian
2009-05-14
Next generation sequencing technologies hold great potential for many biological questions. While mainly used for genomic sequencing, they are also very promising for gene expression profiling. Sequencing of cDNA does not only provide an estimate of the absolute expression level, it can also be used for the identification of allele specific gene expression. We developed PanGEA, a tool which enables a fast and user-friendly analysis of allele specific gene expression using the 454 technology. PanGEA allows mapping of 454-ESTs to genes or whole genomes, displaying gene expression profiles, identification of SNPs and the quantification of allele specific gene expression. The intuitive GUI of PanGEA facilitates a flexible and interactive analysis of the data. PanGEA additionally implements a modification of the Smith-Waterman algorithm which deals with incorrect estimates of homopolymer length as occuring in the 454 technology To our knowledge, PanGEA is the first tool which facilitates the identification of allele specific gene expression. PanGEA is distributed under the Mozilla Public License and available at: http://www.kofler.or.at/bioinformatics/PanGEA
PanGEA: Identification of allele specific gene expression using the 454 technology
Kofler, Robert; Teixeira Torres, Tatiana; Lelley, Tamas; Schlötterer, Christian
2009-01-01
Background Next generation sequencing technologies hold great potential for many biological questions. While mainly used for genomic sequencing, they are also very promising for gene expression profiling. Sequencing of cDNA does not only provide an estimate of the absolute expression level, it can also be used for the identification of allele specific gene expression. Results We developed PanGEA, a tool which enables a fast and user-friendly analysis of allele specific gene expression using the 454 technology. PanGEA allows mapping of 454-ESTs to genes or whole genomes, displaying gene expression profiles, identification of SNPs and the quantification of allele specific gene expression. The intuitive GUI of PanGEA facilitates a flexible and interactive analysis of the data. PanGEA additionally implements a modification of the Smith-Waterman algorithm which deals with incorrect estimates of homopolymer length as occuring in the 454 technology Conclusion To our knowledge, PanGEA is the first tool which facilitates the identification of allele specific gene expression. PanGEA is distributed under the Mozilla Public License and available at: PMID:19442283
Knowlton, K U; Baracchini, E; Ross, R S; Harris, A N; Henderson, S A; Evans, S M; Glembotski, C C; Chien, K R
1991-04-25
To study the mechanisms which mediate the transcriptional activation of cardiac genes during alpha adrenergic stimulation, the present study examined the regulated expression of three cardiac genes, a ventricular embryonic gene (atrial natriuretic factor, ANF), a constitutively expressed contractile protein gene (cardiac MLC-2), and a cardiac sodium channel gene. alpha 1-Adrenergic stimulation activates the expression and release of ANF from neonatal ventricular cells. As assessed by RNase protection analyses, treatment with alpha-adrenergic agonists increases the steady-state levels of ANF mRNA by greater than 15-fold. However, a rat cardiac sodium channel gene mRNA is not induced, indicating that alpha-adrenergic stimulation does not lead to an increase in the expression of all cardiac genes. Studies employing a series of rat ANF luciferase and rat MLC-2 luciferase fusion genes identify 315- and 92-base pair cis regulatory sequences within an embryonic gene (ANF) and a constitutively expressed contractile protein gene (MLC-2), respectively, which mediate alpha-adrenergic-inducible gene expression. Transfection of various ANF luciferase reporters into neonatal rat ventricular cells demonstrated that upstream sequences which mediate tissue-specific expression (-3003 to -638) can be segregated from those responsible for inducibility. The lack of inducibility of a cardiac Na+ channel gene, and the segregation of ANF gene sequences which mediate cardiac specific from those which mediate inducible expression, provides further insight into the relationship between muscle-specific and inducible expression during cardiac myocyte hypertrophy. Based on these results, a testable model is proposed for the induction of embryonic cardiac genes and constitutively expressed contractile protein genes and the noninducibility of a subset of cardiac genes during alpha-adrenergic stimulation of neonatal rat ventricular cells.
USDA-ARS?s Scientific Manuscript database
Five developmental stages of Chrysoperla rufilabris were tested using nine primer pairs. Three sequences were highly expressed at all life stages and six were differentially expressed. These primer pairs may be used as standards to quantitate functional gene expression associated with physiological ...
Dean, C; Jones, J; Favreau, M; Dunsmuir, P; Bedbrook, J
1988-01-01
The petunia rbcS gene SSU301 was introduced into tobacco using Agrobacterium tumefaciens-mediated transformation. The time at which rbcS expression was maximal after transfer of the tobacco plants to the greenhouse was determined. The expression level of the SSU301 gene varied up to 9 fold between individual tobacco plants which had been standardized physiologically as much as possible. The presence of adjacent pUC plasmid sequences did not affect the expression of the SSU301 gene. In an attempt to reduce the between-transformant variability in expression, the SSU301 gene was introduced into tobacco surrounded by 10kb of 5' and 13 kb of 3' DNA sequences which normally flank SSU301 in petunia. The longer flanking regions did not reduce the between-transformant variability of SSU301 gene expression. Images PMID:3174450
Xu, Yuantao; Wu, Guizhi; Hao, Baohai; Chen, Lingling; Deng, Xiuxin; Xu, Qiang
2015-11-23
With the availability of rapidly increasing number of genome and transcriptome sequences, lineage-specific genes (LSGs) can be identified and characterized. Like other conserved functional genes, LSGs play important roles in biological evolution and functions. Two set of citrus LSGs, 296 citrus-specific genes (CSGs) and 1039 orphan genes specific to sweet orange, were identified by comparative analysis between the sweet orange genome sequences and 41 genomes and 273 transcriptomes. With the two sets of genes, gene structure and gene expression pattern were investigated. On average, both the CSGs and orphan genes have fewer exons, shorter gene length and higher GC content when compared with those evolutionarily conserved genes (ECs). Expression profiling indicated that most of the LSGs expressed in various tissues of sweet orange and some of them exhibited distinct temporal and spatial expression patterns. Particularly, the orphan genes were preferentially expressed in callus, which is an important pluripotent tissue of citrus. Besides, part of the CSGs and orphan genes expressed responsive to abiotic stress, indicating their potential functions during interaction with environment. This study identified and characterized two sets of LSGs in citrus, dissected their sequence features and expression patterns, and provided valuable clues for future functional analysis of the LSGs in sweet orange.
Rojas-Cartagena, Carmencita; Ortíz-Pineda, Pablo; Ramírez-Gómez, Francisco; Suárez-Castillo, Edna C.; Matos-Cruz, Vanessa; Rodríguez, Carlos; Ortíz-Zuazaga, Humberto; García-Arrarás, José E.
2010-01-01
Repair and regeneration are key processes for tissue maintenance, and their disruption may lead to disease states. Little is known about the molecular mechanisms that underline the repair and regeneration of the digestive tract. The sea cucumber Holothuria glaberrima represents an excellent model to dissect and characterize the molecular events during intestinal regeneration. To study the gene expression profile, cDNA libraries were constructed from normal, 3-day, and 7-day regenerating intestines of H. glaberrima. Clones were randomly sequenced and queried against the nonredundant protein database at the National Center for Biotechnology Information. RT-PCR analyses were made of several genes to determine their expression profile during intestinal regeneration. A total of 5,173 sequences from three cDNA libraries were obtained. About 46.2, 35.6, and 26.2% of the sequences for the normal, 3-days, and 7-days cDNA libraries, respectively, shared significant similarity with known sequences in the protein database of GenBank but only present 10% of similarity among them. Analysis of the libraries in terms of functional processes, protein domains, and most common sequences suggests that a differential expression profile is taking place during the regeneration process. Further examination of the expressed sequence tag dataset revealed that 12 putative genes are differentially expressed at significant level (R > 6). Experimental validation by RT-PCR analysis reveals that at least three genes (unknown C-4677-1, melanotransferrin, and centaurin) present a differential expression during regeneration. These findings strongly suggest that the gene expression profile varies among regeneration stages and provide evidence for the existence of differential gene expression. PMID:17579180
Raventós, D; Jensen, A B; Rask, M B; Casacuberta, J M; Mundy, J; San Segundo, B
1995-01-01
Transient gene expression assays in barley aleurone protoplasts were used to identify a cis-regulatory element involved in the elicitor-responsive expression of the maize PRms gene. Analysis of transcriptional fusions between PRms 5' upstream sequences and a chloramphenicol acetyltransferase reporter gene, as well as chimeric promoters containing PRms promoter fragments or repeated oligonucleotides fused to a minimal promoter, delineated a 20 bp sequence which functioned as an elicitor-response element (ERE). This sequence contains a motif (-246 AATTGACC) similar to sequences found in promoters of other pathogen-responsive genes. The analysis also indicated that an enhancing sequence(s) between -397 and -296 is required for full PRms activation by elicitors. The protein kinase inhibitor staurosporine was found to completely block the transcriptional activation induced by elicitors. These data indicate that protein phosphorylation is involved in the signal transduction pathway leading to PRms expression.
Majumder, P; Choudhury, A; Banerjee, M; Lahiri, A; Bhattacharyya, N P
2007-08-01
To investigate the mechanism of increased expression of caspase-1 caused by exogenous Hippi, observed earlier in HeLa and Neuro2A cells, in this work we identified a specific motif AAAGACATG (- 101 to - 93) at the caspase-1 gene upstream sequence where HIPPI could bind. Various mutations in this specific sequence compromised the interaction, showing the specificity of the interactions. In the luciferase reporter assay, when the reporter gene was driven by caspase-1 gene upstream sequences (- 151 to - 92) with the mutation G to T at position - 98, luciferase activity was decreased significantly in green fluorescent protein-Hippi-expressing HeLa cells in comparison to that obtained with the wild-type caspase-1 gene 60 bp upstream sequence, indicating the biological significance of such binding. It was observed that the C-terminal 'pseudo' death effector domain of HIPPI interacted with the 60 bp (- 151 to - 92) upstream sequence of the caspase-1 gene containing the motif. We further observed that expression of caspase-8 and caspase-10 was increased in green fluorescent protein-Hippi-expressing HeLa cells. In addition, HIPPI interacted in vitro with putative promoter sequences of these genes, containing a similar motif. In summary, we identified a novel function of HIPPI; it binds to specific upstream sequences of the caspase-1, caspase-8 and caspase-10 genes and alters the expression of the genes. This result showed the motif-specific interaction of HIPPI with DNA, and indicates that it could act as transcription regulator.
Transcriptional insulation of the human keratin 18 gene in transgenic mice.
Neznanov, N; Thorey, I S; Ceceña, G; Oshima, R G
1993-01-01
Expression of the 10-kb human keratin 18 (K18) gene in transgenic mice results in efficient and appropriate tissue-specific expression in a variety of internal epithelial organs, including liver, lung, intestine, kidney, and the ependymal epithelium of brain, but not in spleen, heart, or skeletal muscle. Expression at the RNA level is directly proportional to the number of integrated K18 transgenes. These results indicate that the K18 gene is able to insulate itself both from the commonly observed cis-acting effects of the sites of integration and from the potential complications of duplicated copies of the gene arranged in head-to-tail fashion. To begin to identify the K18 gene sequences responsible for this property of transcriptional insulation, additional transgenic mouse lines containing deletions of either the 5' or 3' distal end of the K18 gene have been characterized. Deletion of 1.5 kb of the distal 5' flanking sequence has no effect upon either the tissue specificity or the copy number-dependent behavior of the transgene. In contrast, deletion of the 3.5-kb 3' flanking sequence of the gene results in the loss of the copy number-dependent behavior of the gene in liver and intestine. However, expression in kidney, lung, and brain remains efficient and copy number dependent in these transgenic mice. Furthermore, herpes simplex virus thymidine kinase gene expression is copy number dependent in transgenic mice when the gene is located between the distal 5'- and 3'-flanking sequences of the K18 gene. Each adult transgenic male expressed the thymidine kinase gene in testes and brain and proportionally to the number of integrated transgenes. We conclude that the characteristic of copy number-dependent expression of the K18 gene is tissue specific because the sequence requirements for transcriptional insulation in adult liver and intestine are different from those for lung and kidney. In addition, the behavior of the transgenic thymidine kinase gene in testes and brain suggests that the property of transcriptional insulation of the K18 gene may be conferred by the distal flanking sequences of the K18 gene and, additionally, may function for other genes. Images PMID:7681143
Identification of human chromosome 22 transcribed sequences with ORF expressed sequence tags
de Souza, Sandro J.; Camargo, Anamaria A.; Briones, Marcelo R. S.; Costa, Fernando F.; Nagai, Maria Aparecida; Verjovski-Almeida, Sergio; Zago, Marco A.; Andrade, Luis Eduardo C.; Carrer, Helaine; El-Dorry, Hamza F. A.; Espreafico, Enilza M.; Habr-Gama, Angelita; Giannella-Neto, Daniel; Goldman, Gustavo H.; Gruber, Arthur; Hackel, Christine; Kimura, Edna T.; Maciel, Rui M. B.; Marie, Suely K. N.; Martins, Elizabeth A. L.; Nóbrega, Marina P.; Paçó-Larson, Maria Luisa; Pardini, Maria Inês M. C.; Pereira, Gonçalo G.; Pesquero, João Bosco; Rodrigues, Vanderlei; Rogatto, Silvia R.; da Silva, Ismael D. C. G.; Sogayar, Mari C.; de Fátima Sonati, Maria; Tajara, Eloiza H.; Valentini, Sandro R.; Acencio, Marcio; Alberto, Fernando L.; Amaral, Maria Elisabete J.; Aneas, Ivy; Bengtson, Mário Henrique; Carraro, Dirce M.; Carvalho, Alex F.; Carvalho, Lúcia Helena; Cerutti, Janete M.; Corrêa, Maria Lucia C.; Costa, Maria Cristina R.; Curcio, Cyntia; Gushiken, Tsieko; Ho, Paulo L.; Kimura, Elza; Leite, Luciana C. C.; Maia, Gustavo; Majumder, Paromita; Marins, Mozart; Matsukuma, Adriana; Melo, Analy S. A.; Mestriner, Carlos Alberto; Miracca, Elisabete C.; Miranda, Daniela C.; Nascimento, Ana Lucia T. O.; Nóbrega, Francisco G.; Ojopi, Élida P. B.; Pandolfi, José Rodrigo C.; Pessoa, Luciana Gilbert; Rahal, Paula; Rainho, Claudia A.; da Ro's, Nancy; de Sá, Renata G.; Sales, Magaly M.; da Silva, Neusa P.; Silva, Tereza C.; da Silva, Wilson; Simão, Daniel F.; Sousa, Josane F.; Stecconi, Daniella; Tsukumo, Fernando; Valente, Valéria; Zalcberg, Heloisa; Brentani, Ricardo R.; Reis, Luis F. L.; Dias-Neto, Emmanuel; Simpson, Andrew J. G.
2000-01-01
Transcribed sequences in the human genome can be identified with confidence only by alignment with sequences derived from cDNAs synthesized from naturally occurring mRNAs. We constructed a set of 250,000 cDNAs that represent partial expressed gene sequences and that are biased toward the central coding regions of the resulting transcripts. They are termed ORF expressed sequence tags (ORESTES). The 250,000 ORESTES were assembled into 81,429 contigs. Of these, 1,181 (1.45%) were found to match sequences in chromosome 22 with at least one ORESTES contig for 162 (65.6%) of the 247 known genes, for 67 (44.6%) of the 150 related genes, and for 45 of the 148 (30.4%) EST-predicted genes on this chromosome. Using a set of stringent criteria to validate our sequences, we identified a further 219 previously unannotated transcribed sequences on chromosome 22. Of these, 171 were in fact also defined by EST or full length cDNA sequences available in GenBank but not utilized in the initial annotation of the first human chromosome sequence. Thus despite representing less than 15% of all expressed human sequences in the public databases at the time of the present analysis, ORESTES sequences defined 48 transcribed sequences on chromosome 22 not defined by other sequences. All of the transcribed sequences defined by ORESTES coincided with DNA regions predicted as encoding exons by genscan. (http://genes.mit.edu/GENSCAN.html). PMID:11070084
DOE Office of Scientific and Technical Information (OSTI.GOV)
Larsen, P. E.; Trivedi, G.; Sreedasyam, A.
2010-07-06
Accurate structural annotation is important for prediction of function and required for in vitro approaches to characterize or validate the gene expression products. Despite significant efforts in the field, determination of the gene structure from genomic data alone is a challenging and inaccurate process. The ease of acquisition of transcriptomic sequence provides a direct route to identify expressed sequences and determine the correct gene structure. We developed methods to utilize RNA-seq data to correct errors in the structural annotation and extend the boundaries of current gene models using assembly approaches. The methods were validated with a transcriptomic data set derivedmore » from the fungus Laccaria bicolor, which develops a mycorrhizal symbiotic association with the roots of many tree species. Our analysis focused on the subset of 1501 gene models that are differentially expressed in the free living vs. mycorrhizal transcriptome and are expected to be important elements related to carbon metabolism, membrane permeability and transport, and intracellular signaling. Of the set of 1501 gene models, 1439 (96%) successfully generated modified gene models in which all error flags were successfully resolved and the sequences aligned to the genomic sequence. The remaining 4% (62 gene models) either had deviations from transcriptomic data that could not be spanned or generated sequence that did not align to genomic sequence. The outcome of this process is a set of high confidence gene models that can be reliably used for experimental characterization of protein function. 69% of expressed mycorrhizal JGI 'best' gene models deviated from the transcript sequence derived by this method. The transcriptomic sequence enabled correction of a majority of the structural inconsistencies and resulted in a set of validated models for 96% of the mycorrhizal genes. The method described here can be applied to improve gene structural annotation in other species, provided that there is a sequenced genome and a set of gene models.« less
Singhal, Dinesh K; Singhal, Raxita; Malik, Hruda N; Kumar, Surender; Kumar, Sudarshan; Mohanty, Ashok K; Kaushik, Jai K; Malakar, Dhruba
2014-01-01
Nanog is a homeodomain containing protein which plays important roles in regulation of signaling pathways for maintenance and induction of pluripotency in stem cells. Because of its unique expression in stem cells it is also regarded as pluripotency marker. In this study goat Nanog (gNanog) gene has been amplified, cloned and characterized at sequence level with successful over-expression in CHO-K1 cell line using a lentiviral based system. gNanog ORF is 903 bp long which codes for Nanog protein of size 300 amino acids (aas). Complete nucleotide sequence shows some evolutionary mutation in goat in comparision to other species. Protein sequence of goat is highly similar to other species. Overall, gNanog nucleotide sequence and predicted protein sequence showed high similarity and minimum divergence with cattle (96 % identity/4 % divergence) and buffalo (94/5 %) while low similarity and high divergence with pig (84/15 %), human (81/23 %) and mouse (69/40 %) indicating evolutionary closeness of gNanog to cattle and buffalo. gNanog lentiviral expression construct was prepared for over-expression of Nanog gene in adult goat fibroblast cells. Lentiviral expression construct of Nanog enabled continuous protein expression for induction and maintenance of pluripotency. Western blotting revealed the expression of Nanog gene at protein level which supported that the lentiviral expression system is highly promising for Nanog protein expression in differentiated goat cell.
Voels, Brent; Wang, Liping; Sens, Donald A; Garrett, Scott H; Zhang, Ke; Somji, Seema
2017-05-25
The 3rd isoform of the metallothionein (MT3) gene family has been shown to be overexpressed in most ductal breast cancers. A previous study has shown that the stable transfection of MCF-7 cells with the MT3 gene inhibits cell growth. The goal of the present study was to determine the role of the unique C-terminal and N-terminal sequences of MT3 on phenotypic properties and gene expression profiles of MCF-7 cells. MCF-7 cells were transfected with various metallothionein gene constructs which contain the insertion or the removal of the unique MT3 C- and N-terminal domains. Global gene expression analysis was performed on the MCF-7 cells containing the various constructs and the expression of the unique C- and N- terminal domains of MT3 was correlated to phenotypic properties of the cells. The results of the present study demonstrate that the C-terminal sequence of MT3, in the absence of the N-terminal sequence, induces dome formation in MCF-7 cells, which in cell cultures is the phenotypic manifestation of a cell's ability to perform vectorial active transport. Global gene expression analysis demonstrated that the increased expression of the GAGE gene family correlated with dome formation. Expression of the C-terminal domain induced GAGE gene expression, whereas the N-terminal domain inhibited GAGE gene expression and that the effect of the N-terminal domain inhibition was dominant over the C-terminal domain of MT3. Transfection with the metallothionein 1E gene increased the expression of GAGE genes. In addition, both the C- and the N-terminal sequences of the MT3 gene had growth inhibitory properties, which correlated to an increased expression of the interferon alpha-inducible protein 6. Our study shows that the C-terminal domain of MT3 confers dome formation in MCF-7 cells and the presence of this domain induces expression of the GAGE family of genes. The differential effects of MT3 and metallothionein 1E on the expression of GAGE genes suggests unique roles of these genes in the development and progression of breast cancer. The finding that interferon alpha-inducible protein 6 expression is associated with the ability of MT3 to inhibit growth needs further investigation.
Li, XiaoChing; Wang, Xiu-Jie; Tannenhauser, Jonathan; Podell, Sheila; Mukherjee, Piali; Hertel, Moritz; Biane, Jeremy; Masuda, Shoko; Nottebohm, Fernando; Gaasterland, Terry
2007-01-01
Vocal learning and neuronal replacement have been studied extensively in songbirds, but until recently, few molecular and genomic tools for songbird research existed. Here we describe new molecular/genomic resources developed in our laboratory. We made cDNA libraries from zebra finch (Taeniopygia guttata) brains at different developmental stages. A total of 11,000 cDNA clones from these libraries, representing 5,866 unique gene transcripts, were randomly picked and sequenced from the 3′ ends. A web-based database was established for clone tracking, sequence analysis, and functional annotations. Our cDNA libraries were not normalized. Sequencing ESTs without normalization produced many developmental stage-specific sequences, yielding insights into patterns of gene expression at different stages of brain development. In particular, the cDNA library made from brains at posthatching day 30–50, corresponding to the period of rapid song system development and song learning, has the most diverse and richest set of genes expressed. We also identified five microRNAs whose sequences are highly conserved between zebra finch and other species. We printed cDNA microarrays and profiled gene expression in the high vocal center of both adult male zebra finches and canaries (Serinus canaria). Genes differentially expressed in the high vocal center were identified from the microarray hybridization results. Selected genes were validated by in situ hybridization. Networks among the regulated genes were also identified. These resources provide songbird biologists with tools for genome annotation, comparative genomics, and microarray gene expression analysis. PMID:17426146
Tamori, Akihiro; Yamanishi, Yoshihiro; Kawashima, Shuichi; Kanehisa, Minoru; Enomoto, Masaru; Tanaka, Hiromu; Kubo, Shoji; Shiomi, Susumu; Nishiguchi, Shuhei
2005-08-15
Integration of hepatitis B virus (HBV) DNA into the human genome is one of the most important steps in HBV-related carcinogenesis. This study attempted to find the link between HBV DNA, the adjoining cellular sequence, and altered gene expression in hepatocellular carcinoma (HCC) with integrated HBV DNA. We examined 15 cases of HCC infected with HBV by cassette ligation-mediated PCR. The human DNA adjacent to the integrated HBV DNA was sequenced. Protein coding sequences were searched for in the human sequence. In five cases with HBV DNA integration, from which good quality RNA was extracted, gene expression was examined by cDNA microarray analysis. The human DNA sequence successive to integrated HBV DNA was determined in the 15 HCCs. Eight protein-coding regions were involved: ras-responsive element binding protein 1, calmodulin 1, mixed lineage leukemia 2 (MLL2), FLJ333655, LOC220272, LOC255345, LOC220220, and LOC168991. The MLL2 gene was expressed in three cases with HBV DNA integrated into exon 3 of MLL2 and in one case with HBV DNA integrated into intron 3 of MLL2. Gene expression analysis suggested that two HCCs with HBV integrated into MLL2 had similar patterns of gene expression compared with three HCCs with HBV integrated into other loci of human chromosomes. HBV DNA was integrated at random sites of human DNA, and the MLL2 gene was one of the targets for integration. Our results suggest that HBV DNA might modulate human genes near integration sites, followed by integration site-specific expression of such genes during hepatocarcinogenesis.
NASA Astrophysics Data System (ADS)
Boone, R. D.; Rogers, S. L.
2004-12-01
We report on work to assess the functional gene sequences for soil microbiota that control nitrogen cycle pathways along the successional sequence (willow, alder, poplar, white spruce, black spruce) on the Tanana River floodplain, Interior Alaska. Microbial DNA and mRNA were extracted from soils (0-10 cm depth) for amoA (ammonium monooxygenase), nifH (nitrogenase reductase), napA (nitrate reductase), and nirS and nirK (nitrite reductase) genes. Gene presence was determined by amplification of a conserved sequence of each gene employing sequence specific oligonucleotide primers and Polymerase Chain Reaction (PCR). Expression of the genes was measured via nested reverse transcriptase PCR amplification of the extracted mRNA. Amplified PCR products were visualized on agarose electrophoresis gels. All five successional stages show evidence for the presence and expression of microbial genes that regulate N fixation (free-living), nitrification, and nitrate reduction. We detected (1) nifH, napA, and nirK presence and amoA expression (mRNA production) for all five successional stages and (2) nirS and amoA presence and nifH, nirK, and napA expression for early successional stages (willow, alder, poplar). The results highlight that the existing body of previous process-level work has not sufficiently considered the microbial potential for a nitrate economy and free-living N fixation along the complete floodplain successional sequence.
Oishi, M; Gohma, H; Lejukole, H Y; Taniguchi, Y; Yamada, T; Suzuki, K; Shinkai, H; Uenishi, H; Yasue, H; Sasaki, Y
2004-05-01
Expressed sequence tags (ESTs) generated based on characterization of clones isolated randomly from cDNA libraries are used to study gene expression profiles in specific tissues and to provide useful information for characterizing tissue physiology. In this study, two directionally cloned cDNA libraries were constructed from 60 day-old bovine whole fetus and fetal placenta. We have characterized 5357 and 1126 clones, and then identified 3464 and 795 unique sequences for the fetus and placenta cDNA libraries: 1851 and 504 showed homology to already identified genes, and 1613 and 291 showed no significant matches to any of the sequences in DNA databases, respectively. Further, we found 94 unique sequences overlapping in both the fetus and the placenta, leading to a catalog of 4165 genes expressed in 60 day-old fetus and placenta. The catalog is used to examine expression profile of genes in 60 day-old bovine fetus and placenta.
Lee, Je Hyuk; Daugharthy, Evan R.; Scheiman, Jonathan; Kalhor, Reza; Ferrante, Thomas C.; Terry, Richard; Turczyk, Brian M.; Yang, Joyce L.; Lee, Ho Suk; Aach, John; Zhang, Kun; Church, George M.
2014-01-01
RNA sequencing measures the quantitative change in gene expression over the whole transcriptome, but it lacks spatial context. On the other hand, in situ hybridization provides the location of gene expression, but only for a small number of genes. Here we detail a protocol for genome-wide profiling of gene expression in situ in fixed cells and tissues, in which RNA is converted into cross-linked cDNA amplicons and sequenced manually on a confocal microscope. Unlike traditional RNA-seq our method enriches for context-specific transcripts over house-keeping and/or structural RNA, and it preserves the tissue architecture for RNA localization studies. Our protocol is written for researchers experienced in cell microscopy with minimal computing skills. Library construction and sequencing can be completed within 14 d, with image analysis requiring an additional 2 d. PMID:25675209
NASA Astrophysics Data System (ADS)
Tian, Caihong; Tek Tay, Wee; Feng, Hongqiang; Wang, Ying; Hu, Yongmin; Li, Guoping
2015-06-01
Adelphocoris suturalis is one of the most serious pest insects of Bt cotton in China, however its molecular genetics, biochemistry and physiology are poorly understood. We used high throughput sequencing platform to perform de novo transcriptome assembly and gene expression analyses across different developmental stages (eggs, 2nd and 5th instar nymphs, female and male adults). We obtained 20 GB of clean data and revealed 88,614 unigenes, including 23,830 clusters and 64,784 singletons. These unigene sequences were annotated and classified by Gene Ontology, Clusters of Orthologous Groups, and Kyoto Encyclopedia of Genes and Genomes databases. A large number of differentially expressed genes were discovered through pairwise comparisons between these developmental stages. Gene expression profiles were dramatically different between life stage transitions, with some of these most differentially expressed genes being associated with sex difference, metabolism and development. Quantitative real-time PCR results confirm deep-sequencing findings based on relative expression levels of nine randomly selected genes. Furthermore, over 791,390 single nucleotide polymorphisms and 2,682 potential simple sequence repeats were identified. Our study provided comprehensive transcriptional gene expression information for A. suturalis that will form the basis to better understanding of development pathways, hormone biosynthesis, sex differences and wing formation in mirid bugs.
Tian, Caihong; Tek Tay, Wee; Feng, Hongqiang; Wang, Ying; Hu, Yongmin; Li, Guoping
2015-01-01
Adelphocoris suturalis is one of the most serious pest insects of Bt cotton in China, however its molecular genetics, biochemistry and physiology are poorly understood. We used high throughput sequencing platform to perform de novo transcriptome assembly and gene expression analyses across different developmental stages (eggs, 2nd and 5th instar nymphs, female and male adults). We obtained 20 GB of clean data and revealed 88,614 unigenes, including 23,830 clusters and 64,784 singletons. These unigene sequences were annotated and classified by Gene Ontology, Clusters of Orthologous Groups, and Kyoto Encyclopedia of Genes and Genomes databases. A large number of differentially expressed genes were discovered through pairwise comparisons between these developmental stages. Gene expression profiles were dramatically different between life stage transitions, with some of these most differentially expressed genes being associated with sex difference, metabolism and development. Quantitative real-time PCR results confirm deep-sequencing findings based on relative expression levels of nine randomly selected genes. Furthermore, over 791,390 single nucleotide polymorphisms and 2,682 potential simple sequence repeats were identified. Our study provided comprehensive transcriptional gene expression information for A. suturalis that will form the basis to better understanding of development pathways, hormone biosynthesis, sex differences and wing formation in mirid bugs. PMID:26047353
ERIC Educational Resources Information Center
Pimenta, Aurea F.; Levitt, Pat
2005-01-01
The human and mouse genome projects elucidated the sequence and position map of innumerous genes expressed in the central nervous system (CNS), advancing our ability to manipulate these sequences and create models to investigate regulation of gene expression and function. In this article, we reviewed gene targeting methodologies with emphasis on…
Analysis of codon usage in beta-tubulin sequences of helminths.
von Samson-Himmelstjerna, G; Harder, A; Failing, K; Pape, M; Schnieder, T
2003-07-01
Codon usage bias has been shown to be correlated with gene expression levels in many organisms, including the nematode Caenorhabditis elegans. Here, the codon usage (cu) characteristics for a set of currently available beta-tubulin coding sequences of helminths were assessed by calculating several indices, including the effective codon number (Nc), the intrinsic codon deviation index (ICDI), the P2 value and the mutational response index (MRI). The P2 value gives a measure of translational pressure, which has been shown to be correlated to high gene expression levels in some organisms, but it has not yet been analysed in that respect in helminths. For all but two of the C. elegans beta-tubulin coding sequences investigated, the P2 value was the only index that indicated the presence of codon usage bias. Therefore, we propose that in general the helminth beta-tubulin sequences investigated here are not expressed at high levels. Furthermore, we calculated the correlation coefficients for the cu patterns of the helminth beta-tubulin sequences compared with those of highly expressed genes in organisms such as Escherichia coli and C. elegans. It was found that beta-tubulin cu patterns for all sequences of members of the Strongylida were significantly correlated to those for highly expressed C. elegans genes. This approach provides a new measure for comparing the adaptation of cu of a particular coding sequence with that of highly expressed genes in possible expression systems.Finally, using the cu patterns of the sequences studied, a phylogenetic tree was constructed. The topology of this tree was very much in concordance with that of a phylogeny based on small subunit ribosomal DNA sequence alignments.
Generation of 2A-linked multicistronic cassettes by recombinant PCR.
Szymczak-Workman, Andrea L; Vignali, Kate M; Vignali, Dario A A
2012-02-01
The need for reliable, multicistronic vectors for multigene delivery is at the forefront of biomedical technology. It is now possible to express multiple proteins from a single open reading frame (ORF) using 2A peptide-linked multicistronic vectors. These small sequences, when cloned between genes, allow for efficient, stoichiometric production of discrete protein products within a single vector through a novel "cleavage" event within the 2A peptide sequence. Expression of more than two genes using conventional approaches has several limitations, most notably imbalanced protein expression and large size. The use of 2A peptide sequences alleviates these concerns. They are small (18-22 amino acids) and have divergent amino-terminal sequences, which minimizes the chance for homologous recombination and allows for multiple, different 2A peptide sequences to be used within a single vector. Importantly, separation of genes placed between 2A peptide sequences is nearly 100%, which allows for stoichiometric and concordant expression of the genes, regardless of the order of placement within the vector. This protocol describes the use of recombinant polymerase chain reaction (PCR) to connect multiple 2A-linked protein sequences. The final construct is subcloned into an expression vector.
Peng, Jing; Peng, Futian; Zhu, Chunfu; Wei, Shaochong
2008-06-01
A putative isopentenyltransferase (IPT) encoding gene was identified from a pingyitiancha (Malus hupehensis Rehd.) expressed sequence tag database, and the full-length gene was cloned by RACE. Based on expression profile and sequence alignment, the nucleotide sequence of the clone, named MhIPT3, was most similar to AtIPT3, an IPT gene in Arabidopsis. The full-length cDNA contained a 963-bp open reading frame encoding a protein of 321 amino acids with a molecular mass of 37.3 kDa. Sequence analysis of genomic DNA revealed the absence of introns in the frame. Quantitative real-time PCR analysis demonstrated that the gene was expressed in roots, stems and leaves. Application of nitrate to roots of nitrogen-deprived seedlings strongly induced expression of MhIPT3 and was accompanied by the accumulation of cytokinins, whereas MhIPT3 expression was little affected by ammonium application to roots of nitrogen-deprived seedlings. Application of nitrate to leaves also up-regulated the expression of MhIPT3 and corresponded closely with the accumulation of isopentyladenine and isopentyladenosine in leaves.
Deonovic, Benjamin; Wang, Yunhao; Weirather, Jason; Wang, Xiu-Jie; Au, Kin Fai
2017-01-01
Abstract Allele-specific expression (ASE) is a fundamental problem in studying gene regulation and diploid transcriptome profiles, with two key challenges: (i) haplotyping and (ii) estimation of ASE at the gene isoform level. Existing ASE analysis methods are limited by a dependence on haplotyping from laborious experiments or extra genome/family trio data. In addition, there is a lack of methods for gene isoform level ASE analysis. We developed a tool, IDP-ASE, for full ASE analysis. By innovative integration of Third Generation Sequencing (TGS) long reads with Second Generation Sequencing (SGS) short reads, the accuracy of haplotyping and ASE quantification at the gene and gene isoform level was greatly improved as demonstrated by the gold standard data GM12878 data and semi-simulation data. In addition to methodology development, applications of IDP-ASE to human embryonic stem cells and breast cancer cells indicate that the imbalance of ASE and non-uniformity of gene isoform ASE is widespread, including tumorigenesis relevant genes and pluripotency markers. These results show that gene isoform expression and allele-specific expression cooperate to provide high diversity and complexity of gene regulation and expression, highlighting the importance of studying ASE at the gene isoform level. Our study provides a robust bioinformatics solution to understand ASE using RNA sequencing data only. PMID:27899656
Oduru, Sreedhar; Campbell, Janee L; Karri, SriTulasi; Hendry, William J; Khan, Shafiq A; Williams, Simon C
2003-01-01
Background Complete genome annotation will likely be achieved through a combination of computer-based analysis of available genome sequences combined with direct experimental characterization of expressed regions of individual genomes. We have utilized a comparative genomics approach involving the sequencing of randomly selected hamster testis cDNAs to begin to identify genes not previously annotated on the human, mouse, rat and Fugu (pufferfish) genomes. Results 735 distinct sequences were analyzed for their relatedness to known sequences in public databases. Eight of these sequences were derived from previously unidentified genes and expression of these genes in testis was confirmed by Northern blotting. The genomic locations of each sequence were mapped in human, mouse, rat and pufferfish, where applicable, and the structure of their cognate genes was derived using computer-based predictions, genomic comparisons and analysis of uncharacterized cDNA sequences from human and macaque. Conclusion The use of a comparative genomics approach resulted in the identification of eight cDNAs that correspond to previously uncharacterized genes in the human genome. The proteins encoded by these genes included a new member of the kinesin superfamily, a SET/MYND-domain protein, and six proteins for which no specific function could be predicted. Each gene was expressed primarily in testis, suggesting that they may play roles in the development and/or function of testicular cells. PMID:12783626
Gene expression analysis of flax seed development
2011-01-01
Background Flax, Linum usitatissimum L., is an important crop whose seed oil and stem fiber have multiple industrial applications. Flax seeds are also well-known for their nutritional attributes, viz., omega-3 fatty acids in the oil and lignans and mucilage from the seed coat. In spite of the importance of this crop, there are few molecular resources that can be utilized toward improving seed traits. Here, we describe flax embryo and seed development and generation of comprehensive genomic resources for the flax seed. Results We describe a large-scale generation and analysis of expressed sequences in various tissues. Collectively, the 13 libraries we have used provide a broad representation of genes active in developing embryos (globular, heart, torpedo, cotyledon and mature stages) seed coats (globular and torpedo stages) and endosperm (pooled globular to torpedo stages) and genes expressed in flowers, etiolated seedlings, leaves, and stem tissue. A total of 261,272 expressed sequence tags (EST) (GenBank accessions LIBEST_026995 to LIBEST_027011) were generated. These EST libraries included transcription factor genes that are typically expressed at low levels, indicating that the depth is adequate for in silico expression analysis. Assembly of the ESTs resulted in 30,640 unigenes and 82% of these could be identified on the basis of homology to known and hypothetical genes from other plants. When compared with fully sequenced plant genomes, the flax unigenes resembled poplar and castor bean more than grape, sorghum, rice or Arabidopsis. Nearly one-fifth of these (5,152) had no homologs in sequences reported for any organism, suggesting that this category represents genes that are likely unique to flax. Digital analyses revealed gene expression dynamics for the biosynthesis of a number of important seed constituents during seed development. Conclusions We have developed a foundational database of expressed sequences and collection of plasmid clones that comprise even low-expressed genes such as those encoding transcription factors. This has allowed us to delineate the spatio-temporal aspects of gene expression underlying the biosynthesis of a number of important seed constituents in flax. Flax belongs to a taxonomic group of diverse plants and the large sequence database will allow for evolutionary studies as well. PMID:21529361
Bedon, Frank; Grima-Pettenati, Jacqueline; Mackay, John
2007-01-01
Background Several members of the R2R3-MYB family of transcription factors act as regulators of lignin and phenylpropanoid metabolism during wood formation in angiosperm and gymnosperm plants. The angiosperm Arabidopsis has over one hundred R2R3-MYBs genes; however, only a few members of this family have been discovered in gymnosperms. Results We isolated and characterised full-length cDNAs encoding R2R3-MYB genes from the gymnosperms white spruce, Picea glauca (13 sequences), and loblolly pine, Pinus taeda L. (five sequences). Sequence similarities and phylogenetic analyses placed the spruce and pine sequences in diverse subgroups of the large R2R3-MYB family, although several of the sequences clustered closely together. We searched the highly variable C-terminal region of diverse plant MYBs for conserved amino acid sequences and identified 20 motifs in the spruce MYBs, nine of which have not previously been reported and three of which are specific to conifers. The number and length of the introns in spruce MYB genes varied significantly, but their positions were well conserved relative to angiosperm MYB genes. Quantitative RTPCR of MYB genes transcript abundance in root and stem tissues revealed diverse expression patterns; three MYB genes were preferentially expressed in secondary xylem, whereas others were preferentially expressed in phloem or were ubiquitous. The MYB genes expressed in xylem, and three others, were up-regulated in the compression wood of leaning trees within 76 hours of induction. Conclusion Our survey of 18 conifer R2R3-MYB genes clearly showed a gene family structure similar to that of Arabidopsis. Three of the sequences are likely to play a role in lignin metabolism and/or wood formation in gymnosperm trees, including a close homolog of the loblolly pine PtMYB4, shown to regulate lignin biosynthesis in transgenic tobacco. PMID:17397551
Wang, Zhong-dong; Wu, Ji-nan; Zhou, Lin; Ling, Jun-qi; Guo, Xi-min; Xiao, Ming-zhen; Zhu, Feng; Pu, Qin; Chai, Yu-bo; Zhao, Zhong-liang
2007-02-01
To study the biological properties of human dental pulp cells (HDPC) by cloning and analysis of genes differentially expressed in HDPC in comparison with human gingival fibroblasts (HGF). HDPC and HGF were cultured and identified by immunocytochemistry. HPDC and HGF subtractive cDNA library was established by PCR-based modified subtractive hybridization, genes differentially expressed by HPDC were cloned, sequenced and compared to find homogeneous sequence in GenBank by BLAST. Cloning and sequencing analysis indicate 12 genes differentially expressed were obtained, in which two were unknown genes. Among the 10 known genes, 4 were related to signal transduction, 2 were related to trans-membrane transportation (both cell membrane and nuclear membrane), and 2 were related to RNA splicing mechanisms. The biological properties of HPDC are determined by the differential expression of some genes and the growth and differentiation of HPDC are associated to the dynamic protein synthesis and secretion activities of the cell.
Llopart, Ana
2018-05-01
The hemizygosity of the X (Z) chromosome fully exposes the fitness effects of mutations on that chromosome and has evolutionary consequences on the relative rates of evolution of X and autosomes. Specifically, several population genetics models predict increased rates of evolution in X-linked loci relative to autosomal loci. This prediction of faster-X evolution has been evaluated and confirmed for both protein coding sequences and gene expression. In the case of faster-X evolution for gene expression divergence, it is often assumed that variation in 5' noncoding sequences is associated with variation in transcript abundance between species but a formal, genomewide test of this hypothesis is still missing. Here, I use whole genome sequence data in Drosophila yakuba and D. santomea to evaluate this hypothesis and report positive correlations between sequence divergence at 5' noncoding sequences and gene expression divergence. I also examine polymorphism and divergence in 9,279 noncoding sequences located at the 5' end of annotated genes and detected multiple signals of positive selection. Notably, I used the traditional synonymous sites as neutral reference to test for adaptive evolution, but I also used bases 8-30 of introns <65 bp, which have been proposed to be a better neutral choice. X-linked genes with high degree of male-biased expression show the most extreme adaptive pattern at 5' noncoding regions, in agreement with faster-X evolution for gene expression divergence and a higher incidence of positively selected recessive mutations. © 2018 The Authors. Molecular Ecology Published by John Wiley & Sons Ltd.
Li, Shuyu; Li, Yiqun Helen; Wei, Tao; Su, Eric Wen; Duffin, Kevin; Liao, Birong
2006-10-25
The tissue expression pattern of a gene often provides an important clue to its potential role in a biological process. A vast amount of gene expression data have been and are being accumulated in public repository through different technology platforms. However, exploitations of these rich data sources remain limited in part due to issues of technology standardization. Our objective is to test the data comparability between SAGE and microarray technologies, through examining the expression pattern of genes under normal physiological states across variety of tissues. There are 42-54% of genes showing significant correlations in tissue expression patterns between SAGE and GeneChip, with 30-40% of genes whose expression patterns are positively correlated and 10-15% of genes whose expression patterns are negatively correlated at a statistically significant level (p = 0.05). Our analysis suggests that the discrepancy on the expression patterns derived from technology platforms is not likely from the heterogeneity of tissues used in these technologies, or other spurious correlations resulting from microarray probe design, abundance of genes, or gene function. The discrepancy can be partially explained by errors in the original assignment of SAGE tags to genes due to the evolution of sequence databases. In addition, sequence analysis has indicated that many SAGE tags and Affymetrix array probe sets are mapped to different splice variants or different sequence regions although they represent the same gene, which also contributes to the observed discrepancies between SAGE and array expression data. To our knowledge, this is the first report attempting to mine gene expression patterns across tissues using public data from different technology platforms. Unlike previous similar studies that only demonstrated the discrepancies between the two gene expression platforms, we carried out in-depth analysis to further investigate the cause for such discrepancies. Our study shows that the exploitation of rich public expression resource requires extensive knowledge about the technologies, and experiment. Informatic methodologies for better interoperability among platforms still remain a gap. One of the areas that can be improved practically is the accurate sequence mapping of SAGE tags and array probes to full-length genes.
Chae, Heejoon; Lee, Sangseon; Seo, Seokjun; Jung, Daekyoung; Chang, Hyeonsook; Nephew, Kenneth P; Kim, Sun
2016-12-01
Measuring gene expression, DNA sequence variation, and DNA methylation status is routinely done using high throughput sequencing technologies. To analyze such multi-omics data and explore relationships, reliable bioinformatics systems are much needed. Existing systems are either for exploring curated data or for processing omics data in the form of a library such as R. Thus scientists have much difficulty in investigating relationships among gene expression, DNA sequence variation, and DNA methylation using multi-omics data. In this study, we report a system called BioVLAB-mCpG-SNP-EXPRESS for the integrated analysis of DNA methylation, sequence variation (SNPs), and gene expression for distinguishing cellular phenotypes at the pairwise and multiple phenotype levels. The system can be deployed on either the Amazon cloud or a publicly available high-performance computing node, and the data analysis and exploration of the analysis result can be conveniently done using a web-based interface. In order to alleviate analysis complexity, all the process are fully automated, and graphical workflow system is integrated to represent real-time analysis progression. The BioVLAB-mCpG-SNP-EXPRESS system works in three stages. First, it processes and analyzes multi-omics data as input in the form of the raw data, i.e., FastQ files. Second, various integrated analyses such as methylation vs. gene expression and mutation vs. methylation are performed. Finally, the analysis result can be explored in a number of ways through a web interface for the multi-level, multi-perspective exploration. Multi-level interpretation can be done by either gene, gene set, pathway or network level and multi-perspective exploration can be explored from either gene expression, DNA methylation, sequence variation, or their relationship perspective. The utility of the system is demonstrated by performing analysis of phenotypically distinct 30 breast cancer cell line data set. BioVLAB-mCpG-SNP-EXPRESS is available at http://biohealth.snu.ac.kr/software/biovlab_mcpg_snp_express/. Copyright © 2016 Elsevier Inc. All rights reserved.
Yao, Yuan-Qing; Lee, Kai-Fai; Xu, Jia-Seng; Ho, Pak-Chung; Yeung, Shu-Biu
2007-09-01
To investigate the effect of embryotrophic factors (ETF) from human oviductal cells on gene expression of mouse early developmental embryos and discuss the role of fallopian tube in early development of embryos. ETF was isolated from conditioned medium of human oviductal cell line by sequential liquid chromatographic systems. Mouse embryos were treated by ETF in vitro. Using differential display RT-PCR, the gene expression of embryos treated by ETF was compared with embryos without ETF treatment. The differentially expressed genes were separated, re-amplified, cloned and sequenced. Gene expression profiles of embryos with ETF treatment was different from embryos without this treatment. Eight differentially expressed genes were cloned and sequenced. These genes functioned in RNA degradation, synthesis, splicing, protein trafficking, cellular differentiation and embryo development. Embryotrophic factors from human oviductal cells affect gene expression of early developmental embryos. The human oviductal cells play wide roles in early developmental stages of embryos.
Highly multiplexed subcellular RNA sequencing in situ
Lee, Je Hyuk; Daugharthy, Evan R.; Scheiman, Jonathan; Kalhor, Reza; Ferrante, Thomas C.; Yang, Joyce L.; Terry, Richard; Jeanty, Sauveur S. F.; Li, Chao; Amamoto, Ryoji; Peters, Derek T.; Turczyk, Brian M.; Marblestone, Adam H.; Inverso, Samuel A.; Bernard, Amy; Mali, Prashant; Rios, Xavier; Aach, John; Church, George M.
2014-01-01
Understanding the spatial organization of gene expression with single nucleotide resolution requires localizing the sequences of expressed RNA transcripts within a cell in situ. Here we describe fluorescent in situ RNA sequencing (FISSEQ), in which stably cross-linked cDNA amplicons are sequenced within a biological sample. Using 30-base reads from 8,742 genes in situ, we examined RNA expression and localization in human primary fibroblasts using a simulated wound healing assay. FISSEQ is compatible with tissue sections and whole mount embryos, and reduces the limitations of optical resolution and noisy signals on single molecule detection. Our platform enables massively parallel detection of genetic elements, including gene transcripts and molecular barcodes, and can be used to investigate cellular phenotype, gene regulation, and environment in situ. PMID:24578530
Demidenko, Natalia V.; Logacheva, Maria D.; Penin, Aleksey A.
2011-01-01
Quantitative reverse transcription PCR (qRT-PCR) is one of the most precise and widely used methods of gene expression analysis. A necessary prerequisite of exact and reliable data is the accurate choice of reference genes. We studied the expression stability of potential reference genes in common buckwheat (Fagopyrum esculentum) in order to find the optimal reference for gene expression analysis in this economically important crop. Recently sequenced buckwheat floral transcriptome was used as source of sequence information. Expression stability of eight candidate reference genes was assessed in different plant structures (leaves and inflorescences at two stages of development and fruits). These genes are the orthologs of Arabidopsis genes identified as stable in a genome-wide survey gene of expression stability and a traditionally used housekeeping gene GAPDH. Three software applications – geNorm, NormFinder and BestKeeper - were used to estimate expression stability and provided congruent results. The orthologs of AT4G33380 (expressed protein of unknown function, Expressed1), AT2G28390 (SAND family protein, SAND) and AT5G46630 (clathrin adapter complex subunit family protein, CACS) are revealed as the most stable. We recommend using the combination of Expressed1, SAND and CACS for the normalization of gene expression data in studies on buckwheat using qRT-PCR. These genes are listed among five the most stably expressed in Arabidopsis that emphasizes utility of the studies on model plants as a framework for other species. PMID:21589908
The contribution of 700,000 ORF sequence tags to the definition of the human transcriptome
Camargo, Anamaria A.; Samaia, Helena P. B.; Dias-Neto, Emmanuel; Simão, Daniel F.; Migotto, Italo A.; Briones, Marcelo R. S.; Costa, Fernando F.; Aparecida Nagai, Maria; Verjovski-Almeida, Sergio; Zago, Marco A.; Andrade, Luis Eduardo C.; Carrer, Helaine; El-Dorry, Hamza F. A.; Espreafico, Enilza M.; Habr-Gama, Angelita; Giannella-Neto, Daniel; Goldman, Gustavo H.; Gruber, Arthur; Hackel, Christine; Kimura, Edna T.; Maciel, Rui M. B.; Marie, Suely K. N.; Martins, Elizabeth A. L.; Nóbrega, Marina P.; Paçó-Larson, Maria Luisa; Pardini, Maria Inês M. C.; Pereira, Gonçalo G.; Pesquero, João Bosco; Rodrigues, Vanderlei; Rogatto, Silvia R.; da Silva, Ismael D. C. G.; Sogayar, Mari C.; Sonati, Maria de Fátima; Tajara, Eloiza H.; Valentini, Sandro R.; Alberto, Fernando L.; Amaral, Maria Elisabete J.; Aneas, Ivy; Arnaldi, Liliane A. T.; de Assis, Angela M.; Bengtson, Mário Henrique; Bergamo, Nadia Aparecida; Bombonato, Vanessa; de Camargo, Maria E. R.; Canevari, Renata A.; Carraro, Dirce M.; Cerutti, Janete M.; Corrêa, Maria Lucia C.; Corrêa, Rosana F. R.; Costa, Maria Cristina R.; Curcio, Cyntia; Hokama, Paula O. M.; Ferreira, Ari J. S.; Furuzawa, Gilberto K.; Gushiken, Tsieko; Ho, Paulo L.; Kimura, Elza; Krieger, José E.; Leite, Luciana C. C.; Majumder, Paromita; Marins, Mozart; Marques, Everaldo R.; Melo, Analy S. A.; Melo, Monica; Mestriner, Carlos Alberto; Miracca, Elisabete C.; Miranda, Daniela C.; Nascimento, Ana Lucia T. O.; Nóbrega, Francisco G.; Ojopi, Élida P. B.; Pandolfi, José Rodrigo C.; Pessoa, Luciana G.; Prevedel, Aline C.; Rahal, Paula; Rainho, Claudia A.; Reis, Eduardo M. R.; Ribeiro, Marcelo L.; da Rós, Nancy; de Sá, Renata G.; Sales, Magaly M.; Sant'anna, Simone Cristina; dos Santos, Mariana L.; da Silva, Aline M.; da Silva, Neusa P.; Silva, Wilson A.; da Silveira, Rosana A.; Sousa, Josane F.; Stecconi, Daniella; Tsukumo, Fernando; Valente, Valéria; Soares, Fernando; Moreira, Eloisa S.; Nunes, Diana N.; Correa, Ricardo G.; Zalcberg, Heloisa; Carvalho, Alex F.; Reis, Luis F. L.; Brentani, Ricardo R.; Simpson, Andrew J. G.; de Souza, Sandro J.
2001-01-01
Open reading frame expressed sequences tags (ORESTES) differ from conventional ESTs by providing sequence data from the central protein coding portion of transcripts. We generated a total of 696,745 ORESTES sequences from 24 human tissues and used a subset of the data that correspond to a set of 15,095 full-length mRNAs as a means of assessing the efficiency of the strategy and its potential contribution to the definition of the human transcriptome. We estimate that ORESTES sampled over 80% of all highly and moderately expressed, and between 40% and 50% of rarely expressed, human genes. In our most thoroughly sequenced tissue, the breast, the 130,000 ORESTES generated are derived from transcripts from an estimated 70% of all genes expressed in that tissue, with an equally efficient representation of both highly and poorly expressed genes. In this respect, we find that the capacity of the ORESTES strategy both for gene discovery and shotgun transcript sequence generation significantly exceeds that of conventional ESTs. The distribution of ORESTES is such that many human transcripts are now represented by a scaffold of partial sequences distributed along the length of each gene product. The experimental joining of the scaffold components, by reverse transcription–PCR, represents a direct route to transcript finishing that may represent a useful alternative to full-length cDNA cloning. PMID:11593022
The contribution of 700,000 ORF sequence tags to the definition of the human transcriptome.
Camargo, A A; Samaia, H P; Dias-Neto, E; Simão, D F; Migotto, I A; Briones, M R; Costa, F F; Nagai, M A; Verjovski-Almeida, S; Zago, M A; Andrade, L E; Carrer, H; El-Dorry, H F; Espreafico, E M; Habr-Gama, A; Giannella-Neto, D; Goldman, G H; Gruber, A; Hackel, C; Kimura, E T; Maciel, R M; Marie, S K; Martins, E A; Nobrega, M P; Paco-Larson, M L; Pardini, M I; Pereira, G G; Pesquero, J B; Rodrigues, V; Rogatto, S R; da Silva, I D; Sogayar, M C; Sonati, M F; Tajara, E H; Valentini, S R; Alberto, F L; Amaral, M E; Aneas, I; Arnaldi, L A; de Assis, A M; Bengtson, M H; Bergamo, N A; Bombonato, V; de Camargo, M E; Canevari, R A; Carraro, D M; Cerutti, J M; Correa, M L; Correa, R F; Costa, M C; Curcio, C; Hokama, P O; Ferreira, A J; Furuzawa, G K; Gushiken, T; Ho, P L; Kimura, E; Krieger, J E; Leite, L C; Majumder, P; Marins, M; Marques, E R; Melo, A S; Melo, M B; Mestriner, C A; Miracca, E C; Miranda, D C; Nascimento, A L; Nobrega, F G; Ojopi, E P; Pandolfi, J R; Pessoa, L G; Prevedel, A C; Rahal, P; Rainho, C A; Reis, E M; Ribeiro, M L; da Ros, N; de Sa, R G; Sales, M M; Sant'anna, S C; dos Santos, M L; da Silva, A M; da Silva, N P; Silva, W A; da Silveira, R A; Sousa, J F; Stecconi, D; Tsukumo, F; Valente, V; Soares, F; Moreira, E S; Nunes, D N; Correa, R G; Zalcberg, H; Carvalho, A F; Reis, L F; Brentani, R R; Simpson, A J; de Souza, S J; Melo, M
2001-10-09
Open reading frame expressed sequences tags (ORESTES) differ from conventional ESTs by providing sequence data from the central protein coding portion of transcripts. We generated a total of 696,745 ORESTES sequences from 24 human tissues and used a subset of the data that correspond to a set of 15,095 full-length mRNAs as a means of assessing the efficiency of the strategy and its potential contribution to the definition of the human transcriptome. We estimate that ORESTES sampled over 80% of all highly and moderately expressed, and between 40% and 50% of rarely expressed, human genes. In our most thoroughly sequenced tissue, the breast, the 130,000 ORESTES generated are derived from transcripts from an estimated 70% of all genes expressed in that tissue, with an equally efficient representation of both highly and poorly expressed genes. In this respect, we find that the capacity of the ORESTES strategy both for gene discovery and shotgun transcript sequence generation significantly exceeds that of conventional ESTs. The distribution of ORESTES is such that many human transcripts are now represented by a scaffold of partial sequences distributed along the length of each gene product. The experimental joining of the scaffold components, by reverse transcription-PCR, represents a direct route to transcript finishing that may represent a useful alternative to full-length cDNA cloning.
Shang, Feng; Ding, Bi-Yue; Xiong, Ying; Dou, Wei; Wei, Dong; Jiang, Hong-Bo; Wei, Dan-Dan; Wang, Jin-Jun
2016-01-01
Winged and wingless morphs in insects represent a trade-off between dispersal ability and reproduction. We studied key genes associated with apterous and alate morphs in Toxoptera citricida (Kirkaldy) using RNAseq, digital gene expression (DGE) profiling, and RNA interference. The de novo assembly of the transcriptome was obtained through Illumina short-read sequencing technology. A total of 44,199 unigenes were generated and 27,640 were annotated. The transcriptomic differences between alate and apterous adults indicated that 279 unigenes were highly expressed in alate adults, whereas 5,470 were expressed at low levels. Expression patterns of the top 10 highly expressed genes in alate adults agreed with wing bud development trends. Silencing of the lipid synthesis and degradation gene (3-ketoacyl-CoA thiolase, mitochondrial-like) and glycogen genes (Phosphoenolpyruvate carboxykinase [GTP]-like and Glycogen phosphorylase-like isoform 2) resulted in underdeveloped wings. This suggests that both lipid and glycogen metabolism provide energy for aphid wing development. The large number of sequences and expression data produced from the transcriptome and DGE sequencing, respectively, increases our understanding of wing development mechanisms. PMID:27577531
Barvkar, Vitthal T; Pardeshi, Varsha C; Kale, Sandip M; Kadoo, Narendra Y; Gupta, Vidya S
2012-05-08
The glycosylation process, catalyzed by ubiquitous glycosyltransferase (GT) family enzymes, is a prevalent modification of plant secondary metabolites that regulates various functions such as hormone homeostasis, detoxification of xenobiotics and biosynthesis and storage of secondary metabolites. Flax (Linum usitatissimum L.) is a commercially grown oilseed crop, important because of its essential fatty acids and health promoting lignans. Identification and characterization of UDP glycosyltransferase (UGT) genes from flax could provide valuable basic information about this important gene family and help to explain the seed specific glycosylated metabolite accumulation and other processes in plants. Plant genome sequencing projects are useful to discover complexity within this gene family and also pave way for the development of functional genomics approaches. Taking advantage of the newly assembled draft genome sequence of flax, we identified 137 UDP glycosyltransferase (UGT) genes from flax using a conserved signature motif. Phylogenetic analysis of these protein sequences clustered them into 14 major groups (A-N). Expression patterns of these genes were investigated using publicly available expressed sequence tag (EST), microarray data and reverse transcription quantitative real time PCR (RT-qPCR). Seventy-three per cent of these genes (100 out of 137) showed expression evidence in 15 tissues examined and indicated varied expression profiles. The RT-qPCR results of 10 selected genes were also coherent with the digital expression analysis. Interestingly, five duplicated UGT genes were identified, which showed differential expression in various tissues. Of the seven intron loss/gain positions detected, two intron positions were conserved among most of the UGTs, although a clear relationship about the evolution of these genes could not be established. Comparison of the flax UGTs with orthologs from four other sequenced dicot genomes indicated that seven UGTs were flax diverged. Flax has a large number of UGT genes including few flax diverged ones. Phylogenetic analysis and expression profiles of these genes identified tissue and condition specific repertoire of UGT genes from this crop. This study would facilitate precise selection of candidate genes and their further characterization of substrate specificities and in planta functions.
2012-01-01
Background The glycosylation process, catalyzed by ubiquitous glycosyltransferase (GT) family enzymes, is a prevalent modification of plant secondary metabolites that regulates various functions such as hormone homeostasis, detoxification of xenobiotics and biosynthesis and storage of secondary metabolites. Flax (Linum usitatissimum L.) is a commercially grown oilseed crop, important because of its essential fatty acids and health promoting lignans. Identification and characterization of UDP glycosyltransferase (UGT) genes from flax could provide valuable basic information about this important gene family and help to explain the seed specific glycosylated metabolite accumulation and other processes in plants. Plant genome sequencing projects are useful to discover complexity within this gene family and also pave way for the development of functional genomics approaches. Results Taking advantage of the newly assembled draft genome sequence of flax, we identified 137 UDP glycosyltransferase (UGT) genes from flax using a conserved signature motif. Phylogenetic analysis of these protein sequences clustered them into 14 major groups (A-N). Expression patterns of these genes were investigated using publicly available expressed sequence tag (EST), microarray data and reverse transcription quantitative real time PCR (RT-qPCR). Seventy-three per cent of these genes (100 out of 137) showed expression evidence in 15 tissues examined and indicated varied expression profiles. The RT-qPCR results of 10 selected genes were also coherent with the digital expression analysis. Interestingly, five duplicated UGT genes were identified, which showed differential expression in various tissues. Of the seven intron loss/gain positions detected, two intron positions were conserved among most of the UGTs, although a clear relationship about the evolution of these genes could not be established. Comparison of the flax UGTs with orthologs from four other sequenced dicot genomes indicated that seven UGTs were flax diverged. Conclusions Flax has a large number of UGT genes including few flax diverged ones. Phylogenetic analysis and expression profiles of these genes identified tissue and condition specific repertoire of UGT genes from this crop. This study would facilitate precise selection of candidate genes and their further characterization of substrate specificities and in planta functions. PMID:22568875
2011-01-01
Background Parasitoid insects manipulate their hosts' physiology by injecting various factors into their host upon parasitization. Transcriptomic approaches provide a powerful approach to study insect host-parasitoid interactions at the molecular level. In order to investigate the effects of parasitization by an ichneumonid wasp (Diadegma semiclausum) on the host (Plutella xylostella), the larval transcriptome profile was analyzed using a short-read deep sequencing method (Illumina). Symbiotic polydnaviruses (PDVs) associated with ichneumonid parasitoids, known as ichnoviruses, play significant roles in host immune suppression and developmental regulation. In the current study, D. semiclausum ichnovirus (DsIV) genes expressed in P. xylostella were identified and their sequences compared with other reported PDVs. Five of these genes encode proteins of unknown identity, that have not previously been reported. Results De novo assembly of cDNA sequence data generated 172,660 contigs between 100 and 10000 bp in length; with 35% of > 200 bp in length. Parasitization had significant impacts on expression levels of 928 identified insect host transcripts. Gene ontology data illustrated that the majority of the differentially expressed genes are involved in binding, catalytic activity, and metabolic and cellular processes. In addition, the results show that transcription levels of antimicrobial peptides, such as gloverin, cecropin E and lysozyme, were up-regulated after parasitism. Expression of ichnovirus genes were detected in parasitized larvae with 19 unique sequences identified from five PDV gene families including vankyrin, viral innexin, repeat elements, a cysteine-rich motif, and polar residue rich protein. Vankyrin 1 and repeat element 1 genes showed the highest transcription levels among the DsIV genes. Conclusion This study provides detailed information on differential expression of P. xylostella larval genes following parasitization, DsIV genes expressed in the host and also improves our current understanding of this host-parasitoid interaction. PMID:21906285
Characterization of the Structural Gene Promoter of Aedes aegypti Densovirus
Ward, Todd W.; Kimmick, Michael W.; Afanasiev, Boris N.; Carlson, Jonathan O.
2001-01-01
Aedes aegypti densonucleosis virus (AeDNV) has two promoters that have been shown to be active by reporter gene expression analysis (B. N. Afanasiev, Y. V. Koslov, J. O. Carlson, and B. J. Beaty, Exp. Parasitol. 79:322–339, 1994). Northern blot analysis of cells infected with AeDNV revealed two transcripts 1,200 and 3,500 nucleotides in length that are assumed to express the structural protein (VP) gene and nonstructural protein genes, respectively. Primer extension was used to map the transcriptional start site of the structural protein gene. Surprisingly, the structural protein gene transcript began at an initiator consensus sequence, CAGT, 60 nucleotides upstream from the map unit 61 TATAA sequence previously thought to define the promoter. Constructs with the β-galactosidase gene fused to the structural protein gene were used to determine elements necessary for promoter function. Deletion or mutation of the initiator sequence, CAGT, reduced protein expression by 93%, whereas mutation of the TATAA sequence at map unit 61 had little effect. An additional open reading frame was observed upstream of the structural protein gene that can express β-galactosidase at a low level (20% of that of VP fusions). Expression of the AeDNV structural protein gene was shown to be stimulated by the major nonstructural protein NS1 (Afanasiev et al., Exp. parasitol., 1994). To determine the sequences required for transactivation, expression of structural protein gene–β-galactosidase gene fusion constructs differing in AeDNV genome content was measured with and without NS1. The presence of NS1 led to an 8- to 10-fold increase in expression when either genomic end was present, compared to a 2-fold increase with a construct lacking the genomic ends. An even higher (37-fold) increase in expression occurred with both genomic ends present; however, this was in part due to template replication as shown by Southern blot analysis. These data indicate the location and importance of various elements necessary for efficient protein expression and transactivation from the structural protein gene promoter of AeDNV. PMID:11152505
Nayidu, Naghabushana K.; Kagale, Sateesh; Taheri, Ali; Withana-Gamage, Thushan S.; Parkin, Isobel A. P.; Sharpe, Andrew G.; Gruber, Margaret Y.
2014-01-01
Coding sequences for major trichome regulatory genes, including the positive regulators GLABRA 1(GL1), GLABRA 2 (GL2), ENHANCER OF GLABRA 3 (EGL3), and TRANSPARENT TESTA GLABRA 1 (TTG1) and the negative regulator TRIPTYCHON (TRY), were cloned from wild Brassica villosa, which is characterized by dense trichome coverage over most of the plant. Transcript (FPKM) levels from RNA sequencing indicated much higher expression of the GL2 and TTG1 regulatory genes in B. villosa leaves compared with expression levels of GL1 and EGL3 genes in either B. villosa or the reference genome species, glabrous B. oleracea; however, cotyledon TTG1 expression was high in both species. RNA sequencing and Q-PCR also revealed an unusual expression pattern for the negative regulators TRY and CPC, which were much more highly expressed in trichome-rich B. villosa leaves than in glabrous B. oleracea leaves and in glabrous cotyledons from both species. The B. villosa TRY expression pattern also contrasted with TRY expression patterns in two diploid Brassica species, and with the Arabidopsis model for expression of negative regulators of trichome development. Further unique sequence polymorphisms, protein characteristics, and gene evolution studies highlighted specific amino acids in GL1 and GL2 coding sequences that distinguished glabrous species from hairy species and several variants that were specific for each B. villosa gene. Positive selection was observed for GL1 between hairy and non-hairy plants, and as expected the origin of the four expressed positive trichome regulatory genes in B. villosa was predicted to be from B. oleracea. In particular the unpredicted expression patterns for TRY and CPC in B. villosa suggest additional characterization is needed to determine the function of the expanded families of trichome regulatory genes in more complex polyploid species within the Brassicaceae. PMID:24755905
Senescence responsive transcriptional element
Campisi, Judith; Testori, Alessandro
1999-01-01
Recombinant polynucleotides have expression control sequences that have a senescence responsive element and a minimal promoter, and which are operatively linked to a heterologous nucleotide sequence. The molecules are useful for achieving high levels of expression of genes in senescent cells. Methods of inhibiting expression of genes in senescent cells also are provided.
Grace, Christy R.; Ferreira, Antonio M.; Waddell, M. Brett; Ridout, Granger; Naeve, Deanna; Leuze, Michael; LoCascio, Philip F.; Panetta, John C.; Wilkinson, Mark R.; Pui, Ching-Hon; Naeve, Clayton W.; Uberbacher, Edward C.; Bonten, Erik J.; Evans, William E.
2016-01-01
MicroRNAs are important regulators of gene expression, acting primarily by binding to sequence-specific locations on already transcribed messenger RNAs (mRNA) and typically down-regulating their stability or translation. Recent studies indicate that microRNAs may also play a role in up-regulating mRNA transcription levels, although a definitive mechanism has not been established. Double-helical DNA is capable of forming triple-helical structures through Hoogsteen and reverse Hoogsteen interactions in the major groove of the duplex, and we show physical evidence (i.e., NMR, FRET, SPR) that purine or pyrimidine-rich microRNAs of appropriate length and sequence form triple-helical structures with purine-rich sequences of duplex DNA, and identify microRNA sequences that favor triplex formation. We developed an algorithm (Trident) to search genome-wide for potential triplex-forming sites and show that several mammalian and non-mammalian genomes are enriched for strong microRNA triplex binding sites. We show that those genes containing sequences favoring microRNA triplex formation are markedly enriched (3.3 fold, p<2.2 × 10−16) for genes whose expression is positively correlated with expression of microRNAs targeting triplex binding sequences. This work has thus revealed a new mechanism by which microRNAs could interact with gene promoter regions to modify gene transcription. PMID:26844769
van Zyl, Leonel; von Arnold, Sara; Bozhkov, Peter; Chen, Yongzhong; Egertsdotter, Ulrika; MacKay, John; Sederoff, Ronald R.; Shen, Jing; Zelena, Lyubov
2002-01-01
Hybridization of labelled cDNA from various cell types with high-density arrays of expressed sequence tags is a powerful technique for investigating gene expression. Few conifer cDNA libraries have been sequenced. Because of the high level of sequence conservation between Pinus and Picea we have investigated the use of arrays from one genus for studies of gene expression in the other. The partial cDNAs from 384 identifiable genes expressed in differentiating xylem of Pinus taeda were printed on nylon membranes in randomized replicates. These were hybridized with labelled cDNA from needles or embryogenic cultures of Pinus taeda, P. sylvestris and Picea abies, and with labelled cDNA from leaves of Nicotiana tabacum. The Spearman correlation of gene expression for pairs of conifer species was high for needles (r2 = 0.78 − 0.86), and somewhat lower for embryogenic cultures (r2 = 0.68 − 0.83). The correlation of gene expression for tobacco leaves and needles of each of the three conifer species was lower but sufficiently high (r2 = 0.52 − 0.63) to suggest that many partial gene sequences are conserved in angiosperms and gymnosperms. Heterologous probing was further used to identify tissue-specific gene expression over species boundaries. To evaluate the significance of differences in gene expression, conventional parametric tests were compared with permutation tests after four methods of normalization. Permutation tests after Z-normalization provide the highest degree of discrimination but may enhance the probability of type I errors. It is concluded that arrays of cDNA from loblolly pine are useful for studies of gene expression in other pines or spruces. PMID:18629264
Alkio, Merianne; Jonas, Uwe; Declercq, Myriam; Van Nocker, Steven; Knoche, Moritz
2014-01-01
The exocarp, or skin, of fleshy fruit is a specialized tissue that protects the fruit, attracts seed dispersing fruit eaters, and has large economical relevance for fruit quality. Development of the exocarp involves regulated activities of many genes. This research analyzed global gene expression in the exocarp of developing sweet cherry (Prunus avium L., ‘Regina’), a fruit crop species with little public genomic resources. A catalog of transcript models (contigs) representing expressed genes was constructed from de novo assembled short complementary DNA (cDNA) sequences generated from developing fruit between flowering and maturity at 14 time points. Expression levels in each sample were estimated for 34 695 contigs from numbers of reads mapping to each contig. Contigs were annotated functionally based on BLAST, gene ontology and InterProScan analyses. Coregulated genes were detected using partitional clustering of expression patterns. The results are discussed with emphasis on genes putatively involved in cuticle deposition, cell wall metabolism and sugar transport. The high temporal resolution of the expression patterns presented here reveals finely tuned developmental specialization of individual members of gene families. Moreover, the de novo assembled sweet cherry fruit transcriptome with 7760 full-length protein coding sequences and over 20 000 other, annotated cDNA sequences together with their developmental expression patterns is expected to accelerate molecular research on this important tree fruit crop. PMID:26504533
Majeske, Audrey J; Oren, Matan; Sacchi, Sandro; Smith, L Courtney
2014-12-01
Immune systems in animals rely on fast and efficient responses to a wide variety of pathogens. The Sp185/333 gene family in the purple sea urchin, Strongylocentrotus purpuratus, consists of an estimated 50 (±10) members per genome that share a basic gene structure but show high sequence diversity, primarily due to the mosaic appearance of short blocks of sequence called elements. The genes show significantly elevated expression in three subpopulations of phagocytes responding to marine bacteria. The encoded Sp185/333 proteins are highly diverse and have central effector functions in the immune system. In this study we report the Sp185/333 gene expression in single sea urchin phagocytes. Sea urchins challenged with heat-killed marine bacteria resulted in a typical increase in coelomocyte concentration within 24 h, which included an increased proportion of phagocytes expressing Sp185/333 proteins. Phagocyte fractions enriched from coelomocytes were used in limiting dilutions to obtain samples of single cells that were evaluated for Sp185/333 gene expression by nested RT-PCR. Amplicon sequences showed identical or nearly identical Sp185/333 amplicon sequences in single phagocytes with matches to six known Sp185/333 element patterns, including both common and rare element patterns. This suggested that single phagocytes show restricted expression from the Sp185/333 gene family and infers a diverse, flexible, and efficient response to pathogens. This type of expression pattern from a family of immune response genes in single cells has not been identified previously in other invertebrates. Copyright © 2014 by The American Association of Immunologists, Inc.
Terrados, Gloria; Finkernagel, Florian; Stielow, Bastian; Sadic, Dennis; Neubert, Juliane; Herdt, Olga; Krause, Michael; Scharfe, Maren; Jarek, Michael; Suske, Guntram
2012-01-01
The transcription factor Sp2 is essential for early mouse development and for proliferation of mouse embryonic fibroblasts in culture. Yet its mechanisms of action and its target genes are largely unknown. In this study, we have combined RNA interference, in vitro DNA binding, chromatin immunoprecipitation sequencing and global gene-expression profiling to investigate the role of Sp2 for cellular functions, to define target sites and to identify genes regulated by Sp2. We show that Sp2 is important for cellular proliferation that it binds to GC-boxes and occupies proximal promoters of genes essential for vital cellular processes including gene expression, replication, metabolism and signalling. Moreover, we identified important key target genes and cellular pathways that are directly regulated by Sp2. Most significantly, Sp2 binds and activates numerous sequence-specific transcription factor and co-activator genes, and represses the whole battery of cholesterol synthesis genes. Our results establish Sp2 as a sequence-specific regulator of vitally important genes. PMID:22684502
Impact of sequencing depth and read length on single cell RNA sequencing data of T cells.
Rizzetto, Simone; Eltahla, Auda A; Lin, Peijie; Bull, Rowena; Lloyd, Andrew R; Ho, Joshua W K; Venturi, Vanessa; Luciani, Fabio
2017-10-06
Single cell RNA sequencing (scRNA-seq) provides great potential in measuring the gene expression profiles of heterogeneous cell populations. In immunology, scRNA-seq allowed the characterisation of transcript sequence diversity of functionally relevant T cell subsets, and the identification of the full length T cell receptor (TCRαβ), which defines the specificity against cognate antigens. Several factors, e.g. RNA library capture, cell quality, and sequencing output affect the quality of scRNA-seq data. We studied the effects of read length and sequencing depth on the quality of gene expression profiles, cell type identification, and TCRαβ reconstruction, utilising 1,305 single cells from 8 publically available scRNA-seq datasets, and simulation-based analyses. Gene expression was characterised by an increased number of unique genes identified with short read lengths (<50 bp), but these featured higher technical variability compared to profiles from longer reads. Successful TCRαβ reconstruction was achieved for 6 datasets (81% - 100%) with at least 0.25 millions (PE) reads of length >50 bp, while it failed for datasets with <30 bp reads. Sufficient read length and sequencing depth can control technical noise to enable accurate identification of TCRαβ and gene expression profiles from scRNA-seq data of T cells.
Regulation of Bacteria-Induced Intercellular Adhesion Molecule-1 by CCAAT/Enhancer Binding Proteins
Manzel, Lori J.; Chin, Cecilia L.; Behlke, Mark A.; Look, Dwight C.
2009-01-01
Direct interaction between bacteria and epithelial cells may initiate or amplify the airway response through induction of epithelial defense gene expression by nuclear factor-κB (NF-κB). However, multiple signaling pathways modify NF-κB effects to modulate gene expression. In this study, the effects of CCAAT/enhancer binding protein (C/EBP) family members on induction of the leukocyte adhesion glycoprotein intercellular adhesion molecule-1 (ICAM-1) was examined in primary cultures of human tracheobronchial epithelial cells incubated with nontypeable Haemophilus influenzae. Increased ICAM-1 gene transcription in response to H. influenzae required gene sequences located at −200 to −135 in the 5′-flanking region that contain a C/EBP-binding sequence immediately upstream of the NF-κB enhancer site. Constitutive C/EBPβ was found to have an important role in epithelial cell ICAM-1 regulation, while the adjacent NF-κB sequence binds the RelA/p65 and NF-κB1/p50 members of the NF-κB family to induce ICAM-1 expression in response to H. influenzae. The expression of C/EBP proteins is not regulated by p38 mitogen-activated protein kinase activation, but p38 affects gene transcription by increasing the binding of TATA-binding protein to TATA-box–containing gene sequences. Epithelial cell ICAM-1 expression in response to H. influenzae was decreased by expressing dominant-negative protein or RNA interference against C/EBPβ, confirming its role in ICAM-1 regulation. Although airway epithelial cells express multiple constitutive and inducible C/EBP family members that bind C/EBP sequences, the results indicate that C/EBPβ plays a central role in modulation of NF-κB–dependent defense gene expression in human airway epithelial cells after exposure to H. influenzae. PMID:18703796
Gao, Ya; Wang, Shu; Fu, Mingjia; Zhong, Guolin
2013-09-04
To determine blue-light induced expression of S-adenosyl-L-homocysteine hydrolase-like (sahhl) gene in fungus Mucor amphibiorum RCS1. In the random process of PCR, a sequence of 555 bp was obtained from M. amphibiorum RCS1. The 555 bp sequence was labeled with digoxin to prepare the probe for northern hybridization. By northern hybridization, the transcription of sahhl gene was analyzed in M. amphibiorum RCS1 mycelia culture process from darkness to blue light to darkness. Simultaneously real-time PCR method was used to the sahhl gene expression analysis. Compared with the sequence of sahh gene from Homo sapiens, Mus musculus and some fungi species, a high homology of the 555 bp sequence was confirmed. Therefore, the preliminary confirmation has supported that the 555 bp sequence should be sahhl gene from M. amphibiorum RCS1. Under the dark pre-culture in 24 h, a large amounts of transcript of sahhl gene in the mycelia can be detected by northern hybridization and real-time PCR in the condition of 24 h blue light. But a large amounts of transcript of sahhl gene were not found in other detection for the dark pre-culture of 48 h, even though M. amphibiorum RCS1 mycelia were induced by blue light. Blue light can induce the expression of sahhl gene in the vigorous growth of M. amphibiorum RCS1 mycelia.
[Isolation and function of genes regulating aphB expression in Vibrio cholerae].
Chen, Haili; Zhu, Zhaoqin; Zhong, Zengtao; Zhu, Jun; Kan, Biao
2012-02-04
We identified genes that regulate the expression of aphB, the gene encoding a key virulence regulator in Vibrio cholerae O1 E1 Tor C6706(-). We constructed a transposon library in V. cholerae C6706 strain containing a P(aphB)-luxCDABE and P(aphB)-lacZ transcriptional reporter plasmids. Using a chemiluminescence imager system, we rapidly detected aphB promoter expression level at a large scale. We then sequenced the transposon insertion sites by arbitrary PCR and sequencing analysis. We obtained two candidate mutants T1 and T2 which displayed reduced aphB expression from approximately 40,000 transposon insertion mutants. Sequencing analysis shows that Tn inserted in vc1585 reading frame in the T1 mutant and Tn inserted in the end of coding sequence of vc1602 in the T2 mutant. By using a genetic screen, we identified two potential genes that may involve in regulation of the expression of the key virulence regulator AphB. This study sheds light on our further investigation to fully understand V. cholerae virulence gene regulatory cascades.
Improving RNA-Seq expression estimation by modeling isoform- and exon-specific read sequencing rate.
Liu, Xuejun; Shi, Xinxin; Chen, Chunlin; Zhang, Li
2015-10-16
The high-throughput sequencing technology, RNA-Seq, has been widely used to quantify gene and isoform expression in the study of transcriptome in recent years. Accurate expression measurement from the millions or billions of short generated reads is obstructed by difficulties. One is ambiguous mapping of reads to reference transcriptome caused by alternative splicing. This increases the uncertainty in estimating isoform expression. The other is non-uniformity of read distribution along the reference transcriptome due to positional, sequencing, mappability and other undiscovered sources of biases. This violates the uniform assumption of read distribution for many expression calculation approaches, such as the direct RPKM calculation and Poisson-based models. Many methods have been proposed to address these difficulties. Some approaches employ latent variable models to discover the underlying pattern of read sequencing. However, most of these methods make bias correction based on surrounding sequence contents and share the bias models by all genes. They therefore cannot estimate gene- and isoform-specific biases as revealed by recent studies. We propose a latent variable model, NLDMseq, to estimate gene and isoform expression. Our method adopts latent variables to model the unknown isoforms, from which reads originate, and the underlying percentage of multiple spliced variants. The isoform- and exon-specific read sequencing biases are modeled to account for the non-uniformity of read distribution, and are identified by utilizing the replicate information of multiple lanes of a single library run. We employ simulation and real data to verify the performance of our method in terms of accuracy in the calculation of gene and isoform expression. Results show that NLDMseq obtains competitive gene and isoform expression compared to popular alternatives. Finally, the proposed method is applied to the detection of differential expression (DE) to show its usefulness in the downstream analysis. The proposed NLDMseq method provides an approach to accurately estimate gene and isoform expression from RNA-Seq data by modeling the isoform- and exon-specific read sequencing biases. It makes use of a latent variable model to discover the hidden pattern of read sequencing. We have shown that it works well in both simulations and real datasets, and has competitive performance compared to popular methods. The method has been implemented as a freely available software which can be found at https://github.com/PUGEA/NLDMseq.
ERIC Educational Resources Information Center
Almeida, Craig A.; Tardiff, Daniel F.; De Luca, Jane P.
2004-01-01
We have developed an introductory bioinformatics exercise for sophomore biology and biochemistry students that reinforces the understanding of the structure of a gene and the principles and events involved in its expression. In addition, the activity illustrates the severe effect mutations in a gene sequence can have on the protein product.…
Design and construction of 2A peptide-linked multicistronic vectors.
Szymczak-Workman, Andrea L; Vignali, Kate M; Vignali, Dario A A
2012-02-01
The need for reliable, multicistronic vectors for multigene delivery is at the forefront of biomedical technology. This article describes the design and construction of 2A peptide-linked multicistronic vectors, which can be used to express multiple proteins from a single open reading frame (ORF). The small 2A peptide sequences, when cloned between genes, allow for efficient, stoichiometric production of discrete protein products within a single vector through a novel "cleavage" event within the 2A peptide sequence. Expression of more than two genes using conventional approaches has several limitations, most notably imbalanced protein expression and large size. The use of 2A peptide sequences alleviates these concerns. They are small (18-22 amino acids) and have divergent amino-terminal sequences, which minimizes the chance for homologous recombination and allows for multiple, different 2A peptide sequences to be used within a single vector. Importantly, separation of genes placed between 2A peptide sequences is nearly 100%, which allows for stoichiometric and concordant expression of the genes, regardless of the order of placement within the vector.
VH gene expression and regulation in the mutant Alicia rabbit. Rescue of VHa2 allotype expression.
Chen, H T; Alexander, C B; Young-Cooper, G O; Mage, R G
1993-04-01
Rabbits of the Alicia strain, derived from rabbits expressing the VHa2 allotype, have a mutation in the H chain locus that has a cis effect upon the expression of VHa2 and VHa- genes. A small deletion at the most J-proximal (3') end of the VH locus leads to low expression of all the genes on the entire chromosome in heterozygous ali mutants and altered relative expression of VH genes in homozygotes. To study VH gene expression and regulation, we used the polymerase chain reaction to amplify the VH genes expressed in spleens of young and adult wild-type and mutant Alicia rabbits. The cDNA from reverse transcription of splenic mRNA was amplified and polymerase chain reaction libraries were constructed and screened with oligonucleotides from framework regions 1 and 3, as well as JH. Thirty-three VH-positive clones were sequenced and analyzed. We found that in mutant Alicia rabbits, products of the first functional VH gene (VH4a2), (or VH4a2-like genes) were expressed in 2- to 8-wk-olds. Expression of both the VHx and VHy types of VHa- genes was also elevated but the relative proportions of VHx and VHy, especially VHx, decreased whereas the relative levels of expression of VH4a2 or VH4a2-like genes increased with age. Our results suggest that the appearance of sequences resembling that of the VH1a2, which is deleted in the mutant ali rabbits, could be caused by alterations of the sequences of the rearranged VH4a2 genes by gene conversions and/or rearrangement of upstream VH1a2-like genes later in development.
Grove, J R; Deutsch, P J; Price, D J; Habener, J F; Avruch, J
1989-11-25
Plasmids that encode a bioactive amino-terminal fragment of the heat-stable inhibitor of the cAMP-dependent protein kinase, PKI(1-31), were employed to characterize the role of this protein kinase in the control of transcriptional activity mediated by three DNA regulatory elements in the JEG-3 human placental cell line. The 5'-flanking sequence of the human collagenase gene contains the heptameric sequence, 5'-TGAGTCA-3', previously identified as a "phorbol ester" response element. Reporter genes containing either the intact 1.2-kilobase 5'-flanking sequence from the human collagenase gene or just the 7-base pair (bp) response element, when coupled to an enhancerless promoter, each exhibit both cAMP and phorbol ester-stimulated expression in JEG-3 cells. Cotransfection of either construct with plasmids encoding PKI(1-31) inhibits cAMP-stimulated but not basal- or phorbol ester-stimulated expression. Pretreatment of cells with phorbol ester for 1 or 2 days abrogates completely the response to rechallenge with phorbol ester but does not alter the basal expression of either construct; cAMP-stimulated expression, while modestly inhibited, remains vigorous. The 5'-flanking sequence of the human chorionic gonadotropin-alpha subunit (HCG alpha) gene has two copies of the sequence, 5'-TGACGTCA-3', contained in directly adjacent identical 18-bp segments, previously identified as a cAMP-response element. Reporter genes containing either the intact 1.5 kilobase of 5'-flanking sequence from the HCG alpha gene, or just the 36-bp tandem repeat cAMP response element, when coupled to an enhancerless promoter, both exhibit a vigorous cAMP stimulation of expression but no response to phorbol ester in JEG-3 cells. Cotransfection with plasmids encoding PKI(1-31) inhibits both basal and cAMP-stimulated expression in a parallel fashion. The 5'-flanking sequence of the human enkephalin gene mediates cAMP-stimulated expression of reporter genes in both JEG-3 and CV-1 cells. Plasmids encoding PKI(1-31) inhibit the expression that is stimulated by the addition of cAMP analogs in both cell lines; basal expression, however, is inhibited by PKI(1-31) only in the JEG-3 cell line and not in the CV-1 cells. These observations indicate that, in JEG-3 cells, PKI(1-31) is a specific inhibitor of kinase A-mediated gene transcription, but it does not modify kinase C-directed transcription.(ABSTRACT TRUNCATED AT 400 WORDS)
Li, Qian; Lin, Sen
2017-01-01
Intramuscular fat (IMF) content and fatty acid composition of longissimus dorsi muscle (LM) change with growth, which partially determines the flavor and nutritional value of goat (Capra hircus) meat. However, unlike cattle, little information is available on the transcriptome-wide changes during different postnatal stages in small ruminants, especially goats. In this study, the sequencing reads of goat LM tissues collected from kid, youth, and adult period were mapped to the goat genome. Results showed that out of total 24 689 Unigenes, 20 435 Unigenes were annotated. Based on expected number of fragments per kilobase of transcript sequence per million base pairs sequenced (FPKM), 111 annotated differentially expressed genes (DEGs) were identified among different postnatal stages, which were subsequently assigned to 16 possible expression patterns by series-cluster analysis. Functional classification by Gene Ontology (GO) analysis was used for selecting the genes showing highest expression related to lipid metabolism. Finally, we identified the node genes for lipid metabolism regulation using co-expression analysis. In conclusion, these data may uncover candidate genes having functional roles in regulation of goat muscle development and lipid metabolism during the various growth stages in goats. PMID:28800357
Lin, Yaqiu; Zhu, Jiangjiang; Wang, Yong; Li, Qian; Lin, Sen
2017-01-01
Intramuscular fat (IMF) content and fatty acid composition of longissimus dorsi muscle (LM) change with growth, which partially determines the flavor and nutritional value of goat (Capra hircus) meat. However, unlike cattle, little information is available on the transcriptome-wide changes during different postnatal stages in small ruminants, especially goats. In this study, the sequencing reads of goat LM tissues collected from kid, youth, and adult period were mapped to the goat genome. Results showed that out of total 24 689 Unigenes, 20 435 Unigenes were annotated. Based on expected number of fragments per kilobase of transcript sequence per million base pairs sequenced (FPKM), 111 annotated differentially expressed genes (DEGs) were identified among different postnatal stages, which were subsequently assigned to 16 possible expression patterns by series-cluster analysis. Functional classification by Gene Ontology (GO) analysis was used for selecting the genes showing highest expression related to lipid metabolism. Finally, we identified the node genes for lipid metabolism regulation using co-expression analysis. In conclusion, these data may uncover candidate genes having functional roles in regulation of goat muscle development and lipid metabolism during the various growth stages in goats.
NASA Astrophysics Data System (ADS)
Kikuchi, Shoshi
2009-02-01
Completion of the high-precision genome sequence analysis of rice led to the collection of about 35,000 full-length cDNA clones and the determination of their complete sequences. Mapping of these full-length cDNA sequences has given us information on (1) the number of genes expressed in the rice genome; (2) the start and end positions and exon-intron structures of rice genes; (3) alternative transcripts; (4) possible encoded proteins; (5) non-protein-coding (np) RNAs; (6) the density of gene localization on the chromosome; (7) setting the parameters of gene prediction programs; and (8) the construction of a microarray system that monitors global gene expression. Manual curation for rice gene annotation by using mapping information on full-length cDNA and EST assemblies has revealed about 32,000 expressed genes in the rice genome. Analysis of major gene families, such as those encoding membrane transport proteins (pumps, ion channels, and secondary transporters), along with the evolution from bacteria to higher animals and plants, reveals how gene numbers have increased through adaptation to circumstances. Family-based gene annotation also gives us a new way of comparing organisms. Massive amounts of data on gene expression under many kinds of physiological conditions are being accumulated in rice oligoarrays (22K and 44K) based on full-length cDNA sequences. Cluster analyses of genes that have the same promoter cis-elements, that have similar expression profiles, or that encode enzymes in the same metabolic pathways or signal transduction cascades give us clues to understanding the networks of gene expression in rice. As a tool for that purpose, we recently developed "RiCES", a tool for searching for cis-elements in the promoter regions of clustered genes.
Yang, Fengxi; Zhu, Genfa
2015-01-01
Cymbidium ensifolium belongs to the genus Cymbidium of the orchid family. Owing to its spectacular flower morphology, C. ensifolium has considerable ecological and cultural value. However, limited genetic data is available for this non-model plant, and the molecular mechanism underlying floral organ identity is still poorly understood. In this study, we characterize the floral transcriptome of C. ensifolium and present, for the first time, extensive sequence and transcript abundance data of individual floral organs. After sequencing, over 10 Gb clean sequence data were generated and assembled into 111,892 unigenes with an average length of 932.03 base pairs, including 1,227 clusters and 110,665 singletons. Assembled sequences were annotated with gene descriptions, gene ontology, clusters of orthologous group terms, the Kyoto Encyclopedia of Genes and Genomes, and the plant transcription factor database. From these annotations, 131 flowering-associated unigenes, 61 CONSTANS-LIKE (COL) unigenes and 90 floral homeotic genes were identified. In addition, four digital gene expression libraries were constructed for the sepal, petal, labellum and gynostemium, and 1,058 genes corresponding to individual floral organ development were identified. Among them, eight MADS-box genes were further investigated by full-length cDNA sequence analysis and expression validation, which revealed two APETALA1/AGL9-like MADS-box genes preferentially expressed in the sepal and petal, two AGAMOUS-like genes particularly restricted to the gynostemium, and four DEF-like genes distinctively expressed in different floral organs. The spatial expression of these genes varied distinctly in different floral mutant corresponding to different floral morphogenesis, which validated the specialized roles of them in floral patterning and further supported the effectiveness of our in silico analysis. This dataset generated in our study provides new insights into the molecular mechanisms underlying floral patterning of Cymbidium and supports a valuable resource for molecular breeding of the orchid plant. PMID:26580566
2009-01-01
Background Sequence identification of ESTs from non-model species offers distinct challenges particularly when these species have duplicated genomes and when they are phylogenetically distant from sequenced model organisms. For the common carp, an environmental model of aquacultural interest, large numbers of ESTs remained unidentified using BLAST sequence alignment. We have used the expression profiles from large-scale microarray experiments to suggest gene identities. Results Expression profiles from ~700 cDNA microarrays describing responses of 7 major tissues to multiple environmental stressors were used to define a co-expression landscape. This was based on the Pearsons correlation coefficient relating each gene with all other genes, from which a network description provided clusters of highly correlated genes as 'mountains'. We show that these contain genes with known identities and genes with unknown identities, and that the correlation constitutes evidence of identity in the latter. This procedure has suggested identities to 522 of 2701 unknown carp ESTs sequences. We also discriminate several common carp genes and gene isoforms that were not discriminated by BLAST sequence alignment alone. Precision in identification was substantially improved by use of data from multiple tissues and treatments. Conclusion The detailed analysis of co-expression landscapes is a sensitive technique for suggesting an identity for the large number of BLAST unidentified cDNAs generated in EST projects. It is capable of detecting even subtle changes in expression profiles, and thereby of distinguishing genes with a common BLAST identity into different identities. It benefits from the use of multiple treatments or contrasts, and from the large-scale microarray data. PMID:19939286
O'Rourke, Jamie A; Fu, Fengli; Bucciarelli, Bruna; Yang, S Sam; Samac, Deborah A; Lamb, JoAnn F S; Monteros, Maria J; Graham, Michelle A; Gronwald, John W; Krom, Nick; Li, Jun; Dai, Xinbin; Zhao, Patrick X; Vance, Carroll P
2015-07-07
Alfalfa (Medicago sativa L.) is the primary forage legume crop species in the United States and plays essential economic and ecological roles in agricultural systems across the country. Modern alfalfa is the result of hybridization between tetraploid M. sativa ssp. sativa and M. sativa ssp. falcata. Due to its large and complex genome, there are few genomic resources available for alfalfa improvement. A de novo transcriptome assembly from two alfalfa subspecies, M. sativa ssp. sativa (B47) and M. sativa ssp. falcata (F56) was developed using Illumina RNA-seq technology. Transcripts from roots, nitrogen-fixing root nodules, leaves, flowers, elongating stem internodes, and post-elongation stem internodes were assembled into the Medicago sativa Gene Index 1.2 (MSGI 1.2) representing 112,626 unique transcript sequences. Nodule-specific and transcripts involved in cell wall biosynthesis were identified. Statistical analyses identified 20,447 transcripts differentially expressed between the two subspecies. Pair-wise comparisons of each tissue combination identified 58,932 sequences differentially expressed in B47 and 69,143 sequences differentially expressed in F56. Comparing transcript abundance in floral tissues of B47 and F56 identified expression differences in sequences involved in anthocyanin and carotenoid synthesis, which determine flower pigmentation. Single nucleotide polymorphisms (SNPs) unique to each M. sativa subspecies (110,241) were identified. The Medicago sativa Gene Index 1.2 increases the expressed sequence data available for alfalfa by ninefold and can be expanded as additional experiments are performed. The MSGI 1.2 transcriptome sequences, annotations, expression profiles, and SNPs were assembled into the Alfalfa Gene Index and Expression Database (AGED) at http://plantgrn.noble.org/AGED/ , a publicly available genomic resource for alfalfa improvement and legume research.
Mismer, D.; Rubin, G. M.
1989-01-01
We have analyzed the cis-acting regulatory sequences of the Rh1 (ninaE) gene in Drosophila melanogaster by P-element-mediated germline transformation of indicator genes transcribed from mutant ninaE promoter sequences. We have previously shown that a 200-bp region extending from -120 to +67 relative to the transcription start site is sufficient to obtain eye-specific expression from the ninaE promoter. In the present study, 22 different 4-13-bp sequences in the -120/+67 promoter region were altered by oligonucleotide-directed mutagenesis. Several of these sequences were found to be required for proper promoter function; two of these are conserved in the promoter of the homologous gene isolated from the related species Drosophila virilis. Alteration of a conserved 9-bp sequence results in aberrant, low level expression in the body. Alteration of a separate 11-bp sequence, found in the promoter regions of several photoreceptor-specific genes of Drosophila, results in an approximately 15-fold reduction in promoter efficiency but without apparent alteration of tissue-specificity. A protein factor capable of interacting with this 11-bp sequence has been detected by DNaseI footprinting in embryonic nuclear extracts. Finally, we have further characterized two separable enhancer sequences previously shown to be required for normal levels of expression from this promoter. PMID:2521839
Tavares, D; Tully, K; Dobner, P R
1999-10-15
The promoter region of the mouse high affinity neurotensin receptor (Ntr-1) gene was characterized, and sequences required for expression in neuroblastoma cell lines that express high affinity NT-binding sites were characterized. Me(2)SO-induced neuronal differentiation of N1E-115 neuroblastoma cells increased both the expression of the endogenous Ntr-1 gene and reporter genes driven by NTR-1 promoter sequences by 3-4-fold. Deletion analysis revealed that an 83-base pair promoter region containing the transcriptional start site is required for Me(2)SO activation. Detailed mutational analysis of this region revealed that a CACCC box and the central region of a large GC-rich palindrome are the crucial cis-regulatory elements required for Me(2)SO induction. The CACCC box is bound by at least one factor that is induced upon Me(2)SO treatment of N1E-115 cells. The Me(2)SO effect was found to be both selective and cell type-restricted. Basal expression in the neuroblastoma cell lines required a distinct set of sequences, including an Sp1-like sequence, and a sequence resembling an NGFI-A-binding site; however, a more distal 5' sequence was found to repress basal activity in N1E-115 cells. These results provide evidence that Ntr-1 gene regulation involves both positive and negative regulatory elements located in the 5'-flanking region and that Ntr-1 gene activation involves the coordinate activation or induction of several factors, including a CACCC box binding complex.
Gene expression profiling of flax (Linum usitatissimum L.) under edaphic stress.
Dmitriev, Alexey A; Kudryavtseva, Anna V; Krasnov, George S; Koroban, Nadezhda V; Speranskaya, Anna S; Krinitsina, Anastasia A; Belenikin, Maxim S; Snezhkina, Anastasiya V; Sadritdinova, Asiya F; Kishlyan, Natalya V; Rozhmina, Tatiana A; Yurkevich, Olga Yu; Muravenko, Olga V; Bolsheva, Nadezhda L; Melnikova, Nataliya V
2016-11-16
Cultivated flax (Linum usitatissimum L.) is widely used for production of textile, food, chemical and pharmaceutical products. However, various stresses decrease flax production. Search for genes, which are involved in stress response, is necessary for breeding of adaptive cultivars. Imbalanced concentration of nutrient elements in soil decrease flax yields and also results in heritable changes in some flax lines. The appearance of Linum Insertion Sequence 1 (LIS-1) is the most studied modification. However, LIS-1 function is still unclear. High-throughput sequencing of transcriptome of flax plants grown under normal (N), phosphate deficient (P), and nutrient excess (NPK) conditions was carried out using Illumina platform. The assembly of transcriptome was performed, and a total of 34924, 33797, and 33698 unique transcripts for N, P, and NPK sequencing libraries were identified, respectively. We have not revealed any LIS-1 derived mRNA in our sequencing data. The analysis of high-throughput sequencing data allowed us to identify genes with potentially differential expression under imbalanced nutrition. For further investigation with qPCR, 15 genes were chosen and their expression levels were evaluated in the extended sampling of 31 flax plants. Significant expression alterations were revealed for genes encoding WRKY and JAZ protein families under P and NPK conditions. Moreover, the alterations of WRKY family genes differed depending on LIS-1 presence in flax plant genome. Besides, we revealed slight and LIS-1 independent mRNA level changes of KRP2 and ING1 genes, which are adjacent to LIS-1, under nutrition stress. Differentially expressed genes were identified in flax plants, which were grown under phosphate deficiency and excess nutrition, on the basis of high-throughput sequencing and qPCR data. We showed that WRKY and JAS gene families participate in flax response to imbalanced nutrient content in soil. Besides, we have not identified any mRNA, which could be derived from LIS-1, in our transcriptome sequencing data. Expression of LIS-1 flanking genes, ING1 and KRP2, was suggested not to be nutrient stress-induced. Obtained results provide new insights into edaphic stress response in flax and the role of LIS-1 in these process.
Hecht, Jochen; Kuhl, Heiner; Haas, Stefan A; Bauer, Sebastian; Poustka, Albert J; Lienau, Jasmin; Schell, Hanna; Stiege, Asita C; Seitz, Volkhard; Reinhardt, Richard; Duda, Georg N; Mundlos, Stefan; Robinson, Peter N
2006-07-05
The sheep is an important model animal for testing novel fracture treatments and other medical applications. Despite these medical uses and the well known economic and cultural importance of the sheep, relatively little research has been performed into sheep genetics, and DNA sequences are available for only a small number of sheep genes. In this work we have sequenced over 47 thousand expressed sequence tags (ESTs) from libraries developed from healing bone in a sheep model of fracture healing. These ESTs were clustered with the previously available 10 thousand sheep ESTs to a total of 19087 contigs with an average length of 603 nucleotides. We used the newly identified sequences to develop RT-PCR assays for 78 sheep genes and measured differential expression during the course of fracture healing between days 7 and 42 postfracture. All genes showed significant shifts at one or more time points. 23 of the genes were differentially expressed between postfracture days 7 and 10, which could reflect an important role for these genes for the initiation of osteogenesis. The sequences we have identified in this work are a valuable resource for future studies on musculoskeletal healing and regeneration using sheep and represent an important head-start for genomic sequencing projects for Ovis aries, with partial or complete sequences being made available for over 5,800 previously unsequenced sheep genes.
Vidal, Ramon Oliveira; Mondego, Jorge Maurício Costa; Pot, David; Ambrósio, Alinne Batista; Andrade, Alan Carvalho; Pereira, Luiz Filipe Protasio; Colombo, Carlos Augusto; Vieira, Luiz Gonzaga Esteves; Carazzolle, Marcelo Falsarella; Pereira, Gonçalo Amarante Guimarães
2010-01-01
Polyploidization constitutes a common mode of evolution in flowering plants. This event provides the raw material for the divergence of function in homeologous genes, leading to phenotypic novelty that can contribute to the success of polyploids in nature or their selection for use in agriculture. Mounting evidence underlined the existence of homeologous expression biases in polyploid genomes; however, strategies to analyze such transcriptome regulation remained scarce. Important factors regarding homeologous expression biases remain to be explored, such as whether this phenomenon influences specific genes, how paralogs are affected by genome doubling, and what is the importance of the variability of homeologous expression bias to genotype differences. This study reports the expressed sequence tag assembly of the allopolyploid Coffea arabica and one of its direct ancestors, Coffea canephora. The assembly was used for the discovery of single nucleotide polymorphisms through the identification of high-quality discrepancies in overlapped expressed sequence tags and for gene expression information indirectly estimated by the transcript redundancy. Sequence diversity profiles were evaluated within C. arabica (Ca) and C. canephora (Cc) and used to deduce the transcript contribution of the Coffea eugenioides (Ce) ancestor. The assignment of the C. arabica haplotypes to the C. canephora (CaCc) or C. eugenioides (CaCe) ancestral genomes allowed us to analyze gene expression contributions of each subgenome in C. arabica. In silico data were validated by the quantitative polymerase chain reaction and allele-specific combination TaqMAMA-based method. The presence of differential expression of C. arabica homeologous genes and its implications in coffee gene expression, ontology, and physiology are discussed. PMID:20864545
2013-01-01
Background Salamanders are unique among vertebrates in their ability to completely regenerate amputated limbs through the mediation of blastema cells located at the stump ends. This regeneration is nerve-dependent because blastema formation and regeneration does not occur after limb denervation. To obtain the genomic information of blastema tissues, de novo transcriptomes from both blastema tissues and denervated stump ends of Ambystoma mexicanum (axolotls) 14 days post-amputation were sequenced and compared using Solexa DNA sequencing. Results The sequencing done for this study produced 40,688,892 reads that were assembled into 307,345 transcribed sequences. The N50 of transcribed sequence length was 562 bases. A similarity search with known proteins identified 39,200 different genes to be expressed during limb regeneration with a cut-off E-value exceeding 10-5. We annotated assembled sequences by using gene descriptions, gene ontology, and clusters of orthologous group terms. Targeted searches using these annotations showed that the majority of the genes were in the categories of essential metabolic pathways, transcription factors and conserved signaling pathways, and novel candidate genes for regenerative processes. We discovered and confirmed numerous sequences of the candidate genes by using quantitative polymerase chain reaction and in situ hybridization. Conclusion The results of this study demonstrate that de novo transcriptome sequencing allows gene expression analysis in a species lacking genome information and provides the most comprehensive mRNA sequence resources for axolotls. The characterization of the axolotl transcriptome can help elucidate the molecular mechanisms underlying blastema formation during limb regeneration. PMID:23815514
Pan, Lei; Liu, Yan; Wei, Qiang; Xiao, Chenwen; Ji, Quanan; Bao, Guolian; Wu, Xinsheng
2015-01-01
Background Fur is an important genetically-determined characteristic of domestic rabbits; rabbit furs are of great economic value. We used the Solexa sequencing technology to assess gene expression in skin tissues from full-sib Rex rabbits of different phenotypes in order to explore the molecular mechanisms associated with fur determination. Methodology/Principal Findings Transcriptome analysis included de novo assembly, gene function identification, and gene function classification and enrichment. We obtained 74,032,912 and 71,126,891 short reads of 100 nt, which were assembled into 377,618 unique sequences by Trinity strategy (N50=680 nt). Based on BLAST results with known proteins, 50,228 sequences were identified at a cut-off E-value ≥ 10-5. Using Blast to Gene Ontology (GO), Clusters of Orthologous Groups (KOG) and Kyoto Encyclopedia of Genes and Genomes (KEGG), we obtained several genes with important protein functions. A total of 308 differentially expressed genes were obtained by transcriptome analysis of plaice and un-plaice phenotype animals; 209 additional differentially expressed genes were not found in any database. These genes included 49 that were only expressed in plaice skin rabbits. The novel genes may play important roles during skin growth and development. In addition, 99 known differentially expressed genes were assigned to PI3K-Akt signaling, focal adhesion, and ECM-receptor interactin, among others. Growth factors play a role in skin growth and development by regulating these signaling pathways. We confirmed the altered expression levels of seven target genes by qRT-PCR. And chosen a key gene for SNP to found the differentially between plaice and un-plaice phenotypes rabbit. Conclusions/Significance The rabbit transcriptome profiling data provide new insights in understanding the molecular mechanisms underlying rabbit skin growth and development. PMID:25955442
Global Gene Expression Patterns and Somatic Mutations in Sporadic Intracranial Aneurysms.
Li, Zhili; Tan, Haibin; Shi, Yi; Huang, Guangfu; Wang, Zhenyu; Liu, Ling; Yin, Cheng; Wang, Qi
2017-04-01
High-throughput sequencing technologies can expand our understanding of the pathologic basis of intracranial aneurysms (IAs). Our study was aimed to decipher the gene expression signature and genetic factors associated with IAs. We determined the gene expression levels of 3 cases of IAs by RNA sequencing. Bioinformatics analysis was conducted to identify the differentially expressed genes (DEGs) and uncover their biological function. In addition, whole genome sequencing was performed on an additional 6 cases of IAs to detect the potential somatic alterations in DEGs. Compared with the normal arterial tissue, 1709 genes were differentially expressed in IAs arterial tissue. The most significantly up-regulated gene and down-regulated gene, H19 and HIST1H3J, may be essential for tumorigenesis of IAs. Hub protein of IKBKG in protein-protein interaction network was probably involved in the inflammation process in aneurysms. Another 2 hub proteins, ACTB and MKI67IP, as well as up-regulated genes, might be abnormally activated in aneurysms and involved in the pathogenesis of IAs. Further whole genome sequencing and filtering yielded 4 candidate somatic single nucleotide variants including MUC3B, and BLM may be involved in the pathogenesis of IAs. Even though, our results do not support the hypothesis of somatic mutations occurred in the DEGs. Two-dimensional genomic data from transcriptome and whole genome sequencing indicated that no somatic mutations occurred in DEGs. In addition, 3 DEGs (IKBKG, ACTB, and MKI67IP) and 2 mutant genes (MUC3B and BLM) were essential in IAs. Copyright © 2017 Elsevier Inc. All rights reserved.
Itzhaki, H; Maxson, J M; Woodson, W R
1994-09-13
The increased production of ethylene during carnation petal senescence regulates the transcription of the GST1 gene encoding a subunit of glutathione-S-transferase. We have investigated the molecular basis for this ethylene-responsive transcription by examining the cis elements and trans-acting factors involved in the expression of the GST1 gene. Transient expression assays following delivery of GST1 5' flanking DNA fused to a beta-glucuronidase receptor gene were used to functionally define sequences responsible for ethylene-responsive expression. Deletion analysis of the 5' flanking sequences of GST1 identified a single positive regulatory element of 197 bp between -667 and -470 necessary for ethylene-responsive expression. The sequences within this ethylene-responsive region were further localized to 126 bp between -596 and -470. The ethylene-responsive element (ERE) within this region conferred ethylene-regulated expression upon a minimal cauliflower mosaic virus-35S TATA-box promoter in an orientation-independent manner. Gel electrophoresis mobility-shift assays and DNase I footprinting were used to identify proteins that bind to sequences within the ERE. Nuclear proteins from carnation petals were shown to specifically interact with the 126-bp ERE and the presence and binding of these proteins were independent of ethylene or petal senescence. DNase I footprinting defined DNA sequences between -510 and -488 within the ERE specifically protected by bound protein. An 8-bp sequence (ATTTCAAA) within the protected region shares significant homology with promoter sequences required for ethylene responsiveness from the tomato fruit-ripening E4 gene.
Gonzalez, S M; Ferland, L H; Robert, B; Abdelhay, E
1998-06-01
Vertebrate Msx genes are related to one of the most divergent homeobox genes of Drosophila, the muscle segment homeobox (msh) gene, and are expressed in a well-defined pattern at sites of tissue interactions. This pattern of expression is conserved in vertebrates as diverse as quail, zebrafish, and mouse in a range of sites including neural crest, appendages, and craniofacial structures. In the present work, we performed structural and functional analyses in order to identify potential cis-acting elements that may be regulating Msx1 gene expression. To this end, a 4.9-kb segment of the 5'-flanking region was sequenced and analyzed for transcription-factor binding sites. Four regions showing a high concentration of these sites were identified. Transfection assays with fragments of regulatory sequences driving the expression of the bacterial lacZ reporter gene showed that a region of 4 kb upstream of the transcription start site contains positive and negative elements responsible for controlling gene expression. Interestingly, a fragment of 130 bp seems to contain the minimal elements necessary for gene expression, as its removal completely abolishes gene expression in cultured cells. These results are reinforced by comparison of this region with the human Msx1 gene promoter, which shows extensive conservation, including many consensus binding sites, suggesting a regulatory role for them.
Rombel, I T; McMorran, B J; Lamont, I L
1995-02-20
Many bacteria respond to a lack of iron in the environment by synthesizing siderophores, which act as iron-scavenging compounds. Fluorescent pseudomonads synthesize strain-specific but chemically related siderophores called pyoverdines or pseudobactins. We have investigated the mechanisms by which iron controls expression of genes involved in pyoverdine metabolism in Pseudomonas aeruginosa. Transcription of these genes is repressed by the presence of iron in the growth medium. Three promoters from these genes were cloned and the activities of the promoters were dependent on the amounts of iron in the growth media. Two of the promoters were sequenced and the transcriptional start site were identified by S1 nuclease analysis. Sequences similar to the consensus binding site for the Fur repressor protein, which controls expression of iron-repressible genes in several gram-negative species, were not present in the promoters, suggesting that they are unlikely to have a high affinity for Fur. However, comparison of the promoter sequences with those of iron-regulated genes from other Pseudomonas species and also the iron-regulated exotoxin gene of P. aeruginosa allowed identification of a shared sequence element, with the consensus sequence (G/C)CTAAAT-CCC, which is likely to act as a binding site for a transcriptional activator protein. Mutations in this sequence greatly reduced the activities of the promoters characterized here as well as those of other iron-regulated promoters. The requirement for this motif in the promoters of iron-regulated genes of different Pseudomonas species indicates that similar mechanisms are likely to be involved in controlling expression of a range of iron-regulated genes in pseudomonads.
Qi, Xiao-Hua; Xu, Xue-Wen; Lin, Xiao-Jian; Zhang, Wen-Jie; Chen, Xue-Hao
2012-03-01
High-throughput tag-sequencing (Tag-seq) analysis based on the Solexa Genome Analyzer platform was applied to analyze the gene expression profiling of cucumber plant at 5 time points over a 24h period of waterlogging treatment. Approximately 5.8 million total clean sequence tags per library were obtained with 143013 distinct clean tag sequences. Approximately 23.69%-29.61% of the distinct clean tags were mapped unambiguously to the unigene database, and 53.78%-60.66% of the distinct clean tags were mapped to the cucumber genome database. Analysis of the differentially expressed genes revealed that most of the genes were down-regulated in the waterlogging stages, and the differentially expressed genes mainly linked to carbon metabolism, photosynthesis, reactive oxygen species generation/scavenging, and hormone synthesis/signaling. Finally, quantitative real-time polymerase chain reaction using nine genes independently verified the tag-mapped results. This present study reveals the comprehensive mechanisms of waterlogging-responsive transcription in cucumber. Copyright © 2011 Elsevier Inc. All rights reserved.
[Construction of the superantigen SEA transfected laryngocarcinoma cells].
Ji, Xiaobin; Jingli, J V; Liu, Qicai; Xie, Jinghua
2013-04-01
To construct an eukaryotic expression vectors containing superantigen staphylococcal enterotoxin A (SEA) gene, and to identify its expression in laryngeal squamous carcinoma cells. SEA full-length gene fragment was obtained from ATCC13565 genome of the staphylococcus, referencing standard strains producing SEA. Coding sequence of SEA was artificially synthetized. Than, SEA gene fragments was subcloned into eukaryotic expression vector pIRES-EGFP. The recombinant plasmid pSEA-IRES-EGFP was constructed and was transfected to laryngocarcinoma Hep-2 cells. Resistant clones were screened by G418. The expression of SEA in laryngocarcinoma cells was identified with ELISA and RT-PCR method. The subclone of artificially synthetized SEA gene was subclone to eukaryotic expression vector pires-EGFP. Flanking sequence confirmed that SEA sequence was fully identical to the coding sequence of standard staphylococcus strains ATCC13565 in Genbank. After recombinant plasmid transfected to laryngocarcinoma cells, the resistant clones was obtained after screening for two weeks. The clones were selected. The specific gene fragment was obtained by RT-PCR amplification. ELISA assay confirmed that the content of SEA protein in supernatant fluid of cell culture had reached about Pg level. The recombinant eukaryotic expression vector containing superantigen SEA gene is successfully constructed, and is capable of effective expression and continued secretion of SEA protein in laryngochrcinoma Hep-2 cells after recombinant plasmid transfected to laryngocarcinoma cells.
Kogelman, Lisette J A; Zhernakova, Daria V; Westra, Harm-Jan; Cirera, Susanna; Fredholm, Merete; Franke, Lude; Kadarmideen, Haja N
2015-10-20
Obesity is a multi-factorial health problem in which genetic factors play an important role. Limited results have been obtained in single-gene studies using either genomic or transcriptomic data. RNA sequencing technology has shown its potential in gaining accurate knowledge about the transcriptome, and may reveal novel genes affecting complex diseases. Integration of genomic and transcriptomic variation (expression quantitative trait loci [eQTL] mapping) has identified causal variants that affect complex diseases. We integrated transcriptomic data from adipose tissue and genomic data from a porcine model to investigate the mechanisms involved in obesity using a systems genetics approach. Using a selective gene expression profiling approach, we selected 36 animals based on a previously created genomic Obesity Index for RNA sequencing of subcutaneous adipose tissue. Differential expression analysis was performed using the Obesity Index as a continuous variable in a linear model. eQTL mapping was then performed to integrate 60 K porcine SNP chip data with the RNA sequencing data. Results were restricted based on genome-wide significant single nucleotide polymorphisms, detected differentially expressed genes, and previously detected co-expressed gene modules. Further data integration was performed by detecting co-expression patterns among eQTLs and integration with protein data. Differential expression analysis of RNA sequencing data revealed 458 differentially expressed genes. The eQTL mapping resulted in 987 cis-eQTLs and 73 trans-eQTLs (false discovery rate < 0.05), of which the cis-eQTLs were associated with metabolic pathways. We reduced the eQTL search space by focusing on differentially expressed and co-expressed genes and disease-associated single nucleotide polymorphisms to detect obesity-related genes and pathways. Building a co-expression network using eQTLs resulted in the detection of a module strongly associated with lipid pathways. Furthermore, we detected several obesity candidate genes, for example, ENPP1, CTSL, and ABHD12B. To our knowledge, this is the first study to perform an integrated genomics and transcriptomics (eQTL) study using, and modeling, genomic and subcutaneous adipose tissue RNA sequencing data on obesity in a porcine model. We detected several pathways and potential causal genes for obesity. Further validation and investigation may reveal their exact function and association with obesity.
Transcriptome Assembly, Gene Annotation and Tissue Gene Expression Atlas of the Rainbow Trout
Salem, Mohamed; Paneru, Bam; Al-Tobasei, Rafet; Abdouni, Fatima; Thorgaard, Gary H.; Rexroad, Caird E.; Yao, Jianbo
2015-01-01
Efforts to obtain a comprehensive genome sequence for rainbow trout are ongoing and will be complemented by transcriptome information that will enhance genome assembly and annotation. Previously, transcriptome reference sequences were reported using data from different sources. Although the previous work added a great wealth of sequences, a complete and well-annotated transcriptome is still needed. In addition, gene expression in different tissues was not completely addressed in the previous studies. In this study, non-normalized cDNA libraries were sequenced from 13 different tissues of a single doubled haploid rainbow trout from the same source used for the rainbow trout genome sequence. A total of ~1.167 billion paired-end reads were de novo assembled using the Trinity RNA-Seq assembler yielding 474,524 contigs > 500 base-pairs. Of them, 287,593 had homologies to the NCBI non-redundant protein database. The longest contig of each cluster was selected as a reference, yielding 44,990 representative contigs. A total of 4,146 contigs (9.2%), including 710 full-length sequences, did not match any mRNA sequences in the current rainbow trout genome reference. Mapping reads to the reference genome identified an additional 11,843 transcripts not annotated in the genome. A digital gene expression atlas revealed 7,678 housekeeping and 4,021 tissue-specific genes. Expression of about 16,000–32,000 genes (35–71% of the identified genes) accounted for basic and specialized functions of each tissue. White muscle and stomach had the least complex transcriptomes, with high percentages of their total mRNA contributed by a small number of genes. Brain, testis and intestine, in contrast, had complex transcriptomes, with a large numbers of genes involved in their expression patterns. This study provides comprehensive de novo transcriptome information that is suitable for functional and comparative genomics studies in rainbow trout, including annotation of the genome. PMID:25793877
Agarwala, Prachi; Pandey, Satyaprakash; Mapa, Koyeli; Maiti, Souvik
2013-03-05
Transforming growth factor β2 (TGFβ2) is a versatile cytokine with a prominent role in cell migration, invasion, cellular development, and immunomodulation. TGFβ2 promotes the malignancy of tumors by inducing epithelial-mesenchymal transition, angiogenesis, and immunosuppression. As it is well-documented that nucleic acid secondary structure can regulate gene expression, we assessed whether any secondary motif regulates its expression at the post-transcriptional level. Bioinformatics analysis predicts an existence of a 23-nucleotide putative G-quadruplex sequence (PG4) in the 5' untranslated region (UTR) of TGFβ2 mRNA. The ability of this stretch of sequence to form a highly stable, intramolecular parallel quadruplex was demonstrated using ultraviolet and circular dichroism spectroscopy. Footprinting studies further validated its existence in the presence of a neighboring nucleotide sequence. Following structural characterization, we evaluated the biological relevance of this secondary motif using a dual luciferase assay. Although PG4 inhibits the expression of the reporter gene, its presence in the context of the entire 5' UTR sequence interestingly enhances gene expression. Mutation or removal of the G-quadruplex sequence from the 5' UTR of the gene diminished the level of expression of this gene at the translational level. Thus, here we highlight an activating role of the G-quadruplex in modulating gene expression of TGFβ2 at the translational level and its potential to be used as a target for the development of therapeutics against cancer.
Sacramento, C B; Moraes, J Z; Denapolis, P M A; Han, S W
2010-08-01
The main objective of the present study was to find suitable DNA-targeting sequences (DTS) for the construction of plasmid vectors to be used to treat ischemic diseases. The well-known Simian virus 40 nuclear DTS (SV40-DTS) and hypoxia-responsive element (HRE) sequences were used to construct plasmid vectors to express the human vascular endothelial growth factor gene (hVEGF). The rate of plasmid nuclear transport and consequent gene expression under normoxia (20% O2) and hypoxia (less than 5% O2) were determined. Plasmids containing the SV40-DTS or HRE sequences were constructed and used to transfect the A293T cell line (a human embryonic kidney cell line) in vitro and mouse skeletal muscle cells in vivo. Plasmid transport to the nucleus was monitored by real-time PCR, and the expression level of the hVEGF gene was measured by ELISA. The in vitro nuclear transport efficiency of the SV40-DTS plasmid was about 50% lower under hypoxia, while the HRE plasmid was about 50% higher under hypoxia. Quantitation of reporter gene expression in vitro and in vivo, under hypoxia and normoxia, confirmed that the SV40-DTS plasmid functioned better under normoxia, while the HRE plasmid was superior under hypoxia. These results indicate that the efficiency of gene expression by plasmids containing DNA binding sequences is affected by the concentration of oxygen in the medium.
Tetteh, Kevin K. A.; Loukas, Alex; Tripp, Cindy; Maizels, Rick M.
1999-01-01
Larvae of Toxocara canis, a nematode parasite of dogs, infect humans, causing visceral and ocular larva migrans. In noncanid hosts, larvae neither grow nor differentiate but endure in a state of arrested development. Reasoning that parasite protein production is orientated to immune evasion, we undertook a random sequencing project from a larval cDNA library to characterize the most highly expressed transcripts. In all, 266 clones were sequenced, most from both 3′ and 5′ ends, and similarity searches against GenBank protein and dbEST nucleotide databases were conducted. Cluster analyses showed that 128 distinct gene products had been found, all but 3 of which represented newly identified genes. Ninety-five genes were represented by a single clone, but seven transcripts were present at high frequencies, each composing >2% of all clones sequenced. These high-abundance transcripts include a mucin and a C-type lectin, which are both major excretory-secretory antigens released by parasites. Four highly expressed novel gene transcripts, termed ant (abundant novel transcript) genes, were found. Together, these four genes comprised 18% of all cDNA clones isolated, but no similar sequences occur in the Caenorhabditis elegans genome. While the coding regions of the four genes are dissimilar, their 3′ untranslated tracts have significant homology in nucleotide sequence. The discovery of these abundant, parasite-specific genes of newly identified lectins and mucins, as well as a range of conserved and novel proteins, provides defined candidates for future analysis of the molecular basis of immune evasion by T. canis. PMID:10456930
Winterhoff, Boris J; Maile, Makayla; Mitra, Amit Kumar; Sebe, Attila; Bazzaro, Martina; Geller, Melissa A; Abrahante, Juan E; Klein, Molly; Hellweg, Raffaele; Mullany, Sally A; Beckman, Kenneth; Daniel, Jerry; Starr, Timothy K
2017-03-01
The purpose of this study was to determine the level of heterogeneity in high grade serous ovarian cancer (HGSOC) by analyzing RNA expression in single epithelial and cancer associated stromal cells. In addition, we explored the possibility of identifying subgroups based on pathway activation and pre-defined signatures from cancer stem cells and chemo-resistant cells. A fresh, HGSOC tumor specimen derived from ovary was enzymatically digested and depleted of immune infiltrating cells. RNA sequencing was performed on 92 single cells and 66 of these single cell datasets passed quality control checks. Sequences were analyzed using multiple bioinformatics tools, including clustering, principle components analysis, and geneset enrichment analysis to identify subgroups and activated pathways. Immunohistochemistry for ovarian cancer, stem cell and stromal markers was performed on adjacent tumor sections. Analysis of the gene expression patterns identified two major subsets of cells characterized by epithelial and stromal gene expression patterns. The epithelial group was characterized by proliferative genes including genes associated with oxidative phosphorylation and MYC activity, while the stromal group was characterized by increased expression of extracellular matrix (ECM) genes and genes associated with epithelial-to-mesenchymal transition (EMT). Neither group expressed a signature correlating with published chemo-resistant gene signatures, but many cells, predominantly in the stromal subgroup, expressed markers associated with cancer stem cells. Single cell sequencing provides a means of identifying subpopulations of cancer cells within a single patient. Single cell sequence analysis may prove to be critical for understanding the etiology, progression and drug resistance in ovarian cancer. Copyright © 2017 Elsevier Inc. All rights reserved.
Sequence analysis of 497 mouse brain ESTs expressed in the substantia nigra
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stewart, G.J.; Savioz, A.; Davies, R.W.
1997-01-15
The use of subtracted, region-specific cDNA libraries combined with single-pass cDNA sequencing allows the discovery of novel genes and facilitates molecular description of the tissue or region involved. We report the sequence of 497 mouse expressed sequence tags (ESTs) from two subtracted libraries enriched for cDNAs expressed in the substantia nigra, a brain region with important roles in movement control and Parkinson disease. Of these, 238 ESTs give no database matches and therefore derive from novel genes. A further 115 ESTs show sequence similarity to ESTs from other organisms, which themselves do not yield any significant database matches to genesmore » of known function. Fifty-six ESTs show sequence similarity to previously identified genes whose mouse homologues have not been reported. The total number of ESTs reported that are new for the mouse is 407, which, together with the 90 ESTs corresponding to known mouse genes or cDNAs, contributes to the molecular description of the substantia nigra. 21 refs., 4 tabs.« less
Digital gene expression for non-model organisms
Hong, Lewis Z.; Li, Jun; Schmidt-Küntzel, Anne; Warren, Wesley C.; Barsh, Gregory S.
2011-01-01
Next-generation sequencing technologies offer new approaches for global measurements of gene expression but are mostly limited to organisms for which a high-quality assembled reference genome sequence is available. We present a method for gene expression profiling called EDGE, or EcoP15I-tagged Digital Gene Expression, based on ultra-high-throughput sequencing of 27-bp cDNA fragments that uniquely tag the corresponding gene, thereby allowing direct quantification of transcript abundance. We show that EDGE is capable of assaying for expression in >99% of genes in the genome and achieves saturation after 6–8 million reads. EDGE exhibits very little technical noise, reveals a large (106) dynamic range of gene expression, and is particularly suited for quantification of transcript abundance in non-model organisms where a high-quality annotated genome is not available. In a direct comparison with RNA-seq, both methods provide similar assessments of relative transcript abundance, but EDGE does better at detecting gene expression differences for poorly expressed genes and does not exhibit transcript length bias. Applying EDGE to laboratory mice, we show that a loss-of-function mutation in the melanocortin 1 receptor (Mc1r), recognized as a Mendelian determinant of yellow hair color in many different mammals, also causes reduced expression of genes involved in the interferon response. To illustrate the application of EDGE to a non-model organism, we examine skin biopsy samples from a cheetah (Acinonyx jubatus) and identify genes likely to control differences in the color of spotted versus non-spotted regions. PMID:21844123
[Sequences and expression pattern of mce gene in Leptospira interrogans of different serogroups].
Zhang, Lei; Xue, Feng; Yan, Jie; Mao, Ya-fei; Li, Li-wei
2008-11-01
To determine the frequency of mce gene in Leptospira interrogans, and to investigate the gene transcription levels of L. interrogans before and after infecting cells. The segments of entire mce genes from 13 L.interrogans strains and 1 L.biflexa strain were amplified by PCR and then sequenced after T-A cloning. A prokaryotic expression system of mce gene was constructed; the expression and output of the target recombinant protein rMce were examined by SDS-PAGE and Western Blot assay. Rabbits were intradermally immunized with rMce to prepare the antiserum, the titer of antiserum was measured by immunodiffusion test. The transcription levels of mce gene in L.interrogans serogroup Icterohaemorrhagiae serovar lai strain 56601 before and after infecting J774A.1 cells were monitored by real-time fluorescence quantitative RT-PCR. mce gene was carried in all tested L.interrogans strains, but not in L.biflexa serogroup Semaranga serovar patoc strain Patoc I. The similarities of nucleotide and putative amino acid sequences of the cloned mce genes to the reported sequences (GenBank accession No: NP712236) were 99.02%-100% and 97.91%-100%, respectively. The constructed prokaryotic expression system of mce gene expressed rMce and the output of rMce was about 5% of the total bacterial proteins. The antiserum against whole cell of L.interrogans strain 56601 efficiently recognized rMce. After infecting J774A.1 cells, transcription levels of the mce gene in L.interrogans strain 56601 were remarkably up-regulated. The constructed prokaryotic expression system of mce gene and the prepared antiserum against rMce provide useful tools for further study of the gene function.
Quiapim, Andréa C.; Brito, Michael S.; Bernardes, Luciano A.S.; daSilva, Idalete; Malavazi, Iran; DePaoli, Henrique C.; Molfetta-Machado, Jeanne B.; Giuliatti, Silvana; Goldman, Gustavo H.; Goldman, Maria Helena S.
2009-01-01
The success of plant reproduction depends on pollen-pistil interactions occurring at the stigma/style. These interactions vary depending on the stigma type: wet or dry. Tobacco (Nicotiana tabacum) represents a model of wet stigma, and its stigmas/styles express genes to accomplish the appropriate functions. For a large-scale study of gene expression during tobacco pistil development and preparation for pollination, we generated 11,216 high-quality expressed sequence tags (ESTs) from stigmas/styles and created the TOBEST database. These ESTs were assembled in 6,177 clusters, from which 52.1% are pistil transcripts/genes of unknown function. The 21 clusters with the highest number of ESTs (putative higher expression levels) correspond to genes associated with defense mechanisms or pollen-pistil interactions. The database analysis unraveled tobacco sequences homologous to the Arabidopsis (Arabidopsis thaliana) genes involved in specifying pistil identity or determining normal pistil morphology and function. Additionally, 782 independent clusters were examined by macroarray, revealing 46 stigma/style preferentially expressed genes. Real-time reverse transcription-polymerase chain reaction experiments validated the pistil-preferential expression for nine out of 10 genes tested. A search for these 46 genes in the Arabidopsis pistil data sets demonstrated that only 11 sequences, with putative equivalent molecular functions, are expressed in this dry stigma species. The reverse search for the Arabidopsis pistil genes in the TOBEST exposed a partial overlap between these dry and wet stigma transcriptomes. The TOBEST represents the most extensive survey of gene expression in the stigmas/styles of wet stigma plants, and our results indicate that wet and dry stigmas/styles express common as well as distinct genes in preparation for the pollination process. PMID:19052150
NASA Astrophysics Data System (ADS)
Gong, Qianhong; Yu, Wengong; Dai, Jixun; Liu, Hongquan; Xu, Rifu; Guan, Huashi; Pan, Kehou
2007-01-01
Endogenous tubulin promoter has been widely used for expressing foreign genes in green algae, but the efficiency and feasibility of endogenous tubulin promoter in the economically important Porphyra yezoensis (Rhodophyta) are unknown. In this study, the flanking sequences of beta-tubulin gene from P. yezoensis were amplified and two transient expression vectors were constructed to determine their transcription promoting feasibility for foreign gene gusA. The testing vector pATubGUS was constructed by inserting 5'-and 3'-flanking regions ( Tub5' and Tub3') up-and down-stream of β-glucuronidase (GUS) gene ( gusA), respectively, into pA, a derivative of pCAT®3-enhancer vector. The control construct, pAGUSTub3, contains only gusA and Tub3'. These constructs were electroporated into P. yezoensis protoplasts and the GUS activities were quantitatively analyzed by spectrometry. The results demonstrated that gusA gene was efficiently expressed in P. yezoensis protoplasts under the regulation of 5'-flanking sequence of the beta-tubulin gene. More interestingly, the pATubGUS produced stronger GUS activity in P. yezoensis protoplasts when compared to the result from pBI221, in which the gusA gene was directed by a constitutive CaMV 35S promoter. The data suggest that the integration of P. yezoensis protoplast and its endogenous beta-tubulin flanking sequences is a potential novel system for foreign gene expression.
Transcriptome sequencing and de novo analysis of the copepod Calanus sinicus using 454 GS FLX.
Ning, Juan; Wang, Minxiao; Li, Chaolun; Sun, Song
2013-01-01
Despite their species abundance and primary economic importance, genomic information about copepods is still limited. In particular, genomic resources are lacking for the copepod Calanus sinicus, which is a dominant species in the coastal waters of East Asia. In this study, we performed de novo transcriptome sequencing to produce a large number of expressed sequence tags for the copepod C. sinicus. Copepodid larvae and adults were used as the basic material for transcriptome sequencing. Using 454 pyrosequencing, a total of 1,470,799 reads were obtained, which were assembled into 56,809 high quality expressed sequence tags. Based on their sequence similarity to known proteins, about 14,000 different genes were identified, including members of all major conserved signaling pathways. Transcripts that were putatively involved with growth, lipid metabolism, molting, and diapause were also identified among these genes. Differentially expressed genes related to several processes were found in C. sinicus copepodid larvae and adults. We detected 284,154 single nucleotide polymorphisms (SNPs) that provide a resource for gene function studies. Our data provide the most comprehensive transcriptome resource available for C. sinicus. This resource allowed us to identify genes associated with primary physiological processes and SNPs in coding regions, which facilitated the quantitative analysis of differential gene expression. These data should provide foundation for future genetic and genomic studies of this and related species.
Garnier, Fabien; Janapatla, Rajendra Prasad; Charpentier, Emmanuelle; Masson, Geoffrey; Grélaud, Carole; Stach, Jean François; Denis, François; Ploy, Marie-Cécile
2007-07-01
A serotype 1 Streptococcus pneumoniae strain isolated by blood culture from a woman with pneumonia was found to harbor insertion sequence (IS) 1515 in the pneumolysin gene, abolishing pneumolysin expression. To our knowledge, this is the first report of an IS in the pneumolysin gene of S. pneumoniae.
DNA-Demethylase Regulated Genes Show Methylation-Independent Spatiotemporal Expression Patterns
Schumann, Ulrike; Lee, Joanne; Kazan, Kemal; Ayliffe, Michael; Wang, Ming-Bo
2017-01-01
Recent research has indicated that a subset of defense-related genes is downregulated in the Arabidopsis DNA demethylase triple mutant rdd (ros1 dml2 dml3) resulting in increased susceptibility to the fungal pathogen Fusarium oxysporum. In rdd plants these downregulated genes contain hypermethylated transposable element sequences (TE) in their promoters, suggesting that this methylation represses gene expression in the mutant and that these sequences are actively demethylated in wild-type plants to maintain gene expression. In this study, the tissue-specific and pathogen-inducible expression patterns of rdd-downregulated genes were investigated and the individual role of ROS1, DML2, and DML3 demethylases in these spatiotemporal regulation patterns was determined. Large differences in defense gene expression were observed between pathogen-infected and uninfected tissues and between root and shoot tissues in both WT and rdd plants, however, only subtle changes in promoter TE methylation patterns occurred. Therefore, while TE hypermethylation caused decreased gene expression in rdd plants it did not dramatically effect spatiotemporal gene regulation, suggesting that this latter regulation is largely methylation independent. Analysis of ros1-3, dml2-1, and dml3-1 single gene mutant lines showed that promoter TE hypermethylation and defense-related gene repression was predominantly, but not exclusively, due to loss of ROS1 activity. These data demonstrate that DNA demethylation of TE sequences, largely by ROS1, promotes defense-related gene expression but does not control spatiotemporal expression in Arabidopsis. Summary: Ros1-mediated DNA demethylation of promoter transposable elements is essential for activation of defense-related gene expression in response to fungal infection in Arabidopsis thaliana. PMID:28894455
Study of cnidarian-algal symbiosis in the "omics" age.
Meyer, Eli; Weis, Virginia M
2012-08-01
The symbiotic associations between cnidarians and dinoflagellate algae (Symbiodinium) support productive and diverse ecosystems in coral reefs. Many aspects of this association, including the mechanistic basis of host-symbiont recognition and metabolic interaction, remain poorly understood. The first completed genome sequence for a symbiotic anthozoan is now available (the coral Acropora digitifera), and extensive expressed sequence tag resources are available for a variety of other symbiotic corals and anemones. These resources make it possible to profile gene expression, protein abundance, and protein localization associated with the symbiotic state. Here we review the history of "omics" studies of cnidarian-algal symbiosis and the current availability of sequence resources for corals and anemones, identifying genes putatively involved in symbiosis across 10 anthozoan species. The public availability of candidate symbiosis-associated genes leaves the field of cnidarian-algal symbiosis poised for in-depth comparative studies of sequence diversity and gene expression and for targeted functional studies of genes associated with symbiosis. Reviewing the progress to date suggests directions for future investigations of cnidarian-algal symbiosis that include (i) sequencing of Symbiodinium, (ii) proteomic analysis of the symbiosome membrane complex, (iii) glycomic analysis of Symbiodinium cell surfaces, and (iv) expression profiling of the gastrodermal cells hosting Symbiodinium.
Guo, Baozhu; Chen, Xiaoping; Dang, Phat; Scully, Brian T; Liang, Xuanqiang; Holbrook, C Corley; Yu, Jiujiang; Culbreath, Albert K
2008-01-01
Background Peanut (Arachis hypogaea L.) is an important crop economically and nutritionally, and is one of the most susceptible host crops to colonization of Aspergillus parasiticus and subsequent aflatoxin contamination. Knowledge from molecular genetic studies could help to devise strategies in alleviating this problem; however, few peanut DNA sequences are available in the public database. In order to understand the molecular basis of host resistance to aflatoxin contamination, a large-scale project was conducted to generate expressed sequence tags (ESTs) from developing seeds to identify resistance-related genes involved in defense response against Aspergillus infection and subsequent aflatoxin contamination. Results We constructed six different cDNA libraries derived from developing peanut seeds at three reproduction stages (R5, R6 and R7) from a resistant and a susceptible cultivated peanut genotypes, 'Tifrunner' (susceptible to Aspergillus infection with higher aflatoxin contamination and resistant to TSWV) and 'GT-C20' (resistant to Aspergillus with reduced aflatoxin contamination and susceptible to TSWV). The developing peanut seed tissues were challenged by A. parasiticus and drought stress in the field. A total of 24,192 randomly selected cDNA clones from six libraries were sequenced. After removing vector sequences and quality trimming, 21,777 high-quality EST sequences were generated. Sequence clustering and assembling resulted in 8,689 unique EST sequences with 1,741 tentative consensus EST sequences (TCs) and 6,948 singleton ESTs. Functional classification was performed according to MIPS functional catalogue criteria. The unique EST sequences were divided into twenty-two categories. A similarity search against the non-redundant protein database available from NCBI indicated that 84.78% of total ESTs showed significant similarity to known proteins, of which 165 genes had been previously reported in peanuts. There were differences in overall expression patterns in different libraries and genotypes. A number of sequences were expressed throughout all of the libraries, representing constitutive expressed sequences. In order to identify resistance-related genes with significantly differential expression, a statistical analysis to estimate the relative abundance (R) was used to compare the relative abundance of each gene transcripts in each cDNA library. Thirty six and forty seven unique EST sequences with threshold of R > 4 from libraries of 'GT-C20' and 'Tifrunner', respectively, were selected for examination of temporal gene expression patterns according to EST frequencies. Nine and eight resistance-related genes with significant up-regulation were obtained in 'GT-C20' and 'Tifrunner' libraries, respectively. Among them, three genes were common in both genotypes. Furthermore, a comparison of our EST sequences with other plant sequences in the TIGR Gene Indices libraries showed that the percentage of peanut EST matched to Arabidopsis thaliana, maize (Zea mays), Medicago truncatula, rapeseed (Brassica napus), rice (Oryza sativa), soybean (Glycine max) and wheat (Triticum aestivum) ESTs ranged from 33.84% to 79.46% with the sequence identity ≥ 80%. These results revealed that peanut ESTs are more closely related to legume species than to cereal crops, and more homologous to dicot than to monocot plant species. Conclusion The developed ESTs can be used to discover novel sequences or genes, to identify resistance-related genes and to detect the differences among alleles or markers between these resistant and susceptible peanut genotypes. Additionally, this large collection of cultivated peanut EST sequences will make it possible to construct microarrays for gene expression studies and for further characterization of host resistance mechanisms. It will be a valuable genomic resource for the peanut community. The 21,777 ESTs have been deposited to the NCBI GenBank database with accession numbers ES702769 to ES724546. PMID:18248674
Paugh, Steven W.; Coss, David R.; Bao, Ju; ...
2016-02-04
MicroRNAs are important regulators of gene expression, acting primarily by binding to sequence-specific locations on already transcribed messenger RNAs (mRNA). Recent studies indicate that microRNAs may also play a role in up-regulating mRNA transcription levels, although a definitive mechanism has not been established. Double-helical DNA is capable of forming triple-helical structures through Hoogsteen and reverse Hoogsteen interactions in the major groove of the duplex, and we show physical evidence that microRNAs form triple-helical structures with duplex DNA, and identify microRNA sequences that favor triplex formation. We developed an algorithm (Trident) to search genome-wide for potential triplex-forming sites and show thatmore » several mammalian and non-mammalian genomes are enriched for strong microRNA triplex binding sites. We show that those genes containing sequences favoring microRNA triplex formation are markedly enriched (3.3 fold, p<2.2 x 10 -16) for genes whose expression is positively correlated with expression of microRNAs targeting triplex binding sequences. As a result, this work has thus revealed a new mechanism by which microRNAs can interact with gene promoter regions to modify gene transcription.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Paugh, Steven W.; Coss, David R.; Bao, Ju
MicroRNAs are important regulators of gene expression, acting primarily by binding to sequence-specific locations on already transcribed messenger RNAs (mRNA). Recent studies indicate that microRNAs may also play a role in up-regulating mRNA transcription levels, although a definitive mechanism has not been established. Double-helical DNA is capable of forming triple-helical structures through Hoogsteen and reverse Hoogsteen interactions in the major groove of the duplex, and we show physical evidence that microRNAs form triple-helical structures with duplex DNA, and identify microRNA sequences that favor triplex formation. We developed an algorithm (Trident) to search genome-wide for potential triplex-forming sites and show thatmore » several mammalian and non-mammalian genomes are enriched for strong microRNA triplex binding sites. We show that those genes containing sequences favoring microRNA triplex formation are markedly enriched (3.3 fold, p<2.2 x 10 -16) for genes whose expression is positively correlated with expression of microRNAs targeting triplex binding sequences. As a result, this work has thus revealed a new mechanism by which microRNAs can interact with gene promoter regions to modify gene transcription.« less
Tian, Xin-Jie; Long, Yan; Wang, Jiao; Zhang, Jing-Wen; Wang, Yan-Yan; Li, Wei-Min; Peng, Yu-Fa; Yuan, Qian-Hua; Pei, Xin-Wu
2015-01-01
The perennial O. rufipogon (common wild rice), which is considered to be the ancestor of Asian cultivated rice species, contains many useful genetic resources, including drought resistance genes. However, few studies have identified the drought resistance and tissue-specific genes in common wild rice. In this study, transcriptome sequencing libraries were constructed, including drought-treated roots (DR) and control leaves (CL) and roots (CR). Using Illumina sequencing technology, we generated 16.75 million bases of high-quality sequence data for common wild rice and conducted de novo assembly and annotation of genes without prior genome information. These reads were assembled into 119,332 unigenes with an average length of 715 bp. A total of 88,813 distinct sequences (74.42% of unigenes) significantly matched known genes in the NCBI NT database. Differentially expressed gene (DEG) analysis showed that 3617 genes were up-regulated and 4171 genes were down-regulated in the CR library compared with the CL library. Among the DEGs, 535 genes were expressed in roots but not in shoots. A similar comparison between the DR and CR libraries showed that 1393 genes were up-regulated and 315 genes were down-regulated in the DR library compared with the CR library. Finally, 37 genes that were specifically expressed in roots were screened after comparing the DEGs identified in the above-described analyses. This study provides a transcriptome sequence resource for common wild rice plants and establishes a digital gene expression profile of wild rice plants under drought conditions using the assembled transcriptome data as a reference. Several tissue-specific and drought-stress-related candidate genes were identified, representing a fully characterized transcriptome and providing a valuable resource for genetic and genomic studies in plants.
Xu, Yan; Zou, Peng; Liu, Yao; Deng, Fengjiao
2010-06-01
Genes specifically expressed in the notochord may be crucial for proper notochord development. Using the digital differential display program offered by the National Center for Biotechnology Information, we identified a novel EST sequence from a zebrafish ovary library (No. XM_701450). The full-length cDNA of this transcript was cloned by performing 3' and 5'-RACE and was further confirmed by PCR and sequencing. The resulting 614 bp gene was found to encode a novel 94 amino acid protein that did not share significant homology with any other known protein. Characterization of the genomic sequence revealed that the gene spanned 4.9 kb and was composed of four exons and three introns. RT-PCR gene expression analysis revealed that our gene of interest was expressed in ovary, kidney, brain, mature oocytes and during the early stages of embryogenesis. During embryonic development, znfr mRNA was found to be expressed in the embryonic shield, chordamesoderm and the vacuolated notochord cells by in situ hybridization. Based on this information, we hypothesize that this novel gene is an important maternal factor required for zebrafish notochord formation during early embryonic development. We have thus named this gene znfr (zebrafish notochord formation related).
On the presence and role of human gene-body DNA methylation
Jjingo, Daudi; Conley, Andrew B.; Yi, Soojin V.; Lunyak, Victoria V.; Jordan, I. King
2012-01-01
DNA methylation of promoter sequences is a repressive epigenetic mark that down-regulates gene expression. However, DNA methylation is more prevalent within gene-bodies than seen for promoters, and gene-body methylation has been observed to be positively correlated with gene expression levels. This paradox remains unexplained, and accordingly the role of DNA methylation in gene-bodies is poorly understood. We addressed the presence and role of human gene-body DNA methylation using a meta-analysis of human genome-wide methylation, expression and chromatin data sets. Methylation is associated with transcribed regions as genic sequences have higher levels of methylation than intergenic or promoter sequences. We also find that the relationship between gene-body DNA methylation and expression levels is non-monotonic and bell-shaped. Mid-level expressed genes have the highest levels of gene-body methylation, whereas the most lowly and highly expressed sets of genes both have low levels of methylation. While gene-body methylation can be seen to efficiently repress the initiation of intragenic transcription, the vast majority of methylated sites within genes are not associated with intragenic promoters. In fact, highly expressed genes initiate the most intragenic transcription, which is inconsistent with the previously held notion that gene-body methylation serves to repress spurious intragenic transcription to allow for efficient transcriptional elongation. These observations lead us to propose a model to explain the presence of human gene-body methylation. This model holds that the repression of intragenic transcription by gene-body methylation is largely epiphenomenal, and suggests that gene-body methylation levels are predominantly shaped via the accessibility of the DNA to methylating enzyme complexes. PMID:22577155
Cao, Chuanwang; Wang, Zhiying; Niu, Changying; Desneux, Nicolas; Gao, Xiwu
2013-01-01
Phenol is a major pollutant in aquatic ecosystems due to its chemical stability, water solubility and environmental mobility. To date, little is known about the molecular modifications of invertebrates under phenol stress. In the present study, we used Solexa sequencing technology to investigate the transcriptome and differentially expressed genes (DEGs) of midges (Chironomus kiinensis) in response to phenol stress. A total of 51,518,972 and 51,150,832 clean reads in the phenol-treated and control libraries, respectively, were obtained and assembled into 51,014 non-redundant (Nr) consensus sequences. A total of 6,032 unigenes were classified by Gene Ontology (GO), and 18,366 unigenes were categorized into 238 Kyoto Encyclopedia of Genes and Genomes (KEGG) categories. These genes included representatives from almost all functional categories. A total of 10,724 differentially expressed genes (P value <0.05) were detected in a comparative analysis of the expression profiles between phenol-treated and control C. kiinensis including 8,390 upregulated and 2,334 downregulated genes. The expression levels of 20 differentially expressed genes were confirmed by real-time RT-PCR, and the trends in gene expression that were observed matched the Solexa expression profiles, although the magnitude of the variations was different. Through pathway enrichment analysis, significantly enriched pathways were identified for the DEGs, including metabolic pathways, aryl hydrocarbon receptor (AhR), pancreatic secretion and neuroactive ligand-receptor interaction pathways, which may be associated with the phenol responses of C. kiinensis. Using Solexa sequencing technology, we identified several groups of key candidate genes as well as important biological pathways involved in the molecular modifications of chironomids under phenol stress. PMID:23527048
Saga, Yukika; Inamura, Tomoka; Shimada, Nao; Kawata, Takefumi
2016-05-01
STATa, a Dictyostelium homologue of metazoan signal transducer and activator of transcription, is important for the organizer function in the tip region of the migrating Dictyostelium slug. We previously showed that ecmF gene expression depends on STATa in prestalk A (pstA) cells, where STATa is activated. Deletion and site-directed mutagenesis analysis of the ecmF/lacZ fusion gene in wild-type and STATa null strains identified an imperfect inverted repeat sequence, ACAAATANTATTTGT, as a STATa-responsive element. An upstream sequence element was required for efficient expression in the rear region of pstA zone; an element downstream of the inverted repeat was necessary for sufficient prestalk expression during culmination. Band shift analyses using purified STATa protein detected no sequence-specific binding to those ecmF elements. The only verified upregulated target gene of STATa is cudA gene; CudA directly activates expL7 gene expression in prestalk cells. However, ecmF gene expression was almost unaffected in a cudA null mutant. Several previously reported putative STATa target genes were also expressed in cudA null mutant but were downregulated in STATa null mutant. Moreover, mybC, which encodes another transcription factor, belonged to this category, and ecmF expression was downregulated in a mybC null mutant. These findings demonstrate the existence of a genetic hierarchy for pstA-specific genes, which can be classified into two distinct STATa downstream pathways, CudA dependent and independent. The ecmF expression is indirectly upregulated by STATa in a CudA-independent activation manner but dependent on MybC, whose expression is positively regulated by STATa. © 2016 Japanese Society of Developmental Biologists.
Kanofsky, Konstantin; Lehmeyer, Mona; Schulze, Jutta; Hehl, Reinhard
2016-01-01
Plants recognize pathogens by microbe-associated molecular patterns (MAMPs) and subsequently induce an immune response. The regulation of gene expression during the immune response depends largely on cis-sequences conserved in promoters of MAMP-responsive genes. These cis-sequences can be analyzed by constructing synthetic promoters linked to a reporter gene and by testing these constructs in transient expression systems. Here, the use of the parsley (Petroselinum crispum) protoplast system for analyzing MAMP-responsive synthetic promoters is described. The synthetic promoter consists of four copies of a potential MAMP-responsive cis-sequence cloned upstream of a minimal promoter and the uidA reporter gene. The reporter plasmid contains a second reporter gene, which is constitutively expressed and hence eliminates the requirement of a second plasmid used as a transformation control. The reporter plasmid is transformed into parsley protoplasts that are elicited by the MAMP Pep25. The MAMP responsiveness is validated by comparing the reporter gene activity from MAMP-treated and untreated cells and by normalizing reporter gene activity using the constitutively expressed reporter gene.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Czarnecki, Olaf; Bryan, Anthony C.; Jawdy, Sara S.
Genetic engineering of plants that results in successful establishment of new biochemical or regulatory pathways requires stable introduction of one or more genes into the plant genome. It might also be necessary to down-regulate or turn off expression of endogenous genes in order to reduce activity of competing pathways. An established way to knockdown gene expression in plants is expressing a hairpin-RNAi construct, eventually leading to degradation of a specifically targeted mRNA. Knockdown of multiple genes that do not share homologous sequences is still challenging and involves either sophisticated cloning strategies to create vectors with different serial expression constructs ormore » multiple transformation events that is often restricted by a lack of available transformation markers. Synthetic RNAi fragments were assembled in yeast carrying homologous sequences to six or seven non-family genes and introduced into pAGRIKOLA. Transformation of Arabidopsis thaliana and subsequent expression analysis of targeted genes proved efficient knockdown of all target genes. In conclusion, we present a simple and cost-effective method to create constructs to simultaneously knockdown multiple non-family genes or genes that do not share sequence homology. The presented method can be applied in plant and animal synthetic biology as well as traditional plant and animal genetic engineering.« less
Czarnecki, Olaf; Bryan, Anthony C.; Jawdy, Sara S.; ...
2016-02-17
Genetic engineering of plants that results in successful establishment of new biochemical or regulatory pathways requires stable introduction of one or more genes into the plant genome. It might also be necessary to down-regulate or turn off expression of endogenous genes in order to reduce activity of competing pathways. An established way to knockdown gene expression in plants is expressing a hairpin-RNAi construct, eventually leading to degradation of a specifically targeted mRNA. Knockdown of multiple genes that do not share homologous sequences is still challenging and involves either sophisticated cloning strategies to create vectors with different serial expression constructs ormore » multiple transformation events that is often restricted by a lack of available transformation markers. Synthetic RNAi fragments were assembled in yeast carrying homologous sequences to six or seven non-family genes and introduced into pAGRIKOLA. Transformation of Arabidopsis thaliana and subsequent expression analysis of targeted genes proved efficient knockdown of all target genes. In conclusion, we present a simple and cost-effective method to create constructs to simultaneously knockdown multiple non-family genes or genes that do not share sequence homology. The presented method can be applied in plant and animal synthetic biology as well as traditional plant and animal genetic engineering.« less
Di, Yanming; Schafer, Daniel W.; Wilhelm, Larry J.; Fox, Samuel E.; Sullivan, Christopher M.; Curzon, Aron D.; Carrington, James C.; Mockler, Todd C.; Chang, Jeff H.
2011-01-01
GENE-counter is a complete Perl-based computational pipeline for analyzing RNA-Sequencing (RNA-Seq) data for differential gene expression. In addition to its use in studying transcriptomes of eukaryotic model organisms, GENE-counter is applicable for prokaryotes and non-model organisms without an available genome reference sequence. For alignments, GENE-counter is configured for CASHX, Bowtie, and BWA, but an end user can use any Sequence Alignment/Map (SAM)-compliant program of preference. To analyze data for differential gene expression, GENE-counter can be run with any one of three statistics packages that are based on variations of the negative binomial distribution. The default method is a new and simple statistical test we developed based on an over-parameterized version of the negative binomial distribution. GENE-counter also includes three different methods for assessing differentially expressed features for enriched gene ontology (GO) terms. Results are transparent and data are systematically stored in a MySQL relational database to facilitate additional analyses as well as quality assessment. We used next generation sequencing to generate a small-scale RNA-Seq dataset derived from the heavily studied defense response of Arabidopsis thaliana and used GENE-counter to process the data. Collectively, the support from analysis of microarrays as well as the observed and substantial overlap in results from each of the three statistics packages demonstrates that GENE-counter is well suited for handling the unique characteristics of small sample sizes and high variability in gene counts. PMID:21998647
CORNAS: coverage-dependent RNA-Seq analysis of gene expression data without biological replicates.
Low, Joel Z B; Khang, Tsung Fei; Tammi, Martti T
2017-12-28
In current statistical methods for calling differentially expressed genes in RNA-Seq experiments, the assumption is that an adjusted observed gene count represents an unknown true gene count. This adjustment usually consists of a normalization step to account for heterogeneous sample library sizes, and then the resulting normalized gene counts are used as input for parametric or non-parametric differential gene expression tests. A distribution of true gene counts, each with a different probability, can result in the same observed gene count. Importantly, sequencing coverage information is currently not explicitly incorporated into any of the statistical models used for RNA-Seq analysis. We developed a fast Bayesian method which uses the sequencing coverage information determined from the concentration of an RNA sample to estimate the posterior distribution of a true gene count. Our method has better or comparable performance compared to NOISeq and GFOLD, according to the results from simulations and experiments with real unreplicated data. We incorporated a previously unused sequencing coverage parameter into a procedure for differential gene expression analysis with RNA-Seq data. Our results suggest that our method can be used to overcome analytical bottlenecks in experiments with limited number of replicates and low sequencing coverage. The method is implemented in CORNAS (Coverage-dependent RNA-Seq), and is available at https://github.com/joel-lzb/CORNAS .
Analysis of expressed sequence tags from a NaHCO(3)-treated alkali-tolerant plant, Chloris virgata.
Nishiuchi, Shunsaku; Fujihara, Kazumasa; Liu, Shenkui; Takano, Tetsuo
2010-04-01
Chloris virgata Swartz (C. virgata) is a gramineous wild plant that can survive in saline-alkali areas in northeast China. To examine the tolerance mechanisms of C. virgata, we constructed a cDNA library from whole plants of C. virgata that had been treated with 100 mM NaHCO(3) for 24 h and sequenced 3168 randomly selected clones. Most (2590) of the expressed sequence tags (ESTs) showed significant similarity to sequences in the NCBI database. Of the 2590 genes, 1893 were unique. Gene Ontology (GO) Slim annotations were obtained for 1081 ESTs by BLAST2GO and it was found that 75 genes of them were annotated with GO terms "response to stress", "response to abiotic stimulus", and "response to biotic stimulus", indicating these genes were likely to function in tolerance mechanism of C. virgata. In a separate experiment, 24 genes that are known from previous studies to be associated with abiotic stress tolerance were further examined by real-time RT-PCR to see how their expressions were affected by NaHCO(3) stress. NaHCO(3) treatment up-regulated the expressions of pathogenesis-related gene (DC998527), Win1 precursor gene (DC998617), catalase gene (DC999385), ribosome inactivating protein 1 (DC999555), Na(+)/H(+) antiporter gene (DC998043), and two-component regulator gene (DC998236). Copyright 2010 Elsevier Masson SAS. All rights reserved.
DNA Methylation of Gene Expression in Acanthamoeba castellanii Encystation.
Moon, Eun-Kyung; Hong, Yeonchul; Lee, Hae-Ahm; Quan, Fu-Shi; Kong, Hyun-Hee
2017-04-01
Encystation mediating cyst specific cysteine proteinase (CSCP) of Acanthamoeba castellanii is expressed remarkably during encystation. However, the molecular mechanism involved in the regulation of CSCP gene expression remains unclear. In this study, we focused on epigenetic regulation of gene expression during encystation of Acanthamoeba . To evaluate methylation as a potential mechanism involved in the regulation of CSCP expression, we first investigated the correlation between promoter methylation status of CSCP gene and its expression. A 2,878 bp of promoter sequence of CSCP gene was amplified by PCR. Three CpG islands (island 1-3) were detected in this sequence using bioinformatics tools. Methylation of CpG island in trophozoites and cysts was measured by bisulfite sequence PCR. CSCP promoter methylation of CpG island 1 (1,633 bp) was found in 8.2% of trophozoites and 7.3% of cysts. Methylation of CpG island 2 (625 bp) was observed in 4.2% of trophozoites and 5.8% of cysts. Methylation of CpG island 3 (367 bp) in trophozoites and cysts was both 3.6%. These results suggest that DNA methylation system is present in CSCP gene expression of Acanthamoeba . In addition, the expression of encystation mediating CSCP is correlated with promoter CpG island 1 hypomethylation.
Zhang, Yi; Zhao, Yuanyuan; Qiu, Xuehong; Han, Richou
2013-08-01
Coptotermes formosanus Shiraki (Isoptera: Rhinotermitidae) termites are harmful social insects to wood constructions. The current control methods heavily depend on the chemical insecticides with increasing resistance. Analysis of the differentially expressed genes mediated by chemical insecticides will contribute to the understanding of the termite resistance to chemicals and to the establishment of alternative control measures. In the present article, a full-length cDNA library was constructed from the termites induced by a mixture of commonly used insecticides (0.01% sulfluramid and 0.01% triflumuron) for 24 h, by using the RNA ligase-mediated Rapid Amplification cDNA End method. Fifty-eight differentially expressed clones were obtained by polymerase chain reaction and confirmed by dot-blot hybridization. Forty-six known sequences were obtained, which clustered into 33 unique sequences grouped in 6 contigs and 27 singlets. Sixty-seven percent (22) of the sequences had counterpart genes from other organisms, whereas 33% (11) were undescribed. A Gene Ontology analysis classified 33 unique sequences into different functional categories. In general, most of the differential expression genes were involved in binding and catalytic activity.
Stenman, G; Anisowicz, A; Sager, R
1988-11-01
The KRAS gene is constitutionally amplified in the Chinese hamster. We have mapped the amplified sequences by in situ hybridization to two major sites on the X and Y chromosomes, Xq4 and Yp2. No autosomal site was detected despite a search under relaxed hybridization conditions. KRAS DNA is amplified about 50-fold compared to a human cell line known to have a diploid number of KRAS sequences, whereas mRNA expression is 5- to 10-fold lower than in normal human cells. While mRNA expression levels do not necessarily parallel gene copy number, the low expression level strongly suggests that the amplified sequences are transcriptionally silent. It is suggested that the amplified sequences arose from the original KRAS gene on chromosome 8 and that the KRAS sequences on the Y chromosome arose by X-Y recombination.
Holmes, Andrew; Szafranski, Karol; Faulkes, Chris G.; Coen, Clive W.; Buffenstein, Rochelle; Platzer, Matthias; de Magalhães, João Pedro; Church, George M.
2011-01-01
The naked mole-rat (Heterocephalus glaber) is a long-lived, cancer resistant rodent and there is a great interest in identifying the adaptations responsible for these and other of its unique traits. We employed RNA sequencing to compare liver gene expression profiles between naked mole-rats and wild-derived mice. Our results indicate that genes associated with oxidoreduction and mitochondria were expressed at higher relative levels in naked mole-rats. The largest effect is nearly 300-fold higher expression of epithelial cell adhesion molecule (Epcam), a tumour-associated protein. Also of interest are the protease inhibitor, alpha2-macroglobulin (A2m), and the mitochondrial complex II subunit Sdhc, both ageing-related genes found strongly over-expressed in the naked mole-rat. These results hint at possible candidates for specifying species differences in ageing and cancer, and in particular suggest complex alterations in mitochondrial and oxidation reduction pathways in the naked mole-rat. Our differential gene expression analysis obviated the need for a reference naked mole-rat genome by employing a combination of Illumina/Solexa and 454 platforms for transcriptome sequencing and assembling transcriptome contigs of the non-sequenced species. Overall, our work provides new research foci and methods for studying the naked mole-rat's fascinating characteristics. PMID:22073188
Zhou, Rongqiong; Xia, Qingyou; Huang, Hancheng; Lai, Min; Wang, Zhenxin
2011-10-01
Toxocara canis is a widespread intestinal nematode parasite of dogs, which can also cause disease in humans. We employed an expressed sequence tag (EST) strategy in order to study gene-expression including development, digestion and reproduction of T. canis. ESTs provided a rapid way to identify genes, particularly in organisms for which we have very little molecular information. In this study, a cDNA library was constructed from a female adult of T. canis and 215 high-quality ESTs from 5'-ends of the cDNA clones representing 79 unigenes were obtained. The titer of the primary cDNA library was 1.83×10(6)pfu/mL with a recombination rate of 99.33%. Most of the sequences ranged from 300 to 900bp with an average length of 656bp. Cluster analysis of these ESTs allowed identification of 79 unique sequences containing 28 contigs and 51 singletons. BLASTX searches revealed that 18 unigenes (22.78% of the total) or 70 ESTs (32.56% of the total) were novel genes that had no significant matches to any protein sequences in the public databases. The rest of the 61 unigenes (77.22% of the total) or 145 ESTs (67.44% of the total) were closely matched to the known genes or sequences deposited in the public databases. These genes were classified into seven groups based on their known or putative biological functions. We also confirmed the gene expression patterns of several immune-related genes using RT-PCR examination. This work will provide a valuable resource for the further investigations in the stage-, sex- and tissue-specific gene transcription or expression. Copyright © 2011. Published by Elsevier Inc.
Analysis of xylem formation in pine by cDNA sequencing
NASA Technical Reports Server (NTRS)
Allona, I.; Quinn, M.; Shoop, E.; Swope, K.; St Cyr, S.; Carlis, J.; Riedl, J.; Retzel, E.; Campbell, M. M.; Sederoff, R.;
1998-01-01
Secondary xylem (wood) formation is likely to involve some genes expressed rarely or not at all in herbaceous plants. Moreover, environmental and developmental stimuli influence secondary xylem differentiation, producing morphological and chemical changes in wood. To increase our understanding of xylem formation, and to provide material for comparative analysis of gymnosperm and angiosperm sequences, ESTs were obtained from immature xylem of loblolly pine (Pinus taeda L.). A total of 1,097 single-pass sequences were obtained from 5' ends of cDNAs made from gravistimulated tissue from bent trees. Cluster analysis detected 107 groups of similar sequences, ranging in size from 2 to 20 sequences. A total of 361 sequences fell into these groups, whereas 736 sequences were unique. About 55% of the pine EST sequences show similarity to previously described sequences in public databases. About 10% of the recognized genes encode factors involved in cell wall formation. Sequences similar to cell wall proteins, most known lignin biosynthetic enzymes, and several enzymes of carbohydrate metabolism were found. A number of putative regulatory proteins also are represented. Expression patterns of several of these genes were studied in various tissues and organs of pine. Sequencing novel genes expressed during xylem formation will provide a powerful means of identifying mechanisms controlling this important differentiation pathway.
Xu, Huayong; Yu, Hui; Tu, Kang; Shi, Qianqian; Wei, Chaochun; Li, Yuan-Yuan; Li, Yi-Xue
2013-01-01
We are witnessing rapid progress in the development of methodologies for building the combinatorial gene regulatory networks involving both TFs (Transcription Factors) and miRNAs (microRNAs). There are a few tools available to do these jobs but most of them are not easy to use and not accessible online. A web server is especially needed in order to allow users to upload experimental expression datasets and build combinatorial regulatory networks corresponding to their particular contexts. In this work, we compiled putative TF-gene, miRNA-gene and TF-miRNA regulatory relationships from forward-engineering pipelines and curated them as built-in data libraries. We streamlined the R codes of our two separate forward-and-reverse engineering algorithms for combinatorial gene regulatory network construction and formalized them as two major functional modules. As a result, we released the cGRNB (combinatorial Gene Regulatory Networks Builder): a web server for constructing combinatorial gene regulatory networks through integrated engineering of seed-matching sequence information and gene expression datasets. The cGRNB enables two major network-building modules, one for MPGE (miRNA-perturbed gene expression) datasets and the other for parallel miRNA/mRNA expression datasets. A miRNA-centered two-layer combinatorial regulatory cascade is the output of the first module and a comprehensive genome-wide network involving all three types of combinatorial regulations (TF-gene, TF-miRNA, and miRNA-gene) are the output of the second module. In this article we propose cGRNB, a web server for building combinatorial gene regulatory networks through integrated engineering of seed-matching sequence information and gene expression datasets. Since parallel miRNA/mRNA expression datasets are rapidly accumulated by the advance of next-generation sequencing techniques, cGRNB will be very useful tool for researchers to build combinatorial gene regulatory networks based on expression datasets. The cGRNB web-server is free and available online at http://www.scbit.org/cgrnb.
Bjørkeng, Eva Katrin; Tessema, Girum Tadesse; Lundblad, Eirik Wasmuth; Butaye, Patrick; Willems, Rob; Sollid, Johanna Ericsson; Sundsfjord, Arnfinn; Hegstad, Kristin
2010-01-01
The presence, distribution and expression of cassette chromosome recombinase (ccr) genes, which are homologous to the staphylococcal ccrAB genes and are designated ccrABEnt genes, were examined in enterococcal isolates (n=421) representing 13 different species. A total of 118 (28 %) isolates were positive for ccrABEnt genes by PCR, and a number of these were confirmed by Southern hybridization with a ccrAEnt probe (n=76) and partial DNA sequencing of ccrAEnt and ccrBEnt genes (n=38). ccrABEnt genes were present in Enterococcus faecium (58/216, 27 %), Enterococcus durans (31/38, 82 %), Enterococcus hirae (27/52, 50 %), Enterococcus casseliflavus (1/4, 25 %) and Enterococcus gallinarum (1/2, 50 %). In the eight other species tested, including Enterococcus faecalis (n=94), ccrABEnt genes were not found. Thirty-eight sequenced ccrABEnt genes from five different enterococcal species showed 94–100 % nucleotide sequence identity and linkage PCRs showed heterogeneity in the ccrABEnt flanking chromosomal genes. Expression analysis of ccrABEnt genes from the E. faecium DO strain showed constitutive expression as a bicistronic mRNA. The ccrABEnt mRNA levels were lower during log phase than stationary phase in relation to total mRNA. Multilocus sequence typing was performed on 39 isolates. ccrABEnt genes were detected in both hospital-related (10/29, 34 %) and non-hospital (4/10, 40 %) strains of E. faecium. Various sequence types were represented by both ccrABEnt positive and negative isolates, suggesting acquisition or loss of ccrABEnt in E. faecium. In summary, ccrABEnt genes, potentially involved in genome plasticity, are expressed in E. faecium and are widely distributed in the E. faecium and E. casseliflavus species groups. PMID:20817645
Erpen, L; Tavano, E C R; Harakava, R; Dutt, M; Grosser, J W; Piedade, S M S; Mendes, B M J; Mourão Filho, F A A
2018-05-23
Regulatory sequences from the citrus constitutive genes cyclophilin (CsCYP), glyceraldehyde-3-phosphate dehydrogenase C2 (CsGAPC2), and elongation factor 1-alpha (CsEF1) were isolated, fused to the uidA gene, and qualitatively and quantitatively evaluated in transgenic sweet orange plants. The 5' upstream region of a gene (the promoter) is the most important component for the initiation and regulation of gene transcription of both native genes and transgenes in plants. The isolation and characterization of gene regulatory sequences are essential to the development of intragenic or cisgenic genetic manipulation strategies, which imply the use of genetic material from the same species or from closely related species. We describe herein the isolation and evaluation of the promoter sequence from three constitutively expressed citrus genes: cyclophilin (CsCYP), glyceraldehyde-3-phosphate dehydrogenase C2 (CsGAPC2), and elongation factor 1-alpha (CsEF1). The functionality of the promoters was confirmed by a histochemical GUS assay in leaves, stems, and roots of stably transformed citrus plants expressing the promoter-uidA construct. Lower uidA mRNA levels were detected when the transgene was under the control of citrus promoters as compared to the expression under the control of the CaMV35S promoter. The association of the uidA gene with the citrus-derived promoters resulted in mRNA levels of up to 60-41.8% of the value obtained with the construct containing CaMV35S driving the uidA gene. Moreover, a lower inter-individual variability in transgene expression was observed amongst the different transgenic lines, where gene constructs containing citrus-derived promoters were used. In silico analysis of the citrus-derived promoter sequences revealed that their activity may be controlled by several putative cis-regulatory elements. These citrus promoters will expand the availability of regulatory sequences for driving gene expression in citrus gene-modification programs.
EXP-PAC: providing comparative analysis and storage of next generation gene expression data.
Church, Philip C; Goscinski, Andrzej; Lefèvre, Christophe
2012-07-01
Microarrays and more recently RNA sequencing has led to an increase in available gene expression data. How to manage and store this data is becoming a key issue. In response we have developed EXP-PAC, a web based software package for storage, management and analysis of gene expression and sequence data. Unique to this package is SQL based querying of gene expression data sets, distributed normalization of raw gene expression data and analysis of gene expression data across experiments and species. This package has been populated with lactation data in the international milk genomic consortium web portal (http://milkgenomics.org/). Source code is also available which can be hosted on a Windows, Linux or Mac APACHE server connected to a private or public network (http://mamsap.it.deakin.edu.au/~pcc/Release/EXP_PAC.html). Copyright © 2012 Elsevier Inc. All rights reserved.
Rodovalho, Cynara M; Ferro, Milene; Fonseca, Fernando Pp; Antonio, Erik A; Guilherme, Ivan R; Henrique-Silva, Flávio; Bacci, Maurício
2011-06-17
Leafcutters are the highest evolved within Neotropical ants in the tribe Attini and model systems for studying caste formation, labor division and symbiosis with microorganisms. Some species of leafcutters are agricultural pests controlled by chemicals which affect other animals and accumulate in the environment. Aiming to provide genetic basis for the study of leafcutters and for the development of more specific and environmentally friendly methods for the control of pest leafcutters, we generated expressed sequence tag data from Atta laevigata, one of the pest ants with broad geographic distribution in South America. The analysis of the expressed sequence tags allowed us to characterize 2,006 unique sequences in Atta laevigata. Sixteen of these genes had a high number of transcripts and are likely positively selected for high level of gene expression, being responsible for three basic biological functions: energy conservation through redox reactions in mitochondria; cytoskeleton and muscle structuring; regulation of gene expression and metabolism. Based on leafcutters lifestyle and reports of genes involved in key processes of other social insects, we identified 146 sequences potential targets for controlling pest leafcutters. The targets are responsible for antixenobiosis, development and longevity, immunity, resistance to pathogens, pheromone function, cell signaling, behavior, polysaccharide metabolism and arginine kynase activity. The generation and analysis of expressed sequence tags from Atta laevigata have provided important genetic basis for future studies on the biology of leaf-cutting ants and may contribute to the development of a more specific and environmentally friendly method for the control of agricultural pest leafcutters.
2011-01-01
Background Leafcutters are the highest evolved within Neotropical ants in the tribe Attini and model systems for studying caste formation, labor division and symbiosis with microorganisms. Some species of leafcutters are agricultural pests controlled by chemicals which affect other animals and accumulate in the environment. Aiming to provide genetic basis for the study of leafcutters and for the development of more specific and environmentally friendly methods for the control of pest leafcutters, we generated expressed sequence tag data from Atta laevigata, one of the pest ants with broad geographic distribution in South America. Results The analysis of the expressed sequence tags allowed us to characterize 2,006 unique sequences in Atta laevigata. Sixteen of these genes had a high number of transcripts and are likely positively selected for high level of gene expression, being responsible for three basic biological functions: energy conservation through redox reactions in mitochondria; cytoskeleton and muscle structuring; regulation of gene expression and metabolism. Based on leafcutters lifestyle and reports of genes involved in key processes of other social insects, we identified 146 sequences potential targets for controlling pest leafcutters. The targets are responsible for antixenobiosis, development and longevity, immunity, resistance to pathogens, pheromone function, cell signaling, behavior, polysaccharide metabolism and arginine kynase activity. Conclusion The generation and analysis of expressed sequence tags from Atta laevigata have provided important genetic basis for future studies on the biology of leaf-cutting ants and may contribute to the development of a more specific and environmentally friendly method for the control of agricultural pest leafcutters. PMID:21682882
An efficient annotation and gene-expression derivation tool for Illumina Solexa datasets.
Hosseini, Parsa; Tremblay, Arianne; Matthews, Benjamin F; Alkharouf, Nadim W
2010-07-02
The data produced by an Illumina flow cell with all eight lanes occupied, produces well over a terabyte worth of images with gigabytes of reads following sequence alignment. The ability to translate such reads into meaningful annotation is therefore of great concern and importance. Very easily, one can get flooded with such a great volume of textual, unannotated data irrespective of read quality or size. CASAVA, a optional analysis tool for Illumina sequencing experiments, enables the ability to understand INDEL detection, SNP information, and allele calling. To not only extract from such analysis, a measure of gene expression in the form of tag-counts, but furthermore to annotate such reads is therefore of significant value. We developed TASE (Tag counting and Analysis of Solexa Experiments), a rapid tag-counting and annotation software tool specifically designed for Illumina CASAVA sequencing datasets. Developed in Java and deployed using jTDS JDBC driver and a SQL Server backend, TASE provides an extremely fast means of calculating gene expression through tag-counts while annotating sequenced reads with the gene's presumed function, from any given CASAVA-build. Such a build is generated for both DNA and RNA sequencing. Analysis is broken into two distinct components: DNA sequence or read concatenation, followed by tag-counting and annotation. The end result produces output containing the homology-based functional annotation and respective gene expression measure signifying how many times sequenced reads were found within the genomic ranges of functional annotations. TASE is a powerful tool to facilitate the process of annotating a given Illumina Solexa sequencing dataset. Our results indicate that both homology-based annotation and tag-count analysis are achieved in very efficient times, providing researchers to delve deep in a given CASAVA-build and maximize information extraction from a sequencing dataset. TASE is specially designed to translate sequence data in a CASAVA-build into functional annotations while producing corresponding gene expression measurements. Achieving such analysis is executed in an ultrafast and highly efficient manner, whether the analysis be a single-read or paired-end sequencing experiment. TASE is a user-friendly and freely available application, allowing rapid analysis and annotation of any given Illumina Solexa sequencing dataset with ease.
Gene expression distribution deconvolution in single-cell RNA sequencing.
Wang, Jingshu; Huang, Mo; Torre, Eduardo; Dueck, Hannah; Shaffer, Sydney; Murray, John; Raj, Arjun; Li, Mingyao; Zhang, Nancy R
2018-06-26
Single-cell RNA sequencing (scRNA-seq) enables the quantification of each gene's expression distribution across cells, thus allowing the assessment of the dispersion, nonzero fraction, and other aspects of its distribution beyond the mean. These statistical characterizations of the gene expression distribution are critical for understanding expression variation and for selecting marker genes for population heterogeneity. However, scRNA-seq data are noisy, with each cell typically sequenced at low coverage, thus making it difficult to infer properties of the gene expression distribution from raw counts. Based on a reexamination of nine public datasets, we propose a simple technical noise model for scRNA-seq data with unique molecular identifiers (UMI). We develop deconvolution of single-cell expression distribution (DESCEND), a method that deconvolves the true cross-cell gene expression distribution from observed scRNA-seq counts, leading to improved estimates of properties of the distribution such as dispersion and nonzero fraction. DESCEND can adjust for cell-level covariates such as cell size, cell cycle, and batch effects. DESCEND's noise model and estimation accuracy are further evaluated through comparisons to RNA FISH data, through data splitting and simulations and through its effectiveness in removing known batch effects. We demonstrate how DESCEND can clarify and improve downstream analyses such as finding differentially expressed genes, identifying cell types, and selecting differentiation markers. Copyright © 2018 the Author(s). Published by PNAS.
Suppression of prolactin gene expression in GH cells correlates with site-specific DNA methylation.
Zhang, Z X; Kumar, V; Rivera, R T; Pasion, S G; Chisholm, J; Biswas, D K
1989-10-01
Prolactin- (PRL) producing and nonproducing subclones of the GH line of (rat) pituitary tumor cells have been compared to elucidate the regulatory mechanisms of PRL gene expression. Particular emphasis was placed on delineating the molecular basis of the suppressed state of the PRL gene in the prolactin-nonproducing (PRL-) GH subclone (GH(1)2C1). We examined six methylatable cytosine residues (5, -CCGG- and 1, -GCGC-) within the 30-kb region of the PRL gene in these subclones. This analysis revealed that -CCGG-sequences of the transcribed region, and specifically, one in the fourth exon of the PRL gene, were heavily methylated in the PRL-, GH(1)2C1 cells. Furthermore, the inhibition of PRL gene expression in GH(1)2C1 was reversed by short-term treatment of the cells with a sublethal concentration of azacytidine (AzaC), an inhibitor of DNA methylation. The reversion of PRL gene expression by AzaC was correlated with the concurrent demethylation of the same -CCGG- sequences in the transcribed region of PRL gene. An inverse correlation between PRL gene expression and the level of methylation of the internal -C- residues in the specific -CCGG-sequence of the transcribed region of the PRL gene was demonstrated. The DNase I sensitivity of these regions of the PRL gene in PRL+, PRL-, and AzaC-treated cells was also consistent with an inverse relationship between methylation state, a higher order of structural modification, and gene expression.(ABSTRACT TRUNCATED AT 250 WORDS)
A Pipeline for High-Throughput Concentration Response Modeling of Gene Expression for Toxicogenomics
House, John S.; Grimm, Fabian A.; Jima, Dereje D.; Zhou, Yi-Hui; Rusyn, Ivan; Wright, Fred A.
2017-01-01
Cell-based assays are an attractive option to measure gene expression response to exposure, but the cost of whole-transcriptome RNA sequencing has been a barrier to the use of gene expression profiling for in vitro toxicity screening. In addition, standard RNA sequencing adds variability due to variable transcript length and amplification. Targeted probe-sequencing technologies such as TempO-Seq, with transcriptomic representation that can vary from hundreds of genes to the entire transcriptome, may reduce some components of variation. Analyses of high-throughput toxicogenomics data require renewed attention to read-calling algorithms and simplified dose–response modeling for datasets with relatively few samples. Using data from induced pluripotent stem cell-derived cardiomyocytes treated with chemicals at varying concentrations, we describe here and make available a pipeline for handling expression data generated by TempO-Seq to align reads, clean and normalize raw count data, identify differentially expressed genes, and calculate transcriptomic concentration–response points of departure. The methods are extensible to other forms of concentration–response gene-expression data, and we discuss the utility of the methods for assessing variation in susceptibility and the diseased cellular state. PMID:29163636
Quantitative analysis of a deeply sequenced marine microbial metatranscriptome.
Gifford, Scott M; Sharma, Shalabh; Rinta-Kanto, Johanna M; Moran, Mary Ann
2011-03-01
The potential of metatranscriptomic sequencing to provide insights into the environmental factors that regulate microbial activities depends on how fully the sequence libraries capture community expression (that is, sample-sequencing depth and coverage depth), and the sensitivity with which expression differences between communities can be detected (that is, statistical power for hypothesis testing). In this study, we use an internal standard approach to make absolute (per liter) estimates of transcript numbers, a significant advantage over proportional estimates that can be biased by expression changes in unrelated genes. Coastal waters of the southeastern United States contain 1 × 10(12) bacterioplankton mRNA molecules per liter of seawater (~200 mRNA molecules per bacterial cell). Even for the large bacterioplankton libraries obtained in this study (~500,000 possible protein-encoding sequences in each of two libraries after discarding rRNAs and small RNAs from >1 million 454 FLX pyrosequencing reads), sample-sequencing depth was only 0.00001%. Expression levels of 82 genes diagnostic for transformations in the marine nitrogen, phosphorus and sulfur cycles ranged from below detection (<1 × 10(6) transcripts per liter) for 36 genes (for example, phosphonate metabolism gene phnH, dissimilatory nitrate reductase subunit napA) to >2.7 × 10(9) transcripts per liter (ammonia transporter amt and ammonia monooxygenase subunit amoC). Half of the categories for which expression was detected, however, had too few copy numbers for robust statistical resolution, as would be required for comparative (experimental or time-series) expression studies. By representing whole community gene abundance and expression in absolute units (per volume or mass of environment), 'omics' data can be better leveraged to improve understanding of microbially mediated processes in the ocean.
Pathak, B G; Neumann, J C; Croyle, M L; Lingrel, J B
1994-01-01
The Na,K-ATPase is an integral plasma membrane protein consisting of alpha and beta subunits, each of which has discrete isoforms expressed in a tissue-specific manner. Of the three functional alpha isoform genes, the one encoding the alpha 3 isoform is the most tissue-restricted in its expression, being found primarily in the brain. To identify regions of the alpha 3 isoform gene that are involved in directing expression in the brain, a 1.6 kb 5'-flanking sequence was attached to a reporter gene, chloramphenicol acetyltransferase (CAT). The alpha 3-CAT chimeric gene construct was microinjected into fertilized mouse eggs, and transgenic mice were produced. Analysis of adult transgenic mice from different lines revealed that the transgene is expressed primarily in the brain. To further delineate regions that are needed for conferring expression in this tissue, systematic deletions of the 5'-flanking sequence of the alpha 3-CAT fusion constructs were made and analyzed, again using transgenic mice. The results from these analyses indicate that DNA sequences required for mediating brain-specific expression of the alpha 3 isoform gene are present within 210 bp upstream of the transcription initiation site. alpha 3-CAT promoter constructs containing scanning mutations in this region were also assayed in transgenic mice. These studies have identified both a functional neural-restrictive silencer element as well as a positively acting cis element. Images PMID:7984427
Liu, Chang-Lun; Sung, Hung-Hung
2011-09-01
To assess the toxicity of nonylphenol towards aquatic crustaceans, Neocaridina denticulata were exposed short-term to sublethal concentration (0.001-0.5 mg/L). Following treatment, differentially expressed genes were identified using suppression subtractive hybridization on samples prepared from whole specimens. There were 20 differentially expressed sequence tags that corresponded to known genes and could be divided into six functional classes: defence, translation, metabolism, ribosomal gene expression, respiration, and genes involved in the stress response. Using semi-quantitative RT-PCR, we found that 14 of the differentially expressed sequence tags significantly responded to nonylphenol, including six at a nominal concentration of 0.01 mg/L; among them, 12 genes were down-regulated. These results suggest that under non-lethal concentrations of nonylphenol, the polluted aquatic environment may still present a potential risk to N. denticulata.
Sedlackova, Lenka; Perkins, Keith D; Meyer, Julia; Strain, Anna K; Goldman, Oksana; Rice, Stephen A
2010-03-01
During productive herpes simplex virus type 1 (HSV-1) infection, a subset of viral delayed-early (DE) and late (L) genes require the immediate-early (IE) protein ICP27 for their expression. However, the cis-acting regulatory sequences in DE and L genes that mediate their specific induction by ICP27 are unknown. One viral L gene that is highly dependent on ICP27 is that encoding glycoprotein C (gC). We previously demonstrated that this gene is posttranscriptionally transactivated by ICP27 in a plasmid cotransfection assay. Based on our past results, we hypothesized that the gC gene possesses a cis-acting inhibitory sequence and that ICP27 overcomes the effects of this sequence to enable efficient gC expression. To test this model, we systematically deleted sequences from the body of the gC gene and tested the resulting constructs for expression. In so doing, we identified a 258-bp "silencing element" (SE) in the 5' portion of the gC coding region. When present, the SE inhibits gC mRNA accumulation from a transiently transfected gC gene, unless ICP27 is present. Moreover, the SE can be transferred to another HSV-1 gene, where it inhibits mRNA accumulation in the absence of ICP27 and confers high-level expression in the presence of ICP27. Thus, for the first time, an ICP27-responsive sequence has been identified in a physiologically relevant ICP27 target gene. To see if the SE functions during viral infection, we engineered HSV-1 recombinants that lack the SE, either in a wild-type (WT) or ICP27-null genetic background. In an ICP27-null background, deletion of the SE led to ICP27-independent expression of the gC gene, demonstrating that the SE functions during viral infection. Surprisingly, the ICP27-independent gC expression seen with the mutant occurred even in the absence of viral DNA synthesis, indicating that the SE helps to regulate the tight DNA replication-dependent expression of gC.
2010-01-01
Background De novo assembly of transcript sequences produced by short-read DNA sequencing technologies offers a rapid approach to obtain expressed gene catalogs for non-model organisms. A draft genome sequence will be produced in 2010 for a Eucalyptus tree species (E. grandis) representing the most important hardwood fibre crop in the world. Genome annotation of this valuable woody plant and genetic dissection of its superior growth and productivity will be greatly facilitated by the availability of a comprehensive collection of expressed gene sequences from multiple tissues and organs. Results We present an extensive expressed gene catalog for a commercially grown E. grandis × E. urophylla hybrid clone constructed using only Illumina mRNA-Seq technology and de novo assembly. A total of 18,894 transcript-derived contigs, a large proportion of which represent full-length protein coding genes were assembled and annotated. Analysis of assembly quality, length and diversity show that this dataset represent the most comprehensive expressed gene catalog for any Eucalyptus tree. mRNA-Seq analysis furthermore allowed digital expression profiling of all of the assembled transcripts across diverse xylogenic and non-xylogenic tissues, which is invaluable for ascribing putative gene functions. Conclusions De novo assembly of Illumina mRNA-Seq reads is an efficient approach for transcriptome sequencing and profiling in Eucalyptus and other non-model organisms. The transcriptome resource (Eucspresso, http://eucspresso.bi.up.ac.za/) generated by this study will be of value for genomic analysis of woody biomass production in Eucalyptus and for comparative genomic analysis of growth and development in woody and herbaceous plants. PMID:21122097
Gorodkin, Jan; Cirera, Susanna; Hedegaard, Jakob; Gilchrist, Michael J; Panitz, Frank; Jørgensen, Claus; Scheibye-Knudsen, Karsten; Arvin, Troels; Lumholdt, Steen; Sawera, Milena; Green, Trine; Nielsen, Bente J; Havgaard, Jakob H; Rosenkilde, Carina; Wang, Jun; Li, Heng; Li, Ruiqiang; Liu, Bin; Hu, Songnian; Dong, Wei; Li, Wei; Yu, Jun; Wang, Jian; Stærfeldt, Hans-Henrik; Wernersson, Rasmus; Madsen, Lone B; Thomsen, Bo; Hornshøj, Henrik; Bujie, Zhan; Wang, Xuegang; Wang, Xuefei; Bolund, Lars; Brunak, Søren; Yang, Huanming; Bendixen, Christian; Fredholm, Merete
2007-01-01
Background Knowledge of the structure of gene expression is essential for mammalian transcriptomics research. We analyzed a collection of more than one million porcine expressed sequence tags (ESTs), of which two-thirds were generated in the Sino-Danish Pig Genome Project and one-third are from public databases. The Sino-Danish ESTs were generated from one normalized and 97 non-normalized cDNA libraries representing 35 different tissues and three developmental stages. Results Using the Distiller package, the ESTs were assembled to roughly 48,000 contigs and 73,000 singletons, of which approximately 25% have a high confidence match to UniProt. Approximately 6,000 new porcine gene clusters were identified. Expression analysis based on the non-normalized libraries resulted in the following findings. The distribution of cluster sizes is scaling invariant. Brain and testes are among the tissues with the greatest number of different expressed genes, whereas tissues with more specialized function, such as developing liver, have fewer expressed genes. There are at least 65 high confidence housekeeping gene candidates and 876 cDNA library-specific gene candidates. We identified differential expression of genes between different tissues, in particular brain/spinal cord, and found patterns of correlation between genes that share expression in pairs of libraries. Finally, there was remarkable agreement in expression between specialized tissues according to Gene Ontology categories. Conclusion This EST collection, the largest to date in pig, represents an essential resource for annotation, comparative genomics, assembly of the pig genome sequence, and further porcine transcription studies. PMID:17407547
Deep sequencing reveals cell-type-specific patterns of single-cell transcriptome variation.
Dueck, Hannah; Khaladkar, Mugdha; Kim, Tae Kyung; Spaethling, Jennifer M; Francis, Chantal; Suresh, Sangita; Fisher, Stephen A; Seale, Patrick; Beck, Sheryl G; Bartfai, Tamas; Kuhn, Bernhard; Eberwine, James; Kim, Junhyong
2015-06-09
Differentiation of metazoan cells requires execution of different gene expression programs but recent single-cell transcriptome profiling has revealed considerable variation within cells of seeming identical phenotype. This brings into question the relationship between transcriptome states and cell phenotypes. Additionally, single-cell transcriptomics presents unique analysis challenges that need to be addressed to answer this question. We present high quality deep read-depth single-cell RNA sequencing for 91 cells from five mouse tissues and 18 cells from two rat tissues, along with 30 control samples of bulk RNA diluted to single-cell levels. We find that transcriptomes differ globally across tissues with regard to the number of genes expressed, the average expression patterns, and within-cell-type variation patterns. We develop methods to filter genes for reliable quantification and to calibrate biological variation. All cell types include genes with high variability in expression, in a tissue-specific manner. We also find evidence that single-cell variability of neuronal genes in mice is correlated with that in rats consistent with the hypothesis that levels of variation may be conserved. Single-cell RNA-sequencing data provide a unique view of transcriptome function; however, careful analysis is required in order to use single-cell RNA-sequencing measurements for this purpose. Technical variation must be considered in single-cell RNA-sequencing studies of expression variation. For a subset of genes, biological variability within each cell type appears to be regulated in order to perform dynamic functions, rather than solely molecular noise.
The tripartite leader sequence is required for ectopic expression of HAdV-B and HAdV-E E3 CR1 genes.
Bair, Camden R; Kotha Lakshmi Narayan, Poornima; Kajon, Adriana E
2017-05-01
The unique repertoire of genes that characterizes the early region 3 (E3) of the different species of human adenovirus (HAdV) likely contributes to their distinct pathogenic traits. The function of many E3 CR1 proteins remains unknown possibly due to unidentified intrinsic properties that make them difficult to express ectopically. This study shows that the species HAdV-B- and HAdV-E-specific E3 CR1 genes can be expressed from vectors carrying the HAdV tripartite leader (TPL) sequence but not from traditional mammalian expression vectors. Insertion of the TPL sequence upstream of the HAdV-B and HAdV-E E3 CR1 open reading frames was sufficient to rescue protein expression from pCI-neo constructs in transfected 293T cells. The detection of higher levels of HAdV-B and HAdV-E E3 CR1 transcripts suggests that the TPL sequence may enhance gene expression at both the transcriptional and translational levels. Our findings will facilitate the characterization of additional AdV E3 proteins. Copyright © 2017 Elsevier Inc. All rights reserved.
Transcriptome response to elevated CO2, water deficit, and thermal stress in peanut
USDA-ARS?s Scientific Manuscript database
Previously, our laboratories have performed gene expression studies using EST sequencing and spotted microarrays to investigate tissue-specific gene expression and response to abiotic stress. While these studies have provided valuable insight into these processes, they are constrained by sequencer t...
Shu, Xianghua; Liu, Yonggang; Yang, Liangyu; Song, Chunlian; Hou, Jiafa
2008-01-01
The complete coding sequences of 3 porcine genes - ASPA, NAGA, and HEXA - were amplified by the reverse transcriptase polymerase chain reaction (RT-PCR) based on the conserved sequence information of the mouse or other mammals and referenced pig ESTs. These 3 novel porcine genes were then deposited in the NCBI database and assigned GeneIDs: 100142661, 100142664 and 100142667. The phylogenetic tree analysis revealed that the porcine ASPA, NAGA, and HEXA all have closer genetic relationships with the ASPA, NAGA, and HEXA of cattle. Tissue expression profile analysis was also carried out and results revealed that swine ASPA, NAGA, and HEXA genes were differentially expressed in various organs, including skeletal muscle, the heart, liver, fat, kidney, lung, and small and large intestines. Our experiment is the first one to establish the foundation for further research on these 3 swine genes.
2009-01-01
Background Stripe rust, caused by Puccinia striiformis f. sp. tritici (Pst), is one of the most destructive diseases of wheat (Triticum aestivum L.) worldwide. In spite of its agricultural importance, the genomics and genetics of the pathogen are poorly characterized. Pst transcripts from urediniospores and germinated urediniospores have been examined previously, but little is known about genes expressed during host infection. Some genes involved in virulence in other rust fungi have been found to be specifically expressed in haustoria. Therefore, the objective of this study was to generate a cDNA library to characterize genes expressed in haustoria of Pst. Results A total of 5,126 EST sequences of high quality were generated from haustoria of Pst, from which 287 contigs and 847 singletons were derived. Approximately 10% and 26% of the 1,134 unique sequences were homologous to proteins with known functions and hypothetical proteins, respectively. The remaining 64% of the unique sequences had no significant similarities in GenBank. Fifteen genes were predicted to be proteins secreted from Pst haustoria. Analysis of ten genes, including six secreted protein genes, using quantitative RT-PCR revealed changes in transcript levels in different developmental and infection stages of the pathogen. Conclusions The haustorial cDNA library was useful in identifying genes of the stripe rust fungus expressed during the infection process. From the library, we identified 15 genes encoding putative secreted proteins and six genes induced during the infection process. These genes are candidates for further studies to determine their functions in wheat-Pst interactions. PMID:20028560
Factors affecting expression of the recF gene of Escherichia coli K-12.
Sandler, S J; Clark, A J
1990-01-31
This report describes four factors which affect expression of the recF gene from strong upstream lambda promoters under temperature-sensitive cIAt2-encoded repressor control. The first factor was the long mRNA leader sequence consisting of the Escherichia coli dnaN gene and 95% of the dnaA gene and lambda bet, N (double amber) and 40% of the exo gene. When most of this DNA was deleted, RecF became detectable in maxicells. The second factor was the vector, pBEU28, a runaway replication plasmid. When we substituted pUC118 for pBEU28, RecF became detectable in whole cells by the Coomassie blue staining technique. The third factor was the efficiency of initiation of translation. We used site-directed mutagenesis to change the mRNA leader, ribosome-binding site and the 3 bp before and after the translational start codon. Monitoring the effect of these mutational changes by translational fusion to lacZ, we discovered that the efficiency of initiation of translation was increased 30-fold. Only an estimated two- or threefold increase in accumulated levels of RecF occurred, however. This led us to discover the fourth factor, namely sequences in the recF gene itself. These sequences reduce expression of the recF-lacZ fusion genes 100-fold. The sequences responsible for this decrease in expression occur in four regions in the N-terminal half of recF. Expression is reduced by some sequences at the transcriptional level and by others at the translational level.
Inferring gene expression from ribosomal promoter sequences, a crowdsourcing approach
Meyer, Pablo; Siwo, Geoffrey; Zeevi, Danny; Sharon, Eilon; Norel, Raquel; Segal, Eran; Stolovitzky, Gustavo; Siwo, Geoffrey; Rider, Andrew K.; Tan, Asako; Pinapati, Richard S.; Emrich, Scott; Chawla, Nitesh; Ferdig, Michael T.; Tung, Yi-An; Chen, Yong-Syuan; Chen, Mei-Ju May; Chen, Chien-Yu; Knight, Jason M.; Sahraeian, Sayed Mohammad Ebrahim; Esfahani, Mohammad Shahrokh; Dreos, Rene; Bucher, Philipp; Maier, Ezekiel; Saeys, Yvan; Szczurek, Ewa; Myšičková, Alena; Vingron, Martin; Klein, Holger; Kiełbasa, Szymon M.; Knisley, Jeff; Bonnell, Jeff; Knisley, Debra; Kursa, Miron B.; Rudnicki, Witold R.; Bhattacharjee, Madhuchhanda; Sillanpää, Mikko J.; Yeung, James; Meysman, Pieter; Rodríguez, Aminael Sánchez; Engelen, Kristof; Marchal, Kathleen; Huang, Yezhou; Mordelet, Fantine; Hartemink, Alexander; Pinello, Luca; Yuan, Guo-Cheng
2013-01-01
The Gene Promoter Expression Prediction challenge consisted of predicting gene expression from promoter sequences in a previously unknown experimentally generated data set. The challenge was presented to the community in the framework of the sixth Dialogue for Reverse Engineering Assessments and Methods (DREAM6), a community effort to evaluate the status of systems biology modeling methodologies. Nucleotide-specific promoter activity was obtained by measuring fluorescence from promoter sequences fused upstream of a gene for yellow fluorescence protein and inserted in the same genomic site of yeast Saccharomyces cerevisiae. Twenty-one teams submitted results predicting the expression levels of 53 different promoters from yeast ribosomal protein genes. Analysis of participant predictions shows that accurate values for low-expressed and mutated promoters were difficult to obtain, although in the latter case, only when the mutation induced a large change in promoter activity compared to the wild-type sequence. As in previous DREAM challenges, we found that aggregation of participant predictions provided robust results, but did not fare better than the three best algorithms. Finally, this study not only provides a benchmark for the assessment of methods predicting activity of a specific set of promoters from their sequence, but it also shows that the top performing algorithm, which used machine-learning approaches, can be improved by the addition of biological features such as transcription factor binding sites. PMID:23950146
Specht, Alicia T; Li, Jun
2017-03-01
To construct gene co-expression networks based on single-cell RNA-Sequencing data, we present an algorithm called LEAP, which utilizes the estimated pseudotime of the cells to find gene co-expression that involves time delay. R package LEAP available on CRAN. jun.li@nd.edu. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
EST Express: PHP/MySQL based automated annotation of ESTs from expression libraries
Smith, Robin P; Buchser, William J; Lemmon, Marcus B; Pardinas, Jose R; Bixby, John L; Lemmon, Vance P
2008-01-01
Background Several biological techniques result in the acquisition of functional sets of cDNAs that must be sequenced and analyzed. The emergence of redundant databases such as UniGene and centralized annotation engines such as Entrez Gene has allowed the development of software that can analyze a great number of sequences in a matter of seconds. Results We have developed "EST Express", a suite of analytical tools that identify and annotate ESTs originating from specific mRNA populations. The software consists of a user-friendly GUI powered by PHP and MySQL that allows for online collaboration between researchers and continuity with UniGene, Entrez Gene and RefSeq. Two key features of the software include a novel, simplified Entrez Gene parser and tools to manage cDNA library sequencing projects. We have tested the software on a large data set (2,016 samples) produced by subtractive hybridization. Conclusion EST Express is an open-source, cross-platform web server application that imports sequences from cDNA libraries, such as those generated through subtractive hybridization or yeast two-hybrid screens. It then provides several layers of annotation based on Entrez Gene and RefSeq to allow the user to highlight useful genes and manage cDNA library projects. PMID:18402700
EST Express: PHP/MySQL based automated annotation of ESTs from expression libraries.
Smith, Robin P; Buchser, William J; Lemmon, Marcus B; Pardinas, Jose R; Bixby, John L; Lemmon, Vance P
2008-04-10
Several biological techniques result in the acquisition of functional sets of cDNAs that must be sequenced and analyzed. The emergence of redundant databases such as UniGene and centralized annotation engines such as Entrez Gene has allowed the development of software that can analyze a great number of sequences in a matter of seconds. We have developed "EST Express", a suite of analytical tools that identify and annotate ESTs originating from specific mRNA populations. The software consists of a user-friendly GUI powered by PHP and MySQL that allows for online collaboration between researchers and continuity with UniGene, Entrez Gene and RefSeq. Two key features of the software include a novel, simplified Entrez Gene parser and tools to manage cDNA library sequencing projects. We have tested the software on a large data set (2,016 samples) produced by subtractive hybridization. EST Express is an open-source, cross-platform web server application that imports sequences from cDNA libraries, such as those generated through subtractive hybridization or yeast two-hybrid screens. It then provides several layers of annotation based on Entrez Gene and RefSeq to allow the user to highlight useful genes and manage cDNA library projects.
2012-01-01
Background Some organisms can survive extreme desiccation by entering into a state of suspended animation known as anhydrobiosis. Panagrolaimus superbus is a free-living anhydrobiotic nematode that can survive rapid environmental desiccation. The mechanisms that P. superbus uses to combat the potentially lethal effects of cellular dehydration may include the constitutive and inducible expression of protective molecules, along with behavioural and/or morphological adaptations that slow the rate of cellular water loss. In addition, inducible repair and revival programmes may also be required for successful rehydration and recovery from anhydrobiosis. Results To identify constitutively expressed candidate anhydrobiotic genes we obtained 9,216 ESTs from an unstressed mixed stage population of P. superbus. We derived 4,009 unigenes from these ESTs. These unigene annotations and sequences can be accessed at http://www.nematodes.org/nembase4/species_info.php?species=PSC. We manually annotated a set of 187 constitutively expressed candidate anhydrobiotic genes from P. superbus. Notable among those is a putative lineage expansion of the lea (late embryogenesis abundant) gene family. The most abundantly expressed sequence was a member of the nematode specific sxp/ral-2 family that is highly expressed in parasitic nematodes and secreted onto the surface of the nematodes' cuticles. There were 2,059 novel unigenes (51.7% of the total), 149 of which are predicted to encode intrinsically disordered proteins lacking a fixed tertiary structure. One unigene may encode an exo-β-1,3-glucanase (GHF5 family), most similar to a sequence from Phytophthora infestans. GHF5 enzymes have been reported from several species of plant parasitic nematodes, with horizontal gene transfer (HGT) from bacteria proposed to explain their evolutionary origin. This P. superbus sequence represents another possible HGT event within the Nematoda. The expression of five of the 19 putative stress response genes tested was upregulated in response to desiccation. These were the antioxidants glutathione peroxidase, dj-1 and 1-Cys peroxiredoxin, an shsp sequence and an lea gene. Conclusions P. superbus appears to utilise a strategy of combined constitutive and inducible gene expression in preparation for entry into anhydrobiosis. The apparent lineage expansion of lea genes, together with their constitutive and inducible expression, suggests that LEA3 proteins are important components of the anhydrobiotic protection repertoire of P. superbus. PMID:22281184
Molecular cloning of chitinase 33 (chit33) gene from Trichoderma atroviride
Matroudi, S.; Zamani, M.R.; Motallebi, M.
2008-01-01
In this study Trichoderma atroviride was selected as over producer of chitinase enzyme among 30 different isolates of Trichoderma sp. on the basis of chitinase specific activity. From this isolate the genomic and cDNA clones encoding chit33 have been isolated and sequenced. Comparison of genomic and cDNA sequences for defining gene structure indicates that this gene contains three short introns and also an open reading frame coding for a protein of 321 amino acids. The deduced amino acid sequence includes a 19 aa putative signal peptide. Homology between this sequence and other reported Trichoderma Chit33 proteins are discussed. The coding sequence of chit33 gene was cloned in pEt26b(+) expression vector and expressed in E. coli. PMID:24031242
de Vries, G E; Arfman, N; Terpstra, P; Dijkhuizen, L
1992-01-01
The gene (mdh) coding for methanol dehydrogenase (MDH) of thermotolerant, methylotroph Bacillus methanolicus C1 has been cloned and sequenced. The deduced amino acid sequence of the mdh gene exhibited similarity to those of five other alcohol dehydrogenase (type III) enzymes, which are distinct from the long-chain zinc-containing (type I) or short-chain zinc-lacking (type II) enzymes. Highly efficient expression of the mdh gene in Escherichia coli was probably driven from its own promoter sequence. After purification of MDH from E. coli, the kinetic and biochemical properties of the enzyme were investigated. The physiological effect of MDH synthesis in E. coli and the role of conserved sequence patterns in type III alcohol dehydrogenases have been analyzed and are discussed. Images PMID:1644761
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nitsche, E.M.; Moquin, A.; Adams, P.S.
1996-05-03
Male sexual differentiation is a process that involves androgen action via the androgen receptor. Defects in the androgen receptor, many resulting from point mutations in the androgen receptor gene, lead to varying degrees of impaired masculinization in chromosomally male individuals. To date no specific androgen regulated morphogens involved in this process have been identified and no marker genes are known that would help to predict further virilization in infants with partial androgen insensitivity. In the present study we first show data on androgen regulated gene expression investigated by differential display reverse transcription PCR (dd RT PCR) on total RNA frommore » human neonatal genital skin fibroblasts cultured in the presence or absence of 100 nM testosterone. Using three different primer combinations, 54 cDNAs appeared to be regulated by androgens. Most of these sequences show the characteristics of expressed mRNAs but showed no homology to sequences in the database. However 15 clones with significant homology to previously cloned sequences were identified. Seven cDNAs appear to be induced by androgen withdrawal. Of these, five are similar to ETS (expression tagged sequences) from unknown genes; the other two show significant homology to the cDNAs of ubiquitin and human guanylate binding protein 2 (GBP-2). In addition, we have identified 8 cDNA clones which show homologies to other sequences in the database and appear to be upregulated in the presence of testosterone. Three differential expressed sequences show significant homology to the cDNAs of L-plastin and one to the cDNA of testican. This latter gene codes for a proteoglycan involved in cell social behavior and therefore of special interest in this context. The results of this study are of interest in further investigation of normal and disturbed androgen-dependent gene expression. 49 refs., 2 figs., 5 tabs.« less
Application of industrial scale genomics to discovery of therapeutic targets in heart failure.
Mehraban, F; Tomlinson, J E
2001-12-01
In recent years intense activity in both academic and industrial sectors has provided a wealth of information on the human genome with an associated impressive increase in the number of novel gene sequences deposited in sequence data repositories and patent applications. This genomic industrial revolution has transformed the way in which drug target discovery is now approached. In this article we discuss how various differential gene expression (DGE) technologies are being utilized for cardiovascular disease (CVD) drug target discovery. Other approaches such as sequencing cDNA from cardiovascular derived tissues and cells coupled with bioinformatic sequence analysis are used with the aim of identifying novel gene sequences that may be exploited towards target discovery. Additional leverage from gene sequence information is obtained through identification of polymorphisms that may confer disease susceptibility and/or affect drug responsiveness. Pharmacogenomic studies are described wherein gene expression-based techniques are used to evaluate drug response and/or efficacy. Industrial-scale genomics supports and addresses not only novel target gene discovery but also the burgeoning issues in pharmaceutical and clinical cardiovascular medicine relative to polymorphic gene responses.
Ferreira de Carvalho, J; Poulain, J; Da Silva, C; Wincker, P; Michon-Coudouel, S; Dheilly, A; Naquin, D; Boutte, J; Salmon, A; Ainouche, M
2013-01-01
Spartina species have a critical ecological role in salt marshes and represent an excellent system to investigate recurrent polyploid speciation. Using the 454 GS-FLX pyrosequencer, we assembled and annotated the first reference transcriptome (from roots and leaves) for two related hexaploid Spartina species that hybridize in Western Europe, the East American invasive Spartina alterniflora and the Euro-African S. maritima. The de novo read assembly generated 38 478 consensus sequences and 99% found an annotation using Poaceae databases, representing a total of 16 753 non-redundant genes. Spartina expressed sequence tags were mapped onto the Sorghum bicolor genome, where they were distributed among the subtelomeric arms of the 10 S. bicolor chromosomes, with high gene density correlation. Normalization of the complementary DNA library improved the number of annotated genes. Ecologically relevant genes were identified among GO biological function categories in salt and heavy metal stress response, C4 photosynthesis and in lignin and cellulose metabolism. Expression of some of these genes had been found to be altered by hybridization and genome duplication in a previous microarray-based study in Spartina. As these species are hexaploid, up to three duplicated homoeologs may be expected per locus. When analyzing sequence polymorphism at four different loci in S. maritima and S. alterniflora, we found up to four haplotypes per locus, suggesting the presence of two expressed homoeologous sequences with one or two allelic variants each. This reference transcriptome will allow analysis of specific Spartina genes of ecological or evolutionary interest, estimation of homoeologous gene expression variation using RNA-seq and further gene expression evolution analyses in natural populations. PMID:23149455
Parallel human genome analysis: microarray-based expression monitoring of 1000 genes.
Schena, M; Shalon, D; Heller, R; Chai, A; Brown, P O; Davis, R W
1996-01-01
Microarrays containing 1046 human cDNAs of unknown sequence were printed on glass with high-speed robotics. These 1.0-cm2 DNA "chips" were used to quantitatively monitor differential expression of the cognate human genes using a highly sensitive two-color hybridization assay. Array elements that displayed differential expression patterns under given experimental conditions were characterized by sequencing. The identification of known and novel heat shock and phorbol ester-regulated genes in human T cells demonstrates the sensitivity of the assay. Parallel gene analysis with microarrays provides a rapid and efficient method for large-scale human gene discovery. Images Fig. 1 Fig. 2 Fig. 3 PMID:8855227
Tissue- and Time-Specific Expression of Otherwise Identical tRNA Genes
Adir, Idan; Dahan, Orna; Broday, Limor; Pilpel, Yitzhak; Rechavi, Oded
2016-01-01
Codon usage bias affects protein translation because tRNAs that recognize synonymous codons differ in their abundance. Although the current dogma states that tRNA expression is exclusively regulated by intrinsic control elements (A- and B-box sequences), we revealed, using a reporter that monitors the levels of individual tRNA genes in Caenorhabditis elegans, that eight tryptophan tRNA genes, 100% identical in sequence, are expressed in different tissues and change their expression dynamically. Furthermore, the expression levels of the sup-7 tRNA gene at day 6 were found to predict the animal’s lifespan. We discovered that the expression of tRNAs that reside within introns of protein-coding genes is affected by the host gene’s promoter. Pairing between specific Pol II genes and the tRNAs that are contained in their introns is most likely adaptive, since a genome-wide analysis revealed that the presence of specific intronic tRNAs within specific orthologous genes is conserved across Caenorhabditis species. PMID:27560950
Analyses of Expressed Sequence Tags from Apple1
Newcomb, Richard D.; Crowhurst, Ross N.; Gleave, Andrew P.; Rikkerink, Erik H.A.; Allan, Andrew C.; Beuning, Lesley L.; Bowen, Judith H.; Gera, Emma; Jamieson, Kim R.; Janssen, Bart J.; Laing, William A.; McArtney, Steve; Nain, Bhawana; Ross, Gavin S.; Snowden, Kimberley C.; Souleyre, Edwige J.F.; Walton, Eric F.; Yauk, Yar-Khing
2006-01-01
The domestic apple (Malus domestica; also known as Malus pumila Mill.) has become a model fruit crop in which to study commercial traits such as disease and pest resistance, grafting, and flavor and health compound biosynthesis. To speed the discovery of genes involved in these traits, develop markers to map genes, and breed new cultivars, we have produced a substantial expressed sequence tag collection from various tissues of apple, focusing on fruit tissues of the cultivar Royal Gala. Over 150,000 expressed sequence tags have been collected from 43 different cDNA libraries representing 34 different tissues and treatments. Clustering of these sequences results in a set of 42,938 nonredundant sequences comprising 17,460 tentative contigs and 25,478 singletons, together representing what we predict are approximately one-half the expressed genes from apple. Many potential molecular markers are abundant in the apple transcripts. Dinucleotide repeats are found in 4,018 nonredundant sequences, mainly in the 5′-untranslated region of the gene, with a bias toward one repeat type (containing AG, 88%) and against another (repeats containing CG, 0.1%). Trinucleotide repeats are most common in the predicted coding regions and do not show a similar degree of sequence bias in their representation. Bi-allelic single-nucleotide polymorphisms are highly abundant with one found, on average, every 706 bp of transcribed DNA. Predictions of the numbers of representatives from protein families indicate the presence of many genes involved in disease resistance and the biosynthesis of flavor and health-associated compounds. Comparisons of some of these gene families with Arabidopsis (Arabidopsis thaliana) suggest instances where there have been duplications in the lineages leading to apple of biosynthetic and regulatory genes that are expressed in fruit. This resource paves the way for a concerted functional genomics effort in this important temperate fruit crop. PMID:16531485
Bacillus anthracis genome organization in light of whole transcriptome sequencing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Martin, Jeffrey; Zhu, Wenhan; Passalacqua, Karla D.
2010-03-22
Emerging knowledge of whole prokaryotic transcriptomes could validate a number of theoretical concepts introduced in the early days of genomics. What are the rules connecting gene expression levels with sequence determinants such as quantitative scores of promoters and terminators? Are translation efficiency measures, e.g. codon adaptation index and RBS score related to gene expression? We used the whole transcriptome shotgun sequencing of a bacterial pathogen Bacillus anthracis to assess correlation of gene expression level with promoter, terminator and RBS scores, codon adaptation index, as well as with a new measure of gene translational efficiency, average translation speed. We compared computationalmore » predictions of operon topologies with the transcript borders inferred from RNA-Seq reads. Transcriptome mapping may also improve existing gene annotation. Upon assessment of accuracy of current annotation of protein-coding genes in the B. anthracis genome we have shown that the transcriptome data indicate existence of more than a hundred genes missing in the annotation though predicted by an ab initio gene finder. Interestingly, we observed that many pseudogenes possess not only a sequence with detectable coding potential but also promoters that maintain transcriptional activity.« less
Pang, Chaoyou; Fan, Shuli; Song, Meizhen; Yu, Shuxun
2013-01-01
Background Cotton (Gossypium hirsutum L.) is one of the world’s most economically-important crops. However, its entire genome has not been sequenced, and limited resources are available in GenBank for understanding the molecular mechanisms underlying leaf development and senescence. Methodology/Principal Findings In this study, 9,874 high-quality ESTs were generated from a normalized, full-length cDNA library derived from pooled RNA isolated from throughout leaf development during the plant blooming stage. After clustering and assembly of these ESTs, 5,191 unique sequences, representative 1,652 contigs and 3,539 singletons, were obtained. The average unique sequence length was 682 bp. Annotation of these unique sequences revealed that 84.4% showed significant homology to sequences in the NCBI non-redundant protein database, and 57.3% had significant hits to known proteins in the Swiss-Prot database. Comparative analysis indicated that our library added 2,400 ESTs and 991 unique sequences to those known for cotton. The unigenes were functionally characterized by gene ontology annotation. We identified 1,339 and 200 unigenes as potential leaf senescence-related genes and transcription factors, respectively. Moreover, nine genes related to leaf senescence and eleven MYB transcription factors were randomly selected for quantitative real-time PCR (qRT-PCR), which revealed that these genes were regulated differentially during senescence. The qRT-PCR for three GhYLSs revealed that these genes express express preferentially in senescent leaves. Conclusions/Significance These EST resources will provide valuable sequence information for gene expression profiling analyses and functional genomics studies to elucidate their roles, as well as for studying the mechanisms of leaf development and senescence in cotton and discovering candidate genes related to important agronomic traits of cotton. These data will also facilitate future whole-genome sequence assembly and annotation in G. hirsutum and comparative genomics among Gossypium species. PMID:24146870
Ewulonu, U K; Snyder, L; Silver, L M; Schimenti, J C
1996-03-01
Transgenic mice were generated to localize essential promoter elements in the mouse testis-expressed Tcp-10 genes. These genes are expressed exclusively in male germ cells, and exhibit a diffuse range of transcriptional start sites, possibly due to the absence of a TATA box. A series of transgene constructs containing different amounts of 5' flanking DNA revealed that all sequences necessary for appropriate temporal and tissue-specific transcription of Tcp-10 reside between positions -1 to -973. All transgenic animals containing these sequences expressed a chimeric transgene at high levels, in a pattern that paralleled the endogenous genes. These experiments further defined a 227 bp fragment from -746 to -973 that was absolutely essential for expression. In a gel-shift assay, this 227-bp fragment bound nuclear protein from testis, but not other tissues, to yield two retarded bands. Sequence analysis of this fragment revealed a half-site for the AP-2 transcription factor recognition sequence. Gel shift assays using native or mutant oligonucleotides demonstrated that the putative AP-2 recognition sequence was essential for generating the retarded bands. Since the binding activity is testis-specific, but AP-2 expression is not exclusive to male germ cells, it is possible that transcription of Tcp-10 requires interaction between AP-2 and a germ cell-specific transcription factor.
Matsunaga, Hiroko; Goto, Mari; Arikawa, Koji; Shirai, Masataka; Tsunoda, Hiroyuki; Huang, Huan; Kambara, Hideki
2015-02-15
Analyses of gene expressions in single cells are important for understanding detailed biological phenomena. Here, a highly sensitive and accurate method by sequencing (called "bead-seq") to obtain a whole gene expression profile for a single cell is proposed. A key feature of the method is to use a complementary DNA (cDNA) library on magnetic beads, which enables adding washing steps to remove residual reagents in a sample preparation process. By adding the washing steps, the next steps can be carried out under the optimal conditions without losing cDNAs. Error sources were carefully evaluated to conclude that the first several steps were the key steps. It is demonstrated that bead-seq is superior to the conventional methods for single-cell gene expression analyses in terms of reproducibility, quantitative accuracy, and biases caused during sample preparation and sequencing processes. Copyright © 2014 Elsevier Inc. All rights reserved.
Jonas, V; Lin, C R; Kawashima, E; Semon, D; Swanson, L W; Mermod, J J; Evans, R M; Rosenfeld, M G
1985-01-01
Two mRNAs generated as a consequence of alternative RNA processing events in expression of the human calcitonin gene encode the protein precursors of either calcitonin or calcitonin gene-related peptide (CGRP). Both calcitonin and CGRP RNAs and their encoded peptide products are expressed in the human pituitary and in medullary thyroid tumors. On the basis of sequence comparison, it is suggested that both the calcitonin and CGRP exons arose from a common primordial sequence, suggesting that duplication and rearrangement events are responsible for the generation of this complex transcription unit. Images PMID:3872459
Huberman, Eliezer [Chicago, IL; Baccam, Mekhine J [Woodridge, IL
2007-02-27
The present invention relates to a nucleic acid sequence and its corresponding protein sequence useful as a dominant selectable marker in eukaryotes. More specifically the invention relates to a nucleic acid encoding a bacterial IMPDH gene that has been engineered into a eukaryotic expression vectors, thereby permitting bacterial IMPDH expression in mammalian cells. Bacterial IMPDH expression confers resistance to MPA which can be used as dominant selectable marker in eukaryotes including mammals. The invention also relates to expression vectors and cells that express the bacterial IMPDH gene as well as gene therapies and protein synthesis.
Hamaguchi-Hamada, Kayoko; Kurumata-Shigeto, Mami; Minobe, Sumiko; Fukuoka, Nozomi; Sato, Manami; Matsufuji, Miyuki; Koizumi, Osamu; Hamada, Shun
2016-01-01
The head region of Hydra, the hypostome, is a key body part for developmental control and the nervous system. We herein examined genes specifically expressed in the head region of Hydra oligactis using suppression subtractive hybridization (SSH) cloning. A total of 1414 subtracted clones were sequenced and found to be derived from at least 540 different genes by BLASTN analyses. Approximately 25% of the subtracted clones had sequences encoding thrombospondin type-1 repeat (TSR) domains, and were derived from 17 genes. We identified 11 TSR domain-containing genes among the top 36 genes that were the most frequently detected in our SSH library. Whole-mount in situ hybridization analyses confirmed that at least 13 out of 17 TSR domain-containing genes were expressed in the hypostome of Hydra oligactis. The prominent expression of TSR domain-containing genes suggests that these genes play significant roles in the hypostome of Hydra oligactis.
Sexual selection drives evolution and rapid turnover of male gene expression.
Harrison, Peter W; Wright, Alison E; Zimmer, Fabian; Dean, Rebecca; Montgomery, Stephen H; Pointer, Marie A; Mank, Judith E
2015-04-07
The profound and pervasive differences in gene expression observed between males and females, and the unique evolutionary properties of these genes in many species, have led to the widespread assumption that they are the product of sexual selection and sexual conflict. However, we still lack a clear understanding of the connection between sexual selection and transcriptional dimorphism, often termed sex-biased gene expression. Moreover, the relative contribution of sexual selection vs. drift in shaping broad patterns of expression, divergence, and polymorphism remains unknown. To assess the role of sexual selection in shaping these patterns, we assembled transcriptomes from an avian clade representing the full range of sexual dimorphism and sexual selection. We use these species to test the links between sexual selection and sex-biased gene expression evolution in a comparative framework. Through ancestral reconstruction of sex bias, we demonstrate a rapid turnover of sex bias across this clade driven by sexual selection and show it to be primarily the result of expression changes in males. We use phylogenetically controlled comparative methods to demonstrate that phenotypic measures of sexual selection predict the proportion of male-biased but not female-biased gene expression. Although male-biased genes show elevated rates of coding sequence evolution, consistent with previous reports in a range of taxa, there is no association between sexual selection and rates of coding sequence evolution, suggesting that expression changes may be more important than coding sequence in sexual selection. Taken together, our results highlight the power of sexual selection to act on gene expression differences and shape genome evolution.
Age-related regulation of genes: slow homeostatic changes and age-dimension technology
NASA Astrophysics Data System (ADS)
Kurachi, Kotoku; Zhang, Kezhong; Huo, Jeffrey; Ameri, Afshin; Kuwahara, Mitsuhiro; Fontaine, Jean-Marc; Yamamoto, Kei; Kurachi, Sumiko
2002-11-01
Through systematic studies of pro- and anti-blood coagulation factors, we have determined molecular mechanisms involving two genetic elements, age-related stability element (ASE), GAGGAAG and age-related increase element (AIE), a unique stretch of dinucleotide repeats (AIE). ASE and AIE are essential for age-related patterns of stable and increased gene expression patterns, respectively. Such age-related gene regulatory mechanisms are also critical for explaining homeostasis in various physiological reactions as well as slow homeostatic changes in them. The age-related increase expression of the human factor IX (hFIX) gene requires the presence of both ASE and AIE, which apparently function additively. The anti-coagulant factor protein C (hPC) gene uses an ASE (CAGGAG) to produce age-related stable expression. Both ASE sequences (G/CAGAAG) share consensus sequence of the transcriptional factor PEA-3 element. No other similar sequences, including another PEA-3 consensus sequence, GAGGATG, function in conferring age-related gene regulation. The age-regulatory mechanisms involving ASE and AIE apparently function universally with different genes and across different animal species. These findings have led us to develop a new field of research and applications, which we named “age-dimension technology (ADT)”. ADT has exciting potential for modifying age-related expression of genes as well as associated physiological processes, and developing novel, more effective prophylaxis or treatments for age-related diseases.
Molecular cloning and expression analysis of annexin A2 gene in sika deer antler tip.
Xia, Yanling; Qu, Haomiao; Lu, Binshan; Zhang, Qiang; Li, Heping
2018-04-01
Molecular cloning and bioinformatics analysis of annexin A2 ( ANXA2 ) gene in sika deer antler tip were conducted. The role of ANXA2 gene in the growth and development of the antler were analyzed initially. The reverse transcriptase polymerase chain reaction (RT-PCR) was used to clone the cDNA sequence of the ANXA2 gene from antler tip of sika deer ( Cervus Nippon hortulorum ) and the bioinformatics methods were applied to analyze the amino acid sequence of Anxa2 protein. The mRNA expression levels of the ANXA2 gene in different growth stages were examined by real time reverse transcriptase polymerase chain reaction (real time RT-PCR). The nucleotide sequence analysis revealed an open reading frame of 1,020 bp encoding 339 amino acids long protein of calculated molecular weight 38.6 kDa and isoelectric point 6.09. Homologous sequence alignment and phylogenetic analysis indicated that the Anxa2 mature protein of sika deer had the closest genetic distance with Cervus elaphus and Bos mutus . Real time RT-PCR results showed that the gene had differential expression levels in different growth stages, and the expression level of the ANXA2 gene was the highest at metaphase (rapid growing period). ANXA2 gene may promote the cell proliferation, and the finding suggested Anxa2 as an important candidate for regulating the growth and development of deer antler.
Pingault, Lise; Choulet, Frédéric; Alberti, Adriana; Glover, Natasha; Wincker, Patrick; Feuillet, Catherine; Paux, Etienne
2015-02-10
Because of its size, allohexaploid nature, and high repeat content, the bread wheat genome is a good model to study the impact of the genome structure on gene organization, function, and regulation. However, because of the lack of a reference genome sequence, such studies have long been hampered and our knowledge of the wheat gene space is still limited. The access to the reference sequence of the wheat chromosome 3B provided us with an opportunity to study the wheat transcriptome and its relationships to genome and gene structure at a level that has never been reached before. By combining this sequence with RNA-seq data, we construct a fine transcriptome map of the chromosome 3B. More than 8,800 transcription sites are identified, that are distributed throughout the entire chromosome. Expression level, expression breadth, alternative splicing as well as several structural features of genes, including transcript length, number of exons, and cumulative intron length are investigated. Our analysis reveals a non-monotonic relationship between gene expression and structure and leads to the hypothesis that gene structure is determined by its function, whereas gene expression is subject to energetic cost. Moreover, we observe a recombination-based partitioning at the gene structure and function level. Our analysis provides new insights into the relationships between gene and genome structure and function. It reveals mechanisms conserved with other plant species as well as superimposed evolutionary forces that shaped the wheat gene space, likely participating in wheat adaptation.
Yang, J; Yamamoto, M; Ishibashi, J; Taniai, K; Yamakawa, M
1998-08-01
An antibacterial protein, designated rhinocerosin, was purified to homogeneity from larvae of the coconut rhinoceros beetle, Oryctes rhinoceros immunized with Escherichia coli. Based on the amino acid sequence of the N-terminal region, a degenerate primer was synthesized and reverse-transcriptase PCR was performed to clone rhinocerosin cDNA. As a result, a 279-bp fragment was obtained. The complete nucleotide sequence was determined by sequencing the extended rhinocerosin cDNA clone by 5' rapid amplification of cDNA ends. The deduced amino acid sequence of the mature portion of rhinocerosin was composed of 72 amino acids without cystein residues and was shown to be rich in glycine (11.1%) and proline (11.1%) residues. Comparison of the deduced amino acid sequence of rhinocerosin with those of other antibacterial proteins indicated that it has 77.8% and 44.6% identity with holotricin 2 and coleoptrecin, respectively. Rhinocerosin had strong antibacterial activity against E. coli, Streptococcus pyogenes, Staphylococcus aureus but not against Pseudomonas aeruginosa. Results of reverse-transcriptase PCR analysis of gene expression in different tissues indicated that the rhinocerosin gene is strongly expressed in the fat body and the Malpighian tubule, and weakly expressed in hemocytes and midgut. In addition, gene expression was inducible by bacteria in the fat body, the Malpighian tubule and hemocyte but constitutive expression was observed in the midgut.
Senthilkumar, Palanisamy; Thirugnanasambantham, Krishnaraj; Mandal, Abul Kalam Azad
2012-12-01
Tea (Camellia sinensis (L.) O. Kuntze) is an economically important plant cultivated for its leaves. Infection of Pestalotiopsis theae in leaves causes gray blight disease and enormous loss to the tea industry. We used suppressive subtractive hybridization (SSH) technique to unravel the differential gene expression pattern during gray blight disease development in tea. Complementary DNA from P. theae-infected and uninfected leaves of disease tolerant cultivar UPASI-10 was used as tester and driver populations respectively. Subtraction efficiency was confirmed by comparing abundance of β-actin gene. A total of 377 and 720 clones with insert size >250 bp from forward and reverse library respectively were sequenced and analyzed. Basic Local Alignment Search Tool analysis revealed 17 sequences in forward SSH library have high degree of similarity with disease and hypersensitive response related genes and 20 sequences with hypothetical proteins while in reverse SSH library, 23 sequences have high degree of similarity with disease and stress response-related genes and 15 sequences with hypothetical proteins. Functional analysis indicated unknown (61 and 59 %) or hypothetical functions (23 and 18 %) for most of the differentially regulated genes in forward and reverse SSH library, respectively, while others have important role in different cellular activities. Majority of the upregulated genes are related to hypersensitive response and reactive oxygen species production. Based on these expressed sequence tag data, putative role of differentially expressed genes were discussed in relation to disease. We also demonstrated the efficiency of SSH as a tool in enriching gray blight disease related up- and downregulated genes in tea. The present study revealed that many genes related to disease resistance were suppressed during P. theae infection and enhancing these genes by the application of inducers may impart better disease tolerance to the plants.
Fujibuchi, Wataru; Anderson, John S. J.; Landsman, David
2001-01-01
Consensus pattern and matrix-based searches designed to predict cis-acting transcriptional regulatory sequences have historically been subject to large numbers of false positives. We sought to decrease false positives by incorporating expression profile data into a consensus pattern-based search method. We have systematically analyzed the expression phenotypes of over 6000 yeast genes, across 121 expression profile experiments, and correlated them with the distribution of 14 known regulatory elements over sequences upstream of the genes. Our method is based on a metric we term probabilistic element assessment (PEA), which is a ranking of potential sites based on sequence similarity in the upstream regions of genes with similar expression phenotypes. For eight of the 14 known elements that we examined, our method had a much higher selectivity than a naïve consensus pattern search. Based on our analysis, we have developed a web-based tool called PROSPECT, which allows consensus pattern-based searching of gene clusters obtained from microarray data. PMID:11574681
The gene space in wheat: the complete γ-gliadin gene family from the wheat cultivar Chinese Spring.
Anderson, Olin D; Huo, Naxin; Gu, Yong Q
2013-06-01
The complete set of unique γ-gliadin genes is described for the wheat cultivar Chinese Spring using a combination of expressed sequence tag (EST) and Roche 454 DNA sequences. Assemblies of Chinese Spring ESTs yielded 11 different γ-gliadin gene sequences. Two of the sequences encode identical polypeptides and are assumed to be the result of a recent gene duplication. One gene has a 3' coding mutation that changes the reading frame in the final eight codons. A second assembly of Chinese Spring γ-gliadin sequences was generated using Roche 454 total genomic DNA sequences. The 454 assembly confirmed the same 11 active genes as the EST assembly plus two pseudogenes not represented by ESTs. These 13 γ-gliadin sequences represent the complete unique set of γ-gliadin genes for cv Chinese Spring, although not ruled out are additional genes that are exact duplications of these 13 genes. A comparison with the ESTs of two other hexaploid cultivars (Butte 86 and Recital) finds that the most active genes are present in all three cultivars, with exceptions likely due to too few ESTs for detection in Butte 86 and Recital. A comparison of the numbers of ESTs per gene indicates differential levels of expression within the γ-gliadin gene family. Genome assignments were made for 6 of the 13 Chinese Spring γ-gliadin genes, i.e., one assignment from a match to two γ-gliadin genes found within a tetraploid wheat A genome BAC and four genes that match four distinct γ-gliadin sequences assembled from Roche 454 sequences from Aegilops tauschii, the hexaploid wheat D-genome ancestor.
Trapnell, Cole; Roberts, Adam; Goff, Loyal; Pertea, Geo; Kim, Daehwan; Kelley, David R; Pimentel, Harold; Salzberg, Steven L; Rinn, John L; Pachter, Lior
2012-01-01
Recent advances in high-throughput cDNA sequencing (RNA-seq) can reveal new genes and splice variants and quantify expression genome-wide in a single assay. The volume and complexity of data from RNA-seq experiments necessitate scalable, fast and mathematically principled analysis software. TopHat and Cufflinks are free, open-source software tools for gene discovery and comprehensive expression analysis of high-throughput mRNA sequencing (RNA-seq) data. Together, they allow biologists to identify new genes and new splice variants of known ones, as well as compare gene and transcript expression under two or more conditions. This protocol describes in detail how to use TopHat and Cufflinks to perform such analyses. It also covers several accessory tools and utilities that aid in managing data, including CummeRbund, a tool for visualizing RNA-seq analysis results. Although the procedure assumes basic informatics skills, these tools assume little to no background with RNA-seq analysis and are meant for novices and experts alike. The protocol begins with raw sequencing reads and produces a transcriptome assembly, lists of differentially expressed and regulated genes and transcripts, and publication-quality visualizations of analysis results. The protocol's execution time depends on the volume of transcriptome sequencing data and available computing resources but takes less than 1 d of computer time for typical experiments and ~1 h of hands-on time. PMID:22383036
Xuxia, Wang; Jie, Chen; Bo, Wang; Lijun, Liu; Hui, Jiang; Diluo, Tang; Dingxiang, Peng
2012-01-01
For the purpose of screening putative anthracnose resistance-related genes of ramie ( Boehmeria nivea L. Gaud), a cDNA library was constructed by suppression subtractive hybridization using anthracnose-resistant cultivar Huazhu no. 4. The cDNAs from Huazhu no. 4, which were infected with Colletotrichum gloeosporioides , were used as the tester and cDNAs from uninfected Huazhu no. 4 as the driver. Sequencing analysis and homology searching showed that these clones represented 132 single genes, which were assigned to functional categories, including 14 putative cellular functions, according to categories established for Arabidopsis . These 132 genes included 35 disease resistance and stress tolerance-related genes including putative heat-shock protein 90, metallothionein, PR-1.2 protein, catalase gene, WRKY family genes, and proteinase inhibitor-like protein. Partial disease-related genes were further analyzed by reverse transcription PCR and RNA gel blot. These expressed sequence tags are the first anthracnose resistance-related expressed sequence tags reported in ramie.
Vettore, André L.; da Silva, Felipe R.; Kemper, Edson L.; Souza, Glaucia M.; da Silva, Aline M.; Ferro, Maria Inês T.; Henrique-Silva, Flavio; Giglioti, Éder A.; Lemos, Manoel V.F.; Coutinho, Luiz L.; Nobrega, Marina P.; Carrer, Helaine; França, Suzelei C.; Bacci, Maurício; Goldman, Maria Helena S.; Gomes, Suely L.; Nunes, Luiz R.; Camargo, Luis E.A.; Siqueira, Walter J.; Van Sluys, Marie-Anne; Thiemann, Otavio H.; Kuramae, Eiko E.; Santelli, Roberto V.; Marino, Celso L.; Targon, Maria L.P.N.; Ferro, Jesus A.; Silveira, Henrique C.S.; Marini, Danyelle C.; Lemos, Eliana G.M.; Monteiro-Vitorello, Claudia B.; Tambor, José H.M.; Carraro, Dirce M.; Roberto, Patrícia G.; Martins, Vanderlei G.; Goldman, Gustavo H.; de Oliveira, Regina C.; Truffi, Daniela; Colombo, Carlos A.; Rossi, Magdalena; de Araujo, Paula G.; Sculaccio, Susana A.; Angella, Aline; Lima, Marleide M.A.; de Rosa, Vicente E.; Siviero, Fábio; Coscrato, Virginia E.; Machado, Marcos A.; Grivet, Laurent; Di Mauro, Sonia M.Z.; Nobrega, Francisco G.; Menck, Carlos F.M.; Braga, Marilia D.V.; Telles, Guilherme P.; Cara, Frank A.A.; Pedrosa, Guilherme; Meidanis, João; Arruda, Paulo
2003-01-01
To contribute to our understanding of the genome complexity of sugarcane, we undertook a large-scale expressed sequence tag (EST) program. More than 260,000 cDNA clones were partially sequenced from 26 standard cDNA libraries generated from different sugarcane tissues. After the processing of the sequences, 237,954 high-quality ESTs were identified. These ESTs were assembled into 43,141 putative transcripts. Of the assembled sequences, 35.6% presented no matches with existing sequences in public databases. A global analysis of the whole SUCEST data set indicated that 14,409 assembled sequences (33% of the total) contained at least one cDNA clone with a full-length insert. Annotation of the 43,141 assembled sequences associated almost 50% of the putative identified sugarcane genes with protein metabolism, cellular communication/signal transduction, bioenergetics, and stress responses. Inspection of the translated assembled sequences for conserved protein domains revealed 40,821 amino acid sequences with 1415 Pfam domains. Reassembling the consensus sequences of the 43,141 transcripts revealed a 22% redundancy in the first assembling. This indicated that possibly 33,620 unique genes had been identified and indicated that >90% of the sugarcane expressed genes were tagged. PMID:14613979
Hybrid Sequencing of Full-Length cDNA Transcripts of Stems and Leaves in Dendrobium officinale
He, Liu; Fu, Shuhua; Xu, Zhichao; Yan, Jun; Xu, Jiang; Zhou, Hong; Zhou, Jianguo; Chen, Xinlian; Li, Ying; Au, Kin Fai; Yao, Hui
2017-01-01
Dendrobium officinale is an extremely valuable orchid used in traditional Chinese medicine, so sought after that it has a higher market value than gold. Although the expression profiles of some genes involved in the polysaccharide synthesis have previously been investigated, little research has been carried out on their alternatively spliced isoforms in D. officinale. In addition, information regarding the translocation of sugars from leaves to stems in D. officinale also remains limited. We analyzed the polysaccharide content of D. officinale leaves and stems, and completed in-depth transcriptome sequencing of these two diverse tissue types using second-generation sequencing (SGS) and single-molecule real-time (SMRT) sequencing technology. The results of this study yielded a digital inventory of gene and mRNA isoform expressions. A comparative analysis of both transcriptomes uncovered a total of 1414 differentially expressed genes, including 844 that were up-regulated and 570 that were down-regulated in stems. Of these genes, one sugars will eventually be exported transporter (SWEET) and one sucrose transporter (SUT) are expressed to a greater extent in D. officinale stems than in leaves. Two glycosyltransferase (GT) and four cellulose synthase (Ces) genes undergo a distinct degree of alternative splicing. In the stems, the content of polysaccharides is twice as much as that in the leaves. The differentially expressed GT and transcription factor (TF) genes will be the focus of further study. The genes DoSWEET4 and DoSUT1 are significantly expressed in the stem, and are likely to be involved in sugar loading in the phloem. PMID:28981454
Comparative transcriptome analysis of microsclerotia development in Nomuraea rileyi.
Song, Zhangyong; Yin, Youping; Jiang, Shasha; Liu, Juanjuan; Chen, Huan; Wang, Zhongkang
2013-06-19
Nomuraea rileyi is used as an environmental-friendly biopesticide. However, mass production and commercialization of this organism are limited due to its fastidious growth and sporulation requirements. When cultured in amended medium, we found that N. rileyi could produce microsclerotia bodies, replacing conidiophores as the infectious agent. However, little is known about the genes involved in microsclerotia development. In the present study, the transcriptomes were analyzed using next-generation sequencing technology to find the genes involved in microsclerotia development. A total of 4.69 Gb of clean nucleotides comprising 32,061 sequences was obtained, and 20,919 sequences were annotated (about 65%). Among the annotated sequences, only 5928 were annotated with 34 gene ontology (GO) functional categories, and 12,778 sequences were mapped to 165 pathways by searching against the Kyoto Encyclopedia of Genes and Genomes pathway (KEGG) database. Furthermore, we assessed the transcriptomic differences between cultures grown in minimal and amended medium. In total, 4808 sequences were found to be differentially expressed; 719 differentially expressed unigenes were assigned to 25 GO classes and 1888 differentially expressed unigenes were assigned to 161 KEGG pathways, including 25 enrichment pathways. Subsequently, we examined the up-regulation or uniquely expressed genes following amended medium treatment, which were also expressed on the enrichment pathway, and found that most of them participated in mediating oxidative stress homeostasis. To elucidate the role of oxidative stress in microsclerotia development, we analyzed the diversification of unigenes using quantitative reverse transcription-PCR (RT-qPCR). Our findings suggest that oxidative stress occurs during microsclerotia development, along with a broad metabolic activity change. Our data provide the most comprehensive sequence resource available for the study of N. rileyi. We believe that the transcriptome datasets will serve as an important public information platform to accelerate studies on N. rileyi microsclerotia.
Kosovac, D; Wild, J; Ludwig, C; Meissner, S; Bauer, A P; Wagner, R
2011-02-01
Advanced gene delivery techniques can be combined with rational gene design to further improve the efficiency of plasmid DNA (pDNA)-mediated transgene expression in vivo. Herein, we analyzed the influence of intragenic sequence modifications on transgene expression in vitro and in vivo using murine erythropoietin (mEPO) as a transgene model. A single electro-gene transfer of an RNA- and codon-optimized mEPOopt gene into skeletal muscle resulted in a 3- to 4-fold increase of mEPO production sustained for >1 year and triggered a significant increase in hematocrit and hemoglobin without causing adverse effects. mEPO expression and hematologic levels were significantly lower when using comparable amounts of the wild type (mEPOwt) gene and only marginal effects were induced by mEPOΔCpG lacking intragenic CpG dinucleotides, even at high pDNA amounts. Corresponding with these observations, in vitro analysis of transfected cells revealed a 2- to 3-fold increased (mEPOopt) and 50% decreased (mEPOΔCpG) erythropoietin expression compared with mEPOwt, respectively. RNA analyses demonstrated that the specific design of the transgene sequence influenced expression levels by modulating transcriptional activity and nuclear plus cytoplasmic RNA amounts rather than translation. In sum, whereas CpG depletion negatively interferes with efficient expression in postmitotic tissues, mEPOopt doses <0.5 μg were sufficient to trigger optimal long-term hematologic effects encouraging the use of sequence-optimized transgenes to further reduce effective pDNA amounts.
2013-01-01
Background Cymbidium sinense belongs to the Orchidaceae, which is one of the most abundant angiosperm families. C. sinense, a high-grade traditional potted flower, is most prevalent in China and some Southeast Asian countries. The control of flowering time is a major bottleneck in the industrialized development of C. sinense. Little is known about the mechanisms responsible for floral development in this orchid. Moreover, genome references for entire transcriptome sequences do not currently exist for C. sinense. Thus, transcriptome and expression profiling data for this species are needed as an important resource to identify genes and to better understand the biological mechanisms of floral development in C. sinense. Results In this study, de novo transcriptome assembly and gene expression analysis using Illumina sequencing technology were performed. Transcriptome analysis assembles gene-related information related to vegetative and reproductive growth of C. sinense. Illumina sequencing generated 54,248,006 high quality reads that were assembled into 83,580 unigenes with an average sequence length of 612 base pairs, including 13,315 clusters and 70,265 singletons. A total of 41,687 (49.88%) unique sequences were annotated, 23,092 of which were assigned to specific metabolic pathways by the Kyoto Encyclopedia of Genes and Genomes (KEGG). Gene Ontology (GO) analysis of the annotated unigenes revealed that the majority of sequenced genes were associated with metabolic and cellular processes, cell and cell parts, catalytic activity and binding. Furthermore, 120 flowering-associated unigenes, 73 MADS-box unigenes and 28 CONSTANS-LIKE (COL) unigenes were identified from our collection. In addition, three digital gene expression (DGE) libraries were constructed for the vegetative phase (VP), floral differentiation phase (FDP) and reproductive phase (RP). The specific expression of many genes in the three development phases was also identified. 32 genes among three sub-libraries with high differential expression were selected as candidates connected with flower development. Conclusion RNA-seq and DGE profiling data provided comprehensive gene expression information at the transcriptional level that could facilitate our understanding of the molecular mechanisms of floral development at three development phases of C. sinense. This data could be used as an important resource for investigating the genetics of the flowering pathway and various biological mechanisms in this orchid. PMID:23617896
Zhang, Jianxia; Wu, Kunlin; Zeng, Songjun; Teixeira da Silva, Jaime A; Zhao, Xiaolan; Tian, Chang-En; Xia, Haoqiang; Duan, Jun
2013-04-24
Cymbidium sinense belongs to the Orchidaceae, which is one of the most abundant angiosperm families. C. sinense, a high-grade traditional potted flower, is most prevalent in China and some Southeast Asian countries. The control of flowering time is a major bottleneck in the industrialized development of C. sinense. Little is known about the mechanisms responsible for floral development in this orchid. Moreover, genome references for entire transcriptome sequences do not currently exist for C. sinense. Thus, transcriptome and expression profiling data for this species are needed as an important resource to identify genes and to better understand the biological mechanisms of floral development in C. sinense. In this study, de novo transcriptome assembly and gene expression analysis using Illumina sequencing technology were performed. Transcriptome analysis assembles gene-related information related to vegetative and reproductive growth of C. sinense. Illumina sequencing generated 54,248,006 high quality reads that were assembled into 83,580 unigenes with an average sequence length of 612 base pairs, including 13,315 clusters and 70,265 singletons. A total of 41,687 (49.88%) unique sequences were annotated, 23,092 of which were assigned to specific metabolic pathways by the Kyoto Encyclopedia of Genes and Genomes (KEGG). Gene Ontology (GO) analysis of the annotated unigenes revealed that the majority of sequenced genes were associated with metabolic and cellular processes, cell and cell parts, catalytic activity and binding. Furthermore, 120 flowering-associated unigenes, 73 MADS-box unigenes and 28 CONSTANS-LIKE (COL) unigenes were identified from our collection. In addition, three digital gene expression (DGE) libraries were constructed for the vegetative phase (VP), floral differentiation phase (FDP) and reproductive phase (RP). The specific expression of many genes in the three development phases was also identified. 32 genes among three sub-libraries with high differential expression were selected as candidates connected with flower development. RNA-seq and DGE profiling data provided comprehensive gene expression information at the transcriptional level that could facilitate our understanding of the molecular mechanisms of floral development at three development phases of C. sinense. This data could be used as an important resource for investigating the genetics of the flowering pathway and various biological mechanisms in this orchid.
Two cis elements collaborate to spatially repress transcription from a sea urchin promoter
NASA Technical Reports Server (NTRS)
Frudakis, T. N.; Wilt, F.
1995-01-01
The expression pattern of many territory-specific genes in metazoan embryos is maintained by an active process of negative spatial regulation. However, the mechanism of this strategy of gene regulation is not well understood in any system. Here we show that reporter constructs containing regulatory sequence for the SM30-alpha gene of Stronglyocentrotus purpuratus are expressed in a pattern congruent with that of the endogenous SM30 gene(s), largely as a result of active transcriptional repression in cell lineages in which the gene is not normally expressed. Chloramphenicol acetyl transferase assays of deletion constructs from the 2600-bp upstream region showed that repressive elements were present in the region from -1628 to -300. In situ hybridization analysis showed that the spatial fidelity of expression was severely compromised when the region from -1628 to -300 was deleted. Two highly repetitive sequence motifs, (G/A/C)CCCCT and (T/C)(T/A/C)CTTTT(T/A/C), are present in the -1628 to -300 region. Representatives of these elements were analyzed by gel mobility shift experiments and were found to interact specifically with protein in crude nuclear extracts. When oligonucleotides containing either sequence element were co-injected with a correctly regulated reporter as potential competitors, the reporter was expressed in inappropriate cells. When composite oligonucleotides, containing both sequence elements, were fused to a misregulated reporter, the expression of the reporter in inappropriate cells was suppressed. Comparison of composite oligonucleotides with oligonucleotides containing single constituent elements show that both sequence elements are required for effective spatial regulation. Thus, both individual elements are required, but only a composite element containing both elements is sufficient to function as a tissue-specific repressive element.
Wu, Shuanghua; Lei, Jianjun; Chen, Guoju; Chen, Hancai; Cao, Bihao; Chen, Changming
2017-01-01
Chinese kale, a vegetable of the cruciferous family, is a popular crop in southern China and Southeast Asia due to its high glucosinolate content and nutritional qualities. However, there is little research on the molecular genetics and genes involved in glucosinolate metabolism and its regulation in Chinese kale. In this study, we sequenced and characterized the transcriptomes and expression profiles of genes expressed in 11 tissues of Chinese kale. A total of 216 million 150-bp clean reads were generated using RNA-sequencing technology. From the sequences, 98,180 unigenes were assembled for the whole plant, and 49,582~98,423 unigenes were assembled for each tissue. Blast analysis indicated that a total of 80,688 (82.18%) unigenes exhibited similarity to known proteins. The functional annotation and classification tools used in this study suggested that genes principally expressed in Chinese kale, were mostly involved in fundamental processes, such as cellular and molecular functions, the signal transduction, and biosynthesis of secondary metabolites. The expression levels of all unigenes were analyzed in various tissues of Chinese kale. A large number of candidate genes involved in glucosinolate metabolism and its regulation were identified, and the expression patterns of these genes were analyzed. We found that most of the genes involved in glucosinolate biosynthesis were highly expressed in the root, petiole, and in senescent leaves. The expression patterns of ten glucosinolate biosynthetic genes from RNA-seq were validated by quantitative RT-PCR in different tissues. These results provided an initial and global overview of Chinese kale gene functions and expression activities in different tissues. PMID:28228764
Cloning and sequencing of the alcohol dehydrogenase II gene from Zymomonas mobilis
Ingram, Lonnie O.; Conway, Tyrrell
1992-01-01
The alcohol dehydrogenase II gene from Zymomonas mobilis has been cloned and sequenced. This gene can be expressed at high levels in other organisms to produce acetaldehyde or to convert acetaldehyde to ethanol.
Characterization and expression of the calpastatin gene in Cyprinus carpio.
Chen, W X; Ma, Y
2015-07-03
Calpastatin, an important protein used to regulate meat quality traits in animals, is encoded by the CAST gene. The aim of the present study was to clone the cDNA sequence of the CAST gene and detect the expression of CAST in the tissues of Cyprinus carpio. The cDNA of the C. carpio CAST gene, amplified using rapid amplification of cDNA ends PCR, is 2834 bp in length (accession No. JX275386), contains a 2634-bp open reading frame, and encodes a protein with 877 amino acid residues. The amino acid sequence of the C. carpio CAST gene was 88, 80, and 59% identical to the sequences observed in grass carp, zebrafish, and other fish, respectively. The C. carpio CAST was observed to contain four conserved domains with 54 serine phosphorylation loci, 28 threonine phosphorylation loci, 1 tyrosine phosphorylation loci, and 6 specific protein kinase C phosphorylation loci. The CAST gene showed widespread expression in different tissues of C. carpio. Surprisingly, the relative expression of the CAST transcript in the muscle and heart tissues of C. carpio was significantly higher than in other tissues (P < 0.01).
2011-01-01
Background Abiotic stresses, such as water deficit and soil salinity, result in changes in physiology, nutrient use, and vegetative growth in vines, and ultimately, yield and flavor in berries of wine grape, Vitis vinifera L. Large-scale expressed sequence tags (ESTs) were generated, curated, and analyzed to identify major genetic determinants responsible for stress-adaptive responses. Although roots serve as the first site of perception and/or injury for many types of abiotic stress, EST sequencing in root tissues of wine grape exposed to abiotic stresses has been extremely limited to date. To overcome this limitation, large-scale EST sequencing was conducted from root tissues exposed to multiple abiotic stresses. Results A total of 62,236 expressed sequence tags (ESTs) were generated from leaf, berry, and root tissues from vines subjected to abiotic stresses and compared with 32,286 ESTs sequenced from 20 public cDNA libraries. Curation to correct annotation errors, clustering and assembly of the berry and leaf ESTs with currently available V. vinifera full-length transcripts and ESTs yielded a total of 13,278 unique sequences, with 2302 singletons and 10,976 mapped to V. vinifera gene models. Of these, 739 transcripts were found to have significant differential expression in stressed leaves and berries including 250 genes not described previously as being abiotic stress responsive. In a second analysis of 16,452 ESTs from a normalized root cDNA library derived from roots exposed to multiple, short-term, abiotic stresses, 135 genes with root-enriched expression patterns were identified on the basis of their relative EST abundance in roots relative to other tissues. Conclusions The large-scale analysis of relative EST frequency counts among a diverse collection of 23 different cDNA libraries from leaf, berry, and root tissues of wine grape exposed to a variety of abiotic stress conditions revealed distinct, tissue-specific expression patterns, previously unrecognized stress-induced genes, and many novel genes with root-enriched mRNA expression for improving our understanding of root biology and manipulation of rootstock traits in wine grape. mRNA abundance estimates based on EST library-enriched expression patterns showed only modest correlations between microarray and quantitative, real-time reverse transcription-polymerase chain reaction (qRT-PCR) methods highlighting the need for deep-sequencing expression profiling methods. PMID:21592389
Hattori, T; Terada, T; Hamasuna, S
1995-06-01
Osem, a rice gene homologous to the wheat Em gene, which encodes one of the late-embryogenesis abundant proteins was isolated. The gene was characterized with respect to control of transcription by abscisic acid (ABA) and the transcriptional activator VP1, which is involved in the ABA-regulated gene expression during late embryo-genesis. A fusion gene (Osem-GUS) consisting of the Osem promoter and the bacterial beta-glucuronidase (GUS) gene was constructed and tested in a transient expression system, using protoplasts derived from a suspension-cultured line of rice cells, for activation by ABA and by co-transfection with an expression vector (35S-Osvp1) for the rice VP1 (OSVP1) cDNA. The expression of Osem-GUS was strongly (40- to 150-fold) activated by externally applied ABA and by over-expression of (OS)VP1. The Osem promoter has three ACGTG-containing sequences, motif A, motif B and motif A', which resemble the abscisic acid-responsive element (ABRE) that was previously identified in the wheat Em and the rice Rab16. There is also a CATGCATG sequence, which is known as the Sph box and is shown to be essential for the regulation by VP1 of the maize anthocyanin regulatory gene C1. Focusing on these sequence elements, various mutant derivatives of the Osem promoter in the transient expression system were assayed. The analysis revealed that motif A functions not only as an ABRE but also as a sequence element required for the regulation by (OS)VP1.
YOSHINO, TIMOTHY P.; DINGUIRARD, NATHALIE; DE MORAES MOURÃO, MARINA
2013-01-01
SUMMARY With rapid developments in DNA and protein sequencing technologies, combined with powerful bioinformatics tools, a continued acceleration of gene identification in parasitic helminths is predicted, potentially leading to discovery of new drug and vaccine targets, enhanced diagnostics and insights into the complex biology underlying host-parasite interactions. For the schistosome blood flukes, with the recent completion of genome sequencing and comprehensive transcriptomic datasets, there has accumulated massive amounts of gene sequence data, for which, in the vast majority of cases, little is known about actual functions within the intact organism. In this review we attempt to bring together traditional in vitro cultivation approaches and recent emergent technologies of molecular genomics, transcriptomics and genetic manipulation to illustrate the considerable progress made in our understanding of trematode gene expression and function during development of the intramolluscan larval stages. Using several prominent trematode families (Schistosomatidae, Fasciolidae, Echinostomatidae), we have focused on the current status of in vitro larval isolation/cultivation as a source of valuable raw material supporting gene discovery efforts in model digeneans that include whole genome sequencing, transcript and protein expression profiling during larval development, and progress made in the in vitro manipulation of genes and their expression in larval trematodes using transgenic and RNA interference (RNAi) approaches. PMID:19961646
Characterization and expression profiles of MaACS and MaACO genes from mulberry (Morus alba L.)*
Liu, Chang-ying; Lü, Rui-hua; Li, Jun; Zhao, Ai-chun; Wang, Xi-ling; Diane, Umuhoza; Wang, Xiao-hong; Wang, Chuan-hong; Yu, Ya-sheng; Han, Shu-mei; Lu, Cheng; Yu, Mao-de
2014-01-01
1-Aminocyclopropane-1-carboxylic acid synthase (ACS) and 1-aminocyclopropane-1-carboxylic acid oxidase (ACO) are encoded by multigene families and are involved in fruit ripening by catalyzing the production of ethylene throughout the development of fruit. However, there are no reports on ACS or ACO genes in mulberry, partly because of the limited molecular research background. In this study, we have obtained five ACS gene sequences and two ACO gene sequences from Morus Genome Database. Sequence alignment and phylogenetic analysis of MaACO1 and MaACO2 showed that their amino acids are conserved compared with ACO proteins from other species. MaACS1 and MaACS2 are type I, MaACS3 and MaACS4 are type II, and MaACS5 is type III, with different C-terminal sequences. Quantitative reverse transcriptase polymerase chain reaction (qRT-PCR) expression analysis showed that the transcripts of MaACS genes were strongly expressed in fruit, and more weakly in other tissues. The expression of MaACO1 and MaACO2 showed different patterns in various mulberry tissues. MaACS and MaACO genes demonstrated two patterns throughout the development of mulberry fruit, and both of them were strongly up-regulated by abscisic acid (ABA) and ethephon. PMID:25001221
Expressed gene sequence of the IFN-gamma-response chemokine CXCL9 of cattle, horses, and swine
USDA-ARS?s Scientific Manuscript database
This report describes the cloning and characterization of expressed gene sequences of bovine, equine, and swine CXCL9 from RNA obtained from peripheral blood mononuclear cell (PBMC) or other tissues. The bovine coding region was 378 nucleotides in length, while the equine and swine coding regions w...
Dash, P K; Tian, L M; Moore, A N
1998-07-07
Axonal injury increases intracellular Ca2+ and cAMP and has been shown to induce gene expression, which is thought to be a key event for regeneration. Increases in intracellular Ca2+ and/or cAMP can alter gene expression via activation of a family of transcription factors that bind to and modulate the expression of CRE (Ca2+/cAMP response element) sequence-containing genes. We have used Aplysia motor neurons to examine the role of CRE-binding proteins in axonal regeneration after injury. We report that axonal injury increases the binding of proteins to a CRE sequence-containing probe. In addition, Western blot analysis revealed that the level of ApCREB2, a CRE sequence-binding repressor, was enhanced as a result of axonal injury. The sequestration of CRE-binding proteins by microinjection of CRE sequence-containing plasmids enhanced axon collateral formation (both number and length) as compared with control plasmid injections. These findings show that Ca2+/cAMP-mediated gene expression via CRE-binding transcription factors participates in the regeneration of motor neuron axons.
Rare Cell Detection by Single-Cell RNA Sequencing as Guided by Single-Molecule RNA FISH.
Torre, Eduardo; Dueck, Hannah; Shaffer, Sydney; Gospocic, Janko; Gupte, Rohit; Bonasio, Roberto; Kim, Junhyong; Murray, John; Raj, Arjun
2018-02-28
Although single-cell RNA sequencing can reliably detect large-scale transcriptional programs, it is unclear whether it accurately captures the behavior of individual genes, especially those that express only in rare cells. Here, we use single-molecule RNA fluorescence in situ hybridization as a gold standard to assess trade-offs in single-cell RNA-sequencing data for detecting rare cell expression variability. We quantified the gene expression distribution for 26 genes that range from ubiquitous to rarely expressed and found that the correspondence between estimates across platforms improved with both transcriptome coverage and increased number of cells analyzed. Further, by characterizing the trade-off between transcriptome coverage and number of cells analyzed, we show that when the number of genes required to answer a given biological question is small, then greater transcriptome coverage is more important than analyzing large numbers of cells. More generally, our report provides guidelines for selecting quality thresholds for single-cell RNA-sequencing experiments aimed at rare cell analyses. Copyright © 2018 Elsevier Inc. All rights reserved.
Characterization of an In Vivo Z-DNA Detection Probe Based on a Cell Nucleus Accumulating Intrabody.
Gulis, Galina; Silva, Izabel Cristina Rodrigues; Sousa, Herdson Renney; Sousa, Isabel Garcia; Bezerra, Maryani Andressa Gomes; Quilici, Luana Salgado; Maranhao, Andrea Queiroz; Brigido, Marcelo Macedo
2016-09-01
Left-handed Z-DNA is a physiologically unstable DNA conformation, and its existence in vivo can be attributed to localized torsional distress. Despite evidence for the existence of Z-DNA in vivo, its precise role in the control of gene expression is not fully understood. Here, an in vivo probe based on an anti-Z-DNA intrabody is proposed for native Z-DNA detection. The probe was used for chromatin immunoprecipitation of potential Z-DNA-forming sequences in the human genome. One of the isolated putative Z-DNA-forming sequences was cloned upstream of a reporter gene expression cassette under control of the CMV promoter. The reporter gene encoded an antibody fragment fused to GFP. Transient co-transfection of this vector along with the Z-probe coding vector improved reporter gene expression. This improvement was demonstrated by measuring reporter gene mRNA and protein levels and the amount of fluorescence in co-transfected CHO-K1 cells. These results suggest that the presence of the anti-Z-DNA intrabody can interfere with a Z-DNA-containing reporter gene expression. Therefore, this in vivo probe for the detection of Z-DNA could be used for global correlation of Z-DNA-forming sequences and gene expression regulation.
Isoform-level gene expression patterns in single-cell RNA-sequencing data.
Vu, Trung Nghia; Wills, Quin F; Kalari, Krishna R; Niu, Nifang; Wang, Liewei; Pawitan, Yudi; Rantalainen, Mattias
2018-02-27
RNA sequencing of single cells enables characterization of transcriptional heterogeneity in seemingly homogeneous cell populations. Single-cell sequencing has been applied in a wide range of researches fields. However, few studies have focus on characterization of isoform-level expression patterns at the single-cell level. In this study we propose and apply a novel method, ISOform-Patterns (ISOP), based on mixture modeling, to characterize the expression patterns of isoform pairs from the same gene in single-cell isoform-level expression data. We define six principal patterns of isoform expression relationships and describe a method for differential-pattern analysis. We demonstrate ISOP through analysis of single-cell RNA-sequencing data from a breast cancer cell line, with replication in three independent datasets. We assigned the pattern types to each of 16,562 isoform-pairs from 4,929 genes. Among those, 26% of the discovered patterns were significant (p<0.05), while remaining patterns are possibly effects of transcriptional bursting, drop-out and stochastic biological heterogeneity. Furthermore, 32% of genes discovered through differential-pattern analysis were not detected by differential-expression analysis. The effect of drop-out events, mean expression level, and properties of the expression distribution on the performances of ISOP were also investigated through simulated datasets. To conclude, ISOP provides a novel approach for characterization of isoformlevel preference, commitment and heterogeneity in single-cell RNA-sequencing data. The ISOP method has been implemented as a R package and is available at https://github.com/nghiavtr/ISOP under a GPL-3 license. mattias.rantalainen@ki.se. Supplementary data are available at Bioinformatics online.
O’Grady, Joseph F.; Hoelters, Laura S.; Swain, Martin T.
2016-01-01
Background Talitrus saltator is an amphipod crustacean that inhabits the supralittoral zone on sandy beaches in the Northeast Atlantic and Mediterranean. T. saltator exhibits endogenous locomotor activity rhythms and time-compensated sun and moon orientation, both of which necessitate at least one chronometric mechanism. Whilst their behaviour is well studied, currently there are no descriptions of the underlying molecular components of a biological clock in this animal, and very few in other crustacean species. Methods We harvested brain tissue from animals expressing robust circadian activity rhythms and used homology cloning and Illumina RNAseq approaches to sequence and identify the core circadian clock and clock-related genes in these samples. We assessed the temporal expression of these genes in time-course samples from rhythmic animals using RNAseq. Results We identified a comprehensive suite of circadian clock gene homologues in T. saltator including the ‘core’ clock genes period (Talper), cryptochrome 2 (Talcry2), timeless (Taltim), clock (Talclk), and bmal1 (Talbmal1). In addition we describe the sequence and putative structures of 23 clock-associated genes including two unusual, extended isoforms of pigment dispersing hormone (Talpdh). We examined time-course RNAseq expression data, derived from tissues harvested from behaviourally rhythmic animals, to reveal rhythmic expression of these genes with approximately circadian period in Talper and Talbmal1. Of the clock-related genes, casein kinase IIβ (TalckIIβ), ebony (Talebony), jetlag (Taljetlag), pigment dispensing hormone (Talpdh), protein phosphatase 1 (Talpp1), shaggy (Talshaggy), sirt1 (Talsirt1), sirt7 (Talsirt7) and supernumerary limbs (Talslimb) show temporal changes in expression. Discussion We report the sequences of principle genes that comprise the circadian clock of T. saltator and highlight the conserved structural and functional domains of their deduced cognate proteins. Our sequencing data contribute to the growing inventory of described comparative clocks. Expression profiling of the identified clock genes illuminates tantalising targets for experimental manipulation to elucidate the molecular and cellular control of clock-driven phenotypes in this crustacean. PMID:27761341
Voelker, T A; Staswick, P; Chrispeels, M J
1986-12-01
Phytohemagglutinin (PHA), the seed lectin of the common bean, Phaseolus vulgaris, is encoded by two highly homologous, tandemly linked genes, dlec1 and dlec2, which are coordinately expressed at high levels in developing cotyledons. Their respective transcripts translate into closely related polypeptides, PHA-E and PHA-L, constituents of the tetrameric lectin which accumulates at high levels in developing seeds. In the bean cultivar Pinto UI111, PHA-E is not detectable, and PHA-L accumulates at very reduced levels. To investigate the cause of the Pinto phenotype, we cloned and sequenced the two PHA genes of Pinto, called Pdlec1 and Pdlec2, and determined the abundance of their respective mRNAs in developing cotyledons. Both genes are more than 90% homologous to the normal PHA genes found in other cultivars. Pdlec1 carries a 1-bp frameshift mutation close to the 5' end of its coding sequence. Only very truncated polypeptides could be made from its mRNA. The gene Pdlec2 encodes a polypeptide, which resembles PHA-L and its predicted amino acid sequence agrees with the available Pinto PHA amino acid sequence data. Analysis of the mRNA of developing cotyledons revealed that the Pdlec1 message is reduced 600-fold, and Pdlec2 mRNA is reduced 20-fold with respect to mRNA levels in normal cultivars. A comparison of the sequences which are upstream from the coding sequence shows that Pdlec2 has a 100-bp deletion compared to the other genes (dlec1, dlec2 and Pdlec1). This deletion which contains a large tandem repeat may be responsible for the low level of expression of Pdlec2. The very low expression of Pdlec1 is as yet unexplained.
Li, Yunhai; Lee, Kee Khoon; Walsh, Sean; Smith, Caroline; Hadingham, Sophie; Sorefan, Karim; Cawley, Gavin; Bevan, Michael W
2006-03-01
Establishing transcriptional regulatory networks by analysis of gene expression data and promoter sequences shows great promise. We developed a novel promoter classification method using a Relevance Vector Machine (RVM) and Bayesian statistical principles to identify discriminatory features in the promoter sequences of genes that can correctly classify transcriptional responses. The method was applied to microarray data obtained from Arabidopsis seedlings treated with glucose or abscisic acid (ABA). Of those genes showing >2.5-fold changes in expression level, approximately 70% were correctly predicted as being up- or down-regulated (under 10-fold cross-validation), based on the presence or absence of a small set of discriminative promoter motifs. Many of these motifs have known regulatory functions in sugar- and ABA-mediated gene expression. One promoter motif that was not known to be involved in glucose-responsive gene expression was identified as the strongest classifier of glucose-up-regulated gene expression. We show it confers glucose-responsive gene expression in conjunction with another promoter motif, thus validating the classification method. We were able to establish a detailed model of glucose and ABA transcriptional regulatory networks and their interactions, which will help us to understand the mechanisms linking metabolism with growth in Arabidopsis. This study shows that machine learning strategies coupled to Bayesian statistical methods hold significant promise for identifying functionally significant promoter sequences.
Zhao, Man; Gu, Yongzhe; He, Lingli; Chen, Qingshan; He, Chaoying
2015-05-15
The DA1 gene family is plant-specific and Arabidopsis DA1 regulates seed and organ size, but the functions in soybeans are unknown. The cultivated soybean (Glycine max) is believed to be domesticated from the annual wild soybeans (Glycine soja). To evaluate whether DA1-like genes were involved in the evolution of soybeans, we compared variation at both sequence and expression levels of DA1-like genes from G. max (GmaDA1) and G. soja (GsoDA1). Sequence identities were extremely high between the orthologous pairs between soybeans, while the paralogous copies in a soybean species showed a relatively high divergence. Moreover, the expression variation of DA1-like paralogous genes in soybean was much greater than the orthologous gene pairs between the wild and cultivated soybeans during development and challenging abiotic stresses such as salinity. We further found that overexpressing GsoDA1 genes did not affect seed size. Nevertheless, overexpressing them reduced transgenic Arabidopsis seed germination sensitivity to salt stress. Moreover, most of these genes could improve salt tolerance of the transgenic Arabidopsis plants, corroborated by a detection of expression variation of several key genes in the salt-tolerance pathways. Our work suggested that expression diversification of DA1-like genes is functionally associated with adaptive radiation of soybeans, reinforcing that the plant-specific DA1 gene family might have contributed to the successful adaption to complex environments and radiation of the plants.
Fischer, Iris; Steige, Kim A.; Stephan, Wolfgang; Mboup, Mamadou
2013-01-01
The wild tomato species Solanum chilense and S. peruvianum are a valuable non-model system for studying plant adaptation since they grow in diverse environments facing many abiotic constraints. Here we investigate the sequence evolution of regulatory regions of drought and cold responsive genes and their expression regulation. The coding regions of these genes were previously shown to exhibit signatures of positive selection. Expression profiles and sequence evolution of regulatory regions of members of the Asr (ABA/water stress/ripening induced) gene family and the dehydrin gene pLC30-15 were analyzed in wild tomato populations from contrasting environments. For S. chilense, we found that Asr4 and pLC30-15 appear to respond much faster to drought conditions in accessions from very dry environments than accessions from more mesic locations. Sequence analysis suggests that the promoter of Asr2 and the downstream region of pLC30-15 are under positive selection in some local populations of S. chilense. By investigating gene expression differences at the population level we provide further support of our previous conclusions that Asr2, Asr4, and pLC30-15 are promising candidates for functional studies of adaptation. Our analysis also demonstrates the power of the candidate gene approach in evolutionary biology research and highlights the importance of wild Solanum species as a genetic resource for their cultivated relatives. PMID:24205149
Harnessing Gene Conversion in Chicken B Cells to Create a Human Antibody Sequence Repertoire
Schusser, Benjamin; Yi, Henry; Collarini, Ellen J.; Izquierdo, Shelley Mettler; Harriman, William D.; Etches, Robert J.; Leighton, Philip A.
2013-01-01
Transgenic chickens expressing human sequence antibodies would be a powerful tool to access human targets and epitopes that have been intractable in mammalian hosts because of tolerance to conserved proteins. To foster the development of the chicken platform, it is beneficial to validate transgene constructs using a rapid, cell culture-based method prior to generating fully transgenic birds. We describe a method for the expression of human immunoglobulin variable regions in the chicken DT40 B cell line and the further diversification of these genes by gene conversion. Chicken VL and VH loci were knocked out in DT40 cells and replaced with human VK and VH genes. To achieve gene conversion of human genes in chicken B cells, synthetic human pseudogene arrays were inserted upstream of the functional human VK and VH regions. Proper expression of chimeric IgM comprised of human variable regions and chicken constant regions is shown. Most importantly, sequencing of DT40 genetic variants confirmed that the human pseudogene arrays contributed to the generation of diversity through gene conversion at both the Igl and Igh loci. These data show that engineered pseudogene arrays produce a diverse pool of human antibody sequences in chicken B cells, and suggest that these constructs will express a functional repertoire of chimeric antibodies in transgenic chickens. PMID:24278246
Zhao, J.; Chen, Y. H.; Kwan, H. S.
2000-01-01
The complete nucleotide sequence of putative glucoamylase gene gla1 from the basidiomycetous fungus Lentinula edodes strain L54 is reported. The coding region of the genomic glucoamylase sequence, which is preceded by eukaryotic promoter elements CAAT and TATA, spans 2,076 bp. The gla1 gene sequence codes for a putative polypeptide of 571 amino acids and is interrupted by seven introns. The open reading frame sequence of the gla1 gene shows strong homology with those of other fungal glucoamylase genes and encodes a protein with an N-terminal catalytic domain and a C-terminal starch-binding domain. The similarity between the Gla1 protein and other fungal glucoamylases is from 45 to 61%, with the region of highest conservation found in catalytic domains and starch-binding domains. We compared the kinetics of glucoamylase activity and levels of gene expression in L. edodes strain L54 grown on different carbon sources (glucose, starch, cellulose, and potato extract) and in various developmental stages (mycelium growth, primordium appearance, and fruiting body formation). Quantitative reverse transcription PCR utilizing pairs of primers specific for gla1 gene expression shows that expression of gla1 was induced by starch and increased during the process of fruiting body formation, which indicates that glucoamylases may play an important role in the morphogenesis of the basidiomycetous fungus. PMID:10831434
Huang, Xiaoyun; Zang, Xiaonan; Wu, Fei; Jin, Yuming; Wang, Haitao; Liu, Chang; Ding, Yating; He, Bangxiang; Xiao, Dongfang; Song, Xinwei; Liu, Zhu
2017-01-01
Gracilariopsis lemaneiformis (aka Gracilaria lemaneiformis) is a red macroalga rich in phycoerythrin, which can capture light efficiently and transfer it to photosystemⅡ. However, little is known about the synthesis of optically active phycoerythrinin in G. lemaneiformis at the molecular level. With the advent of high-throughput sequencing technology, analysis of genetic information for G. lemaneiformis by transcriptome sequencing is an effective means to get a deeper insight into the molecular mechanism of phycoerythrin synthesis. Illumina technology was employed to sequence the transcriptome of two strains of G. lemaneiformis- the wild type and a green-pigmented mutant. We obtained a total of 86915 assembled unigenes as a reference gene set, and 42884 unigenes were annotated in at least one public database. Taking the above transcriptome sequencing as a reference gene set, 4041 differentially expressed genes were screened to analyze and compare the gene expression profiles of the wild type and green mutant. By GO and KEGG pathway analysis, we concluded that three factors, including a reduction in the expression level of apo-phycoerythrin, an increase of chlorophyll light-harvesting complex synthesis, and reduction of phycoerythrobilin by competitive inhibition, caused the reduction of optically active phycoerythrin in the green-pigmented mutant.
An efficient annotation and gene-expression derivation tool for Illumina Solexa datasets
2010-01-01
Background The data produced by an Illumina flow cell with all eight lanes occupied, produces well over a terabyte worth of images with gigabytes of reads following sequence alignment. The ability to translate such reads into meaningful annotation is therefore of great concern and importance. Very easily, one can get flooded with such a great volume of textual, unannotated data irrespective of read quality or size. CASAVA, a optional analysis tool for Illumina sequencing experiments, enables the ability to understand INDEL detection, SNP information, and allele calling. To not only extract from such analysis, a measure of gene expression in the form of tag-counts, but furthermore to annotate such reads is therefore of significant value. Findings We developed TASE (Tag counting and Analysis of Solexa Experiments), a rapid tag-counting and annotation software tool specifically designed for Illumina CASAVA sequencing datasets. Developed in Java and deployed using jTDS JDBC driver and a SQL Server backend, TASE provides an extremely fast means of calculating gene expression through tag-counts while annotating sequenced reads with the gene's presumed function, from any given CASAVA-build. Such a build is generated for both DNA and RNA sequencing. Analysis is broken into two distinct components: DNA sequence or read concatenation, followed by tag-counting and annotation. The end result produces output containing the homology-based functional annotation and respective gene expression measure signifying how many times sequenced reads were found within the genomic ranges of functional annotations. Conclusions TASE is a powerful tool to facilitate the process of annotating a given Illumina Solexa sequencing dataset. Our results indicate that both homology-based annotation and tag-count analysis are achieved in very efficient times, providing researchers to delve deep in a given CASAVA-build and maximize information extraction from a sequencing dataset. TASE is specially designed to translate sequence data in a CASAVA-build into functional annotations while producing corresponding gene expression measurements. Achieving such analysis is executed in an ultrafast and highly efficient manner, whether the analysis be a single-read or paired-end sequencing experiment. TASE is a user-friendly and freely available application, allowing rapid analysis and annotation of any given Illumina Solexa sequencing dataset with ease. PMID:20598141
Goldman, Gustavo H.; dos Reis Marques, Everaldo; Custódio Duarte Ribeiro, Diógenes; Ângelo de Souza Bernardes, Luciano; Quiapin, Andréa Carla; Vitorelli, Patrícia Marostica; Savoldi, Marcela; Semighini, Camile P.; de Oliveira, Regina C.; Nunes, Luiz R.; Travassos, Luiz R.; Puccia, Rosana; Batista, Wagner L.; Ferreira, Leslie Ecker; Moreira, Júlio C.; Bogossian, Ana Paula; Tekaia, Fredj; Nobrega, Marina Pasetto; Nobrega, Francisco G.; Goldman, Maria Helena S.
2003-01-01
Paracoccidioides brasiliensis, a thermodimorphic fungus, is the causative agent of the prevalent systemic mycosis in Latin America, paracoccidioidomycosis. We present here a survey of expressed genes in the yeast pathogenic phase of P. brasiliensis. We obtained 13,490 expressed sequence tags from both 5′ and 3′ ends. Clustering analysis yielded the partial sequences of 4,692 expressed genes that were functionally classified by similarity to known genes. We have identified several Candida albicans virulence and pathogenicity homologues in P. brasiliensis. Furthermore, we have analyzed the expression of some of these genes during the dimorphic yeast-mycelium-yeast transition by real-time quantitative reverse transcription-PCR. Clustering analysis of the mycelium-yeast transition revealed three groups: (i) RBT, hydrophobin, and isocitrate lyase; (ii) malate dehydrogenase, contigs Pb1067 and Pb1145, GPI, and alternative oxidase; and (iii) ubiquitin, delta-9-desaturase, HSP70, HSP82, and HSP104. The first two groups displayed high mRNA expression in the mycelial phase, whereas the third group showed higher mRNA expression in the yeast phase. Our results suggest the possible conservation of pathogenicity and virulence mechanisms among fungi, expand considerably gene identification in P. brasiliensis, and provide a broader basis for further progress in understanding its biological peculiarities. PMID:12582121
Costa, José Hélio; de Melo, Dirce Fernandes; Gouveia, Zélia; Cardoso, Hélia Guerra; Peixe, Augusto; Arnholdt-Schmitt, Birgit
2009-12-01
'Genomic design' refers to the structural organization of gene sequences. Recently, the role of intron sequences for gene regulation is being better understood. Further, introns possess high rates of polymorphism that are considered as the major source for speciation. In molecular breeding, the length of gene-specific introns is recognized as a tool to discriminate genotypes with diverse traits of agronomic interest. 'Economy selection' and 'time-economy selection' have been proposed as models for explaining why highly expressed genes typically contain small introns. However, in contrast to these theories, plant-specific selection reveals that highly expressed genes contain introns that are large. In the presented research, 'wet'Aox gene identification from grapevine is advanced by a bioinformatics approach to study the species-specific organization of Aox gene structures in relation to available expressed sequence tag (EST) data. Two Aox1 and one Aox2 gene sequences have been identified in Vitis vinifera using grapevine cultivars from Portugal and Germany. Searching the complete genome sequence data of two grapevine cultivars confirmed that V. vinifera alternative oxidase (Aox) is encoded by a small multigene family composed of Aox1a, Aox1b and Aox2. An analysis of EST distribution revealed high expression of the VvAox2 gene. A relationship between the atypical long primary transcript of VvAox2 (in comparison to other plant Aox genes) and its expression level is suggested. V. vinifera Aox genes contain four exons interrupted by three introns except for Aox1a which contains an additional intron in the 3'-UTR. The lengths of primary Aox transcripts were estimated for each gene in two V. vinifera varieties: PN40024 and Pinot Noir. In both varieties, Aox1a and Aox1b contained small introns that corresponded to primary transcript lengths ranging from 1501 to 1810 bp. The Aox2 of PN40024 (12 329 bp) was longer than that from Pinot Noir (7279 bp) because of selection against a transposable-element insertion that is 5028 bp in size. An EST database basic local alignment search tool (BLAST) search of GenBank revealed the following ESTs percentages for each gene: Aox1a (26.2%), Aox1b (11.9%) and Aox2 (61.9%). Aox1a was expressed in fruits and roots, Aox1b expression was confined to flowers and Aox2 was ubiquitously expressed. These data for V. vinifera show that atypically long Aox intron lengths are related to high levels of gene expression. Furthermore, it is shown for the first time that two grapevine cultivars can be distinguished by Aox intron length polymorphism.
Vukmirovic, Milica; Herazo-Maya, Jose D; Blackmon, John; Skodric-Trifunovic, Vesna; Jovanovic, Dragana; Pavlovic, Sonja; Stojsic, Jelena; Zeljkovic, Vesna; Yan, Xiting; Homer, Robert; Stefanovic, Branko; Kaminski, Naftali
2017-01-12
Idiopathic Pulmonary Fibrosis (IPF) is a lethal lung disease of unknown etiology. A major limitation in transcriptomic profiling of lung tissue in IPF has been a dependence on snap-frozen fresh tissues (FF). In this project we sought to determine whether genome scale transcript profiling using RNA Sequencing (RNA-Seq) could be applied to archived Formalin-Fixed Paraffin-Embedded (FFPE) IPF tissues. We isolated total RNA from 7 IPF and 5 control FFPE lung tissues and performed 50 base pair paired-end sequencing on Illumina 2000 HiSeq. TopHat2 was used to map sequencing reads to the human genome. On average ~62 million reads (53.4% of ~116 million reads) were mapped per sample. 4,131 genes were differentially expressed between IPF and controls (1,920 increased and 2,211 decreased (FDR < 0.05). We compared our results to differentially expressed genes calculated from a previously published dataset generated from FF tissues analyzed on Agilent microarrays (GSE47460). The overlap of differentially expressed genes was very high (760 increased and 1,413 decreased, FDR < 0.05). Only 92 differentially expressed genes changed in opposite directions. Pathway enrichment analysis performed using MetaCore confirmed numerous IPF relevant genes and pathways including extracellular remodeling, TGF-beta, and WNT. Gene network analysis of MMP7, a highly differentially expressed gene in both datasets, revealed the same canonical pathways and gene network candidates in RNA-Seq and microarray data. For validation by NanoString nCounter® we selected 35 genes that had a fold change of 2 in at least one dataset (10 discordant, 10 significantly differentially expressed in one dataset only and 15 concordant genes). High concordance of fold change and FDR was observed for each type of the samples (FF vs FFPE) with both microarrays (r = 0.92) and RNA-Seq (r = 0.90) and the number of discordant genes was reduced to four. Our results demonstrate that RNA sequencing of RNA obtained from archived FFPE lung tissues is feasible. The results obtained from FFPE tissue are highly comparable to FF tissues. The ability to perform RNA-Seq on archived FFPE IPF tissues should greatly enhance the availability of tissue biopsies for research in IPF.
BASiCS: Bayesian Analysis of Single-Cell Sequencing Data
Vallejos, Catalina A.; Marioni, John C.; Richardson, Sylvia
2015-01-01
Single-cell mRNA sequencing can uncover novel cell-to-cell heterogeneity in gene expression levels in seemingly homogeneous populations of cells. However, these experiments are prone to high levels of unexplained technical noise, creating new challenges for identifying genes that show genuine heterogeneous expression within the population of cells under study. BASiCS (Bayesian Analysis of Single-Cell Sequencing data) is an integrated Bayesian hierarchical model where: (i) cell-specific normalisation constants are estimated as part of the model parameters, (ii) technical variability is quantified based on spike-in genes that are artificially introduced to each analysed cell’s lysate and (iii) the total variability of the expression counts is decomposed into technical and biological components. BASiCS also provides an intuitive detection criterion for highly (or lowly) variable genes within the population of cells under study. This is formalised by means of tail posterior probabilities associated to high (or low) biological cell-to-cell variance contributions, quantities that can be easily interpreted by users. We demonstrate our method using gene expression measurements from mouse Embryonic Stem Cells. Cross-validation and meaningful enrichment of gene ontology categories within genes classified as highly (or lowly) variable supports the efficacy of our approach. PMID:26107944
BASiCS: Bayesian Analysis of Single-Cell Sequencing Data.
Vallejos, Catalina A; Marioni, John C; Richardson, Sylvia
2015-06-01
Single-cell mRNA sequencing can uncover novel cell-to-cell heterogeneity in gene expression levels in seemingly homogeneous populations of cells. However, these experiments are prone to high levels of unexplained technical noise, creating new challenges for identifying genes that show genuine heterogeneous expression within the population of cells under study. BASiCS (Bayesian Analysis of Single-Cell Sequencing data) is an integrated Bayesian hierarchical model where: (i) cell-specific normalisation constants are estimated as part of the model parameters, (ii) technical variability is quantified based on spike-in genes that are artificially introduced to each analysed cell's lysate and (iii) the total variability of the expression counts is decomposed into technical and biological components. BASiCS also provides an intuitive detection criterion for highly (or lowly) variable genes within the population of cells under study. This is formalised by means of tail posterior probabilities associated to high (or low) biological cell-to-cell variance contributions, quantities that can be easily interpreted by users. We demonstrate our method using gene expression measurements from mouse Embryonic Stem Cells. Cross-validation and meaningful enrichment of gene ontology categories within genes classified as highly (or lowly) variable supports the efficacy of our approach.
Jose, Jency; Jalali, S K; Shivalingaswamy, T M; Kumar, N K Krishna; Bhatnagar, R; Bandyopadhyay, A
2013-06-01
A PCR based method for detection of viral DNA in nucleopolyhedrovirus of three lepidopterans, Spodoptera litura, Amsacta albistriga and Helicoverpa armigera, was developed by employing the late expression factor-8 (lef-8) gene of three NPV using specific primers. The amplicons of 689, 699 and 665 bp were amplified, respectively, and the nucleotide sequences were submitted to GenBank and the accession numbers were obtained. The sequences of lef-8 gene of S. litura NPV and H. armigera NPV matched with those of their respective references in the GenBank database, thereby confirming their identity, however, the sequence of A. albistriga NPV was the first sequence submitted to the GenBank database. The sequence similarity analysis between the three lef-8 gene of NPV sequenced in the present study revealed that there was no significant similarity between them, however A. albistriga NPV and S. litura NPV were found to be closely related. CLUSTAL alignment of the sequences generated revealed general relatedness among NPVs lef-8 gene. The study confirmed that lef-8 gene can be used for quick and correct discriminatory identification of insect viruses.
Gruel, Jérémy; LeBorgne, Michel; LeMeur, Nolwenn; Théret, Nathalie
2011-09-12
Regulation of gene expression plays a pivotal role in cellular functions. However, understanding the dynamics of transcription remains a challenging task. A host of computational approaches have been developed to identify regulatory motifs, mainly based on the recognition of DNA sequences for transcription factor binding sites. Recent integration of additional data from genomic analyses or phylogenetic footprinting has significantly improved these methods. Here, we propose a different approach based on the compilation of Simple Shared Motifs (SSM), groups of sequences defined by their length and similarity and present in conserved sequences of gene promoters. We developed an original algorithm to search and count SSM in pairs of genes. An exceptional number of SSM is considered as a common regulatory pattern. The SSM approach is applied to a sample set of genes and validated using functional gene-set enrichment analyses. We demonstrate that the SSM approach selects genes that are over-represented in specific biological categories (Ontology and Pathways) and are enriched in co-expressed genes. Finally we show that genes co-expressed in the same tissue or involved in the same biological pathway have increased SSM values. Using unbiased clustering of genes, Simple Shared Motifs analysis constitutes an original contribution to provide a clearer definition of expression networks.
2011-01-01
Background Regulation of gene expression plays a pivotal role in cellular functions. However, understanding the dynamics of transcription remains a challenging task. A host of computational approaches have been developed to identify regulatory motifs, mainly based on the recognition of DNA sequences for transcription factor binding sites. Recent integration of additional data from genomic analyses or phylogenetic footprinting has significantly improved these methods. Results Here, we propose a different approach based on the compilation of Simple Shared Motifs (SSM), groups of sequences defined by their length and similarity and present in conserved sequences of gene promoters. We developed an original algorithm to search and count SSM in pairs of genes. An exceptional number of SSM is considered as a common regulatory pattern. The SSM approach is applied to a sample set of genes and validated using functional gene-set enrichment analyses. We demonstrate that the SSM approach selects genes that are over-represented in specific biological categories (Ontology and Pathways) and are enriched in co-expressed genes. Finally we show that genes co-expressed in the same tissue or involved in the same biological pathway have increased SSM values. Conclusions Using unbiased clustering of genes, Simple Shared Motifs analysis constitutes an original contribution to provide a clearer definition of expression networks. PMID:21910886
Coordinate cytokine regulatory sequences
Frazer, Kelly A.; Rubin, Edward M.; Loots, Gabriela G.
2005-05-10
The present invention provides CNS sequences that regulate the cytokine gene expression, expression cassettes and vectors comprising or lacking the CNS sequences, host cells and non-human transgenic animals comprising the CNS sequences or lacking the CNS sequences. The present invention also provides methods for identifying compounds that modulate the functions of CNS sequences as well as methods for diagnosing defects in the CNS sequences of patients.
2014-01-01
Background Down-regulation or silencing of transgene expression can be a major hurdle to both molecular studies and biotechnology applications in many plant species. Sugarcane is particularly effective at silencing introduced transgenes, including reporter genes such as the firefly luciferase gene. Synthesizing transgene coding sequences optimized for usage in the host plant is one method of enhancing transgene expression and stability. Using specified design rules we have synthesised new coding sequences for both the firefly luciferase and Renilla luciferase reporter genes. We have tested these optimized versions for enhanced levels of luciferase activity and for increased steady state luciferase mRNA levels in sugarcane. Results The synthetic firefly luciferase (luc*) and Renilla luciferase (Renluc*) coding sequences have elevated G + C contents in line with sugarcane codon usage, but maintain 75% identity to the native firefly or Renilla luciferase nucleotide sequences and 100% identity to the protein coding sequences. Under the control of the maize pUbi promoter, the synthetic luc* and Renluc* genes yielded 60x and 15x higher luciferase activity respectively, over the native firefly and Renilla luciferase genes in transient assays on sugarcane suspension cell cultures. Using a novel transient assay in sugarcane suspension cells combining co-bombardment and qRT-PCR, we showed that synthetic luc* and Renluc* genes generate increased transcript levels compared to the native firefly and Renilla luciferase genes. In stable transgenic lines, the luc* transgene generated significantly higher levels of expression than the native firefly luciferase transgene. The fold difference in expression was highest in the youngest tissues. Conclusions We developed synthetic versions of both the firefly and Renilla luciferase reporter genes that resist transgene silencing in sugarcane. These transgenes will be particularly useful for evaluating the expression patterns conferred by existing and newly isolated promoters in sugarcane tissues. The strategies used to design the synthetic luciferase transgenes could be applied to other transgenes that are aggressively silenced in sugarcane. PMID:24708613
Chou, Ting-Chun; Moyle, Richard L
2014-04-08
Down-regulation or silencing of transgene expression can be a major hurdle to both molecular studies and biotechnology applications in many plant species. Sugarcane is particularly effective at silencing introduced transgenes, including reporter genes such as the firefly luciferase gene.Synthesizing transgene coding sequences optimized for usage in the host plant is one method of enhancing transgene expression and stability. Using specified design rules we have synthesised new coding sequences for both the firefly luciferase and Renilla luciferase reporter genes. We have tested these optimized versions for enhanced levels of luciferase activity and for increased steady state luciferase mRNA levels in sugarcane. The synthetic firefly luciferase (luc*) and Renilla luciferase (Renluc*) coding sequences have elevated G + C contents in line with sugarcane codon usage, but maintain 75% identity to the native firefly or Renilla luciferase nucleotide sequences and 100% identity to the protein coding sequences.Under the control of the maize pUbi promoter, the synthetic luc* and Renluc* genes yielded 60x and 15x higher luciferase activity respectively, over the native firefly and Renilla luciferase genes in transient assays on sugarcane suspension cell cultures.Using a novel transient assay in sugarcane suspension cells combining co-bombardment and qRT-PCR, we showed that synthetic luc* and Renluc* genes generate increased transcript levels compared to the native firefly and Renilla luciferase genes.In stable transgenic lines, the luc* transgene generated significantly higher levels of expression than the native firefly luciferase transgene. The fold difference in expression was highest in the youngest tissues. We developed synthetic versions of both the firefly and Renilla luciferase reporter genes that resist transgene silencing in sugarcane. These transgenes will be particularly useful for evaluating the expression patterns conferred by existing and newly isolated promoters in sugarcane tissues. The strategies used to design the synthetic luciferase transgenes could be applied to other transgenes that are aggressively silenced in sugarcane.
A wing expressed sequence tag resource for Bicyclus anynana butterflies, an evo-devo model
Beldade, Patrícia; Rudd, Stephen; Gruber, Jonathan D; Long, Anthony D
2006-01-01
Background Butterfly wing color patterns are a key model for integrating evolutionary developmental biology and the study of adaptive morphological evolution. Yet, despite the biological, economical and educational value of butterflies they are still relatively under-represented in terms of available genomic resources. Here, we describe an Expression Sequence Tag (EST) project for Bicyclus anynana that has identified the largest available collection to date of expressed genes for any butterfly. Results By targeting cDNAs from developing wings at the stages when pattern is specified, we biased gene discovery towards genes potentially involved in pattern formation. Assembly of 9,903 ESTs from a subtracted library allowed us to identify 4,251 genes of which 2,461 were annotated based on BLAST analyses against relevant gene collections. Gene prediction software identified 2,202 peptides, of which 215 longer than 100 amino acids had no homology to any known proteins and, thus, potentially represent novel or highly diverged butterfly genes. We combined gene and Single Nucleotide Polymorphism (SNP) identification by constructing cDNA libraries from pools of outbred individuals, and by sequencing clones from the 3' end to maximize alignment depth. Alignments of multi-member contigs allowed us to identify over 14,000 putative SNPs, with 316 genes having at least one high confidence double-hit SNP. We furthermore identified 320 microsatellites in transcribed genes that can potentially be used as genetic markers. Conclusion Our project was designed to combine gene and sequence polymorphism discovery and has generated the largest gene collection available for any butterfly and many potential markers in expressed genes. These resources will be invaluable for exploring the potential of B. anynana in particular, and butterflies in general, as models in ecological, evolutionary, and developmental genetics. PMID:16737530
Furuya, Toshiki; Hirose, Satomi; Semba, Hisashi; Kino, Kuniki
2011-01-01
The mimABCD gene cluster encodes the binuclear iron monooxygenase that oxidizes propane and phenol in Mycobacterium smegmatis strain MC2 155 and Mycobacterium goodii strain 12523. Interestingly, expression of the mimABCD gene cluster is induced by acetone. In this study, we investigated the regulator gene responsible for this acetone-responsive expression. In the genome sequence of M. smegmatis strain MC2 155, the mimABCD gene cluster is preceded by a gene designated mimR, which is divergently transcribed. Sequence analysis revealed that MimR exhibits amino acid similarity with the NtrC family of transcriptional activators, including AcxR and AcoR, which are involved in acetone and acetoin metabolism, respectively. Unexpectedly, many homologs of the mimR gene were also found in the sequenced genomes of actinomycetes. A plasmid carrying a transcriptional fusion of the intergenic region between the mimR and mimA genes with a promoterless green fluorescent protein (GFP) gene was constructed and introduced into M. smegmatis strain MC2 155. Using a GFP reporter system, we confirmed by deletion and complementation analyses that the mimR gene product is the positive regulator of the mimABCD gene cluster expression that is responsive to acetone. M. goodii strain 12523 also utilized the same regulatory system as M. smegmatis strain MC2 155. Although transcriptional activators of the NtrC family generally control transcription using the σ54 factor, a gene encoding the σ54 factor was absent from the genome sequence of M. smegmatis strain MC2 155. These results suggest the presence of a novel regulatory system in actinomycetes, including mycobacteria. PMID:21856847
Walker, M D; Park, C W; Rosen, A; Aronheim, A
1990-01-01
Cell specific expression of the insulin gene is achieved through transcriptional mechanisms operating on multiple DNA sequence elements located in the 5' flanking region of the gene. Of particular importance in the rat insulin I gene are two closely similar 9 bp sequences (IEB1 and IEB2): mutation of either of these leads to 5-10 fold reduction in transcriptional activity. We have screened an expression cDNA library derived from mouse pancreatic endocrine beta cells with a radioactive DNA probe containing multiple copies of the IEB1 sequence. A cDNA clone (A1) isolated by this procedure encodes a protein which shows efficient binding to the IEB1 probe, but much weaker binding to either an unrelated DNA probe or to a probe bearing a single base pair insertion within the recognition sequence. DNA sequence analysis indicates a protein belonging to the helix-loop-helix family of DNA-binding proteins. The ability of the protein encoded by clone A1 to recognize a number of wild type and mutant DNA sequences correlates closely with the ability of each sequence element to support transcription in vivo in the context of the insulin 5' flanking DNA. We conclude that the isolated cDNA may encode a transcription factor that participates in control of insulin gene expression. Images PMID:2181401
Kim, Sunhwa; Matsuo, Ichiro; Ajisaka, Katsumi; Nakajima, Harushi; Kitamoto, Katsuhiko
2002-10-01
We isolated a beta-N-acetylglucosaminidase encoding gene and its cDNA from the filamentous fungus Aspergillus nidulans, and designated it nagA. The nagA gene contained no intron and encoded a polypeptide of 603 amino acids with a putative 19-amino acid signal sequence. The deduced amino acid sequence was very similar to the sequence of Candida albicans Hex1 and Trichoderma harzianum Nag1. Yeast cells containing the nagA cDNA under the control of the GAL1 promoter expressed beta-N-acetylglucosaminidase activity. The chromosomal nagA gene of A. nidulans was disrupted by replacement with the argB marker gene. The disruptant strains expressed low levels of beta-N-acetylglucosaminidase activity and showed poor growth on a medium containing chitobiose as a carbon source. Aspergillus oryzae strain carrying the nagA gene under the control of the improved glaA promoter produced large amounts of beta-N-acetylglucosaminidase in a wheat bran solid culture.
On construction of stochastic genetic networks based on gene expression sequences.
Ching, Wai-Ki; Ng, Michael M; Fung, Eric S; Akutsu, Tatsuya
2005-08-01
Reconstruction of genetic regulatory networks from time series data of gene expression patterns is an important research topic in bioinformatics. Probabilistic Boolean Networks (PBNs) have been proposed as an effective model for gene regulatory networks. PBNs are able to cope with uncertainty, corporate rule-based dependencies between genes and discover the sensitivity of genes in their interactions with other genes. However, PBNs are unlikely to use directly in practice because of huge amount of computational cost for obtaining predictors and their corresponding probabilities. In this paper, we propose a multivariate Markov model for approximating PBNs and describing the dynamics of a genetic network for gene expression sequences. The main contribution of the new model is to preserve the strength of PBNs and reduce the complexity of the networks. The number of parameters of our proposed model is O(n2) where n is the number of genes involved. We also develop efficient estimation methods for solving the model parameters. Numerical examples on synthetic data sets and practical yeast data sequences are given to demonstrate the effectiveness of the proposed model.
Wang, Cheng; Yu, Jie; Kallen, Caleb B
2008-01-01
The proliferating cell nuclear antigen (PCNA) is an essential component of DNA replication, cell cycle regulation, and epigenetic inheritance. High expression of PCNA is associated with poor prognosis in patients with breast cancer. The 5'-region of the PCNA gene contains two computationally-detected estrogen response element (ERE) sequences, one of which is evolutionarily conserved. Both of these sequences are of undocumented cis-regulatory function. We recently demonstrated that estradiol (E2) enhances PCNA mRNA expression in MCF7 breast cancer cells. MCF7 cells proliferate in response to E2. Here, we demonstrate that E2 rapidly enhanced PCNA mRNA and protein expression in a process that requires ERalpha as well as de novo protein synthesis. One of the two upstream ERE sequences was specifically bound by ERalpha-containing protein complexes, in vitro, in gel shift analysis. Yet, each ERE sequence, when cloned as a single copy, or when engineered as two tandem copies of the ERE-containing sequence, was not capable of activating a luciferase reporter construct in response to E2. In MCF7 cells, neither ERE-containing genomic region demonstrated E2-dependent recruitment of ERalpha by sensitive ChIP-PCR assays. We conclude that E2 enhances PCNA gene expression by an indirect process and that computational detection of EREs, even when evolutionarily conserved and when near E2-responsive genes, requires biochemical validation.
Dou, Wei; Shen, Guang-Mao; Niu, Jin-Zhi; Ding, Tian-Bo; Wei, Dan-Dan; Wang, Jin-Jun
2013-01-01
Recent studies indicate that infestations of psocids pose a new risk for global food security. Among the psocids species, Liposcelis bostrychophila Badonnel has gained recognition in importance because of its parthenogenic reproduction, rapid adaptation, and increased worldwide distribution. To date, the molecular data available for L. bostrychophila is largely limited to genes identified through homology. Also, no transcriptome data relevant to psocids infection is available. In this study, we generated de novo assembly of L. bostrychophila transcriptome performed through the short read sequencing technology (Illumina). In a single run, we obtained more than 51 million sequencing reads that were assembled into 60,012 unigenes (mean size = 711 bp) by Trinity. The transcriptome sequences from different developmental stages of L. bostrychophila including egg, nymph and adult were annotated with non-redundant (Nr) protein database, gene ontology (GO), cluster of orthologous groups of proteins (COG), and KEGG orthology (KO). The analysis revealed three major enzyme families involved in insecticide metabolism as differentially expressed in the L. bostrychophila transcriptome. A total of 49 P450-, 31 GST- and 21 CES-specific genes representing the three enzyme families were identified. Besides, 16 transcripts were identified to contain target site sequences of resistance genes. Furthermore, we profiled gene expression patterns upon insecticide (malathion and deltamethrin) exposure using the tag-based digital gene expression (DGE) method. The L. bostrychophila transcriptome and DGE data provide gene expression data that would further our understanding of molecular mechanisms in psocids. In particular, the findings of this investigation will facilitate identification of genes involved in insecticide resistance and designing of new compounds for control of psocids.
Dou, Wei; Shen, Guang-Mao; Niu, Jin-Zhi; Ding, Tian-Bo; Wei, Dan-Dan; Wang, Jin-Jun
2013-01-01
Background Recent studies indicate that infestations of psocids pose a new risk for global food security. Among the psocids species, Liposcelis bostrychophila Badonnel has gained recognition in importance because of its parthenogenic reproduction, rapid adaptation, and increased worldwide distribution. To date, the molecular data available for L. bostrychophila is largely limited to genes identified through homology. Also, no transcriptome data relevant to psocids infection is available. Methodology and Principal Findings In this study, we generated de novo assembly of L. bostrychophila transcriptome performed through the short read sequencing technology (Illumina). In a single run, we obtained more than 51 million sequencing reads that were assembled into 60,012 unigenes (mean size = 711 bp) by Trinity. The transcriptome sequences from different developmental stages of L. bostrychophila including egg, nymph and adult were annotated with non-redundant (Nr) protein database, gene ontology (GO), cluster of orthologous groups of proteins (COG), and KEGG orthology (KO). The analysis revealed three major enzyme families involved in insecticide metabolism as differentially expressed in the L. bostrychophila transcriptome. A total of 49 P450-, 31 GST- and 21 CES-specific genes representing the three enzyme families were identified. Besides, 16 transcripts were identified to contain target site sequences of resistance genes. Furthermore, we profiled gene expression patterns upon insecticide (malathion and deltamethrin) exposure using the tag-based digital gene expression (DGE) method. Conclusion The L. bostrychophila transcriptome and DGE data provide gene expression data that would further our understanding of molecular mechanisms in psocids. In particular, the findings of this investigation will facilitate identification of genes involved in insecticide resistance and designing of new compounds for control of psocids. PMID:24278202
Rizk, Francine; Laverdure, Sylvain; d'Alençon, Emmanuelle; Bossin, Hervé; Dupressoir, Thierry
2018-01-01
The Lepidopteran ambidensovirus 1 isolated from Junonia coenia (hereafter JcDV) is an invertebrate parvovirus considered as a viral transduction vector as well as a potential tool for the biological control of insect pests. Previous works showed that JcDV-based circular plasmids experimentally integrate into insect cells genomic DNA. In order to approach the natural conditions of infection and possible integration, we generated linear JcDV- gfp based molecules which were transfected into non permissive Spodoptera frugiperda ( Sf9 ) cultured cells. Cells were monitored for the expression of green fluorescent protein (GFP) and DNA was analyzed for integration of transduced viral sequences. Non-structural protein modulation of the VP-gene cassette promoter activity was additionally assayed. We show that linear JcDV-derived molecules are capable of long term genomic integration and sustained transgene expression in Sf9 cells. As expected, only the deletion of both inverted terminal repeats (ITR) or the polyadenylation signals of NS and VP genes dramatically impairs the global transduction/expression efficiency. However, all the integrated viral sequences we characterized appear "scrambled" whatever the viral content of the transfected vector. Despite a strong GFP expression, we were unable to recover any full sequence of the original constructs and found rearranged viral and non-viral sequences as well. Cellular flanking sequences were identified as non-coding ones. On the other hand, the kinetics of GFP expression over time led us to investigate the apparent down-regulation by non-structural proteins of the VP-gene cassette promoter. Altogether, our results show that JcDV-derived sequences included in linear DNA molecules are able to drive efficiently the integration and expression of a foreign gene into the genome of insect cells, whatever their composition, provided that at least one ITR is present. However, the transfected sequences were extensively rearranged with cellular DNA during or after random integration in the host cell genome. Lastly, the non-structural proteins seem to participate in the regulation of p9 promoter activity rather than to the integration of viral sequences.
Genomic analysis of expressed sequence tags in American black bear Ursus americanus
2010-01-01
Background Species of the bear family (Ursidae) are important organisms for research in molecular evolution, comparative physiology and conservation biology, but relatively little genetic sequence information is available for this group. Here we report the development and analyses of the first large scale Expressed Sequence Tag (EST) resource for the American black bear (Ursus americanus). Results Comprehensive analyses of molecular functions, alternative splicing, and tissue-specific expression of 38,757 black bear EST sequences were conducted using the dog genome as a reference. We identified 18 genes, involved in functions such as lipid catabolism, cell cycle, and vesicle-mediated transport, that are showing rapid evolution in the bear lineage Three genes, Phospholamban (PLN), cysteine glycine-rich protein 3 (CSRP3) and Troponin I type 3 (TNNI3), are related to heart contraction, and defects in these genes in humans lead to heart disease. Two genes, biphenyl hydrolase-like (BPHL) and CSRP3, contain positively selected sites in bear. Global analysis of evolution rates of hibernation-related genes in bear showed that they are largely conserved and slowly evolving genes, rather than novel and fast-evolving genes. Conclusion We provide a genomic resource for an important mammalian organism and our study sheds new light on the possible functions and evolution of bear genes. PMID:20338065
Genomic analysis of expressed sequence tags in American black bear Ursus americanus.
Zhao, Sen; Shao, Chunxuan; Goropashnaya, Anna V; Stewart, Nathan C; Xu, Yichi; Tøien, Øivind; Barnes, Brian M; Fedorov, Vadim B; Yan, Jun
2010-03-26
Species of the bear family (Ursidae) are important organisms for research in molecular evolution, comparative physiology and conservation biology, but relatively little genetic sequence information is available for this group. Here we report the development and analyses of the first large scale Expressed Sequence Tag (EST) resource for the American black bear (Ursus americanus). Comprehensive analyses of molecular functions, alternative splicing, and tissue-specific expression of 38,757 black bear EST sequences were conducted using the dog genome as a reference. We identified 18 genes, involved in functions such as lipid catabolism, cell cycle, and vesicle-mediated transport, that are showing rapid evolution in the bear lineage Three genes, Phospholamban (PLN), cysteine glycine-rich protein 3 (CSRP3) and Troponin I type 3 (TNNI3), are related to heart contraction, and defects in these genes in humans lead to heart disease. Two genes, biphenyl hydrolase-like (BPHL) and CSRP3, contain positively selected sites in bear. Global analysis of evolution rates of hibernation-related genes in bear showed that they are largely conserved and slowly evolving genes, rather than novel and fast-evolving genes. We provide a genomic resource for an important mammalian organism and our study sheds new light on the possible functions and evolution of bear genes.
Gambling on a shortcut to genome sequencing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Roberts, L.
1991-06-21
Almost from the start of the Human Genome Project, a debate has been raging over whether to sequence the entire human genome, all 3 billion bases, or just the genes - a mere 2% or 3% of the genome, and by far the most interesting part. In England, Sydney Brenner convinced the Medical Research Council (MRC) to start with the expressed genes, or complementary DNAs. But the US stance has been that the entire sequence is essential if we are to understand the blueprint of man. Craig Venter of the National Institute of Neurological Disorders and Stroke says that focusingmore » on the expressed genes may be even more useful than expected. His strategy involves randomly selecting clones from cDNA libraries which theoretically contain all the genes that are switched on at a particular time in a particular tissue. Then the researchers sequence just a short stretch of each clone, about 400 to 500 bases, to create can expressed sequence tag or EST. The sequences of these ESTs are then stored in a database. Using that information, other researchers can then recreate that EST by using polymerase chain reaction techniques.« less
Wang, Xu; Werren, John H.; Clark, Andrew G.
2015-01-01
There is extraordinary diversity in sexual dimorphism (SD) among animals, but little is known about its epigenetic basis. To study the epigenetic architecture of SD in a haplodiploid system, we performed RNA-seq and whole-genome bisulfite sequencing of adult females and males from two closely related parasitoid wasps, Nasonia vitripennis and Nasonia giraulti. More than 75% of expressed genes displayed significantly sex-biased expression. As a consequence, expression profiles are more similar between species within each sex than between sexes within each species. Furthermore, extremely male- and female-biased genes are enriched for totally different functional categories: male-biased genes for key enzymes in sex-pheromone synthesis and female-biased genes for genes involved in epigenetic regulation of gene expression. Remarkably, just 70 highly expressed, extremely male-biased genes account for 10% of all transcripts in adult males. Unlike expression profiles, DNA methylomes are highly similar between sexes within species, with no consistent sex differences in methylation found. Therefore, methylation changes cannot explain the extensive level of sex-biased gene expression observed. Female-biased genes have smaller sequence divergence between species, higher conservation to other hymenopterans, and a broader expression range across development. Overall, female-biased genes have been recruited from genes with more conserved and broadly expressing “house-keeping” functions, whereas male-biased genes are more recently evolved and are predominately testis specific. In summary, Nasonia accomplish a striking degree of sex-biased expression without sex chromosomes or epigenetic differences in methylation. We propose that methylation provides a general signal for constitutive gene expression, whereas other sex-specific signals cause sex-biased gene expression. PMID:26100871
Chen, Yadong; Xia, Yongtao; Shao, Changwei; Han, Lei; Chen, Xuejie; Yu, Mengjun; Sha, Zhenxia
2016-07-01
As the Russian sturgeon (Acipenser gueldenstaedtii) is an important food and is the main source of caviar, it is necessary to discover the genes associated with its sex differentiation. However, the complicated life and maturity cycles of the Russian sturgeon restrict the accurate identification of sex in early development. To generate a first look at specific sex-related genes, we sequenced the transcriptome of gonads in different development stages (1, 2, and 5 yr old stages) with next-generation RNA sequencing. We generated >60 million raw reads, and the filtered reads were assembled into 263,341 contigs, which produced 38,505 unigenes. Genes involved in signal transduction mechanisms were the most abundant, suggesting that development of sturgeon gonads is under control of signal transduction mechanisms. Differentially expressed gene analysis suggests that more genes for protein synthesis, cytochrome c oxidase subunits, and ribosomal proteins were expressed in female gonads than in male. Meanwhile, male gonads expressed more transposable element transposase, reverse transcriptase, and transposase-related genes than female. In total, 342, 782, and 7,845 genes were detected in intersex, male, and female transcriptomes, respectively. The female gonad expressed more genes than the male gonad, and more genes were involved in female gonadal development. Genes (sox9, foxl2) are differentially expressed in different sexes and may be important sex-related genes in Russian sturgeon. Sox9 genes are responsible for the development of male gonads and foxl2 for female gonads. Copyright © 2016 the American Physiological Society.
Chuang, Duen-yau; Chien, Yung-chei; Wu, Huang-Pin
2007-01-01
The purpose of this study was to clone the carocin S1 gene and express it in a non-carocin-producing strain of Erwinia carotovora. A mutant, TH22-10, which produced a high-molecular-weight bacteriocin but not a low-molecular-weight bacteriocin, was obtained by Tn5 insertional mutagenesis using H-rif-8-2 (a spontaneous rifampin-resistant mutant of Erwinia carotovora subsp. carotovora 89-H-4). Using thermal asymmetric interlaced PCR, the DNA sequence from the Tn5 insertion site and the DNA sequence of the contiguous 2,280-bp region were determined. Two complete open reading frames (ORF), designated ORF2 and ORF3, were identified within the sequence fragment. ORF2 and ORF3 were identified with the carocin S1 genes, caroS1K (ORF2) and caroS1I (ORF3), which, respectively, encode a killing protein (CaroS1K) and an immunity protein (CaroS1I). These genes were homologous to the pyocin S3 gene and the pyocin AP41 gene. Carocin S1 was expressed in E. carotovora subsp. carotovora Ea1068 and replicated in TH22-10 but could not be expressed in Escherichia coli (JM101) because a consensus sequence resembling an SOS box was absent. A putative sequence similar to the consensus sequence for the E. coli cyclic AMP receptor protein binding site (−312 bp) was found upstream of the start codon. Production of this bacteriocin was also induced by glucose and lactose. The homology search results indicated that the carocin S1 gene (between bp 1078 and bp 1704) was homologous to the pyocin S3 and pyocin AP41 genes in Pseudomonas aeruginosa. These genes encode proteins with nuclease activity (domain 4). This study found that carocin S1 also has nuclease activity. PMID:17071754
Cao, Yi-zhan; Hao, Chun-qiu; Feng, Zhi-hua; Zhou, Yong-xing; Li, Jin-ge; Jia, Zhan-sheng; Wang, Ping-zhong
2003-02-01
To construct three recombinant shuttle plasmids of adenovirus expression vector which can express hepatitis C virus(HCV) different structure genes(C, C+E1, C+E1+E2) in order to pack adenovirus expression vectors which can express HCV different structure gene effectively. The different HCV structure genes derived from the plasmid pBRTM/HCV1-3011 by using polymerase chain reaction (PCR) were inserted into the backward position of cytomegalovirus(CMV) immediate early promotor element of shuttle plasmid(pAd.CMV-Link.1) of adenovirus expression vector respectively, then the three recombinant plasmids (pAd.HCV-C, pAd.HCV-CE1, pAd.HCV-S) were obtained. The recombinant plasmids were identified by endonuclease, PCR and sequencing. HCV structure genes were expressed transiently with Lipofectamine 2000 coated in HepG2 cells which were confirmed by immunofluorescence and Western-Blot. Insert DNAs of the three recombinant plasmids' were confirmed to be HCV different structure genes by endonuclease, PCR and sequencing. The three recombinant plasmids can express HCV structure gene (C, C+E1, C+E1+E2) transiently in HepG2 cells which were confirmed by immunofluorescence and Western-Blot. The three recombinant shuttle plasmids of adenovirus expression vector can express HCV structure gene(C, C+E1, C+E1+E2) transiently. This should be useful to pack adenovirus expression vector which can express HCV structure genes.
Zhang, Senhao; Shi, Yinghua; Cheng, Ningning; Du, Hongqi; Fan, Wenna; Wang, Chengzhang
2015-01-01
Alfalfa (Medicago sativa L.) is one of the most widely cultivated perennial forage legumes worldwide. Fall dormancy is an adaptive character related to the biomass production and winter survival in alfalfa. The physiological, biochemical and molecular mechanisms causing fall dormancy and the related genes have not been well studied. In this study, we sequenced two standard varieties of alfalfa (dormant and non-dormant) at two time points and generated approximately 160 million high quality paired-end sequence reads using sequencing by synthesis (SBS) technology. The de novo transcriptome assembly generated a set of 192,875 transcripts with an average length of 856 bp representing about 165.1 Mb of the alfalfa leaf transcriptome. After assembly, 111,062 (57.6%) transcripts were annotated against the NCBI non-redundant database. A total of 30,165 (15.6%) transcripts were mapped to 323 Kyoto Encyclopedia of Genes and Genomes pathways. We also identified 41,973 simple sequence repeats, which can be used to generate markers for alfalfa, and 1,541 transcription factors were identified across 1,350 transcripts. Gene expression between dormant and non-dormant alfalfa at different time points were performed, and we identified several differentially expressed genes potentially related to fall dormancy. The Gene Ontology and pathways information were also identified. We sequenced and assembled the leaf transcriptome of alfalfa related to fall dormancy, and also identified some genes of interest involved in the fall dormancy mechanism. Thus, our research focused on studying fall dormancy in alfalfa through transcriptome sequencing. The sequencing and gene expression data generated in this study may be used further to elucidate the complete mechanisms governing fall dormancy in alfalfa.
Cheng, Ningning; Du, Hongqi; Fan, Wenna; Wang, Chengzhang
2015-01-01
Alfalfa (Medicago sativa L.) is one of the most widely cultivated perennial forage legumes worldwide. Fall dormancy is an adaptive character related to the biomass production and winter survival in alfalfa. The physiological, biochemical and molecular mechanisms causing fall dormancy and the related genes have not been well studied. In this study, we sequenced two standard varieties of alfalfa (dormant and non-dormant) at two time points and generated approximately 160 million high quality paired-end sequence reads using sequencing by synthesis (SBS) technology. The de novo transcriptome assembly generated a set of 192,875 transcripts with an average length of 856 bp representing about 165.1 Mb of the alfalfa leaf transcriptome. After assembly, 111,062 (57.6%) transcripts were annotated against the NCBI non-redundant database. A total of 30,165 (15.6%) transcripts were mapped to 323 Kyoto Encyclopedia of Genes and Genomes pathways. We also identified 41,973 simple sequence repeats, which can be used to generate markers for alfalfa, and 1,541 transcription factors were identified across 1,350 transcripts. Gene expression between dormant and non-dormant alfalfa at different time points were performed, and we identified several differentially expressed genes potentially related to fall dormancy. The Gene Ontology and pathways information were also identified. We sequenced and assembled the leaf transcriptome of alfalfa related to fall dormancy, and also identified some genes of interest involved in the fall dormancy mechanism. Thus, our research focused on studying fall dormancy in alfalfa through transcriptome sequencing. The sequencing and gene expression data generated in this study may be used further to elucidate the complete mechanisms governing fall dormancy in alfalfa. PMID:25799491
Martín, Juan F; Rodríguez-García, Antonio; Liras, Paloma
2017-05-01
Phosphate limitation is important for production of antibiotics and other secondary metabolites in Streptomyces. Phosphate control is mediated by the two-component system PhoR-PhoP. Following phosphate depletion, PhoP stimulates expression of genes involved in scavenging, transport and mobilization of phosphate, and represses the utilization of nitrogen sources. PhoP reduces expression of genes for aerobic respiration and activates nitrate respiration genes. PhoP activates genes for teichuronic acid formation and reduces expression of genes for phosphate-rich teichoic acid biosynthesis. In Streptomyces coelicolor, PhoP repressed several differentiation and pleiotropic regulatory genes, which affects development and indirectly antibiotic biosynthesis. A new bioinformatics analysis of the putative PhoP-binding sequences in Streptomyces avermitilis was made. Many sequences in S. avermitilis genome showed high weight values and were classified according to the available genetic information. These genes encode phosphate scavenging proteins, phosphate transporters and nitrogen metabolism genes. Among of the genes highlighted in the new studies was aveR, located in the avermectin gene cluster, encoding a LAL-type regulator, and afsS, which is regulated by PhoP and AfsR. The sequence logo for S. avermitilis PHO boxes is similar to that of S. coelicolor, with differences in the weight value for specific nucleotides in the sequence.
A cis-regulatory logic simulator.
Zeigler, Robert D; Gertz, Jason; Cohen, Barak A
2007-07-27
A major goal of computational studies of gene regulation is to accurately predict the expression of genes based on the cis-regulatory content of their promoters. The development of computational methods to decode the interactions among cis-regulatory elements has been slow, in part, because it is difficult to know, without extensive experimental validation, whether a particular method identifies the correct cis-regulatory interactions that underlie a given set of expression data. There is an urgent need for test expression data in which the interactions among cis-regulatory sites that produce the data are known. The ability to rapidly generate such data sets would facilitate the development and comparison of computational methods that predict gene expression patterns from promoter sequence. We developed a gene expression simulator which generates expression data using user-defined interactions between cis-regulatory sites. The simulator can incorporate additive, cooperative, competitive, and synergistic interactions between regulatory elements. Constraints on the spacing, distance, and orientation of regulatory elements and their interactions may also be defined and Gaussian noise can be added to the expression values. The simulator allows for a data transformation that simulates the sigmoid shape of expression levels from real promoters. We found good agreement between sets of simulated promoters and predicted regulatory modules from real expression data. We present several data sets that may be useful for testing new methodologies for predicting gene expression from promoter sequence. We developed a flexible gene expression simulator that rapidly generates large numbers of simulated promoters and their corresponding transcriptional output based on specified interactions between cis-regulatory sites. When appropriate rule sets are used, the data generated by our simulator faithfully reproduces experimentally derived data sets. We anticipate that using simulated gene expression data sets will facilitate the direct comparison of computational strategies to predict gene expression from promoter sequence. The source code is available online and as additional material. The test sets are available as additional material.
A chromatin insulator determines the nuclear localization of DNA.
Gerasimova, T I; Byrd, K; Corces, V G
2000-11-01
Chromatin insulators might regulate gene expression by controlling the subnuclear organization of DNA. We found that a DNA sequence normally located inside of the nucleus moved to the periphery when the gypsy insulator was placed within the sequence. The presence of the gypsy insulator also caused two sequences, normally found in different regions of the nucleus, to come together at a single location. Alterations in this subnuclear organization imposed by the gypsy insulator correlated with changes in gene expression that took place during the heat-shock response. These global changes in transcription were accompanied by dramatic alterations in the distribution of insulator proteins and DNA. The results suggest that the nuclear organization imposed by the gypsy insulator on the chromatin fiber is important for gene expression.
Repressor-mediated tissue-specific gene expression in plants
Meagher, Richard B [Athens, GA; Balish, Rebecca S [Oxford, OH; Tehryung, Kim [Athens, GA; McKinney, Elizabeth C [Athens, GA
2009-02-17
Plant tissue specific gene expression by way of repressor-operator complexes, has enabled outcomes including, without limitation, male sterility and engineered plants having root-specific gene expression of relevant proteins to clean environmental pollutants from soil and water. A mercury hyperaccumulation strategy requires that mercuric ion reductase coding sequence is strongly expressed. The actin promoter vector, A2pot, engineered to contain bacterial lac operator sequences, directed strong expression in all plant vegetative organs and tissues. In contrast, the expression from the A2pot construct was restricted primarily to root tissues when a modified bacterial repressor (LacIn) was coexpressed from the light-regulated rubisco small subunit promoter in above-ground tissues. Also provided are analogous repressor operator complexes for selective expression in other plant tissues, for example, to produce male sterile plants.
Monoallelic Gene Expression in Mammals.
Chess, Andrew
2016-11-23
Monoallelic expression not due to cis-regulatory sequence polymorphism poses an intriguing problem in epigenetics because it requires the unequal treatment of two segments of DNA that are present in the same nucleus and that can indeed have absolutely identical sequences. Here, I focus on a few recent developments in the field of monoallelic expression that are of particular interest and raise interesting questions for future work. One development is regarding analyses of imprinted genes, in which recent work suggests the possibility that intriguing networks of imprinted genes exist and are important for genetic and physiological studies. Another issue that has been raised in recent years by a number of publications is the question of how skewed allelic expression should be for it to be designated as monoallelic expression and, further, what methods are appropriate or inappropriate for analyzing genomic data to examine allele-specific expression. Perhaps the most exciting recent development in mammalian monoallelic expression is a clever and carefully executed analysis of genetic diversity of autosomal genes subject to random monoallelic expression (RMAE), which provides compelling evidence for distinct evolutionary forces acting on random monoallelically expressed genes.
2013-01-01
Background The fertile and sterile plants were derived from the self-pollinated offspring of the F1 hybrid between the novel restorer line NR1 and the Nsa CMS line in Brassica napus. To elucidate gene expression and regulation caused by the A and C subgenomes of B. napus, as well as the alien chromosome and cytoplasm from Sinapis arvensis during the development of young floral buds, we performed a genome-wide high-throughput transcriptomic sequencing for young floral buds of sterile and fertile plants. Results In this study, equal amounts of total RNAs taken from young floral buds of sterile and fertile plants were sequenced using the Illumina/Solexa platform. After filtered out low quality data, a total of 2,760,574 and 2,714,441 clean tags were remained in the two libraries, from which 242,163 (Ste) and 253,507 (Fer) distinct tags were obtained. All distinct sequencing tags were annotated using all possible CATG+17-nt sequences of the genome and transcriptome of Brassica rapa and those of Brassica oleracea as the reference sequences, respectively. In total, 3231 genes of B. rapa and 3371 genes of B. oleracea were detected with significant differential expression levels. GO and pathway-based analyses were performed to determine and further to understand the biological functions of those differentially expressed genes (DEGs). In addition, there were 1089 specially expressed unknown tags in Fer, which were neither mapped to B. oleracea nor to B. rapa, and these unique tags were presumed to arise basically from the added alien chromosome of S. arvensis. Fifteen genes were randomly selected and their expression levels were confirmed by quantitative RT-PCR, and fourteen of them showed consistent expression patterns with the digital gene expression (DGE) data. Conclusions A number of genes were differentially expressed between the young floral buds of sterile and fertile plants. Some of these genes may be candidates for future research on CMS in Nsa line, fertility restoration and improved agronomic traits in NR1 line. Further study of the unknown tags which were specifically expressed in Fer will help to explore desirable agronomic traits from wild species. PMID:23324545
Yan, Xiaohong; Dong, Caihua; Yu, Jingyin; Liu, Wanghui; Jiang, Chenghong; Liu, Jia; Hu, Qiong; Fang, Xiaoping; Wei, Wenhui
2013-01-16
The fertile and sterile plants were derived from the self-pollinated offspring of the F1 hybrid between the novel restorer line NR1 and the Nsa CMS line in Brassica napus. To elucidate gene expression and regulation caused by the A and C subgenomes of B. napus, as well as the alien chromosome and cytoplasm from Sinapis arvensis during the development of young floral buds, we performed a genome-wide high-throughput transcriptomic sequencing for young floral buds of sterile and fertile plants. In this study, equal amounts of total RNAs taken from young floral buds of sterile and fertile plants were sequenced using the Illumina/Solexa platform. After filtered out low quality data, a total of 2,760,574 and 2,714,441 clean tags were remained in the two libraries, from which 242,163 (Ste) and 253,507 (Fer) distinct tags were obtained. All distinct sequencing tags were annotated using all possible CATG+17-nt sequences of the genome and transcriptome of Brassica rapa and those of Brassica oleracea as the reference sequences, respectively. In total, 3231 genes of B. rapa and 3371 genes of B. oleracea were detected with significant differential expression levels. GO and pathway-based analyses were performed to determine and further to understand the biological functions of those differentially expressed genes (DEGs). In addition, there were 1089 specially expressed unknown tags in Fer, which were neither mapped to B. oleracea nor to B. rapa, and these unique tags were presumed to arise basically from the added alien chromosome of S. arvensis. Fifteen genes were randomly selected and their expression levels were confirmed by quantitative RT-PCR, and fourteen of them showed consistent expression patterns with the digital gene expression (DGE) data. A number of genes were differentially expressed between the young floral buds of sterile and fertile plants. Some of these genes may be candidates for future research on CMS in Nsa line, fertility restoration and improved agronomic traits in NR1 line. Further study of the unknown tags which were specifically expressed in Fer will help to explore desirable agronomic traits from wild species.
Two rapidly evolving genes contribute to male fitness in Drosophila
Reinhardt, Josephine A; Jones, Corbin D
2013-01-01
Purifying selection often results in conservation of gene sequence and function. The most functionally conserved genes are also thought to be among the most biologically essential. These observations have led to the use of sequence conservation as a proxy for functional conservation. Here we describe two genes that are exceptions to this pattern. We show that lack of sequence conservation among orthologs of CG15460 and CG15323 – herein named jean-baptiste (jb) and karr respectively – does not necessarily predict lack of functional conservation. These two Drosophila melanogaster genes are among the most rapidly evolving protein-coding genes in this species, being nearly as diverged from their D. yakuba orthologs as random sequences are. jb and karr are both expressed at an elevated level in larval males and adult testes, but they are not accessory gland proteins and their loss does not affect male fertility. Instead, knockdown of these genes in D. melanogaster via RNA interference caused male-biased viability defects. These viability effects occur prior to the third instar for jb and during late pupation for karr. We show that putative orthologs to jb and karr are also expressed strongly in the testes of other Drosophila species and have similar gene structure across species despite low levels of sequence conservation. While standard molecular evolution tests could not reject neutrality, other data hint at a role for natural selection. Together these data provide a clear case where a lack of sequence conservation does not imply a lack of conservation of expression or function. PMID:24221639
Sharma, Mukul; Vedithi, Sundeep Chaitanya; Das, Madhusmita; Roy, Anindya; Ebenezer, Mannam
2017-01-01
Survival of Mycobacterium leprae, the causative bacteria for leprosy, in the human host is dependent to an extent on the ways in which its genome integrity is retained. DNA repair mechanisms protect bacterial DNA from damage induced by various stress factors. The current study is aimed at understanding the sequence and functional annotation of DNA repair genes in M. leprae. T he genome of M. leprae was annotated using sequence alignment tools to identify DNA repair genes that have homologs in Mycobacterium tuberculosis and Escherichia coli. A set of 96 genes known to be involved in DNA repair mechanisms in E. coli and Mycobacteriaceae were chosen as a reference. Among these, 61 were identified in M. leprae based on sequence similarity and domain architecture. The 61 were classified into 36 characterized gene products (59%), 11 hypothetical proteins (18%), and 14 pseudogenes (23%). All these genes have homologs in M. tuberculosis and 49 (80.32%) in E. coli. A set of 12 genes which are absent in E. coli were present in M. leprae and in Mycobacteriaceae. These 61 genes were further investigated for their expression profiles in the whole transcriptome microarray data of M. leprae which was obtained from the signal intensities of 60bp probes, tiling the entire genome with 10bp overlaps. It was noted that transcripts corresponding to all the 61 genes were identified in the transcriptome data with varying expression levels ranging from 0.18 to 2.47 fold (normalized with 16SrRNA). The mRNA expression levels of a representative set of seven genes ( four annotated and three hypothetical protein coding genes) were analyzed using quantitative Polymerase Chain Reaction (qPCR) assays with RNA extracted from skin biopsies of 10 newly diagnosed, untreated leprosy cases. It was noted that RNA expression levels were higher for genes involved in homologous recombination whereas the genes with a low level of expression are involved in the direct repair pathway. This study provided preliminary information on the potential DNA repair pathways that are extant in M. leprae and the associated genes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yagi, J.M.; Madsen, E.L.
The diversity of Rieske dioxygenase genes and short-term temporal variability in the abundance of two selected dioxygenase gene sequences were examined in a naphthalene-rich, coal tar waste-contaminated subsurface study site. Using a previously published PCR-based approach (S. M. Ni Chadhain, R. S. Norman, K. V. Pesce, J. J. Kukor, and G. J. Zylstra, Appl. Environ. Microbiol. 72: 4078-4087, 2006) a broad suite of genes was detected, ranging from dioxygenase sequences associated with Rhodococcus and Sphingomonas to 32 previously uncharacterized Rieske gene sequence clone groups. The nag genes appeared frequently (20% of the total) in two groundwater monitoring wells characterized bymore » low (similar to 10{sup 2} ppb; similar to 1 {mu} M) ambient concentrations of naphthalene. A quantitative competitive PCR assay was used to show that abundances of nag genes (and archetypal nah genes) fluctuated substantially over a 9-month period. To contrast short-term variation with long-term community stability, in situ community gene expression (dioxygenase mRNA) and biodegradation potential (community metabolism of naphthalene in microcosms) were compared to measurements from 6 years earlier. cDNA sequences amplified from total RNA extracts revealed that nah- and nag-type genes were expressed in situ, corresponding well with structural gene abundances. Despite evidence for short-term (9-month) shifts in dioxygenase gene copy number, agreement in field gene expression (dioxygenase mRNA) and biodegradation potential was observed in comparisons to equivalent assays performed 6 years earlier. Thus, stability in community biodegradation characteristics at the hemidecadal time frame has been documented for these subsurface microbial communities.« less
Jiang, Bo; Zhang, Yong; She, Chang; Zhao, Jiaju; Zhou, Kailong; Zuo, Zhicheng; Zhou, Xiaozhong; Wang, Peiji; Dong, Qirong
2017-09-01
It is well known that moderate to high doses of ionizing radiation have a toxic effect on the organism. However, there are few experimental studies on the mechanisms of LDR ionizing radiation on nerve regeneration after peripheral nerve injury. We established the rats' peripheral nerve injury model via repaired Peripheral nerve injury nerve, vascular endothelial growth factor a and Growth associated protein-43 were detected from different treatment groups. We performed transcriptome sequencing focusing on investigating the differentially expressed genes and gene functions between the control group and 1Gy group. Sequencing was done by using high-throughput RNA-sequencing (RNA-seq) technologies. The results showed the 1Gy group to be the most effective promoting repair. RNA-sequencing identified 619 differently expressed genes between control and treated groups. A Gene Ontology analysis of the differentially expressed genes revealed enrichment in the functional pathways. Among them, candidate genes associated with nerve repair were identified. Pathways involved in cell-substrate adhesion, vascular smooth muscle contraction and cell adhesion molecule signaling may be involved in recovery from peripheral nerve injury. Copyright © 2017. Published by Elsevier B.V.
Spliced synthetic genes as internal controls in RNA sequencing experiments.
Hardwick, Simon A; Chen, Wendy Y; Wong, Ted; Deveson, Ira W; Blackburn, James; Andersen, Stacey B; Nielsen, Lars K; Mattick, John S; Mercer, Tim R
2016-09-01
RNA sequencing (RNA-seq) can be used to assemble spliced isoforms, quantify expressed genes and provide a global profile of the transcriptome. However, the size and diversity of the transcriptome, the wide dynamic range in gene expression and inherent technical biases confound RNA-seq analysis. We have developed a set of spike-in RNA standards, termed 'sequins' (sequencing spike-ins), that represent full-length spliced mRNA isoforms. Sequins have an entirely artificial sequence with no homology to natural reference genomes, but they align to gene loci encoded on an artificial in silico chromosome. The combination of multiple sequins across a range of concentrations emulates alternative splicing and differential gene expression, and it provides scaling factors for normalization between samples. We demonstrate the use of sequins in RNA-seq experiments to measure sample-specific biases and determine the limits of reliable transcript assembly and quantification in accompanying human RNA samples. In addition, we have designed a complementary set of sequins that represent fusion genes arising from rearrangements of the in silico chromosome to aid in cancer diagnosis. RNA sequins provide a qualitative and quantitative reference with which to navigate the complexity of the human transcriptome.
2013-01-01
predicted amino acid sequences of the three encoded BmAChEs were no more closely related to one another than AChEs from different organisms and their...solely on nucleotide and amino acid sequence similarity; however, the cholinesterase gene family contains a number of related enzymes and structural...acetylcholinesterase of P. papatasi was cloned, sequenced , and expressed in the baculo- virus system to generate a recombinant enzyme for biochemical
Generation and Analysis of Expressed Sequence Tags from Olea europaea L.
Ozdemir Ozgenturk, Nehir; Oruç, Fatma; Sezerman, Ugur; Kuçukural, Alper; Vural Korkut, Senay; Toksoz, Feriha; Un, Cemal
2010-01-01
Olive (Olea europaea L.) is an important source of edible oil which was originated in Near-East region. In this study, two cDNA libraries were constructed from young olive leaves and immature olive fruits for generation of ESTs to discover the novel genes and search the function of unknown genes of olive. The randomly selected 3840 colonies were sequenced for EST collection from both libraries. Readable 2228 sequences for olive leaf and 1506 sequences for olive fruit were assembled into 205 and 69 contigs, respectively, whereas 2478 were singletons. Putative functions of all 2752 differentially expressed unique sequences were designated by gene homology based on BLAST and annotated using BLAST2GO. While 1339 ESTs show no homology to the database, 2024 ESTs have homology (under 80%) with hypothetical proteins, putative proteins, expressed proteins, and unknown proteins in NCBI-GenBank. 635 EST's unique genes sequence have been identified by over 80% homology to known function in other species which were not previously described in Olea family. Only 3.1% of total EST's was shown similarity with olive database existing in NCBI. This generated EST's data and consensus sequences were submitted to NCBI as valuable source for functional genome studies of olive. PMID:21197085
Association of coral algal symbionts with a diverse viral community responsive to heat shock.
Brüwer, Jan D; Agrawal, Shobhit; Liew, Yi Jin; Aranda, Manuel; Voolstra, Christian R
2017-08-17
Stony corals provide the structural foundation of coral reef ecosystems and are termed holobionts given they engage in symbioses, in particular with photosynthetic dinoflagellates of the genus Symbiodinium. Besides Symbiodinium, corals also engage with bacteria affecting metabolism, immunity, and resilience of the coral holobiont, but the role of associated viruses is largely unknown. In this regard, the increase of studies using RNA sequencing (RNA-Seq) to assess gene expression provides an opportunity to elucidate viral signatures encompassed within the data via careful delineation of sequence reads and their source of origin. Here, we re-analyzed an RNA-Seq dataset from a cultured coral symbiont (Symbiodinium microadriaticum, Clade A1) across four experimental treatments (control, cold shock, heat shock, dark shock) to characterize associated viral diversity, abundance, and gene expression. Our approach comprised the filtering and removal of host sequence reads, subsequent phylogenetic assignment of sequence reads of putative viral origin, and the assembly and analysis of differentially expressed viral genes. About 15.46% (123 million) of all sequence reads were non-host-related, of which <1% could be classified as archaea, bacteria, or virus. Of these, 18.78% were annotated as virus and comprised a diverse community consistent across experimental treatments. Further, non-host related sequence reads assembled into 56,064 contigs, including 4856 contigs of putative viral origin that featured 43 differentially expressed genes during heat shock. The differentially expressed genes included viral kinases, ubiquitin, and ankyrin repeat proteins (amongst others), which are suggested to help the virus proliferate and inhibit the algal host's antiviral response. Our results suggest that a diverse viral community is associated with coral algal endosymbionts of the genus Symbiodinium, which prompts further research on their ecological role in coral health and resilience.
Kalra, Shikha; Puniya, Bhanwar Lal; Kulshreshtha, Deepika; Kumar, Sunil; Kaur, Jagdeep; Ramachandran, Srinivasan; Singh, Kashmir
2013-01-01
Chlorophytum borivilianum, an endangered medicinal plant species is highly recognized for its aphrodisiac properties provided by saponins present in the plant. The transcriptome information of this species is limited and only few hundred expressed sequence tags (ESTs) are available in the public databases. To gain molecular insight of this plant, high throughput transcriptome sequencing of leaf RNA was carried out using Illumina's HiSeq 2000 sequencing platform. A total of 22,161,444 single end reads were retrieved after quality filtering. Available (e.g., De-Bruijn/Eulerian graph) and in-house developed bioinformatics tools were used for assembly and annotation of transcriptome. A total of 101,141 assembled transcripts were obtained, with coverage size of 22.42 Mb and average length of 221 bp. Guanine-cytosine (GC) content was found to be 44%. Bioinformatics analysis, using non-redundant proteins, gene ontology (GO), enzyme commission (EC) and kyoto encyclopedia of genes and genomes (KEGG) databases, extracted all the known enzymes involved in saponin and flavonoid biosynthesis. Few genes of the alkaloid biosynthesis, along with anticancer and plant defense genes, were also discovered. Additionally, several cytochrome P450 (CYP450) and glycosyltransferase unique sequences were also found. We identified simple sequence repeat motifs in transcripts with an abundance of di-nucleotide simple sequence repeat (SSR; 43.1%) markers. Large scale expression profiling through Reads per Kilobase per Million mapped reads (RPKM) showed major genes involved in different metabolic pathways of the plant. Genes, expressed sequence tags (ESTs) and unique sequences from this study provide an important resource for the scientific community, interested in the molecular genetics and functional genomics of C. borivilianum. PMID:24376689
Kalra, Shikha; Puniya, Bhanwar Lal; Kulshreshtha, Deepika; Kumar, Sunil; Kaur, Jagdeep; Ramachandran, Srinivasan; Singh, Kashmir
2013-01-01
Chlorophytum borivilianum, an endangered medicinal plant species is highly recognized for its aphrodisiac properties provided by saponins present in the plant. The transcriptome information of this species is limited and only few hundred expressed sequence tags (ESTs) are available in the public databases. To gain molecular insight of this plant, high throughput transcriptome sequencing of leaf RNA was carried out using Illumina's HiSeq 2000 sequencing platform. A total of 22,161,444 single end reads were retrieved after quality filtering. Available (e.g., De-Bruijn/Eulerian graph) and in-house developed bioinformatics tools were used for assembly and annotation of transcriptome. A total of 101,141 assembled transcripts were obtained, with coverage size of 22.42 Mb and average length of 221 bp. Guanine-cytosine (GC) content was found to be 44%. Bioinformatics analysis, using non-redundant proteins, gene ontology (GO), enzyme commission (EC) and kyoto encyclopedia of genes and genomes (KEGG) databases, extracted all the known enzymes involved in saponin and flavonoid biosynthesis. Few genes of the alkaloid biosynthesis, along with anticancer and plant defense genes, were also discovered. Additionally, several cytochrome P450 (CYP450) and glycosyltransferase unique sequences were also found. We identified simple sequence repeat motifs in transcripts with an abundance of di-nucleotide simple sequence repeat (SSR; 43.1%) markers. Large scale expression profiling through Reads per Kilobase per Million mapped reads (RPKM) showed major genes involved in different metabolic pathways of the plant. Genes, expressed sequence tags (ESTs) and unique sequences from this study provide an important resource for the scientific community, interested in the molecular genetics and functional genomics of C. borivilianum.
Investigation of mRNA quadruplex formation in Escherichia coli.
Wieland, Markus; Hartig, Jörg S
2009-01-01
The protocol presented here allows for the investigation of the formation of unusual nucleic acid structures in the 5'-untranslated region (UTR) of bacteria by correlating gene expression levels to the in vitro stability of the respective structure. In particular, we describe the introduction of G-quadruplex forming sequences close to the ribosome-binding site (RBS) on the mRNA of a reporter gene and the subsequent read-out of the expression levels. Insertion of a stable secondary structure results in the cloaking of RBS and eventually reduced gene expression levels. The structures and stability of the introduced sequences are further characterized by circular dichroism (CD) spectroscopy and thermal melting experiments. The extent of inhibition is then correlated to the stability of the respective quadruplex structure, allowing judgement of whether factors other than thermodynamic stability affect the formation of a given quadruplex sequence in vivo. Measuring gene expression levels takes 2 d including cloning; CD experiments take 5 hours per experiment.
Hamaguchi-Hamada, Kayoko; Kurumata-Shigeto, Mami; Minobe, Sumiko; Fukuoka, Nozomi; Sato, Manami; Matsufuji, Miyuki; Koizumi, Osamu; Hamada, Shun
2016-01-01
The head region of Hydra, the hypostome, is a key body part for developmental control and the nervous system. We herein examined genes specifically expressed in the head region of Hydra oligactis using suppression subtractive hybridization (SSH) cloning. A total of 1414 subtracted clones were sequenced and found to be derived from at least 540 different genes by BLASTN analyses. Approximately 25% of the subtracted clones had sequences encoding thrombospondin type-1 repeat (TSR) domains, and were derived from 17 genes. We identified 11 TSR domain-containing genes among the top 36 genes that were the most frequently detected in our SSH library. Whole-mount in situ hybridization analyses confirmed that at least 13 out of 17 TSR domain-containing genes were expressed in the hypostome of Hydra oligactis. The prominent expression of TSR domain-containing genes suggests that these genes play significant roles in the hypostome of Hydra oligactis. PMID:27043211
Reading, Benjamin J; Chapman, Robert W; Schaff, Jennifer E; Scholl, Elizabeth H; Opperman, Charles H; Sullivan, Craig V
2012-02-21
The striped bass and its relatives (genus Morone) are important fisheries and aquaculture species native to estuaries and rivers of the Atlantic coast and Gulf of Mexico in North America. To open avenues of gene expression research on reproduction and breeding of striped bass, we generated a collection of expressed sequence tags (ESTs) from a complementary DNA (cDNA) library representative of their ovarian transcriptome. Sequences of a total of 230,151 ESTs (51,259,448 bp) were acquired by Roche 454 pyrosequencing of cDNA pooled from ovarian tissues obtained at all stages of oocyte growth, at ovulation (eggs), and during preovulatory atresia. Quality filtering of ESTs allowed assembly of 11,208 high-quality contigs ≥ 100 bp, including 2,984 contigs 500 bp or longer (average length 895 bp). Blastx comparisons revealed 5,482 gene orthologues (E-value < 10-3), of which 4,120 (36.7% of total contigs) were annotated with Gene Ontology terms (E-value < 10-6). There were 5,726 remaining unknown unique sequences (51.1% of total contigs). All of the high-quality EST sequences are available in the National Center for Biotechnology Information (NCBI) Short Read Archive (GenBank: SRX007394). Informative contigs were considered to be abundant if they were assembled from groups of ESTs comprising ≥ 0.15% of the total short read sequences (≥ 345 reads/contig). Approximately 52.5% of these abundant contigs were predicted to have predominant ovary expression through digital differential display in silico comparisons to zebrafish (Danio rerio) UniGene orthologues. Over 1,300 Gene Ontology terms from Biological Process classes of Reproduction, Reproductive process, and Developmental process were assigned to this collection of annotated contigs. This first large reference sequence database available for the ecologically and economically important temperate basses (genus Morone) provides a foundation for gene expression studies in these species. The predicted predominance of ovary gene expression and assignment of directly relevant Gene Ontology classes suggests a powerful utility of this dataset for analysis of ovarian gene expression related to fundamental questions of oogenesis. Additionally, a high definition Agilent 60-mer oligo ovary 'UniClone' microarray with 8 × 15,000 probe format has been designed based on this striped bass transcriptome (eArray Group: Striper Group, Design ID: 029004).
Tang, Bin; Liu, Xiao-Jun; Shi, Zuo-Kun; Shen, Qi-Da; Xu, Yan-Xia; Wang, Su; Zhang, Fan; Wang, Shi-Gui
2017-06-01
Harmonia axyridis is an important predatory lady beetle that is a natural enemy of agricultural and forestry pests. In this research, the cold hardiness induced genes and their expression changes in H. axyridis were screened and detected by the way of the transcriptome and qualitative real-time PCR under normal and low temperatures, using high-throughput transcriptome and digital gene-expression-tag technologies. We obtained a 10Gb transcriptome and an 8Mb gene expression tag pool using Illumina deep sequencing technology and RNA-Seq analysis (accession number SRX540102). Of the 46,980 non-redundant unigenes identified, 28,037 (59.7%) were matched to known genes in GenBank, 21,604 (46.0%) in Swiss-Prot, 19,482 (41.5%) in Kyoto Encyclopedia of Genes and Genomes and 13,193 (28.1%) in Gene Ontology databases. Seventy-five percent of the unigene sequences had top matches with gene sequences from Tribolium castaneum. Results indicated that 60 genes regulated the entire cold-acclimation response, and, of these, seven genes were always up-regulated and five genes always down-regulated. Further screening revealed that six cold-resistant genes, E3 ubiquitin-protein ligase, transketolase, trehalase, serine/arginine repetitive matrix protein 2, glycerol kinase and sugar transporter SWEET1-like, play key roles in the response. Expression from a number of the differentially expressed genes was confirmed with quantitative real-time PCR (HaCS_Trans). The paper attempted to identify cold-resistance response genes, and study the potential mechanism by which cold acclimation enhances the insect's cold endurance. Information on these cold-resistance response genes will improve the development of low-temperature storage technology of natural enemy insects for future use in biological control. Copyright © 2017 Elsevier Inc. All rights reserved.
Integrated systems analysis reveals a molecular network underlying autism spectrum disorders
Li, Jingjing; Shi, Minyi; Ma, Zhihai; Zhao, Shuchun; Euskirchen, Ghia; Ziskin, Jennifer; Urban, Alexander; Hallmayer, Joachim; Snyder, Michael
2014-01-01
Autism is a complex disease whose etiology remains elusive. We integrated previously and newly generated data and developed a systems framework involving the interactome, gene expression and genome sequencing to identify a protein interaction module with members strongly enriched for autism candidate genes. Sequencing of 25 patients confirmed the involvement of this module in autism, which was subsequently validated using an independent cohort of over 500 patients. Expression of this module was dichotomized with a ubiquitously expressed subcomponent and another subcomponent preferentially expressed in the corpus callosum, which was significantly affected by our identified mutations in the network center. RNA-sequencing of the corpus callosum from patients with autism exhibited extensive gene mis-expression in this module, and our immunochemical analysis showed that the human corpus callosum is predominantly populated by oligodendrocyte cells. Analysis of functional genomic data further revealed a significant involvement of this module in the development of oligodendrocyte cells in mouse brain. Our analysis delineates a natural network involved in autism, helps uncover novel candidate genes for this disease and improves our understanding of its molecular pathology. PMID:25549968
Transcriptome sequencing of rhizome tissue of Sinopodophyllum hexandrum at two temperatures.
Kumari, Anita; Singh, Heikham Russiachand; Jha, Ashwani; Swarnkar, Mohit Kumar; Shankar, Ravi; Kumar, Sanjay
2014-10-07
Sinopodophyllum hexandrum is an endangered medicinal herb, which is commonly present in elevations ranging between 2,400-4,500 m and is sensitive to temperature. Medicinal property of the species is attributed to the presence of podophyllotoxin in the rhizome tissue. The present work analyzed transcriptome of rhizome tissue of S. hexandrum exposed to 15°C and 25°C to understand the temperature mediated molecular responses including those associated with podophyllotoxin biosynthesis. Deep sequencing of transcriptome with an average coverage of 88.34X yielded 60,089 assembled transcript sequences representing 20,387 unique genes having homology to known genes. Fragments per kilobase of exon per million fragments mapped (FPKM) based expression analysis revealed genes related to growth and development were over-expressed at 15°C, whereas genes involved in stress response were over-expressed at 25°C. There was a decreasing trend of podophyllotoxin accumulation at 25°C; data was well supported by the expression of corresponding genes of the pathway. FPKM data was validated by quantitative real-time polymerase chain reaction data using a total of thirty four genes and a positive correlation between the two platforms of gene expression was obtained. Also, detailed analyses yielded cytochrome P450s, methyltransferases and glycosyltransferases which could be the potential candidate hitherto unidentified genes of podophyllotoxin biosynthesis pathway. The present work revealed temperature responsive transcriptome of S. hexandrum on Illumina platform. Data suggested expression of genes for growth and development and podophyllotoxin biosynthesis at 15°C, and prevalence of those associated with stress response at 25°C.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cairney, J.; Hays, D.; Stockand, J.D.
1991-05-01
The rangeland shrub Atriplex canescens (saltbush) is extremely drought-tolerant and is capable of growing at water potentials below {minus}40 bar. To discover the molecular basis of this tolerance, the authors have isolated a number of cDNA clones of drought-stress induced genes. Analysis of the nucleotide sequence and expression of these genes in different tissues and in response to different stresses reveals the diversity of the stress response. Members of a drought-induced, multi-gene family, have been sequenced. Although 95% homologous, non-conservative substitutions result in proteins of different tertiary structure. Additionally, the genes are expressed through a number of mature forms ofmore » mRNA which may arise by alternative RNA processing.« less
Occurrence and expression of gene transfer agent genes in marine bacterioplankton.
Biers, Erin J; Wang, Kui; Pennington, Catherine; Belas, Robert; Chen, Feng; Moran, Mary Ann
2008-05-01
Genes with homology to the transduction-like gene transfer agent (GTA) were observed in genome sequences of three cultured members of the marine Roseobacter clade. A broader search for homologs for this host-controlled virus-like gene transfer system identified likely GTA systems in cultured Alphaproteobacteria, and particularly in marine bacterioplankton representatives. Expression of GTA genes and extracellular release of GTA particles ( approximately 50 to 70 nm) was demonstrated experimentally for the Roseobacter clade member Silicibacter pomeroyi DSS-3, and intraspecific gene transfer was documented. GTA homologs are surprisingly infrequent in marine metagenomic sequence data, however, and the role of this lateral gene transfer mechanism in ocean bacterioplankton communities remains unclear.
Occurrence and Expression of Gene Transfer Agent Genes in Marine Bacterioplankton▿
Biers, Erin J.; Wang, Kui; Pennington, Catherine; Belas, Robert; Chen, Feng; Moran, Mary Ann
2008-01-01
Genes with homology to the transduction-like gene transfer agent (GTA) were observed in genome sequences of three cultured members of the marine Roseobacter clade. A broader search for homologs for this host-controlled virus-like gene transfer system identified likely GTA systems in cultured Alphaproteobacteria, and particularly in marine bacterioplankton representatives. Expression of GTA genes and extracellular release of GTA particles (∼50 to 70 nm) was demonstrated experimentally for the Roseobacter clade member Silicibacter pomeroyi DSS-3, and intraspecific gene transfer was documented. GTA homologs are surprisingly infrequent in marine metagenomic sequence data, however, and the role of this lateral gene transfer mechanism in ocean bacterioplankton communities remains unclear. PMID:18359833
Phage-mediated Delivery of Targeted sRNA Constructs to Knock Down Gene Expression in E. coli.
Bernheim, Aude G; Libis, Vincent K; Lindner, Ariel B; Wintermute, Edwin H
2016-03-20
RNA-mediated knockdowns are widely used to control gene expression. This versatile family of techniques makes use of short RNA (sRNA) that can be synthesized with any sequence and designed to complement any gene targeted for silencing. Because sRNA constructs can be introduced to many cell types directly or using a variety of vectors, gene expression can be repressed in living cells without laborious genetic modification. The most common RNA knockdown technology, RNA interference (RNAi), makes use of the endogenous RNA-induced silencing complex (RISC) to mediate sequence recognition and cleavage of the target mRNA. Applications of this technique are therefore limited to RISC-expressing organisms, primarily eukaryotes. Recently, a new generation of RNA biotechnologists have developed alternative mechanisms for controlling gene expression through RNA, and so made possible RNA-mediated gene knockdowns in bacteria. Here we describe a method for silencing gene expression in E. coli that functionally resembles RNAi. In this system a synthetic phagemid is designed to express sRNA, which may designed to target any sequence. The expression construct is delivered to a population of E. coli cells with non-lytic M13 phage, after which it is able to stably replicate as a plasmid. Antisense recognition and silencing of the target mRNA is mediated by the Hfq protein, endogenous to E. coli. This protocol includes methods for designing the antisense sRNA, constructing the phagemid vector, packaging the phagemid into M13 bacteriophage, preparing a live cell population for infection, and performing the infection itself. The fluorescent protein mKate2 and the antibiotic resistance gene chloramphenicol acetyltransferase (CAT) are targeted to generate representative data and to quantify knockdown effectiveness.
Asai, Shuta; Rallapalli, Ghanasyam; Piquerez, Sophie J M; Caillaud, Marie-Cécile; Furzer, Oliver J; Ishaque, Naveed; Wirthmueller, Lennart; Fabro, Georgina; Shirasu, Ken; Jones, Jonathan D G
2014-10-01
Plants have evolved strong innate immunity mechanisms, but successful pathogens evade or suppress plant immunity via effectors delivered into the plant cell. Hyaloperonospora arabidopsidis (Hpa) causes downy mildew on Arabidopsis thaliana, and a genome sequence is available for isolate Emoy2. Here, we exploit the availability of genome sequences for Hpa and Arabidopsis to measure gene-expression changes in both Hpa and Arabidopsis simultaneously during infection. Using a high-throughput cDNA tag sequencing method, we reveal expression patterns of Hpa predicted effectors and Arabidopsis genes in compatible and incompatible interactions, and promoter elements associated with Hpa genes expressed during infection. By resequencing Hpa isolate Waco9, we found it evades Arabidopsis resistance gene RPP1 through deletion of the cognate recognized effector ATR1. Arabidopsis salicylic acid (SA)-responsive genes including PR1 were activated not only at early time points in the incompatible interaction but also at late time points in the compatible interaction. By histochemical analysis, we found that Hpa suppresses SA-inducible PR1 expression, specifically in the haustoriated cells into which host-translocated effectors are delivered, but not in non-haustoriated adjacent cells. Finally, we found a highly-expressed Hpa effector candidate that suppresses responsiveness to SA. As this approach can be easily applied to host-pathogen interactions for which both host and pathogen genome sequences are available, this work opens the door towards transcriptome studies in infection biology that should help unravel pathogen infection strategies and the mechanisms by which host defense responses are overcome.
Oliveira-Neto, Osmundo B; Batista, João A N; Rigden, Daniel J; Fragoso, Rodrigo R; Silva, Rodrigo O; Gomes, Eliane A; Franco, Octávio L; Dias, Simoni C; Cordeiro, Célia M T; Monnerat, Rose G; Grossi-De-Sá, Maria F
2004-09-01
Fourteen different cDNA fragments encoding serine proteinases were isolated by reverse transcription-PCR from cotton boll weevil (Anthonomus grandis) larvae. A large diversity between the sequences was observed, with a mean pairwise identity of 22% in the amino acid sequence. The cDNAs encompassed 11 trypsin-like sequences classifiable into three families and three chymotrypsin-like sequences belonging to a single family. Using a combination of 5' and 3' RACE, the full-length sequence was obtained for five of the cDNAs, named Agser2, Agser5, Agser6, Agser10 and Agser21. The encoded proteins included amino acid sequence motifs of serine proteinase active sites, conserved cysteine residues, and both zymogen activation and signal peptides. Southern blotting analysis suggested that one or two copies of these serine proteinase genes exist in the A. grandis genome. Northern blotting analysis of Agser2 and Agser5 showed that for both genes, expression is induced upon feeding and is concentrated in the gut of larvae and adult insects. Reverse northern analysis of the 14 cDNA fragments showed that only two trypsin-like and two chymotrypsin-like were expressed at detectable levels. Under the effect of the serine proteinase inhibitors soybean Kunitz trypsin inhibitor and black-eyed pea trypsin/chymotrypsin inhibitor, expression of one of the trypsin-like sequences was upregulated while expression of the two chymotrypsin-like sequences was downregulated. Copyright 2004 Elsevier Ltd.
Structure, Expression, Chromosomal Location and Product of the Gene Encoding Adh2 in Petunia
Gregerson, R. G.; Cameron, L.; McLean, M.; Dennis, P.; Strommer, J.
1993-01-01
In most higher plants the genes encoding alcohol dehydrogenase comprise a small gene family, usually with two members. The Adh1 gene of Petunia has been cloned and analyzed, but a second identifiable gene was not recovered from any of three genomic libraries. We have therefore employed the polymerase chain reaction to obtain the major portion of a second Adh gene. From sequence, mapping and northern data we conclude this gene encodes ADH2, the major anaerobically inducible Adh gene of Petunia. The availability of both Adh1 and Adh2 from Petunia has permitted us to compare their structures and patterns of expression to those of the well-studied Adh genes of maize, of which one is highly expressed developmentally, while both are induced in response to hypoxia. Despite their evolutionary distance, evidenced by deduced amino acid sequence as well as taxonomic classification, the pairs of genes are regulated in strikingly similar ways in maize and Petunia. Our findings suggest a significant biological basis for the regulatory strategy employed by these distant species for differential expression of multiple Adh genes. PMID:8096485
Molecular cloning and expression of the calmodulin gene from guinea pig hearts.
Feng, Rui; Liu, Yan; Sun, Xuefei; Wang, Yan; Hu, Huiyuan; Guo, Feng; Zhao, Jinsheng; Hao, Liying
2015-06-01
The aim of the present study was to isolate and characterize a complementary DNA (cDNA) clone encoding the calmodulin (CaM; GenBank accession no. FJ012165) gene from guinea pig hearts. The CaM gene was amplified from cDNA collected from guinea pig hearts and inserted into a pGEM®-T Easy vector. Subsequently, CaM nucleotide and protein sequence similarity analysis was conducted between guinea pigs and other species. In addition, reverse transcription-polymerase chain reaction (RT-PCR) was performed to investigate the CaM 3 expression patterns in different guinea pig tissues. Sequence analysis revealed that the CaM gene isolated from the guinea pig heart had ∼90% sequence identity with the CaM 3 genes in humans, mice and rats. Furthermore, the deduced peptide sequences of CaM 3 in the guinea pig showed 100% homology to the CaM proteins from other species. In addition, the RT-PCR results indicated that CaM 3 was widely and differentially expressed in guinea pigs. In conclusion, the current study provided valuable information with regard to the cloning and expression of CaM 3 in guinea pig hearts. These findings may be helpful for understanding the function of CaM3 and the possible role of CaM3 in cardiovascular diseases.
Le Bras, Stéphanie; Cohen-Tannoudji, Michel; Guyot, Valérie; Vandormael-Pournin, Sandrine; Coumailleau, Franck; Babinet, Charles; Baldacci, Patricia
2002-08-21
The DDK syndrome is defined as the embryonic lethality of F1 mouse embryos from crosses between DDK females and males from other strains (named hereafter as non-DDK strains). Genetically controlled by the Ovum mutant (Om) locus, it is due to a deleterious interaction between a maternal factor present in DDK oocytes and the non-DDK paternal pronucleus. Therefore, the DDK syndrome constitutes a unique genetic tool to study the crucial interactions that take place between the parental genomes and the egg cytoplasm during mammalian development. In this paper, we present an extensive analysis performed by exon trapping on the Om region. Twenty-seven trapped sequences were from genes in the databases: beta-adaptin, CCT zeta2, DNA LigaseIII, Notchless, Rad51l3 and Scya1. Twenty-eight other sequences presented similarities with expressed sequence tags and genomic sequences whereas 57 did not. The pattern of expression of 37 of these markers was established. Importantly, five of them are expressed in DDK oocytes and are candidate genes for the maternal factor, and 20 are candidate genes for the paternal factor since they are expressed in testis. This data is an important step towards identifying the genes responsible for the DDK syndrome.
Makita, Yuko; Kawashima, Mika; Lau, Nyok Sean; Othman, Ahmad Sofiman; Matsui, Minami
2018-01-19
Natural rubber is an economically important material. Currently the Pará rubber tree, Hevea brasiliensis is the main commercial source. Little is known about rubber biosynthesis at the molecular level. Next-generation sequencing (NGS) technologies brought draft genomes of three rubber cultivars and a variety of RNA sequencing (RNA-seq) data. However, no current genome or transcriptome databases (DB) are organized by gene. A gene-oriented database is a valuable support for rubber research. Based on our original draft genome sequence of H. brasiliensis RRIM600, we constructed a rubber tree genome and transcriptome DB. Our DB provides genome information including gene functional annotations and multi-transcriptome data of RNA-seq, full-length cDNAs including PacBio Isoform sequencing (Iso-Seq), ESTs and genome wide transcription start sites (TSSs) derived from CAGE technology. Using our original and publically available RNA-seq data, we calculated co-expressed genes for identifying functionally related gene sets and/or genes regulated by the same transcription factor (TF). Users can access multi-transcriptome data through both a gene-oriented web page and a genome browser. For the gene searching system, we provide keyword search, sequence homology search and gene expression search; users can also select their expression threshold easily. The rubber genome and transcriptome DB provides rubber tree genome sequence and multi-transcriptomics data. This DB is useful for comprehensive understanding of the rubber transcriptome. This will assist both industrial and academic researchers for rubber and economically important close relatives such as R. communis, M. esculenta and J. curcas. The Rubber Transcriptome DB release 2017.03 is accessible at http://matsui-lab.riken.jp/rubber/ .
Isolation and characterization of a water stress-specific genomic gene, pwsi 18, from rice.
Joshee, N; Kisaka, H; Kitagawa, Y
1998-01-01
One of the water stress-specific cDNA clones of rice characterised previously, wsi18, was selected for further study. The wsi18 gene can be induced by water stress conditions such as mannitol, NaCl, and dryness, but not by ABA, cold, or heat. A genomic clone for wsi18, pwsi18, contained about 1.7 kbp of the 5' upstream sequence, two introns, and the full coding sequence. The 5'-upstream sequence of pwsi18 contained putative cis-acting elements, namely an ABA-responsive element (ABRE), three G-boxes, three E-boxes, a MEF-2 sequence, four direct and two inverted repeats, and four sequences similar to DRE, which is involved in the dehydration response of Arabidopsis genes. The gusA reporter gene under the control of the pwsi18 promoter showed transient expression in response to water stress. Deletion of the downstream DRE-like sequence between the distal G-boxes-2 and -3 resulted in rather low GUS expression.
Deng, Youping; Dong, Yinghua; Thodima, Venkata; Clem, Rollie J; Passarelli, A Lorena
2006-01-01
Background Little is known about the genome sequences of lepidopteran insects, although this group of insects has been studied extensively in the fields of endocrinology, development, immunity, and pathogen-host interactions. In addition, cell lines derived from Spodoptera frugiperda and other lepidopteran insects are routinely used for baculovirus foreign gene expression. This study reports the results of an expressed sequence tag (EST) sequencing project in cells from the lepidopteran insect S. frugiperda, the fall armyworm. Results We have constructed an EST database using two cDNA libraries from the S. frugiperda-derived cell line, SF-21. The database consists of 2,367 ESTs which were assembled into 244 contigs and 951 singlets for a total of 1,195 unique sequences. Conclusion S. frugiperda is an agriculturally important pest insect and genomic information will be instrumental for establishing initial transcriptional profiling and gene function studies, and for obtaining information about genes manipulated during infections by insect pathogens such as baculoviruses. PMID:17052344
Kenney, S; Kamine, J; Markovitz, D; Fenrick, R; Pagano, J
1988-03-01
Acquired immunodeficiency syndrome patients are frequently coinfected with Epstein-Barr virus (EBV). In this report, we demonstrate that an EBV immediate-early gene product, BamHI MLF1, stimulates expression of the bacterial chloramphenicol acetyltransferase (CAT) gene linked to the human immunodeficiency virus (HIV) promoter. The HIV promoter sequences necessary for trans-activation by EBV do not include the tat-responsive sequences. In addition, in contrast to the other herpesvirus trans-activators previously studied, the EBV BamHI MLF1 gene product appears to function in part by a posttranscriptional mechanism, since it increases pHIV-CAT protein activity more than it increases HIV-CAT mRNA. This ability of an EBV gene product to activate HIV gene expression may have biologic consequences in persons coinfected with both viruses.
Liu, Jingjing; Yin, Tongming; Ye, Ning; Chen, Yingnan; Yin, Tingting; Liu, Min; Hassani, Danial
2013-01-01
Background The dioecious system is relatively rare in plants. Shrub willow is an annual flowering dioecious woody plant, and possesses many characteristics that lend it as a great model for tracking the missing pieces of sex determination evolution. To gain a global view of the genes differentially expressed in the male and female shrub willows and to develop a database for further studies, we performed a large-scale transcriptome sequencing of flower buds which were separately collected from two types of sexes. Results Totally, 1,201,931 high quality reads were obtained, with an average length of 389 bp and a total length of 467.96 Mb. The ESTs were assembled into 29,048 contigs, and 132,709 singletons. These unigenes were further functionally annotated by comparing their sequences to different proteins and functional domain databases and assigned with Gene Ontology (GO) terms. A biochemical pathway database containing 291 predicted pathways was also created based on the annotations of the unigenes. Digital expression analysis identified 806 differentially expressed genes between the male and female flower buds. And 33 of them located on the incipient sex chromosome of Salicaceae, among which, 12 genes might involve in plant sex determination empirically. These genes were worthy of special notification in future studies. Conclusions In this study, a large number of EST sequences were generated from the flower buds of a male and a female shrub willow. We also reported the differentially expressed genes between the two sex-type flowers. This work provides valuable information and sequence resources for uncovering the sex determining genes and for future functional genomics analysis of Salicaceae spp. PMID:23560075
Andrews, T Daniel; Gojobori, Takashi
2004-01-01
The PilE protein is the major component of the Neisseria meningitidis pilus, which is encoded by the pilE/pilS locus that includes an expressed gene and eight homologous silent fragments. The silent gene fragments have been shown to recombine through gene conversion with the expressed gene and thereby provide a means by which novel antigenic variants of the PilE protein can be generated. We have analyzed the evolutionary rate of the pilE gene using the nucleotide sequence of two complete pilE/pilS loci. The very high rate of evolution displayed by the PilE protein appears driven by both recombination and positive selection. Within the semivariable region of the pilE and pilS genes, recombination appears to occur within multiple small sequence blocks that lie between conserved sequence elements. Within the hypervariable region, positive selection was identified from comparison of the silent and expressed genes. The unusual gene conversion mechanism that operates at the pilE/pilS locus is a strategy employed by N. meningitidis to enhance mutation of certain regions of the PilE protein. The silent copies of the gene effectively allow "parallelized" evolution of pilE, thus enabling the encoded protein to rapidly explore a large area of sequence space in an effort to find novel antigenic variants.
Hu, H M; Chuang, C K; Lee, M J; Tseng, T C; Tang, T K
2000-11-01
We previously reported two novel testis-specific serine/threonine kinases, Aie1 (mouse) and AIE2 (human), that share high amino acid identities with the kinase domains of fly aurora and yeast Ipl1. Here, we report the entire intron-exon organization of the Aie1 gene and analyze the expression patterns of Aie1 mRNA during testis development. The mouse Aie1 gene spans approximately 14 kb and contains seven exons. The sequences of the exon-intron boundaries of the Aie1 gene conform to the consensus sequences (GT/AG) of the splicing donor and acceptor sites of most eukaryotic genes. Comparative genomic sequencing revealed that the gene structure is highly conserved between mouse Aie1 and human AIE2. However, much less homology was found in the sequence outside the kinase-coding domains. The Aie1 locus was mapped to mouse chromosome 7A2-A3 by fluorescent in situ hybridization. Northern blot analysis indicates that Aie1 mRNA likely is expressed at a low level on day 14 and reaches its plateau on day 21 in the developing postnatal testis. RNA in situ hybridization indicated that the expression of the Aie1 transcript was restricted to meiotically active germ cells, with the highest levels detected in spermatocytes at the late pachytene stage. These findings suggest that Aie1 plays a role in spermatogenesis.
Pirooznia, Mehdi; Gong, Ping; Guan, Xin; Inouye, Laura S; Yang, Kuan; Perkins, Edward J; Deng, Youping
2007-01-01
Background Eisenia fetida, commonly known as red wiggler or compost worm, belongs to the Lumbricidae family of the Annelida phylum. Little is known about its genome sequence although it has been extensively used as a test organism in terrestrial ecotoxicology. In order to understand its gene expression response to environmental contaminants, we cloned 4032 cDNAs or expressed sequence tags (ESTs) from two E. fetida libraries enriched with genes responsive to ten ordnance related compounds using suppressive subtractive hybridization-PCR. Results A total of 3144 good quality ESTs (GenBank dbEST accession number EH669363–EH672369 and EL515444–EL515580) were obtained from the raw clone sequences after cleaning. Clustering analysis yielded 2231 unique sequences including 448 contigs (from 1361 ESTs) and 1783 singletons. Comparative genomic analysis showed that 743 or 33% of the unique sequences shared high similarity with existing genes in the GenBank nr database. Provisional function annotation assigned 830 Gene Ontology terms to 517 unique sequences based on their homology with the annotated genomes of four model organisms Drosophila melanogaster, Mus musculus, Saccharomyces cerevisiae, and Caenorhabditis elegans. Seven percent of the unique sequences were further mapped to 99 Kyoto Encyclopedia of Genes and Genomes pathways based on their matching Enzyme Commission numbers. All the information is stored and retrievable at a highly performed, web-based and user-friendly relational database called EST model database or ESTMD version 2. Conclusion The ESTMD containing the sequence and annotation information of 4032 E. fetida ESTs is publicly accessible at . PMID:18047730
Men, Lina; Yan, Shanchun; Liu, Guanjun
2013-08-13
Larix gmelinii is a dominant tree species in China's boreal forests and plays an important role in the coniferous ecosystem. It is also one of the most economically important tree species in the Chinese timber industry due to excellent water resistance and anti-corrosion of its wood products. Unfortunately, in Northeast China, L. gmelinii often suffers from serious attacks by diseases and insects. The application of exogenous volatile semiochemicals may induce and enhance its resistance against insect or disease attacks; however, little is known regarding the genes and molecular mechanisms related to induced resistance. We performed de novo sequencing and assembly of the L. gmelinii transcriptome using a short read sequencing technology (Illumina). Chemical defenses of L. gmelinii seedlings were induced with jasmonic acid (JA) or methyl jasmonate (MeJA) for 6 hours. Transcriptomes were compared between seedlings induced by JA, MeJA and untreated controls using a tag-based digital gene expression profiling system. In a single run, 25,977,782 short reads were produced and 51,157 unigenes were obtained with a mean length of 517 nt. We sequenced 3 digital gene expression libraries and generated between 3.5 and 5.9 million raw tags, and obtained 52,040 reliable reference genes after removing redundancy. The expression of disease/insect-resistance genes (e.g., phenylalanine ammonialyase, coumarate 3-hydroxylase, lipoxygenase, allene oxide synthase and allene oxide cyclase) was up-regulated. The expression profiles of some abundant genes under different elicitor treatment were studied by using real-time qRT-PCR.The results showed that the expression levels of disease/insect-resistance genes in the seedling samples induced by JA and MeJA were higher than those in the control group. The seedlings induced with MeJA elicited the strongest increases in disease/insect-resistance genes. Both JA and MeJA induced seedlings of L. gmelinii showed significantly increased expression of disease/insect-resistance genes. MeJA seemed to have a stronger induction effect than JA on expression of disease/insect-resistance related genes. This study provides sequence resources for L. gmelinii research and will help us to better understand the functions of disease/insect-resistance genes and the molecular mechanisms of secondary metabolisms in L. gmelinii.
Mishra, Amrendra; Sriram, Harshini; Chandarana, Pinal; Tanavde, Vivek; Kumar, Rekha V; Gopinath, Ashok; Govindarajan, Raman; Ramaswamy, S; Sadasivam, Subhashini
2018-05-01
The goal of this study was to isolate cancer stem-like cells marked by high expression of CD44, a putative cancer stem cell marker, from primary oral squamous cell carcinomas and identify distinctive gene expression patterns in these cells. From 1 October 2013 to 4 September 2015, 76 stage III-IV primary oral squamous cell carcinoma of the gingivobuccal sulcus were resected. In all, 13 tumours were analysed by immunohistochemistry to visualise CD44-expressing cells. Expression of CD44 within The Cancer Genome Atlas-Head and Neck Squamous Cell Carcinoma RNA-sequencing data was also assessed. Seventy resected tumours were dissociated into single cells and stained with antibodies to CD44 as well as CD45 and CD31 (together referred as Lineage/Lin). From 45 of these, CD44 + Lin - and CD44 - Lin - subpopulations were successfully isolated using fluorescence-activated cell sorting, and good-quality RNA was obtained from 14 such sorted pairs. Libraries from five pairs were sequenced and the results analysed using bioinformatics tools. Reverse transcription quantitative polymerase chain reaction was performed to experimentally validate the differential expression of selected candidate genes identified from the transcriptome sequencing in the same 5 and an additional 9 tumours. CD44 was expressed on the surface of poorly differentiated tumour cells, and within the The Cancer Genome Atlas-Head and Neck Squamous Cell Carcinoma samples, its messenger RNA levels were higher in tumours compared to normal. Transcriptomics revealed that 102 genes were upregulated and 85 genes were downregulated in CD44 + Lin - compared to CD44 - Lin - cells in at least 3 of the 5 tumours sequenced. The upregulated genes included those involved in immune regulation, while the downregulated genes were enriched for genes involved in cell adhesion. Decreased expression of PCDH18, MGP, SPARCL1 and KRTDAP was confirmed by reverse transcription quantitative polymerase chain reaction. Lower expression of the cell-cell adhesion molecule PCDH18 correlated with poorer overall survival in the The Cancer Genome Atlas-Head and Neck Squamous Cell Carcinoma data highlighting it as a potential negative prognostic factor in this cancer.
Stajdohar, Miha; Rosengarten, Rafael D; Kokosar, Janez; Jeran, Luka; Blenkus, Domen; Shaulsky, Gad; Zupan, Blaz
2017-06-02
Dictyostelium discoideum, a soil-dwelling social amoeba, is a model for the study of numerous biological processes. Research in the field has benefited mightily from the adoption of next-generation sequencing for genomics and transcriptomics. Dictyostelium biologists now face the widespread challenges of analyzing and exploring high dimensional data sets to generate hypotheses and discovering novel insights. We present dictyExpress (2.0), a web application designed for exploratory analysis of gene expression data, as well as data from related experiments such as Chromatin Immunoprecipitation sequencing (ChIP-Seq). The application features visualization modules that include time course expression profiles, clustering, gene ontology enrichment analysis, differential expression analysis and comparison of experiments. All visualizations are interactive and interconnected, such that the selection of genes in one module propagates instantly to visualizations in other modules. dictyExpress currently stores the data from over 800 Dictyostelium experiments and is embedded within a general-purpose software framework for management of next-generation sequencing data. dictyExpress allows users to explore their data in a broader context by reciprocal linking with dictyBase-a repository of Dictyostelium genomic data. In addition, we introduce a companion application called GenBoard, an intuitive graphic user interface for data management and bioinformatics analysis. dictyExpress and GenBoard enable broad adoption of next generation sequencing based inquiries by the Dictyostelium research community. Labs without the means to undertake deep sequencing projects can mine the data available to the public. The entire information flow, from raw sequence data to hypothesis testing, can be accomplished in an efficient workspace. The software framework is generalizable and represents a useful approach for any research community. To encourage more wide usage, the backend is open-source, available for extension and further development by bioinformaticians and data scientists.
de Vries, Ronald P.; van den Broeck, Hetty C.; Dekkers, Ester; Manzanares, Paloma; de Graaff, Leo H.; Visser, Jaap
1999-01-01
A gene encoding a third α-galactosidase (AglB) from Aspergillus niger has been cloned and sequenced. The gene consists of an open reading frame of 1,750 bp containing six introns. The gene encodes a protein of 443 amino acids which contains a eukaryotic signal sequence of 16 amino acids and seven putative N-glycosylation sites. The mature protein has a calculated molecular mass of 48,835 Da and a predicted pI of 4.6. An alignment of the AglB amino acid sequence with those of other α-galactosidases revealed that it belongs to a subfamily of α-galactosidases that also includes A. niger AglA. A. niger AglC belongs to a different subfamily that consists mainly of prokaryotic α-galactosidases. The expression of aglA, aglB, aglC, and lacA, the latter of which encodes an A. niger β-galactosidase, has been studied by using a number of monomeric, oligomeric, and polymeric compounds as growth substrates. Expression of aglA is only detected on galactose and galactose-containing oligomers and polymers. The aglB gene is expressed on all of the carbon sources tested, including glucose. Elevated expression was observed on xylan, which could be assigned to regulation via XlnR, the xylanolytic transcriptional activator. Expression of aglC was only observed on glucose, fructose, and combinations of glucose with xylose and galactose. High expression of lacA was detected on arabinose, xylose, xylan, and pectin. Similar to aglB, the expression on xylose and xylan can be assigned to regulation via XlnR. All four genes have distinct expression patterns which seem to mirror the natural substrates of the encoded proteins. PMID:10347026
Searching for resistance genes to Bursaphelenchus xylophilus using high throughput screening.
Santos, Carla S; Pinheiro, Miguel; Silva, Ana I; Egas, Conceição; Vasconcelos, Marta W
2012-11-07
Pine wilt disease (PWD), caused by the pinewood nematode (PWN; Bursaphelenchus xylophilus), damages and kills pine trees and is causing serious economic damage worldwide. Although the ecological mechanism of infestation is well described, the plant's molecular response to the pathogen is not well known. This is due mainly to the lack of genomic information and the complexity of the disease. High throughput sequencing is now an efficient approach for detecting the expression of genes in non-model organisms, thus providing valuable information in spite of the lack of the genome sequence. In an attempt to unravel genes potentially involved in the pine defense against the pathogen, we hereby report the high throughput comparative sequence analysis of infested and non-infested stems of Pinus pinaster (very susceptible to PWN) and Pinus pinea (less susceptible to PWN). Four cDNA libraries from infested and non-infested stems of P. pinaster and P. pinea were sequenced in a full 454 GS FLX run, producing a total of 2,083,698 reads. The putative amino acid sequences encoded by the assembled transcripts were annotated according to Gene Ontology, to assign Pinus contigs into Biological Processes, Cellular Components and Molecular Functions categories. Most of the annotated transcripts corresponded to Picea genes-25.4-39.7%, whereas a smaller percentage, matched Pinus genes, 1.8-12.8%, probably a consequence of more public genomic information available for Picea than for Pinus. The comparative transcriptome analysis showed that when P. pinaster was infested with PWN, the genes malate dehydrogenase, ABA, water deficit stress related genes and PAR1 were highly expressed, while in PWN-infested P. pinea, the highly expressed genes were ricin B-related lectin, and genes belonging to the SNARE and high mobility group families. Quantitative PCR experiments confirmed the differential gene expression between the two pine species. Defense-related genes triggered by nematode infestation were detected in both P. pinaster and P. pinea transcriptomes utilizing 454 pyrosequencing technology. P. pinaster showed higher abundance of genes related to transcriptional regulation, terpenoid secondary metabolism (including some with nematicidal activity) and pathogen attack. P. pinea showed higher abundance of genes related to oxidative stress and higher levels of expression in general of stress responsive genes. This study provides essential information about the molecular defense mechanisms utilized by P. pinaster and P. pinea against PWN infestation and contributes to a better understanding of PWD.
Expression map of a complete set of gustatory receptor genes in chemosensory organs of Bombyx mori.
Guo, Huizhen; Cheng, Tingcai; Chen, Zhiwei; Jiang, Liang; Guo, Youbing; Liu, Jianqiu; Li, Shenglong; Taniai, Kiyoko; Asaoka, Kiyoshi; Kadono-Okuda, Keiko; Arunkumar, Kallare P; Wu, Jiaqi; Kishino, Hirohisa; Zhang, Huijie; Seth, Rakesh K; Gopinathan, Karumathil P; Montagné, Nicolas; Jacquin-Joly, Emmanuelle; Goldsmith, Marian R; Xia, Qingyou; Mita, Kazuei
2017-03-01
Most lepidopteran species are herbivores, and interaction with host plants affects their gene expression and behavior as well as their genome evolution. Gustatory receptors (Grs) are expected to mediate host plant selection, feeding, oviposition and courtship behavior. However, due to their high diversity, sequence divergence and extremely low level of expression it has been difficult to identify precisely a complete set of Grs in Lepidoptera. By manual annotation and BAC sequencing, we improved annotation of 43 gene sequences compared with previously reported Grs in the most studied lepidopteran model, the silkworm, Bombyx mori, and identified 7 new tandem copies of BmGr30 on chromosome 7, bringing the total number of BmGrs to 76. Among these, we mapped 68 genes to chromosomes in a newly constructed chromosome distribution map and 8 genes to scaffolds; we also found new evidence for large clusters of BmGrs, especially from the bitter receptor family. RNA-seq analysis of diverse BmGr expression patterns in chemosensory organs of larvae and adults enabled us to draw a precise organ specific map of BmGr expression. Interestingly, most of the clustered genes were expressed in the same tissues and more than half of the genes were expressed in larval maxillae, larval thoracic legs and adult legs. For example, BmGr63 showed high expression levels in all organs in both larval and adult stages. By contrast, some genes showed expression limited to specific developmental stages or organs and tissues. BmGr19 was highly expressed in larval chemosensory organs (especially antennae and thoracic legs), the single exon genes BmGr53 and BmGr67 were expressed exclusively in larval tissues, the BmGr27-BmGr31 gene cluster on chr7 displayed a high expression level limited to adult legs and the candidate CO 2 receptor BmGr2 was highly expressed in adult antennae, where few other Grs were expressed. Transcriptional analysis of the Grs in B. mori provides a valuable new reference for finding genes involved in plant-insect interactions in Lepidoptera and establishing correlations between these genes and vital insect behaviors like host plant selection and courtship for mating. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.
Hoang, Van L T; Innes, David J; Shaw, P Nicholas; Monteith, Gregory R; Gidley, Michael J; Dietzgen, Ralf G
2015-07-30
Mango fruits contain a broad spectrum of phenolic compounds which impart potential health benefits; their biosynthesis is catalysed by enzymes in the phenylpropanoid-flavonoid (PF) pathway. The aim of this study was to reveal the variability in genes involved in the PF pathway in three different mango varieties Mangifera indica L., a member of the family Anacardiaceae: Kensington Pride (KP), Irwin (IW) and Nam Doc Mai (NDM) and to determine associations with gene expression and mango flavonoid profiles. A close evolutionary relationship between mango genes and those from the woody species poplar of the Salicaceae family (Populus trichocarpa) and grape of the Vitaceae family (Vitis vinifera), was revealed through phylogenetic analysis of PF pathway genes. We discovered 145 SNPs in total within coding sequences with an average frequency of one SNP every 316 bp. Variety IW had the highest SNP frequency (one SNP every 258 bp) while KP and NDM had similar frequencies (one SNP every 369 bp and 360 bp, respectively). The position in the PF pathway appeared to influence the extent of genetic diversity of the encoded enzymes. The entry point enzymes phenylalanine lyase (PAL), cinnamate 4-mono-oxygenase (C4H) and chalcone synthase (CHS) had low levels of SNP diversity in their coding sequences, whereas anthocyanidin reductase (ANR) showed the highest SNP frequency followed by flavonoid 3'-hydroxylase (F3'H). Quantitative PCR revealed characteristic patterns of gene expression that differed between mango peel and flesh, and between varieties. The combination of mango expressed sequence tags and availability of well-established reference PF biosynthetic genes from other plant species allowed the identification of coding sequences of genes that may lead to the formation of important flavonoid compounds in mango fruits and facilitated characterisation of single nucleotide polymorphisms between varieties. We discovered an association between the extent of sequence variation and position in the pathway for up-stream genes. The high expression of PAL, C4H and CHS genes in mango peel compared to flesh is associated with high amounts of total phenolic contents in peels, which suggest that these genes have an influence on total flavonoid levels in mango fruit peel and flesh. In addition, the particularly high expression levels of ANR in KP and NDM peels compared to IW peel and the significant accumulation of its product epicatechin gallate (ECG) in those extracts reflects the rate-limiting role of ANR on ECG biosynthesis in mango.
Linne, Hannah; Yasaei, Hemad; Marriott, Alison; Harvey, Amanda; Mokbel, Kefah; Newbold, Robert; Roberts, Terry
2017-01-01
Narrowing the search for the critical hTERT repressor sequence(s) has identified three regions on chromosome 3p (3p12-p21.1, 3p21.2 and 3p21.3-p22). However, the precise location and identity of the sequence(s) responsible for hTERT transcriptional repression remains elusive. In order to identify critical hTERT repressor sequences located within human chromosome 3p12-p22, we investigated hTERT transcriptional activity within 21NT microcell hybrid clones containing chromosome 3 fragments. Mapping of chromosome 3 structure in a single hTERT-repressed 21NT-#3fragment hybrid clone, revealed a 490kb region of deletion localised to 3p21.3 and encompassing the histone H3, lysine 36 (H3K36) trimethyltransferase enzyme SETD2; a putative tumour suppressor gene in breast cancer. Three additional genes, BAP1, PARP-3 and PBRM1, were also selected for further investigation based on their location within the 3p21.1-p21.3 region, together with their documented role in the epigenetic regulation of target gene expression or hTERT regulation. All four genes (SETD2, BAP1, PARP-3 and PBRM1) were found to be expressed at low levels in 21NT. Gene copy number variation (CNV) analysis of SETD2, BAP1, PARP-3 and PBRM1 within a panel of nine breast cancer cell lines demonstrated single copy number loss of all candidate genes within five (56%) cell lines (including 21NT cells). Stable, forced overexpression of BAP1, but not PARP2, SETD2 or PBRM1, within 21NT cells was associated with a significant reduction in hTERT expression levels relative to wild-type controls. We propose that at least two sequences exist on human chromosome 3p, that function to regulate hTERT transcription within human breast cancer cells. PMID:28977912
Linne, Hannah; Yasaei, Hemad; Marriott, Alison; Harvey, Amanda; Mokbel, Kefah; Newbold, Robert; Roberts, Terry
2017-09-22
Narrowing the search for the critical hTERT repressor sequence(s) has identified three regions on chromosome 3p (3p12-p21.1, 3p21.2 and 3p21.3-p22). However, the precise location and identity of the sequence(s) responsible for hTERT transcriptional repression remains elusive. In order to identify critical hTERT repressor sequences located within human chromosome 3p12-p22, we investigated hTERT transcriptional activity within 21NT microcell hybrid clones containing chromosome 3 fragments. Mapping of chromosome 3 structure in a single hTERT- repressed 21NT-#3fragment hybrid clone, revealed a 490kb region of deletion localised to 3p21.3 and encompassing the histone H3, lysine 36 (H3K36) trimethyltransferase enzyme SETD2; a putative tumour suppressor gene in breast cancer. Three additional genes, BAP1, PARP-3 and PBRM1, were also selected for further investigation based on their location within the 3p21.1-p21.3 region, together with their documented role in the epigenetic regulation of target gene expression or hTERT regulation. All four genes (SETD2, BAP1, PARP-3 and PBRM1) were found to be expressed at low levels in 21NT. Gene copy number variation (CNV) analysis of SETD2, BAP1, PARP-3 and PBRM1 within a panel of nine breast cancer cell lines demonstrated single copy number loss of all candidate genes within five (56%) cell lines (including 21NT cells). Stable, forced overexpression of BAP1, but not PARP2, SETD2 or PBRM1, within 21NT cells was associated with a significant reduction in hTERT expression levels relative to wild-type controls. We propose that at least two sequences exist on human chromosome 3p, that function to regulate hTERT transcription within human breast cancer cells.
Faster-X Evolution of Gene Expression in Drosophila
Meisel, Richard P.; Malone, John H.; Clark, Andrew G.
2012-01-01
DNA sequences on X chromosomes often have a faster rate of evolution when compared to similar loci on the autosomes, and well articulated models provide reasons why the X-linked mode of inheritance may be responsible for the faster evolution of X-linked genes. We analyzed microarray and RNA–seq data collected from females and males of six Drosophila species and found that the expression levels of X-linked genes also diverge faster than autosomal gene expression, similar to the “faster-X” effect often observed in DNA sequence evolution. Faster-X evolution of gene expression was recently described in mammals, but it was limited to the evolutionary lineages shortly following the creation of the therian X chromosome. In contrast, we detect a faster-X effect along both deep lineages and those on the tips of the Drosophila phylogeny. In Drosophila males, the dosage compensation complex (DCC) binds the X chromosome, creating a unique chromatin environment that promotes the hyper-expression of X-linked genes. We find that DCC binding, chromatin environment, and breadth of expression are all predictive of the rate of gene expression evolution. In addition, estimates of the intraspecific genetic polymorphism underlying gene expression variation suggest that X-linked expression levels are not under relaxed selective constraints. We therefore hypothesize that the faster-X evolution of gene expression is the result of the adaptive fixation of beneficial mutations at X-linked loci that change expression level in cis. This adaptive faster-X evolution of gene expression is limited to genes that are narrowly expressed in a single tissue, suggesting that relaxed pleiotropic constraints permit a faster response to selection. Finally, we present a conceptional framework to explain faster-X expression evolution, and we use this framework to examine differences in the faster-X effect between Drosophila and mammals. PMID:23071459
Guo, Yuan; Qiu, Caisheng; Long, Songhua; Chen, Ping; Hao, Dongmei; Preisner, Marta; Wang, Hui; Wang, Yufu
2017-08-30
To better understand the molecular mechanisms and gene expression characteristics associated with development of bast fiber cell within flax stem phloem, the gene expression profiling of flax stem peels and leaves were screened, using Illumina's Digital Gene Expression (DGE) analysis. Four DGE libraries (2 for stem peel and 2 for leaf), ranging from 6.7 to 9.2 million clean reads were obtained, which produced 7.0 million and 6.8 million mapped reads for flax stem peel and leave, respectively. By differential gene expression analysis, a total of 975 genes, of which 708 (73%) genes have protein-coding annotation, were identified as phloem enriched genes putatively involved in the processes of polysaccharide and cell wall metabolism. Differential expression genes (DEGs) was validated using quantitative RT-PCR, the expression pattern of all nine genes determined by qRT-PCR fitted in well with that obtained by sequencing analysis. Cluster and Gene Ontology (GO) analysis revealed that a large number of genes related to metabolic process, catalytic activity and binding category were expressed predominantly in the stem peels. The Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis of the phloem enriched genes suggested approximately 111 biological pathways. The large number of genes and pathways produced from DGE sequencing will expand our understanding of the complex molecular and cellular events in flax bast fiber development and provide a foundation for future studies on fiber development in other bast fiber crops. Copyright © 2017 Elsevier B.V. All rights reserved.
Qu, Chun-Pu; Xu, Zhi-Ru; Liu, Guan-Jun; Liu, Chun; Li, Yang; Wei, Zhi-Gang; Liu, Gui-Feng
2010-01-01
In aerobic organisms, protection against oxidative damage involves the combined action of highly specialized antioxidant enzymes, such as copper-zinc superoxide dismutase. In this work, a cDNA clone which encodes a copper-zinc superoxide dismutase gene, named PS-CuZnSOD, has been identified from P. sibiricum Laxm. by the rapid amplification of cDNA ends method (RACE). Analysis of the nucleotide sequence reveals that the PS-CuZnSOD gene cDNA clone consists of 669 bp, containing 87 bp in the 5' untranslated region; 459 bp in the open reading frame (ORF) encoding 152 amino acids; and 123 bp in 3' untranslated region. The gene accession nucleotide sequence number in GenBank is GQ472846. Sequence analysis indicates that the protein, like most plant superoxide dismutases (SOD), includes two conserved ecCuZnSOD signatures that are from the amino acids 43 to 51, and from the amino acids 137 to 148, and it has a signal peptide extension in the front of the N-terminus (1-16 aa). Expression analysis by real-time quantitative PCR reveals that the PS-CuZnSOD gene is expressed in leaves, stems and underground stems. PS-CuZnSOD gene expression can be induced by 3% NaHCO(3). The different mRNA levels' expression of PS-CuZnSOD show the gene's different expression modes in leaves, stems and underground stems under the salinity-alkalinity stress.
Enhancer activity of Helitron in sericin-1 gene promoter from Bombyx mori.
Huang, Ke; Li, Chun-Feng; Wu, Jie; Wei, Jun-Hong; Zou, Yong; Han, Min-Jin; Zhou, Ze-Yang
2016-06-01
Sericin is a kind of water-soluble protein expressed specifically in the middle silk gland of Bombyx mori. When the sericin-1 gene promoter was cloned and a transgenic vector was constructed to express a foreign protein, a specific Helitron, Bmhel-8, was identified in the sericin-1 gene promoter sequence in some genotypes of Bombyx mori and Bombyx mandarina. Given that the Bmhel-8 Helitron transposon was present only in some genotypes, it could be the source of allelic variation in the sericin-1 promoter. The length of the sericin-1 promoter sequence is approximately 1063 or 643 bp. The larger size of the sequence or allele is ascribed to the presence of Bmhel-8. Silkworm genotypes can be homozygous for either the shorter or larger promoter sequence or heterozygous, containing both alleles. Bmhel-8 in the sericin-1 promoter exhibits enhancer activity, as demonstrated by a dual-luciferase reporter system in BmE cell lines. Furthermore, Bmhel-8 displays enhancer activity in a sericin-1 promoter-driven gene expression system but does not regulate the tissue-specific expression of sericin-1. © 2016 Institute of Zoology, Chinese Academy of Sciences.
The organization and expression of the mdm2 gene.
de Oca Luna, R M; Tabor, A D; Eberspaecher, H; Hulboy, D L; Worth, L L; Colman, M S; Finlay, C A; Lozano, G
1996-05-01
The mdm2 gene encodes a zinc finger protein that negatively regulates p53 function by binding and masking the p53 transcriptional activation domain. Two different promoters control expression of mdm2, one of which is also transactivated by p53. We cloned and characterized the mdm2 gene from a murine 129 library. It contained at least 12 exons and spanned approximately 25 kb of DNA. Sequencing of the mdm2 gene revealed three nucleotide differences that resulted in amino acid substitutions in the previously published mdm2 sequence. Sequencing of normal BalbC/J DNA and the original cosmid clone isolated from the 3T3DM cell line revealed that they are identical, suggesting that the published sequence is in error at these three positions. In addition, we analyzed the expression pattern of mdm2 and found ubiquitous low-level expression throughout embryo development and in adult tissues. Analysis of mRNA from numerous tissues for several mdm2 spliced variants that had been identified in the transformed 3T3DM cell line revealed that these variants could not be detected in the developing embryo or in adult tissues.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Depto, A.S.; Stenberg, R.M.
1989-03-01
To better understand the regulation of late gene expression in human cytomegalovirus (CMV)-infected cells, the authors examined expression of the gene that codes for the 65-kilodalton lower-matrix phosphoprotein (pp65). Analysis of RNA isolated at 72 h from cells infected with CMV Towne or ts66, a DNA-negative temperature-sensitive mutant, supported the fact that pp65 is expressed at low levels prior to viral DNA replication but maximally expressed after the initiation of viral DNA replication. To investigate promoter activation in a transient expression assay, the pp65 promoter was cloned into the indicator plasmid containing the gene for chloramphenicol acetyltransferase (CAT). Transfection ofmore » the promoter-CAT construct and subsequent superinfection with CMV resulted in activation of the promoter at early times after infection. Cotransfection with plasmids capable of expressing immediate-early (IE) proteins demonstrated that the promoter was activated by IE proteins and that both IE regions 1 and 2 were necessary. These studies suggest that interactions between IE proteins and this octamer sequence may be important for the regulation and expression of this CMV gene.« less
Fu, Lijuan; Shi, Zhimin; Luo, Guanzheng; Tu, Weihong; Wang, XiuJie; Fang, Zhide; Li, XiaoChing
2014-10-01
Mutations in the human FOXP2 gene cause speech and language impairments. The FOXP2 protein is a transcription factor that regulates the expression of many downstream genes, which may have important roles in nervous system development and function. An adequate amount of functional FOXP2 protein is thought to be critical for the proper development of the neural circuitry underlying speech and language. However, how FOXP2 gene expression is regulated is not clearly understood. The FOXP2 mRNA has an approximately 4-kb-long 3' untranslated region (3' UTR), twice as long as its protein coding region, indicating that FOXP2 can be regulated by microRNAs (miRNAs). We identified multiple miRNAs that regulate the expression of the human FOXP2 gene using sequence analysis and in vitro cell systems. Focusing on let-7a, miR-9, and miR-129-5p, three brain-enriched miRNAs, we show that these miRNAs regulate human FOXP2 expression in a dosage-dependent manner and target specific sequences in the FOXP2 3' UTR. We further show that these three miRNAs are expressed in the cerebellum of the human fetal brain, where FOXP2 is known to be expressed. Our results reveal novel regulatory functions of the human FOXP2 3' UTR sequence and regulatory interactions between multiple miRNAs and the human FOXP2 gene. The expression of let-7a, miR-9, and miR-129-5p in the human fetal cerebellum is consistent with their roles in regulating FOXP2 expression during early cerebellum development. These results suggest that various genetic and environmental factors may contribute to speech and language development and related neural developmental disorders via the miRNA-FOXP2 regulatory network.
Goettel, Wolfgang; Xia, Eric; Upchurch, Robert; Wang, Ming-Li; Chen, Pengyin; An, Yong-Qiang Charles
2014-04-23
Variation in seed oil composition and content among soybean varieties is largely attributed to differences in transcript sequences and/or transcript accumulation of oil production related genes in seeds. Discovery and analysis of sequence and expression variations in these genes will accelerate soybean oil quality improvement. In an effort to identify these variations, we sequenced the transcriptomes of soybean seeds from nine lines varying in oil composition and/or total oil content. Our results showed that 69,338 distinct transcripts from 32,885 annotated genes were expressed in seeds. A total of 8,037 transcript expression polymorphisms and 50,485 transcript sequence polymorphisms (48,792 SNPs and 1,693 small Indels) were identified among the lines. Effects of the transcript polymorphisms on their encoded protein sequences and functions were predicted. The studies also provided independent evidence that the lack of FAD2-1A gene activity and a non-synonymous SNP in the coding sequence of FAB2C caused elevated oleic acid and stearic acid levels in soybean lines M23 and FAM94-41, respectively. As a proof-of-concept, we developed an integrated RNA-seq and bioinformatics approach to identify and functionally annotate transcript polymorphisms, and demonstrated its high effectiveness for discovery of genetic and transcript variations that result in altered oil quality traits. The collection of transcript polymorphisms coupled with their predicted functional effects will be a valuable asset for further discovery of genes, gene variants, and functional markers to improve soybean oil quality.
Ulloa, Pilar E; Rincón, Gonzalo; Islas-Trejo, Alma; Araneda, Cristian; Iturra, Patricia; Neira, Roberto; Medrano, Juan F
2015-06-01
The objectives of this study were to measure gene expression in zebrafish and then identify SNP to be used as potential markers in a growth association study. We developed an approach where muscle samples collected from low- and high-growth fish were analyzed using RNA-Sequencing (RNA-seq), and SNP were chosen from the genes that were differentially expressed between the low and high groups. A population of 24 families was fed a plant protein-based diet from the larval to adult stages. From a total of 440 males, 5 % of the fish from both tails of the weight gain distribution were selected. Total RNA was extracted from individual muscle of 8 low-growth and 8 high-growth fish. Two pooled RNA-Seq libraries were prepared for each phenotype using 4 fish per library. Libraries were sequenced using the Illumina GAII Sequencer and analyzed using the CLCBio genomic workbench software. One hundred and twenty-four genes were differentially expressed between phenotypes (p value < 0.05 and FDR < 0.2). From these genes, 164 SNP were selected and genotyped in 240 fish samples. Marker-trait analysis revealed 5 SNP associated with growth in key genes (Nars, Lmod2b, Cuzd1, Acta1b, and Plac8l1). These genes are good candidates for further growth studies in fish and to consider for identification of potential SNPs associated with different growth rates in response to a plant protein-based diet.
Jouffe, Vincent; Rowe, Suzanne; Liaubet, Laurence; Buitenhuis, Bart; Hornshøj, Henrik; SanCristobal, Magali; Mormède, Pierre; de Koning, D J
2009-07-16
Microarray studies can supplement QTL studies by suggesting potential candidate genes in the QTL regions, which by themselves are too large to provide a limited selection of candidate genes. Here we provide a case study where we explore ways to integrate QTL data and microarray data for the pig, which has only a partial genome sequence. We outline various procedures to localize differentially expressed genes on the pig genome and link this with information on published QTL. The starting point is a set of 237 differentially expressed cDNA clones in adrenal tissue from two pig breeds, before and after treatment with adrenocorticotropic hormone (ACTH). Different approaches to localize the differentially expressed (DE) genes to the pig genome showed different levels of success and a clear lack of concordance for some genes between the various approaches. For a focused analysis on 12 genes, overlapping QTL from the public domain were presented. Also, differentially expressed genes underlying QTL for ACTH response were described. Using the latest version of the draft sequence, the differentially expressed genes were mapped to the pig genome. This enabled co-location of DE genes and previously studied QTL regions, but the draft genome sequence is still incomplete and will contain many errors. A further step to explore links between DE genes and QTL at the pathway level was largely unsuccessful due to the lack of annotation of the pig genome. This could be improved by further comparative mapping analyses but this would be time consuming. This paper provides a case study for the integration of QTL data and microarray data for a species with limited genome sequence information and annotation. The results illustrate the challenges that must be addressed but also provide a roadmap for future work that is applicable to other non-model species.
2014-01-01
Background Deciphering of the information content of eukaryotic promoters has remained confined to universal landmarks and conserved sequence elements such as enhancers and transcription factor binding motifs, which are considered sufficient for gene activation and regulation. Gene-specific sequences, interspersed between the canonical transacting factor binding sites or adjoining them within a promoter, are generally taken to be devoid of any regulatory information and have therefore been largely ignored. An unanswered question therefore is, do gene-specific sequences within a eukaryotic promoter have a role in gene activation? Here, we present an exhaustive experimental analysis of a gene-specific sequence adjoining the heat shock element (HSE) in the proximal promoter of the small heat shock protein gene, αB-crystallin (cryab). These sequences are highly conserved between the rodents and the humans. Results Using human retinal pigment epithelial cells in culture as the host, we have identified a 10-bp gene-specific promoter sequence (GPS), which, unlike an enhancer, controls expression from the promoter of this gene, only when in appropriate position and orientation. Notably, the data suggests that GPS in comparison with the HSE works in a context-independent fashion. Additionally, when moved upstream, about a nucleosome length of DNA (−154 bp) from the transcription start site (TSS), the activity of the promoter is markedly inhibited, suggesting its involvement in local promoter access. Importantly, we demonstrate that deletion of the GPS results in complete loss of cryab promoter activity in transgenic mice. Conclusions These data suggest that gene-specific sequences such as the GPS, identified here, may have critical roles in regulating gene-specific activity from eukaryotic promoters. PMID:24589182
Different gene expressions between cattle and yak provide insights into high-altitude adaptation.
Wang, K; Yang, Y; Wang, L; Ma, T; Shang, H; Ding, L; Han, J; Qiu, Q
2016-02-01
DNA sequence variation has been widely reported as the genetic basis for adaptation, in both humans and other animals, to the hypoxic environment experienced at high altitudes. However, little is known about the patterns of gene expression underlying such hypoxic adaptations. In this study, we examined the differences in the transcriptomes of four organs (heart, kidney, liver and lung) between yak and cattle, a pair of closely related species distributed at high and low altitudes respectively. Of the four organs examined, heart shows the greatest differentiation between the two species in terms of gene expression profiles. Detailed analyses demonstrated that some genes associated with the oxygen supply system and the defense systems that respond to threats of hypoxia are differentially expressed. In addition, genes with significantly differentiated patterns of expression in all organs exhibited an unexpected uniformity of regulation along with an elevated frequency of nonsynonymous substitutions. This co-evolution of protein sequences and gene expression patterns is likely to be correlated with the optimization of the yak metabolic system to resist hypoxia. © 2015 Stichting International Foundation for Animal Genetics.
El-Mogharbel, Nisrine; Wakefield, Matthew; Deakin, Janine E; Tsend-Ayush, Enkhjargal; Grützner, Frank; Alsop, Amber; Ezaz, Tariq; Marshall Graves, Jennifer A
2007-01-01
We isolated and characterized a cluster of platypus DMRT genes and compared their arrangement, location, and sequence across vertebrates. The DMRT gene cluster on human 9p24.3 harbors, in order, DMRT1, DMRT3, and DMRT2, which share a DM domain. DMRT1 is highly conserved and involved in sexual development in vertebrates, and deletions in this region cause sex reversal in humans. Sequence comparisons of DMRT genes between species have been valuable in identifying exons, control regions, and conserved nongenic regions (CNGs). The addition of platypus sequences is expected to be particularly valuable, since monotremes fill a gap in the vertebrate genome coverage. We therefore isolated and fully sequenced platypus BAC clones containing DMRT3 and DMRT2 as well as DMRT1 and then generated multispecies alignments and ran prediction programs followed by experimental verification to annotate this gene cluster. We found that the three genes have 58-66% identity to their human orthologues, lie in the same order as in other vertebrates, and colocate on 1 of the 10 platypus sex chromosomes, X5. We also predict that optimal annotation of the newly sequenced platypus genome will be challenging. The analysis of platypus sequence revealed differences in structure and sequence of the DMRT gene cluster. Multispecies comparison was particularly effective for detecting CNGs, revealing several novel potential regulatory regions within DMRT3 and DMRT2 as well as DMRT1. RT-PCR indicated that platypus DMRT1 and DMRT3 are expressed specifically in the adult testis (and not ovary), but DMRT2 has a wider expression profile, as it does for other mammals. The platypus DMRT1 expression pattern, and its location on an X chromosome, suggests an involvement in monotreme sexual development.
Analysis of Epstein-Barr Virus Genomes and Expression Profiles in Gastric Adenocarcinoma.
Borozan, Ivan; Zapatka, Marc; Frappier, Lori; Ferretti, Vincent
2018-01-15
Epstein-Barr virus (EBV) is a causative agent of a variety of lymphomas, nasopharyngeal carcinoma (NPC), and ∼9% of gastric carcinomas (GCs). An important question is whether particular EBV variants are more oncogenic than others, but conclusions are currently hampered by the lack of sequenced EBV genomes. Here, we contribute to this question by mining whole-genome sequences of 201 GCs to identify 13 EBV-positive GCs and by assembling 13 new EBV genome sequences, almost doubling the number of available GC-derived EBV genome sequences and providing the first non-Asian EBV genome sequences from GC. Whole-genome sequence comparisons of all EBV isolates sequenced to date (85 from tumors and 57 from healthy individuals) showed that most GC and NPC EBV isolates were closely related although American Caucasian GC samples were more distant, suggesting a geographical component. However, EBV GC isolates were found to contain some consistent changes in protein sequences regardless of geographical origin. In addition, transcriptome data available for eight of the EBV-positive GCs were analyzed to determine which EBV genes are expressed in GC. In addition to the expected latency proteins (EBNA1, LMP1, and LMP2A), specific subsets of lytic genes were consistently expressed that did not reflect a typical lytic or abortive lytic infection, suggesting a novel mechanism of EBV gene regulation in the context of GC. These results are consistent with a model in which a combination of specific latent and lytic EBV proteins promotes tumorigenesis. IMPORTANCE Epstein-Barr virus (EBV) is a widespread virus that causes cancer, including gastric carcinoma (GC), in a small subset of individuals. An important question is whether particular EBV variants are more cancer associated than others, but more EBV sequences are required to address this question. Here, we have generated 13 new EBV genome sequences from GC, almost doubling the number of EBV sequences from GC isolates and providing the first EBV sequences from non-Asian GC. We further identify sequence changes in some EBV proteins common to GC isolates. In addition, gene expression analysis of eight of the EBV-positive GCs showed consistent expression of both the expected latency proteins and a subset of lytic proteins that was not consistent with typical lytic or abortive lytic expression. These results suggest that novel mechanisms activate expression of some EBV lytic proteins and that their expression may contribute to oncogenesis. Copyright © 2018 American Society for Microbiology.
Satapathy, Lopamudra; Singh, Dharmendra; Ranjan, Prashant; Kumar, Dhananjay; Kumar, Manish; Prabhu, Kumble Vinod; Mukhopadhyay, Kunal
2014-12-01
WRKY, a plant-specific transcription factor family, has important roles in pathogen defense, abiotic cues and phytohormone signaling, yet little is known about their roles and molecular mechanism of function in response to rust diseases in wheat. We identified 100 TaWRKY sequences using wheat Expressed Sequence Tag database of which 22 WRKY sequences were novel. Identified proteins were characterized based on their zinc finger motifs and phylogenetic analysis clustered them into six clades consisting of class IIc and class III WRKY proteins. Functional annotation revealed major functions in metabolic and cellular processes in control plants; whereas response to stimuli, signaling and defense in pathogen inoculated plants, their major molecular function being binding to DNA. Tag-based expression analysis of the identified genes revealed differential expression between mock and Puccinia triticina inoculated wheat near isogenic lines. Gene expression was also performed with six rust-related microarray experiments at Gene Expression Omnibus database. TaWRKY10, 15, 17 and 56 were common in both tag-based and microarray-based differential expression analysis and could be representing rust specific WRKY genes. The obtained results will bestow insight into the functional characterization of WRKY transcription factors responsive to leaf rust pathogenesis that can be used as candidate genes in molecular breeding programs to improve biotic stress tolerance in wheat.
Generate Optimized Genetic Rhythm for Enzyme Expression in Non-native systems
DOE Office of Scientific and Technical Information (OSTI.GOV)
2016-11-03
Most amino acids are represented by more than one codon, resulting in redundancy in the genetic code. Silent codon substitutions that do not alter the amino acid sequence still have an effect on protein expression. We have developed an algorithm, GoGREEN, to enhance the expression of foreign proteins in a host organism. GoGREEN selects codons according to frequency patterns seen in the gene of interest using the codon usage table from the host organism. GoGREEN is also designed to accommodate gaps in the sequence.This software takes for input (1) the aligned protein sequences for genes the user wishes to express,more » (2) the codon usage table for the host organism, (3) and the DNA sequence for the target protein found in the host organism. The program will select codons based on codon usage patterns for the target DNA sequence. The program will also select codons for “gaps” found in the aligned protein sequences using the codon usage table from the host organism.« less
Das, Amitabh; Chai, Jin Choul; Yang, Chul-su; Lee, Young Seek; Das, Nando Dulal; Jung, Kyoung Hwa; Chai, Young Gyu
2015-01-01
Persistent macrophage activation is associated with the expression of various pro-inflammatory genes, cytokines and chemokines, which may initiate or amplify inflammatory disorders. A novel synthetic BET inhibitor, JQ1, was proven to exert immunosuppressive activities in macrophages. However, a genome-wide search for JQ1 molecular targets has not been undertaken. The present study aimed at evaluating the anti-inflammatory function and underlying genes that are targeted by JQ1 in LPS-stimulated primary bone marrow-derived macrophages (BMDMs) using global transcriptomic RNA sequencing and quantitative real-time PCR. Among the annotated genes, transcriptional sequencing of BMDMs that were treated with JQ1 revealed a selective effect on LPS-induced gene expression in which the induction of cytokines/chemokines, interferon-stimulated genes, and prominent (transcription factors) TFs was suppressed. Additionally, we found that JQ1 reduced the expression of previously unidentified genes that are important in inflammation. Importantly, these inflammatory genes were not affected by JQ1 treatment alone. Furthermore, we confirmed that JQ1 reduced cytokines/chemokines in the supernatants of LPS treated BMDMs. Moreover, the biological pathways and gene ontology of the differentially expressed genes were determined in the JQ1 treatment of BMDMs. These unprecedented results suggest that the BET inhibitor JQ1 is a candidate for the prevention or therapeutic treatment of inflammatory disorders. PMID:26582142
Gene length as a biological timer to establish temporal transcriptional regulation
Kirkconnell, Killeen S.; Magnuson, Brian; Paulsen, Michelle T.; Lu, Brian; Bedi, Karan; Ljungman, Mats
2017-01-01
ABSTRACT Transcriptional timing is inherently influenced by gene length, thus providing a mechanism for temporal regulation of gene expression. While gene size has been shown to be important for the expression timing of specific genes during early development, whether it plays a role in the timing of other global gene expression programs has not been extensively explored. Here, we investigate the role of gene length during the early transcriptional response of human fibroblasts to serum stimulation. Using the nascent sequencing techniques Bru-seq and BruUV-seq, we identified immediate genome-wide transcriptional changes following serum stimulation that were linked to rapid activation of enhancer elements. We identified 873 significantly induced and 209 significantly repressed genes. Variations in gene size allowed for a large group of genes to be simultaneously activated but produce full-length RNAs at different times. The median length of the group of serum-induced genes was significantly larger than the median length of all expressed genes, housekeeping genes, and serum-repressed genes. These gene length relationships were also observed in corresponding mouse orthologs, suggesting that relative gene size is evolutionarily conserved. The sizes of transcription factor and microRNA genes immediately induced after serum stimulation varied dramatically, setting up a cascade mechanism for temporal expression arising from a single activation event. The retention and expansion of large intronic sequences during evolution have likely played important roles in fine-tuning the temporal expression of target genes in various cellular response programs. PMID:28055303
Generation and analysis of expressed sequence tags from the bone marrow of Chinese Sika deer.
Yao, Baojin; Zhao, Yu; Zhang, Mei; Li, Juan
2012-03-01
Sika deer is one of the best-known and highly valued animals of China. Despite its economic, cultural, and biological importance, there has not been a large-scale sequencing project for Sika deer to date. With the ultimate goal of sequencing the complete genome of this organism, we first established a bone marrow cDNA library for Sika deer and generated a total of 2,025 reads. After processing the sequences, 2,017 high-quality expressed sequence tags (ESTs) were obtained. These ESTs were assembled into 1,157 unigenes, including 238 contigs and 919 singletons. Comparative analyses indicated that 888 (76.75%) of the unigenes had significant matches to sequences in the non-redundant protein database, In addition to highly expressed genes, such as stearoyl-CoA desaturase, cytochrome c oxidase, adipocyte-type fatty acid-binding protein, adiponectin and thymosin beta-4, we also obtained vascular endothelial growth factor-A and heparin-binding growth-associated molecule, both of which are of great importance for angiogenesis research. There were 244 (21.09%) unigenes with no significant match to any sequence in current protein or nucleotide databases, and these sequences may represent genes with unknown function in Sika deer. Open reading frame analysis of the sequences was performed using the getorf program. In addition, the sequences were functionally classified using the gene ontology hierarchy, clusters of orthologous groups of proteins and Kyoto encyclopedia of genes and genomes databases. Analysis of ESTs described in this paper provides an important resource for the transcriptome exploration of Sika deer, and will also facilitate further studies on functional genomics, gene discovery and genome annotation of Sika deer.
Tong, Ying; Zheng, Kang; Zhao, Shufang; Xiao, Guanxiu; Luo, Chen
2012-11-01
Recent studies demonstrated that sequence divergence in both transcriptional regulatory region and coding region contributes to the subfunctionalization of duplicate gene. However, whether sequence divergence in the 3'-untranslated region (3'-UTR) has an impact on the subfunctionalization of duplicate genes remains unclear. Here, we identified two diverging duplicate vsx1 (visual system homeobox-1) loci in goldfish, named vsx1A1 and vsx1A2. Phylogenetic analysis suggests that vsx1A1 and vsx1A2 may arise from a duplication of vsx1 after the separation of goldfish and zebrafish. Sequence comparison revealed that divergence in both transcriptional and translational regulatory regions is higher than divergence in the introns. vsx1A2 expresses during blastula and gastrula stages and in adult retina but silences from segmentation stage to hatching stage, vsx1A1 starts expression from segmentation onward. Comparing to that zebrafish vsx1 expresses in all the developmental stages and in the adult retina, it appears that goldfish vsx1A1 and vsx1A2 are under going to share the functions of ancestral vsx1. The different but overlapping temporal expression patterns of vsx1A1 and vsx1A2 suggest that sequence divergence in the promoter region of duplicate vsx1 is not sufficient for partitioning the functions of ancestral vsx1. By comparing vsx1A1 and vsx1A2 3'-UTR-linked green fluorescent protein gene expression patterns, we demonstrated that the 3'-UTR of vsx1A1 remains but the 3'-UTR of vsx1A2 has lost the capability of mediating bipolar cell specific expression during retina development. These results indicate that sequence divergence in the 3'-UTRs has a clear effect on subfunctionalization of the duplicate genes. © 2012 WILEY PERIODICALS, INC.
Zhang, Fei; Zhang, Yangyi; Wen, Xintian; Huang, Xiaobo; Wen, Yiping; Wu, Rui; Yan, Qigui; Huang, Yong; Ma, Xiaoping; Zhao, Qin; Cao, Sanjie
2015-10-01
Porcine pleuropneumonia is an infectious disease caused by Actinobacillus pleuropneumoniae. The identification of A. pleuropneumoniae genes, specially expressed in vivo, is a useful tool to reveal the mechanism of infection. IVIAT was used in this work to identify antigens expressed in vivo during A. pleuropneumoniae infection, using sera from individuals with chronic porcine pleuropneumonia. Sequencing of DNA inserts from positive clones showed 11 open reading frames with high homology to A. pleuropneumoniae genes. Based on sequence analysis, proteins encoded by these genes were involved in metabolism, replication, transcription regulation, and signal transduction. Moreover, three function-unknown proteins were also indentified in this work. Expression analysis using quantitative real-time PCR showed that most of the genes tested were up-regulated in vivo relative to their expression levels in vitro. IVI (in vivoinduced) genes that were amplified by PCR in different A. pleuropneumoniae strains showed that these genes could be detected in almost all of the strains. It is demonstrated that the identified IVI antigen may have important roles in the infection of A. pleuropneumoniae.
Expressing genes do not forget their LINEs: transposable elements and gene expression
Kines, Kristine J.; Belancio, Victoria P.
2012-01-01
1. ABSTRACT Historically the accumulated mass of mammalian transposable elements (TEs), particularly those located within gene boundaries, was viewed as a genetic burden potentially detrimental to the genomic landscape. This notion has been strengthened by the discovery that transposable sequences can alter the architecture of the transcriptome, not only through insertion, but also long after the integration process is completed. Insertions previously considered harmless are now known to impact the expression of host genes via modification of the transcript quality or quantity, transcriptional interference, or by the control of pathways that affect the mRNA life-cycle. Conversely, several examples of the evolutionary advantageous impact of TEs on the host gene structure that diversified the cellular transcriptome are reported. TE-induced changes in gene expression can be tissue-or disease-specific, raising the possibility that the impact of TE sequences may vary during development, among normal cell types, and between normal and disease-affected tissues. The understanding of the rules and abundance of TE-interference with gene expression is in its infancy, and its contribution to human disease and/or evolution remains largely unexplored. PMID:22201807
Zhou, Wen-Zhao; Zhang, Yan-Mei; Lu, Jun-Ying; Li, Jun-Feng
2012-01-01
To provide a resource of sisal-specific expressed sequence data and facilitate this powerful approach in new gene research, the preparation of normalized cDNA libraries enriched with full-length sequences is necessary. Four libraries were produced with RNA pooled from Agave sisalana multiple tissues to increase efficiency of normalization and maximize the number of independent genes by SMART™ method and the duplex-specific nuclease (DSN). This procedure kept the proportion of full-length cDNAs in the subtracted/normalized libraries and dramatically enhanced the discovery of new genes. Sequencing of 3875 cDNA clones of libraries revealed 3320 unigenes with an average insert length about 1.2 kb, indicating that the non-redundancy of libraries was about 85.7%. These unigene functions were predicted by comparing their sequences to functional domain databases and extensively annotated with Gene Ontology (GO) terms. Comparative analysis of sisal unigenes and other plant genomes revealed that four putative MADS-box genes and knotted-like homeobox (knox) gene were obtained from a total of 1162 full-length transcripts. Furthermore, real-time PCR showed that the characteristics of their transcripts mainly depended on the tight expression regulation of a number of genes during the leaf and flower development. Analysis of individual library sequence data indicated that the pooled-tissue approach was highly effective in discovering new genes and preparing libraries for efficient deep sequencing. PMID:23202944
Isolation of genes from female sterile flowers in Medicago sativa.
Capomaccio, Stefano; Barone, Pierluigi; Reale, Lara; Veronesi, Fabio; Rosellini, Daniele
2009-06-01
A better knowledge of female sporogenesis and gametogenesis could have several practical applications, from commercial hybrid seed production to gene containment in GM crops. With the purpose of isolating genes involved in the megasporogenesis process, the cDNA-AFLP technique was employed to isolate transcript-derived fragments (TDF) differentially expressed between female-fertile and female-sterile full-sib alfalfa plants. This female sterility trait involves female-specific arrest of sporogenesis at early prophase associated with ectopic, massive callose deposition within the nucellus. Ninety-six TDFs were generated and BLAST analyses revealed similarities with genes involved in different Gene Ontology categories. Three TDFs were selected based on their putative functions: showing high similarity to a soybean flower-expressed beta 1,3-glucanase, to an Arabidopsis thaliana MAPKKK, and to an A. thaliana eukaryotic initiation translation factor eIF4G III, respectively. The full length mRNA sequences were obtained. RT-PCR and in situ hybridizations were performed to confirm differential expression during flower development. The genomic organization of the three genes was assessed through sequencing and Southern experiments. Sequence polymorphisms were found between sterile and fertile plants. Our approach based on differential display and bulked segregant analysis was successful in isolating genes that were differentially expressed between fertile and sterile alfalfa plants.
Båge, Tove; Lagervall, Maria; Jansson, Leif; Lundeberg, Joakim; Yucel-Lindberg, Tülay
2012-01-01
Periodontitis is a chronic inflammatory disease affecting the soft tissue and bone that surrounds the teeth. Despite extensive research, distinctive genes responsible for the disease have not been identified. The objective of this study was to elucidate transcriptome changes in periodontitis, by investigating gene expression profiles in gingival tissue obtained from periodontitis-affected and healthy gingiva from the same patient, using RNA-sequencing. Gingival biopsies were obtained from a disease-affected and a healthy site from each of 10 individuals diagnosed with periodontitis. Enrichment analysis performed among uniquely expressed genes for the periodontitis-affected and healthy tissues revealed several regulated pathways indicative of inflammation for the periodontitis-affected condition. Hierarchical clustering of the sequenced biopsies demonstrated clustering according to the degree of inflammation, as observed histologically in the biopsies, rather than clustering at the individual level. Among the top 50 upregulated genes in periodontitis-affected tissues, we investigated two genes which have not previously been demonstrated to be involved in periodontitis. These included interferon regulatory factor 4 and chemokine (C-C motif) ligand 18, which were also expressed at the protein level in gingival biopsies from patients with periodontitis. In conclusion, this study provides a first step towards a quantitative comprehensive insight into the transcriptome changes in periodontitis. We demonstrate for the first time site-specific local variation in gene expression profiles of periodontitis-affected and healthy tissues obtained from patients with periodontitis, using RNA-seq. Further, we have identified novel genes expressed in periodontitis tissues, which may constitute potential therapeutic targets for future treatment strategies of periodontitis. PMID:23029519
Digital transcriptome analysis of putative sex-determination genes in papaya (Carica papaya).
Urasaki, Naoya; Tarora, Kazuhiko; Shudo, Ayano; Ueno, Hiroki; Tamaki, Moritoshi; Miyagi, Norimichi; Adaniya, Shinichi; Matsumura, Hideo
2012-01-01
Papaya (Carica papaya) is a trioecious plant species that has male, female and hermaphrodite flowers on different plants. The primitive sex chromosomes genetically determine the sex of the papaya. Although draft sequences of the papaya genome are already available, the genes for sex determination have not been identified, likely due to the complicated structure of its sex-chromosome sequences. To identify the candidate genes for sex determination, we conducted a transcriptome analysis of flower samples from male, female and hermaphrodite plants using high-throughput SuperSAGE for digital gene expression analysis. Among the short sequence tags obtained from the transcripts, 312 unique tags were specifically mapped to the primitive sex chromosome (X or Y(h)) sequences. An annotation analysis revealed that retroelements are the most abundant sequences observed in the genes corresponding to these tags. The majority of tags on the sex chromosomes were located on the X chromosome, and only 30 tags were commonly mapped to both the X and Y(h) chromosome, implying a loss of many genes on the Y(h) chromosome. Nevertheless, candidate Y(h) chromosome-specific female determination genes, including a MADS-box gene, were identified. Information on these sex chromosome-specific expressed genes will help elucidating sex determination in the papaya.
Digital Transcriptome Analysis of Putative Sex-Determination Genes in Papaya (Carica papaya)
Urasaki, Naoya; Tarora, Kazuhiko; Shudo, Ayano; Ueno, Hiroki; Tamaki, Moritoshi; Miyagi, Norimichi; Adaniya, Shinichi; Matsumura, Hideo
2012-01-01
Papaya (Carica papaya) is a trioecious plant species that has male, female and hermaphrodite flowers on different plants. The primitive sex chromosomes genetically determine the sex of the papaya. Although draft sequences of the papaya genome are already available, the genes for sex determination have not been identified, likely due to the complicated structure of its sex-chromosome sequences. To identify the candidate genes for sex determination, we conducted a transcriptome analysis of flower samples from male, female and hermaphrodite plants using high-throughput SuperSAGE for digital gene expression analysis. Among the short sequence tags obtained from the transcripts, 312 unique tags were specifically mapped to the primitive sex chromosome (X or Yh) sequences. An annotation analysis revealed that retroelements are the most abundant sequences observed in the genes corresponding to these tags. The majority of tags on the sex chromosomes were located on the X chromosome, and only 30 tags were commonly mapped to both the X and Yh chromosome, implying a loss of many genes on the Yh chromosome. Nevertheless, candidate Yh chromosome-specific female determination genes, including a MADS-box gene, were identified. Information on these sex chromosome-specific expressed genes will help elucidating sex determination in the papaya. PMID:22815863
Zhang, Tingting; Hu, Shuhao; Yan, Caixia; Li, Chunjuan; Zhao, Xiaobo; Wan, Shubo; Shan, Shihua
2017-02-01
In the present investigation, a total of 60 conserved peanut (Arachis hypogaea L.) microRNA (miRNA) sequences, belonging to 16 families, were identified using bioinformatics methods. There were 392 target gene sequences, identified from 58 miRNAs with Target-align software and BLASTx analyses. Gene Ontology (GO) functional analysis suggested that these target genes were involved in mediating peanut growth and development, signal transduction and stress resistance. There were 55 miRNA sequences, verified employing a poly (A) tailing test, with a success rate of up to 91.67%. Twenty peanut target gene sequences were randomly selected, and the 5' rapid amplification of the cDNA ends (5'-RACE) method were used to validate the cleavage sites of these target genes. Of these, 14 (70%) peanut miRNA targets were verified by means of gel electrophoresis, cloning and sequencing. Furthermore, functional analysis and homologous sequence retrieval were conducted for target gene sequences, and 26 target genes were chosen as the objects for stress resistance experimental study. Real-time fluorescence quantitative PCR (qRT-PCR) technology was applied to measure the expression level of resistance-associated miRNAs and their target genes in peanut exposed to Aspergillus flavus (A. flavus) infection and drought stress, respectively. In consequence, 5 groups of miRNAs & targets were found accorded with the mode of miRNA negatively controlling the expression of target genes. This study, preliminarily determined the biological functions of some resistance-associated miRNAs and their target genes in peanut. Copyright © 2016 Elsevier Masson SAS. All rights reserved.
Silva, Francisco Goes da; Iandolino, Alberto; Al-Kayal, Fadi; Bohlmann, Marlene C.; Cushman, Mary Ann; Lim, Hyunju; Ergul, Ali; Figueroa, Rubi; Kabuloglu, Elif K.; Osborne, Craig; Rowe, Joan; Tattersall, Elizabeth; Leslie, Anna; Xu, Jane; Baek, JongMin; Cramer, Grant R.; Cushman, John C.; Cook, Douglas R.
2005-01-01
We report the analysis and annotation of 146,075 expressed sequence tags from Vitis species. The majority of these sequences were derived from different cultivars of Vitis vinifera, comprising an estimated 25,746 unique contig and singleton sequences that survey transcription in various tissues and developmental stages and during biotic and abiotic stress. Putatively homologous proteins were identified for over 17,752 of the transcripts, with 1,962 transcripts further subdivided into one or more Gene Ontology categories. A simple structured vocabulary, with modules for plant genotype, plant development, and stress, was developed to describe the relationship between individual expressed sequence tags and cDNA libraries; the resulting vocabulary provides query terms to facilitate data mining within the context of a relational database. As a measure of the extent to which characterized metabolic pathways were encompassed by the data set, we searched for homologs of the enzymes leading from glycolysis, through the oxidative/nonoxidative pentose phosphate pathway, and into the general phenylpropanoid pathway. Homologs were identified for 65 of these 77 enzymes, with 86% of enzymatic steps represented by paralogous genes. Differentially expressed transcripts were identified by means of a stringent believability index cutoff of ≥98.4%. Correlation analysis and two-dimensional hierarchical clustering grouped these transcripts according to similarity of expression. In the broadest analysis, 665 differentially expressed transcripts were identified across 29 cDNA libraries, representing a range of developmental and stress conditions. The groupings revealed expected associations between plant developmental stages and tissue types, with the notable exception of abiotic stress treatments. A more focused analysis of flower and berry development identified 87 differentially expressed transcripts and provides the basis for a compendium that relates gene expression and annotation to previously characterized aspects of berry development and physiology. Comparison with published results for select genes, as well as correlation analysis between independent data sets, suggests that the inferred in silico patterns of expression are likely to be an accurate representation of transcript abundance for the conditions surveyed. Thus, the combined data set reveals the in silico expression patterns for hundreds of genes in V. vinifera, the majority of which have not been previously studied within this species. PMID:16219919
2012-01-01
Background Water stress limits plant survival and production in many parts of the world. Identification of genes and alleles responding to water stress conditions is important in breeding plants better adapted to drought. Currently there are no studies examining the transcriptome wide gene and allelic expression patterns under water stress conditions. We used RNA sequencing (RNA-seq) to identify the candidate genes and alleles and to explore the evolutionary signatures of selection. Results We studied the effect of water stress on gene expression in Eucalyptus camaldulensis seedlings derived from three natural populations. We used reference-guided transcriptome mapping to study gene expression. Several genes showed differential expression between control and stress conditions. Gene ontology (GO) enrichment tests revealed up-regulation of 140 stress-related gene categories and down-regulation of 35 metabolic and cell wall organisation gene categories. More than 190,000 single nucleotide polymorphisms (SNPs) were detected and 2737 of these showed differential allelic expression. Allelic expression of 52% of these variants was correlated with differential gene expression. Signatures of selection patterns were studied by estimating the proportion of nonsynonymous to synonymous substitution rates (Ka/Ks). The average Ka/Ks ratio among the 13,719 genes was 0.39 indicating that most of the genes are under purifying selection. Among the positively selected genes (Ka/Ks > 1.5) apoptosis and cell death categories were enriched. Of the 287 positively selected genes, ninety genes showed differential expression and 27 SNPs from 17 positively selected genes showed differential allelic expression between treatments. Conclusions Correlation of allelic expression of several SNPs with total gene expression indicates that these variants may be the cis-acting variants or in linkage disequilibrium with such variants. Enrichment of apoptosis and cell death gene categories among the positively selected genes reveals the past selection pressures experienced by the populations used in this study. PMID:22853646
The Intolerance of Regulatory Sequence to Genetic Variation Predicts Gene Dosage Sensitivity
Wang, Quanli; Halvorsen, Matt; Han, Yujun; Weir, William H.; Allen, Andrew S.; Goldstein, David B.
2015-01-01
Noncoding sequence contains pathogenic mutations. Yet, compared with mutations in protein-coding sequence, pathogenic regulatory mutations are notoriously difficult to recognize. Most fundamentally, we are not yet adept at recognizing the sequence stretches in the human genome that are most important in regulating the expression of genes. For this reason, it is difficult to apply to the regulatory regions the same kinds of analytical paradigms that are being successfully applied to identify mutations among protein-coding regions that influence risk. To determine whether dosage sensitive genes have distinct patterns among their noncoding sequence, we present two primary approaches that focus solely on a gene’s proximal noncoding regulatory sequence. The first approach is a regulatory sequence analogue of the recently introduced residual variation intolerance score (RVIS), termed noncoding RVIS, or ncRVIS. The ncRVIS compares observed and predicted levels of standing variation in the regulatory sequence of human genes. The second approach, termed ncGERP, reflects the phylogenetic conservation of a gene’s regulatory sequence using GERP++. We assess how well these two approaches correlate with four gene lists that use different ways to identify genes known or likely to cause disease through changes in expression: 1) genes that are known to cause disease through haploinsufficiency, 2) genes curated as dosage sensitive in ClinGen’s Genome Dosage Map, 3) genes judged likely to be under purifying selection for mutations that change expression levels because they are statistically depleted of loss-of-function variants in the general population, and 4) genes judged unlikely to cause disease based on the presence of copy number variants in the general population. We find that both noncoding scores are highly predictive of dosage sensitivity using any of these criteria. In a similar way to ncGERP, we assess two ensemble-based predictors of regional noncoding importance, ncCADD and ncGWAVA, and find both scores are significantly predictive of human dosage sensitive genes and appear to carry information beyond conservation, as assessed by ncGERP. These results highlight that the intolerance of noncoding sequence stretches in the human genome can provide a critical complementary tool to other genome annotation approaches to help identify the parts of the human genome increasingly likely to harbor mutations that influence risk of disease. PMID:26332131
Comparative transcriptome analysis of microsclerotia development in Nomuraea rileyi
2013-01-01
Background Nomuraea rileyi is used as an environmental-friendly biopesticide. However, mass production and commercialization of this organism are limited due to its fastidious growth and sporulation requirements. When cultured in amended medium, we found that N. rileyi could produce microsclerotia bodies, replacing conidiophores as the infectious agent. However, little is known about the genes involved in microsclerotia development. In the present study, the transcriptomes were analyzed using next-generation sequencing technology to find the genes involved in microsclerotia development. Results A total of 4.69 Gb of clean nucleotides comprising 32,061 sequences was obtained, and 20,919 sequences were annotated (about 65%). Among the annotated sequences, only 5928 were annotated with 34 gene ontology (GO) functional categories, and 12,778 sequences were mapped to 165 pathways by searching against the Kyoto Encyclopedia of Genes and Genomes pathway (KEGG) database. Furthermore, we assessed the transcriptomic differences between cultures grown in minimal and amended medium. In total, 4808 sequences were found to be differentially expressed; 719 differentially expressed unigenes were assigned to 25 GO classes and 1888 differentially expressed unigenes were assigned to 161 KEGG pathways, including 25 enrichment pathways. Subsequently, we examined the up-regulation or uniquely expressed genes following amended medium treatment, which were also expressed on the enrichment pathway, and found that most of them participated in mediating oxidative stress homeostasis. To elucidate the role of oxidative stress in microsclerotia development, we analyzed the diversification of unigenes using quantitative reverse transcription-PCR (RT-qPCR). Conclusion Our findings suggest that oxidative stress occurs during microsclerotia development, along with a broad metabolic activity change. Our data provide the most comprehensive sequence resource available for the study of N. rileyi. We believe that the transcriptome datasets will serve as an important public information platform to accelerate studies on N. rileyi microsclerotia. PMID:23777366
Liu, Y; Chatterjee, A; Chatterjee, A K
1994-01-01
Our previous genetic analysis (J. W. Willis, J. K. Engwall, and A. K. Chatterjee, Phytopathology 77:1199-1205, 1987) had revealed a tight linkage between pel-3 (pel, pectate lyase gene) and peh-1 (peh, polygalacturonase gene) within the chromosome of Erwinia carotovora subsp. carotovora 71. Nucleotide sequencing, transcript assays, and expression of enzymatic activities in Escherichia coli have now confirmed that a 3,500-bp segment contains the open reading frames (ORFs) for Pel-3 and Peh-1. The 1,041-bp pel-3 ORF and the 1,206-bp peh-1 ORF are separated by a 579-bp sequence. The genes are transcribed divergently from their own promoters. In E. coli and E. carotovora subsp. carotovora 71, peh-1 is better expressed than pel-3. However, plant signals activate the expression of both the genes in E. carotovora subsp. carotovora. A consensus integration host factor (IHF)-binding sequence upstream of pel-3 appears physiologically significant, since pel-3 promoter activity is higher in an E. coli IHF+ strain than in an IHF- strain. While peh-1 has extensive homology with plant and bacterial peh genes, pel-3 appears not to have significant homology with the pel genes belonging to the pelBC, pelADE, or periplasmic pel families. Pel-3 also is unusual in that it is predicted to contain an ATP- and GTP-binding site motif A (P-loop) not found in the other Pels. Images PMID:8074530
Thyssen, Gregory N; Fang, David D; Turley, Rickie B; Florane, Christopher; Li, Ping; Naoumkina, Marina
2015-09-01
Mapping-by-sequencing and SNP marker analysis were used to fine map the Ligon-lintless-1 ( Li 1 ) short fiber mutation in tetraploid cotton to a 255-kb region that contains 16 annotated proteins. The Ligon-lintless-1 (Li 1 ) mutant of cotton (Gossypium hirsutum L.) has been studied as a model for cotton fiber development since its identification in 1929; however, the causative mutation has not been identified yet. Here we report the fine genetic mapping of the mutation to a 255-kb region that contains only 16 annotated genes in the reference Gossypium raimondii genome. We took advantage of the incompletely dominant dwarf vegetative phenotype to identify 100 mutants (Li 1 /Li 1 ) and 100 wild-type (li 1 /li 1 ) homozygotes from a mapping population of 2567 F2 plants, which we bulked and deep sequenced. Since only homozygotes were sequenced, we were able to use a high stringency in SNP calling to rapidly narrow down the region harboring the Li 1 locus, and designed subgenome-specific SNP markers to test the population. We characterized the expression of all sixteen genes in the region by RNA sequencing of elongating fibers and by RT-qPCR at seven time points spanning fiber development. One of the most highly expressed genes found in this interval in wild-type fiber cells is 40-fold under-expressed at the day of anthesis (DOA) in the mutant fiber cells. This gene is a major facilitator superfamily protein, part of the large family of proteins that includes auxin and sugar transporters. Interestingly, nearly all genes in this region were most highly expressed at DOA and showed a high degree of co-expression. Further characterization is required to determine if transport of hormones or carbohydrates is involved in both the dwarf and lintless phenotypes of Li 1 plants.
Wang, Tianyu; Nabavi, Sheida
2018-04-24
Differential gene expression analysis is one of the significant efforts in single cell RNA sequencing (scRNAseq) analysis to discover the specific changes in expression levels of individual cell types. Since scRNAseq exhibits multimodality, large amounts of zero counts, and sparsity, it is different from the traditional bulk RNA sequencing (RNAseq) data. The new challenges of scRNAseq data promote the development of new methods for identifying differentially expressed (DE) genes. In this study, we proposed a new method, SigEMD, that combines a data imputation approach, a logistic regression model and a nonparametric method based on the Earth Mover's Distance, to precisely and efficiently identify DE genes in scRNAseq data. The regression model and data imputation are used to reduce the impact of large amounts of zero counts, and the nonparametric method is used to improve the sensitivity of detecting DE genes from multimodal scRNAseq data. By additionally employing gene interaction network information to adjust the final states of DE genes, we further reduce the false positives of calling DE genes. We used simulated datasets and real datasets to evaluate the detection accuracy of the proposed method and to compare its performance with those of other differential expression analysis methods. Results indicate that the proposed method has an overall powerful performance in terms of precision in detection, sensitivity, and specificity. Copyright © 2018 Elsevier Inc. All rights reserved.
Zhao, Jie
2010-01-01
Arabinogalactan proteins (AGPs) comprise a family of hydroxyproline-rich glycoproteins that are implicated in plant growth and development. In this study, 69 AGPs are identified from the rice genome, including 13 classical AGPs, 15 arabinogalactan (AG) peptides, three non-classical AGPs, three early nodulin-like AGPs (eNod-like AGPs), eight non-specific lipid transfer protein-like AGPs (nsLTP-like AGPs), and 27 fasciclin-like AGPs (FLAs). The results from expressed sequence tags, microarrays, and massively parallel signature sequencing tags are used to analyse the expression of AGP-encoding genes, which is confirmed by real-time PCR. The results reveal that several rice AGP-encoding genes are predominantly expressed in anthers and display differential expression patterns in response to abscisic acid, gibberellic acid, and abiotic stresses. Based on the results obtained from this analysis, an attempt has been made to link the protein structures and expression patterns of rice AGP-encoding genes to their functions. Taken together, the genome-wide identification and expression analysis of the rice AGP gene family might facilitate further functional studies of rice AGPs. PMID:20423940
Maldonado-Borges, Josefina Ines; Ku-Cauich, José Roberto; Escobedo-Graciamedrano, Rosa Maria
2013-01-01
Analysis of cDNA-AFLP was used to study the genes expressed in zygotic and somatic embryogenesis of Musa acuminata Colla ssp. malaccensis, and a comparison was made between their differential transcribed fragments (TDFs) and the sequenced genome of the double haploid- (DH-) Pahang of the malaccensis subspecies that is available in the network. A total of 253 transcript-derived fragments (TDFs) were detected with apparent size of 100-4000 bp using 5 pairs of AFLP primers, of which 21 were differentially expressed during the different stages of banana embryogenesis; 15 of the sequences have matched DH-Pahang chromosomes, with 7 of them being homologous to gene sequences encoding either known or putative protein domains of higher plants. Four TDF sequences were located in all Musa chromosomes, while the rest were located in one or two chromosomes. Their putative individual function is briefly reviewed based on published information, and the potential roles of these genes in embryo development are discussed. Thus the availability of the genome of Musa and the information of TDFs sequences presented here opens new possibilities for an in-depth study of the molecular and biochemical research of zygotic and somatic embryogenesis of Musa.
Stable zymomonas mobilis xylose and arabinose fermenting strains
Zhang, Min [Lakewood, CO; Chou, Yat-Chen [Taipei, TW
2008-04-08
The present invention briefly includes a transposon for stable insertion of foreign genes into a bacterial genome, comprising at least one operon having structural genes encoding enzymes selected from the group consisting of xylAxylB, araBAD and tal/tkt, and at least one promoter for expression of the structural genes in the bacterium, a pair of inverted insertion sequences, the operons contained inside the insertion sequences, and a transposase gene located outside of the insertion sequences. A plasmid shuttle vector for transformation of foreign genes into a bacterial genome, comprising at least one operon having structural genes encoding enzymes selected from the group consisting of xylAxylB, araBAD and tal/tkt, at least one promoter for expression of the structural genes in the bacterium, and at least two DNA fragments having homology with a gene in the bacterial genome to be transformed, is also provided.The transposon and shuttle vectors are useful in constructing significantly different Zymomonas mobilis strains, according to the present invention, which are useful in the conversion of the cellulose derived pentose sugars into fuels and chemicals, using traditional fermentation technology, because they are stable for expression in a non-selection medium.
Weigel, B J; Burgett, S G; Chen, V J; Skatrud, P L; Frolik, C A; Queener, S W; Ingolia, T D
1988-01-01
beta-Lactam antibiotics such as penicillins and cephalosporins are synthesized by a wide variety of microbes, including procaryotes and eucaryotes. Isopenicillin N synthetase catalyzes a key reaction in the biosynthetic pathway of penicillins and cephalosporins. The genes encoding this protein have previously been cloned from the filamentous fungi Cephalosporium acremonium and Penicillium chrysogenum and characterized. We have extended our analysis to the isopenicillin N synthetase genes from the fungus Aspergillus nidulans and the gram-positive procaryote Streptomyces lipmanii. The isopenicillin N synthetase genes from these organisms have been cloned and sequenced, and the proteins encoded by the open reading frames were expressed in Escherichia coli. Active isopenicillin N synthetase enzyme was recovered from extracts of E. coli cells prepared from cells containing each of the genes in expression vectors. The four isopenicillin N synthetase genes studied are closely related. Pairwise comparison of the DNA sequences showed between 62.5 and 75.7% identity; comparison of the predicted amino acid sequences showed between 53.9 and 80.6% identity. The close homology of the procaryotic and eucaryotic isopenicillin N synthetase genes suggests horizontal transfer of the genes during evolution. Images PMID:3045077
Krisinger, J; Jeung, E B; Simmen, R C; Leung, P C
1995-01-01
The expression of Calbindin-D9k (CaBP-9k) in the pig uterus and placenta was measured by Northern blot analysis and reverse transcription polymerase chain reaction (PCR), respectively. Progesterone (P4) administration to ovariectomized pigs decreased CaBP-9k mRNA levels. Expression of endometrial CaBP-9k mRNA was high on pregnancy Days 10-12 and below the detection limit on Days 15 and 18. On Day 60, expression could be detected at low levels. In myometrium and placenta, CaBP-9k mRNA expression was not detectable by Northern analysis using total RNA. Reverse-transcribed RNA from both tissues demonstrated the presence of CaBP-9k transcripts by means of PCR. The partial CaBP-9k gene was amplified by PCR and cloned to determine the sequence of intron A. In contrast to the rat CaBP-9k gene, the pig gene does not contain a functional estrogen response element (ERE) within this region. A similar ERE-like sequence located at the identical location was examined by gel retardation analysis and failed to bind the estradiol receptor. A similar disruption of this ERE-like sequence has been described in the human CaBP-9k gene, which is not expressed at any level in placenta, myometrium, or endometrium. It is concluded that the pig CaBP-9k gene is regulated in these reproductive tissues in a manner distinct from that in rat and human tissues. The regulation is probably due to a regulatory region outside of intron A, which in the rat gene contains the key cis element for uterine expression of the CaBP-9k gene.
Higashi, Koichi; Tobe, Toru; Kanai, Akinori; Uyar, Ebru; Ishikawa, Shu; Suzuki, Yutaka; Ogasawara, Naotake; Kurokawa, Ken; Oshima, Taku
2016-01-01
Bacteria can acquire new traits through horizontal gene transfer. Inappropriate expression of transferred genes, however, can disrupt the physiology of the host bacteria. To reduce this risk, Escherichia coli expresses the nucleoid-associated protein, H-NS, which preferentially binds to horizontally transferred genes to control their expression. Once expression is optimized, the horizontally transferred genes may actually contribute to E. coli survival in new habitats. Therefore, we investigated whether and how H-NS contributes to this optimization process. A comparison of H-NS binding profiles on common chromosomal segments of three E. coli strains belonging to different phylogenetic groups indicated that the positions of H-NS-bound regions have been conserved in E. coli strains. The sequences of the H-NS-bound regions appear to have diverged more so than H-NS-unbound regions only when H-NS-bound regions are located upstream or in coding regions of genes. Because these regions generally contain regulatory elements for gene expression, sequence divergence in these regions may be associated with alteration of gene expression. Indeed, nucleotide substitutions in H-NS-bound regions of the ybdO promoter and coding regions have diversified the potential for H-NS-independent negative regulation among E. coli strains. The ybdO expression in these strains was still negatively regulated by H-NS, which reduced the effect of H-NS-independent regulation under normal growth conditions. Hence, we propose that, during E. coli evolution, the conservation of H-NS binding sites resulted in the diversification of the regulation of horizontally transferred genes, which may have facilitated E. coli adaptation to new ecological niches. PMID:26789284
Kozlov, Konstantin N.; Kulakovskiy, Ivan V.; Zubair, Asif; Marjoram, Paul; Lawrie, David S.; Nuzhdin, Sergey V.; Samsonova, Maria G.
2017-01-01
Annotating the genotype-phenotype relationship, and developing a proper quantitative description of the relationship, requires understanding the impact of natural genomic variation on gene expression. We apply a sequence-level model of gap gene expression in the early development of Drosophila to analyze single nucleotide polymorphisms (SNPs) in a panel of natural sequenced D. melanogaster lines. Using a thermodynamic modeling framework, we provide both analytical and computational descriptions of how single-nucleotide variants affect gene expression. The analysis reveals that the sequence variants increase (decrease) gene expression if located within binding sites of repressors (activators). We show that the sign of SNP influence (activation or repression) may change in time and space and elucidate the origin of this change in specific examples. The thermodynamic modeling approach predicts non-local and non-linear effects arising from SNPs, and combinations of SNPs, in individual fly genotypes. Simulation of individual fly genotypes using our model reveals that this non-linearity reduces to almost additive inputs from multiple SNPs. Further, we see signatures of the action of purifying selection in the gap gene regulatory regions. To infer the specific targets of purifying selection, we analyze the patterns of polymorphism in the data at two phenotypic levels: the strengths of binding and expression. We find that combinations of SNPs show evidence of being under selective pressure, while individual SNPs do not. The model predicts that SNPs appear to accumulate in the genotypes of the natural population in a way biased towards small increases in activating action on the expression pattern. Taken together, these results provide a systems-level view of how genetic variation translates to the level of gene regulatory networks via combinatorial SNP effects. PMID:28898266
Higashi, Koichi; Tobe, Toru; Kanai, Akinori; Uyar, Ebru; Ishikawa, Shu; Suzuki, Yutaka; Ogasawara, Naotake; Kurokawa, Ken; Oshima, Taku
2016-01-01
Bacteria can acquire new traits through horizontal gene transfer. Inappropriate expression of transferred genes, however, can disrupt the physiology of the host bacteria. To reduce this risk, Escherichia coli expresses the nucleoid-associated protein, H-NS, which preferentially binds to horizontally transferred genes to control their expression. Once expression is optimized, the horizontally transferred genes may actually contribute to E. coli survival in new habitats. Therefore, we investigated whether and how H-NS contributes to this optimization process. A comparison of H-NS binding profiles on common chromosomal segments of three E. coli strains belonging to different phylogenetic groups indicated that the positions of H-NS-bound regions have been conserved in E. coli strains. The sequences of the H-NS-bound regions appear to have diverged more so than H-NS-unbound regions only when H-NS-bound regions are located upstream or in coding regions of genes. Because these regions generally contain regulatory elements for gene expression, sequence divergence in these regions may be associated with alteration of gene expression. Indeed, nucleotide substitutions in H-NS-bound regions of the ybdO promoter and coding regions have diversified the potential for H-NS-independent negative regulation among E. coli strains. The ybdO expression in these strains was still negatively regulated by H-NS, which reduced the effect of H-NS-independent regulation under normal growth conditions. Hence, we propose that, during E. coli evolution, the conservation of H-NS binding sites resulted in the diversification of the regulation of horizontally transferred genes, which may have facilitated E. coli adaptation to new ecological niches.
Negre, Bárbara; Casillas, Sònia; Suzanne, Magali; Sánchez-Herrero, Ernesto; Akam, Michael; Nefedov, Michael; Barbadilla, Antonio; de Jong, Pieter; Ruiz, Alfredo
2005-01-01
Homeotic (Hox) genes are usually clustered and arranged in the same order as they are expressed along the anteroposterior body axis of metazoans. The mechanistic explanation for this colinearity has been elusive, and it may well be that a single and universal cause does not exist. The Hox-gene complex (HOM-C) has been rearranged differently in several Drosophila species, producing a striking diversity of Hox gene organizations. We investigated the genomic and functional consequences of the two HOM-C splits present in Drosophila buzzatii. Firstly, we sequenced two regions of the D. buzzatii genome, one containing the genes labial and abdominal A, and another one including proboscipedia, and compared their organization with that of D. melanogaster and D. pseudoobscura in order to map precisely the two splits. Then, a plethora of conserved noncoding sequences, which are putative enhancers, were identified around the three Hox genes closer to the splits. The position and order of these enhancers are conserved, with minor exceptions, between the three Drosophila species. Finally, we analyzed the expression patterns of the same three genes in embryos and imaginal discs of four Drosophila species with different Hox-gene organizations. The results show that their expression patterns are conserved despite the HOM-C splits. We conclude that, in Drosophila, Hox-gene clustering is not an absolute requirement for proper function. Rather, the organization of Hox genes is modular, and their clustering seems the result of phylogenetic inertia more than functional necessity. PMID:15867430
Wittenberger, T; Schaller, H C; Hellebrand, S
2001-03-30
We have developed a comprehensive expressed sequence tag database search method and used it for the identification of new members of the G-protein coupled receptor superfamily. Our approach proved to be especially useful for the detection of expressed sequence tag sequences that do not encode conserved parts of a protein, making it an ideal tool for the identification of members of divergent protein families or of protein parts without conserved domain structures in the expressed sequence tag database. At least 14 of the expressed sequence tags found with this strategy are promising candidates for new putative G-protein coupled receptors. Here, we describe the sequence and expression analysis of five new members of this receptor superfamily, namely GPR84, GPR86, GPR87, GPR90 and GPR91. We also studied the genomic structure and chromosomal localization of the respective genes applying in silico methods. A cluster of six closely related G-protein coupled receptors was found on the human chromosome 3q24-3q25. It consists of four orphan receptors (GPR86, GPR87, GPR91, and H963), the purinergic receptor P2Y1, and the uridine 5'-diphosphoglucose receptor KIAA0001. It seems likely that these receptors evolved from a common ancestor and therefore might have related ligands. In conclusion, we describe a data mining procedure that proved to be useful for the identification and first characterization of new genes and is well applicable for other gene families. Copyright 2001 Academic Press.
Gordon, Kacy L.; Arthur, Robert K.; Ruvinsky, Ilya
2015-01-01
Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2) from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements. PMID:26020930
Gupta, Sanjay; Pathak, Rashmi U; Kanungo, Madhu S
2006-08-01
One approach to the understanding of the molecular basis of aging in higher organisms may be to use genes whose timing and rate of expression during the life span run parallel with specific functions that can be monitored. The genes for egg proteins, such as vitellogenin (VTG), which is expressed in the liver, and ovalbumin, lysozyme etc. that are expressed in the oviduct of birds, meet these requirements. Egg laying function is dependent on the production of these proteins, which, in turn, depends on the expression of their genes. In this communication we present the age-related studies on the VTG II gene of the bird, Japanese quail. The gene is expressed only in the liver and its expression is considerably lower in old birds that do not lay eggs. Comparison of the promoter region of the gene carrying the two important cis-acting elements, estrogen responsive element (ERE) and progesterone responsive element (PRE), shows it to be 100% homologous to the corresponding region of the chicken VTG II gene. Methylation of DNA and conformation of chromatin of this region were studied, as they are known to be important for regulation of expression of genes. Our studies show that in the liver of adult female quails which lay eggs, a -CCGG- sequence located in this region is hypomethylated, and the chromatin encompassing this region of the gene is relaxed. In the old, the -CCGG- sequence is hypermethylated and the chromatin is compact. This is correlated with a decrease in the expression of the gene and decrease in egg production. Further, electrophoretic mobility shift assay (EMSA) shows that the levels/affinity of specific trans-acting factors that bind to ERE and PRE present in the region, are not different in adult and old birds. Hence the methylation status of the -CCGG- sequence that is located in-between the ERE and the PRE may be crucial for the conformation of chromatin and availability of these two important cis-acting elements for the binding of the trans-acting factors. This, in turn, may downregulate the expression of the gene in old birds.
Wang, S Y; Huo, J L; Miao, Y W; Cheng, W M; Zeng, Y Z
2013-04-02
U2 small nuclear RNA auxiliary factor 2 (U2AF2) is an important gene for pre-messenger RNA splicing in higher eukaryotes. In this study, the Banna mini-pig inbred line (BMI) U2AF2 coding sequence (CDS) was cloned, sequenced, and characterized. The U2AF2 complete CDS was amplified using the reverse transcription-polymerase chain reaction (RT-PCR) technique based on the conserved sequence information of cattle and known highly homologous swine expressed sequence tags. This novel gene was deposited into the National Center for Biotechnology Information database (Accession No. JQ839267). Sequence analysis revealed that the BMI U2AF2 coding sequence consisted of 1416 bp and encoded 471 amino acids with a molecular weight of 53.12 kDa. The protein sequence has high sequence homology with U2AF65 of 6 species - Homo sapiens (100%), Equus caballus (100%), Canis lupus (100%), Macaca mulatta (99.8%), Bos taurus (74.4%), and Mus musculus (74.4%). The phylogenetic tree analysis revealed that BMI U2AF65 has a closer genetic relationship with B. taurus U2AF65 than with U2AF65 of E. caballus, C. lupus, M. mulatta, H. sapiens, and M. musculus. RT-PCR analysis showed that BMI U2AF2 was most highly expressed in the brain; moderately expressed in the spleen, lung, muscle, and skin; and weakly expressed in the liver, kidney, and ovary. Its expression was nearly silent in the spinal cord, nerve fiber, heart, stomach, pancreas, and intestine. Three microRNA target sites were predicted in the CDS of BMI U2AF2 messenger RNA. Our results establish a foundation for further insight into this swine gene.
Evidence of function for conserved noncoding sequences in Arabidopsis thaliana.
Spangler, Jacob B; Subramaniam, Sabarinath; Freeling, Michael; Feltus, F Alex
2012-01-01
• Whole genome duplication events provide a lineage with a large reservoir of genes that can be molded by evolutionary forces into phenotypes that fit alternative environments. A well-studied whole genome duplication, the α-event, occurred in an ancestor of the model plant Arabidopsis thaliana. Retained segments of the α-event have been defined in recent years in the form of duplicate protein coding sequences (α-pairs) and associated conserved noncoding DNA sequences (CNSs). Our aim was to identify any association between CNSs and α-pair co-functionality at the gene expression level. • Here, we tested for correlation between CNS counts and α-pair co-expression and expression intensity across nine expression datasets: aerial tissue, flowers, leaves, roots, rosettes, seedlings, seeds, shoots and whole plants. • We provide evidence for a putative regulatory role of the CNSs. The association of CNSs with α-pair co-expression and expression intensity varied by gene function, subgene position and the presence of transcription factor binding motifs. A range of possible CNS regulatory mechanisms, including intron-mediated enhancement, messenger RNA fold stability and transcriptional regulation, are discussed. • This study provides a framework to understand how CNS motifs are involved in the maintenance of gene expression after a whole genome duplication event. © 2011 The Authors. New Phytologist © 2011 New Phytologist Trust.
Feng, Yong-qian; Zha, Xiao-jun; Zhai, Chao-yang
2007-07-01
To construct the eucaryotic recombinant plasmid of pYES2/LactoferricinB expressing in yeast of S. cerevisiae, of which the expressed protein antibacterial activity was verified in preliminary. By self-template PCR method, the gene of Lactoferricin B and its several sequence mutations were amplified with the parts of the pre-synthesized single chains. And then Lactoferricin B gene and its mutants were cloned into the vector of pYES2 to construct the recombined expression plasmid pYES2/Lactoferricin B etc. extracted and used to transform the yeast S. cerevisiae. The expressions of proteins were determined after induced by galactose. The expression proteins were collected and purified by hydronium-exchange column, and the bacterial inhibited test was applied to identify the protein antibacterial activities. The PCR amplifying and DNA sequencing tests indicated that the purpose plasmid contained the Lactoferricin B gene and several mutations. The induced target proteins were confirmed by SDS-PAGE electrophoresis and mass spectrum test. The protein antibacterial activities of mutations were verified in preliminary. The recombined plasmid pYES2/Lactoferricin B etc. are successfully constructed and induced to express in yeast cell of S. cerevisiae; the obtained recombined protein of Lactoferricin B provides a basis for further research work on the biological function and antibacterial activity.
Asamizu, E; Nakamura, Y; Sato, S; Tabata, S
2000-06-30
For comprehensive analysis of genes expressed in the model dicotyledonous plant, Arabidopsis thaliana, expressed sequence tags (ESTs) were accumulated. Normalized and size-selected cDNA libraries were constructed from aboveground organs, flower buds, roots, green siliques and liquid-cultured seedlings, respectively, and a total of 14,026 5'-end ESTs and 39,207 3'-end ESTs were obtained. The 3'-end ESTs could be clustered into 12,028 non-redundant groups. Similarity search of the non-redundant ESTs against the public non-redundant protein database indicated that 4816 groups show similarity to genes of known function, 1864 to hypothetical genes, and the remaining 5348 are novel sequences. Gene coverage by the non-redundant ESTs was analyzed using the annotated genomic sequences of approximately 10 Mb on chromosomes 3 and 5. A total of 923 regions were hit by at least one EST, among which only 499 regions were hit by the ESTs deposited in the public database. The result indicates that the EST source generated in this project complements the EST data in the public database and facilitates new gene discovery.
Transposon integration enhances expression of stress response genes.
Feng, Gang; Leem, Young-Eun; Levin, Henry L
2013-01-01
Transposable elements possess specific patterns of integration. The biological impact of these integration profiles is not well understood. Tf1, a long-terminal repeat retrotransposon in Schizosaccharomyces pombe, integrates into promoters with a preference for the promoters of stress response genes. To determine the biological significance of Tf1 integration, we took advantage of saturated maps of insertion activity and studied how integration at hot spots affected the expression of the adjacent genes. Our study revealed that Tf1 integration did not reduce gene expression. Importantly, the insertions activated the expression of 6 of 32 genes tested. We found that Tf1 increased gene expression by inserting enhancer activity. Interestingly, the enhancer activity of Tf1 could be limited by Abp1, a host surveillance factor that sequesters transposon sequences into structures containing histone deacetylases. We found the Tf1 promoter was activated by heat treatment and, remarkably, only genes that themselves were induced by heat could be activated by Tf1 integration, suggesting a synergy of Tf1 enhancer sequence with the stress response elements of target promoters. We propose that the integration preference of Tf1 for the promoters of stress response genes and the ability of Tf1 to enhance the expression of these genes co-evolved to promote the survival of cells under stress.
Patterns of homoeologous gene expression shown by RNA sequencing in hexaploid bread wheat
2014-01-01
Background Bread wheat (Triticum aestivum) has a large, complex and hexaploid genome consisting of A, B and D homoeologous chromosome sets. Therefore each wheat gene potentially exists as a trio of A, B and D homoeoloci, each of which may contribute differentially to wheat phenotypes. We describe a novel approach combining wheat cytogenetic resources (chromosome substitution ‘nullisomic-tetrasomic’ lines) with next generation deep sequencing of gene transcripts (RNA-Seq), to directly and accurately identify homoeologue-specific single nucleotide variants and quantify the relative contribution of individual homoeoloci to gene expression. Results We discover, based on a sample comprising ~5-10% of the total wheat gene content, that at least 45% of wheat genes are expressed from all three distinct homoeoloci. Most of these genes show strikingly biased expression patterns in which expression is dominated by a single homoeolocus. The remaining ~55% of wheat genes are expressed from either one or two homoeoloci only, through a combination of extensive transcriptional silencing and homoeolocus loss. Conclusions We conclude that wheat is tending towards functional diploidy, through a variety of mechanisms causing single homoeoloci to become the predominant source of gene transcripts. This discovery has profound consequences for wheat breeding and our understanding of wheat evolution. PMID:24726045
Transposon integration enhances expression of stress response genes
Feng, Gang; Leem, Young-Eun; Levin, Henry L.
2013-01-01
Transposable elements possess specific patterns of integration. The biological impact of these integration profiles is not well understood. Tf1, a long-terminal repeat retrotransposon in Schizosaccharomyces pombe, integrates into promoters with a preference for the promoters of stress response genes. To determine the biological significance of Tf1 integration, we took advantage of saturated maps of insertion activity and studied how integration at hot spots affected the expression of the adjacent genes. Our study revealed that Tf1 integration did not reduce gene expression. Importantly, the insertions activated the expression of 6 of 32 genes tested. We found that Tf1 increased gene expression by inserting enhancer activity. Interestingly, the enhancer activity of Tf1 could be limited by Abp1, a host surveillance factor that sequesters transposon sequences into structures containing histone deacetylases. We found the Tf1 promoter was activated by heat treatment and, remarkably, only genes that themselves were induced by heat could be activated by Tf1 integration, suggesting a synergy of Tf1 enhancer sequence with the stress response elements of target promoters. We propose that the integration preference of Tf1 for the promoters of stress response genes and the ability of Tf1 to enhance the expression of these genes co-evolved to promote the survival of cells under stress. PMID:23193295
Transcriptome assembly and digital gene expression atlas of the rainbow trout
USDA-ARS?s Scientific Manuscript database
Background: Transcriptome analysis is a preferred method for gene discovery, marker development and gene expression profiling in non-model organisms. Previously, we sequenced a transcriptome reference using Sanger-based and 454-pyrosequencing, however, a transcriptome assembly is still incomplete an...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gelb, Bruce D; Tartaglia, Marco; Pennacchio, Len
Diagnostic and therapeutic applications for Noonan Syndrome are described. The diagnostic and therapeutic applications are based on certain mutations in a RAS-specific guanine nucleotide exchange factor gene SOS1 or its expression product. The diagnostic and therapeutic applications are also based on certain mutations in a serine/threonine protein kinase gene RAF1 or its expression product thereof. Also described are nucleotide sequences, amino acid sequences, probes, and primers related to RAF1 or SOS1, and variants thereof, as well as host cells expressing such variants.
Kenney, S; Kamine, J; Markovitz, D; Fenrick, R; Pagano, J
1988-01-01
Acquired immunodeficiency syndrome patients are frequently coinfected with Epstein-Barr virus (EBV). In this report, we demonstrate that an EBV immediate-early gene product, BamHI MLF1, stimulates expression of the bacterial chloramphenicol acetyltransferase (CAT) gene linked to the human immunodeficiency virus (HIV) promoter. The HIV promoter sequences necessary for trans-activation by EBV do not include the tat-responsive sequences. In addition, in contrast to the other herpesvirus trans-activators previously studied, the EBV BamHI MLF1 gene product appears to function in part by a posttranscriptional mechanism, since it increases pHIV-CAT protein activity more than it increases HIV-CAT mRNA. This ability of an EBV gene product to activate HIV gene expression may have biologic consequences in persons coinfected with both viruses. Images PMID:2830625
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kenney, S.; Kamine, J.; Markovitz, D.
Acquired immunodeficiency syndrome patients are frequently coinfected with Epstein-Barr virus (EBV). In this report, the authors demonstrate that an EBV immediate-early gene product, BamHI MLF1, stimulates expression of the bacterial chloramphenicol acetyltransferase (CAT) gene linked to the human immunodeficiency virus (HIV) promoter. The HIV promoter sequences necessary for trans-activation by EBV do not include the tat-responsive sequences. In addition, in contrast to the other herpesvirus trans-activators previously studied, the EBV BamHI MLF1 gene product appears to function in part by a posttranscriptional mechanism, since it increases pHIV-CAT protein activity more than it increases HIV-CAT mRNA. This ability of an EBVmore » gene product to activate HIV gene expression may have biologic consequences in persons coinfected with both viruses.« less
dbWFA: a web-based database for functional annotation of Triticum aestivum transcripts
Vincent, Jonathan; Dai, Zhanwu; Ravel, Catherine; Choulet, Frédéric; Mouzeyar, Said; Bouzidi, M. Fouad; Agier, Marie; Martre, Pierre
2013-01-01
The functional annotation of genes based on sequence homology with genes from model species genomes is time-consuming because it is necessary to mine several unrelated databases. The aim of the present work was to develop a functional annotation database for common wheat Triticum aestivum (L.). The database, named dbWFA, is based on the reference NCBI UniGene set, an expressed gene catalogue built by expressed sequence tag clustering, and on full-length coding sequences retrieved from the TriFLDB database. Information from good-quality heterogeneous sources, including annotations for model plant species Arabidopsis thaliana (L.) Heynh. and Oryza sativa L., was gathered and linked to T. aestivum sequences through BLAST-based homology searches. Even though the complexity of the transcriptome cannot yet be fully appreciated, we developed a tool to easily and promptly obtain information from multiple functional annotation systems (Gene Ontology, MapMan bin codes, MIPS Functional Categories, PlantCyc pathway reactions and TAIR gene families). The use of dbWFA is illustrated here with several query examples. We were able to assign a putative function to 45% of the UniGenes and 81% of the full-length coding sequences from TriFLDB. Moreover, comparison of the annotation of the whole T. aestivum UniGene set along with curated annotations of the two model species assessed the accuracy of the annotation provided by dbWFA. To further illustrate the use of dbWFA, genes specifically expressed during the early cell division or late storage polymer accumulation phases of T. aestivum grain development were identified using a clustering analysis and then annotated using dbWFA. The annotation of these two sets of genes was consistent with previous analyses of T. aestivum grain transcriptomes and proteomes. Database URL: urgi.versailles.inra.fr/dbWFA/ PMID:23660284
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chattopadhyay, Saket; Ely, Abdullah; Bloom, Kristie
2009-11-20
RNA interference (RNAi) may be harnessed to inhibit viral gene expression and this approach is being developed to counter chronic infection with hepatitis B virus (HBV). Compared to synthetic RNAi activators, DNA expression cassettes that generate silencing sequences have advantages of sustained efficacy and ease of propagation in plasmid DNA (pDNA). However, the large size of pDNAs and inclusion of sequences conferring antibiotic resistance and immunostimulation limit delivery efficiency and safety. To develop use of alternative DNA templates that may be applied for therapeutic gene silencing, we assessed the usefulness of PCR-generated linear expression cassettes that produce anti-HBV micro-RNA (miR)more » shuttles. We found that silencing of HBV markers of replication was efficient (>75%) in cell culture and in vivo. miR shuttles were processed to form anti-HBV guide strands and there was no evidence of induction of the interferon response. Modification of terminal sequences to include flanking human adenoviral type-5 inverted terminal repeats was easily achieved and did not compromise silencing efficacy. These linear DNA sequences should have utility in the development of gene silencing applications where modifications of terminal elements with elimination of potentially harmful and non-essential sequences are required.« less
Wang, Luwen; Jiang, Ning; Wang, Lin; Fang, Ou; Leach, Lindsey J; Hu, Xiaohua; Luo, Zewei
2014-01-01
Paired sense and antisense (S/AS) genes located in cis represent a structural feature common to the genomes of both prokaryotes and eukaryotes, and produce partially complementary transcripts. We used published genome and transcriptome sequence data and found that over 20% of genes (645 pairs) in the budding yeast Saccharomyces cerevisiae genome are arranged in convergent pairs with overlapping 3'-UTRs. Using published microarray transcriptome data from the standard laboratory strain of S. cerevisiae, our analysis revealed that expression levels of convergent pairs are significantly negatively correlated across a broad range of environments. This implies an important role for convergent genes in the regulation of gene expression, which may compensate for the absence of RNA-dependent mechanisms such as micro RNAs in budding yeast. We selected four representative convergent gene pairs and used expression assays in wild type yeast and its genetically modified strains to explore the underlying patterns of gene expression. Results showed that convergent genes are reciprocally regulated in yeast populations and in single cells, whereby an increase in expression of one gene produces a decrease in the expression of the other, and vice-versa. Time course analysis of the cell cycle illustrated the functional significance of this relationship for the three pairs with relevant functional roles. Furthermore, a series of genetic modifications revealed that the 3'-UTR sequence plays an essential causal role in mediating transcriptional interference, which requires neither the sequence of the open reading frame nor the translation of fully functional proteins. More importantly, transcriptional interference persisted even when one of the convergent genes was expressed ectopically (in trans) and therefore does not depend on the cis arrangement of convergent genes; we conclude that the mechanism of transcriptional interference cannot be explained by the transcriptional collision model, which postulates a clash between simultaneous transcriptional processes occurring on opposite DNA strands.
Wang, Luwen; Jiang, Ning; Wang, Lin; Fang, Ou; Leach, Lindsey J.; Hu, Xiaohua; Luo, Zewei
2014-01-01
Paired sense and antisense (S/AS) genes located in cis represent a structural feature common to the genomes of both prokaryotes and eukaryotes, and produce partially complementary transcripts. We used published genome and transcriptome sequence data and found that over 20% of genes (645 pairs) in the budding yeast Saccharomyces cerevisiae genome are arranged in convergent pairs with overlapping 3′-UTRs. Using published microarray transcriptome data from the standard laboratory strain of S. cerevisiae, our analysis revealed that expression levels of convergent pairs are significantly negatively correlated across a broad range of environments. This implies an important role for convergent genes in the regulation of gene expression, which may compensate for the absence of RNA-dependent mechanisms such as micro RNAs in budding yeast. We selected four representative convergent gene pairs and used expression assays in wild type yeast and its genetically modified strains to explore the underlying patterns of gene expression. Results showed that convergent genes are reciprocally regulated in yeast populations and in single cells, whereby an increase in expression of one gene produces a decrease in the expression of the other, and vice-versa. Time course analysis of the cell cycle illustrated the functional significance of this relationship for the three pairs with relevant functional roles. Furthermore, a series of genetic modifications revealed that the 3′-UTR sequence plays an essential causal role in mediating transcriptional interference, which requires neither the sequence of the open reading frame nor the translation of fully functional proteins. More importantly, transcriptional interference persisted even when one of the convergent genes was expressed ectopically (in trans) and therefore does not depend on the cis arrangement of convergent genes; we conclude that the mechanism of transcriptional interference cannot be explained by the transcriptional collision model, which postulates a clash between simultaneous transcriptional processes occurring on opposite DNA strands. PMID:24465217
Chanhome, Lawan; Tan, Nget Hong
2017-01-01
Background The monocled cobra (Naja kaouthia) is a medically important venomous snake in Southeast Asia. Its venom has been shown to vary geographically in relation to venom composition and neurotoxic activity, indicating vast diversity of the toxin genes within the species. To investigate the polygenic trait of the venom and its locale-specific variation, we profiled and compared the venom gland transcriptomes of N. kaouthia from Malaysia (NK-M) and Thailand (NK-T) applying next-generation sequencing (NGS) technology. Methods The transcriptomes were sequenced on the Illumina HiSeq platform, assembled and followed by transcript clustering and annotations for gene expression and function. Pairwise or multiple sequence alignments were conducted on the toxin genes expressed. Substitution rates were studied for the major toxins co-expressed in NK-M and NK-T. Results and discussion The toxin transcripts showed high redundancy (41–82% of the total mRNA expression) and comprised 23 gene families expressed in NK-M and NK-T, respectively (22 gene families were co-expressed). Among the venom genes, three-finger toxins (3FTxs) predominated in the expression, with multiple sequences noted. Comparative analysis and selection study revealed that 3FTxs are genetically conserved between the geographical specimens whilst demonstrating distinct differential expression patterns, implying gene up-regulation for selected principal toxins, or alternatively, enhanced transcript degradation or lack of transcription of certain traits. One of the striking features that elucidates the inter-geographical venom variation is the up-regulation of α-neurotoxins (constitutes ∼80.0% of toxin’s fragments per kilobase of exon model per million mapped reads (FPKM)), particularly the long-chain α-elapitoxin-Nk2a (48.3%) in NK-T but only 1.7% was noted in NK-M. Instead, short neurotoxin isoforms were up-regulated in NK-M (46.4%). Another distinct transcriptional pattern observed is the exclusively and abundantly expressed cytotoxin CTX-3 in NK-T. The findings suggested correlation with the geographical variation in proteome and toxicity of the venom, and support the call for optimising antivenom production and use in the region. Besides, the current study uncovered full and partial sequences of numerous toxin genes from N. kaouthia which have not been reported hitherto; these include N. kaouthia-specific l-amino acid oxidase (LAAO), snake venom serine protease (SVSP), cystatin, acetylcholinesterase (AChE), hyaluronidase (HYA), waprin, phospholipase B (PLB), aminopeptidase (AP), neprilysin, etc. Taken together, the findings further enrich the snake toxin database and provide deeper insights into the genetic diversity of cobra venom toxins. PMID:28392982
Information Propagation in Developmental Enhancers
NASA Astrophysics Data System (ADS)
Jena, Siddhartha; Levine, Michael
Rather than encoding information about protein sequence, certain lengths of noncoding DNA, called enhancers, interact with protein machinery such as transcription factors to precisely regulate gene expression. Enhancers have been studied extensively in the fruit fly Drosophila melanogaster, where they regulate the expression of developmental genes that establish the blueprint of the adult fly. It has been suggested that enhancer sequences possess a specific but unknown syntax with regards to the placement and strength of transcription factor binding sites. Moreover, studies in divergent fly species have shown that compensatory evolution allows for maintenance of enhancer functionality despite considerable variation in primary DNA sequence. Here, the possible role of enhancers as signal processing modules is studied as a way of explaining these two findings. We first demonstrate how this framework can be used to explain the fine-tuned spatiotemporal dynamics of gene expression. We then explore the evolutionary pressure on enhancer sequences and the resulting emergence of enhancers that are linked by compensatory mutations. This study provides a possible mechanism for the function of multiple enhancers linked to a single gene.
Use of the Fluidigm C1 platform for RNA sequencing of single mouse pancreatic islet cells.
Xin, Yurong; Kim, Jinrang; Ni, Min; Wei, Yi; Okamoto, Haruka; Lee, Joseph; Adler, Christina; Cavino, Katie; Murphy, Andrew J; Yancopoulos, George D; Lin, Hsin Chieh; Gromada, Jesper
2016-03-22
This study provides an assessment of the Fluidigm C1 platform for RNA sequencing of single mouse pancreatic islet cells. The system combines microfluidic technology and nanoliter-scale reactions. We sequenced 622 cells, allowing identification of 341 islet cells with high-quality gene expression profiles. The cells clustered into populations of α-cells (5%), β-cells (92%), δ-cells (1%), and pancreatic polypeptide cells (2%). We identified cell-type-specific transcription factors and pathways primarily involved in nutrient sensing and oxidation and cell signaling. Unexpectedly, 281 cells had to be removed from the analysis due to low viability, low sequencing quality, or contamination resulting in the detection of more than one islet hormone. Collectively, we provide a resource for identification of high-quality gene expression datasets to help expand insights into genes and pathways characterizing islet cell types. We reveal limitations in the C1 Fluidigm cell capture process resulting in contaminated cells with altered gene expression patterns. This calls for caution when interpreting single-cell transcriptomics data using the C1 Fluidigm system.
Gladka, Monika M; Molenaar, Bas; de Ruiter, Hesther; van der Elst, Stefan; Tsui, Hoyee; Versteeg, Danielle; Lacraz, Grègory P A; Huibers, Manon M H; van Oudenaarden, Alexander; van Rooij, Eva
2018-01-31
Background -Genome-wide transcriptome analysis has greatly advanced our understanding of the regulatory networks underlying basic cardiac biology and mechanisms driving disease. However, so far, the resolution of studying gene expression patterns in the adult heart has been limited to the level of extracts from whole tissues. The use of tissue homogenates inherently causes the loss of any information on cellular origin or cell type-specific changes in gene expression. Recent developments in RNA amplification strategies provide a unique opportunity to use small amounts of input RNA for genome-wide sequencing of single cells. Methods -Here, we present a method to obtain high quality RNA from digested cardiac tissue from adult mice for automated single-cell sequencing of both the healthy and diseased heart. Results -After optimization, we were able to perform single-cell sequencing on adult cardiac tissue under both homeostatic conditions and after ischemic injury. Clustering analysis based on differential gene expression unveiled known and novel markers of all main cardiac cell types. Based on differential gene expression we were also able to identify multiple subpopulations within a certain cell type. Furthermore, applying single-cell sequencing on both the healthy and the injured heart indicated the presence of disease-specific cell subpopulations. As such, we identified cytoskeleton associated protein 4 ( Ckap4 ) as a novel marker for activated fibroblasts that positively correlates with known myofibroblast markers in both mouse and human cardiac tissue. Ckap4 inhibition in activated fibroblasts treated with TGFβ triggered a greater increase in the expression of genes related to activated fibroblasts compared to control, suggesting a role of Ckap4 in modulating fibroblast activation in the injured heart. Conclusions -Single-cell sequencing on both the healthy and diseased adult heart allows us to study transcriptomic differences between cardiac cells, as well as cell type-specific changes in gene expression during cardiac disease. This new approach provides a wealth of novel insights into molecular changes that underlie the cellular processes relevant for cardiac biology and pathophysiology. Applying this technology could lead to the discovery of new therapeutic targets relevant for heart disease.
A Novel Phenanthrene Dioxygenase from Nocardioides sp. Strain KP7: Expression in Escherichia coli
Saito, Atsushi; Iwabuchi, Tokuro; Harayama, Shigeaki
2000-01-01
Nocardioides sp. strain KP7 grows on phenanthrene but not on naphthalene. This organism degrades phenanthrene via 1-hydroxy-2-naphthoate, o-phthalate, and protocatechuate. The genes responsible for the degradation of phenanthrene to o-phthalate (phd) were found by Southern hybridization to reside on the chromosome. A 10.6-kb DNA fragment containing eight phd genes was cloned and sequenced. The phdA, phdB, phdC, and phdD genes, which encode the α and β subunits of the oxygenase component, a ferredoxin, and a ferredoxin reductase, respectively, of phenanthrene dioxygenase were identified. The gene cluster, phdAB, was located 8.3 kb downstream of the previously characterized phdK gene, which encodes 2-carboxybenzaldehyde dehydrogenase. The phdCD gene cluster was located 2.9 kb downstream of the phdB gene. PhdA and PhdB exhibited moderate (less than 60%) sequence identity to the α and β subunits of other ring-hydroxylating dioxygenases. The PhdC sequence showed features of a [3Fe-4S] or [4Fe-4S] type of ferredoxin, not of the [2Fe-2S] type of ferredoxin that has been found in most of the reported ring-hydroxylating dioxygenases. PhdD also showed moderate (less than 40%) sequence identity to known reductases. The phdABCD genes were expressed poorly in Escherichia coli, even when placed under the control of strong promoters. The introduction of a Shine-Dalgarno sequence upstream of each initiation codon of the phdABCD genes improved their expression in E. coli. E. coli cells carrying phdBCD or phdACD exhibited no phenanthrene-degrading activity, and those carrying phdABD or phdABC exhibited phenanthrene-degrading activity which was significantly less than that in cells carrying the phdABCD genes. It was thus concluded that all of the phdABCD genes are necessary for the efficient expression of phenanthrene-degrading activity. The genetic organization of the phd genes, the phylogenetically diverged positions of these genes, and an unusual type of ferredoxin component suggest phenanthrene dioxygenase in Nocardioides sp. strain KP7 to be a new class of aromatic ring-hydroxylating dioxygenases. PMID:10735855
Mallona, Izaskun; Egea-Cortines, Marcos; Weiss, Julia
2011-08-01
The cactus Opuntia ficus-indica is a constitutive Crassulacean acid metabolism (CAM) species. Current knowledge of CAM metabolism suggests that the enzyme phosphoenolpyruvate carboxylase kinase (PPCK) is circadian regulated at the transcriptional level, whereas phosphoenolpyruvate carboxylase (PEPC), malate dehydrogenase (MDH), NADP-malic enzyme (NADP-ME), and pyruvate phosphate dikinase (PPDK) are posttranslationally controlled. As little transcriptomic data are available from obligate CAM plants, we created an expressed sequence tag database derived from different organs and developmental stages. Sequences were assembled, compared with sequences in the National Center for Biotechnology Information nonredundant database for identification of putative orthologs, and mapped using Kyoto Encyclopedia of Genes and Genomes Orthology and Gene Ontology. We identified genes involved in circadian regulation and CAM metabolism for transcriptomic analysis in plants grown in long days. We identified stable reference genes for quantitative polymerase chain reaction and found that OfiSAND, like its counterpart in Arabidopsis (Arabidopsis thaliana), and OfiTUB are generally appropriate standards for use in the quantification of gene expression in O. ficus-indica. Three kinds of expression profiles were found: transcripts of OfiPPCK oscillated with a 24-h periodicity; transcripts of the light-active OfiNADP-ME and OfiPPDK genes adapted to 12-h cycles, while transcript accumulation patterns of OfiPEPC and OfiMDH were arrhythmic. Expression of the circadian clock gene OfiTOC1, similar to Arabidopsis, oscillated with a 24-h periodicity, peaking at night. Expression of OfiCCA1 and OfiPRR9, unlike in Arabidopsis, adapted best to a 12-h rhythm, suggesting that circadian clock gene interactions differ from those of Arabidopsis. Our results indicate that the evolution of CAM metabolism could be the result of modified circadian regulation at both the transcriptional and posttranscriptional levels.
Jiang, Jinjin; Wang, Yue; Zhu, Bao; Fang, Tingting; Fang, Yujie; Wang, Youping
2015-01-27
Brassica includes many successfully cultivated crop species of polyploid origin, either by ancestral genome triplication or by hybridization between two diploid progenitors, displaying complex repetitive sequences and transposons. The U's triangle, which consists of three diploids and three amphidiploids, is optimal for the analysis of complicated genomes after polyploidization. Next-generation sequencing enables the transcriptome profiling of polyploids on a global scale. We examined the gene expression patterns of three diploids (Brassica rapa, B. nigra, and B. oleracea) and three amphidiploids (B. napus, B. juncea, and B. carinata) via digital gene expression analysis. In total, the libraries generated between 5.7 and 6.1 million raw reads, and the clean tags of each library were mapped to 18547-21995 genes of B. rapa genome. The unambiguous tag-mapped genes in the libraries were compared. Moreover, the majority of differentially expressed genes (DEGs) were explored among diploids as well as between diploids and amphidiploids. Gene ontological analysis was performed to functionally categorize these DEGs into different classes. The Kyoto Encyclopedia of Genes and Genomes analysis was performed to assign these DEGs into approximately 120 pathways, among which the metabolic pathway, biosynthesis of secondary metabolites, and peroxisomal pathway were enriched. The non-additive genes in Brassica amphidiploids were analyzed, and the results indicated that orthologous genes in polyploids are frequently expressed in a non-additive pattern. Methyltransferase genes showed differential expression pattern in Brassica species. Our results provided an understanding of the transcriptome complexity of natural Brassica species. The gene expression changes in diploids and allopolyploids may help elucidate the morphological and physiological differences among Brassica species.
USDA-ARS?s Scientific Manuscript database
We set out to better understand the genetic basis behind growth variation in hybrid striped bass (HSB) by determining whether gene expression changes could be detected between the largest and smallest HSB in a population using a global gene expression approach by RNA sequencing of liver. Fingerling...
Ekblom, Robert; Farrell, Lindsay L; Lank, David B; Burke, Terry
2012-01-01
By next generation transcriptome sequencing, it is possible to obtain data on both nucleotide sequence variation and gene expression. We have used this approach (RNA-Seq) to investigate the genetic basis for differences in plumage coloration and mating strategies in a non-model bird species, the ruff (Philomachus pugnax). Ruff males show enormous variation in the coloration of ornamental feathers, used for individual recognition. This polymorphism is linked to reproductive strategies, with dark males (Independents) defending territories on leks against other Independents, whereas white morphs (Satellites) co-occupy Independent's courts without agonistic interactions. Previous work found a strong genetic component for mating strategy, but the genes involved were not identified. We present feather transcriptome data of more than 6,000 de-novo sequenced ruff genes (although with limited coverage for many of them). None of the identified genes showed significant expression divergence between males, but many genetic markers showed nucleotide differentiation between different color morphs and mating strategies. These include several feather keratin genes, splicing factors, and the Xg blood-group gene. Many of the genes with significant genetic structure between mating strategies have not yet been annotated and their functions remain to be elucidated. We also conducted in-depth investigations of 28 pre-identified coloration candidate genes. Two of these (EDNRB and TYR) were specifically expressed in black- and rust-colored males, respectively. We have demonstrated the utility of next generation transcriptome sequencing for identifying and genotyping large number of genetic markers in a non-model species without previous genomic resources, and highlight the potential of this approach for addressing the genetic basis of ecologically important variation. PMID:23145334
Balancing gene expression without library construction via a reusable sRNA pool.
Ghodasara, Amar; Voigt, Christopher A
2017-07-27
Balancing protein expression is critical when optimizing genetic systems. Typically, this requires library construction to vary the genetic parts controlling each gene, which can be expensive and time-consuming. Here, we develop sRNAs corresponding to 15nt 'target' sequences that can be inserted upstream of a gene. The targeted gene can be repressed from 1.6- to 87-fold by controlling sRNA expression using promoters of different strength. A pool is built where six sRNAs are placed under the control of 16 promoters that span a ∼103-fold range of strengths, yielding ∼107 combinations. This pool can simultaneously optimize up to six genes in a system. This requires building only a single system-specific construct by placing a target sequence upstream of each gene and transforming it with the pre-built sRNA pool. The resulting library is screened and the top clone is sequenced to determine the promoter controlling each sRNA, from which the fold-repression of the genes can be inferred. The system is then rebuilt by rationally selecting parts that implement the optimal expression of each gene. We demonstrate the versatility of this approach by using the same pool to optimize a metabolic pathway (β-carotene) and genetic circuit (XNOR logic gate). © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Voelker, Toni A.; Staswick, Paul; Chrispeels, Maarten J.
1986-01-01
Phytohemagglutinin (PHA), the seed lectin of the common bean, Phaseolus vulgaris, is encoded by two highly homologous, tandemly linked genes, dlec1 and dlec2, which are coordinately expressed at high levels in developing cotyledons. Their respective transcripts translate into closely related polypeptides, PHA-E and PHA-L, constituents of the tetrameric lectin which accumulates at high levels in developing seeds. In the bean cultivar Pinto UI111, PHA-E is not detectable, and PHA-L accumulates at very reduced levels. To investigate the cause of the Pinto phenotype, we cloned and sequenced the two PHA genes of Pinto, called Pdlec1 and Pdlec2, and determined the abundance of their respective mRNAs in developing cotyledons. Both genes are more than 90% homologous to the normal PHA genes found in other cultivars. Pdlec1 carries a 1-bp frameshift mutation close to the 5' end of its coding sequence. Only very truncated polypeptides could be made from its mRNA. The gene Pdlec2 encodes a polypeptide, which resembles PHA-L and its predicted amino acid sequence agrees with the available Pinto PHA amino acid sequence data. Analysis of the mRNA of developing cotyledons revealed that the Pdlec1 message is reduced 600-fold, and Pdlec2 mRNA is reduced 20-fold with respect to mRNA levels in normal cultivars. A comparison of the sequences which are upstream from the coding sequence shows that Pdlec2 has a 100-bp deletion compared to the other genes (dlec1, dlec2 and Pdlec1). This deletion which contains a large tandem repeat may be responsible for the low level of expression of Pdlec2. The very low expression of Pdlec1 is as yet unexplained. ImagesFig. 5. PMID:16453730
Wistow, Graeme; Bernstein, Steven L; Wyatt, M Keith; Fariss, Robert N; Behal, Amita; Touchman, Jeffrey W; Bouffard, Gerald; Smith, Don; Peterson, Katherine
2002-06-15
The retinal pigment epithelium (RPE) and choroid comprise a functional unit of the eye that is essential to normal retinal health and function. Here we describe expressed sequence tag (EST) analysis of human RPE/choroid as part of a project for ocular bioinformatics. A cDNA library (cs) was made from human RPE/choroid and sequenced. Data were analyzed and assembled using the program GRIST (GRouping and Identification of Sequence Tags). Complete sequencing, Northern and Western blots, RH mapping, peptide antibody synthesis and immunofluorescence (IF) have been used to examine expression patterns and genome location for selected transcripts and proteins. Ten thousand individual sequence reads yield over 6300 unique gene clusters of which almost half have no matches with named genes. One of the most abundant transcripts is from a gene (named "alpha") that maps to the BBS1 region of chromosome 11. A number of tissue preferred transcripts are common to both RPE/choroid and iris. These include oculoglycan/opticin, for which an alternative splice form is detected in RPE/choroid, and "oculospanin" (Ocsp), a novel tetraspanin that maps to chromosome 17q. Antiserum to Ocsp detects expression in RPE, iris, ciliary body, and retinal ganglion cells by IF. A newly identified gene for a zinc-finger protein (TIRC) maps to 19q13.4. Variant transcripts of several genes were also detected. Most notably, the predominant form of Bestrophin represented in cs contains a longer open reading frame as a result of splice junction skipping. The unamplified cs library gives a view of the transcriptional repertoire of the adult RPE/choroid. A large number of potentially novel genes and splice forms and candidates for genetic diseases are revealed. Clones from this collection are being included in a large, nonredundant set for cDNA microarray construction.
Intrinsic and extrinsic approaches for detecting genes in a bacterial genome.
Borodovsky, M; Rudd, K E; Koonin, E V
1994-01-01
The unannotated regions of the Escherichia coli genome DNA sequence from the EcoSeq6 database, totaling 1,278 'intergenic' sequences of the combined length of 359,279 basepairs, were analyzed using computer-assisted methods with the aim of identifying putative unknown genes. The proposed strategy for finding new genes includes two key elements: i) prediction of expressed open reading frames (ORFs) using the GeneMark method based on Markov chain models for coding and non-coding regions of Escherichia coli DNA, and ii) search for protein sequence similarities using programs based on the BLAST algorithm and programs for motif identification. A total of 354 putative expressed ORFs were predicted by GeneMark. Using the BLASTX and TBLASTN programs, it was shown that 208 ORFs located in the unannotated regions of the E. coli chromosome are significantly similar to other protein sequences. Identification of 182 ORFs as probable genes was supported by GeneMark and BLAST, comprising 51.4% of the GeneMark 'hits' and 87.5% of the BLAST 'hits'. 73 putative new genes, comprising 20.6% of the GeneMark predictions, belong to ancient conserved protein families that include both eubacterial and eukaryotic members. This value is close to the overall proportion of highly conserved sequences among eubacterial proteins, indicating that the majority of the putative expressed ORFs that are predicted by GeneMark, but have no significant BLAST hits, nevertheless are likely to be real genes. The majority of the putative genes identified by BLAST search have been described since the release of the EcoSeq6 database, but about 70 genes have not been detected so far. Among these new identifications are genes encoding proteins with a variety of predicted functions including dehydrogenases, kinases, several other metabolic enzymes, ATPases, rRNA methyltransferases, membrane proteins, and different types of regulatory proteins. Images PMID:7984428
Baker, Richard H; Narechania, Apurva; Johns, Philip M; Wilkinson, Gerald S
2012-08-19
Gene duplication provides an essential source of novel genetic material to facilitate rapid morphological evolution. Traits involved in reproduction and sexual dimorphism represent some of the fastest evolving traits in nature, and gene duplication is intricately involved in the origin and evolution of these traits. Here, we review genomic research on stalk-eyed flies (Diopsidae) that has been used to examine the extent of gene duplication and its role in the genetic architecture of sexual dimorphism. Stalk-eyed flies are remarkable because of the elongation of the head into long stalks, with the eyes and antenna laterally displaced at the ends of these stalks. Many species are strongly sexually dimorphic for eyespan, and these flies have become a model system for studying sexual selection. Using both expressed sequence tag and next-generation sequencing, we have established an extensive database of gene expression in the developing eye-antennal imaginal disc, the adult head and testes. Duplicated genes exhibit narrower expression patterns than non-duplicated genes, and the testes, in particular, provide an abundant source of gene duplication. Within somatic tissue, duplicated genes are more likely to be differentially expressed between the sexes, suggesting gene duplication may provide a mechanism for resolving sexual conflict.
Baker, Richard H.; Narechania, Apurva; Johns, Philip M.; Wilkinson, Gerald S.
2012-01-01
Gene duplication provides an essential source of novel genetic material to facilitate rapid morphological evolution. Traits involved in reproduction and sexual dimorphism represent some of the fastest evolving traits in nature, and gene duplication is intricately involved in the origin and evolution of these traits. Here, we review genomic research on stalk-eyed flies (Diopsidae) that has been used to examine the extent of gene duplication and its role in the genetic architecture of sexual dimorphism. Stalk-eyed flies are remarkable because of the elongation of the head into long stalks, with the eyes and antenna laterally displaced at the ends of these stalks. Many species are strongly sexually dimorphic for eyespan, and these flies have become a model system for studying sexual selection. Using both expressed sequence tag and next-generation sequencing, we have established an extensive database of gene expression in the developing eye-antennal imaginal disc, the adult head and testes. Duplicated genes exhibit narrower expression patterns than non-duplicated genes, and the testes, in particular, provide an abundant source of gene duplication. Within somatic tissue, duplicated genes are more likely to be differentially expressed between the sexes, suggesting gene duplication may provide a mechanism for resolving sexual conflict. PMID:22777023
NASA Astrophysics Data System (ADS)
Vislova, A.; Aylward, F.; Sosa, O.; DeLong, E.
2016-02-01
Previous work has revealed diel periodicity of gene expression in key metabolic pathways in both autotrophic and heterotrophic microbes in the surface ocean. In this study, we investigated patterns of diel periodicity of gene expression in depth profiles (25, 75, 125 and 250 meters). We postulated that microbial diel transcriptional signals would be increasingly dampened with depth, and that the timing of peak expression of specific transcripts would be shifted in time between depths, in accordance with depth-dependent diel light variability. Bacterioplankton were sampled from four depths every four hours at station ALOHA (22° 45' N 158° W) over 2 days. RNA was extracted from cells preserved on filters, converted to cDNA, and sequenced on the Illumina platform. Surprisingly, harmonic regression analysis revealed an increasing proportion of genes with diel periodic expression patterns with increasing depth between 25- 125 meters. At 250 meters, the proportion of genes exhibiting diel expression patterns decreased an order of magnitude compared to the photic zone. Community composition, functional gene categories, and diel patterns of gene expression were significantly different between the photic zone and 250 meter samples. The signals driving diel periodic gene expression in microbes at 250 meters is under further investigation. These data are now beginning provide a better understanding of the tempo and mode of microbial dynamics among specific taxa, throughout the ocean's interior.
Haney, Robert A.; Clarke, Thomas H.; Gadgil, Rujuta; Fitzpatrick, Ryan; Hayashi, Cheryl Y.; Ayoub, Nadia A.; Garb, Jessica E.
2016-01-01
Gene duplication and positive selection can be important determinants of the evolution of venom, a protein-rich secretion used in prey capture and defense. In a typical model of venom evolution, gene duplicates switch to venom gland expression and change function under the action of positive selection, which together with further duplication produces large gene families encoding diverse toxins. Although these processes have been demonstrated for individual toxin families, high-throughput multitissue sequencing of closely related venomous species can provide insights into evolutionary dynamics at the scale of the entire venom gland transcriptome. By assembling and analyzing multitissue transcriptomes from the Western black widow spider and two closely related species with distinct venom toxicity phenotypes, we do not find that gene duplication and duplicate retention is greater in gene families with venom gland biased expression in comparison with broadly expressed families. Positive selection has acted on some venom toxin families, but does not appear to be in excess for families with venom gland biased expression. Moreover, we find 309 distinct gene families that have single transcripts with venom gland biased expression, suggesting that the switching of genes to venom gland expression in numerous unrelated gene families has been a dominant mode of evolution. We also find ample variation in protein sequences of venom gland–specific transcripts, lineage-specific family sizes, and ortholog expression among species. This variation might contribute to the variable venom toxicity of these species. PMID:26733576
Kale, Shiv D; Ayubi, Tariq; Chung, Dawoon; Tubau-Juni, Nuria; Leber, Andrew; Dang, Ha X; Karyala, Saikumar; Hontecillas, Raquel; Lawrence, Christopher B; Cramer, Robert A; Bassaganya-Riera, Josep
2017-12-06
Incidences of invasive pulmonary aspergillosis, an infection caused predominantly by Aspergillus fumigatus, have increased due to the growing number of immunocompromised individuals. While A. fumigatus is reliant upon deficiencies in the host to facilitate invasive disease, the distinct mechanisms that govern the host-pathogen interaction remain enigmatic, particularly in the context of distinct immune modulating therapies. To gain insights into these mechanisms, RNA-Seq technology was utilized to sequence RNA derived from lungs of 2 clinically relevant, but immunologically distinct murine models of IPA on days 2 and 3 post inoculation when infection is established and active disease present. Our findings identify notable differences in host gene expression between the chemotherapeutic and steroid models at the interface of immunity and metabolism. RT-qPCR verified model specific and nonspecific expression of 23 immune-associated genes. Deep sequencing facilitated identification of highly expressed fungal genes. We utilized sequence similarity and gene expression to categorize the A. fumigatus putative in vivo secretome. RT-qPCR suggests model specific gene expression for nine putative fungal secreted proteins. Our analysis identifies contrasting responses by the host and fungus from day 2 to 3 between the two models. These differences may help tailor the identification, development, and deployment of host- and/or fungal-targeted therapeutics.
Costa, Caroline B; Monteiro, Karina M; Teichmann, Aline; da Silva, Edileuza D; Lorenzatto, Karina R; Cancela, Martín; Paes, Jéssica A; Benitz, André de N D; Castillo, Estela; Margis, Rogério; Zaha, Arnaldo; Ferreira, Henrique B
2015-08-01
The histone chaperone SET/TAF-Iβ is implicated in processes of chromatin remodelling and gene expression regulation. It has been associated with the control of developmental processes, but little is known about its function in helminth parasites. In Mesocestoides corti, a partial cDNA sequence related to SET/TAF-Iβ was isolated in a screening for genes differentially expressed in larvae (tetrathyridia) and adult worms. Here, the full-length coding sequence of the M. corti SET/TAF-Iβ gene was analysed and the encoded protein (McSET/TAF) was compared with orthologous sequences, showing that McSET/TAF can be regarded as a SET/TAF-Iβ family member, with a typical nucleosome-assembly protein (NAP) domain and an acidic tail. The expression patterns of the McSET/TAF gene and protein were investigated during the strobilation process by RT-qPCR, using a set of five reference genes, and by immunoblot and immunofluorescence, using monospecific polyclonal antibodies. A gradual increase in McSET/TAF transcripts and McSET/TAF protein was observed upon development induction by trypsin, demonstrating McSET/TAF differential expression during strobilation. These results provided the first evidence for the involvement of a protein from the NAP family of epigenetic effectors in the regulation of cestode development.
Takagi, M; Kobayashi, N; Sugimoto, M; Fujii, T; Watari, J; Yano, K
1987-01-01
The expression of a LEU gene from Candida maltosa (designated as C-LEU2) isolated previously (Kawamura et al. 1983) was shown to be regulated, when transferred into Saccharomyces cerevisiae, by leucine and threonine in the medium, as in the case of LEU2 gene of S. cerevisiae. The coding region together with the regulatory region was subcloned and the nucleotide sequence was determined. When the sequence of the coding region was compared with that of LEU2, the homology was 72% for base pairs and 76% for deduced amino acids. Comparison of the regulatory region of C-LEU2 with those of LEU1 and LEU2 suggested a few short consensus sequences which are involved in regulation of gene expression by leucine and threonine in the medium.
Chen, Wei-Hua; Wang, Xue-Xia; Lin, Wei; He, Xiao-Wei; Wu, Zhen-Qiang; Lin, Ying; Hu, Song-Nian; Wang, Xiao-Ning
2006-01-01
Background The cynomolgus monkey (Macaca fascicularis) is one of the most widely used surrogate animal models for an increasing number of human diseases and vaccines, especially immune-system-related ones. Towards a better understanding of the gene expression background upon its immunogenetics, we constructed a cDNA library from Epstein-Barr virus (EBV)-transformed B lymphocytes of a cynomolgus monkey and sequenced 10,000 randomly picked clones. Results After processing, 8,312 high-quality expressed sequence tags (ESTs) were generated and assembled into 3,728 unigenes. Annotations of these uniquely expressed transcripts demonstrated that out of the 2,524 open reading frame (ORF) positive unigenes (mitochondrial and ribosomal sequences were not included), 98.8% shared significant similarities (E-value less than 1e-10) with the NCBI nucleotide (nt) database, while only 67.7% (E-value less than 1e-5) did so with the NCBI non-redundant protein (nr) database. Further analysis revealed that 90.0% of the unigenes that shared no similarities to the nr database could be assigned to human chromosomes, in which 75 did not match significantly to any cynomolgus monkey and human ESTs. The mapping regions to known human genes on the human genome were described in detail. The protein family and domain analysis revealed that the first, second and fourth of the most abundantly expressed protein families were all assigned to immunoglobulin and major histocompatibility complex (MHC)-related proteins. The expression profiles of these genes were compared with that of homologous genes in human blood, lymph nodes and a RAMOS cell line, which demonstrated expression changes after transformation with EBV. The degree of sequence similarity of the MHC class I and II genes to the human reference sequences was evaluated. The results indicated that class I molecules showed weak amino acid identities (<90%), while class II showed slightly higher ones. Conclusion These results indicated that the genes expressed in the cynomolgus monkey could be used to identify novel protein-coding genes and revise those incomplete or incorrect annotations in the human genome by comparative methods, since the old world monkeys and humans share high similarities at the molecular level, especially within coding regions. The identification of multiple genes involved in the immune response, their sequence variations to the human homologues, and their responses to EBV infection could provide useful information to improve our understanding of the cynomolgus monkey immune system. PMID:16618371
Seth, Meetu; Shirayama, Masaki; Gu, Weifeng; Ishidate, Takao; Conte, Darryl; Mello, Craig C.
2014-01-01
SUMMARY Organisms can develop adaptive sequence-specific immunity by re-expressing pathogen-specific small RNAs that guide gene silencing. For example, the C. elegans PIWI-Argonaute/piRNA pathway recruits RNA-dependent RNA polymerase RdRP to foreign sequences to amplify a trans-generational small RNA-induced epigenetic silencing signal (termed RNAe). Here we provide evidence that in addition to an adaptive memory of silenced sequences, C. elegans can also develop an opposing adaptive memory of expressed/self mRNAs. We refer to this mechanism, which can prevent or reverse RNAe as RNA-induced epigenetic gene activation (RNAa). We show that CSR-1, which engages RdRP-amplified small RNAs complementary to germline-expressed mRNAs, is required for RNAa. We show that a transgene with RNAa activity also exhibits accumulation of cognate CSR-1 small RNAs. Our findings suggest that C. elegans adaptively acquires and maintains a trans-generational CSR-1 memory that recognizes and protects self mRNAs, allowing piRNAs to recognize foreign sequences innately, without need for prior exposure. PMID:24360782
Zhang, Tiantao; Coates, Brad S.; Ge, Xing; Bai, Shuxiong; He, Kanglai; Wang, Zhenying
2015-01-01
The Asian corn borer (ACB), Ostrinia furnacalis (Guenée), is a destructive pest insect of cultivated corn crops, for which antennal-expressed receptors are important to detect olfactory cues for mate attraction and oviposition. Few olfactory related genes were reported in ACB, so we sequenced and characterized the transcriptome of male and female O. furnacalis antennae. Non-normalized male and female O. furnacalis antennal cDNA libraries were sequenced on the Illumina HiSeq 2000 and assembled into a reference transcriptome. Functional gene annotations identified putative olfactory-related genes; 56 odorant receptors (ORs), 23 odorant binding proteins (OBPs), and 10 CSPs. RNA-seq estimates of gene expression respectively showed up- and down-regulation of 79 and 30 genes in female compared to male antennae, which included up-regulation of 8 ORs and 1 PBP gene in male antennae as well as 3 ORs in female antennae. Quantitative real-time RT-PCR analyses validated strong male antennal-biased expression of OfurOR3, 4, 6, 7, 8, 11, 12, 13 and 14 transcripts, whereas OfurOR17 and 18 were specially expressed in female antennae. Sex-biases gene expression described here provides important insight in gene functionalization, and provides candidate genes putatively involved in environmental perception, host plant attraction, and mate recognition. PMID:26062030
NASA Technical Reports Server (NTRS)
Miracle, A. L.; Anderson, M. K.; Litman, R. T.; Walsh, C. J.; Luer, C. A.; Rothenberg, E. V.; Litman, G. W.
2001-01-01
Cartilaginous fish express canonical B and T cell recognition genes, but their lymphoid organs and lymphocyte development have been poorly defined. Here, the expression of Ig, TCR, recombination-activating gene (Rag)-1 and terminal deoxynucleosidase (TdT) genes has been used to identify roles of various lymphoid tissues throughout development in the cartilaginous fish, Raja eglanteria (clearnose skate). In embryogenesis, Ig and TCR genes are sharply up-regulated at 8 weeks of development. At this stage TCR and TdT expression is limited to the thymus; later, TCR gene expression appears in peripheral sites in hatchlings and adults, suggesting that the thymus is a source of T cells as in mammals. B cell gene expression indicates more complex roles for the spleen and two special organs of cartilaginous fish-the Leydig and epigonal (gonad-associated) organs. In the adult, the Leydig organ is the site of the highest IgM and IgX expression. However, the spleen is the first site of IgM expression, while IgX is expressed first in gonad, liver, Leydig and even thymus. Distinctive spatiotemporal patterns of Ig light chain gene expression also are seen. A subset of Ig genes is pre-rearranged in the germline of the cartilaginous fish, making expression possible without rearrangement. To assess whether this allows differential developmental regulation, IgM and IgX heavy chain cDNA sequences from specific tissues and developmental stages have been compared with known germline-joined genomic sequences. Both non-productively rearranged genes and germline-joined genes are transcribed in the embryo and hatchling, but not in the adult.
Bang, Kyeongrin; Hwang, Sejung; Lee, Jiae; Cho, Saeyoull
2015-01-01
To identify immune-related genes in the larvae of white-spotted flower chafers, next-generation sequencing was conducted with an Illumina HiSeq2000, resulting in 100 million cDNA reads with sequence information from over 10 billion base pairs (bp) and >50× transcriptome coverage. A subset of 77,336 contigs was created, and ∼35,532 sequences matched entries against the NCBI nonredundant database (cutoff, e < 10(-5)). Statistical analysis was performed on the 35,532 contigs. For profiling of the immune response, samples were analyzed by aligning 42 base sequence tags to the de novo reference assembly, comparing levels in immunized larvae to control levels of expression. Of the differentially expressed genes, 3,440 transcripts were upregulated and 3,590 transcripts were downregulated. Many of these genes were confirmed as immune-related genes such as pattern recognition proteins, immune-related signal transduction proteins, antimicrobial peptides, and cellular response proteins, by comparison to published data. © The Author 2015. Published by Oxford University Press on behalf of the Entomological Society of America.
Single-Cell RNA-Sequencing: Assessment of Differential Expression Analysis Methods.
Dal Molin, Alessandra; Baruzzo, Giacomo; Di Camillo, Barbara
2017-01-01
The sequencing of the transcriptomes of single-cells, or single-cell RNA-sequencing, has now become the dominant technology for the identification of novel cell types and for the study of stochastic gene expression. In recent years, various tools for analyzing single-cell RNA-sequencing data have been proposed, many of them with the purpose of performing differentially expression analysis. In this work, we compare four different tools for single-cell RNA-sequencing differential expression, together with two popular methods originally developed for the analysis of bulk RNA-sequencing data, but largely applied to single-cell data. We discuss results obtained on two real and one synthetic dataset, along with considerations about the perspectives of single-cell differential expression analysis. In particular, we explore the methods performance in four different scenarios, mimicking different unimodal or bimodal distributions of the data, as characteristic of single-cell transcriptomics. We observed marked differences between the selected methods in terms of precision and recall, the number of detected differentially expressed genes and the overall performance. Globally, the results obtained in our study suggest that is difficult to identify a best performing tool and that efforts are needed to improve the methodologies for single-cell RNA-sequencing data analysis and gain better accuracy of results.
Serial analysis of gene expression (SAGE) in normal human trabecular meshwork.
Liu, Yutao; Munro, Drew; Layfield, David; Dellinger, Andrew; Walter, Jeffrey; Peterson, Katherine; Rickman, Catherine Bowes; Allingham, R Rand; Hauser, Michael A
2011-04-08
To identify the genes expressed in normal human trabecular meshwork tissue, a tissue critical to the pathogenesis of glaucoma. Total RNA was extracted from human trabecular meshwork (HTM) harvested from 3 different donors. Extracted RNA was used to synthesize individual SAGE (serial analysis of gene expression) libraries using the I-SAGE Long kit from Invitrogen. Libraries were analyzed using SAGE 2000 software to extract the 17 base pair sequence tags. The extracted sequence tags were mapped to the genome using SAGE Genie map. A total of 298,834 SAGE tags were identified from all HTM libraries (96,842, 88,126, and 113,866 tags, respectively). Collectively, there were 107,325 unique tags. There were 10,329 unique tags with a minimum of 2 counts from a single library. These tags were mapped to known unique Unigene clusters. Approximately 29% of the tags (orphan tags) did not map to a known Unigene cluster. Thirteen percent of the tags mapped to at least 2 Unigene clusters. Sequence tags from many glaucoma-related genes, including myocilin, optineurin, and WD repeat domain 36, were identified. This is the first time SAGE analysis has been used to characterize the gene expression profile in normal HTM. SAGE analysis provides an unbiased sampling of gene expression of the target tissue. These data will provide new and valuable information to improve understanding of the biology of human aqueous outflow.
Meesapyodsuk, Dauenpen; Balsevich, John; Reed, Darwin W.; Covello, Patrick S.
2007-01-01
Saponaria vaccaria (Caryophyllaceae), a soapwort, known in western Canada as cowcockle, contains bioactive oleanane-type saponins similar to those found in soapbark tree (Quillaja saponaria; Rosaceae). To improve our understanding of the biosynthesis of these saponins, a combined polymerase chain reaction and expressed sequence tag approach was taken to identify the genes involved. A cDNA encoding a β-amyrin synthase (SvBS) was isolated by reverse transcription-polymerase chain reaction and characterized by expression in yeast (Saccharomyces cerevisiae). The SvBS gene is predominantly expressed in leaves. A S. vaccaria developing seed expressed sequence tag collection was developed and used for the isolation of a full-length cDNA bearing sequence similarity to ester-forming glycosyltransferases. The gene product of the cDNA, classified as UGT74M1, was expressed in Escherichia coli, purified, and identified as a triterpene carboxylic acid glucosyltransferase. UGT74M1 is expressed in roots and leaves and appears to be involved in monodesmoside biosynthesis in S. vaccaria. PMID:17172290
White, Michael A.; Kitano, Jun; Peichel, Catherine L.
2015-01-01
Sex chromosomes are subject to unique evolutionary forces that cause suppression of recombination, leading to sequence degeneration and the formation of heteromorphic chromosome pairs (i.e., XY or ZW). Although progress has been made in characterizing the outcomes of these evolutionary processes on vertebrate sex chromosomes, it is still unclear how recombination suppression and sequence divergence typically occur and how gene dosage imbalances are resolved in the heterogametic sex. The threespine stickleback fish (Gasterosteus aculeatus) is a powerful model system to explore vertebrate sex chromosome evolution, as it possesses an XY sex chromosome pair at relatively early stages of differentiation. Using a combination of whole-genome and transcriptome sequencing, we characterized sequence evolution and gene expression across the sex chromosomes. We uncovered two distinct evolutionary strata that correspond with known structural rearrangements on the Y chromosome. In the oldest stratum, only a handful of genes remain, and these genes are under strong purifying selection. By comparing sex-linked gene expression with expression of autosomal orthologs in an outgroup, we show that dosage compensation has not evolved in threespine sticklebacks through upregulation of the X chromosome in males. Instead, in the oldest stratum, the genes that still possess a Y chromosome allele are enriched for genes predicted to be dosage sensitive in mammals and yeast. Our results suggest that dosage imbalances may have been avoided at haploinsufficient genes by retaining function of the Y chromosome allele through strong purifying selection. PMID:25818858
Global analysis of gene expression profiles in developing physic nut (Jatropha curcas L.) seeds.
Jiang, Huawu; Wu, Pingzhi; Zhang, Sheng; Song, Chi; Chen, Yaping; Li, Meiru; Jia, Yongxia; Fang, Xiaohua; Chen, Fan; Wu, Guojiang
2012-01-01
Physic nut (Jatropha curcas L.) is an oilseed plant species with high potential utility as a biofuel. Furthermore, following recent sequencing of its genome and the availability of expressed sequence tag (EST) libraries, it is a valuable model plant for studying carbon assimilation in endosperms of oilseed plants. There have been several transcriptomic analyses of developing physic nut seeds using ESTs, but they have provided limited information on the accumulation of stored resources in the seeds. We applied next-generation Illumina sequencing technology to analyze global gene expression profiles of developing physic nut seeds 14, 19, 25, 29, 35, 41, and 45 days after pollination (DAP). The acquired profiles reveal the key genes, and their expression timeframes, involved in major metabolic processes including: carbon flow, starch metabolism, and synthesis of storage lipids and proteins in the developing seeds. The main period of storage reserves synthesis in the seeds appears to be 29-41 DAP, and the fatty acid composition of the developing seeds is consistent with relative expression levels of different isoforms of acyl-ACP thioesterase and fatty acid desaturase genes. Several transcription factor genes whose expression coincides with storage reserve deposition correspond to those known to regulate the process in Arabidopsis. The results will facilitate searches for genes that influence de novo lipid synthesis, accumulation and their regulatory networks in developing physic nut seeds, and other oil seeds. Thus, they will be helpful in attempts to modify these plants for efficient biofuel production.
Lv, Daoyuan; Song, Ping; Chen, Yungui; Gong, Wuming; Mo, Saijun
2005-04-08
Using the digital differential display program of the National Center for Biotechnology Information, we identified a contig of expression sequence tags (ESTs) (Accession No. BM316936), which came from zebrafish ovary and testis libraries. The full-length cDNA of this transcript was cloned and further confirmed by polymerase chain reaction and sequencing. The full-length cDNA of the novel gene is 807bp and encodes a novel protein of 187 amino acids, which shares no significant homology with any other known proteins. Characterization of genomic sequences of the gene revealed that it spans 6kb on the linkage group 3 and is composed of five exons and four introns. RT-PCR analysis showed that it was expressed in mature oocytes and one-cell stage, and persisted until 24h of development. RT-PCR also revealed that it is expressed in gonad and kidney, with the highest level of expression in the testis. The expression sites of the novel gene in adult gonad were further localized by in situ hybridization to oogonia and growing oocytes in ovary and to spermatogonia, spermatocytes but not to spermatids in testis. Based on its abundance in testis and the germline stem cell-spermatogonia and oogonia, we hypothesize that it may function as a testicular development and gametogenesis related gene that plays important roles in spermatogenesis, and named it Zsrg (zebrafish testis spermatogenesis related gene, Zsrg).
2013-01-01
Background Birds have a ZZ male: ZW female sex chromosome system and while the Z-linked DMRT1 gene is necessary for testis development, the exact mechanism of sex determination in birds remains unsolved. This is partly due to the poor annotation of the W chromosome, which is speculated to carry a female determinant. Few genes have been mapped to the W and little is known of their expression. Results We used RNA-seq to produce a comprehensive profile of gene expression in chicken blastoderms and embryonic gonads prior to sexual differentiation. We found robust sexually dimorphic gene expression in both tissues pre-dating gonadogenesis, including sex-linked and autosomal genes. This supports the hypothesis that sexual differentiation at the molecular level is at least partly cell autonomous in birds. Different sets of genes were sexually dimorphic in the two tissues, indicating that molecular sexual differentiation is tissue specific. Further analyses allowed the assembly of full-length transcripts for 26 W chromosome genes, providing a view of the W transcriptome in embryonic tissues. This is the first extensive analysis of W-linked genes and their expression profiles in early avian embryos. Conclusion Sexual differentiation at the molecular level is established in chicken early in embryogenesis, before gonadal sex differentiation. We find that the W chromosome is more transcriptionally active than previously thought, expand the number of known genes to 26 and present complete coding sequences for these W genes. This includes two novel W-linked sequences and three small RNAs reassigned to the W from the Un_Random chromosome. PMID:23531366
Ito, T M; Polido, P B; Rampim, M C; Kaschuk, G; Souza, S G H
2014-09-26
Sweet orange (Citrus sinensis) plays an important role in the economy of more than 140 countries, but it is grown in areas with intermittent stressful soil and climatic conditions. The stress tolerance could be addressed by manipulating the ethylene response factor (ERF) transcription factors because they orchestrate plant responses to environmental stress. We performed an in silico study on the ERFs in the expressed sequence tag database of C. sinensis to identify potential genes that regulate plant responses to stress. We identified 108 putative genes encoding protein sequences of the AP2/ERF superfamily distributed within 10 groups of amino acid sequences. Ninety-one genes were assembled from the ERF family containing only one AP2/ERF domain, 13 genes were assembled from the AP2 family containing two AP2/ERF domains, and four other genes were assembled from the RAV family containing one AP2/ERF domain and a B3 domain. Some conserved domains of the ERF family genes were disrupted into a few segments by introns. This irregular distribution of genes in the AP2/ERF superfamily in different plant species could be a result of genomic losses or duplication events in a common ancestor. The in silico gene expression revealed that 67% of AP2/ERF genes are expressed in tissues with usual plant development, and 14% were expressed in stressed tissues. Because the AP2/ERF superfamily is expressed in an orchestrated way, it is possible that the manipulation of only one gene may result in changes in the whole plant function, which could result in more tolerant crops.
Griffin, Nicole G; Wang, Yu; Hulette, Christine M; Halvorsen, Matt; Cronin, Kenneth D; Walley, Nicole M; Haglund, Michael M; Radtke, Rodney A; Skene, J H Pate; Sinha, Saurabh R; Heinzen, Erin L
2016-03-01
Hippocampal sclerosis is the most common neuropathologic finding in cases of medically intractable mesial temporal lobe epilepsy. In this study, we analyzed the gene expression profiles of dentate granule cells of patients with mesial temporal lobe epilepsy with and without hippocampal sclerosis to show that next-generation sequencing methods can produce interpretable genomic data from RNA collected from small homogenous cell populations, and to shed light on the transcriptional changes associated with hippocampal sclerosis. RNA was extracted, and complementary DNA (cDNA) was prepared and amplified from dentate granule cells that had been harvested by laser capture microdissection from surgically resected hippocampi from patients with mesial temporal lobe epilepsy with and without hippocampal sclerosis. Sequencing libraries were sequenced, and the resulting sequencing reads were aligned to the reference genome. Differential expression analysis was used to ascertain expression differences between patients with and without hippocampal sclerosis. Greater than 90% of the RNA-Seq reads aligned to the reference. There was high concordance between transcriptional profiles obtained for duplicate samples. Principal component analysis revealed that the presence or absence of hippocampal sclerosis was the main determinant of the variance within the data. Among the genes up-regulated in the hippocampal sclerosis samples, there was significant enrichment for genes involved in oxidative phosphorylation. By analyzing the gene expression profiles of dentate granule cells from surgically resected hippocampal specimens from patients with mesial temporal lobe epilepsy with and without hippocampal sclerosis, we have demonstrated the utility of next-generation sequencing methods for producing biologically relevant results from small populations of homogeneous cells, and have provided insight on the transcriptional changes associated with this pathology. Wiley Periodicals, Inc. © 2016 International League Against Epilepsy.
The Spatial and Temporal Transcriptomic Landscapes of Ginseng, Panax ginseng C. A. Meyer.
Wang, Kangyu; Jiang, Shicui; Sun, Chunyu; Lin, Yanping; Yin, Rui; Wang, Yi; Zhang, Meiping
2015-12-11
Ginseng, including Asian ginseng (Panax ginseng C. A. Meyer) and American ginseng (P. quinquefolius L.), is one of the most important medicinal herbs in Asia and North America, but significantly understudied. This study sequenced and characterized the transcriptomes and expression profiles of genes expressed in 14 tissues and four different aged roots of Asian ginseng. A total of 265.2 million 100-bp clean reads were generated using the high-throughput sequencing platform HiSeq 2000, representing >8.3x of the 3.2-Gb ginseng genome. From the sequences, 248,993 unigenes were assembled for whole plant, 61,912-113,456 unigenes for each tissue and 54,444-65,412 unigenes for different year-old roots. We comprehensively analyzed the unigene sets and gene expression profiles. We found that the number of genes allocated to each functional category is stable across tissues or developmental stages, while the expression profiles of different genes of a gene family or involved in ginsenoside biosynthesis dramatically diversified spatially and temporally. These results provide an overall insight into the spatial and temporal transcriptome dynamics and landscapes of Asian ginseng, and comprehensive resources for advanced research and breeding of ginseng and related species.
Sun, Xiujun; Liu, Zhihong; Zhou, Liqing; Wu, Biao; Dong, Yinghui; Yang, Aiguo
2016-01-01
The Yesso scallop Patinopecten yessoensis displays polymorphism in shell colors, which is of great interest for the scallop industry. To identify genes involved in the shell coloration, in the present study, we investigate the transcriptome differences by Illumina digital gene expression (DGE) analysis in two extreme color phenotypes, Red and White. Illumina sequencing yields a total of 62,715,364 clean sequence reads, and more than 85% reads are mapped into our previously sequenced transcriptome. There are 25 significantly differentially expressed genes between Red and White scallops. EPR (Electron paramagnetic resonance) analysis has identified EPR spectra of pheomelanin and eumelanin in the red shells, but not in the white shells. Compared to the Red scallops, the White scallops have relatively higher mRNA expression in tyrosinase genes, but lower expression in other melanogensis-associated genes. Meantime, the relatively lower tyrosinase protein and decreased tyrosinase activity in White scallops are suggested to be associated with the lack of melanin in the white shells. Our findings highlight the functional roles of melanogensis-associated genes in the melanization process of scallop shells, and shed new lights on the transcriptional and post-transcriptional mechanisms in the regulation of tyrosinase activity during the process of melanin synthesis. The present results will assist our molecular understanding of melanin synthesis underlying shell color polymorphism in scallops, as well as other bivalves, and also help the color-based breeding in shellfish aquaculture. PMID:27563719
Habermann, Bianca; Bebin, Anne-Gaelle; Herklotz, Stephan; Volkmer, Michael; Eckelt, Kay; Pehlke, Kerstin; Epperlein, Hans Henning; Schackert, Hans Konrad; Wiebe, Glenis; Tanaka, Elly M
2004-01-01
Background The ambystomatid salamander, Ambystoma mexicanum (axolotl), is an important model organism in evolutionary and regeneration research but relatively little sequence information has so far been available. This is a major limitation for molecular studies on caudate development, regeneration and evolution. To address this lack of sequence information we have generated an expressed sequence tag (EST) database for A. mexicanum. Results Two cDNA libraries, one made from stage 18-22 embryos and the other from day-6 regenerating tail blastemas, generated 17,352 sequences. From the sequenced ESTs, 6,377 contigs were assembled that probably represent 25% of the expressed genes in this organism. Sequence comparison revealed significant homology to entries in the NCBI non-redundant database. Further examination of this gene set revealed the presence of genes involved in important cell and developmental processes, including cell proliferation, cell differentiation and cell-cell communication. On the basis of these data, we have performed phylogenetic analysis of key cell-cycle regulators. Interestingly, while cell-cycle proteins such as the cyclin B family display expected evolutionary relationships, the cyclin-dependent kinase inhibitor 1 gene family shows an unusual evolutionary behavior among the amphibians. Conclusions Our analysis reveals the importance of a comprehensive sequence set from a representative of the Caudata and illustrates that the EST sequence database is a rich source of molecular, developmental and regeneration studies. To aid in data mining, the ESTs have been organized into an easily searchable database that is freely available online. PMID:15345051
Pasion, S G; Hines, J C; Aebersold, R; Ray, D S
1992-01-01
A type II DNA topoisomerase, topoIImt, was shown previously to be associated with the kinetoplast DNA of the trypanosomatid Crithidia fasciculata. The gene encoding this kinetoplast-associated topoisomerase has been cloned by immunological screening of a Crithidia genomic expression library with monoclonal antibodies raised against the purified enzyme. The gene CfaTOP2 is a single copy gene and is expressed as a 4.8-kb polyadenylated transcript. The nucleotide sequence of CfaTOP2 has been determined and encodes a predicted polypeptide of 1239 amino acids with a molecular mass of 138,445. The identification of the cloned gene is supported by immunoblot analysis of the beta-galactosidase-CfaTOP2 fusion protein expressed in Escherichia coli and by analysis of tryptic peptide sequences derived from purified topoIImt. CfaTOP2 shares significant homology with nuclear type II DNA topoisomerases of other eukaryotes suggesting that in Crithidia both nuclear and mitochondrial forms of topoisomerase II are encoded by the same gene.
Gene encoding plant asparagine synthetase
Coruzzi, Gloria M.; Tsai, Fong-Ying
1993-10-26
The identification and cloning of the gene(s) for plant asparagine synthetase (AS), an important enzyme involved in the formation of asparagine, a major nitrogen transport compound of higher plants is described. Expression vectors constructed with the AS coding sequence may be utilized to produce plant AS; to engineer herbicide resistant plants, salt/drought tolerant plants or pathogen resistant plants; as a dominant selectable marker; or to select for novel herbicides or compounds useful as agents that synchronize plant cells in culture. The promoter for plant AS, which directs high levels of gene expression and is induced in an organ specific manner and by darkness, is also described. The AS promoter may be used to direct the expression of heterologous coding sequences in appropriate hosts.
Bussemaker, Harmen J.; Li, Hao; Siggia, Eric D.
2000-01-01
The availability of complete genome sequences and mRNA expression data for all genes creates new opportunities and challenges for identifying DNA sequence motifs that control gene expression. An algorithm, “MobyDick,” is presented that decomposes a set of DNA sequences into the most probable dictionary of motifs or words. This method is applicable to any set of DNA sequences: for example, all upstream regions in a genome or all genes expressed under certain conditions. Identification of words is based on a probabilistic segmentation model in which the significance of longer words is deduced from the frequency of shorter ones of various lengths, eliminating the need for a separate set of reference data to define probabilities. We have built a dictionary with 1,200 words for the 6,000 upstream regulatory regions in the yeast genome; the 500 most significant words (some with as few as 10 copies in all of the upstream regions) match 114 of 443 experimentally determined sites (a significance level of 18 standard deviations). When analyzing all of the genes up-regulated during sporulation as a group, we find many motifs in addition to the few previously identified by analyzing the subclusters individually to the expression subclusters. Applying MobyDick to the genes derepressed when the general repressor Tup1 is deleted, we find known as well as putative binding sites for its regulatory partners. PMID:10944202
Macroarray expression analysis of barley susceptibility and nonhost resistance to Blumeria graminis.
Eichmann, Ruth; Biemelt, Sophia; Schäfer, Patrick; Scholz, Uwe; Jansen, Carin; Felk, Angelika; Schäfer, Wilhelm; Langen, Gregor; Sonnewald, Uwe; Kogel, Karl-Heinz; Hückelhoven, Ralph
2006-04-01
Different formae speciales of the grass powdery mildew fungus Blumeria graminis undergo basic-compatible or basic-incompatible (nonhost) interactions with barley. Background resistance in compatible interactions and nonhost resistance require common genetic and mechanistic elements of plant defense. To build resources for differential screening for genes that potentially distinguish a compatible from an incompatible interaction on the level of differential gene expression of the plant, we constructed eight dedicated cDNA libraries, established 13.000 expressed sequence tag (EST) sequences and designed DNA macroarrays. Using macroarrays based on cDNAs derived from epidermal peels of plants pretreated with the chemical resistance activating compound acibenzolar-S-methyl, we compared the expression of barley gene transcripts in the early host interaction with B. graminis f.sp. hordei or the nonhost pathogen B. graminis f.sp. tritici, respectively. We identified 102 spots corresponding to 94 genes on the macroarray that gave significant B. graminis-responsive signals at 12 and/or 24 h after inoculation. In independent expression analyses, we confirmed the macroarray results for 11 selected genes. Although the majority of genes showed a similar expression profile in compatible versus incompatible interactions, about 30 of the 94 genes were expressed on slightly different levels in compatible versus incompatible interactions.
Nikolaidis, Nikolas; Nei, Masatoshi
2004-03-01
We have identified the Hsp70 gene superfamily of the nematode Caenorhabditis briggsae and investigated the evolution of these genes in comparison with Hsp70 genes from C. elegans, Drosophila, and yeast. The Hsp70 genes are classified into three monophyletic groups according to their subcellular localization, namely, cytoplasm (CYT), endoplasmic reticulum (ER), and mitochondria (MT). The Hsp110 genes can be classified into the polyphyletic CYT group and the monophyletic ER group. The different Hsp70 and Hsp110 groups appeared to evolve following the model of divergent evolution. This model can also explain the evolution of the ER and MT genes. On the other hand, the CYT genes are divided into heat-inducible and constitutively expressed genes. The constitutively expressed genes have evolved more or less following the birth-and-death process, and the rates of gene birth and gene death are different between the two nematode species. By contrast, some heat-inducible genes show an intraspecies phylogenetic clustering. This suggests that they are subject to sequence homogenization resulting from gene conversion-like events. In addition, the heat-inducible genes show high levels of sequence conservation in both intra-species and inter-species comparisons, and in most cases, amino acid sequence similarity is higher than nucleotide sequence similarity. This indicates that purifying selection also plays an important role in maintaining high sequence similarity among paralogous Hsp70 genes. Therefore, we suggest that the CYT heat-inducible genes have been subjected to a combination of purifying selection, birth-and-death process, and gene conversion-like events.
Plant nitrogen regulatory P-PII polypeptides
Coruzzi, Gloria M.; Lam, Hon-Ming; Hsieh, Ming-Hsiun
2004-11-23
The present invention generally relates to plant nitrogen regulatory PII gene (hereinafter P-PII gene), a gene involved in regulating plant nitrogen metabolism. The invention provides P-PII nucleotide sequences, expression constructs comprising said nucleotide sequences, and host cells and plants having said constructs and, optionally expressing the P-PII gene from said constructs. The invention also provides substantially pure P-PII proteins. The P-PII nucleotide sequences and constructs of the invention may be used to engineer organisms to overexpress wild-type or mutant P-PII regulatory protein. Engineered plants that overexpress or underexpress P-PII regulatory protein may have increased nitrogen assimilation capacity. Engineered organisms may be used to produce P-PII proteins which, in turn, can be used for a variety of purposes including in vitro screening of herbicides. P-PII nucleotide sequences have additional uses as probes for isolating additional genomic clones having the promoters of P-PII gene. P-PII promoters are light- and/or sucrose-inducible and may be advantageously used in genetic engineering of plants.
Su, Huei-Jiun; Hu, Jer-Ming
2012-01-01
Background and Aims The holoparasitic flowering plant Balanophora displays extreme floral reduction and was previously found to have enormous rate acceleration in the nuclear 18S rDNA region. So far, it remains unclear whether non-ribosomal, protein-coding genes of Balanophora also evolve in an accelerated fashion and whether the genes with high substitution rates retain their functionality. To tackle these issues, six different genes were sequenced from two Balanophora species and their rate variation and expression patterns were examined. Methods Sequences including nuclear PI, euAP3, TM6, LFY and RPB2 and mitochondrial matR were determined from two Balanophora spp. and compared with selected hemiparasitic species of Santalales and autotrophic core eudicots. Gene expression was detected for the six protein-coding genes and the expression patterns of the three B-class genes (PI, AP3 and TM6) were further examined across different organs of B. laxiflora using RT-PCR. Key Results Balanophora mitochondrial matR is highly accelerated in both nonsynonymous (dN) and synonymous (dS) substitution rates, whereas the rate variation of nuclear genes LFY, PI, euAP3, TM6 and RPB2 are less dramatic. Significant dS increases were detected in Balanophora PI, TM6, RPB2 and dN accelerations in euAP3. All of the protein-coding genes are expressed in inflorescences, indicative of their functionality. PI is restrictively expressed in tepals, synandria and floral bracts, whereas AP3 and TM6 are widely expressed in both male and female inflorescences. Conclusions Despite the observation that rates of sequence evolution are generally higher in Balanophora than in hemiparasitic species of Santalales and autotrophic core eudicots, the five nuclear protein-coding genes are functional and are evolving at a much slower rate than 18S rDNA. The mechanism or mechanisms responsible for rapid sequence evolution and concomitant rate acceleration for 18S rDNA and matR are currently not well understood and require further study in Balanophora and other holoparasites. PMID:23041381
Regulation of gene expression in the mammalian eye and its relevance to eye disease.
Scheetz, Todd E; Kim, Kwang-Youn A; Swiderski, Ruth E; Philp, Alisdair R; Braun, Terry A; Knudtson, Kevin L; Dorrance, Anne M; DiBona, Gerald F; Huang, Jian; Casavant, Thomas L; Sheffield, Val C; Stone, Edwin M
2006-09-26
We used expression quantitative trait locus mapping in the laboratory rat (Rattus norvegicus) to gain a broad perspective of gene regulation in the mammalian eye and to identify genetic variation relevant to human eye disease. Of >31,000 gene probes represented on an Affymetrix expression microarray, 18,976 exhibited sufficient signal for reliable analysis and at least 2-fold variation in expression among 120 F(2) rats generated from an SR/JrHsd x SHRSP intercross. Genome-wide linkage analysis with 399 genetic markers revealed significant linkage with at least one marker for 1,300 probes (alpha = 0.001; estimated empirical false discovery rate = 2%). Both contiguous and noncontiguous loci were found to be important in regulating mammalian eye gene expression. We investigated one locus of each type in greater detail and identified putative transcription-altering variations in both cases. We found an inserted cREL binding sequence in the 5' flanking sequence of the Abca4 gene associated with an increased expression level of that gene, and we found a mutation of the gene encoding thyroid hormone receptor beta2 associated with a decreased expression level of the gene encoding short-wavelength sensitive opsin (Opn1sw). In addition to these positional studies, we performed a pairwise analysis of gene expression to identify genes that are regulated in a coordinated manner and used this approach to validate two previously undescribed genes involved in the human disease Bardet-Biedl syndrome. These data and analytical approaches can be used to facilitate the discovery of additional genes and regulatory elements involved in human eye disease.
Virus-Based RNA Silencing Agents and Virus-Derived Expression Vectors as Gene Therapy Vehicles.
Venkataraman, Srividhya; Ahmad, Tauqeer; AbouHaidar, Mounir G; Hefferon, Kathleen L
2017-01-01
In consideration of recent developments in understanding the genomics and proteomics of viruses, the use of viral DNA / RNA sequences as well as their gene expression schemes, have found new in-roads towards the prognosis and therapy of diseases. Correspondingly, the sphere of the patenting scenario has expanded significantly. The current review addresses patented inventions concerning the use of virus sequences as gene silencing machineries and inventions concerning the generation and application of viral sequences as expression vectors. Furthermore, this review also discusses the employment of these patents for clinical, agricultural and biotechnological applications. Considering these objectives, the Delphion Research Intellectual Property Network database was searched using keywords such as "gene silencing", "engineered viruses" and "expression vectors" and descriptions of recent patents on the said topics were discussed. Despite several recent advances in the use of viruses as disease therapy vehicles and biotechnological vectors, these developments have yet to be proven effective in practice, in clinical and field trials. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Zhang, Tiejun; Chao, Yuehui; Kang, Junmei; Ding, Wang; Yang, Qingchuan
2013-07-01
Genes that regulate flowering time play crucial roles in plant development and biomass formation. Based on the cDNA sequence of Medicago truncatula (accession no. AY690425), the LFY gene of alfalfa was cloned. Sequence similarity analysis revealed high homology with FLO/LFY family genes of other plants. When fused to the green fluorescent protein, MsLFY protein was localized in the nucleus of onion (Allium cepa L.) epidermal cells. The RT-qPCR analysis of MsLFY expression patterns showed that the expression of MsLFY gene was at a low level in roots, stems, leaves and pods, and the expression level in floral buds was the highest. The expression of MsLFY was induced by GA3 and long photoperiod. Plant expression vector was constructed and transformed into Arabidopsis by the agrobacterium-mediated methods. PCR amplification with the transgenic Arabidopsis genome DNA indicated that MsLFY gene had integrated in Arabidopsis genome. Overexpression of MsLFY specifically caused early flowering under long day conditions compared with non-transgenic plants. These results indicated MsLFY played roles in promoting flowering time.
Tissue-specific expression of the gene for a putative plasma membrane H(+)-ATPase in a seagrass.
Fukuhara, T; Pak, J Y; Ohwaki, Y; Tsujimura, H; Nitta, T
1996-01-01
A cDNA clone corresponding to the gene (ZHA1) for a putative plasma membrane H(+)-ATPase of a seagrass (Zostera marina L.) was isolated and sequenced. Comparison of the amino acid predicted sequence from the nucleotide sequence of ZHA1 with those encoded by known genes for plasma membrane H(+)-ATPases from other plants indicated that ZHA1 is most similar to the gene (PMA4) for a plasma membrane H(+)-ATPase in a tobacco (84.4%). Northern hybridization indicated that ZHA1 was strongly expressed in mature leaves, which are exposed to seawater and have the ability of tolerate salinity; ZHA1 was weakly expressed in immature leaves, which are protected from seawater by tightly enveloping sheaths and are sensitive to salinity. In mature leaves, in situ hybridization revealed that ZHA1 was expressed specifically in epidermal cells, the plasma membranes of which were highly invaginated and morphologically similar to those of typical transfer cells. Therefore, the differentiation of the transfer cell-like structures, accompanied by the high-level expression of ZHA1, in the epidermal cells of mature leaves in particular may be important for the excretion of salt by these cells. PMID:8587992
Rennick, Linda J; Duprex, W Paul; Rima, Bert K
2007-10-01
Transcription from morbillivirus genomes commences at a single promoter in the 3' non-coding terminus, with the six genes being transcribed sequentially. The 3' and 5' untranslated regions (UTRs) of the genes (mRNA sense), together with the intergenic trinucleotide spacer, comprise the non-coding sequences (NCS) of the virus and contain the conserved gene end and gene start signals, respectively. Bicistronic minigenomes containing transcription units (TUs) encoding autofluorescent reporter proteins separated by measles virus (MV) NCS were used to give a direct estimation of gene expression in single, living cells by assessing the relative amounts of each fluorescent protein in each cell. Initially, five minigenomes containing each of the MV NCS were generated. Assays were developed to determine the amount of each fluorescent protein in cells at both cell population and single-cell levels. This revealed significant variations in gene expression between cells expressing the same NCS-containing minigenome. The minigenome containing the M/F NCS produced significantly lower amounts of fluorescent protein from the second TU (TU2), compared with the other minigenomes. A minigenome with a truncated F 5' UTR had increased expression from TU2. This UTR is 524 nt longer than the other MV 5' UTRs. Insertions into the 5' UTR of the enhanced green fluorescent protein gene in the minigenome containing the N/P NCS showed that specific sequences, rather than just the additional length of F 5' UTR, govern this decreased expression from TU2.
Prestes, Priscilla R; Marques, Francine Z; Lopez-Campos, Guillermo; Lewandowski, Paul; Delbridge, Lea M D; Charchar, Fadi J; Harrap, Stephen B
2018-05-18
Hypertrophic cardiomyopathy thickens heart muscles reducing functionality and increasing risk of cardiac disease and morbidity. Genetic factors are involved, but their contribution is poorly understood. We used the hypertrophic heart rat (HHR), a unique normotensive polygenic model of cardiac hypertrophy and heart failure to investigate the role of genes associated with monogenic human cardiomyopathy. We selected 42 genes involved in monogenic human cardiomyopathies to study: 1) DNA variants, by sequencing the whole-genome of 13-week old HHR and age-matched normal heart rat (NHR), its genetic control strain; 2) mRNA expression, by targeted RNA-sequencing in left ventricles of HHR and NHR at five ages (2-days old, 4-, 13-, 33- and 50-weeks old) compared to human idiopathic dilated data; and 3) microRNA expression, with rat microRNA microarrays in left ventricles of 2-days old HHR and age-matched NHR. We also investigated experimentally validated microRNA-mRNA interactions. Whole-genome sequencing revealed unique variants mostly located in non-coding regions of HHR and NHR. We found 29 genes differentially expressed in at least one age. Genes encoding desmoglein 2 (Dsg2) and transthyretin (Ttr) were significantly differentially expressed at all ages in the HHR, but only Ttr was also differentially expressed in human idiopathic cardiomyopathy. Lastly, only two microRNAs differentially expressed in the HHR were present in our comparison of validated microRNA-mRNA interactions. These two microRNAs interact with five of the genes studied. Our study shows that genes involved in monogenic forms of human cardiomyopathies may also influence polygenic forms of the disease.
Proglucagons in vertebrates: Expression and processing of multiple genes in a bony fish.
Busby, Ellen R; Mommsen, Thomas P
2016-09-01
In contrast to mammals, where a single proglucagon (PG) gene encodes three peptides: glucagon, glucagon-like peptide 1 and glucagon-like peptide 2 (GLP-1; GLP-2), many non-mammalian vertebrates carry multiple PG genes. Here, we investigate proglucagon mRNA sequences, their tissue expression and processing in a diploid bony fish. Copper rockfish (Sebastes caurinus) express two independent genes coding for distinct proglucagon sequences (PG I, PG II), with PG II lacking the GLP-2 sequence. These genes are differentially transcribed in the endocrine pancreas, the brain, and the gastrointestinal tract. Alternative splicing identified in rockfish is only one part of this complex regulation of the PG transcripts: the system has the potential to produce two glucagons, four GLP-1s and a single GLP-2, or any combination of these peptides. Mass spectrometric analysis of partially purified PG-derived peptides in endocrine pancreas confirms translation of both PG transcripts and differential processing of the resulting peptides. The complex differential regulation of the two PG genes and their continued presence in this extant teleostean fish strongly suggests unique and, as yet largely unidentified, roles for the peptide products encoded in each gene. Copyright © 2016 Elsevier Inc. All rights reserved.
Su, Chang; Wang, Chao; He, Lin; Yang, Chuanping; Wang, Yucheng
2014-01-01
DNA methylation plays a critical role in the regulation of gene expression. Most studies of DNA methylation have been performed in herbaceous plants, and little is known about the methylation patterns in tree genomes. In the present study, we generated a map of methylated cytosines at single base pair resolution for Betula platyphylla (white birch) by bisulfite sequencing combined with transcriptomics to analyze DNA methylation and its effects on gene expression. We obtained a detailed view of the function of DNA methylation sequence composition and distribution in the genome of B. platyphylla. There are 34,460 genes in the whole genome of birch, and 31,297 genes are methylated. Conservatively, we estimated that 14.29% of genomic cytosines are methylcytosines in birch. Among the methylation sites, the CHH context accounts for 48.86%, and is the largest proportion. Combined transcriptome and methylation analysis showed that the genes with moderate methylation levels had higher expression levels than genes with high and low methylation. In addition, methylated genes are highly enriched for the GO subcategories of binding activities, catalytic activities, cellular processes, response to stimulus and cell death, suggesting that methylation mediates these pathways in birch trees. PMID:25514241
Liu, Jin; Shao, Luyao; Trang, Phong; Yang, Zhu; Reeves, Michael; Sun, Xu; Vu, Gia-Phong; Wang, Yu; Li, Hongjian; Zheng, Congyi; Lu, Sangwei; Liu, Fenyong
2016-06-09
An external guide sequence (EGS) is a RNA sequence which can interact with a target mRNA to form a tertiary structure like a pre-tRNA and recruit intracellular ribonuclease P (RNase P), a tRNA processing enzyme, to degrade target mRNA. Previously, an in vitro selection procedure has been used by us to engineer new EGSs that are more robust in inducing human RNase P to cleave their targeted mRNAs. In this study, we constructed EGSs from a variant to target the mRNA encoding herpes simplex virus 1 (HSV-1) major transcription regulator ICP4, which is essential for the expression of viral early and late genes and viral growth. The EGS variant induced human RNase P cleavage of ICP4 mRNA sequence 60 times better than the EGS generated from a natural pre-tRNA. A decrease of about 97% and 75% in the level of ICP4 gene expression and an inhibition of about 7,000- and 500-fold in viral growth were observed in HSV infected cells expressing the variant and the pre-tRNA-derived EGS, respectively. This study shows that engineered EGSs can inhibit HSV-1 gene expression and viral growth. Furthermore, these results demonstrate the potential for engineered EGS RNAs to be developed and used as anti-HSV therapeutics.
Liu, Jin; Shao, Luyao; Trang, Phong; Yang, Zhu; Reeves, Michael; Sun, Xu; Vu, Gia-Phong; Wang, Yu; Li, Hongjian; Zheng, Congyi; Lu, Sangwei; Liu, Fenyong
2016-01-01
An external guide sequence (EGS) is a RNA sequence which can interact with a target mRNA to form a tertiary structure like a pre-tRNA and recruit intracellular ribonuclease P (RNase P), a tRNA processing enzyme, to degrade target mRNA. Previously, an in vitro selection procedure has been used by us to engineer new EGSs that are more robust in inducing human RNase P to cleave their targeted mRNAs. In this study, we constructed EGSs from a variant to target the mRNA encoding herpes simplex virus 1 (HSV-1) major transcription regulator ICP4, which is essential for the expression of viral early and late genes and viral growth. The EGS variant induced human RNase P cleavage of ICP4 mRNA sequence 60 times better than the EGS generated from a natural pre-tRNA. A decrease of about 97% and 75% in the level of ICP4 gene expression and an inhibition of about 7,000- and 500-fold in viral growth were observed in HSV infected cells expressing the variant and the pre-tRNA-derived EGS, respectively. This study shows that engineered EGSs can inhibit HSV-1 gene expression and viral growth. Furthermore, these results demonstrate the potential for engineered EGS RNAs to be developed and used as anti-HSV therapeutics. PMID:27279482
OSG-GEM: Gene Expression Matrix Construction Using the Open Science Grid.
Poehlman, William L; Rynge, Mats; Branton, Chris; Balamurugan, D; Feltus, Frank A
2016-01-01
High-throughput DNA sequencing technology has revolutionized the study of gene expression while introducing significant computational challenges for biologists. These computational challenges include access to sufficient computer hardware and functional data processing workflows. Both these challenges are addressed with our scalable, open-source Pegasus workflow for processing high-throughput DNA sequence datasets into a gene expression matrix (GEM) using computational resources available to U.S.-based researchers on the Open Science Grid (OSG). We describe the usage of the workflow (OSG-GEM), discuss workflow design, inspect performance data, and assess accuracy in mapping paired-end sequencing reads to a reference genome. A target OSG-GEM user is proficient with the Linux command line and possesses basic bioinformatics experience. The user may run this workflow directly on the OSG or adapt it to novel computing environments.
OSG-GEM: Gene Expression Matrix Construction Using the Open Science Grid
Poehlman, William L.; Rynge, Mats; Branton, Chris; Balamurugan, D.; Feltus, Frank A.
2016-01-01
High-throughput DNA sequencing technology has revolutionized the study of gene expression while introducing significant computational challenges for biologists. These computational challenges include access to sufficient computer hardware and functional data processing workflows. Both these challenges are addressed with our scalable, open-source Pegasus workflow for processing high-throughput DNA sequence datasets into a gene expression matrix (GEM) using computational resources available to U.S.-based researchers on the Open Science Grid (OSG). We describe the usage of the workflow (OSG-GEM), discuss workflow design, inspect performance data, and assess accuracy in mapping paired-end sequencing reads to a reference genome. A target OSG-GEM user is proficient with the Linux command line and possesses basic bioinformatics experience. The user may run this workflow directly on the OSG or adapt it to novel computing environments. PMID:27499617
Laimins, L; Holmgren-König, M; Khoury, G
1986-01-01
The enhancer elements from either simian virus 40 or murine sarcoma virus activate the expression of a transfected rat insulin 1 (rI1) gene when placed within 2.0 kilobases or less of the rI1 gene cap site. Inclusion of 4.0 kilobases of upstream rI1 sequence, however, results in a substantial reduction in the enhancer-dependent insulin gene expression. These observations suggested that a negative transcriptional regulatory element was present between 2.0 and 4.0 kilobases of the rI1 sequence. To test this notion, we employed a heterologous enhancer-dependent transcription assay in which the simian virus 40 72-base-pair repeat is linked to a human beta-globin gene. Addition of the upstream rI1 element to this system decreased the level of enhancer-dependent beta-globin transcription by a factor of 5 to 15. This rI1 "silencer" element functions in a manner relatively independent of position and orientation and requires a cis-dependent relationship to the transcription unit on which it acts. Thus, the silencer sequence seems to have a number of the characteristics of enhancer elements, and we suggest that it may function by the converse of the enhancer mechanism. The rI1 silencer sequence was identified as a member of a long interspersed rat repetitive family. Thus, a potential role for certain repetitive sequences interspersed throughout the eukaryotic genome may be to regulate gene expression by retaining transcriptional activity within defined domains. Images PMID:3010279
2014-01-01
Background Variation in seed oil composition and content among soybean varieties is largely attributed to differences in transcript sequences and/or transcript accumulation of oil production related genes in seeds. Discovery and analysis of sequence and expression variations in these genes will accelerate soybean oil quality improvement. Results In an effort to identify these variations, we sequenced the transcriptomes of soybean seeds from nine lines varying in oil composition and/or total oil content. Our results showed that 69,338 distinct transcripts from 32,885 annotated genes were expressed in seeds. A total of 8,037 transcript expression polymorphisms and 50,485 transcript sequence polymorphisms (48,792 SNPs and 1,693 small Indels) were identified among the lines. Effects of the transcript polymorphisms on their encoded protein sequences and functions were predicted. The studies also provided independent evidence that the lack of FAD2-1A gene activity and a non-synonymous SNP in the coding sequence of FAB2C caused elevated oleic acid and stearic acid levels in soybean lines M23 and FAM94-41, respectively. Conclusions As a proof-of-concept, we developed an integrated RNA-seq and bioinformatics approach to identify and functionally annotate transcript polymorphisms, and demonstrated its high effectiveness for discovery of genetic and transcript variations that result in altered oil quality traits. The collection of transcript polymorphisms coupled with their predicted functional effects will be a valuable asset for further discovery of genes, gene variants, and functional markers to improve soybean oil quality. PMID:24755115
Characterization of Cer-1 cis-regulatory region during early Xenopus development.
Silva, Ana Cristina; Filipe, Mário; Steinbeisser, Herbert; Belo, José António
2011-05-01
Cerberus-related molecules are well-known Wnt, Nodal, and BMP inhibitors that have been implicated in different processes including anterior–posterior patterning and left–right asymmetry. In both mouse and frog, two Cerberus-related genes have been isolated, mCer-1 and mCer-2, and Xcer and Xcoco, respectively. Until now, little is known about the mechanisms involved in their transcriptional regulation. Here, we report a heterologous analysis of the mouse Cerberus-1 gene upstream regulatory regions, responsible for its expression in the visceral endodermal cells. Our analysis showed that the consensus sequences for a TATA, CAAT, or GC boxes were absent but a TGTGG sequence was present at position -172 to -168 bp, relative to the ATG. Using a series of deletion constructs and transient expression in Xenopus embryos, we found that a fragment of 1.4 kb of Cer-1 promoter sequence could reproduce the endogenous expression pattern of Xenopus cerberus. A 0.7-kb mcer-1 upstream region was able to drive reporter expression to the involuting mesendodermal cells, while further deletions abolished reporter gene expression. Our results suggest that although no sequence similarity was found between mouse and Xenopus cerberus cis-regulatory regions, the signaling cascades regulating cerberus expression, during gastrulation, is conserved.
Li, Wenli; Turner, Amy; Aggarwal, Praful; Matter, Andrea; Storvick, Erin; Arnett, Donna K; Broeckel, Ulrich
2015-12-16
Whole transcriptome sequencing (RNA-seq) represents a powerful approach for whole transcriptome gene expression analysis. However, RNA-seq carries a few limitations, e.g., the requirement of a significant amount of input RNA and complications led by non-specific mapping of short reads. The Ion AmpliSeq Transcriptome Human Gene Expression Kit (AmpliSeq) was recently introduced by Life Technologies as a whole-transcriptome, targeted gene quantification kit to overcome these limitations of RNA-seq. To assess the performance of this new methodology, we performed a comprehensive comparison of AmpliSeq with RNA-seq using two well-established next-generation sequencing platforms (Illumina HiSeq and Ion Torrent Proton). We analyzed standard reference RNA samples and RNA samples obtained from human induced pluripotent stem cell derived cardiomyocytes (hiPSC-CMs). Using published data from two standard RNA reference samples, we observed a strong concordance of log2 fold change for all genes when comparing AmpliSeq to Illumina HiSeq (Pearson's r = 0.92) and Ion Torrent Proton (Pearson's r = 0.92). We used ROC, Matthew's correlation coefficient and RMSD to determine the overall performance characteristics. All three statistical methods demonstrate AmpliSeq as a highly accurate method for differential gene expression analysis. Additionally, for genes with high abundance, AmpliSeq outperforms the two RNA-seq methods. When analyzing four closely related hiPSC-CM lines, we show that both AmpliSeq and RNA-seq capture similar global gene expression patterns consistent with known sources of variations. Our study indicates that AmpliSeq excels in the limiting areas of RNA-seq for gene expression quantification analysis. Thus, AmpliSeq stands as a very sensitive and cost-effective approach for very large scale gene expression analysis and mRNA marker screening with high accuracy.
Ma, Yu-Hua; Ye, Gui-Sheng
2018-06-11
In this study, we screened differentially expressed genes in a multidrug-resistant isolate strain of Clostridium perfringens by RNA sequencing. We also separated and identified differentially expressed proteins (DEPs) in the isolate strain by two-dimensional electrophoresis (2-DE) and mass spectrometry (MS). The RNA sequencing results showed that, compared with the control strain, 1128 genes were differentially expressed in the isolate strain, and these included 227 up-regulated genes and 901 down-regulated genes. Bioinformatics analysis identified the following genes and gene categories that are potentially involved in multidrug resistance (MDR) in the isolate strain: drug transport, drug response, hydrolase activity, transmembrane transporter, transferase activity, amidase transmembrane transporter, efflux transmembrane transporter, bacterial chemotaxis, ABC transporter, and others. The results of the 2-DE showed that 70 proteins were differentially expressed in the isolate strain, 45 of which were up-regulated and 25 down-regulated. Twenty-seven DEPs were identified by MS and these included the following protein categories: ribosome, antimicrobial peptide resistance, and ABC transporter, all of which may be involved in MDR in the isolate strain of C. perfringens. The results provide reference data for further investigations on the drug resistant molecular mechanisms of C. perfringens.
New in-depth rainbow trout transcriptome reference and digital atlas of gene expression
USDA-ARS?s Scientific Manuscript database
Sequencing the rainbow trout genome is underway and a transcriptome reference sequence is required to help in genome assembly and gene discovery. Previously, we reported a transcriptome reference sequence using a 19X coverage of 454-pyrosequencing data. Although this work added a great wealth of ann...
Cheng, Yunqing; Liu, Jianfeng; Zhang, Huidi; Wang, Ju; Zhao, Yixin; Geng, Wanting
2015-01-01
A high ratio of blank fruit in hazelnut (Corylus heterophylla Fisch) is a very common phenomenon that causes serious yield losses in northeast China. The development of blank fruit in the Corylus genus is known to be associated with embryo abortion. However, little is known about the molecular mechanisms responsible for embryo abortion during the nut development stage. Genomic information for C. heterophylla Fisch is not available; therefore, data related to transcriptome and gene expression profiling of developing and abortive ovules are needed. In this study, de novo transcriptome sequencing and RNA-seq analysis were conducted using short-read sequencing technology (Illumina HiSeq 2000). The results of the transcriptome assembly analysis revealed genetic information that was associated with the fruit development stage. Two digital gene expression libraries were constructed, one for a full (normally developing) ovule and one for an empty (abortive) ovule. Transcriptome sequencing and assembly results revealed 55,353 unigenes, including 18,751 clusters and 36,602 singletons. These results were annotated using the public databases NR, NT, Swiss-Prot, KEGG, COG, and GO. Using digital gene expression profiling, gene expression differences in developing and abortive ovules were identified. A total of 1,637 and 715 unigenes were significantly upregulated and downregulated, respectively, in abortive ovules, compared with developing ovules. Quantitative real-time polymerase chain reaction analysis was used in order to verify the differential expression of some genes. The transcriptome and digital gene expression profiling data of normally developing and abortive ovules in hazelnut provide exhaustive information that will improve our understanding of the molecular mechanisms of abortive ovule formation in hazelnut.
Li, Shu-Fen; Zhang, Guo-Jun; Zhang, Xue-Jin; Yuan, Jin-Hong; Deng, Chuan-Liang; Gao, Wu-Jun
2017-08-22
Garden asparagus (Asparagus officinalis) is a highly valuable vegetable crop of commercial and nutritional interest. It is also commonly used to investigate the mechanisms of sex determination and differentiation in plants. However, the sex expression mechanisms in asparagus remain poorly understood. De novo transcriptome sequencing via Illumina paired-end sequencing revealed more than 26 billion bases of high-quality sequence data from male and female asparagus flower buds. A total of 72,626 unigenes with an average length of 979 bp were assembled. In comparative transcriptome analysis, 4876 differentially expressed genes (DEGs) were identified in the possible sex-determining stage of female and male/supermale flower buds. Of these DEGs, 433, including 285 male/supermale-biased and 149 female-biased genes, were annotated as flower related. Of the male/supermale-biased flower-related genes, 102 were probably involved in anther development. In addition, 43 DEGs implicated in hormone response and biosynthesis putatively associated with sex expression and reproduction were discovered. Moreover, 128 transcription factor (TF)-related genes belonging to various families were found to be differentially expressed, and this finding implied the essential roles of TF in sex determination or differentiation in asparagus. Correlation analysis indicated that miRNA-DEG pairs were also implicated in asparagus sexual development. Our study identified a large number of DEGs involved in the sex expression and reproduction of asparagus, including known genes participating in plant reproduction, plant hormone signaling, TF encoding, and genes with unclear functions. We also found that miRNAs might be involved in the sex differentiation process. Our study could provide a valuable basis for further investigations on the regulatory networks of sex determination and differentiation in asparagus and facilitate further genetic and genomic studies on this dioecious species.
Cloning and Expression of the Benzoate Dioxygenase Genes from Rhodococcus sp. Strain 19070
Haddad, Sandra; Eby, D. Matthew; Neidle, Ellen L.
2001-01-01
The bopXYZ genes from the gram-positive bacterium Rhodococcus sp. strain 19070 encode a broad-substrate-specific benzoate dioxygenase. Expression of the BopXY terminal oxygenase enabled Escherichia coli to convert benzoate or anthranilate (2-aminobenzoate) to a nonaromatic cis-diol or catechol, respectively. This expression system also rapidly transformed m-toluate (3-methylbenzoate) to an unidentified product. In contrast, 2-chlorobenzoate was not a good substrate. The BopXYZ dioxygenase was homologous to the chromosomally encoded benzoate dioxygenase (BenABC) and the plasmid-encoded toluate dioxygenase (XylXYZ) of gram-negative acinetobacters and pseudomonads. Pulsed-field gel electrophoresis failed to identify any plasmid in Rhodococcus sp. strain 19070. Catechol 1,2- and 2,3-dioxygenase activity indicated that strain 19070 possesses both meta- and ortho-cleavage degradative pathways, which are associated in pseudomonads with the xyl and ben genes, respectively. Open reading frames downstream of bopXYZ, designated bopL and bopK, resembled genes encoding cis-diol dehydrogenases and benzoate transporters, respectively. The bop genes were in the same order as the chromosomal ben genes of P. putida PRS2000. The deduced sequences of BopXY were 50 to 60% identical to the corresponding proteins of benzoate and toluate dioxygenases. The reductase components of these latter dioxygenases, BenC and XylZ, are 201 residues shorter than the deduced BopZ sequence. As predicted from the sequence, expression of BopZ in E. coli yielded an approximately 60-kDa protein whose presence corresponded to increased cytochrome c reductase activity. While the N-terminal region of BopZ was approximately 50% identical in sequence to the entire BenC or XylZ reductases, the C terminus was unlike other known protein sequences. PMID:11375157
Sun, Haiyue; Liu, Yushan; Gai, Yuzhuo; Geng, Jinman; Chen, Li; Liu, Hongdi; Kang, Limin; Tian, Youwen; Li, Yadong
2015-09-02
Cranberries (Vaccinium macrocarpon Ait.), renowned for their excellent health benefits, are an important berry crop. Here, we performed transcriptome sequencing of one cranberry cultivar, from fruits at two different developmental stages, on the Illumina HiSeq 2000 platform. Our main goals were to identify putative genes for major metabolic pathways of bioactive compounds and compare the expression patterns between white fruit (W) and red fruit (R) in cranberry. In this study, two cDNA libraries of W and R were constructed. Approximately 119 million raw sequencing reads were generated and assembled de novo, yielding 57,331 high quality unigenes with an average length of 739 bp. Using BLASTx, 38,460 unigenes were identified as putative homologs of annotated sequences in public protein databases, including NCBI NR, NT, Swiss-Prot, KEGG, COG and GO. Of these, 21,898 unigenes mapped to 128 KEGG pathways, with the metabolic pathways, secondary metabolites, glycerophospholipid metabolism, ether lipid metabolism, starch and sucrose metabolism, purine metabolism, and pyrimidine metabolism being well represented. Among them, many candidate genes were involved in flavonoid biosynthesis, transport and regulation. Furthermore, digital gene expression (DEG) analysis identified 3,257 unigenes that were differentially expressed between the two fruit developmental stages. In addition, 14,473 simple sequence repeats (SSRs) were detected. Our results present comprehensive gene expression information about the cranberry fruit transcriptome that could facilitate our understanding of the molecular mechanisms of fruit development in cranberries. Although it will be necessary to validate the functions carried out by these genes, these results could be used to improve the quality of breeding programs for the cranberry and related species.
Ceccarelli, A; Zhukovskaya, N; Kawata, T; Bozzaro, S; Williams, J
2000-12-01
The ecmB gene of Dictyostelium is expressed at culmination both in the prestalk cells that enter the stalk tube and in ancillary stalk cell structures such as the basal disc. Stalk tube-specific expression is regulated by sequence elements within the cap-site proximal part of the promoter, the stalk tube (ST) promoter region. Dd-STATa, a member of the STAT transcription factor family, binds to elements present in the ST promoter-region and represses transcription prior to entry into the stalk tube. We have characterised an activatory DNA sequence element, that lies distal to the repressor elements and that is both necessary and sufficient for expression within the stalk tube. We have mapped this activator to a 28 nucleotide region (the 28-mer) within which we have identified a GA-containing sequence element that is required for efficient gene transcription. The Dd-STATa protein binds to the 28-mer in an in vitro binding assay, and binding is dependent upon the GA-containing sequence. However, the ecmB gene is expressed in a Dd-STATa null mutant, therefore Dd-STATa cannot be responsible for activating the 28-mer in vivo. Instead, we identified a distinct 28-mer binding activity in nuclear extracts from the Dd-STATa null mutant, the activity of this GA binding activity being largely masked in wild type extracts by the high affinity binding of the Dd-STATa protein. We suggest, that in addition to the long range repression exerted by binding to the two known repressor sites, Dd-STATa inhibits transcription by direct competition with this putative activator for binding to the GA sequence.
Wiersma, Andrew T; Gaines, Todd A; Preston, Christopher; Hamilton, John P; Giacomini, Darci; Robin Buell, C; Leach, Jan E; Westra, Philip
2015-02-01
Field-evolved resistance to the herbicide glyphosate is due to amplification of one of two EPSPS alleles, increasing transcription and protein with no splice variants or effects on other pathway genes. The widely used herbicide glyphosate inhibits the shikimate pathway enzyme 5-enolpyruvylshikimate-3-phosphate synthase (EPSPS). Globally, the intensive use of glyphosate for weed control has selected for glyphosate resistance in 31 weed species. Populations of suspected glyphosate-resistant Kochia scoparia were collected from fields located in the US central Great Plains. Glyphosate dose response verified glyphosate resistance in nine populations. The mechanism of resistance to glyphosate was investigated using targeted sequencing, quantitative PCR, immunoblotting, and whole transcriptome de novo sequencing to characterize the sequence and expression of EPSPS. Sequence analysis showed no mutation of the EPSPS Pro106 codon in glyphosate-resistant K. scoparia, whereas EPSPS genomic copy number and transcript abundance were elevated three- to ten-fold in resistant individuals relative to susceptible individuals. Glyphosate-resistant individuals with increased relative EPSPS copy numbers had consistently lower shikimate accumulation in leaf disks treated with 100 μM glyphosate and EPSPS protein levels were higher in glyphosate-resistant individuals with increased gene copy number compared to glyphosate-susceptible individuals. RNA sequence analysis revealed seven nucleotide positions with two different expressed alleles in glyphosate-susceptible reads. However, one nucleotide at the seven positions was predominant in glyphosate-resistant sequences, suggesting that only one of two EPSPS alleles was amplified in glyphosate-resistant individuals. No alternatively spliced EPSPS transcripts were detected. Expression of five other genes in the chorismate pathway was unaffected in glyphosate-resistant individuals with increased EPSPS expression. These results indicate increased EPSPS expression is a mechanism for glyphosate resistance in these K. scoparia populations.
Epstein, L M; Forney, J D
1984-01-01
A screening procedure was devised for the isolation of X-ray-induced mutations affecting the expression of the A immobilization antigen (i-antigen) in Paramecium tetraurelia. Two of the mutations isolated by this procedure proved to be in modifier genes. The two genes are unlinked to each other and unlinked to the structural A i-antigen gene. These are the first modifier genes identified in a Paramecium sp. that affect surface antigen expression. Another mutation was found to be a deletion of sequences just downstream from the A i-antigen gene. In cells carrying this mutation, the A i-antigen gene lies in close proximity to the end of a macronuclear chromosome. The expression of the A i-antigen is not affected in these cells, demonstrating that downstream sequences are not important for the regulation and expression of the A i-antigen gene. A stable cell line was also recovered which shows non-Mendelian inheritance of a macronuclear deletion of the A i-antigen gene. This mutant does not contain the gene in its macronucleus, but contains a complete copy of the gene in its micronucleus. In the cytoplasm of wild-type animals, the micronuclear gene is included in the developing macronucleus; in the cytoplasm of the mutant, the incorporation of the A i-antigen gene into the macronucleus is inhibited. This is the first evidence that a mechanism is available in ciliates to control the expression of a gene by regulating its incorporation into developing macronuclei. Images PMID:6092921
Wang, Fengqing; Suo, Yanfei; Wei, He; Li, Mingjie; Xie, Caixia; Wang, Lina; Chen, Xinjian; Zhang, Zhongyi
2015-01-01
The v-myb avian myeloblastosis viral oncogene homolog (MYB) superfamily constitutes one of the most abundant groups of transcription factors (TFs) described in plants. To date, little is known about the MYB genes in Rehmannia glutinosa. Forty unique MYB genes with full-length cDNA sequences were isolated. These 40 genes were grouped into five categories, one R1R2R3-MYB, four TRFL MYBs, four SMH MYBs, 25 R2R3-MYBs, and six MYB-related members. The MYB DNA-binding domain (DBD) sequence composition was conserved among proteins of the same subgroup. As expected, most of the closely related members in the phylogenetic tree exhibited common motifs. Additionally, the gene structure and motifs of the R. glutinosa MYB genes were analyzed. MYB gene expression was analyzed in the leaf and the tuberous root under two abiotic stress conditions. Expression profiles showed that most R. glutinosa MYB genes were expressed in the leaf and the tuberous root, suggesting that MYB genes are involved in various physiological and developmental processes in R. glutinosa. Seven MYB genes were up-regulated in response to shading in at least one tissue. Two MYB genes showed increased expression and 13 MYB genes showed decreased expression in the tuberous root under continuous cropping. This investigation is the first comprehensive study of the MYB gene family in R. glutinosa. PMID:26147429
Generation, annotation and analysis of ESTs from Trichoderma harzianum CECT 2413
Vizcaíno, Juan Antonio; González, Francisco Javier; Suárez, M Belén; Redondo, José; Heinrich, Julian; Delgado-Jarana, Jesús; Hermosa, Rosa; Gutiérrez, Santiago; Monte, Enrique; Llobell, Antonio; Rey, Manuel
2006-01-01
Background The filamentous fungus Trichoderma harzianum is used as biological control agent of several plant-pathogenic fungi. In order to study the genome of this fungus, a functional genomics project called "TrichoEST" was developed to give insights into genes involved in biological control activities using an approach based on the generation of expressed sequence tags (ESTs). Results Eight different cDNA libraries from T. harzianum strain CECT 2413 were constructed. Different growth conditions involving mainly different nutrient conditions and/or stresses were used. We here present the analysis of the 8,710 ESTs generated. A total of 3,478 unique sequences were identified of which 81.4% had sequence similarity with GenBank entries, using the BLASTX algorithm. Using the Gene Ontology hierarchy, we performed the annotation of 51.1% of the unique sequences and compared its distribution among the gene libraries. Additionally, the InterProScan algorithm was used in order to further characterize the sequences. The identification of the putatively secreted proteins was also carried out. Later, based on the EST abundance, we examined the highly expressed genes and a hydrophobin was identified as the gene expressed at the highest level. We compared our collection of ESTs with the previous collections obtained from Trichoderma species and we also compared our sequence set with different complete eukaryotic genomes from several animals, plants and fungi. Accordingly, the presence of similar sequences in different kingdoms was also studied. Conclusion This EST collection and its annotation provide a significant resource for basic and applied research on T. harzianum, a fungus with a high biotechnological interest. PMID:16872539
Maluf, Mirian Perez; da Silva, Carla Cristina; de Oliveira, Michelle de Paula Abreu; Tavares, Aline Gomes; Silvarolla, Maria Bernadete; Guerreiro, Oliveiro
2009-10-01
In this work, we studied the biosynthesis of caffeine by examining the expression of genes involved in this biosynthetic pathway in coffee fruits containing normal or low levels of this substance. The amplification of gene-specific transcripts during fruit development revealed that low-caffeine fruits had a lower expression of the theobromine synthase and caffeine synthase genes and also contained an extra transcript of the caffeine synthase gene. This extra transcript contained only part of exon 1 and all of exon 3. The sequence of the mutant caffeine synthase gene revealed the substitution of isoleucine for valine in the enzyme active site that probably interfered with enzymatic activity. These findings indicate that the absence of caffeine in these mutants probably resulted from a combination of transcriptional regulation and the presence of mutations in the caffeine synthase amino acid sequence.
2009-01-01
In this work, we studied the biosynthesis of caffeine by examining the expression of genes involved in this biosynthetic pathway in coffee fruits containing normal or low levels of this substance. The amplification of gene-specific transcripts during fruit development revealed that low-caffeine fruits had a lower expression of the theobromine synthase and caffeine synthase genes and also contained an extra transcript of the caffeine synthase gene. This extra transcript contained only part of exon 1 and all of exon 3. The sequence of the mutant caffeine synthase gene revealed the substitution of isoleucine for valine in the enzyme active site that probably interfered with enzymatic activity. These findings indicate that the absence of caffeine in these mutants probably resulted from a combination of transcriptional regulation and the presence of mutations in the caffeine synthase amino acid sequence. PMID:21637458
Massa, Sónia I; Pearson, Gareth A; Aires, Tânia; Kube, Michael; Olsen, Jeanine L; Reinhardt, Richard; Serrão, Ester A; Arnaud-Haond, Sophie
2011-09-01
Predicted global climate change threatens the distributional ranges of species worldwide. We identified genes expressed in the intertidal seagrass Zostera noltii during recovery from a simulated low tide heat-shock exposure. Five Expressed Sequence Tag (EST) libraries were compared, corresponding to four recovery times following sub-lethal temperature stress, and a non-stressed control. We sequenced and analyzed 7009 sequence reads from 30min, 2h, 4h and 24h after the beginning of the heat-shock (AHS), and 1585 from the control library, for a total of 8594 sequence reads. Among 51 Tentative UniGenes (TUGs) exhibiting significantly different expression between libraries, 19 (37.3%) were identified as 'molecular chaperones' and were over-expressed following heat-shock, while 12 (23.5%) were 'photosynthesis TUGs' generally under-expressed in heat-shocked plants. A time course analysis of expression showed a rapid increase in expression of the molecular chaperone class, most of which were heat-shock proteins; which increased from 2 sequence reads in the control library to almost 230 in the 30min AHS library, followed by a slow decrease during further recovery. In contrast, 'photosynthesis TUGs' were under-expressed 30min AHS compared with the control library, and declined progressively with recovery time in the stress libraries, with a total of 29 sequence reads 24h AHS, compared with 125 in the control. A total of 4734 TUGs were screened for EST-Single Sequence Repeats (EST-SSRs) and 86 microsatellites were identified. Copyright © 2011 Elsevier B.V. All rights reserved.
Lerat, Emmanuelle; Fablet, Marie; Modolo, Laurent; Lopez-Maestre, Hélène
2017-01-01
Abstract Over recent decades, substantial efforts have been made to understand the interactions between host genomes and transposable elements (TEs). The impact of TEs on the regulation of host genes is well known, with TEs acting as platforms of regulatory sequences. Nevertheless, due to their repetitive nature it is considerably hard to integrate TE analysis into genome-wide studies. Here, we developed a specific tool for the analysis of TE expression: TEtools. This tool takes into account the TE sequence diversity of the genome, it can be applied to unannotated or unassembled genomes and is freely available under the GPL3 (https://github.com/l-modolo/TEtools). TEtools performs the mapping of RNA-seq data obtained from classical mRNAs or small RNAs onto a list of TE sequences and performs differential expression analyses with statistical relevance. Using this tool, we analyzed TE expression from five Drosophila wild-type strains. Our data show for the first time that the activity of TEs is strictly linked to the activity of the genes implicated in the piwi-interacting RNA biogenesis and therefore fits an arms race scenario between TE sequences and host control genes. PMID:28204592
Parallel gene analysis with allele-specific padlock probes and tag microarrays
Banér, Johan; Isaksson, Anders; Waldenström, Erik; Jarvius, Jonas; Landegren, Ulf; Nilsson, Mats
2003-01-01
Parallel, highly specific analysis methods are required to take advantage of the extensive information about DNA sequence variation and of expressed sequences. We present a scalable laboratory technique suitable to analyze numerous target sequences in multiplexed assays. Sets of padlock probes were applied to analyze single nucleotide variation directly in total genomic DNA or cDNA for parallel genotyping or gene expression analysis. All reacted probes were then co-amplified and identified by hybridization to a standard tag oligonucleotide array. The technique was illustrated by analyzing normal and pathogenic variation within the Wilson disease-related ATP7B gene, both at the level of DNA and RNA, using allele-specific padlock probes. PMID:12930977
NASA Astrophysics Data System (ADS)
Lau, Yun-Fai; Kan, Yuet Wai
1983-09-01
We have developed a series of cosmids that can be used as vectors for genomic recombinant DNA library preparations, as expression vectors in mammalian cells for both transient and stable transformations, and as shuttle vectors between bacteria and mammalian cells. These cosmids were constructed by inserting one of the SV2-derived selectable gene markers-SV2-gpt, SV2-DHFR, and SV2-neo-in cosmid pJB8. High efficiency of genomic cloning was obtained with these cosmids and the size of the inserts was 30-42 kilobases. We isolated recombinant cosmids containing the human α -globin gene cluster from these genomic libraries. The simian virus 40 DNA in these selectable gene markers provides the origin of replication and enhancer sequences necessary for replication in permissive cells such as COS 7 cells and thereby allows transient expression of α -globin genes in these cells. These cosmids and their recombinants could also be stably transformed into mammalian cells by using the respective selection systems. Both of the adult α -globin genes were more actively expressed than the embryonic zeta -globin genes in these transformed cell lines. Because of the presence of the cohesive ends of the Charon 4A phage in the cosmids, the transforming DNA sequences could readily be rescued from these stably transformed cells into bacteria by in vitro packaging of total cellular DNA. Thus, these cosmid vectors are potentially useful for direct isolation of structural genes.
The Eucalyptus terpene synthase gene family.
Külheim, Carsten; Padovan, Amanda; Hefer, Charles; Krause, Sandra T; Köllner, Tobias G; Myburg, Alexander A; Degenhardt, Jörg; Foley, William J
2015-06-11
Terpenoids are abundant in the foliage of Eucalyptus, providing the characteristic smell as well as being valuable economically and influencing ecological interactions. Quantitative and qualitative inter- and intra- specific variation of terpenes is common in eucalypts. The genome sequences of Eucalyptus grandis and E. globulus were mined for terpene synthase genes (TPS) and compared to other plant species. We investigated the relative expression of TPS in seven plant tissues and functionally characterized five TPS genes from E. grandis. Compared to other sequenced plant genomes, Eucalyptus grandis has the largest number of putative functional TPS genes of any sequenced plant. We discovered 113 and 106 putative functional TPS genes in E. grandis and E. globulus, respectively. All but one TPS from E. grandis were expressed in at least one of seven plant tissues examined. Genomic clusters of up to 20 genes were identified. Many TPS are expressed in tissues other than leaves which invites a re-evaluation of the function of terpenes in Eucalyptus. Our data indicate that terpenes in Eucalyptus may play a wider role in biotic and abiotic interactions than previously thought. Tissue specific expression is common and the possibility of stress induction needs further investigation. Phylogenetic comparison of the two investigated Eucalyptus species gives insight about recent evolution of different clades within the TPS gene family. While the majority of TPS genes occur in orthologous pairs some clades show evidence of recent gene duplication, as well as loss of function.
Differential expression of a WRKY gene between wild and cultivated soybeans correlates to seed size.
Gu, Yongzhe; Li, Wei; Jiang, Hongwei; Wang, Yan; Gao, Huihui; Liu, Miao; Chen, Qingshan; Lai, Yongcai; He, Chaoying
2017-05-17
Soybean (Glycine max) probably originated from the wild soybean (Glycine soja). Glycine max has a significantly larger seed size, but the underlying genomic changes are largely unknown. Candidate regulatory genes were preliminarily proposed by data co-localizing RNA sequencing with the quantitative loci (QTLs) for seed size. The soybean gene locus SoyWRKY15a and its orthologous genes from G. max (GmWRKY15a) and G. soja (GsWRKY15a) were analyzed in detail. The coding sequences were nearly identical between the two orthologs, but GmWRKY15a was significantly more highly expressed than GsWRKY15a. Four haplotypes (H1-H4) were found and they varied in the size of a CT-core microsatellite locus in the 5'-untranslated region of this gene. H1 (with six CT-repeats) was the only allelic version found in G. max, while H3 (with five CT-repeats) was the dominant G. soja allele. Differential expression of this gene in soybean pods was correlated with CT-repeat variation, and manipulation of the CT copy number altered the reporter gene expression, suggesting a regulatory role for the simple sequence repeats. Seed weight of wild soybeans harboring H1 was significantly greater than that of soybeans having haplotypes H2, H3, or H4, and seed weight was correlated with gene expression, suggesting the influence of GsWRKY15a in controlling seed size. However, the seed size might be refractory to increased SoyWRKY15a expression in cultivated soybeans. The evolutionary significance of SoyWRKY15a variation in soybean seed domestication is discussed. © The Author 2017. Published by Oxford University Press on behalf of the Society for Experimental Biology.
Four out of eight genes in a mouse chromosome 7 congenic donor region are candidate obesity genes.
Sarahan, Kari A; Fisler, Janis S; Warden, Craig H
2011-09-22
We previously identified a region of mouse chromosome 7 that influences body fat mass in F2 littermates of congenic × background intercrosses. Current analyses revealed that alleles in the donor region of the subcongenic B6.C-D7Mit318 (318) promoted a twofold increase in adiposity in homozygous lines of 318 compared with background C57BL/6ByJ (B6By) mice. Parent-of-origin effects were discounted through cross-fostering studies and an F1 reciprocal cross. Mapping of the donor region revealed that it has a maximal size of 2.8 Mb (minimum 1.8 Mb) and contains a maximum of eight protein coding genes. Quantitative PCR in whole brain, liver, and gonadal white adipose tissue (GWAT) revealed differential expression between genotypes for three genes in females and two genes in males. Alpha-2,8-sialyltransferase 8B (St8sia2) showed reduced 318 mRNA levels in brain for females and males and in GWAT for females only. Both sexes of 318 mice had reduced Repulsive guidance molecule-a (Rgma) expression in GWAT. In brain, Family with sequence similarity 174 member b (Fam174b) had increased expression in 318 females, whereas Chromodomain helicase DNA binding protein 2 (Chd2-2) had reduced expression in 318 males. No donor region genes were differentially expressed in liver. Sequence analysis of coding exons for all genes in the 318 donor region revealed only one single nucleotide polymorphism that produced a nonsynonymous missense mutation, Gln7Pro, in Fam174b. Our findings highlight the difficulty of using expression and sequence to identify quantitative trait genes underlying obesity even in small genomic regions.
2011-01-01
Background Transcriptome sequencing data has become an integral component of modern genetics, genomics and evolutionary biology. However, despite advances in the technologies of DNA sequencing, such data are lacking for many groups of living organisms, in particular, many plant taxa. We present here the results of transcriptome sequencing for two closely related plant species. These species, Fagopyrum esculentum and F. tataricum, belong to the order Caryophyllales - a large group of flowering plants with uncertain evolutionary relationships. F. esculentum (common buckwheat) is also an important food crop. Despite these practical and evolutionary considerations Fagopyrum species have not been the subject of large-scale sequencing projects. Results Normalized cDNA corresponding to genes expressed in flowers and inflorescences of F. esculentum and F. tataricum was sequenced using the 454 pyrosequencing technology. This resulted in 267 (for F. esculentum) and 229 (F. tataricum) thousands of reads with average length of 341-349 nucleotides. De novo assembly of the reads produced about 25 thousands of contigs for each species, with 7.5-8.2× coverage. Comparative analysis of two transcriptomes demonstrated their overall similarity but also revealed genes that are presumably differentially expressed. Among them are retrotransposon genes and genes involved in sugar biosynthesis and metabolism. Thirteen single-copy genes were used for phylogenetic analysis; the resulting trees are largely consistent with those inferred from multigenic plastid datasets. The sister relationships of the Caryophyllales and asterids now gained high support from nuclear gene sequences. Conclusions 454 transcriptome sequencing and de novo assembly was performed for two congeneric flowering plant species, F. esculentum and F. tataricum. As a result, a large set of cDNA sequences that represent orthologs of known plant genes as well as potential new genes was generated. PMID:21232141
USDA-ARS?s Scientific Manuscript database
Mounting evidence shows microRNAs (miRNAs) directly regulate gene expression post-transcriptionally through base-pairing with regions in the 3’-untranslated sequences of target gene mRNAs, which results in dysregulation of gene expression/translation and subsequently modulates cellular processes. We...
Comprehensive analysis of MHC class I genes from the U-, S-, and Z-lineages in Atlantic salmon.
Lukacs, Morten F; Harstad, Håvard; Bakke, Hege G; Beetz-Sargent, Marianne; McKinnel, Linda; Lubieniecki, Krzysztof P; Koop, Ben F; Grimholt, Unni
2010-03-05
We have previously sequenced more than 500 kb of the duplicated MHC class I regions in Atlantic salmon. In the IA region we identified the loci for the MHC class I gene Sasa-UBA in addition to a soluble MHC class I molecule, Sasa-ULA. A pseudolocus for Sasa-UCA was identified in the nonclassical IB region. Both regions contained genes for antigen presentation, as wells as orthologues to other genes residing in the human MHC region. The genomic localisation of two MHC class I lineages (Z and S) has been resolved. 7 BACs were sequenced using a combination of standard Sanger and 454 sequencing. The new sequence data extended the IA region with 150 kb identifying the location of one Z-lineage locus, ZAA. The IB region was extended with 350 kb including three new Z-lineage loci, ZBA, ZCA and ZDA in addition to a UGA locus. An allelic version of the IB region contained a functional UDA locus in addition to the UCA pseudolocus. Additionally a BAC harbouring two MHC class I genes (UHA) was placed on linkage group 14, while a BAC containing the S-lineage locus SAA (previously known as UAA) was placed on LG10. Gene expression studies showed limited expression range for all class I genes with exception of UBA being dominantly expressed in gut, spleen and gills, and ZAA with high expression in blood. Here we describe the genomic organization of MHC class I loci from the U-, Z-, and S-lineages in Atlantic salmon. Nine of the described class I genes are located in the extension of the duplicated IA and IB regions, while three class I genes are found on two separate linkage groups. The gene organization of the two regions indicates that the IB region is evolving at a different pace than the IA region. Expression profiling, polymorphic content, peptide binding properties and phylogenetic relationship show that Atlantic salmon has only one MHC class Ia gene (UBA), in addition to a multitude of nonclassical MHC class I genes from the U-, S- and Z-lineages.
Zhang, Yu; Peng, Lifang; Wu, Ya; Shen, Yanyue; Wu, Xiaoming; Wang, Jianbo
2014-11-01
Embryo development represents a crucial developmental period in the life cycle of flowering plants. To gain insights into the genetic programs that control embryo development in Brassica rapa L., RNA sequencing technology was used to perform transcriptome profiling analysis of B. rapa developing embryos. The results generated 42,906,229 sequence reads aligned with 32,941 genes. In total, 27,760, 28,871, 28,384, and 25,653 genes were identified from embryos at globular, heart, early cotyledon, and mature developmental stages, respectively, and analysis between stages revealed a subset of stage-specific genes. We next investigated 9,884 differentially expressed genes with more than fivefold changes in expression and false discovery rate ≤ 0.001 from three adjacent-stage comparisons; 1,514, 3,831, and 6,633 genes were detected between globular and heart stage embryo libraries, heart stage and early cotyledon stage, and early cotyledon and mature stage, respectively. Large numbers of genes related to cellular process, metabolism process, response to stimulus, and biological process were expressed during the early and middle stages of embryo development. Fatty acid biosynthesis, biosynthesis of secondary metabolites, and photosynthesis-related genes were expressed predominantly in embryos at the middle stage. Genes for lipid metabolism and storage proteins were highly expressed in the middle and late stages of embryo development. We also identified 911 transcription factor genes that show differential expression across embryo developmental stages. These results increase our understanding of the complex molecular and cellular events during embryo development in B. rapa and provide a foundation for future studies on other oilseed crops.
Optimal Scaling of Digital Transcriptomes
Glusman, Gustavo; Caballero, Juan; Robinson, Max; Kutlu, Burak; Hood, Leroy
2013-01-01
Deep sequencing of transcriptomes has become an indispensable tool for biology, enabling expression levels for thousands of genes to be compared across multiple samples. Since transcript counts scale with sequencing depth, counts from different samples must be normalized to a common scale prior to comparison. We analyzed fifteen existing and novel algorithms for normalizing transcript counts, and evaluated the effectiveness of the resulting normalizations. For this purpose we defined two novel and mutually independent metrics: (1) the number of “uniform” genes (genes whose normalized expression levels have a sufficiently low coefficient of variation), and (2) low Spearman correlation between normalized expression profiles of gene pairs. We also define four novel algorithms, one of which explicitly maximizes the number of uniform genes, and compared the performance of all fifteen algorithms. The two most commonly used methods (scaling to a fixed total value, or equalizing the expression of certain ‘housekeeping’ genes) yielded particularly poor results, surpassed even by normalization based on randomly selected gene sets. Conversely, seven of the algorithms approached what appears to be optimal normalization. Three of these algorithms rely on the identification of “ubiquitous” genes: genes expressed in all the samples studied, but never at very high or very low levels. We demonstrate that these include a “core” of genes expressed in many tissues in a mutually consistent pattern, which is suitable for use as an internal normalization guide. The new methods yield robustly normalized expression values, which is a prerequisite for the identification of differentially expressed and tissue-specific genes as potential biomarkers. PMID:24223126
Cox, Murray P; Dong, Ting; Shen, Genggeng; Dalvi, Yogesh; Scott, D Barry; Ganley, Austen R D
2014-03-01
Polyploidy, a state in which the chromosome complement has undergone an increase, is a major force in evolution. Understanding the consequences of polyploidy has received much attention, and allopolyploids, which result from the union of two different parental genomes, are of particular interest because they must overcome a suite of biological responses to this merger, known as "genome shock." A key question is what happens to gene expression of the two gene copies following allopolyploidization, but until recently the tools to answer this question on a genome-wide basis were lacking. Here we utilize high throughput transcriptome sequencing to produce the first genome-wide picture of gene expression response to allopolyploidy in fungi. A novel pipeline for assigning sequence reads to the gene copies was used to quantify their expression in a fungal allopolyploid. We find that the transcriptional response to allopolyploidy is predominantly conservative: both copies of most genes are retained; over half the genes inherit parental gene expression patterns; and parental differential expression is often lost in the allopolyploid. Strikingly, the patterns of gene expression change are highly concordant with the genome-wide expression results of a cotton allopolyploid. The very different nature of these two allopolyploids implies a conserved, eukaryote-wide transcriptional response to genome merger. We provide evidence that the transcriptional responses we observe are mostly driven by intrinsic differences between the regulatory systems in the parent species, and from this propose a mechanistic model in which the cross-kingdom conservation in transcriptional response reflects conservation of the mutational processes underlying eukaryotic gene regulatory evolution. This work provides a platform to develop a universal understanding of gene expression response to allopolyploidy and suggests that allopolyploids are an exceptional system to investigate gene regulatory changes that have evolved in the parental species prior to allopolyploidization.
Searching for resistance genes to Bursaphelenchus xylophilus using high throughput screening
2012-01-01
Background Pine wilt disease (PWD), caused by the pinewood nematode (PWN; Bursaphelenchus xylophilus), damages and kills pine trees and is causing serious economic damage worldwide. Although the ecological mechanism of infestation is well described, the plant’s molecular response to the pathogen is not well known. This is due mainly to the lack of genomic information and the complexity of the disease. High throughput sequencing is now an efficient approach for detecting the expression of genes in non-model organisms, thus providing valuable information in spite of the lack of the genome sequence. In an attempt to unravel genes potentially involved in the pine defense against the pathogen, we hereby report the high throughput comparative sequence analysis of infested and non-infested stems of Pinus pinaster (very susceptible to PWN) and Pinus pinea (less susceptible to PWN). Results Four cDNA libraries from infested and non-infested stems of P. pinaster and P. pinea were sequenced in a full 454 GS FLX run, producing a total of 2,083,698 reads. The putative amino acid sequences encoded by the assembled transcripts were annotated according to Gene Ontology, to assign Pinus contigs into Biological Processes, Cellular Components and Molecular Functions categories. Most of the annotated transcripts corresponded to Picea genes-25.4-39.7%, whereas a smaller percentage, matched Pinus genes, 1.8-12.8%, probably a consequence of more public genomic information available for Picea than for Pinus. The comparative transcriptome analysis showed that when P. pinaster was infested with PWN, the genes malate dehydrogenase, ABA, water deficit stress related genes and PAR1 were highly expressed, while in PWN-infested P. pinea, the highly expressed genes were ricin B-related lectin, and genes belonging to the SNARE and high mobility group families. Quantitative PCR experiments confirmed the differential gene expression between the two pine species. Conclusions Defense-related genes triggered by nematode infestation were detected in both P. pinaster and P. pinea transcriptomes utilizing 454 pyrosequencing technology. P. pinaster showed higher abundance of genes related to transcriptional regulation, terpenoid secondary metabolism (including some with nematicidal activity) and pathogen attack. P. pinea showed higher abundance of genes related to oxidative stress and higher levels of expression in general of stress responsive genes. This study provides essential information about the molecular defense mechanisms utilized by P. pinaster and P. pinea against PWN infestation and contributes to a better understanding of PWD. PMID:23134679
Sequence analysis and expression of the M1 and M2 matrix protein genes of hirame rhabdovirus (HIRRV)
Nishizawa, T.; Kurath, G.; Winton, J.R.
1997-01-01
We have cloned and sequenced a 2318 nucleotide region of the genomic RNA of hirame rhabdovirus (HIRRV), an important viral pathogen of Japanese flounder Paralichthys olivaceus. This region comprises approximately two-thirds of the 3' end of the nucleocapsid protein (N) gene and the complete matrix protein (M1 and M2) genes with the associated intergenic regions. The partial N gene sequence was 812 nucleotides in length with an open reading frame (ORF) that encoded the carboxyl-terminal 250 amino acids of the N protein. The M1 and M2 genes were 771 and 700 nucleotides in length, respectively, with ORFs encoding proteins of 227 and 193 amino acids. The M1 gene sequence contained an additional small ORF that could encode a highly basic, arginine-rich protein of 25 amino acids. Comparisons of the N, M1, and M2 gene sequences of HIRRV with the corresponding sequences of the fish rhabdoviruses, infectious hematopoietic necrosis virus (IHNV) or viral hemorrhagic septicemia virus (VHSV) indicated that HIRRV was more closely related to IHNV than to VHSV, but was clearly distinct from either. The putative consensus gene termination sequence for IHNV and VHSV, AGAYAG(A)(7), was present in the N-M1, M1-M2, and M2-G intergenic regions of HIRRV as were the putative transcription initiation sequences YGGCAC and AACA. An Escherichia coli expression system was used to produce recombinant proteins from the M1 and M2 genes of HIRRV. These were the same size as the authentic M1 and M2 proteins and reacted with anti-HIRRV rabbit serum in western blots. These reagents can be used for further study of the fish immune response and to test novel control methods.
Vital, Marius; Chai, Benli; Østman, Bjørn; Cole, James; Konstantinidis, Konstantinos T; Tiedje, James M
2015-01-01
Escherichia coli spans a genetic continuum from enteric strains to several phylogenetically distinct, atypical lineages that are rare in humans, but more common in extra-intestinal environments. To investigate the link between gene regulation, phylogeny and diversification in this species, we analyzed global gene expression profiles of four strains representing distinct evolutionary lineages, including a well-studied laboratory strain, a typical commensal (enteric) strain and two environmental strains. RNA-Seq was employed to compare the whole transcriptomes of strains grown under batch, chemostat and starvation conditions. Highly differentially expressed genes showed a significantly lower nucleotide sequence identity compared with other genes, indicating that gene regulation and coding sequence conservation are directly connected. Overall, distances between the strains based on gene expression profiles were largely dependent on the culture condition and did not reflect phylogenetic relatedness. Expression differences of commonly shared genes (all four strains) and E. coli core genes were consistently smaller between strains characterized by more similar primary habitats. For instance, environmental strains exhibited increased expression of stress defense genes under carbon-limited growth and entered a more pronounced survival-like phenotype during starvation compared with other strains, which stayed more alert for substrate scavenging and catabolism during no-growth conditions. Since those environmental strains show similar genetic distance to each other and to the other two strains, these findings cannot be simply attributed to genetic relatedness but suggest physiological adaptations. Our study provides new insights into ecologically relevant gene-expression and underscores the role of (differential) gene regulation for the diversification of the model bacterial species. PMID:25343512
Baxter, Laura L; Hsu, Benjamin J; Umayam, Lowell; Wolfsberg, Tyra G; Larson, Denise M; Frith, Martin C; Kawai, Jun; Hayashizaki, Yoshihide; Carninci, Piero; Pavan, William J
2007-06-01
As part of the RIKEN mouse encyclopedia project, two cDNA libraries were prepared from melanocyte-derived cell lines, using techniques of full-length clone selection and subtraction/normalization to enrich for rare transcripts. End sequencing showed that these libraries display over 83% complete coding sequence at the 5' end and 96-97% complete coding sequence at the 3' end. Evaluation of the libraries, derived from B16F10Y tumor cells and melan-c cells, revealed that they contain clones for a majority of the genes previously demonstrated to function in melanocyte biology. Analysis of genomic locations for transcripts revealed that the distribution of melanocyte genes is non-random throughout the genome. Three genomic regions identified that showed significant clustering of melanocyte-expressed genes contain one or more genes previously shown to regulate melanocyte development or function. A catalog of genes expressed in these libraries is presented, providing a valuable resource of cDNA clones and sequence information that can be used for identification of new genes important for melanocyte development, function, and disease.
Nguyen, Thong T; Suryamohan, Kushal; Kuriakose, Boney; Janakiraman, Vasantharajan; Reichelt, Mike; Chaudhuri, Subhra; Guillory, Joseph; Divakaran, Neethu; Rabins, P E; Goel, Ridhi; Deka, Bhabesh; Sarkar, Suman; Ekka, Preety; Tsai, Yu-Chih; Vargas, Derek; Santhosh, Sam; Mohan, Sangeetha; Chin, Chen-Shan; Korlach, Jonas; Thomas, George; Babu, Azariah; Seshagiri, Somasekar
2018-06-12
We sequenced the Hyposidra talaca NPV (HytaNPV) double stranded circular DNA genome using PacBio single molecule sequencing technology. We found that the HytaNPV genome is 139,089 bp long with a GC content of 39.6%. It encodes 141 open reading frames (ORFs) including the 37 baculovirus core genes, 25 genes conserved among lepidopteran baculoviruses, 72 genes known in baculovirus, and 7 genes unique to the HytaNPV genome. It is a group II alphabaculovirus that codes for the F protein and lacks the gp64 gene found in group I alphabaculovirus viruses. Using RNA-seq, we confirmed the expression of the ORFs identified in the HytaNPV genome. Phylogenetic analysis showed HytaNPV to be closest to BusuNPV, SujuNPV and EcobNPV that infect other tea pests, Buzura suppressaria, Sucra jujuba, and Ectropis oblique, respectively. We identified repeat elements and a conserved non-coding baculovirus element in the genome. Analysis of the putative promoter sequences identified motif consistent with the temporal expression of the genes observed in the RNA-seq data.
Rudd, Stephen
2005-01-01
The public expressed sequence tag collections are continually being enriched with high-quality sequences that represent an ever-expanding range of taxonomically diverse plant species. While these sequence collections provide biased insight into the populations of expressed genes available within individual species and their associated tissues, the information is conceivably of wider relevance in a comparative context. When we consider the available expressed sequence tag (EST) collections of summer 2004, most of the major plant taxonomic clades are at least superficially represented. Investigation of the five million available plant ESTs provides a wealth of information that has applications in modelling the routes of plant genome evolution and the identification of lineage-specific genes and gene families. Over four million ESTs from over 50 distinct plant species have been collated within an EST analysis pipeline called openSputnik. The ESTs were resolved down into approximately one million unigene sequences. These have been annotated using orthology-based annotation transfer from reference plant genomes and using a variety of contemporary bioinformatics methods to assign peptide, structural and functional attributes. The openSputnik database is available at http://sputnik.btk.fi.
Evolution of Synonymous Codon Usage in Neurospora tetrasperma and Neurospora discreta
Whittle, C. A.; Sun, Y.; Johannesson, H.
2011-01-01
Neurospora comprises a primary model system for the study of fungal genetics and biology. In spite of this, little is known about genome evolution in Neurospora. For example, the evolution of synonymous codon usage is largely unknown in this genus. In the present investigation, we conducted a comprehensive analysis of synonymous codon usage and its relationship to gene expression and gene length (GL) in Neurospora tetrasperma and Neurospora discreta. For our analysis, we examined codon usage among 2,079 genes per organism and assessed gene expression using large-scale expressed sequenced tag (EST) data sets (279,323 and 453,559 ESTs for N. tetrasperma and N. discreta, respectively). Data on relative synonymous codon usage revealed 24 codons (and two putative codons) that are more frequently used in genes with high than with low expression and thus were defined as optimal codons. Although codon-usage bias was highly correlated with gene expression, it was independent of selectively neutral base composition (introns); thus demonstrating that translational selection drives synonymous codon usage in these genomes. We also report that GL (coding sequences [CDS]) was inversely associated with optimal codon usage at each gene expression level, with highly expressed short genes having the greatest frequency of optimal codons. Optimal codon frequency was moderately higher in N. tetrasperma than in N. discreta, which might be due to variation in selective pressures and/or mating systems. PMID:21402862
Human AZU-1 gene, variants thereof and expressed gene products
Chen, Huei-Mei; Bissell, Mina
2004-06-22
A human AZU-1 gene, mutants, variants and fragments thereof. Protein products encoded by the AZU-1 gene and homologs encoded by the variants of AZU-1 gene acting as tumor suppressors or markers of malignancy progression and tumorigenicity reversion. Identification, isolation and characterization of AZU-1 and AZU-2 genes localized to a tumor suppressive locus at chromosome 10q26, highly expressed in nonmalignant and premalignant cells derived from a human breast tumor progression model. A recombinant full length protein sequences encoded by the AZU-1 gene and nucleotide sequences of AZU-1 and AZU-2 genes and variant and fragments thereof. Monoclonal or polyclonal antibodies specific to AZU-1, AZU-2 encoded protein and to AZU-1, or AZU-2 encoded protein homologs.
Tran, Tuan-Anh; Vo, Nam Tri; Nguyen, Hoang Duc; Pham, Bao The
2015-12-01
Recombinant proteins play an important role in many aspects of life and have generated a huge income, notably in the industrial enzyme business. A gene is introduced into a vector and expressed in a host organism-for example, E. coli-to obtain a high productivity of target protein. However, transferred genes from particular organisms are not usually compatible with the host's expression system because of various reasons, for example, codon usage bias, GC content, repetitive sequences, and secondary structure. The solution is developing programs to optimize for designing a nucleotide sequence whose origin is from peptide sequences using properties of highly expressed genes (HEGs) of the host organism. Existing data of HEGs determined by practical and computer-based methods do not satisfy for qualifying and quantifying. Therefore, the demand for developing a new HEG prediction method is critical. We proposed a new method for predicting HEGs and criteria to evaluate gene optimization. Codon usage bias was weighted by amplifying the difference between HEGs and non-highly expressed genes (non-HEGs). The number of predicted HEGs is 5% of the genome. In comparison with Puigbò's method, the result is twice as good as Puigbò's one, in kernel ratio and kernel sensitivity. Concerning transcription/translation factor proteins (TF), the proposed method gives low TF sensitivity, while Puigbò's method gives moderate one. In summary, the results indicated that the proposed method can be a good optional applying method to predict optimized genes for particular organisms, and we generated an HEG database for further researches in gene design.
Influence of 5'-flanking sequence on 4.5SI RNA gene transcription by RNA polymerase III.
Gogolevskaya, Irina K; Stasenko, Danil V; Tatosyan, Karina A; Kramerov, Dmitri A
2018-05-01
Short nuclear 4.5SI RNA can be found in three related rodent families. Its function remains unknown. The genes of 4.5SI RNA contain an internal promoter of RNA polymerase III composed of the boxes A and B. Here, the effect of the sequence immediately upstream of the mouse 4.5SI RNA gene on its transcription was studied. The gene with deletions and substitutions in the 5'-flanking sequence was used to transfect HeLa cells and its transcriptional activity was evaluated from the cellular level of 4.5SI RNA. Single-nucleotide substitutions in the region adjacent to the transcription start site (positions -2 to -8) decreased the expression activity of the gene down to 40%-60% of the control. The substitution of the conserved pentanucleotide AGAAT (positions -14 to -18) could either decrease (43%-56%) or increase (134%) the gene expression. A TATA-like box (TACATGA) was found at positions -24 to -30 of the 4.5SI RNA gene. Its replacement with a polylinker fragment of the vector did not decrease the transcription level, while its replacement with a GC-rich sequence almost completely (down to 2%-5%) suppressed the transcription of the 4.5SI RNA gene. The effect of plasmid sequences bordering the gene on its transcription by RNA polymerase III is discussed.
Next generation sequencing and analysis of a conserved transcriptome of New Zealand's kiwi.
Subramanian, Sankar; Huynen, Leon; Millar, Craig D; Lambert, David M
2010-12-15
Kiwi is a highly distinctive, flightless and endangered ratite bird endemic to New Zealand. To understand the patterns of molecular evolution of the nuclear protein-coding genes in brown kiwi (Apteryx australis mantelli) and to determine the timescale of avian history we sequenced a transcriptome obtained from a kiwi embryo using next generation sequencing methods. We then assembled the conserved protein-coding regions using the chicken proteome as a scaffold. Using 1,543 conserved protein coding genes we estimated the neutral evolutionary divergence between the kiwi and chicken to be ~45%, which is approximately equal to the divergence computed for the human-mouse pair using the same set of genes. A large fraction of genes was found to be under high selective constraint, as most of the expressed genes appeared to be involved in developmental gene regulation. Our study suggests a significant relationship between gene expression levels and protein evolution. Using sequences from over 700 nuclear genes we estimated the divergence between the two basal avian groups, Palaeognathae and Neognathae to be 132 million years, which is consistent with previous studies using mitochondrial genes. The results of this investigation revealed patterns of mutation and purifying selection in conserved protein coding regions in birds. Furthermore this study suggests a relatively cost-effective way of obtaining a glimpse into the fundamental molecular evolutionary attributes of a genome, particularly when no closely related genomic sequence is available.
Licciardello, Concetta; D'Agostino, Nunzio; Traini, Alessandra; Recupero, Giuseppe Reforgiato; Frusciante, Luigi; Chiusano, Maria Luisa
2014-02-03
Glutathione S-transferases (GSTs) represent a ubiquitous gene family encoding detoxification enzymes able to recognize reactive electrophilic xenobiotic molecules as well as compounds of endogenous origin. Anthocyanin pigments require GSTs for their transport into the vacuole since their cytoplasmic retention is toxic to the cell. Anthocyanin accumulation in Citrus sinensis (L.) Osbeck fruit flesh determines different phenotypes affecting the typical pigmentation of Sicilian blood oranges. In this paper we describe: i) the characterization of the GST gene family in C. sinensis through a systematic EST analysis; ii) the validation of the EST assembly by exploiting the genome sequences of C. sinensis and C. clementina and their genome annotations; iii) GST gene expression profiling in six tissues/organs and in two different sweet orange cultivars, Cadenera (common) and Moro (pigmented). We identified 61 GST transcripts, described the full- or partial-length nature of the sequences and assigned to each sequence the GST class membership exploiting a comparative approach and the classification scheme proposed for plant species. A total of 23 full-length sequences were defined. Fifty-four of the 61 transcripts were successfully aligned to the C. sinensis and C. clementina genomes. Tissue specific expression profiling demonstrated that the expression of some GST transcripts was 'tissue-affected' and cultivar specific. A comparative analysis of C. sinensis GSTs with those from other plant species was also considered. Data from the current analysis are accessible at http://biosrv.cab.unina.it/citrusGST/, with the aim to provide a reference resource for C. sinensis GSTs. This study aimed at the characterization of the GST gene family in C. sinensis. Based on expression patterns from two different cultivars and on sequence-comparative analyses, we also highlighted that two sequences, a Phi class GST and a Mapeg class GST, could be involved in the conjugation of anthocyanin pigments and in their transport into the vacuole, specifically in fruit flesh of the pigmented cultivar.
2012-01-01
Background A detailed knowledge about spatial and temporal gene expression is important for understanding both the function of genes and their evolution. For the vast majority of species, transcriptomes are still largely uncharacterized and even in those where substantial information is available it is often in the form of partially sequenced transcriptomes. With the development of next generation sequencing, a single experiment can now simultaneously identify the transcribed part of a species genome and estimate levels of gene expression. Results mRNA from actively growing needles of Norway spruce (Picea abies) was sequenced using next generation sequencing technology. In total, close to 70 million fragments with a length of 76 bp were sequenced resulting in 5 Gbp of raw data. A de novo assembly of these reads, together with publicly available expressed sequence tag (EST) data from Norway spruce, was used to create a reference transcriptome. Of the 38,419 PUTs (putative unique transcripts) longer than 150 bp in this reference assembly, 83.5% show similarity to ESTs from other spruce species and of the remaining PUTs, 3,704 show similarity to protein sequences from other plant species, leaving 4,167 PUTs with limited similarity to currently available plant proteins. By predicting coding frames and comparing not only the Norway spruce PUTs, but also PUTs from the close relatives Picea glauca and Picea sitchensis to both Pinus taeda and Taxus mairei, we obtained estimates of synonymous and non-synonymous divergence among conifer species. In addition, we detected close to 15,000 SNPs of high quality and estimated gene expression differences between samples collected under dark and light conditions. Conclusions Our study yielded a large number of single nucleotide polymorphisms as well as estimates of gene expression on transcriptome scale. In agreement with a recent study we find that the synonymous substitution rate per year (0.6 × 10−09 and 1.1 × 10−09) is an order of magnitude smaller than values reported for angiosperm herbs. However, if one takes generation time into account, most of this difference disappears. The estimates of the dN/dS ratio (non-synonymous over synonymous divergence) reported here are in general much lower than 1 and only a few genes showed a ratio larger than 1. PMID:23122049
Zou, Zhi; Yang, Lifu; Wang, Danhua; Huang, Qixing; Mo, Yeyong; Xie, Guishui
2016-01-01
WRKY proteins comprise one of the largest transcription factor families in plants and form key regulators of many plant processes. This study presents the characterization of 58 WRKY genes from the castor bean (Ricinus communis L., Euphorbiaceae) genome. Compared with the automatic genome annotation, one more WRKY-encoding locus was identified and 20 out of the 57 predicted gene models were manually corrected. All RcWRKY genes were shown to contain at least one intron in their coding sequences. According to the structural features of the present WRKY domains, the identified RcWRKY genes were assigned to three previously defined groups (I-III). Although castor bean underwent no recent whole-genome duplication event like physic nut (Jatropha curcas L., Euphorbiaceae), comparative genomics analysis indicated that one gene loss, one intron loss and one recent proximal duplication occurred in the RcWRKY gene family. The expression of all 58 RcWRKY genes was supported by ESTs and/or RNA sequencing reads derived from roots, leaves, flowers, seeds and endosperms. Further global expression profiles with RNA sequencing data revealed diverse expression patterns among various tissues. Results obtained from this study not only provide valuable information for future functional analysis and utilization of the castor bean WRKY genes, but also provide a useful reference to investigate the gene family expansion and evolution in Euphorbiaceus plants.
Zhang, Quan; Zhu, Feng; Liu, Long; Zheng, Chuan Wei; Wang, De He; Hou, Zhuo Cheng; Ning, Zhong Hua
2015-01-01
Eggshell damages lead to economic losses in the egg production industry and are a threat to human health. We examined 49-wk-old Rhode Island White hens (Gallus gallus) that laid eggs having shells with significantly different strengths and thicknesses. We used HiSeq 2000 (Illumina) sequencing to characterize the chicken transcriptome and whole genome to identify the key genes and genetic mutations associated with eggshell calcification. We identified a total of 14,234 genes expressed in the chicken uterus, representing 89% of all annotated chicken genes. A total of 889 differentially expressed genes were identified by comparing low eggshell strength (LES) and normal eggshell strength (NES) genomes. The DEGs are enriched in calcification-related processes, including calcium ion transport and calcium signaling pathways as revealed by gene ontology (GO) and Kyoto encyclopedia of genes and genomes (KEGG) pathway analysis. Some important matrix proteins, such as OC-116, LTF and SPP1, were also expressed differentially between two groups. A total of 3,671,919 single-nucleotide polymorphisms (SNPs) and 508,035 Indels were detected in protein coding genes by whole-genome re-sequencing, including 1775 non-synonymous variations and 19 frame-shift Indels in DEGs. SNPs and Indels found in this study could be further investigated for eggshell traits. This is the first report to integrate the transcriptome and genome re-sequencing to target the genetic variations which decreased the eggshell qualities. These findings further advance our understanding of eggshell calcification in the chicken uterus.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aho, Hanne; Schwemmer, M.; Tessmann, D.
1996-03-01
The mitochondrial capsule selenoprotein (MCS) (HGMW-approved symbol MCSP) is one of three proteins that are important for the maintenance and stabilization of the crescent structure of the sperm mitochondria. We describe here the isolation of a cDNA, the exon-intron organization, the expression, and the chromosomal localization of the human MCS gene. Nucleotide sequence analysis of the human and mouse MCS cDNAs reveals that the 5{prime}- and 3{prime}-untranslated sequences are more conserved (71%) than the coding sequences (59%). The open reading frame encodes a 116-amino-acid protein and lacks the UGA codons, which have been reported to encode the selenocysteines in themore » N-terminal of the deduced mouse protein. The deduced human protein shows a low degree of amino acid sequence identity to the mouse protein. The deduced human protein shows a low degree of amino acid sequence identity to the mouse protein (39%). The most striking homology lies in the dicysteine motifs. Northern and Southern zooblot analyses reveal that the MCS gene in human, baboon, and bovine is more conserved than its counterparts in mouse and rat. The single intron in the human MCS gene is approximately 6 kb and interrupts the 5{prime}-untranslated region at a position equivalent to that in the mouse and rat genes. Northern blot and in situ hybridization experiments demonstrate that the expression of the human MCS gene is restricted to haploid spermatids. The human gene was assigned to q21 of chromosome 1. 30 refs., 9 figs.« less
Principles of gene microarray data analysis.
Mocellin, Simone; Rossi, Carlo Riccardo
2007-01-01
The development of several gene expression profiling methods, such as comparative genomic hybridization (CGH), differential display, serial analysis of gene expression (SAGE), and gene microarray, together with the sequencing of the human genome, has provided an opportunity to monitor and investigate the complex cascade of molecular events leading to tumor development and progression. The availability of such large amounts of information has shifted the attention of scientists towards a nonreductionist approach to biological phenomena. High throughput technologies can be used to follow changing patterns of gene expression over time. Among them, gene microarray has become prominent because it is easier to use, does not require large-scale DNA sequencing, and allows for the parallel quantification of thousands of genes from multiple samples. Gene microarray technology is rapidly spreading worldwide and has the potential to drastically change the therapeutic approach to patients affected with tumor. Therefore, it is of paramount importance for both researchers and clinicians to know the principles underlying the analysis of the huge amount of data generated with microarray technology.
Regulatory link between DNA methylation and active demethylation in Arabidopsis
Lei, Mingguang; Zhang, Huiming; Julian, Russell; Tang, Kai; Xie, Shaojun; Zhu, Jian-Kang
2015-01-01
De novo DNA methylation through the RNA-directed DNA methylation (RdDM) pathway and active DNA demethylation play important roles in controlling genome-wide DNA methylation patterns in plants. Little is known about how cells manage the balance between DNA methylation and active demethylation activities. Here, we report the identification of a unique RdDM target sequence, where DNA methylation is required for maintaining proper active DNA demethylation of the Arabidopsis genome. In a genetic screen for cellular antisilencing factors, we isolated several REPRESSOR OF SILENCING 1 (ros1) mutant alleles, as well as many RdDM mutants, which showed drastically reduced ROS1 gene expression and, consequently, transcriptional silencing of two reporter genes. A helitron transposon element (TE) in the ROS1 gene promoter negatively controls ROS1 expression, whereas DNA methylation of an RdDM target sequence between ROS1 5′ UTR and the promoter TE region antagonizes this helitron TE in regulating ROS1 expression. This RdDM target sequence is also targeted by ROS1, and defective DNA demethylation in loss-of-function ros1 mutant alleles causes DNA hypermethylation of this sequence and concomitantly causes increased ROS1 expression. Our results suggest that this sequence in the ROS1 promoter region serves as a DNA methylation monitoring sequence (MEMS) that senses DNA methylation and active DNA demethylation activities. Therefore, the ROS1 promoter functions like a thermostat (i.e., methylstat) to sense DNA methylation levels and regulates DNA methylation by controlling ROS1 expression. PMID:25733903
Kohno, K; Yasuzawa, K; Hirose, M; Kano, Y; Goshima, N; Tanaka, H; Imamoto, F
1994-06-01
The molecular mechanism of autoregulation of expression of the hupA gene in Escherichia coli was examined. The promoter of the gene contains a palindromic sequence with the potential to form a cruciform DNA structure in which the -35 sequence lies at the base of the stem and the -10 sequence forms a single-stranded loop. An artificial promoter lacking the palindrome, which was constructed by replacing a 10 nucleotide repeat for the predicted cruciform arm by a sequence in the opposite orientation, was not subject to HU-repression. DNA relaxation induced by deleting HU proteins and/or inhibiting DNA gyrase in cells results in increased expression from the hupA promoter. We propose that initiation of transcription of the hupA gene is negatively regulated by steric hindrance of the functional promoter domains for formation of the cruciform configuration, which is facilitated at least in part by negative supercoiling of the hupA promoter DNA region. The promoter region of the hupB gene also contains a palindromic sequence that can assume a cruciform configuration. Negative regulation of this gene by HU proteins may occur by a mechanism similar to that operating for the hupA gene.
Sequence and expression analyses of porcine ISG15 and ISG43 genes.
Huang, Jiangnan; Zhao, Shuhong; Zhu, Mengjin; Wu, Zhenfang; Yu, Mei
2009-08-01
The coding sequences of porcine interferon-stimulated gene 15 (ISG15) and the interferon-stimulated gene (ISG43) were cloned from swine spleen mRNA. The amino acid sequences deduced from porcine ISG15 and ISG43 genes coding sequence shared 24-75% and 29-83% similarity with ISG15s and ISG43s from other vertebrates, respectively. Structural analyses revealed that porcine ISG15 comprises two ubiquitin homologues motifs (UBQ) domain and a conserved C-terminal LRLRGG conjugating motif. Porcine ISG43 contains an ubiquitin-processing proteases-like domain. Phylogenetic analyses showed that porcine ISG15 and ISG43 were mostly related to rat ISG15 and cattle ISG43, respectively. Using quantitative real-time PCR assay, significant increased expression levels of porcine ISG15 and ISG43 genes were detected in porcine kidney endothelial cells (PK15) cells treated with poly I:C. We also observed the enhanced mRNA expression of three members of dsRNA pattern-recognition receptors (PRR), TLR3, DDX58 and IFIH1, which have been reported to act as critical receptors in inducing the mRNA expression of ISG15 and ISG43 genes. However, we did not detect any induced mRNA expression of IFNalpha and IFNbeta, suggesting that transcriptional activations of ISG15 and ISG43 were mediated through IFN-independent signaling pathway in the poly I:C treated PK15 cells. Association analyses in a Landrace pig population revealed that ISG15 c.347T>C (BstUI) polymorphism and the ISG43 c.953T>G (BccI) polymorphism were significantly associated with hematological parameters and immune-related traits.
Cabrera, Ana R; Donohue, Kevin V; Khalil, Sayed M S; Scholl, Elizabeth; Opperman, Charles; Sonenshine, Daniel E; Roe, R Michael
2011-01-01
Many species of mites and ticks are of agricultural and medical importance. Much can be learned from the study of transcriptomes of acarines which can generate DNA-sequence information of potential target genes for the control of acarine pests. High throughput transcriptome sequencing can also yield sequences of genes critical during physiological processes poorly understood in acarines, i.e., the regulation of female reproduction in mites. The predatory mite, Phytoseiulus persimilis, was selected to conduct a transcriptome analysis using 454 pyrosequencing. The objective of this project was to obtain DNA-sequence information of expressed genes from P. persimilis with special interest in sequences corresponding to vitellogenin (Vg) and the vitellogenin receptor (VgR). These genes are critical to the understanding of vitellogenesis, and they will facilitate the study of the regulation of mite female reproduction. A total of 12,556 contiguous sequences (contigs) were assembled with an average size of 935bp. From these sequences, the putative translated peptides of 11 contigs were similar in amino acid sequences to other arthropod Vgs, while 6 were similar to VgRs. We selected some of these sequences to conduct stage-specific expression studies to further determine their function. 2010 Elsevier Ltd. All rights reserved.
Characterization of defensin gene from abalone Haliotis discus hannai and its deduced protein
NASA Astrophysics Data System (ADS)
Hong, Xuguang; Sun, Xiuqin; Zheng, Minggang; Qu, Lingyun; Zan, Jindong; Zhang, Jinxing
2008-11-01
Defensin is one of preserved ancient host defensive materials formed in biological evolution. As a regulator and effector molecule, it is very important in animals’ acquired immune system. This paper reports the defensin gene from the mixed liver and kidney cDNA library of abalone Haliotis discus hannai Ino. Sequence analysis shows that the gene sequence of full-length cDNA encodes 42 mature peptides (including six Cys), molecular weight of 4 323 Da, and pI of 8.02. Amino acid sequence homology analysis shows that the peptides are highly similar (70% in common) to other insects defensin. Because of a typical insect-defensin structural character of mature peptide in the secondary structure, the polypeptide named Haliotis discus defensin (hd-def), a novel of antimicrobial peptides, belongs to insects defensin subfamily. The RT-PCR result of Haliotis discus defensin shows that the gene can be expressed only in the hepatopancreas by Gram-negative and positive bacteria stimulation, which is ascribed to inducible expression. Therefore, it is revealed that the Haliotis discus defensin gene expression was related to the antibacterial infection of Haliotis discus hannai Ino.
Tokuhiro, Keizo; Miyagawa, Yasushi; Yamada, Shuichi; Hirose, Mika; Ohta, Hiroshi; Nishimune, Yoshitake; Tanaka, Hiromitsu
2007-03-01
Haspin is a unique protein kinase expressed predominantly in haploid male germ cells. The genomic structure of haspin (Gsg2) has revealed it to be intronless, and the entire transcription unit is in an intron of the integrin alphaE (Itgae) gene. Transcription occurs from a bidirectional promoter that also generates an alternatively spliced integrin alphaE-derived mRNA (Aed). In mice, the testis-specific alternative splicing of Aed is expressed bidirectionally downstream from the Gsg2 transcription initiation site, and a segment consisting of 26 bp transcribes both genomic DNA strands between Gsg2 and the Aed transcription initiation sites. To investigate the mechanisms for this unique gene regulation, we cloned and characterized the Gsg2 promoter region. The 193-bp genomic fragment from the 5' end of the Gsg2 and Aed genes, fused with EGFP and DsRed genes, drove the expression of both proteins in haploid germ cells of transgenic mice. This promoter element contained only a GC-rich sequence, and not the previously reported DNA sequences known to bind various transcription factors--with the exception of E2F1, TCFAP2A1 (AP2), and SP1. Here, we show that the 193-bp DNA sequence is sufficient for the specific, bidirectional, and synchronous expression in germ cells in the testis. We also demonstrate the existence of germ cell nuclear factors specifically bound to the promoter sequence. This activity may be regulated by binding to the promoter sequence with germ cell-specific nuclear complex(es) without regulation via DNA methylation.
Transcriptome and Gene Expression Analysis of the Rice Leaf Folder, Cnaphalocrosis medinalis
Li, Shang-Wei; Yang, Hong; Liu, Yue-Feng; Liao, Qi-Rong; Du, Juan; Jin, Dao-Chao
2012-01-01
Background The rice leaf folder (RLF), Cnaphalocrocis medinalis (Guenee) (Lepidoptera: Pyralidae), is one of the most destructive pests affecting rice in Asia. Although several studies have been performed on the ecological and physiological aspects of this species, the molecular mechanisms underlying its developmental regulation, behavior, and insecticide resistance remain largely unknown. Presently, there is a lack of genomic information for RLF; therefore, studies aimed at profiling the RLF transcriptome expression would provide a better understanding of its biological function at the molecular level. Principal Findings De novo assembly of the RLF transcriptome was performed via the short read sequencing technology (Illumina). In a single run, we produced more than 23 million sequencing reads that were assembled into 44,941 unigenes (mean size = 474 bp) by Trinity. Through a similarity search, 25,281 (56.82%) unigenes matched known proteins in the NCBI Nr protein database. The transcriptome sequences were annotated with gene ontology (GO), cluster of orthologous groups of proteins (COG), and KEGG orthology (KO). Additionally, we profiled gene expression during RLF development using a tag-based digital gene expression (DGE) system. Five DGE libraries were constructed, and variations in gene expression were compared between collected samples: eggs vs. 3rd instar larvae, 3rd instar larvae vs. pupae, pupae vs. adults. The results demonstrated that thousands of genes were significantly differentially expressed during various developmental stages. A number of the differentially expressed genes were confirmed by quantitative real-time PCR (qRT-PCR). Conclusions The RLF transcriptome and DGE data provide a comprehensive and global gene expression profile that would further promote our understanding of the molecular mechanisms underlying various biological characteristics, including development, elevated fecundity, flight, sex differentiation, olfactory behavior, and insecticide resistance in RLF. Therefore, these findings could help elucidate the intrinsic factors involved in the RLF-mediated destruction of rice and offer sustainable insect pest management. PMID:23185238
Ogura, Atsushi; Ikeo, Kazuho; Gojobori, Takashi
2004-01-01
Although the camera eye of the octopus is very similar to that of humans, phylogenetic and embryological analyses have suggested that their camera eyes have been acquired independently. It has been known as a typical example of convergent evolution. To study the molecular basis of convergent evolution of camera eyes, we conducted a comparative analysis of gene expression in octopus and human camera eyes. We sequenced 16,432 ESTs of the octopus eye, leading to 1052 nonredundant genes that have matches in the protein database. Comparing these 1052 genes with 13,303 already-known ESTs of the human eye, 729 (69.3%) genes were commonly expressed between the human and octopus eyes. On the contrary, when we compared octopus eye ESTs with human connective tissue ESTs, the expression similarity was quite low. To trace the evolutionary changes that are potentially responsible for camera eye formation, we also compared octopus-eye ESTs with the completed genome sequences of other organisms. We found that 1019 out of the 1052 genes had already existed at the common ancestor of bilateria, and 875 genes were conserved between humans and octopuses. It suggests that a larger number of conserved genes and their similar gene expression may be responsible for the convergent evolution of the camera eye. PMID:15289475
APPLICATION OF DNA MICROARRAYS TO REPRODUCTIVE TOXICOLOGY AND THE DEVELOPMENT OF A TESTIS ARRAY
With the advent of sequence information for entire mammalian genomes, it is now possible to analyze gene expression and gene polymorphisms on a genomic scale. The primary tool for analysis of gene expression is the DNA microarray. We have used commercially available cDNA micro...
With the advent of sequence information for entire eukaryotic genomes, it is now possible to analyze gene expression on a genomic scale. The primary tool for genomic analysis of gene expression is the gene microarray. We have used commercially available and custom cDNA microarray...
Szczyglowski, K; Hamburger, D; Kapranov, P; de Bruijn, F J
1997-01-01
A range of novel expressed sequence tags (ESTs) associated with late developmental events during nodule organogenesis in the legume Lotus japonicus were identified using mRNA differential display; 110 differentially displayed polymerase chain reaction products were cloned and analyzed. Of 88 unique cDNAs obtained, 22 shared significant homology to DNA/protein sequences in the respective databases. This group comprises, among others, a nodule-specific homolog of protein phosphatase 2C, a peptide transporter protein, and a nodule-specific form of cytochrome P450. RNA gel-blot analysis of 16 differentially displayed ESTs confirmed their nodule-specific expression pattern. The kinetics of mRNA accumulation of the majority of the ESTs analyzed were found to resemble the expression pattern observed for the L. japonicus leghemoglobin gene. These results indicate that the newly isolated molecular markers correspond to genes induced during late developmental stages of L. japonicus nodule organogenesis and provide important, novel tools for the study of nodulation. PMID:9276951
Highlander, S K; Wickersham, E A; Garza, O; Weinstock, G M
1993-01-01
Multicopy and single-copy chromosomal fusions between the Pasteurella haemolytica leukotoxin regulatory region and the Escherichia coli beta-galactosidase gene have been constructed. These fusions were used as reporters to identify and isolate regulators of leukotoxin expression from a P. haemolytica cosmid library. A cosmid clone, which inhibited leukotoxin expression from multicopy and single-copy protein fusions, was isolated and found to contain the complete leukotoxin gene cluster plus additional upstream sequences. The locus responsible for inhibition of expression from leukotoxin-beta-galactosidase fusions was mapped within these upstream sequences, by transposon mutagenesis with Tn5, and its DNA sequence was determined. The inhibitory activity was found to be associated with a predicted 440-amino-acid reading frame (lapA) that lies within a four-gene arginine transport locus. LapA is predicted to be the nucleotide-binding component of this transport system and shares homology with the Clp family of proteases. Images PMID:8359916
Identification of an expressed gene in Dipylidium caninum.
Miranda, Rodrigo R C; Costa-Júnior, Livio M; Campos, Artur K; Santos, Hudson A; Rabelo, Elida M L
2004-10-01
Recombinant DNA studies have been focused on developing vaccines to different cestodes. But few studies involving Dipylidium caninum molecular biology and genes have been done. Only partial sequences of mitochondrial DNA and ribosomal RNA gene are available in databases. Any molecular work with this parasite, including epidemiology, study of drug-resistant strains, and vaccine development, is hampered by the lack of knowledge of its genome. Thus, the knowledge of specific genes of different developmental stages of D. caninum is crucial to locate potential targets to be used as candidates to develop a vaccine and/or new drugs against this parasite. Here we report, for the first time, the sequencing of a fragment of a D. caninum expressed gene.
Molecular and Functional Characterization of Broccoli EMBRYONIC FLOWER 2 Genes
Chen, Long-Fang O.; Lin, Chun-Hung; Lai, Ying-Mi; Huang, Jia-Yuan; Sung, Zinmay Renee
2012-01-01
Polycomb group (PcG) proteins regulate major developmental processes in Arabidopsis. EMBRYONIC FLOWER 2 (EMF2), the VEFS domain-containing PcG gene, regulates diverse genetic pathways and is required for vegetative development and plant survival. Despite widespread EMF2-like sequences in plants, little is known about their function other than in Arabidopsis and rice. To study the role of EMF2 in broccoli (Brassica oleracea var. italica cv. Elegance) development, we identified two broccoli EMF2 (BoEMF2) genes with sequence homology to and a similar gene expression pattern to that in Arabidopsis (AtEMF2). Reducing their expression in broccoli resulted in aberrant phenotypes and gene expression patterns. BoEMF2 regulates genes involved in diverse developmental and stress programs similar to AtEMF2 in Arabidopsis. However, BoEMF2 differs from AtEMF2 in the regulation of flower organ identity, cell proliferation and elongation, and death-related genes, which may explain the distinct phenotypes. The expression of BoEMF2.1 in the Arabidopsis emf2 mutant (Rescued emf2) partially rescued the mutant phenotype and restored the gene expression pattern to that of the wild type. Many EMF2-mediated molecular and developmental functions are conserved in broccoli and Arabidopsis. Furthermore, the restored gene expression pattern in Rescued emf2 provides insights into the molecular basis of PcG-mediated growth and development. PMID:22537758
Maldonado-Borges, Josefina Ines; Ku-Cauich, José Roberto; Escobedo-GraciaMedrano, Rosa Maria
2013-01-01
Analysis of cDNA-AFLP was used to study the genes expressed in zygotic and somatic embryogenesis of Musa acuminata Colla ssp. malaccensis, and a comparison was made between their differential transcribed fragments (TDFs) and the sequenced genome of the double haploid- (DH-) Pahang of the malaccensis subspecies that is available in the network. A total of 253 transcript-derived fragments (TDFs) were detected with apparent size of 100–4000 bp using 5 pairs of AFLP primers, of which 21 were differentially expressed during the different stages of banana embryogenesis; 15 of the sequences have matched DH-Pahang chromosomes, with 7 of them being homologous to gene sequences encoding either known or putative protein domains of higher plants. Four TDF sequences were located in all Musa chromosomes, while the rest were located in one or two chromosomes. Their putative individual function is briefly reviewed based on published information, and the potential roles of these genes in embryo development are discussed. Thus the availability of the genome of Musa and the information of TDFs sequences presented here opens new possibilities for an in-depth study of the molecular and biochemical research of zygotic and somatic embryogenesis of Musa. PMID:24027442
Omeroglu Ulu, Zehra; Ulu, Salih; Un, Cemal; Ozdem Oztabak, Kemal; Altunatmaz, Kemal
2017-01-01
Kivircik sheep is an important local Turkish sheep according to its meat quality and milk productivity. The aim of this study was to analyze gene expression profiles of both prenatal and postnatal stages for the Kivircik sheep. Therefore, two different cDNA libraries, which were taken from the same Kivircik sheep mammary gland tissue at prenatal and postnatal stages, were constructed. Total 3072 colonies which were randomly selected from the two libraries were sequenced for developing a sheep ESTs collection. We used Phred/Phrap computer programs for analysis of the raw EST and readable EST sequences were assembled with the CAP3 software. Putative functions of all unique sequences and statistical analysis were determined by Geneious software. Total 422 ESTs have over 80% similarity to known sequences of other organisms in NCBI classified by Panther database for the Gene Ontology (GO) category. By comparing gene expression profiles, we observed some putative genes that may be relative to reproductive performance or play important roles in milk synthesis and secretion. A total of 2414 ESTs have been deposited to the NCBI GenBank database (GW996847–GW999260). EST data in this study have provided a new source of information to functional genome studies of sheep. PMID:28239610
2010-01-01
Background Some organisms can survive extreme desiccation by entering a state of suspended animation known as anhydrobiosis. The free-living mycophagous nematode Aphelenchus avenae can be induced to enter anhydrobiosis by pre-exposure to moderate reductions in relative humidity (RH) prior to extreme desiccation. This preconditioning phase is thought to allow modification of the transcriptome by activation of genes required for desiccation tolerance. Results To identify such genes, a panel of expressed sequence tags (ESTs) enriched for sequences upregulated in A. avenae during preconditioning was created. A subset of 30 genes with significant matches in databases, together with a number of apparently novel sequences, were chosen for further study. Several of the recognisable genes are associated with water stress, encoding, for example, two new hydrophilic proteins related to the late embryogenesis abundant (LEA) protein family. Expression studies confirmed EST panel members to be upregulated by evaporative water loss, and the majority of genes was also induced by osmotic stress and cold, but rather fewer by heat. We attempted to use RNA interference (RNAi) to demonstrate the importance of this gene set for anhydrobiosis, but found A. avenae to be recalcitrant with the techniques used. Instead, therefore, we developed a cross-species RNAi procedure using A. avenae sequences in another anhydrobiotic nematode, Panagrolaimus superbus, which is amenable to gene silencing. Of 20 A. avenae ESTs screened, a significant reduction in survival of desiccation in treated P. superbus populations was observed with two sequences, one of which was novel, while the other encoded a glutathione peroxidase. To confirm a role for glutathione peroxidases in anhydrobiosis, RNAi with cognate sequences from P. superbus was performed and was also shown to reduce desiccation tolerance in this species. Conclusions This study has identified and characterised the expression profiles of members of the anhydrobiotic gene set in A. avenae. It also demonstrates the potential of RNAi for the analysis of anhydrobiosis and provides the first genetic data to underline the importance of effective antioxidant systems in metazoan desiccation tolerance. PMID:20085654
Dong, Xiangrong; Wang, Yanping; Yang, Fengyuan; Zhao, Shanshan; Tian, Bing; Li, Tao
2017-01-01
Lycopene biosynthetic genes from Deinococcus radiodurans were co-expressed in Lactococcus lactis to produce lycopene and improve its tolerance to stress. Lycopene-related genes from D. radiodurans, DR1395 (crtE), DR0862 (crtB), and DR0861 (crtI), were fused in line with S hine-Dalgarno (SD) sequences and co-expressed in L. lactis. The recombinant strain produced 0.36 mg lycopene g -1 dry cell wt after 48 h fermentation. The survival rate to UV irradiation of the recombinant strain was higher than that of the non-transformed strain. The L. lactis with co-expressed genes responsible for lycopene biosynthesis from D. radiodurans produced lycopene and exhibited increased resistance to UV stress, suggesting that the recombinant strain has important application potential in food industry.
de Souza, C R; Aragão, F J; Moreira, E C O; Costa, C N M; Nascimento, S B; Carvalho, L J
2009-03-24
Cassava is one of the most important tropical food crops for more than 600 million people worldwide. Transgenic technologies can be useful for increasing its nutritional value and its resistance to viral diseases and insect pests. However, tissue-specific promoters that guarantee correct expression of transgenes would be necessary. We used inverse polymerase chain reaction to isolate a promoter sequence of the Mec1 gene coding for Pt2L4, a glutamic acid-rich protein differentially expressed in cassava storage roots. In silico analysis revealed putative cis-acting regulatory elements within this promoter sequence, including root-specific elements that may be required for its expression in vascular tissues. Transient expression experiments showed that the Mec1 promoter is functional, since this sequence was able to drive GUS expression in bean embryonic axes. Results from our computational analysis can serve as a guide for functional experiments to identify regions with tissue-specific Mec1 promoter activity. The DNA sequence that we identified is a new promoter that could be a candidate for genetic engineering of cassava roots.
Divergence of Gene Body DNA Methylation and Evolution of Plant Duplicate Genes
Wang, Jun; Marowsky, Nicholas C.; Fan, Chuanzhu
2014-01-01
It has been shown that gene body DNA methylation is associated with gene expression. However, whether and how deviation of gene body DNA methylation between duplicate genes can influence their divergence remains largely unexplored. Here, we aim to elucidate the potential role of gene body DNA methylation in the fate of duplicate genes. We identified paralogous gene pairs from Arabidopsis and rice (Oryza sativa ssp. japonica) genomes and reprocessed their single-base resolution methylome data. We show that methylation in paralogous genes nonlinearly correlates with several gene properties including exon number/gene length, expression level and mutation rate. Further, we demonstrated that divergence of methylation level and pattern in paralogs indeed positively correlate with their sequence and expression divergences. This result held even after controlling for other confounding factors known to influence the divergence of paralogs. We observed that methylation level divergence might be more relevant to the expression divergence of paralogs than methylation pattern divergence. Finally, we explored the mechanisms that might give rise to the divergence of gene body methylation in paralogs. We found that exonic methylation divergence more closely correlates with expression divergence than intronic methylation divergence. We show that genomic environments (e.g., flanked by transposable elements and repetitive sequences) of paralogs generated by various duplication mechanisms are associated with the methylation divergence of paralogs. Overall, our results suggest that the changes in gene body DNA methylation could provide another avenue for duplicate genes to develop differential expression patterns and undergo different evolutionary fates in plant genomes. PMID:25310342
Boyd, D A; Cvitkovitch, D G; Hamilton, I R
1994-01-01
We report the sequencing of a 2,242-bp region of the Streptococcus mutants NG5 genome containing the genes for ptsH and ptsI, which encode HPr and enzyme I (EI), respectively, of the phosphoenolpyruvate-dependent phosphotransferase transport system. The sequence was obtained from two cloned overlapping genomic fragments; one expresses HPr and a truncated EI, while the other expresses a full-length EI in Escherichia coli, as determined by Western immunoblotting. The ptsI gene appeared to be expressed from a region located in the ptsH gene. The S. mutans NG5 pts operon does not appear to be linked to other phosphotransferase transport system proteins as has been found in other bacteria. A positive fermentation pattern on MacConkey-glucose plates by an E. coli ptsI mutant harboring the S. mutans NG5 ptsI gene on a plasmid indicated that the S. mutans NG5 EI can complement a defect in the E. coli gene. This was confirmed by protein phosphorylation experiments with 32P-labeled phosphoenolpyruvate indicating phosphotransfer from the S. mutans NG5 EI to the E. coli HPr. Two forms of the cloned EI, both truncated to varying degrees in the C-terminal region, were inefficiently phosphorylated and unable to complement fully the ptsI defect in the E. coli mutant. The deduced amino acid sequence of HPr shows a high degree of homology, particularly around the active site, to the same protein from other gram-positive bacteria, notably, S. salivarius, and to a lesser extent with those of gram-negative bacteria. The deduced amino acid sequence of S. mutans NG5 EI also shares several regions of homology with other sequenced EIs, notably, with the region around the active site, a region that contains the only conserved cystidyl residue among the various proteins and which may be involved in substrate binding. Images PMID:8132321
2012-01-01
Background DNA cytosine methylation is an epigenetic modification that has been implicated in many biological processes. However, large-scale epigenomic studies have been applied to very few plant species, and variability in methylation among specialized tissues and its relationship to gene expression is poorly understood. Results We surveyed DNA methylation from seven distinct tissue types (vegetative bud, male inflorescence [catkin], female catkin, leaf, root, xylem, phloem) in the reference tree species black cottonwood (Populus trichocarpa). Using 5-methyl-cytosine DNA immunoprecipitation followed by Illumina sequencing (MeDIP-seq), we mapped a total of 129,360,151 36- or 32-mer reads to the P. trichocarpa reference genome. We validated MeDIP-seq results by bisulfite sequencing, and compared methylation and gene expression using published microarray data. Qualitative DNA methylation differences among tissues were obvious on a chromosome scale. Methylated genes had lower expression than unmethylated genes, but genes with methylation in transcribed regions ("gene body methylation") had even lower expression than genes with promoter methylation. Promoter methylation was more frequent than gene body methylation in all tissues except male catkins. Male catkins differed in demethylation of particular transposable element categories, in level of gene body methylation, and in expression range of genes with methylated transcribed regions. Tissue-specific gene expression patterns were correlated with both gene body and promoter methylation. Conclusions We found striking differences among tissues in methylation, which were apparent at the chromosomal scale and when genes and transposable elements were examined. In contrast to other studies in plants, gene body methylation had a more repressive effect on transcription than promoter methylation. PMID:22251412
Quantification of differential gene expression by multiplexed targeted resequencing of cDNA
Arts, Peer; van der Raadt, Jori; van Gestel, Sebastianus H.C.; Steehouwer, Marloes; Shendure, Jay; Hoischen, Alexander; Albers, Cornelis A.
2017-01-01
Whole-transcriptome or RNA sequencing (RNA-Seq) is a powerful and versatile tool for functional analysis of different types of RNA molecules, but sample reagent and sequencing cost can be prohibitive for hypothesis-driven studies where the aim is to quantify differential expression of a limited number of genes. Here we present an approach for quantification of differential mRNA expression by targeted resequencing of complementary DNA using single-molecule molecular inversion probes (cDNA-smMIPs) that enable highly multiplexed resequencing of cDNA target regions of ∼100 nucleotides and counting of individual molecules. We show that accurate estimates of differential expression can be obtained from molecule counts for hundreds of smMIPs per reaction and that smMIPs are also suitable for quantification of relative gene expression and allele-specific expression. Compared with low-coverage RNA-Seq and a hybridization-based targeted RNA-Seq method, cDNA-smMIPs are a cost-effective high-throughput tool for hypothesis-driven expression analysis in large numbers of genes (10 to 500) and samples (hundreds to thousands). PMID:28474677
Gene expression patterns are correlated with genomic and genic structure in soybean
USDA-ARS?s Scientific Manuscript database
Studies have indicated that exon and intron size, and intergenic distance are correlated with gene expression levels and expression breadth. Previous studies on these correlations in plants and animals have been conflicting. In this study next-generation sequence data of the soybean transcriptome wa...
van der Ley, P
1988-11-01
Gonococci express a family of related outer membrane proteins designated protein II (P.II). These surface proteins are subject to both phase variation and antigenic variation. The P.II gene repertoire of Neisseria gonorrhoeae strain JS3 was found to consist of at least ten genes, eight of which were cloned. Sequence analysis and DNA hybridization studies revealed that one particular P.II-encoding sequence is present in three distinct, but almost identical, copies in the JS3 genome. These genes encode the P.II protein that was previously identified as P.IIc. Comparison of their sequences shows that the multiple copies of this P.IIc-encoding gene might have been generated by both gene conversion and gene duplication.
Comparative antennal transcriptome of Apis cerana cerana from four developmental stages.
Zhao, Huiting; Peng, Zhu; Du, Yali; Xu, Kai; Guo, Lina; Yang, Shuang; Ma, Weihua; Jiang, Yusuo
2018-06-20
Apis cerana cerana, an important endemic honey bee species in China, possesses valuable characteristics such as a sensitive olfactory system, good foraging ability, and strong resistance to parasitic mites. Here, we performed transcriptome sequencing of the antenna, the major chemosensory organ of the bee, using an Illumina sequencer, to identify typical differentially expressed genes (DEGs) in adult worker bees of different ages, namely, T1 (1 day); T2 (10 days); T3 (15 days); and T4 (25 days). Surprisingly, the expression levels of DEGs changed significantly between the T1 period and the other three periods. All the DEGs were classified into 26 expression profiles by trend analysis. Selected trend clusters were analyzed, and valuable information on gene expression patterns was obtained. We found that the expression levels of genes encoding cuticle proteins declined after eclosion, while those of immunity-related genes increased. In addition, genes encoding venom proteins and major royal jelly proteins were enriched at the T2 stage; small heat shock proteins showed significantly higher expression at the T3 stage; and some metabolism-related genes were more highly expressed at the T4 stage. The DEGs identified in this study may serve as a valuable resource for the characterization of expression patterns of antennal genes in A. cerana cerana. Furthermore, this study provides insights into the relationship between labor division in social bees and gene function. Copyright © 2018. Published by Elsevier B.V.
Woods, J P; Heinecke, E L; Goldman, W E
1998-04-01
We developed an efficient electrotransformation system for the pathogenic fungus Histoplasma capsulatum and used it to examine the effects of features of the transforming DNA on transformation efficiency and fate of the transforming DNA and to demonstrate fungal expression of two recombinant Escherichia coli genes, hph and lacZ. Linearized DNA and plasmids containing Histoplasma telomeric sequences showed the greatest transformation efficiencies, while the plasmid vector had no significant effect, nor did the derivation of the selectable URA5 marker (native Histoplasma gene or a heterologous Podospora anserina gene). Electrotransformation resulted in more frequent multimerization, other modification, or possibly chromosomal integration of transforming telomeric plasmids when saturating amounts of DNA were used, but this effect was not observed with smaller amounts of transforming DNA. We developed another selection system using a hygromycin B resistance marker from plasmid pAN7-1, consisting of the E. coli hph gene flanked by Aspergillus nidulans promoter and terminator sequences. Much of the heterologous fungal sequences could be removed without compromising function in H. capsulatum, allowing construction of a substantially smaller effective marker fragment. Transformation efficiency increased when nonselective conditions were maintained for a time after electrotransformation before selection with the protein synthesis inhibitor hygromycin B was imposed. Finally, we constructed a readily detectable and quantifiable reporter gene by fusing Histoplasma URA5 with E. coli lacZ, resulting in expression of functional beta-galactosidase in H. capsulatum. Demonstration of expression of bacterial genes as effective selectable markers and reporters, together with a highly efficient electrotransformation system, provide valuable approaches for molecular genetic analysis and manipulation of H. capsulatum, which have proven useful for examination of targeted gene disruption, regulated gene expression, and potential virulence determinants in this fungus.
2010-01-01
Background Cytochrome P450 monooxygenases (P450s) catalyze oxidation of various substrates using oxygen and NAD(P)H. Plant P450s are involved in the biosynthesis of primary and secondary metabolites performing diverse biological functions. The recent availability of the soybean genome sequence allows us to identify and analyze soybean putative P450s at a genome scale. Co-expression analysis using an available soybean microarray and Illumina sequencing data provides clues for functional annotation of these enzymes. This approach is based on the assumption that genes that have similar expression patterns across a set of conditions may have a functional relationship. Results We have identified a total number of 332 full-length P450 genes and 378 pseudogenes from the soybean genome. From the full-length sequences, 195 genes belong to A-type, which could be further divided into 20 families. The remaining 137 genes belong to non-A type P450s and are classified into 28 families. A total of 178 probe sets were found to correspond to P450 genes on the Affymetrix soybean array. Out of these probe sets, 108 represented single genes. Using the 28 publicly available microarray libraries that contain organ-specific information, some tissue-specific P450s were identified. Similarly, stress responsive soybean P450s were retrieved from 99 microarray soybean libraries. We also utilized Illumina transcriptome sequencing technology to analyze the expressions of all 332 soybean P450 genes. This dataset contains total RNAs isolated from nodules, roots, root tips, leaves, flowers, green pods, apical meristem, mock-inoculated and Bradyrhizobium japonicum-infected root hair cells. The tissue-specific expression patterns of these P450 genes were analyzed and the expression of a representative set of genes were confirmed by qRT-PCR. We performed the co-expression analysis on many of the 108 P450 genes on the Affymetrix arrays. First we confirmed that CYP93C5 (an isoflavone synthase gene) is co-expressed with several genes encoding isoflavonoid-related metabolic enzymes. We then focused on nodulation-induced P450s and found that CYP728H1 was co-expressed with the genes involved in phenylpropanoid metabolism. Similarly, CYP736A34 was highly co-expressed with lipoxygenase, lectin and CYP83D1, all of which are involved in root and nodule development. Conclusions The genome scale analysis of P450s in soybean reveals many unique features of these important enzymes in this crop although the functions of most of them are largely unknown. Gene co-expression analysis proves to be a useful tool to infer the function of uncharacterized genes. Our work presented here could provide important leads toward functional genomics studies of soybean P450s and their regulatory network through the integration of reverse genetics, biochemistry, and metabolic profiling tools. The identification of nodule-specific P450s and their further exploitation may help us to better understand the intriguing process of soybean and rhizobium interaction. PMID:21062474
The promises and pitfalls of RNA-interference-based therapeutics
Castanotto, Daniela; Rossi, John J.
2009-01-01
The discovery that gene expression can be controlled by the Watson–Crick base-pairing of small RNAs with messenger RNAs containing complementary sequence — a process known as RNA interference — has markedly advanced our understanding of eukaryotic gene regulation and function. The ability of short RNA sequences to modulate gene expression has provided a powerful tool with which to study gene function and is set to revolutionize the treatment of disease. Remarkably, despite being just one decade from its discovery, the phenomenon is already being used therapeutically in human clinical trials, and biotechnology companies that focus on RNA-interference-based therapeutics are already publicly traded. PMID:19158789
Distal regulatory regions restrict the expression of cis-linked genes to the tapetal cells.
Franco, Luciana O; de O Manes, Carmem Lara; Hamdi, Said; Sachetto-Martins, Gilberto; de Oliveira, Dulce E
2002-04-24
The oleosin glycine-rich protein genes Atgrp-6, Atgrp-7, and Atgrp-8 occur in clusters in the Arabidopsis genome and are expressed specifically in the tapetum cells. The cis-regulatory regions involved in the tissue-specific gene expression were investigated by fusing different segments of the gene cluster to the uidA reporter gene. Common distal regulatory regions were identified that coordinate expression of the sequential genes. At least two of these genes were regulated spatially by proximal and distal sequences. The cis-acting elements (122 bp upstream of the transcriptional start point) drive the uidA expression to floral tissues, whereas distal 5' upstream regions restrict the gene activity to tapetal cells.
Evolution of the bovine lysozyme gene family: changes in gene expression and reversion of function.
Irwin, D M
1995-09-01
Recruitment of lysozyme to a digestive function in ruminant artiodactyls is associated with amplification of the gene. At least four of the approximately ten genes are expressed in the stomach, and several are expressed in nonstomach tissues. Characterization of additional lysozymelike sequences in the bovine genome has identified most, if not all, of the members of this gene family. There are at least six stomachlike lysozyme genes, two of which are pseudogenes. The stomach lysozyme pseudogenes show a pattern of concerted evolution similar to that of the functional stomach genes. At least four nonstomach lysozyme genes exist. The nonstomach lysozyme genes are not monophyletic. A gene encoding a tracheal lysozyme was isolated, and the stomach lysozyme of advanced ruminants was found to be more closely related to the tracheal lysozyme than to the stomach lysozyme of the camel or other nonstomach lysozyme genes of ruminants. The tracheal lysozyme shares with stomach lysozymes of advanced ruminants the deletion of amino acid 103, and several other adaptive sequence characteristics of stomach lysozymes. I suggest here that tracheal lysozyme has reverted from a functional stomach lysozyme. Tracheal lysozyme then represents a second instance of a change in lysozyme gene expression and function within ruminants.
Chen, Min; Tan, Qiuping; Sun, Mingyue; Li, Dongmei; Fu, Xiling; Chen, Xiude; Xiao, Wei; Li, Ling; Gao, Dongsheng
2016-06-01
Bud dormancy in deciduous fruit trees is an important adaptive mechanism for their survival in cold climates. The WRKY genes participate in several developmental and physiological processes, including dormancy. However, the dormancy mechanisms of WRKY genes have not been studied in detail. We conducted a genome-wide analysis and identified 58 WRKY genes in peach. These putative genes were located on all eight chromosomes. In bioinformatics analyses, we compared the sequences of WRKY genes from peach, rice, and Arabidopsis. In a cluster analysis, the gene sequences formed three groups, of which group II was further divided into five subgroups. Gene structure was highly conserved within each group, especially in groups IId and III. Gene expression analyses by qRT-PCR showed that WRKY genes showed different expression patterns in peach buds during dormancy. The mean expression levels of six WRKY genes (Prupe.6G286000, Prupe.1G393000, Prupe.1G114800, Prupe.1G071400, Prupe.2G185100, and Prupe.2G307400) increased during endodormancy and decreased during ecodormancy, indicating that these six WRKY genes may play a role in dormancy in a perennial fruit tree. This information will be useful for selecting fruit trees with desirable dormancy characteristics or for manipulating dormancy in genetic engineering programs.
cDNA sequence and expression of a cold-responsive gene in Citrus unshiu.
Hara, M; Wakasugi, Y; Ikoma, Y; Yano, M; Ogawa, K; Kuboi, T
1999-02-01
A cDNA clone encoding a protein (CuCOR19), the sequence of which is similar to Poncirus COR19, of the dehydrin family was isolated from the epicarp of Citrus unshiu. The molecular mass of the predicted protein was 18,980 daltons. CuCOR19 was highly hydrophilic and contained three repeating elements including Lys-rich motifs. The gene expression in leaves increased by cold stress.
Yang, Shihui; Vera, Jessica M; Grass, Jeff; Savvakis, Giannis; Moskvin, Oleg V; Yang, Yongfu; McIlwain, Sean J; Lyu, Yucai; Zinonos, Irene; Hebert, Alexander S; Coon, Joshua J; Bates, Donna M; Sato, Trey K; Brown, Steven D; Himmel, Michael E; Zhang, Min; Landick, Robert; Pappas, Katherine M; Zhang, Yaoping
2018-01-01
Zymomonas mobilis is a natural ethanologen being developed and deployed as an industrial biofuel producer. To date, eight Z. mobilis strains have been completely sequenced and found to contain 2-8 native plasmids. However, systematic verification of predicted Z. mobilis plasmid genes and their contribution to cell fitness has not been hitherto addressed. Moreover, the precise number and identities of plasmids in Z. mobilis model strain ZM4 have been unclear. The lack of functional information about plasmid genes in ZM4 impedes ongoing studies for this model biofuel-producing strain. In this study, we determined the complete chromosome and plasmid sequences of ZM4 and its engineered xylose-utilizing derivatives 2032 and 8b. Compared to previously published and revised ZM4 chromosome sequences, the ZM4 chromosome sequence reported here contains 65 nucleotide sequence variations as well as a 2400-bp insertion. Four plasmids were identified in all three strains, with 150 plasmid genes predicted in strain ZM4 and 2032, and 153 plasmid genes predicted in strain 8b due to the insertion of heterologous DNA for expanded substrate utilization. Plasmid genes were then annotated using Blast2GO, InterProScan, and systems biology data analyses, and most genes were found to have apparent orthologs in other organisms or identifiable conserved domains. To verify plasmid gene prediction, RNA-Seq was used to map transcripts and also compare relative gene expression under various growth conditions, including anaerobic and aerobic conditions, or growth in different concentrations of biomass hydrolysates. Overall, plasmid genes were more responsive to varying hydrolysate concentrations than to oxygen availability. Additionally, our results indicated that although all plasmids were present in low copy number (about 1-2 per cell), the copy number of some plasmids varied under specific growth conditions or due to heterologous gene insertion. The complete genome of ZM4 and two xylose-utilizing derivatives is reported in this study, with an emphasis on identifying and characterizing plasmid genes. Plasmid gene annotation, validation, expression levels at growth conditions of interest, and contribution to host fitness are reported for the first time.
Bovine mammary gene expression profiling during the onset of lactation.
Gao, Yuanyuan; Lin, Xueyan; Shi, Kerong; Yan, Zhengui; Wang, Zhonghua
2013-01-01
Lactogenesis includes two stages. Stage I begins a few weeks before parturition. Stage II is initiated around the time of parturition and extends for several days afterwards. To better understand the molecular events underlying these changes, genome-wide gene expression profiling was conducted using digital gene expression (DGE) on bovine mammary tissue at three time points (on approximately day 35 before parturition (-35 d), day 7 before parturition (-7 d) and day 3 after parturition (+3 d)). Approximately 6.2 million (M), 5.8 million (M) and 6.1 million (M) 21-nt cDNA tags were sequenced in the three cDNA libraries (-35 d, -7 d and +3 d), respectively. After aligning to the reference sequences, the three cDNA libraries included 8,662, 8,363 and 8,359 genes, respectively. With a fold change cutoff criteria of ≥ 2 or ≤-2 and a false discovery rate (FDR) of ≤ 0.001, a total of 812 genes were significantly differentially expressed at -7 d compared with -35 d (stage I). Gene ontology analysis showed that those significantly differentially expressed genes were mainly associated with cell cycle, lipid metabolism, immune response and biological adhesion. A total of 1,189 genes were significantly differentially expressed at +3 d compared with -7 d (stage II), and these genes were mainly associated with the immune response and cell cycle. Moreover, there were 1,672 genes significantly differentially expressed at +3 d compared with -35 d. Gene ontology analysis showed that the main differentially expressed genes were those associated with metabolic processes. The results suggest that the mammary gland begins to lactate not only by a gain of function but also by a broad suppression of function to effectively push most of the cell's resources towards lactation.
Regulation of gene expression in the mammalian eye and its relevance to eye disease
Scheetz, Todd E.; Kim, Kwang-Youn A.; Swiderski, Ruth E.; Philp, Alisdair R.; Braun, Terry A.; Knudtson, Kevin L.; Dorrance, Anne M.; DiBona, Gerald F.; Huang, Jian; Casavant, Thomas L.; Sheffield, Val C.; Stone, Edwin M.
2006-01-01
We used expression quantitative trait locus mapping in the laboratory rat (Rattus norvegicus) to gain a broad perspective of gene regulation in the mammalian eye and to identify genetic variation relevant to human eye disease. Of >31,000 gene probes represented on an Affymetrix expression microarray, 18,976 exhibited sufficient signal for reliable analysis and at least 2-fold variation in expression among 120 F2 rats generated from an SR/JrHsd × SHRSP intercross. Genome-wide linkage analysis with 399 genetic markers revealed significant linkage with at least one marker for 1,300 probes (α = 0.001; estimated empirical false discovery rate = 2%). Both contiguous and noncontiguous loci were found to be important in regulating mammalian eye gene expression. We investigated one locus of each type in greater detail and identified putative transcription-altering variations in both cases. We found an inserted cREL binding sequence in the 5′ flanking sequence of the Abca4 gene associated with an increased expression level of that gene, and we found a mutation of the gene encoding thyroid hormone receptor β2 associated with a decreased expression level of the gene encoding short-wavelength sensitive opsin (Opn1sw). In addition to these positional studies, we performed a pairwise analysis of gene expression to identify genes that are regulated in a coordinated manner and used this approach to validate two previously undescribed genes involved in the human disease Bardet–Biedl syndrome. These data and analytical approaches can be used to facilitate the discovery of additional genes and regulatory elements involved in human eye disease. PMID:16983098
Sirakova, T D; Markaryan, A; Kolattukudy, P E
1994-01-01
An extracellular elastinolytic metalloproteinase, purified from Aspergillus fumigatus isolated from an aspergillosis and patient/and an internal peptide derived from it were subjected to N-terminal sequencing. Oligonucleotide primers based on these sequences were used to PCR amplify a segment of the metalloproteinase cDNA, which was used as a probe to isolate the cDNA and gene for this enzyme. The gene sequence matched exactly with the cDNA sequence except for the four introns that interrupted the open reading frame. According to the deduced amino acid sequence, the metalloproteinase has a signal sequence and 227 additional amino acids preceding the sequence for the mature protein of 389 amino acids with a calculated molecular mass of 42 kDa, which is close to the size of the purified mature fungal proteinase. This sequence contains segments that matched both the N terminus of the mature protein and the internal peptide. A. fumigatus metalloproteinase contains some of the conserved zinc-binding and active-site motifs characteristic of metalloproteinases but shows no overall homology with known metalloproteinases. The cDNA of the mature protein when introduced into Escherichia coli directed the expression of a protein with a size, N-terminal sequence, and immunological cross-reactivity identical to those of the native fungal enzyme. Although the enzyme in the inclusion bodies could not be renatured, expression at 30 degrees C yielded soluble enzyme that showed chromatographic behavior identical to that of the native fungal enzyme and catalyzed hydrolysis of elastin. The metalloproteinase gene described here was not found in Aspergillus flavus. Images PMID:7927676
Xu, Li; Ding, Zhi-Shan; Zhou, Yun-Kai; Tao, Xue-Fen
2009-06-01
To obtain the full-length cDNA sequence of Secoisolariciresinol Dehydrogenase gene from Dysosma versipellis by RACE PCR,then investigate the character of Secoisolariciresinol Dehydrogenase gene. The full-length cDNA sequence of Secoisolariciresinol Dehydrogenase gene was obtained by 3'-RACE and 5'-RACE from Dysosma versipellis. We first reported the full cDNA sequences of Secoisolariciresinol Dehydrogenase in Dysosma versipellis. The acquired gene was 991bp in full length, including 5' untranslated region of 42bp, 3' untranslated region of 112bp with Poly (A). The open reading frame (ORF) encoding 278 amino acid with molecular weight 29253.3 Daltons and isolectric point 6.328. The gene accession nucleotide sequence number in GeneBank was EU573789. Semi-quantitative RT-PCR analysis revealed that the Secoisolariciresinol Dehydrogenase gene was highly expressed in stem. Alignment of the amino acid sequence of Secoisolariciresinol Dehydrogenase indicated there may be some significant amino acid sequence difference among different species. Obtain the full-length cDNA sequence of Secoisolariciresinol Dehydrogenase gene from Dysosma versipellis.
Mori, Yoshifumi; Chung, Ung-Il; Tanaka, Sakae; Saito, Taku
2014-01-01
Superficial zone (SFZ) cells, which are morphologically and functionally distinct from chondrocytes in deeper zones, play important roles in the maintenance of articular cartilage. Here, we established an easy and reliable method for performance of laser microdissection (LMD) on cryosections of mature rat articular cartilage using an adhesive membrane. We further examined gene expression profiles in the SFZ and the deeper zones of articular cartilage by performing RNA sequencing (RNA-seq). We validated sample collection methods, RNA amplification and the RNA-seq data using real-time RT-PCR. The combined data provide comprehensive information regarding genes specifically expressed in the SFZ or deeper zones, as well as a useful protocol for expression analysis of microsamples of hard tissues.
Insights into rubber biosynthesis from transcriptome analysis of Hevea brasiliensis latex.
Chow, Keng-See; Wan, Kiew-Lian; Isa, Mohd Noor Mat; Bahari, Azlina; Tan, Siang-Hee; Harikrishna, K; Yeang, Hoong-Yeet
2007-01-01
Hevea brasiliensis is the most widely cultivated species for commercial production of natural rubber (cis-polyisoprene). In this study, 10,040 expressed sequence tags (ESTs) were generated from the latex of the rubber tree, which represents the cytoplasmic content of a single cell type, in order to analyse the latex transcription profile with emphasis on rubber biosynthesis-related genes. A total of 3,441 unique transcripts (UTs) were obtained after quality editing and assembly of EST sequences. Functional classification of UTs according to the Gene Ontology convention showed that 73.8% were related to genes of unknown function. Among highly expressed ESTs, a significant proportion encoded proteins related to rubber biosynthesis and stress or defence responses. Sequences encoding rubber particle membrane proteins (RPMPs) belonging to three protein families accounted for 12% of the ESTs. Characterization of these ESTs revealed nine RPMP variants (7.9-27 kDa) including the 14 kDa REF (rubber elongation factor) and 22 kDa SRPP (small rubber particle protein). The expression of multiple RPMP isoforms in latex was shown using antibodies against REF and SRPP. Both EST and quantitative reverse transcription-PCR (QRT-PCR) analyses demonstrated REF and SRPP to be the most abundant transcripts in latex. Besides rubber biosynthesis, comparative sequence analysis showed that the RPMPs are highly similar to sequences in the plant kingdom having stress-related functions. Implications of the RPMP function in cis-polyisoprene biosynthesis in the context of transcript abundance and differential gene expression are discussed.