active protein-coding genes: Topics by Science.gov

Sample records for active protein-coding genes

Activity-Dependent Human Brain Coding/Noncoding Gene Regulatory Networks

PubMed Central

Lipovich, Leonard; Dachet, Fabien; Cai, Juan; Bagla, Shruti; Balan, Karina; Jia, Hui; Loeb, Jeffrey A.

2012-01-01

While most gene transcription yields RNA transcripts that code for proteins, a sizable proportion of the genome generates RNA transcripts that do not code for proteins, but may have important regulatory functions. The brain-derived neurotrophic factor (BDNF) gene, a key regulator of neuronal activity, is overlapped by a primate-specific, antisense long noncoding RNA (lncRNA) called BDNFOS. We demonstrate reciprocal patterns of BDNF and BDNFOS transcription in highly active regions of human neocortex removed as a treatment for intractable seizures. A genome-wide analysis of activity-dependent coding and noncoding human transcription using a custom lncRNA microarray identified 1288 differentially expressed lncRNAs, of which 26 had expression profiles that matched activity-dependent coding genes and an additional 8 were adjacent to or overlapping with differentially expressed protein-coding genes. The functions of most of these protein-coding partner genes, such as ARC, include long-term potentiation, synaptic activity, and memory. The nuclear lncRNAs NEAT1, MALAT1, and RPPH1, composing an RNAse P-dependent lncRNA-maturation pathway, were also upregulated. As a means to replicate human neuronal activity, repeated depolarization of SY5Y cells resulted in sustained CREB activation and produced an inverse pattern of BDNF-BDNFOS co-expression that was not achieved with a single depolarization. RNAi-mediated knockdown of BDNFOS in human SY5Y cells increased BDNF expression, suggesting that BDNFOS directly downregulates BDNF. Temporal expression patterns of other lncRNA-messenger RNA pairs validated the effect of chronic neuronal activity on the transcriptome and implied various lncRNA regulatory mechanisms. lncRNAs, some of which are unique to primates, thus appear to have potentially important regulatory roles in activity-dependent human brain plasticity. PMID:22960213
Promoter analysis reveals globally differential regulation of human long non-coding RNA and protein-coding genes

DOE PAGES

Alam, Tanvir; Medvedeva, Yulia A.; Jia, Hui; ...

2014-10-02

Transcriptional regulation of protein-coding genes is increasingly well-understood on a global scale, yet no comparable information exists for long non-coding RNA (lncRNA) genes, which were recently recognized to be as numerous as protein-coding genes in mammalian genomes. We performed a genome-wide comparative analysis of the promoters of human lncRNA and protein-coding genes, finding global differences in specific genetic and epigenetic features relevant to transcriptional regulation. These two groups of genes are hence subject to separate transcriptional regulatory programs, including distinct transcription factor (TF) proteins that significantly favor lncRNA, rather than coding-gene, promoters. We report a specific signature of promoter-proximal transcriptionalmore » regulation of lncRNA genes, including several distinct transcription factor binding sites (TFBS). Experimental DNase I hypersensitive site profiles are consistent with active configurations of these lncRNA TFBS sets in diverse human cell types. TFBS ChIP-seq datasets confirm the binding events that we predicted using computational approaches for a subset of factors. For several TFs known to be directly regulated by lncRNAs, we find that their putative TFBSs are enriched at lncRNA promoters, suggesting that the TFs and the lncRNAs may participate in a bidirectional feedback loop regulatory network. Accordingly, cells may be able to modulate lncRNA expression levels independently of mRNA levels via distinct regulatory pathways. Our results also raise the possibility that, given the historical reliance on protein-coding gene catalogs to define the chromatin states of active promoters, a revision of these chromatin signature profiles to incorporate expressed lncRNA genes is warranted in the future.« less
Promoter analysis reveals globally differential regulation of human long non-coding RNA and protein-coding genes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Alam, Tanvir; Medvedeva, Yulia A.; Jia, Hui

Transcriptional regulation of protein-coding genes is increasingly well-understood on a global scale, yet no comparable information exists for long non-coding RNA (lncRNA) genes, which were recently recognized to be as numerous as protein-coding genes in mammalian genomes. We performed a genome-wide comparative analysis of the promoters of human lncRNA and protein-coding genes, finding global differences in specific genetic and epigenetic features relevant to transcriptional regulation. These two groups of genes are hence subject to separate transcriptional regulatory programs, including distinct transcription factor (TF) proteins that significantly favor lncRNA, rather than coding-gene, promoters. We report a specific signature of promoter-proximal transcriptionalmore » regulation of lncRNA genes, including several distinct transcription factor binding sites (TFBS). Experimental DNase I hypersensitive site profiles are consistent with active configurations of these lncRNA TFBS sets in diverse human cell types. TFBS ChIP-seq datasets confirm the binding events that we predicted using computational approaches for a subset of factors. For several TFs known to be directly regulated by lncRNAs, we find that their putative TFBSs are enriched at lncRNA promoters, suggesting that the TFs and the lncRNAs may participate in a bidirectional feedback loop regulatory network. Accordingly, cells may be able to modulate lncRNA expression levels independently of mRNA levels via distinct regulatory pathways. Our results also raise the possibility that, given the historical reliance on protein-coding gene catalogs to define the chromatin states of active promoters, a revision of these chromatin signature profiles to incorporate expressed lncRNA genes is warranted in the future.« less
The transcriptional activator ZNF143 is essential for normal development in zebrafish

PubMed Central

2012-01-01

Background ZNF143 is a sequence-specific DNA-binding protein that stimulates transcription of both small RNA genes by RNA polymerase II or III, or protein-coding genes by RNA polymerase II, using separable activating domains. We describe phenotypic effects following knockdown of this protein in developing Danio rerio (zebrafish) embryos by injection of morpholino antisense oligonucleotides that target znf143 mRNA. Results The loss of function phenotype is pleiotropic and includes a broad array of abnormalities including defects in heart, blood, ear and midbrain hindbrain boundary. Defects are rescued by coinjection of synthetic mRNA encoding full-length ZNF143 protein, but not by protein lacking the amino-terminal activation domains. Accordingly, expression of several marker genes is affected following knockdown, including GATA-binding protein 1 (gata1), cardiac myosin light chain 2 (cmlc2) and paired box gene 2a (pax2a). The zebrafish pax2a gene proximal promoter contains two binding sites for ZNF143, and reporter gene transcription driven by this promoter in transfected cells is activated by this protein. Conclusions Normal development of zebrafish embryos requires ZNF143. Furthermore, the pax2a gene is probably one example of many protein-coding gene targets of ZNF143 during zebrafish development. PMID:22268977
The transcriptional activator ZNF143 is essential for normal development in zebrafish.

PubMed

Halbig, Kari M; Lekven, Arne C; Kunkel, Gary R

2012-01-23

ZNF143 is a sequence-specific DNA-binding protein that stimulates transcription of both small RNA genes by RNA polymerase II or III, or protein-coding genes by RNA polymerase II, using separable activating domains. We describe phenotypic effects following knockdown of this protein in developing Danio rerio (zebrafish) embryos by injection of morpholino antisense oligonucleotides that target znf143 mRNA. The loss of function phenotype is pleiotropic and includes a broad array of abnormalities including defects in heart, blood, ear and midbrain hindbrain boundary. Defects are rescued by coinjection of synthetic mRNA encoding full-length ZNF143 protein, but not by protein lacking the amino-terminal activation domains. Accordingly, expression of several marker genes is affected following knockdown, including GATA-binding protein 1 (gata1), cardiac myosin light chain 2 (cmlc2) and paired box gene 2a (pax2a). The zebrafish pax2a gene proximal promoter contains two binding sites for ZNF143, and reporter gene transcription driven by this promoter in transfected cells is activated by this protein. Normal development of zebrafish embryos requires ZNF143. Furthermore, the pax2a gene is probably one example of many protein-coding gene targets of ZNF143 during zebrafish development.
[Regulation of heat shock gene expression in response to stress].

PubMed

Garbuz, D G

2017-01-01

Heat shock (HS) genes, or stress genes, code for a number of proteins that collectively form the most ancient and universal stress defense system. The system determines the cell capability of adaptation to various adverse factors and performs a variety of auxiliary functions in normal physiological conditions. Common stress factors, such as higher temperatures, hypoxia, heavy metals, and others, suppress transcription and translation for the majority of genes, while HS genes are upregulated. Transcription of HS genes is controlled by transcription factors of the HS factor (HSF) family. Certain HSFs are activated on exposure to higher temperatures or other adverse factors to ensure stress-induced HS gene expression, while other HSFs are specifically activated at particular developmental stages. The regulation of the main mammalian stress-inducible factor HSF1 and Drosophila melanogaster HSF includes many components, such as a variety of early warning signals indicative of abnormal cell activity (e.g., increases in intracellular ceramide, cytosolic calcium ions, or partly denatured proteins); protein kinases, which phosphorylate HSFs at various Ser residues; acetyltransferases; and regulatory proteins, such as SUMO and HSBP1. Transcription factors other than HSFs are also involved in activating HS gene transcription; the set includes D. melanogaster GAF, mammalian Sp1 and NF-Y, and other factors. Transcription of several stress genes coding for molecular chaperones of the glucose-regulated protein (GRP) family is predominantly regulated by another stress-detecting system, which is known as the unfolded protein response (UPR) system and is activated in response to massive protein misfolding in the endoplasmic reticulum and mitochondrial matrix. A translational fine tuning of HS protein expression occurs via changing the phosphorylation status of several proteins involved in translation initiation. In addition, specific signal sequences in the 5'-UTRs of some HS protein mRNAs ensure their preferential translation in stress.
Methylation of miRNA genes and oncogenesis.

PubMed

Loginov, V I; Rykov, S V; Fridman, M V; Braga, E A

2015-02-01

Interaction between microRNA (miRNA) and messenger RNA of target genes at the posttranscriptional level provides fine-tuned dynamic regulation of cell signaling pathways. Each miRNA can be involved in regulating hundreds of protein-coding genes, and, conversely, a number of different miRNAs usually target a structural gene. Epigenetic gene inactivation associated with methylation of promoter CpG-islands is common to both protein-coding genes and miRNA genes. Here, data on functions of miRNAs in development of tumor-cell phenotype are reviewed. Genomic organization of promoter CpG-islands of the miRNA genes located in inter- and intragenic areas is discussed. The literature and our own results on frequency of CpG-island methylation in miRNA genes from tumors are summarized, and data regarding a link between such modification and changed activity of miRNA genes and, consequently, protein-coding target genes are presented. Moreover, the impact of miRNA gene methylation on key oncogenetic processes as well as affected signaling pathways is discussed.
Genes uniquely expressed in human growth plate chondrocytes uncover a distinct regulatory network.

PubMed

Li, Bing; Balasubramanian, Karthika; Krakow, Deborah; Cohn, Daniel H

2017-12-20

Chondrogenesis is the earliest stage of skeletal development and is a highly dynamic process, integrating the activities and functions of transcription factors, cell signaling molecules and extracellular matrix proteins. The molecular mechanisms underlying chondrogenesis have been extensively studied and multiple key regulators of this process have been identified. However, a genome-wide overview of the gene regulatory network in chondrogenesis has not been achieved. In this study, employing RNA sequencing, we identified 332 protein coding genes and 34 long non-coding RNA (lncRNA) genes that are highly selectively expressed in human fetal growth plate chondrocytes. Among the protein coding genes, 32 genes were associated with 62 distinct human skeletal disorders and 153 genes were associated with skeletal defects in knockout mice, confirming their essential roles in skeletal formation. These gene products formed a comprehensive physical interaction network and participated in multiple cellular processes regulating skeletal development. The data also revealed 34 transcription factors and 11,334 distal enhancers that were uniquely active in chondrocytes, functioning as transcriptional regulators for the cartilage-selective genes. Our findings revealed a complex gene regulatory network controlling skeletal development whereby transcription factors, enhancers and lncRNAs participate in chondrogenesis by transcriptional regulation of key genes. Additionally, the cartilage-selective genes represent candidate genes for unsolved human skeletal disorders.
Polymerization of non-complementary RNA: systematic symmetric nucleotide exchanges mainly involving uracil produce mitochondrial RNA transcripts coding for cryptic overlapping genes.

PubMed

Seligmann, Hervé

2013-03-01

Usual DNA→RNA transcription exchanges T→U. Assuming different systematic symmetric nucleotide exchanges during translation, some GenBank RNAs match exactly human mitochondrial sequences (exchange rules listed in decreasing transcript frequencies): C↔U, A↔U, A↔U+C↔G (two nucleotide pairs exchanged), G↔U, A↔G, C↔G, none for A↔C, A↔G+C↔U, and A↔C+G↔U. Most unusual transcripts involve exchanging uracil. Independent measures of rates of rare replicational enzymatic DNA nucleotide misinsertions predict frequencies of RNA transcripts systematically exchanging the corresponding misinserted nucleotides. Exchange transcripts self-hybridize less than other gene regions, self-hybridization increases with length, suggesting endoribonuclease-limited elongation. Blast detects stop codon depleted putative protein coding overlapping genes within exchange-transcribed mitochondrial genes. These align with existing GenBank proteins (mainly metazoan origins, prokaryotic and viral origins underrepresented). These GenBank proteins frequently interact with RNA/DNA, are membrane transporters, or are typical of mitochondrial metabolism. Nucleotide exchange transcript frequencies increase with overlapping gene densities and stop densities, indicating finely tuned counterbalancing regulation of expression of systematic symmetric nucleotide exchange-encrypted proteins. Such expression necessitates combined activities of suppressor tRNAs matching stops, and nucleotide exchange transcription. Two independent properties confirm predicted exchanged overlap coding genes: discrepancy of third codon nucleotide contents from replicational deamination gradients, and codon usage according to circular code predictions. Predictions from both properties converge, especially for frequent nucleotide exchange types. Nucleotide exchanging transcription apparently increases coding densities of protein coding genes without lengthening genomes, revealing unsuspected functional DNA coding potential. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
The Drosophila genes CG14593 and CG30106 code for G-protein-coupled receptors specifically activated by the neuropeptides CCHamide-1 and CCHamide-2.

PubMed

Hansen, Karina K; Hauser, Frank; Williamson, Michael; Weber, Stine B; Grimmelikhuijzen, Cornelis J P

2011-01-07

Recently, a novel neuropeptide, CCHamide, was discovered in the silkworm Bombyx mori (L. Roller et al., Insect Biochem. Mol. Biol. 38 (2008) 1147-1157). We have now found that all insects with a sequenced genome have two genes, each coding for a different CCHamide, CCHamide-1 and -2. We have also cloned and deorphanized two Drosophila G-protein-coupled receptors (GPCRs) coded for by genes CG14593 and CG30106 that are selectively activated by Drosophila CCH-amide-1 (EC(50), 2×10(-9) M) and CCH-amide-2 (EC(50), 5×10(-9) M), respectively. Gene CG30106 (symbol synonym CG14484) has in a previous publication (E.C. Johnson et al., J. Biol. Chem. 278 (2003) 52172-52178) been wrongly assigned to code for an allatostatin-B receptor. This conclusion is based on our findings that the allatostatins-B do not activate the CG30106 receptor and on the recent findings from other research groups that the allatostatins-B activate an unrelated GPCR coded for by gene CG16752. Comparative genomics suggests that a duplication of the CCHamide neuropeptide signalling system occurred after the split of crustaceans and insects, about 410 million years ago, because only one CCHamide neuropeptide gene is found in the water flea Daphnia pulex (Crustacea) and the tick Ixodes scapularis (Chelicerata). Copyright Â© 2010 Elsevier Inc. All rights reserved.
The Evolution and Expression Pattern of Human Overlapping lncRNA and Protein-coding Gene Pairs.

PubMed

Ning, Qianqian; Li, Yixue; Wang, Zhen; Zhou, Songwen; Sun, Hong; Yu, Guangjun

2017-03-27

Long non-coding RNA overlapping with protein-coding gene (lncRNA-coding pair) is a special type of overlapping genes. Protein-coding overlapping genes have been well studied and increasing attention has been paid to lncRNAs. By studying lncRNA-coding pairs in human genome, we showed that lncRNA-coding pairs were more likely to be generated by overprinting and retaining genes in lncRNA-coding pairs were given higher priority than non-overlapping genes. Besides, the preference of overlapping configurations preserved during evolution was based on the origin of lncRNA-coding pairs. Further investigations showed that lncRNAs promoting the splicing of their embedded protein-coding partners was a unilateral interaction, but the existence of overlapping partners improving the gene expression was bidirectional and the effect was decreased with the increased evolutionary age of genes. Additionally, the expression of lncRNA-coding pairs showed an overall positive correlation and the expression correlation was associated with their overlapping configurations, local genomic environment and evolutionary age of genes. Comparison of the expression correlation of lncRNA-coding pairs between normal and cancer samples found that the lineage-specific pairs including old protein-coding genes may play an important role in tumorigenesis. This work presents a systematically comprehensive understanding of the evolution and the expression pattern of human lncRNA-coding pairs.
The Yersinia pestis gcvB gene encodes two small regulatory RNA molecules

PubMed Central

McArthur, Sarah D; Pulvermacher, Sarah C; Stauffer, George V

2006-01-01

Background In recent years it has become clear that small non-coding RNAs function as regulatory elements in bacterial virulence and bacterial stress responses. We tested for the presence of the small non-coding GcvB RNAs in Y. pestis as possible regulators of gene expression in this organism. Results In this study, we report that the Yersinia pestis KIM6 gcvB gene encodes two small RNAs. Transcription of gcvB is activated by the GcvA protein and repressed by the GcvR protein. The gcvB-encoded RNAs are required for repression of the Y. pestis dppA gene, encoding the periplasmic-binding protein component of the dipeptide transport system, showing that the GcvB RNAs have regulatory activity. A deletion of the gcvB gene from the Y. pestis KIM6 chromosome results in a decrease in the generation time of the organism as well as a change in colony morphology. Conclusion The results of this study indicate that the Y. pestis gcvB gene encodes two small non-coding regulatory RNAs that repress dppA expression. A gcvB deletion is pleiotropic, suggesting that the sRNAs are likely involved in controlling genes in addition to dppA. PMID:16768793
A global analysis of protein expression profiles in Sinorhizobium meliloti: discovery of new genes for nodule occupancy and stress adaptation.

PubMed

Djordjevic, Michael A; Chen, Han Cai; Natera, Siria; Van Noorden, Giel; Menzel, Christian; Taylor, Scott; Renard, Clotilde; Geiger, Otto; Weiller, Georg F

2003-06-01

A proteomic examination of Sinorhizobium meliloti strain 1021 was undertaken using a combination of 2-D gel electrophoresis, peptide mass fingerprinting, and bioinformatics. Our goal was to identify (i) putative symbiosis- or nutrient-stress-specific proteins, (ii) the biochemical pathways active under different conditions, (iii) potential new genes, and (iv) the extent of posttranslational modifications of S. meliloti proteins. In total, we identified the protein products of 810 genes (13.1% of the genome's coding capacity). The 810 genes generated 1,180 gene products, with chromosomal genes accounting for 78% of the gene products identified (18.8% of the chromosome's coding capacity). The activity of 53 metabolic pathways was inferred from bioinformatic analysis of proteins with assigned Enzyme Commission numbers. Of the remaining proteins that did not encode enzymes, ABC-type transporters composed 12.7% and regulatory proteins 3.4% of the total. Proteins with up to seven transmembrane domains were identified in membrane preparations. A total of 27 putative nodule-specific proteins and 35 nutrient-stress-specific proteins were identified and used as a basis to define genes and describe processes occurring in S. meliloti cells in nodules and under stress. Several nodule proteins from the plant host were present in the nodule bacteria preparations. We also identified seven potentially novel proteins not predicted from the DNA sequence. Post-translational modifications such as N-terminal processing could be inferred from the data. The posttranslational addition of UMP to the key regulator of nitrogen metabolism, PII, was demonstrated. This work demonstrates the utility of combining mass spectrometry with protein arraying or separation techniques to identify candidate genes involved in important biological processes and niche occupations that may be intransigent to other methods of gene expression profiling.
Systematic asymmetric nucleotide exchanges produce human mitochondrial RNAs cryptically encoding for overlapping protein coding genes.

PubMed

Seligmann, Hervé

2013-05-07

GenBank's EST database includes RNAs matching exactly human mitochondrial sequences assuming systematic asymmetric nucleotide exchange-transcription along exchange rules: A→G→C→U/T→A (12 ESTs), A→U/T→C→G→A (4 ESTs), C→G→U/T→C (3 ESTs), and A→C→G→U/T→A (1 EST), no RNAs correspond to other potential asymmetric exchange rules. Hypothetical polypeptides translated from nucleotide-exchanged human mitochondrial protein coding genes align with numerous GenBank proteins, predicted secondary structures resemble their putative GenBank homologue's. Two independent methods designed to detect overlapping genes (one based on nucleotide contents analyses in relation to replicative deamination gradients at third codon positions, and circular code analyses of codon contents based on frame redundancy), confirm nucleotide-exchange-encrypted overlapping genes. Methods converge on which genes are most probably active, and which not, and this for the various exchange rules. Mean EST lengths produced by different nucleotide exchanges are proportional to (a) extents that various bioinformatics analyses confirm the protein coding status of putative overlapping genes; (b) known kinetic chemistry parameters of the corresponding nucleotide substitutions by the human mitochondrial DNA polymerase gamma (nucleotide DNA misinsertion rates); (c) stop codon densities in predicted overlapping genes (stop codon readthrough and exchanging polymerization regulate gene expression by counterbalancing each other). Numerous rarely expressed proteins seem encoded within regular mitochondrial genes through asymmetric nucleotide exchange, avoiding lengthening genomes. Intersecting evidence between several independent approaches confirms the working hypothesis status of gene encryption by systematic nucleotide exchanges. Copyright © 2013 Elsevier Ltd. All rights reserved.
A subset of conserved mammalian long non-coding RNAs are fossils of ancestral protein-coding genes.

PubMed

Hezroni, Hadas; Ben-Tov Perry, Rotem; Meir, Zohar; Housman, Gali; Lubelsky, Yoav; Ulitsky, Igor

2017-08-30

Only a small portion of human long non-coding RNAs (lncRNAs) appear to be conserved outside of mammals, but the events underlying the birth of new lncRNAs in mammals remain largely unknown. One potential source is remnants of protein-coding genes that transitioned into lncRNAs. We systematically compare lncRNA and protein-coding loci across vertebrates, and estimate that up to 5% of conserved mammalian lncRNAs are derived from lost protein-coding genes. These lncRNAs have specific characteristics, such as broader expression domains, that set them apart from other lncRNAs. Fourteen lncRNAs have sequence similarity with the loci of the contemporary homologs of the lost protein-coding genes. We propose that selection acting on enhancer sequences is mostly responsible for retention of these regions. As an example of an RNA element from a protein-coding ancestor that was retained in the lncRNA, we describe in detail a short translated ORF in the JPX lncRNA that was derived from an upstream ORF in a protein-coding gene and retains some of its functionality. We estimate that ~ 55 annotated conserved human lncRNAs are derived from parts of ancestral protein-coding genes, and loss of coding potential is thus a non-negligible source of new lncRNAs. Some lncRNAs inherited regulatory elements influencing transcription and translation from their protein-coding ancestors and those elements can influence the expression breadth and functionality of these lncRNAs.
Foxo3 activity promoted by non-coding effects of circular RNA and Foxo3 pseudogene in the inhibition of tumor growth and angiogenesis.

PubMed

Yang, W; Du, W W; Li, X; Yee, A J; Yang, B B

2016-07-28

It has recently been shown that the upregulation of a pseudogene specific to a protein-coding gene could function as a sponge to bind multiple potential targeting microRNAs (miRNAs), resulting in increased gene expression. Similarly, it was recently demonstrated that circular RNAs can function as sponges for miRNAs, and could upregulate expression of mRNAs containing an identical sequence. Furthermore, some mRNAs are now known to not only translate protein, but also function to sponge miRNA binding, facilitating gene expression. Collectively, these appear to be effective mechanisms to ensure gene expression and protein activity. Here we show that expression of a member of the forkhead family of transcription factors, Foxo3, is regulated by the Foxo3 pseudogene (Foxo3P), and Foxo3 circular RNA, both of which bind to eight miRNAs. We found that the ectopic expression of the Foxo3P, Foxo3 circular RNA and Foxo3 mRNA could all suppress tumor growth and cancer cell proliferation and survival. Our results showed that at least three mechanisms are used to ensure protein translation of Foxo3, which reflects an essential role of Foxo3 and its corresponding non-coding RNAs.
Transcriptome interrogation of human myometrium identifies differentially expressed sense-antisense pairs of protein-coding and long non-coding RNA genes in spontaneous labor at term

PubMed Central

Romero, Roberto; Tarca, Adi; Chaemsaithong, Piya; Miranda, Jezid; Chaiworapongsa, Tinnakorn; Jia, Hui; Hassan, Sonia S.; Kalita, Cynthia A.; Cai, Juan; Yeo, Lami; Lipovich, Leonard

2014-01-01

Objective The mechanisms responsible for normal and abnormal parturition are poorly understood. Myometrial activation leading to regular uterine contractions is a key component of labor. Dysfunctional labor (arrest of dilatation and/or descent) is a leading indication for cesarean delivery. Compelling evidence suggests that most of these disorders are functional in nature, and not the result of cephalopelvic disproportion. The methodology and the datasets afforded by the post-genomic era provide novel opportunities to understand and target gene functions in these disorders. In 2012, the ENCODE Consortium elucidated the extraordinary abundance and functional complexity of long non-coding RNA genes in the human genome. The purpose of the study was to identify differentially expressed long non-coding RNA genes in human myometrium in women in spontaneous labor at term. Materials and Methods Myometrium was obtained from women undergoing cesarean deliveries who were not in labor (n=19) and women in spontaneous labor at term (n=20). RNA was extracted and profiled using an Illumina® microarray platform. The analysis of the protein coding genes from this study has been previously reported. Here, we have used computational approaches to bound the extent of long non-coding RNA representation on this platform, and to identify co-differentially expressed and correlated pairs of long non-coding RNA genes and protein-coding genes sharing the same genomic loci. Results Upon considering more than 18,498 distinct lncRNA genes compiled nonredundantly from public experimental data sources, and interrogating 2,634 that matched Illumina microarray probes, we identified co-differential expression and correlation at two genomic loci that contain coding-lncRNA gene pairs: SOCS2-AK054607 and LMCD1-NR_024065 in women in spontaneous labor at term. This co-differential expression and correlation was validated by qRT-PCR, an independent experimental method. Intriguingly, one of the two lncRNA genes differentially expressed in term labor had a key genomic structure element, a splice site that lacked evolutionary conservation beyond primates. Conclusions We provide for the first time evidence for coordinated differential expression and correlation of cis-encoded antisense lncRNAs and protein-coding genes with known, as well as novel roles in pregnancy in the myometrium of women in spontaneous labor at term. PMID:24168098
Identification and analysis of unitary loss of long-established protein-coding genes in Poaceae shows evidences for biased gene loss and putatively functional transcription of relics.

PubMed

Zhao, Yi; Tang, Liang; Li, Zhe; Jin, Jinpu; Luo, Jingchu; Gao, Ge

2015-04-18

Long-established protein-coding genes may lose their coding potential during evolution ("unitary gene loss"). Members of the Poaceae family are a major food source and represent an ideal model clade for plant evolution research. However, the global pattern of unitary gene loss in Poaceae genomes as well as the evolutionary fate of lost genes are still less-investigated and remain largely elusive. Using a locally developed pipeline, we identified 129 unitary gene loss events for long-established protein-coding genes from four representative species of Poaceae, i.e. brachypodium, rice, sorghum and maize. Functional annotation suggested that the lost genes in all or most of Poaceae species are enriched for genes involved in development and response to endogenous stimulus. We also found that 44 mutated genomic loci of lost genes, which we referred as relics, were still actively transcribed, and of which 84% (37 of 44) showed significantly differential expression across different tissues. More interestingly, we found that there were totally five expressed relics may function as competitive endogenous RNA in brachypodium, rice and sorghum genome. Based on comparative genomics and transcriptome data, we firstly compiled a comprehensive catalogue of unitary gene loss events in Poaceae species and characterized a statistically significant functional preference for these lost genes as well showed the potential of relics functioning as competitive endogenous RNAs in Poaceae genomes.
Transcription of a protein-coding gene on B chromosomes of the Siberian roe deer (Capreolus pygargus)

PubMed Central

2013-01-01

Background Most eukaryotic species represent stable karyotypes with a particular diploid number. B chromosomes are additional to standard karyotypes and may vary in size, number and morphology even between cells of the same individual. For many years it was generally believed that B chromosomes found in some plant, animal and fungi species lacked active genes. Recently, molecular cytogenetic studies showed the presence of additional copies of protein-coding genes on B chromosomes. However, the transcriptional activity of these genes remained elusive. We studied karyotypes of the Siberian roe deer (Capreolus pygargus) that possess up to 14 B chromosomes to investigate the presence and expression of genes on supernumerary chromosomes. Results Here, we describe a 2 Mbp region homologous to cattle chromosome 3 and containing TNNI3K (partial), FPGT, LRRIQ3 and a large gene-sparse segment on B chromosomes of the Siberian roe deer. The presence of the copy of the autosomal region was demonstrated by B-specific cDNA analysis, PCR assisted mapping, cattle bacterial artificial chromosome (BAC) clone localization and quantitative polymerase chain reaction (qPCR). By comparative analysis of B-specific and non-B chromosomal sequences we discovered some B chromosome-specific mutations in protein-coding genes, which further enabled the detection of a FPGT-TNNI3K transcript expressed from duplicated genes located on B chromosomes in roe deer fibroblasts. Conclusions Discovery of a large autosomal segment in all B chromosomes of the Siberian roe deer further corroborates the view of an autosomal origin for these elements. Detection of a B-derived transcript in fibroblasts implies that the protein coding sequences located on Bs are not fully inactivated. The origin, evolution and effect on host of B chromosomal genes seem to be similar to autosomal segmental duplications, which reinforces the view that supernumerary chromosomal elements might play an important role in genome evolution. PMID:23915065
Retrieval of Enterobacteriaceae drug targets using singular value decomposition.

PubMed

Silvério-Machado, Rita; Couto, Bráulio R G M; Dos Santos, Marcos A

2015-04-15

The identification of potential drug target proteins in bacteria is important in pharmaceutical research for the development of new antibiotics to combat bacterial agents that cause diseases. A new model that combines the singular value decomposition (SVD) technique with biological filters composed of a set of protein properties associated with bacterial drug targets and similarity to protein-coding essential genes of Escherichia coli (strain K12) has been created to predict potential antibiotic drug targets in the Enterobacteriaceae family. This model identified 99 potential drug target proteins in the studied family, which exhibit eight different functions and are protein-coding essential genes or similar to protein-coding essential genes of E.coli (strain K12), indicating that the disruption of the activities of these proteins is critical for cells. Proteins from bacteria with described drug resistance were found among the retrieved candidates. These candidates have no similarity to the human proteome, therefore exhibiting the advantage of causing no adverse effects or at least no known adverse effects on humans. rita_silverio@hotmail.com. Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

Non-coding RNAs in lung cancer

PubMed Central

Ricciuti, Biagio; Mecca, Carmen; Crinò, Lucio; Baglivo, Sara; Cenci, Matteo; Metro, Giulio

2014-01-01

The discovery that protein-coding genes represent less than 2% of all human genome, and the evidence that more than 90% of it is actively transcribed, changed the classical point of view of the central dogma of molecular biology, which was always based on the assumption that RNA functions mainly as an intermediate bridge between DNA sequences and protein synthesis machinery. Accumulating data indicates that non-coding RNAs are involved in different physiological processes, providing for the maintenance of cellular homeostasis. They are important regulators of gene expression, cellular differentiation, proliferation, migration, apoptosis, and stem cell maintenance. Alterations and disruptions of their expression or activity have increasingly been associated with pathological changes of cancer cells, this evidence and the prospect of using these molecules as diagnostic markers and therapeutic targets, make currently non-coding RNAs among the most relevant molecules in cancer research. In this paper we will provide an overview of non-coding RNA function and disruption in lung cancer biology, also focusing on their potential as diagnostic, prognostic and predictive biomarkers. PMID:25593996
Purification and identification of a nuclease activity in embryo axes from French bean.

PubMed

Lambert, Rocío; Quiles, Francisco Antonio; Cabello-Díaz, Juan Miguel; Piedras, Pedro

2014-07-01

Plant nucleases are involved in nucleic acid degradation associated to programmed cell death processes as well as in DNA restriction, repair and recombination processes. However, the knowledge about the function of plant nucleases is limited. A major nuclease activity was detected by in-gel assay with whole embryonic axes of common bean by using ssDNA or RNA as substrate, whereas this activity was minimal in cotyledons. The enzyme has been purified to electrophoretic homogeneity from embryonic axes. The main biochemical properties of the purified enzyme indicate that it belongs to the S1/P1 family of nucleases. This was corroborated when this protein, after SDS-electrophoresis, was excised from the gel and further analysis by MALDI TOF/TOF allowed identification of the gene (PVN1) that codes this protein. The gene that codes the purified protein was identified. The expression of PVN1 gene was induced at the specific moment of radicle protrusion. The inclusion of inorganic phosphate to the imbibition media reduced the level of expression of this gene and the nuclease activity suggesting a relationship with the phosphorous status in French bean seedlings. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
The artificial zinc finger coding gene 'Jazz' binds the utrophin promoter and activates transcription.

PubMed

Corbi, N; Libri, V; Fanciulli, M; Tinsley, J M; Davies, K E; Passananti, C

2000-06-01

Up-regulation of utrophin gene expression is recognized as a plausible therapeutic approach in the treatment of Duchenne muscular dystrophy (DMD). We have designed and engineered new zinc finger-based transcription factors capable of binding and activating transcription from the promoter of the dystrophin-related gene, utrophin. Using the recognition 'code' that proposes specific rules between zinc finger primary structure and potential DNA binding sites, we engineered a new gene named 'Jazz' that encodes for a three-zinc finger peptide. Jazz belongs to the Cys2-His2 zinc finger type and was engineered to target the nine base pair DNA sequence: 5'-GCT-GCT-GCG-3', present in the promoter region of both the human and mouse utrophin gene. The entire zinc finger alpha-helix region, containing the amino acid positions that are crucial for DNA binding, was specifically chosen on the basis of the contacts more frequently represented in the available list of the 'code'. Here we demonstrate that Jazz protein binds specifically to the double-stranded DNA target, with a dissociation constant of about 32 nM. Band shift and super-shift experiments confirmed the high affinity and specificity of Jazz protein for its DNA target. Moreover, we show that chimeric proteins, named Gal4-Jazz and Sp1-Jazz, are able to drive the transcription of a test gene from the human utrophin promoter.
Biotin protein ligase from Corynebacterium glutamicum: role for growth and L: -lysine production.

PubMed

Peters-Wendisch, P; Stansen, K C; Götker, S; Wendisch, V F

2012-03-01

Corynebacterium glutamicum is a biotin auxotrophic Gram-positive bacterium that is used for large-scale production of amino acids, especially of L-glutamate and L-lysine. It is known that biotin limitation triggers L-glutamate production and that L-lysine production can be increased by enhancing the activity of pyruvate carboxylase, one of two biotin-dependent proteins of C. glutamicum. The gene cg0814 (accession number YP_225000) has been annotated to code for putative biotin protein ligase BirA, but the protein has not yet been characterized. A discontinuous enzyme assay of biotin protein ligase activity was established using a 105aa peptide corresponding to the carboxyterminus of the biotin carboxylase/biotin carboxyl carrier protein subunit AccBC of the acetyl CoA carboxylase from C. glutamicum as acceptor substrate. Biotinylation of this biotin acceptor peptide was revealed with crude extracts of a strain overexpressing the birA gene and was shown to be ATP dependent. Thus, birA from C. glutamicum codes for a functional biotin protein ligase (EC 6.3.4.15). The gene birA from C. glutamicum was overexpressed and the transcriptome was compared with the control strain revealing no significant gene expression changes of the bio-genes. However, biotin protein ligase overproduction increased the level of the biotin-containing protein pyruvate carboxylase and entailed a significant growth advantage in glucose minimal medium. Moreover, birA overexpression resulted in a twofold higher L-lysine yield on glucose as compared with the control strain.
Biased exonization of transposed elements in duplicated genes: A lesson from the TIF-IA gene.

PubMed

Amit, Maayan; Sela, Noa; Keren, Hadas; Melamed, Ze'ev; Muler, Inna; Shomron, Noam; Izraeli, Shai; Ast, Gil

2007-11-29

Gene duplication and exonization of intronic transposed elements are two mechanisms that enhance genomic diversity. We examined whether there is less selection against exonization of transposed elements in duplicated genes than in single-copy genes. Genome-wide analysis of exonization of transposed elements revealed a higher rate of exonization within duplicated genes relative to single-copy genes. The gene for TIF-IA, an RNA polymerase I transcription initiation factor, underwent a humanoid-specific triplication, all three copies of the gene are active transcriptionally, although only one copy retains the ability to generate the TIF-IA protein. Prior to TIF-IA triplication, an Alu element was inserted into the first intron. In one of the non-protein coding copies, this Alu is exonized. We identified a single point mutation leading to exonization in one of the gene duplicates. When this mutation was introduced into the TIF-IA coding copy, exonization was activated and the level of the protein-coding mRNA was reduced substantially. A very low level of exonization was detected in normal human cells. However, this exonization was abundant in most leukemia cell lines evaluated, although the genomic sequence is unchanged in these cancerous cells compared to normal cells. The definition of the Alu element within the TIF-IA gene as an exon is restricted to certain types of cancers; the element is not exonized in normal human cells. These results further our understanding of the delicate interplay between gene duplication and alternative splicing and of the molecular evolutionary mechanisms leading to genetic innovations. This implies the existence of purifying selection against exonization in single copy genes, with duplicate genes free from such constrains.
Biased exonization of transposed elements in duplicated genes: A lesson from the TIF-IA gene

PubMed Central

Amit, Maayan; Sela, Noa; Keren, Hadas; Melamed, Ze'ev; Muler, Inna; Shomron, Noam; Izraeli, Shai; Ast, Gil

2007-01-01

Background Gene duplication and exonization of intronic transposed elements are two mechanisms that enhance genomic diversity. We examined whether there is less selection against exonization of transposed elements in duplicated genes than in single-copy genes. Results Genome-wide analysis of exonization of transposed elements revealed a higher rate of exonization within duplicated genes relative to single-copy genes. The gene for TIF-IA, an RNA polymerase I transcription initiation factor, underwent a humanoid-specific triplication, all three copies of the gene are active transcriptionally, although only one copy retains the ability to generate the TIF-IA protein. Prior to TIF-IA triplication, an Alu element was inserted into the first intron. In one of the non-protein coding copies, this Alu is exonized. We identified a single point mutation leading to exonization in one of the gene duplicates. When this mutation was introduced into the TIF-IA coding copy, exonization was activated and the level of the protein-coding mRNA was reduced substantially. A very low level of exonization was detected in normal human cells. However, this exonization was abundant in most leukemia cell lines evaluated, although the genomic sequence is unchanged in these cancerous cells compared to normal cells. Conclusion The definition of the Alu element within the TIF-IA gene as an exon is restricted to certain types of cancers; the element is not exonized in normal human cells. These results further our understanding of the delicate interplay between gene duplication and alternative splicing and of the molecular evolutionary mechanisms leading to genetic innovations. This implies the existence of purifying selection against exonization in single copy genes, with duplicate genes free from such constrains. PMID:18047649
Complete mitochondrial genome of Germain's Peacock-Pheasant Polyplectron germaini (Aves, Galliformes, Phasianidae).

PubMed

Omeire, Destiny; Abdin, Shaunte; Brooks, Daniel M; Miranda, Hector C

2015-04-01

The Germain's Peacock-Pheasant Polyplectron germaini (Aves, Galliformes, Phasianidae) is classified as Near Threatened on the IUCN Red List. The complete mitochondrial genome of P. germaini is 16,699 bp, consisting of 13 protein-coding genes, 2 rRNA, 22 tRNA genes and 1 control region. All of the 13 protein-coding genes have ATG as start codon. Eight of the 13 protein-coding genes have TAA as stop codon.
Nmf9 Encodes a Highly Conserved Protein Important to Neurological Function in Mice and Flies.

PubMed

Zhang, Shuxiao; Ross, Kevin D; Seidner, Glen A; Gorman, Michael R; Poon, Tiffany H; Wang, Xiaobo; Keithley, Elizabeth M; Lee, Patricia N; Martindale, Mark Q; Joiner, William J; Hamilton, Bruce A

2015-07-01

Many protein-coding genes identified by genome sequencing remain without functional annotation or biological context. Here we define a novel protein-coding gene, Nmf9, based on a forward genetic screen for neurological function. ENU-induced and genome-edited null mutations in mice produce deficits in vestibular function, fear learning and circadian behavior, which correlated with Nmf9 expression in inner ear, amygdala, and suprachiasmatic nuclei. Homologous genes from unicellular organisms and invertebrate animals predict interactions with small GTPases, but the corresponding domains are absent in mammalian Nmf9. Intriguingly, homozygotes for null mutations in the Drosophila homolog, CG45058, show profound locomotor defects and premature death, while heterozygotes show striking effects on sleep and activity phenotypes. These results link a novel gene orthology group to discrete neurological functions, and show conserved requirement across wide phylogenetic distance and domain level structural changes.
Identification of a G protein coupled receptor induced in activated T cells.

PubMed

Kaplan, M H; Smith, D I; Sundick, R S

1993-07-15

Many genes are induced after T cell activation to make a cell competent for proliferation and ultimately, function. Many of these genes encode surface receptors for growth factors that signal a cell to proliferate. We have cloned a novel gene (clone 6H1) that codes for a member of the G protein-coupled receptor superfamily. This gene was isolated from a chicken activated T cell cDNA library by low level hybridization to mammalian IL-2 cDNA probes. The 308 amino acid open reading frame has seven hydrophobic, presumably transmembrane domains and a consensus site for interaction with G proteins. Tissue distribution studies suggest that gene expression is restricted to activated T cells. The message appears by 1 h after activation and is maintained for at least 45 h. Transcription of 6H1 is induced by a number of T cell stimuli and is inhibited by cyclosporin A, but not by cycloheximide. This is the first description of a member of this superfamily expressed specifically in activated T cells. The gene product may provide a link between T cell growth factors and G protein activation.
AP1 Keeps Chromatin Poised for Action | Center for Cancer Research

Cancer.gov

The human genome harbors gene-encoding DNA, the blueprint for building proteins that regulate cellular function. Embedded across the genome, in non-coding regions, are DNA elements to which regulatory factors bind. The interaction of regulatory factors with DNA at these sites modifies gene expression to modulate cell activity. In cells, DNA exists in a complex with proteins
The high-level expression of human tissue plasminogen activator in the milk of transgenic mice with hybrid gene locus strategy.

PubMed

Zhou, Yanrong; Lin, Yanli; Wu, Xiaojie; Xiong, Fuyin; Lv, Yuemeng; Zheng, Tao; Huang, Peitang; Chen, Hongxing

2012-02-01

Transgene expression for the mammary gland bioreactor aimed at producing recombinant proteins requires optimized expression vector construction. Previously we presented a hybrid gene locus strategy, which was originally tested with human lactoferrin (hLF) as target transgene, and an extremely high-level expression of rhLF ever been achieved as to 29.8 g/l in mice milk. Here to demonstrate the broad application of this strategy, another 38.4 kb mWAP-htPA hybrid gene locus was constructed, in which the 3-kb genomic coding sequence in the 24-kb mouse whey acidic protein (mWAP) gene locus was substituted by the 17.4-kb genomic coding sequence of human tissue plasminogen activator (htPA), exactly from the start codon to the end codon. Corresponding five transgenic mice lines were generated and the highest expression level of rhtPA in the milk attained as to 3.3 g/l. Our strategy will provide a universal way for the large-scale production of pharmaceutical proteins in the mammary gland of transgenic animals.
Nucleic acids encoding plant glutamine phenylpyruvate transaminase (GPT) and uses thereof

DOEpatents

Unkefer, Pat J.; Anderson, Penelope S.; Knight, Thomas J.

2016-03-29

Glutamine phenylpyruvate transaminase (GPT) proteins, nucleic acid molecules encoding GPT proteins, and uses thereof are disclosed. Provided herein are various GPT proteins and GPT gene coding sequences isolated from a number of plant species. As disclosed herein, GPT proteins share remarkable structural similarity within plant species, and are active in catalyzing the synthesis of 2-hydroxy-5-oxoproline (2-oxoglutaramate), a powerful signal metabolite which regulates the function of a large number of genes involved in the photosynthesis apparatus, carbon fixation and nitrogen metabolism.
Long Non-Coding RNAs Differentially Expressed between Normal versus Primary Breast Tumor Tissues Disclose Converse Changes to Breast Cancer-Related Protein-Coding Genes

PubMed Central

Reiche, Kristin; Kasack, Katharina; Schreiber, Stephan; Lüders, Torben; Due, Eldri U.; Naume, Bjørn; Riis, Margit; Kristensen, Vessela N.; Horn, Friedemann; Børresen-Dale, Anne-Lise; Hackermüller, Jörg; Baumbusch, Lars O.

2014-01-01

Breast cancer, the second leading cause of cancer death in women, is a highly heterogeneous disease, characterized by distinct genomic and transcriptomic profiles. Transcriptome analyses prevalently assessed protein-coding genes; however, the majority of the mammalian genome is expressed in numerous non-coding transcripts. Emerging evidence supports that many of these non-coding RNAs are specifically expressed during development, tumorigenesis, and metastasis. The focus of this study was to investigate the expression features and molecular characteristics of long non-coding RNAs (lncRNAs) in breast cancer. We investigated 26 breast tumor and 5 normal tissue samples utilizing a custom expression microarray enclosing probes for mRNAs as well as novel and previously identified lncRNAs. We identified more than 19,000 unique regions significantly differentially expressed between normal versus breast tumor tissue, half of these regions were non-coding without any evidence for functional open reading frames or sequence similarity to known proteins. The identified non-coding regions were primarily located in introns (53%) or in the intergenic space (33%), frequently orientated in antisense-direction of protein-coding genes (14%), and commonly distributed at promoter-, transcription factor binding-, or enhancer-sites. Analyzing the most diverse mRNA breast cancer subtypes Basal-like versus Luminal A and B resulted in 3,025 significantly differentially expressed unique loci, including 682 (23%) for non-coding transcripts. A notable number of differentially expressed protein-coding genes displayed non-synonymous expression changes compared to their nearest differentially expressed lncRNA, including an antisense lncRNA strongly anticorrelated to the mRNA coding for histone deacetylase 3 (HDAC3), which was investigated in more detail. Previously identified chromatin-associated lncRNAs (CARs) were predominantly downregulated in breast tumor samples, including CARs located in the protein-coding genes for CALD1, FTX, and HNRNPH1. In conclusion, a number of differentially expressed lncRNAs have been identified with relation to cancer-related protein-coding genes. PMID:25264628
Long non-coding RNAs differentially expressed between normal versus primary breast tumor tissues disclose converse changes to breast cancer-related protein-coding genes.

PubMed

Reiche, Kristin; Kasack, Katharina; Schreiber, Stephan; Lüders, Torben; Due, Eldri U; Naume, Bjørn; Riis, Margit; Kristensen, Vessela N; Horn, Friedemann; Børresen-Dale, Anne-Lise; Hackermüller, Jörg; Baumbusch, Lars O

2014-01-01

Breast cancer, the second leading cause of cancer death in women, is a highly heterogeneous disease, characterized by distinct genomic and transcriptomic profiles. Transcriptome analyses prevalently assessed protein-coding genes; however, the majority of the mammalian genome is expressed in numerous non-coding transcripts. Emerging evidence supports that many of these non-coding RNAs are specifically expressed during development, tumorigenesis, and metastasis. The focus of this study was to investigate the expression features and molecular characteristics of long non-coding RNAs (lncRNAs) in breast cancer. We investigated 26 breast tumor and 5 normal tissue samples utilizing a custom expression microarray enclosing probes for mRNAs as well as novel and previously identified lncRNAs. We identified more than 19,000 unique regions significantly differentially expressed between normal versus breast tumor tissue, half of these regions were non-coding without any evidence for functional open reading frames or sequence similarity to known proteins. The identified non-coding regions were primarily located in introns (53%) or in the intergenic space (33%), frequently orientated in antisense-direction of protein-coding genes (14%), and commonly distributed at promoter-, transcription factor binding-, or enhancer-sites. Analyzing the most diverse mRNA breast cancer subtypes Basal-like versus Luminal A and B resulted in 3,025 significantly differentially expressed unique loci, including 682 (23%) for non-coding transcripts. A notable number of differentially expressed protein-coding genes displayed non-synonymous expression changes compared to their nearest differentially expressed lncRNA, including an antisense lncRNA strongly anticorrelated to the mRNA coding for histone deacetylase 3 (HDAC3), which was investigated in more detail. Previously identified chromatin-associated lncRNAs (CARs) were predominantly downregulated in breast tumor samples, including CARs located in the protein-coding genes for CALD1, FTX, and HNRNPH1. In conclusion, a number of differentially expressed lncRNAs have been identified with relation to cancer-related protein-coding genes.
Biallelic insertion of a transcriptional terminator via the CRISPR/Cas9 system efficiently silences expression of protein-coding and non-coding RNA genes.

PubMed

Liu, Yangyang; Han, Xiao; Yuan, Junting; Geng, Tuoyu; Chen, Shihao; Hu, Xuming; Cui, Isabelle H; Cui, Hengmi

2017-04-07

The type II bacterial CRISPR/Cas9 system is a simple, convenient, and powerful tool for targeted gene editing. Here, we describe a CRISPR/Cas9-based approach for inserting a poly(A) transcriptional terminator into both alleles of a targeted gene to silence protein-coding and non-protein-coding genes, which often play key roles in gene regulation but are difficult to silence via insertion or deletion of short DNA fragments. The integration of 225 bp of bovine growth hormone poly(A) signals into either the first intron or the first exon or behind the promoter of target genes caused efficient termination of expression of PPP1R12C , NSUN2 (protein-coding genes), and MALAT1 (non-protein-coding gene). Both NeoR and PuroR were used as markers in the selection of clonal cell lines with biallelic integration of a poly(A) signal. Genotyping analysis indicated that the cell lines displayed the desired biallelic silencing after a brief selection period. These combined results indicate that this CRISPR/Cas9-based approach offers an easy, convenient, and efficient novel technique for gene silencing in cell lines, especially for those in which gene integration is difficult because of a low efficiency of homology-directed repair. © 2017 by The American Society for Biochemistry and Molecular Biology, Inc.
The first draft genome of the aquatic model plant Lemna minor opens the route for future stress physiology research and biotechnological applications.

PubMed

Van Hoeck, Arne; Horemans, Nele; Monsieurs, Pieter; Cao, Hieu Xuan; Vandenhove, Hildegarde; Blust, Ronny

2015-01-01

Freshwater duckweed, comprising the smallest, fastest growing and simplest macrophytes has various applications in agriculture, phytoremediation and energy production. Lemna minor, the so-called common duckweed, is a model system of these aquatic plants for ecotoxicological bioassays, genetic transformation tools and industrial applications. Given the ecotoxic relevance and high potential for biomass production, whole-genome information of this cosmopolitan duckweed is needed. The 472 Mbp assembly of the L. minor genome (2n = 40; estimated 481 Mbp; 98.1 %) contains 22,382 protein-coding genes and 61.5 % repetitive sequences. The repeat content explains 94.5 % of the genome size difference in comparison with the greater duckweed, Spirodela polyrhiza (2n = 40; 158 Mbp; 19,623 protein-coding genes; and 15.79 % repetitive sequences). Comparison of proteins from other monocot plants, protein ortholog identification, OrthoMCL, suggests 1356 duckweed-specific groups (3367 proteins, 15.0 % total L. minor proteins) and 795 Lemna-specific groups (2897 proteins, 12.9 % total L. minor proteins). Interestingly, proteins involved in biosynthetic processes in response to various stimuli and hydrolase activities are enriched in the Lemna proteome in comparison with the Spirodela proteome. The genome sequence and annotation of L. minor protein-coding genes provide new insights in biological understanding and biomass production applications of Lemna species.
De Novo Origin of Human Protein-Coding Genes

PubMed Central

Wu, Dong-Dong; Irwin, David M.; Zhang, Ya-Ping

2011-01-01

The de novo origin of a new protein-coding gene from non-coding DNA is considered to be a very rare occurrence in genomes. Here we identify 60 new protein-coding genes that originated de novo on the human lineage since divergence from the chimpanzee. The functionality of these genes is supported by both transcriptional and proteomic evidence. RNA–seq data indicate that these genes have their highest expression levels in the cerebral cortex and testes, which might suggest that these genes contribute to phenotypic traits that are unique to humans, such as improved cognitive ability. Our results are inconsistent with the traditional view that the de novo origin of new genes is very rare, thus there should be greater appreciation of the importance of the de novo origination of genes. PMID:22102831
Molecular cloning of chitinase 33 (chit33) gene from Trichoderma atroviride

PubMed Central

Matroudi, S.; Zamani, M.R.; Motallebi, M.

2008-01-01

In this study Trichoderma atroviride was selected as over producer of chitinase enzyme among 30 different isolates of Trichoderma sp. on the basis of chitinase specific activity. From this isolate the genomic and cDNA clones encoding chit33 have been isolated and sequenced. Comparison of genomic and cDNA sequences for defining gene structure indicates that this gene contains three short introns and also an open reading frame coding for a protein of 321 amino acids. The deduced amino acid sequence includes a 19 aa putative signal peptide. Homology between this sequence and other reported Trichoderma Chit33 proteins are discussed. The coding sequence of chit33 gene was cloned in pEt26b(+) expression vector and expressed in E. coli. PMID:24031242
Codon usage and expression level of human mitochondrial 13 protein coding genes across six continents.

PubMed

Chakraborty, Supriyo; Uddin, Arif; Mazumder, Tarikul Huda; Choudhury, Monisha Nath; Malakar, Arup Kumar; Paul, Prosenjit; Halder, Binata; Deka, Himangshu; Mazumder, Gulshana Akthar; Barbhuiya, Riazul Ahmed; Barbhuiya, Masuk Ahmed; Devi, Warepam Jesmi

2017-12-02

The study of codon usage coupled with phylogenetic analysis is an important tool to understand the genetic and evolutionary relationship of a gene. The 13 protein coding genes of human mitochondria are involved in electron transport chain for the generation of energy currency (ATP). However, no work has yet been reported on the codon usage of the mitochondrial protein coding genes across six continents. To understand the patterns of codon usage in mitochondrial genes across six different continents, we used bioinformatic analyses to analyze the protein coding genes. The codon usage bias was low as revealed from high ENC value. Correlation between codon usage and GC3 suggested that all the codons ending with G/C were positively correlated with GC3 but vice versa for A/T ending codons with the exception of ND4L and ND5 genes. Neutrality plot revealed that for the genes ATP6, COI, COIII, CYB, ND4 and ND4L, natural selection might have played a major role while mutation pressure might have played a dominant role in the codon usage bias of ATP8, COII, ND1, ND2, ND3, ND5 and ND6 genes. Phylogenetic analysis indicated that evolutionary relationships in each of 13 protein coding genes of human mitochondria were different across six continents and further suggested that geographical distance was an important factor for the origin and evolution of 13 protein coding genes of human mitochondria. Copyright © 2017 Elsevier B.V. and Mitochondria Research Society. All rights reserved.
A Dual Origin of the Xist Gene from a Protein-Coding Gene and a Set of Transposable Elements

PubMed Central

Elisaphenko, Eugeny A.; Kolesnikov, Nikolay N.; Shevchenko, Alexander I.; Rogozin, Igor B.; Nesterova, Tatyana B.; Brockdorff, Neil; Zakian, Suren M.

2008-01-01

X-chromosome inactivation, which occurs in female eutherian mammals is controlled by a complex X-linked locus termed the X-inactivation center (XIC). Previously it was proposed that genes of the XIC evolved, at least in part, as a result of pseudogenization of protein-coding genes. In this study we show that the key XIC gene Xist, which displays fragmentary homology to a protein-coding gene Lnx3, emerged de novo in early eutherians by integration of mobile elements which gave rise to simple tandem repeats. The Xist gene promoter region and four out of ten exons found in eutherians retain homology to exons of the Lnx3 gene. The remaining six Xist exons including those with simple tandem repeats detectable in their structure have similarity to different transposable elements. Integration of mobile elements into Xist accompanies the overall evolution of the gene and presumably continues in contemporary eutherian species. Additionally we showed that the combination of remnants of protein-coding sequences and mobile elements is not unique to the Xist gene and is found in other XIC genes producing non-coding nuclear RNA. PMID:18575625

Long non-coding RNAs and mRNAs profiling during spleen development in pig.

PubMed

Che, Tiandong; Li, Diyan; Jin, Long; Fu, Yuhua; Liu, Yingkai; Liu, Pengliang; Wang, Yixin; Tang, Qianzi; Ma, Jideng; Wang, Xun; Jiang, Anan; Li, Xuewei; Li, Mingzhou

2018-01-01

Genome-wide transcriptomic studies in humans and mice have become extensive and mature. However, a comprehensive and systematic understanding of protein-coding genes and long non-coding RNAs (lncRNAs) expressed during pig spleen development has not been achieved. LncRNAs are known to participate in regulatory networks for an array of biological processes. Here, we constructed 18 RNA libraries from developing fetal pig spleen (55 days before birth), postnatal pig spleens (0, 30, 180 days and 2 years after birth), and the samples from the 2-year-old Wild Boar. A total of 15,040 lncRNA transcripts were identified among these samples. We found that the temporal expression pattern of lncRNAs was more restricted than observed for protein-coding genes. Time-series analysis showed two large modules for protein-coding genes and lncRNAs. The up-regulated module was enriched for genes related to immune and inflammatory function, while the down-regulated module was enriched for cell proliferation processes such as cell division and DNA replication. Co-expression networks indicated the functional relatedness between protein-coding genes and lncRNAs, which were enriched for similar functions over the series of time points examined. We identified numerous differentially expressed protein-coding genes and lncRNAs in all five developmental stages. Notably, ceruloplasmin precursor (CP), a protein-coding gene participating in antioxidant and iron transport processes, was differentially expressed in all stages. This study provides the first catalog of the developing pig spleen, and contributes to a fuller understanding of the molecular mechanisms underpinning mammalian spleen development.
Intron-exon organization of the active human protein S gene PS. alpha. and its pseudogene PS. beta. : Duplication and silencing during primate evolution

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ploos van Amstel, H.; Reitsma, P.H.; van der Logt, C.P.

The human protein S locus on chromosome 3 consists of two protein S genes, PS{alpha} and PS{beta}. Here the authors report the cloning and characterization of both genes. Fifteen exons of the PS{alpha} gene were identified that together code for protein S mRNA as derived from the reported protein S cDNAs. Analysis by primer extension of liver protein S mRNA, however, reveals the presence of two mRNA forms that differ in the length of their 5{prime}-noncoding region. Both transcripts contain a 5{prime}-noncoding region longer than found in the protein S cDNAs. The two products may arise from alternative splicing ofmore » an additional intron in this region or from the usage of two start sites for transcription. The intron-exon organization of the PS{alpha} gene fully supports the hypothesis that the protein S gene is the product of an evolutional assembling process in which gene modules coding for structural/functional protein units also found in other coagulation proteins have been put upstream of the ancestral gene of a steroid hormone binding protein. The PS{beta} gene is identified as a pseudogene. It contains a large variety of detrimental aberrations, viz., the absence of exon I, a splice site mutation, three stop codons, and a frame shift mutation. Overall the two genes PS{alpha} and PS{beta} show between their exonic sequences 96.5% homology. Southern analysis of primate DNA showed that the duplication of the ancestral protein S gene has occurred after the branching of the orangutan from the African apes. A nonsense mutation that is present in the pseudogene of man also could be identified in one of the two protein S genes of both chimpanzee and gorilla. This implicates that silencing of one of the two protein S genes must have taken place before the divergence of the three African apes.« less
CHIR99021 promotes self-renewal of mouse embryonic stem cells by modulation of protein-encoding gene and long intergenic non-coding RNA expression

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wu, Yongyan; Key Laboratory of Animal Biotechnology, Ministry of Agriculture, Northwest A and F University, Yangling 712100, Shaanxi; Ai, Zhiying

2013-10-15

Embryonic stem cells (ESCs) can proliferate indefinitely in vitro and differentiate into cells of all three germ layers. These unique properties make them exceptionally valuable for drug discovery and regenerative medicine. However, the practical application of ESCs is limited because it is difficult to derive and culture ESCs. It has been demonstrated that CHIR99021 (CHIR) promotes self-renewal and enhances the derivation efficiency of mouse (m)ESCs. However, the downstream targets of CHIR are not fully understood. In this study, we identified CHIR-regulated genes in mESCs using microarray analysis. Our microarray data demonstrated that CHIR not only influenced the Wnt/β-catenin pathway bymore » stabilizing β-catenin, but also modulated several other pluripotency-related signaling pathways such as TGF-β, Notch and MAPK signaling pathways. More detailed analysis demonstrated that CHIR inhibited Nodal signaling, while activating bone morphogenetic protein signaling in mESCs. In addition, we found that pluripotency-maintaining transcription factors were up-regulated by CHIR, while several developmental-related genes were down-regulated. Furthermore, we found that CHIR altered the expression of epigenetic regulatory genes and long intergenic non-coding RNAs. Quantitative real-time PCR results were consistent with microarray data, suggesting that CHIR alters the expression pattern of protein-encoding genes (especially transcription factors), epigenetic regulatory genes and non-coding RNAs to establish a relatively stable pluripotency-maintaining network. - Highlights: • Combined use of CHIR with LIF promotes self-renewal of J1 mESCs. • CHIR-regulated genes are involved in multiple pathways. • CHIR inhibits Nodal signaling and promotes Bmp4 expression to activate BMP signaling. • Expression of epigenetic regulatory genes and lincRNAs is altered by CHIR.« less
Robust expression of a bioactive mammalian protein in chlamydomonas chloroplast

DOEpatents

Mayfield, Stephen P.

2010-03-16

Methods and compositions are disclosed to engineer chloroplast comprising heterologous mammalian genes via a direct replacement of chloroplast Photosystem II (PSII) reaction center protein coding regions to achieve expression of recombinant protein above 5% of total protein. When algae is used, algal expressed protein is produced predominantly as a soluble protein where the functional activity of the peptide is intact. As the host algae is edible, production of biologics in this organism for oral delivery or proteins/peptides, especially gut active proteins, without purification is disclosed.
Robust expression of a bioactive mammalian protein in Chlamydomonas chloroplast

DOEpatents

Mayfield, Stephen P

2015-01-13

Methods and compositions are disclosed to engineer chloroplast comprising heterologous mammalian genes via a direct replacement of chloroplast Photosystem II (PSII) reaction center protein coding regions to achieve expression of recombinant protein above 5% of total protein. When algae is used, algal expressed protein is produced predominantly as a soluble protein where the functional activity of the peptide is intact. As the host algae is edible, production of biologics in this organism for oral delivery of proteins/peptides, especially gut active proteins, without purification is disclosed.
Prediction of plant lncRNA by ensemble machine learning classifiers.

PubMed

Simopoulos, Caitlin M A; Weretilnyk, Elizabeth A; Golding, G Brian

2018-05-02

In plants, long non-protein coding RNAs are believed to have essential roles in development and stress responses. However, relative to advances on discerning biological roles for long non-protein coding RNAs in animal systems, this RNA class in plants is largely understudied. With comparatively few validated plant long non-coding RNAs, research on this potentially critical class of RNA is hindered by a lack of appropriate prediction tools and databases. Supervised learning models trained on data sets of mostly non-validated, non-coding transcripts have been previously used to identify this enigmatic RNA class with applications largely focused on animal systems. Our approach uses a training set comprised only of empirically validated long non-protein coding RNAs from plant, animal, and viral sources to predict and rank candidate long non-protein coding gene products for future functional validation. Individual stochastic gradient boosting and random forest classifiers trained on only empirically validated long non-protein coding RNAs were constructed. In order to use the strengths of multiple classifiers, we combined multiple models into a single stacking meta-learner. This ensemble approach benefits from the diversity of several learners to effectively identify putative plant long non-coding RNAs from transcript sequence features. When the predicted genes identified by the ensemble classifier were compared to those listed in GreeNC, an established plant long non-coding RNA database, overlap for predicted genes from Arabidopsis thaliana, Oryza sativa and Eutrema salsugineum ranged from 51 to 83% with the highest agreement in Eutrema salsugineum. Most of the highest ranking predictions from Arabidopsis thaliana were annotated as potential natural antisense genes, pseudogenes, transposable elements, or simply computationally predicted hypothetical protein. Due to the nature of this tool, the model can be updated as new long non-protein coding transcripts are identified and functionally verified. This ensemble classifier is an accurate tool that can be used to rank long non-protein coding RNA predictions for use in conjunction with gene expression studies. Selection of plant transcripts with a high potential for regulatory roles as long non-protein coding RNAs will advance research in the elucidation of long non-protein coding RNA function.
Nucleotide sequence of the L1 ribosomal protein gene of Xenopus laevis: remarkable sequence homology among introns.

PubMed Central

Loreni, F; Ruberti, I; Bozzoni, I; Pierandrei-Amaldi, P; Amaldi, F

1985-01-01

Ribosomal protein L1 is encoded by two genes in Xenopus laevis. The comparison of two cDNA sequences shows that the two L1 gene copies (L1a and L1b) have diverged in many silent sites and very few substitution sites; moreover a small duplication occurred at the very end of the coding region of the L1b gene which thus codes for a product five amino acids longer than that coded by L1a. Quantitatively the divergence between the two L1 genes confirms that a whole genome duplication took place in Xenopus laevis approximately 30 million years ago. A genomic fragment containing one of the two L1 gene copies (L1a), with its nine introns and flanking regions, has been completely sequenced. The 5' end of this gene has been mapped within a 20-pyridimine stretch as already found for other vertebrate ribosomal protein genes. Four of the nine introns have a 60-nucleotide sequence with 80% homology; within this region some boxes, one of which is 16 nucleotides long, are 100% homologous among the four introns. This feature of L1a gene introns is interesting since we have previously shown that the activity of this gene is regulated at a post-transcriptional level and it involves the block of the normal splicing of some intron sequences. Images Fig. 3. Fig. 5. PMID:3841512
The complete mitochondrial genome of Hydra vulgaris (Hydroida: Hydridae).

PubMed

Pan, Hong-Chun; Fang, Hong-Yan; Li, Shi-Wei; Liu, Jun-Hong; Wang, Ying; Wang, An-Tai

2014-12-01

The complete mitochondrial genome of Hydra vulgaris (Hydroida: Hydridae) is composed of two linear DNA molecules. The mitochondrial DNA (mtDNA) molecule 1 is 8010 bp long and contains six protein-coding genes, large subunit rRNA, methionine and tryptophan tRNAs, two pseudogenes consisting respectively of a partial copy of COI, and terminal sequences at two ends of the linear mtDNA, while the mtDNA molecule 2 is 7576 bp long and contains seven protein-coding genes, small subunit rRNA, methionine tRNA, a pseudogene consisting of a partial copy of COI and terminal sequences at two ends of the linear mtDNA. COI gene begins with GTG as start codon, whereas other 12 protein-coding genes start with a typical ATG initiation codon. In addition, all protein-coding genes are terminated with TAA as stop codon.
Cloning and characterization of two novel DNases from Streptococcus pyogenes.

PubMed

Hasegawa, Tadao; Torii, Keizo; Hashikawa, Shinnosuke; Iinuma, Yoshitsugu; Ohta, Michio

2002-06-01

The proteins in the culture supernatant (exoproteins) from Streptococcus pyogenes serotype M1 were separated by two-dimensional gel electrophoresis, and their N-terminal amino acid sequences were determined. The amino acid sequences were compared to sequences in the S. pyogenes genome database. The coding sequence showed similarity to sequences of two genes, mf2-v ( mf2 variant) and mf3, which had sequence similarity to genes encoding mitogenic factor (MF); MF has DNase activity. The recombinant genes were expressed in Escherichia coli and the proteins were synthesized. Mf2-v and Mf3 had DNase activity. The activity of Mf2-v was localized to the C-terminal half of the protein. The mf3 gene was shown to be present in most clinically isolated strains of S. pyogenes tested, and the mf2gene was detected in 20% of the isolates. The products of the mf2 and mf3 genes in clinically isolated S. pyogenes strains were thus shown to be DNases.
Improving the genome annotation of the acarbose producer Actinoplanes sp. SE50/110 by sequencing enriched 5'-ends of primary transcripts.

PubMed

Schwientek, Patrick; Neshat, Armin; Kalinowski, Jörn; Klein, Andreas; Rückert, Christian; Schneiker-Bekel, Susanne; Wendler, Sergej; Stoye, Jens; Pühler, Alfred

2014-11-20

Actinoplanes sp. SE50/110 is the producer of the alpha-glucosidase inhibitor acarbose, which is an economically relevant and potent drug in the treatment of type-2 diabetes mellitus. In this study, we present the detection of transcription start sites on this genome by sequencing enriched 5'-ends of primary transcripts. Altogether, 1427 putative transcription start sites were initially identified. With help of the annotated genome sequence, 661 transcription start sites were found to belong to the leader region of protein-coding genes with the surprising result that roughly 20% of these genes rank among the class of leaderless transcripts. Next, conserved promoter motifs were identified for protein-coding genes with and without leader sequences. The mapped transcription start sites were finally used to improve the annotation of the Actinoplanes sp. SE50/110 genome sequence. Concerning protein-coding genes, 41 translation start sites were corrected and 9 novel protein-coding genes could be identified. In addition to this, 122 previously undetermined non-coding RNA (ncRNA) genes of Actinoplanes sp. SE50/110 were defined. Focusing on antisense transcription start sites located within coding genes or their leader sequences, it was discovered that 96 of those ncRNA genes belong to the class of antisense RNA (asRNA) genes. The remaining 26 ncRNA genes were found outside of known protein-coding genes. Four chosen examples of prominent ncRNA genes, namely the transfer messenger RNA gene ssrA, the ribonuclease P class A RNA gene rnpB, the cobalamin riboswitch RNA gene cobRS, and the selenocysteine-specific tRNA gene selC, are presented in more detail. This study demonstrates that sequencing of enriched 5'-ends of primary transcripts and the identification of transcription start sites are valuable tools for advanced genome annotation of Actinoplanes sp. SE50/110 and most probably also for other bacteria. Copyright © 2014 Elsevier B.V. All rights reserved.
Network perturbation by recurrent regulatory variants in cancer

PubMed Central

Cho, Ara; Lee, Insuk; Choi, Jung Kyoon

2017-01-01

Cancer driving genes have been identified as recurrently affected by variants that alter protein-coding sequences. However, a majority of cancer variants arise in noncoding regions, and some of them are thought to play a critical role through transcriptional perturbation. Here we identified putative transcriptional driver genes based on combinatorial variant recurrence in cis-regulatory regions. The identified genes showed high connectivity in the cancer type-specific transcription regulatory network, with high outdegree and many downstream genes, highlighting their causative role during tumorigenesis. In the protein interactome, the identified transcriptional drivers were not as highly connected as coding driver genes but appeared to form a network module centered on the coding drivers. The coding and regulatory variants associated via these interactions between the coding and transcriptional drivers showed exclusive and complementary occurrence patterns across tumor samples. Transcriptional cancer drivers may act through an extensive perturbation of the regulatory network and by altering protein network modules through interactions with coding driver genes. PMID:28333928
Haplotype combination of the bovine INSIG1 gene sequence variants and association with growth traits in Nanyang cattle.

PubMed

Sun, Jiajie; Gao, Yuan; Liu, Dong; Ma, Wei; Xue, Jing; Zhang, Chunlei; Lan, Xianyong; Lei, Chuzhao; Chen, Hong

2012-06-01

The insulin-induced gene 1 (INSIG1) gene encodes a protein that blocks proteolytic activation of sterol regulatory element binding proteins, which are transcription factors that activate genes that regulate cholesterol, fatty acid, and glucose metabolism. However, similar research for the bovine INSIG1 gene is lacking. Therefore, in this study, polymorphisms of the bovine INSIG1 gene were detected in 643 individuals from four cattle breeds by DNA pooling, forced PCR-RFLP, PCR-SSCP, and DNA sequencing methods. Only 10 novel SNPs were identified, which included four mutations in the coding region and the others in the introns. In Nanyang individuals, seven common haplotypes were identified based on four coding region SNPs. The haplotype GACT, with a frequency of 75.4%, was the most prevalent haplotypes and SNPs formed two linkage disequilibrium blocks with strong multi-allelic D' (D' = 1). Additionally, association analysis between mutations of the bovine INSIG1 gene and growth traits in Nanyang cattle at 6, 12, 18, and 24 months old was performed, and the results indicated that the polymorphisms were not significantly associated with body mass.
A comprehensive catalog of human KRAB-associated zinc finger genes: Insights into the evolutionary history of a large family of transcriptional repressors

PubMed Central

Huntley, Stuart; Baggott, Daniel M.; Hamilton, Aaron T.; Tran-Gyamfi, Mary; Yang, Shan; Kim, Joomyeong; Gordon, Laurie; Branscomb, Elbert; Stubbs, Lisa

2006-01-01

Krüppel-type zinc finger (ZNF) motifs are prevalent components of transcription factor proteins in all eukaryotes. KRAB-ZNF proteins, in which a potent repressor domain is attached to a tandem array of DNA-binding zinc-finger motifs, are specific to tetrapod vertebrates and represent the largest class of ZNF proteins in mammals. To define the full repertoire of human KRAB-ZNF proteins, we searched the genome sequence for key motifs and then constructed and manually curated gene models incorporating those sequences. The resulting gene catalog contains 423 KRAB-ZNF protein-coding loci, yielding alternative transcripts that altogether predict at least 742 structurally distinct proteins. Active rounds of segmental duplication, involving single genes or larger regions and including both tandem and distributed duplication events, have driven the expansion of this mammalian gene family. Comparisons between the human genes and ZNF loci mined from the draft mouse, dog, and chimpanzee genomes not only identified 103 KRAB-ZNF genes that are conserved in mammals but also highlighted a substantial level of lineage-specific change; at least 136 KRAB-ZNF coding genes are primate specific, including many recent duplicates. KRAB-ZNF genes are widely expressed and clustered genes are typically not coregulated, indicating that paralogs have evolved to fill roles in many different biological processes. To facilitate further study, we have developed a Web-based public resource with access to gene models, sequences, and other data, including visualization tools to provide genomic context and interaction with other public data sets. PMID:16606702
Molecular cloning, recombinant expression, and antifungal functional characterization of the lipid transfer protein from Panax ginseng.

PubMed

Cai, Kexin; Wang, Jiawen; Wang, Min; Zhang, Hui; Wang, Siming; Zhao, Yu

2016-07-01

To establish an efficient expression system for a fusion protein GST-pgLTP (Lipid Transfer Protein) and to test its antifungal activity. The nucleotide sequence of LTP gene was obtained from Panax ginseng using RT-PCR. The ORF of the cDNA is 363 bp, codING for a protein OF 120 amino acids with a calculated MW of 12.09 kDa. The pgLTP gene with a His6-tag at the C-terminus was cloned into the pGEX-6p1 vector to generate a GST-fusion pgLTP protein construct that was expressed in Escherichia coli Rosetta. Following purification by Ni-NTA, the fusion protein exhibited antifungal activity against five fungi found in ginseng. The fusion protein GST-pgLTP has activity against a broad spectrum of phytopathogenic fungi, and can potentially be adapted for production to combat fungal diseases that affect P. ginseng.
Drosophila Araucan and Caupolican Integrate Intrinsic and Signalling Inputs for the Acquisition by Muscle Progenitors of the Lateral Transverse Fate

PubMed Central

Carrasco-Rando, Marta; Tutor, Antonio S.; Prieto-Sánchez, Silvia; González-Pérez, Esther; Barrios, Natalia; Letizia, Annalisa; Martín, Paloma; Campuzano, Sonsoles; Ruiz-Gómez, Mar

2011-01-01

A central issue of myogenesis is the acquisition of identity by individual muscles. In Drosophila, at the time muscle progenitors are singled out, they already express unique combinations of muscle identity genes. This muscle code results from the integration of positional and temporal signalling inputs. Here we identify, by means of loss-of-function and ectopic expression approaches, the Iroquois Complex homeobox genes araucan and caupolican as novel muscle identity genes that confer lateral transverse muscle identity. The acquisition of this fate requires that Araucan/Caupolican repress other muscle identity genes such as slouch and vestigial. In addition, we show that Caupolican-dependent slouch expression depends on the activation state of the Ras/Mitogen Activated Protein Kinase cascade. This provides a comprehensive insight into the way Iroquois genes integrate in muscle progenitors, signalling inputs that modulate gene expression and protein activity. PMID:21811416
Non-homeodomain regions of Hox proteins mediate activation versus repression of Six2 via a single enhancer site in vivo

PubMed Central

Yallowitz, Alisha R.; Gong, Ke-Qin; Swinehart, Ilea T.; Nelson, Lisa T.; Wellik, Deneen M.

2009-01-01

Summary Hox genes control many developmental events along the AP axis, but few target genes have been identified. Whether target genes are activated or repressed, what enhancer elements are required for regulation, and how different domains of the Hox proteins contribute to regulatory specificity is poorly understood. Six2 is genetically downstream of both the Hox11 paralogous genes in the developing mammalian kidney and Hoxa2 in branchial arch and facial mesenchyme. Loss-of-function of Hox11 leads to loss of Six2 expression and loss-of-function of Hoxa2 leads to expanded Six2 expression. Herein we demonstrate that a single enhancer site upstream of the Six2 coding sequence is responsible for both activation by Hox11 proteins in the kidney and repression by Hoxa2 in the branchial arch and facial mesenchyme in vivo. DNA binding activity is required for both activation and repression, but differential activity is not controlled by differences in the homeodomains. Rather, protein domains N- and C-terminal to the homeodomain confer activation versus repression activity. These data support a model in which the DNA binding specificity of Hox proteins in vivo may be similar, consistent with accumulated in vitro data, and that unique functions result mainly from differential interactions mediated by non-homeodomain regions of Hox proteins. PMID:19716816
Identification of the Operon for the Sorbitol (Glucitol) Phosphoenolpyruvate:Sugar Phosphotransferase System in Streptococcus mutans

PubMed Central

Boyd, David A.; Thevenot, Tracy; Gumbmann, Markus; Honeyman, Allen L.; Hamilton, Ian R.

2000-01-01

Transposon mutagenesis and marker rescue were used to isolate and identify an 8.5-kb contiguous region containing six open reading frames constituting the operon for the sorbitol P-enolpyruvate phosphotransferase transport system (PTS) of Streptococcus mutans LT11. The first gene, srlD, codes for sorbitol-6-phosphate dehydrogenase, followed downstream by srlR, coding for a transcriptional regulator; srlM, coding for a putative activator; and the srlA, srlE, and srlB genes, coding for the EIIC, EIIBC, and EIIA components of the sorbitol PTS, respectively. Among all sorbitol PTS operons characterized to date, the srlD gene is found after the genes coding for the EII components; thus, the location of the gene in S. mutans is unique. The SrlR protein is similar to several transcriptional regulators found in Bacillus spp. that contain PTS regulator domains (J. Stülke, M. Arnaud, G. Rapoport, and I. Martin-Verstraete, Mol. Microbiol. 28:865–874, 1998), and its gene overlaps the srlM gene by 1 bp. The arrangement of these two regulatory genes is unique, having not been reported for other bacteria. PMID:10639465
Maize GO annotation—methods, evaluation, and review (maize-GAMER)

USDA-ARS?s Scientific Manuscript database

We created a new high-coverage, robust, and reproducible functional annotation of maize protein-coding genes based on Gene Ontology (GO) term assignments. Whereas the existing Phytozome and Gramene maize GO annotation sets only cover 41% and 56% of maize protein-coding genes, respectively, this stu...
Rye B chromosomes encode a functional Argonaute-like protein with in vitro slicer activities similar to its A chromosome paralog.

PubMed

Ma, Wei; Gabriel, Tobias Sebastian; Martis, Mihaela Maria; Gursinsky, Torsten; Schubert, Veit; Vrána, Jan; Doležel, Jaroslav; Grundlach, Heidrun; Altschmied, Lothar; Scholz, Uwe; Himmelbach, Axel; Behrens, Sven-Erik; Banaei-Moghaddam, Ali Mohammad; Houben, Andreas

2017-01-01

B chromosomes (Bs) are supernumerary, dispensable parts of the nuclear genome, which appear in many different species of eukaryote. So far, Bs have been considered to be genetically inert elements without any functional genes. Our comparative transcriptome analysis and the detection of active RNA polymerase II (RNAPII) in the proximity of B chromatin demonstrate that the Bs of rye (Secale cereale) contribute to the transcriptome. In total, 1954 and 1218 B-derived transcripts with an open reading frame were expressed in generative and vegetative tissues, respectively. In addition to B-derived transposable element transcripts, a high percentage of short transcripts without detectable similarity to known proteins and gene fragments from A chromosomes (As) were found, suggesting an ongoing gene erosion process. In vitro analysis of the A- and B-encoded AGO4B protein variants demonstrated that both possess RNA slicer activity. These data demonstrate unambiguously the presence of a functional AGO4B gene on Bs and that these Bs carry both functional protein coding genes and pseudogene copies. Thus, B-encoded genes may provide an additional level of gene control and complexity in combination with their related A-located genes. Hence, physiological effects, associated with the presence of Bs, may partly be explained by the activity of B-located (pseudo)genes. © 2016 IPK Gatersleben. New Phytologist © 2016 New Phytologist Trust.
Forty-four novel protein-coding loci discovered using a proteomics informed by transcriptomics (PIT) approach in rat male germ cells.

PubMed

Chocu, Sophie; Evrard, Bertrand; Lavigne, Régis; Rolland, Antoine D; Aubry, Florence; Jégou, Bernard; Chalmel, Frédéric; Pineau, Charles

2014-11-01

Spermatogenesis is a complex process, dependent upon the successive activation and/or repression of thousands of gene products, and ends with the production of haploid male gametes. RNA sequencing of male germ cells in the rat identified thousands of novel testicular unannotated transcripts (TUTs). Although such RNAs are usually annotated as long noncoding RNAs (lncRNAs), it is possible that some of these TUTs code for protein. To test this possibility, we used a "proteomics informed by transcriptomics" (PIT) strategy combining RNA sequencing data with shotgun proteomics analyses of spermatocytes and spermatids in the rat. Among 3559 TUTs and 506 lncRNAs found in meiotic and postmeiotic germ cells, 44 encoded at least one peptide. We showed that these novel high-confidence protein-coding loci exhibit several genomic features intermediate between those of lncRNAs and mRNAs. We experimentally validated the testicular expression pattern of two of these novel protein-coding gene candidates, both highly conserved in mammals: one for a vesicle-associated membrane protein we named VAMP-9, and the other for an enolase domain-containing protein. This study confirms the potential of PIT approaches for the discovery of protein-coding transcripts initially thought to be untranslated or unknown transcripts. Our results contribute to the understanding of spermatogenesis by characterizing two novel proteins, implicated by their strong expression in germ cells. The mass spectrometry proteomics data have been deposited with the ProteomeXchange Consortium under the data set identifier PXD000872. © 2014 by the Society for the Study of Reproduction, Inc.

Complete mitochondrial genome sequence from an endangered Indian snake, Python molurus molurus (Serpentes, Pythonidae).

PubMed

Dubey, Bhawna; Meganathan, P R; Haque, Ikramul

2012-07-01

This paper reports the complete mitochondrial genome sequence of an endangered Indian snake, Python molurus molurus (Indian Rock Python). A typical snake mitochondrial (mt) genome of 17258 bp length comprising of 37 genes including the 13 protein coding genes, 22 tRNA genes, and 2 ribosomal RNA genes along with duplicate control regions is described herein. The P. molurus molurus mt. genome is relatively similar to other snake mt. genomes with respect to gene arrangement, composition, tRNA structures and skews of AT/GC bases. The nucleotide composition of the genome shows that there are more A-C % than T-G% on the positive strand as revealed by positive AT and CG skews. Comparison of individual protein coding genes, with other snake genomes suggests that ATP8 and NADH3 genes have high divergence rates. Codon usage analysis reveals a preference of NNC codons over NNG codons in the mt. genome of P. molurus. Also, the synonymous and non-synonymous substitution rates (ka/ks) suggest that most of the protein coding genes are under purifying selection pressure. The phylogenetic analyses involving the concatenated 13 protein coding genes of P. molurus molurus conformed to the previously established snake phylogeny.
Cloning, sequence analysis, and expression in Escherichia coli of a gene coding for a. beta. -mannanase from the extremely thermophilic bacterium Caldocellum saccharolyticum

DOE Office of Scientific and Technical Information (OSTI.GOV)

Luethi, E.; Jasmat, N.B.; Grayling, R.A.

1991-03-01

A {lambda} recombinant phage expressing {beta}-mannanase activity in Escherichia coli has been isolated from a genomic library of the extremely thermophilic anaerobe Caldocellum saccharolyticum. The gene was cloned into pBR322 on a 5-kb BamHI fragment, and its location was obtained by deletion analysis. The sequence of a 2.1-kb fragment containing the mannanase gene has been determined. One open reading frame was found which could code for a protein of M{sub r} 38,904. The mannanase gene (manA) was overexpressed in E. coli by cloning the gene downstream from the lacZ promoter of pUC18. The enzyme was most active at pH 6more » and 80 C and degraded locust bean gum, guar gum, Pinus radiata glucomannan, and konjak glucomannan. The noncoding region downstream from the mannanase gene showed strong homology to celB, a gene coding for a cellulase from the same organism, suggesting that the manA gene might have been inserted into its present position on the C. saccharolyticum genome by homologous recombination.« less
Identification of BSAP (Pax-5) target genes in early B-cell development by loss- and gain-of-function experiments.

PubMed Central

Nutt, S L; Morrison, A M; Dörfler, P; Rolink, A; Busslinger, M

1998-01-01

The Pax-5 gene codes for the transcription factor BSAP which is essential for the progression of adult B lymphopoiesis beyond an early progenitor (pre-BI) cell stage. Although several genes have been proposed to be regulated by BSAP, CD19 is to date the only target gene which has been genetically confirmed to depend on this transcription factor for its expression. We have now taken advantage of cultured pre-BI cells of wild-type and Pax-5 mutant bone marrow to screen a large panel of B lymphoid genes for additional BSAP target genes. Four differentially expressed genes were shown to be under the direct control of BSAP, as their expression was rapidly regulated in Pax-5-deficient pre-BI cells by a hormone-inducible BSAP-estrogen receptor fusion protein. The genes coding for the B-cell receptor component Ig-alpha (mb-1) and the transcription factors N-myc and LEF-1 are positively regulated by BSAP, while the gene coding for the cell surface protein PD-1 is efficiently repressed. Distinct regulatory mechanisms of BSAP were revealed by reconstituting Pax-5-deficient pre-BI cells with full-length BSAP or a truncated form containing only the paired domain. IL-7 signalling was able to efficiently induce the N-myc gene only in the presence of full-length BSAP, while complete restoration of CD19 synthesis was critically dependent on the BSAP protein concentration. In contrast, the expression of the mb-1 and LEF-1 genes was already reconstituted by the paired domain polypeptide lacking any transactivation function, suggesting that the DNA-binding domain of BSAP is sufficient to recruit other transcription factors to the regulatory regions of these two genes. In conclusion, these loss- and gain-of-function experiments demonstrate that BSAP regulates four newly identified target genes as a transcriptional activator, repressor or docking protein depending on the specific regulatory sequence context. PMID:9545244
A family of octopamine [corrected] receptors that specifically induce cyclic AMP production or Ca2+ release in Drosophila melanogaster.

PubMed

Balfanz, Sabine; Strünker, Timo; Frings, Stephan; Baumann, Arnd

2005-04-01

In invertebrates, the biogenic-amine octopamine is an important physiological regulator. It controls and modulates neuronal development, circadian rhythm, locomotion, 'fight or flight' responses, as well as learning and memory. Octopamine mediates its effects by activation of different GTP-binding protein (G protein)-coupled receptor types, which induce either cAMP production or Ca(2+) release. Here we describe the functional characterization of two genes from Drosophila melanogaster that encode three octopamine receptors. The first gene (Dmoa1) codes for two polypeptides that are generated by alternative splicing. When heterologously expressed, both receptors cause oscillatory increases of the intracellular Ca(2+) concentration in response to applying nanomolar concentrations of octopamine. The second gene (Dmoa2) codes for a receptor that specifically activates adenylate cyclase and causes a rise of intracellular cAMP with an EC(50) of approximately 3 x 10(-8) m octopamine. Tyramine, the precursor of octopamine biosynthesis, activates all three receptors at > or = 100-fold higher concentrations, whereas dopamine and serotonin are non-effective. Developmental expression of Dmoa genes was assessed by RT-PCR. Overlapping but not identical expression patterns were observed for the individual transcripts. The genes characterized in this report encode unique receptors that display signature properties of native octopamine receptors.
The complete mitochondrial genome of the endangered spotback skate, Atlantoraja castelnaui.

PubMed

Duckett, Drew J L; Naylor, Gavin J P

2016-05-01

Chondrichthyes are a highly threatened class of organisms, largely due to overfishing and other human activities. The present study describes the complete mitochondrial genome (16,750 bp) of the endangered spotback skate, Atlantoraja castelnaui. The mitogenome is arranged in a typical vertebrate fashion, containing 13 protein-coding genes, 22 tRNA genes, 2 rRNA genes and 1 control region.
New PAH gene promoter KLF1 and 3'-region C/EBPalpha motifs influence transcription in vitro.

PubMed

Klaassen, Kristel; Stankovic, Biljana; Kotur, Nikola; Djordjevic, Maja; Zukic, Branka; Nikcevic, Gordana; Ugrin, Milena; Spasovski, Vesna; Srzentic, Sanja; Pavlovic, Sonja; Stojiljkovic, Maja

2017-02-01

Phenylketonuria (PKU) is a metabolic disease caused by mutations in the phenylalanine hydroxylase (PAH) gene. Although the PAH genotype remains the main determinant of PKU phenotype severity, genotype-phenotype inconsistencies have been reported. In this study, we focused on unanalysed sequences in non-coding PAH gene regions to assess their possible influence on the PKU phenotype. We transiently transfected HepG2 cells with various chloramphenicol acetyl transferase (CAT) reporter constructs which included PAH gene non-coding regions. Selected non-coding regions were indicated by in silico prediction to contain transcription factor binding sites. Furthermore, electrophoretic mobility shift assay (EMSA) and supershift assays were performed to identify which transcriptional factors were engaged in the interaction. We found novel KLF1 motif in the PAH promoter, which decreases CAT activity by 50 % in comparison to basal transcription in vitro. The cytosine at the c.-170 promoter position creates an additional binding site for the protein complex involving KLF1 transcription factor. Moreover, we assessed for the first time the role of a multivariant variable number tandem repeat (VNTR) region located in the 3'-region of the PAH gene. We found that the VNTR3, VNTR7 and VNTR8 constructs had approximately 60 % of CAT activity. The regulation is mediated by the C/EBPalpha transcription factor, present in protein complex binding to VNTR3. Our study highlighted two novel promoter KLF1 and 3'-region C/EBPalpha motifs in the PAH gene which decrease transcription in vitro and, thus, could be considered as PAH expression modifiers. New transcription motifs in non-coding regions will contribute to better understanding of the PKU phenotype complexity and may become important for the optimisation of PKU treatment.
Improvement of heterologous protein production in Aspergillus oryzae by RNA interference with alpha-amylase genes.

PubMed

Nemoto, Takashi; Maruyama, Jun-ichi; Kitamoto, Katsuhiko

2009-11-01

Aspergillus oryzae RIB40 has three alpha-amylase genes (amyA, amyB, and amyC), and secretes alpha-amylase abundantly. However, large amounts of endogenous secretory proteins such as alpha-amylase can compete with heterologous protein in the secretory pathway and decrease its production yields. In this study, we examined the effects of suppression of alpha-amylase on heterologous protein production in A. oryzae, using the bovine chymosin (CHY) as a reporter heterologous protein. The three alpha-amylase genes in A. oryzae have nearly identical DNA sequences from those promoters to the coding regions. Hence we performed silencing of alpha-amylase genes by RNA interference (RNAi) in the A. oryzae CHY producing strain. The silenced strains exhibited a reduction in alpha-amylase activity and an increase in CHY production in the culture medium. This result suggests that suppression of alpha-amylase is effective in heterologous protein production in A. oryzae.
Comparative analysis of human protein-coding and noncoding RNAs between brain and 10 mixed cell lines by RNA-Seq.

PubMed

Chen, Geng; Yin, Kangping; Shi, Leming; Fang, Yuanzhang; Qi, Ya; Li, Peng; Luo, Jian; He, Bing; Liu, Mingyao; Shi, Tieliu

2011-01-01

In their expression process, different genes can generate diverse functional products, including various protein-coding or noncoding RNAs. Here, we investigated the protein-coding capacities and the expression levels of their isoforms for human known genes, the conservation and disease association of long noncoding RNAs (ncRNAs) with two transcriptome sequencing datasets from human brain tissues and 10 mixed cell lines. Comparative analysis revealed that about two-thirds of the genes expressed between brain and cell lines are the same, but less than one-third of their isoforms are identical. Besides those genes specially expressed in brain and cell lines, about 66% of genes expressed in common encoded different isoforms. Moreover, most genes dominantly expressed one isoform and some genes only generated protein-coding (or noncoding) RNAs in one sample but not in another. We found 282 human genes could encode both protein-coding and noncoding RNAs through alternative splicing in the two samples. We also identified more than 1,000 long ncRNAs, and most of those long ncRNAs contain conserved elements across either 46 vertebrates or 33 placental mammals or 10 primates. Further analysis showed that some long ncRNAs differentially expressed in human breast cancer or lung cancer, several of those differentially expressed long ncRNAs were validated by RT-PCR. In addition, those validated differentially expressed long ncRNAs were found significantly correlated with certain breast cancer or lung cancer related genes, indicating the important biological relevance between long ncRNAs and human cancers. Our findings reveal that the differences of gene expression profile between samples mainly result from the expressed gene isoforms, and highlight the importance of studying genes at the isoform level for completely illustrating the intricate transcriptome.
Transcription Factor Binding Profiles Reveal Cyclic Expression of Human Protein-coding Genes and Non-coding RNAs

PubMed Central

Cheng, Chao; Ung, Matthew; Grant, Gavin D.; Whitfield, Michael L.

2013-01-01

Cell cycle is a complex and highly supervised process that must proceed with regulatory precision to achieve successful cellular division. Despite the wide application, microarray time course experiments have several limitations in identifying cell cycle genes. We thus propose a computational model to predict human cell cycle genes based on transcription factor (TF) binding and regulatory motif information in their promoters. We utilize ENCODE ChIP-seq data and motif information as predictors to discriminate cell cycle against non-cell cycle genes. Our results show that both the trans- TF features and the cis- motif features are predictive of cell cycle genes, and a combination of the two types of features can further improve prediction accuracy. We apply our model to a complete list of GENCODE promoters to predict novel cell cycle driving promoters for both protein-coding genes and non-coding RNAs such as lincRNAs. We find that a similar percentage of lincRNAs are cell cycle regulated as protein-coding genes, suggesting the importance of non-coding RNAs in cell cycle division. The model we propose here provides not only a practical tool for identifying novel cell cycle genes with high accuracy, but also new insights on cell cycle regulation by TFs and cis-regulatory elements. PMID:23874175
Discovery of rare protein-coding genes in model methylotroph Methylobacterium extorquens AM1.

PubMed

Kumar, Dhirendra; Mondal, Anupam Kumar; Yadav, Amit Kumar; Dash, Debasis

2014-12-01

Proteogenomics involves the use of MS to refine annotation of protein-coding genes and discover genes in a genome. We carried out comprehensive proteogenomic analysis of Methylobacterium extorquens AM1 (ME-AM1) from publicly available proteomics data with a motive to improve annotation for methylotrophs; organisms capable of surviving in reduced carbon compounds such as methanol. Besides identifying 2482(50%) proteins, 29 new genes were discovered and 66 annotated gene models were revised in ME-AM1 genome. One such novel gene is identified with 75 peptides, lacks homolog in other methylobacteria but has glycosyl transferase and lipopolysaccharide biosynthesis protein domains, indicating its potential role in outer membrane synthesis. Many novel genes are present only in ME-AM1 among methylobacteria. Distant homologs of these genes in unrelated taxonomic classes and low GC-content of few genes suggest lateral gene transfer as a potential mode of their origin. Annotations of methylotrophy related genes were also improved by the discovery of a short gene in methylotrophy gene island and redefining a gene important for pyrroquinoline quinone synthesis, essential for methylotrophy. The combined use of proteogenomics and rigorous bioinformatics analysis greatly enhanced the annotation of protein-coding genes in model methylotroph ME-AM1 genome. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Regulated expression of the human cytomegalovirus pp65 gene: Octamer sequence in the promoter is required for activation by viral gene products

DOE Office of Scientific and Technical Information (OSTI.GOV)

Depto, A.S.; Stenberg, R.M.

1989-03-01

To better understand the regulation of late gene expression in human cytomegalovirus (CMV)-infected cells, the authors examined expression of the gene that codes for the 65-kilodalton lower-matrix phosphoprotein (pp65). Analysis of RNA isolated at 72 h from cells infected with CMV Towne or ts66, a DNA-negative temperature-sensitive mutant, supported the fact that pp65 is expressed at low levels prior to viral DNA replication but maximally expressed after the initiation of viral DNA replication. To investigate promoter activation in a transient expression assay, the pp65 promoter was cloned into the indicator plasmid containing the gene for chloramphenicol acetyltransferase (CAT). Transfection ofmore » the promoter-CAT construct and subsequent superinfection with CMV resulted in activation of the promoter at early times after infection. Cotransfection with plasmids capable of expressing immediate-early (IE) proteins demonstrated that the promoter was activated by IE proteins and that both IE regions 1 and 2 were necessary. These studies suggest that interactions between IE proteins and this octamer sequence may be important for the regulation and expression of this CMV gene.« less
Structural architecture of the human long non-coding RNA, steroid receptor RNA activator

PubMed Central

Novikova, Irina V.; Hennelly, Scott P.; Sanbonmatsu, Karissa Y.

2012-01-01

While functional roles of several long non-coding RNAs (lncRNAs) have been determined, the molecular mechanisms are not well understood. Here, we report the first experimentally derived secondary structure of a human lncRNA, the steroid receptor RNA activator (SRA), 0.87 kB in size. The SRA RNA is a non-coding RNA that coactivates several human sex hormone receptors and is strongly associated with breast cancer. Coding isoforms of SRA are also expressed to produce proteins, making the SRA gene a unique bifunctional system. Our experimental findings (SHAPE, in-line, DMS and RNase V1 probing) reveal that this lncRNA has a complex structural organization, consisting of four domains, with a variety of secondary structure elements. We examine the coevolution of the SRA gene at the RNA structure and protein structure levels using comparative sequence analysis across vertebrates. Rapid evolutionary stabilization of RNA structure, combined with frame-disrupting mutations in conserved regions, suggests that evolutionary pressure preserves the RNA structural core rather than its translational product. We perform similar experiments on alternatively spliced SRA isoforms to assess their structural features. PMID:22362738
Next generation sequencing and analysis of a conserved transcriptome of New Zealand's kiwi.

PubMed

Subramanian, Sankar; Huynen, Leon; Millar, Craig D; Lambert, David M

2010-12-15

Kiwi is a highly distinctive, flightless and endangered ratite bird endemic to New Zealand. To understand the patterns of molecular evolution of the nuclear protein-coding genes in brown kiwi (Apteryx australis mantelli) and to determine the timescale of avian history we sequenced a transcriptome obtained from a kiwi embryo using next generation sequencing methods. We then assembled the conserved protein-coding regions using the chicken proteome as a scaffold. Using 1,543 conserved protein coding genes we estimated the neutral evolutionary divergence between the kiwi and chicken to be ~45%, which is approximately equal to the divergence computed for the human-mouse pair using the same set of genes. A large fraction of genes was found to be under high selective constraint, as most of the expressed genes appeared to be involved in developmental gene regulation. Our study suggests a significant relationship between gene expression levels and protein evolution. Using sequences from over 700 nuclear genes we estimated the divergence between the two basal avian groups, Palaeognathae and Neognathae to be 132 million years, which is consistent with previous studies using mitochondrial genes. The results of this investigation revealed patterns of mutation and purifying selection in conserved protein coding regions in birds. Furthermore this study suggests a relatively cost-effective way of obtaining a glimpse into the fundamental molecular evolutionary attributes of a genome, particularly when no closely related genomic sequence is available.
Characterization of mitochondrial genome of sea cucumber Stichopus horrens: a novel gene arrangement in Holothuroidea.

PubMed

Fan, SiGang; Hu, ChaoQun; Wen, Jing; Zhang, LvPing

2011-05-01

The complete mitochondrial DNA sequence contains useful information for phylogenetic analyses of metazoa. In this study, the complete mitochondrial DNA sequence of sea cucumber Stichopus horrens (Holothuroidea: Stichopodidae: Stichopus) is presented. The complete sequence was determined using normal and long PCRs. The mitochondrial genome of Stichopus horrens is a circular molecule 16257 bps long, composed of 13 protein-coding genes, two ribosomal RNA genes and 22 transfer RNA genes. Most of these genes are coded on the heavy strand except for one protein-coding gene (nad6) and five tRNA genes (tRNA ( Ser(UCN) ), tRNA ( Gln ), tRNA ( Ala ), tRNA ( Val ), tRNA ( Asp )) which are coded on the light strand. The composition of the heavy strand is 30.8% A, 23.7% C, 16.2% G, and 29.3% T bases (AT skew=0.025; GC skew=-0.188). A non-coding region of 675 bp was identified as a putative control region because of its location and AT richness. The intergenic spacers range from 1 to 50 bp in size, totaling 227 bp. A total of 25 overlapping nucleotides, ranging from 1 to 10 bp in size, exist among 11 genes. All 13 protein-coding genes are initiated with an ATG. The TAA codon is used as the stop codon in all the protein coding genes except nad3 and nad4 that use TAG as their termination codon. The most frequently used amino acids are Leu (16.29%), Ser (10.34%) and Phe (8.37%). All of the tRNA genes have the potential to fold into typical cloverleaf secondary structures. We also compared the order of the genes in the mitochondrial DNA from the five holothurians that are now available and found a novel gene arrangement in the mitochondrial DNA of Stichopus horrens.
De Novo ORFs in Drosophila Are Important to Organismal Fitness and Evolved Rapidly from Previously Non-coding Sequences

PubMed Central

Reinhardt, Josephine A.; Wanjiru, Betty M.; Brant, Alicia T.; Saelao, Perot; Begun, David J.; Jones, Corbin D.

2013-01-01

How non-coding DNA gives rise to new protein-coding genes (de novo genes) is not well understood. Recent work has revealed the origins and functions of a few de novo genes, but common principles governing the evolution or biological roles of these genes are unknown. To better define these principles, we performed a parallel analysis of the evolution and function of six putatively protein-coding de novo genes described in Drosophila melanogaster. Reconstruction of the transcriptional history of de novo genes shows that two de novo genes emerged from novel long non-coding RNAs that arose at least 5 MY prior to evolution of an open reading frame. In contrast, four other de novo genes evolved a translated open reading frame and transcription within the same evolutionary interval suggesting that nascent open reading frames (proto-ORFs), while not required, can contribute to the emergence of a new de novo gene. However, none of the genes arose from proto-ORFs that existed long before expression evolved. Sequence and structural evolution of de novo genes was rapid compared to nearby genes and the structural complexity of de novo genes steadily increases over evolutionary time. Despite the fact that these genes are transcribed at a higher level in males than females, and are most strongly expressed in testes, RNAi experiments show that most of these genes are essential in both sexes during metamorphosis. This lethality suggests that protein coding de novo genes in Drosophila quickly become functionally important. PMID:24146629
Coding of Class I and II aminoacyl-tRNA synthetases

PubMed Central

Carter, Charles W.

2018-01-01

SUMMARY The aminoacyl-tRNA synthetases and their cognate transfer RNAs translate the universal genetic code. The twenty canonical amino acids are sufficiently diverse to create a selective advantage for dividing amino acid activation between two distinct, apparently unrelated superfamilies of synthetases, Class I amino acids being generally larger and less polar, Class II amino acids smaller and more polar. Biochemical, bioinformatic, and protein engineering experiments support the hypothesis that the two Classes descended from opposite strands of the same ancestral gene. Parallel experimental deconstructions of Class I and II synthetases reveal parallel losses in catalytic proficiency at two novel modular levels—protozymes and Urzymes—associated with the evolution of catalytic activity. Bi-directional coding supports an important unification of the proteome; affords a genetic relatedness metric—middle base-pairing frequencies in sense/antisense alignments—that probes more deeply into the evolutionary history of translation than do single multiple sequence alignments; and has facilitated the analysis of hitherto unknown coding relationships in tRNA sequences. Reconstruction of native synthetases by modular thermodynamic cycles facilitated by domain engineering emphasizes the subtlety associated with achieving high specificity, shedding new light on allosteric relationships in contemporary synthetases. Synthetase Urzyme structural biology suggests that they are catalytically active molten globules, broadening the potential manifold of polypeptide catalysts accessible to primitive genetic coding and motivating revisions of the origins of catalysis. Finally, bi-directional genetic coding of some of the oldest genes in the proteome places major limitations on the likelihood that any RNA World preceded the origins of coded proteins. PMID:28828732
Emerging Putative Associations between Non-Coding RNAs and Protein-Coding Genes in Neuropathic Pain: Added Value from Reusing Microarray Data.

PubMed

Raju, Hemalatha B; Tsinoremas, Nicholas F; Capobianco, Enrico

2016-01-01

Regeneration of injured nerves is likely occurring in the peripheral nervous system, but not in the central nervous system. Although protein-coding gene expression has been assessed during nerve regeneration, little is currently known about the role of non-coding RNAs (ncRNAs). This leaves open questions about the potential effects of ncRNAs at transcriptome level. Due to the limited availability of human neuropathic pain (NP) data, we have identified the most comprehensive time-course gene expression profile referred to sciatic nerve (SN) injury and studied in a rat model using two neuronal tissues, namely dorsal root ganglion (DRG) and SN. We have developed a methodology to identify differentially expressed bioentities starting from microarray probes and repurposing them to annotate ncRNAs, while analyzing the expression profiles of protein-coding genes. The approach is designed to reuse microarray data and perform first profiling and then meta-analysis through three main steps. First, we used contextual analysis to identify what we considered putative or potential protein-coding targets for selected ncRNAs. Relevance was therefore assigned to differential expression of neighbor protein-coding genes, with neighborhood defined by a fixed genomic distance from long or antisense ncRNA loci, and of parental genes associated with pseudogenes. Second, connectivity among putative targets was used to build networks, in turn useful to conduct inference at interactomic scale. Last, network paths were annotated to assess relevance to NP. We found significant differential expression in long-intergenic ncRNAs (32 lincRNAs in SN and 8 in DRG), antisense RNA (31 asRNA in SN and 12 in DRG), and pseudogenes (456 in SN and 56 in DRG). In particular, contextual analysis centered on pseudogenes revealed some targets with known association to neurodegeneration and/or neurogenesis processes. While modules of the olfactory receptors were clearly identified in protein-protein interaction networks, other connectivity paths were identified between proteins already investigated in studies on disorders, such as Parkinson, Down syndrome, Huntington disease, and Alzheimer. Our findings suggest the importance of reusing gene expression data by meta-analysis approaches.
Td4IN2: A drought-responsive durum wheat (Triticum durum Desf.) gene coding for a resistance like protein with serine/threonine protein kinase, nucleotide binding site and leucine rich domains.

PubMed

Rampino, Patrizia; De Pascali, Mariarosaria; De Caroli, Monica; Luvisi, Andrea; De Bellis, Luigi; Piro, Gabriella; Perrotta, Carla

2017-11-01

Wheat, the main food source for a third of world population, appears strongly under threat because of predicted increasing temperatures coupled to drought. Plant complex molecular response to drought stress relies on the gene network controlling cell reactions to abiotic stress. In the natural environment, plants are subjected to the combination of abiotic and biotic stresses. Also the response of plants to biotic stress, to cope with pathogens, involves the activation of a molecular network. Investigations on combination of abiotic and biotic stresses indicate the existence of cross-talk between the two networks and a kind of overlapping can be hypothesized. In this work we describe the isolation and characterization of a drought-related durum wheat (Triticum durum Desf.) gene, identified in a previous study, coding for a protein combining features of NBS-LRR type resistance protein with a S/TPK domain, involved in drought stress response. This is one of the few examples reported where all three domains are present in a single protein and, to our knowledge, it is the first report on a gene specifically induced by drought stress and drought-related conditions, with this particular structure. Copyright © 2017 Elsevier Masson SAS. All rights reserved.
Unraveling patterns of site-to-site synonymous rates variation and associated gene properties of protein domains and families.

PubMed

Dimitrieva, Slavica; Anisimova, Maria

2014-01-01

In protein-coding genes, synonymous mutations are often thought not to affect fitness and therefore are not subject to natural selection. Yet increasingly, cases of non-neutral evolution at certain synonymous sites were reported over the last decade. To evaluate the extent and the nature of site-specific selection on synonymous codons, we computed the site-to-site synonymous rate variation (SRV) and identified gene properties that make SRV more likely in a large database of protein-coding gene families and protein domains. To our knowledge, this is the first study that explores the determinants and patterns of the SRV in real data. We show that the SRV is widespread in the evolution of protein-coding sequences, putting in doubt the validity of the synonymous rate as a standard neutral proxy. While protein domains rarely undergo adaptive evolution, the SRV appears to play important role in optimizing the domain function at the level of DNA. In contrast, protein families are more likely to evolve by positive selection, but are less likely to exhibit SRV. Stronger SRV was detected in genes with stronger codon bias and tRNA reusage, those coding for proteins with larger number of interactions or forming larger number of structures, located in intracellular components and those involved in typically conserved complex processes and functions. Genes with extreme SRV show higher expression levels in nearly all tissues. This indicates that codon bias in a gene, which often correlates with gene expression, may often be a site-specific phenomenon regulating the speed of translation along the sequence, consistent with the co-translational folding hypothesis. Strikingly, genes with SRV were strongly overrepresented for metabolic pathways and those associated with several genetic diseases, particularly cancers and diabetes.
Multiple copies of genes coding for electron transport proteins in the bacterium Nitrosomonas europaea.

PubMed

McTavish, H; LaQuier, F; Arciero, D; Logan, M; Mundfrom, G; Fuchs, J A; Hooper, A B

1993-04-01

The genome of Nitrosomonas europaea contains at least three copies each of the genes coding for hydroxylamine oxidoreductase (HAO) and cytochrome c554. A copy of an HAO gene is always located within 2.7 kb of a copy of a cytochrome c554 gene. Cytochrome P-460, a protein that shares very unusual spectral features with HAO, was found to be encoded by a gene separate from the HAO genes.

Morphometric Analysis of Recognized Genes for Autism Spectrum Disorders and Obesity in Relationship to the Distribution of Protein-Coding Genes on Human Chromosomes.

PubMed

McGuire, Austen B; Rafi, Syed K; Manzardo, Ann M; Butler, Merlin G

2016-05-05

Mammalian chromosomes are comprised of complex chromatin architecture with the specific assembly and configuration of each chromosome influencing gene expression and function in yet undefined ways by varying degrees of heterochromatinization that result in Giemsa (G) negative euchromatic (light) bands and G-positive heterochromatic (dark) bands. We carried out morphometric measurements of high-resolution chromosome ideograms for the first time to characterize the total euchromatic and heterochromatic chromosome band length, distribution and localization of 20,145 known protein-coding genes, 790 recognized autism spectrum disorder (ASD) genes and 365 obesity genes. The individual lengths of G-negative euchromatin and G-positive heterochromatin chromosome bands were measured in millimeters and recorded from scaled and stacked digital images of 850-band high-resolution ideograms supplied by the International Society of Chromosome Nomenclature (ISCN) 2013. Our overall measurements followed established banding patterns based on chromosome size. G-negative euchromatic band regions contained 60% of protein-coding genes while the remaining 40% were distributed across the four heterochromatic dark band sub-types. ASD genes were disproportionately overrepresented in the darker heterochromatic sub-bands, while the obesity gene distribution pattern did not significantly differ from protein-coding genes. Our study supports recent trends implicating genes located in heterochromatin regions playing a role in biological processes including neurodevelopment and function, specifically genes associated with ASD.
Genomic Structure of an Economically Important Cyanobacterium, Arthrospira (Spirulina) platensis NIES-39

PubMed Central

Fujisawa, Takatomo; Narikawa, Rei; Okamoto, Shinobu; Ehira, Shigeki; Yoshimura, Hidehisa; Suzuki, Iwane; Masuda, Tatsuru; Mochimaru, Mari; Takaichi, Shinichi; Awai, Koichiro; Sekine, Mitsuo; Horikawa, Hiroshi; Yashiro, Isao; Omata, Seiha; Takarada, Hiromi; Katano, Yoko; Kosugi, Hiroki; Tanikawa, Satoshi; Ohmori, Kazuko; Sato, Naoki; Ikeuchi, Masahiko; Fujita, Nobuyuki; Ohmori, Masayuki

2010-01-01

A filamentous non-N2-fixing cyanobacterium, Arthrospira (Spirulina) platensis, is an important organism for industrial applications and as a food supply. Almost the complete genome of A. platensis NIES-39 was determined in this study. The genome structure of A. platensis is estimated to be a single, circular chromosome of 6.8 Mb, based on optical mapping. Annotation of this 6.7 Mb sequence yielded 6630 protein-coding genes as well as two sets of rRNA genes and 40 tRNA genes. Of the protein-coding genes, 78% are similar to those of other organisms; the remaining 22% are currently unknown. A total 612 kb of the genome comprise group II introns, insertion sequences and some repetitive elements. Group I introns are located in a protein-coding region. Abundant restriction-modification systems were determined. Unique features in the gene composition were noted, particularly in a large number of genes for adenylate cyclase and haemolysin-like Ca2+-binding proteins and in chemotaxis proteins. Filament-specific genes were highlighted by comparative genomic analysis. PMID:20203057
Computer analysis of protein functional sites projection on exon structure of genes in Metazoa.

PubMed

Medvedeva, Irina V; Demenkov, Pavel S; Ivanisenko, Vladimir A

2015-01-01

Study of the relationship between the structural and functional organization of proteins and their coding genes is necessary for an understanding of the evolution of molecular systems and can provide new knowledge for many applications for designing proteins with improved medical and biological properties. It is well known that the functional properties of proteins are determined by their functional sites. Functional sites are usually represented by a small number of amino acid residues that are distantly located from each other in the amino acid sequence. They are highly conserved within their functional group and vary significantly in structure between such groups. According to this facts analysis of the general properties of the structural organization of the functional sites at the protein level and, at the level of exon-intron structure of the coding gene is still an actual problem. One approach to this analysis is the projection of amino acid residue positions of the functional sites along with the exon boundaries to the gene structure. In this paper, we examined the discontinuity of the functional sites in the exon-intron structure of genes and the distribution of lengths and phases of the functional site encoding exons in vertebrate genes. We have shown that the DNA fragments coding the functional sites were in the same exons, or in close exons. The observed tendency to cluster the exons that code functional sites which could be considered as the unit of protein evolution. We studied the characteristics of the structure of the exon boundaries that code, and do not code, functional sites in 11 Metazoa species. This is accompanied by a reduced frequency of intercodon gaps (phase 0) in exons encoding the amino acid residue functional site, which may be evidence of the existence of evolutionary limitations to the exon shuffling. These results characterize the features of the coding exon-intron structure that affect the functionality of the encoded protein and allow a better understanding of the emergence of biological diversity.
RNAi mediates post-transcriptional repression of gene expression in fission yeast Schizosaccharomyces pombe

DOE Office of Scientific and Technical Information (OSTI.GOV)

Smialowska, Agata, E-mail: smialowskaa@gmail.com; School of Life Sciences, Södertörn Högskola, Huddinge 141-89; Djupedal, Ingela

Highlights: • Protein coding genes accumulate anti-sense sRNAs in fission yeast S. pombe. • RNAi represses protein-coding genes in S. pombe. • RNAi-mediated gene repression is post-transcriptional. - Abstract: RNA interference (RNAi) is a gene silencing mechanism conserved from fungi to mammals. Small interfering RNAs are products and mediators of the RNAi pathway and act as specificity factors in recruiting effector complexes. The Schizosaccharomyces pombe genome encodes one of each of the core RNAi proteins, Dicer, Argonaute and RNA-dependent RNA polymerase (dcr1, ago1, rdp1). Even though the function of RNAi in heterochromatin assembly in S. pombe is established, its rolemore » in controlling gene expression is elusive. Here, we report the identification of small RNAs mapped anti-sense to protein coding genes in fission yeast. We demonstrate that these genes are up-regulated at the protein level in RNAi mutants, while their mRNA levels are not significantly changed. We show that the repression by RNAi is not a result of heterochromatin formation. Thus, we conclude that RNAi is involved in post-transcriptional gene silencing in S. pombe.« less
[Convergent origin of repeats in genes coding for globular proteins. An analysis of the factors determining the presence of inverted and symmetrical repeats].

PubMed

Solov'ev, V V; Kel', A E; Kolchanov, N A

1989-01-01

The factors, determining the presence of inverted and symmetrical repeats in genes coding for globular proteins, have been analysed. An interesting property of genetical code has been revealed in the analysis of symmetrical repeats: the pairs of symmetrical codons corresponded to pairs of amino acids with mostly similar physical-chemical parameters. This property may explain the presence of symmetrical repeats and palindromes only in genes coding for beta-structural proteins-polypeptides, where amino acids with similar physical-chemical properties occupy symmetrical positions. A stochastic model of evolution of polynucleotide sequences has been used for analysis of inverted repeats. The modelling demonstrated that only limiting of sequences (uneven frequencies of used codons) is enough for arising of nonrandom inverted repeats in genes.
Complete cDNA sequence and amino acid analysis of a bovine ribonuclease K6 gene.

PubMed

Pietrowski, D; Förster, M

2000-01-01

The complete cDNA sequence of a ribonuclease k6 gene of Bos Taurus has been determined. It codes for a protein with 154 amino acids and contains the invariant cysteine, histidine and lysine residues as well as the characteristic motifs specific to ribonuclease active sites. The deduced protein sequence is 27 residues longer than other known ribonucleases k6 and shows amino acids exchanges which could reflect a strain specificity or polymorphism within the bovine genome. Based on sequence similarity we have termed the identified gene bovine ribonuclease k6 b (brk6b).
The Human Cell Surfaceome of Breast Tumors

PubMed Central

da Cunha, Júlia Pinheiro Chagas; Galante, Pedro Alexandre Favoretto; de Souza, Jorge Estefano Santana; Pieprzyk, Martin; Carraro, Dirce Maria; Old, Lloyd J.; Camargo, Anamaria Aranha; de Souza, Sandro José

2013-01-01

Introduction. Cell surface proteins are ideal targets for cancer therapy and diagnosis. We have identified a set of more than 3700 genes that code for transmembrane proteins believed to be at human cell surface. Methods. We used a high-throuput qPCR system for the analysis of 573 cell surface protein-coding genes in 12 primary breast tumors, 8 breast cell lines, and 21 normal human tissues including breast. To better understand the role of these genes in breast tumors, we used a series of bioinformatics strategies to integrates different type, of the datasets, such as KEGG, protein-protein interaction databases, ONCOMINE, and data from, literature. Results. We found that at least 77 genes are overexpressed in breast primary tumors while at least 2 of them have also a restricted expression pattern in normal tissues. We found common signaling pathways that may be regulated in breast tumors through the overexpression of these cell surface protein-coding genes. Furthermore, a comparison was made between the genes found in this report and other genes associated with features clinically relevant for breast tumorigenesis. Conclusions. The expression profiling generated in this study, together with an integrative bioinformatics analysis, allowed us to identify putative targets for breast tumors. PMID:24195083
Using the NCBI Genome Databases to Compare the Genes for Human & Chimpanzee Beta Hemoglobin

ERIC Educational Resources Information Center

Offner, Susan

2010-01-01

The beta hemoglobin protein is identical in humans and chimpanzees. In this tutorial, students see that even though the proteins are identical, the genes that code for them are not. There are many more differences in the introns than in the exons, which indicates that coding regions of DNA are more highly conserved than non-coding regions.
Modeling T-cell activation using gene expression profiling and state-space models.

PubMed

Rangel, Claudia; Angus, John; Ghahramani, Zoubin; Lioumi, Maria; Sotheran, Elizabeth; Gaiba, Alessia; Wild, David L; Falciani, Francesco

2004-06-12

We have used state-space models to reverse engineer transcriptional networks from highly replicated gene expression profiling time series data obtained from a well-established model of T-cell activation. State space models are a class of dynamic Bayesian networks that assume that the observed measurements depend on some hidden state variables that evolve according to Markovian dynamics. These hidden variables can capture effects that cannot be measured in a gene expression profiling experiment, e.g. genes that have not been included in the microarray, levels of regulatory proteins, the effects of messenger RNA and protein degradation, etc. Bootstrap confidence intervals are developed for parameters representing 'gene-gene' interactions over time. Our models represent the dynamics of T-cell activation and provide a methodology for the development of rational and experimentally testable hypotheses. Supplementary data and Matlab computer source code will be made available on the web at the URL given below. http://public.kgi.edu/~wild/LDS/index.htm
TA-GC cloning: A new simple and versatile technique for the directional cloning of PCR products for recombinant protein expression.

PubMed

Niarchos, Athanasios; Siora, Anastasia; Konstantinou, Evangelia; Kalampoki, Vasiliki; Lagoumintzis, George; Poulas, Konstantinos

2017-01-01

During the last few decades, the recombinant protein expression finds more and more applications. The cloning of protein-coding genes into expression vectors is required to be directional for proper expression, and versatile in order to facilitate gene insertion in multiple different vectors for expression tests. In this study, the TA-GC cloning method is proposed, as a new, simple and efficient method for the directional cloning of protein-coding genes in expression vectors. The presented method features several advantages over existing methods, which tend to be relatively more labour intensive, inflexible or expensive. The proposed method relies on the complementarity between single A- and G-overhangs of the protein-coding gene, obtained after a short incubation with T4 DNA polymerase, and T and C overhangs of the novel vector pET-BccI, created after digestion with the restriction endonuclease BccI. The novel protein-expression vector pET-BccI also facilitates the screening of transformed colonies for recombinant transformants. Evaluation experiments of the proposed TA-GC cloning method showed that 81% of the transformed colonies contained recombinant pET-BccI plasmids, and 98% of the recombinant colonies expressed the desired protein. This demonstrates that TA-GC cloning could be a valuable method for cloning protein-coding genes in expression vectors.
TA-GC cloning: A new simple and versatile technique for the directional cloning of PCR products for recombinant protein expression

PubMed Central

Niarchos, Athanasios; Siora, Anastasia; Konstantinou, Evangelia; Kalampoki, Vasiliki; Poulas, Konstantinos

2017-01-01

During the last few decades, the recombinant protein expression finds more and more applications. The cloning of protein-coding genes into expression vectors is required to be directional for proper expression, and versatile in order to facilitate gene insertion in multiple different vectors for expression tests. In this study, the TA-GC cloning method is proposed, as a new, simple and efficient method for the directional cloning of protein-coding genes in expression vectors. The presented method features several advantages over existing methods, which tend to be relatively more labour intensive, inflexible or expensive. The proposed method relies on the complementarity between single A- and G-overhangs of the protein-coding gene, obtained after a short incubation with T4 DNA polymerase, and T and C overhangs of the novel vector pET-BccI, created after digestion with the restriction endonuclease BccI. The novel protein-expression vector pET-BccI also facilitates the screening of transformed colonies for recombinant transformants. Evaluation experiments of the proposed TA-GC cloning method showed that 81% of the transformed colonies contained recombinant pET-BccI plasmids, and 98% of the recombinant colonies expressed the desired protein. This demonstrates that TA-GC cloning could be a valuable method for cloning protein-coding genes in expression vectors. PMID:29091919
Translatomics combined with transcriptomics and proteomics reveals novel functional, recently evolved orphan genes in Escherichia coli O157:H7 (EHEC).

PubMed

Neuhaus, Klaus; Landstorfer, Richard; Fellner, Lea; Simon, Svenja; Schafferhans, Andrea; Goldberg, Tatyana; Marx, Harald; Ozoline, Olga N; Rost, Burkhard; Kuster, Bernhard; Keim, Daniel A; Scherer, Siegfried

2016-02-24

Genomes of E. coli, including that of the human pathogen Escherichia coli O157:H7 (EHEC) EDL933, still harbor undetected protein-coding genes which, apparently, have escaped annotation due to their small size and non-essential function. To find such genes, global gene expression of EHEC EDL933 was examined, using strand-specific RNAseq (transcriptome), ribosomal footprinting (translatome) and mass spectrometry (proteome). Using the above methods, 72 short, non-annotated protein-coding genes were detected. All of these showed signals in the ribosomal footprinting assay indicating mRNA translation. Seven were verified by mass spectrometry. Fifty-seven genes are annotated in other enterobacteriaceae, mainly as hypothetical genes; the remaining 15 genes constitute novel discoveries. In addition, protein structure and function were predicted computationally and compared between EHEC-encoded proteins and 100-times randomly shuffled proteins. Based on this comparison, 61 of the 72 novel proteins exhibit predicted structural and functional features similar to those of annotated proteins. Many of the novel genes show differential transcription when grown under eleven diverse growth conditions suggesting environmental regulation. Three genes were found to confer a phenotype in previous studies, e.g., decreased cattle colonization. These findings demonstrate that ribosomal footprinting can be used to detect novel protein coding genes, contributing to the growing body of evidence that hypothetical genes are not annotation artifacts and opening an additional way to study their functionality. All 72 genes are taxonomically restricted and, therefore, appear to have evolved relatively recently de novo.
Transcriptome interrogation of human myometrium identifies differentially expressed sense-antisense pairs of protein-coding and long non-coding RNA genes in spontaneous labor at term.

PubMed

Romero, Roberto; Tarca, Adi L; Chaemsaithong, Piya; Miranda, Jezid; Chaiworapongsa, Tinnakorn; Jia, Hui; Hassan, Sonia S; Kalita, Cynthia A; Cai, Juan; Yeo, Lami; Lipovich, Leonard

2014-09-01

To identify differentially expressed long non-coding RNA (lncRNA) genes in human myometrium in women with spontaneous labor at term. Myometrium was obtained from women undergoing cesarean deliveries who were not in labor (n = 19) and women in spontaneous labor at term (n = 20). RNA was extracted and profiled using an Illumina® microarray platform. We have used computational approaches to bound the extent of long non-coding RNA representation on this platform, and to identify co-differentially expressed and correlated pairs of long non-coding RNA genes and protein-coding genes sharing the same genomic loci. We identified co-differential expression and correlation at two genomic loci that contain coding-lncRNA gene pairs: SOCS2-AK054607 and LMCD1-NR_024065 in women in spontaneous labor at term. This co-differential expression and correlation was validated by qRT-PCR, an experimental method completely independent of the microarray analysis. Intriguingly, one of the two lncRNA genes differentially expressed in term labor had a key genomic structure element, a splice site, that lacked evolutionary conservation beyond primates. We provide, for the first time, evidence for coordinated differential expression and correlation of cis-encoded antisense lncRNAs and protein-coding genes with known as well as novel roles in pregnancy in the myometrium of women in spontaneous labor at term.
Behind the curtain of non-coding RNAs; long non-coding RNAs regulating hepatocarcinogenesis

PubMed Central

El Khodiry, Aya; Afify, Menna; El Tayebi, Hend M

2018-01-01

Hepatocellular carcinoma (HCC) is one of the most common and aggressive cancers worldwide. HCC is the fifth common malignancy in the world and the second leading cause of cancer death in Asia. Long non-coding RNAs (lncRNAs) are RNAs with a length greater than 200 nucleotides that do not encode proteins. lncRNAs can regulate gene expression and protein synthesis in several ways by interacting with DNA, RNA and proteins in a sequence specific manner. They could regulate cellular and developmental processes through either gene inhibition or gene activation. Many studies have shown that dysregulation of lncRNAs is related to many human diseases such as cardiovascular diseases, genetic disorders, neurological diseases, immune mediated disorders and cancers. However, the study of lncRNAs is challenging as they are poorly conserved between species, their expression levels aren’t as high as that of mRNAs and have great interpatient variations. The study of lncRNAs expression in cancers have been a breakthrough as it unveils potential biomarkers and drug targets for cancer therapy and helps understand the mechanism of pathogenesis. This review discusses many long non-coding RNAs and their contribution in HCC, their role in development, metastasis, and prognosis of HCC and how to regulate and target these lncRNAs as a therapeutic tool in HCC treatment in the future. PMID:29434445
The complete mitochondrial genome and phylogenetic analysis of the giant panda (Ailuropoda melanoleuca).

PubMed

Peng, Rui; Zeng, Bo; Meng, Xiuxiang; Yue, Bisong; Zhang, Zhihe; Zou, Fangdong

2007-08-01

The complete mitochondrial genome sequence of the giant panda, Ailuropoda melanoleuca, was determined by the long and accurate polymerase chain reaction (LA-PCR) with conserved primers and primer walking sequence methods. The complete mitochondrial DNA is 16,805 nucleotides in length and contains two ribosomal RNA genes, 13 protein-coding genes, 22 transfer RNA genes and one control region. The total length of the 13 protein-coding genes is longer than the American black bear, brown bear and polar bear by 3 amino acids at the end of ND5 gene. The codon usage also followed the typical vertebrate pattern except for an unusual ATT start codon, which initiates the NADH dehydrogenase subunit 5 (ND5) gene. The molecular phylogenetic analysis was performed on the sequences of 12 concatenated heavy-strand encoded protein-coding genes, and suggested that the giant panda is most closely related to bears.
[Neuromuscular system and aging: involutions and implications].

PubMed

Paillard, Thierry

2013-12-01

In aged human, the number of muscle fibers and motor units decreases. The remaining motor units lose their functionality (decrease of the discharge frequency, greater fluctuation of the discharge) particularly those which contain type II fibers. The renewal of intracellular proteins declines which creates a negative balance between the daily protein losses and the capacities to renew them. The activity of the protein kinase (Akt) that stimulates the synthesis of regulation proteins (mTOR, p70S6, IGFBP-5) declines whereas the factors of degradation of proteins (NF-kappa B) are activated. Besides, the process of activation and proliferation of satellite cells is affected and the production of anabolic hormones and local factors is decreased. After a strength training program, muscle hypertrophy is linked to the protein synthesis at the level of myosin heavy chain (MHC) isoforms in older subjects. However, the transcription of the genes that code the MHC-I (slow form) increases and the transcription of the genes that code the MHC-II (fast form) decreases. Thus, the transition of the phenotype towards a slower form cannot be inverted by strength training during the advanced in age. Moreover, strength training enables to decrease the proportion of fibers containing MHC of hybrid form in the process of evolution. Hence, strength training can engender a stabilization of the muscular phenotype i.e. different isoforms of MHC. In addition, strength training counteracts the noxious effects mentioned above by generating muscular hypertrophy thanks to a reactive increase in the production of anabolic hormones. A program of aerobic training can induce an increase in the synthesis of ARN messengers coding isoforms related to the oxidative metabolism (MHC-I and to a lesser extent MHC-IIa) while the transcribed for the type MHC-IIx decrease.
The Complete Mitochondrial DNA Sequence of Scenedesmus obliquus Reflects an Intermediate Stage in the Evolution of the Green Algal Mitochondrial Genome

PubMed Central

Nedelcu, Aurora M.; Lee, Robert W.; Lemieux, Claude; Gray, Michael W.; Burger, Gertraud

2000-01-01

Two distinct mitochondrial genome types have been described among the green algal lineages investigated to date: a reduced–derived, Chlamydomonas-like type and an ancestral, Prototheca-like type. To determine if this unexpected dichotomy is real or is due to insufficient or biased sampling and to define trends in the evolution of the green algal mitochondrial genome, we sequenced and analyzed the mitochondrial DNA (mtDNA) of Scenedesmus obliquus. This genome is 42,919 bp in size and encodes 42 conserved genes (i.e., large and small subunit rRNA genes, 27 tRNA and 13 respiratory protein-coding genes), four additional free-standing open reading frames with no known homologs, and an intronic reading frame with endonuclease/maturase similarity. No 5S rRNA or ribosomal protein-coding genes have been identified in Scenedesmus mtDNA. The standard protein-coding genes feature a deviant genetic code characterized by the use of UAG (normally a stop codon) to specify leucine, and the unprecedented use of UCA (normally a serine codon) as a signal for termination of translation. The mitochondrial genome of Scenedesmus combines features of both green algal mitochondrial genome types: the presence of a more complex set of protein-coding and tRNA genes is shared with the ancestral type, whereas the lack of 5S rRNA and ribosomal protein-coding genes as well as the presence of fragmented and scrambled rRNA genes are shared with the reduced–derived type of mitochondrial genome organization. Furthermore, the gene content and the fragmentation pattern of the rRNA genes suggest that this genome represents an intermediate stage in the evolutionary process of mitochondrial genome streamlining in green algae. [The sequence data described in this paper have been submitted to the GenBank data library under accession no. AF204057.] PMID:10854413
Tenebrio molitor antifreeze protein gene identification and regulation.

PubMed

Qin, Wensheng; Walker, Virginia K

2006-02-15

The yellow mealworm, Tenebrio molitor, is a freeze susceptible, stored product pest. Its winter survival is facilitated by the accumulation of antifreeze proteins (AFPs), encoded by a small gene family. We have now isolated 11 different AFP genomic clones from 3 genomic libraries. All the clones had a single coding sequence, with no evidence of intervening sequences. Three genomic clones were further characterized. All have putative TATA box sequences upstream of the coding regions and multiple potential poly(A) signal sequences downstream of the coding regions. A TmAFP regulatory region, B1037, conferred transcriptional activity when ligated to a luciferase reporter sequence and after transfection into an insect cell line. A 143 bp core promoter including a TATA box sequence was identified. Its promoter activity was increased 4.4 times by inserting an exotic 245 bp intron into the construct, similar to the enhancement of transgenic expression seen in several other systems. The addition of a duplication of the first 120 bp sequence from the 143 bp core promoter decreased promoter activity by half. Although putative hormonal response sequences were identified, none of the five hormones tested enhanced reporter activity. These studies on the mechanisms of AFP transcriptional control are important for the consideration of any transfer of freeze-resistance phenotypes to beneficial hosts.
Transimulation - protein biosynthesis web service.

PubMed

Siwiak, Marlena; Zielenkiewicz, Piotr

2013-01-01

Although translation is the key step during gene expression, it remains poorly characterized at the level of individual genes. For this reason, we developed Transimulation - a web service measuring translational activity of genes in three model organisms: Escherichia coli, Saccharomyces cerevisiae and Homo sapiens. The calculations are based on our previous computational model of translation and experimental data sets. Transimulation quantifies mean translation initiation and elongation time (expressed in SI units), and the number of proteins produced per transcript. It also approximates the number of ribosomes that typically occupy a transcript during translation, and simulates their propagation. The simulation of ribosomes' movement is interactive and allows modifying the coding sequence on the fly. It also enables uploading any coding sequence and simulating its translation in one of three model organisms. In such a case, ribosomes propagate according to mean codon elongation times of the host organism, which may prove useful for heterologous expression. Transimulation was used to examine evolutionary conservation of translational parameters of orthologous genes. Transimulation may be accessed at http://nexus.ibb.waw.pl/Transimulation (requires Java version 1.7 or higher). Its manual and source code, distributed under the GPL-2.0 license, is freely available at the website.
Emerging Putative Associations between Non-Coding RNAs and Protein-Coding Genes in Neuropathic Pain: Added Value from Reusing Microarray Data

PubMed Central

Raju, Hemalatha B.; Tsinoremas, Nicholas F.; Capobianco, Enrico

2016-01-01

Regeneration of injured nerves is likely occurring in the peripheral nervous system, but not in the central nervous system. Although protein-coding gene expression has been assessed during nerve regeneration, little is currently known about the role of non-coding RNAs (ncRNAs). This leaves open questions about the potential effects of ncRNAs at transcriptome level. Due to the limited availability of human neuropathic pain (NP) data, we have identified the most comprehensive time-course gene expression profile referred to sciatic nerve (SN) injury and studied in a rat model using two neuronal tissues, namely dorsal root ganglion (DRG) and SN. We have developed a methodology to identify differentially expressed bioentities starting from microarray probes and repurposing them to annotate ncRNAs, while analyzing the expression profiles of protein-coding genes. The approach is designed to reuse microarray data and perform first profiling and then meta-analysis through three main steps. First, we used contextual analysis to identify what we considered putative or potential protein-coding targets for selected ncRNAs. Relevance was therefore assigned to differential expression of neighbor protein-coding genes, with neighborhood defined by a fixed genomic distance from long or antisense ncRNA loci, and of parental genes associated with pseudogenes. Second, connectivity among putative targets was used to build networks, in turn useful to conduct inference at interactomic scale. Last, network paths were annotated to assess relevance to NP. We found significant differential expression in long-intergenic ncRNAs (32 lincRNAs in SN and 8 in DRG), antisense RNA (31 asRNA in SN and 12 in DRG), and pseudogenes (456 in SN and 56 in DRG). In particular, contextual analysis centered on pseudogenes revealed some targets with known association to neurodegeneration and/or neurogenesis processes. While modules of the olfactory receptors were clearly identified in protein–protein interaction networks, other connectivity paths were identified between proteins already investigated in studies on disorders, such as Parkinson, Down syndrome, Huntington disease, and Alzheimer. Our findings suggest the importance of reusing gene expression data by meta-analysis approaches. PMID:27803687

A chromatin activity based chemoproteomic approach reveals a transcriptional repressome for gene-specific silencing

PubMed Central

Liu, Cui; Yu, Yanbao; Liu, Feng; Wei, Xin; Wrobel, John A.; Gunawardena, Harsha P.; Zhou, Li; Jin, Jian; Chen, Xian

2015-01-01

Immune cells develop endotoxin tolerance (ET) after prolonged stimulation. ET increases the level of a repression mark H3K9me2 in the transcriptional-silent chromatin specifically associated with pro-inflammatory genes. However, it is not clear what proteins are functionally involved in this process. Here we show that a novel chromatin activity based chemoproteomic (ChaC) approach can dissect the functional chromatin protein complexes that regulate ET-associated inflammation. Using UNC0638 that binds the enzymatically active H3K9-specific methyltransferase G9a/GLP, ChaC reveals that G9a is constitutively active at a G9a-dependent mega-dalton repressome in primary endotoxin-tolerant macrophages. G9a/GLP broadly impacts the ET-specific reprogramming of the histone code landscape, chromatin remodeling, and the activities of select transcription factors. We discover that the G9a-dependent epigenetic environment promotes the transcriptional repression activity of c-Myc for gene-specific co-regulation of chronic inflammation. ChaC may be also applicable to dissect other functional protein complexes in the context of phenotypic chromatin architectures. PMID:25502336
Deep transcriptome annotation enables the discovery and functional characterization of cryptic small proteins

PubMed Central

Delcourt, Vivian; Lucier, Jean-François; Gagnon, Jules; Beaudoin, Maxime C; Vanderperre, Benoît; Breton, Marc-André; Motard, Julie; Jacques, Jean-François; Brunelle, Mylène; Gagnon-Arsenault, Isabelle; Fournier, Isabelle; Ouangraoua, Aida; Hunting, Darel J; Cohen, Alan A; Landry, Christian R; Scott, Michelle S

2017-01-01

Recent functional, proteomic and ribosome profiling studies in eukaryotes have concurrently demonstrated the translation of alternative open-reading frames (altORFs) in addition to annotated protein coding sequences (CDSs). We show that a large number of small proteins could in fact be coded by these altORFs. The putative alternative proteins translated from altORFs have orthologs in many species and contain functional domains. Evolutionary analyses indicate that altORFs often show more extreme conservation patterns than their CDSs. Thousands of alternative proteins are detected in proteomic datasets by reanalysis using a database containing predicted alternative proteins. This is illustrated with specific examples, including altMiD51, a 70 amino acid mitochondrial fission-promoting protein encoded in MiD51/Mief1/SMCR7L, a gene encoding an annotated protein promoting mitochondrial fission. Our results suggest that many genes are multicoding genes and code for a large protein and one or several small proteins. PMID:29083303
A second gene for acyl-(acyl-carrier-protein): glycerol-3-phosphate acyltransferase in squash, Cucurbita moschata cv. Shirogikuza(*), codes for an oleate-selective isozyme: molecular cloning and protein purification studies.

PubMed

Nishida, I; Sugiura, M; Enju, A; Nakamura, M

2000-12-01

A new isogene for acyl-(acyl-carrier-protein):glycerol-3-phosphate acyltransferase (GPAT; EC 2.3.1.15) in squash has been cloned and the gene product was identified as oleate-selective GPAT. Using PCR primers that could hybridise with exons for a previously cloned squash GPAT, we obtained two PCR products of different size: one coded for a previously cloned squash GPAT corresponding to non-selective isoforms AT2 and AT3, and the other for a new isozyme, probably the oleate-selective isoform AT1. Full-length amino acid sequences of respective isozymes were deduced from the nucleotide sequences of genomic genes and cDNAs, which were cloned by a series of PCR-based methods. Thus, we designated the new gene CmATS1;1 and the other one CmATS1;2. Genome blot analysis revealed that the squash genome contained the two isogenes at non-allelic loci. AT1-active fractions were partially purified, and three polypeptide bands were identified as being AT1 polypeptides, which exhibited relative molecular masses of 39.5-40.5 kDa, pI values of 6.75-7.15, and oleate selectivity over palmitate. Partial amino-terminal sequences obtained from two of these bands verified that the new isogene codes for AT1 polypeptides.
Maternal transcription of non-protein coding RNAs from the PWS-critical region rescues growth retardation in mice.

PubMed

Rozhdestvensky, Timofey S; Robeck, Thomas; Galiveti, Chenna R; Raabe, Carsten A; Seeger, Birte; Wolters, Anna; Gubar, Leonid V; Brosius, Jürgen; Skryabin, Boris V

2016-02-05

Prader-Willi syndrome (PWS) is a neurogenetic disorder caused by loss of paternally expressed genes on chromosome 15q11-q13. The PWS-critical region (PWScr) contains an array of non-protein coding IPW-A exons hosting intronic SNORD116 snoRNA genes. Deletion of PWScr is associated with PWS in humans and growth retardation in mice exhibiting ~15% postnatal lethality in C57BL/6 background. Here we analysed a knock-in mouse containing a 5'HPRT-LoxP-Neo(R) cassette (5'LoxP) inserted upstream of the PWScr. When the insertion was inherited maternally in a paternal PWScr-deletion mouse model (PWScr(p-/m5'LoxP)), we observed compensation of growth retardation and postnatal lethality. Genomic methylation pattern and expression of protein-coding genes remained unaltered at the PWS-locus of PWScr(p-/m5'LoxP) mice. Interestingly, ubiquitous Snord116 and IPW-A exon transcription from the originally silent maternal chromosome was detected. In situ hybridization indicated that PWScr(p-/m5'LoxP) mice expressed Snord116 in brain areas similar to wild type animals. Our results suggest that the lack of PWScr RNA expression in certain brain areas could be a primary cause of the growth retardation phenotype in mice. We propose that activation of disease-associated genes on imprinted regions could lead to general therapeutic strategies in associated diseases.
Expression of lysozymes from Erwinia amylovora phages and Erwinia genomes and inhibition by a bacterial protein.

PubMed

Müller, Ina; Gernold, Marina; Schneider, Bernd; Geider, Klaus

2012-01-01

Genes coding for lysozyme-inhibiting proteins (Ivy) were cloned from the chromosomes of the plant pathogens Erwinia amylovora and Erwinia pyrifoliae. The product interfered not only with activity of hen egg white lysozyme, but also with an enzyme from E. amylovora phage ΦEa1h. We have expressed lysozyme genes from the genomes of three Erwinia species in Escherichia coli. The lysozymes expressed from genes of the E. amylovora phages ΦEa104 and ΦEa116, Erwinia chromosomes and Arabidopsis thaliana were not affected by Ivy. The enzyme from bacteriophage ΦEa1h was fused at the N- or C-terminus to other peptides. Compared to the intact lysozyme, a His-tag reduced its lytic activity about 10-fold and larger fusion proteins abolished activity completely. Specific protease cleavage restored lysozyme activity of a GST-fusion. The bacteriophage-encoded lysozymes were more active than the enzymes from bacterial chromosomes. Viral lyz genes were inserted into a broad-host range vector, and transfer to E. amylovora inhibited cell growth. Inserted in the yeast Pichia pastoris, the ΦEa1h-lysozyme was secreted and also inhibited by Ivy. Here we describe expression of unrelated cloned 'silent' lyz genes from Erwinia chromosomes and a novel interference of bacterial Ivy proteins with a viral lysozyme. Copyright © 2012 S. Karger AG, Basel.
Experimental annotation of post-translational features and translated coding regions in the pathogen Salmonella Typhimurium

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ansong, Charles; Tolic, Nikola; Purvine, Samuel O.

Complete and accurate genome annotation is crucial for comprehensive and systematic studies of biological systems. For example systems biology-oriented genome scale modeling efforts greatly benefit from accurate annotation of protein-coding genes to develop proper functioning models. However, determining protein-coding genes for most new genomes is almost completely performed by inference, using computational predictions with significant documented error rates (> 15%). Furthermore, gene prediction programs provide no information on biologically important post-translational processing events critical for protein function. With the ability to directly measure peptides arising from expressed proteins, mass spectrometry-based proteomics approaches can be used to augment and verify codingmore » regions of a genomic sequence and importantly detect post-translational processing events. In this study we utilized “shotgun” proteomics to guide accurate primary genome annotation of the bacterial pathogen Salmonella Typhimurium 14028 to facilitate a systems-level understanding of Salmonella biology. The data provides protein-level experimental confirmation for 44% of predicted protein-coding genes, suggests revisions to 48 genes assigned incorrect translational start sites, and uncovers 13 non-annotated genes missed by gene prediction programs. We also present a comprehensive analysis of post-translational processing events in Salmonella, revealing a wide range of complex chemical modifications (70 distinct modifications) and confirming more than 130 signal peptide and N-terminal methionine cleavage events in Salmonella. This study highlights several ways in which proteomics data applied during the primary stages of annotation can improve the quality of genome annotations, especially with regards to the annotation of mature protein products.« less
Complete mitochondrial genome of the agarophyte red alga Gelidium vagum (Gelidiales).

PubMed

Yang, Eun Chan; Kim, Kyeong Mi; Boo, Ga Hun; Lee, Jung-Hyun; Boo, Sung Min; Yoon, Hwan Su

2014-08-01

We describe the first complete mitochondrial genome of Gelidium vagum (Gelidiales) (24,901 bp, 30.4% GC content), an agar-producing red alga. The circular mitochondrial genome contains 43 genes, including 23 protein-coding, 18 tRNA and 2 rRNA genes. All the protein-coding genes have a typical ATG start codon. No introns were found. Two genes, secY and rps12, were overlapped by 41 bp.
Long non-coding RNA expression patterns in lung tissues of chronic cigarette smoke induced COPD mouse model.

PubMed

Zhang, Haiyun; Sun, Dejun; Li, Defu; Zheng, Zeguang; Xu, Jingyi; Liang, Xue; Zhang, Chenting; Wang, Sheng; Wang, Jian; Lu, Wenju

2018-05-15

Long non-coding RNAs (lncRNAs) have critical regulatory roles in protein-coding gene expression. Aberrant expression profiles of lncRNAs have been observed in various human diseases. In this study, we investigated transcriptome profiles in lung tissues of chronic cigarette smoke (CS)-induced COPD mouse model. We found that 109 lncRNAs and 260 mRNAs were significantly differential expressed in lungs of chronic CS-induced COPD mouse model compared with control animals. GO and KEGG analyses indicated that differentially expressed lncRNAs associated protein-coding genes were mainly involved in protein processing of endoplasmic reticulum pathway, and taurine and hypotaurine metabolism pathway. The combination of high throughput data analysis and the results of qRT-PCR validation in lungs of chronic CS-induced COPD mouse model, 16HBE cells with CSE treatment and PBMC from patients with COPD revealed that NR_102714 and its associated protein-coding gene UCHL1 might be involved in the development of COPD both in mouse and human. In conclusion, our study demonstrated that aberrant expression profiles of lncRNAs and mRNAs existed in lungs of chronic CS-induced COPD mouse model. From animal models perspective, these results might provide further clues to investigate biological functions of lncRNAs and their potential target protein-coding genes in the pathogenesis of COPD.
Successful Recovery of Nuclear Protein-Coding Genes from Small Insects in Museums Using Illumina Sequencing.

PubMed

Kanda, Kojun; Pflug, James M; Sproul, John S; Dasenko, Mark A; Maddison, David R

2015-01-01

In this paper we explore high-throughput Illumina sequencing of nuclear protein-coding, ribosomal, and mitochondrial genes in small, dried insects stored in natural history collections. We sequenced one tenebrionid beetle and 12 carabid beetles ranging in size from 3.7 to 9.7 mm in length that have been stored in various museums for 4 to 84 years. Although we chose a number of old, small specimens for which we expected low sequence recovery, we successfully recovered at least some low-copy nuclear protein-coding genes from all specimens. For example, in one 56-year-old beetle, 4.4 mm in length, our de novo assembly recovered about 63% of approximately 41,900 nucleotides in a target suite of 67 nuclear protein-coding gene fragments, and 70% using a reference-based assembly. Even in the least successfully sequenced carabid specimen, reference-based assembly yielded fragments that were at least 50% of the target length for 34 of 67 nuclear protein-coding gene fragments. Exploration of alternative references for reference-based assembly revealed few signs of bias created by the reference. For all specimens we recovered almost complete copies of ribosomal and mitochondrial genes. We verified the general accuracy of the sequences through comparisons with sequences obtained from PCR and Sanger sequencing, including of conspecific, fresh specimens, and through phylogenetic analysis that tested the placement of sequences in predicted regions. A few possible inaccuracies in the sequences were detected, but these rarely affected the phylogenetic placement of the samples. Although our sample sizes are low, an exploratory regression study suggests that the dominant factor in predicting success at recovering nuclear protein-coding genes is a high number of Illumina reads, with success at PCR of COI and killing by immersion in ethanol being secondary factors; in analyses of only high-read samples, the primary significant explanatory variable was body length, with small beetles being more successfully sequenced.
Successful Recovery of Nuclear Protein-Coding Genes from Small Insects in Museums Using Illumina Sequencing

PubMed Central

Dasenko, Mark A.

2015-01-01

In this paper we explore high-throughput Illumina sequencing of nuclear protein-coding, ribosomal, and mitochondrial genes in small, dried insects stored in natural history collections. We sequenced one tenebrionid beetle and 12 carabid beetles ranging in size from 3.7 to 9.7 mm in length that have been stored in various museums for 4 to 84 years. Although we chose a number of old, small specimens for which we expected low sequence recovery, we successfully recovered at least some low-copy nuclear protein-coding genes from all specimens. For example, in one 56-year-old beetle, 4.4 mm in length, our de novo assembly recovered about 63% of approximately 41,900 nucleotides in a target suite of 67 nuclear protein-coding gene fragments, and 70% using a reference-based assembly. Even in the least successfully sequenced carabid specimen, reference-based assembly yielded fragments that were at least 50% of the target length for 34 of 67 nuclear protein-coding gene fragments. Exploration of alternative references for reference-based assembly revealed few signs of bias created by the reference. For all specimens we recovered almost complete copies of ribosomal and mitochondrial genes. We verified the general accuracy of the sequences through comparisons with sequences obtained from PCR and Sanger sequencing, including of conspecific, fresh specimens, and through phylogenetic analysis that tested the placement of sequences in predicted regions. A few possible inaccuracies in the sequences were detected, but these rarely affected the phylogenetic placement of the samples. Although our sample sizes are low, an exploratory regression study suggests that the dominant factor in predicting success at recovering nuclear protein-coding genes is a high number of Illumina reads, with success at PCR of COI and killing by immersion in ethanol being secondary factors; in analyses of only high-read samples, the primary significant explanatory variable was body length, with small beetles being more successfully sequenced. PMID:26716693
Origin and evolution of the long non-coding genes in the X-inactivation center.

PubMed

Romito, Antonio; Rougeulle, Claire

2011-11-01

Random X chromosome inactivation (XCI), the eutherian mechanism of X-linked gene dosage compensation, is controlled by a cis-acting locus termed the X-inactivation center (Xic). One of the striking features that characterize the Xic landscape is the abundance of loci transcribing non-coding RNAs (ncRNAs), including Xist, the master regulator of the inactivation process. Recent comparative genomic analyses have depicted the evolutionary scenario behind the origin of the X-inactivation center, revealing that this locus evolved from a region harboring protein-coding genes. During mammalian radiation, this ancestral protein-coding region was disrupted in the marsupial group, whilst it provided in eutherian lineage the starting material for the non-translated RNAs of the X-inactivation center. The emergence of non-coding genes occurred by a dual mechanism involving loss of protein-coding function of the pre-existing genes and integration of different classes of mobile elements, some of which modeled the structure and sequence of the non-coding genes in a species-specific manner. The rising genes started to produce transcripts that acquired function in regulating the epigenetic status of the X chromosome, as shown for Xist, its antisense Tsix, Jpx, and recently suggested for Ftx. Thus, the appearance of the Xic, which occurred after the divergence between eutherians and marsupials, was the basis for the evolution of random X inactivation as a strategy to achieve dosage compensation. Copyright © 2011. Published by Elsevier Masson SAS.
Integrative structural annotation of de novo RNA-Seq provides an accurate reference gene set of the enormous genome of the onion (Allium cepa L.)

PubMed Central

Kim, Seungill; Kim, Myung-Shin; Kim, Yong-Min; Yeom, Seon-In; Cheong, Kyeongchae; Kim, Ki-Tae; Jeon, Jongbum; Kim, Sunggil; Kim, Do-Sun; Sohn, Seong-Han; Lee, Yong-Hwan; Choi, Doil

2015-01-01

The onion (Allium cepa L.) is one of the most widely cultivated and consumed vegetable crops in the world. Although a considerable amount of onion transcriptome data has been deposited into public databases, the sequences of the protein-coding genes are not accurate enough to be used, owing to non-coding sequences intermixed with the coding sequences. We generated a high-quality, annotated onion transcriptome from de novo sequence assembly and intensive structural annotation using the integrated structural gene annotation pipeline (ISGAP), which identified 54,165 protein-coding genes among 165,179 assembled transcripts totalling 203.0 Mb by eliminating the intron sequences. ISGAP performed reliable annotation, recognizing accurate gene structures based on reference proteins, and ab initio gene models of the assembled transcripts. Integrative functional annotation and gene-based SNP analysis revealed a whole biological repertoire of genes and transcriptomic variation in the onion. The method developed in this study provides a powerful tool for the construction of reference gene sets for organisms based solely on de novo transcriptome data. Furthermore, the reference genes and their variation described here for the onion represent essential tools for molecular breeding and gene cloning in Allium spp. PMID:25362073
Structure of Thermotoga maritima Stationary Phase Survival Protein SurE: A Novel Acid Phosphatase

PubMed Central

Zhang, R.-G.; Skarina, T.; Katz, J.E.; Beasley, S.; Khachatryan, A.; Vyas, S.; Arrowsmith, C.H.; Clarke, S.; Edwards, A.; Joachimiak, A.; Savchenko, A.

2009-01-01

Summary Background The rpoS, nlpD, pcm, and surE genes are among many whose expression is induced during the stationary phase of bacterial growth. rpoS codes for the stationary-phase RNA polymerase σ subunit, and nlpD codes for a lipoprotein. The pcm gene product repairs damaged proteins by converting the atypical isoaspartyl residues back to L-aspartyls. The physiological and biochemical functions of surE are unknown, but its importance in stress is supported by the duplication of the surE gene in E. coli subjected to high-temperature growth. The pcm and surE genes are highly conserved in bacteria, archaea, and plants. Results The structure of SurE from Thermotoga maritima was determined at 2.0 Å. The SurE monomer is composed of two domains; a conserved N-terminal domain, a Rossman fold, and a C-terminal oligomerization domain, a new fold. Monomers form a dimer that assembles into a tetramer. Biochemical analysis suggests that SurE is an acid phosphatase, with an optimum pH of 5.5–6.2. The active site was identified in the N-terminal domain through analysis of conserved residues. Structure-based site-directed point mutations abolished phosphatase activity. T. maritima SurE intra- and inter-subunit salt bridges were identified that may explain the SurE thermostability. Conclusions The structure of SurE provided information about the protein’s fold, oligomeric state, and active site. The protein possessed magnesium-dependent acid phosphatase activity, but the physiologically relevant substrate(s) remains to be identified. The importance of three of the assigned active site residues in catalysis was confirmed by site-directed mutagenesis. PMID:11709173
Mutation Update of ARSA and PSAP Genes Causing Metachromatic Leukodystrophy.

PubMed

Cesani, Martina; Lorioli, Laura; Grossi, Serena; Amico, Giulia; Fumagalli, Francesca; Spiga, Ivana; Filocamo, Mirella; Biffi, Alessandra

2016-01-01

Metachromatic leukodystrophy is a neurodegenerative disorder characterized by progressive demyelination. The disease is caused by variants in the ARSA gene, which codes for the lysosomal enzyme arylsulfatase A, or, more rarely, in the PSAP gene, which codes for the activator protein saposin B. In this Mutation Update, an extensive review of all the ARSA- and PSAP-causative variants published in the literature to date, accounting for a total of 200 ARSA and 10 PSAP allele types, is presented. The detailed ARSA and PSAP variant lists are freely available on the Leiden Online Variation Database (LOVD) platform at http://www.LOVD.nl/ARSA and http://www.LOVD.nl/PSAP, respectively. © 2015 WILEY PERIODICALS, INC.
Chromosome preference of disease genes and vectorization for the prediction of non-coding disease genes.

PubMed

Peng, Hui; Lan, Chaowang; Liu, Yuansheng; Liu, Tao; Blumenstein, Michael; Li, Jinyan

2017-10-03

Disease-related protein-coding genes have been widely studied, but disease-related non-coding genes remain largely unknown. This work introduces a new vector to represent diseases, and applies the newly vectorized data for a positive-unlabeled learning algorithm to predict and rank disease-related long non-coding RNA (lncRNA) genes. This novel vector representation for diseases consists of two sub-vectors, one is composed of 45 elements, characterizing the information entropies of the disease genes distribution over 45 chromosome substructures. This idea is supported by our observation that some substructures (e.g., the chromosome 6 p-arm) are highly preferred by disease-related protein coding genes, while some (e.g., the 21 p-arm) are not favored at all. The second sub-vector is 30-dimensional, characterizing the distribution of disease gene enriched KEGG pathways in comparison with our manually created pathway groups. The second sub-vector complements with the first one to differentiate between various diseases. Our prediction method outperforms the state-of-the-art methods on benchmark datasets for prioritizing disease related lncRNA genes. The method also works well when only the sequence information of an lncRNA gene is known, or even when a given disease has no currently recognized long non-coding genes.
Chromosome preference of disease genes and vectorization for the prediction of non-coding disease genes

PubMed Central

Peng, Hui; Lan, Chaowang; Liu, Yuansheng; Liu, Tao; Blumenstein, Michael; Li, Jinyan

2017-01-01

Disease-related protein-coding genes have been widely studied, but disease-related non-coding genes remain largely unknown. This work introduces a new vector to represent diseases, and applies the newly vectorized data for a positive-unlabeled learning algorithm to predict and rank disease-related long non-coding RNA (lncRNA) genes. This novel vector representation for diseases consists of two sub-vectors, one is composed of 45 elements, characterizing the information entropies of the disease genes distribution over 45 chromosome substructures. This idea is supported by our observation that some substructures (e.g., the chromosome 6 p-arm) are highly preferred by disease-related protein coding genes, while some (e.g., the 21 p-arm) are not favored at all. The second sub-vector is 30-dimensional, characterizing the distribution of disease gene enriched KEGG pathways in comparison with our manually created pathway groups. The second sub-vector complements with the first one to differentiate between various diseases. Our prediction method outperforms the state-of-the-art methods on benchmark datasets for prioritizing disease related lncRNA genes. The method also works well when only the sequence information of an lncRNA gene is known, or even when a given disease has no currently recognized long non-coding genes. PMID:29108274
AMP-Activated Protein Kinase Interacts with the Peroxisome Proliferator-Activated Receptor Delta to Induce Genes Affecting Fatty Acid Oxidation in Human Macrophages.

PubMed

Kemmerer, Marina; Finkernagel, Florian; Cavalcante, Marcela Frota; Abdalla, Dulcineia Saes Parra; Müller, Rolf; Brüne, Bernhard; Namgaladze, Dmitry

2015-01-01

AMP-activated protein kinase (AMPK) maintains energy homeostasis by suppressing cellular ATP-consuming processes and activating catabolic, ATP-producing pathways such as fatty acid oxidation (FAO). The transcription factor peroxisome proliferator-activated receptor δ (PPARδ) also affects fatty acid metabolism, stimulating the expression of genes involved in FAO. To question the interplay of AMPK and PPARδ in human macrophages we transduced primary human macrophages with lentiviral particles encoding for the constitutively active AMPKα1 catalytic subunit, followed by microarray expression analysis after treatment with the PPARδ agonist GW501516. Microarray analysis showed that co-activation of AMPK and PPARδ increased expression of FAO genes, which were validated by quantitative PCR. Induction of these FAO-associated genes was also observed upon infecting macrophages with an adenovirus coding for AMPKγ1 regulatory subunit carrying an activating R70Q mutation. The pharmacological AMPK activator A-769662 increased expression of several FAO genes in a PPARδ- and AMPK-dependent manner. Although GW501516 significantly increased FAO and reduced the triglyceride amount in very low density lipoproteins (VLDL)-loaded foam cells, AMPK activation failed to potentiate this effect, suggesting that increased expression of fatty acid catabolic genes alone may be not sufficient to prevent macrophage lipid overload.
AMP-Activated Protein Kinase Interacts with the Peroxisome Proliferator-Activated Receptor Delta to Induce Genes Affecting Fatty Acid Oxidation in Human Macrophages

PubMed Central

Kemmerer, Marina; Finkernagel, Florian; Cavalcante, Marcela Frota; Abdalla, Dulcineia Saes Parra; Müller, Rolf; Brüne, Bernhard; Namgaladze, Dmitry

2015-01-01

AMP-activated protein kinase (AMPK) maintains energy homeostasis by suppressing cellular ATP-consuming processes and activating catabolic, ATP-producing pathways such as fatty acid oxidation (FAO). The transcription factor peroxisome proliferator-activated receptor δ (PPARδ) also affects fatty acid metabolism, stimulating the expression of genes involved in FAO. To question the interplay of AMPK and PPARδ in human macrophages we transduced primary human macrophages with lentiviral particles encoding for the constitutively active AMPKα1 catalytic subunit, followed by microarray expression analysis after treatment with the PPARδ agonist GW501516. Microarray analysis showed that co-activation of AMPK and PPARδ increased expression of FAO genes, which were validated by quantitative PCR. Induction of these FAO-associated genes was also observed upon infecting macrophages with an adenovirus coding for AMPKγ1 regulatory subunit carrying an activating R70Q mutation. The pharmacological AMPK activator A-769662 increased expression of several FAO genes in a PPARδ- and AMPK-dependent manner. Although GW501516 significantly increased FAO and reduced the triglyceride amount in very low density lipoproteins (VLDL)-loaded foam cells, AMPK activation failed to potentiate this effect, suggesting that increased expression of fatty acid catabolic genes alone may be not sufficient to prevent macrophage lipid overload. PMID:26098914
A T-DNA gene required for agropine biosynthesis by transformed plants is functionally and evolutionarily related to a Ti plasmid gene required for catabolism of agropine by Agrobacterium strains.

PubMed Central

Hong, S B; Hwang, I; Dessaux, Y; Guyon, P; Kim, K S; Farrand, S K

1997-01-01

The mechanisms that ensure that Ti plasmid T-DNA genes encoding proteins involved in the biosynthesis of opines in crown gall tumors are always matched by Ti plasmid genes conferring the ability to catabolize that set of opines on the inducing Agrobacterium strains are unknown. The pathway for the biosynthesis of the opine agropine is thought to require an enzyme, mannopine cyclase, coded for by the ags gene located in the T(R) region of octopine-type Ti plasmids. Extracts prepared from agropine-type tumors contained an activity that cyclized mannopine to agropine. Tumor cells containing a T region in which ags was mutated lacked this activity and did not contain agropine. Expression of ags from the lac promoter conferred mannopine-lactonizing activity on Escherichia coli. Agrobacterium tumefaciens strains harboring an octopine-type Ti plasmid exhibit a similar activity which is not coded for by ags. Analysis of the DNA sequence of the gene encoding this activity, called agcA, showed it to be about 60% identical to T-DNA ags genes. Relatedness decreased abruptly in the 5' and 3' untranslated regions of the genes. ags is preceded by a promoter that functions only in the plant. Expression analysis showed that agcA also is preceded by its own promoter, which is active in the bacterium. Translation of agcA yielded a protein of about 45 kDa, consistent with the size predicted from the DNA sequence. Antibodies raised against the agcA product cross-reacted with the anabolic enzyme. These results indicate that the agropine system arose by a duplication of a progenitor gene, one copy of which became associated with the T-DNA and the other copy of which remained associated with the bacterium. PMID:9244272
An essential role for the RNA-binding protein Smaug during the Drosophila maternal-to-zygotic transition.

PubMed

Benoit, Beatrice; He, Chun Hua; Zhang, Fan; Votruba, Sarah M; Tadros, Wael; Westwood, J Timothy; Smibert, Craig A; Lipshitz, Howard D; Theurkauf, William E

2009-03-01

Genetic control of embryogenesis switches from the maternal to the zygotic genome during the maternal-to-zygotic transition (MZT), when maternal mRNAs are destroyed, high-level zygotic transcription is initiated, the replication checkpoint is activated and the cell cycle slows. The midblastula transition (MBT) is the first morphological event that requires zygotic gene expression. The Drosophila MBT is marked by blastoderm cellularization and follows 13 cleavage-stage divisions. The RNA-binding protein Smaug is required for cleavage-independent maternal transcript destruction during the Drosophila MZT. Here, we show that smaug mutants also disrupt syncytial blastoderm stage cell-cycle delays, DNA replication checkpoint activation, cellularization, and high-level zygotic expression of protein coding and micro RNA genes. We also show that Smaug protein levels increase through the cleavage divisions and peak when the checkpoint is activated and zygotic transcription initiates, and that transgenic expression of Smaug in an anterior-to-posterior gradient produces a concomitant gradient in the timing of maternal transcript destruction, cleavage cell cycle delays, zygotic gene transcription, cellularization and gastrulation. Smaug accumulation thus coordinates progression through the MZT.

The Mitochondrial Cytochrome Oxidase Subunit I Gene Occurs on a Minichromosome with Extensive Heteroplasmy in Two Species of Chewing Lice, Geomydoecus aurei and Thomomydoecus minor

PubMed Central

Pietan, Lucas L.; Spradling, Theresa A.

2016-01-01

In animals, mitochondrial DNA (mtDNA) typically occurs as a single circular chromosome with 13 protein-coding genes and 22 tRNA genes. The various species of lice examined previously, however, have shown mitochondrial genome rearrangements with a range of chromosome sizes and numbers. Our research demonstrates that the mitochondrial genomes of two species of chewing lice found on pocket gophers, Geomydoecus aurei and Thomomydoecus minor, are fragmented with the 1,536 base-pair (bp) cytochrome-oxidase subunit I (cox1) gene occurring as the only protein-coding gene on a 1,916–1,964 bp minicircular chromosome in the two species, respectively. The cox1 gene of T. minor begins with an atypical start codon, while that of G. aurei does not. Components of the non-protein coding sequence of G. aurei and T. minor include a tRNA (isoleucine) gene, inverted repeat sequences consistent with origins of replication, and an additional non-coding region that is smaller than the non-coding sequence of other lice with such fragmented mitochondrial genomes. Sequences of cox1 minichromosome clones for each species reveal extensive length and sequence heteroplasmy in both coding and noncoding regions. The highly variable non-gene regions of G. aurei and T. minor have little sequence similarity with one another except for a 19-bp region of phylogenetically conserved sequence with unknown function. PMID:27589589
ProClaT, a new bioinformatics tool for in silico protein reclassification: case study of DraB, a protein coded from the draTGB operon in Azospirillum brasilense.

PubMed

Rubel, Elisa Terumi; Raittz, Roberto Tadeu; Coimbra, Nilson Antonio da Rocha; Gehlen, Michelly Alves Coutinho; Pedrosa, Fábio de Oliveira

2016-12-15

Azopirillum brasilense is a plant-growth promoting nitrogen-fixing bacteria that is used as bio-fertilizer in agriculture. Since nitrogen fixation has a high-energy demand, the reduction of N 2 to NH 4 + by nitrogenase occurs only under limiting conditions of NH 4 + and O 2 . Moreover, the synthesis and activity of nitrogenase is highly regulated to prevent energy waste. In A. brasilense nitrogenase activity is regulated by the products of draG and draT. The product of the draB gene, located downstream in the draTGB operon, may be involved in the regulation of nitrogenase activity by an, as yet, unknown mechanism. A deep in silico analysis of the product of draB was undertaken aiming at suggesting its possible function and involvement with DraT and DraG in the regulation of nitrogenase activity in A. brasilense. In this work, we present a new artificial intelligence strategy for protein classification, named ProClaT. The features used by the pattern recognition model were derived from the primary structure of the DraB homologous proteins, calculated by a ProClaT internal algorithm. ProClaT was applied to this case study and the results revealed that the A. brasilense draB gene codes for a protein highly similar to the nitrogenase associated NifO protein of Azotobacter vinelandii. This tool allowed the reclassification of DraB/NifO homologous proteins, hypothetical, conserved hypothetical and those annotated as putative arsenate reductase, ArsC, as NifO-like. An analysis of co-occurrence of draB, draT, draG and of other nif genes was performed, suggesting the involvement of draB (nifO) in nitrogen fixation, however, without the definition of a specific function.
Intraarticular expression of biologically active interleukin 1-receptor-antagonist protein by ex vivo gene transfer.

PubMed Central

Bandara, G; Mueller, G M; Galea-Lauri, J; Tindal, M H; Georgescu, H I; Suchanek, M K; Hung, G L; Glorioso, J C; Robbins, P D; Evans, C H

1993-01-01

Gene therapy offers a radical different approach to the treatment of arthritis. Here we have demonstrated that two marker genes (lacZ and neo) and cDNA coding for a potentially therapeutic protein (human interleukin 1-receptor-antagonist protein; IRAP or IL-1ra) can be delivered, by ex vivo techniques, to the synovial lining of joints; intraarticular expression of IRAP inhibited intraarticular responses to interleukin 1. To achieve this, lapine synoviocytes were first transduced in culture by retroviral infection. The genetically modified synovial cells were then transplanted by intraarticular injection into the knee joints of rabbits, where they efficiently colonized the synovium. Assay of joint lavages confirmed the in vivo expression of biologically active human IRAP. With allografted cells, IRAP expression was lost by 12 days after transfer. In contrast, autografted synoviocytes continued to express IRAP for approximately 5 weeks. Knee joints expressing human IRAP were protected from the leukocytosis that otherwise follows the intraarticular injection of recombinant human interleukin 1 beta. Thus, we report the intraarticular expression and activity of a potentially therapeutic protein by gene-transfer technology; these experiments demonstrate the feasibility of treating arthritis and other joint disorders with gene therapy. Images Fig. 1 Fig. 2 PMID:8248169
Genome Sequence of the Mesophilic Thermotogales Bacterium Mesotoga prima MesG1.Ag.4.2 Reveals the Largest Thermotogales Genome To Date

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhaxybayeva, Olga; Swithers, Kristen S; Foght, Julia

2012-01-01

Here we describe the genome of Mesotoga prima MesG1.Ag4.2, the first genome of a mesophilic Thermotogales bacterium. Mesotoga prima was isolated from a polychlorinated biphenyl (PCB)-dechlorinating enrichment culture from Baltimore Harbor sediments. Its 2.97 Mb genome is considerably larger than any previously sequenced Thermotogales genomes, which range between 1.86 and 2.30 Mb. This larger size is due to both higher numbers of protein-coding genes and larger intergenic regions. In particular, the M. prima genome contains more genes for proteins involved in regulatory functions, for instance those involved in regulation of transcription. Together with its closest relative, Kosmotoga olearia, it alsomore » encodes different types of proteins involved in environmental and cell-cell interactions as compared with other Thermotogales bacteria. Amino acid composition analysis of M. prima proteins implies that this lineage has inhabited low-temperature environments for a long time. A large fraction of the M. prima genome has been acquired by lateral gene transfer (LGT): a DarkHorse analysis suggests that 766 (32%) of predicted protein-coding genes have been involved in LGT after Mesotoga diverged from the other Thermotogales lineages. A notable example of a lineage-specific LGT event is a reductive dehalogenase gene - a key enzyme in dehalorespiration, indicating M. prima may have a more active role in PCB dechlorination than was previously assumed.« less
Computer analysis of protein functional sites projection on exon structure of genes in Metazoa

PubMed Central

2015-01-01

Background Study of the relationship between the structural and functional organization of proteins and their coding genes is necessary for an understanding of the evolution of molecular systems and can provide new knowledge for many applications for designing proteins with improved medical and biological properties. It is well known that the functional properties of proteins are determined by their functional sites. Functional sites are usually represented by a small number of amino acid residues that are distantly located from each other in the amino acid sequence. They are highly conserved within their functional group and vary significantly in structure between such groups. According to this facts analysis of the general properties of the structural organization of the functional sites at the protein level and, at the level of exon-intron structure of the coding gene is still an actual problem. Results One approach to this analysis is the projection of amino acid residue positions of the functional sites along with the exon boundaries to the gene structure. In this paper, we examined the discontinuity of the functional sites in the exon-intron structure of genes and the distribution of lengths and phases of the functional site encoding exons in vertebrate genes. We have shown that the DNA fragments coding the functional sites were in the same exons, or in close exons. The observed tendency to cluster the exons that code functional sites which could be considered as the unit of protein evolution. We studied the characteristics of the structure of the exon boundaries that code, and do not code, functional sites in 11 Metazoa species. This is accompanied by a reduced frequency of intercodon gaps (phase 0) in exons encoding the amino acid residue functional site, which may be evidence of the existence of evolutionary limitations to the exon shuffling. Conclusions These results characterize the features of the coding exon-intron structure that affect the functionality of the encoded protein and allow a better understanding of the emergence of biological diversity. PMID:26693737
MAP17 Is a Necessary Activator of Renal Na+/Glucose Cotransporter SGLT2

PubMed Central

Coady, Michael J.; El Tarazi, Abdulah; Santer, René; Bissonnette, Pierre; Sasseville, Louis J.; Calado, Joaquim; Lussier, Yoann; Dumayne, Christopher; Bichet, Daniel G.

2017-01-01

The renal proximal tubule reabsorbs 90% of the filtered glucose load through the Na+-coupled glucose transporter SGLT2, and specific inhibitors of SGLT2 are now available to patients with diabetes to increase urinary glucose excretion. Using expression cloning, we identified an accessory protein, 17 kDa membrane-associated protein (MAP17), that increased SGLT2 activity in RNA-injected Xenopus oocytes by two orders of magnitude. Significant stimulation of SGLT2 activity also occurred in opossum kidney cells cotransfected with SGLT2 and MAP17. Notably, transfection with MAP17 did not change the quantity of SGLT2 protein at the cell surface in either cell type. To confirm the physiologic relevance of the MAP17–SGLT2 interaction, we studied a cohort of 60 individuals with familial renal glucosuria. One patient without any identifiable mutation in the SGLT2 coding gene (SLC5A2) displayed homozygosity for a splicing mutation (c.176+1G>A) in the MAP17 coding gene (PDZK1IP1). In the proximal tubule and in other tissues, MAP17 is known to interact with PDZK1, a scaffolding protein linked to other transporters, including Na+/H+ exchanger 3, and to signaling pathways, such as the A-kinase anchor protein 2/protein kinase A pathway. Thus, these results provide the basis for a more thorough characterization of SGLT2 which would include the possible effects of its inhibition on colocalized renal transporters. PMID:27288013
The Ever-Evolving Concept of the Gene: The Use of RNA/Protein Experimental Techniques to Understand Genome Functions

PubMed Central

Cipriano, Andrea; Ballarino, Monica

2018-01-01

The completion of the human genome sequence together with advances in sequencing technologies have shifted the paradigm of the genome, as composed of discrete and hereditable coding entities, and have shown the abundance of functional noncoding DNA. This part of the genome, previously dismissed as “junk” DNA, increases proportionally with organismal complexity and contributes to gene regulation beyond the boundaries of known protein-coding genes. Different classes of functionally relevant nonprotein-coding RNAs are transcribed from noncoding DNA sequences. Among them are the long noncoding RNAs (lncRNAs), which are thought to participate in the basal regulation of protein-coding genes at both transcriptional and post-transcriptional levels. Although knowledge of this field is still limited, the ability of lncRNAs to localize in different cellular compartments, to fold into specific secondary structures and to interact with different molecules (RNA or proteins) endows them with multiple regulatory mechanisms. It is becoming evident that lncRNAs may play a crucial role in most biological processes such as the control of development, differentiation and cell growth. This review places the evolution of the concept of the gene in its historical context, from Darwin's hypothetical mechanism of heredity to the post-genomic era. We discuss how the original idea of protein-coding genes as unique determinants of phenotypic traits has been reconsidered in light of the existence of noncoding RNAs. We summarize the technological developments which have been made in the genome-wide identification and study of lncRNAs and emphasize the methodologies that have aided our understanding of the complexity of lncRNA-protein interactions in recent years. PMID:29560353
Evaluation of the efficacy of twelve mitochondrial protein-coding genes as barcodes for mollusk DNA barcoding.

PubMed

Yu, Hong; Kong, Lingfeng; Li, Qi

2016-01-01

In this study, we evaluated the efficacy of 12 mitochondrial protein-coding genes from 238 mitochondrial genomes of 140 molluscan species as potential DNA barcodes for mollusks. Three barcoding methods (distance, monophyly and character-based methods) were used in species identification. The species recovery rates based on genetic distances for the 12 genes ranged from 70.83 to 83.33%. There were no significant differences in intra- or interspecific variability among the 12 genes. The monophyly and character-based methods provided higher resolution than the distance-based method in species delimitation. Especially in closely related taxa, the character-based method showed some advantages. The results suggested that besides the standard COI barcode, other 11 mitochondrial protein-coding genes could also be potentially used as a molecular diagnostic for molluscan species discrimination. Our results also showed that the combination of mitochondrial genes did not enhance the efficacy for species identification and a single mitochondrial gene would be fully competent.
Amplification of the groESL operon in Pseudomonas putida increases siderophore gene promoter activity.

PubMed

Venturi, V; Wolfs, K; Leong, J; Weisbeek, P J

1994-10-17

Pseudobactin 358 is the yellow-green fluorescent siderophore [microbial iron(III) transport agent] produced by Pseudomonas putida WCS358 under iron-limiting conditions. The genes encoding pseudobactin 358 biosynthesis are iron-regulated at the level of transcription. In this study, the molecular characterization is reported of a cosmid clone of WCS358 DNA that can stimulate, in an iron-dependent manner, the activity of a WCS358 siderophore gene promoter in the heterologous Pseudomonas strain A225. The functional region in the clone was identified by subcloning, transposon mutagenesis and DNA sequencing as the groESL operon of strain WCS358. This increase in promoter activity was not observed when the groESL genes of strain WCS358 were integrated via a transposon vector into the genome of Pseudomonas A225, indicating that multiple copies of the operon are necessary for the increase in siderophore gene promoter activity. Amplification of the Escherichia coli and WCS358 groESL genes also increased iron-regulated promoter activity in the parent strain WCS358. The groESL operon codes for the chaperone proteins GroES and GroEL, which are responsible for mediating the folding and assembly of many proteins.
Prokaryote-derived protein inhibitors of peptidases: a sketchy occurrence and mostly unknown function

PubMed Central

Kantyka, Tomasz; Rawlings, Neil D.; Potempa, Jan

2010-01-01

In metazoan organisms protein inhibitors of peptidases are important factors essential for regulation of proteolytic activity. In vertebrates genes encoding peptidase inhibitors constitute up to 1% of genes reflecting a need for tight and specific control of proteolysis especially in extracellular body fluids. In stark contrast unicellular organisms, both prokaryotic and eukaryotic consistently contain only few, if any, genes coding for putative peptidase inhibitors. This may seem perplexing in the light of the fact that these organisms produce large numbers of proteases of different catalytic classes with the genes constituting up to 6% of the total gene count with the average being about 3%. Apparently, however, a unicellular life-style is fully compatible with other mechanisms of regulation of proteolysis and does not require protein inhibitors to control their intracellular and extracellular proteolytic activity. So in prokaryotes occurrence of genes encoding different types of peptidase inhibitors is infrequent and often scattered among phylogenetically distinct orders or even phyla of microbiota. Genes encoding proteins homologous to alpha-2-macroglobulin (family I39), serine carboxypeptidase Y inhibitor (family I51), alpha-1-peptidase inhibitor (family I4) and ecotin (family I11) are the most frequently represented in Bacteria. Although several of these gene products were shown to possess inhibitory activity, with an exception of ecotin and staphostatins, the biological function of microbial inhibitors is unclear. In this review we present distribution of protein inhibitors from different families among prokaryotes, describe their mode of action and hypothesize on their role in microbial physiology and interactions with hosts and environment. PMID:20558234
Using a Euclid distance discriminant method to find protein coding genes in the yeast genome.

PubMed

Zhang, Chun-Ting; Wang, Ju; Zhang, Ren

2002-02-01

The Euclid distance discriminant method is used to find protein coding genes in the yeast genome, based on the single nucleotide frequencies at three codon positions in the ORFs. The method is extremely simple and may be extended to find genes in prokaryotic genomes or eukaryotic genomes with less introns. Six-fold cross-validation tests have demonstrated that the accuracy of the algorithm is better than 93%. Based on this, it is found that the total number of protein coding genes in the yeast genome is less than or equal to 5579 only, about 3.8-7.0% less than 5800-6000, which is currently widely accepted. The base compositions at three codon positions are analyzed in details using a graphic method. The result shows that the preference codons adopted by yeast genes are of the RGW type, where R, G and W indicate the bases of purine, non-G and A/T, whereas the 'codons' in the intergenic sequences are of the form NNN, where N denotes any base. This fact constitutes the basis of the algorithm to distinguish between coding and non-coding ORFs in the yeast genome. The names of putative non-coding ORFs are listed here in detail.
BRD4 assists elongation of both coding and enhancer RNAs guided by histone acetylation

PubMed Central

Kanno, Tomohiko; Kanno, Yuka; LeRoy, Gary; Campos, Eric; Sun, Hong-Wei; Brooks, Stephen R; Vahedi, Golnaz; Heightman, Tom D; Garcia, Benjamin A; Reinberg, Danny; Siebenlist, Ulrich; O’Shea, John J; Ozato, Keiko

2016-01-01

Small-molecule BET inhibitors interfere with the epigenetic interactions between acetylated histones and the bromodomains of the BET family proteins, including BRD4, and they potently inhibit growth of malignant cells by targeting cancer-promoting genes. BRD4 interacts with the pause-release factor P-TEFb, and has been proposed to release Pol II from promoter-proximal pausing. We show that BRD4 occupied widespread genomic regions in mouse cells, and directly stimulated elongation of both protein-coding transcripts and non-coding enhancer RNAs (eRNAs), dependent on the function of bromodomains. BRD4 interacted physically with elongating Pol II complexes, and assisted Pol II progression through hyper-acetylated nucleosomes by interacting with acetylated histones via bromodomains. On active enhancers, the BET inhibitor JQ1 antagonized BRD4-associated eRNA synthesis. Thus, BRD4 is involved in multiple steps of the transcription hierarchy, primarily by assisting transcript elongation both at enhancers and on gene bodies. PMID:25383670
A genetic screen to isolate type III effectors translocated into pepper cells during Xanthomonas infection

DOE Office of Scientific and Technical Information (OSTI.GOV)

Julie Anne Roden, Branids Belt, Jason Barzel Ross, Thomas Tachibana, Joe Vargas, Mary Beth Mudgett

2004-11-23

The bacterial pathogen Xanthomonas campestris pv. vesicatoria (Xcv) uses a type III secretion system (TTSS) to translocate effector proteins into host plant cells. The TTSS is required for Xcv colonization, yet the identity of many proteins translocated through this apparatus is not known. We used a genetic screen to functionally identify Xcv TTSS effectors. A transposon 5 (Tn5)-based transposon construct including the coding sequence for the Xcv AvrBs2 effector devoid of its TTSS signal was randomly inserted into the Xcv genome. Insertion of the avrBs2 reporter gene into Xcv genes coding for proteins containing a functional TTSS signal peptide resultedmore » in the creation of chimeric TTSS effector::AvrBs2 fusion proteins. Xcv strains containing these fusions translocated the AvrBs2 reporter in a TTSS-dependent manner into resistant BS2 pepper cells during infection, activating the avrBs2-dependent hypersensitive response (HR). We isolated seven chimeric fusion proteins and designated the identified TTSS effectors as Xanthomonas outer proteins (Xops). Translocation of each Xop was confirmed by using the calmodulin-dependent adenylate cydase reporter assay. Three xop genes are Xanthomonas spp.-specific, whereas homologs for the rest are found in other phytopathogenic bacteria. XopF1 and XopF2 define an effector gene family in Xcv. XopN contains a eukaryotic protein fold repeat and is required for full Xcv pathogenicity in pepper and tomato. The translocated effectors identified in this work expand our knowledge of the diversity of proteins that Xcv uses to manipulate its hosts.« less
Identification in Marinomonas mediterranea of a novel quinoprotein with glycine oxidase activity.

PubMed

Campillo-Brocal, Jonatan Cristian; Lucas-Elio, Patricia; Sanchez-Amat, Antonio

2013-08-01

A novel enzyme with lysine-epsilon oxidase activity was previously described in the marine bacterium Marinomonas mediterranea. This enzyme differs from other l-amino acid oxidases in not being a flavoprotein but containing a quinone cofactor. It is encoded by an operon with two genes lodA and lodB. The first one codes for the oxidase, while the second one encodes a protein required for the expression of the former. Genome sequencing of M. mediterranea has revealed that it contains two additional operons encoding proteins with sequence similarity to LodA. In this study, it is shown that the product of one of such genes, Marme_1655, encodes a protein with glycine oxidase activity. This activity shows important differences in terms of substrate range and sensitivity to inhibitors to other glycine oxidases previously described which are flavoproteins synthesized by Bacillus. The results presented in this study indicate that the products of the genes with different degrees of similarity to lodA detected in bacterial genomes could constitute a reservoir of different oxidases. © 2013 The Authors. Microbiology Open published by John Wiley & Sons Ltd.
Enhancement of protein production via the strong DIT1 terminator and two RNA-binding proteins in Saccharomyces cerevisiae

PubMed Central

Ito, Yoichiro; Kitagawa, Takao; Yamanishi, Mamoru; Katahira, Satoshi; Izawa, Shingo; Irie, Kenji; Furutani-Seiki, Makoto; Matsuyama, Takashi

2016-01-01

Post-transcriptional upregulation is an effective way to increase the expression of transgenes and thus maximize the yields of target chemicals from metabolically engineered organisms. Refractory elements in the 3′ untranslated region (UTR) that increase mRNA half-life might be available. In Saccharomyces cerevisiae, several terminator regions have shown activity in increasing the production of proteins by upstream coding genes; among these terminators the DIT1 terminator has the highest activity. Here, we found in Saccharomyces cerevisiae that two resident trans-acting RNA-binding proteins (Nab6p and Pap1p) enhance the activity of the DIT1 terminator through the cis element GUUCG/U within the 3′-UTR. These two RNA-binding proteins could upregulate a battery of cell-wall–related genes. Mutagenesis of the DIT1 terminator improved its activity by a maximum of 500% of that of the standard PGK1 terminator. Further understanding and improvement of this system will facilitate inexpensive and stable production of complicated organism-derived drugs worldwide. PMID:27845367
Enhancement of protein production via the strong DIT1 terminator and two RNA-binding proteins in Saccharomyces cerevisiae.

PubMed

Ito, Yoichiro; Kitagawa, Takao; Yamanishi, Mamoru; Katahira, Satoshi; Izawa, Shingo; Irie, Kenji; Furutani-Seiki, Makoto; Matsuyama, Takashi

2016-11-15

Post-transcriptional upregulation is an effective way to increase the expression of transgenes and thus maximize the yields of target chemicals from metabolically engineered organisms. Refractory elements in the 3' untranslated region (UTR) that increase mRNA half-life might be available. In Saccharomyces cerevisiae, several terminator regions have shown activity in increasing the production of proteins by upstream coding genes; among these terminators the DIT1 terminator has the highest activity. Here, we found in Saccharomyces cerevisiae that two resident trans-acting RNA-binding proteins (Nab6p and Pap1p) enhance the activity of the DIT1 terminator through the cis element GUUCG/U within the 3'-UTR. These two RNA-binding proteins could upregulate a battery of cell-wall-related genes. Mutagenesis of the DIT1 terminator improved its activity by a maximum of 500% of that of the standard PGK1 terminator. Further understanding and improvement of this system will facilitate inexpensive and stable production of complicated organism-derived drugs worldwide.
Profiling of Virulence Determinants in Cronobacter sakazakii Isolates from Different Plant and Environmental Commodities.

PubMed

Singh, Niharika; Raghav, Mamta; Narula, Shifa; Tandon, Simran; Goel, Gunjan

2017-05-01

Cronobacter sakazakii is an emerging pathogen causing meningitis, sepsis and necrotizing enterocolitis in neonates and immune-compromised adults. The present study describes the profiling of different virulence factors associated with C. sakazakii isolates derived from plant-based materials and environmental samples (soil, water, and vacuum dust). All the isolates exhibited β-hemolysis and chitinase activity, and were able to utilize inositol. Among the nine virulence-associated genes, hly gene coding for hemolysin was detected in all the isolates followed by ompA (outer membrane protein); however, plasmid-borne genes were detected at a level of 60% for both cpa (cronobacter plasminogen activator) and eitA (Ferric ion transporter protein) gene, respectively. Furthermore, the isolate C. sakazakii N81 showed cytotoxicity for Caco-2 cells. The presence of the virulence determinants investigated in this study indicates the pathogenic potential of C. sakazakii with their plausible connection with clinical manifestations.
Integrative structural annotation of de novo RNA-Seq provides an accurate reference gene set of the enormous genome of the onion (Allium cepa L.).

PubMed

Kim, Seungill; Kim, Myung-Shin; Kim, Yong-Min; Yeom, Seon-In; Cheong, Kyeongchae; Kim, Ki-Tae; Jeon, Jongbum; Kim, Sunggil; Kim, Do-Sun; Sohn, Seong-Han; Lee, Yong-Hwan; Choi, Doil

2015-02-01

The onion (Allium cepa L.) is one of the most widely cultivated and consumed vegetable crops in the world. Although a considerable amount of onion transcriptome data has been deposited into public databases, the sequences of the protein-coding genes are not accurate enough to be used, owing to non-coding sequences intermixed with the coding sequences. We generated a high-quality, annotated onion transcriptome from de novo sequence assembly and intensive structural annotation using the integrated structural gene annotation pipeline (ISGAP), which identified 54,165 protein-coding genes among 165,179 assembled transcripts totalling 203.0 Mb by eliminating the intron sequences. ISGAP performed reliable annotation, recognizing accurate gene structures based on reference proteins, and ab initio gene models of the assembled transcripts. Integrative functional annotation and gene-based SNP analysis revealed a whole biological repertoire of genes and transcriptomic variation in the onion. The method developed in this study provides a powerful tool for the construction of reference gene sets for organisms based solely on de novo transcriptome data. Furthermore, the reference genes and their variation described here for the onion represent essential tools for molecular breeding and gene cloning in Allium spp. © The Author 2014. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Genomic evidence for genes encoding leucine-rich repeat receptors linked to resistance against the eukaryotic extra- and intracellular Brassica napus pathogens Leptosphaeria maculans and Plasmodiophora brassicae.

PubMed

Stotz, Henrik U; Harvey, Pascoe J; Haddadi, Parham; Mashanova, Alla; Kukol, Andreas; Larkan, Nicholas J; Borhan, M Hossein; Fitt, Bruce D L

2018-01-01

Genes coding for nucleotide-binding leucine-rich repeat (LRR) receptors (NLRs) control resistance against intracellular (cell-penetrating) pathogens. However, evidence for a role of genes coding for proteins with LRR domains in resistance against extracellular (apoplastic) fungal pathogens is limited. Here, the distribution of genes coding for proteins with eLRR domains but lacking kinase domains was determined for the Brassica napus genome. Predictions of signal peptide and transmembrane regions divided these genes into 184 coding for receptor-like proteins (RLPs) and 121 coding for secreted proteins (SPs). Together with previously annotated NLRs, a total of 720 LRR genes were found. Leptosphaeria maculans-induced expression during a compatible interaction with cultivar Topas differed between RLP, SP and NLR gene families; NLR genes were induced relatively late, during the necrotrophic phase of pathogen colonization. Seven RLP, one SP and two NLR genes were found in Rlm1 and Rlm3/Rlm4/Rlm7/Rlm9 loci for resistance against L. maculans on chromosome A07 of B. napus. One NLR gene at the Rlm9 locus was positively selected, as was the RLP gene on chromosome A10 with LepR3 and Rlm2 alleles conferring resistance against L. maculans races with corresponding effectors AvrLm1 and AvrLm2, respectively. Known loci for resistance against L. maculans (extracellular hemi-biotrophic fungus), Sclerotinia sclerotiorum (necrotrophic fungus) and Plasmodiophora brassicae (intracellular, obligate biotrophic protist) were examined for presence of RLPs, SPs and NLRs in these regions. Whereas loci for resistance against P. brassicae were enriched for NLRs, no such signature was observed for the other pathogens. These findings demonstrate involvement of (i) NLR genes in resistance against the intracellular pathogen P. brassicae and a putative NLR gene in Rlm9-mediated resistance against the extracellular pathogen L. maculans.
Sequence and analysis of chromosome 4 of the plant Arabidopsis thaliana.

PubMed

Mayer, K; Schüller, C; Wambutt, R; Murphy, G; Volckaert, G; Pohl, T; Düsterhöft, A; Stiekema, W; Entian, K D; Terryn, N; Harris, B; Ansorge, W; Brandt, P; Grivell, L; Rieger, M; Weichselgartner, M; de Simone, V; Obermaier, B; Mache, R; Müller, M; Kreis, M; Delseny, M; Puigdomenech, P; Watson, M; Schmidtheini, T; Reichert, B; Portatelle, D; Perez-Alonso, M; Boutry, M; Bancroft, I; Vos, P; Hoheisel, J; Zimmermann, W; Wedler, H; Ridley, P; Langham, S A; McCullagh, B; Bilham, L; Robben, J; Van der Schueren, J; Grymonprez, B; Chuang, Y J; Vandenbussche, F; Braeken, M; Weltjens, I; Voet, M; Bastiaens, I; Aert, R; Defoor, E; Weitzenegger, T; Bothe, G; Ramsperger, U; Hilbert, H; Braun, M; Holzer, E; Brandt, A; Peters, S; van Staveren, M; Dirske, W; Mooijman, P; Klein Lankhorst, R; Rose, M; Hauf, J; Kötter, P; Berneiser, S; Hempel, S; Feldpausch, M; Lamberth, S; Van den Daele, H; De Keyser, A; Buysshaert, C; Gielen, J; Villarroel, R; De Clercq, R; Van Montagu, M; Rogers, J; Cronin, A; Quail, M; Bray-Allen, S; Clark, L; Doggett, J; Hall, S; Kay, M; Lennard, N; McLay, K; Mayes, R; Pettett, A; Rajandream, M A; Lyne, M; Benes, V; Rechmann, S; Borkova, D; Blöcker, H; Scharfe, M; Grimm, M; Löhnert, T H; Dose, S; de Haan, M; Maarse, A; Schäfer, M; Müller-Auer, S; Gabel, C; Fuchs, M; Fartmann, B; Granderath, K; Dauner, D; Herzl, A; Neumann, S; Argiriou, A; Vitale, D; Liguori, R; Piravandi, E; Massenet, O; Quigley, F; Clabauld, G; Mündlein, A; Felber, R; Schnabl, S; Hiller, R; Schmidt, W; Lecharny, A; Aubourg, S; Chefdor, F; Cooke, R; Berger, C; Montfort, A; Casacuberta, E; Gibbons, T; Weber, N; Vandenbol, M; Bargues, M; Terol, J; Torres, A; Perez-Perez, A; Purnelle, B; Bent, E; Johnson, S; Tacon, D; Jesse, T; Heijnen, L; Schwarz, S; Scholler, P; Heber, S; Francs, P; Bielke, C; Frishman, D; Haase, D; Lemcke, K; Mewes, H W; Stocker, S; Zaccaria, P; Bevan, M; Wilson, R K; de la Bastide, M; Habermann, K; Parnell, L; Dedhia, N; Gnoj, L; Schutz, K; Huang, E; Spiegel, L; Sehkon, M; Murray, J; Sheet, P; Cordes, M; Abu-Threideh, J; Stoneking, T; Kalicki, J; Graves, T; Harmon, G; Edwards, J; Latreille, P; Courtney, L; Cloud, J; Abbott, A; Scott, K; Johnson, D; Minx, P; Bentley, D; Fulton, B; Miller, N; Greco, T; Kemp, K; Kramer, J; Fulton, L; Mardis, E; Dante, M; Pepin, K; Hillier, L; Nelson, J; Spieth, J; Ryan, E; Andrews, S; Geisel, C; Layman, D; Du, H; Ali, J; Berghoff, A; Jones, K; Drone, K; Cotton, M; Joshu, C; Antonoiu, B; Zidanic, M; Strong, C; Sun, H; Lamar, B; Yordan, C; Ma, P; Zhong, J; Preston, R; Vil, D; Shekher, M; Matero, A; Shah, R; Swaby, I K; O'Shaughnessy, A; Rodriguez, M; Hoffmann, J; Till, S; Granat, S; Shohdy, N; Hasegawa, A; Hameed, A; Lodhi, M; Johnson, A; Chen, E; Marra, M; Martienssen, R; McCombie, W R

1999-12-16

The higher plant Arabidopsis thaliana (Arabidopsis) is an important model for identifying plant genes and determining their function. To assist biological investigations and to define chromosome structure, a coordinated effort to sequence the Arabidopsis genome was initiated in late 1996. Here we report one of the first milestones of this project, the sequence of chromosome 4. Analysis of 17.38 megabases of unique sequence, representing about 17% of the genome, reveals 3,744 protein coding genes, 81 transfer RNAs and numerous repeat elements. Heterochromatic regions surrounding the putative centromere, which has not yet been completely sequenced, are characterized by an increased frequency of a variety of repeats, new repeats, reduced recombination, lowered gene density and lowered gene expression. Roughly 60% of the predicted protein-coding genes have been functionally characterized on the basis of their homology to known genes. Many genes encode predicted proteins that are homologous to human and Caenorhabditis elegans proteins.

Complete Mitochondrial Genome of Echinostoma hortense (Digenea: Echinostomatidae).

PubMed

Liu, Ze-Xuan; Zhang, Yan; Liu, Yu-Ting; Chang, Qiao-Cheng; Su, Xin; Fu, Xue; Yue, Dong-Mei; Gao, Yuan; Wang, Chun-Ren

2016-04-01

Echinostoma hortense (Digenea: Echinostomatidae) is one of the intestinal flukes with medical importance in humans. However, the mitochondrial (mt) genome of this fluke has not been known yet. The present study has determined the complete mt genome sequences of E. hortense and assessed the phylogenetic relationships with other digenean species for which the complete mt genome sequences are available in GenBank using concatenated amino acid sequences inferred from 12 protein-coding genes. The mt genome of E. hortense contained 12 protein-coding genes, 22 transfer RNA genes, 2 ribosomal RNA genes, and 1 non-coding region. The length of the mt genome of E. hortense was 14,994 bp, which was somewhat smaller than those of other trematode species. Phylogenetic analyses based on concatenated nucleotide sequence datasets for all 12 protein-coding genes using maximum parsimony (MP) method showed that E. hortense and Hypoderaeum conoideum gathered together, and they were closer to each other than to Fasciolidae and other echinostomatid trematodes. The availability of the complete mt genome sequences of E. hortense provides important genetic markers for diagnostics, population genetics, and evolutionary studies of digeneans.
Complete Mitochondrial Genome of Echinostoma hortense (Digenea: Echinostomatidae)

PubMed Central

Liu, Ze-Xuan; Zhang, Yan; Liu, Yu-Ting; Chang, Qiao-Cheng; Su, Xin; Fu, Xue; Yue, Dong-Mei; Gao, Yuan; Wang, Chun-Ren

2016-01-01

Echinostoma hortense (Digenea: Echinostomatidae) is one of the intestinal flukes with medical importance in humans. However, the mitochondrial (mt) genome of this fluke has not been known yet. The present study has determined the complete mt genome sequences of E. hortense and assessed the phylogenetic relationships with other digenean species for which the complete mt genome sequences are available in GenBank using concatenated amino acid sequences inferred from 12 protein-coding genes. The mt genome of E. hortense contained 12 protein-coding genes, 22 transfer RNA genes, 2 ribosomal RNA genes, and 1 non-coding region. The length of the mt genome of E. hortense was 14,994 bp, which was somewhat smaller than those of other trematode species. Phylogenetic analyses based on concatenated nucleotide sequence datasets for all 12 protein-coding genes using maximum parsimony (MP) method showed that E. hortense and Hypoderaeum conoideum gathered together, and they were closer to each other than to Fasciolidae and other echinostomatid trematodes. The availability of the complete mt genome sequences of E. hortense provides important genetic markers for diagnostics, population genetics, and evolutionary studies of digeneans. PMID:27180575
Maternal transcription of non-protein coding RNAs from the PWS-critical region rescues growth retardation in mice

PubMed Central

Rozhdestvensky, Timofey S.; Robeck, Thomas; Galiveti, Chenna R.; Raabe, Carsten A.; Seeger, Birte; Wolters, Anna; Gubar, Leonid V.; Brosius, Jürgen; Skryabin, Boris V.

2016-01-01

Prader-Willi syndrome (PWS) is a neurogenetic disorder caused by loss of paternally expressed genes on chromosome 15q11-q13. The PWS-critical region (PWScr) contains an array of non-protein coding IPW-A exons hosting intronic SNORD116 snoRNA genes. Deletion of PWScr is associated with PWS in humans and growth retardation in mice exhibiting ~15% postnatal lethality in C57BL/6 background. Here we analysed a knock-in mouse containing a 5′HPRT-LoxP-NeoR cassette (5′LoxP) inserted upstream of the PWScr. When the insertion was inherited maternally in a paternal PWScr-deletion mouse model (PWScrp−/m5′LoxP), we observed compensation of growth retardation and postnatal lethality. Genomic methylation pattern and expression of protein-coding genes remained unaltered at the PWS-locus of PWScrp−/m5′LoxP mice. Interestingly, ubiquitous Snord116 and IPW-A exon transcription from the originally silent maternal chromosome was detected. In situ hybridization indicated that PWScrp−/m5′LoxP mice expressed Snord116 in brain areas similar to wild type animals. Our results suggest that the lack of PWScr RNA expression in certain brain areas could be a primary cause of the growth retardation phenotype in mice. We propose that activation of disease-associated genes on imprinted regions could lead to general therapeutic strategies in associated diseases. PMID:26848093
In silico search for functionally similar proteins involved in meiosis and recombination in evolutionarily distant organisms.

PubMed

Bogdanov, Yuri F; Dadashev, Sergei Y; Grishaeva, Tatiana M

2003-01-01

Evolutionarily distant organisms have not only orthologs, but also nonhomologous proteins that build functionally similar subcellular structures. For instance, this is true with protein components of the synaptonemal complex (SC), a universal ultrastructure that ensures the successful pairing and recombination of homologous chromosomes during meiosis. We aimed at developing a method to search databases for genes that code for such nonhomologous but functionally analogous proteins. Advantage was taken of the ultrastructural parameters of SC and the conformation of SC proteins responsible for these. Proteins involved in SC central space are known to be similar in secondary structure. Using published data, we found a highly significant correlation between the width of the SC central space and the length of rod-shaped central domain of mammalian and yeast intermediate proteins forming transversal filaments in the SC central space. Basing on this, we suggested a method for searching genome databases of distant organisms for genes whose virtual proteins meet the above correlation requirement. Our recent finding of the Drosophila melanogaster CG17604 gene coding for synaptonemal complex transversal filament protein received experimental support from another lab. With the same strategy, we showed that the Arabidopsis thaliana and Caenorhabditis elegans genomes contain unique genes coding for such proteins.
Biomimetic Artificial Epigenetic Code for Targeted Acetylation of Histones.

PubMed

Taniguchi, Junichi; Feng, Yihong; Pandian, Ganesh N; Hashiya, Fumitaka; Hidaka, Takuya; Hashiya, Kaori; Park, Soyoung; Bando, Toshikazu; Ito, Shinji; Sugiyama, Hiroshi

2018-06-13

While the central role of locus-specific acetylation of histone proteins in eukaryotic gene expression is well established, the availability of designer tools to regulate acetylation at particular nucleosome sites remains limited. Here, we develop a unique strategy to introduce acetylation by constructing a bifunctional molecule designated Bi-PIP. Bi-PIP has a P300/CBP-selective bromodomain inhibitor (Bi) as a P300/CBP recruiter and a pyrrole-imidazole polyamide (PIP) as a sequence-selective DNA binder. Biochemical assays verified that Bi-PIPs recruit P300 to the nucleosomes having their target DNA sequences and extensively accelerate acetylation. Bi-PIPs also activated transcription of genes that have corresponding cognate DNA sequences inside living cells. Our results demonstrate that Bi-PIPs could act as a synthetic programmable histone code of acetylation, which emulates the bromodomain-mediated natural propagation system of histone acetylation to activate gene expression in a sequence-selective manner.
Recognition of Protein-coding Genes Based on Z-curve Algorithms

PubMed Central

-Biao Guo, Feng; Lin, Yan; -Ling Chen, Ling

2014-01-01

Recognition of protein-coding genes, a classical bioinformatics issue, is an absolutely needed step for annotating newly sequenced genomes. The Z-curve algorithm, as one of the most effective methods on this issue, has been successfully applied in annotating or re-annotating many genomes, including those of bacteria, archaea and viruses. Two Z-curve based ab initio gene-finding programs have been developed: ZCURVE (for bacteria and archaea) and ZCURVE_V (for viruses and phages). ZCURVE_C (for 57 bacteria) and Zfisher (for any bacterium) are web servers for re-annotation of bacterial and archaeal genomes. The above four tools can be used for genome annotation or re-annotation, either independently or combined with the other gene-finding programs. In addition to recognizing protein-coding genes and exons, Z-curve algorithms are also effective in recognizing promoters and translation start sites. Here, we summarize the applications of Z-curve algorithms in gene finding and genome annotation. PMID:24822027
The Glucuronic Acid Utilization Gene Cluster from Bacillus stearothermophilus T-6

PubMed Central

Shulami, Smadar; Gat, Orit; Sonenshein, Abraham L.; Shoham, Yuval

1999-01-01

A λ-EMBL3 genomic library of Bacillus stearothermophilus T-6 was screened for hemicellulolytic activities, and five independent clones exhibiting β-xylosidase activity were isolated. The clones overlap each other and together represent a 23.5-kb chromosomal segment. The segment contains a cluster of xylan utilization genes, which are organized in at least three transcriptional units. These include the gene for the extracellular xylanase, xylanase T-6; part of an operon coding for an intracellular xylanase and a β-xylosidase; and a putative 15.5-kb-long transcriptional unit, consisting of 12 genes involved in the utilization of α-d-glucuronic acid (GlcUA). The first four genes in the potential GlcUA operon (orf1, -2, -3, and -4) code for a putative sugar transport system with characteristic components of the binding-protein-dependent transport systems. The most likely natural substrate for this transport system is aldotetraouronic acid [2-O-α-(4-O-methyl-α-d-glucuronosyl)-xylotriose] (MeGlcUAXyl3). The following two genes code for an intracellular α-glucuronidase (aguA) and a β-xylosidase (xynB). Five more genes (kdgK, kdgA, uxaC, uxuA, and uxuB) encode proteins that are homologous to enzymes involved in galacturonate and glucuronate catabolism. The gene cluster also includes a potential regulatory gene, uxuR, the product of which resembles repressors of the GntR family. The apparent transcriptional start point of the cluster was determined by primer extension analysis and is located 349 bp from the initial ATG codon. The potential operator site is a perfect 12-bp inverted repeat located downstream from the promoter between nucleotides +170 and +181. Gel retardation assays indicated that UxuR binds specifically to this sequence and that this binding is efficiently prevented in vitro by MeGlcUAXyl3, the most likely molecular inducer. PMID:10368143
Differential protein-coding gene and long noncoding RNA expression in smoking-related lung squamous cell carcinoma.

PubMed

Li, Shicheng; Sun, Xiao; Miao, Shuncheng; Liu, Jia; Jiao, Wenjie

2017-11-01

Cigarette smoking is one of the greatest preventable risk factors for developing cancer, and most cases of lung squamous cell carcinoma (lung SCC) are associated with smoking. The pathogenesis mechanism of tumor progress is unclear. This study aimed to identify biomarkers in smoking-related lung cancer, including protein-coding gene, long noncoding RNA, and transcription factors. We selected and obtained messenger RNA microarray datasets and clinical data from the Gene Expression Omnibus database to identify gene expression altered by cigarette smoking. Integrated bioinformatic analysis was used to clarify biological functions of the identified genes, including Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway, the construction of a protein-protein interaction network, transcription factor, and statistical analyses. Subsequent quantitative real-time PCR was utilized to verify these bioinformatic analyses. Five hundred and ninety-eight differentially expressed genes and 21 long noncoding RNA were identified in smoking-related lung SCC. GO and KEGG pathway analysis showed that identified genes were enriched in the cancer-related functions and pathways. The protein-protein interaction network revealed seven hub genes identified in lung SCC. Several transcription factors and their binding sites were predicted. The results of real-time quantitative PCR revealed that AURKA and BIRC5 were significantly upregulated and LINC00094 was downregulated in the tumor tissues of smoking patients. Further statistical analysis indicated that dysregulation of AURKA, BIRC5, and LINC00094 indicated poor prognosis in lung SCC. Protein-coding genes AURKA, BIRC5, and LINC00094 could be biomarkers or therapeutic targets for smoking-related lung SCC. © 2017 The Authors. Thoracic Cancer published by China Lung Oncology Group and John Wiley & Sons Australia, Ltd.
Evidence for an ergot alkaloid gene cluster in Claviceps purpurea.

PubMed

Tudzynski, P; Hölter, K; Correia, T; Arntz, C; Grammel, N; Keller, U

1999-02-01

A gene (cpd1) coding for the dimethylallyltryptophan synthase (DMATS) that catalyzes the first specific step in the biosynthesis of ergot alkaloids, was cloned from a strain of Claviceps purpurea that produces alkaloids in axenic culture. The derived gene product (CPD1) shows only 70% similarity to the corresponding gene previously isolated from Claviceps strain ATCC 26245, which is likely to be an isolate of C. fusiformis. Therefore, the related cpd1 most probably represents the first C. purpurea gene coding for an enzymatic step of the alkaloid biosynthetic pathway to be cloned. Analysis of the 3'-flanking region of cpd1 revealed a second, closely linked ergot alkaloid biosynthetic gene named cpps1, which codes for a 356-kDa polypeptide showing significant similarity to fungal modular peptide synthetases. The protein contains three amino acid-activating modules, and in the second module a sequence is found which matches that of an internal peptide (17 amino acids in length) obtained from a tryptic digest of lysergyl peptide synthetase 1 (LPS1) of C. purpurea, thus confirming that cpps1 encodes LPS1. LPS1 activates the three amino acids of the peptide portion of ergot peptide alkaloids during D-lysergyl peptide assembly. Chromosome walking revealed the presence of additional genes upstream of cpd1 which are probably also involved in ergot alkaloid biosynthesis: cpox1 probably codes for an FAD-dependent oxidoreductase (which could represent the chanoclavine cyclase), and a second putative oxidoreductase gene, cpox2, is closely linked to it in inverse orientation. RT-PCR experiments confirm that all four genes are expressed under conditions of peptide alkaloid biosynthesis. These results strongly suggest that at least some genes of ergot alkaloid biosynthesis in C. purpurea are clustered, opening the way for a detailed molecular genetic analysis of the pathway.
Expression, purification and functional reconstitution of slack sodium-activated potassium channels.

PubMed

Yan, Yangyang; Yang, Youshan; Bian, Shumin; Sigworth, Fred J

2012-11-01

The slack (slo2.2) gene codes for a potassium-channel α-subunit of the 6TM voltage-gated channel family. Expression of slack results in Na(+)-activated potassium channel activity in various cell types. We describe the purification and reconstitution of Slack protein and show that the Slack α-subunit alone is sufficient for potassium channel activity activated by sodium ions as assayed in planar bilayer membranes and in membrane vesicles.
Co-LncRNA: investigating the lncRNA combinatorial effects in GO annotations and KEGG pathways based on human RNA-Seq data

PubMed Central

Zhao, Zheng; Bai, Jing; Wu, Aiwei; Wang, Yuan; Zhang, Jinwen; Wang, Zishan; Li, Yongsheng; Xu, Juan; Li, Xia

2015-01-01

Long non-coding RNAs (lncRNAs) are emerging as key regulators of diverse biological processes and diseases. However, the combinatorial effects of these molecules in a specific biological function are poorly understood. Identifying co-expressed protein-coding genes of lncRNAs would provide ample insight into lncRNA functions. To facilitate such an effort, we have developed Co-LncRNA, which is a web-based computational tool that allows users to identify GO annotations and KEGG pathways that may be affected by co-expressed protein-coding genes of a single or multiple lncRNAs. LncRNA co-expressed protein-coding genes were first identified in publicly available human RNA-Seq datasets, including 241 datasets across 6560 total individuals representing 28 tissue types/cell lines. Then, the lncRNA combinatorial effects in a given GO annotations or KEGG pathways are taken into account by the simultaneous analysis of multiple lncRNAs in user-selected individual or multiple datasets, which is realized by enrichment analysis. In addition, this software provides a graphical overview of pathways that are modulated by lncRNAs, as well as a specific tool to display the relevant networks between lncRNAs and their co-expressed protein-coding genes. Co-LncRNA also supports users in uploading their own lncRNA and protein-coding gene expression profiles to investigate the lncRNA combinatorial effects. It will be continuously updated with more human RNA-Seq datasets on an annual basis. Taken together, Co-LncRNA provides a web-based application for investigating lncRNA combinatorial effects, which could shed light on their biological roles and could be a valuable resource for this community. Database URL: http://www.bio-bigdata.com/Co-LncRNA/ PMID:26363020
Proteomic Analysis of Mitotic RNA Polymerase II Reveals Novel Interactors and Association With Proteins Dysfunctional in Disease*

PubMed Central

Möller, André; Xie, Sheila Q.; Hosp, Fabian; Lang, Benjamin; Phatnani, Hemali P.; James, Sonya; Ramirez, Francisco; Collin, Gayle B.; Naggert, Jürgen K.; Babu, M. Madan; Greenleaf, Arno L.; Selbach, Matthias; Pombo, Ana

2012-01-01

RNA polymerase II (RNAPII) transcribes protein-coding genes in eukaryotes and interacts with factors involved in chromatin remodeling, transcriptional activation, elongation, and RNA processing. Here, we present the isolation of native RNAPII complexes using mild extraction conditions and immunoaffinity purification. RNAPII complexes were extracted from mitotic cells, where they exist dissociated from chromatin. The proteomic content of native complexes in total and size-fractionated extracts was determined using highly sensitive LC-MS/MS. Protein associations with RNAPII were validated by high-resolution immunolocalization experiments in both mitotic cells and in interphase nuclei. Functional assays of transcriptional activity were performed after siRNA-mediated knockdown. We identify >400 RNAPII associated proteins in mitosis, among these previously uncharacterized proteins for which we show roles in transcriptional elongation. We also identify, as novel functional RNAPII interactors, two proteins involved in human disease, ALMS1 and TFG, emphasizing the importance of gene regulation for normal development and physiology. PMID:22199231
GENCODE: the reference human genome annotation for The ENCODE Project.

PubMed

Harrow, Jennifer; Frankish, Adam; Gonzalez, Jose M; Tapanari, Electra; Diekhans, Mark; Kokocinski, Felix; Aken, Bronwen L; Barrell, Daniel; Zadissa, Amonida; Searle, Stephen; Barnes, If; Bignell, Alexandra; Boychenko, Veronika; Hunt, Toby; Kay, Mike; Mukherjee, Gaurab; Rajan, Jeena; Despacio-Reyes, Gloria; Saunders, Gary; Steward, Charles; Harte, Rachel; Lin, Michael; Howald, Cédric; Tanzer, Andrea; Derrien, Thomas; Chrast, Jacqueline; Walters, Nathalie; Balasubramanian, Suganthi; Pei, Baikang; Tress, Michael; Rodriguez, Jose Manuel; Ezkurdia, Iakes; van Baren, Jeltje; Brent, Michael; Haussler, David; Kellis, Manolis; Valencia, Alfonso; Reymond, Alexandre; Gerstein, Mark; Guigó, Roderic; Hubbard, Tim J

2012-09-01

The GENCODE Consortium aims to identify all gene features in the human genome using a combination of computational analysis, manual annotation, and experimental validation. Since the first public release of this annotation data set, few new protein-coding loci have been added, yet the number of alternative splicing transcripts annotated has steadily increased. The GENCODE 7 release contains 20,687 protein-coding and 9640 long noncoding RNA loci and has 33,977 coding transcripts not represented in UCSC genes and RefSeq. It also has the most comprehensive annotation of long noncoding RNA (lncRNA) loci publicly available with the predominant transcript form consisting of two exons. We have examined the completeness of the transcript annotation and found that 35% of transcriptional start sites are supported by CAGE clusters and 62% of protein-coding genes have annotated polyA sites. Over one-third of GENCODE protein-coding genes are supported by peptide hits derived from mass spectrometry spectra submitted to Peptide Atlas. New models derived from the Illumina Body Map 2.0 RNA-seq data identify 3689 new loci not currently in GENCODE, of which 3127 consist of two exon models indicating that they are possibly unannotated long noncoding loci. GENCODE 7 is publicly available from gencodegenes.org and via the Ensembl and UCSC Genome Browsers.
An open reading frame in intron seven of the sea urchin DNA-methyltransferase gene codes for a functional AP1 endonuclease.

PubMed

Cioffi, Anna Valentina; Ferrara, Diana; Cubellis, Maria Vittoria; Aniello, Francesco; Corrado, Marcella; Liguori, Francesca; Amoroso, Alessandro; Fucci, Laura; Branno, Margherita

2002-08-01

Analysis of the genome structure of the Paracentrotus lividus (sea urchin) DNA methyltransferase (DNA MTase) gene showed the presence of an open reading frame, named METEX, in intron 7 of the gene. METEX expression is developmentally regulated, showing no correlation with DNA MTase expression. In fact, DNA MTase transcripts are present at high concentrations in the early developmental stages, while METEX is expressed at late stages of development. Two METEX cDNA clones (Met1 and Met2) that are different in the 3' end have been isolated in a cDNA library screening. The putative translated protein from Met2 cDNA clone showed similarity with Escherichia coli endonuclease III on the basis of sequence and predictive three-dimensional structure. The protein, overexpressed in E. coli and purified, had functional properties similar to the endonuclease specific for apurinic/apyrimidinic (AP) sites on the basis of the lyase activity. Therefore the open reading frame, present in intron 7 of the P. lividus DNA MTase gene, codes for a functional AP endonuclease designated SuAP1.
The GENCODE exome: sequencing the complete human exome

PubMed Central

Coffey, Alison J; Kokocinski, Felix; Calafato, Maria S; Scott, Carol E; Palta, Priit; Drury, Eleanor; Joyce, Christopher J; LeProust, Emily M; Harrow, Jen; Hunt, Sarah; Lehesjoki, Anna-Elina; Turner, Daniel J; Hubbard, Tim J; Palotie, Aarno

2011-01-01

Sequencing the coding regions, the exome, of the human genome is one of the major current strategies to identify low frequency and rare variants associated with human disease traits. So far, the most widely used commercial exome capture reagents have mainly targeted the consensus coding sequence (CCDS) database. We report the design of an extended set of targets for capturing the complete human exome, based on annotation from the GENCODE consortium. The extended set covers an additional 5594 genes and 10.3 Mb compared with the current CCDS-based sets. The additional regions include potential disease genes previously inaccessible to exome resequencing studies, such as 43 genes linked to ion channel activity and 70 genes linked to protein kinase activity. In total, the new GENCODE exome set developed here covers 47.9 Mb and performed well in sequence capture experiments. In the sample set used in this study, we identified over 5000 SNP variants more in the GENCODE exome target (24%) than in the CCDS-based exome sequencing. PMID:21364695
Noncoding RNA Shows Context-Dependent Function | Center for Cancer Research

Cancer.gov

In addition to well-studied protein coding sequences, it is known that the genomes of higher organisms produce numerous noncoding RNAs (ncRNAs). Important roles for some ncRNAs in cell function have been demonstrated, though usually on a case-by-case basis, leading some scientists to argue that the majority of ncRNA production is just “noise” that results from the imperfect transcription machinery. The fact that many ncRNAs overlap with coding genes has hampered studies of their activities. Thus, a general understanding of whether ncRNA production is functional or not is lacking. To address this issue, Daniel Larson, Ph.D., of CCR’s Laboratory of Receptor Biology and Gene Expression, and his colleagues developed a new approach using single-molecule imaging in living cells. The researchers specifically labeled coding and ncRNAs from the GAL locus in yeast, which regulates the galactose response. Glucose is the preferred source of carbon for yeast, but when it is scarce, genes within the GAL locus, including GAL10 and GAL1, are activated to allow the metabolism of galactose.
Mutant phenotypes for thousands of bacterial genes of unknown function

DOE PAGES

Price, Morgan N.; Wetmore, Kelly M.; Waters, R. Jordan; ...

2018-05-16

One-third of all protein-coding genes from bacterial genomes cannot be annotated with a function. Here, to investigate the functions of these genes, we present genome-wide mutant fitness data from 32 diverse bacteria across dozens of growth conditions. We identified mutant phenotypes for 11,779 protein-coding genes that had not been annotated with a specific function. Many genes could be associated with a specific condition because the gene affected fitness only in that condition, or with another gene in the same bacterium because they had similar mutant phenotypes. Of the poorly annotated genes, 2,316 had associations that have high confidence because theymore » are conserved in other bacteria. By combining these conserved associations with comparative genomics, we identified putative DNA repair proteins; in addition, we propose specific functions for poorly annotated enzymes and transporters and for uncharacterized protein families. Lastly, our study demonstrates the scalability of microbial genetics and its utility for improving gene annotations.« less
Mutant phenotypes for thousands of bacterial genes of unknown function

DOE Office of Scientific and Technical Information (OSTI.GOV)

Price, Morgan N.; Wetmore, Kelly M.; Waters, R. Jordan

One-third of all protein-coding genes from bacterial genomes cannot be annotated with a function. Here, to investigate the functions of these genes, we present genome-wide mutant fitness data from 32 diverse bacteria across dozens of growth conditions. We identified mutant phenotypes for 11,779 protein-coding genes that had not been annotated with a specific function. Many genes could be associated with a specific condition because the gene affected fitness only in that condition, or with another gene in the same bacterium because they had similar mutant phenotypes. Of the poorly annotated genes, 2,316 had associations that have high confidence because theymore » are conserved in other bacteria. By combining these conserved associations with comparative genomics, we identified putative DNA repair proteins; in addition, we propose specific functions for poorly annotated enzymes and transporters and for uncharacterized protein families. Lastly, our study demonstrates the scalability of microbial genetics and its utility for improving gene annotations.« less
Analysis of protein-coding genetic variation in 60,706 humans.

PubMed

Lek, Monkol; Karczewski, Konrad J; Minikel, Eric V; Samocha, Kaitlin E; Banks, Eric; Fennell, Timothy; O'Donnell-Luria, Anne H; Ware, James S; Hill, Andrew J; Cummings, Beryl B; Tukiainen, Taru; Birnbaum, Daniel P; Kosmicki, Jack A; Duncan, Laramie E; Estrada, Karol; Zhao, Fengmei; Zou, James; Pierce-Hoffman, Emma; Berghout, Joanne; Cooper, David N; Deflaux, Nicole; DePristo, Mark; Do, Ron; Flannick, Jason; Fromer, Menachem; Gauthier, Laura; Goldstein, Jackie; Gupta, Namrata; Howrigan, Daniel; Kiezun, Adam; Kurki, Mitja I; Moonshine, Ami Levy; Natarajan, Pradeep; Orozco, Lorena; Peloso, Gina M; Poplin, Ryan; Rivas, Manuel A; Ruano-Rubio, Valentin; Rose, Samuel A; Ruderfer, Douglas M; Shakir, Khalid; Stenson, Peter D; Stevens, Christine; Thomas, Brett P; Tiao, Grace; Tusie-Luna, Maria T; Weisburd, Ben; Won, Hong-Hee; Yu, Dongmei; Altshuler, David M; Ardissino, Diego; Boehnke, Michael; Danesh, John; Donnelly, Stacey; Elosua, Roberto; Florez, Jose C; Gabriel, Stacey B; Getz, Gad; Glatt, Stephen J; Hultman, Christina M; Kathiresan, Sekar; Laakso, Markku; McCarroll, Steven; McCarthy, Mark I; McGovern, Dermot; McPherson, Ruth; Neale, Benjamin M; Palotie, Aarno; Purcell, Shaun M; Saleheen, Danish; Scharf, Jeremiah M; Sklar, Pamela; Sullivan, Patrick F; Tuomilehto, Jaakko; Tsuang, Ming T; Watkins, Hugh C; Wilson, James G; Daly, Mark J; MacArthur, Daniel G

2016-08-18

Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human 'knockout' variants in protein-coding genes.
Predicting Gene Structure Changes Resulting from Genetic Variants via Exon Definition Features.

PubMed

Majoros, William H; Holt, Carson; Campbell, Michael S; Ware, Doreen; Yandell, Mark; Reddy, Timothy E

2018-04-25

Genetic variation that disrupts gene function by altering gene splicing between individuals can substantially influence traits and disease. In those cases, accurately predicting the effects of genetic variation on splicing can be highly valuable for investigating the mechanisms underlying those traits and diseases. While methods have been developed to generate high quality computational predictions of gene structures in reference genomes, the same methods perform poorly when used to predict the potentially deleterious effects of genetic changes that alter gene splicing between individuals. Underlying that discrepancy in predictive ability are the common assumptions by reference gene finding algorithms that genes are conserved, well-formed, and produce functional proteins. We describe a probabilistic approach for predicting recent changes to gene structure that may or may not conserve function. The model is applicable to both coding and noncoding genes, and can be trained on existing gene annotations without requiring curated examples of aberrant splicing. We apply this model to the problem of predicting altered splicing patterns in the genomes of individual humans, and we demonstrate that performing gene-structure prediction without relying on conserved coding features is feasible. The model predicts an unexpected abundance of variants that create de novo splice sites, an observation supported by both simulations and empirical data from RNA-seq experiments. While these de novo splice variants are commonly misinterpreted by other tools as coding or noncoding variants of little or no effect, we find that in some cases they can have large effects on splicing activity and protein products, and we propose that they may commonly act as cryptic factors in disease. The software is available from geneprediction.org/SGRF. bmajoros@duke.edu. Supplementary information is available at Bioinformatics online.

Complete nucleotide sequence of the freshwater unicellular cyanobacterium Synechococcus elongatus PCC 6301 chromosome: gene content and organization.

PubMed

Sugita, Chieko; Ogata, Koretsugu; Shikata, Masamitsu; Jikuya, Hiroyuki; Takano, Jun; Furumichi, Miho; Kanehisa, Minoru; Omata, Tatsuo; Sugiura, Masahiro; Sugita, Mamoru

2007-01-01

The entire genome of the unicellular cyanobacterium Synechococcus elongatus PCC 6301 (formerly Anacystis nidulans Berkeley strain 6301) was sequenced. The genome consisted of a circular chromosome 2,696,255 bp long. A total of 2,525 potential protein-coding genes, two sets of rRNA genes, 45 tRNA genes representing 42 tRNA species, and several genes for small stable RNAs were assigned to the chromosome by similarity searches and computer predictions. The translated products of 56% of the potential protein-coding genes showed sequence similarities to experimentally identified and predicted proteins of known function, and the products of 35% of the genes showed sequence similarities to the translated products of hypothetical genes. The remaining 9% of genes lacked significant similarities to genes for predicted proteins in the public DNA databases. Some 139 genes coding for photosynthesis-related components were identified. Thirty-seven genes for two-component signal transduction systems were also identified. This is the smallest number of such genes identified in cyanobacteria, except for marine cyanobacteria, suggesting that only simple signal transduction systems are found in this strain. The gene arrangement and nucleotide sequence of Synechococcus elongatus PCC 6301 were nearly identical to those of a closely related strain Synechococcus elongatus PCC 7942, except for the presence of a 188.6 kb inversion. The sequences as well as the gene information shown in this paper are available in the Web database, CYORF (http://www.cyano.genome.jp/).
Induction of multixenobiotic defense mechanisms in resistant Daphnia magna clones as a general cellular response to stress.

PubMed

Jordão, Rita; Campos, Bruno; Lemos, Marco F L; Soares, Amadeu M V M; Tauler, Romà; Barata, Carlos

2016-06-01

Multixenobiotic resistance mechanisms (MXR) were recently identified in Daphnia magna. Previous results characterized gene transcripts of genes encoding and efflux activities of four putative ABCB1 and ABCC transporters that were chemically induced but showed low specificity against model transporter substrates and inhibitors, thus preventing us from distinguishing between activities of different efflux transporter types. In this study we report on the specificity of induction of ABC transporters and of the stress protein hsp70 in clones selected to be genetically resistant to ABCB1 chemical substrates. Clones resistant to mitoxantrone, ivermectin and pentachlorophenol showed distinctive transcriptional responses of transporter protein coding genes and of putative transporter dye activities. Expression of hsp70 proteins also varied across resistant clones. Clones resistant to mitoxantrone and pentachlorophenol showed high constitutive levels of hsp70. Transcriptional levels of the abcb1 gene transporter and of putative dye transporter activity were also induced to a greater extent in the pentachlorophenol resistant clone. Observed higher dye transporter activities in individuals from clones resistant to mitoxantrone and ivermectin were unrelated with transcriptional levels of the studied four abcc and abcb1 transporter genes. These findings suggest that Abcb1 induction in D. magna may be a part of a general cellular stress response. Copyright © 2016 Elsevier B.V. All rights reserved.
OeFAD8, OeLIP and OeOSM expression and activity in cold-acclimation of Olea europaea, a perennial dicot without winter-dormancy.

PubMed

D'Angeli, Simone; Matteucci, Maya; Fattorini, Laura; Gismondi, Angelo; Ludovici, Matteo; Canini, Antonella; Altamura, Maria Maddalena

2016-05-01

Cold-acclimation genes in woody dicots without winter-dormancy, e.g., olive-tree, need investigation. Positive relationships between OeFAD8, OeOSM , and OeLIP19 and olive-tree cold-acclimation exist, and couple with increased lipid unsaturation and cutinisation. Olive-tree is a woody species with no winter-dormancy and low frost-tolerance. However, cold-tolerant genotypes were empirically selected, highlighting that cold-acclimation might be acquired. Proteins needed for olive-tree cold-acclimation are unknown, even if roles for osmotin (OeOSM) as leaf cryoprotectant, and seed lipid-transfer protein for endosperm cutinisation under cold, were demonstrated. In other species, FAD8, coding a desaturase producing α-linolenic acid, is activated by temperature-lowering, concomitantly with bZIP-LIP19 genes. The research was focussed on finding OeLIP19 gene(s) in olive-tree genome, and analyze it/their expression, and that of OeFAD8 and OeOSM, in drupes and leaves under different cold-conditions/developmental stages/genotypes, in comparison with changes in unsaturated lipids and cell wall cutinisation. Cold-induced cytosolic calcium transients always occurred in leaves/drupes of some genotypes, e.g., Moraiolo, but ceased in others, e.g., Canino, at specific drupe stages/cold-treatments, suggesting cold-acclimation acquisition only in the latter genotypes. Canino and Moraiolo were selected for further analyses. Cold-acclimation in Canino was confirmed by an electrolyte leakage from leaf/drupe membranes highly reduced in comparison with Moraiolo. Strong increases in fruit-epicarp/leaf-epidermis cutinisation characterized cold-acclimated Canino, and positively coupled with OeOSM expression, and immunolocalization of the coded protein. OeFAD8 expression increased with cold-acclimation, as the production of α-linolenic acid, and related compounds. An OeLIP19 gene was isolated. Its levels changed with a trend similar to OeFAD8. All together, results sustain a positive relationship between OeFAD8, OeOSM and OeLIP19 expression in olive-tree cold-acclimation. The parallel changes in unsaturated lipids and cutinisation concur to suggest orchestrated roles of the coded proteins in the process.
A genome-wide identification and analysis of the DYW-deaminase genes in the pentatricopeptide repeat gene family in cotton (Gossypium spp.)

PubMed Central

Liu, Guoyuan; Li, Xue; Guo, Liping; Zhang, Xuexian; Qi, Tingxiang; Wang, Hailin; Tang, Huini; Qiao, Xiuqin; Zhang, Jinfa; Xing, Chaozhu; Wu, Jianyong

2017-01-01

The RNA editing occurring in plant organellar genomes mainly involves the change of cytidine to uridine. This process involves a deamination reaction, with cytidine deaminase as the catalyst. Pentatricopeptide repeat (PPR) proteins with a C-terminal DYW domain are reportedly associated with cytidine deamination, similar to members of the deaminase superfamily. PPR genes are involved in many cellular functions and biological processes including fertility restoration to cytoplasmic male sterility (CMS) in plants. In this study, we identified 227 and 211 DYW deaminase-coding PPR genes for the cultivated tetraploid cotton species G. hirsutum and G. barbadense (2n = 4x = 52), respectively, as well as 126 and 97 DYW deaminase-coding PPR genes in the ancestral diploid species G. raimondii and G. arboreum (2n = 26), respectively. The 227 G. hirsutum PPR genes were predicted to encode 52–2016 amino acids, 203 of which were mapped onto 26 chromosomes. Most DYW deaminase genes lacked introns, and their proteins were predicted to target the mitochondria or chloroplasts. Additionally, the DYW domain differed from the complete DYW deaminase domain, which contained part of the E domain and the entire E+ domain. The types and number of DYW tripeptides may have been influenced by evolutionary processes, with some tripeptides being lost. Furthermore, a gene ontology analysis revealed that DYW deaminase functions were mainly related to binding as well as hydrolase and transferase activities. The G. hirsutum DYW deaminase expression profiles varied among different cotton tissues and developmental stages, and no differentially expressed DYW deaminase-coding PPRs were directly associated with the male sterility and restoration in the CMS-D2 system. Our current study provides an important piece of information regarding the structural and evolutionary characteristics of Gossypium DYW-containing PPR genes coding for deaminases and will be useful for characterizing the DYW deaminase gene family in cotton biology and breeding. PMID:28339482
Regulatory BC1 RNA in Cognitive Control

ERIC Educational Resources Information Center

Iacoangeli, Anna; Dosunmu, Aderemi; Eom, Taesun; Stefanov, Dimitre G.; Tiedge, Henri

2017-01-01

Dendritic regulatory BC1 RNA is a non-protein-coding (npc) RNA that operates in the translational control of gene expression. The absence of BC1 RNA in BC1 knockout (KO) animals causes translational dysregulation that entails neuronal phenotypic alterations including prolonged epileptiform discharges, audiogenic seizure activity in vivo, and…
Novel promoters and coding first exons in DLG2 linked to developmental disorders and intellectual disability.

PubMed

Reggiani, Claudio; Coppens, Sandra; Sekhara, Tayeb; Dimov, Ivan; Pichon, Bruno; Lufin, Nicolas; Addor, Marie-Claude; Belligni, Elga Fabia; Digilio, Maria Cristina; Faletra, Flavio; Ferrero, Giovanni Battista; Gerard, Marion; Isidor, Bertrand; Joss, Shelagh; Niel-Bütschi, Florence; Perrone, Maria Dolores; Petit, Florence; Renieri, Alessandra; Romana, Serge; Topa, Alexandra; Vermeesch, Joris Robert; Lenaerts, Tom; Casimir, Georges; Abramowicz, Marc; Bontempi, Gianluca; Vilain, Catheline; Deconinck, Nicolas; Smits, Guillaume

2017-07-19

Tissue-specific integrative omics has the potential to reveal new genic elements important for developmental disorders. Two pediatric patients with global developmental delay and intellectual disability phenotype underwent array-CGH genetic testing, both showing a partial deletion of the DLG2 gene. From independent human and murine omics datasets, we combined copy number variations, histone modifications, developmental tissue-specific regulation, and protein data to explore the molecular mechanism at play. Integrating genomics, transcriptomics, and epigenomics data, we describe two novel DLG2 promoters and coding first exons expressed in human fetal brain. Their murine conservation and protein-level evidence allowed us to produce new DLG2 gene models for human and mouse. These new genic elements are deleted in 90% of 29 patients (public and in-house) showing partial deletion of the DLG2 gene. The patients' clinical characteristics expand the neurodevelopmental phenotypic spectrum linked to DLG2 gene disruption to cognitive and behavioral categories. While protein-coding genes are regarded as well known, our work shows that integration of multiple omics datasets can unveil novel coding elements. From a clinical perspective, our work demonstrates that two new DLG2 promoters and exons are crucial for the neurodevelopmental phenotypes associated with this gene. In addition, our work brings evidence for the lack of cross-annotation in human versus mouse reference genomes and nucleotide versus protein databases.
Integrative Annotation of 21,037 Human Genes Validated by Full-Length cDNA Clones

PubMed Central

Imanishi, Tadashi; Itoh, Takeshi; Suzuki, Yutaka; O'Donovan, Claire; Fukuchi, Satoshi; Koyanagi, Kanako O; Barrero, Roberto A; Tamura, Takuro; Yamaguchi-Kabata, Yumi; Tanino, Motohiko; Yura, Kei; Miyazaki, Satoru; Ikeo, Kazuho; Homma, Keiichi; Kasprzyk, Arek; Nishikawa, Tetsuo; Hirakawa, Mika; Thierry-Mieg, Jean; Thierry-Mieg, Danielle; Ashurst, Jennifer; Jia, Libin; Nakao, Mitsuteru; Thomas, Michael A; Mulder, Nicola; Karavidopoulou, Youla; Jin, Lihua; Kim, Sangsoo; Yasuda, Tomohiro; Lenhard, Boris; Eveno, Eric; Suzuki, Yoshiyuki; Yamasaki, Chisato; Takeda, Jun-ichi; Gough, Craig; Hilton, Phillip; Fujii, Yasuyuki; Sakai, Hiroaki; Tanaka, Susumu; Amid, Clara; Bellgard, Matthew; Bonaldo, Maria de Fatima; Bono, Hidemasa; Bromberg, Susan K; Brookes, Anthony J; Bruford, Elspeth; Carninci, Piero; Chelala, Claude; Couillault, Christine; de Souza, Sandro J.; Debily, Marie-Anne; Devignes, Marie-Dominique; Dubchak, Inna; Endo, Toshinori; Estreicher, Anne; Eyras, Eduardo; Fukami-Kobayashi, Kaoru; R. Gopinath, Gopal; Graudens, Esther; Hahn, Yoonsoo; Han, Michael; Han, Ze-Guang; Hanada, Kousuke; Hanaoka, Hideki; Harada, Erimi; Hashimoto, Katsuyuki; Hinz, Ursula; Hirai, Momoki; Hishiki, Teruyoshi; Hopkinson, Ian; Imbeaud, Sandrine; Inoko, Hidetoshi; Kanapin, Alexander; Kaneko, Yayoi; Kasukawa, Takeya; Kelso, Janet; Kersey, Paul; Kikuno, Reiko; Kimura, Kouichi; Korn, Bernhard; Kuryshev, Vladimir; Makalowska, Izabela; Makino, Takashi; Mano, Shuhei; Mariage-Samson, Regine; Mashima, Jun; Matsuda, Hideo; Mewes, Hans-Werner; Minoshima, Shinsei; Nagai, Keiichi; Nagasaki, Hideki; Nagata, Naoki; Nigam, Rajni; Ogasawara, Osamu; Ohara, Osamu; Ohtsubo, Masafumi; Okada, Norihiro; Okido, Toshihisa; Oota, Satoshi; Ota, Motonori; Ota, Toshio; Otsuki, Tetsuji; Piatier-Tonneau, Dominique; Poustka, Annemarie; Ren, Shuang-Xi; Saitou, Naruya; Sakai, Katsunaga; Sakamoto, Shigetaka; Sakate, Ryuichi; Schupp, Ingo; Servant, Florence; Sherry, Stephen; Shiba, Rie; Shimizu, Nobuyoshi; Shimoyama, Mary; Simpson, Andrew J; Soares, Bento; Steward, Charles; Suwa, Makiko; Suzuki, Mami; Takahashi, Aiko; Tamiya, Gen; Tanaka, Hiroshi; Taylor, Todd; Terwilliger, Joseph D; Unneberg, Per; Veeramachaneni, Vamsi; Watanabe, Shinya; Wilming, Laurens; Yasuda, Norikazu; Yoo, Hyang-Sook; Stodolsky, Marvin; Makalowski, Wojciech; Go, Mitiko; Nakai, Kenta; Takagi, Toshihisa; Kanehisa, Minoru; Sakaki, Yoshiyuki; Quackenbush, John; Okazaki, Yasushi; Hayashizaki, Yoshihide; Hide, Winston; Chakraborty, Ranajit; Nishikawa, Ken; Sugawara, Hideaki; Tateno, Yoshio; Chen, Zhu; Oishi, Michio; Tonellato, Peter; Apweiler, Rolf; Okubo, Kousaku; Wagner, Lukas; Wiemann, Stefan; Strausberg, Robert L; Isogai, Takao; Auffray, Charles; Nomura, Nobuo; Sugano, Sumio

2004-01-01

The human genome sequence defines our inherent biological potential; the realization of the biology encoded therein requires knowledge of the function of each gene. Currently, our knowledge in this area is still limited. Several lines of investigation have been used to elucidate the structure and function of the genes in the human genome. Even so, gene prediction remains a difficult task, as the varieties of transcripts of a gene may vary to a great extent. We thus performed an exhaustive integrative characterization of 41,118 full-length cDNAs that capture the gene transcripts as complete functional cassettes, providing an unequivocal report of structural and functional diversity at the gene level. Our international collaboration has validated 21,037 human gene candidates by analysis of high-quality full-length cDNA clones through curation using unified criteria. This led to the identification of 5,155 new gene candidates. It also manifested the most reliable way to control the quality of the cDNA clones. We have developed a human gene database, called the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/). It provides the following: integrative annotation of human genes, description of gene structures, details of novel alternative splicing isoforms, non-protein-coding RNAs, functional domains, subcellular localizations, metabolic pathways, predictions of protein three-dimensional structure, mapping of known single nucleotide polymorphisms (SNPs), identification of polymorphic microsatellite repeats within human genes, and comparative results with mouse full-length cDNAs. The H-InvDB analysis has shown that up to 4% of the human genome sequence (National Center for Biotechnology Information build 34 assembly) may contain misassembled or missing regions. We found that 6.5% of the human gene candidates (1,377 loci) did not have a good protein-coding open reading frame, of which 296 loci are strong candidates for non-protein-coding RNA genes. In addition, among 72,027 uniquely mapped SNPs and insertions/deletions localized within human genes, 13,215 nonsynonymous SNPs, 315 nonsense SNPs, and 452 indels occurred in coding regions. Together with 25 polymorphic microsatellite repeats present in coding regions, they may alter protein structure, causing phenotypic effects or resulting in disease. The H-InvDB platform represents a substantial contribution to resources needed for the exploration of human biology and pathology. PMID:15103394
Multiplexed pyrosequencing of nine sea anemone (Cnidaria: Anthozoa: Hexacorallia: Actiniaria) mitochondrial genomes.

PubMed

Foox, Jonathan; Brugler, Mercer; Siddall, Mark Edward; Rodríguez, Estefanía

2016-07-01

Six complete and three partial actiniarian mitochondrial genomes were amplified in two semi-circles using long-range PCR and pyrosequenced in a single run on a 454 GS Junior, doubling the number of complete mitogenomes available within the order. Typical metazoan mtDNA features included circularity, 13 protein-coding genes, 2 ribosomal RNA genes, and length ranging from 17,498 to 19,727 bp. Several typical anthozoan mitochondrial genome features were also observed including the presence of only two transfer RNA genes, elevated A + T richness ranging from 54.9 to 62.4%, large intergenic regions, and group 1 introns interrupting NADH dehydrogenase subunit 5 and cytochrome c oxidase subunit I, the latter of which possesses a homing endonuclease gene. Within the sea anemone Alicia sansibarensis, we report the first mitochondrial gene order rearrangement within the Actiniaria, as well as putative novel non-canonical protein-coding genes. Phylogenetic analyses of all 13 protein-coding and 2 ribosomal genes largely corroborated current hypotheses of sea anemone interrelatedness, with a few lower-level differences.
Complete mitochondrial genome of Bactrocera arecae (Insecta: Tephritidae) by next-generation sequencing and molecular phylogeny of Dacini tribe

PubMed Central

Yong, Hoi-Sen; Song, Sze-Looi; Lim, Phaik-Eem; Chan, Kok-Gan; Chow, Wan-Loo; Eamsobhana, Praphathip

2015-01-01

The whole mitochondrial genome of the pest fruit fly Bactrocera arecae was obtained from next-generation sequencing of genomic DNA. It had a total length of 15,900 bp, consisting of 13 protein-coding genes, 2 rRNA genes, 22 tRNA genes and a non-coding region (A + T-rich control region). The control region (952 bp) was flanked by rrnS and trnI genes. The start codons included 6 ATG, 3 ATT and 1 each of ATA, ATC, GTG and TCG. Eight TAA, two TAG, one incomplete TA and two incomplete T stop codons were represented in the protein-coding genes. The cloverleaf structure for trnS1 lacked the D-loop, and that of trnN and trnF lacked the TΨC-loop. Molecular phylogeny based on 13 protein-coding genes was concordant with 37 mitochondrial genes, with B. arecae having closest genetic affinity to B. tryoni. The subgenus Bactrocera of Dacini tribe and the Dacinae subfamily (Dacini and Ceratitidini tribes) were monophyletic. The whole mitogenome of B. arecae will serve as a useful dataset for studying the genetics, systematics and phylogenetic relationships of the many species of Bactrocera genus in particular, and tephritid fruit flies in general. PMID:26472633
Cis-regulatory somatic mutations and gene-expression alteration in B-cell lymphomas.

PubMed

Mathelier, Anthony; Lefebvre, Calvin; Zhang, Allen W; Arenillas, David J; Ding, Jiarui; Wasserman, Wyeth W; Shah, Sohrab P

2015-04-23

With the rapid increase of whole-genome sequencing of human cancers, an important opportunity to analyze and characterize somatic mutations lying within cis-regulatory regions has emerged. A focus on protein-coding regions to identify nonsense or missense mutations disruptive to protein structure and/or function has led to important insights; however, the impact on gene expression of mutations lying within cis-regulatory regions remains under-explored. We analyzed somatic mutations from 84 matched tumor-normal whole genomes from B-cell lymphomas with accompanying gene expression measurements to elucidate the extent to which these cancers are disrupted by cis-regulatory mutations. We characterize mutations overlapping a high quality set of well-annotated transcription factor binding sites (TFBSs), covering a similar portion of the genome as protein-coding exons. Our results indicate that cis-regulatory mutations overlapping predicted TFBSs are enriched in promoter regions of genes involved in apoptosis or growth/proliferation. By integrating gene expression data with mutation data, our computational approach culminates with identification of cis-regulatory mutations most likely to participate in dysregulation of the gene expression program. The impact can be measured along with protein-coding mutations to highlight key mutations disrupting gene expression and pathways in cancer. Our study yields specific genes with disrupted expression triggered by genomic mutations in either the coding or the regulatory space. It implies that mutated regulatory components of the genome contribute substantially to cancer pathways. Our analyses demonstrate that identifying genomically altered cis-regulatory elements coupled with analysis of gene expression data will augment biological interpretation of mutational landscapes of cancers.
Complete mitochondrial genome of Palawan peacock-pheasant Polyplectron napoleonis (Galliformes, Phasianidae).

PubMed

Quach, Tommy; Brooks, Daniel M; Miranda, Hector C

2016-01-01

The complete mitochondrial genome of the Palawan peacock-pheasant Polyplectron napoleonis is 16,710 bp and contains 13 protein-coding genes, 2 rRNA genes, 22 tRNA genes and a control-region. All protein-coding genes use the standard ATG start codon, except for cox1 which has GTG start codon. Seven out of 13 PCGs have TAA stop codons, two have AGG (cox1 and nd6), and three PCGs (nd2, cox2 and nd4) have incomplete stop codon of just T- - nucleotide.
Short Exogenous Peptides Regulate Expression of CLE, KNOX1, and GRF Family Genes in Nicotiana tabacum.

PubMed

Fedoreyeva, L I; Dilovarova, T A; Ashapkin, V V; Martirosyan, Yu Ts; Khavinson, V Kh; Kharchenko, P N; Vanyushin, B F

2017-04-01

Exogenous short biologically active peptides epitalon (Ala-Glu-Asp-Gly), bronchogen (Ala-Glu-Asp-Leu), and vilon (Lys-Glu) at concentrations 10 -7 -10 -9 M significantly influence growth, development, and differentiation of tobacco (Nicotiana tabacum) callus cultures. Epitalon and bronchogen, in particular, both increase growth of calluses and stimulate formation and growth of leaves in plant regenerants. Because the regulatory activity of the short peptides appears at low peptide concentrations, their action to some extent is like that of the activity of phytohormones, and it seems to have signaling character and epigenetic nature. The investigated peptides modulate in tobacco cells the expression of genes including genes responsible for tissue formation and cell differentiation. These peptides differently modulate expression of CLE family genes coding for known endogenous regulatory peptides, the KNOX1 genes (transcription factor genes) and GRF (growth regulatory factor) genes coding for respective DNA-binding proteins such as topoisomerases, nucleases, and others. Thus, at the level of transcription, plants have a system of short peptide regulation of formation of long-known peptide regulators of growth and development. The peptides studied here may be related to a new generation of plant growth regulators. They can be used in the experimental botany, plant molecular biology, biotechnology, and practical agronomy.
Conserved Curvature of RNA Polymerase I Core Promoter Beyond rRNA Genes: The Case of the Tritryps

PubMed Central

Smircich, Pablo; Duhagon, María Ana; Garat, Beatriz

2015-01-01

In trypanosomatids, the RNA polymerase I (RNAPI)-dependent promoters controlling the ribosomal RNA (rRNA) genes have been well identified. Although the RNAPI transcription machinery recognizes the DNA conformation instead of the DNA sequence of promoters, no conformational study has been reported for these promoters. Here we present the in silico analysis of the intrinsic DNA curvature of the rRNA gene core promoters in Trypanosoma brucei, Trypanosoma cruzi, and Leishmania major. We found that, in spite of the absence of sequence conservation, these promoters hold conformational properties similar to other eukaryotic rRNA promoters. Our results also indicated that the intrinsic DNA curvature pattern is conserved within the Leishmania genus and also among strains of T. cruzi and T. brucei. Furthermore, we analyzed the impact of point mutations on the intrinsic curvature and their impact on the promoter activity. Furthermore, we found that the core promoters of protein-coding genes transcribed by RNAPI in T. brucei show the same conserved conformational characteristics. Overall, our results indicate that DNA intrinsic curvature of the rRNA gene core promoters is conserved in these ancient eukaryotes and such conserved curvature might be a requirement of RNAPI machinery for transcription of not only rRNA genes but also protein-coding genes. PMID:26718450
Genes Involved in Anaerobic Metabolism of Phenol in the Bacterium Thauera aromatica

PubMed Central

Breinig, Sabine; Schiltz, Emile; Fuchs, Georg

2000-01-01

Genes involved in the anaerobic metabolism of phenol in the denitrifying bacterium Thauera aromatica have been studied. The first two committed steps in this metabolism appear to be phosphorylation of phenol to phenylphosphate by an unknown phosphoryl donor (“phenylphosphate synthase”) and subsequent carboxylation of phenylphosphate to 4-hydroxybenzoate under release of phosphate (“phenylphosphate carboxylase”). Both enzyme activities are strictly phenol induced. Two-dimensional gel electrophoresis allowed identification of several phenol-induced proteins. Based on N-terminal and internal amino acid sequences of such proteins, degenerate oligonucleotides were designed to identify the corresponding genes. A chromosomal DNA segment of about 14 kbp was sequenced which contained 10 genes transcribed in the same direction. These are organized in two adjacent gene clusters and include the genes coding for five identified phenol-induced proteins. Comparison with sequences in the databases revealed the following similarities: the gene products of two open reading frames (ORFs) are each similar to either the central part and N-terminal part of phosphoenolpyruvate synthases. We propose that these ORFs are components of the phenylphosphate synthase system. Three ORFs showed similarity to the ubiD gene product, 3-octaprenyl-4-hydroxybenzoate carboxy lyase; UbiD catalyzes the decarboxylation of a 4-hydroxybenzoate analogue in ubiquinone biosynthesis. Another ORF was similar to the ubiX gene product, an isoenzyme of UbiD. We propose that (some of) these four proteins are involved in the carboxylation of phenylphosphate. A 700-bp PCR product derived from one of these ORFs cross-hybridized with DNA from different Thauera and Azoarcus strains, even from those which have not been reported to grow with phenol. One ORF showed similarity to the mutT gene product, and three ORFs showed no strong similarities to sequences in the databases. Upstream of the first gene cluster, an ORF which is transcribed in the opposite direction codes for a protein highly similar to the DmpR regulatory protein of Pseudomonas putida. DmpR controls transcription of the genes of aerobic phenol metabolism, suggesting a similar regulation of anaerobic phenol metabolism by the putative regulator. PMID:11004186
Curated genome annotation of Oryza sativa ssp. japonica and comparative genome analysis with Arabidopsis thaliana

PubMed Central

Itoh, Takeshi; Tanaka, Tsuyoshi; Barrero, Roberto A.; Yamasaki, Chisato; Fujii, Yasuyuki; Hilton, Phillip B.; Antonio, Baltazar A.; Aono, Hideo; Apweiler, Rolf; Bruskiewich, Richard; Bureau, Thomas; Burr, Frances; Costa de Oliveira, Antonio; Fuks, Galina; Habara, Takuya; Haberer, Georg; Han, Bin; Harada, Erimi; Hiraki, Aiko T.; Hirochika, Hirohiko; Hoen, Douglas; Hokari, Hiroki; Hosokawa, Satomi; Hsing, Yue; Ikawa, Hiroshi; Ikeo, Kazuho; Imanishi, Tadashi; Ito, Yukiyo; Jaiswal, Pankaj; Kanno, Masako; Kawahara, Yoshihiro; Kawamura, Toshiyuki; Kawashima, Hiroaki; Khurana, Jitendra P.; Kikuchi, Shoshi; Komatsu, Setsuko; Koyanagi, Kanako O.; Kubooka, Hiromi; Lieberherr, Damien; Lin, Yao-Cheng; Lonsdale, David; Matsumoto, Takashi; Matsuya, Akihiro; McCombie, W. Richard; Messing, Joachim; Miyao, Akio; Mulder, Nicola; Nagamura, Yoshiaki; Nam, Jongmin; Namiki, Nobukazu; Numa, Hisataka; Nurimoto, Shin; O’Donovan, Claire; Ohyanagi, Hajime; Okido, Toshihisa; OOta, Satoshi; Osato, Naoki; Palmer, Lance E.; Quetier, Francis; Raghuvanshi, Saurabh; Saichi, Naomi; Sakai, Hiroaki; Sakai, Yasumichi; Sakata, Katsumi; Sakurai, Tetsuya; Sato, Fumihiko; Sato, Yoshiharu; Schoof, Heiko; Seki, Motoaki; Shibata, Michie; Shimizu, Yuji; Shinozaki, Kazuo; Shinso, Yuji; Singh, Nagendra K.; Smith-White, Brian; Takeda, Jun-ichi; Tanino, Motohiko; Tatusova, Tatiana; Thongjuea, Supat; Todokoro, Fusano; Tsugane, Mika; Tyagi, Akhilesh K.; Vanavichit, Apichart; Wang, Aihui; Wing, Rod A.; Yamaguchi, Kaori; Yamamoto, Mayu; Yamamoto, Naoyuki; Yu, Yeisoo; Zhang, Hao; Zhao, Qiang; Higo, Kenichi; Burr, Benjamin; Gojobori, Takashi; Sasaki, Takuji

2007-01-01

We present here the annotation of the complete genome of rice Oryza sativa L. ssp. japonica cultivar Nipponbare. All functional annotations for proteins and non-protein-coding RNA (npRNA) candidates were manually curated. Functions were identified or inferred in 19,969 (70%) of the proteins, and 131 possible npRNAs (including 58 antisense transcripts) were found. Almost 5000 annotated protein-coding genes were found to be disrupted in insertional mutant lines, which will accelerate future experimental validation of the annotations. The rice loci were determined by using cDNA sequences obtained from rice and other representative cereals. Our conservative estimate based on these loci and an extrapolation suggested that the gene number of rice is ∼32,000, which is smaller than previous estimates. We conducted comparative analyses between rice and Arabidopsis thaliana and found that both genomes possessed several lineage-specific genes, which might account for the observed differences between these species, while they had similar sets of predicted functional domains among the protein sequences. A system to control translational efficiency seems to be conserved across large evolutionary distances. Moreover, the evolutionary process of protein-coding genes was examined. Our results suggest that natural selection may have played a role for duplicated genes in both species, so that duplication was suppressed or favored in a manner that depended on the function of a gene. PMID:17210932
An operon from Lactobacillus helveticus composed of a proline iminopeptidase gene (pepI) and two genes coding for putative members of the ABC transporter family of proteins.

PubMed

Varmanen, P; Rantanen, T; Palva, A

1996-12-01

A proline iminopeptidase gene (pepI) of an industrial Lactobacillus helveticus strain was cloned and found to be organized in an operon-like structure of three open reading frames (ORF1, ORF2 and ORF3). ORF1 was preceded by a typical prokaryotic promoter region, and a putative transcription terminator was found downstream of ORF3, identified as the pepI gene. Using primer-extension analyses, only one transcription start site, upstream of ORF1, was identifiable in the predicted operon. Although the size of mRNA could not be judged by Northern analysis either with ORF1-, ORF2- or pepI-specific probes, reverse transcription-PCR analyses further supported the operon structure of the three genes. ORF1, ORF2 and ORF3 had coding capacities for 50.7, 24.5 and 33.8 kDa proteins, respectively. The ORF3-encoded PepI protein showed 65% identity with the PepI proteins from Lactobacillus delbrueckii subsp. bulgaricus and Lactobacillus delbrueckii subsp. lactis. The ORF1-encoded protein had significant homology with several members of the ABC transporter family but, with two distinct putative ATP-binding sites, it would represent an unusual type among the bacterial ABC transporters. ORF2 encoded a putative integral membrane protein also characteristic of the ABC transporter family. The pepI gene was overexpressed in Escherichia coli. Purified PepI hydrolysed only di and tripeptides with proline in the first position. Optimum PepI activity was observed at pH 7.5 and 40 degrees C. A gel filtration analysis indicated that PepI is a dimer of M(r) 53,000. PepI was shown to be a metal-independent serine peptidase having thiol groups at or near the active site. Kinetic studies with proline-p-nitroanilide as substrate revealed Km and Vmax values of 0.8 mM and 350 mmol min-1 mg-1, respectively, and a very high turnover number of 135,000 s-1.
Evidence for the recent origin of a bacterial protein-coding, overlapping orphan gene by evolutionary overprinting.

PubMed

Fellner, Lea; Simon, Svenja; Scherling, Christian; Witting, Michael; Schober, Steffen; Polte, Christine; Schmitt-Kopplin, Philippe; Keim, Daniel A; Scherer, Siegfried; Neuhaus, Klaus

2015-12-18

Gene duplication is believed to be the classical way to form novel genes, but overprinting may be an important alternative. Overprinting allows entirely novel proteins to evolve de novo, i.e., formerly non-coding open reading frames within functional genes become expressed. Only three cases have been described for Escherichia coli. Here, a fourth example is presented. RNA sequencing revealed an open reading frame weakly transcribed in cow dung, coding for 101 residues and embedded completely in the -2 reading frame of citC in enterohemorrhagic E. coli. This gene is designated novel overlapping gene, nog1. The promoter region fused to gfp exhibits specific activities and 5' rapid amplification of cDNA ends indicated the transcriptional start 40-bp upstream of the start codon. nog1 was strand-specifically arrested in translation by a nonsense mutation silent in citC. This Nog1-mutant showed a phenotype in competitive growth against wild type in the presence of MgCl2. Small differences in metabolite concentrations were also found. Bioinformatic analyses propose Nog1 to be inner membrane-bound and to possess at least one membrane-spanning domain. A phylogenetic analysis suggests that the orphan gene nog1 arose by overprinting after Escherichia/Shigella separated from the other γ-proteobacteria. Since nog1 is of recent origin, non-essential, short, weakly expressed and only marginally involved in E. coli's central metabolism, we propose that this gene is in an initial stage of evolution. While we present specific experimental evidence for the existence of a fourth overlapping gene in enterohemorrhagic E. coli, we believe that this may be an initial finding only and overlapping genes in bacteria may be more common than is currently assumed by microbiologists.
Core histone genes of Giardia intestinalis: genomic organization, promoter structure, and expression

PubMed Central

Yee, Janet; Tang, Anita; Lau, Wei-Ling; Ritter, Heather; Delport, Dewald; Page, Melissa; Adam, Rodney D; Müller, Miklós; Wu, Gang

2007-01-01

Background Giardia intestinalis is a protist found in freshwaters worldwide, and is the most common cause of parasitic diarrhea in humans. The phylogenetic position of this parasite is still much debated. Histones are small, highly conserved proteins that associate tightly with DNA to form chromatin within the nucleus. There are two classes of core histone genes in higher eukaryotes: DNA replication-independent histones and DNA replication-dependent ones. Results We identified two copies each of the core histone H2a, H2b and H3 genes, and three copies of the H4 gene, at separate locations on chromosomes 3, 4 and 5 within the genome of Giardia intestinalis, but no gene encoding a H1 linker histone could be recognized. The copies of each gene share extensive DNA sequence identities throughout their coding and 5' noncoding regions, which suggests these copies have arisen from relatively recent gene duplications or gene conversions. The transcription start sites are at triplet A sequences 1–27 nucleotides upstream of the translation start codon for each gene. We determined that a 50 bp region upstream from the start of the histone H4 coding region is the minimal promoter, and a highly conserved 15 bp sequence called the histone motif (him) is essential for its activity. The Giardia core histone genes are constitutively expressed at approximately equivalent levels and their mRNAs are polyadenylated. Competition gel-shift experiments suggest that a factor within the protein complex that binds him may also be a part of the protein complexes that bind other promoter elements described previously in Giardia. Conclusion In contrast to other eukaryotes, the Giardia genome has only a single class of core histone genes that encode replication-independent histones. Our inability to locate a gene encoding the linker histone H1 leads us to speculate that the H1 protein may not be required for the compaction of Giardia's small and gene-rich genome. PMID:17425802
The splicing of tiny introns of Paramecium is controlled by MAGO.

PubMed

Contreras, Julia; Begley, Victoria; Marsella, Laura; Villalobo, Eduardo

2018-07-15

The exon junction complex (EJC) is a key element of the splicing machinery. The EJC core is composed of eIF4A3, MAGO, Y14 and MLN51. Few accessory proteins, such as CWC22 or UPF3, bind transiently to the EJC. The EJC has been implicated in the control of the splicing of long introns. To ascertain whether the EJC controls the splicing of short introns, we used Paramecium tetraurelia as a model organism, since it has thousands of very tiny introns. To elucidate whether EJC affects intron splicing in P. tetraurelia, we searched for EJC protein-coding genes, and silenced those genes coding for eIF4A3, MAGO and CWC22. We found that P. tetraurelia likely assembles an active EJC with only three of the core proteins, since MLN51 is lacking. Silencing of eIF4A3 or CWC22 genes, but not that of MAGO, caused lethality. Silencing of the MAGO gene caused either an increase, decrease, or no change in intron retention levels of some intron-containing mRNAs used as reporters. We suggest that a fine-tuning expression of EJC genes is required for steady intron removal in P. tetraurelia. Taking into consideration our results and those published by others, we conclude that the EJC controls splicing independently of the intron size. Copyright © 2018 Elsevier B.V. All rights reserved.
Rate heterogeneity in six protein-coding genes from the holoparasite Balanophora (Balanophoraceae) and other taxa of Santalales

PubMed Central

Su, Huei-Jiun; Hu, Jer-Ming

2012-01-01

Background and Aims The holoparasitic flowering plant Balanophora displays extreme floral reduction and was previously found to have enormous rate acceleration in the nuclear 18S rDNA region. So far, it remains unclear whether non-ribosomal, protein-coding genes of Balanophora also evolve in an accelerated fashion and whether the genes with high substitution rates retain their functionality. To tackle these issues, six different genes were sequenced from two Balanophora species and their rate variation and expression patterns were examined. Methods Sequences including nuclear PI, euAP3, TM6, LFY and RPB2 and mitochondrial matR were determined from two Balanophora spp. and compared with selected hemiparasitic species of Santalales and autotrophic core eudicots. Gene expression was detected for the six protein-coding genes and the expression patterns of the three B-class genes (PI, AP3 and TM6) were further examined across different organs of B. laxiflora using RT-PCR. Key Results Balanophora mitochondrial matR is highly accelerated in both nonsynonymous (dN) and synonymous (dS) substitution rates, whereas the rate variation of nuclear genes LFY, PI, euAP3, TM6 and RPB2 are less dramatic. Significant dS increases were detected in Balanophora PI, TM6, RPB2 and dN accelerations in euAP3. All of the protein-coding genes are expressed in inflorescences, indicative of their functionality. PI is restrictively expressed in tepals, synandria and floral bracts, whereas AP3 and TM6 are widely expressed in both male and female inflorescences. Conclusions Despite the observation that rates of sequence evolution are generally higher in Balanophora than in hemiparasitic species of Santalales and autotrophic core eudicots, the five nuclear protein-coding genes are functional and are evolving at a much slower rate than 18S rDNA. The mechanism or mechanisms responsible for rapid sequence evolution and concomitant rate acceleration for 18S rDNA and matR are currently not well understood and require further study in Balanophora and other holoparasites. PMID:23041381

Transcriptomes of six mutants in the Sen1 pathway reveal combinatorial control of transcription termination across the Saccharomyces cerevisiae genome

PubMed Central

Carver, Melissa N.; Müller, Ulrika; Bekiranov, Stefan; Auble, David T.

2017-01-01

Transcriptome studies on eukaryotic cells have revealed an unexpected abundance and diversity of noncoding RNAs synthesized by RNA polymerase II (Pol II), some of which influence the expression of protein-coding genes. Yet, much less is known about biogenesis of Pol II non-coding RNA than mRNAs. In the budding yeast Saccharomyces cerevisiae, initiation of non-coding transcripts by Pol II appears to be similar to that of mRNAs, but a distinct pathway is utilized for termination of most non-coding RNAs: the Sen1-dependent or “NNS” pathway. Here, we examine the effect on the S. cerevisiae transcriptome of conditional mutations in the genes encoding six different essential proteins that influence Sen1-dependent termination: Sen1, Nrd1, Nab3, Ssu72, Rpb11, and Hrp1. We observe surprisingly diverse effects on transcript abundance for the different proteins that cannot be explained simply by differing severity of the mutations. Rather, we infer from our results that termination of Pol II transcription of non-coding RNA genes is subject to complex combinatorial control that likely involves proteins beyond those studied here. Furthermore, we identify new targets and functions of Sen1-dependent termination, including a role in repression of meiotic genes in vegetative cells. In combination with other recent whole-genome studies on termination of non-coding RNAs, our results provide promising directions for further investigation. PMID:28665995
Molecular cloning and functional characterization of an antifungal PR-5 protein from Ocimum basilicum.

PubMed

Rather, Irshad Ahmad; Awasthi, Praveen; Mahajan, Vidushi; Bedi, Yashbir S; Vishwakarma, Ram A; Gandhi, Sumit G

2015-03-01

Pathogenesis-related (PR) proteins are involved in biotic and abiotic stress responses of plants and are grouped into 17 families (PR-1 to PR-17). PR-5 family includes proteins related to thaumatin and osmotin, with several members possessing antimicrobial properties. In this study, a PR-5 gene showing a high degree of homology with osmotin-like protein was isolated from sweet basil (Ocimum basilicum L.). A complete open reading frame consisting of 675 nucleotides, coding for a precursor protein, was obtained by PCR amplification. Based on sequence comparisons with tobacco osmotin and other osmotin-like proteins (OLPs), this protein was named ObOLP. The predicted mature protein is 225 amino acids in length and contains 16 cysteine residues that may potentially form eight disulfide bonds, a signature common to most PR-5 proteins. Among the various abiotic stress treatments tested, including high salt, mechanical wounding and exogenous phytohormone/elicitor treatments; methyl jasmonate (MeJA) and mechanical wounding significantly induced the expression of ObOLP gene. The coding sequence of ObOLP was cloned and expressed in a bacterial host resulting in a 25kDa recombinant-HIS tagged protein, displaying antifungal activity. The ObOLP protein sequence appears to contain an N-terminal signal peptide with signatures of secretory pathway. Further, our experimental data shows that ObOLP expression is regulated transcriptionally and in silico analysis suggests that it may be post-transcriptionally and post-translationally regulated through microRNAs and post-translational protein modifications, respectively. This study appears to be the first report of isolation and characterization of osmotin-like protein gene from O. basilicum. Copyright © 2014 Elsevier B.V. All rights reserved.
Characterization of the Lymantria dispar nucleopolyhedrovirus 25K FP gene

Treesearch

David S. Bischoff; James M. Slavicek

1996-01-01

The Lymantria dispar nucleopolyhedrovirus (LdMNPV) gene encoding the 25K FP protein has been cloned and sequenced. The 25KFP gene codes for a 217 amino acid protein with a predicted molecular mass of 24870 Da. Expression of the 25K FP protein in a rabbit reticulocyte system generated a 27 kDa protein, in close agreement with the...
Role of genomic architecture in the expression dynamics of long noncoding RNAs during differentiation of human neuroblastoma cells.

PubMed

Batagov, Arsen O; Yarmishyn, Aliaksandr A; Jenjaroenpun, Piroon; Tan, Jovina Z; Nishida, Yuichiro; Kurochkin, Igor V

2013-10-16

Mammalian genomes are extensively transcribed producing thousands of long non-protein-coding RNAs (lncRNAs). The biological significance and function of the vast majority of lncRNAs remain unclear. Recent studies have implicated several lncRNAs as playing important roles in embryonic development and cancer progression. LncRNAs are characterized with different genomic architectures in relationship with their associated protein-coding genes. Our study aimed at bridging lncRNA architecture with dynamical patterns of their expression using differentiating human neuroblastoma cells model. LncRNA expression was studied in a 120-hours timecourse of differentiation of human neuroblastoma SH-SY5Y cells into neurons upon treatment with retinoic acid (RA), the compound used for the treatment of neuroblastoma. A custom microarray chip was utilized to interrogate expression levels of 9,267 lncRNAs in the course of differentiation. We categorized lncRNAs into 19 architecture classes according to their position relatively to protein-coding genes. For each architecture class, dynamics of expression of lncRNAs was studied in association with their protein-coding partners. It allowed us to demonstrate positive correlation of lncRNAs with their associated protein-coding genes at bidirectional promoters and for sense-antisense transcript pairs. In contrast, lncRNAs located in the introns and downstream of the protein-coding genes were characterized with negative correlation modes. We further classified the lncRNAs by the temporal patterns of their expression dynamics. We found that intronic and bidirectional promoter architectures are associated with rapid RA-dependent induction or repression of the corresponding lncRNAs, followed by their constant expression. At the same time, lncRNAs expressed downstream of protein-coding genes are characterized by rapid induction, followed by transcriptional repression. Quantitative RT-PCR analysis confirmed the discovered functional modes for several selected lncRNAs associated with proteins involved in cancer and embryonic development. This is the first report detailing dynamical changes of multiple lncRNAs during RA-induced neuroblastoma differentiation. Integration of genomic and transcriptomic levels of information allowed us to demonstrate specific behavior of lncRNAs organized in different genomic architectures. This study also provides a list of lncRNAs with possible roles in neuroblastoma.
Identification of Methanococcus Jannaschii Proteins in 2-D Gel Electrophoresis Patterns by Mass Spectrometry

DOE R&D Accomplishments Database

Liang, X.

1998-06-10

The genome of Methanococcus jannaschii has been sequenced completely and has been found to contain approximately 1,770 predicted protein-coding regions. When these coding regions are expressed and how their expression is regulated, however, remain open questions. In this work, mass spectrometry was combined with two-dimensional gel electrophoresis to identify which proteins the genes produce under different growth conditions, and thus investigate the regulation of genes responsible for functions characteristic of this thermophilic representative of the methanogenic Archaea.
Heat Shock Protein Genes Undergo Dynamic Alteration in Their Three-Dimensional Structure and Genome Organization in Response to Thermal Stress

PubMed Central

Chowdhary, Surabhi; Kainth, Amoldeep S.

2017-01-01

ABSTRACT Three-dimensional (3D) chromatin organization is important for proper gene regulation, yet how the genome is remodeled in response to stress is largely unknown. Here, we use a highly sensitive version of chromosome conformation capture in combination with fluorescence microscopy to investigate Heat Shock Protein (HSP) gene conformation and 3D nuclear organization in budding yeast. In response to acute thermal stress, HSP genes undergo intense intragenic folding interactions that go well beyond 5′-3′ gene looping previously described for RNA polymerase II genes. These interactions include looping between upstream activation sequence (UAS) and promoter elements, promoter and terminator regions, and regulatory and coding regions (gene “crumpling”). They are also dynamic, being prominent within 60 s, peaking within 2.5 min, and attenuating within 30 min, and correlate with HSP gene transcriptional activity. With similarly striking kinetics, activated HSP genes, both chromosomally linked and unlinked, coalesce into discrete intranuclear foci. Constitutively transcribed genes also loop and crumple yet fail to coalesce. Notably, a missense mutation in transcription factor TFIIB suppresses gene looping, yet neither crumpling nor HSP gene coalescence is affected. An inactivating promoter mutation, in contrast, obviates all three. Our results provide evidence for widespread, transcription-associated gene crumpling and demonstrate the de novo assembly and disassembly of HSP gene foci. PMID:28970326
Heat Shock Protein Genes Undergo Dynamic Alteration in Their Three-Dimensional Structure and Genome Organization in Response to Thermal Stress.

PubMed

Chowdhary, Surabhi; Kainth, Amoldeep S; Gross, David S

2017-12-15

Three-dimensional (3D) chromatin organization is important for proper gene regulation, yet how the genome is remodeled in response to stress is largely unknown. Here, we use a highly sensitive version of chromosome conformation capture in combination with fluorescence microscopy to investigate Heat Shock Protein ( HSP ) gene conformation and 3D nuclear organization in budding yeast. In response to acute thermal stress, HSP genes undergo intense intragenic folding interactions that go well beyond 5'-3' gene looping previously described for RNA polymerase II genes. These interactions include looping between upstream activation sequence (UAS) and promoter elements, promoter and terminator regions, and regulatory and coding regions (gene "crumpling"). They are also dynamic, being prominent within 60 s, peaking within 2.5 min, and attenuating within 30 min, and correlate with HSP gene transcriptional activity. With similarly striking kinetics, activated HSP genes, both chromosomally linked and unlinked, coalesce into discrete intranuclear foci. Constitutively transcribed genes also loop and crumple yet fail to coalesce. Notably, a missense mutation in transcription factor TFIIB suppresses gene looping, yet neither crumpling nor HSP gene coalescence is affected. An inactivating promoter mutation, in contrast, obviates all three. Our results provide evidence for widespread, transcription-associated gene crumpling and demonstrate the de novo assembly and disassembly of HSP gene foci. Copyright © 2017 American Society for Microbiology.
Junk DNA and the long non-coding RNA twist in cancer genetics

PubMed Central

Ling, Hui; Vincent, Kimberly; Pichler, Martin; Fodde, Riccardo; Berindan-Neagoe, Ioana; Slack, Frank J.; Calin, George A

2015-01-01

The central dogma of molecular biology states that the flow of genetic information moves from DNA to RNA to protein. However, in the last decade this dogma has been challenged by new findings on non-coding RNAs (ncRNAs) such as microRNAs (miRNAs). More recently, long non-coding RNAs (lncRNAs) have attracted much attention due to their large number and biological significance. Many lncRNAs have been identified as mapping to regulatory elements including gene promoters and enhancers, ultraconserved regions, and intergenic regions of protein-coding genes. Yet, the biological function and molecular mechanisms of lncRNA in human diseases in general and cancer in particular remain largely unknown. Data from the literature suggest that lncRNA, often via interaction with proteins, functions in specific genomic loci or use their own transcription loci for regulatory activity. In this review, we summarize recent findings supporting the importance of DNA loci in lncRNA function, and the underlying molecular mechanisms via cis or trans regulation, and discuss their implications in cancer. In addition, we use the 8q24 genomic locus, a region containing interactive SNPs, DNA regulatory elements and lncRNAs, as an example to illustrate how single nucleotide polymorphism (SNP) located within lncRNAs may be functionally associated with the individual’s susceptibility to cancer. PMID:25619839
Mitochondrial and cytoplasmic isoleucyl-, glutamyl- and arginyl-tRNA synthetases of yeast are encoded by separate genes.

PubMed

Tzagoloff, A; Shtanko, A

1995-06-01

Three complementation groups of a pet mutant collection have been found to be composed of respiratory-deficient deficient mutants with lesions in mitochondrial protein synthesis. Recombinant plasmids capable of restoring respiration were cloned by transformation of representatives of each complementation group with a yeast genomic library. The plasmids were used to characterize the complementing genes and to institute disruption of the chromosomal copies of each gene in respiratory-proficient yeast. The sequences of the cloned genes indicate that they code for isoleucyl-, arginyl- and glutamyl-tRNA synthetases. The properties of the mutants used to obtain the genes and of strains with the disrupted genes indicate that all three aminoacyl-tRNA synthetases function exclusively in mitochondrial proteins synthesis. The ISM1 gene for mitochondrial isoleucyl-tRNA synthetase has been localized to chromosome XVI next to UME5. The MSR1 gene for the arginyl-tRNA synthetase was previously located on yeast chromosome VIII. The third gene MSE1 for the mitochondrial glutamyl-tRNA synthetase has not been localized. The identification of three new genes coding for mitochondrial-specific aminoacyl-tRNA synthetases indicates that in Saccharomyces cerevisiae at least 11 members of this protein family are encoded by genes distinct from those coding for the homologous cytoplasmic enzymes.
Structural and functional studies of a family of Dictyostelium discoideum developmentally regulated, prestalk genes coding for small proteins.

PubMed

Vicente, Juan J; Galardi-Castilla, María; Escalante, Ricardo; Sastre, Leandro

2008-01-03

The social amoeba Dictyostelium discoideum executes a multicellular development program upon starvation. This morphogenetic process requires the differential regulation of a large number of genes and is coordinated by extracellular signals. The MADS-box transcription factor SrfA is required for several stages of development, including slug migration and spore terminal differentiation. Subtractive hybridization allowed the isolation of a gene, sigN (SrfA-induced gene N), that was dependent on the transcription factor SrfA for expression at the slug stage of development. Homology searches detected the existence of a large family of sigN-related genes in the Dictyostelium discoideum genome. The 13 most similar genes are grouped in two regions of chromosome 2 and have been named Group1 and Group2 sigN genes. The putative encoded proteins are 87-89 amino acids long. All these genes have a similar structure, composed of a first exon containing a 13 nucleotides long open reading frame and a second exon comprising the remaining of the putative coding region. The expression of these genes is induced at10 hours of development. Analyses of their promoter regions indicate that these genes are expressed in the prestalk region of developing structures. The addition of antibodies raised against SigN Group 2 proteins induced disintegration of multi-cellular structures at the mound stage of development. A large family of genes coding for small proteins has been identified in D. discoideum. Two groups of very similar genes from this family have been shown to be specifically expressed in prestalk cells during development. Functional studies using antibodies raised against Group 2 SigN proteins indicate that these genes could play a role during multicellular development.
Genetic and molecular characterization of a gene encoding a wide specificity purine permease of Aspergillus nidulans reveals a novel family of transporters conserved in prokaryotes and eukaryotes.

PubMed

Diallinas, G; Gorfinkiel, L; Arst, H N; Cecchetto, G; Scazzocchio, C

1995-04-14

In Aspergillus nidulans, loss-of-function mutations in the uapA and azgA genes, encoding the major uric acid-xanthine and hypoxanthine-adenine-guanine permeases, respectively, result in impaired utilization of these purines as sole nitrogen sources. The residual growth of the mutant strains is due to the activity of a broad specificity purine permease. We have identified uapC, the gene coding for this third permease through the isolation of both gain-of-function and loss-of-function mutations. Uptake studies with wild-type and mutant strains confirmed the genetic analysis and showed that the UapC protein contributes 30% and 8-10% to uric acid and hypoxanthine transport rates, respectively. The uapC gene was cloned, its expression studied, its sequence and transcript map established, and the sequence of its putative product analyzed. uapC message accumulation is: (i) weakly induced by 2-thiouric acid; (ii) repressed by ammonium; (iii) dependent on functional uaY and areA regulatory gene products (mediating uric acid induction and nitrogen metabolite repression, respectively); (iv) increased by uapC gain-of-function mutations which specifically, but partially, suppress a leucine to valine mutation in the zinc finger of the protein coded by the areA gene. The putative uapC gene product is a highly hydrophobic protein of 580 amino acids (M(r) = 61,251) including 12-14 putative transmembrane segments. The UapC protein is highly similar (58% identity) to the UapA permease and significantly similar (23-34% identity) to a number of bacterial transporters. Comparisons of the sequences and hydropathy profiles of members of this novel family of transporters yield insights into their structure, functionally important residues, and possible evolutionary relationships.
Chemical Approaches to Control Gene Expression

PubMed Central

Gottesfeld, Joel M.; Turner, James M.; Dervan, Peter B.

2000-01-01

A current goal in molecular medicine is the development of new strategies to interfere with gene expression in living cells in the hope that novel therapies for human disease will result from these efforts. This review focuses on small-molecule or chemical approaches to manipulate gene expression by modulating either transcription of messenger RNA-coding genes or protein translation. The molecules under study include natural products, designed ligands, and compounds identified through functional screens of combinatorial libraries. The cellular targets for these molecules include DNA, messenger RNA, and the protein components of the transcription, RNA processing, and translational machinery. Studies with model systems have shown promise in the inhibition of both cellular and viral gene transcription and mRNA utilization. Moreover, strategies for both repression and activation of gene transcription have been described. These studies offer promise for treatment of diseases of pathogenic (viral, bacterial, etc.) and cellular origin (cancer, genetic diseases, etc.). PMID:11097426
Systematic analysis of human kinase genes: a large number of genes and alternative splicing events result in functional and structural diversity

PubMed Central

Milanesi, Luciano; Petrillo, Mauro; Sepe, Leandra; Boccia, Angelo; D'Agostino, Nunzio; Passamano, Myriam; Di Nardo, Salvatore; Tasco, Gianluca; Casadio, Rita; Paolella, Giovanni

2005-01-01

Background Protein kinases are a well defined family of proteins, characterized by the presence of a common kinase catalytic domain and playing a significant role in many important cellular processes, such as proliferation, maintenance of cell shape, apoptosys. In many members of the family, additional non-kinase domains contribute further specialization, resulting in subcellular localization, protein binding and regulation of activity, among others. About 500 genes encode members of the kinase family in the human genome, and although many of them represent well known genes, a larger number of genes code for proteins of more recent identification, or for unknown proteins identified as kinase only after computational studies. Results A systematic in silico study performed on the human genome, led to the identification of 5 genes, on chromosome 1, 11, 13, 15 and 16 respectively, and 1 pseudogene on chromosome X; some of these genes are reported as kinases from NCBI but are absent in other databases, such as KinBase. Comparative analysis of 483 gene regions and subsequent computational analysis, aimed at identifying unannotated exons, indicates that a large number of kinase may code for alternately spliced forms or be incorrectly annotated. An InterProScan automated analysis was perfomed to study domain distribution and combination in the various families. At the same time, other structural features were also added to the annotation process, including the putative presence of transmembrane alpha helices, and the cystein propensity to participate into a disulfide bridge. Conclusion The predicted human kinome was extended by identifiying both additional genes and potential splice variants, resulting in a varied panorama where functionality may be searched at the gene and protein level. Structural analysis of kinase proteins domains as defined in multiple sources together with transmembrane alpha helices and signal peptide prediction provides hints to function assignment. The results of the human kinome analysis are collected in the KinWeb database, available for browsing and searching over the internet, where all results from the comparative analysis and the gene structure annotation are made available, alongside the domain information. Kinases may be searched by domain combinations and the relative genes may be viewed in a graphic browser at various level of magnification up to gene organization on the full chromosome set. PMID:16351747
The complete mitochondrial genome of Papilio glaucus and its phylogenetic implications.

PubMed

Shen, Jinhui; Cong, Qian; Grishin, Nick V

2015-09-01

Due to the intriguing morphology, lifecycle, and diversity of butterflies and moths, Lepidoptera are emerging as model organisms for the study of genetics, evolution and speciation. The progress of these studies relies on decoding Lepidoptera genomes, both nuclear and mitochondrial. Here we describe a protocol to obtain mitogenomes from Next Generation Sequencing reads performed for whole-genome sequencing and report the complete mitogenome of Papilio (Pterourus) glaucus. The circular mitogenome is 15,306 bp in length and rich in A and T. It contains 13 protein-coding genes (PCGs), 22 transfer-RNA-coding genes (tRNA), and 2 ribosomal-RNA-coding genes (rRNA), with a gene order typical for mitogenomes of Lepidoptera. We performed phylogenetic analyses based on PCG and RNA-coding genes or protein sequences using Bayesian Inference and Maximum Likelihood methods. The phylogenetic trees consistently show that among species with available mitogenomes Papilio glaucus is the closest to Papilio (Agehana) maraho from Asia.
Picornaviruses and nuclear functions: targeting a cellular compartment distinct from the replication site of a positive-strand RNA virus

PubMed Central

Flather, Dylan; Semler, Bert L.

2015-01-01

The compartmentalization of DNA replication and gene transcription in the nucleus and protein production in the cytoplasm is a defining feature of eukaryotic cells. The nucleus functions to maintain the integrity of the nuclear genome of the cell and to control gene expression based on intracellular and environmental signals received through the cytoplasm. The spatial separation of the major processes that lead to the expression of protein-coding genes establishes the necessity of a transport network to allow biomolecules to translocate between these two regions of the cell. The nucleocytoplasmic transport network is therefore essential for regulating normal cellular functioning. The Picornaviridae virus family is one of many viral families that disrupt the nucleocytoplasmic trafficking of cells to promote viral replication. Picornaviruses contain positive-sense, single-stranded RNA genomes and replicate in the cytoplasm of infected cells. As a result of the limited coding capacity of these viruses, cellular proteins are required by these intracellular parasites for both translation and genomic RNA replication. Being of messenger RNA polarity, a picornavirus genome can immediately be translated upon entering the cell cytoplasm. However, the replication of viral RNA requires the activity of RNA-binding proteins, many of which function in host gene expression, and are consequently localized to the nucleus. As a result, picornaviruses disrupt nucleocytoplasmic trafficking to exploit protein functions normally localized to a different cellular compartment from which they translate their genome to facilitate efficient replication. Furthermore, picornavirus proteins are also known to enter the nucleus of infected cells to limit host-cell transcription and down-regulate innate antiviral responses. The interactions of picornavirus proteins and host-cell nuclei are extensive, required for a productive infection, and are the focus of this review. PMID:26150805
The complete mitochondrial genome of Setaria digitata (Nematoda: Filarioidea): Mitochondrial gene content, arrangement and composition compared with other nematodes.

PubMed

Yatawara, Lalani; Wickramasinghe, Susiji; Rajapakse, R P V J; Agatsuma, Takeshi

2010-09-01

In the present study, we determined the complete mitochondrial (mt) genome sequence (13,839bp) of parasitic nematode Setaria digitata and its structure and organization compared with Onchocerca volvulus, Dirofilaria immitis and Brugia malayi. The mt genome of S. digitata is slightly larger than the mt genomes of other filarial nematodes. S. digitata mt genome contains 36 genes (12 protein-coding genes, 22 transfer RNAs and 2 ribosomal RNAs) that are typically found in metazoans. This genome contains a high A+T (75.1%) content and low G+C content (24.9%). The mt gene order for S. digitata is the same as those for O. volvulus, D. immitis and B. malayi but it is distinctly different from other nematodes compared. The start codons inferred in the mt genome of S. digitata are TTT, ATT, TTG, ATG, GTT and ATA. Interestingly, the initiation codon TTT is unique to S. digitata mt genome and four protein-coding genes use this codon as a translation initiation codon. Five protein-coding genes use TAG as a stop codon whereas three genes use TAA and four genes use T as a termination codon. Out of 64 possible codons, only 57 are used for mitochondrial protein-coding genes of S. digitata. T-rich codons such as TTT (18.9%), GTT (7.9%), TTG (7.8%), TAT (7%), ATT (5.7%), TCT (4.8%) and TTA (4.1%) are used more frequently. This pattern of codon usage reflects the strong bias for T in the mt genome of S. digitata. In conclusion, the present investigation provides new molecular data for future studies of the comparative mitochondrial genomics and systematic of parasitic nematodes of socio-economic importance. 2010 Elsevier B.V. All rights reserved.
A novel TBP-TAF complex on RNA polymerase II-transcribed snRNA genes.

PubMed

Zaborowska, Justyna; Taylor, Alice; Roeder, Robert G; Murphy, Shona

2012-01-01

Initiation of transcription of most human genes transcribed by RNA polymerase II (RNAP II) requires the formation of a preinitiation complex comprising TFIIA, B, D, E, F, H and RNAP II. The general transcription factor TFIID is composed of the TATA-binding protein and up to 13 TBP-associated factors. During transcription of snRNA genes, RNAP II does not appear to make the transition to long-range productive elongation, as happens during transcription of protein-coding genes. In addition, recognition of the snRNA gene-type specific 3' box RNA processing element requires initiation from an snRNA gene promoter. These characteristics may, at least in part, be driven by factors recruited to the promoter. For example, differences in the complement of TAFs might result in differential recruitment of elongation and RNA processing factors. As precedent, it already has been shown that the promoters of some protein-coding genes do not recruit all the TAFs found in TFIID. Although TAF5 has been shown to be associated with RNAP II-transcribed snRNA genes, the full complement of TAFs associated with these genes has remained unclear. Here we show, using a ChIP and siRNA-mediated approach, that the TBP/TAF complex on snRNA genes differs from that found on protein-coding genes. Interestingly, the largest TAF, TAF1, and the core TAFs, TAF10 and TAF4, are not detected on snRNA genes. We propose that this snRNA gene-specific TAF subset plays a key role in gene type-specific control of expression.
Improvement of genome assembly completeness and identification of novel full-length protein-coding genes by RNA-seq in the giant panda genome.

PubMed

Chen, Meili; Hu, Yibo; Liu, Jingxing; Wu, Qi; Zhang, Chenglin; Yu, Jun; Xiao, Jingfa; Wei, Fuwen; Wu, Jiayan

2015-12-11

High-quality and complete gene models are the basis of whole genome analyses. The giant panda (Ailuropoda melanoleuca) genome was the first genome sequenced on the basis of solely short reads, but the genome annotation had lacked the support of transcriptomic evidence. In this study, we applied RNA-seq to globally improve the genome assembly completeness and to detect novel expressed transcripts in 12 tissues from giant pandas, by using a transcriptome reconstruction strategy that combined reference-based and de novo methods. Several aspects of genome assembly completeness in the transcribed regions were effectively improved by the de novo assembled transcripts, including genome scaffolding, the detection of small-size assembly errors, the extension of scaffold/contig boundaries, and gap closure. Through expression and homology validation, we detected three groups of novel full-length protein-coding genes. A total of 12.62% of the novel protein-coding genes were validated by proteomic data. GO annotation analysis showed that some of the novel protein-coding genes were involved in pigmentation, anatomical structure formation and reproduction, which might be related to the development and evolution of the black-white pelage, pseudo-thumb and delayed embryonic implantation of giant pandas. The updated genome annotation will help further giant panda studies from both structural and functional perspectives.
MHC class I-associated peptides derive from selective regions of the human genome.

PubMed

Pearson, Hillary; Daouda, Tariq; Granados, Diana Paola; Durette, Chantal; Bonneil, Eric; Courcelles, Mathieu; Rodenbrock, Anja; Laverdure, Jean-Philippe; Côté, Caroline; Mader, Sylvie; Lemieux, Sébastien; Thibault, Pierre; Perreault, Claude

2016-12-01

MHC class I-associated peptides (MAPs) define the immune self for CD8+ T lymphocytes and are key targets of cancer immunosurveillance. Here, the goals of our work were to determine whether the entire set of protein-coding genes could generate MAPs and whether specific features influence the ability of discrete genes to generate MAPs. Using proteogenomics, we have identified 25,270 MAPs isolated from the B lymphocytes of 18 individuals who collectively expressed 27 high-frequency HLA-A,B allotypes. The entire MAP repertoire presented by these 27 allotypes covered only 10% of the exomic sequences expressed in B lymphocytes. Indeed, 41% of expressed protein-coding genes generated no MAPs, while 59% of genes generated up to 64 MAPs, often derived from adjacent regions and presented by different allotypes. We next identified several features of transcripts and proteins associated with efficient MAP production. From these data, we built a logistic regression model that predicts with good accuracy whether a gene generates MAPs. Our results show preferential selection of MAPs from a limited repertoire of proteins with distinctive features. The notion that the MHC class I immunopeptidome presents only a small fraction of the protein-coding genome for monitoring by the immune system has profound implications in autoimmunity and cancer immunology.
MHC class I–associated peptides derive from selective regions of the human genome

PubMed Central

Pearson, Hillary; Granados, Diana Paola; Durette, Chantal; Bonneil, Eric; Courcelles, Mathieu; Rodenbrock, Anja; Laverdure, Jean-Philippe; Côté, Caroline; Thibault, Pierre

2016-01-01

MHC class I–associated peptides (MAPs) define the immune self for CD8+ T lymphocytes and are key targets of cancer immunosurveillance. Here, the goals of our work were to determine whether the entire set of protein-coding genes could generate MAPs and whether specific features influence the ability of discrete genes to generate MAPs. Using proteogenomics, we have identified 25,270 MAPs isolated from the B lymphocytes of 18 individuals who collectively expressed 27 high-frequency HLA-A,B allotypes. The entire MAP repertoire presented by these 27 allotypes covered only 10% of the exomic sequences expressed in B lymphocytes. Indeed, 41% of expressed protein-coding genes generated no MAPs, while 59% of genes generated up to 64 MAPs, often derived from adjacent regions and presented by different allotypes. We next identified several features of transcripts and proteins associated with efficient MAP production. From these data, we built a logistic regression model that predicts with good accuracy whether a gene generates MAPs. Our results show preferential selection of MAPs from a limited repertoire of proteins with distinctive features. The notion that the MHC class I immunopeptidome presents only a small fraction of the protein-coding genome for monitoring by the immune system has profound implications in autoimmunity and cancer immunology. PMID:27841757

Characterization of the complete mitochondrial genome of the hybrid Epinephelus moara♀ × Epinephelus lanceolatus♂, and phylogenetic analysis in subfamily epinephelinae

NASA Astrophysics Data System (ADS)

Gao, Fengtao; Wei, Min; Zhu, Ying; Guo, Hua; Chen, Songlin; Yang, Guanpin

2017-06-01

This study presents the complete mitochondrial genome of the hybrid Epinephelus moara♀× Epinephelus lanceolatus♂. The genome is 16886 bp in length, and contains 13 protein-coding genes, 2 rRNA genes, 22 tRNA genes, a light-strand replication origin and a control region. Additionally, phylogenetic analysis based on the nucleotide sequences of 13 conserved protein-coding genes using the maximum likelihood method indicated that the mitochondrial genome is maternally inherited. This study presents genomic data for studying phylogenetic relationships and breeding of hybrid Epinephelinae.
The primary transcriptome of the marine diazotroph Trichodesmium erythraeum IMS101

NASA Astrophysics Data System (ADS)

Pfreundt, Ulrike; Kopf, Matthias; Belkin, Natalia; Berman-Frank, Ilana; Hess, Wolfgang R.

2014-08-01

Blooms of the dinitrogen-fixing marine cyanobacterium Trichodesmium considerably contribute to new nitrogen inputs into tropical oceans. Intriguingly, only 60% of the Trichodesmium erythraeum IMS101 genome sequence codes for protein, compared with ~85% in other sequenced cyanobacterial genomes. The extensive non-coding genome fraction suggests space for an unusually high number of unidentified, potentially regulatory non-protein-coding RNAs (ncRNAs). To identify the transcribed fraction of the genome, here we present a genome-wide map of transcriptional start sites (TSS) at single nucleotide resolution, revealing the activity of 6,080 promoters. We demonstrate that T. erythraeum has the highest number of actively splicing group II introns and the highest percentage of TSS yielding ncRNAs of any bacterium examined to date. We identified a highly transcribed retroelement that serves as template repeat for the targeted mutation of at least 12 different genes by mutagenic homing. Our findings explain the non-coding portion of the T. erythraeum genome by the transcription of an unusually high number of non-coding transcripts in addition to the known high incidence of transposable elements. We conclude that riboregulation and RNA maturation-dependent processes constitute a major part of the Trichodesmium regulatory apparatus.
Expression and characterization of truncated human heme oxygenase (hHO-1) and a fusion protein of hHO-1 with human cytochrome P450 reductase.

PubMed

Wilks, A; Black, S M; Miller, W L; Ortiz de Montellano, P R

1995-04-04

A human heme oxygenase (hHO-1) gene without the sequence coding for the last 23 amino acids has been expressed in Escherichia coli behind the pho A promoter. The truncated enzyme is obtained in high yields as a soluble, catalytically-active protein, making it available for the first time for detailed mechanistic studies. The purified, truncated hHO-1/heme complex is spectroscopically indistinguishable from that of the rat enzyme and converts heme to biliverdin when reconstituted with rat liver cytochrome P450 reductase. A self-sufficient heme oxygenase system has been obtained by fusing the truncated hHO-1 gene to the gene for human cytochrome P450 reductase without the sequence coding for the 20 amino acid membrane binding domain. Expression of the fusion protein in pCWori+ yields a protein that only requires NADPH for catalytic turnover. The failure of exogenous cytochrome P450 reductase to stimulate turnover and the insensitivity of the catalytic rate toward changes in ionic strength establish that electrons are transferred intramolecularly between the reductase and heme oxygenase domains of the fusion protein. The Vmax for the fusion protein is 2.5 times higher than that for the reconstituted system. Therefore, either the covalent tether does not interfere with normal docking and electron transfer between the flavin and heme domains or alternative but equally efficient electron transfer pathways are available that do not require specific docking.
[Long non-coding RNAs in the pathophysiology of atherosclerosis].

PubMed

Novak, Jan; Vašků, Julie Bienertová; Souček, Miroslav

2018-01-01

The human genome contains about 22 000 protein-coding genes that are transcribed to an even larger amount of messenger RNAs (mRNA). Interestingly, the results of the project ENCODE from 2012 show, that despite up to 90 % of our genome being actively transcribed, protein-coding mRNAs make up only 2-3 % of the total amount of the transcribed RNA. The rest of RNA transcripts is not translated to proteins and that is why they are referred to as "non-coding RNAs". Earlier the non-coding RNA was considered "the dark matter of genome", or "the junk", whose genes has accumulated in our DNA during the course of evolution. Today we already know that non-coding RNAs fulfil a variety of regulatory functions in our body - they intervene into epigenetic processes from chromatin remodelling to histone methylation, or into the transcription process itself, or even post-transcription processes. Long non-coding RNAs (lncRNA) are one of the classes of non-coding RNAs that have more than 200 nucleotides in length (non-coding RNAs with less than 200 nucleotides in length are called small non-coding RNAs). lncRNAs represent a widely varied and large group of molecules with diverse regulatory functions. We can identify them in all thinkable cell types or tissues, or even in an extracellular space, which includes blood, specifically plasma. Their levels change during the course of organogenesis, they are specific to different tissues and their changes also occur along with the development of different illnesses, including atherosclerosis. This review article aims to present lncRNAs problematics in general and then focuses on some of their specific representatives in relation to the process of atherosclerosis (i.e. we describe lncRNA involvement in the biology of endothelial cells, vascular smooth muscle cells or immune cells), and we further describe possible clinical potential of lncRNA, whether in diagnostics or therapy of atherosclerosis and its clinical manifestations.Key words: atherosclerosis - lincRNA - lncRNA - MALAT - MIAT.
Light-Regulated Transcription of Genes Encoding Peridinin Chlorophyll a Proteins and the Major Intrinsic Light-Harvesting Complex Proteins in the Dinoflagellate Amphidinium carterae Hulburt (Dinophycae)1

PubMed Central

ten Lohuis, Michael R.; Miller, David J.

1998-01-01

In the dinoflagellate Amphidinium carterae, photoadaptation involves changes in the transcription of genes encoding both of the major classes of light-harvesting proteins, the peridinin chlorophyll a proteins (PCPs) and the major a/c-containing intrinsic light-harvesting proteins (LHCs). PCP and LHC transcript levels were increased up to 86- and 6-fold higher, respectively, under low-light conditions relative to cells grown at high illumination. These increases in transcript abundance were accompanied by decreases in the extent of methylation of CpG and CpNpG motifs within or near PCP- and LHC-coding regions. Cytosine methylation levels in A. carterae are therefore nonstatic and may vary with environmental conditions in a manner suggestive of involvement in the regulation of gene expression. However, chemically induced undermethylation was insufficient in activating transcription, because treatment with two methylation inhibitors had no effect on PCP mRNA or protein levels. Regulation of gene activity through changes in DNA methylation has traditionally been assumed to be restricted to higher eukaryotes (deuterostomes and green plants); however, the atypically large genomes of dinoflagellates may have generated the requirement for systems of this type in a relatively “primitive” organism. Dinoflagellates may therefore provide a unique perspective on the evolution of eukaryotic DNA-methylation systems. PMID:9576788
Dominant genetics using a yeast genomic library under the control of a strong inducible promoter.

PubMed

Ramer, S W; Elledge, S J; Davis, R W

1992-12-01

In Saccharomyces cerevisiae, numerous genes have been identified by selection from high-copy-number libraries based on "multicopy suppression" or other phenotypic consequences of overexpression. Although fruitful, this approach suffers from two major drawbacks. First, high copy number alone may not permit high-level expression of tightly regulated genes. Conversely, other genes expressed in proportion to dosage cannot be identified if their products are toxic at elevated levels. This work reports construction of a genomic DNA expression library for S. cerevisiae that circumvents both limitations by fusing randomly sheared genomic DNA to the strong, inducible yeast GAL1 promoter, which can be regulated by carbon source. The library obtained contains 5 x 10(7) independent recombinants, representing a breakpoint at every base in the yeast genome. This library was used to examine aberrant gene expression in S. cerevisiae. A screen for dominant activators of yeast mating response identified eight genes that activate the pathway in the absence of exogenous mating pheromone, including one previously unidentified gene. One activator was a truncated STE11 gene lacking approximately 1000 base pairs of amino-terminal coding sequence. In two different clones, the same GAL1 promoter-proximal ATG is in-frame with the coding sequence of STE11, suggesting that internal initiation of translation there results in production of a biologically active, truncated STE11 protein. Thus this library allows isolation based on dominant phenotypes of genes that might have been difficult or impossible to isolate from high-copy-number libraries.
Cloning of hydrogenase genes and fine structure analysis of an operon essential for H/sub 2/ metabolism in Escherichia coli

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sankar, P.; Lee, J.H.; Shanmugam, K.T.

1985-04-01

Escherichia coli has two unlinked genes that code for hydrogenase synthesis and activity. The DNA fragments containing the two genes (hydA and hydB) were cloned into a plasmid vector, pBR322. The plasmids containing the hyd genes (pSE-290 and pSE-111 carrying the hydA and hydB genes, respectively) were used to genetically map a total of 51 mutant strains with defects in hydrogenase activity. A total of 37 mutants carried a mutation in the hydB gene, whereas the remaining 14 hyd were hydA. This complementation analysis also established the presence of two new genes, so far unidentified, one coding for formate dehydrogenase-2more » (fdv) and another producing an electron transport protein (fhl) coupling formate dehydrogenase-2 to hydrogenase. Three of the four genes, hydB, fhl, and fdv, may constitute a single operon, and all three genes are carried by a 5.6-kilobase-pair chromosomal DNA insert in plasmid pSE-128. Plasmids carrying a part of this 5.6-kilobase-pair DNA (pSE-130) or fragments derived from this DNA in different orientations (pSE-126 and pSE-129) inhibited the production of active formate hydrogenlyase. This inhibition occurred even in a prototrophic E. coli, strain K-10, but only during an early induction period. These results, based on complementation analysis with cloned DNA fragments, show that both hydA and hydB genes are essential for the production of active hydrogenase. For the expression of active formate hydrogenlyase, two other gene products, fhl and fdv are also needed. All four genes map between 58 and 59 min in the E. coli chromosome.« less
Evaluation of 10 genes encoding cardiac proteins in Doberman Pinschers with dilated cardiomyopathy.

PubMed

O'Sullivan, M Lynne; O'Grady, Michael R; Pyle, W Glen; Dawson, John F

2011-07-01

To identify a causative mutation for dilated cardiomyopathy (DCM) in Doberman Pinschers by sequencing the coding regions of 10 cardiac genes known to be associated with familial DCM in humans. 5 Doberman Pinschers with DCM and congestive heart failure and 5 control mixed-breed dogs that were euthanized or died. RNA was extracted from frozen ventricular myocardial samples from each dog, and first-strand cDNA was synthesized via reverse transcription, followed by PCR amplification with gene-specific primers. Ten cardiac genes were analyzed: cardiac actin, α-actinin, α-tropomyosin, β-myosin heavy chain, metavinculin, muscle LIM protein, myosinbinding protein C, tafazzin, titin-cap (telethonin), and troponin T. Sequences for DCM-affected and control dogs and the published canine genome were compared. None of the coding sequences yielded a common causative mutation among all Doberman Pinscher samples. However, 3 variants were identified in the α-actinin gene in the DCM-affected Doberman Pinschers. One of these variants, identified in 2 of the 5 Doberman Pinschers, resulted in an amino acid change in the rod-forming triple coiled-coil domain. Mutations in the coding regions of several genes associated with DCM in humans did not appear to consistently account for DCM in Doberman Pinschers. However, an α-actinin variant was detected in some Doberman Pinschers that may contribute to the development of DCM given its potential effect on the structure of this protein. Investigation of additional candidate gene coding and noncoding regions and further evaluation of the role of α-actinin in development of DCM in Doberman Pinschers are warranted.
Mitochondrial genome of Pteronotus personatus (Chiroptera: Mormoopidae): comparison with selected bats and phylogenetic considerations.

PubMed

López-Wilchis, Ricardo; Del Río-Portilla, Miguel Ángel; Guevara-Chumacero, Luis Manuel

2017-02-01

We described the complete mitochondrial genome (mitogenome) of the Wagner's mustached bat, Pteronotus personatus, a species belonging to the family Mormoopidae, and compared it with other published mitogenomes of bats (Chiroptera). The mitogenome of P. personatus was 16,570 bp long and contained a typically conserved structure including 13 protein-coding genes, 22 transfer RNA genes, two ribosomal RNA genes, and one control region (D-loop). Most of the genes were encoded on the H-strand, except for eight tRNA and the ND6 genes. The order of protein-coding and rRNA genes was highly conserved in all mitogenomes. All protein-coding genes started with an ATG codon, except for ND2, ND3, and ND5, which initiated with ATA, and terminated with the typical stop codon TAA/TAG or the codon AGA. Phylogenetic trees constructed using Maximum Parsimony, Maximum Likelihood, and Bayesian inference methods showed an identical topology and indicated the monophyly of different families of bats (Mormoopidae, Phyllostomidae, Vespertilionidae, Rhinolophidae, and Pteropopidae) and the existence of two major clades corresponding to the suborders Yangochiroptera and Yinpterochiroptera. The mitogenome sequence provided here will be useful for further phylogenetic analyses and population genetic studies in mormoopid bats.
Cloning, characterization and sequence comparison of the gene coding for IMP dehydrogenase from Pyrococcus furiosus.

PubMed

Collart, F R; Osipiuk, J; Trent, J; Olsen, G J; Huberman, E

1996-10-03

We have cloned and characterized the gene encoding inosine monophosphate dehydrogenase (IMPDH) from Pyrococcus furiosus (Pf), a hyperthermophillic archeon. Sequence analysis of the Pf gene indicated an open reading frame specifying a protein of 485 amino acids (aa) with a calculated M(r) of 52900. Canonical Archaea promoter elements, Box A and Box B, are located -49 and -17 nucleotides (nt), respectively, upstream of the putative start codon. The sequence of the putative active-site region conforms to the IMPDH signature motif and contains a putative active-site cysteine. Phylogenetic relationships derived by using all available IMPDH sequences are consistent with trees developed for other molecules; they do not precisely resolve the history of Pf IMPDH but indicate a close similarity to bacterial IMPDH proteins. The phylogenetic analysis indicates that a gene duplication occurred prior to the division between rodents and humans, accounting for the Type I and II isoforms identified in mice and humans.
Identification of nucleolus-associated chromatin domains reveals the role of the nucleolus in the 3D organisation of the A. thaliana genome

PubMed Central

Pontvianne, Frédéric; Carpentier, Marie-Christine; Durut, Nathalie; Pavlištová, Veronika; Jaške, Karin; Schořová, Šárka; Parrinello, Hugues; Rohmer, Marine; Pikaard, Craig S; Fojtová, Miloslava; Fajkus, Jiří; Saez-Vasquez, Julio

2017-01-01

The nucleolus is the site of ribosomal RNA (rRNA) gene transcription, rRNA processing and ribosome biogenesis. However, the nucleolus also plays additional roles in the cell. We isolated nucleoli by Fluorescence Activated Cell Sorting (FACS) and identified Nucleolus-Associated Chromatin Domains (NADs) by deep sequencing, comparing wild-type plants and null mutants for the nucleolar protein, NUCLEOLIN 1 (NUC1). NADs are primarily genomic regions with heterochromatic signatures and include transposable elements (TEs), sub-telomeric regions and mostly inactive protein-coding genes. However, NADs also include active ribosomal RNA genes, and the entire short arm of chromosome 4 adjacent to them. In nuc1 null mutants, which alter rRNA gene expression and overall nucleolar structure, NADs are altered, telomere association with the nucleolus is decreased and telomeres become shorter. Collectively, our studies reveal roles for NUC1 and the nucleolus in the spatial organization of chromosomes as well as telomere maintenance. PMID:27477271
Analysis of Antisense Expression by Whole Genome Tiling Microarrays and siRNAs Suggests Mis-Annotation of Arabidopsis Orphan Protein-Coding Genes

PubMed Central

Richardson, Casey R.; Luo, Qing-Jun; Gontcharova, Viktoria; Jiang, Ying-Wen; Samanta, Manoj; Youn, Eunseog; Rock, Christopher D.

2010-01-01

Background MicroRNAs (miRNAs) and trans-acting small-interfering RNAs (tasi-RNAs) are small (20–22 nt long) RNAs (smRNAs) generated from hairpin secondary structures or antisense transcripts, respectively, that regulate gene expression by Watson-Crick pairing to a target mRNA and altering expression by mechanisms related to RNA interference. The high sequence homology of plant miRNAs to their targets has been the mainstay of miRNA prediction algorithms, which are limited in their predictive power for other kingdoms because miRNA complementarity is less conserved yet transitive processes (production of antisense smRNAs) are active in eukaryotes. We hypothesize that antisense transcription and associated smRNAs are biomarkers which can be computationally modeled for gene discovery. Principal Findings We explored rice (Oryza sativa) sense and antisense gene expression in publicly available whole genome tiling array transcriptome data and sequenced smRNA libraries (as well as C. elegans) and found evidence of transitivity of MIRNA genes similar to that found in Arabidopsis. Statistical analysis of antisense transcript abundances, presence of antisense ESTs, and association with smRNAs suggests several hundred Arabidopsis ‘orphan’ hypothetical genes are non-coding RNAs. Consistent with this hypothesis, we found novel Arabidopsis homologues of some MIRNA genes on the antisense strand of previously annotated protein-coding genes. A Support Vector Machine (SVM) was applied using thermodynamic energy of binding plus novel expression features of sense/antisense transcription topology and siRNA abundances to build a prediction model of miRNA targets. The SVM when trained on targets could predict the “ancient” (deeply conserved) class of validated Arabidopsis MIRNA genes with an accuracy of 84%, and 76% for “new” rapidly-evolving MIRNA genes. Conclusions Antisense and smRNA expression features and computational methods may identify novel MIRNA genes and other non-coding RNAs in plants and potentially other kingdoms, which can provide insight into antisense transcription, miRNA evolution, and post-transcriptional gene regulation. PMID:20520764
Non-coding RNAs—Novel targets in neurotoxicity

PubMed Central

Tal, Tamara L.; Tanguay, Robert L.

2012-01-01

Over the past ten years non-coding RNAs (ncRNAs) have emerged as pivotal players in fundamental physiological and cellular processes and have been increasingly implicated in cancer, immune disorders, and cardiovascular, neurodegenerative, and metabolic diseases. MicroRNAs (miRNAs) represent a class of ncRNA molecules that function as negative regulators of post-transcriptional gene expression. miRNAs are predicted to regulate 60% of all human protein-coding genes and as such, play key roles in cellular and developmental processes, human health, and disease. Relative to counterparts that lack bindings sites for miRNAs, genes encoding proteins that are post-transcriptionally regulated by miRNAs are twice as likely to be sensitive to environmental chemical exposure. Not surprisingly, miRNAs have been recognized as targets or effectors of nervous system, developmental, hepatic, and carcinogenic toxicants, and have been identified as putative regulators of phase I xenobiotic-metabolizing enzymes. In this review, we give an overview of the types of ncRNAs and highlight their roles in neurodevelopment, neurological disease, activity-dependent signaling, and drug metabolism. We then delve into specific examples that illustrate their importance as mediators, effectors, or adaptive agents of neurotoxicants or neuroactive pharmaceutical compounds. Finally, we identify a number of outstanding questions regarding ncRNAs and neurotoxicity. PMID:22394481
Co-LncRNA: investigating the lncRNA combinatorial effects in GO annotations and KEGG pathways based on human RNA-Seq data.

PubMed

Zhao, Zheng; Bai, Jing; Wu, Aiwei; Wang, Yuan; Zhang, Jinwen; Wang, Zishan; Li, Yongsheng; Xu, Juan; Li, Xia

2015-01-01

Long non-coding RNAs (lncRNAs) are emerging as key regulators of diverse biological processes and diseases. However, the combinatorial effects of these molecules in a specific biological function are poorly understood. Identifying co-expressed protein-coding genes of lncRNAs would provide ample insight into lncRNA functions. To facilitate such an effort, we have developed Co-LncRNA, which is a web-based computational tool that allows users to identify GO annotations and KEGG pathways that may be affected by co-expressed protein-coding genes of a single or multiple lncRNAs. LncRNA co-expressed protein-coding genes were first identified in publicly available human RNA-Seq datasets, including 241 datasets across 6560 total individuals representing 28 tissue types/cell lines. Then, the lncRNA combinatorial effects in a given GO annotations or KEGG pathways are taken into account by the simultaneous analysis of multiple lncRNAs in user-selected individual or multiple datasets, which is realized by enrichment analysis. In addition, this software provides a graphical overview of pathways that are modulated by lncRNAs, as well as a specific tool to display the relevant networks between lncRNAs and their co-expressed protein-coding genes. Co-LncRNA also supports users in uploading their own lncRNA and protein-coding gene expression profiles to investigate the lncRNA combinatorial effects. It will be continuously updated with more human RNA-Seq datasets on an annual basis. Taken together, Co-LncRNA provides a web-based application for investigating lncRNA combinatorial effects, which could shed light on their biological roles and could be a valuable resource for this community. Database URL: http://www.bio-bigdata.com/Co-LncRNA/. © The Author(s) 2015. Published by Oxford University Press.
Decoding the disease-associated proteins encoded in the human chromosome 4.

PubMed

Chen, Lien-Chin; Liu, Mei-Ying; Hsiao, Yung-Chin; Choong, Wai-Kok; Wu, Hsin-Yi; Hsu, Wen-Lian; Liao, Pao-Chi; Sung, Ting-Yi; Tsai, Shih-Feng; Yu, Jau-Song; Chen, Yu-Ju

2013-01-04

Chromosome 4 is the fourth largest chromosome, containing approximately 191 megabases (~6.4% of the human genome) with 757 protein-coding genes. A number of marker genes for many diseases have been found in this chromosome, including genetic diseases (e.g., hepatocellular carcinoma) and biomedical research (cardiac system, aging, metabolic disorders, immune system, cancer and stem cell) related genes (e.g., oncogenes, growth factors). As a pilot study for the chromosome 4-centric human proteome project (Chr 4-HPP), we present here a systematic analysis of the disease association, protein isoforms, coding single nucleotide polymorphisms of these 757 protein-coding genes and their experimental evidence at the protein level. We also describe how the findings from the chromosome 4 project might be used to drive the biomarker discovery and validation study in disease-oriented projects, using the examples of secretomic and membrane proteomic approaches in cancer research. By integrating with cancer cell secretomes and several other existing databases in the public domain, we identified 141 chromosome 4-encoded proteins as cancer cell-secretable/shedable proteins. Additionally, we also identified 54 chromosome 4-encoded proteins that have been classified as cancer-associated proteins with successful selected or multiple reaction monitoring (SRM/MRM) assays developed. From literature annotation and topology analysis, 271 proteins were recognized as membrane proteins while 27.9% of the 757 proteins do not have any experimental evidence at the protein-level. In summary, the analysis revealed that the chromosome 4 is a rich resource for cancer-associated proteins for biomarker verification projects and for drug target discovery projects.
Functional Expression of Two Neuronal Nicotinic Acetylcholine Receptors from cDNA Clones Identifies a Gene Family

NASA Astrophysics Data System (ADS)

Boulter, Jim; Connolly, John; Deneris, Evan; Goldman, Dan; Heinemann, Steven; Patrick, Jim

1987-11-01

A family of genes coding for proteins homologous to the α subunit of the muscle nicotinic acetylcholine receptor has been identified in the rat genome. These genes are transcribed in the central and peripheral nervous systems in areas known to contain functional nicotinic receptors. In this paper, we demonstrate that three of these genes, which we call alpha3, alpha4, and beta2, encode proteins that form functional nicotinic acetylcholine receptors when expressed in Xenopus oocytes. Oocytes expressing either alpha3 or alpha4 protein in combination with the beta2 protein produced a strong response to acetylcholine. Oocytes expressing only the alpha4 protein gave a weak response to acetylcholine. These receptors are activated by acetylcholine and nicotine and are blocked by Bungarus toxin 3.1. They are not blocked by α -bungarotoxin, which blocks the muscle nicotinic acetylcholine receptor. Thus, the receptors formed by the alpha3, alpha4, and beta2 subunits are pharmacologically similar to the ganglionic-type neuronal nicotinic acetylcholine receptor. These results indicate that the alpha3, alpha4, and beta2 genes encode functional nicotinic acetylcholine receptor subunits that are expressed in the brain and peripheral nervous system.
The compositional transition of vertebrate genomes: an analysis of the secondary structure of the proteins encoded by human genes.

PubMed

D'Onofrio, Giuseppe; Ghosh, Tapash Chandra

2005-01-17

Fluctuations and increments of both C(3) and G(3) levels along the human coding sequences were investigated comparing two sets of Xenopus/human orthologous genes. The first set of genes shows minor differences of the GC(3) levels, the second shows considerable increments of the GC(3) levels in the human genes. In both data sets, the fluctuations of C(3) and G(3) levels along the coding sequences correlated with the secondary structures of the encoded proteins. The human genes that underwent the compositional transition showed a different increment of the C(3) and G(3) levels within and among the structural units of the proteins. The relative synonymous codon usage (RSCU) of several amino acids were also affected during the compositional transition, showing that there exists a correlation between RSCU and protein secondary structures in human genes. The importance of natural selection for the formation of isochore organization of the human genome has been discussed on the basis of these results.
Image-guided genomic analysis of tissue response to laser-induced thermal stress

NASA Astrophysics Data System (ADS)

Mackanos, Mark A.; Helms, Mike; Kalish, Flora; Contag, Christopher H.

2011-05-01

The cytoprotective response to thermal injury is characterized by transcriptional activation of ``heat shock proteins'' (hsp) and proinflammatory proteins. Expression of these proteins may predict cellular survival. Microarray analyses were performed to identify spatially distinct gene expression patterns responding to thermal injury. Laser injury zones were identified by expression of a transgene reporter comprised of the 70 kD hsp gene and the firefly luciferase coding sequence. Zones included the laser spot, the surrounding region where hsp70-luc expression was increased, and a region adjacent to the surrounding region. A total of 145 genes were up-regulated in the laser irradiated region, while 69 were up-regulated in the adjacent region. At 7 hours the chemokine Cxcl3 was the highest expressed gene in the laser spot (24 fold) and adjacent region (32 fold). Chemokines were the most common up-regulated genes identified. Microarray gene expression was successfully validated using qRT- polymerase chain reaction for selected genes of interest. The early response genes are likely involved in cytoprotection and initiation of the healing response. Their regulatory elements will benefit creating the next generation reporter mice and controlling expression of therapeutic proteins. The identified genes serve as drug development targets that may prevent acute tissue damage and accelerate healing.
The Rodin-Ohno hypothesis that two enzyme superfamilies descended from one ancestral gene: an unlikely scenario for the origins of translation that will not be dismissed

PubMed Central

2014-01-01

Background Because amino acid activation is rate-limiting for uncatalyzed protein synthesis, it is a key puzzle in understanding the origin of the genetic code. Two unrelated classes (I and II) of contemporary aminoacyl-tRNA synthetases (aaRS) now translate the code. Observing that codons for the most highly conserved, Class I catalytic peptides, when read in the reverse direction, are very nearly anticodons for Class II defining catalytic peptides, Rodin and Ohno proposed that the two superfamilies descended from opposite strands of the same ancestral gene. This unusual hypothesis languished for a decade, perhaps because it appeared to be unfalsifiable. Results The proposed sense/antisense alignment makes important predictions. Fragments that align in antiparallel orientations, and contain the respective active sites, should catalyze the same two reactions catalyzed by contemporary synthetases. Recent experiments confirmed that prediction. Invariant cores from both classes, called Urzymes after Ur = primitive, authentic, plus enzyme and representing ~20% of the contemporary structures, can be expressed and exhibit high, proportionate rate accelerations for both amino-acid activation and tRNA acylation. A major fraction (60%) of the catalytic rate acceleration by contemporary synthetases resides in segments that align sense/antisense. Bioinformatic evidence for sense/antisense ancestry extends to codons specifying the invariant secondary and tertiary structures outside the active sites of the two synthetase classes. Peptides from a designed, 46-residue gene constrained by Rosetta to encode Class I and II ATP binding sites with fully complementary sequences both accelerate amino acid activation by ATP ~400 fold. Conclusions Biochemical and bioinformatic results substantially enhance the posterior probability that ancestors of the two synthetase classes arose from opposite strands of the same ancestral gene. The remarkable acceleration by short peptides of the rate-limiting step in uncatalyzed protein synthesis, together with the synergy of synthetase Urzymes and their cognate tRNAs, introduce a new paradigm for the origin of protein catalysts, emphasize the potential relevance of an operational RNA code embedded in the tRNA acceptor stems, and challenge the RNA-World hypothesis. Reviewers This article was reviewed by Dr. Paul Schimmel (nominated by Laura Landweber), Dr. Eugene Koonin and Professor David Ardell. PMID:24927791
Analysis of SOX10 mutations identified in Waardenburg-Hirschsprung patients: Differential effects on target gene regulation.

PubMed

Chan, Kwok Keung; Wong, Corinne Kung Yen; Lui, Vincent Chi Hang; Tam, Paul Kwong Hang; Sham, Mai Har

2003-10-15

SOX10 is a member of the SOX gene family related by homology to the high-mobility group (HMG) box region of the testis-determining gene SRY. Mutations of the transcription factor gene SOX10 lead to Waardenburg-Hirschsprung syndrome (Waardenburg-Shah syndrome, WS4) in humans. A number of SOX10 mutations have been identified in WS4 patients who suffer from different extents of intestinal aganglionosis, pigmentation, and hearing abnormalities. Some patients also exhibit signs of myelination deficiency in the central and peripheral nervous systems. Although the molecular bases for the wide range of symptoms displayed by the patients are still not clearly understood, a few target genes for SOX10 have been identified. We have analyzed the impact of six different SOX10 mutations on the activation of SOX10 target genes by yeast one-hybrid and mammalian cell transfection assays. To investigate the transactivation activities of the mutant proteins, three different SOX target binding sites were introduced into luciferase reporter gene constructs and examined in our series of transfection assays: consensus HMG domain protein binding sites; SOX10 binding sites identified in the RET promoter; and Sox10 binding sites identified in the P0 promoter. We found that the same mutation could have different transactivation activities when tested with different target binding sites and in different cell lines. The differential transactivation activities of the SOX10 mutants appeared to correlate with the intestinal and/or neurological symptoms presented in the patients. Among the six mutant SOX10 proteins tested, much reduced transactivation activities were observed when tested on the SOX10 binding sites from the RET promoter. Of the two similar mutations X467K and 1400del12, only the 1400del12 mutant protein exhibited an increase of transactivation through the P0 promoter. While the lack of normal SOX10 mediated activation of RET transcription may lead to intestinal aganglionosis, overexpression of genes coding for structural myelin proteins such as P0 due to mutant SOX10 may explain the dysmyelination phenotype observed in the patients with an additional neurological disorder. Copyright 2003 Wiley-Liss, Inc.

Gene expression profiling of porcine skeletal muscle in the early recovery phase following acute physical activity.

PubMed

Jensen, Jeanette H; Conley, Lene N; Hedegaard, Jakob; Nielsen, Mathilde; Young, Jette F; Oksbjerg, Niels; Hornshøj, Henrik; Bendixen, Christian; Thomsen, Bo

2012-07-01

Acute physical activity elicits changes in gene expression in skeletal muscles to promote metabolic changes and to repair exercise-induced muscle injuries. In the present time-course study, pigs were submitted to an acute bout of treadmill running until near exhaustion to determine the impact of unaccustomed exercise on global transcriptional profiles in porcine skeletal muscles. Using a combined microarray and candidate gene approach, we identified a suite of genes that are differentially expressed in muscles during postexercise recovery. Several members of the heat shock protein family and proteins associated with proteolytic events, such as the muscle-specific E3 ubiquitin ligase atrogin-1, were significantly upregulated, suggesting that protein breakdown, prevention of protein aggregation and stabilization of unfolded proteins are important processes for restoration of cellular homeostasis. We also detected an upregulation of genes that are associated with muscle cell proliferation and differentiation, including MUSTN1, ASB5 and CSRP3, possibly reflecting activation, differentiation and fusion of satellite cells to facilitate repair of muscle damage. In addition, exercise increased expression of the orphan nuclear hormone receptor NR4A3, which regulates metabolic functions associated with lipid, carbohydrate and energy homeostasis. Finally, we observed an unanticipated induction of the long non-coding RNA transcript NEAT1, which has been implicated in RNA processing and nuclear retention of adenosine-to-inosine edited mRNAs in the ribonucleoprotein bodies called paraspeckles. These findings expand the complexity of pathways affected by acute contractile activity of skeletal muscle, contributing to a better understanding of the molecular processes that occur in muscle tissue in the recovery phase.
APPRIS 2017: principal isoforms for multiple gene sets

PubMed Central

Rodriguez-Rivas, Juan; Di Domenico, Tomás; Vázquez, Jesús; Valencia, Alfonso

2018-01-01

Abstract The APPRIS database (http://appris-tools.org) uses protein structural and functional features and information from cross-species conservation to annotate splice isoforms in protein-coding genes. APPRIS selects a single protein isoform, the ‘principal’ isoform, as the reference for each gene based on these annotations. A single main splice isoform reflects the biological reality for most protein coding genes and APPRIS principal isoforms are the best predictors of these main proteins isoforms. Here, we present the updates to the database, new developments that include the addition of three new species (chimpanzee, Drosophila melangaster and Caenorhabditis elegans), the expansion of APPRIS to cover the RefSeq gene set and the UniProtKB proteome for six species and refinements in the core methods that make up the annotation pipeline. In addition APPRIS now provides a measure of reliability for individual principal isoforms and updates with each release of the GENCODE/Ensembl and RefSeq reference sets. The individual GENCODE/Ensembl, RefSeq and UniProtKB reference gene sets for six organisms have been merged to produce common sets of splice variants. PMID:29069475
Deficient brain RNA polymerase and altered nucleolar structure persists until day 8 after perinatal asphyxia of the rat.

PubMed

Kastner, Philomena; Mosgoeller, Wilhelm; Fang-Kircher, Susanne; Kitzmueller, Erwin; Kirchner, Liselotte; Hoeger, Harald; Seither, Peter; Lubec, Gert; Lubec, Barbara

2003-01-01

RNA polymerases (POL) are integral constituents of the protein synthesis machinery, with POL I and POL III coding for ribosomal RNA and POL II coding for protein. POL I is located in the nucleolus and transcribes class I genes, those that code for large ribosomal RNA. It has been reported that the POL system is seriously affected in perinatal asphyxia (PA) immediately after birth. Because POL I is necessary for protein synthesis and brain protein synthesis was shown to be deranged after hypoxic-ischemic conditions, we aimed to study whether POL derangement persists in a simple, well-documented animal model of graded global PA at the activity, mRNA, protein, and morphologic level until 8 d after the asphyctic insult. Nuclear POL I activity was determined according to a radiochemical method; mRNA steady state and protein levels of RPA4O-an essential subunit of POL I and III-were evaluated by blotting methods; and the POL I subunit polymerase activating factor-53 was evaluated using immunohistochemistry. Silver staining and transmission electron microscopy were used to examine the nucleolus. At the eighth day after PA, nuclear POL I decreased with the length of the asphyctic period, whereas mRNA and protein levels for RPA4O were unchanged. The subunit polymerase activating factor-53, however, was unambiguously reduced in several brain regions. Dramatic changes of nucleolar morphology were observed, the main finding being nucleolar disintegration at the electron microscopy level. We suggest that severe acidosis and/or deficient protein kinase C in the brain during the asphyctic period may be responsible for disintegration of the nucleolus as well as for decreased POL activity persisting until the eighth day after PA. The biologic effect may be that PA causes impaired RNA and protein synthesis, which has been already observed in hypoxic-ischemic states.
The Mediator complex and transcription regulation

PubMed Central

Poss, Zachary C.; Ebmeier, Christopher C.

2013-01-01

The Mediator complex is a multi-subunit assembly that appears to be required for regulating expression of most RNA polymerase II (pol II) transcripts, which include protein-coding and most non-coding RNA genes. Mediator and pol II function within the pre-initiation complex (PIC), which consists of Mediator, pol II, TFIIA, TFIIB, TFIID, TFIIE, TFIIF and TFIIH and is approximately 4.0 MDa in size. Mediator serves as a central scaffold within the PIC and helps regulate pol II activity in ways that remain poorly understood. Mediator is also generally targeted by sequence-specific, DNA-binding transcription factors (TFs) that work to control gene expression programs in response to developmental or environmental cues. At a basic level, Mediator functions by relaying signals from TFs directly to the pol II enzyme, thereby facilitating TF-dependent regulation of gene expression. Thus, Mediator is essential for converting biological inputs (communicated by TFs) to physiological responses (via changes in gene expression). In this review, we summarize an expansive body of research on the Mediator complex, with an emphasis on yeast and mammalian complexes. We focus on the basics that underlie Mediator function, such as its structure and subunit composition, and describe its broad regulatory influence on gene expression, ranging from chromatin architecture to transcription initiation and elongation, to mRNA processing. We also describe factors that influence Mediator structure and activity, including TFs, non-coding RNAs and the CDK8 module. PMID:24088064
GeneBuilder: interactive in silico prediction of gene structure.

PubMed

Milanesi, L; D'Angelo, D; Rogozin, I B

1999-01-01

Prediction of gene structure in newly sequenced DNA becomes very important in large genome sequencing projects. This problem is complicated due to the exon-intron structure of eukaryotic genes and because gene expression is regulated by many different short nucleotide domains. In order to be able to analyse the full gene structure in different organisms, it is necessary to combine information about potential functional signals (promoter region, splice sites, start and stop codons, 3' untranslated region) together with the statistical properties of coding sequences (coding potential), information about homologous proteins, ESTs and repeated elements. We have developed the GeneBuilder system which is based on prediction of functional signals and coding regions by different approaches in combination with similarity searches in proteins and EST databases. The potential gene structure models are obtained by using a dynamic programming method. The program permits the use of several parameters for gene structure prediction and refinement. During gene model construction, selecting different exon homology levels with a protein sequence selected from a list of homologous proteins can improve the accuracy of the gene structure prediction. In the case of low homology, GeneBuilder is still able to predict the gene structure. The GeneBuilder system has been tested by using the standard set (Burset and Guigo, Genomics, 34, 353-367, 1996) and the performances are: 0.89 sensitivity and 0.91 specificity at the nucleotide level. The total correlation coefficient is 0.88. The GeneBuilder system is implemented as a part of the WebGene a the URL: http://www.itba.mi. cnr.it/webgene and TRADAT (TRAncription Database and Analysis Tools) launcher URL: http://www.itba.mi.cnr.it/tradat.
Identification of a Conserved Non-Protein-Coding Genomic Element that Plays an Essential Role in Alphabaculovirus Pathogenesis

PubMed Central

Kikhno, Irina

2014-01-01

Highly homologous sequences 154–157 bp in length grouped under the name of “conserved non-protein-coding element” (CNE) were revealed in all of the sequenced genomes of baculoviruses belonging to the genus Alphabaculovirus. A CNE alignment led to the detection of a set of highly conserved nucleotide clusters that occupy strictly conserved positions in the CNE sequence. The significant length of the CNE and conservation of both its length and cluster architecture were identified as a combination of characteristics that make this CNE different from known viral non-coding functional sequences. The essential role of the CNE in the Alphabaculovirus life cycle was demonstrated through the use of a CNE-knockout Autographa californica multiple nucleopolyhedrovirus (AcMNPV) bacmid. It was shown that the essential function of the CNE was not mediated by the presumed expression activities of the protein- and non-protein-coding genes that overlap the AcMNPV CNE. On the basis of the presented data, the AcMNPV CNE was categorized as a complex-structured, polyfunctional genomic element involved in an essential DNA transaction that is associated with an undefined function of the baculovirus genome. PMID:24740153
Gene Trapping Using Gal4 in Zebrafish

PubMed Central

Balciuniene, Jorune; Balciunas, Darius

2013-01-01

Large clutch size and external development of optically transparent embryos make zebrafish an exceptional vertebrate model system for in vivo insertional mutagenesis using fluorescent reporters to tag expression of mutated genes. Several laboratories have constructed and tested enhancer- and gene-trap vectors in zebrafish, using fluorescent proteins, Gal4- and lexA- based transcriptional activators as reporters 1-7. These vectors had two potential drawbacks: suboptimal stringency (e.g. lack of ability to differentiate between enhancer- and gene-trap events) and low mutagenicity (e.g. integrations into genes rarely produced null alleles). Gene Breaking Transposon (GBTs) were developed to address these drawbacks 8-10. We have modified one of the first GBT vectors, GBT-R15, for use with Gal4-VP16 as the primary gene trap reporter and added UAS:eGFP as the secondary reporter for direct detection of gene trap events. Application of Gal4-VP16 as the primary gene trap reporter provides two main advantages. First, it increases sensitivity for genes expressed at low expression levels. Second, it enables researchers to use gene trap lines as Gal4 drivers to direct expression of other transgenes in very specific tissues. This is especially pertinent for genes with non-essential or redundant functions, where gene trap integration may not result in overt phenotypes. The disadvantage of using Gal4-VP16 as the primary gene trap reporter is that genes coding for proteins with N-terminal signal sequences are not amenable to trapping, as the resulting Gal4-VP16 fusion proteins are unlikely to be able to enter the nucleus and activate transcription. Importantly, the use of Gal4-VP16 does not pre-select for nuclear proteins: we recovered gene trap mutations in genes encoding proteins which function in the nucleus, the cytoplasm and the plasma membrane. PMID:24121167
The gene coding for the B cell surface protein CD19 is localized on human chromosome 16p11.

PubMed

Stapleton, P; Kozmik, Z; Weith, A; Busslinger, M

1995-02-01

The CD19 gene codes for one of the earliest markers of the human B cell lineage and is a target for the B lymphoid-specific transcription factor BSAP (Pax-5). The transmembrane protein CD19 has been implicated in controlling proliferation of mature B lymphocytes by modulating signal transduction through the antigen receptor. In this study, we have employed Southern blot and fluorescence in situ hybridization analyses to localize the CD19 gene to human chromosome 16p11.
A murC gene in Porphyromonas gingivalis 381.

PubMed

Ansai, T; Yamashita, Y; Awano, S; Shibata, Y; Wachi, M; Nagai, K; Takehara, T

1995-09-01

The gene encoding a 51 kDa polypeptide of Porphyromonas gingivalis 381 was isolated by immunoblotting using an antiserum raised against P. gingivalis alkaline phosphatase. DNA sequence analysis of a 2.5 kb DNA fragment containing a gene encoding the 51 kDa protein revealed one complete and two incomplete ORFs. Database searches using the FASTA program revealed significant homology between the P. gingivalis 51 kDa protein and the MurC protein of Escherichia coli, which functions in peptidoglycan synthesis. The cloned 51 kDa protein encoded a functional product that complemented an E. coli murC mutant. Moreover, the ORF just upstream of murC coded for a protein that was 31% homologous with the E. coli MurG protein. The ORF just downstream of murC coded for a protein that was 17% homologous with the Streptococcus pneumoniae penicillin-binding protein 2B (PBP2B), which functions in peptidoglycan synthesis and is responsible for antibiotic resistance. These results suggest that P. gingivalis contains a homologue of the E. coli peptidoglycan synthesis gene murC and indicate the possibility of a cluster of genes responsible for cell division and cell growth, as in the E. coli mra region.
The Crc global regulator inhibits the Pseudomonas putida pWW0 toluene/xylene assimilation pathway by repressing the translation of regulatory and structural genes.

PubMed

Moreno, Renata; Fonseca, Pilar; Rojo, Fernando

2010-08-06

In Pseudomonas putida, the expression of the pWW0 plasmid genes for the toluene/xylene assimilation pathway (the TOL pathway) is subject to complex regulation in response to environmental and physiological signals. This includes strong inhibition via catabolite repression, elicited by the carbon sources that the cells prefer to hydrocarbons. The Crc protein, a global regulator that controls carbon flow in pseudomonads, has an important role in this inhibition. Crc is a translational repressor that regulates the TOL genes, but how it does this has remained unknown. This study reports that Crc binds to sites located at the translation initiation regions of the mRNAs coding for XylR and XylS, two specific transcription activators of the TOL genes. Unexpectedly, eight additional Crc binding sites were found overlapping the translation initiation sites of genes coding for several enzymes of the pathway, all encoded within two polycistronic mRNAs. Evidence is provided supporting the idea that these sites are functional. This implies that Crc can differentially modulate the expression of particular genes within polycistronic mRNAs. It is proposed that Crc controls TOL genes in two ways. First, Crc inhibits the translation of the XylR and XylS regulators, thereby reducing the transcription of all TOL pathway genes. Second, Crc inhibits the translation of specific structural genes of the pathway, acting mainly on proteins involved in the first steps of toluene assimilation. This ensures a rapid inhibitory response that reduces the expression of the toluene/xylene degradation proteins when preferred carbon sources become available.
The Complete Mitogenome of the Wood-Feeding Cockroach Cryptocercus meridianus (Blattodea: Cryptocercidae) and Its Phylogenetic Relationship among Cockroach Families.

PubMed

Li, Weijun; Wang, Zongqing; Che, Yanli

2017-11-12

In this study, the complete mitochondrial genome of Cryptocercus meridianus was sequenced. The circular mitochondrial genome is 15,322 bp in size and contains 13 protein-coding genes, two ribosomal RNA genes (12S rRNA and 16S rRNA), 22 transfer RNA genes, and one D-loop region. We compare the mitogenome of C. meridianus with that of C. relictus and C. kyebangensis . The base composition of the whole genome was 45.20%, 9.74%, 16.06%, and 29.00% for A, G, C, and T, respectively; it shows a high AT content (74.2%), similar to the mitogenomes of C. relictus and C. kyebangensis . The protein-coding genes are initiated with typical mitochondrial start codons except for cox1 with TTG. The gene order of the C. meridianus mitogenome differs from the typical insect pattern for the translocation of tRNA-Ser AGN , while the mitogenomes of the other two Cryptocercus species, C. relictus and C. kyebangensis , are consistent with the typical insect pattern. There are two very long non-coding intergenic regions lying on both sides of the rearranged gene tRNA-Ser AGN . The phylogenetic relationships were constructed based on the nucleotide sequence of 13 protein-coding genes and two ribosomal RNA genes. The mitogenome of C. meridianus is the first representative of the order Blattodea that demonstrates rearrangement, and it will contribute to the further study of the phylogeny and evolution of the genus Cryptocercus and related taxa.
The pig X and Y Chromosomes: structure, sequence, and evolution

PubMed Central

Skinner, Benjamin M.; Sargent, Carole A.; Churcher, Carol; Hunt, Toby; Herrero, Javier; Loveland, Jane E.; Dunn, Matt; Louzada, Sandra; Fu, Beiyuan; Chow, William; Gilbert, James; Austin-Guest, Siobhan; Beal, Kathryn; Carvalho-Silva, Denise; Cheng, William; Gordon, Daria; Grafham, Darren; Hardy, Matt; Harley, Jo; Hauser, Heidi; Howden, Philip; Howe, Kerstin; Lachani, Kim; Ellis, Peter J.I.; Kelly, Daniel; Kerry, Giselle; Kerwin, James; Ng, Bee Ling; Threadgold, Glen; Wileman, Thomas; Wood, Jonathan M.D.; Yang, Fengtang; Harrow, Jen; Affara, Nabeel A.; Tyler-Smith, Chris

2016-01-01

We have generated an improved assembly and gene annotation of the pig X Chromosome, and a first draft assembly of the pig Y Chromosome, by sequencing BAC and fosmid clones from Duroc animals and incorporating information from optical mapping and fiber-FISH. The X Chromosome carries 1033 annotated genes, 690 of which are protein coding. Gene order closely matches that found in primates (including humans) and carnivores (including cats and dogs), which is inferred to be ancestral. Nevertheless, several protein-coding genes present on the human X Chromosome were absent from the pig, and 38 pig-specific X-chromosomal genes were annotated, 22 of which were olfactory receptors. The pig Y-specific Chromosome sequence generated here comprises 30 megabases (Mb). A 15-Mb subset of this sequence was assembled, revealing two clusters of male-specific low copy number genes, separated by an ampliconic region including the HSFY gene family, which together make up most of the short arm. Both clusters contain palindromes with high sequence identity, presumably maintained by gene conversion. Many of the ancestral X-related genes previously reported in at least one mammalian Y Chromosome are represented either as active genes or partial sequences. This sequencing project has allowed us to identify genes—both single copy and amplified—on the pig Y Chromosome, to compare the pig X and Y Chromosomes for homologous sequences, and thereby to reveal mechanisms underlying pig X and Y Chromosome evolution. PMID:26560630
Relaxed evolution in the tyrosine aminotransferase gene tat in old world fruit bats (Chiroptera: Pteropodidae).

PubMed

Shen, Bin; Fang, Tao; Yang, Tianxiao; Jones, Gareth; Irwin, David M; Zhang, Shuyi

2014-01-01

Frugivorous and nectarivorous bats fuel their metabolism mostly by using carbohydrates and allocate the restricted amounts of ingested proteins mainly for anabolic protein syntheses rather than for catabolic energy production. Thus, it is possible that genes involved in protein (amino acid) catabolism may have undergone relaxed evolution in these fruit- and nectar-eating bats. The tyrosine aminotransferase (TAT, encoded by the Tat gene) is the rate-limiting enzyme in the tyrosine catabolic pathway. To test whether the Tat gene has undergone relaxed evolution in the fruit- and nectar-eating bats, we obtained the Tat coding region from 20 bat species including four Old World fruit bats (Pteropodidae) and two New World fruit bats (Phyllostomidae). Phylogenetic reconstructions revealed a gene tree in which all echolocating bats (including the New World fruit bats) formed a monophyletic group. The phylogenetic conflict appears to stem from accelerated TAT protein sequence evolution in the Old World fruit bats. Our molecular evolutionary analyses confirmed a change in the selection pressure acting on Tat, which was likely caused by a relaxation of the evolutionary constraints on the Tat gene in the Old World fruit bats. Hepatic TAT activity assays showed that TAT activities in species of the Old World fruit bats are significantly lower than those of insectivorous bats and omnivorous mice, which was not caused by a change in TAT protein levels in the liver. Our study provides unambiguous evidence that the Tat gene has undergone relaxed evolution in the Old World fruit bats in response to changes in their metabolism due to the evolution of their special diet.
Molecular cloning and sequence analysis of the gene coding for the 57kDa soluble antigen of the salmonid fish pathogen Renibacterium salmoninarum

USGS Publications Warehouse

Chien, Maw-Sheng; Gilbert , Teresa L.; Huang, Chienjin; Landolt, Marsha L.; O'Hara, Patrick J.; Winton, James R.

1992-01-01

The complete sequence coding for the 57-kDa major soluble antigen of the salmonid fish pathogen, Renibacterium salmoninarum, was determined. The gene contained an opening reading frame of 1671 nucleotides coding for a protein of 557 amino acids with a calculated Mr value of 57190. The first 26 amino acids constituted a signal peptide. The deduced sequence for amino acid residues 27–61 was in agreement with the 35 N-terminal amino acid residues determined by microsequencing, suggesting the protein in synthesized as a 557-amino acid precursor and processed to produce a mature protein of Mr 54505. Two regions of the protein contained imperfect direct repeats. The first region contained two copies of an 81-residue repeat, the second contained five copies of an unrelated 25-residue repeat. Also, a perfect inverted repeat (including three in-frame UAA stop codons) was observed at the carboxyl-terminus of the gene.
Characterization of the complete mitochondrial genome of Marshallagia marshalli and phylogenetic implications for the superfamily Trichostrongyloidea.

PubMed

Sun, Miao-Miao; Han, Liang; Zhang, Fu-Kai; Zhou, Dong-Hui; Wang, Shu-Qing; Ma, Jun; Zhu, Xing-Quan; Liu, Guo-Hua

2018-01-01

Marshallagia marshalli (Nematoda: Trichostrongylidae) infection can lead to serious parasitic gastroenteritis in sheep, goat, and wild ruminant, causing significant socioeconomic losses worldwide. Up to now, the study concerning the molecular biology of M. marshalli is limited. Herein, we sequenced the complete mitochondrial (mt) genome of M. marshalli and examined its phylogenetic relationship with selected members of the superfamily Trichostrongyloidea using Bayesian inference (BI) based on concatenated mt amino acid sequence datasets. The complete mt genome sequence of M. marshalli is 13,891 bp, including 12 protein-coding genes, 22 transfer RNA genes, and 2 ribosomal RNA genes. All protein-coding genes are transcribed in the same direction. Phylogenetic analyses based on concatenated amino acid sequences of the 12 protein-coding genes supported the monophylies of the families Haemonchidae, Molineidae, and Dictyocaulidae with strong statistical support, but rejected the monophyly of the family Trichostrongylidae. The determination of the complete mt genome sequence of M. marshalli provides novel genetic markers for studying the systematics, population genetics, and molecular epidemiology of M. marshalli and its congeners.
Molecular Cloning, Characterization, and Differential Expression of a Glucoamylase Gene from the Basidiomycetous Fungus Lentinula edodes

PubMed Central

Zhao, J.; Chen, Y. H.; Kwan, H. S.

2000-01-01

The complete nucleotide sequence of putative glucoamylase gene gla1 from the basidiomycetous fungus Lentinula edodes strain L54 is reported. The coding region of the genomic glucoamylase sequence, which is preceded by eukaryotic promoter elements CAAT and TATA, spans 2,076 bp. The gla1 gene sequence codes for a putative polypeptide of 571 amino acids and is interrupted by seven introns. The open reading frame sequence of the gla1 gene shows strong homology with those of other fungal glucoamylase genes and encodes a protein with an N-terminal catalytic domain and a C-terminal starch-binding domain. The similarity between the Gla1 protein and other fungal glucoamylases is from 45 to 61%, with the region of highest conservation found in catalytic domains and starch-binding domains. We compared the kinetics of glucoamylase activity and levels of gene expression in L. edodes strain L54 grown on different carbon sources (glucose, starch, cellulose, and potato extract) and in various developmental stages (mycelium growth, primordium appearance, and fruiting body formation). Quantitative reverse transcription PCR utilizing pairs of primers specific for gla1 gene expression shows that expression of gla1 was induced by starch and increased during the process of fruiting body formation, which indicates that glucoamylases may play an important role in the morphogenesis of the basidiomycetous fungus. PMID:10831434
Genomics of Clostridium taeniosporum, an organism which forms endospores with ribbon-like appendages

PubMed Central

Cambridge, Joshua M.; Blinkova, Alexandra L.; Salvador Rocha, Erick I.; Bode Hernández, Addys; Moreno, Maday; Ginés-Candelaria, Edwin; Goetz, Benjamin M.; Hunicke-Smith, Scott; Satterwhite, Ed; Tucker, Haley O.

2018-01-01

Clostridium taeniosporum, a non-pathogenic anaerobe closely related to the C. botulinum Group II members, was isolated from Crimean lake silt about 60 years ago. Its endospores are surrounded by an encasement layer which forms a trunk at one spore pole to which about 12–14 large, ribbon-like appendages are attached. The genome consists of one 3,264,813 bp, circular chromosome (with 26.6% GC) and three plasmids. The chromosome contains 2,892 potential protein coding sequences: 2,124 have specific functions, 147 have general functions, 228 are conserved but without known function and 393 are hypothetical based on the fact that no statistically significant orthologs were found. The chromosome also contains 101 genes for stable RNAs, including 7 rRNA clusters. Over 84% of the protein coding sequences and 96% of the stable RNA coding regions are oriented in the same direction as replication. The three known appendage genes are located within a single cluster with five other genes, the protein products of which are closely related, in terms of sequence, to the known appendage proteins. The relatedness of the deduced protein products suggests that all or some of the closely related genes might code for minor appendage proteins or assembly factors. The appendage genes might be unique among the known clostridia; no statistically significant orthologs were found within other clostridial genomes for which sequence data are available. The C. taeniosporum chromosome contains two functional prophages, one Siphoviridae and one Myoviridae, and one defective prophage. Three plasmids of 5.9, 69.7 and 163.1 Kbp are present. These data are expected to contribute to future studies of developmental, structural and evolutionary biology and to potential industrial applications of this organism. PMID:29293521
Genomics of Clostridium taeniosporum, an organism which forms endospores with ribbon-like appendages.

PubMed

Cambridge, Joshua M; Blinkova, Alexandra L; Salvador Rocha, Erick I; Bode Hernández, Addys; Moreno, Maday; Ginés-Candelaria, Edwin; Goetz, Benjamin M; Hunicke-Smith, Scott; Satterwhite, Ed; Tucker, Haley O; Walker, James R

2018-01-01

Clostridium taeniosporum, a non-pathogenic anaerobe closely related to the C. botulinum Group II members, was isolated from Crimean lake silt about 60 years ago. Its endospores are surrounded by an encasement layer which forms a trunk at one spore pole to which about 12-14 large, ribbon-like appendages are attached. The genome consists of one 3,264,813 bp, circular chromosome (with 26.6% GC) and three plasmids. The chromosome contains 2,892 potential protein coding sequences: 2,124 have specific functions, 147 have general functions, 228 are conserved but without known function and 393 are hypothetical based on the fact that no statistically significant orthologs were found. The chromosome also contains 101 genes for stable RNAs, including 7 rRNA clusters. Over 84% of the protein coding sequences and 96% of the stable RNA coding regions are oriented in the same direction as replication. The three known appendage genes are located within a single cluster with five other genes, the protein products of which are closely related, in terms of sequence, to the known appendage proteins. The relatedness of the deduced protein products suggests that all or some of the closely related genes might code for minor appendage proteins or assembly factors. The appendage genes might be unique among the known clostridia; no statistically significant orthologs were found within other clostridial genomes for which sequence data are available. The C. taeniosporum chromosome contains two functional prophages, one Siphoviridae and one Myoviridae, and one defective prophage. Three plasmids of 5.9, 69.7 and 163.1 Kbp are present. These data are expected to contribute to future studies of developmental, structural and evolutionary biology and to potential industrial applications of this organism.
Differential regulation of transcription through distinct Suppressor of Hairless DNA binding site architectures during Notch signaling in proneural clusters.

PubMed

Cave, John W; Xia, Li; Caudy, Michael

2011-01-01

In Drosophila melanogaster, achaete (ac) and m8 are model basic helix-loop-helix activator (bHLH A) and repressor genes, respectively, that have the opposite cell expression pattern in proneural clusters during Notch signaling. Previous studies have shown that activation of m8 transcription in specific cells within proneural clusters by Notch signaling is programmed by a "combinatorial" and "architectural" DNA transcription code containing binding sites for the Su(H) and proneural bHLH A proteins. Here we show the novel result that the ac promoter contains a similar combinatorial code of Su(H) and bHLH A binding sites but contains a different Su(H) site architectural code that does not mediate activation during Notch signaling, thus programming a cell expression pattern opposite that of m8 in proneural clusters.
Identification Of Protein Vaccine Candidates Using Comprehensive Proteomic Analysis Strategies

DTIC Science & Technology

2007-12-01

urease (URE) gene codes for a urea amidohydrolase protein that catalyzes urea hydrolysis. The protein was first isolated from C. immitis and...the Cu, Zn, Superoxide Dismutase (SOD), the Spherule Outer Wall glycoprotein (SOWgp), the T-Cell Reactive Protein (TCRP), and Urease (URE). It is...et al. 1997. Isolation and characterization of the urease gene (URE) from the pathogenic fungus Coccidioides immitis. Gene 198: 387-391. 54. Li, K

Cloning and identification of bacteriophage T4 gene 2 product gp2 and action of gp2 on infecting DNA in vivo.

PubMed Central

Lipinska, B; Rao, A S; Bolten, B M; Balakrishnan, R; Goldberg, E B

1989-01-01

We sequenced bacteriophage T4 genes 2 and 3 and the putative C-terminal portion of gene 50. They were found to have appropriate open reading frames directed counterclockwise on the T4 map. Mutations in genes 2 and 64 were shown to be in the same open reading frame, which we now call gene 2. This gene codes for a protein of 27,068 daltons. The open reading frame corresponding to gene 3 codes for a protein of 20,634 daltons. Appropriate bands on polyacrylamide gels were identified at 30 and 20 kilodaltons, respectively. We found that the product of the cloned gene 2 can protect T4 DNA double-stranded ends from exonuclease V action. Images PMID:2644202
AP1 Keeps Chromatin Poised for Action | Center for Cancer Research

Cancer.gov

The human genome harbors gene-encoding DNA, the blueprint for building proteins that regulate cellular function. Embedded across the genome, in non-coding regions, are DNA elements to which regulatory factors bind. The interaction of regulatory factors with DNA at these sites modifies gene expression to modulate cell activity. In cells, DNA exists in a complex with proteins called chromatin that compacts the DNA in the nucleus, strongly restricting access to DNA sequences. As a result, regulatory factors only interact with a small subset of their potential binding elements in a given cell to regulate genes. How factors recognize and select sites in chromatin across the genome is not well understood -- but several discoveries in CCR’s Laboratory of Receptor Biology and Gene Expression (LRBGE) have shed light on the mechanisms that direct factors to DNA.
The Arabidopsis TOR Kinase Specifically Regulates the Expression of Nuclear Genes Coding for Plastidic Ribosomal Proteins and the Phosphorylation of the Cytosolic Ribosomal Protein S6

PubMed Central

Dobrenel, Thomas; Mancera-Martínez, Eder; Forzani, Céline; Azzopardi, Marianne; Davanture, Marlène; Moreau, Manon; Schepetilnikov, Mikhail; Chicher, Johana; Langella, Olivier; Zivy, Michel; Robaglia, Christophe; Ryabova, Lyubov A.; Hanson, Johannes; Meyer, Christian

2016-01-01

Protein translation is an energy consuming process that has to be fine-tuned at both the cell and organism levels to match the availability of resources. The target of rapamycin kinase (TOR) is a key regulator of a large range of biological processes in response to environmental cues. In this study, we have investigated the effects of TOR inactivation on the expression and regulation of Arabidopsis ribosomal proteins at different levels of analysis, namely from transcriptomic to phosphoproteomic. TOR inactivation resulted in a coordinated down-regulation of the transcription and translation of nuclear-encoded mRNAs coding for plastidic ribosomal proteins, which could explain the chlorotic phenotype of the TOR silenced plants. We have identified in the 5′ untranslated regions (UTRs) of this set of genes a conserved sequence related to the 5′ terminal oligopyrimidine motif, which is known to confer translational regulation by the TOR kinase in other eukaryotes. Furthermore, the phosphoproteomic analysis of the ribosomal fraction following TOR inactivation revealed a lower phosphorylation of the conserved Ser240 residue in the C-terminal region of the 40S ribosomal protein S6 (RPS6). These results were confirmed by Western blot analysis using an antibody that specifically recognizes phosphorylated Ser240 in RPS6. Finally, this antibody was used to follow TOR activity in plants. Our results thus uncover a multi-level regulation of plant ribosomal genes and proteins by the TOR kinase. PMID:27877176
Dynamic gene expression response to altered gravity in human T cells.

PubMed

Thiel, Cora S; Hauschild, Swantje; Huge, Andreas; Tauber, Svantje; Lauber, Beatrice A; Polzer, Jennifer; Paulsen, Katrin; Lier, Hartwin; Engelmann, Frank; Schmitz, Burkhard; Schütte, Andreas; Layer, Liliana E; Ullrich, Oliver

2017-07-12

We investigated the dynamics of immediate and initial gene expression response to different gravitational environments in human Jurkat T lymphocytic cells and compared expression profiles to identify potential gravity-regulated genes and adaptation processes. We used the Affymetrix GeneChip® Human Transcriptome Array 2.0 containing 44,699 protein coding genes and 22,829 non-protein coding genes and performed the experiments during a parabolic flight and a suborbital ballistic rocket mission to cross-validate gravity-regulated gene expression through independent research platforms and different sets of control experiments to exclude other factors than alteration of gravity. We found that gene expression in human T cells rapidly responded to altered gravity in the time frame of 20 s and 5 min. The initial response to microgravity involved mostly regulatory RNAs. We identified three gravity-regulated genes which could be cross-validated in both completely independent experiment missions: ATP6V1A/D, a vacuolar H + -ATPase (V-ATPase) responsible for acidification during bone resorption, IGHD3-3/IGHD3-10, diversity genes of the immunoglobulin heavy-chain locus participating in V(D)J recombination, and LINC00837, a long intergenic non-protein coding RNA. Due to the extensive and rapid alteration of gene expression associated with regulatory RNAs, we conclude that human cells are equipped with a robust and efficient adaptation potential when challenged with altered gravitational environments.
Genomic localization of the human gene encoding Dr1, a negative modulator of transcription of class II and class III genes.

PubMed

Purrello, M; Di Pietro, C; Rapisarda, A; Viola, A; Corsaro, C; Motta, S; Grzeschik, K H; Sichel, G

1996-01-01

Dr1 is a nuclear protein of 19 kDa that exists in the nucleoplasm as a homotetramer. By binding to TBP (the DNA-binding subunit of TFIID, and also a subunit of SL1 and TFIIIB), the protein blocks class II and class III preinitiation complex assembly, thus repressing the activity of the corresponding promoters. Since transcription of class I genes is unaffected by Dr1. it has been proposed that the protein may coordinate the expression of class I, class II and class III genes. By somatic cell genetics and fluorescence in situ hybridization, we have localized the gene (DR1), present in the genome of higher eukaryotes as a single copy, to human chromosome region 1p21-->p13. The nucleotide sequence conservation of the coding segment of the gene, as determined by Noah's ark blot analysis, and its ubiquitous transcription suggest that Dr1 has an important biological role, which could be related to the negative control of cell proliferation.
Molecular characterization and expression analysis of AMPK α subunit isoform genes from Scophthalmus maximus responding to salinity stress.

PubMed

Zeng, Lin; Liu, Bin; Wu, Chang-Wen; Lei, Ji-Lin; Xu, Mei-Ying; Zhu, Ai-Yi; Zhang, Jian-She; Hong, Wan-Shu

2016-12-01

AMP-activated protein kinase (AMPK) is a highly conserved and multi-functional protein kinase that plays important roles in both intracellular energy balance and cellular stress response. In the present study, molecular characterization, tissue distribution and gene expression levels of the AMPK α1 and α2 genes from turbot (Scophthalmus maximus) under salinity stress are described. The complete coding regions of the AMPK α1 and α2 genes were isolated from turbot through degenerate primers in combination with RACE using muscle cDNA. The complete coding regions of AMPK α1 (1722 bp) and α2 (1674 bp) encoded 573 and 557 amino acids peptides, respectively. Multiple alignments, structural analysis and phylogenetic tree construction indicated that S. maximus AMPK α1 and α2 shared a high amino acid identity with other species, especially fish. AMPK α1 and α2 genes could be detected in all tested tissues, indicating that they are constitutively expressed. Salinity challenges significantly altered the gene expression levels of AMPK α1 and α2 mRNA in a salinity- and time-dependent manners in S. maximus gill tissues, suggesting that AMPK α1 and α2 played important roles in mediating the salinity stress in S. maximus. The expression levels of AMPK α1 and α2 mRNA were a positive correlation with gill Na + , K + -ATPase activities. These findings will aid our understanding of the molecular mechanism of juvenile turbot in response to environmental salinity changes.
The metazoan Mediator co-activator complex as an integrative hub for transcriptional regulation.

PubMed

Malik, Sohail; Roeder, Robert G

2010-11-01

The Mediator is an evolutionarily conserved, multiprotein complex that is a key regulator of protein-coding genes. In metazoan cells, multiple pathways that are responsible for homeostasis, cell growth and differentiation converge on the Mediator through transcriptional activators and repressors that target one or more of the almost 30 subunits of this complex. Besides interacting directly with RNA polymerase II, Mediator has multiple functions and can interact with and coordinate the action of numerous other co-activators and co-repressors, including those acting at the level of chromatin. These interactions ultimately allow the Mediator to deliver outputs that range from maximal activation of genes to modulation of basal transcription to long-term epigenetic silencing.
Generation of a variety of stable Influenza A reporter viruses by genetic engineering of the NS gene segment

PubMed Central

Reuther, Peter; Göpfert, Kristina; Dudek, Alexandra H.; Heiner, Monika; Herold, Susanne; Schwemmle, Martin

2015-01-01

Influenza A viruses (IAV) pose a constant threat to the human population and therefore a better understanding of their fundamental biology and identification of novel therapeutics is of upmost importance. Various reporter-encoding IAV were generated to achieve these goals, however, one recurring difficulty was the genetic instability especially of larger reporter genes. We employed the viral NS segment coding for the non-structural protein 1 (NS1) and nuclear export protein (NEP) for stable expression of diverse reporter proteins. This was achieved by converting the NS segment into a single open reading frame (ORF) coding for NS1, the respective reporter and NEP. To allow expression of individual proteins, the reporter genes were flanked by two porcine Teschovirus-1 2A peptide (PTV-1 2A)-coding sequences. The resulting viruses encoding luciferases, fluorescent proteins or a Cre recombinase are characterized by a high genetic stability in vitro and in mice and can be readily employed for antiviral compound screenings, visualization of infected cells or cells that survived acute infection. PMID:26068081
Variation in Its C-Terminal Amino Acids Determines Whether Endo-β-Mannanase Is Active or Inactive in Ripening Tomato Fruits of Different Cultivars1

PubMed Central

Bourgault, Richard; Bewley, J. Derek

2002-01-01

Endo-β-mannanase cDNAs were cloned and characterized from ripening tomato (Lycopersicon esculentum Mill. cv Trust) fruit, which produces an active enzyme, and from the tomato cv Walter, which produces an inactive enzyme. There is a two-nucleotide deletion in the gene from tomato cv Walter, which results in a frame shift and the deletion of four amino acids at the C terminus of the full-length protein. Other cultivars that produce either active or inactive enzyme show the same absence or presence of the two-nucleotide deletion. The endo-β-mannanase enzyme protein was purified and characterized from ripe fruit to ensure that cDNA codes for the enzyme from fruit. Immunoblot analysis demonstrated that non-ripening mutants, which also fail to exhibit endo-β-mannanase activity, do so because they fail to express the protein. In a two-way genetic cross between tomato cvs Walter and Trust, all F1 progeny from both crosses produced fruit with active enzyme, suggesting that this form is dominant and homozygous in tomato cv Trust. Self-pollination of a plant from the heterozygous F1 generation yielded F2 plants that bear fruit with and without active enzyme at a ratio appropriate to Mendelian genetic segregation of alleles. Heterologous expression of the two endo-β-mannanase genes in Escherichia coli resulted in active enzyme being produced from cultures containing the tomato cv Trust gene and inactive enzyme being produced from those containing the tomato cv Walter gene. Site-directed mutagenesis was used to establish key elements in the C terminus of the endo-β-mannanase protein that are essential for full enzyme activity. PMID:12427992
Novel coding, translation, and gene expression of a replicating covalently closed circular RNA of 220 nt.

PubMed

AbouHaidar, Mounir Georges; Venkataraman, Srividhya; Golshani, Ashkan; Liu, Bolin; Ahmad, Tauqeer

2014-10-07

The highly structured (64% GC) covalently closed circular (CCC) RNA (220 nt) of the virusoid associated with rice yellow mottle virus codes for a 16-kDa highly basic protein using novel modalities for coding, translation, and gene expression. This CCC RNA is the smallest among all known viroids and virusoids and the only one that codes proteins. Its sequence possesses an internal ribosome entry site and is directly translated through two (or three) completely overlapping ORFs (shifting to a new reading frame at the end of each round). The initiation and termination codons overlap UGAUGA (underline highlights the initiation codon AUG within the combined initiation-termination sequence). Termination codons can be ignored to obtain larger read-through proteins. This circular RNA with no noncoding sequences is a unique natural supercompact "nanogenome."
Transcriptional profiling of murine osteoblast differentiation based on RNA-seq expression analyses.

PubMed

Khayal, Layal Abo; Grünhagen, Johannes; Provazník, Ivo; Mundlos, Stefan; Kornak, Uwe; Robinson, Peter N; Ott, Claus-Eric

2018-04-11

Osteoblastic differentiation is a multistep process characterized by osteogenic induction of mesenchymal stem cells, which then differentiate into proliferative pre-osteoblasts that produce copious amounts of extracellular matrix, followed by stiffening of the extracellular matrix, and matrix mineralization by hydroxylapatite deposition. Although these processes have been well characterized biologically, a detailed transcriptional analysis of murine primary calvaria osteoblast differentiation based on RNA sequencing (RNA-seq) analyses has not previously been reported. Here, we used RNA-seq to obtain expression values of 29,148 genes at four time points as murine primary calvaria osteoblasts differentiate in vitro until onset of mineralization was clearly detectable by microscopic inspection. Expression of marker genes confirmed osteogenic differentiation. We explored differential expression of 1386 protein-coding genes using unsupervised clustering and GO analyses. 100 differentially expressed lncRNAs were investigated by co-expression with protein-coding genes that are localized within the same topologically associated domain. Additionally, we monitored expression of 237 genes that are silent or active at distinct time points and compared differential exon usage. Our data represent an in-depth profiling of murine primary calvaria osteoblast differentiation by RNA-seq and contribute to our understanding of genetic regulation of this key process in osteoblast biology. Copyright © 2018 Elsevier Inc. All rights reserved.
Functional genomics provides insights into the role of Propionibacterium freudenreichii ssp. shermanii JS in cheese ripening.

PubMed

Ojala, Teija; Laine, Pia K S; Ahlroos, Terhi; Tanskanen, Jarna; Pitkänen, Saara; Salusjärvi, Tuomas; Kankainen, Matti; Tynkkynen, Soile; Paulin, Lars; Auvinen, Petri

2017-01-16

Propionibacterium freudenreichii is a commercially important bacterium that is essential for the development of the characteristic eyes and flavor of Swiss-type cheeses. These bacteria grow actively and produce large quantities of flavor compounds during cheese ripening at warm temperatures but also appear to contribute to the aroma development during the subsequent cold storage of cheese. Here, we advance our understanding of the role of P. freudenreichii in cheese ripening by presenting the 2.68-Mbp annotated genome sequence of P. freudenreichii ssp. shermanii JS and determining its global transcriptional profiles during industrial cheese-making using transcriptome sequencing. The annotation of the genome identified a total of 2377 protein-coding genes and revealed the presence of enzymes and pathways for formation of several flavor compounds. Based on transcriptome profiling, the expression of 348 protein-coding genes was altered between the warm and cold room ripening of cheese. Several propionate, acetate, and diacetyl/acetoin production related genes had higher expression levels in the warm room, whereas a general slowing down of the metabolism and an activation of mobile genetic elements was seen in the cold room. A few ripening-related and amino acid catabolism involved genes were induced or remained active in cold room, indicating that strain JS contributes to the aroma development also during cold room ripening. In addition, we performed a comparative genomic analysis of strain JS and 29 other Propionibacterium strains of 10 different species, including an isolate of both P. freudenreichii subspecies freudenreichii and shermanii. Ortholog grouping of the predicted protein sequences revealed that close to 86% of the ortholog groups of strain JS, including a variety of ripening-related ortholog groups, were conserved across the P. freudenreichii isolates. Taken together, this study contributes to the understanding of the genomic basis of P. freudenreichii and sheds light on its activities during cheese ripening. Copyright Â© 2016 Elsevier B.V. All rights reserved.
Molecular characterization demonstrates that the Zea mays gene sugary2 codes for the starch synthase isoform SSIIa.

PubMed

Zhang, Xiaoli; Colleoni, Christophe; Ratushna, Vlada; Sirghie-Colleoni, Mirella; James, Martha G; Myers, Alan M

2004-04-01

Mutations in the maize gene sugary2 ( su2 ) affect starch structure and its resultant physiochemical properties in useful ways, although the gene has not been characterized previously at the molecular level. This study tested the hypothesis that su2 codes for starch synthase IIa (SSIIa). Two independent mutations of the su2 locus, su2-2279 and su2-5178 , were identified in a Mutator -active maize population. The nucleotide sequence of the genomic locus that codes for SSIIa was compared between wild type plants and those homozygous for either novel mutation. Plants bearing su2-2279 invariably contained a Mutator transposon in exon 3 of the SSIIa gene, and su2-5178 mutants always contained a small retrotransposon-like insertion in exon 10. Six allelic su2 (-) mutations conditioned loss or reduction in abundance of the SSIIa protein detected by immunoblot. These data indicate that su2 codes for SSIIa and that deficiency in this isoform is ultimately responsible for the altered physiochemical properties of su2 (-) mutant starches. A specific starch synthase isoform among several identified in soluble endosperm extracts was absent in su2-2279 or su2-5178 mutants, indicating that SSIIa is active in the soluble phase during kernel development. The immediate structural effect of the su2 (-) mutations was shown to be increased abundance of short glucan chains in amylopectin and a proportional decrease in intermediate length chains, similar to the effects of SSII deficiency in other species.
From Genomes to Protein Models and Back

NASA Astrophysics Data System (ADS)

Tramontano, Anna; Giorgetti, Alejandro; Orsini, Massimiliano; Raimondo, Domenico

2007-12-01

The alternative splicing mechanism allows genes to generate more than one product. When the splicing events occur within protein coding regions they can modify the biological function of the protein. Alternative splicing has been suggested as one way for explaining the discrepancy between the number of human genes and functional complexity. We analysed the putative structure of the alternatively spliced gene products annotated in the ENCODE pilot project and discovered that many of the potential alternative gene products will be unlikely to produce stable functional proteins.
Multiple Site-Directed and Saturation Mutagenesis by the Patch Cloning Method.

PubMed

Taniguchi, Naohiro; Murakami, Hiroshi

2017-01-01

Constructing protein-coding genes with desired mutations is a basic step for protein engineering. Herein, we describe a multiple site-directed and saturation mutagenesis method, termed MUPAC. This method has been used to introduce multiple site-directed mutations in the green fluorescent protein gene and in the moloney murine leukemia virus reverse transcriptase gene. Moreover, this method was also successfully used to introduce randomized codons at five desired positions in the green fluorescent protein gene, and for simple DNA assembly for cloning.
The complete mitochondrial genome of a spiraling whitefly, Aleurodicus dispersus Russell (Hemiptera: Aleyrodidae).

PubMed

Ming-Xing, Lu; Zhi-Teng, Chen; Wei-Wei, Yu; Yu-Zhou, Du

2017-03-01

We report the complete mitochondrial genome (mitogenome) of a spiraling whitefly, Aleurodicus dispersus (Hemiptera: Aleyrodidae). The 16 170 bp long genome consists of 13 protein-coding genes, 20 transfer RNAs, 2 ribosomal RNAs, and a control region. The A. dispersus mitogenome also includes a cytb-like non-coding region and shows several variations relative to the typical insect mitogenome. A phylogenetic tree has been constructed using the 13 protein-coding genes of 12 related species from Hemiptera. Our results would contribute to further study of phylogeny in Aleyrodidae and Hemiptera.
Genes associated with pro-apoptotic and protective mechanisms are affected differently on exposure of neuronal cell cultures to arsenite. No indication for endoplasmic reticulum stress despite activation of grp78 and gadd153 expression.

PubMed

Mengesdorf, Thorsten; Althausen, Sonja; Paschen, Wulf

2002-08-15

The effect of arsenite exposure on cell viability, protein synthesis, energy metabolism and the expression of genes coding for cytoplasmic (hsp70) and endoplasmic reticulum (ER; gadd153, grp78, grp94) stress proteins was investigated in primary neuronal cell cultures. Furthermore, signs of ER stress were evaluated by investigating xbp1 mRNA processing. Arsenite levels of 30 and 100 microM induced severe cell injury. Protein synthesis was reduced to below 20% of control in cultures exposed to 30 and 100 microM arsenite for 1 h, and it remained markedly suppressed until 24 h of exposure. Arsenite induced a transient inhibition of energy metabolism after 1 h of exposure, but energy state recovered completely after 3 h. Arsenite exposure affected the expression and translation of genes coding for HSP70 and GRP78, GRP94, GADD153 to different extents. While hsp70 mRNA levels rose drastically, approximally 550-fold after 6 h exposure, HSP70 protein levels did not change over the first 6 h. On the other hand, gadd153 mRNA levels rose only approximately 14-fold after 6 h exposure, while GADD153 protein levels were markedly increased after 3 and 6 h exposure. HSP70 protein levels were markedly increased and GADD153 protein levels decreased to almost control levels in cultures left in arsenite solution for 24 h, i.e. when only a small fraction of cells had escaped arsenite toxicity. Arsenite exposure of neurons thus induced an imbalance between pro-apoptotic and survival-activating pathways. Despite the marked increase in gadd153 mRNA levels, we did not observe signs of xbp1 processing in arsenite exposed cultures, indicating that arsenite did not produce ER stress.
ChIPBase v2.0: decoding transcriptional regulatory networks of non-coding RNAs and protein-coding genes from ChIP-seq data.

PubMed

Zhou, Ke-Ren; Liu, Shun; Sun, Wen-Ju; Zheng, Ling-Ling; Zhou, Hui; Yang, Jian-Hua; Qu, Liang-Hu

2017-01-04

The abnormal transcriptional regulation of non-coding RNAs (ncRNAs) and protein-coding genes (PCGs) is contributed to various biological processes and linked with human diseases, but the underlying mechanisms remain elusive. In this study, we developed ChIPBase v2.0 (http://rna.sysu.edu.cn/chipbase/) to explore the transcriptional regulatory networks of ncRNAs and PCGs. ChIPBase v2.0 has been expanded with ∼10 200 curated ChIP-seq datasets, which represent about 20 times expansion when comparing to the previous released version. We identified thousands of binding motif matrices and their binding sites from ChIP-seq data of DNA-binding proteins and predicted millions of transcriptional regulatory relationships between transcription factors (TFs) and genes. We constructed 'Regulator' module to predict hundreds of TFs and histone modifications that were involved in or affected transcription of ncRNAs and PCGs. Moreover, we built a web-based tool, Co-Expression, to explore the co-expression patterns between DNA-binding proteins and various types of genes by integrating the gene expression profiles of ∼10 000 tumor samples and ∼9100 normal tissues and cell lines. ChIPBase also provides a ChIP-Function tool and a genome browser to predict functions of diverse genes and visualize various ChIP-seq data. This study will greatly expand our understanding of the transcriptional regulations of ncRNAs and PCGs. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Origin of sphinx, a young chimeric RNA gene in Drosophila melanogaster

PubMed Central

Wang, Wen; Brunet, Frédéric G.; Nevo, Eviatar; Long, Manyuan

2002-01-01

Non-protein-coding RNA genes play an important role in various biological processes. How new RNA genes originated and whether this process is controlled by similar evolutionary mechanisms for the origin of protein-coding genes remains unclear. A young chimeric RNA gene that we term sphinx (spx) provides the first insight into the early stage of evolution of RNA genes. spx originated as an insertion of a retroposed sequence of the ATP synthase chain F gene at the cytological region 60DB since the divergence of Drosophila melanogaster from its sibling species 2–3 million years ago. This retrosequence, which is located at 102F on the fourth chromosome, recruited a nearby exon and intron, thereby evolving a chimeric gene structure. This molecular process suggests that the mechanism of exon shuffling, which can generate protein-coding genes, also plays a role in the origin of RNA genes. The subsequent evolutionary process of spx has been associated with a high nucleotide substitution rate, possibly driven by a continuous positive Darwinian selection for a novel function, as is shown in its sex- and development-specific alternative splicing. To test whether spx has adapted to different environments, we investigated its population genetic structure in the unique “Evolution Canyon” in Israel, revealing a similar haplotype structure in spx, and thus similar evolutionary forces operating on spx between environments. PMID:11904380
No involvement of the nerve growth factor gene locus in hypertension in spontaneously hypertensive rats.

PubMed

Nemoto, Kiyomitsu; Sekimoto, Masashi; Fukamachi, Katsumi; Kageyama, Haruaki; Degawa, Masakuni; Hamadai, Masanori; Hendley, Edith D; Macrae, I Mhairi; Clark, James S; Dominiczak, Anna F; Ueyama, Takashi

2005-02-01

Sympathetic hyper-innervation and increased levels of nerve growth factor (NGF), an essential neurotrophic factor for sympathetic neurons, have been observed in the vascular tissues of spontaneously hypertensive rats (SHRs). Such observations have suggested that the pathogenesis of hypertension might involve a qualitative or quantitative abnormality in the NGF protein, resulting from a significant mutation in the gene's promoter or coding region. In the present study, we analyzed the nucleotide sequences of the cis-element of the NGF gene in SHRs, stroke-prone SHRs (SHRSPs), and normotensive Wistar-Kyoto (WKY) rats. The present analyses revealed some differences in the 3-kb promoter region, coding exon, and 3' untranslated region (3'UTR) for the NGF gene among those strains. However, the observed differences did not lead to changes in promoter activity or to amino acid substitution; nor did they represent a link between the 3'UTR mutation of SHRSPs and elevated blood pressure in an F2 generation produced by crossbreeding SHRSPs with WKY rats. These results suggest that the NGF gene locus is not involved in hypertension in SHR/ SHRSP strains. The present study also revealed two differences between SHRs and WKY rats, as found in cultured vascular smooth muscle cells and in mRNA prepared from each strain. First, SHRs had higher expression levels of c-fos and c-jun genes, which encode the component of the AP-1 transcription factor that activates NGF gene transcription. Second, NGF mRNAs prepared from SHRs had a longer 3'UTR than those prepared from WKY rats. Although it remains to be determined whether these events play a role in the hypertension of SHR/SHRSP strains, the present results emphasize the importance of actively searching for aberrant trans-acting factor(s) leading to the enhanced expression of the NGF gene and NGF protein in SHR/SHRSP strains.

The rat alpha-tropomyosin gene generates a minimum of six different mRNAs coding for striated, smooth, and nonmuscle isoforms by alternative splicing.

PubMed Central

Wieczorek, D F; Smith, C W; Nadal-Ginard, B

1988-01-01

Tropomyosin (TM), a ubiquitous protein, is a component of the contractile apparatus of all cells. In nonmuscle cells, it is found in stress fibers, while in sarcomeric and nonsarcomeric muscle, it is a component of the thin filament. Several different TM isoforms specific for nonmuscle cells and different types of muscle cell have been described. As for other contractile proteins, it was assumed that smooth, striated, and nonmuscle isoforms were each encoded by different sets of genes. Through the use of S1 nuclease mapping, RNA blots, and 5' extension analyses, we showed that the rat alpha-TM gene, whose expression was until now considered to be restricted to muscle cells, generates many different tissue-specific isoforms. The promoter of the gene appears to be very similar to other housekeeping promoters in both its pattern of utilization, being active in most cell types, and its lack of any canonical sequence elements. The rat alpha-TM gene is split into at least 13 exons, 7 of which are alternatively spliced in a tissue-specific manner. This gene arrangement, which also includes two different 3' ends, generates a minimum of six different mRNAs each with the capacity to code for a different protein. These distinct TM isoforms are expressed specifically in nonmuscle and smooth and striated (cardiac and skeletal) muscle cells. The tissue-specific expression and developmental regulation of these isoforms is, therefore, produced by alternative mRNA processing. Moreover, structural and sequence comparisons among TM genes from different phyla suggest that alternative splicing is evolutionarily a very old event that played an important role in gene evolution and might have appeared concomitantly with or even before constitutive splicing. Images PMID:3352602
The HOX genes are expressed, in vivo, in human tooth germs: in vitro cAMP exposure of dental pulp cells results in parallel HOX network activation and neuronal differentiation.

PubMed

D'Antò, Vincenzo; Cantile, Monica; D'Armiento, Maria; Schiavo, Giulia; Spagnuolo, Gianrico; Terracciano, Luigi; Vecchione, Raffaela; Cillo, Clemente

2006-03-01

Homeobox-containing genes play a crucial role in odontogenesis. After the detection of Dlx and Msx genes in overlapping domains along maxillary and mandibular processes, a homeobox odontogenic code has been proposed to explain the interaction between different homeobox genes during dental lamina patterning. No role has so far been assigned to the Hox gene network in the homeobox odontogenic code due to studies on specific Hox genes and evolutionary considerations. Despite its involvement in early patterning during embryonal development, the HOX gene network, the most repeat-poor regions of the human genome, controls the phenotype identity of adult eukaryotic cells. Here, according to our results, the HOX gene network appears to be active in human tooth germs between 18 and 24 weeks of development. The immunohistochemical localization of specific HOX proteins mostly concerns the epithelial tooth germ compartment. Furthermore, only a few genes of the network are active in embryonal retromolar tissues, as well as in ectomesenchymal dental pulp cells (DPC) grown in vitro from adult human molar. Exposure of DPCs to cAMP induces the expression of from three to nine total HOX genes of the network in parallel with phenotype modifications with traits of neuronal differentiation. Our observations suggest that: (i) by combining its component genes, the HOX gene network determines the phenotype identity of epithelial and ectomesenchymal cells interacting in the generation of human tooth germ; (ii) cAMP treatment activates the HOX network and induces, in parallel, a neuronal-like phenotype in human primary ectomesenchymal dental pulp cells. 2005 Wiley-Liss, Inc.
Genome-wide transcriptional analysis of flagellar regeneration in Chlamydomonas reinhardtii identifies orthologs of ciliary disease genes

NASA Technical Reports Server (NTRS)

Stolc, Viktor; Samanta, Manoj Pratim; Tongprasit, Waraporn; Marshall, Wallace F.

2005-01-01

The important role that cilia and flagella play in human disease creates an urgent need to identify genes involved in ciliary assembly and function. The strong and specific induction of flagellar-coding genes during flagellar regeneration in Chlamydomonas reinhardtii suggests that transcriptional profiling of such cells would reveal new flagella-related genes. We have conducted a genome-wide analysis of RNA transcript levels during flagellar regeneration in Chlamydomonas by using maskless photolithography method-produced DNA oligonucleotide microarrays with unique probe sequences for all exons of the 19,803 predicted genes. This analysis represents previously uncharacterized whole-genome transcriptional activity profiling study in this important model organism. Analysis of strongly induced genes reveals a large set of known flagellar components and also identifies a number of important disease-related proteins as being involved with cilia and flagella, including the zebrafish polycystic kidney genes Qilin, Reptin, and Pontin, as well as the testis-expressed tubby-like protein TULP2.
Complete genome sequencing of the luminescent bacterium, Vibrio qinghaiensis sp. Q67 using PacBio technology

NASA Astrophysics Data System (ADS)

Gong, Liang; Wu, Yu; Jian, Qijie; Yin, Chunxiao; Li, Taotao; Gupta, Vijai Kumar; Duan, Xuewu; Jiang, Yueming

2018-01-01

Vibrio qinghaiensis sp.-Q67 (Vqin-Q67) is a freshwater luminescent bacterium that continuously emits blue-green light (485 nm). The bacterium has been widely used for detecting toxic contaminants. Here, we report the complete genome sequence of Vqin-Q67, obtained using third-generation PacBio sequencing technology. Continuous long reads were attained from three PacBio sequencing runs and reads >500 bp with a quality value of >0.75 were merged together into a single dataset. This resultant highly-contiguous de novo assembly has no genome gaps, and comprises two chromosomes with substantial genetic information, including protein-coding genes, non-coding RNA, transposon and gene islands. Our dataset can be useful as a comparative genome for evolution and speciation studies, as well as for the analysis of protein-coding gene families, the pathogenicity of different Vibrio species in fish, the evolution of non-coding RNA and transposon, and the regulation of gene expression in relation to the bioluminescence of Vqin-Q67.
cncRNAs: Bi-functional RNAs with protein coding and non-coding functions

PubMed Central

Kumari, Pooja; Sampath, Karuna

2015-01-01

For many decades, the major function of mRNA was thought to be to provide protein-coding information embedded in the genome. The advent of high-throughput sequencing has led to the discovery of pervasive transcription of eukaryotic genomes and opened the world of RNA-mediated gene regulation. Many regulatory RNAs have been found to be incapable of protein coding and are hence termed as non-coding RNAs (ncRNAs). However, studies in recent years have shown that several previously annotated non-coding RNAs have the potential to encode proteins, and conversely, some coding RNAs have regulatory functions independent of the protein they encode. Such bi-functional RNAs, with both protein coding and non-coding functions, which we term as ‘cncRNAs’, have emerged as new players in cellular systems. Here, we describe the functions of some cncRNAs identified from bacteria to humans. Because the functions of many RNAs across genomes remains unclear, we propose that RNAs be classified as coding, non-coding or both only after careful analysis of their functions. PMID:26498036
Integration of mRNP formation and export.

PubMed

Björk, Petra; Wieslander, Lars

2017-08-01

Expression of protein-coding genes in eukaryotes relies on the coordinated action of many sophisticated molecular machineries. Transcription produces precursor mRNAs (pre-mRNAs) and the active gene provides an environment in which the pre-mRNAs are processed, folded, and assembled into RNA-protein (RNP) complexes. The dynamic pre-mRNPs incorporate the growing transcript, proteins, and the processing machineries, as well as the specific protein marks left after processing that are essential for export and the cytoplasmic fate of the mRNPs. After release from the gene, the mRNPs move by diffusion within the interchromatin compartment, making up pools of mRNPs. Here, splicing and polyadenylation can be completed and the mRNPs recruit the major export receptor NXF1. Export competent mRNPs interact with the nuclear pore complex, leading to export, concomitant with compositional and conformational changes of the mRNPs. We summarize the integrated nuclear processes involved in the formation and export of mRNPs.
Molecular cloning of a cDNA coding for GTP cyclohydrolase I from Dictyostelium discoideum.

PubMed Central

Witter, K; Cahill, D J; Werner, T; Ziegler, I; Rödl, W; Bacher, A; Gütlich, M

1996-01-01

The GTP cyclohydrolase I (GTP-CH) gene of the cellular slime mould Dictyostelium discoideum has been cloned and sequenced. The 855 bp cDNA of this gene contains the open reading frame (ORF) encoding 232 amino acids with a predicted molecular mass of approx. 26 kDa. Southern blot analysis indicated the presence of a single gene for GTP-CH in Dictyostelium. PCR amplification of the ORF from chromosomal DNA and sequencing showed the existence of a 101 bp intron in the GTP-CH gene of Dictyostelium discoideum. The amino acid sequence has 47% and 49% positional identity to those of the human and yeast enzymes respectively. Most of the sequence variation between species is located in the N-terminal part of the protein. The overall identity with the E. coli protein is markedly lower. The enzyme was expressed in E. coli and purified as a 68 kDa fusion protein with the maltose-binding protein of E. coli. GTP-CH of Dictyostelium is heat-stable and showed maximal activity at 60 degrees C. The Km value for GTP is 50 microM. PMID:8870645
Complex Interplay among DNA Modification, Noncoding RNA Expression and Protein-Coding RNA Expression in Salvia miltiorrhiza Chloroplast Genome

PubMed Central

Chen, Haimei; Zhang, Jianhui; Yuan, George; Liu, Chang

2014-01-01

Salvia miltiorrhiza is one of the most widely used medicinal plants. As a first step to develop a chloroplast-based genetic engineering method for the over-production of active components from S. miltiorrhiza, we have analyzed the genome, transcriptome, and base modifications of the S. miltiorrhiza chloroplast. Total genomic DNA and RNA were extracted from fresh leaves and then subjected to strand-specific RNA-Seq and Single-Molecule Real-Time (SMRT) sequencing analyses. Mapping the RNA-Seq reads to the genome assembly allowed us to determine the relative expression levels of 80 protein-coding genes. In addition, we identified 19 polycistronic transcription units and 136 putative antisense and intergenic noncoding RNA (ncRNA) genes. Comparison of the abundance of protein-coding transcripts (cRNA) with and without overlapping antisense ncRNAs (asRNA) suggest that the presence of asRNA is associated with increased cRNA abundance (p<0.05). Using the SMRT Portal software (v1.3.2), 2687 potential DNA modification sites and two potential DNA modification motifs were predicted. The two motifs include a TATA box–like motif (CPGDMM1, “TATANNNATNA”), and an unknown motif (CPGDMM2 “WNYANTGAW”). Specifically, 35 of the 97 CPGDMM1 motifs (36.1%) and 91 of the 369 CPGDMM2 motifs (24.7%) were found to be significantly modified (p<0.01). Analysis of genes downstream of the CPGDMM1 motif revealed the significantly increased abundance of ncRNA genes that are less than 400 bp away from the significantly modified CPGDMM1motif (p<0.01). Taking together, the present study revealed a complex interplay among DNA modifications, ncRNA and cRNA expression in chloroplast genome. PMID:24914614
Complex interplay among DNA modification, noncoding RNA expression and protein-coding RNA expression in Salvia miltiorrhiza chloroplast genome.

PubMed

Chen, Haimei; Zhang, Jianhui; Yuan, George; Liu, Chang

2014-01-01

Salvia miltiorrhiza is one of the most widely used medicinal plants. As a first step to develop a chloroplast-based genetic engineering method for the over-production of active components from S. miltiorrhiza, we have analyzed the genome, transcriptome, and base modifications of the S. miltiorrhiza chloroplast. Total genomic DNA and RNA were extracted from fresh leaves and then subjected to strand-specific RNA-Seq and Single-Molecule Real-Time (SMRT) sequencing analyses. Mapping the RNA-Seq reads to the genome assembly allowed us to determine the relative expression levels of 80 protein-coding genes. In addition, we identified 19 polycistronic transcription units and 136 putative antisense and intergenic noncoding RNA (ncRNA) genes. Comparison of the abundance of protein-coding transcripts (cRNA) with and without overlapping antisense ncRNAs (asRNA) suggest that the presence of asRNA is associated with increased cRNA abundance (p<0.05). Using the SMRT Portal software (v1.3.2), 2687 potential DNA modification sites and two potential DNA modification motifs were predicted. The two motifs include a TATA box-like motif (CPGDMM1, "TATANNNATNA"), and an unknown motif (CPGDMM2 "WNYANTGAW"). Specifically, 35 of the 97 CPGDMM1 motifs (36.1%) and 91 of the 369 CPGDMM2 motifs (24.7%) were found to be significantly modified (p<0.01). Analysis of genes downstream of the CPGDMM1 motif revealed the significantly increased abundance of ncRNA genes that are less than 400 bp away from the significantly modified CPGDMM1motif (p<0.01). Taking together, the present study revealed a complex interplay among DNA modifications, ncRNA and cRNA expression in chloroplast genome.
A human haploid gene trap collection to study lncRNAs with unusual RNA biology.

PubMed

Kornienko, Aleksandra E; Vlatkovic, Irena; Neesen, Jürgen; Barlow, Denise P; Pauler, Florian M

2016-01-01

Many thousand long non-coding (lnc) RNAs are mapped in the human genome. Time consuming studies using reverse genetic approaches by post-transcriptional knock-down or genetic modification of the locus demonstrated diverse biological functions for a few of these transcripts. The Human Gene Trap Mutant Collection in haploid KBM7 cells is a ready-to-use tool for studying protein-coding gene function. As lncRNAs show remarkable differences in RNA biology compared to protein-coding genes, it is unclear if this gene trap collection is useful for functional analysis of lncRNAs. Here we use the uncharacterized LOC100288798 lncRNA as a model to answer this question. Using public RNA-seq data we show that LOC100288798 is ubiquitously expressed, but inefficiently spliced. The minor spliced LOC100288798 isoforms are exported to the cytoplasm, whereas the major unspliced isoform is nuclear localized. This shows that LOC100288798 RNA biology differs markedly from typical mRNAs. De novo assembly from RNA-seq data suggests that LOC100288798 extends 289kb beyond its annotated 3' end and overlaps the downstream SLC38A4 gene. Three cell lines with independent gene trap insertions in LOC100288798 were available from the KBM7 gene trap collection. RT-qPCR and RNA-seq confirmed successful lncRNA truncation and its extended length. Expression analysis from RNA-seq data shows significant deregulation of 41 protein-coding genes upon LOC100288798 truncation. Our data shows that gene trap collections in human haploid cell lines are useful tools to study lncRNAs, and identifies the previously uncharacterized LOC100288798 as a potential gene regulator.
Identification and characterization of a novel zebrafish (Danio rerio) pentraxin-carbonic anhydrase.

PubMed

Patrikainen, Maarit S; Tolvanen, Martti E E; Aspatwar, Ashok; Barker, Harlan R; Ortutay, Csaba; Jänis, Janne; Laitaoja, Mikko; Hytönen, Vesa P; Azizi, Latifeh; Manandhar, Prajwol; Jáger, Edit; Vullo, Daniela; Kukkurainen, Sampo; Hilvo, Mika; Supuran, Claudiu T; Parkkila, Seppo

2017-01-01

Carbonic anhydrases (CAs) are ubiquitous, essential enzymes which catalyze the conversion of carbon dioxide and water to bicarbonate and H + ions. Vertebrate genomes generally contain gene loci for 15-21 different CA isoforms, three of which are enzymatically inactive. CA VI is the only secretory protein of the enzymatically active isoforms. We discovered that non-mammalian CA VI contains a C-terminal pentraxin (PTX) domain, a novel combination for both CAs and PTXs. We isolated and sequenced zebrafish ( Danio rerio ) CA VI cDNA, complete with the sequence coding for the PTX domain, and produced the recombinant CA VI-PTX protein. Enzymatic activity and kinetic parameters were measured with a stopped-flow instrument. Mass spectrometry, analytical gel filtration and dynamic light scattering were used for biophysical characterization. Sequence analyses and Bayesian phylogenetics were used in generating hypotheses of protein structure and CA VI gene evolution. A CA VI-PTX antiserum was produced, and the expression of CA VI protein was studied by immunohistochemistry. A knock-down zebrafish model was constructed, and larvae were observed up to five days post-fertilization (dpf). The expression of ca6 mRNA was quantitated by qRT-PCR in different developmental times in morphant and wild-type larvae and in different adult fish tissues. Finally, the swimming behavior of the morphant fish was compared to that of wild-type fish. The recombinant enzyme has a very high carbonate dehydratase activity. Sequencing confirms a 530-residue protein identical to one of the predicted proteins in the Ensembl database (ensembl.org). The protein is pentameric in solution, as studied by gel filtration and light scattering, presumably joined by the PTX domains. Mass spectrometry confirms the predicted signal peptide cleavage and disulfides, and N-glycosylation in two of the four observed glycosylation motifs. Molecular modeling of the pentamer is consistent with the modifications observed in mass spectrometry. Phylogenetics and sequence analyses provide a consistent hypothesis of the evolutionary history of domains associated with CA VI in mammals and non-mammals. Briefly, the evidence suggests that ancestral CA VI was a transmembrane protein, the exon coding for the cytoplasmic domain was replaced by one coding for PTX domain, and finally, in the therian lineage, the PTX-coding exon was lost. We knocked down CA VI expression in zebrafish embryos with antisense morpholino oligonucleotides, resulting in phenotype features of decreased buoyancy and swim bladder deflation in 4 dpf larvae. These findings provide novel insights into the evolution, structure, and function of this unique CA form.
Quantifying the mechanisms of domain gain in animal proteins.

PubMed

Buljan, Marija; Frankish, Adam; Bateman, Alex

2010-01-01

Protein domains are protein regions that are shared among different proteins and are frequently functionally and structurally independent from the rest of the protein. Novel domain combinations have a major role in evolutionary innovation. However, the relative contributions of the different molecular mechanisms that underlie domain gains in animals are still unknown. By using animal gene phylogenies we were able to identify a set of high confidence domain gain events and by looking at their coding DNA investigate the causative mechanisms. Here we show that the major mechanism for gains of new domains in metazoan proteins is likely to be gene fusion through joining of exons from adjacent genes, possibly mediated by non-allelic homologous recombination. Retroposition and insertion of exons into ancestral introns through intronic recombination are, in contrast to previous expectations, only minor contributors to domain gains and have accounted for less than 1% and 10% of high confidence domain gain events, respectively. Additionally, exonization of previously non-coding regions appears to be an important mechanism for addition of disordered segments to proteins. We observe that gene duplication has preceded domain gain in at least 80% of the gain events. The interplay of gene duplication and domain gain demonstrates an important mechanism for fast neofunctionalization of genes.
The first mitochondrial genome for the butterfly family Riodinidae (Abisara fylloides) and its systematic implications.

PubMed

Zhao, Fang; Huang, Dun-Yuan; Sun, Xiao-Yan; Shi, Qing-Hui; Hao, Jia-Sheng; Zhang, Lan-Lan; Yang, Qun

2013-10-01

The Riodinidae is one of the lepidopteran butterfly families. This study describes the complete mitochondrial genome of the butterfly species Abisara fylloides, the first mitochondrial genome of the Riodinidae family. The results show that the entire mitochondrial genome of A. fylloides is 15 301 bp in length, and contains 13 protein-coding genes, 2 ribosomal RNA genes, 22 transfer RNA genes and a 423 bp A+T-rich region. The gene content, orientation and order are identical to the majority of other lepidopteran insects. Phylogenetic reconstruction was conducted using the concatenated 13 protein-coding gene (PCG) sequences of 19 available butterfly species covering all the five butterfly families (Papilionidae, Nymphalidae, Peridae, Lycaenidae and Riodinidae). Both maximum likelihood and Bayesian inference analyses highly supported the monophyly of Lycaenidae+Riodinidae, which was standing as the sister of Nymphalidae. In addition, we propose that the riodinids be categorized into the family Lycaenidae as a subfamilial taxon. The Riodinidae is one of the lepidopteran butterfly families. This study describes the complete mitochondrial genome of the butterfly species Abisara fylloides , the first mitochondrial genome of the Riodinidae family. The results show that the entire mitochondrial genome of A. fylloides is 15 301 bp in length, and contains 13 protein-coding genes, 2 ribosomal RNA genes, 22 transfer RNA genes and a 423 bp A+T-rich region. The gene content, orientation and order are identical to the majority of other lepidopteran insects. Phylogenetic reconstruction was conducted using the concatenated 13 protein-coding gene (PCG) sequences of 19 available butterfly species covering all the five butterfly families (Papilionidae, Nymphalidae, Peridae, Lycaenidae and Riodinidae). Both maximum likelihood and Bayesian inference analyses highly supported the monophyly of Lycaenidae+Riodinidae, which was standing as the sister of Nymphalidae. In addition, we propose that the riodinids be categorized into the family Lycaenidae as a subfamilial taxon.
Exploratory Investigation of Bacteroides fragilis Transcriptional Response during In vitro Exposure to Subinhibitory Concentration of Metronidazole.

PubMed

de Freitas, Michele C R; Resende, Juliana A; Ferreira-Machado, Alessandra B; Saji, Guadalupe D R Q; de Vasconcelos, Ana T R; da Silva, Vânia L; Nicolás, Marisa F; Diniz, Cláudio G

2016-01-01

Bacteroides fragilis , member from commensal gut microbiota, is an important pathogen associated to endogenous infections and metronidazole remains a valuable antibiotic for the treatment of these infections, although bacterial resistance is widely reported. Considering the need of a better understanding on the global mechanisms by which B. fragilis survive upon metronidazole exposure, we performed a RNA-seq transcriptomic approach with validation of gene expression results by qPCR. Bacteria strains were selected after in vitro subcultures with subinhibitory concentration (SIC) of the drug. From a wild type B. fragilis ATCC 43859 four derivative strains were selected: first and fourth subcultures under metronidazole exposure and first and fourth subcultures after drug removal. According to global gene expression analysis, 2,146 protein coding genes were identified, of which a total of 1,618 (77%) were assigned to a Gene Ontology term (GO), indicating that most known cellular functions were taken. Among these 2,146 protein coding genes, 377 were shared among all strains, suggesting that they are critical for B. fragilis survival. In order to identify distinct expression patterns, we also performed a K-means clustering analysis set to 15 groups. This analysis allowed us to detect the major activated or repressed genes encoding for enzymes which act in several metabolic pathways involved in metronidazole response such as drug activation, defense mechanisms against superoxide ions, high expression level of multidrug eﬄux pumps, and DNA repair. The strains collected after metronidazole removal were functionally more similar to those cultured under drug pressure, reinforcing that drug-exposure lead to drastic persistent changes in the B. fragilis gene expression patterns. These results may help to elucidate B. fragilis response during metronidazole exposure, mainly at SIC, contributing with information about bacterial survival strategies under stress conditions in their environment.
Synthetic versions of firefly luciferase and Renilla luciferase reporter genes that resist transgene silencing in sugarcane

PubMed Central

2014-01-01

Background Down-regulation or silencing of transgene expression can be a major hurdle to both molecular studies and biotechnology applications in many plant species. Sugarcane is particularly effective at silencing introduced transgenes, including reporter genes such as the firefly luciferase gene. Synthesizing transgene coding sequences optimized for usage in the host plant is one method of enhancing transgene expression and stability. Using specified design rules we have synthesised new coding sequences for both the firefly luciferase and Renilla luciferase reporter genes. We have tested these optimized versions for enhanced levels of luciferase activity and for increased steady state luciferase mRNA levels in sugarcane. Results The synthetic firefly luciferase (luc*) and Renilla luciferase (Renluc*) coding sequences have elevated G + C contents in line with sugarcane codon usage, but maintain 75% identity to the native firefly or Renilla luciferase nucleotide sequences and 100% identity to the protein coding sequences. Under the control of the maize pUbi promoter, the synthetic luc* and Renluc* genes yielded 60x and 15x higher luciferase activity respectively, over the native firefly and Renilla luciferase genes in transient assays on sugarcane suspension cell cultures. Using a novel transient assay in sugarcane suspension cells combining co-bombardment and qRT-PCR, we showed that synthetic luc* and Renluc* genes generate increased transcript levels compared to the native firefly and Renilla luciferase genes. In stable transgenic lines, the luc* transgene generated significantly higher levels of expression than the native firefly luciferase transgene. The fold difference in expression was highest in the youngest tissues. Conclusions We developed synthetic versions of both the firefly and Renilla luciferase reporter genes that resist transgene silencing in sugarcane. These transgenes will be particularly useful for evaluating the expression patterns conferred by existing and newly isolated promoters in sugarcane tissues. The strategies used to design the synthetic luciferase transgenes could be applied to other transgenes that are aggressively silenced in sugarcane. PMID:24708613
Synthetic versions of firefly luciferase and Renilla luciferase reporter genes that resist transgene silencing in sugarcane.

PubMed

Chou, Ting-Chun; Moyle, Richard L

2014-04-08

Down-regulation or silencing of transgene expression can be a major hurdle to both molecular studies and biotechnology applications in many plant species. Sugarcane is particularly effective at silencing introduced transgenes, including reporter genes such as the firefly luciferase gene.Synthesizing transgene coding sequences optimized for usage in the host plant is one method of enhancing transgene expression and stability. Using specified design rules we have synthesised new coding sequences for both the firefly luciferase and Renilla luciferase reporter genes. We have tested these optimized versions for enhanced levels of luciferase activity and for increased steady state luciferase mRNA levels in sugarcane. The synthetic firefly luciferase (luc*) and Renilla luciferase (Renluc*) coding sequences have elevated G + C contents in line with sugarcane codon usage, but maintain 75% identity to the native firefly or Renilla luciferase nucleotide sequences and 100% identity to the protein coding sequences.Under the control of the maize pUbi promoter, the synthetic luc* and Renluc* genes yielded 60x and 15x higher luciferase activity respectively, over the native firefly and Renilla luciferase genes in transient assays on sugarcane suspension cell cultures.Using a novel transient assay in sugarcane suspension cells combining co-bombardment and qRT-PCR, we showed that synthetic luc* and Renluc* genes generate increased transcript levels compared to the native firefly and Renilla luciferase genes.In stable transgenic lines, the luc* transgene generated significantly higher levels of expression than the native firefly luciferase transgene. The fold difference in expression was highest in the youngest tissues. We developed synthetic versions of both the firefly and Renilla luciferase reporter genes that resist transgene silencing in sugarcane. These transgenes will be particularly useful for evaluating the expression patterns conferred by existing and newly isolated promoters in sugarcane tissues. The strategies used to design the synthetic luciferase transgenes could be applied to other transgenes that are aggressively silenced in sugarcane.
Exogean: a framework for annotating protein-coding genes in eukaryotic genomic DNA

PubMed Central

Djebali, Sarah; Delaplace, Franck; Crollius, Hugues Roest

2006-01-01

Background Accurate and automatic gene identification in eukaryotic genomic DNA is more than ever of crucial importance to efficiently exploit the large volume of assembled genome sequences available to the community. Automatic methods have always been considered less reliable than human expertise. This is illustrated in the EGASP project, where reference annotations against which all automatic methods are measured are generated by human annotators and experimentally verified. We hypothesized that replicating the accuracy of human annotators in an automatic method could be achieved by formalizing the rules and decisions that they use, in a mathematical formalism. Results We have developed Exogean, a flexible framework based on directed acyclic colored multigraphs (DACMs) that can represent biological objects (for example, mRNA, ESTs, protein alignments, exons) and relationships between them. Graphs are analyzed to process the information according to rules that replicate those used by human annotators. Simple individual starting objects given as input to Exogean are thus combined and synthesized into complex objects such as protein coding transcripts. Conclusion We show here, in the context of the EGASP project, that Exogean is currently the method that best reproduces protein coding gene annotations from human experts, in terms of identifying at least one exact coding sequence per gene. We discuss current limitations of the method and several avenues for improvement. PMID:16925841
Alternative splicing and promoter use in TFII-I genes.

PubMed

Makeyev, Aleksandr V; Bayarsaihan, Dashzeveg

2009-03-15

TFII-I proteins are ubiquitously expressed transcriptional factors involved in both basal transcription and signal transduction activation or repression. TFII-I proteins are detected as early as at two-cell stage and exhibit distinct and dynamic expression patterns in developing embryos as well as mark regional variation in the adult mouse brain. Analysis of atypical small and rare chromosomal deletions at 7q11.23 points to TFII-I genes (GTF2I and GTF2IRD1) as the prime candidates responsible for craniofacial and cognitive abnormalities in the Williams-Beuren syndrome. TFII-I genes are often subjected to alternative splicing, which generates isoforms that show different activities and play distinct biological roles. The coding regions of TFII-I genes are composed of more than 30 exons and are well conserved among vertebrates. However, their 5' untranslated regions are not as well conserved and all poorly characterized. In the present work, we analyzed promoter regions of TFII-I genes and described their additional exons, as well as tested tissue specificity of both previously reported and novel alternatively spliced isoforms. Our comprehensive analysis leads to further elucidation of the functional heterogeneity of TFII-I proteins, provides hints on search for regulatory pathways governing their expression, and opens up possibilities for examining the effect of different haplotypes on their promoter functions.
SOA genes encode proteins controlling lipase expression in response to triacylglycerol utilization in the yeast Yarrowia lipolytica.

PubMed

Desfougères, Thomas; Haddouche, Ramdane; Fudalej, Franck; Neuvéglise, Cécile; Nicaud, Jean-Marc

2010-02-01

The oleaginous yeast Yarrowia lipolytica efficiently metabolizes hydrophobic substrates such as alkanes, fatty acids or triacylglycerol. This yeast has been identified in oil-polluted water and in lipid-rich food. The enzymes involved in lipid breakdown, for use as a carbon source, are known, but the molecular mechanisms controlling the expression of the genes encoding these enzymes are still poorly understood. The study of mRNAs obtained from cells grown on oleic acid identified a new group of genes called SOA genes (specific for oleic acid). SOA1 and SOA2 are two small genes coding for proteins with no known homologs. Single- and double-disrupted strains were constructed. Wild-type and mutant strains were grown on dextrose, oleic acid and triacylglycerols. The double mutant presents a clear phenotype consisting of a growth defect on tributyrin and triolein, but not on dextrose or oleic acid media. Lipase activity was 50-fold lower in this mutant than in the wild-type strain. The impact of SOA deletion on the expression of the main extracellular lipase gene (LIP2) was monitored using a LIP2-beta-galactosidase promoter fusion protein. These data suggest that Soa proteins are components of a molecular mechanism controlling lipase gene expression in response to extracellular triacylglycerol.
Assessment of allelic diversity in intron-containing Mal d 1 genes and their association to apple allergenicity

PubMed Central

Gao, Zhongshan; Weg, Eric W van de; Matos, Catarina I; Arens, Paul; Bolhaar, Suzanne THP; Knulst, Andre C; Li, Yinghui; Hoffmann-Sommergruber, Karin; Gilissen, Luud JWJ

2008-01-01

Background Mal d 1 is a major apple allergen causing food allergic symptoms of the oral allergy syndrome (OAS) in birch-pollen sensitised patients. The Mal d 1 gene family is known to have at least 7 intron-containing and 11 intronless members that have been mapped in clusters on three linkage groups. In this study, the allelic diversity of the seven intron-containing Mal d 1 genes was assessed among a set of apple cultivars by sequencing or indirectly through pedigree genotyping. Protein variant constitutions were subsequently compared with Skin Prick Test (SPT) responses to study the association of deduced protein variants with allergenicity in a set of 14 cultivars. Results From the seven intron-containing Mal d 1 genes investigated, Mal d 1.01 and Mal d 1.02 were highly conserved, as nine out of ten cultivars coded for the same protein variant, while only one cultivar coded for a second variant. Mal d 1.04, Mal d 1.05 and Mal d 1.06 A, B and C were more variable, coding for three to six different protein variants. Comparison of Mal d 1 allelic composition between the high-allergenic cultivar Golden Delicious and the low-allergenic cultivars Santana and Priscilla, which are linked in pedigree, showed an association between the protein variants coded by the Mal d 1.04 and -1.06A genes (both located on linkage group 16) with allergenicity. This association was confirmed in 10 other cultivars. In addition, Mal d 1.06A allele dosage effects associated with the degree of allergenicity based on prick to prick testing. Conversely, no associations were observed for the protein variants coded by the Mal d 1.01 (on linkage group 13), -1.02, -1.06B, -1.06C genes (all on linkage group 16), nor by the Mal d 1.05 gene (on linkage group 6). Conclusion Protein variant compositions of Mal d 1.04 and -1.06A and, in case of Mal d 1.06A, allele doses are associated with the differences in allergenicity among fourteen apple cultivars. This information indicates the involvement of qualitative as well as quantitative factors in allergenicity and warrants further research in the relative importance of quantitative and qualitative aspects of Mal d 1 gene expression on allergenicity. Results from this study have implications for medical diagnostics, immunotherapy, clinical research and breeding schemes for new hypo-allergenic cultivars. PMID:19014530

How to calculate the non-synonymous to synonymous rate ratio of protein-coding genes under the Fisher-Wright mutation-selection framework.

PubMed

Dos Reis, Mario

2015-04-01

First principles of population genetics are used to obtain formulae relating the non-synonymous to synonymous substitution rate ratio to the selection coefficients acting at codon sites in protein-coding genes. Two theoretical cases are discussed and two examples from real data (a chloroplast gene and a virus polymerase) are given. The formulae give much insight into the dynamics of non-synonymous substitutions and may inform the development of methods to detect adaptive evolution. © 2015 The Author(s) Published by the Royal Society. All rights reserved.
Informational structure of genetic sequences and nature of gene splicing

NASA Astrophysics Data System (ADS)

Trifonov, E. N.

1991-10-01

Only about 1/20 of DNA of higher organisms codes for proteins, by means of classical triplet code. The rest of DNA sequences is largely silent, with unclear functions, if any. The triplet code is not the only code (message) carried by the sequences. There are three levels of molecular communication, where the same sequence ``talks'' to various bimolecules, while having, respectively, three different appearances: DNA, RNA and protein. Since the molecular structures and, hence, sequence specific preferences of these are substantially different, the original DNA sequence has to carry simultaneously three types of sequence patterns (codes, messages), thus, being a composite structure in which one had the same letter (nucleotide) is frequently involved in several overlapping codes of different nature. This multiplicity and overlapping of the codes is a unique feature of the Gnomic, language of genetic sequences. The coexisting codes have to be degenerate in various degrees to allow an optimal and concerted performance of all the encoded functions. There is an obvious conflict between the best possible performance of a given function and necessity to compromise the quality of a given sequence pattern in favor of other patterns. It appears that the major role of various changes in the sequences on their ``ontogenetic'' way from DNA to RNA to protein, like RNA editing and splicing, or protein post-translational modifications is to resolve such conflicts. New data are presented strongly indicating that the gene splicing is such a device to resolve the conflict between the code of DNA folding in chromatin and the triplet code for protein synthesis.
Complete mitochondrial genome sequence of the heart failure model of cardiomyopathic Syrian hamster (Mesocricetus auratus).

PubMed

Hu, Bo; Liu, Dong-Xing; Zhang, Yu-Qing; Song, Jian-Tao; Ji, Xian-Fei; Hou, Zhi-Qiang; Zhang, Zhen-Hai

2016-05-01

In this study we sequenced the complete mitochondrial genome sequencing of a heart failure model of cardiomyopathic Syrian hamster (Mesocricetus auratus) for the first time. The total length of the mitogenome was 16,267 bp. It harbored 13 protein-coding genes, 2 ribosomal RNA genes, 22 transfer RNA genes and 1 non-coding control region.
RsmV a small non-coding regulatory RNA in Pseudomonas aeruginosa that sequesters RsmA and RsmF from target mRNAs.

PubMed

Janssen, Kayley H; Diaz, Manisha R; Gode, Cindy J; Wolfgang, Matthew C; Yahr, Timothy L

2018-06-04

The Gram-negative opportunistic pathogen Pseudomonas aeruginosa has distinct genetic programs that favor either acute or chronic virulence gene expression. Acute virulence is associated with twitching and swimming motility, expression of a type III secretion system (T3SS), and the absence of alginate, Psl, or Pel polysaccharide production. Traits associated with chronic infection include growth as a biofilm, reduced motility, and expression of a type VI secretion system (T6SS). The Rsm post-transcriptional regulatory system plays important roles in the inverse control of phenotypes associated with acute and chronic virulence. RsmA and RsmF are RNA-binding proteins that interact with target mRNAs to control gene expression at the post-transcriptional level. Previous work found that RsmA activity is controlled by at least three small, non-coding regulatory RNAs (RsmW, RsmY, and RsmZ). In this study, we took an in-silico approach to identify additional sRNAs that might function in the sequestration of RsmA and/or RsmF and identified RsmV, a 192 nt transcript with four predicted RsmA/RsmF consensus binding sites. RsmV is capable of sequestering RsmA and RsmF in vivo to activate translation of tssA1 , a component of the T6SS, and to inhibit T3SS gene expression. Each of the predicted RsmA/RsmF consensus binding sites contribute to RsmV activity. Electrophoretic mobility shifts assays show that RsmF binds RsmV with >10-fold higher affinity than RsmY and RsmZ. Gene expression studies revealed that the temporal expression pattern of RsmV differs from RsmW, RsmY, and RsmZ. These findings suggest that each sRNA may play distinct roles in controlling RsmA and RsmF activity. IMPORTANCE The CsrA/RsmA family of RNA-binding proteins play important roles in post-transcriptional control of gene expression. The activity of CsrA/RsmA proteins is controlled by small non-coding RNAs that function as decoys to sequester CsrA/RsmA from target mRNAs. Pseudomonas aeruginosa has two CsrA family proteins (RsmA and RsmF) and at least four sequestering sRNAs (RsmV [identified in this study], RsmW, RsmY, RsmZ) that control RsmA/RsmF activity. RsmY and RsmZ are the primary sRNAs that sequester RsmA/RsmF, and RsmV and RsmW appear to play smaller roles. Differences in the temporal expression and absolute levels of the sRNAs and in their binding affinities for RsmA/RsmF may provide a mechanism of fine-tuning the output of the Rsm system in response to environmental cues. Copyright © 2018 American Society for Microbiology.
Novel coding, translation, and gene expression of a replicating covalently closed circular RNA of 220 nt

PubMed Central

AbouHaidar, Mounir Georges; Venkataraman, Srividhya; Golshani, Ashkan; Liu, Bolin; Ahmad, Tauqeer

2014-01-01

The highly structured (64% GC) covalently closed circular (CCC) RNA (220 nt) of the virusoid associated with rice yellow mottle virus codes for a 16-kDa highly basic protein using novel modalities for coding, translation, and gene expression. This CCC RNA is the smallest among all known viroids and virusoids and the only one that codes proteins. Its sequence possesses an internal ribosome entry site and is directly translated through two (or three) completely overlapping ORFs (shifting to a new reading frame at the end of each round). The initiation and termination codons overlap UGAUGA (underline highlights the initiation codon AUG within the combined initiation-termination sequence). Termination codons can be ignored to obtain larger read-through proteins. This circular RNA with no noncoding sequences is a unique natural supercompact “nanogenome.” PMID:25253891
PdSlt2 Penicillium digitatum mitogen-activated-protein kinase controls sporulation and virulence during citrus fruit infection.

PubMed

de Ramón-Carbonell, Marta; Sánchez-Torres, Paloma

2017-12-01

The Slt2 mitogen-activated protein (MAP) kinase homologue of Penicillium digitatum, the most relevant pathogen-producing citrus green mould decay during postharvest, was identified and explored. The P. digitatum Slt2-MAPK coding gene (PdSlt2) was functionally characterized by homologous gene elimination and transcriptomic evaluation. The absence of PdSlt2 gene resulted in significantly reduced virulence during citrus infection. The ΔPdSlt2 mutants were also defective in asexual reproduction, showing impairment of sporulation during citrus infection. Gene expression analysis revealed that PdSlt2 was highly induced during citrus fruit infection at early stages (1 dpi). Moreover, PdSlt2 deletion altered gene expression profiles. The relative gene expression (RGE) of fungicide resistance- and fungal virulence-related genes showed that PdSlt2 acts as negative regulator of several transporter encoding genes (ABC and MFS transporters) and a positive regulator of two sterol demethylases. This study indicates that PdSlt2 MAPK is functionally preserved in P. digitatum and highlights the relevant role of the PdSlt2 MAP kinase-mediated signalling pathway in regulating diverse genes crucial for infection and asexual reproduction. Copyright © 2017 British Mycological Society. Published by Elsevier Ltd. All rights reserved.
Synergism and Antagonism between Bacillus thuringiensis Vip3A and Cry1 Proteins in Heliothis virescens, Diatraea saccharalis and Spodoptera frugiperda

PubMed Central

Lemes, Ana Rita Nunes; Davolos, Camila Chiaradia; Legori, Paula Cristina Brunini Crialesi; Fernandes, Odair Aparecido; Ferré, Juan; Lemos, Manoel Victor Franco; Desiderio, Janete Apparecida

2014-01-01

Second generation Bt crops (insect resistant crops carrying Bacillus thuringiensis genes) combine more than one gene that codes for insecticidal proteins in the same plant to provide better control of agricultural pests. Some of the new combinations involve co-expression of cry and vip genes. Because Cry and Vip proteins have different midgut targets and possibly different mechanisms of toxicity, it is important to evaluate possible synergistic or antagonistic interactions between these two classes of toxins. Three members of the Cry1 class of proteins and three from the Vip3A class were tested against Heliothis virescens for possible interactions. At the level of LC50, Cry1Ac was the most active protein, whereas the rest of proteins tested were similarly active. However, at the level of LC90, Cry1Aa and Cry1Ca were the least active proteins, and Cry1Ac and Vip3A proteins were not significantly different. Under the experimental conditions used in this study, we found an antagonistic effect of Cry1Ca with the three Vip3A proteins. The interaction between Cry1Ca and Vip3Aa was also tested on two other species of Lepidoptera. Whereas antagonism was observed in Spodoptera frugiperda, synergism was found in Diatraea saccharalis. In all cases, the interaction between Vip3A and Cry1 proteins was more evident at the LC90 level than at the LC50 level. The fact that the same combination of proteins may result in a synergistic or an antagonistic interaction may be an indication that there are different types of interactions within the host, depending on the insect species tested. PMID:25275646
In Vitro Anti-Echinococcal and Metabolic Effects of Metformin Involve Activation of AMP-Activated Protein Kinase in Larval Stages of Echinococcus granulosus.

PubMed

Loos, Julia A; Cumino, Andrea C

2015-01-01

Metformin (Met) is a biguanide anti-hyperglycemic agent, which also exerts antiproliferative effects on cancer cells. This drug inhibits the complex I of the mitochondrial electron transport chain inducing a fall in the cell energy charge and leading 5'-AMP-activated protein kinase (AMPK) activation. AMPK is a highly conserved heterotrimeric complex that coordinates metabolic and growth pathways in order to maintain energy homeostasis and cell survival, mainly under nutritional stress conditions, in a Liver Kinase B1 (LKB1)-dependent manner. This work describes for the first time, the in vitro anti-echinococcal effect of Met on Echinococcus granulosus larval stages, as well as the molecular characterization of AMPK (Eg-AMPK) in this parasite of clinical importance. The drug exerted a dose-dependent effect on the viability of both larval stages. Based on this, we proceeded with the identification of the genes encoding for the different subunits of Eg-AMPK. We cloned one gene coding for the catalytic subunit (Eg-ampkɑ) and two genes coding for the regulatory subunits (Eg-ampkβ and Eg-ampkγ), all of them constitutively transcribed in E. granulosus protoscoleces and metacestodes. Their deduced amino acid sequences show all the conserved functional domains, including key amino acids involved in catalytic activity and protein-protein interactions. In protoscoleces, the drug induced the activation of AMPK (Eg-AMPKɑ-P176), possibly as a consequence of cellular energy charge depletion evidenced by assays with the fluorescent indicator JC-1. Met also led to carbohydrate starvation, it increased glucogenolysis and homolactic fermentation, and decreased transcription of intermediary metabolism genes. By in toto immunolocalization assays, we detected Eg-AMPKɑ-P176 expression, both in the nucleus and the cytoplasm of cells as in the larval tegument, the posterior bladder and the calcareous corpuscles of control and Met-treated protoscoleces. Interestingly, expression of Eg-AMPKɑ was observed in the developmental structures during the de-differentiation process from protoscoleces to microcysts. Therefore, the Eg-AMPK expression during the asexual development of E. granulosus, as well as the in vitro synergic therapeutic effects observed in presence of Met plus albendazole sulfoxide (ABZSO), suggest the importance of carrying out chemoprophylactic and clinical efficacy studies combining Met with conventional anti-echinococcal agents to test the potential use of this drug in hydatidosis therapy.
In Vitro Anti-Echinococcal and Metabolic Effects of Metformin Involve Activation of AMP-Activated Protein Kinase in Larval Stages of Echinococcus granulosus

PubMed Central

Loos, Julia A.; Cumino, Andrea C.

2015-01-01

Metformin (Met) is a biguanide anti-hyperglycemic agent, which also exerts antiproliferative effects on cancer cells. This drug inhibits the complex I of the mitochondrial electron transport chain inducing a fall in the cell energy charge and leading 5'-AMP-activated protein kinase (AMPK) activation. AMPK is a highly conserved heterotrimeric complex that coordinates metabolic and growth pathways in order to maintain energy homeostasis and cell survival, mainly under nutritional stress conditions, in a Liver Kinase B1 (LKB1)-dependent manner. This work describes for the first time, the in vitro anti-echinococcal effect of Met on Echinococcus granulosus larval stages, as well as the molecular characterization of AMPK (Eg-AMPK) in this parasite of clinical importance. The drug exerted a dose-dependent effect on the viability of both larval stages. Based on this, we proceeded with the identification of the genes encoding for the different subunits of Eg-AMPK. We cloned one gene coding for the catalytic subunit (Eg-ampkɑ) and two genes coding for the regulatory subunits (Eg-ampkβ and Eg-ampkγ), all of them constitutively transcribed in E. granulosus protoscoleces and metacestodes. Their deduced amino acid sequences show all the conserved functional domains, including key amino acids involved in catalytic activity and protein-protein interactions. In protoscoleces, the drug induced the activation of AMPK (Eg-AMPKɑ-P176), possibly as a consequence of cellular energy charge depletion evidenced by assays with the fluorescent indicator JC-1. Met also led to carbohydrate starvation, it increased glucogenolysis and homolactic fermentation, and decreased transcription of intermediary metabolism genes. By in toto immunolocalization assays, we detected Eg-AMPKɑ-P176 expression, both in the nucleus and the cytoplasm of cells as in the larval tegument, the posterior bladder and the calcareous corpuscles of control and Met-treated protoscoleces. Interestingly, expression of Eg-AMPKɑ was observed in the developmental structures during the de-differentiation process from protoscoleces to microcysts. Therefore, the Eg-AMPK expression during the asexual development of E. granulosus, as well as the in vitro synergic therapeutic effects observed in presence of Met plus albendazole sulfoxide (ABZSO), suggest the importance of carrying out chemoprophylactic and clinical efficacy studies combining Met with conventional anti-echinococcal agents to test the potential use of this drug in hydatidosis therapy. PMID:25965910
DOE Office of Scientific and Technical Information (OSTI.GOV)

Dai, Ziyu; Hooker, Brian S.; Anderson, Daniel B.

Optimization of Acidothermus cellulolyticus endoglucanase (E1) gene expression in transgenic potato (Solanum tuberosum L.) was examined in this study, where the E1 coding sequence was transcribed under control of a leaf specific promoter (tomato RbcS-3C) or the Mac promoter (a hybrid promoter of mannopine synthase promoter and cauliflower mosaic virus 35S promoter enhancer region). Average E1 activity in leaf extracts of potato transformants, in which E1 protein was targeted by a chloroplast signal peptide and an apoplast signal peptide were much higher than those by an E1 native signal peptide and a vacuole signal peptide. E1 protein accumulated up tomore » 2.6% of total leaf soluble protein, where E1 gene was under control of the RbcS-3C promoter, alfalfa mosaic virus 5-untranslated leader, and RbcS-2A signal peptide. E1 protein production, based on average E1 activity and E1 protein accumulation in leaf extracts, is higher in potato than those measured previously in transgenic tobacco bearing the same transgene constructs. Comparisons of E1 activity, protein accumulation, and relative mRNA levels showed that E1 expression under control of tomato RbcS-3C promoter was specifically localized in leaf tissues, while E1 gene was expressed in both leaf and tuber tissues under control of Mac promoter. This suggests dual-crop applications in which potato vines serve as enzyme production `bioreactors' while tubers are preserved for culinary applications.« less
Exon Shuffling and Origin of Scorpion Venom Biodiversity

PubMed Central

Wang, Xueli; Gao, Bin; Zhu, Shunyi

2016-01-01

Scorpion venom is a complex combinatorial library of peptides and proteins with multiple biological functions. A combination of transcriptomic and proteomic techniques has revealed its enormous molecular diversity, as identified by the presence of a large number of ion channel-targeted neurotoxins with different folds, membrane-active antimicrobial peptides, proteases, and protease inhibitors. Although the biodiversity of scorpion venom has long been known, how it arises remains unsolved. In this work, we analyzed the exon-intron structures of an array of scorpion venom protein-encoding genes and unexpectedly found that nearly all of these genes possess a phase-1 intron (one intron located between the first and second nucleotides of a codon) near the cleavage site of a signal sequence despite their mature peptides remarkably differ. This observation matches a theory of exon shuffling in the origin of new genes and suggests that recruitment of different folds into scorpion venom might be achieved via shuffling between body protein-coding genes and ancestral venom gland-specific genes that presumably contributed tissue-specific regulatory elements and secretory signal sequences. PMID:28035955
Exon Shuffling and Origin of Scorpion Venom Biodiversity.

PubMed

Wang, Xueli; Gao, Bin; Zhu, Shunyi

2016-12-26

Scorpion venom is a complex combinatorial library of peptides and proteins with multiple biological functions. A combination of transcriptomic and proteomic techniques has revealed its enormous molecular diversity, as identified by the presence of a large number of ion channel-targeted neurotoxins with different folds, membrane-active antimicrobial peptides, proteases, and protease inhibitors. Although the biodiversity of scorpion venom has long been known, how it arises remains unsolved. In this work, we analyzed the exon-intron structures of an array of scorpion venom protein-encoding genes and unexpectedly found that nearly all of these genes possess a phase-1 intron (one intron located between the first and second nucleotides of a codon) near the cleavage site of a signal sequence despite their mature peptides remarkably differ. This observation matches a theory of exon shuffling in the origin of new genes and suggests that recruitment of different folds into scorpion venom might be achieved via shuffling between body protein-coding genes and ancestral venom gland-specific genes that presumably contributed tissue-specific regulatory elements and secretory signal sequences.
Activation tagging in indica rice identifies ribosomal proteins as potential targets for manipulation of water-use efficiency and abiotic stress tolerance in plants.

PubMed

Moin, Mazahar; Bakshi, Achala; Saha, Anusree; Udaya Kumar, M; Reddy, Attipalli R; Rao, K V; Siddiq, E A; Kirti, P B

2016-11-01

We have generated 3900 enhancer-based activation-tagged plants, in addition to 1030 stable Dissociator-enhancer plants in a widely cultivated indica rice variety, BPT-5204. Of them, 3000 were screened for water-use efficiency (WUE) by analysing photosynthetic quantum efficiency and yield-related attributes under water-limiting conditions that identified 200 activation-tagged mutants, which were analysed for flanking sequences at the site of enhancer integration in the genome. We have further selected five plants with low Δ 13 C, high quantum efficiency and increased plant yield compared with wild type for a detailed investigation. Expression studies of 18 genes in these mutants revealed that in four plants one of the three to four tagged genes became activated, while two genes were concurrently up-regulated in the fifth plant. Two genes coding for proteins involved in 60S ribosomal assembly, RPL6 and RPL23A, were among those that became activated by enhancers. Quantitative expression analysis of these two genes also corroborated the results on activating-tagging. The high up-regulation of RPL6 and RPL23A in various stress treatments and the presence of significant cis-regulatory elements in their promoter regions along with the high up-regulation of several of RPL genes in various stress treatments indicate that they are potential targets for manipulating WUE/abiotic stress tolerance. © 2016 John Wiley & Sons Ltd.
SERPINA2 Is a Novel Gene with a Divergent Function from SERPINA1

PubMed Central

Martins, Manuella; Figueiredo, Joana; Silva, Diana Isabel; Castro, Patrícia; Morales-Hojas, Ramiro; Simões-Correia, Joana; Seixas, Susana

2013-01-01

Serine protease inhibitors (SERPINs) are a superfamily of highly conserved proteins that play a key role in controlling the activity of proteases in diverse biological processes. The SERPIN cluster located at the 14q32.1 region includes the gene coding for SERPINA1, and a highly homologous sequence, SERPINA2, which was originally thought to be a pseudogene. We have previously shown that SERPINA2 is expressed in different tissues, namely leukocytes and testes, suggesting that it is a functional SERPIN. To investigate the function of SERPINA2, we used HeLa cells stably transduced with the different variants of SERPINA2 and SERPINA1 (M1, S and Z) and leukocytes as the in vivo model. We identified SERPINA2 as a 52 kDa intracellular glycoprotein, which is localized at the endoplasmic reticulum (ER), independently of the variant analyzed. SERPINA2 is not significantly regulated by proteasome, proposing that ER localization is not due to misfolding. Specific features of SERPINA2 include the absence of insoluble aggregates and the insignificant response to cell stress, suggesting that it is a non-polymerogenic protein with divergent activity of SERPINA1. Using phylogenetic analysis, we propose an origin of SERPINA2 in the crown of primates, and we unveiled the overall conservation of SERPINA2 and A1. Nonetheless, few SERPINA2 residues seem to have evolved faster, contributing to the emergence of a new advantageous function, possibly as a chymotrypsin-like SERPIN. Herein, we present evidences that SERPINA2 is an active gene, coding for an ER-resident protein, which may act as substrate or adjuvant of ER-chaperones. PMID:23826168
Structure and regulation of KGD1, the structural gene for yeast alpha-ketoglutarate dehydrogenase.

PubMed

Repetto, B; Tzagoloff, A

1989-06-01

Nuclear respiratory-defective mutants of Saccharomyces cerevisiae have been screened for lesions in the mitochondrial alpha-ketoglutarate dehydrogenase complex. Strains assigned to complementation group G70 were ascertained to be deficient in enzyme activity due to mutations in the KGD1 gene coding for the alpha-ketoglutarate dehydrogenase component of the complex. The KGD1 gene has been cloned by transformation of a representative kgd1 mutant, C225/U1, with a recombinant plasmid library of wild-type yeast nuclear DNA. Transformants containing the gene on a multicopy plasmid had three- to four-times-higher alpha-ketoglutarate dehydrogenase activity than did wild-type S. cerevisiae. Substitution of the chromosomal copy of KGD1 with a disrupted allele (kgd1::URA3) induced a deficiency in alpha-ketoglutarate dehydrogenase. The sequence of the cloned region of DNA which complements kgd1 mutants was found to have an open reading frame of 3,042 nucleotides capable of coding for a protein of Mw 114,470. The encoded protein had 38% identical residues with the reported sequence of alpha-ketoglutarate dehydrogenase from Escherichia coli. Two lines of evidence indicated that transcription of KGD1 is catabolite repressed. Higher steady-state levels of KGD1 mRNA were detected in wild-type yeast grown on the nonrepressible sugar galactose than in yeast grown on high glucose. Regulation of KGD1 was also studied by fusing different 5'-flanking regions of KGD1 to the lacZ gene of E. coli and measuring the expression of beta-galactosidase in yeast. Transformants harboring a fusion of 693 nucleotides of the 5'-flanking sequence expressed 10 times more beta-galactosidase activity when grown under derepressed conditions. The response to the carbon source was reduced dramatically when the same lacZ fusion was present in a hap2 or hap3 mutant. The promoter element(s) responsible for the regulated expression of KGD1 has been mapped to the -354 to -143 region. This region contained several putative activation sites with sequences matching the core element proposed to be essential for binding of the HAP2 and HAP3 regulatory proteins.
PAR-CLIP data indicate that Nrd1-Nab3-dependent transcription termination regulates expression of hundreds of protein coding genes in yeast

PubMed Central

2014-01-01

Background Nrd1 and Nab3 are essential sequence-specific yeast RNA binding proteins that function as a heterodimer in the processing and degradation of diverse classes of RNAs. These proteins also regulate several mRNA coding genes; however, it remains unclear exactly what percentage of the mRNA component of the transcriptome these proteins control. To address this question, we used the pyCRAC software package developed in our laboratory to analyze CRAC and PAR-CLIP data for Nrd1-Nab3-RNA interactions. Results We generated high-resolution maps of Nrd1-Nab3-RNA interactions, from which we have uncovered hundreds of new Nrd1-Nab3 mRNA targets, representing between 20 and 30% of protein-coding transcripts. Although Nrd1 and Nab3 showed a preference for binding near 5′ ends of relatively short transcripts, they bound transcripts throughout coding sequences and 3′ UTRs. Moreover, our data for Nrd1-Nab3 binding to 3′ UTRs was consistent with a role for these proteins in the termination of transcription. Our data also support a tight integration of Nrd1-Nab3 with the nutrient response pathway. Finally, we provide experimental evidence for some of our predictions, using northern blot and RT-PCR assays. Conclusions Collectively, our data support the notion that Nrd1 and Nab3 function is tightly integrated with the nutrient response and indicate a role for these proteins in the regulation of many mRNA coding genes. Further, we provide evidence to support the hypothesis that Nrd1-Nab3 represents a failsafe termination mechanism in instances of readthrough transcription. PMID:24393166
Discovery of the First Germline-Restricted Gene by Subtractive Transcriptomic Analysis in the Zebra Finch, Taeniopygia guttata.

PubMed

Biederman, Michelle K; Nelson, Megan M; Asalone, Kathryn C; Pedersen, Alyssa L; Saldanha, Colin J; Bracht, John R

2018-05-21

Developmentally programmed genome rearrangements are rare in vertebrates, but have been reported in scattered lineages including the bandicoot, hagfish, lamprey, and zebra finch (Taeniopygia guttata) [1]. In the finch, a well-studied animal model for neuroendocrinology and vocal learning [2], one such programmed genome rearrangement involves a germline-restricted chromosome, or GRC, which is found in germlines of both sexes but eliminated from mature sperm [3, 4]. Transmitted only through the oocyte, it displays uniparental female-driven inheritance, and early in embryonic development is apparently eliminated from all somatic tissue in both sexes [3, 4]. The GRC comprises the longest finch chromosome at over 120 million base pairs [3], and previously the only known GRC-derived sequence was repetitive and non-coding [5]. Because the zebra finch genome project was sourced from male muscle (somatic) tissue [6], the remaining genomic sequence and protein-coding content of the GRC remain unknown. Here we report the first protein-coding gene from the GRC: a member of the α-soluble N-ethylmaleimide sensitive fusion protein (NSF) attachment protein (α-SNAP) family hitherto missing from zebra finch gene annotations. In addition to the GRC-encoded α-SNAP, we find an additional paralogous α-SNAP residing in the somatic genome (a somatolog)-making the zebra finch the first example in which α-SNAP is not a single-copy gene. We show divergent, sex-biased expression for the paralogs and also that positive selection is detectable across the bird α-SNAP lineage, including the GRC-encoded α-SNAP. This study presents the identification and evolutionary characterization of the first protein-coding GRC gene in any organism. Copyright © 2018 Elsevier Ltd. All rights reserved.
RNAi screening of subtracted transcriptomes reveals tumor suppression by taurine-activated GABAA receptors involved in volume regulation

PubMed Central

van Nierop, Pim; Vormer, Tinke L.; Foijer, Floris; Verheij, Joanne; Lodder, Johannes C.; Andersen, Jesper B.; Mansvelder, Huibert D.; te Riele, Hein

2018-01-01

To identify coding and non-coding suppressor genes of anchorage-independent proliferation by efficient loss-of-function screening, we have developed a method for enzymatic production of low complexity shRNA libraries from subtracted transcriptomes. We produced and screened two LEGO (Low-complexity by Enrichment for Genes shut Off) shRNA libraries that were enriched for shRNA vectors targeting coding and non-coding polyadenylated transcripts that were reduced in transformed Mouse Embryonic Fibroblasts (MEFs). The LEGO shRNA libraries included ~25 shRNA vectors per transcript which limited off-target artifacts. Our method identified 79 coding and non-coding suppressor transcripts. We found that taurine-responsive GABAA receptor subunits, including GABRA5 and GABRB3, were induced during the arrest of non-transformed anchor-deprived MEFs and prevented anchorless proliferation. We show that taurine activates chloride currents through GABAA receptors on MEFs, causing seclusion of cell volume in large membrane protrusions. Volume seclusion from cells by taurine correlated with reduced proliferation and, conversely, suppression of this pathway allowed anchorage-independent proliferation. In human cholangiocarcinomas, we found that several proteins involved in taurine signaling via GABAA receptors were repressed. Low GABRA5 expression typified hyperproliferative tumors, and loss of taurine signaling correlated with reduced patient survival, suggesting this tumor suppressive mechanism operates in vivo. PMID:29787571
Deletion of the Sm1 encoding motif in the lsm gene results in distinct changes in the transcriptome and enhanced swarming activity of Haloferax cells.

PubMed

Maier, Lisa-Katharina; Benz, Juliane; Fischer, Susan; Alstetter, Martina; Jaschinski, Katharina; Hilker, Rolf; Becker, Anke; Allers, Thorsten; Soppa, Jörg; Marchfelder, Anita

2015-10-01

Members of the Sm protein family are important for the cellular RNA metabolism in all three domains of life. The family includes archaeal and eukaryotic Lsm proteins, eukaryotic Sm proteins and archaeal and bacterial Hfq proteins. While several studies concerning the bacterial and eukaryotic family members have been published, little is known about the archaeal Lsm proteins. Although structures for several archaeal Lsm proteins have been solved already more than ten years ago, we still do not know much about their biological function, however one can confidently propose that the archaeal Lsm proteins will also be involved in RNA metabolism. Therefore, we investigated this protein in the halophilic archaeon Haloferax volcanii. The Haloferax genome encodes a single Lsm protein, the lsm gene overlaps and is co-transcribed with the gene for the ribosomal L37.eR protein. Here, we show that the reading frame of the lsm gene contains a promoter which regulates expression of the overlapping rpl37R gene. This rpl37R specific promoter ensures high expression of the rpl37R gene in exponential growth phase. To investigate the biological function of the Lsm protein we generated a lsm deletion mutant that had the coding sequence for the Sm1 motif removed but still contained the internal promoter for the downstream rpl37R gene. The transcriptome of this deletion mutant was compared to the wild type transcriptome, revealing that several genes are down-regulated and many genes are up-regulated in the deletion strain. Northern blot analyses confirmed down-regulation of two genes. In addition, the deletion strain showed a gain of function in swarming, in congruence with the up-regulation of transcripts encoding proteins required for motility. Copyright © 2015 The Authors. Published by Elsevier B.V. All rights reserved.
Isolation and sequencing of the gene encoding Sp23, a structural protein of spermatophore of the mealworm beetle, Tenebrio molitor.

PubMed

Feng, X; Happ, G M

1996-11-14

The cDNA for Sp23, a structural protein of the spermatophore of Tenebrio molitor, had been previously cloned and characterized (Paesen, G.C., Schwartz, M.B., Peferoen, M., Weyda, F. and Happ, G.M. (1992a) Amino acid sequence of Sp23, a structure protein of the spermatophore of the mealworm beetle, Tenebrio molitor. J. Biol. Chem. 257, 18852-18857). Using the labeled cDNA for Sp23 as a probe to screen a library of genomic DNA from Tenebrio molitor, we isolated a genomic clone for Sp23. A 5373-base pair (bp) restriction fragment containing the Sp23 gene was sequenced. The coding region is separated by a 55-bp intron which is located close to the translation start site. Three putative ecdysone response elements (EcRE) are identified in the 5' flanking region of the Sp23 gene. Comparison of the flanking regions of the Sp23 gene with those of the D-protein gene expressed in the accessory glands of Tenebrio reveals similar sequences present in the flanking regions of the two genes. The genomic organization of the coding region of the Sp23 gene shares similarities with that of the D-protein gene, three Drosophila accessory gland genes and two Drosophila 20-OH ecdysone-responsive genes.

The Escherichia coli supX locus is topA, the structural gene for DNA topoisomerase I.

PubMed Central

Margolin, P; Zumstein, L; Sternglanz, R; Wang, J C

1985-01-01

Mutations in the supX locus, which result in the absence of DNA topoisomerase I enzyme activity in both Salmonella typhimurium and Escherichia coli, are all selected as suppressors of the leu-500 promoter mutation in S. typhimurium. To determine whether the supX locus is the structural gene topA for the DNA topoisomerase I enzyme or is a positive-acting regulator/activator gene for a nearby topA structural gene, nonsense mutations were selected in the E. coli supX gene carried on an F' episome in S. typhimurium cells. The cysB-topA region of the episomes with nonsense-mutant supX alleles were then cloned onto plasmid pBR322 and transformed into E. coli cells lacking a chromosomal supX gene. Three such E. coli strains, each carrying cloned DNA from episomes with different nonsense-mutant supX alleles, all lacked DNA topoisomerase I activity but expressed antigenic determinants specific to the enzyme; control cells lacked both enzyme activity and antigenic determinants. Maxicell studies of plasmid-coded proteins demonstrated the absence of the DNA topoisomerase I protein (100 kDa) in the three strains but the appearance of a new smaller peptide in each (36, 47, and 64 kDa). These new peptides must represent fragments of the enzyme resulting from translation termination at the supX nonsense codons and confirm the interpretation that the supX gene is topA, the structural gene for DNA topoisomerase I. Images PMID:2991925
Xuhuai goat H-FABP gene clone, subcellular localization of expression products and the preparation of transgenic mice.

PubMed

Yin, Yan-hui; Li, Bi-chun; Wei, Guang-hui; Zhu, Cai-ye; Li, Wei; Zhang, Ya-ni; Du, Li-xin; Cao, Wen-guang

2012-05-01

The aim of this study was to clone the heart-type fatty acid binding protein (H-FABP) gene of Xuhuai goat, to explore it bioinformatically, and analyze the subcellular localization using enhanced green fluorescent protein (EGFP). The results showed that the coding sequence (CDS) length of Xuhuai goat H-FABP gene was 402 bp, encoding 133 amino acids (GenBank accession number AY466498.1). The H-FABP cDNA coding sequence was compared with the corresponding region of human, chicken, brown rat, cow, wild boar, donkey, and zebrafish. The similarity were 89%, 76%, 85%, 84%, 93%, 91%, 70%, respectively. For the corresponding amino acid sequences, the similarity were 90%, 79%, 88%, 97%, 95%, 94%, 72%, respectively. This study did not find the signal peptide region in the H-FABP protein; it revealed that H-FABP protein might be a nonsecreted protein. H-FABP expression was detected in vitro by reverse transcription-polymerase chain reaction (RT-PCR), and the EGFP-H-FABP fusion protein was localized to the cytoplasm. The gene could also be transiently and permanently expressed in mice.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Tashkandy, Nisreen; Sabban, Sari; Fakieh, Mohammad

Flavobacterium suncheonense is a member of the family Flavobacteriaceae in the phylum Bacteroidetes. Strain GH29-5 T (DSM 17707 T ) was isolated from greenhouse soil in Suncheon, South Korea. F. suncheonense GH29-5 T is part of the Genomic Encyclopedia of Bacteria and Archaea project. The 2,880,663 bp long draft genome consists of 54 scaffolds with 2739 protein-coding genes and 82 RNA genes. The genome of strain GH29-5 T has 117 genes encoding peptidases but a small number of genes encoding carbohydrate active enzymes (51 CAZymes). Metallo and serine peptidases were found most frequently. Among CAZymes, eight glycoside hydrolase families, ninemore » glycosyl transferase families, two carbohydrate binding module families and four carbohydrate esterase families were identified. Suprisingly, polysaccharides utilization loci (PULs) were not found in strain GH29-5 T . Based on the coherent physiological and genomic characteristics we suggest that F. suncheonense GH29-5 T feeds rather on proteins than saccharides and lipids.« less
Mitochondrial genomes of the jungle crow Corvus macrorhynchos (Passeriformes: Corvidae) from shed feathers and a phylogenetic analysis of genus Corvus using mitochondrial protein-coding genes.

PubMed

Krzeminska, Urszula; Wilson, Robyn; Rahman, Sadequr; Song, Beng Kah; Seneviratne, Sampath; Gan, Han Ming; Austin, Christopher M

2016-07-01

The complete mitochondrial genomes of two jungle crows (Corvus macrorhynchos) were sequenced. DNA was extracted from tissue samples obtained from shed feathers collected in the field in Sri Lanka and sequenced using the Illumina MiSeq Personal Sequencer. Jungle crow mitogenomes have a structural organization typical of the genus Corvus and are 16,927 bp and 17,066 bp in length, both comprising 13 protein-coding genes, 22 transfer RNA genes, 2 ribosomal subunit genes, and a non-coding control region. In addition, we complement already available house crow (Corvus spelendens) mitogenome resources by sequencing an individual from Singapore. A phylogenetic tree constructed from Corvidae family mitogenome sequences available on GenBank is presented. We confirm the monophyly of the genus Corvus and propose to use complete mitogenome resources for further intra- and interspecies genetic studies.
The expanding regulatory universe of p53 in gastrointestinal cancer.

PubMed

Fesler, Andrew; Zhang, Ning; Ju, Jingfang

2016-01-01

Tumor suppresser gene TP53 is one of the most frequently deleted or mutated genes in gastrointestinal cancers. As a transcription factor, p53 regulates a number of important protein coding genes to control cell cycle, cell death, DNA damage/repair, stemness, differentiation and other key cellular functions. In addition, p53 is also able to activate the expression of a number of small non-coding microRNAs (miRNAs) through direct binding to the promoter region of these miRNAs. Many miRNAs have been identified to be potential tumor suppressors by regulating key effecter target mRNAs. Our understanding of the regulatory network of p53 has recently expanded to include long non-coding RNAs (lncRNAs). Like miRNA, lncRNAs have been found to play important roles in cancer biology. With our increased understanding of the important functions of these non-coding RNAs and their relationship with p53, we are gaining exciting new insights into the biology and function of cells in response to various growth environment changes. In this review we summarize the current understanding of the ever expanding involvement of non-coding RNAs in the p53 regulatory network and its implications for our understanding of gastrointestinal cancer.
Relaxed Evolution in the Tyrosine Aminotransferase Gene Tat in Old World Fruit Bats (Chiroptera: Pteropodidae)

PubMed Central

Shen, Bin; Fang, Tao; Yang, Tianxiao; Jones, Gareth; Irwin, David M.; Zhang, Shuyi

2014-01-01

Frugivorous and nectarivorous bats fuel their metabolism mostly by using carbohydrates and allocate the restricted amounts of ingested proteins mainly for anabolic protein syntheses rather than for catabolic energy production. Thus, it is possible that genes involved in protein (amino acid) catabolism may have undergone relaxed evolution in these fruit- and nectar-eating bats. The tyrosine aminotransferase (TAT, encoded by the Tat gene) is the rate-limiting enzyme in the tyrosine catabolic pathway. To test whether the Tat gene has undergone relaxed evolution in the fruit- and nectar-eating bats, we obtained the Tat coding region from 20 bat species including four Old World fruit bats (Pteropodidae) and two New World fruit bats (Phyllostomidae). Phylogenetic reconstructions revealed a gene tree in which all echolocating bats (including the New World fruit bats) formed a monophyletic group. The phylogenetic conflict appears to stem from accelerated TAT protein sequence evolution in the Old World fruit bats. Our molecular evolutionary analyses confirmed a change in the selection pressure acting on Tat, which was likely caused by a relaxation of the evolutionary constraints on the Tat gene in the Old World fruit bats. Hepatic TAT activity assays showed that TAT activities in species of the Old World fruit bats are significantly lower than those of insectivorous bats and omnivorous mice, which was not caused by a change in TAT protein levels in the liver. Our study provides unambiguous evidence that the Tat gene has undergone relaxed evolution in the Old World fruit bats in response to changes in their metabolism due to the evolution of their special diet. PMID:24824435
[Preparation and activity validation of PP7 bacteriophage-like particles displaying PAP114-128 peptide].

PubMed

Sun, Yanli; Sun, Yanhua

2016-10-01

Objective To obtain the PP7 bacteriophage-like particles carrying the peptide of prostatic acid phosphatase PAP 114-128 , and prove that they retain the original biological activity. Methods First, the plasmid pETDuet-2PP7 was constructed as follows: the gene of PP7 coat protein dimer was amplified by gene mutation combined with overlapping PCR technology, and inserted into the vector pETDuet-1. Following that, the plasmid pETDuet-2PP7-PAP 114-128 was constructed as follows: the PP7 coat protein gene carrying the coding gene of PAP 114-128 peptide was amplified using PCR, and then inserted into the vector pETDuet-2PP7. Both pETDuet-2PP7 and pETDuet-2PP7-PAP 114-128 were transformed into E.coli and expressed. The expression product was verified by SDS-PAGE, double immunodiffusion assay and ELISA. Results The gene fragment of PP7 coat protein dimer was obtained by overlapping PCR using Ex Taq DNA polymerase, and the antigenicity of its expression product was the same as that of the coat protein of wild-type PP7 bacteriophage. Moreover, the PAP 114-128 peptide epitope that was displayed on the surface of PP7 bacteriophage was identical with the corresponding epitope of natural human PAP, and it was able to induce high levels of antibodies. Conclusion The gene of PP7 coat protein dimer with repeated sequences can be prepared by gene mutation combined with overlapping PCR. Based on this, PP7 bacteriophage-like particles carrying PAP peptide can be prepared, which not only solves the problem of the instability of the peptides, but also lays a foundation for the study on their delivery and function.
Complete mitochondrial genome of Cynopterus sphinx (Pteropodidae: Cynopterus).

PubMed

Li, Linmiao; Li, Min; Wu, Zhengjun; Chen, Jinping

2015-01-01

We have characterized the complete mitochondrial genome of Cynopterus sphinx (Pteropodidae: Cynopterus) and described its organization in this study. The total length of C. sphinx complete mitochondrial genome was 16,895 bp with the base composition of 32.54% A, 14.05% G, 25.82% T and 27.59% C. The complete mitochondrial genome included 13 protein-coding genes, 22 tRNA genes, 2 rRNA genes (12S rRNA and 16S rRNA) and 1 control region (D-loop). The control region was 1435 bp long with the sequence CATACG repeat 64 times. Three protein-coding genes (ND1, COI and ND4) were ended with incomplete stop codon TA or T.
Identification of the UDP-glucose-4-epimerase required for galactofuranose biosynthesis and galactose metabolism in A. niger.

PubMed

Park, Joohae; Tefsen, Boris; Arentshorst, Mark; Lagendijk, Ellen; van den Hondel, Cees Amjj; van Die, Irma; Ram, Arthur Fj

2014-01-01

Galactofuranose (Gal f )-containing glycoconjugates are important to secure the integrity of the cell wall of filamentous fungi. Mutations that prevent the biosynthesis of Gal f -containing molecules compromise cell wall integrity. In response to cell wall weakening, the cell wall integrity (CWI)-pathway is activated to reinforce the strength of the cell wall. Activation of CWI-pathway in Aspergillus niger is characterized by the specific induction of the agsA gene, which encodes a cell wall α-glucan synthase. In this study, we screened a collection of cell wall mutants with an induced expression of agsA for defects in Gal f biosynthesis using a with anti-Gal f antibody (L10). From this collection of mutants, we previously identified mutants in the UDP-galactopyranose mutase encoding gene ( ugmA ). Here, we have identified six additional UDP-galactopyranose mutase ( ugmA ) mutants and one mutant (named mutant #41) in an additional complementation group that displayed strongly reduced Gal f -levels in the cell wall. By using a whole genome sequencing approach, 21 SNPs in coding regions were identified between mutant #41 and its parental strain which changed the amino acid sequence of the encoded proteins. One of these mutations was in gene An14g03820, which codes for a putative UDP-glucose-4-epimerase (UgeA). The A to G mutation in this gene causes an amino acid change of Asn to Asp at position 191 in the UgeA protein. Targeted deletion of ugeA resulted in an even more severe reduction of Gal f in N-linked glucans, indicating that the UgeA protein in mutant #41 is partially active. The ugeA gene is also required for growth on galactose despite the presence of two UgeA homologs in the A. niger genome. By using a classical mutant screen and whole genome sequencing of a new Gal f -deficient mutant, the UDP-glucose-4-epimerase gene ( ugeA ) has been identified. UgeA is required for the biosynthesis of Gal f as well as for galactose metabolism in Aspergillus niger .
The mitochondrial genome of the multicolored Asian lady beetle Harmonia axyridis (Pallas) and a phylogenetic analysis of the Polyphaga (Insecta: Coleoptera).

PubMed

Niu, Fang-Fang; Zhu, Liang; Wang, Su; Wei, Shu-Jun

2016-07-01

Here, we report the mitochondrial genome sequence of the multicolored Asian lady beetle Harmonia axyridis (Pallas, 1773) (Coleoptera: Coccinellidae) (GenBank accession No. KR108208). This is the first species with sequenced mitochondrial genome from the genus Harmonia. The current length with partitial A + T-rich region of this mitochondrial genome is 16,387 bp. All the typical genes were sequenced except the trnI and trnQ. As in most other sequenced mitochondrial genomes of Coleoptera, there is no re-arrangement in the sequenced region compared with the pupative ancestral arrangement of insects. All protein-coding genes start with ATN codons. Five, five and three protein-coding genes stop with termination codon TAA, TA and T, respectively. Phylogenetic analysis using Bayesian method based on the first and second codon positions of the protein-coding genes supported that the Scirtidae is a basal lineage of Polyphaga. The Harmonia and the Coccinella form a sister lineage. The monophyly of Staphyliniformia, Scarabaeiformia and Cucujiformia was supported. The Buprestidae was found to be a sister group to the Bostrichiformia.
Not so bad after all: retroviruses and long terminal repeat retrotransposons as a source of new genes in vertebrates.

PubMed

Naville, M; Warren, I A; Haftek-Terreau, Z; Chalopin, D; Brunet, F; Levin, P; Galiana, D; Volff, J-N

2016-04-01

Viruses and transposable elements, once considered as purely junk and selfish sequences, have repeatedly been used as a source of novel protein-coding genes during the evolution of most eukaryotic lineages, a phenomenon called 'molecular domestication'. This is exemplified perfectly in mammals and other vertebrates, where many genes derived from long terminal repeat (LTR) retroelements (retroviruses and LTR retrotransposons) have been identified through comparative genomics and functional analyses. In particular, genes derived from gag structural protein and envelope (env) genes, as well as from the integrase-coding and protease-coding sequences, have been identified in humans and other vertebrates. Retroelement-derived genes are involved in many important biological processes including placenta formation, cognitive functions in the brain and immunity against retroelements, as well as in cell proliferation, apoptosis and cancer. These observations support an important role of retroelement-derived genes in the evolution and diversification of the vertebrate lineage. Copyright © 2016 European Society of Clinical Microbiology and Infectious Diseases. Published by Elsevier Ltd. All rights reserved.
Specialized activities and expression differences for Clostridium thermocellum biofilm and planktonic cells

DOE PAGES

Dumitrache, Alexandru; Klingeman, Dawn M.; Natzke, Jace; ...

2017-02-27

Clostridium thermocellum forms biofilms adherent to lignocellulosic feedstock in a typical continuous cell-monolayer to efficiently break down and uptake cellulose hydrolysates. The sessile cells of biofilms may revert to non-adherent planktonic cells through generation of offspring cells or microenvironment constraints such as limited surface area. These interdependent cell populations co-exist and have different contributions to culture activity and growth. Here, we developed a novel bioreactor design to rapidly harvest sessile and planktonic cell populations for omics studies. In RNA-seq analyses, within 3299 protein coding genes, 59% (or 1958 genes) were differentially expressed with a minimum two-fold change between the twomore » cell populations isolated simultaneously at high culture activity. Furthermore, sessile cells had definitive greater expression of genes involved in catabolism of carbohydrates by glycolysis and pyruvate fermentation, ATP generation by proton gradient, the anabolism of proteins and lipids and cellular functions critical for cell division; planktonic cells had notably higher gene expression for flagellar motility and chemotaxis, cellulosomal cellulases and anchoring scaffoldins, and a range of stress induced homeostasis mechanisms such as oxidative stress protection by antioxidants and flavoprotein co-factors, methionine repair, Fe-S cluster assembly and repair in redox proteins, cell growth control through tRNA thiolation, recovery of damaged DNA by nucleotide excision repair and removal of terminal proteins by proteases. Our knowledge of these cellular adaptations will aid the engineering of industrially relevant strains for consolidated bioprocessing of solid lignocellulosic biomass« less
Specialized activities and expression differences for Clostridium thermocellum biofilm and planktonic cells

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dumitrache, Alexandru; Klingeman, Dawn M.; Natzke, Jace

Clostridium thermocellum forms biofilms adherent to lignocellulosic feedstock in a typical continuous cell-monolayer to efficiently break down and uptake cellulose hydrolysates. The sessile cells of biofilms may revert to non-adherent planktonic cells through generation of offspring cells or microenvironment constraints such as limited surface area. These interdependent cell populations co-exist and have different contributions to culture activity and growth. Here, we developed a novel bioreactor design to rapidly harvest sessile and planktonic cell populations for omics studies. In RNA-seq analyses, within 3299 protein coding genes, 59% (or 1958 genes) were differentially expressed with a minimum two-fold change between the twomore » cell populations isolated simultaneously at high culture activity. Furthermore, sessile cells had definitive greater expression of genes involved in catabolism of carbohydrates by glycolysis and pyruvate fermentation, ATP generation by proton gradient, the anabolism of proteins and lipids and cellular functions critical for cell division; planktonic cells had notably higher gene expression for flagellar motility and chemotaxis, cellulosomal cellulases and anchoring scaffoldins, and a range of stress induced homeostasis mechanisms such as oxidative stress protection by antioxidants and flavoprotein co-factors, methionine repair, Fe-S cluster assembly and repair in redox proteins, cell growth control through tRNA thiolation, recovery of damaged DNA by nucleotide excision repair and removal of terminal proteins by proteases. Our knowledge of these cellular adaptations will aid the engineering of industrially relevant strains for consolidated bioprocessing of solid lignocellulosic biomass« less
Complete mitochondrial genome of Cuora trifasciata (Chinese three-striped box turtle), and a comparative analysis with other box turtles.

PubMed

Li, Wei; Zhang, Xin-Cheng; Zhao, Jian; Shi, Yan; Zhu, Xin-Ping

2015-01-25

Cuora trifasciata has become one of the most critically endangered species in the world. The complete mitochondrial genome of C. trifasciata (Chinese three-striped box turtle) was determined in this study. Its mitochondrial genome is a 16,575-bp-long circular molecule that consists of 37 genes that are typically found in other vertebrates. And the basic characteristics of the C. trifasciata mitochondrial genome were also determined. Moreover, a comparison of C. trifasciata with Cuora cyclornata, Cuora pani and Cuora aurocapitata indicated that the four mitogenomics differed in length, codons, overlaps, 13 protein-coding genes (PCGs), ND3, rRNA genes, control region, and other aspects. Phylogenetic analysis with Bayesian inference and maximum likelihood based on 12 protein-coding genes of the genus Cuora indicated the phylogenetic position of C. trifasciata within Cuora. The phylogenetic analysis also showed that C. trifasciata from Vietnam and China formed separate monophyletic clades with different Cuora species. The results of nucleotide base compositions, protein-coding genes and phylogenetic analysis showed that C. trifasciata from these two countries may represent different Cuora species. Copyright © 2014 Elsevier B.V. All rights reserved.
Current and future implications of basic and translational research on amyloid-β peptide production and removal pathways

PubMed Central

Bohm, C.; Chen, F.; Sevalle, J.; Qamar, S.; Dodd, R.; Li, Y.; Schmitt-Ulms, G.; Fraser, P.E.; St George-Hyslop, P.H.

2015-01-01

Inherited variants in multiple different genes are associated with increased risk for Alzheimer's disease (AD). In many of these genes, the inherited variants alter some aspect of the production or clearance of the neurotoxic amyloid β-peptide (Aβ). Thus missense, splice site or duplication mutants in the presenilin 1 (PS1), presenilin 2 (PS2) or the amyloid precursor protein (APP) genes, which alter the levels or shift the balance of Aβ produced, are associated with rare, highly penetrant autosomal dominant forms of Familial Alzheimer's Disease (FAD). Similarly, the more prevalent late-onset forms of AD are associated with both coding and non-coding variants in genes such as SORL1, PICALM and ABCA7 that affect the production and clearance of Aβ. This review summarises some of the recent molecular and structural work on the role of these genes and the proteins coded by them in the biology of Aβ. We also briefly outline how the emerging knowledge about the pathways involved in Aβ generation and clearance can be potentially targeted therapeutically. This article is part of Special Issue entitled "Neuronal Protein". PMID:25748120
Impact of the excision of an ancient repeat insertion on Rickettsia conorii guanylate kinase activity.

PubMed

Abergel, Chantal; Blanc, Guillaume; Monchois, Vincent; Renesto, Patricia; Sigoillot, Cécile; Ogata, Hiroyuki; Raoult, Didier; Claverie, Jean-Michel

2006-11-01

The genomic sequencing of Rickettsia conorii revealed a new family of Rickettsia-specific palindromic elements (RPEs) capable of in-frame insertion in preexisting open reading frames (ORFs). Many of these altered ORFs correspond to proteins with well-characterized or essential functions in other microorganisms. Previous experiments indicated that RPE-containing genes are normally transcribed and that no excision of the repeat occurs at the mRNA level. Using mass spectrometry, we now confirmed the retention of the RPE-derived amino acid residues in 4 proteins successfully expressed in Escherichia coli, raising the general question of the consequences of this common insertion event on the fitness of Rickettsia enzymes. The predicted guanylate kinase activity of the R. conorii gmk gene product was measured both on the RPE-containing and RPE-excised recombinant proteins. We show that the 2 proteins are active but exhibit substantial differences in their affinity for adenosine triphosphate, guanosine monophosphate, and catalytic constants. The distribution of the RPEgmk insert among Rickettsia species indicates that the insertion event is ancient and occurred after the divergence of Rickettsia felis and R. conorii but before that of Rickettsia helvetica and R. conorii. We found no evidence that the gmk gene fixed adaptive changes to compensate the RPE peptide insertion. Furthermore, the analysis of the rates of divergence in 23 RPE-containing genes indicates that coding RPE repeats tend to evolve under weak selective constraint, at a rate similar to intergenic noncoding RPE sequences. Altogether, these results suggest that the insertion of RPE-encoded "selfish peptides," although respecting the original fold and activity of the host proteins, might be slightly detrimental to the enzyme efficiency within limits tolerable for slow-growing intracellular parasites such as Rickettsia.
Base composition and expression level of human genes.

PubMed

Arhondakis, Stilianos; Auletta, Fabio; Torelli, Giuseppe; D'Onofrio, Giuseppe

2004-01-21

It is well known that the gene distribution is non-uniform in the human genome, reaching the highest concentration in the GC-rich isochores. Also the amino acid frequencies, and the hydrophobicity, of the corresponding encoded proteins are affected by the high GC level of the genes localized in the GC-rich isochores. It was hypothesized that the gene expression level as well is higher in GC-rich compared to GC-poor isochores [Mol. Biol. Evol. 10 (1993) 186]. Several features of human genes and proteins, namely expression level, coding and non-coding lengths, and hydrophobicity were investigated in the present paper. The results support the hypothesis reported above, since all the parameters so far studied converge to the same conclusion, that the average expression level of the GC-rich genes is significantly higher than that of the GC-poor genes.
The complete mitochondrial genome of the invasive Africanized Honey Bee, Apis mellifera scutellata (Insecta: Hymenoptera: Apidae).

PubMed

Gibson, Joshua D; Hunt, Greg J

2016-01-01

The complete mitochondrial genome from an Africanized honey bee population (AHB, derived from Apis mellifera scutellata) was assembled and analyzed. The mitogenome is 16,411 bp long and contains the same gene repertoire and gene order as the European honey bee (13 protein coding genes, 22 tRNA genes and 2 rRNA genes). ND4 appears to use an alternate start codon and the long rRNA gene is 48 bp shorter in AHB due to a deletion in a terminal AT dinucleotide repeat. The dihydrouracil arm is missing from tRNA-Ser (AGN) and tRNA-Glu is missing the TV loop. The A + T content is comparable to the European honey bee (84.7%), which increases to 95% for the 3rd position in the protein coding genes.
Small proteins in cyanobacteria provide a paradigm for the functional analysis of the bacterial micro-proteome.

PubMed

Baumgartner, Desiree; Kopf, Matthias; Klähn, Stephan; Steglich, Claudia; Hess, Wolfgang R

2016-11-28

Despite their versatile functions in multimeric protein complexes, in the modification of enzymatic activities, intercellular communication or regulatory processes, proteins shorter than 80 amino acids (μ-proteins) are a systematically underestimated class of gene products in bacteria. Photosynthetic cyanobacteria provide a paradigm for small protein functions due to extensive work on the photosynthetic apparatus that led to the functional characterization of 19 small proteins of less than 50 amino acids. In analogy, previously unstudied small ORFs with similar degrees of conservation might encode small proteins of high relevance also in other functional contexts. Here we used comparative transcriptomic information available for two model cyanobacteria, Synechocystis sp. PCC 6803 and Synechocystis sp. PCC 6714 for the prediction of small ORFs. We found 293 transcriptional units containing candidate small ORFs ≤80 codons in Synechocystis sp. PCC 6803, also including the known mRNAs encoding small proteins of the photosynthetic apparatus. From these transcriptional units, 146 are shared between the two strains, 42 are shared with the higher plant Arabidopsis thaliana and 25 with E. coli. To verify the existence of the respective μ-proteins in vivo, we selected five genes as examples to which a FLAG tag sequence was added and re-introduced them into Synechocystis sp. PCC 6803. These were the previously annotated gene ssr1169, two newly defined genes norf1 and norf4, as well as nsiR6 (nitrogen stress-induced RNA 6) and hliR1(high light-inducible RNA 1) , which originally were considered non-coding. Upon activation of expression via the Cu 2+. responsive petE promoter or from the native promoters, all five proteins were detected in Western blot experiments. The distribution and conservation of these five genes as well as their regulation of expression and the physico-chemical properties of the encoded proteins underline the likely great bandwidth of small protein functions in bacteria and makes them attractive candidates for functional studies.
Capturing the Biofuel Wellhead and Powerhouse: The Chloroplast and Mitochondrial Genomes of the Leguminous Feedstock Tree Pongamia pinnata

PubMed Central

Kazakoff, Stephen H.; Imelfort, Michael; Edwards, David; Koehorst, Jasper; Biswas, Bandana; Batley, Jacqueline; Scott, Paul T.; Gresshoff, Peter M.

2012-01-01

Pongamia pinnata (syn. Millettia pinnata) is a novel, fast-growing arboreal legume that bears prolific quantities of oil-rich seeds suitable for the production of biodiesel and aviation biofuel. Here, we have used Illumina® ‘Second Generation DNA Sequencing (2GS)’ and a new short-read de novo assembler, SaSSY, to assemble and annotate the Pongamia chloroplast (152,968 bp; cpDNA) and mitochondrial (425,718 bp; mtDNA) genomes. We also show that SaSSY can be used to accurately assemble 2GS data, by re-assembling the Lotus japonicus cpDNA and in the process assemble its mtDNA (380,861 bp). The Pongamia cpDNA contains 77 unique protein-coding genes and is almost 60% gene-dense. It contains a 50 kb inversion common to other legumes, as well as a novel 6.5 kb inversion that is responsible for the non-disruptive, re-orientation of five protein-coding genes. Additionally, two copies of an inverted repeat firmly place the species outside the subclade of the Fabaceae lacking the inverted repeat. The Pongamia and L. japonicus mtDNA contain just 33 and 31 unique protein-coding genes, respectively, and like other angiosperm mtDNA, have expanded intergenic and multiple repeat regions. Through comparative analysis with Vigna radiata we measured the average synonymous and non-synonymous divergence of all three legume mitochondrial (1.59% and 2.40%, respectively) and chloroplast (8.37% and 8.99%, respectively) protein-coding genes. Finally, we explored the relatedness of Pongamia within the Fabaceae and showed the utility of the organellar genome sequences by mapping transcriptomic data to identify up- and down-regulated stress-responsive gene candidates and confirm in silico predicted RNA editing sites. PMID:23272141

Capturing the biofuel wellhead and powerhouse: the chloroplast and mitochondrial genomes of the leguminous feedstock tree Pongamia pinnata.

PubMed

Kazakoff, Stephen H; Imelfort, Michael; Edwards, David; Koehorst, Jasper; Biswas, Bandana; Batley, Jacqueline; Scott, Paul T; Gresshoff, Peter M

2012-01-01

Pongamia pinnata (syn. Millettia pinnata) is a novel, fast-growing arboreal legume that bears prolific quantities of oil-rich seeds suitable for the production of biodiesel and aviation biofuel. Here, we have used Illumina® 'Second Generation DNA Sequencing (2GS)' and a new short-read de novo assembler, SaSSY, to assemble and annotate the Pongamia chloroplast (152,968 bp; cpDNA) and mitochondrial (425,718 bp; mtDNA) genomes. We also show that SaSSY can be used to accurately assemble 2GS data, by re-assembling the Lotus japonicus cpDNA and in the process assemble its mtDNA (380,861 bp). The Pongamia cpDNA contains 77 unique protein-coding genes and is almost 60% gene-dense. It contains a 50 kb inversion common to other legumes, as well as a novel 6.5 kb inversion that is responsible for the non-disruptive, re-orientation of five protein-coding genes. Additionally, two copies of an inverted repeat firmly place the species outside the subclade of the Fabaceae lacking the inverted repeat. The Pongamia and L. japonicus mtDNA contain just 33 and 31 unique protein-coding genes, respectively, and like other angiosperm mtDNA, have expanded intergenic and multiple repeat regions. Through comparative analysis with Vigna radiata we measured the average synonymous and non-synonymous divergence of all three legume mitochondrial (1.59% and 2.40%, respectively) and chloroplast (8.37% and 8.99%, respectively) protein-coding genes. Finally, we explored the relatedness of Pongamia within the Fabaceae and showed the utility of the organellar genome sequences by mapping transcriptomic data to identify up- and down-regulated stress-responsive gene candidates and confirm in silico predicted RNA editing sites.
Search for protein partners of mitochondrial single-stranded DNA-binding protein Rim1p using a yeast two-hybrid system.

PubMed

Kucejová, B; Foury, F

2003-01-01

RIM1 is a nuclear gene of the yeast Saccharomyces cerevisiae coding for a protein with single-stranded DNA-binding activity that is essential for mitochondrial genome maintenance. No protein partners of Rim1p have been described so far in yeast. To better understand the role of this protein in mitochondrial DNA replication and recombination, a search for protein interactors by the yeast two-hybrid system was performed. This approach led to the identification of several candidates, including a putative transcription factor, Azf1p, and Mph1p, a protein with an RNA helicase domain which is known to influence the mutation rate of nuclear and mitochondrial genomes.
An exploration of the sequence of a 2.9-Mb region of the genome of Drosophila melanogaster: the Adh region.

PubMed Central

Ashburner, M; Misra, S; Roote, J; Lewis, S E; Blazej, R; Davis, T; Doyle, C; Galle, R; George, R; Harris, N; Hartzell, G; Harvey, D; Hong, L; Houston, K; Hoskins, R; Johnson, G; Martin, C; Moshrefi, A; Palazzolo, M; Reese, M G; Spradling, A; Tsang, G; Wan, K; Whitelaw, K; Celniker, S

1999-01-01

A contiguous sequence of nearly 3 Mb from the genome of Drosophila melanogaster has been sequenced from a series of overlapping P1 and BAC clones. This region covers 69 chromosome polytene bands on chromosome arm 2L, including the genetically well-characterized "Adh region." A computational analysis of the sequence predicts 218 protein-coding genes, 11 tRNAs, and 17 transposable element sequences. At least 38 of the protein-coding genes are arranged in clusters of from 2 to 6 closely related genes, suggesting extensive tandem duplication. The gene density is one protein-coding gene every 13 kb; the transposable element density is one element every 171 kb. Of 73 genes in this region identified by genetic analysis, 49 have been located on the sequence; P-element insertions have been mapped to 43 genes. Ninety-five (44%) of the known and predicted genes match a Drosophila EST, and 144 (66%) have clear similarities to proteins in other organisms. Genes known to have mutant phenotypes are more likely to be represented in cDNA libraries, and far more likely to have products similar to proteins of other organisms, than are genes with no known mutant phenotype. Over 650 chromosome aberration breakpoints map to this chromosome region, and their nonrandom distribution on the genetic map reflects variation in gene spacing on the DNA. This is the first large-scale analysis of the genome of D. melanogaster at the sequence level. In addition to the direct results obtained, this analysis has allowed us to develop and test methods that will be needed to interpret the complete sequence of the genome of this species.Before beginning a Hunt, it is wise to ask someone what you are looking for before you begin looking for it. Milne 1926 PMID:10471707
Profilin is associated with transcriptionally active genes

PubMed Central

Söderberg, Emilia; Hessle, Viktoria; von Euler, Anne; Visa, Neus

2012-01-01

We have raised antibodies against the profilin of Chironomus tentans to study the location of profilin relative to chromatin and to active genes in salivary gland polytene chromosomes. We show that a fraction of profilin is located in the nucleus, where profilin is highly concentrated in the nucleoplasm and at the nuclear periphery. Moreover, profilin is associated with multiple bands in the polytene chromosomes. By staining salivary glands with propidium iodide, we show that profilin does not co-localize with dense chromatin. Profilin associates instead with protein-coding genes that are transcriptionally active, as revealed by co-localization with hnRNP and snRNP proteins. We have performed experiments of transcription inhibition with actinomycin D and we show that the association of profilin with the chromosomes requires ongoing transcription. However, the interaction of profilin with the gene loci does not depend on RNA. Our results are compatible with profilin regulating actin polymerization in the cell nucleus. However, the association of actin with the polytene chromosomes of C. tentans is sensitive to RNase, whereas the association of profilin is not, and we propose therefore that the chromosomal location of profilin is independent of actin. PMID:22572953
Preparation and characterization of human interleukin-5 expressed in recombinant Escherichia coli.

PubMed Central

Proudfoot, A E; Fattah, D; Kawashima, E H; Bernard, A; Wingfield, P T

1990-01-01

The gene coding for human interleukin-5 was synthesized and expressed in Escherichia coli under control of a heat-inducible promoter. High-level expression, 10-15% of total cellular protein, was achieved in E. coli. The protein was produced in an insoluble state. A simple extraction, renaturation and purification scheme is described. The recombinant protein was found to be a homodimer, similar to the natural murine-derived protein. Despite the lack of glycosylation, high specific activities were obtained in three 'in vitro' biological assays. Physical characterization of the protein showed it to be mostly alpha-helical, supporting the hypothesis that a conformational similarity exists among certain cytokines. Images Fig. 1. Fig. 3. PMID:2205201
Different small, acid-soluble proteins of the alpha/beta type have interchangeable roles in the heat and UV radiation resistance of Bacillus subtilis spores.

PubMed Central

Mason, J M; Setlow, P

1987-01-01

Spores of Bacillus subtilis strains which carry deletion mutations in one gene (sspA) or two genes (sspA and sspB) which code for major alpha/beta-type small, acid-soluble spore proteins (SASP) are known to be much more sensitive to heat and UV radiation than wild-type spores. This heat- and UV-sensitive phenotype was cured completely or in part by introduction into these mutant strains of one or more copies of the sspA or sspB genes themselves; multiple copies of the B. subtilis sspD gene, which codes for a minor alpha/beta-type SASP; or multiple copies of the SASP-C gene, which codes for a major alpha/beta-type SASP of Bacillus megaterium. These findings suggest that alpha/beta-type SASP play interchangeable roles in the heat and UV radiation resistance of bacterial spores. Images PMID:3112127
A Catalogue of Putative cis-Regulatory Interactions Between Long Non-coding RNAs and Proximal Coding Genes Based on Correlative Analysis Across Diverse Human Tumors.

PubMed

Basu, Swaraj; Larsson, Erik

2018-05-31

Antisense transcripts and other long non-coding RNAs are pervasive in mammalian cells, and some of these molecules have been proposed to regulate proximal protein-coding genes in cis For example, non-coding transcription can contribute to inactivation of tumor suppressor genes in cancer, and antisense transcripts have been implicated in the epigenetic inactivation of imprinted genes. However, our knowledge is still limited and more such regulatory interactions likely await discovery. Here, we make use of available gene expression data from a large compendium of human tumors to generate hypotheses regarding non-coding-to-coding cis -regulatory relationships with emphasis on negative associations, as these are less likely to arise for reasons other than cis -regulation. We document a large number of possible regulatory interactions, including 193 coding/non-coding pairs that show expression patterns compatible with negative cis -regulation. Importantly, by this approach we capture several known cases, and many of the involved coding genes have known roles in cancer. Our study provides a large catalog of putative non-coding/coding cis -regulatory pairs that may serve as a basis for further experimental validation and characterization. Copyright © 2018 Basu and Larsson.
Wheat CBF gene family: identification of polymorphisms in the CBF coding sequence.

PubMed

Mohseni, Sara; Che, Hua; Djillali, Zakia; Dumont, Estelle; Nankeu, Joseph; Danyluk, Jean

2012-12-01

Expression of cold-regulated genes needed for protection against freezing stress is mediated, in part, by the CBF transcription factor family. Previous studies with temperate cereals suggested that the CBF gene family in wheat was large, and that CBF genes were at the base of an important low temperature tolerance trait. Therefore, the goal of our study was to identify the CBF repertoire in the freezing-tolerant hexaploid wheat cultivar Norstar, and then to examine if the coding region of CBF genes in two spring cultivars contain polymorphisms that could affect the protein sequence and structure. Our analyses reveal that hexaploid wheat contains a complex CBF family consisting of at least 65 CBF genes of which 60 are known to be expressed in the cultivar Norstar. They represent 27 paralogous genes with 1-3 homeologous copies for the A, B, and D genomes. The cultivar Norstar contains two pseudogenes and at least 24 additional proteins having sequences and (or) structures that deviate from the consensus in the conserved AP2 DNA-binding and (or) C-terminal activation-domains. This suggests that in cultivars such as Norstar, low temperature tolerance may be increased through breeding of additional optimal alleles. The examination of the CBF repertoire present in the two spring cultivars, Chinese Spring and Manitou, reveals that they have additional polymorphisms affecting conserved positions in these domains. Understanding the effects of these polymorphisms will provide additional information for the selection of optimum CBF alleles in Triticeae breeding programs.
Mucin acts as a nutrient source and a signal for the differential expression of genes coding for cellular processes and virulence factors in Acinetobacter baumannii

PubMed Central

Ohneck, Emily J.; Arivett, Brock A.; Fiester, Steven E.; Wood, Cecily R.; Metz, Maeva L.; Simeone, Gabriella M.

2018-01-01

The capacity of Acinetobacter baumannii to persist and cause infections depends on its interaction with abiotic and biotic surfaces, including those found on medical devices and host mucosal surfaces. However, the extracellular stimuli affecting these interactions are poorly understood. Based on our previous observations, we hypothesized that mucin, a glycoprotein secreted by lung epithelial cells, particularly during respiratory infections, significantly alters A. baumannii’s physiology and its interaction with the surrounding environment. Biofilm, virulence and growth assays showed that mucin enhances the interaction of A. baumannii ATCC 19606T with abiotic and biotic surfaces and its cytolytic activity against epithelial cells while serving as a nutrient source. The global effect of mucin on the physiology and virulence of this pathogen is supported by RNA-Seq data showing that its presence in a low nutrient medium results in the differential transcription of 427 predicted protein-coding genes. The reduced expression of ion acquisition genes and the increased transcription of genes coding for energy production together with the detection of mucin degradation indicate that this host glycoprotein is a nutrient source. The increased expression of genes coding for adherence and biofilm biogenesis on abiotic and biotic surfaces, the degradation of phenylacetic acid and the production of an active type VI secretion system further supports the role mucin plays in virulence. Taken together, our observations indicate that A. baumannii recognizes mucin as an environmental signal, which triggers a response cascade that allows this pathogen to acquire critical nutrients and promotes host-pathogen interactions that play a role in the pathogenesis of bacterial infections. PMID:29309434
New genes from non-coding sequence: the role of de novo protein-coding genes in eukaryotic evolutionary innovation.

PubMed

McLysaght, Aoife; Guerzoni, Daniele

2015-09-26

The origin of novel protein-coding genes de novo was once considered so improbable as to be impossible. In less than a decade, and especially in the last five years, this view has been overturned by extensive evidence from diverse eukaryotic lineages. There is now evidence that this mechanism has contributed a significant number of genes to genomes of organisms as diverse as Saccharomyces, Drosophila, Plasmodium, Arabidopisis and human. From simple beginnings, these genes have in some instances acquired complex structure, regulated expression and important functional roles. New genes are often thought of as dispensable late additions; however, some recent de novo genes in human can play a role in disease. Rather than an extremely rare occurrence, it is now evident that there is a relatively constant trickle of proto-genes released into the testing ground of natural selection. It is currently unknown whether de novo genes arise primarily through an 'RNA-first' or 'ORF-first' pathway. Either way, evolutionary tinkering with this pool of genetic potential may have been a significant player in the origins of lineage-specific traits and adaptations. © 2015 The Authors.
Identification of Nucleolus-Associated Chromatin Domains Reveals a Role for the Nucleolus in 3D Organization of the A. thaliana Genome.

PubMed

Pontvianne, Frédéric; Carpentier, Marie-Christine; Durut, Nathalie; Pavlištová, Veronika; Jaške, Karin; Schořová, Šárka; Parrinello, Hugues; Rohmer, Marine; Pikaard, Craig S; Fojtová, Miloslava; Fajkus, Jiří; Sáez-Vásquez, Julio

2016-08-09

The nucleolus is the site of rRNA gene transcription, rRNA processing, and ribosome biogenesis. However, the nucleolus also plays additional roles in the cell. We isolated nucleoli using fluorescence-activated cell sorting (FACS) and identified nucleolus-associated chromatin domains (NADs) by deep sequencing, comparing wild-type plants and null mutants for the nucleolar protein NUCLEOLIN 1 (NUC1). NADs are primarily genomic regions with heterochromatic signatures and include transposable elements (TEs), sub-telomeric regions, and mostly inactive protein-coding genes. However, NADs also include active rRNA genes and the entire short arm of chromosome 4 adjacent to them. In nuc1 null mutants, which alter rRNA gene expression and overall nucleolar structure, NADs are altered, telomere association with the nucleolus is decreased, and telomeres become shorter. Collectively, our studies reveal roles for NUC1 and the nucleolus in the spatial organization of chromosomes as well as telomere maintenance. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Dissecting the chromatin interactome of microRNA genes.

PubMed

Chen, Dijun; Fu, Liang-Yu; Zhang, Zhao; Li, Guoliang; Zhang, Hang; Jiang, Li; Harrison, Andrew P; Shanahan, Hugh P; Klukas, Christian; Zhang, Hong-Yu; Ruan, Yijun; Chen, Ling-Ling; Chen, Ming

2014-03-01

Our knowledge of the role of higher-order chromatin structures in transcription of microRNA genes (MIRs) is evolving rapidly. Here we investigate the effect of 3D architecture of chromatin on the transcriptional regulation of MIRs. We demonstrate that MIRs have transcriptional features that are similar to protein-coding genes. RNA polymerase II-associated ChIA-PET data reveal that many groups of MIRs and protein-coding genes are organized into functionally compartmentalized chromatin communities and undergo coordinated expression when their genomic loci are spatially colocated. We observe that MIRs display widespread communication in those transcriptionally active communities. Moreover, miRNA-target interactions are significantly enriched among communities with functional homogeneity while depleted from the same community from which they originated, suggesting MIRs coordinating function-related pathways at posttranscriptional level. Further investigation demonstrates the existence of spatial MIR-MIR chromatin interacting networks. We show that groups of spatially coordinated MIRs are frequently from the same family and involved in the same disease category. The spatial interaction network possesses both common and cell-specific subnetwork modules that result from the spatial organization of chromatin within different cell types. Together, our study unveils an entirely unexplored layer of MIR regulation throughout the human genome that links the spatial coordination of MIRs to their co-expression and function.
De novo assembly and characterization of the Trichuris trichiura adult worm transcriptome using Ion Torrent sequencing.

PubMed

Santos, Leonardo N; Silva, Eduardo S; Santos, André S; De Sá, Pablo H; Ramos, Rommel T; Silva, Artur; Cooper, Philip J; Barreto, Maurício L; Loureiro, Sebastião; Pinheiro, Carina S; Alcantara-Neves, Neuza M; Pacheco, Luis G C

2016-07-01

Infection with helminthic parasites, including the soil-transmitted helminth Trichuris trichiura (human whipworm), has been shown to modulate host immune responses and, consequently, to have an impact on the development and manifestation of chronic human inflammatory diseases. De novo derivation of helminth proteomes from sequencing of transcriptomes will provide valuable data to aid identification of parasite proteins that could be evaluated as potential immunotherapeutic molecules in near future. Herein, we characterized the transcriptome of the adult stage of the human whipworm T. trichiura, using next-generation sequencing technology and a de novo assembly strategy. Nearly 17.6 million high-quality clean reads were assembled into 6414 contiguous sequences, with an N50 of 1606bp. In total, 5673 protein-encoding sequences were confidentially identified in the T. trichiura adult worm transcriptome; of these, 1013 sequences represent potential newly discovered proteins for the species, most of which presenting orthologs already annotated in the related species T. suis. A number of transcripts representing probable novel non-coding transcripts for the species T. trichiura were also identified. Among the most abundant transcripts, we found sequences that code for proteins involved in lipid transport, such as vitellogenins, and several chitin-binding proteins. Through a cross-species expression analysis of gene orthologs shared by T. trichiura and the closely related parasites T. suis and T. muris it was possible to find twenty-six protein-encoding genes that are consistently highly expressed in the adult stages of the three helminth species. Additionally, twenty transcripts could be identified that code for proteins previously detected by mass spectrometry analysis of protein fractions of the whipworm somatic extract that present immunomodulatory activities. Five of these transcripts were amongst the most highly expressed protein-encoding sequences in the T. trichiura adult worm. Besides, orthologs of proteins demonstrated to have potent immunomodulatory properties in related parasitic helminths were also predicted from the T. trichiura de novo assembled transcriptome. Copyright © 2016. Published by Elsevier B.V.
A gene family for acidic ribosomal proteins in Schizosaccharomyces pombe: two essential and two nonessential genes.

PubMed Central

Beltrame, M; Bianchi, M E

1990-01-01

We have cloned the genes for small acidic ribosomal proteins (A-proteins) of the fission yeast Schizosaccharomyces pombe. S. pombe contains four transcribed genes for small A-proteins per haploid genome, as is the case for Saccharomyces cerevisiae. In contrast, multicellular eucaryotes contain two transcribed genes per haploid genome. The four proteins of S. pombe, besides sharing a high overall similarity, form two couples of nearly identical sequences. Their corresponding genes have a very conserved structure and are transcribed to a similar level. Surprisingly, of each couple of genes coding for nearly identical proteins, one is essential for cell growth, whereas the other is not. We suggest that the unequal importance of the four small A-proteins for cell survival is related to their physical organization in 60S ribosomal subunits. Images PMID:2325655
The analysis of the complete mitochondrial genome of Lecanicillium muscarium (synonym Verticillium lecanii) suggests a minimum common gene organization in mtDNAs of Sordariomycetes: phylogenetic implications.

PubMed

Kouvelis, Vassili N; Ghikas, Dimitri V; Typas, Milton A

2004-10-01

The mitochondrial genome (mtDNA) of the entomopathogenic fungus Lecanicillium muscarium (synonym Verticillium lecanii) with a total size of 24,499-bp has been analyzed. So far, it is the smallest known mitochondrial genome among Pezizomycotina, with an extremely compact gene organization and only one group-I intron in its large ribosomal RNA (rnl) gene. It contains the 14 typical genes coding for proteins related to oxidative phosphorylation, the two rRNA genes, one intronic ORF coding for a possible ribosomal protein (rps), and a set of 25 tRNA genes which recognize codons for all amino acids, except alanine and cysteine. All genes are transcribed from the same DNA strand. Gene order comparison with all available complete fungal mtDNAs-representatives of all four Phyla are included-revealed some characteristic common features like uninterrupted gene pairs, overlapping genes, and extremely variable intergenic regions, that can all be exploited for the study of fungal mitochondrial genomes. Moreover, a minimum common mtDNA gene order could be detected, in two units, for all known Sordariomycetes namely nad1-nad4-atp8-atp6 and rns-cox3-rnl, which can be extended in Hypocreales, to nad4L-nad5-cob-cox1-nad1-nad4-atp8-atp6 and rns-cox3-rnl nad2-nad3, respectively. Phylogenetic analysis of all fungal mtDNA essential protein-coding genes as one unit, clearly demonstrated the superiority of small genome (mtDNA) over single gene comparisons.
Molecular cloning, structural analysis, and expression in Escherichia coli of a chitinase gene from Enterobacter agglomerans.

PubMed Central

Chernin, L S; De la Fuente, L; Sobolev, V; Haran, S; Vorgias, C E; Oppenheim, A B; Chet, I

1997-01-01

The gene chiA, which codes for endochitinase, was cloned from a soilborne Enterobacter agglomerans. Its complete sequence was determined, and the deduced amino acid sequence of the enzyme designated Chia_Entag yielded an open reading frame coding for 562 amino acids of a 61-kDa precursor protein with a putative leader peptide at its N terminus. The nucleotide and polypeptide sequences of Chia_Entag showed 86.8 and 87.7% identity with the corresponding gene and enzyme, Chia_Serma, of Serratia marcescens, respectively. Homology modeling of Chia_Entag's three-dimensional structure demonstrated that most amino acid substitutions are at solvent-accessible sites. Escherichia coli JM109 carrying the E. agglomerans chiA gene produced and secreted Chia_Entag. The antifungal activity of the secreted endochitinase was demonstrated in vitro by inhibition of Fusarium oxysporum spore germination. The transformed strain inhibited Rhizoctonia solani growth on plates and the root rot disease caused by this fungus in cotton seedlings under greenhouse conditions. PMID:9055404
Identification and characterization of an early gene in the Lymantria dispar multinucleocapsid nuclear polyhedrosis virus

Treesearch

David S. Bischoff; James M. Slavicek

1995-01-01

The Lymantria dispar multinucleocapsid nuclear polyhedrosis virus (LdMNPV) gene encoding G22 was cloned and sequenced. The G22 gene codes for a 191 amino acid protein with a predicted Mr of 22000. Expression of G22 in a rabbit reticulocyte system generated a protein with an M...
Analysis of informational redundancy in the protein-assembling machinery

NASA Astrophysics Data System (ADS)

Berkovich, Simon

2004-03-01

Entropy analysis of the DNA structure does not reveal a significant departure from randomness indicating lack of informational redundancy. This signifies the absence of a hidden meaning in the genome text and supports the 'barcode' interpretation of DNA given in [1]. Lack of informational redundancy is a characteristic property of an identification label rather than of a message of instructions. Yet randomness of DNA has to induce non-random structures of the proteins. Protein synthesis is a two-step process: transcription into RNA with gene splicing and formation a structure of amino acids. Entropy estimations, performed by A. Djebbari, show typical values of redundancy of the biomolecules along these pathways: DNA gene 4proteins 15-40in gene expression, the RNA copy carries the same information as the original DNA template. Randomness is essentially eliminated only at the step of the protein creation by a degenerate code. According to [1], the significance of the substitution of U for T with a subsequent gene splicing is that these transformations result in a different pattern of RNA oscillations, so the vital DNA communications are protected against extraneous noise coming from the protein making activities. 1. S. Berkovich, "On the 'barcode' functionality of DNA, or the Phenomenon of Life in the Physical Universe", Dorrance Publishing Co., Pittsburgh, 2003
Evolution of the alternative AQP2 gene: Acquisition of a novel protein-coding sequence in dolphins.

PubMed

Kishida, Takushi; Suzuki, Miwa; Takayama, Asuka

2018-01-01

Taxon-specific de novo protein-coding sequences are thought to be important for taxon-specific environmental adaptation. A recent study revealed that bottlenose dolphins acquired a novel isoform of aquaporin 2 generated by alternative splicing (alternative AQP2), which helps dolphins to live in hyperosmotic seawater. The AQP2 gene consists of four exons, but the alternative AQP2 gene lacks the fourth exon and instead has a longer third exon that includes the original third exon and a part of the original third intron. Here, we show that the latter half of the third exon of the alternative AQP2 arose from a non-protein-coding sequence. Intact ORF of this de novo sequence is shared not by all cetaceans, but only by delphinoids. However, this sequence is conservative in all modern cetaceans, implying that this de novo sequence potentially plays important roles for marine adaptation in cetaceans. Copyright © 2017 Elsevier Inc. All rights reserved.
Efficient analysis of mouse genome sequences reveal many nonsense variants

PubMed Central

Steeland, Sophie; Timmermans, Steven; Van Ryckeghem, Sara; Hulpiau, Paco; Saeys, Yvan; Van Montagu, Marc; Vandenbroucke, Roosmarijn E.; Libert, Claude

2016-01-01

Genetic polymorphisms in coding genes play an important role when using mouse inbred strains as research models. They have been shown to influence research results, explain phenotypical differences between inbred strains, and increase the amount of interesting gene variants present in the many available inbred lines. SPRET/Ei is an inbred strain derived from Mus spretus that has ∼1% sequence difference with the C57BL/6J reference genome. We obtained a listing of all SNPs and insertions/deletions (indels) present in SPRET/Ei from the Mouse Genomes Project (Wellcome Trust Sanger Institute) and processed these data to obtain an overview of all transcripts having nonsynonymous coding sequence variants. We identified 8,883 unique variants affecting 10,096 different transcripts from 6,328 protein-coding genes, which is about 28% of all coding genes. Because only a subset of these variants results in drastic changes in proteins, we focused on variations that are nonsense mutations that ultimately resulted in a gain of a stop codon. These genes were identified by in silico changing the C57BL/6J coding sequences to the SPRET/Ei sequences, converting them to amino acid (AA) sequences, and comparing the AA sequences. All variants and transcripts affected were also stored in a database, which can be browsed using a SPRET/Ei M. spretus variants web tool (www.spretus.org), including a manual. We validated the tool by demonstrating the loss of function of three proteins predicted to be severely truncated, namely Fas, IRAK2, and IFNγR1. PMID:27147605

Penicillin-binding protein 4 of Escherichia coli: molecular cloning of the dacB gene, controlled overexpression, and alterations in murein composition.

PubMed

Korat, B; Mottl, H; Keck, W

1991-03-01

The penicillin-binding protein 4 (PBP4), from Escherichia coli, a DD-carboxypeptidase/DD-endopeptidase, was purified in an enzymatically active form to homogeneity by affinity chromatography on 6-aminopenicillanic acid/Sepharose and heparin/Sepharose. Polyclonal antibodies raised against the pure protein were used to identify and isolate PBP4 overproducing clones from an E. coli expression library, which was established on the basis of a temperature-inducible runaway replication plasmid. Three positive clones were isolated, one of which carried the intact structural gene dacB that codes for PBP4, on a 1.9kb SmaI-EcoRI fragment, whereas the other two carried truncated forms of this gene. The direction of transcription was determined. The PBP4 overproducing strain, when grown in rich medium, tolerated 160-fold overexpression. After disrupting cells by sonication, the majority (80%) of the overproduced PBP4 was detected in the 100,000 X g supernatant. Southern blotting analysis using the cloned dacB gene as a probe revealed that, in contrast to that described by Takeda et al. (1981), the plasmid pLC18-38 of the Clarke-Carbon collection does not code for PBP4. The overall composition of murein, synthesized in vitro or in vivo by the PBP4 overproducing strain, as determined by high-performance liquid chromatography analysis, suggests that PBP4 is not involved in transpeptidation but exclusively catalyses a DD-carboxypeptidase and DD-endopeptidase reaction.
A single Alal 39-to-Glu substitution in the Renibacterium salmoninarum virulence-associated protein p57 results in antigenic variation and is associated with enhanced p57 binding to Chinook salmon leukocytes

USGS Publications Warehouse

Wiens, Gregory D.; Pascho, Ron; Winton, James R.

2002-01-01

The gram-positive bacterium Renibacterium salmoninarum produces relatively large amounts of a 57-kDa protein (p57) implicated in the pathogenesis of salmonid bacterial kidney disease. Antigenic variation in p57 was identified by using monoclonal antibody 4C11, which exhibited severely decreased binding to R. salmoninarum strain 684 p57 and bound robustly to the p57 proteins of seven other R. salmoninarum strains. This difference in binding was not due to alterations in p57 synthesis, secretion, or bacterial cell association. The molecular basis of the 4C11 epitope loss was determined by amplifying and sequencing the two identical genes encoding p57, msa1 and msa2. The 5′ and coding sequences of the 684 msa1 and msa2 genes were identical to those of the ATCC 33209 msa1and msa2 genes except for a single C-to-A nucleotide mutation. This mutation was identified in both the msa1 and msa2 genes of strain 684 and resulted in an Ala139-to-Glu substitution in the amino-terminal region of p57. We examined whether this mutation in p57 altered salmonid leukocyte and rabbit erythrocyte binding activities. R. salmoninarum strain 684 extracellular protein exhibited a twofold increase in agglutinating activity for chinook salmon leukocytes and rabbit erythrocytes compared to the activity of the ATCC 33209 extracellular protein. A specific and quantitative p57 binding assay confirmed the increased binding activity of 684 p57. Monoclonal antibody 4C11 blocked the agglutinating activity of the ATCC 33209 extracellular protein but not the agglutinating activity of the 684 extracellular protein. These results indicate that the Ala139-to-Glu substitution altered immune recognition and was associated with enhanced biological activity of R. salmoninarum 684 p57.
Tetrahymena thermophila acidic ribosomal protein L37 contains an archaebacterial type of C-terminus.

PubMed

Hansen, T S; Andreasen, P H; Dreisig, H; Højrup, P; Nielsen, H; Engberg, J; Kristiansen, K

1991-09-15

We have cloned and characterized a Tetrahymena thermophila macronuclear gene (L37) encoding the acidic ribosomal protein (A-protein) L37. The gene contains a single intron located in the 3'-part of the coding region. Two major and three minor transcription start points (tsp) were mapped 39 to 63 nucleotides upstream from the translational start codon. The uppermost tsp mapped to the first T in a putative T. thermophila RNA polymerase II initiator element, TATAA. The coding region of L37 predicts a protein of 109 amino acid (aa) residues. A substantial part of the deduced aa sequence was verified by protein sequencing. The T. thermophila L37 clearly belongs to the P1-type family of eukaryotic A-proteins, but the C-terminal region has the hallmarks of archaebacterial A-proteins.
Massively Convergent Evolution for Ribosomal Protein Gene Content in Plastid and Mitochondrial Genomes

PubMed Central

Maier, Uwe-G; Zauner, Stefan; Woehle, Christian; Bolte, Kathrin; Hempel, Franziska; Allen, John F.; Martin, William F.

2013-01-01

Plastid and mitochondrial genomes have undergone parallel evolution to encode the same functional set of genes. These encode conserved protein components of the electron transport chain in their respective bioenergetic membranes and genes for the ribosomes that express them. This highly convergent aspect of organelle genome evolution is partly explained by the redox regulation hypothesis, which predicts a separate plastid or mitochondrial location for genes encoding bioenergetic membrane proteins of either photosynthesis or respiration. Here we show that convergence in organelle genome evolution is far stronger than previously recognized, because the same set of genes for ribosomal proteins is independently retained by both plastid and mitochondrial genomes. A hitherto unrecognized selective pressure retains genes for the same ribosomal proteins in both organelles. On the Escherichia coli ribosome assembly map, the retained proteins are implicated in 30S and 50S ribosomal subunit assembly and initial rRNA binding. We suggest that ribosomal assembly imposes functional constraints that govern the retention of ribosomal protein coding genes in organelles. These constraints are subordinate to redox regulation for electron transport chain components, which anchor the ribosome to the organelle genome in the first place. As organelle genomes undergo reduction, the rRNAs also become smaller. Below size thresholds of approximately 1,300 nucleotides (16S rRNA) and 2,100 nucleotides (26S rRNA), all ribosomal protein coding genes are lost from organelles, while electron transport chain components remain organelle encoded as long as the organelles use redox chemistry to generate a proton motive force. PMID:24259312
Genes from the medicinal leech (Hirudo medicinalis) coding for unusual enzymes that specifically cleave endo-epsilon (gamma-Glu)-Lys isopeptide bonds and help to dissolve blood clots.

PubMed

Zavalova, L; Lukyanov, S; Baskova, I; Snezhkov, E; Akopov, S; Berezhnoy, S; Bogdanova, E; Barsova, E; Sverdlov, E D

1996-11-27

We previously detected in salivary gland secretions of the medicinal leech (Hirudo medicinalis) a novel enzymatic activity, endo-epsilon(gamma-Glu)-Lys isopeptidase, which cleaves isopeptide bonds formed by transglutaminase (Factor XIIIa) between glutamine gamma-carboxamide and the epsilon-amino group of lysine. Such isopeptide bonds, either within or between protein polypeptide chains are formed in many biological processes. However, before we started our work no enzymes were known to be capable of specifically splitting isopeptide bonds in proteins. The isopeptidase activity we detected was specific for isopeptide bonds. The enzyme was termed destabilase. Here we report the first purification of destabilase, part of its amino acid sequence isolation and sequencing of two related cDNAs derived from the gene family that encodes destabilase proteins, and the detection of isopeptidase activity encoded by one of these cDNAs cloned in a baculovirus expression vector. The deduced mature protein products of these cDNAs contain 115 and 116 amino acid residues, including 14 highly conserved Cys residues, and are formed from precursors containing specific leader peptides. No homologous sequences were found in public databases.
Functional identification of glutamate cysteine ligase and glutathione synthetase in the marine yeast Rhodosporidium diobovatum.

PubMed

Kong, Min; Wang, Fengjuan; Tian, Liuying; Tang, Hui; Zhang, Liping

2017-12-15

Glutathione (GSH) fulfills a variety of metabolic functions, participates in oxidative stress response, and defends against toxic actions of heavy metals and xenobiotics. In this study, GSH was detected in Rhodosporidium diobovatum by high-performance liquid chromatography (HPLC). Then, two novel enzymes from R. diobovatum were characterized that convert glutamate, cysteine, and glycine into GSH. Based on reverse transcription PCR, we obtained the glutathione synthetase gene (GSH2), 1866 bp, coding for a 56.6-kDa protein, and the glutamate cysteine ligase gene (GSH1), 2469 bp, coding for a 90.5-kDa protein. The role of GSH1 and GSH2 for the biosynthesis of GSH in the marine yeast R. diobovatum was determined by deletions using the CRISPR-Cas9 nuclease system and enzymatic activity. These results also showed that GSH1 and GSH2 were involved in the production of GSH and are thus being potentially useful to engineer GSH pathways. Alternatively, pET-GSH constructed using vitro recombination could be used to detect the function of genes related to GSH biosynthesis. Finally, the fermentation parameters determined in the present study provide a reference for industrial GSH production in R. diobovatum.
Functional identification of glutamate cysteine ligase and glutathione synthetase in the marine yeast Rhodosporidium diobovatum

NASA Astrophysics Data System (ADS)

Kong, Min; Wang, Fengjuan; Tian, Liuying; Tang, Hui; Zhang, Liping

2018-02-01

Glutathione (GSH) fulfills a variety of metabolic functions, participates in oxidative stress response, and defends against toxic actions of heavy metals and xenobiotics. In this study, GSH was detected in Rhodosporidium diobovatum by high-performance liquid chromatography (HPLC). Then, two novel enzymes from R. diobovatum were characterized that convert glutamate, cysteine, and glycine into GSH. Based on reverse transcription PCR, we obtained the glutathione synthetase gene ( GSH2), 1866 bp, coding for a 56.6-kDa protein, and the glutamate cysteine ligase gene ( GSH1), 2469 bp, coding for a 90.5-kDa protein. The role of GSH1 and GSH2 for the biosynthesis of GSH in the marine yeast R. diobovatum was determined by deletions using the CRISPR-Cas9 nuclease system and enzymatic activity. These results also showed that GSH1 and GSH2 were involved in the production of GSH and are thus being potentially useful to engineer GSH pathways. Alternatively, pET- GSH constructed using vitro recombination could be used to detect the function of genes related to GSH biosynthesis. Finally, the fermentation parameters determined in the present study provide a reference for industrial GSH production in R. diobovatum.
Expression profiles of long non-coding RNAs located in autoimmune disease-associated regions reveal immune cell-type specificity.

PubMed

Hrdlickova, Barbara; Kumar, Vinod; Kanduri, Kartiek; Zhernakova, Daria V; Tripathi, Subhash; Karjalainen, Juha; Lund, Riikka J; Li, Yang; Ullah, Ubaid; Modderman, Rutger; Abdulahad, Wayel; Lähdesmäki, Harri; Franke, Lude; Lahesmaa, Riitta; Wijmenga, Cisca; Withoff, Sebo

2014-01-01

Although genome-wide association studies (GWAS) have identified hundreds of variants associated with a risk for autoimmune and immune-related disorders (AID), our understanding of the disease mechanisms is still limited. In particular, more than 90% of the risk variants lie in non-coding regions, and almost 10% of these map to long non-coding RNA transcripts (lncRNAs). lncRNAs are known to show more cell-type specificity than protein-coding genes. We aimed to characterize lncRNAs and protein-coding genes located in loci associated with nine AIDs which have been well-defined by Immunochip analysis and by transcriptome analysis across seven populations of peripheral blood leukocytes (granulocytes, monocytes, natural killer (NK) cells, B cells, memory T cells, naive CD4(+) and naive CD8(+) T cells) and four populations of cord blood-derived T-helper cells (precursor, primary, and polarized (Th1, Th2) T-helper cells). We show that lncRNAs mapping to loci shared between AID are significantly enriched in immune cell types compared to lncRNAs from the whole genome (α <0.005). We were not able to prioritize single cell types relevant for specific diseases, but we observed five different cell types enriched (α <0.005) in five AID (NK cells for inflammatory bowel disease, juvenile idiopathic arthritis, primary biliary cirrhosis, and psoriasis; memory T and CD8(+) T cells in juvenile idiopathic arthritis, primary biliary cirrhosis, psoriasis, and rheumatoid arthritis; Th0 and Th2 cells for inflammatory bowel disease, juvenile idiopathic arthritis, primary biliary cirrhosis, psoriasis, and rheumatoid arthritis). Furthermore, we show that co-expression analyses of lncRNAs and protein-coding genes can predict the signaling pathways in which these AID-associated lncRNAs are involved. The observed enrichment of lncRNA transcripts in AID loci implies lncRNAs play an important role in AID etiology and suggests that lncRNA genes should be studied in more detail to interpret GWAS findings correctly. The co-expression results strongly support a model in which the lncRNA and protein-coding genes function together in the same pathways.
A-Mating-Type Gene Expression Can Drive Clamp Formation in the Bipolar Mushroom Pholiota microspora (Pholiota nameko) ▿

PubMed Central

Yi, Ruirong; Mukaiyama, Hiroyuki; Tachikawa, Takashi; Shimomura, Norihiro; Aimi, Tadanori

2010-01-01

In the bipolar basidiomycete Pholiota microspora, a pair of homeodomain protein genes located at the A-mating-type locus regulates mating compatibility. In the present study, we used a DNA-mediated transformation system in P. microspora to investigate the homeodomain proteins that control the clamp formation. When a single homeodomain protein gene (A3-hox1 or A3-hox2) from the A3 monokaryon strain was transformed into the A4 monokaryon strain, the transformants produced many pseudoclamps but very few clamps. When two homeodomain protein genes (A3-hox1 and A3-hox2) were transformed either separately or together into the A4 monokaryon, the ratio of clamps to the clamplike cells in the transformants was significantly increased to ca. 50%. We therefore concluded that the gene dosage of homeodomain protein genes is important for clamp formation. When the sip promoter was connected to the coding region of A3-hox1 and A3-hox2 and the fused fragments were introduced into NGW19-6 (A4), the transformants achieved more than 85% clamp formation and exhibited two nuclei per cell, similar to the dikaryon (NGW12-163 × NGW19-6). The results of real-time reverse transcription-PCR confirmed that sip promoter activity is greater than that of the native promoter of homeodomain protein genes in P. microspora. Thus, we concluded that nearly 100% clamp formation requires high expression levels of homeodomain protein genes and that altered expression of the A-mating-type genes alone is sufficient to drive true clamp formation. PMID:20453073
The RB-related gene Rb2/p130 in neuroblastoma differentiation and in B-myb promoter down-regulation.

PubMed

Raschellà, G; Tanno, B; Bonetto, F; Negroni, A; Claudio, P P; Baldi, A; Amendola, R; Calabretta, B; Giordano, A; Paggi, M G

1998-05-01

The retinoblastoma family of nuclear factors is composed of RB, the prototype of the tumour suppressor genes and of the strictly related genes p107 and Rb2/p130. The three genes code for proteins, namely pRb, p107 and pRb2/p130, that share similar structures and functions. These proteins are expressed, often simultaneously, in many cell types and are involved in the regulation of proliferation and differentiation. We determined the expression and the phosphorylation of the RB family gene products during the DMSO-induced differentiation of the N1E-115 murine neuroblastoma cells. In this system, pRb2/p130 was strongly up-regulated during mid-late differentiation stages, while, on the contrary, pRb and p107 resulted markedly decreased at late stages. Differentiating N1E-115 cells also showed a progressive decrease in B-myb levels, a proliferation-related protein whose constitutive expression inhibits neuronal differentiation. Transfection of each of the RB family genes in these cells was able, at different degrees, to induce neuronal differentiation, to inhibit [3H]thymidine incorporation and to down-regulate the activity of the B-myb promoter.
Genome-Wide Discovery of Long Non-Coding RNAs in Rainbow Trout.

PubMed

Al-Tobasei, Rafet; Paneru, Bam; Salem, Mohamed

2016-01-01

The ENCODE project revealed that ~70% of the human genome is transcribed. While only 1-2% of the RNAs encode for proteins, the rest are non-coding RNAs. Long non-coding RNAs (lncRNAs) form a diverse class of non-coding RNAs that are longer than 200 nt. Emerging evidence indicates that lncRNAs play critical roles in various cellular processes including regulation of gene expression. LncRNAs show low levels of gene expression and sequence conservation, which make their computational identification in genomes difficult. In this study, more than two billion Illumina sequence reads were mapped to the genome reference using the TopHat and Cufflinks software. Transcripts shorter than 200 nt, with more than 83-100 amino acids ORF, or with significant homologies to the NCBI nr-protein database were removed. In addition, a computational pipeline was used to filter the remaining transcripts based on a protein-coding-score test. Depending on the filtering stringency conditions, between 31,195 and 54,503 lncRNAs were identified, with only 421 matching known lncRNAs in other species. A digital gene expression atlas revealed 2,935 tissue-specific and 3,269 ubiquitously-expressed lncRNAs. This study annotates the lncRNA rainbow trout genome and provides a valuable resource for functional genomics research in salmonids.
Distribution in microbial genomes of genes similar to lodA and goxA which encode a novel family of quinoproteins with amino acid oxidase activity.

PubMed

Campillo-Brocal, Jonatan C; Chacón-Verdú, María Dolores; Lucas-Elío, Patricia; Sánchez-Amat, Antonio

2015-03-24

L-Amino acid oxidases (LAOs) have been generally described as flavoproteins that oxidize amino acids releasing the corresponding ketoacid, ammonium and hydrogen peroxide. The generation of hydrogen peroxide gives to these enzymes antimicrobial characteristics. They are involved in processes such as biofilm development and microbial competition. LAOs are of great biotechnological interest in different applications such as the design of biosensors, biotransformations and biomedicine. The marine bacterium Marinomonas mediterranea synthesizes LodA, the first known LAO that contains a quinone cofactor. LodA is encoded in an operon that contains a second gene coding for LodB, a protein required for the post-translational modification generating the cofactor. Recently, GoxA, a quinoprotein with sequence similarity to LodA but with a different enzymatic activity (glycine oxidase instead of lysine-ε-oxidase) has been described. The aim of this work has been to study the distribution of genes similar to lodA and/or goxA in sequenced microbial genomes and to get insight into the evolution of this novel family of proteins through phylogenetic analysis. Genes encoding LodA-like proteins have been detected in several bacterial classes. However, they are absent in Archaea and detected only in a small group of fungi of the class Agaromycetes. The vast majority of the genes detected are in a genome region with a nearby lodB-like gene suggesting a specific interaction between both partner proteins. Sequence alignment of the LodA-like proteins allowed the detection of several conserved residues. All of them showed a Cys and a Trp that aligned with the residues that are forming part of the cysteine tryptophilquinone (CTQ) cofactor in LodA. Phylogenetic analysis revealed that LodA-like proteins can be clustered in different groups. Interestingly, LodA and GoxA are in different groups, indicating that those groups are related to the enzymatic activity of the proteins detected. Genome mining has revealed for the first time the broad distribution of LodA-like proteins containing a CTQ cofactor in many different microbial groups. This study provides a platform to explore the potentially novel enzymatic activities of the proteins detected, the mechanisms of post-translational modifications involved in their synthesis, as well as their biological relevance.
Hox proteins activate the IGFBP-1 promoter and suppress the function of hPR in human endometrial cells.

PubMed

Gao, Jiaguo; Mazella, James; Tseng, Linda

2002-11-01

Previous studies have shown that progestin activates the transcription of IGFBP-1 (insulin-like growth factor binding protein-1). Four regions in the IGFBP-1 promotor have been identified to enhance the transcription. Two of the regions, located at -73 to -65 bp and -319 to -311 bp formed identical DNA-protein complexes with the nuclear extracts of endometrial stromal/decidual cells. To identify the binding protein(s) in endometrial cells that interact with these two regions, we have used the TGTCAATTA repeats (-319 to -11 bp of the IGFBP-1 promoter) to screen the human decidual cDNA library by yeast one-hybrid system. We found that Hox A10, HoxA11, HoxB2, HoxB4, and HoxD11 interacted with the TGTCAATTA repeats in yeast cells. Among these hox genes, the full-length coding region of HoxA10, HoxA11, and HoxB4 were used for functional analysis in three types of endometrial cells, undifferentiated endometrial stromal cells, decidual cells (differentiated stromal cells) and endometrial adenocarcinoma cell line (HEC1-B). All these endometrial cells produce IGFBP-1. Transient transfection assay showed that HoxA10 expression vector increased the promoter activity (the IGFBP-1 proximal promoter containing TGC/TCAATTA and two functional PRE sites) in endometrial stromal cells and in HEC-1B cells, but not in decidual cells. HoxB4 enhanced the promoter activity only in decidual cells, while HoxA11 had no apparent effect in all three types of cells. To evaluate whether Hox proteins would interact with progesterone receptor (hPR), cells were transfected with the promoter construct, Hox and hPR expression vectors. hPR alone activated the IGFBP-1 promoter activity, but expression of Hox gene suppressed the activation. Hox proteins also suppressed the hPR enhanced promoter activities of MMTV (containing consensus-PRE sites) and glycodelin (GdA, containing Sp1 site which mediates the hPR function). These data showed that Hox genes selectively activate the transcription of the IGFBP-1 and GdA genes in different types of endometrial cells. Hox genes, however, suppress the hPR enhanced activities. In addition, we found that HoxB4 expression was induced by estrogen and progestin. Other investigators have shown that HoxA10 and 11 were stimulated by progestin. These findings show that Hox proteins are molecular mediators of the steroid hormones during endometrial cell development.
Bacillus anthracis genome organization in light of whole transcriptome sequencing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Martin, Jeffrey; Zhu, Wenhan; Passalacqua, Karla D.

2010-03-22

Emerging knowledge of whole prokaryotic transcriptomes could validate a number of theoretical concepts introduced in the early days of genomics. What are the rules connecting gene expression levels with sequence determinants such as quantitative scores of promoters and terminators? Are translation efficiency measures, e.g. codon adaptation index and RBS score related to gene expression? We used the whole transcriptome shotgun sequencing of a bacterial pathogen Bacillus anthracis to assess correlation of gene expression level with promoter, terminator and RBS scores, codon adaptation index, as well as with a new measure of gene translational efficiency, average translation speed. We compared computationalmore » predictions of operon topologies with the transcript borders inferred from RNA-Seq reads. Transcriptome mapping may also improve existing gene annotation. Upon assessment of accuracy of current annotation of protein-coding genes in the B. anthracis genome we have shown that the transcriptome data indicate existence of more than a hundred genes missing in the annotation though predicted by an ab initio gene finder. Interestingly, we observed that many pseudogenes possess not only a sequence with detectable coding potential but also promoters that maintain transcriptional activity.« less
Identification of functional elements and regulatory circuits by Drosophila modENCODE

DOE Office of Scientific and Technical Information (OSTI.GOV)

Roy, Sushmita; Ernst, Jason; Kharchenko, Peter V.

2010-12-22

To gain insight into how genomic information is translated into cellular and developmental programs, the Drosophila model organism Encyclopedia of DNA Elements (modENCODE) project is comprehensively mapping transcripts, histone modifications, chromosomal proteins, transcription factors, replication proteins and intermediates, and nucleosome properties across a developmental time course and in multiple cell lines. We have generated more than 700 data sets and discovered protein-coding, noncoding, RNA regulatory, replication, and chromatin elements, more than tripling the annotated portion of the Drosophila genome. Correlated activity patterns of these elements reveal a functional regulatory network, which predicts putative new functions for genes, reveals stage- andmore » tissue-specific regulators, and enables gene-expression prediction. Our results provide a foundation for directed experimental and computational studies in Drosophila and related species and also a model for systematic data integration toward comprehensive genomic and functional annotation. Several years after the complete genetic sequencing of many species, it is still unclear how to translate genomic information into a functional map of cellular and developmental programs. The Encyclopedia of DNA Elements (ENCODE) (1) and model organism ENCODE (modENCODE) (2) projects use diverse genomic assays to comprehensively annotate the Homo sapiens (human), Drosophila melanogaster (fruit fly), and Caenorhabditis elegans (worm) genomes, through systematic generation and computational integration of functional genomic data sets. Previous genomic studies in flies have made seminal contributions to our understanding of basic biological mechanisms and genome functions, facilitated by genetic, experimental, computational, and manual annotation of the euchromatic and heterochromatic genome (3), small genome size, short life cycle, and a deep knowledge of development, gene function, and chromosome biology. The functions of {approx}40% of the protein and nonprotein-coding genes [FlyBase 5.12 (4)] have been determined from cDNA collections (5, 6), manual curation of gene models (7), gene mutations and comprehensive genome-wide RNA interference screens (8-10), and comparative genomic analyses (11, 12). The Drosophila modENCODE project has generated more than 700 data sets that profile transcripts, histone modifications and physical nucleosome properties, general and specific transcription factors (TFs), and replication programs in cell lines, isolated tissues, and whole organisms across several developmental stages (Fig. 1). Here, we computationally integrate these data sets and report (i) improved and additional genome annotations, including full-length proteincoding genes and peptides as short as 21 amino acids; (ii) noncoding transcripts, including 132 candidate structural RNAs and 1608 nonstructural transcripts; (iii) additional Argonaute (Ago)-associated small RNA genes and pathways, including new microRNAs (miRNAs) encoded within protein-coding exons and endogenous small interfering RNAs (siRNAs) from 3-inch untranslated regions; (iv) chromatin 'states' defined by combinatorial patterns of 18 chromatin marks that are associated with distinct functions and properties; (v) regions of high TF occupancy and replication activity with likely epigenetic regulation; (vi)mixed TF and miRNA regulatory networks with hierarchical structure and enriched feed-forward loops; (vii) coexpression- and co-regulation-based functional annotations for nearly 3000 genes; (viii) stage- and tissue-specific regulators; and (ix) predictive models of gene expression levels and regulator function.« less
Long non-coding RNAs involved in autophagy regulation

PubMed Central

Yang, Lixian; Wang, Hanying; Shen, Qi; Feng, Lifeng; Jin, Hongchuan

2017-01-01

Autophagy degrades non-functioning or damaged proteins and organelles to maintain cellular homeostasis in a physiological or pathological context. Autophagy can be protective or detrimental, depending on its activation status and other conditions. Therefore, autophagy has a crucial role in a myriad of pathophysiological processes. From the perspective of autophagy-related (ATG) genes, the molecular dissection of autophagy process and the regulation of its level have been largely unraveled. However, the discovery of long non-coding RNAs (lncRNAs) provides a new paradigm of gene regulation in almost all important biological processes, including autophagy. In this review, we highlight recent advances in autophagy-associated lncRNAs and their specific autophagic targets, as well as their relevance to human diseases such as cancer, cardiovascular disease, diabetes and cerebral ischemic stroke. PMID:28981093
The Bacillus thuringiensis cyt Genes for Hemolytic Endotoxins Constitute a Gene Family

PubMed Central

Guerchicoff, Alejandra; Delécluse, Armelle; Rubinstein, Clara P.

2001-01-01

In the same way that cry genes, coding for larvicidal delta endotoxins, constitute a large and diverse gene family, the cyt genes for hemolytic toxins seem to compose another set of highly related genes in Bacillus thuringiensis. Although the occurrence of Cyt hemolytic factors in B. thuringiensis has been typically associated with mosquitocidal strains, we have recently shown that cyt genes are also present in strains with different pathotypes; this is the case for the morrisoni subspecies, which includes strains biologically active against dipteran, lepidopteran, and coleopteran larvae. In addition, while one Cyt type of protein has been described in all of the mosquitocidal strains studied so far, the present study confirms that at least two Cyt toxins coexist in the more toxic antidipteran strains, such as B. thuringiensis subsp. israelensis and subsp. morrisoni PG14, and that this could also be the case for many others. In fact, PCR screening and Western blot analysis of 50 B. thuringiensis strains revealed that cyt2-related genes are present in all strains with known antidipteran activity, as well as in some others with different or unknown host ranges. Partial DNA sequences for several of these genes were determined, and protein sequence alignments revealed a high degree of conservation of the structural domains. These findings point to an important biological role for Cyt toxins in the final in vivo toxic activity of many B. thuringiensis strains. PMID:11229896
The complete mitochondrial genome of the Giant Manta ray, Manta birostris.

PubMed

Hinojosa-Alvarez, Silvia; Díaz-Jaimes, Pindaro; Marcet-Houben, Marina; Gabaldón, Toni

2015-01-01

The complete mitochondrial genome of the giant manta ray (Manta birostris), consists of 18,075 bp with rich A + T and low G content. Gene organization and length is similar to other species of ray. It comprises of 13 protein-coding genes, 2 rRNAs genes, 23 tRNAs genes and 1 non-coding sequence, and the control region. We identified an AT tandem repeat region, similar to that reported in Mobula japanica.
Expression and function of AtMBD4L, the single gene encoding the nuclear DNA glycosylase MBD4L in Arabidopsis.

PubMed

Nota, Florencia; Cambiagno, Damián A; Ribone, Pamela; Alvarez, María E

2015-06-01

DNA glycosylases recognize and excise damaged or incorrect bases from DNA initiating the base excision repair (BER) pathway. Methyl-binding domain protein 4 (MBD4) is a member of the HhH-GPD DNA glycosylase superfamily, which has been well studied in mammals but not in plants. Our knowledge on the plant enzyme is limited to the activity of the Arabidopsis recombinant protein MBD4L in vitro. To start evaluating MBD4L in its biological context, we here characterized the structure, expression and effects of its gene, AtMBD4L. Phylogenetic analysis indicated that AtMBD4L belongs to one of the seven families of HhH-GPD DNA glycosylase genes existing in plants, and is unique on its family. Two AtMBD4L transcripts coding for active enzymes were detected in leaves and flowers. Transgenic plants expressing the AtMBD4L:GUS gene confined GUS activity to perivascular leaf tissues (usually adjacent to hydathodes), flowers (anthers at particular stages of development), and the apex of immature siliques. MBD4L-GFP fusion proteins showed nuclear localization in planta. Interestingly, overexpression of the full length MBD4L, but not a truncated enzyme lacking the DNA glycosylase domain, induced the BER gene LIG1 and enhanced tolerance to oxidative stress. These results suggest that endogenous MBD4L acts on particular tissues, is capable of activating BER, and may contribute to repair DNA damage caused by oxidative stress. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Origins of Genes: "Big Bang" or Continuous Creation?

NASA Astrophysics Data System (ADS)

Kesse, Paul K.; Gibbs, Adrian

1992-10-01

Many protein families are common to all cellular organisms, indicating that many genes have ancient origins. Genetic variation is mostly attributed to processes such as mutation, duplication, and rearrangement of ancient modules. Thus it is widely assumed that much of present-day genetic diversity can be traced by common ancestry to a molecular "big bang." A rarely considered alternative is that proteins may arise continuously de novo. One mechanism of generating different coding sequences is by "overprinting," in which an existing nucleotide sequence is translated de novo in a different reading frame or from noncoding open reading frames. The clearest evidence for overprinting is provided when the original gene function is retained, as in overlapping genes. Analysis of their phylogenies indicates which are the original genes and which are their informationally novel partners. We report here the phylogenetic relationships of overlapping coding sequences from steroid-related receptor genes and from tymovirus, luteovirus, and lentivirus genomes. For each pair of overlapping coding sequences, one is confined to a single lineage, whereas the other is more widespread. This suggests that the phylogenetically restricted coding sequence arose only in the progenitor of that lineage by translating an out-of-frame sequence to yield the new polypeptide. The production of novel exons by alternative splicing in thyroid receptor and lentivirus genes suggests that introns can be a valuable evolutionary source for overprinting. New genes and their products may drive major evolutionary changes.

Biodegradation of DDT by Stenotrophomonas sp. DDT-1: Characterization and genome functional analysis

NASA Astrophysics Data System (ADS)

Pan, Xiong; Lin, Dunli; Zheng, Yuan; Zhang, Qian; Yin, Yuanming; Cai, Lin; Fang, Hua; Yu, Yunlong

2016-02-01

A novel bacterium capable of utilizing 1,1,1-trichloro-2,2-bis(p-chlorophenyl)ethane (DDT) as the sole carbon and energy source was isolated from a contaminated soil which was identified as Stenotrophomonas sp. DDT-1 based on morphological characteristics, BIOLOG GN2 microplate profile, and 16S rDNA phylogeny. Genome sequencing and functional annotation of the isolate DDT-1 showed a 4,514,569 bp genome size, 66.92% GC content, 4,033 protein-coding genes, and 76 RNA genes including 8 rRNA genes. Totally, 2,807 protein-coding genes were assigned to Clusters of Orthologous Groups (COGs), and 1,601 protein-coding genes were mapped to Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway. The degradation half-lives of DDT increased with substrate concentration from 0.1 to 10.0 mg/l, whereas decreased with temperature from 15 °C to 35 °C. Neutral condition was the most favorable for DDT biodegradation. Based on genome annotation of DDT degradation genes and the metabolites detected by GC-MS, a mineralization pathway was proposed for DDT biodegradation in which it was orderly converted into DDE/DDD, DDMU, DDOH, and DDA via dechlorination, hydroxylation, and carboxylation, and ultimately mineralized to carbon dioxide. The results indicate that the isolate DDT-1 is a promising bacterial resource for the removal or detoxification of DDT residues in the environment.
Biodegradation of DDT by Stenotrophomonas sp. DDT-1: Characterization and genome functional analysis.

PubMed

Pan, Xiong; Lin, Dunli; Zheng, Yuan; Zhang, Qian; Yin, Yuanming; Cai, Lin; Fang, Hua; Yu, Yunlong

2016-02-18

A novel bacterium capable of utilizing 1,1,1-trichloro-2,2-bis(p-chlorophenyl)ethane (DDT) as the sole carbon and energy source was isolated from a contaminated soil which was identified as Stenotrophomonas sp. DDT-1 based on morphological characteristics, BIOLOG GN2 microplate profile, and 16S rDNA phylogeny. Genome sequencing and functional annotation of the isolate DDT-1 showed a 4,514,569 bp genome size, 66.92% GC content, 4,033 protein-coding genes, and 76 RNA genes including 8 rRNA genes. Totally, 2,807 protein-coding genes were assigned to Clusters of Orthologous Groups (COGs), and 1,601 protein-coding genes were mapped to Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway. The degradation half-lives of DDT increased with substrate concentration from 0.1 to 10.0 mg/l, whereas decreased with temperature from 15 °C to 35 °C. Neutral condition was the most favorable for DDT biodegradation. Based on genome annotation of DDT degradation genes and the metabolites detected by GC-MS, a mineralization pathway was proposed for DDT biodegradation in which it was orderly converted into DDE/DDD, DDMU, DDOH, and DDA via dechlorination, hydroxylation, and carboxylation, and ultimately mineralized to carbon dioxide. The results indicate that the isolate DDT-1 is a promising bacterial resource for the removal or detoxification of DDT residues in the environment.
Glial cell line-derived neurotrophic factor protects against high-fat diet-induced hepatic steatosis by suppressing hepatic PPAR-γ expression.

PubMed

Mwangi, Simon Musyoka; Peng, Sophia; Nezami, Behtash Ghazi; Thorn, Natalie; Farris, Alton B; Jain, Sanjay; Laroui, Hamed; Merlin, Didier; Anania, Frank; Srinivasan, Shanthi

2016-01-15

Glial cell line-derived neurotrophic factor (GDNF) protects against high-fat diet (HFD)-induced hepatic steatosis in mice, however, the mechanisms involved are not known. In this study we investigated the effects of GDNF overexpression and nanoparticle delivery of GDNF in mice on hepatic steatosis and fibrosis and the expression of genes involved in the regulation of hepatic lipid uptake and de novo lipogenesis. Transgenic overexpression of GDNF in liver and other metabolically active tissues was protective against HFD-induced hepatic steatosis. Mice overexpressing GDNF had significantly reduced P62/sequestosome 1 protein levels suggestive of accelerated autophagic clearance. They also had significantly reduced peroxisome proliferator-activated receptor-γ (PPAR-γ) and CD36 gene expression and protein levels, and lower expression of mRNA coding for enzymes involved in de novo lipogenesis. GDNF-loaded nanoparticles were protective against short-term HFD-induced hepatic steatosis and attenuated liver fibrosis in mice with long-standing HFD-induced hepatic steatosis. They also suppressed the liver expression of steatosis-associated genes. In vitro, GDNF suppressed triglyceride accumulation in Hep G2 cells through enhanced p38 mitogen-activated protein kinase-dependent signaling and inhibition of PPAR-γ gene promoter activity. These results show that GDNF acts directly in the liver to protect against HFD-induced cellular stress and that GDNF may have a role in the treatment of nonalcoholic fatty liver disease.
CodingQuarry: highly accurate hidden Markov model gene prediction in fungal genomes using RNA-seq transcripts.

PubMed

Testa, Alison C; Hane, James K; Ellwood, Simon R; Oliver, Richard P

2015-03-11

The impact of gene annotation quality on functional and comparative genomics makes gene prediction an important process, particularly in non-model species, including many fungi. Sets of homologous protein sequences are rarely complete with respect to the fungal species of interest and are often small or unreliable, especially when closely related species have not been sequenced or annotated in detail. In these cases, protein homology-based evidence fails to correctly annotate many genes, or significantly improve ab initio predictions. Generalised hidden Markov models (GHMM) have proven to be invaluable tools in gene annotation and, recently, RNA-seq has emerged as a cost-effective means to significantly improve the quality of automated gene annotation. As these methods do not require sets of homologous proteins, improving gene prediction from these resources is of benefit to fungal researchers. While many pipelines now incorporate RNA-seq data in training GHMMs, there has been relatively little investigation into additionally combining RNA-seq data at the point of prediction, and room for improvement in this area motivates this study. CodingQuarry is a highly accurate, self-training GHMM fungal gene predictor designed to work with assembled, aligned RNA-seq transcripts. RNA-seq data informs annotations both during gene-model training and in prediction. Our approach capitalises on the high quality of fungal transcript assemblies by incorporating predictions made directly from transcript sequences. Correct predictions are made despite transcript assembly problems, including those caused by overlap between the transcripts of adjacent gene loci. Stringent benchmarking against high-confidence annotation subsets showed CodingQuarry predicted 91.3% of Schizosaccharomyces pombe genes and 90.4% of Saccharomyces cerevisiae genes perfectly. These results are 4-5% better than those of AUGUSTUS, the next best performing RNA-seq driven gene predictor tested. Comparisons against whole genome Sc. pombe and S. cerevisiae annotations further substantiate a 4-5% improvement in the number of correctly predicted genes. We demonstrate the success of a novel method of incorporating RNA-seq data into GHMM fungal gene prediction. This shows that a high quality annotation can be achieved without relying on protein homology or a training set of genes. CodingQuarry is freely available ( https://sourceforge.net/projects/codingquarry/ ), and suitable for incorporation into genome annotation pipelines.
Regulation of cellulase expression, sporulation, and morphogenesis by velvet family proteins in Trichoderma reesei.

PubMed

Liu, Kuimei; Dong, Yanmei; Wang, Fangzhong; Jiang, Baojie; Wang, Mingyu; Fang, Xu

2016-01-01

Homologs of the velvet protein family are encoded by the ve1, vel2, and vel3 genes in Trichoderma reesei. To test their regulatory functions, the velvet protein-coding genes were disrupted, generating Δve1, Δvel2, and Δvel3 strains. The phenotypic features of these strains were examined to identify their functions in morphogenesis, sporulation, and cellulase expression. The three velvet-deficient strains produced more hyphal branches, indicating that velvet family proteins participate in the morphogenesis in T. reesei. Deletion of ve1 and vel3 did not affect biomass accumulation, while deletion of vel2 led to a significantly hampered growth when cellulose was used as the sole carbon source in the medium. The deletion of either ve1 or vel2 led to the sharp decrease of sporulation as well as a global downregulation of cellulase-coding genes. In contrast, although the expression of cellulase-coding genes of the ∆vel3 strain was downregulated in the dark, their expression in light condition was unaffected. Sporulation was hampered in the ∆vel3 strain. These results suggest that Ve1 and Vel2 play major roles, whereas Vel3 plays a minor role in sporulation, morphogenesis, and cellulase expression.
Rare and Coding Region Genetic Variants Associated With Risk of Ischemic Stroke: The NHLBI Exome Sequence Project.

PubMed

Auer, Paul L; Nalls, Mike; Meschia, James F; Worrall, Bradford B; Longstreth, W T; Seshadri, Sudha; Kooperberg, Charles; Burger, Kathleen M; Carlson, Christopher S; Carty, Cara L; Chen, Wei-Min; Cupples, L Adrienne; DeStefano, Anita L; Fornage, Myriam; Hardy, John; Hsu, Li; Jackson, Rebecca D; Jarvik, Gail P; Kim, Daniel S; Lakshminarayan, Kamakshi; Lange, Leslie A; Manichaikul, Ani; Quinlan, Aaron R; Singleton, Andrew B; Thornton, Timothy A; Nickerson, Deborah A; Peters, Ulrike; Rich, Stephen S

2015-07-01

Stroke is the second leading cause of death and the third leading cause of years of life lost. Genetic factors contribute to stroke prevalence, and candidate gene and genome-wide association studies (GWAS) have identified variants associated with ischemic stroke risk. These variants often have small effects without obvious biological significance. Exome sequencing may discover predicted protein-altering variants with a potentially large effect on ischemic stroke risk. To investigate the contribution of rare and common genetic variants to ischemic stroke risk by targeting the protein-coding regions of the human genome. The National Heart, Lung, and Blood Institute (NHLBI) Exome Sequencing Project (ESP) analyzed approximately 6000 participants from numerous cohorts of European and African ancestry. For discovery, 365 cases of ischemic stroke (small-vessel and large-vessel subtypes) and 809 European ancestry controls were sequenced; for replication, 47 affected sibpairs concordant for stroke subtype and an African American case-control series were sequenced, with 1672 cases and 4509 European ancestry controls genotyped. The ESP's exome sequencing and genotyping started on January 1, 2010, and continued through June 30, 2012. Analyses were conducted on the full data set between July 12, 2012, and July 13, 2013. Discovery of new variants or genes contributing to ischemic stroke risk and subtype (primary analysis) and determination of support for protein-coding variants contributing to risk in previously published candidate genes (secondary analysis). We identified 2 novel genes associated with an increased risk of ischemic stroke: a protein-coding variant in PDE4DIP (rs1778155; odds ratio, 2.15; P = 2.63 × 10(-8)) with an intracellular signal transduction mechanism and in ACOT4 (rs35724886; odds ratio, 2.04; P = 1.24 × 10(-7)) with a fatty acid metabolism; confirmation of PDE4DIP was observed in affected sibpair families with large-vessel stroke subtype and in African Americans. Replication of protein-coding variants in candidate genes was observed for 2 previously reported GWAS associations: ZFHX3 (cardioembolic stroke) and ABCA1 (large-vessel stroke). Exome sequencing discovered 2 novel genes and mechanisms, PDE4DIP and ACOT4, associated with increased risk for ischemic stroke. In addition, ZFHX3 and ABCA1 were discovered to have protein-coding variants associated with ischemic stroke. These results suggest that genetic variation in novel pathways contributes to ischemic stroke risk and serves as a target for prediction, prevention, and therapy.
Decoding sORF translation - from small proteins to gene regulation.

PubMed

Cabrera-Quio, Luis Enrique; Herberg, Sarah; Pauli, Andrea

2016-11-01

Translation is best known as the fundamental mechanism by which the ribosome converts a sequence of nucleotides into a string of amino acids. Extensive research over many years has elucidated the key principles of translation, and the majority of translated regions were thought to be known. The recent discovery of wide-spread translation outside of annotated protein-coding open reading frames (ORFs) came therefore as a surprise, raising the intriguing possibility that these newly discovered translated regions might have unrecognized protein-coding or gene-regulatory functions. Here, we highlight recent findings that provide evidence that some of these newly discovered translated short ORFs (sORFs) encode functional, previously missed small proteins, while others have regulatory roles. Based on known examples we will also speculate about putative additional roles and the potentially much wider impact that these translated regions might have on cellular homeostasis and gene regulation.
Deep developmental transcriptome sequencing uncovers numerous new genes and enhances gene annotation in the sponge Amphimedon queenslandica.

PubMed

Fernandez-Valverde, Selene L; Calcino, Andrew D; Degnan, Bernard M

2015-05-15

The demosponge Amphimedon queenslandica is amongst the few early-branching metazoans with an assembled and annotated draft genome, making it an important species in the study of the origin and early evolution of animals. Current gene models in this species are largely based on in silico predictions and low coverage expressed sequence tag (EST) evidence. Amphimedon queenslandica protein-coding gene models are improved using deep RNA-Seq data from four developmental stages and CEL-Seq data from 82 developmental samples. Over 86% of previously predicted genes are retained in the new gene models, although 24% have additional exons; there is also a marked increase in the total number of annotated 3' and 5' untranslated regions (UTRs). Importantly, these new developmental transcriptome data reveal numerous previously unannotated protein-coding genes in the Amphimedon genome, increasing the total gene number by 25%, from 30,060 to 40,122. In general, Amphimedon genes have introns that are markedly smaller than those in other animals and most of the alternatively spliced genes in Amphimedon undergo intron-retention; exon-skipping is the least common mode of alternative splicing. Finally, in addition to canonical polyadenylation signal sequences, Amphimedon genes are enriched in a number of unique AT-rich motifs in their 3' UTRs. The inclusion of developmental transcriptome data has substantially improved the structure and composition of protein-coding gene models in Amphimedon queenslandica, providing a more accurate and comprehensive set of genes for functional and comparative studies. These improvements reveal the Amphimedon genome is comprised of a remarkably high number of tightly packed genes. These genes have small introns and there is pervasive intron retention amongst alternatively spliced transcripts. These aspects of the sponge genome are more similar unicellular opisthokont genomes than to other animal genomes.
The mitochondrial genomes of the acoelomorph worms Paratomella rubra, Isodiametra pulchra and Archaphanostoma ylvae.

PubMed

Robertson, Helen E; Lapraz, François; Egger, Bernhard; Telford, Maximilian J; Schiffer, Philipp H

2017-05-12

Acoels are small, ubiquitous - but understudied - marine worms with a very simple body plan. Their internal phylogeny is still not fully resolved, and the position of their proposed phylum Xenacoelomorpha remains debated. Here we describe mitochondrial genome sequences from the acoels Paratomella rubra and Isodiametra pulchra, and the complete mitochondrial genome of the acoel Archaphanostoma ylvae. The P. rubra and A. ylvae sequences are typical for metazoans in size and gene content. The larger I. pulchra mitochondrial genome contains both ribosomal genes, 21 tRNAs, but only 11 protein-coding genes. We find evidence suggesting a duplicated sequence in the I. pulchra mitochondrial genome. The P. rubra, I. pulchra and A. ylvae mitochondria have a unique genome organisation in comparison to other metazoan mitochondrial genomes. We found a large degree of protein-coding gene and tRNA overlap with little non-coding sequence in the compact P. rubra genome. Conversely, the A. ylvae and I. pulchra genomes have many long non-coding sequences between genes, likely driving genome size expansion in the latter. Phylogenetic trees inferred from mitochondrial genes retrieve Xenacoelomorpha as an early branching taxon in the deuterostomes. Sequence divergence analysis between P. rubra sampled in England and Spain indicates cryptic diversity.
CRISP-3, a protein with homology to plant defense proteins, is expressed in mouse B cells under the control of Oct2.

PubMed

Pfisterer, P; König, H; Hess, J; Lipowsky, G; Haendler, B; Schleuning, W D; Wirth, T

1996-11-01

The Oct2 transcription factor is expressed throughout the B-lymphoid lineage and plays an essential role during the terminal phase of B-cell differentiation. Several genes specifically expressed in B lymphocytes have been identified that contain a functional octamer motif in their regulatory elements. However, expression of only a single gene, the murine CD36 gene, has been shown to date to be dependent on Oct2. Here, we present the identification and characterization of a further gene, coding for cysteine-rich secreted protein 3 (CRISP-3), whose expression in B cells is regulated by Oct2. We show that CRISP-3 is expressed in the B-lymphoid lineage specifically at the pre-B-cell stage. By using different experimental strategies, including nuclear run-on experiments, we demonstrate that this gene is transcriptionally activated by Oct2. Furthermore, analysis of CRISP-3 expression in primary B cells derived from either wild-type or Oct2-deficient mice demonstrates the dependence on Oct2. Two variant octamer motifs were identified in the upstream promoter region of the crisp-3 gene, and Oct2 interacts with both of them in vitro. Cotransfection experiments with expression vectors for Oct1 and Oct2 together with a reporter driven by the crisp-3 promoter showed that transcriptional activation of this promoter can only be achieved with Oct2. The C-terminal transactivation domain of Oct2 is required for this activation. Finally, introducing specific mutations in the two variant octamer motifs revealed that both of them are important for full transcriptional activation by Oct2.
Analysis and recognition of 5′ UTR intron splice sites in human pre-mRNA

PubMed Central

Eden, E.; Brunak, S.

2004-01-01

Prediction of splice sites in non-coding regions of genes is one of the most challenging aspects of gene structure recognition. We perform a rigorous analysis of such splice sites embedded in human 5′ untranslated regions (UTRs), and investigate correlations between this class of splice sites and other features found in the adjacent exons and introns. By restricting the training of neural network algorithms to ‘pure’ UTRs (not extending partially into protein coding regions), we for the first time investigate the predictive power of the splicing signal proper, in contrast to conventional splice site prediction, which typically relies on the change in sequence at the transition from protein coding to non-coding. By doing so, the algorithms were able to pick up subtler splicing signals that were otherwise masked by ‘coding’ noise, thus enhancing significantly the prediction of 5′ UTR splice sites. For example, the non-coding splice site predicting networks pick up compositional and positional bias in the 3′ ends of non-coding exons and 5′ non-coding intron ends, where cytosine and guanine are over-represented. This compositional bias at the true UTR donor sites is also visible in the synaptic weights of the neural networks trained to identify UTR donor sites. Conventional splice site prediction methods perform poorly in UTRs because the reading frame pattern is absent. The NetUTR method presented here performs 2–3-fold better compared with NetGene2 and GenScan in 5′ UTRs. We also tested the 5′ UTR trained method on protein coding regions, and discovered, surprisingly, that it works quite well (although it cannot compete with NetGene2). This indicates that the local splicing pattern in UTRs and coding regions is largely the same. The NetUTR method is made publicly available at www.cbs.dtu.dk/services/NetUTR. PMID:14960723
Biodegradation of the Organic Disulfide 4,4′-Dithiodibutyric Acid by Rhodococcus spp.

PubMed Central

Khairy, Heba; Wübbeler, Jan Hendrik

2015-01-01

Four Rhodococcus spp. exhibited the ability to use 4,4′-dithiodibutyric acid (DTDB) as a sole carbon source for growth. The most important step for the production of a novel polythioester (PTE) using DTDB as a precursor substrate is the initial cleavage of DTDB. Thus, identification of the enzyme responsible for this step was mandatory. Because Rhodococcus erythropolis strain MI2 serves as a model organism for elucidation of the biodegradation of DTDB, it was used to identify the genes encoding the enzymes involved in DTDB utilization. To identify these genes, transposon mutagenesis of R. erythropolis MI2 was carried out using transposon pTNR-TA. Among 3,261 mutants screened, 8 showed no growth with DTDB as the sole carbon source. In five mutants, the insertion locus was mapped either within a gene coding for a polysaccharide deacetyltransferase, a putative ATPase, or an acetyl coenzyme A transferase, 1 bp upstream of a gene coding for a putative methylase, or 176 bp downstream of a gene coding for a putative kinase. In another mutant, the insertion was localized between genes encoding a putative transcriptional regulator of the TetR family (noxR) and an NADH:flavin oxidoreductase (nox). Moreover, in two other mutants, the insertion loci were mapped within a gene encoding a hypothetical protein in the vicinity of noxR and nox. The interruption mutant generated, R. erythropolis MI2 noxΩtsr, was unable to grow with DTDB as the sole carbon source. Subsequently, nox was overexpressed and purified, and its activity with DTDB was measured. The specific enzyme activity of Nox amounted to 1.2 ± 0.15 U/mg. Therefore, we propose that Nox is responsible for the initial cleavage of DTDB into 2 molecules of 4-mercaptobutyric acid (4MB). PMID:26407888
Functional characterization of the Dsc E3 ligase complex in the citrus postharvest pathogen Penicillium digitatum.

PubMed

Ruan, Ruoxin; Chung, Kuang-Ren; Li, Hongye

2017-12-01

Sterol regulatory element binding proteins (SREBPs) are required for sterol homeostasis in eukaryotes. Activation of SREBPs is regulated by the Dsc E3 ligase complex in Schizosaccharomyces pombe and Aspergillus spp. Previous studies indicated that an SREBP-coding gene PdsreA is required for fungicide resistance and ergosterol biosynthesis in the citrus postharvest pathogen Penicillium digitatum. In this study, five genes, designated PddscA, PddscB, PddscC, PddscD, and PddscE encoding the Dsc E3 ligase complex were characterized to be required for fungicide resistance, ergosterol biosynthesis and CoCl 2 tolerance in P. digitatum. Each of the dsc genes was inactivated by target gene disruption and the resulted phenotypes were analyzed and compared. Genetic analysis reveals that, of five Dsc complex components, PddscB is the core subunit gene in P. digitatum. Although the resultant dsc mutants were able to infect citrus fruit and induce maceration lesions as the wild-type, the mutants rarely produced aerial mycelia on affected citrus fruit peels. P. digitatum Dsc proteins regulated not only the expression of genes involved in ergosterol biosynthesis but also that of PdsreA. Yeast two-hybrid assays revealed a direct interaction between the PdSreA protein and the Dsc proteins. Ectopic expression of the PdSreA N-terminus restored fungicide resistance in the dsc mutants. Our results provide important evidence to understand the mechanisms underlying SREBP activation and regulation of ergosterol biosynthesis in plant pathogenic fungi. Copyright © 2017 Elsevier GmbH. All rights reserved.
Avian sarcoma virus 17 carries the jun oncogene.

PubMed Central

Maki, Y; Bos, T J; Davis, C; Starbuck, M; Vogt, P K

1987-01-01

Biologically active molecular clones of avian sarcoma virus 17 (ASV 17) contain a replication-defective proviral genome of 3.5 kilobases (kb). The genome retains partial gag and env sequences, which flank a cell-derived putative oncogene of 0.93 kb, termed jun. The jun gene lacks preserved coding domains of tyrosine-specific protein kinases. It also shows no significant nucleic acid homology with other known oncogenes. The probable transformation-specific protein in ASV 17-transformed cells is a 55-kDa gag-jun fusion product. Images PMID:3033666
Inhibition of Escherichia coli viability by external guide sequences complementary to two essential genes

PubMed Central

McKinney, Jeffrey; Guerrier-Takada, Cecilia; Wesolowski, Donna; Altman, Sidney

2001-01-01

Narrow spectrum antimicrobial activity has been designed to reduce the expression of two essential genes, one coding for the protein subunit of RNase P (C5 protein) and one for gyrase (gyrase A). In both cases, external guide sequences (EGS) have been designed to complex with either mRNA. Using the EGS technology, the level of microbial viability is reduced to less than 10% of the wild-type strain. The EGSs are additive when used together and depend on the number of nucleotides paired when attacking gyrase A mRNA. In the case of gyrase A, three nucleotides unpaired out of a 15-mer EGS still favor complete inhibition by the EGS but five unpaired nucleotides do not. PMID:11381134
A novel familial mutation in the PCSK1 gene that alters the oxyanion hole residue of proprotein convertase 1/3 and impairs its enzymatic activity.

PubMed

Wilschanski, Michael; Abbasi, Montaser; Blanco, Elias; Lindberg, Iris; Yourshaw, Michael; Zangen, David; Berger, Itai; Shteyer, Eyal; Pappo, Orit; Bar-Oz, Benjamin; Martín, Martin G; Elpeleg, Orly

2014-01-01

Four siblings presented with congenital diarrhea and various endocrinopathies. Exome sequencing and homozygosity mapping identified five regions, comprising 337 protein-coding genes that were shared by three affected siblings. Exome sequencing identified a novel homozygous N309K mutation in the proprotein convertase subtilisin/kexin type 1 (PCSK1) gene, encoding the neuroendocrine convertase 1 precursor (PC1/3) which was recently reported as a cause of Congenital Diarrhea Disorder (CDD). The PCSK1 mutation affected the oxyanion hole transition state-stabilizing amino acid within the active site, which is critical for appropriate proprotein maturation and enzyme activity. Unexpectedly, the N309K mutant protein exhibited normal, though slowed, prodomain removal and was secreted from both HEK293 and Neuro2A cells. However, the secreted enzyme showed no catalytic activity, and was not processed into the 66 kDa form. We conclude that the N309K enzyme is able to cleave its own propeptide but is catalytically inert against in trans substrates, and that this variant accounts for the enteric and systemic endocrinopathies seen in this large consanguineous kindred.
A Novel Familial Mutation in the PCSK1 Gene That Alters the Oxyanion Hole Residue of Proprotein Convertase 1/3 and Impairs Its Enzymatic Activity

PubMed Central

Wilschanski, Michael; Abbasi, Montaser; Blanco, Elias; Lindberg, Iris; Yourshaw, Michael; Zangen, David; Berger, Itai; Shteyer, Eyal; Pappo, Orit; Bar-Oz, Benjamin; Martín, Martin G.; Elpeleg, Orly

2014-01-01

Four siblings presented with congenital diarrhea and various endocrinopathies. Exome sequencing and homozygosity mapping identified five regions, comprising 337 protein-coding genes that were shared by three affected siblings. Exome sequencing identified a novel homozygous N309K mutation in the proprotein convertase subtilisin/kexin type 1 (PCSK1) gene, encoding the neuroendocrine convertase 1 precursor (PC1/3) which was recently reported as a cause of Congenital Diarrhea Disorder (CDD). The PCSK1 mutation affected the oxyanion hole transition state-stabilizing amino acid within the active site, which is critical for appropriate proprotein maturation and enzyme activity. Unexpectedly, the N309K mutant protein exhibited normal, though slowed, prodomain removal and was secreted from both HEK293 and Neuro2A cells. However, the secreted enzyme showed no catalytic activity, and was not processed into the 66 kDa form. We conclude that the N309K enzyme is able to cleave its own propeptide but is catalytically inert against in trans substrates, and that this variant accounts for the enteric and systemic endocrinopathies seen in this large consanguineous kindred. PMID:25272002
Development of a bioluminescence resonance energy transfer (BRET) for monitoring estrogen receptor alpha activation

NASA Astrophysics Data System (ADS)

Michelini, Elisa; Mirasoli, Mara; Karp, Matti; Virta, Marko; Roda, Aldo

2004-06-01

Estrogen receptor (ER) is a ligand-activated transcriptional factor, able to dimerize after activation and to bind specific DNA sequences (estrogen response elements), thus activating gene target transcription. Since ER homo- and hetero-dimerization (giving a-a and a-b isoforms) is a fundamental step for receptor activation, we developed an assay for detecting compounds that induce human ERa homo-dimerization based on bioluminescence resonance energy transfer (BRET). BRET is a non-radiative energy transfer, occurring between a luminescent donor and a fluorescent acceptor, that strictly depends on the closeness between the two proteins and can therefore be used for studying protein-protein interactions. We cloned ERa coding sequence in frame with either a variant of the green fluorescent protein (enhanced yellow fluorescent protein, EYFP) or Renilla luciferase (RLuc). Upon ERa homo-dimerization, BRET process takes place in the presence of the RLuc substrate coelenterazine resulting in EYFP emission at its characteristic wavelength. The ER alpha-Rluc and ER alpha-EYFP fusion proteins were cloned, then the occurrence of BRET in the presence of ER alpha activators was assayed both in vivo, within cells, and in vitro, with purified fusion proteins.
Recurrent and functional regulatory mutations in breast cancer.

PubMed

Rheinbay, Esther; Parasuraman, Prasanna; Grimsby, Jonna; Tiao, Grace; Engreitz, Jesse M; Kim, Jaegil; Lawrence, Michael S; Taylor-Weiner, Amaro; Rodriguez-Cuevas, Sergio; Rosenberg, Mara; Hess, Julian; Stewart, Chip; Maruvka, Yosef E; Stojanov, Petar; Cortes, Maria L; Seepo, Sara; Cibulskis, Carrie; Tracy, Adam; Pugh, Trevor J; Lee, Jesse; Zheng, Zongli; Ellisen, Leif W; Iafrate, A John; Boehm, Jesse S; Gabriel, Stacey B; Meyerson, Matthew; Golub, Todd R; Baselga, Jose; Hidalgo-Miranda, Alfredo; Shioda, Toshi; Bernards, Andre; Lander, Eric S; Getz, Gad

2017-07-06

Genomic analysis of tumours has led to the identification of hundreds of cancer genes on the basis of the presence of mutations in protein-coding regions. By contrast, much less is known about cancer-causing mutations in non-coding regions. Here we perform deep sequencing in 360 primary breast cancers and develop computational methods to identify significantly mutated promoters. Clear signals are found in the promoters of three genes. FOXA1, a known driver of hormone-receptor positive breast cancer, harbours a mutational hotspot in its promoter leading to overexpression through increased E2F binding. RMRP and NEAT1, two non-coding RNA genes, carry mutations that affect protein binding to their promoters and alter expression levels. Our study shows that promoter regions harbour recurrent mutations in cancer with functional consequences and that the mutations occur at similar frequencies as in coding regions. Power analyses indicate that more such regions remain to be discovered through deep sequencing of adequately sized cohorts of patients.
Exome chip meta-analysis identifies novel loci and East Asian-specific coding variants that contribute to lipid levels and coronary artery disease.

PubMed

Lu, Xiangfeng; Peloso, Gina M; Liu, Dajiang J; Wu, Ying; Zhang, He; Zhou, Wei; Li, Jun; Tang, Clara Sze-Man; Dorajoo, Rajkumar; Li, Huaixing; Long, Jirong; Guo, Xiuqing; Xu, Ming; Spracklen, Cassandra N; Chen, Yang; Liu, Xuezhen; Zhang, Yan; Khor, Chiea Chuen; Liu, Jianjun; Sun, Liang; Wang, Laiyuan; Gao, Yu-Tang; Hu, Yao; Yu, Kuai; Wang, Yiqin; Cheung, Chloe Yu Yan; Wang, Feijie; Huang, Jianfeng; Fan, Qiao; Cai, Qiuyin; Chen, Shufeng; Shi, Jinxiu; Yang, Xueli; Zhao, Wanting; Sheu, Wayne H-H; Cherny, Stacey Shawn; He, Meian; Feranil, Alan B; Adair, Linda S; Gordon-Larsen, Penny; Du, Shufa; Varma, Rohit; Chen, Yii-Der Ida; Shu, Xiao-Ou; Lam, Karen Siu Ling; Wong, Tien Yin; Ganesh, Santhi K; Mo, Zengnan; Hveem, Kristian; Fritsche, Lars G; Nielsen, Jonas Bille; Tse, Hung-Fat; Huo, Yong; Cheng, Ching-Yu; Chen, Y Eugene; Zheng, Wei; Tai, E Shyong; Gao, Wei; Lin, Xu; Huang, Wei; Abecasis, Goncalo; Kathiresan, Sekar; Mohlke, Karen L; Wu, Tangchun; Sham, Pak Chung; Gu, Dongfeng; Willer, Cristen J

2017-12-01

Most genome-wide association studies have been of European individuals, even though most genetic variation in humans is seen only in non-European samples. To search for novel loci associated with blood lipid levels and clarify the mechanism of action at previously identified lipid loci, we used an exome array to examine protein-coding genetic variants in 47,532 East Asian individuals. We identified 255 variants at 41 loci that reached chip-wide significance, including 3 novel loci and 14 East Asian-specific coding variant associations. After a meta-analysis including >300,000 European samples, we identified an additional nine novel loci. Sixteen genes were identified by protein-altering variants in both East Asians and Europeans, and thus are likely to be functional genes. Our data demonstrate that most of the low-frequency or rare coding variants associated with lipids are population specific, and that examining genomic data across diverse ancestries may facilitate the identification of functional genes at associated loci.

Exome chip meta-analysis identifies novel loci and East Asian-specific coding variants contributing to lipid levels and coronary artery disease

PubMed Central

Lu, Xiangfeng; Peloso, Gina M; Liu, Dajiang J.; Wu, Ying; Zhang, He; Zhou, Wei; Li, Jun; Tang, Clara Sze-man; Dorajoo, Rajkumar; Li, Huaixing; Long, Jirong; Guo, Xiuqing; Xu, Ming; Spracklen, Cassandra N.; Chen, Yang; Liu, Xuezhen; Zhang, Yan; Khor, Chiea Chuen; Liu, Jianjun; Sun, Liang; Wang, Laiyuan; Gao, Yu-Tang; Hu, Yao; Yu, Kuai; Wang, Yiqin; Cheung, Chloe Yu Yan; Wang, Feijie; Huang, Jianfeng; Fan, Qiao; Cai, Qiuyin; Chen, Shufeng; Shi, Jinxiu; Yang, Xueli; Zhao, Wanting; Sheu, Wayne H.-H.; Cherny, Stacey Shawn; He, Meian; Feranil, Alan B.; Adair, Linda S.; Gordon-Larsen, Penny; Du, Shufa; Varma, Rohit; da Chen, Yii-Der I; Shu, XiaoOu; Lam, Karen Siu Ling; Wong, Tien Yin; Ganesh, Santhi K.; Mo, Zengnan; Hveem, Kristian; Fritsche, Lars; Nielsen, Jonas Bille; Tse, Hung-fat; Huo, Yong; Cheng, Ching-Yu; Chen, Y. Eugene; Zheng, Wei; Tai, E Shyong; Gao, Wei; Lin, Xu; Huang, Wei; Abecasis, Goncalo; Consortium, GLGC; Kathiresan, Sekar; Mohlke, Karen L.; Wu, Tangchun; Sham, Pak Chung; Gu, Dongfeng; Willer, Cristen J

2017-01-01

Most genome-wide association studies have been conducted in European individuals, even though most genetic variation in humans is seen only in non-European samples. To search for novel loci associated with blood lipid levels and clarify the mechanism of action at previously identified lipid loci, we examined protein-coding genetic variants in 47,532 East Asian individuals using an exome array. We identified 255 variants at 41 loci reaching chip-wide significance, including 3 novel loci and 14 East Asian-specific coding variant associations. After meta-analysis with > 300,000 European samples, we identified an additional 9 novel loci. The same 16 genes were identified by the protein-altering variants in both East Asians and Europeans, likely pointing to the functional genes. Our data demonstrate that most of the low-frequency or rare coding variants associated with lipids are population-specific, and that examining genomic data across diverse ancestries may facilitate the identification of functional genes at associated loci. PMID:29083407
Homoeologous cloning of omega-secalin gene family in a wheat 1BL/1RS translocation.

PubMed

Chai, Jian Fang; Liu, Xu; Jia, Ji Zeng

2005-08-01

Wheat 1BL/1RS translocations are widely planted in China as well as in most of the wheat producing area in the world for their good qualities of disease resistance and high yield. 1BL/1RS translocations are however poor in bread making, partially caused by a family of small monomeric proteins, omega-secalins, which are encoded by genes on 1RS. Based on published sequence of a rye omega-secalin gene we designed a pair of primers to cover the whole mature protein coding sequence. A major band could be amplified from 1BL/1RS translocations but not from euploid wheat. Using this primer set we conducted PCR amplification by using high fidelity Pfu polymerase on the genomic DNAs and cDNAs purified from a 1BL/1RS translocation Lankao 906. Sequencing analysis indicated that this gene family contains several members of 1150 bp, 1076 bp, 1075 bp, 1052 bp and 1004 bp genes, including two pseudogenes and three active genes. The gene transcripts were differentially expressed in developing seeds.
Structural characteristics of ScBx genes controlling the biosynthesis of hydroxamic acids in rye (Secale cereale L.).

PubMed

Bakera, Beata; Makowska, Bogna; Groszyk, Jolanta; Niziołek, Michał; Orczyk, Wacław; Bolibok-Brągoszewska, Hanna; Hromada-Judycka, Aneta; Rakoczy-Trojanowska, Monika

2015-08-01

Benzoxazinoids (BX) are major secondary metabolites of gramineous plants that play an important role in disease resistance and allelopathy. They also have many other unique properties including anti-bacterial and anti-fungal activity, and the ability to reduce alfa-amylase activity. The biosynthesis and modification of BX are controlled by the genes Bx1 ÷ Bx10, GT and glu, and the majority of these Bx genes have been mapped in maize, wheat and rye. However, the genetic basis of BX biosynthesis remains largely uncharacterized apart from some data from maize and wheat. The aim of this study was to isolate, sequence and characterize five genes (ScBx1, ScBx2, ScBx3, ScBx4 and ScBx5) encoding enzymes involved in the synthesis of DIBOA, an important defense compound of rye. Using a modified 3D procedure of BAC library screening, seven BAC clones containing all of the ScBx genes were isolated and sequenced. Bioinformatic analyses of the resulting contigs were used to examine the structure and other features of these genes, including their promoters, introns and 3'UTRs. Comparative analysis showed that the ScBx genes are similar to those of other Poaceae species, especially to the TaBx genes. The polymorphisms present both in the coding sequences and non-coding regions of ScBx in relation to other Bx genes are predicted to have an impact on the expression, structure and properties of the encoded proteins.
Alternative splicing and promoter use in TFII-I genes

PubMed Central

Makeyev, Aleksandr V.; Bayarsaihan, Dashzeveg

2008-01-01

TFII-I proteins are ubiquitously expressed transcriptional factors involved in both basal transcription and signal transduction activation or repression. TFII-I proteins are detected as early as at two-cell stage and exhibit distinct and dynamic expression patterns in developing embryos as well as mark regional variation in the adult mouse brain. Analysis of atypical small and rare chromosomal deletions at 7q11.23 points to TFII-I genes (GTF2I and GTF2IRD1) as the prime candidates responsible for craniofacial and cognitive abnormalities in the Williams-Beuren syndrome. TFII-I genes are often subjected to alternative splicing, which generates isoforms that that show different activities and play distinct biological roles. The coding regions of TFII-I genes are composed of more than 30 exons and are well conserved among vertebrates. However, their 5′ untranslated regions are not as well conserved and all poorly characterized. In the present work, we analyzed promoter regions of TFII-I genes and described their additional exons, as well as tested tissue specificity of both previously reported and novel alternatively spliced isoforms. Our comprehensive analysis leads to further elucidation of the functional heterogeneity of TFII-I proteins, provides hints on search for regulatory pathways governing their expression, and opens up possibilities for examining the effect of different haplotypes on their promoter functions. PMID:19111598
Molecular cloning, expression and characterization of 100K gene of fowl adenovirus-4 for prevention and control of hydropericardium syndrome.

PubMed

Shah, M S; Ashraf, A; Khan, M I; Rahman, M; Habib, M; Qureshi, J A

2016-01-01

Fowl adenovirus-4 is an infectious agent causing Hydropericardium syndrome in chickens. Adenovirus are non-enveloped virions having linear, double stranded DNA. Viral genome codes for few structural and non structural proteins. 100K is an important non-structural viral protein. Open reading frame for coding sequence of 100K protein was cloned with oligo histidine tag and expressed in Escherichia coli as a fusion protein. Nucleotide sequence of the gene revealed that 100K gene of FAdV-4 has high homology (98%) with the respective gene of FAdV-10. Recombinant 100K protein was expressed in E. coli and purified by nickel affinity chromatography. Immunization of chickens with recombinant 100K protein elicited significant serum antibody titers. However challenge protection test revealed that 100K protein conferred little protection (40%) to the immunized chicken against pathogenic viral challenge. So it was concluded that 100K gene has 2397 bp length and recombinant 100K protein has molecular weight of 95 kDa. It was also found that the recombinant protein has little capacity to affect the immune response because in-spite of having an important role in intracellular transport & folding of viral capsid proteins during viral replication, it is not exposed on the surface of the virus at any stage. Copyright © 2015 The International Alliance for Biological Standardization. All rights reserved.
Identification and characterization of smallest pore-forming protein in the cell wall of pathogenic Corynebacterium urealyticum DSM 7109.

PubMed

Abdali, Narges; Younas, Farhan; Mafakheri, Samaneh; Pothula, Karunakar R; Kleinekathöfer, Ulrich; Tauch, Andreas; Benz, Roland

2018-05-09

Corynebacterium urealyticum, a pathogenic, multidrug resistant member of the mycolata, is known as causative agent of urinary tract infections although it is a bacterium of the skin flora. This pathogenic bacterium shares with the mycolata the property of having an unusual cell envelope composition and architecture, typical for the genus Corynebacterium. The cell wall of members of the mycolata contains channel-forming proteins for the uptake of solutes. In this study, we provide novel information on the identification and characterization of a pore-forming protein in the cell wall of C. urealyticum DSM 7109. Detergent extracts of whole C. urealyticum cultures formed in lipid bilayer membranes slightly cation-selective pores with a single-channel conductance of 1.75 nS in 1 M KCl. Experiments with different salts and non-electrolytes suggested that the cell wall pore of C. urealyticum is wide and water-filled and has a diameter of about 1.8 nm. Molecular modelling and dynamics has been performed to obtain a model of the pore. For the search of the gene coding for the cell wall pore of C. urealyticum we looked in the known genome of C. urealyticum for a similar chromosomal localization of the porin gene to known porH and porA genes of other Corynebacterium strains. Three genes are located between the genes coding for GroEL2 and polyphosphate kinase (PKK2). Two of the genes (cur_1714 and cur_1715) were expressed in different constructs in C. glutamicum ΔporAΔporH and in porin-deficient BL21 DE3 Omp8 E. coli strains. The results suggested that the gene cur_1714 codes alone for the cell wall channel. The cell wall porin of C. urealyticum termed PorACur was purified to homogeneity using different biochemical methods and had an apparent molecular mass of about 4 kDa on tricine-containing sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE). Biophysical characterization of the purified protein (PorACur) suggested indeed that cur_1714 is the gene coding for the pore-forming protein in C. urealyticum because the protein formed in lipid bilayer experiments the same pores as the detergent extract of whole cells. The study is the first report of a cell wall channel in the pathogenic C. urealyticum.
A multitasking Argonaute: exploring the many facets of C. elegans CSR-1.

PubMed

Wedeles, Christopher J; Wu, Monica Z; Claycomb, Julie M

2013-12-01

While initial studies of small RNA-mediated gene regulatory pathways focused on the cytoplasmic functions of such pathways, identifying roles for Argonaute/small RNA pathways in modulating chromatin and organizing the genome has become a topic of intense research in recent years. Nuclear regulatory mechanisms for Argonaute/small RNA pathways appear to be widespread, in organisms ranging from plants to fission yeast, Caenorhabditis elegans to humans. As the effectors of small RNA-mediated gene regulatory pathways, Argonaute proteins guide the chromatin-directed activities of these pathways. Of particular interest is the C. elegans Argonaute, chromosome segregation and RNAi deficient (CSR-1), which has been implicated in such diverse functions as organizing the holocentromeres of worm chromosomes, modulating germline chromatin, protecting the genome from foreign nucleic acid, regulating histone levels, executing RNAi, and inhibiting translation in conjunction with Pumilio proteins. CSR-1 interacts with small RNAs known as 22G-RNAs, which have complementarity to 25 % of the protein coding genes. This peculiar Argonaute is the only essential C. elegans Argonaute out of 24 family members in total. Here, we summarize the current understanding of CSR-1 functions in the worm, with emphasis on the chromatin-directed activities of this ever-intriguing Argonaute.
Functional genomic profiling of Aspergillus fumigatus biofilm reveals enhanced production of the mycotoxin gliotoxin.

PubMed

Bruns, Sandra; Seidler, Marc; Albrecht, Daniela; Salvenmoser, Stefanie; Remme, Nicole; Hertweck, Christian; Brakhage, Axel A; Kniemeyer, Olaf; Müller, Frank-Michael C

2010-09-01

The opportunistic pathogenic mold Aspergillus fumigatus is an increasing cause of morbidity and mortality in immunocompromised and in part immunocompetent patients. A. fumigatus can grow in multicellular communities by the formation of a hyphal network encased in an extracellular matrix. Here, we describe the proteome and transcriptome of planktonic- and biofilm-grown A. fumigatus mycelium after 24 and 48 h. A biofilm- and time-dependent regulation of many proteins and genes of the primary metabolism indicates a developmental stage of the young biofilm at 24 h, which demands energy. At a matured biofilm phase, metabolic activity seems to be reduced. However, genes, which code for hydrophobins, and proteins involved in the biosynthesis of secondary metabolites were significantly upregulated. In particular, proteins of the gliotoxin secondary metabolite gene cluster were induced in biofilm cultures. This was confirmed by real-time PCR and by detection of this immunologically active mycotoxin in culture supernatants using HPLC analysis. The enhanced production of gliotoxin by in vitro formed biofilms reported here may also play a significant role under in vivo conditions. It may confer A. fumigatus protection from the host immune system and also enable its survival and persistence in chronic lung infections such as aspergilloma.
Pseudoscorpion mitochondria show rearranged genes and genome-wide reductions of RNA gene sizes and inferred structures, yet typical nucleotide composition bias

PubMed Central

2012-01-01

Background Pseudoscorpions are chelicerates and have historically been viewed as being most closely related to solifuges, harvestmen, and scorpions. No mitochondrial genomes of pseudoscorpions have been published, but the mitochondrial genomes of some lineages of Chelicerata possess unusual features, including short rRNA genes and tRNA genes that lack sequence to encode arms of the canonical cloverleaf-shaped tRNA. Additionally, some chelicerates possess an atypical guanine-thymine nucleotide bias on the major coding strand of their mitochondrial genomes. Results We sequenced the mitochondrial genomes of two divergent taxa from the chelicerate order Pseudoscorpiones. We find that these genomes possess unusually short tRNA genes that do not encode cloverleaf-shaped tRNA structures. Indeed, in one genome, all 22 tRNA genes lack sequence to encode canonical cloverleaf structures. We also find that the large ribosomal RNA genes are substantially shorter than those of most arthropods. We inferred secondary structures of the LSU rRNAs from both pseudoscorpions, and find that they have lost multiple helices. Based on comparisons with the crystal structure of the bacterial ribosome, two of these helices were likely contact points with tRNA T-arms or D-arms as they pass through the ribosome during protein synthesis. The mitochondrial gene arrangements of both pseudoscorpions differ from the ancestral chelicerate gene arrangement. One genome is rearranged with respect to the location of protein-coding genes, the small rRNA gene, and at least 8 tRNA genes. The other genome contains 6 tRNA genes in novel locations. Most chelicerates with rearranged mitochondrial genes show a genome-wide reversal of the CA nucleotide bias typical for arthropods on their major coding strand, and instead possess a GT bias. Yet despite their extensive rearrangement, these pseudoscorpion mitochondrial genomes possess a CA bias on the major coding strand. Phylogenetic analyses of all 13 mitochondrial protein-coding gene sequences consistently yield trees that place pseudoscorpions as sister to acariform mites. Conclusion The well-supported phylogenetic placement of pseudoscorpions as sister to Acariformes differs from some previous analyses based on morphology. However, these two lineages share multiple molecular evolutionary traits, including substantial mitochondrial genome rearrangements, extensive nucleotide substitution, and loss of helices in their inferred tRNA and rRNA structures. PMID:22409411
Gene discovery in Eimeria tenella by immunoscreening cDNA expression libraries of sporozoites and schizonts with chicken intestinal antibodies.

PubMed

Réfega, Susana; Girard-Misguich, Fabienne; Bourdieu, Christiane; Péry, Pierre; Labbé, Marie

2003-04-02

Specific antibodies were produced ex vivo from intestinal culture of Eimeria tenella infected chickens. The specificity of these intestinal antibodies was tested against different parasite stages. These antibodies were used to immunoscreen first generation schizont and sporozoite cDNA libraries permitting the identification of new E. tenella antigens. We obtained a total of 119 cDNA clones which were subjected to sequence analysis. The sequences coding for the proteins inducing local immune responses were compared with nucleotide or protein databases and with expressed sequence tags (ESTs) databases. We identified new Eimeria genes coding for heat shock proteins, a ribosomal protein, a pyruvate kinase and a pyridoxine kinase. Specific features of other sequences are discussed.
Identification and expression of the tig gene coding for trigger factor from psychrophilic bacteria with no information of genome sequence available.

PubMed

Lee, Kyunghee; Choi, Hyojung; Im, Hana

2009-08-01

Trigger factor (TF) plays a key role as a molecular chaperone with a peptidyl-prolyl cis-trans isomerase (PPIase) activity by which cells promote folding of newly synthesized proteins coming out of ribosomes. Since psychrophilic bacteria grow at a quite low temperature, between 4 and 15 degrees C, TF from such bacteria was investigated and compared with that of mesophilic bacteria E. coli in order to offer an explanation of cold-adaptation at a molecular level. Using a combination of gradient PCRs with homologous primers and LA PCR in vitro cloning technology, the tig gene was fully identified from Psychromonas arctica, whose genome sequence is not yet available. The resulting amino acid sequence of the TF was compared with other homologous TFs using sequence alignments to search for common domains. In addition, we have developed a protein expression system, by which TF proteins from P. arctica (PaTF) were produced by IPTG induction upon cloning the tig gene on expression vectors, such as pAED4. We have further examined the role of expressed psychrophilic PaTF on survival against cold treatment at 4 degrees C. Finally, we have attempted the in vitro biochemical characterization of TF proteins with His-tags expressed in a pET system, such as the PPIase activity of PaTF protein. Our results demonstrate that the expressed PaTF proteins helped cells survive against cold environments in vivo and the purified PaTF in vitro display the functional PPIase activity in a concentration dependent manner.
The CanOE strategy: integrating genomic and metabolic contexts across multiple prokaryote genomes to find candidate genes for orphan enzymes.

PubMed

Smith, Adam Alexander Thil; Belda, Eugeni; Viari, Alain; Medigue, Claudine; Vallenet, David

2012-05-01

Of all biochemically characterized metabolic reactions formalized by the IUBMB, over one out of four have yet to be associated with a nucleic or protein sequence, i.e. are sequence-orphan enzymatic activities. Few bioinformatics annotation tools are able to propose candidate genes for such activities by exploiting context-dependent rather than sequence-dependent data, and none are readily accessible and propose result integration across multiple genomes. Here, we present CanOE (Candidate genes for Orphan Enzymes), a four-step bioinformatics strategy that proposes ranked candidate genes for sequence-orphan enzymatic activities (or orphan enzymes for short). The first step locates "genomic metabolons", i.e. groups of co-localized genes coding proteins catalyzing reactions linked by shared metabolites, in one genome at a time. These metabolons can be particularly helpful for aiding bioanalysts to visualize relevant metabolic data. In the second step, they are used to generate candidate associations between un-annotated genes and gene-less reactions. The third step integrates these gene-reaction associations over several genomes using gene families, and summarizes the strength of family-reaction associations by several scores. In the final step, these scores are used to rank members of gene families which are proposed for metabolic reactions. These associations are of particular interest when the metabolic reaction is a sequence-orphan enzymatic activity. Our strategy found over 60,000 genomic metabolons in more than 1,000 prokaryote organisms from the MicroScope platform, generating candidate genes for many metabolic reactions, of which more than 70 distinct orphan reactions. A computational validation of the approach is discussed. Finally, we present a case study on the anaerobic allantoin degradation pathway in Escherichia coli K-12.
The nearly complete mitochondrial genome of a stonefly species, Styloperla sp. (Plecoptera: Styloperlidae).

PubMed

Chen, Zhi-Teng; Wu, Hai-Yan; Du, Yu-Zhou

2016-07-01

We report the nearly complete mitochondrial genome of a stonefly species, Styloperla sp. (Plecoptera: Styloperlidae), which is a circular molecule of 15,416 bp in length and consists of 13 protein-coding genes, 2 ribosomal RNAs, 20 transfer RNAs and a partial control region (645 bp). Using the 13 protein-coding genes of 8 stoneflies and 3 other related species, we constructed a phylogenetic tree to verify the accuracy of the new determined mitogenome sequences. Our results provide basic data for further study of phylogeny in Plecoptera.
The single IGF-1 partial deficiency is responsible for mitochondrial dysfunction and is restored by IGF-1 replacement therapy.

PubMed

Olleros Santos-Ruiz, M; Sádaba, M C; Martín-Estal, I; Muñoz, U; Sebal Neira, C; Castilla-Cortázar, I

2017-08-01

We previously described in cirrhosis and aging, both conditions of IGF-1 deficiency, a clear hepatic mitochondrial dysfunction with increased oxidative damage. In both conditions, the hepatic mitochondrial function was improved with low doses of IGF-1. The aim of this work was to explore if the only mere IGF-1 partial deficiency, without any exogenous insult, is responsible for hepatic mitochondrial dysfunction. Heterozygous (igf1 +/- ) mice were divided into two groups: untreated and treated mice with low doses of IGF-1. WT group was used as controls. Parameters of hepatic mitochondrial function were determined by flow cytometry, antioxidant enzyme activities were determined by spectrophotometry, and electron chain transport enzyme levels were determined by immunohistochemistry and immunofluorescence analyses. Liver expression of genes coding for proteins involved in mitochondrial protection and apoptosis was studied by microarray analysis and RT-qPCR. Hz mice showed a significant reduction in hepatic mitochondrial membrane potential (MMP) and ATPase activity, and an increase in intramitochondrial free radical production and proton leak rates, compared to controls. These parameters were normalized by IGF-1 replacement therapy. No significant differences were found between groups in oxygen consumption and antioxidant enzyme activities, except for catalase, whose activity was increased in both Hz groups. Relevant genes coding for proteins involved in mitochondrial protection and survival were altered in Hz group and were reverted to normal in Hz+IGF-1 group. The mere IGF-1 partial deficiency is per se associated with hepatic mitochondrial dysfunction sensitive to IGF-1 replacement therapy. Results in this work prove that IGF-1 is involved in hepatic mitochondrial protection, because it is able to reduce free radical production, oxidative damage and apoptosis. All these IGF-1 actions are mediated by the modulation of the expression of genes encoding citoprotective and antiapoptotic proteins. Copyright © 2017. Published by Elsevier Ltd.
Assessment of the Antimicrobial Activity and the Entomocidal Potential of Bacillus thuringiensis Isolates from Algeria

PubMed Central

Djenane, Zahia; Nateche, Farida; Amziane, Meriam; Gomis-Cebolla, Joaquín; El-Aichar, Fairouz; Khorf, Hassiba; Ferré, Juan

2017-01-01

This work represents the first initiative to analyze the distribution of B. thuringiensis in Algeria and to evaluate the biological potential of the isolates. A total of 157 isolates were recovered, with at least one isolate in 94.4% of the samples. The highest Bt index was found in samples from rhizospheric soil (0.48) and from the Mediterranean area (0.44). Most isolates showed antifungal activity (98.5%), in contrast to the few that had antibacterial activity (29.9%). A high genetic diversity was made evident by the finding of many different crystal shapes and various combinations of shapes within a single isolate (in 58.4% of the isolates). Also, over 50% of the isolates harbored cry1, cry2, or cry9 genes, and 69.3% contained a vip3 gene. A good correlation between the presence of chitinase genes and antifungal activity was observed. More than half of the isolates with a broad spectrum of antifungal activity harbored both endochitinase and exochitinase genes. Interestingly, 15 isolates contained the two chitinase genes and all of the above cry family genes, with some of them harboring a vip3 gene as well. The combination of this large number of genes coding for entomopathogenic proteins suggests a putative wide range of entomotoxic activity. PMID:28406460
Proteomic Analysis and Identification of the Structural and Regulatory Proteins of the Rhodobacter capsulatus Gene Transfer Agent

PubMed Central

Chen, Frank; Spano, Anthony; Goodman, Benjamin E.; Blasier, Kiev R.; Sabat, Agnes; Jeffery, Erin; Norris, Andrew; Shabanowitz, Jeffrey; Hunt, Donald F.; Lebedev, Nikolai

2010-01-01

The gene transfer agent of Rhodobacter capsulatus (GTA) is a unique phage-like particle that exchanges genetic information between members of this same species of bacterium. Besides being an excellent tool for genetic mapping, the GTA has a number of advantages for biotechnological and nanoengineering purposes. To facilitate the GTA purification and identify the proteins involved in GTA expression, assembly and regulation, in the present work we construct and transform into R. capsulatus Y262 a gene coding for a C-terminally His-tagged capsid protein. The constructed protein was expressed in the cells, assembled into chimeric GTA particles inside the cells and excreted from the cells into surrounding medium. Transmission electron micrographs of phosphotungstate-stained, NiNTA-purified chimeric GTA confirm that its structure is similar to normal GTA particles, with many particles composed both of a head and a tail. The mass spectrometric proteomic analysis of polypeptides present in the GTA recovered outside the cells shows that GTA is composed of at least 9 proteins represented in the GTA gene cluster including proteins coded for by Orf’s 3, 5, 6–9, 11, 13, and 15. PMID:19105630
Proteomic analysis and identification of the structural and regulatory proteins of the Rhodobacter capsulatus gene transfer agent.

PubMed

Chen, Frank; Spano, Anthony; Goodman, Benjamin E; Blasier, Kiev R; Sabat, Agnes; Jeffery, Erin; Norris, Andrew; Shabanowitz, Jeffrey; Hunt, Donald F; Lebedev, Nikolai

2009-02-01

The gene transfer agent of Rhodobacter capsulatus (GTA) is a unique phage-like particle that exchanges genetic information between members of this same species of bacterium. Besides being an excellent tool for genetic mapping, the GTA has a number of advantages for biotechnological and nanoengineering purposes. To facilitate the GTA purification and identify the proteins involved in GTA expression, assembly and regulation, in the present work we construct and transform into R. capsulatus Y262 a gene coding for a C-terminally His-tagged capsid protein. The constructed protein was expressed in the cells, assembled into chimeric GTA particles inside the cells and excreted from the cells into surrounding medium. Transmission electron micrographs of phosphotungstate-stained, NiNTA-purified chimeric GTA confirm that its structure is similar to normal GTA particles, with many particles composed both of a head and a tail. The mass spectrometric proteomic analysis of polypeptides present in the GTA recovered outside the cells shows that GTA is composed of at least 9 proteins represented in the GTA gene cluster including proteins coded for by Orf's 3, 5, 6-9, 11, 13, and 15.
Ab initio reconstruction of transcriptomes of pluripotent and lineage committed cells reveals gene structures of thousands of lincRNAs

PubMed Central

Guttman, Mitchell; Garber, Manuel; Levin, Joshua Z.; Donaghey, Julie; Robinson, James; Adiconis, Xian; Fan, Lin; Koziol, Magdalena J.; Gnirke, Andreas; Nusbaum, Chad; Rinn, John L.; Lander, Eric S.; Regev, Aviv

2010-01-01

RNA-Seq provides an unbiased way to study a transcriptome, including both coding and non-coding genes. To date, most RNA-Seq studies have critically depended on existing annotations, and thus focused on expression levels and variation in known transcripts. Here, we present Scripture, a method to reconstruct the transcriptome of a mammalian cell using only RNA-Seq reads and the genome sequence. We apply it to mouse embryonic stem cells, neuronal precursor cells, and lung fibroblasts to accurately reconstruct the full-length gene structures for the vast majority of known expressed genes. We identify substantial variation in protein-coding genes, including thousands of novel 5′-start sites, 3′-ends, and internal coding exons. We then determine the gene structures of over a thousand lincRNA and antisense loci. Our results open the way to direct experimental manipulation of thousands of non-coding RNAs, and demonstrate the power of ab initio reconstruction to render a comprehensive picture of mammalian transcriptomes. PMID:20436462
A compendium of transcription factor and Transcriptionally active protein coding gene families in cowpea (Vigna unguiculata L.).

PubMed

Misra, Vikram A; Wang, Yu; Timko, Michael P

2017-11-22

Cowpea (Vigna unguiculata (L.) Walp.) is the most important food and forage legume in the semi-arid tropics of sub-Saharan Africa where approximately 80% of worldwide production takes place primarily on low-input, subsistence farm sites. Among the major goals of cowpea breeding and improvement programs are the rapid manipulation of agronomic traits for seed size and quality and improved resistance to abiotic and biotic stresses to enhance productivity. Knowing the suite of transcription factors (TFs) and transcriptionally active proteins (TAPs) that control various critical plant cellular processes would contribute tremendously to these improvement aims. We used a computational approach that employed three different predictive pipelines to data mine the cowpea genome and identified over 4400 genes representing 136 different TF and TAP families. We compare the information content of cowpea to two evolutionarily close species common bean (Phaseolus vulgaris), and soybean (Glycine max) to gauge the relative informational content. Our data indicate that correcting for genome size cowpea has fewer TF and TAP genes than common bean (4408 / 5291) and soybean (4408/ 11,065). Members of the GROWTH-REGULATING FACTOR (GRF) and Auxin/indole-3-acetic acid (Aux/IAA) gene families appear to be over-represented in the genome relative to common bean and soybean, whereas members of the MADS (Minichromosome maintenance deficient 1 (MCM1), AGAMOUS, DEFICIENS, and serum response factor (SRF)) and C2C2-YABBY appear to be under-represented. Analysis of the AP2-EREBP APETALA2-Ethylene Responsive Element Binding Protein (AP2-EREBP), NAC (NAM (no apical meristem), ATAF1, 2 (Arabidopsis transcription activation factor), CUC (cup-shaped cotyledon)), and WRKY families, known to be important in defense signaling, revealed changes and phylogenetic rearrangements relative to common bean and soybean that suggest these groups may have evolved different functions. The availability of detailed information on the coding capacity of the cowpea genome and in particular the various TF and TAP gene families will facilitate future comparative analysis and development of strategies for controlling growth, differentiation, and abiotic and biotic stress resistances of cowpea.
NetDecoder: a network biology platform that decodes context-specific biological networks and gene activities.

PubMed

da Rocha, Edroaldo Lummertz; Ung, Choong Yong; McGehee, Cordelia D; Correia, Cristina; Li, Hu

2016-06-02

The sequential chain of interactions altering the binary state of a biomolecule represents the 'information flow' within a cellular network that determines phenotypic properties. Given the lack of computational tools to dissect context-dependent networks and gene activities, we developed NetDecoder, a network biology platform that models context-dependent information flows using pairwise phenotypic comparative analyses of protein-protein interactions. Using breast cancer, dyslipidemia and Alzheimer's disease as case studies, we demonstrate NetDecoder dissects subnetworks to identify key players significantly impacting cell behaviour specific to a given disease context. We further show genes residing in disease-specific subnetworks are enriched in disease-related signalling pathways and information flow profiles, which drive the resulting disease phenotypes. We also devise a novel scoring scheme to quantify key genes-network routers, which influence many genes, key targets, which are influenced by many genes, and high impact genes, which experience a significant change in regulation. We show the robustness of our results against parameter changes. Our network biology platform includes freely available source code (http://www.NetDecoder.org) for researchers to explore genome-wide context-dependent information flow profiles and key genes, given a set of genes of particular interest and transcriptome data. More importantly, NetDecoder will enable researchers to uncover context-dependent drug targets. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

Enrichment of Circular Code Motifs in the Genes of the Yeast Saccharomyces cerevisiae.

PubMed

Michel, Christian J; Ngoune, Viviane Nguefack; Poch, Olivier; Ripp, Raymond; Thompson, Julie D

2017-12-03

A set X of 20 trinucleotides has been found to have the highest average occurrence in the reading frame, compared to the two shifted frames, of genes of bacteria, archaea, eukaryotes, plasmids and viruses. This set X has an interesting mathematical property, since X is a maximal C3 self-complementary trinucleotide circular code. Furthermore, any motif obtained from this circular code X has the capacity to retrieve, maintain and synchronize the original (reading) frame. Since 1996, the theory of circular codes in genes has mainly been developed by analysing the properties of the 20 trinucleotides of X, using combinatorics and statistical approaches. For the first time, we test this theory by analysing the X motifs, i.e., motifs from the circular code X, in the complete genome of the yeast Saccharomyces cerevisiae . Several properties of X motifs are identified by basic statistics (at the frequency level), and evaluated by comparison to R motifs, i.e., random motifs generated from 30 different random codes R. We first show that the frequency of X motifs is significantly greater than that of R motifs in the genome of S. cerevisiae . We then verify that no significant difference is observed between the frequencies of X and R motifs in the non-coding regions of S. cerevisiae , but that the occurrence number of X motifs is significantly higher than R motifs in the genes (protein-coding regions). This property is true for all cardinalities of X motifs (from 4 to 20) and for all 16 chromosomes. We further investigate the distribution of X motifs in the three frames of S. cerevisiae genes and show that they occur more frequently in the reading frame, regardless of their cardinality or their length. Finally, the ratio of X genes, i.e., genes with at least one X motif, to non-X genes, in the set of verified genes is significantly different to that observed in the set of putative or dubious genes with no experimental evidence. These results, taken together, represent the first evidence for a significant enrichment of X motifs in the genes of an extant organism. They raise two hypotheses: the X motifs may be evolutionary relics of the primitive codes used for translation, or they may continue to play a functional role in the complex processes of genome decoding and protein synthesis.
Intrinsic and extrinsic approaches for detecting genes in a bacterial genome.

PubMed Central

Borodovsky, M; Rudd, K E; Koonin, E V

1994-01-01

The unannotated regions of the Escherichia coli genome DNA sequence from the EcoSeq6 database, totaling 1,278 'intergenic' sequences of the combined length of 359,279 basepairs, were analyzed using computer-assisted methods with the aim of identifying putative unknown genes. The proposed strategy for finding new genes includes two key elements: i) prediction of expressed open reading frames (ORFs) using the GeneMark method based on Markov chain models for coding and non-coding regions of Escherichia coli DNA, and ii) search for protein sequence similarities using programs based on the BLAST algorithm and programs for motif identification. A total of 354 putative expressed ORFs were predicted by GeneMark. Using the BLASTX and TBLASTN programs, it was shown that 208 ORFs located in the unannotated regions of the E. coli chromosome are significantly similar to other protein sequences. Identification of 182 ORFs as probable genes was supported by GeneMark and BLAST, comprising 51.4% of the GeneMark 'hits' and 87.5% of the BLAST 'hits'. 73 putative new genes, comprising 20.6% of the GeneMark predictions, belong to ancient conserved protein families that include both eubacterial and eukaryotic members. This value is close to the overall proportion of highly conserved sequences among eubacterial proteins, indicating that the majority of the putative expressed ORFs that are predicted by GeneMark, but have no significant BLAST hits, nevertheless are likely to be real genes. The majority of the putative genes identified by BLAST search have been described since the release of the EcoSeq6 database, but about 70 genes have not been detected so far. Among these new identifications are genes encoding proteins with a variety of predicted functions including dehydrogenases, kinases, several other metabolic enzymes, ATPases, rRNA methyltransferases, membrane proteins, and different types of regulatory proteins. Images PMID:7984428
Structure of the beta-galactosidase gene from Thermus sp. strain T2: expression in Escherichia coli and purification in a single step of an active fusion protein.

PubMed

Vian, A; Carrascosa, A V; García, J L; Cortés, E

1998-06-01

The nucleotide sequence of both the bgaA gene, coding for a thermostable beta-galactosidase of Thermus sp. strain T2, and its flanking regions was determined. The deduced amino acid sequence of the enzyme predicts a polypeptide of 645 amino acids (Mr, 73,595). Comparative analysis of the open reading frames located in the flanking regions of the bgaA gene revealed that they might encode proteins involved in the transport and hydrolysis of sugars. The observed homology between the deduced amino acid sequences of BgaA and the beta-galactosidase of Bacillus stearothermophilus allows us to classify the new enzyme within family 42 of glycosyl hydrolases. BgaA was overexpressed in its active form in Escherichia coli, but more interestingly, an active chimeric beta-galactosidase was constructed by fusing the BgaA protein to the choline-binding domain of the major pneumococcal autolysin. This chimera illustrates a novel approach for producing an active and thermostable hybrid enzyme that can be purified in a single step by affinity chromatography on DEAE-cellulose, retaining the catalytic properties of the native enzyme. The chimeric enzyme showed a specific activity of 191,000 U/mg at 70 degrees C and a Km value of 1.6 mM with o-nitrophenyl-beta-D-galactopyranoside as a substrate, and it retained 50% of its initial activity after 1 h of incubation at 70 degrees C.
The mitochondrial proteins AtHscB and AtIsu1 involved in Fe-S cluster assembly interact with the Hsp70-type chaperon AtHscA2 and modulate its catalytic activity.

PubMed

Leaden, Laura; Busi, Maria V; Gomez-Casati, Diego F

2014-11-01

Arabidopsis plants contain two genes coding for mitochondrial Hsp70-type chaperon-like proteins, AtHscA1 (At4g37910) and AtHscA2 (At5g09590). Both genes are homologs of the Ssq1 gene involved in Fe-S cluster assembly in yeast. Protein-protein interaction studies showed that AtHscA2 interacts with AtIsu1 and AtHscB, two Arabidopsis homologs of the Isu1 protein and the Jac1 yeast co-chaperone. Moreover, this interaction could modulate the activity of AtHscA2. In the presence of a 1:5:5 molar ratio of AtHscA2:AtIsu1:AtHscB we observed an increase in the V(max) and a decrease in the S(0.5) for ATP of AtHscA2. Furthermore, an increase of about 28-fold in the catalytic efficiency of AtHscA2 was also observed. Results suggest that AtHscA2 in cooperation with AtIsu1 and AtHscB play an important role in the regulation of the Fe-S assembly pathway in plant mitochondria. Copyright © 2014 Elsevier B.V. and Mitochondria Research Society. All rights reserved.
Decoding the genome beyond sequencing: the new phase of genomic research.

PubMed

Heng, Henry H Q; Liu, Guo; Stevens, Joshua B; Bremer, Steven W; Ye, Karen J; Abdallah, Batoul Y; Horne, Steven D; Ye, Christine J

2011-10-01

While our understanding of gene-based biology has greatly improved, it is clear that the function of the genome and most diseases cannot be fully explained by genes and other regulatory elements. Genes and the genome represent distinct levels of genetic organization with their own coding systems; Genes code parts like protein and RNA, but the genome codes the structure of genetic networks, which are defined by the whole set of genes, chromosomes and their topological interactions within a cell. Accordingly, the genetic code of DNA offers limited understanding of genome functions. In this perspective, we introduce the genome theory which calls for the departure of gene-centric genomic research. To make this transition for the next phase of genomic research, it is essential to acknowledge the importance of new genome-based biological concepts and to establish new technology platforms to decode the genome beyond sequencing. Copyright © 2011 Elsevier Inc. All rights reserved.
Complete mitochondrial genome of a Asian lion (Panthera leo goojratensis).

PubMed

Li, Yu-Fei; Wang, Qiang; Zhao, Jian-ning

2016-01-01

The entire mitochondrial genome of this Asian lion (Panthera leo goojratensis) was 17,183 bp in length, gene composition and arrangement conformed to other lions, which contained the typical structure of 22 tRNAs, 2 rRNAs, 13 protein-coding genes and a non-coding region. The characteristic of the mitochondrial genome was analyzed in detail.
A Plain English Map of the Human Glycolysis Enzymes.

ERIC Educational Resources Information Center

Offner, Susan

1999-01-01

Presents a plain English map of the gene coding for the glycolysis enzymes in humans to be used as a teaching tool. The map can be used to illustrate that every reaction in a cell requires an enzyme, and that every enzyme is a protein coded for by a gene somewhere on the chromosomes. (WRM)
Proteogenomics produces comprehensive and highly accurate protein-coding gene annotation in a complete genome assembly of Malassezia sympodialis

PubMed Central

Tellgren-Roth, Christian; Baudo, Charles D.; Kennell, John C.; Sun, Sheng; Billmyre, R. Blake; Schröder, Markus S.; Andersson, Anna; Holm, Tina; Sigurgeirsson, Benjamin; Wu, Guangxi; Sankaranarayanan, Sundar Ram; Siddharthan, Rahul; Sanyal, Kaustuv; Lundeberg, Joakim; Nystedt, Björn; Boekhout, Teun; Dawson, Thomas L.; Heitman, Joseph

2017-01-01

Abstract Complete and accurate genome assembly and annotation is a crucial foundation for comparative and functional genomics. Despite this, few complete eukaryotic genomes are available, and genome annotation remains a major challenge. Here, we present a complete genome assembly of the skin commensal yeast Malassezia sympodialis and demonstrate how proteogenomics can substantially improve gene annotation. Through long-read DNA sequencing, we obtained a gap-free genome assembly for M. sympodialis (ATCC 42132), comprising eight nuclear and one mitochondrial chromosome. We also sequenced and assembled four M. sympodialis clinical isolates, and showed their value for understanding Malassezia reproduction by confirming four alternative allele combinations at the two mating-type loci. Importantly, we demonstrated how proteomics data could be readily integrated with transcriptomics data in standard annotation tools. This increased the number of annotated protein-coding genes by 14% (from 3612 to 4113), compared to using transcriptomics evidence alone. Manual curation further increased the number of protein-coding genes by 9% (to 4493). All of these genes have RNA-seq evidence and 87% were confirmed by proteomics. The M. sympodialis genome assembly and annotation presented here is at a quality yet achieved only for a few eukaryotic organisms, and constitutes an important reference for future host-microbe interaction studies. PMID:28100699
Immunoreactivity of polyclonal antibodies generated against the carboxy terminus of the predicted amino acid sequence of the Huntington disease gene

DOE Office of Scientific and Technical Information (OSTI.GOV)

Alkatib, G.; Graham, R.; Pelmear-Telenius, A.

1994-09-01

A cDNA fragment spanning the 3{prime}-end of the Huntington disease gene (from 8052 to 9252) was cloned into a prokaryotic expression vector containing the E. Coli lac promoter and a portion of the coding sequence for {beta}-galactosidase. The truncated {beta}-galactosidase gene was cleaved with BamHl and fused in frame to the BamHl fragment of the Huntington disease gene 3{prime}-end. Expression analysis of proteins made in E. Coli revealed that 20-30% of the total cellular proteins was represented by the {beta}-galactosidase-huntingtin fusion protein. The identity of the Huntington disease protein amino acid sequences was confirmed by protein sequence analysis. Affinity chromatographymore » was used to purify large quantities of the fusion protein from bacterial cell lysates. Affinity-purified proteins were used to immunize New Zealand white rabbits for antibody production. The generated polyclonal antibodies were used to immunoprecipitate the Huntington disease gene product expressed in a neuroblastoma cell line. In this cell line the antibodies precipitated two protein bands of apparent gel migrations of 200 and 150 kd which together, correspond to the calculated molecular weight of the Huntington disease gene product (350 kd). Immunoblotting experiments revealed the presence of a large precursor protein in the range of 350-750 kd which is in agreement with the predicted molecular weight of the protein without post-translational modifications. These results indicate that the huntingtin protein is cleaved into two subunits in this neuroblastoma cell line and implicate that cleavage of a large precursor protein may contribute to its biological activity. Experiments are ongoing to determine the precursor-product relationship and to examine the synthesis of the huntingtin protein in freshly isolated rat brains, and to determine cellular and subcellular distribution of the gene product.« less
Complete genome sequence of Fer-de-Lance Virus reveals a novel gene in reptilian Paramyxoviruses

USGS Publications Warehouse

Kurath, G.; Batts, W.N.; Ahne, W.; Winton, J.R.

2004-01-01

The complete RNA genome sequence of the archetype reptilian paramyxovirus, Fer-de-Lance virus (FDLV), has been determined. The genome is 15,378 nucleotides in length and consists of seven nonoverlapping genes in the order 3??? N-U-P-M-F-HN-L 5???, coding for the nucleocapsid, unknown, phospho-, matrix, fusion, hemagglutinin-neuraminidase, and large polymerase proteins, respectively. The gene junctions contain highly conserved transcription start and stop signal sequences and tri-nucleotide intergenic regions similar to those of other Paramyxoviridae. The FDLV P gene expression strategy is like that of rubulaviruses, which express the accessory V protein from the primary transcript and edit a portion of the mRNA to encode P and I proteins. There is also an overlapping open reading frame potentially encoding a small basic protein in the P gene. The gene designated U (unknown), encodes a deduced protein of 19.4 kDa that has no counterpart in other paramyxoviruses and has no similarity with sequences in the National Center for Biotechnology Information database. Active transcription of the U gene in infected cells was demonstrated by Northern blot analysis, and bicistronic N-U mRNA was also evident. The genomes of two other snake paramyxovirus genotypes were also found to have U genes, with 11 to 16% nucleotide divergence from the FDLV U gene. Pairwise comparisons of amino acid identities and phylogenetic analyses of all deduced FDLV protein sequences with homologous sequences from other Paramyxoviridae indicate that FDLV represents a new genus within the subfamily Paramyxovirinae. We suggest the name Ferlavirus for the new genus, with FDLV as the type species.
The mitochondrial genome of Polistes jokahamae and a phylogenetic analysis of the Vespoidea (Insecta: Hymenoptera).

PubMed

Song, Sheng-Nan; Chen, Peng-Yan; Wei, Shu-Jun; Chen, Xue-Xin

2016-07-01

The mitochondrial genome sequence of Polistes jokahamae (Radoszkowski, 1887) (Hymenoptera: Vespidae) (GenBank accession no. KR052468) was sequenced. The current length with partial A + T-rich region of this mitochondrial genome is 16,616 bp. All the typical mitochondrial genes were sequenced except for three tRNAs (trnI, trnQ, and trnY) located between the A + T-rich region and nad2. At least three rearrangement events occurred in the sequenced region compared with the pupative ancestral arrangement of insects, corresponding to the shuffling of trnK and trnD, translocation or remote inversion of tnnY and translocation of trnL1. All protein-coding genes start with ATN codons. Eleven, one, and another one protein-coding genes stop with termination codon TAA, TA, and T, respectively. Phylogenetic analysis using the Bayesian method based on all codon positions of the 13 protein-coding genes supports the monophyly of Vespidae and Formicidae. Within the Formicidae, the Myrmicinae and Formicinae form a sister lineage and then sister to the Dolichoderinae, while within the Vespidae, the Eumeninae is sister to the lineage of Vespinae + Polistinae.
PANTHER version 11: expanded annotation data from Gene Ontology and Reactome pathways, and data analysis tool enhancements.

PubMed

Mi, Huaiyu; Huang, Xiaosong; Muruganujan, Anushya; Tang, Haiming; Mills, Caitlin; Kang, Diane; Thomas, Paul D

2017-01-04

The PANTHER database (Protein ANalysis THrough Evolutionary Relationships, http://pantherdb.org) contains comprehensive information on the evolution and function of protein-coding genes from 104 completely sequenced genomes. PANTHER software tools allow users to classify new protein sequences, and to analyze gene lists obtained from large-scale genomics experiments. In the past year, major improvements include a large expansion of classification information available in PANTHER, as well as significant enhancements to the analysis tools. Protein subfamily functional classifications have more than doubled due to progress of the Gene Ontology Phylogenetic Annotation Project. For human genes (as well as a few other organisms), PANTHER now also supports enrichment analysis using pathway classifications from the Reactome resource. The gene list enrichment tools include a new 'hierarchical view' of results, enabling users to leverage the structure of the classifications/ontologies; the tools also allow users to upload genetic variant data directly, rather than requiring prior conversion to a gene list. The updated coding single-nucleotide polymorphisms (SNP) scoring tool uses an improved algorithm. The hidden Markov model (HMM) search tools now use HMMER3, dramatically reducing search times and improving accuracy of E-value statistics. Finally, the PANTHER Tree-Attribute Viewer has been implemented in JavaScript, with new views for exploring protein sequence evolution. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
ROOT HAIR DEFECTIVE SIX-LIKE Class I Genes Promote Root Hair Development in the Grass Brachypodium distachyon

PubMed Central

Kim, Chul Min

2016-01-01

Genes encoding ROOT HAIR DEFECTIVE SIX-LIKE (RSL) class I basic helix loop helix proteins are expressed in future root hair cells of the Arabidopsis thaliana root meristem where they positively regulate root hair cell development. Here we show that there are three RSL class I protein coding genes in the Brachypodium distachyon genome, BdRSL1, BdRSL2 and BdRSL3, and each is expressed in developing root hair cells after the asymmetric cell division that forms root hair cells and hairless epidermal cells. Expression of BdRSL class I genes is sufficient for root hair cell development: ectopic overexpression of any of the three RSL class I genes induces the development of root hairs in every cell of the root epidermis. Expression of BdRSL class I genes in root hairless Arabidopsis thaliana root hair defective 6 (Atrhd6) Atrsl1 double mutants, devoid of RSL class I function, restores root hair development indicating that the function of these proteins has been conserved. However, neither AtRSL nor BdRSL class I genes is sufficient for root hair development in A. thaliana. These data demonstrate that the spatial pattern of class I RSL activity can account for the pattern of root hair cell differentiation in B. distachyon. However, the spatial pattern of class I RSL activity cannot account for the spatial pattern of root hair cells in A. thaliana. Taken together these data indicate that that the functions of RSL class I proteins have been conserved among most angiosperms—monocots and eudicots—despite the dramatically different patterns of root hair cell development. PMID:27494519
Functional characterization of the MKC1 gene of Candida albicans, which encodes a mitogen-activated protein kinase homolog related to cell integrity.

PubMed Central

Navarro-García, F; Sánchez, M; Pla, J; Nombela, C

1995-01-01

Mitogen-activated protein (MAP) kinases represent a group of serine/threonine protein kinases playing a central role in signal transduction processes in eukaryotic cells. Using a strategy based on the complementation of the thermosensitive autolytic phenotype of slt2 null mutants, we have isolated a Candida albicans homolog of Saccharomyces cerevisiae MAP kinase gene SLT2 (MPK1), which is involved in the recently outlined PKC1-controlled signalling pathway. The isolated gene, named MKC1 (MAP kinase from C. albicans), coded for a putative protein, Mkc1p, of 58,320 Da that displayed all the characteristic domains of MAP kinases and was 55% identical to S. cerevisiae Slt2p (Mpk1p). The MKC1 gene was deleted in a diploid Candida strain, and heterozygous and homozygous strains, in both Ura+ and Ura- backgrounds, were obtained to facilitate the analysis of the function of the gene. Deletion of the two alleles of the MKC1 gene gave rise to viable cells that grew at 28 and 37 degrees C but, nevertheless, displayed a variety of phenotypic traits under more stringent conditions. These included a low growth yield and a loss of viability in cultures grown at 42 degrees C, a high sensitivity to thermal shocks at 55 degrees C, an enhanced susceptibility to caffeine that was osmotically remediable, and the formation of a weak cell wall with a very low resistance to complex lytic enzyme preparations. The analysis of the functions downstream of the MKC1 gene should contribute to understanding of the connection of growth and morphogenesis in pathogenic fungi. PMID:7891715
Chromatinized Protein Kinase C-θ: Can It Escape the Clutches of NF-κB?

PubMed Central

Sutcliffe, Elissa L.; Li, Jasmine; Zafar, Anjum; Hardy, Kristine; Ghildyal, Reena; McCuaig, Robert; Norris, Nicole C.; Lim, Pek Siew; Milburn, Peter J.; Casarotto, Marco G.; Denyer, Gareth; Rao, Sudha

2012-01-01

We recently provided the first description of a nuclear mechanism used by Protein Kinase C-theta (PKC-θ) to mediate T cell gene expression. In this mode, PKC-θ tethers to chromatin to form an active nuclear complex by interacting with proteins including RNA polymerase II, the histone kinase MSK-1, the demethylase LSD1, and the adaptor molecule 14-3-3ζ at regulatory regions of inducible immune response genes. Moreover, our genome-wide analysis identified many novel PKC-θ target genes and microRNAs implicated in T cell development, differentiation, apoptosis, and proliferation. We have expanded our ChIP-on-chip analysis and have now identified a transcription factor motif containing NF-κB binding sites that may facilitate recruitment of PKC-θ to chromatin at coding genes. Furthermore, NF-κB association with chromatin appears to be a prerequisite for the assembly of the PKC-θ active complex. In contrast, a distinct NF-κB-containing module appears to operate at PKC-θ targeted microRNA genes, and here NF-κB negatively regulates microRNA gene transcription. Our efforts are also focusing on distinguishing between the nuclear and cytoplasmic functions of PKCs to ascertain how these kinases may synergize their roles as both cytoplasmic signaling proteins and their functions on the chromatin template, together enabling rapid induction of eukaryotic genes. We have identified an alternative sequence within PKC-θ that appears to be important for nuclear translocation of this kinase. Understanding the molecular mechanisms used by signal transduction kinases to elicit specific and distinct transcriptional programs in T cells will enable scientists to refine current therapeutic strategies for autoimmune diseases and cancer. PMID:22969762
Intragenome Diversity of Gene Families Encoding Toxin-like Proteins in Venomous Animals.

PubMed

Rodríguez de la Vega, Ricardo C; Giraud, Tatiana

2016-11-01

The evolution of venoms is the story of how toxins arise and of the processes that generate and maintain their diversity. For animal venoms these processes include recruitment for expression in the venom gland, neofunctionalization, paralogous expansions, and functional divergence. The systematic study of these processes requires the reliable identification of the venom components involved in antagonistic interactions. High-throughput sequencing has the potential of uncovering the entire set of toxins in a given organism, yet the existence of non-venom toxin paralogs and the misleading effects of partial census of the molecular diversity of toxins make necessary to collect complementary evidence to distinguish true toxins from their non-venom paralogs. Here, we analyzed the whole genomes of two scorpions, one spider and one snake, aiming at the identification of the full repertoires of genes encoding toxin-like proteins. We classified the entire set of protein-coding genes into paralogous groups and monotypic genes, identified genes encoding toxin-like proteins based on known toxin families, and quantified their expression in both venom-glands and pooled tissues. Our results confirm that genes encoding toxin-like proteins are part of multigene families, and that these families arise by recruitment events from non-toxin genes followed by limited expansions of the toxin-like protein coding genes. We also show that failing to account for sequence similarity with non-toxin proteins has a considerable misleading effect that can be greatly reduced by comparative transcriptomics. Our study overall contributes to the understanding of the evolutionary dynamics of proteins involved in antagonistic interactions. © The Author 2016. Published by Oxford University Press on behalf of the Society for Integrative and Comparative Biology. All rights reserved. For permissions please email: journals.permissions@oup.com.
A purified truncated form of yeast Gal4 expressed in Escherichia coli and used to functionalize poly(lactic acid) nanoparticle surface is transcriptionally active in cellulo.

PubMed

Legaz, Sophie; Exposito, Jean-Yves; Borel, Agnès; Candusso, Marie-Pierre; Megy, Simon; Montserret, Roland; Lahaye, Vincent; Terzian, Christophe; Verrier, Bernard

2015-09-01

Gal4/UAS system is a powerful tool for the analysis of numerous biological processes. Gal4 is a large yeast transcription factor that activates genes including UAS sequences in their promoter. Here, we have synthesized a minimal form of Gal4 DNA sequence coding for the binding and dimerization regions, but also part of the transcriptional activation domain. This truncated Gal4 protein was expressed as inclusion bodies in Escherichia coli. A structured and active form of this recombinant protein was purified and used to cover poly(lactic acid) (PLA) nanoparticles. In cellulo, these Gal4-vehicles were able to activate the expression of a Green Fluorescent Protein (GFP) gene under the control of UAS sequences, demonstrating that the decorated Gal4 variant can be delivery into cells where it still retains its transcription factor capacities. Thus, we have produced in E. coli and purified a short active form of Gal4 that retains its functions at the surface of PLA-nanoparticles in cellular assay. These decorated Gal4-nanoparticles will be useful to decipher their tissue distribution and their potential after ingestion or injection in UAS-GFP recombinant animal models. Copyright © 2015 Elsevier Inc. All rights reserved.
End Joining-Mediated Gene Expression in Mammalian Cells Using PCR-Amplified DNA Constructs that Contain Terminator in Front of Promoter.

PubMed

Nakamura, Mikiko; Suzuki, Ayako; Akada, Junko; Tomiyoshi, Keisuke; Hoshida, Hisashi; Akada, Rinji

2015-12-01

Mammalian gene expression constructs are generally prepared in a plasmid vector, in which a promoter and terminator are located upstream and downstream of a protein-coding sequence, respectively. In this study, we found that front terminator constructs-DNA constructs containing a terminator upstream of a promoter rather than downstream of a coding region-could sufficiently express proteins as a result of end joining of the introduced DNA fragment. By taking advantage of front terminator constructs, FLAG substitutions, and deletions were generated using mutagenesis primers to identify amino acids specifically recognized by commercial FLAG antibodies. A minimal epitope sequence for polyclonal FLAG antibody recognition was also identified. In addition, we analyzed the sequence of a C-terminal Ser-Lys-Leu peroxisome localization signal, and identified the key residues necessary for peroxisome targeting. Moreover, front terminator constructs of hepatitis B surface antigen were used for deletion analysis, leading to the identification of regions required for the particle formation. Collectively, these results indicate that front terminator constructs allow for easy manipulations of C-terminal protein-coding sequences, and suggest that direct gene expression with PCR-amplified DNA is useful for high-throughput protein analysis in mammalian cells.
Accumulation of multiple mutations in linezolid-resistant Staphylococcus epidermidis causing bloodstream infections; in silico analysis of L3 amino acid substitutions that might confer high-level linezolid resistance.

PubMed

Ikonomidis, Alexandros; Grapsa, Anastasia; Pavlioglou, Charikleia; Demiri, Antonia; Batarli, Alexandra; Panopoulou, Maria

2016-12-01

Fifty-six Staphylococcus epidermidis clinical isolates, showing high-level linezolid resistance and causing bacteremia in critically ill patients, were studied. All isolates belonged to ST22 clone and carried the T2504A and C2534T mutations in gene coding for 23SrRNA as well as the C189A, G208A, C209T and G384C missense mutations in L3 protein which resulted in Asp159Tyr, Gly152Asp and Leu94Val substitutions. Other silent mutations were also detected in genes coding for ribosomal proteins L3 and L22. In silico analysis of missense mutations showed that although L3 protein retained the sequence of secondary motifs, the tertiary structure was influenced. The observed alteration in L3 protein folding provides an indication on the putative role of L3-coding gene mutations in high-level linezolid resistance. Furthermore, linezolid pressure in health care settings where linezolid consumption is of high rates might lead to the selection of resistant mutants possessing L3 mutations that might confer high-level linezolid resistance.
Complete mitochondrial genome of the invasive brown alga Sargassum muticum (Sargassaceae, Phaeophyceae).

PubMed

Liu, Feng; Pang, Shaojun

2016-01-01

Sargassum muticum (Yendo) Fensholt is an invasive canopy-forming brown alga, expanding its presence from Northeast Asia to North America and Europe. The complete mitochondrial genome of S. muticum is characterized as a circular molecule of 34,720 bp. The overall AT content of S. muticum mitogenome is 63.41%. This mitogenome contains 65 genes typically found in brown algae, including 3 ribosomal RNA genes, 25 transfer RNA genes, 35 protein-coding genes, and 2 conserved open reading frames (ORFs). The gene order of mitogenome for S. muticum is identical to that for Sargassum horneri, Fucus vesiculosus and Desmarestia viridis. Phylogenetic analyses based on 35 protein-coding genes reveal that S. muticum has a close evolutionary relationship with S. horneri and a distant relationship with Dictyota dichotoma, supporting current taxonomic systems. The present investigation provides new molecular data for studies of S. muticum population diversity as well as comparative genomics in the Phaeophyceae.

Identification of Circular RNAs from the Parental Genes Involved in Multiple Aspects of Cellular Metabolism in Barley

PubMed Central

Darbani, Behrooz; Noeparvar, Shahin; Borg, Søren

2016-01-01

RNA circularization made by head-to-tail back-splicing events is involved in the regulation of gene expression from transcriptional to post-translational levels. By exploiting RNA-Seq data and down-stream analysis, we shed light on the importance of circular RNAs in plants. The results introduce circular RNAs as novel interactors in the regulation of gene expression in plants and imply the comprehensiveness of this regulatory pathway by identifying circular RNAs for a diverse set of genes. These genes are involved in several aspects of cellular metabolism as hormonal signaling, intracellular protein sorting, carbohydrate metabolism and cell-wall biogenesis, respiration, amino acid biosynthesis, transcription and translation, and protein ubiquitination. Additionally, these parental loci of circular RNAs, from both nuclear and mitochondrial genomes, encode for different transcript classes including protein coding transcripts, microRNA, rRNA, and long non-coding/microprotein coding RNAs. The results shed light on the mitochondrial exonic circular RNAs and imply the importance of circular RNAs for regulation of mitochondrial genes. Importantly, we introduce circular RNAs in barley and elucidate their cellular-level alterations across tissues and in response to micronutrients iron and zinc. In further support of circular RNAs' functional roles in plants, we report several cases where fluctuations of circRNAs do not correlate with the levels of their parental-loci encoded linear transcripts. PMID:27375638
Probing the Boundaries of Orthology: The Unanticipated Rapid Evolution of Drosophila centrosomin

PubMed Central

Eisman, Robert C.; Kaufman, Thomas C.

2013-01-01

The rapid evolution of essential developmental genes and their protein products is both intriguing and problematic. The rapid evolution of gene products with simple protein folds and a lack of well-characterized functional domains typically result in a low discovery rate of orthologous genes. Additionally, in the absence of orthologs it is difficult to study the processes and mechanisms underlying rapid evolution. In this study, we have investigated the rapid evolution of centrosomin (cnn), an essential gene encoding centrosomal protein isoforms required during syncytial development in Drosophila melanogaster. Until recently the rapid divergence of cnn made identification of orthologs difficult and questionable because Cnn violates many of the assumptions underlying models for protein evolution. To overcome these limitations, we have identified a group of insect orthologs and present conserved features likely to be required for the functions attributed to cnn in D. melanogaster. We also show that the rapid divergence of Cnn isoforms is apparently due to frequent coding sequence indels and an accelerated rate of intronic additions and eliminations. These changes appear to be buffered by multi-exon and multi-reading frame maximum potential ORFs, simple protein folds, and the splicing machinery. These buffering features also occur in other genes in Drosophila and may help prevent potentially deleterious mutations due to indels in genes with large coding exons and exon-dense regions separated by small introns. This work promises to be useful for future investigations of cnn and potentially other rapidly evolving genes and proteins. PMID:23749319
Cohort-specific imputation of gene expression improves prediction of warfarin dose for African Americans.

PubMed

Gottlieb, Assaf; Daneshjou, Roxana; DeGorter, Marianne; Bourgeois, Stephane; Svensson, Peter J; Wadelius, Mia; Deloukas, Panos; Montgomery, Stephen B; Altman, Russ B

2017-11-24

Genome-wide association studies are useful for discovering genotype-phenotype associations but are limited because they require large cohorts to identify a signal, which can be population-specific. Mapping genetic variation to genes improves power and allows the effects of both protein-coding variation as well as variation in expression to be combined into "gene level" effects. Previous work has shown that warfarin dose can be predicted using information from genetic variation that affects protein-coding regions. Here, we introduce a method that improves dose prediction by integrating tissue-specific gene expression. In particular, we use drug pathways and expression quantitative trait loci knowledge to impute gene expression-on the assumption that differential expression of key pathway genes may impact dose requirement. We focus on 116 genes from the pharmacokinetic and pharmacodynamic pathways of warfarin within training and validation sets comprising both European and African-descent individuals. We build gene-tissue signatures associated with warfarin dose in a cohort-specific manner and identify a signature of 11 gene-tissue pairs that significantly augments the International Warfarin Pharmacogenetics Consortium dosage-prediction algorithm in both populations. Our results demonstrate that imputed expression can improve dose prediction and bridge population-specific compositions. MATLAB code is available at https://github.com/assafgo/warfarin-cohort.
Bacillus cereus-type polyhydroxyalkanoate biosynthetic gene cluster contains R-specific enoyl-CoA hydratase gene.

PubMed

Kihara, Takahiro; Hiroe, Ayaka; Ishii-Hyakutake, Manami; Mizuno, Kouhei; Tsuge, Takeharu

2017-08-01

Bacillus cereus and Bacillus megaterium both accumulate polyhydroxyalkanoate (PHA) but their PHA biosynthetic gene (pha) clusters that code for proteins involved in PHA biosynthesis are different. Namely, a gene encoding MaoC-like protein exists in the B. cereus-type pha cluster but not in the B. megaterium-type pha cluster. MaoC-like protein has an R-specific enoyl-CoA hydratase (R-hydratase) activity and is referred to as PhaJ when involved in PHA metabolism. In this study, the pha cluster of B. cereus YB-4 was characterized in terms of PhaJ's function. In an in vitro assay, PhaJ from B. cereus YB-4 (PhaJ YB4 ) exhibited hydration activity toward crotonyl-CoA. In an in vivo assay using Escherichia coli as a host for PHA accumulation, the recombinant strain expressing PhaJ YB4 and PHA synthase led to increased PHA accumulation, suggesting that PhaJ YB4 functioned as a monomer supplier. The monomer composition of the accumulated PHA reflected the substrate specificity of PhaJ YB4 , which appeared to prefer short chain-length substrates. The pha cluster from B. cereus YB-4 functioned to accumulate PHA in E. coli; however, it did not function when the phaJ YB4 gene was deleted. The B. cereus-type pha cluster represents a new example of a pha cluster that contains the gene encoding PhaJ.
Chromosomal localization of the human V3 pituitary vasopressin receptor gene (AVPR3) to 1q32

DOE Office of Scientific and Technical Information (OSTI.GOV)

Rousseau-Merck, M.F.; Derre, J.; Berger, R.

1995-11-20

Vasopressin exerts its physiological effects on liver metabolism, fluid osmolarity, and corticotrophic response to stress through a set of at least three receptors, V1a, V2, and V3 (also called V1b), respectively. These receptors constitute a distinct group of the superfamily of G-protein-coupled cell surface receptors. When bound to vasopressin, they couple to G proteins activating phospholipase C for the V1a and V3 types and adenylate cyclase for the V2. The vasopressin receptor subfamily also includes the receptor for oxytocin, a structurally related hormone that signals through the activation of phospholipase C. The chromosomal position of the V2 receptor gene hasmore » been assigned to Xq28-qter by PCR-based screening of somatic cell hybrids, whereas the oxytocin receptor gene has been mapped to chromosome 3q26.2 by fluorescence in situ hybridization (FISH). The chromosomal location of the V1a gene is currently unknown. We recently cloned the cDNA and the gene coding for the human pituitary-specific V3 receptor (HGMW-approved symbol AVPR3). We report here the chromosomal localization of this gene by two distinct in situ hybridization techniques using radioactive and fluorescent probes. 11 refs., 1 fig.« less
Comparative architecture of silks, fibrous proteins and their encoding genes in insects and spiders.

PubMed

Craig, Catherine L; Riekel, Christian

2002-12-01

The known silk fibroins and fibrous glues are thought to be encoded by members of the same gene family. All silk fibroins sequenced to date contain regions of long-range order (crystalline regions) and/or short-range order (non-crystalline regions). All of the sequenced fibroin silks (Flag or silk from flagelliform gland in spiders; Fhc or heavy chain fibroin silks produced by Lepidoptera larvae) are made up of hierarchically organized, repetitive arrays of amino acids. Fhc fibroin genes are characterized by a similar molecular genetic architecture of two exons and one intron, but the organization and size of these units differs. The Flag, Ser (sericin gene) and BR (Balbiani ring genes; both fibrous proteins) genes are made up of multiple exons and introns. Sequences coding for crystalline and non-crystalline protein domains are integrated in the repetitive regions of Fhc and MA exons, but not in the protein glues Ser1 and BR-1. Genetic 'hot-spots' promote recombination errors in Fhc, MA, and Flag. Codon bias, structural constraint, point mutations, and shortened coding arrays may be alternative means of stabilizing precursor mRNA transcripts. Differential regulation of gene expression and selective splicing of the mRNA transcript may allow rapid adaptation of silk functional properties to different physical environments.
Creating reference gene annotation for the mouse C57BL6/J genome assembly.

PubMed

Mudge, Jonathan M; Harrow, Jennifer

2015-10-01

Annotation on the reference genome of the C57BL6/J mouse has been an ongoing project ever since the draft genome was first published. Initially, the principle focus was on the identification of all protein-coding genes, although today the importance of describing long non-coding RNAs, small RNAs, and pseudogenes is recognized. Here, we describe the progress of the GENCODE mouse annotation project, which combines manual annotation from the HAVANA group with Ensembl computational annotation, alongside experimental and in silico validation pipelines from other members of the consortium. We discuss the more recent incorporation of next-generation sequencing datasets into this workflow, including the usage of mass-spectrometry data to potentially identify novel protein-coding genes. Finally, we will outline how the C57BL6/J genebuild can be used to gain insights into the variant sites that distinguish different mouse strains and species.
Production and purification of recombinant human glucagon overexpressed as intein fusion protein in Escherichia coli.

PubMed

Esipov, Roman S; Stepanenko, Vasily N; Gurevich, Alexandr I; Chupova, Larisa A; Miroshnikov, Anatoly I

2006-01-01

Chemico-enzymatic synthesis and cloning in Esherichia coli of an artificial gene coding human glucagon was performed. Recombinant plasmid containing hybrid glucagons gene and intein Ssp dnaB from Synechocestis sp. was designed. Expression of the obtained hybrid gene in E. coli, properties of the formed hybrid protein, and conditions of its autocatalytic cleavage leading to glucagon formation were studied.
Mechanisms of radiation-induced gene responses

DOE Office of Scientific and Technical Information (OSTI.GOV)

Woloschak, G.E.; Paunesku, T.

1996-10-01

In the process of identifying genes differentially expressed in cells exposed ultraviolet radiation, we have identified a transcript having a 26-bp region that is highly conserved in a variety of species including Bacillus circulans, yeast, pumpkin, Drosophila, mouse, and man. When the 5` region (flanking region or UTR) of a gene, the sequence is predominantly in +/+ orientation with respect to the coding DNA strand; while in the coding region and the 3` region (UTR), the sequence is most frequently in the +/-orientation with respect to the coding DNA strand. In two genes, the element is split into two parts;more » however, in most cases, it is found only once but with a minimum of 11 consecutive nucleotides precisely depicting the original sequence. The element is found in a large number of different genes with diverse functions (from human ras p21 to B. circulans chitonase). Gel shift assays demonstrated the presence of a protein in HeLa cell extracts that binds to the sense and antisense single-stranded consensus oligomers, as well as to the double- stranded oligonucleotide. When double-stranded oligomer was used, the size shift demonstrated as additional protein-oligomer complex larger than the one bound to either sense or antisense single-stranded consensus oligomers alone. It is speculated either that this element binds to protein(s) important in maintaining DNA is a single-stranded orientation for transcription or, alternatively that this element is important in the transcription-coupled DNA repair process.« less
Aberrant expression of NKL homeobox gene HLX in Hodgkin lymphoma.

PubMed

Nagel, Stefan; Pommerenke, Claudia; Meyer, Corinna; Kaufmann, Maren; MacLeod, Roderick A F; Drexler, Hans G

2018-03-06

NKL homeobox genes are basic regulators of cell and tissue differentiation, many acting as oncogenes in T-cell leukemia. Recently, we described an hematopoietic NKL-code comprising six particular NKL homeobox genes expressed in hematopoietic stem cells and lymphoid progenitors, unmasking their physiological roles in the development of these cell types. Hodgkin lymphoma (HL) is a B-cell malignancy showing aberrant activity of several developmental genes resulting in disturbed B-cell differentiation. To examine potential concordances in abnormal lymphoid differentiation of T- and B-cell malignancies we analyzed the expression of the hematopoietic NKL-code associated genes in HL, comprising HHEX, HLX, MSX1, NKX2-3, NKX3-1 and NKX6-3. Our approach revealed aberrant HLX activity in 8 % of classical HL patients and additionally in HL cell line L-540. Accordingly, to identify upstream regulators and downstream target genes of HLX we used L-540 cells as a model and performed chromosome and genome analyses, comparative expression profiling and functional assays via knockdown and overexpression experiments therein. These investigations excluded chromosomal rearrangements of the HLX locus at 1q41 and demonstrated that STAT3 operated directly as transcriptional activator of the HLX gene. Moreover, subcellular analyses showed highly enriched STAT3 protein in the nucleus of L-540 cells which underwent cytoplasmic translocation by repressing deacetylation. Finally, HLX inhibited transcription of B-cell differentiation factors MSX1, BCL11A and SPIB and of pro-apoptotic factor BCL2L11/BIM, thereby suppressing Etoposide-induced cell death. Collectively, we propose that aberrantly expressed NKL homeobox gene HLX is part of a pathological gene network in HL, driving deregulated B-cell differentiation and survival.
Tau mRNA 3'UTR-to-CDS ratio is increased in Alzheimer disease.

PubMed

García-Escudero, Vega; Gargini, Ricardo; Martín-Maestro, Patricia; García, Esther; García-Escudero, Ramón; Avila, Jesús

2017-08-10

Neurons frequently show an imbalance in expression of the 3' untranslated region (3'UTR) relative to the coding DNA sequence (CDS) region of mature messenger RNAs (mRNA). The ratio varies among different cells or parts of the brain. The Map2 protein levels per cell depend on the 3'UTR-to-CDS ratio rather than the total mRNA amount, which suggests powerful regulation of protein expression by 3'UTR sequences. Here we found that MAPT (the microtubule-associated protein tau gene) 3'UTR levels are particularly high with respect to other genes; indeed, the 3'UTR-to-CDS ratio of MAPT is balanced in healthy brain in mouse and human. The tau protein accumulates in Alzheimer diseased brain. We nonetheless observed that the levels of RNA encoding MAPT/tau were diminished in these patients' brains. To explain this apparently contradictory result, we studied MAPT mRNA stoichiometry in coding and non-coding regions, and found that the 3'UTR-to-CDS ratio was higher in the hippocampus of Alzheimer disease patients, with higher tau protein but lower total mRNA levels. Our data indicate that changes in the 3'UTR-to-CDS ratio have a regulatory role in the disease. Future research should thus consider not only mRNA levels, but also the ratios between coding and non-coding regions. Copyright © 2017 Elsevier B.V. All rights reserved.
Origins of genes: "big bang" or continuous creation?

PubMed Central

Keese, P K; Gibbs, A

1992-01-01

Many protein families are common to all cellular organisms, indicating that many genes have ancient origins. Genetic variation is mostly attributed to processes such as mutation, duplication, and rearrangement of ancient modules. Thus it is widely assumed that much of present-day genetic diversity can be traced by common ancestry to a molecular "big bang." A rarely considered alternative is that proteins may arise continuously de novo. One mechanism of generating different coding sequences is by "overprinting," in which an existing nucleotide sequence is translated de novo in a different reading frame or from noncoding open reading frames. The clearest evidence for overprinting is provided when the original gene function is retained, as in overlapping genes. Analysis of their phylogenies indicates which are the original genes and which are their informationally novel partners. We report here the phylogenetic relationships of overlapping coding sequences from steroid-related receptor genes and from tymovirus, luteovirus, and lentivirus genomes. For each pair of overlapping coding sequences, one is confined to a single lineage, whereas the other is more widespread. This suggests that the phylogenetically restricted coding sequence arose only in the progenitor of that lineage by translating an out-of-frame sequence to yield the new polypeptide. The production of novel exons by alternative splicing in thyroid receptor and lentivirus genes suggests that introns can be a valuable evolutionary source for overprinting. New genes and their products may drive major evolutionary changes. PMID:1329098
Complete mitochondrial genome of Yangtze River wild common carp (Cyprinus carpio haematopterus) and Russian scattered scale mirror carp (Cyprinus carpio carpio).

PubMed

Hu, Guang Fu; Liu, Xiang Jiang; Zou, Gui Wei; Li, Zhong; Liang, Hong-Wei; Hu, Shao-Na

2016-01-01

We sequenced the complete mitogenomes of (Cyprinus carpio haematopterus) and Russian scattered scale mirror carp (Cyprinus carpio carpio). Comparison of these two mitogenomes revealed that the mitogenomes of these two common carp strains were remarkably similar in genome length, gene order and content, and AT content. There were only 55 bp variations in 16,581 nucleotides. About 1 bp variation was located in rRNAs, 2 bp in tRNAs, 9 bp in the control region and 43 bp in protein-coding genes. Furthermore, forty-three variable nucleotides in the protein-coding genes of the two strains led to four variable amino acids, which were located in the ND2, ATPase 6, ND5 and ND6 genes, respectively.
Next-Generation Sequencing of Protein-Coding and Long Non-protein-Coding RNAs in Two Types of Exosomes Derived from Human Whole Saliva.

PubMed

Ogawa, Yuko; Tsujimoto, Masafumi; Yanoshita, Ryohei

2016-01-01

Exosomes are small extracellular vesicles containing microRNAs and mRNAs that are produced by various types of cells. We previously used ultrafiltration and size-exclusion chromatography to isolate two types of human salivary exosomes (exosomes I, II) that are different in size and proteomes. We showed that salivary exosomes contain large repertoires of small RNAs. However, precise information regarding long RNAs in salivary exosomes has not been fully determined. In this study, we investigated the compositions of protein-coding RNAs (pcRNAs) and long non-protein-coding RNAs (lncRNAs) of exosome I, exosome II and whole saliva (WS) by next-generation sequencing technology. Although 11% of all RNAs were commonly detected among the three samples, the compositions of reads mapping to known RNAs were similar. The most abundant pcRNA is ribosomal RNA protein, and pcRNAs of some salivary proteins such as S100 calcium-binding protein A8 (protein S100-A8) were present in salivary exosomes. Interestingly, lncRNAs of pseudogenes (presumably, processed pseudogenes) were abundant in exosome I, exosome II and WS. Translationally controlled tumor protein gene, which plays an important role in cell proliferation, cell death and immune responses, was highly expressed as pcRNA and pseudogenes in salivary exosomes. Our results show that salivary exosomes contain various types of RNAs such as pseudogenes and small RNAs, and may mediate intercellular communication by transferring these RNAs to target cells as gene expression regulators.
Complete genome sequence of Granulicella mallensis type strain MP5ACTX8T, an acidobacterium from tundra soil

PubMed Central

Rawat, Suman R.; Männistö, Minna K.; Starovoytov, Valentin; Goodwin, Lynne; Nolan, Matt; Hauser, Loren J.; Land, Miriam; Davenport, Karen Walston; Woyke, Tanja; Häggblom, Max M.

2013-01-01

Granulicella mallensis MP5ACTX8T is a novel species of the genus Granulicella in subdivision 1of Acidobacteria. G. mallensis is of ecological interest being a member of the dominant soil bacterial community active at low temperatures and nutrient limiting conditions in Arctic alpine tundra. G. mallensis is a cold-adapted acidophile and a versatile heterotroph that hydrolyzes a suite of sugars and complex polysaccharides. Genome analysis revealed metabolic versatility with genes involved in metabolism and transport of carbohydrates. These include gene modules encoding the carbohydrate-active enzyme (CAZyme) family involved in breakdown, utilization and biosynthesis of diverse structural and storage polysaccharides including plant based carbon polymers. The genome of Granulicella mallensis MP5ACTX8T consists of a single replicon of 6,237,577 base pairs (bp) with 4,907 protein-coding genes and 53 RNA genes. PMID:24501646
Complete mitochondrial genome of the Yellow-spotted skate Okamejei hollandi (Rajiformes: Rajidae).

PubMed

Li, Weidong; Chen, Xiao; Liu, Wenai; Sun, Renjie; Zhou, Haolang

2016-07-01

The complete mitochondrial genome of the Yellow-spotted skate Okamejei hollandi was determined in this study. It is 16,974 bp in length and contains 13 protein-coding genes, two rRNA genes, 22 tRNA genes, and one putative control region. The overall base composition is 30.5% A, 27.8% C, 14.0% G, and 27.8% T. There are 28 bp short intergenic spaces located in 12 gene junctions and 31 bp overlaps located in nine gene junctions in the whole mitogenome. Two start codons (ATG and GTG) and two stop codons (TAG and TAA/T) were used in the protein-coding genes. The lengths of 22 tRNA genes range from 68 (tRNA-Ser2) to 75 (tRNA-Leu1) bp. The origin of L-strand replication (OL) sequence (37 bp) was identified between the tRNA-Asn and tRNA-Cys genes. The control region is 1311 bp in length with high A + T and poor G content.
Biodegradation of DDT by Stenotrophomonas sp. DDT-1: Characterization and genome functional analysis

PubMed Central

Pan, Xiong; Lin, Dunli; Zheng, Yuan; Zhang, Qian; Yin, Yuanming; Cai, Lin; Fang, Hua; Yu, Yunlong

2016-01-01

A novel bacterium capable of utilizing 1,1,1-trichloro-2,2-bis(p-chlorophenyl)ethane (DDT) as the sole carbon and energy source was isolated from a contaminated soil which was identified as Stenotrophomonas sp. DDT-1 based on morphological characteristics, BIOLOG GN2 microplate profile, and 16S rDNA phylogeny. Genome sequencing and functional annotation of the isolate DDT-1 showed a 4,514,569 bp genome size, 66.92% GC content, 4,033 protein-coding genes, and 76 RNA genes including 8 rRNA genes. Totally, 2,807 protein-coding genes were assigned to Clusters of Orthologous Groups (COGs), and 1,601 protein-coding genes were mapped to Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway. The degradation half-lives of DDT increased with substrate concentration from 0.1 to 10.0 mg/l, whereas decreased with temperature from 15 °C to 35 °C. Neutral condition was the most favorable for DDT biodegradation. Based on genome annotation of DDT degradation genes and the metabolites detected by GC-MS, a mineralization pathway was proposed for DDT biodegradation in which it was orderly converted into DDE/DDD, DDMU, DDOH, and DDA via dechlorination, hydroxylation, and carboxylation, and ultimately mineralized to carbon dioxide. The results indicate that the isolate DDT-1 is a promising bacterial resource for the removal or detoxification of DDT residues in the environment. PMID:26888254
Expression and regulation of long noncoding RNAs during the osteogenic differentiation of periodontal ligament stem cells in the inflammatory microenvironment.

PubMed

Zhang, Qingbin; Chen, Li; Cui, Shiman; Li, Yan; Zhao, Qi; Cao, Wei; Lai, Shixiang; Yin, Sanjun; Zuo, Zhixiang; Ren, Jian

2017-10-25

Although long noncoding RNAs (lncRNAs) have been emerging as critical regulators in various tissues and biological processes, little is known about their expression and regulation during the osteogenic differentiation of periodontal ligament stem cells (PDLSCs) in inflammatory microenvironment. In this study, we have identified 63 lncRNAs that are not annotated in previous database. These novel lncRNAs were not randomly located in the genome but preferentially located near protein-coding genes related to particular functions and diseases, such as stem cell maintenance and differentiation, development disorders and inflammatory diseases. Moreover, we have identified 650 differentially expressed lncRNAs among different subsets of PDLSCs. Pathway enrichment analysis for neighboring protein-coding genes of these differentially expressed lncRNAs revealed stem cell differentiation related functions. Many of these differentially expressed lncRNAs function as competing endogenous RNAs that regulate protein-coding transcripts through competing shared miRNAs.
Identification of a cis-regulatory region of a gene in Arabidopsis thaliana whose induction by dehydration is mediated by abscisic acid and requires protein synthesis.

PubMed

Iwasaki, T; Yamaguchi-Shinozaki, K; Shinozaki, K

1995-05-20

In Arabidopsis thaliana, the induction of a dehydration-responsive gene, rd22, is mediated by abscisic acid (ABA) but the gene does not include any sequence corresponding to the consensus ABA-responsive element (ABRE), RYACGTGGYR, in its promoter region. The cis-regulatory region of the rd22 promoter was identified by monitoring the expression of beta-glucuronidase (GUS) activity in leaves of transgenic tobacco plants transformed with chimeric gene fusions constructed between 5'-deleted promoters of rd22 and the coding region of the GUS reporter gene. A 67-bp nucleotide fragment corresponding to positions -207 to -141 of the rd22 promoter conferred responsiveness to dehydration and ABA on a non-responsive promoter. The 67-bp fragment contains the sequences of the recognition sites for some transcription factors, such as MYC, MYB, and GT-1. The fact that accumulation of rd22 mRNA requires protein synthesis raises the possibility that the expression of rd22 might be regulated by one of these trans-acting protein factors whose de novo synthesis is induced by dehydration or ABA. Although the structure of the RD22 protein is very similar to that of a non-storage seed protein, USP, of Vicia faba, the expression of the GUS gene driven by the rd22 promoter in non-stressed transgenic Arabidopsis plants was found mainly in flowers and bolted stems rather than in seeds.
Orexin gene therapy restores the timing and maintenance of wakefulness in narcoleptic mice.

PubMed

Kantor, Sandor; Mochizuki, Takatoshi; Lops, Stefan N; Ko, Brian; Clain, Elizabeth; Clark, Erika; Yamamoto, Mihoko; Scammell, Thomas E

2013-08-01

Narcolepsy is caused by selective loss of the orexin/hypocretin-producing neurons of the hypothalamus. For patients with narcolepsy, chronic sleepiness is often the most disabling symptom, but current therapies rarely normalize alertness and do not address the underlying orexin deficiency. We hypothesized that the sleepiness of narcolepsy would substantially improve if orexin signaling were restored in specific brain regions at appropriate times of day. We used gene therapy to restore orexin signaling in a mouse model of narcolepsy. In these Atx mice, expression of a toxic protein (ataxin-3) selectively kills the orexin neurons. To induce ectopic expression of the orexin neuropeptides, we microinjected an adeno-associated viral vector coding for prepro-orexin plus a red fluorescence protein (AAV-orexin) into the mediobasal hypothalamus of Atx and wild-type mice. Control mice received an AAV coding only for red fluorescence protein. Two weeks later, we recorded sleep/wake behavior, locomotor activity, and body temperature and examined the patterns of orexin expression. Atx mice rescued with AAV-orexin produced long bouts of wakefulness and had a normal diurnal pattern of arousal, with the longest bouts of wake and the highest amounts of locomotor activity in the first hours of the night. In addition, AAV-orexin improved the timing of rapid eye movement sleep and the consolidation of nonrapid eye movement sleep in Atx mice. These substantial improvements in sleepiness and other symptoms of narcolepsy demonstrate the effectiveness of orexin gene therapy in a mouse model of narcolepsy. Additional work is needed to optimize this approach, but in time, AAV-orexin could become a useful therapeutic option for patients with narcolepsy.

First complete mitochondrial genome of the South American annual fish Austrolebias charrua (Cyprinodontiformes: Rivulidae): peculiar features among cyprinodontiforms mitogenomes.

PubMed

Gutiérrez, Verónica; Rego, Natalia; Naya, Hugo; García, Graciela

2015-10-28

Among teleosts, the South American genus Austrolebias (Cyprinodontiformes: Rivulidae) includes 42 taxa of annual fishes divided into five different species groups. It is a monophyletic genus, but morphological and molecular data do not resolve the relationship among intrageneric clades and high rates of substitution have been previously described in some mitochondrial genes. In this work, the complete mitogenome of a species of the genus was determined for the first time. We determined its structure, gene order and evolutionary peculiar features, which will allow us to evaluate the performance of mitochondrial genes in the phylogenetic resolution at different taxonomic levels. Regarding gene content and order, the circular mitogenome of A. charrua (17,271 pb) presents the typical pattern of vertebrate mitogenomes. It contains the full complement of 13 proteins-coding genes, 22 tRNA, 2 rRNA and one non-coding control region. Notably, the tRNA-Cys was only 57 bp in length and lacks the D-loop arm. In three full sibling individuals, heteroplasmatic condition was detected due to a total of 12 variable sites in seven protein-coding genes. Among cyprinodontiforms, the mitogenome of A. charrua exhibits the lowest G+C content (37 %) and GCskew, as well as the highest strand asymmetry with a net difference of T over A at 1st and 3rd codon positions. Considering the 12 coding-genes of the H strand, correspondence analyses of nucleotide composition and codon usage show that A and T at 1st and 3rd codon positions have the highest weight in the first axis, and segregate annual species from the other cyprinodontiforms analyzed. Given the annual life-style, their mitogenomes could be under different selective pressures. All 13 protein-coding genes are under strong purifying selection and we did not find any significant evidence of nucleotide sites showing episodic selection (dN >dS) at annual lineages. When fast evolving third codon positions were removed from alignments, the "supergene" tree recovers our reference species phylogeny as well as the Cytb, ND4L and ND6 genes. Therefore, third codon positions seem to be saturated in the aforementioned coding regions at intergeneric Cyprinodontiformes comparisons. The complete mitogenome obtained in present work, offers relevant data for further comparative studies on molecular phylogeny and systematics of this taxonomic controversial endemic genus of annual fishes.
Transcription and DNA Damage: Holding Hands or Crossing Swords?

PubMed

D'Alessandro, Giuseppina; d'Adda di Fagagna, Fabrizio

2017-10-27

Transcription has classically been considered a potential threat to genome integrity. Collision between transcription and DNA replication machinery, and retention of DNA:RNA hybrids, may result in genome instability. On the other hand, it has been proposed that active genes repair faster and preferentially via homologous recombination. Moreover, while canonical transcription is inhibited in the proximity of DNA double-strand breaks, a growing body of evidence supports active non-canonical transcription at DNA damage sites. Small non-coding RNAs accumulate at DNA double-strand break sites in mammals and other organisms, and are involved in DNA damage signaling and repair. Furthermore, RNA binding proteins are recruited to DNA damage sites and participate in the DNA damage response. Here, we discuss the impact of transcription on genome stability, the role of RNA binding proteins at DNA damage sites, and the function of small non-coding RNAs generated upon damage in the signaling and repair of DNA lesions. Copyright © 2016 Elsevier Ltd. All rights reserved.
Development-related expression patterns of protein-coding and miRNA genes involved in porcine muscle growth.

PubMed

Wang, F J; Jin, L; Guo, Y Q; Liu, R; He, M N; Li, M Z; Li, X W

2014-11-27

Muscle growth and development is associated with remarkable changes in protein-coding and microRNA (miRNA) gene expression. To determine the expression patterns of genes and miRNAs related to muscle growth and development, we measured the expression levels of 25 protein-coding and 16 miRNA genes in skeletal and cardiac muscles throughout 5 developmental stages by quantitative reverse transcription-polymerase chain reaction. The Short Time-Series Expression Miner (STEM) software clustering results showed that growth-related genes were downregulated at all developmental stages in both the psoas major and longissimus dorsi muscles, indicating their involvement in early developmental stages. Furthermore, genes related to muscle atrophy, such as forkhead box 1 and muscle ring finger, showed unregulated expression with increasing age, suggesting a decrease in protein synthesis during the later stages of skeletal muscle development. We found that development of the cardiac muscle was a complex process in which growth-related genes were highly expressed during embryonic development, but they did not show uniform postnatal expression patterns. Moreover, the expression level of miR-499, which enhances the expression of the β-myosin heavy chain, was significantly different in the psoas major and longissimus dorsi muscles, suggesting the involvement of miR-499 in the determination of skeletal muscle fiber types. We also performed correlation analyses of messenger RNA and miRNA expression. We found negative relationships between miR-486 and forkhead box 1, and miR-133a and serum response factor at all developmental stages, suggesting that forkhead box 1 and serum response factor are potential targets of miR-486 and miR-133a, respectively.
Draft Genome Sequence of the Deinococcus-Thermus Bacterium Meiothermus ruber Strain A

DOE PAGES

Thiel, Vera; Tomsho, Lynn P.; Burhans, Richard; ...

2015-03-26

The draft genome sequence of the Deinococcus-Thermus group bacterium Meiothermus ruber strain A, isolated from a cyanobacterial enrichment culture obtained from Octopus Spring (Yellowstone National Park, WY), comprises 2,968,099 bp in 170 contigs. It is predicted to contain 2,895 protein-coding genes, 44 tRNA-coding genes, and 2 rRNA operons.
Inducible Knockout of the Cyclin-Dependent Kinase 5 Activator p35 Alters Hippocampal Spatial Coding and Neuronal Excitability

PubMed Central

Kamiki, Eriko; Boehringer, Roman; Polygalov, Denis; Ohshima, Toshio; McHugh, Thomas J.

2018-01-01

p35 is an activating co-factor of Cyclin-dependent kinase 5 (Cdk5), a protein whose dysfunction has been implicated in a wide-range of neurological disorders including cognitive impairment and disease. Inducible deletion of the p35 gene in adult mice results in profound deficits in hippocampal-dependent spatial learning and synaptic physiology, however the impact of the loss of p35 function on hippocampal in vivo physiology and spatial coding remains unknown. Here, we recorded CA1 pyramidal cell activity in freely behaving p35 cKO and control mice and found that place cells in the mutant mice have elevated firing rates and impaired spatial coding, accompanied by changes in the temporal organization of spiking both during exploration and rest. These data shed light on the role of p35 in maintaining cellular and network excitability and provide a physiological correlate of the spatial learning deficits in these mice. PMID:29867369
Integrative analyses of RNA editing, alternative splicing, and expression of young genes in human brain transcriptome by deep RNA sequencing.

PubMed

Wu, Dong-Dong; Ye, Ling-Qun; Li, Yan; Sun, Yan-Bo; Shao, Yi; Chen, Chunyan; Zhu, Zhu; Zhong, Li; Wang, Lu; Irwin, David M; Zhang, Yong E; Zhang, Ya-Ping

2015-08-01

Next-generation RNA sequencing has been successfully used for identification of transcript assembly, evaluation of gene expression levels, and detection of post-transcriptional modifications. Despite these large-scale studies, additional comprehensive RNA-seq data from different subregions of the human brain are required to fully evaluate the evolutionary patterns experienced by the human brain transcriptome. Here, we provide a total of 6.5 billion RNA-seq reads from different subregions of the human brain. A significant correlation was observed between the levels of alternative splicing and RNA editing, which might be explained by a competition between the molecular machineries responsible for the splicing and editing of RNA. Young human protein-coding genes demonstrate biased expression to the neocortical and non-neocortical regions during evolution on the lineage leading to humans. We also found that a significantly greater number of young human protein-coding genes are expressed in the putamen, a tissue that was also observed to have the highest level of RNA-editing activity. The putamen, which previously received little attention, plays an important role in cognitive ability, and our data suggest a potential contribution of the putamen to human evolution. © The Author (2015). Published by Oxford University Press on behalf of Journal of Molecular Cell Biology, IBCB, SIBS, CAS. All rights reserved.
Growth of Trametes versicolor on phenol.

PubMed

Yemendzhiev, H; Gerginova, M; Krastanov, A; Stoilova, I; Alexieva, Z

2008-11-01

Trametes versicolor 1 was shown to grow on phenol as its sole carbon and energy source. The culture growth and degradation ability dependence on culture medium pH value was observed. The optimal pH value of a liquid Czapek salt medium was 6.5. The investigated strain utilized completely 0.5 g/l phenol in 6 days. The dynamics of the phenol degradation process was investigated. The process was characterized by specific growth rate micromax 0.33 h(-1), metabolic coefficient k=4.4, yield coefficient Yx/s=0.23 and rate of degradation Q=0.506 h(-1). The intracellular activities of phenol hydroxylase (0.333 U/mg protein) and cis,cis-muconate lactonizing enzyme (0.41 U/mg protein) were demonstrated for the first time in this fungus. In an attempt to estimate the occurrence of gene sequences in T. versicolor 1 related to phenol degradation pathway a dot blot analysis with total DNA isolated from this strain was performed. Two synthetic oligonucleotides were used as hybridizing probes. One of the probes was homologous to the 5'end of phyA gene coding for phenol hydroxylase in Trichosporon cutaneum ATCC 46490. The other probe was created on the basis of cis,cis-muconate lactonizing enzyme coding gene in T. cutaneum ATCC 58094. The results of these investigations showed that T. versicolor 1 may carry genes similar to those of Trichosporon cutaneum capable to degrade phenol.
Recombinant lactoferrin (Lf) of Vechur cow, the critical breed of Bos indicus and the Lf gene variants.

PubMed

Anisha, Shashidharan; Bhasker, Salini; Mohankumar, Chinnamma

2012-03-01

Vechur cow, categorized as a critically maintained breed by the FAO, is a unique breed of Bos indicus due to its extremely small size, less fodder intake, adaptability, easy domestication and traditional medicinal property of the milk. Lactoferrin (Lf) is an iron-binding glycoprotein that is found predominantly in the milk of mammals. The full coding region of Lf gene of Vechur cow was cloned, sequenced and expressed in a prokaryotic system. Antibacterial activity of the recombinant Lf showed suppression of bacterial growth. To the best of our knowledge this is the first time that the full coding region of Lf gene of B. indicus Vechur breed is sequenced, successfully expressed in a prokaryotic system and characterized. Comparative analysis of Lf gene sequence of five Vechur cows with B. taurus revealed 15 SNPs in the exon region associated with 11 amino acid substitutions. The amino acid arginine was noticed as a pronounced substitution and the tertiary structure analysis of the BLfV protein confirmed the positions of arginine in the β sheet region, random coil and helix region 1. Based on the recent reports on the nutritional therapies of arginine supplementation for wound healing and for cardiovascular diseases, the higher level of arginine in the lactoferrin protein of Vechur cow milk provides enormous scope for further therapeutic studies. Copyright © 2011 Elsevier B.V. All rights reserved.
The complete mitochondrial genome of the mudsnail Cipangopaludina cathayensis (Gastropoda: Viviparidae).

PubMed

Yang, Huirong; Zhang, Jia-En; Luo, Hao; Luo, Mingzhu; Guo, Jing; Deng, Zhixin; Zhao, Benliang

2016-05-01

We present the complete mitochondrial genome of Cipangopaludina cathayensis in this study. The mitochondrial genome is 17,157 bp in length, containing 13 protein-coding genes, 2 rRNA genes, 22 tRNA genes. All of them are encoded on the heavy strand except 7 tRNA genes on the light strand. Overall nucleotide compositions of the light strand are 44.51% of A, 26.74% of T, 20.48% of C and 8.28% of G. All the protein-coding genes start with ATG initiation codon except ATP6 with ATA and ND4 with TTG, and 2 types of termination codons are TAA (ATP6, ND2, COX1, COX2, ATP8, ND1, ND6, Cytb, COX3, ND4) and TAG (ND4L, ND5, ND3). There are 29 intergenic spacers and 5 gene overlaps. The tandem repeat sequences are observed in COX2, tRNA(Asp), ATP6, tRNA(Cys), S-rRNA, ND1, Cytb, ND4 and COX3 genes. Gene arrangement and distribution are different from the typical vertebrates. The absence of D-loop is consistent with the Gastropoda, but at least one lengthy non-coding region is essential regulatory element for the initiation of transcription and replication.
Amino- and carboxyl-terminal amino acid sequences of proteins coded by gag gene of murine leukemia virus

PubMed Central

Oroszlan, Stephen; Henderson, Louis E.; Stephenson, John R.; Copeland, Terry D.; Long, Cedric W.; Ihle, James N.; Gilden, Raymond V.

1978-01-01

The amino- and carboxyl-terminal amino acid sequences of proteins (p10, p12, p15, and p30) coded by the gag gene of Rauscher and AKR murine leukemia viruses were determined. Among these proteins, p15 from both viruses appears to have a blocked amino end. Proline was found to be the common NH2 terminus of both p30s and both p12s, and alanine of both p10s. The amino-terminal sequences of p30s are identical, as are those of p10s, while the p12 sequences are clearly distinctive but also show substantial homology. The carboxyl-terminal amino acids of both viral p30s and p12s are leucine and phenylalanine, respectively. Rauscher leukemia virus p15 has tyrosine as the carboxyl terminus while AKR virus p15 has phenylalanine in this position. The compositional and sequence data provide definite chemical criteria for the identification of analogous gag gene products and for the comparison of viral proteins isolated in different laboratories. On the basis of amino acid sequences and the previously proposed H-p15-p12-p30-p10-COOH peptide sequence in the precursor polyprotein, a model for cleavage sites involved in the post-translational processing of the precursor coded for by the gag gene is proposed. PMID:206897
FunGene: the functional gene pipeline and repository.

PubMed

Fish, Jordan A; Chai, Benli; Wang, Qiong; Sun, Yanni; Brown, C Titus; Tiedje, James M; Cole, James R

2013-01-01

Ribosomal RNA genes have become the standard molecular markers for microbial community analysis for good reasons, including universal occurrence in cellular organisms, availability of large databases, and ease of rRNA gene region amplification and analysis. As markers, however, rRNA genes have some significant limitations. The rRNA genes are often present in multiple copies, unlike most protein-coding genes. The slow rate of change in rRNA genes means that multiple species sometimes share identical 16S rRNA gene sequences, while many more species share identical sequences in the short 16S rRNA regions commonly analyzed. In addition, the genes involved in many important processes are not distributed in a phylogenetically coherent manner, potentially due to gene loss or horizontal gene transfer. While rRNA genes remain the most commonly used markers, key genes in ecologically important pathways, e.g., those involved in carbon and nitrogen cycling, can provide important insights into community composition and function not obtainable through rRNA analysis. However, working with ecofunctional gene data requires some tools beyond those required for rRNA analysis. To address this, our Functional Gene Pipeline and Repository (FunGene; http://fungene.cme.msu.edu/) offers databases of many common ecofunctional genes and proteins, as well as integrated tools that allow researchers to browse these collections and choose subsets for further analysis, build phylogenetic trees, test primers and probes for coverage, and download aligned sequences. Additional FunGene tools are specialized to process coding gene amplicon data. For example, FrameBot produces frameshift-corrected protein and DNA sequences from raw reads while finding the most closely related protein reference sequence. These tools can help provide better insight into microbial communities by directly studying key genes involved in important ecological processes.
Exploratory Investigation of Bacteroides fragilis Transcriptional Response during In vitro Exposure to Subinhibitory Concentration of Metronidazole

PubMed Central

de Freitas, Michele C. R.; Resende, Juliana A.; Ferreira-Machado, Alessandra B.; Saji, Guadalupe D. R. Q.; de Vasconcelos, Ana T. R.; da Silva, Vânia L.; Nicolás, Marisa F.; Diniz, Cláudio G.

2016-01-01

Bacteroides fragilis, member from commensal gut microbiota, is an important pathogen associated to endogenous infections and metronidazole remains a valuable antibiotic for the treatment of these infections, although bacterial resistance is widely reported. Considering the need of a better understanding on the global mechanisms by which B. fragilis survive upon metronidazole exposure, we performed a RNA-seq transcriptomic approach with validation of gene expression results by qPCR. Bacteria strains were selected after in vitro subcultures with subinhibitory concentration (SIC) of the drug. From a wild type B. fragilis ATCC 43859 four derivative strains were selected: first and fourth subcultures under metronidazole exposure and first and fourth subcultures after drug removal. According to global gene expression analysis, 2,146 protein coding genes were identified, of which a total of 1,618 (77%) were assigned to a Gene Ontology term (GO), indicating that most known cellular functions were taken. Among these 2,146 protein coding genes, 377 were shared among all strains, suggesting that they are critical for B. fragilis survival. In order to identify distinct expression patterns, we also performed a K-means clustering analysis set to 15 groups. This analysis allowed us to detect the major activated or repressed genes encoding for enzymes which act in several metabolic pathways involved in metronidazole response such as drug activation, defense mechanisms against superoxide ions, high expression level of multidrug eﬄux pumps, and DNA repair. The strains collected after metronidazole removal were functionally more similar to those cultured under drug pressure, reinforcing that drug-exposure lead to drastic persistent changes in the B. fragilis gene expression patterns. These results may help to elucidate B. fragilis response during metronidazole exposure, mainly at SIC, contributing with information about bacterial survival strategies under stress conditions in their environment. PMID:27703449
Transcriptomics Profiling of Alzheimer’s Disease Reveal Neurovascular Defects, Altered Amyloid-β Homeostasis, and Deregulated Expression of Long Noncoding RNAs

PubMed Central

Magistri, Marco; Velmeshev, Dmitry; Makhmutova, Madina; Faghihi, Mohammad Ali

2015-01-01

Abstract The underlying genetic variations of late-onset Alzheimer’s disease (LOAD) cases remain largely unknown. A combination of genetic variations with variable penetrance and lifetime epigenetic factors may converge on transcriptomic alterations that drive LOAD pathological process. Transcriptome profiling using deep sequencing technology offers insight into common altered pathways regardless of underpinning genetic or epigenetic factors and thus represents an ideal tool to investigate molecular mechanisms related to the pathophysiology of LOAD. We performed directional RNA sequencing on high quality RNA samples extracted from hippocampi of LOAD and age-matched controls. We further validated our data using qRT-PCR on a larger set of postmortem brain tissues, confirming downregulation of the gene encoding substance P (TAC1) and upregulation of the gene encoding the plasminogen activator inhibitor-1 (SERPINE1). Pathway analysis indicates dysregulation in neural communication, cerebral vasculature, and amyloid-β clearance. Beside protein coding genes, we identified several annotated and non-annotated long noncoding RNAs that are differentially expressed in LOAD brain tissues, three of them are activity-dependent regulated and one is induced by Aβ1 - 42 exposure of human neural cells. Our data provide a comprehensive list of transcriptomics alterations in LOAD hippocampi and warrant holistic approach including both coding and non-coding RNAs in functional studies aimed to understand the pathophysiology of LOAD. PMID:26402107
Complete mitochondrial genome of Taharana fasciana (Insecta, Hemiptera: Cicadellidae) and comparison with other Cicadellidae insects.

PubMed

Wang, Jiajia; Li, Hu; Dai, Renhuai

2017-12-01

Here, we describe the first complete mitochondrial genome (mitogenome) sequence of the leafhopper Taharana fasciana (Coelidiinae). The mitogenome sequence contains 15,161 bp with an A + T content of 77.9%. It includes 13 protein-coding genes, two ribosomal RNA genes, 22 transfer RNA genes, and one non-coding (A + T-rich) region; in addition, a repeat region is also present (GenBank accession no. KY886913). These genes/regions are in the same order as in the inferred insect ancestral mitogenome. All protein-coding genes have ATN as the start codon, and TAA or single T as the stop codons, except the gene ND3, which ends with TAG. Furthermore, we predicted the secondary structures of the rRNAs in T. fasciana. Six domains (domain III is absent in arthropods) and 41 helices were predicted for 16S rRNA, and 12S rRNA comprised three structural domains and 24 helices. Phylogenetic tree analysis confirmed that T. fasciana and other members of the Cicadellidae are clustered into a clade, and it identified the relationships among the subfamilies Deltocephalinae, Coelidiinae, Idiocerinae, Cicadellinae, and Typhlocybinae.
Expression of the Long Intergenic Non-Protein Coding RNA 665 (LINC00665) Gene and the Cell Cycle in Hepatocellular Carcinoma Using The Cancer Genome Atlas, the Gene Expression Omnibus, and Quantitative Real-Time Polymerase Chain Reaction.

PubMed

Wen, Dong-Yue; Lin, Peng; Pang, Yu-Yan; Chen, Gang; He, Yun; Dang, Yi-Wu; Yang, Hong

2018-05-05

BACKGROUND Long non-coding RNAs (lncRNAs) have a role in physiological and pathological processes, including cancer. The aim of this study was to investigate the expression of the long intergenic non-protein coding RNA 665 (LINC00665) gene and the cell cycle in hepatocellular carcinoma (HCC) using database analysis including The Cancer Genome Atlas (TCGA), the Gene Expression Omnibus (GEO), and quantitative real-time polymerase chain reaction (qPCR). MATERIAL AND METHODS Expression levels of LINC00665 were compared between human tissue samples of HCC and adjacent normal liver, clinicopathological correlations were made using TCGA and the GEO, and qPCR was performed to validate the findings. Other public databases were searched for other genes associated with LINC00665 expression, including The Atlas of Noncoding RNAs in Cancer (TANRIC), the Multi Experiment Matrix (MEM), Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) and protein-protein interaction (PPI) networks. RESULTS Overexpression of LINC00665 in patients with HCC was significantly associated with gender, tumor grade, stage, and tumor cell type. Overexpression of LINC00665 in patients with HCC was significantly associated with overall survival (OS) (HR=1.47795%; CI: 1.046-2.086). Bioinformatics analysis identified 469 related genes and further analysis supported a hypothesis that LINC00665 regulates pathways in the cell cycle to facilitate the development and progression of HCC through ten identified core genes: CDK1, BUB1B, BUB1, PLK1, CCNB2, CCNB1, CDC20, ESPL1, MAD2L1, and CCNA2. CONCLUSIONS Overexpression of the lncRNA, LINC00665 may be involved in the regulation of cell cycle pathways in HCC through ten identified hub genes.
Combining Shigella Tn-seq data with gold-standard E. coli gene deletion data suggests rare transitions between essential and non-essential gene functionality.

PubMed

Freed, Nikki E; Bumann, Dirk; Silander, Olin K

2016-09-06

Gene essentiality - whether or not a gene is necessary for cell growth - is a fundamental component of gene function. It is not well established how quickly gene essentiality can change, as few studies have compared empirical measures of essentiality between closely related organisms. Here we present the results of a Tn-seq experiment designed to detect essential protein coding genes in the bacterial pathogen Shigella flexneri 2a 2457T on a genome-wide scale. Superficial analysis of this data suggested that 481 protein-coding genes in this Shigella strain are critical for robust cellular growth on rich media. Comparison of this set of genes with a gold-standard data set of essential genes in the closely related Escherichia coli K12 BW25113 revealed that an excessive number of genes appeared essential in Shigella but non-essential in E. coli. Importantly, and in converse to this comparison, we found no genes that were essential in E. coli and non-essential in Shigella, implying that many genes were artefactually inferred as essential in Shigella. Controlling for such artefacts resulted in a much smaller set of discrepant genes. Among these, we identified three sets of functionally related genes, two of which have previously been implicated as critical for Shigella growth, but which are dispensable for E. coli growth. The data presented here highlight the small number of protein coding genes for which we have strong evidence that their essentiality status differs between the closely related bacterial taxa E. coli and Shigella. A set of genes involved in acetate utilization provides a canonical example. These results leave open the possibility of developing strain-specific antibiotic treatments targeting such differentially essential genes, but suggest that such opportunities may be rare in closely related bacteria.
MitoNuc: a database of nuclear genes coding for mitochondrial proteins. Update 2002.

PubMed

Attimonelli, Marcella; Catalano, Domenico; Gissi, Carmela; Grillo, Giorgio; Licciulli, Flavio; Liuni, Sabino; Santamaria, Monica; Pesole, Graziano; Saccone, Cecilia

2002-01-01

Mitochondria, besides their central role in energy metabolism, have recently been found to be involved in a number of basic processes of cell life and to contribute to the pathogenesis of many degenerative diseases. All functions of mitochondria depend on the interaction of nuclear and organelle genomes. Mitochondrial genomes have been extensively sequenced and analysed and data have been collected in several specialised databases. In order to collect information on nuclear coded mitochondrial proteins we developed MitoNuc, a database containing detailed information on sequenced nuclear genes coding for mitochondrial proteins in Metazoa. The MitoNuc database can be retrieved through SRS and is available via the web site http://bighost.area.ba.cnr.it/mitochondriome where other mitochondrial databases developed by our group, the complete list of the sequenced mitochondrial genomes, links to other mitochondrial sites and related information, are available. The MitoAln database, related to MitoNuc in the previous release, reporting the multiple alignments of the relevant homologous protein coding regions, is no longer supported in the present release. In order to keep the links among entries in MitoNuc from homologous proteins, a new field in the database has been defined: the cluster identifier, an alpha numeric code used to identify each cluster of homologous proteins. A comment field derived from the corresponding SWISS-PROT entry has been introduced; this reports clinical data related to dysfunction of the protein. The logic scheme of MitoNuc database has been implemented in the ORACLE DBMS. This will allow the end-users to retrieve data through a friendly interface that will be soon implemented.
Genome sequence of Plasmopara viticola and insight into the pathogenic mechanism

PubMed Central

Yin, Ling; An, Yunhe; Qu, Junjie; Li, Xinlong; Zhang, Yali; Dry, Ian; Wu, Huijuan; Lu, Jiang

2017-01-01

Plasmopara viticola causes downy mildew disease of grapevine which is one of the most devastating diseases of viticulture worldwide. Here we report a 101.3 Mb whole genome sequence of P. viticola isolate ‘JL-7-2’ obtained by a combination of Illumina and PacBio sequencing technologies. The P. viticola genome contains 17,014 putative protein-coding genes and has ~26% repetitive sequences. A total of 1,301 putative secreted proteins, including 100 putative RXLR effectors and 90 CRN effectors were identified in this genome. In the secretome, 261 potential pathogenicity genes and 95 carbohydrate-active enzymes were predicted. Transcriptional analysis revealed that most of the RXLR effectors, pathogenicity genes and carbohydrate-active enzymes were significantly up-regulated during infection. Comparative genomic analysis revealed that P. viticola evolved independently from the Arabidopsis downy mildew pathogen Hyaloperonospora arabidopsidis. The availability of the P. viticola genome provides a valuable resource not only for comparative genomic analysis and evolutionary studies among oomycetes, but also enhance our knowledge on the mechanism of interactions between this biotrophic pathogen and its host. PMID:28417959
The dhnA gene of Escherichia coli encodes a class I fructose bisphosphate aldolase.

PubMed Central

Thomson, G J; Howlett, G J; Ashcroft, A E; Berry, A

1998-01-01

The gene encoding the Escherichia coli Class I fructose-1, 6-bisphosphate aldolase (FBP aldolase) has been cloned and the protein overproduced in high amounts. This gene sequence has previously been identified as encoding an E. coli dehydrin in the GenBanktrade mark database [gene dhnA; entry code U73760; Close and Choi (1996) Submission to GenBanktrade mark]. However, the purified protein overproduced from the dhnA gene shares all its properties with those known for the E. coli Class I FBP aldolase. The protein is an 8-10-mer with a native molecular mass of approx. 340 kDa, each subunit consisting of 349 amino acids. The Class I enzyme shows low sequence identity with other known FBP aldolases, both Class I and Class II (in the order of 20%), which may be reflected by some novel properties of this FBP aldolase. The active-site peptide has been isolated and the Schiff-base-forming lysine residue (Lys236) has been identified by a combination of site-directed mutagenesis, kinetics and electrospray-ionization MS. A second lysine residue (Lys238) has been implicated in substrate binding. The cloning of this gene and the high levels of overexpression obtained will facilitate future structure-function studies. PMID:9531482
[Novel bidirectional promoter from human genome].

PubMed

Orekhova, A S; Sverdlova, P S; Spirin, P V; Leonova, O G; Popenko, V I; Prasolov, V S; Rubtsov, P M

2011-01-01

In human and other mammalian genomes a number of closely linked gene pairs transcribed in opposite directions are found. According to bioinformatic analysis up to 10% of human genes are arranged in this way. In present work the fragment of human genome was cloned that separates genes localized at 2p13.1 and oriented "head-to-head", coding for hypothetical proteins with unknown functions--CCDC (Coiled Coil Domain Containing) 142 and TTC (TetraTricopeptide repeat Containing) 31. Intergenic CCDC142-TTC31 region overlaps with CpG-island and contains a number of potential binding sites for transcription factors. This fragment functions as bidirectional promoter in the system ofluciferase reporter gene expression upon transfection of human embryonic kidney (HEK293) cells. The vectors containing genes of two fluorescent proteins--green (EGFP) and red (DsRed2) in opposite orientations separated by the fragment of CCDC142-TTC31 intergenic region were constructed. In HEK293 cells transfected with these vectors simultaneous expression of two fluorescent proteins is observed. Truncated versions of intergenic region were obtained and their promoter activity measured. Minimal promoter fragment contains elements Inr, BRE, DPE characteristic for TATA-less promoters. Thus, from the human genome the novel bidirectional promoter was cloned that can be used for simultaneous constitutive expression of two genes in human cells.

Perspectives on the mechanism of transcriptional regulation by long non-coding RNAs.

PubMed

Roberts, Thomas C; Morris, Kevin V; Weinberg, Marc S

2014-01-01

Long non-coding RNAs (lncRNAs) are increasingly being recognized as epigenetic regulators of gene transcription. The diversity and complexity of lncRNA genes means that they exert their regulatory effects by a variety of mechanisms. Although there is still much to be learned about the mechanism of lncRNA function, general principles are starting to emerge. In particular, the application of high throughput (deep) sequencing methodologies has greatly advanced our understanding of lncRNA gene function. lncRNAs function as adaptors that link specific chromatin loci with chromatin-remodeling complexes and transcription factors. lncRNAs can act in cis or trans to guide epigenetic-modifier complexes to distinct genomic sites, or act as scaffolds which recruit multiple proteins simultaneously, thereby coordinating their activities. In this review we discuss the genomic organization of lncRNAs, the importance of RNA secondary structure to lncRNA functionality, the multitude of ways in which they interact with the genome, and what evolutionary conservation tells us about their function.
Gln3p and Nil1p regulation of invertase activity and SUC2 expression in Saccharomyces cerevisiae.

PubMed

Oliveira, Edna Maria Morais; Mansure, José João; Bon, Elba Pinto da Silva

2005-04-01

In Saccharomyces cerevisiae, sensing and signalling pathways regulate gene expression in response to quality of carbon and nitrogen sources. One such system, the target of rapamycin (Tor) proteins, senses nutrients and uses the GATA activators Gln3p and Nil1p to regulate translation in response to low-quality carbon and nitrogen. The signal transduction, triggered in response to nitrogen nutrition that is sensed by the Tor proteins, operates via a regulatory pathway involving the cytoplasmic factor Ure2p. When carbon and nitrogen are abundant, the phosphorylated Ure2p anchors the also phosphorylated Gln3p and Nil1p in the cytoplasm. Upon a shift from high- to low-quality nitrogen or treatment with rapamycin all three proteins are dephosphorylated, causing Gln3p and Nil1p to enter the nucleus and promote transcription. The genes that code for yeast periplasmic enzymes with nutritional roles would be obvious targets for regulation by the sensing and signalling pathways that respond to quality of carbon and nitrogen sources. Indeed, previous results from our laboratory had shown that the GATA factors Gln3p, Nil1p, Dal80p, Nil2p and also the protein Ure2 regulate the expression of asparaginase II, coded by ASP3. We also had observed that the activity levels of the also periplasmic invertase, coded by SUC2, were 6-fold lower in ure2 mutant cells in comparison to wild-type cells collected at stationary phase. These results suggested similarities between the signalling pathways regulating the expression of ASP3 and SUC2. In the present work we showed that invertase levels displayed by the single nil1 and gln3 and by the double gln3nil1 mutant cells, cultivated in a sucrose-ammonium medium and collected at the exponential phase, were 6-, 10- and 60-fold higher, respectively, in comparison to their wild-type counterparts. RT-PCR data of SUC2 expression in the double-mutant cells indicated a 10-fold increase in the mRNA(SUC2) levels.
Bioinsecticidal activity of Talisia esculenta reserve protein on growth and serine digestive enzymes during larval development of Anticarsia gemmatalis.

PubMed

Macedo, Maria Lígia R; Freire, Maria das Graças M; Kubo, Carlos Eduardo G; Parra, José Roberto P

2011-01-01

Plants synthesize a variety of molecules to defend themselves against an attack by insects. Talisin is a reserve protein from Talisia esculenta seeds, the first to be characterized from the family Sapindaceae. In this study, the insecticidal activity of Talisin was tested by incorporating the reserve protein into an artificial diet fed to the velvetbean caterpillar Anticarsia gemmatalis, the major pest of soybean crops in Brazil. At 1.5% (w/w) of the dietary protein, Talisin affected larval growth, pupal weight, development and mortality, adult fertility and longevity, and produced malformations in pupae and adult insects. Talisin inhibited the trypsin-like activity of larval midgut homogenates. The trypsin activity in Talisin-fed larvae was sensitive to Talisin, indicating that no novel protease-resistant to Talisin was induced in Talisin-fed larvae. Affinity chromatography showed that Talisin bound to midgut proteinases of the insect A. gemmatalis, but was resistant to enzymatic digestion by these larval proteinases. The transformation of genes coding for this reserve protein could be useful for developing insect resistant crops. Copyright © 2010 Elsevier Inc. All rights reserved.
Transport genes of Chromobacterium violaceum: an overview.

PubMed

Grangeiro, Thalles Barbosa; Jorge, Daniel Macedo de Melo; Bezerra, Walderly Melgaço; Vasconcelos, Ana Tereza Ribeiro; Simpson, Andrew John George

2004-03-31

The complete genome sequence of the free-living bacterium Chromobacterium violaceum has been determined by a consortium of laboratories in Brazil. Almost 500 open reading frames (ORFs) coding for transport-related membrane proteins were identified in C. violaceum, which represents 11% of all genes found. The main class of transporter proteins is the primary active transporters (212 ORFs), followed by electrochemical potential-driven transporters (154 ORFs) and channels/pores (62 ORFs). Other classes (61 ORFs) include group translocators, transport electron carriers, accessory factors, and incompletely characterized systems. Therefore, all major categories of transport-related membrane proteins currently recognized in the Transport Protein Database (http://tcdb.ucsd.edu/tcdb) are present in C. violaceum. The complex apparatus of transporters of C. violaceum is certainly an important factor that makes this bacterium a dominant microorganism in a variety of ecosystems in tropical and subtropical regions. From a biotechnological point of view, the most important finding is the transporters of heavy metals, which could lead to the exploitation of C. violaceum for bioremediation.
N-terminal deletions in Rous sarcoma virus p60src: effects on tyrosine kinase and biological activities and on recombination in tissue culture with the cellular src gene.

PubMed Central

Cross, F R; Garber, E A; Hanafusa, H

1985-01-01

We have constructed deletions within the region of cloned Rous sarcoma virus DNA coding for the N-terminal 30 kilodaltons of p60src. Infectious virus was recovered after transfection. Deletions of amino acids 15 to 149, 15 to 169, or 149 to 169 attenuated but did not abolish transforming activity, as assayed by focus formation and anchorage-independent growth. These deletions also had only slight effects on the tyrosine kinase activity of the mutant src protein. Deletion of amino acids 169 to 264 or 15 to 264 completely abolished transforming activity, and src kinase activity was reduced at least 10-fold. However, these mutant viruses generated low levels of transforming virus by recombination with the cellular src gene. The results suggest that as well as previously identified functional domains for p60src myristylation and membrane binding (amino acids 1 to 14) and tyrosine kinase activity (amino acids 250 to 526), additional N-terminal sequences (particularly amino acids 82 to 169) can influence the transforming activity of the src protein. Images PMID:2426576
Tumor hypoxia induces nuclear paraspeckle formation through HIF-2α dependent transcriptional activation of NEAT1 leading to cancer cell survival

PubMed Central

Choudhry, H; Albukhari, A; Morotti, M; Haider, S; Moralli, D; Smythies, J; Schödel, J; Green, C M; Camps, C; Buffa, F; Ratcliffe, P; Ragoussis, J; Harris, A L; Mole, D R

2015-01-01

Activation of cellular transcriptional responses, mediated by hypoxia-inducible factor (HIF), is common in many types of cancer, and generally confers a poor prognosis. Known to induce many hundreds of protein-coding genes, HIF has also recently been shown to be a key regulator of the non-coding transcriptional response. Here, we show that NEAT1 long non-coding RNA (lncRNA) is a direct transcriptional target of HIF in many breast cancer cell lines and in solid tumors. Unlike previously described lncRNAs, NEAT1 is regulated principally by HIF-2 rather than by HIF-1. NEAT1 is a nuclear lncRNA that is an essential structural component of paraspeckles and the hypoxic induction of NEAT1 induces paraspeckle formation in a manner that is dependent upon both NEAT1 and on HIF-2. Paraspeckles are multifunction nuclear structures that sequester transcriptionally active proteins as well as RNA transcripts that have been subjected to adenosine-to-inosine (A-to-I) editing. We show that the nuclear retention of one such transcript, F11R (also known as junctional adhesion molecule 1, JAM1), in hypoxia is dependent upon the hypoxic increase in NEAT1, thereby conferring a novel mechanism of HIF-dependent gene regulation. Induction of NEAT1 in hypoxia also leads to accelerated cellular proliferation, improved clonogenic survival and reduced apoptosis, all of which are hallmarks of increased tumorigenesis. Furthermore, in patients with breast cancer, high tumor NEAT1 expression correlates with poor survival. Taken together, these results indicate a new role for HIF transcriptional pathways in the regulation of nuclear structure and that this contributes to the pro-tumorigenic hypoxia-phenotype in breast cancer. PMID:25417700
Genetics of PCOS: A systematic bioinformatics approach to unveil the proteins responsible for PCOS.

PubMed

Panda, Pritam Kumar; Rane, Riya; Ravichandran, Rahul; Singh, Shrinkhla; Panchal, Hetalkumar

2016-06-01

Polycystic ovary syndrome (PCOS) is a hormonal imbalance in women, which causes problems during menstrual cycle and in pregnancy that sometimes results in fatality. Though the genetics of PCOS is not fully understood, early diagnosis and treatment can prevent long-term effects. In this study, we have studied the proteins involved in PCOS and the structural aspects of the proteins that are taken into consideration using computational tools. The proteins involved are modeled using Modeller 9v14 and Ab-initio programs. All the 43 proteins responsible for PCOS were subjected to phylogenetic analysis to identify the relatedness of the proteins. Further, microarray data analysis of PCOS datasets was analyzed that was downloaded from GEO datasets to find the significant protein-coding genes responsible for PCOS, which is an addition to the reported protein-coding genes. Various statistical analyses were done using R programming to get an insight into the structural aspects of PCOS that can be used as drug targets to treat PCOS and other related reproductive diseases.
GWIPS‐viz as a tool for exploring ribosome profiling evidence supporting the synthesis of alternative proteoforms

PubMed Central

Michel, Audrey M.; Ahern, Anna M.; Donohue, Claire A.

2015-01-01

The boundaries of protein coding sequences are more difficult to define at the 5′ end than at the 3′ end due to potential multiple translation initiation sites (TISs). Even in the presence of phylogenetic data, the use of sequence information only may not be sufficient for the accurate identification of TISs. Traditional proteomics approaches may also fail because the N‐termini of newly synthesized proteins are often processed. Thus ribosome profiling (ribo‐seq), producing a snapshot of the ribosome distribution across the entire transcriptome, is an attractive experimental technique for the purpose of TIS location exploration. The GWIPS‐viz (Genome Wide Information on Protein Synthesis visualized) browser (http://gwips.ucc.ie) provides free access to the genomic alignments of ribo‐seq data and corresponding mRNA‐seq data along with relevant annotation tracks. In this brief, we illustrate how GWIPS‐viz can be used to explore the ribosome occupancy at the 5′ ends of protein coding genes to assess the activity of AUG and non‐AUG TISs responsible for the synthesis of proteoforms with alternative or heterogeneous N‐termini. The presence of ribo‐seq tracks for various organisms allows for cross‐species comparison of orthologous genes and the availability of datasets from multiple laboratories permits the assessment of the technical reproducibility of the ribosome densities. PMID:25736862
Permanent draft genome of Thermithiobacillus tepidarius DSM 3134 T, a moderately thermophilic, obligately chemolithoautotrophic member of the Acidithiobacillia

DOE PAGES

Boden, Rich; Hutt, Lee P.; Huntemann, Marcel; ...

2016-09-26

Thermithiobacillus tepidarius DSM 3134 T was originally isolated (1983) from the waters of a sulfidic spring entering the Roman Baths (Temple of Sulis-Minerva) at Bath, United Kingdom and is an obligate chemolithoautotroph growing at the expense of reduced sulfur species. This strain has a genome size of 2,958,498 bp. Here we report the genome sequence, annotation and characteristics. The genome comprises 2,902 protein coding and 66 RNA coding genes. Genes responsible for the transaldolase variant of the Calvin-Benson-Bassham cycle were identified along with a biosynthetic horseshoe in lieu of Krebs' cycle sensu stricto. Terminal oxidases were identified, viz. cytochrome cmore » oxidase (cbb 3 , EC 1.9.3.1) and ubiquinol oxidase (bd, EC 1.10.3.10). Metalloresistance genes involved in pathways of arsenic and cadmium resistance were found. Evidence of horizontal gene transfer accounting for 5.9 % of the protein-coding genes was found, including transfer from Thiobacillus spp. and Methylococcus capsulatus Bath, isolated from the same spring. A sox gene cluster was found, similar in structure to those from other Acidithiobacillia - by comparison with Thiobacillus thioparus and Paracoccus denitrificans, an additional gene between soxA and soxB was found, annotated as a DUF302-family protein of unknown function. As the Kelly-Friedrich pathway of thiosulfate oxidation (encoded by sox) is not used in Thermithiobacillus spp., the role of the operon (if any) in this species remains unknown. We speculate that DUF302 and sox genes may have a role in periplasmic trithionate oxidation.« less
Permanent draft genome of Thermithiobacillus tepidarius DSM 3134 T, a moderately thermophilic, obligately chemolithoautotrophic member of the Acidithiobacillia

DOE Office of Scientific and Technical Information (OSTI.GOV)

Boden, Rich; Hutt, Lee P.; Huntemann, Marcel

Thermithiobacillus tepidarius DSM 3134 T was originally isolated (1983) from the waters of a sulfidic spring entering the Roman Baths (Temple of Sulis-Minerva) at Bath, United Kingdom and is an obligate chemolithoautotroph growing at the expense of reduced sulfur species. This strain has a genome size of 2,958,498 bp. Here we report the genome sequence, annotation and characteristics. The genome comprises 2,902 protein coding and 66 RNA coding genes. Genes responsible for the transaldolase variant of the Calvin-Benson-Bassham cycle were identified along with a biosynthetic horseshoe in lieu of Krebs' cycle sensu stricto. Terminal oxidases were identified, viz. cytochrome cmore » oxidase (cbb 3 , EC 1.9.3.1) and ubiquinol oxidase (bd, EC 1.10.3.10). Metalloresistance genes involved in pathways of arsenic and cadmium resistance were found. Evidence of horizontal gene transfer accounting for 5.9 % of the protein-coding genes was found, including transfer from Thiobacillus spp. and Methylococcus capsulatus Bath, isolated from the same spring. A sox gene cluster was found, similar in structure to those from other Acidithiobacillia - by comparison with Thiobacillus thioparus and Paracoccus denitrificans, an additional gene between soxA and soxB was found, annotated as a DUF302-family protein of unknown function. As the Kelly-Friedrich pathway of thiosulfate oxidation (encoded by sox) is not used in Thermithiobacillus spp., the role of the operon (if any) in this species remains unknown. We speculate that DUF302 and sox genes may have a role in periplasmic trithionate oxidation.« less
Genomic analysis of organismal complexity in the multicellular green alga Volvox carteri

DOE Office of Scientific and Technical Information (OSTI.GOV)

Prochnik, Simon E.; Umen, James; Nedelcu, Aurora

2010-07-01

Analysis of the Volvox carteri genome reveals that this green alga's increased organismal complexity and multicellularity are associated with modifications in protein families shared with its unicellular ancestor, and not with large-scale innovations in protein coding capacity. The multicellular green alga Volvox carteri and its morphologically diverse close relatives (the volvocine algae) are uniquely suited for investigating the evolution of multicellularity and development. We sequenced the 138 Mb genome of V. carteri and compared its {approx}14,500 predicted proteins to those of its unicellular relative, Chlamydomonas reinhardtii. Despite fundamental differences in organismal complexity and life history, the two species have similarmore » protein-coding potentials, and few species-specific protein-coding gene predictions. Interestingly, volvocine algal-specific proteins are enriched in Volvox, including those associated with an expanded and highly compartmentalized extracellular matrix. Our analysis shows that increases in organismal complexity can be associated with modifications of lineage-specific proteins rather than large-scale invention of protein-coding capacity.« less
The complete mitochondrial genome of Octopus conispadiceus (Sasaki, 1917) (Cephalopoda: Octopodidae).

PubMed

Ma, Yuanyuan; Zheng, Xiaodong; Cheng, Rubin; Li, Qi

2016-01-01

In this paper, we determined the complete mitochondrial genome of Octopus conispadiceus (Cephalopoda: Octopodidae). The whole mitogenome of O. conispadiceus is 16,027 basepairs (bp) in length with a base composition of 41.4% A, 34.8% T, 16.1% C, 7.7% G and contains 13 protein-coding genes, 2 ribosomal RNA genes, 22 transfer RNA genes, and a major non-coding region (MNR). The gene arrangements of O. conispadiceus showed remarkable similarity to that of O. vulgaris, Amphioctopus fangsiao, Cistopus chinensis and C. taiwanicus.
The complete mitochondrial genome of Conus tulipa (Neogastropoda: Conidae).

PubMed

Chen, Po-Wei; Hsiao, Sheng-Tai; Huang, Chih-Wei; Chen, Kao-Sung; Tseng, Chen-Te; Wu, Wen-Lung; Hwang, Deng-Fwu

2016-07-01

The complete mitogenome sequence of the cone snail Conus tulipa (Linnaeus, 1758) has been sequenced by next-generation sequencing method. The assembled mitogenome is 16,599 bp in length, including 13 protein-coding genes, 22 transfer RNA genes and 2 ribosomal RNA genes. The overall base composition of C. tulipa is 28.7% A, 15.2% C, 18.4% G and 37.7% T. It shows 81.1% identity to the cone snail C. consors, 78.5% to C. borgesi and 77.5% to C. textile. Using the 13 protein-coding genes and 2 ribosomal RNA genes of C. tulipa in this study, together with 18 other closely species, we constructed the species phylogenetic tree to verify the accuracy and utility of new determined mitogenome sequence. The complete mitogenome of the C. tulipa provides an essential and important DNA molecular data for further phylogeography and evolutionary analysis for cone snail phylogeny.
The complete mitochondrial genome of Chrysopa pallens (Insecta, Neuroptera, Chrysopidae).

PubMed

He, Kun; Chen, Zhe; Yu, Dan-Na; Zhang, Jia-Yong

2012-10-01

The complete mitochondrial genome of Chrysopa pallens (Neuroptera, Chrysopidae) was sequenced. It consists of 13 protein-coding genes, 22 transfer RNA genes, 2 ribosomal RNA (rRNA) genes, and a control region (AT-rich region). The total length of C. pallens mitogenome is 16,723 bp with 79.5% AT content, and the length of control region is 1905 bp with 89.1% AT content. The non-coding regions of C. pallens include control region between 12S rRNA and trnI genes, and a 75-bp space region between trnI and trnQ genes.
Structure and expression of canary myc family genes.

PubMed Central

Collum, R G; Clayton, D F; Alt, F W

1991-01-01

We found that the canary N-myc gene is highly related to mammalian N-myc genes in both the protein-coding region and the long 3' untranslated region. Examined coding regions of the canary c-myc gene were also highly related to their mammalian counterparts, but in contrast to N-myc, the canary and mammalian c-myc genes were quite divergent in their 3' untranslated regions. We readily detected N-myc and c-myc expression in the adult canary brain and found N-myc expression both at sites of proliferating neuronal precursors and in mature neurons. Images PMID:1996121
Insights into plant biomass conversion from the genome of the anaerobic thermophilic bacterium Caldicellulosiruptor bescii DSM 6725

PubMed Central

Dam, Phuongan; Kataeva, Irina; Yang, Sung-Jae; Zhou, Fengfeng; Yin, Yanbin; Chou, Wenchi; Poole, Farris L.; Westpheling, Janet; Hettich, Robert; Giannone, Richard; Lewis, Derrick L.; Kelly, Robert; Gilbert, Harry J.; Henrissat, Bernard; Xu, Ying; Adams, Michael W. W.

2011-01-01

Caldicellulosiruptor bescii DSM 6725 utilizes various polysaccharides and grows efficiently on untreated high-lignin grasses and hardwood at an optimum temperature of ∼80°C. It is a promising anaerobic bacterium for studying high-temperature biomass conversion. Its genome contains 2666 protein-coding sequences organized into 1209 operons. Expression of 2196 genes (83%) was confirmed experimentally. At least 322 genes appear to have been obtained by lateral gene transfer (LGT). Putative functions were assigned to 364 conserved/hypothetical protein (C/HP) genes. The genome contains 171 and 88 genes related to carbohydrate transport and utilization, respectively. Growth on cellulose led to the up-regulation of 32 carbohydrate-active (CAZy), 61 sugar transport, 25 transcription factor and 234 C/HP genes. Some C/HPs were overproduced on cellulose or xylan, suggesting their involvement in polysaccharide conversion. A unique feature of the genome is enrichment with genes encoding multi-modular, multi-functional CAZy proteins organized into one large cluster, the products of which are proposed to act synergistically on different components of plant cell walls and to aid the ability of C. bescii to convert plant biomass. The high duplication of CAZy domains coupled with the ability to acquire foreign genes by LGT may have allowed the bacterium to rapidly adapt to changing plant biomass-rich environments. PMID:21227922
Functional analysis of the ComK protein of Bacillus coagulans.

PubMed

Kovács, Ákos T; Eckhardt, Tom H; van Hartskamp, Mariska; van Kranenburg, Richard; Kuipers, Oscar P

2013-01-01

The genes for DNA uptake and recombination in Bacilli are commonly regulated by the transcriptional factor ComK. We have identified a ComK homologue in Bacillus coagulans, an industrial relevant organism that is recalcitrant for transformation. Introduction of B. coagulans comK gene under its own promoter region into Bacillus subtilis comK strain results in low transcriptional induction of the late competence gene comGA, but lacking bistable expression. The promoter regions of B. coagulans comK and the comGA genes are recognized in B. subtilis and expression from these promoters is activated by B. subtilis ComK. Purified ComK protein of B. coagulans showed DNA-binding ability in gel retardation assays with B. subtilis- and B. coagulans-derived probes. These experiments suggest that the function of B. coagulans ComK is similar to that of ComK of B. subtilis. When its own comK is overexpressed in B. coagulans the comGA gene expression increases 40-fold, while the expression of another late competence gene, comC is not elevated and no reproducible DNA-uptake could be observed under these conditions. Our results demonstrate that B. coagulans ComK can recognize several B.subtilis comK-responsive elements, and vice versa, but indicate that the activation of the transcription of complete sets of genes coding for a putative DNA uptake apparatus in B. coagulans might differ from that of B. subtilis.
Expression-Linked Patterns of Codon Usage, Amino Acid Frequency, and Protein Length in the Basally Branching Arthropod Parasteatoda tepidariorum

PubMed Central

Whittle, Carrie A.; Extavour, Cassandra G.

2016-01-01

Abstract Spiders belong to the Chelicerata, the most basally branching arthropod subphylum. The common house spider, Parasteatoda tepidariorum, is an emerging model and provides a valuable system to address key questions in molecular evolution in an arthropod system that is distinct from traditionally studied insects. Here, we provide evidence suggesting that codon usage, amino acid frequency, and protein lengths are each influenced by expression-mediated selection in P. tepidariorum. First, highly expressed genes exhibited preferential usage of T3 codons in this spider, suggestive of selection. Second, genes with elevated transcription favored amino acids with low or intermediate size/complexity (S/C) scores (glycine and alanine) and disfavored those with large S/C scores (such as cysteine), consistent with the minimization of biosynthesis costs of abundant proteins. Third, we observed a negative correlation between expression level and coding sequence length. Together, we conclude that protein-coding genes exhibit signals of expression-related selection in this emerging, noninsect, arthropod model. PMID:27017527
Biosynthesis of riboflavin: an unusual riboflavin synthase of Methanobacterium thermoautotrophicum.

PubMed Central

Eberhardt, S; Korn, S; Lottspeich, F; Bacher, A

1997-01-01

Riboflavin synthase was purified by a factor of about 1,500 from cell extract of Methanobacterium thermoautotrophicum. The enzyme had a specific activity of about 2,700 nmol mg(-1) h(-1) at 65 degrees C, which is relatively low compared to those of riboflavin synthases of eubacteria and yeast. Amino acid sequences obtained after proteolytic cleavage had no similarity with known riboflavin synthases. The gene coding for riboflavin synthase (designated ribC) was subsequently cloned by marker rescue with a ribC mutant of Escherichia coli. The ribC gene of M. thermoautotrophicum specifies a protein of 153 amino acid residues. The predicted amino acid sequence agrees with the information gleaned from Edman degradation of the isolated protein and shows 67% identity with the sequence predicted for the unannotated reading frame MJ1184 of Methanococcus jannaschii. The ribC gene is adjacent to a cluster of four genes with similarity to the genes cbiMNQO of Salmonella typhimurium, which form part of the cob operon (this operon contains most of the genes involved in the biosynthesis of vitamin B12). The amino acid sequence predicted by the ribC gene of M. thermoautotrophicum shows no similarity whatsoever to the sequences of riboflavin synthases of eubacteria and yeast. Most notably, the M. thermoautotrophicum protein does not show the internal sequence homology characteristic of eubacterial and yeast riboflavin synthases. The protein of M. thermoautotrophicum can be expressed efficiently in a recombinant E. coli strain. The specific activity of the purified, recombinant protein is 1,900 nmol mg(-1) h(-1) at 65 degrees C. In contrast to riboflavin synthases from eubacteria and fungi, the methanobacterial enzyme has an absolute requirement for magnesium ions. The 5' phosphate of 6,7-dimethyl-8-ribityllumazine does not act as a substrate. The findings suggest that riboflavin synthase has evolved independently in eubacteria and methanobacteria. PMID:9139911
Ribosome profiling reveals changes in translational status of soybean transcripts during immature cotyledon development

PubMed Central

Shamimuzzaman, Md.

2018-01-01

To understand translational capacity on a genome-wide scale across three developmental stages of immature soybean seed cotyledons, ribosome profiling was performed in combination with RNA sequencing and cluster analysis. Transcripts representing 216 unique genes demonstrated a higher level of translational activity in at least one stage by exhibiting higher translational efficiencies (TEs) in which there were relatively more ribosome footprint sequence reads mapping to the transcript than were present in the control total RNA sample. The majority of these transcripts were more translationally active at the early stage of seed development and included 12 unique serine or cysteine proteases and 16 2S albumin and low molecular weight cysteine-rich proteins that may serve as substrates for turnover and mobilization early in seed development. It would appear that the serine proteases and 2S albumins play a vital role in the early stages. In contrast, our investigation of profiles of 19 genes encoding high abundance seed storage proteins, such as glycinins, beta-conglycinins, lectin, and Kunitz trypsin inhibitors, showed that they all had similar patterns in which the TE values started at low levels and increased approximately 2 to 6-fold during development. The highest levels of these seed protein transcripts were found at the mid-developmental stage, whereas the highest ribosome footprint levels of only up to 1.6 TE were found at the late developmental stage. These experimental findings suggest that the major seed storage protein coding genes are primarily regulated at the transcriptional level during normal soybean cotyledon development. Finally, our analyses also identified a total of 370 unique gene models that showed very low TE values including over 48 genes encoding ribosomal family proteins and 95 gene models that are related to energy and photosynthetic functions, many of which have homology to the chloroplast genome. Additionally, we showed that genes of the chloroplast were relatively translationally inactive during seed development. PMID:29570733

Enzymatic improvement of mitochondrial thiol oxidase Erv1 for oxidized glutathione fermentation by Saccharomyces cerevisiae.

PubMed

Kobayashi, Jyumpei; Sasaki, Daisuke; Hara, Kiyotaka Y; Hasunuma, Tomohisa; Kondo, Akihiko

2017-03-15

Oxidized glutathione (GSSG) is the preferred form for industrial mass production of glutathione due to its high stability compared with reduced glutathione (GSH). In our previous study, over-expression of the mitochondrial thiol oxidase ERV1 gene was the most effective for high GSSG production in Saccharomyces cerevisiae cells among three types of different thiol oxidase genes. We improved Erv1 enzyme activity for oxidation of GSH and revealed that S32 and N34 residues are critical for the oxidation. Five engineered Erv1 variant proteins containing S32 and/or N34 replacements exhibited 1.7- to 2.4-fold higher in vitro GSH oxidation activity than that of parental Erv1, whereas the oxidation activities of these variants for γ-glutamylcysteine were comparable. According to three-dimensional structures of Erv1 and protein stability assays, S32 and N34 residues interact with nearby residues through hydrogen bonding and greatly contribute to protein stability. These results suggest that increased flexibility by amino acid replacements around the active center decrease inhibitory effects on GSH oxidation. Over-expressions of mutant genes coding these Erv1 variants also increased GSSG and consequently total glutathione production in S. cerevisiae cells. Over-expression of the ERV1 S32A gene was the most effective for GSSG production in S. cerevisiae cells among the parent and other mutant genes, and it increased GSSG production about 1.5-fold compared to that of the parental ERV1 gene. This is the first study demonstrating the pivotal effects of S32 and N34 residues to high GSH oxidation activity of Erv1. Furthermore, in vivo validity of Erv1 variants containing these S32 and N34 replacements were also demonstrated. This study indicates potentials of Erv1 for high GSSG production.
[Identification of proteins interacting with the circadian clock protein PER1 in tumors using bacterial two-hybrid system technique].

PubMed

Zhang, Yu; Yao, Youlin; Jiang, Siyuan; Lu, Yilu; Liu, Yunqiang; Tao, Dachang; Zhang, Sizhong; Ma, Yongxin

2015-04-01

To identify protein-protein interaction partners of PER1 (period circadian protein homolog 1), key component of the molecular oscillation system of the circadian rhythm in tumors using bacterial two-hybrid system technique. Human cervical carcinoma cell Hela library was adopted. Recombinant bait plasmid pBT-PER1 and pTRG cDNA plasmid library were cotransformed into the two-hybrid system reporter strain cultured in a special selective medium. Target clones were screened. After isolating the positive clones, the target clones were sequenced and analyzed. Fourteen protein coding genes were identified, 4 of which were found to contain whole coding regions of genes, which included optic atrophy 3 protein (OPA3) associated with mitochondrial dynamics and homo sapiens cutA divalent cation tolerance homolog of E. coli (CUTA) associated with copper metabolism. There were also cellular events related proteins and proteins which are involved in biochemical reaction and signal transduction-related proteins. Identification of potential interacting proteins with PER1 in tumors may provide us new insights into the functions of the circadian clock protein PER1 during tumorigenesis.
Characterization of a novel MIIA domain-containing protein (MdcE) in Bradyrhizobium spp.

PubMed

Durán, David; Imperial, Juan; Palacios, José; Ruiz-Argüeso, Tomás; Göttfert, Michael; Zehner, Susanne; Rey, Luis

2018-03-01

Several genes coding for proteins with metal ion-inducible autocleavage (MIIA) domains were identified in type III secretion system tts gene clusters from draft genomes of recently isolated Bradyrhizobium spp. MIIA domains have been first described in the effectors NopE1 and NopE2 of Bradyrhizobium diazoefficiens USDA 110. All identified genes are preceded by tts box promoter motifs. The identified proteins contain one or two MIIA domains. A phylogenetic analysis of 35 MIIA domain sequences from 16 Bradyrhizobium strains revealed four groups. The protein from Bradyrhizobium sp. LmjC strain contains a single MIIA domain and was designated MdcE (MdcELmjC). It was expressed as a fusion to maltose-binding protein (MalE) in Escherichia coli and subsequently purified by affinity chromatography. Recombinant MalE-MdcELmjC-Strep protein exhibited autocleavage in the presence of Ca2+, Cu2+, Cd2+ and Mn2+, but not in the presence of Mg2+, Ni2+ or Co2+. Site-directed mutagenesis at the predicted cleavage site abolished autocleavage activity of MdcELmjC. An LmjC mdcE- mutant was impaired in the ability to nodulate Lupinus angustifolius and Macroptilium atropurpureum. © FEMS 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Orexin Gene Therapy Restores the Timing and Maintenance of Wakefulness in Narcoleptic Mice

PubMed Central

Kantor, Sandor; Mochizuki, Takatoshi; Lops, Stefan N.; Ko, Brian; Clain, Elizabeth; Clark, Erika; Yamamoto, Mihoko; Scammell, Thomas E.

2013-01-01

Study Objectives: Narcolepsy is caused by selective loss of the orexin/hypocretin-producing neurons of the hypothalamus. For patients with narcolepsy, chronic sleepiness is often the most disabling symptom, but current therapies rarely normalize alertness and do not address the underlying orexin deficiency. We hypothesized that the sleepiness of narcolepsy would substantially improve if orexin signaling were restored in specific brain regions at appropriate times of day. Design: We used gene therapy to restore orexin signaling in a mouse model of narcolepsy. In these Atx mice, expression of a toxic protein (ataxin-3) selectively kills the orexin neurons. Interventions: To induce ectopic expression of the orexin neuropeptides, we microinjected an adeno-associated viral vector coding for prepro-orexin plus a red fluorescence protein (AAV-orexin) into the mediobasal hypothalamus of Atx and wild-type mice. Control mice received an AAV coding only for red fluorescence protein. Two weeks later, we recorded sleep/wake behavior, locomotor activity, and body temperature and examined the patterns of orexin expression. Measurements and Results: Atx mice rescued with AAV-orexin produced long bouts of wakefulness and had a normal diurnal pattern of arousal, with the longest bouts of wake and the highest amounts of locomotor activity in the first hours of the night. In addition, AAV-orexin improved the timing of rapid eye movement sleep and the consolidation of nonrapid eye movement sleep in Atx mice. Conclusions: These substantial improvements in sleepiness and other symptoms of narcolepsy demonstrate the effectiveness of orexin gene therapy in a mouse model of narcolepsy. Additional work is needed to optimize this approach, but in time, AAV-orexin could become a useful therapeutic option for patients with narcolepsy. Citation: Kantor S; Mochizuki T; Lops SN; Ko B; Clain E; Clark E; Yamamoto M; Scammell TE. Orexin gene therapy restores the timing and maintenance of wakefulness in narcoleptic mice. SLEEP 2013;36(8):1129–1138. PMID:23904672
Gene cloning and prokaryotic expression of recombinant outer membrane protein from Vibrio parahaemolyticus

NASA Astrophysics Data System (ADS)

Yuan, Ye; Wang, Xiuli; Guo, Sheping; Qiu, Xuemei

2011-06-01

Gram-negative Vibrio parahaemolyticus is a common pathogen in humans and marine animals. The outer membrane protein of bacteria plays an important role in the infection and pathogenicity to the host. Thus, the outer membrane proteins are an ideal target for vaccines. We amplified a complete outer membrane protein gene (ompW) from V. parahaemolyticus ATCC 17802. We then cloned and expressed the gene into Escherichia coli BL21 (DE3) cells. The gene coded for a protein that was 42.78 kDa. We purified the protein using Ni-NTA affinity chromatography and Anti-His antibody Western blotting, respectively. Our results provide a basis for future application of the OmpW protein as a vaccine candidate against infection by V. parahaemolyticus. In addition, the purified OmpW protein can be used for further functional and structural studies.
A Molecular Portrait of De Novo Genes in Yeasts.

PubMed

Vakirlis, Nikolaos; Hebert, Alex S; Opulente, Dana A; Achaz, Guillaume; Hittinger, Chris Todd; Fischer, Gilles; Coon, Joshua J; Lafontaine, Ingrid

2018-03-01

New genes, with novel protein functions, can evolve "from scratch" out of intergenic sequences. These de novo genes can integrate the cell's genetic network and drive important phenotypic innovations. Therefore, identifying de novo genes and understanding how the transition from noncoding to coding occurs are key problems in evolutionary biology. However, identifying de novo genes is a difficult task, hampered by the presence of remote homologs, fast evolving sequences and erroneously annotated protein coding genes. To overcome these limitations, we developed a procedure that handles the usual pitfalls in de novo gene identification and predicted the emergence of 703 de novo gene candidates in 15 yeast species from 2 genera whose phylogeny spans at least 100 million years of evolution. We validated 85 candidates by proteomic data, providing new translation evidence for 25 of them through mass spectrometry experiments. We also unambiguously identified the mutations that enabled the transition from noncoding to coding for 30 Saccharomyces de novo genes. We established that de novo gene origination is a widespread phenomenon in yeasts, only a few being ultimately maintained by selection. We also found that de novo genes preferentially emerge next to divergent promoters in GC-rich intergenic regions where the probability of finding a fortuitous and transcribed ORF is the highest. Finally, we found a more than 3-fold enrichment of de novo genes at recombination hot spots, which are GC-rich and nucleosome-free regions, suggesting that meiotic recombination contributes to de novo gene emergence in yeasts.
Evolution at protein ends: major contribution of alternative transcription initiation and termination to the transcriptome and proteome diversity in mammals

PubMed Central

Shabalina, Svetlana A.; Ogurtsov, Aleksey Y.; Spiridonov, Nikolay A.; Koonin, Eugene V.

2014-01-01

Alternative splicing (AS), alternative transcription initiation (ATI) and alternative transcription termination (ATT) create the extraordinary complexity of transcriptomes and make key contributions to the structural and functional diversity of mammalian proteomes. Analysis of mammalian genomic and transcriptomic data shows that contrary to the traditional view, the joint contribution of ATI and ATT to the transcriptome and proteome diversity is quantitatively greater than the contribution of AS. Although the mean numbers of protein-coding constitutive and alternative nucleotides in gene loci are nearly identical, their distribution along the transcripts is highly non-uniform. On average, coding exons in the variable 5′ and 3′ transcript ends that are created by ATI and ATT contain approximately four times more alternative nucleotides than core protein-coding regions that diversify exclusively via AS. Short upstream exons that encompass alternative 5′-untranslated regions and N-termini of proteins evolve under strong nucleotide-level selection whereas in 3′-terminal exons that encode protein C-termini, protein-level selection is significantly stronger. The groups of genes that are subject to ATI and ATT show major differences in biological roles, expression and selection patterns. PMID:24792168
Terminator Operon Reporter: combining a transcription termination switch with reporter technology for improved gene synthesis and synthetic biology applications.

PubMed

Zampini, Massimiliano; Mur, Luis A J; Rees Stevens, Pauline; Pachebat, Justin A; Newbold, C James; Hayes, Finbarr; Kingston-Smith, Alison

2016-05-25

Synthetic biology is characterized by the development of novel and powerful DNA fabrication methods and by the application of engineering principles to biology. The current study describes Terminator Operon Reporter (TOR), a new gene assembly technology based on the conditional activation of a reporter gene in response to sequence errors occurring at the assembly stage of the synthetic element. These errors are monitored by a transcription terminator that is placed between the synthetic gene and reporter gene. Switching of this terminator between active and inactive states dictates the transcription status of the downstream reporter gene to provide a rapid and facile readout of the accuracy of synthetic assembly. Designed specifically and uniquely for the synthesis of protein coding genes in bacteria, TOR allows the rapid and cost-effective fabrication of synthetic constructs by employing oligonucleotides at the most basic purification level (desalted) and without the need for costly and time-consuming post-synthesis correction methods. Thus, TOR streamlines gene assembly approaches, which are central to the future development of synthetic biology.
Mechanisms generating long range correlation in nucleotide composition of the Borrelia Burgdorferi genome

NASA Astrophysics Data System (ADS)

Mackiewicz, P.; Gierlik, A.; Kowalczuk, M.; Szczepanik, D.; Dudek, M. R.; Cebrat, S.

1999-12-01

We have analysed protein coding and intergenic sequences in the Borrelia burgdorferi (the Lyme disease bacterium) genome using different kinds of DNA walks. Genes occupying the leading strand of DNA have significantly different nucleotide composition from genes occupying the lagging strand. Nucleotide compositional bias of the two DNA strands reflects the aminoacid composition of proteins. 96% of genes coding for ribosomal proteins lie on the leading DNA strand, which suggests that the positions of these as well as other genes are non-random. In the B. burgdorferi genome, the asymmetry in intergenic DNA sequences is lower than the asymmetry in the third positions in codons. All these characters of the B. burgdorferi genome suggest that both replication-associated mutational pressure and recombination mechanisms have established the specific structure of the genome and now any recombination leading to inversion of a gene in respect to the direction of replication is forbidden. This property of the genome allows us to assume that it is in a steady state, which enables us to fix some parameters for simulations of DNA evolution.
Proteogenomics produces comprehensive and highly accurate protein-coding gene annotation in a complete genome assembly of Malassezia sympodialis.

PubMed

Zhu, Yafeng; Engström, Pär G; Tellgren-Roth, Christian; Baudo, Charles D; Kennell, John C; Sun, Sheng; Billmyre, R Blake; Schröder, Markus S; Andersson, Anna; Holm, Tina; Sigurgeirsson, Benjamin; Wu, Guangxi; Sankaranarayanan, Sundar Ram; Siddharthan, Rahul; Sanyal, Kaustuv; Lundeberg, Joakim; Nystedt, Björn; Boekhout, Teun; Dawson, Thomas L; Heitman, Joseph; Scheynius, Annika; Lehtiö, Janne

2017-03-17

Complete and accurate genome assembly and annotation is a crucial foundation for comparative and functional genomics. Despite this, few complete eukaryotic genomes are available, and genome annotation remains a major challenge. Here, we present a complete genome assembly of the skin commensal yeast Malassezia sympodialis and demonstrate how proteogenomics can substantially improve gene annotation. Through long-read DNA sequencing, we obtained a gap-free genome assembly for M. sympodialis (ATCC 42132), comprising eight nuclear and one mitochondrial chromosome. We also sequenced and assembled four M. sympodialis clinical isolates, and showed their value for understanding Malassezia reproduction by confirming four alternative allele combinations at the two mating-type loci. Importantly, we demonstrated how proteomics data could be readily integrated with transcriptomics data in standard annotation tools. This increased the number of annotated protein-coding genes by 14% (from 3612 to 4113), compared to using transcriptomics evidence alone. Manual curation further increased the number of protein-coding genes by 9% (to 4493). All of these genes have RNA-seq evidence and 87% were confirmed by proteomics. The M. sympodialis genome assembly and annotation presented here is at a quality yet achieved only for a few eukaryotic organisms, and constitutes an important reference for future host-microbe interaction studies. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Measles virus minigenomes encoding two autofluorescent proteins reveal cell-to-cell variation in reporter expression dependent on viral sequences between the transcription units.

PubMed

Rennick, Linda J; Duprex, W Paul; Rima, Bert K

2007-10-01

Transcription from morbillivirus genomes commences at a single promoter in the 3' non-coding terminus, with the six genes being transcribed sequentially. The 3' and 5' untranslated regions (UTRs) of the genes (mRNA sense), together with the intergenic trinucleotide spacer, comprise the non-coding sequences (NCS) of the virus and contain the conserved gene end and gene start signals, respectively. Bicistronic minigenomes containing transcription units (TUs) encoding autofluorescent reporter proteins separated by measles virus (MV) NCS were used to give a direct estimation of gene expression in single, living cells by assessing the relative amounts of each fluorescent protein in each cell. Initially, five minigenomes containing each of the MV NCS were generated. Assays were developed to determine the amount of each fluorescent protein in cells at both cell population and single-cell levels. This revealed significant variations in gene expression between cells expressing the same NCS-containing minigenome. The minigenome containing the M/F NCS produced significantly lower amounts of fluorescent protein from the second TU (TU2), compared with the other minigenomes. A minigenome with a truncated F 5' UTR had increased expression from TU2. This UTR is 524 nt longer than the other MV 5' UTRs. Insertions into the 5' UTR of the enhanced green fluorescent protein gene in the minigenome containing the N/P NCS showed that specific sequences, rather than just the additional length of F 5' UTR, govern this decreased expression from TU2.
Directed Shotgun Proteomics Guided by Saturated RNA-seq Identifies a Complete Expressed Prokaryotic Proteome

DOE Office of Scientific and Technical Information (OSTI.GOV)

Omasits, U.; Quebatte, Maxime; Stekhoven, Daniel J.

2013-11-01

Prokaryotes, due to their moderate complexity, are particularly amenable to the comprehensive identification of the protein repertoire expressed under different conditions. We applied a generic strategy to identify a complete expressed prokaryotic proteome, which is based on the analysis of RNA and proteins extracted from matched samples. Saturated transcriptome profiling by RNA-seq provided an endpoint estimate of the protein-coding genes expressed under two conditions which mimic the interaction of Bartonella henselae with its mammalian host. Directed shotgun proteomics experiments were carried out on four subcellular fractions. By specifically targeting proteins which are short, basic, low abundant, and membrane localized, wemore » could eliminate their initial underrepresentation compared to the estimated endpoint. A total of 1250 proteins were identified with an estimated false discovery rate below 1%. This represents 85% of all distinct annotated proteins and ~90% of the expressed protein-coding genes. Genes that were detected at the transcript but not protein level, were found to be highly enriched in several genomic islands. Furthermore, genes that lacked an ortholog and a functional annotation were not detected at the protein level; these may represent examples of overprediction in genome annotations. A dramatic membrane proteome reorganization was observed, including differential regulation of autotransporters, adhesins, and hemin binding proteins. Particularly noteworthy was the complete membrane proteome coverage, which included expression of all members of the VirB/D4 type IV secretion system, a key virulence factor.« less
Directed shotgun proteomics guided by saturated RNA-seq identifies a complete expressed prokaryotic proteome

PubMed Central

Omasits, Ulrich; Quebatte, Maxime; Stekhoven, Daniel J.; Fortes, Claudia; Roschitzki, Bernd; Robinson, Mark D.; Dehio, Christoph; Ahrens, Christian H.

2013-01-01

Prokaryotes, due to their moderate complexity, are particularly amenable to the comprehensive identification of the protein repertoire expressed under different conditions. We applied a generic strategy to identify a complete expressed prokaryotic proteome, which is based on the analysis of RNA and proteins extracted from matched samples. Saturated transcriptome profiling by RNA-seq provided an endpoint estimate of the protein-coding genes expressed under two conditions which mimic the interaction of Bartonella henselae with its mammalian host. Directed shotgun proteomics experiments were carried out on four subcellular fractions. By specifically targeting proteins which are short, basic, low abundant, and membrane localized, we could eliminate their initial underrepresentation compared to the estimated endpoint. A total of 1250 proteins were identified with an estimated false discovery rate below 1%. This represents 85% of all distinct annotated proteins and ∼90% of the expressed protein-coding genes. Genes that were detected at the transcript but not protein level, were found to be highly enriched in several genomic islands. Furthermore, genes that lacked an ortholog and a functional annotation were not detected at the protein level; these may represent examples of overprediction in genome annotations. A dramatic membrane proteome reorganization was observed, including differential regulation of autotransporters, adhesins, and hemin binding proteins. Particularly noteworthy was the complete membrane proteome coverage, which included expression of all members of the VirB/D4 type IV secretion system, a key virulence factor. PMID:23878158
SGP-1: Prediction and Validation of Homologous Genes Based on Sequence Alignments

PubMed Central

Wiehe, Thomas; Gebauer-Jung, Steffi; Mitchell-Olds, Thomas; Guigó, Roderic

2001-01-01

Conventional methods of gene prediction rely on the recognition of DNA-sequence signals, the coding potential or the comparison of a genomic sequence with a cDNA, EST, or protein database. Reasons for limited accuracy in many circumstances are species-specific training and the incompleteness of reference databases. Lately, comparative genome analysis has attracted increasing attention. Several analysis tools that are based on human/mouse comparisons are already available. Here, we present a program for the prediction of protein-coding genes, termed SGP-1 (Syntenic Gene Prediction), which is based on the similarity of homologous genomic sequences. In contrast to most existing tools, the accuracy of SGP-1 depends little on species-specific properties such as codon usage or the nucleotide distribution. SGP-1 may therefore be applied to nonstandard model organisms in vertebrates as well as in plants, without the need for extensive parameter training. In addition to predicting genes in large-scale genomic sequences, the program may be useful to validate gene structure annotations from databases. To this end, SGP-1 output also contains comparisons between predicted and annotated gene structures in HTML format. The program can be accessed via a Web server at http://soft.ice.mpg.de/sgp-1. The source code, written in ANSI C, is available on request from the authors. PMID:11544202
Actinomyces spp. gene expression in root caries lesions

PubMed Central

Dame-Teixeira, Naile; Parolo, Clarissa Cavalcanti Fatturi; Maltz, Marisa; Tugnait, Aradhna; Devine, Deirdre; Do, Thuy

2016-01-01

Background The studies of the distribution of Actinomyces spp. on carious and non-carious root surfaces have not been able to confirm the association of these bacteria with root caries, although they were extensively implicated as a prime suspect in root caries. Objective The aim of this study was to observe the gene expression of Actinomyces spp. in the microbiota of root surfaces with and without caries. Design The oral biofilms from exposed sound root surface (SRS; n=10) and active root caries (RC; n=30) samples were collected. The total bacterial RNA was extracted, and the mRNA was isolated. Samples with low RNA concentration were pooled, yielding a final sample size of SRS=10 and RC=9. Complementary DNA (cDNA) libraries were prepared and sequenced on an Illumina® HiSeq 2500 system. Sequence reads were mapped to eight Actinomyces genomes. Count data were normalized using DESeq2 to analyse differential gene expression applying the Benjamini-Hochberg correction (false discovery rate [FDR]<0.001). Results Actinomyces spp. had similar numbers of reads (Mann-Whitney U-test; p>0.05), except for Actinomyces OT178 (p=0.001) and Actinomyces gerencseriae (p=0.004), which had higher read counts in the SRS. Genes that code for stress proteins (clp, dnaK, and groEL), enzymes of glycolysis pathways (including enolase and phosphoenolpyruvate carboxykinase), adhesion (Type-2 fimbrial and collagen-binding protein), and cell growth (EF-Tu) were highly – but not differentially (p>0.001) – expressed in both groups. Genes with the most significant upregulation in RC were those coding for hypothetical proteins and uracil DNA glycosylase (p=2.61E-17). The gene with the most significant upregulation in SRS was a peptide ABC transporter substrate-binding protein (log2FC=−6.00, FDR=2.37E-05). Conclusion There were similar levels of Actinomyces gene expression in both sound and carious root biofilms. These bacteria can be commensal in root surface sites but may be cariogenic due to survival mechanisms that allow them to exist in acid environments and to metabolize sugars, saving energy. PMID:27640531
In silico screening of the chicken genome for overlaps between genomic regions: microRNA genes, coding and non-coding transcriptional units, QTL, and genetic variations.

PubMed

Zorc, Minja; Kunej, Tanja

2016-05-01

MicroRNAs (miRNAs) are a class of non-coding RNAs involved in posttranscriptional regulation of target genes. Regulation requires complementarity between target mRNA and the mature miRNA seed region, responsible for their recognition and binding. It has been estimated that each miRNA targets approximately 200 genes, and genetic variability of miRNA genes has been reported to affect phenotypic variability and disease susceptibility in humans, livestock species, and model organisms. Polymorphisms in miRNA genes could therefore represent biomarkers for phenotypic traits in livestock animals. In our previous study, we collected polymorphisms within miRNA genes in chicken. In the present study, we identified miRNA-related genomic overlaps to prioritize genomic regions of interest for further functional studies and biomarker discovery. Overlapping genomic regions in chicken were analyzed using the following bioinformatics tools and databases: miRNA SNiPer, Ensembl, miRBase, NCBI Blast, and QTLdb. Out of 740 known pre-miRNA genes, 263 (35.5 %) contain polymorphisms; among them, 35 contain more than three polymorphisms The most polymorphic miRNA genes in chicken are gga-miR-6662, containing 23 single nucleotide polymorphisms (SNPs) within the pre-miRNA region, including five consecutive SNPs, and gga-miR-6688, containing ten polymorphisms including three consecutive polymorphisms. Several miRNA-related genomic hotspots have been revealed in chicken genome; polymorphic miRNA genes are located within protein-coding and/or non-coding transcription units and quantitative trait loci (QTL) associated with production traits. The present study includes the first description of an exonic miRNA in a chicken genome, an overlap between the miRNA gene and the exon of the protein-coding gene (gga-miR-6578/HADHB), and the first report of a missense polymorphism located within a mature miRNA seed region. Identified miRNA-related genomic hotspots in chicken can serve researchers as a starting point for further functional studies and association studies with poultry production and health traits and the basis for systematic screening of exonic miRNAs and missense/miRNA seed polymorphisms in other genomes.
Expression of drought tolerance genes in tropical upland rice cultivars (Oryza sativa).

PubMed

Silveira, R D D; Abreu, F R M; Mamidi, S; McClean, P E; Vianello, R P; Lanna, A C; Carneiro, N P; Brondani, C

2015-07-27

Gene expression related to drought response in the leaf tissues of two Brazilian upland cultivars, the drought-tolerant Douradão and the drought-sensitive Primavera, was analyzed. RNA-seq identified 27,618 transcripts in the Douradão cultivar, with 24,090 (87.2%) homologous to the rice database, and 27,221 transcripts in the Primavera cultivar, with 23,663 (86.9%) homologous to the rice database. Gene-expression analysis between control and water-deficient treatments revealed 493 and 1154 differentially expressed genes in Douradão and Primavera cultivars, respectively. Genes exclusively expressed under drought were identified for Douradão, including two genes of particular interest coding for the protein peroxidase precursor, which is involved in three distinct metabolic pathways. Comparisons between the two drought-exposed cultivars revealed 2314 genes were differentially expressed (978 upregulated, 1336 downregulated in Douradão). Six genes distributed across 4 different transcription factor families (bHLH, MYB, NAC, and WRKY) were identified, all of which were upregulated in Douradão compared to Primavera during drought. Most of the genes identified in Douradão activate metabolic pathways responsible for production of secondary metabolites and genes coding for enzymatically active signaling receptors. Quantitative PCR validation showed that most gene expression was in agreement with computational prediction of these transcripts. The transcripts identified here will define molecular markers for identification of Cis-acting elements to search for allelic variants of these genes through analysis of polymorphic SNPs in GenBank accessions of upland rice, aiming to develop cultivars with the best combination of these alleles, resulting in materials with high yield potential in the event of drought during the reproductive phase.
The complete mitochondrial genome of Gryllotalpa unispina Saussure, 1874 (Orthoptera: Gryllotalpoidea: Gryllotalpidae).

PubMed

Zhang, Yulong; Shao, Dandan; Cai, Miao; Yin, Hong; Zhang, Daochuan

2016-01-01

The complete mitochondrial genome of Gryllotalpa unispina was 15,513 bp in length and contained 70.9% AT. All G. unispina protein-coding sequences except for the nad2 started with a typical ATN codon. The usual termination codons (TAA) and incomplete stop codons (T) were found from 13 protein-coding genes. All tRNA genes were folded into the typical cloverleaf secondary structure, except trnS(AGN) lacking the dihydrouridine arm. The sizes of the large and small ribosomal RNA genes were 1245 and 725 bp, respectively. The A + T-rich region was 917 bp in length with 76.8%. The orientation and gene order of the G. unispina mitogenome were identical to the G. orientalis and G. pluvialis, there was no phenomenon of "DK rearrangement" which has been widely reported in Caelifera.
Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants.

PubMed

Fu, Wenqing; O'Connor, Timothy D; Jun, Goo; Kang, Hyun Min; Abecasis, Goncalo; Leal, Suzanne M; Gabriel, Stacey; Rieder, Mark J; Altshuler, David; Shendure, Jay; Nickerson, Deborah A; Bamshad, Michael J; Akey, Joshua M

2013-01-10

Establishing the age of each mutation segregating in contemporary human populations is important to fully understand our evolutionary history and will help to facilitate the development of new approaches for disease-gene discovery. Large-scale surveys of human genetic variation have reported signatures of recent explosive population growth, notable for an excess of rare genetic variants, suggesting that many mutations arose recently. To more quantitatively assess the distribution of mutation ages, we resequenced 15,336 genes in 6,515 individuals of European American and African American ancestry and inferred the age of 1,146,401 autosomal single nucleotide variants (SNVs). We estimate that approximately 73% of all protein-coding SNVs and approximately 86% of SNVs predicted to be deleterious arose in the past 5,000-10,000 years. The average age of deleterious SNVs varied significantly across molecular pathways, and disease genes contained a significantly higher proportion of recently arisen deleterious SNVs than other genes. Furthermore, European Americans had an excess of deleterious variants in essential and Mendelian disease genes compared to African Americans, consistent with weaker purifying selection due to the Out-of-Africa dispersal. Our results better delimit the historical details of human protein-coding variation, show the profound effect of recent human history on the burden of deleterious SNVs segregating in contemporary populations, and provide important practical information that can be used to prioritize variants in disease-gene discovery.
The complete nucleotide sequence of the domestic dog (Canis familiaris) mitochondrial genome.

PubMed

Kim, K S; Lee, S E; Jeong, H W; Ha, J H

1998-10-01

The complete nucleotide sequence of the mitochondrial genome of the domestic dog, Canis familiaris, was determined. The length of the sequence was 16,728 bp; however, the length was not absolute due to the variation (heteroplasmy) caused by differing numbers of the repetitive motif, 5'-GTACACGT(A/G)C-3', in the control region. The genome organization, gene contents, and codon usage conformed to those of other mammalian mitochondrial genomes. Although its features were unknown, the "CTAGA" duplication event which followed the translational stop codon of the COII gene was not observed in other mammalian mitochondrial genomes. In order to determine the possible differences between mtDNAs in carnivores, two rRNA and 13 protein-coding genes from the cat, dog, and seal were compared. The combined molecular differences, in two rRNA genes as well as in the inferred amino acid sequences of the mitochondrial 13 protein-coding genes, suggested that there is a closer relationship between the dog and the seal than there is between either of these species and the cat. Based on the molecular differences of the mtDNA, the evolutionary divergence between the cat, the dog, and the seal was dated to approximately 50 +/- 4 million years ago. The degree of difference between carnivore mtDNAs varied according to the individual protein-coding gene applied, showing that the evolutionary relationships of distantly related species should be presented in an extended study based on ample sequence data like complete mtDNA molecules. Copyright 1998 Academic Press.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.