Yerrapragada, Shaila; Shukla, Animesh; Hallsworth-Pepin, Kymberlie; Choi, Kwangmin; Wollam, Aye; Clifton, Sandra; Qin, Xiang; Muzny, Donna; Raghuraman, Sriram; Ashki, Haleh; Uzman, Akif; Highlander, Sarah K.; Fryszczyn, Bartlomiej G.; Fox, George E.; Tirumalai, Madhan R.; Liu, Yamei; Kim, Sun
2015-01-01
Tolypothrix sp. PCC 7601 is a freshwater filamentous cyanobacterium with complex responses to environmental conditions. Here, we present its 9.96-Mbp draft genome sequence, containing 10,065 putative protein-coding sequences, including 305 predicted two-component system proteins and 27 putative phytochrome-class photoreceptors, the most such proteins in any sequenced genome. PMID:25953173
Yerrapragada, Shaila; Shukla, Animesh; Hallsworth-Pepin, Kymberlie; Choi, Kwangmin; Wollam, Aye; Clifton, Sandra; Qin, Xiang; Muzny, Donna; Raghuraman, Sriram; Ashki, Haleh; Uzman, Akif; Highlander, Sarah K; Fryszczyn, Bartlomiej G; Fox, George E; Tirumalai, Madhan R; Liu, Yamei; Kim, Sun; Kehoe, David M; Weinstock, George M
2015-05-07
Tolypothrix sp. PCC 7601 is a freshwater filamentous cyanobacterium with complex responses to environmental conditions. Here, we present its 9.96-Mbp draft genome sequence, containing 10,065 putative protein-coding sequences, including 305 predicted two-component system proteins and 27 putative phytochrome-class photoreceptors, the most such proteins in any sequenced genome. Copyright © 2015 Yerrapragada et al.
Comparative analyses of putative toxin gene homologs from an Old World viper, Daboia russelii
Krishnan, Neeraja M.
2017-01-01
Availability of snake genome sequences has opened up exciting areas of research on comparative genomics and gene diversity. One of the challenges in studying snake genomes is the acquisition of biological material from live animals, especially from the venomous ones, making the process cumbersome and time-consuming. Here, we report comparative sequence analyses of putative toxin gene homologs from Russell’s viper (Daboia russelii) using whole-genome sequencing data obtained from shed skin. When compared with the major venom proteins in Russell’s viper studied previously, we found 45–100% sequence similarity between the venom proteins and their putative homologs in the skin. Additionally, comparative analyses of 20 putative toxin gene family homologs provided evidence of unique sequence motifs in nerve growth factor (NGF), platelet derived growth factor (PDGF), Kunitz/Bovine pancreatic trypsin inhibitor (Kunitz BPTI), cysteine-rich secretory proteins, antigen 5, andpathogenesis-related1 proteins (CAP) and cysteine-rich secretory protein (CRISP). In those derived proteins, we identified V11 and T35 in the NGF domain; F23 and A29 in the PDGF domain; N69, K2 and A5 in the CAP domain; and Q17 in the CRISP domain to be responsible for differences in the largest pockets across the protein domain structures in crotalines, viperines and elapids from the in silico structure-based analysis. Similarly, residues F10, Y11 and E20 appear to play an important role in the protein structures across the kunitz protein domain of viperids and elapids. Our study highlights the usefulness of shed skin in obtaining good quality high-molecular weight DNA for comparative genomic studies, and provides evidence towards the unique features and evolution of putative venom gene homologs in vipers. PMID:29230357
Niskanen, Einari A; Hytönen, Vesa P; Grapputo, Alessandro; Nordlund, Henri R; Kulomaa, Markku S; Laitinen, Olli H
2005-01-01
Background A chicken egg contains several biotin-binding proteins (BBPs), whose complete DNA and amino acid sequences are not known. In order to identify and characterise these genes and proteins we studied chicken cDNAs and genes available in the NCBI database and chicken genome database using the reported N-terminal amino acid sequences of chicken egg-yolk BBPs as search strings. Results Two separate hits showing significant homology for these N-terminal sequences were discovered. For one of these hits, the chromosomal location in the immediate proximity of the avidin gene family was found. Both of these hits encode proteins having high sequence similarity with avidin suggesting that chicken BBPs are paralogous to avidin family. In particular, almost all residues corresponding to biotin binding in avidin are conserved in these putative BBP proteins. One of the found DNA sequences, however, seems to encode a carboxy-terminal extension not present in avidin. Conclusion We describe here the predicted properties of the putative BBP genes and proteins. Our present observations link BBP genes together with avidin gene family and shed more light on the genetic arrangement and variability of this family. In addition, comparative modelling revealed the potential structural elements important for the functional and structural properties of the putative BBP proteins. PMID:15777476
Bain, Peter A; Papanicolaou, Alexie; Kumar, Anupama
2015-01-01
Murray-Darling rainbowfish (Melanotaenia fluviatilis [Castelnau, 1878]; Atheriniformes: Melanotaeniidae) is a small-bodied teleost currently under development in Australasia as a test species for aquatic toxicological studies. To date, efforts towards the development of molecular biomarkers of contaminant exposure have been hindered by the lack of available sequence data. To address this, we sequenced messenger RNA from brain, liver and gonads of mature male and female fish and generated a high-quality draft transcriptome using a de novo assembly approach. 149,742 clusters of putative transcripts were obtained, encompassing 43,841 non-redundant protein-coding regions. Deduced amino acid sequences were annotated by functional inference based on similarity with sequences from manually curated protein sequence databases. The draft assembly contained protein-coding regions homologous to 95.7% of the complete cohort of predicted proteins from the taxonomically related species, Oryzias latipes (Japanese medaka). The mean length of rainbowfish protein-coding sequences relative to their medaka homologues was 92.1%, indicating that despite the limited number of tissues sampled a large proportion of the total expected number of protein-coding genes was captured in the study. Because of our interest in the effects of environmental contaminants on endocrine pathways, we manually curated subsets of coding regions for putative nuclear receptors and steroidogenic enzymes in the rainbowfish transcriptome, revealing 61 candidate nuclear receptors encompassing all known subfamilies, and 41 putative steroidogenic enzymes representing all major steroidogenic enzymes occurring in teleosts. The transcriptome presented here will be a valuable resource for researchers interested in biomarker development, protein structure and function, and contaminant-response genomics in Murray-Darling rainbowfish.
Are plant formins integral membrane proteins?
Cvrcková, F
2000-01-01
The formin family of proteins has been implicated in signaling pathways of cellular morphogenesis in both animals and fungi; in the latter case, at least, they participate in communication between the actin cytoskeleton and the cell surface. Nevertheless, they appear to be cytoplasmic or nuclear proteins, and it is not clear whether they communicate with the plasma membrane, and if so, how. Because nothing is known about formin function in plants, I performed a systematic search for putative Arabidopsis thaliana formin homologs. I found eight putative formin-coding genes in the publicly available part of the Arabidopsis genome sequence and analyzed their predicted protein sequences. Surprisingly, some of them lack parts of the conserved formin-homology 2 (FH2) domain and the majority of them seem to have signal sequences and putative transmembrane segments that are not found in yeast or animals formins. Plant formins define a distinct subfamily. The presence in most Arabidopsis formins of sequence motifs typical or transmembrane proteins suggests a mechanism of membrane attachment that may be specific to plant formins, and indicates an unexpected evolutionary flexibility of the conserved formin domain.
Intrinsic and extrinsic approaches for detecting genes in a bacterial genome.
Borodovsky, M; Rudd, K E; Koonin, E V
1994-01-01
The unannotated regions of the Escherichia coli genome DNA sequence from the EcoSeq6 database, totaling 1,278 'intergenic' sequences of the combined length of 359,279 basepairs, were analyzed using computer-assisted methods with the aim of identifying putative unknown genes. The proposed strategy for finding new genes includes two key elements: i) prediction of expressed open reading frames (ORFs) using the GeneMark method based on Markov chain models for coding and non-coding regions of Escherichia coli DNA, and ii) search for protein sequence similarities using programs based on the BLAST algorithm and programs for motif identification. A total of 354 putative expressed ORFs were predicted by GeneMark. Using the BLASTX and TBLASTN programs, it was shown that 208 ORFs located in the unannotated regions of the E. coli chromosome are significantly similar to other protein sequences. Identification of 182 ORFs as probable genes was supported by GeneMark and BLAST, comprising 51.4% of the GeneMark 'hits' and 87.5% of the BLAST 'hits'. 73 putative new genes, comprising 20.6% of the GeneMark predictions, belong to ancient conserved protein families that include both eubacterial and eukaryotic members. This value is close to the overall proportion of highly conserved sequences among eubacterial proteins, indicating that the majority of the putative expressed ORFs that are predicted by GeneMark, but have no significant BLAST hits, nevertheless are likely to be real genes. The majority of the putative genes identified by BLAST search have been described since the release of the EcoSeq6 database, but about 70 genes have not been detected so far. Among these new identifications are genes encoding proteins with a variety of predicted functions including dehydrogenases, kinases, several other metabolic enzymes, ATPases, rRNA methyltransferases, membrane proteins, and different types of regulatory proteins. Images PMID:7984428
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mangelsen, Elke; Kilian, Joachim; Berendzen, Kenneth W.
2008-02-01
WRKY proteins belong to the WRKY-GCM1 superfamily of zinc finger transcription factors that have been subject to a large plant-specific diversification. For the cereal crop barley (Hordeum vulgare), three different WRKY proteins have been characterized so far, as regulators in sucrose signaling, in pathogen defense, and in response to cold and drought, respectively. However, their phylogenetic relationship remained unresolved. In this study, we used the available sequence information to identify a minimum number of 45 barley WRKY transcription factor (HvWRKY) genes. According to their structural features the HvWRKY factors were classified into the previously defined polyphyletic WRKY subgroups 1 tomore » 3. Furthermore, we could assign putative orthologs of the HvWRKY proteins in Arabidopsis and rice. While in most cases clades of orthologous proteins were formed within each group or subgroup, other clades were composed of paralogous proteins for the grasses and Arabidopsis only, which is indicative of specific gene radiation events. To gain insight into their putative functions, we examined expression profiles of WRKY genes from publicly available microarray data resources and found group specific expression patterns. While putative orthologs of the HvWRKY transcription factors have been inferred from phylogenetic sequence analysis, we performed a comparative expression analysis of WRKY genes in Arabidopsis and barley. Indeed, highly correlative expression profiles were found between some of the putative orthologs. HvWRKY genes have not only undergone radiation in monocot or dicot species, but exhibit evolutionary traits specific to grasses. HvWRKY proteins exhibited not only sequence similarities between orthologs with Arabidopsis, but also relatedness in their expression patterns. This correlative expression is indicative for a putative conserved function of related WRKY proteins in mono- and dicot species.« less
Ciok, Anna; Adamczuk, Marcin; Bartosik, Dariusz; Dziewit, Lukasz
2016-11-28
Pseudomonas strains isolated from the heavily contaminated Lubin copper mine and Zelazny Most post-flotation waste reservoir in Poland were screened for the presence of integrons. This analysis revealed that two strains carried homologous DNA regions composed of a gene encoding a DNA_BRE_C domain-containing tyrosine recombinase (with no significant sequence similarity to other integrases of integrons) plus a three-component array of putative integron gene cassettes. The predicted gene cassettes encode three putative polypeptides with homology to (i) transmembrane proteins, (ii) GCN5 family acetyltransferases, and (iii) hypothetical proteins of unknown function (homologous proteins are encoded by the gene cassettes of several class 1 integrons). Comparative sequence analyses identified three structural variants of these novel integron-like elements within the sequenced bacterial genomes. Analysis of their distribution revealed that they are found exclusively in strains of the genus Pseudomonas .
Xin, Min; Zhang, Peipei; Liu, Wenwen; Ren, Yingdang; Cao, Mengji; Wang, Xifeng
2017-10-01
The complete nucleotide sequence of a novel positive single-stranded (+ss) RNA virus, tentatively named watermelon virus A (WVA), was determined using a combination of three methods: RNA sequencing, small RNA sequencing, and Sanger sequencing. The full genome of WVA is comprised of 8,372 nucleotides (nt), excluding the poly (A) tail, and contains four open reading frames (ORFs). The largest ORF, ORF1 encodes a putative replication-associated polyprotein (RP) with three conserved domains. ORF2 and ORF4 encode a movement protein (MP) and coat protein (CP), respectively. The putative product encoded by ORF3, of an estimated molecular mass of 25 kDa, has no significant similarity with other proteins. Identity and phylogenetic analysis indicate that WVA is a new virus, closely related to members of the family Betaflexiviridae. However, the final taxonomic allocation of WVA within the family is yet to be determined.
Xuxia, Wang; Jie, Chen; Bo, Wang; Lijun, Liu; Hui, Jiang; Diluo, Tang; Dingxiang, Peng
2012-01-01
For the purpose of screening putative anthracnose resistance-related genes of ramie ( Boehmeria nivea L. Gaud), a cDNA library was constructed by suppression subtractive hybridization using anthracnose-resistant cultivar Huazhu no. 4. The cDNAs from Huazhu no. 4, which were infected with Colletotrichum gloeosporioides , were used as the tester and cDNAs from uninfected Huazhu no. 4 as the driver. Sequencing analysis and homology searching showed that these clones represented 132 single genes, which were assigned to functional categories, including 14 putative cellular functions, according to categories established for Arabidopsis . These 132 genes included 35 disease resistance and stress tolerance-related genes including putative heat-shock protein 90, metallothionein, PR-1.2 protein, catalase gene, WRKY family genes, and proteinase inhibitor-like protein. Partial disease-related genes were further analyzed by reverse transcription PCR and RNA gel blot. These expressed sequence tags are the first anthracnose resistance-related expressed sequence tags reported in ramie.
Putative Porin of Bradyrhizobium sp. (Lupinus) Bacteroids Induced by Glyphosate▿
de María, Nuria; Guevara, Ángeles; Serra, M. Teresa; García-Luque, Isabel; González-Sama, Alfonso; de Lacoba, Mario García; de Felipe, M. Rosario; Fernández-Pascual, Mercedes
2007-01-01
Application of glyphosate (N-[phosphonomethyl] glycine) to Bradyrhizobium sp. (Lupinus)-nodulated lupin plants caused modifications in the protein pattern of bacteroids. The most significant change was the presence of a 44-kDa polypeptide in bacteroids from plants treated with the higher doses of glyphosate employed (5 and 10 mM). The polypeptide has been characterized by the amino acid sequencing of its N terminus and the isolation and nucleic acid sequencing of its encoding gene. It is putatively encoded by a single gene, and the protein has been identified as a putative porin. Protein modeling revealed the existence of several domains sharing similarity to different porins, such as a transmembrane beta-barrel. The protein has been designated BLpp, for Bradyrhizobium sp. (Lupinus) putative porin, and would be the first porin described in Bradyrhizobium sp. (Lupinus). In addition, a putative conserved domain of porins has been identified which consists of 87 amino acids, located in the BLpp sequence 30 amino acids downstream of the N-terminal region. In bacteroids, mRNA of the BLpp gene shows a basal constitutive expression that increases under glyphosate treatment, and the expression of the gene is seemingly regulated at the transcriptional level. By contrast, in free-living bacteria glyphosate treatment leads to an inhibition of BLpp mRNA accumulation, indicating a different effect of glyphosate on BLpp gene expression in bacteroids and free-living bacteria. The possible role of BLpp in a metabolite interchange between Bradyrhizobium and lupin is discussed. PMID:17557843
Diversity of the P2 protein among nontypeable Haemophilus influenzae isolates.
Bell, J; Grass, S; Jeanteur, D; Munson, R S
1994-01-01
The genes for outer membrane protein P2 of four nontypeable Haemophilus influenzae strains were cloned and sequenced. The derived amino acid sequences were compared with the outer membrane protein P2 sequence from H. influenzae type b MinnA and the sequences of P2 from three additional nontypeable H. influenzae strains. The sequences were 76 to 94% identical. The sequences had regions with considerable variability separated by regions which were highly conserved. The variable regions mapped to putative surface-exposed loops of the protein. PMID:8188390
Melendrez, Melanie C.; Lange, Rachel K.; Cohan, Frederick M.; Ward, David M.
2011-01-01
Previous research has shown that sequences of 16S rRNA genes and 16S-23S rRNA internal transcribed spacer regions may not have enough genetic resolution to define all ecologically distinct Synechococcus populations (ecotypes) inhabiting alkaline, siliceous hot spring microbial mats. To achieve higher molecular resolution, we studied sequence variation in three protein-encoding loci sampled by PCR from 60°C and 65°C sites in the Mushroom Spring mat (Yellowstone National Park, WY). Sequences were analyzed using the ecotype simulation (ES) and AdaptML algorithms to identify putative ecotypes. Between 4 and 14 times more putative ecotypes were predicted from variation in protein-encoding locus sequences than from variation in 16S rRNA and 16S-23S rRNA internal transcribed spacer sequences. The number of putative ecotypes predicted depended on the number of sequences sampled and the molecular resolution of the locus. Chao estimates of diversity indicated that few rare ecotypes were missed. Many ecotypes hypothesized by sequence analyses were different in their habitat specificities, suggesting different adaptations to temperature or other parameters that vary along the flow channel. PMID:21169433
2012-01-01
Background Natrialba magadii is an aerobic chemoorganotrophic member of the Euryarchaeota and is a dual extremophile requiring alkaline conditions and hypersalinity for optimal growth. The genome sequence of Nab. magadii type strain ATCC 43099 was deciphered to obtain a comprehensive insight into the genetic content of this haloarchaeon and to understand the basis of some of the cellular functions necessary for its survival. Results The genome of Nab. magadii consists of four replicons with a total sequence of 4,443,643 bp and encodes 4,212 putative proteins, some of which contain peptide repeats of various lengths. Comparative genome analyses facilitated the identification of genes encoding putative proteins involved in adaptation to hypersalinity, stress response, glycosylation, and polysaccharide biosynthesis. A proton-driven ATP synthase and a variety of putative cytochromes and other proteins supporting aerobic respiration and electron transfer were encoded by one or more of Nab. magadii replicons. The genome encodes a number of putative proteases/peptidases as well as protein secretion functions. Genes encoding putative transcriptional regulators, basal transcription factors, signal perception/transduction proteins, and chemotaxis/phototaxis proteins were abundant in the genome. Pathways for the biosynthesis of thiamine, riboflavin, heme, cobalamin, coenzyme F420 and other essential co-factors were deduced by in depth sequence analyses. However, approximately 36% of Nab. magadii protein coding genes could not be assigned a function based on Blast analysis and have been annotated as encoding hypothetical or conserved hypothetical proteins. Furthermore, despite extensive comparative genomic analyses, genes necessary for survival in alkaline conditions could not be identified in Nab. magadii. Conclusions Based on genomic analyses, Nab. magadii is predicted to be metabolically versatile and it could use different carbon and energy sources to sustain growth. Nab. magadii has the genetic potential to adapt to its milieu by intracellular accumulation of inorganic cations and/or neutral organic compounds. The identification of Nab. magadii genes involved in coenzyme biosynthesis is a necessary step toward further reconstruction of the metabolic pathways in halophilic archaea and other extremophiles. The knowledge gained from the genome sequence of this haloalkaliphilic archaeon is highly valuable in advancing the applications of extremophiles and their enzymes. PMID:22559199
Chowdhury, Shomeek; Zhang, Jian; Kurgan, Lukasz
2018-05-28
Deciphering a complete landscape of protein-RNA interactions in the human proteome remains an elusive challenge. We computationally elucidate RNA binding proteins (RBPs) using an approach that complements previous efforts. We employ two modern complementary sequence-based methods that provide accurate predictions from the structured and the intrinsically disordered sequences, even in the absence of sequence similarity to the known RBPs. We generate and analyze putative RNA binding residues on the whole proteome scale. Using a conservative setting that ensures low, 5% false positive rate, we identify 1511 putative RBPs that include 281 known RBPs and 166 RBPs that were previously predicted. We empirically demonstrate that these overlaps are statistically significant. We also validate the putative RBPs based on two major hallmarks of their RNA binding residues: high levels of evolutionary conservation and enrichment in charged amino acids. Moreover, we show that the novel RBPs are significantly under-annotated functionally which coincides with the fact that they were not yet found to interact with RNAs. We provide two examples of our novel putative RBPs for which there is recent evidence of their interactions with RNAs. The dataset of novel putative RBPs and RNA binding residues for the future hypothesis generation is provided in the Supporting Information. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Upadhyay, Atul Kumar; Sowdhamini, Ramanathan
2016-01-01
3D-domain swapping is one of the mechanisms of protein oligomerization and the proteins exhibiting this phenomenon have many biological functions. These proteins, which undergo domain swapping, have acquired much attention owing to their involvement in human diseases, such as conformational diseases, amyloidosis, serpinopathies, proteionopathies etc. Early realisation of proteins in the whole human genome that retain tendency to domain swap will enable many aspects of disease control management. Predictive models were developed by using machine learning approaches with an average accuracy of 78% (85.6% of sensitivity, 87.5% of specificity and an MCC value of 0.72) to predict putative domain swapping in protein sequences. These models were applied to many complete genomes with special emphasis on the human genome. Nearly 44% of the protein sequences in the human genome were predicted positive for domain swapping. Enrichment analysis was performed on the positively predicted sequences from human genome for their domain distribution, disease association and functional importance based on Gene Ontology (GO). Enrichment analysis was also performed to infer a better understanding of the functional importance of these sequences. Finally, we developed hinge region prediction, in the given putative domain swapped sequence, by using important physicochemical properties of amino acids.
Collart, F R; Osipiuk, J; Trent, J; Olsen, G J; Huberman, E
1996-10-03
We have cloned and characterized the gene encoding inosine monophosphate dehydrogenase (IMPDH) from Pyrococcus furiosus (Pf), a hyperthermophillic archeon. Sequence analysis of the Pf gene indicated an open reading frame specifying a protein of 485 amino acids (aa) with a calculated M(r) of 52900. Canonical Archaea promoter elements, Box A and Box B, are located -49 and -17 nucleotides (nt), respectively, upstream of the putative start codon. The sequence of the putative active-site region conforms to the IMPDH signature motif and contains a putative active-site cysteine. Phylogenetic relationships derived by using all available IMPDH sequences are consistent with trees developed for other molecules; they do not precisely resolve the history of Pf IMPDH but indicate a close similarity to bacterial IMPDH proteins. The phylogenetic analysis indicates that a gene duplication occurred prior to the division between rodents and humans, accounting for the Type I and II isoforms identified in mice and humans.
Nucleotide sequence of the gene encoding the nitrogenase iron protein of Thiobacillus ferrooxidans
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pretorius, I.M.; Rawlings, D.E.; O'Neill, E.G.
1987-01-01
The DNA sequence was determined for the cloned Thiobacillus ferrooxidans nifH and part of the nifD genes. The DNA chains were radiolabeled with (..cap alpha..-/sup 32/P)dCTP (3000 Ci/mmol) or (..cap alpha..-/sup 35/S)dCTP (400 Ci/mmol). A putative T. ferrooxidans nifH promoter was identified whose sequences showed perfect consensus with those of the Klebsiella pneumoniae nif promoter. Two putative consensus upstream activator sequences were also identified. The amino acid sequence was deduced from the DNA sequence. In a comparison of nifH DNA sequences from T. ferrooxidans and eight other nitrogen-fixing microbes, a Rhizobium sp. isolated from Parasponia andersonii showed the greatest homologymore » (74%) and Clostridium pasteurianum (nifH1) showed the least homology (54%). In the comparison of the amino acid sequences of the Fe proteins, the Rhizobium sp. and Rhizobium japonicum showed the greatest homology (both 86%) and C. pasteurianum (nifH1 gene product) demonstrated the least homology (56%) to the T. ferrooxidans Fe protein.« less
Palma, Leopoldo; Muñoz, Delia; Berry, Colin; Murillo, Jesús; Caballero, Primitivo
2014-01-01
In this work, we report the genome sequencing of two Bacillus thuringiensis strains using Illumina next-generation sequencing technology (NGS). Strain Hu4-2, toxic to many lepidopteran pest species and to some mosquitoes, encoded genes for two insecticidal crystal (Cry) proteins, cry1Ia and cry9Ea, and a vegetative insecticidal protein (Vip) gene, vip3Ca2. Strain Leapi01 contained genes coding for seven Cry proteins (cry1Aa, cry1Ca, cry1Da, cry2Ab, cry9Ea and two cry1Ia gene variants) and a vip3 gene (vip3Aa10). A putative novel insecticidal protein gene 1143 bp long was found in both strains, whose sequences exhibited 100% nucleotide identity. The predicted protein showed 57 and 100% pairwise identity to protein sequence 72 from a patented Bt strain (US8318900) and to a putative 41.9-kDa insecticidal toxin from Bacillus cereus, respectively. The 41.9-kDa protein, containing a C-terminal 6× HisTag fusion, was expressed in Escherichia coli and tested for the first time against four lepidopteran species (Mamestra brassicae, Ostrinia nubilalis, Spodoptera frugiperda and S. littoralis) and the green-peach aphid Myzus persicae at doses as high as 4.8 µg/cm2 and 1.5 mg/mL, respectively. At these protein concentrations, the recombinant 41.9-kDa protein caused no mortality or symptoms of impaired growth against any of the insects tested, suggesting that these species are outside the protein’s target range or that the protein may not, in fact, be toxic. While the use of the polymerase chain reaction has allowed a significant increase in the number of Bt insecticidal genes characterized to date, novel NGS technologies promise a much faster, cheaper and efficient screening of Bt pesticidal proteins. PMID:24784323
Scolari, Francesca; Gomulski, Ludvik M.; Ribeiro, José M. C.; Siciliano, Paolo; Meraldi, Alice; Falchetto, Marco; Bonomi, Angelica; Manni, Mosè; Gabrieli, Paolo; Malovini, Alberto; Bellazzi, Riccardo; Aksoy, Serap; Gasperi, Giuliano; Malacrida, Anna R.
2012-01-01
Background Insect seminal fluid is a complex mixture of proteins, carbohydrates and lipids, produced in the male reproductive tract. This seminal fluid is transferred together with the spermatozoa during mating and induces post-mating changes in the female. Molecular characterization of seminal fluid proteins in the Mediterranean fruit fly, Ceratitis capitata, is limited, although studies suggest that some of these proteins are biologically active. Methodology/Principal Findings We report on the functional annotation of 5914 high quality expressed sequence tags (ESTs) from the testes and male accessory glands, to identify transcripts encoding putative secreted peptides that might elicit post-mating responses in females. The ESTs were assembled into 3344 contigs, of which over 33% produced no hits against the nr database, and thus may represent novel or rapidly evolving sequences. Extraction of the coding sequences resulted in a total of 3371 putative peptides. The annotated dataset is available as a hyperlinked spreadsheet. Four hundred peptides were identified with putative secretory activity, including odorant binding proteins, protease inhibitor domain-containing peptides, antigen 5 proteins, mucins, and immunity-related sequences. Quantitative RT-PCR-based analyses of a subset of putative secretory protein-encoding transcripts from accessory glands indicated changes in their abundance after one or more copulations when compared to virgin males of the same age. These changes in abundance, particularly evident after the third mating, may be related to the requirement to replenish proteins to be transferred to the female. Conclusions/Significance We have developed the first large-scale dataset for novel studies on functions and processes associated with the reproductive biology of Ceratitis capitata. The identified genes may help study genome evolution, in light of the high adaptive potential of the medfly. In addition, studies of male recovery dynamics in terms of accessory gland gene expression profiles and correlated remating inhibition mechanisms may permit the improvement of pest management approaches. PMID:23071645
Donkey Orchid Symptomless Virus: A Viral ‘Platypus’ from Australian Terrestrial Orchids
Wylie, Stephen J.; Li, Hua; Jones, Michael G. K.
2013-01-01
Complete and partial genome sequences of two isolates of an unusual new plant virus, designated Donkey orchid symptomless virus (DOSV) were identified using a high-throughput sequencing approach. The virus was identified from asymptomatic plants of Australian terrestrial orchid Diuris longifolia (Common donkey orchid) growing in a remnant forest patch near Perth, western Australia. DOSV was identified from two D. longifolia plants of 264 tested, and from at least one plant of 129 Caladenia latifolia (pink fairy orchid) plants tested. Phylogenetic analysis of the genome revealed open reading frames (ORF) encoding seven putative proteins of apparently disparate origins. A 69-kDa protein (ORF1) that overlapped the replicase shared low identity with MPs of plant tymoviruses (Tymoviridae). A 157-kDa replicase (ORF2) and 22-kDa coat protein (ORF4) shared 32% and 40% amino acid identity, respectively, with homologous proteins encoded by members of the plant virus family Alphaflexiviridae. A 44-kDa protein (ORF3) shared low identity with myosin and an autophagy protein from Squirrelpox virus. A 27-kDa protein (ORF5) shared no identity with described proteins. A 14-kDa protein (ORF6) shared limited sequence identity (26%) over a limited region of the envelope glycoprotein precursor of mammal-infecting Crimea-Congo hemorrhagic fever virus (Bunyaviridae). The putative 25-kDa movement protein (MP) (ORF7) shared limited (27%) identity with 3A-like MPs of members of the plant-infecting Tombusviridae and Virgaviridae. Transmissibility was shown when DOSV systemically infected Nicotiana benthamiana plants. Structure and organization of the domains within the putative replicase of DOSV suggests a common evolutionary origin with ‘potexvirus-like’ replicases of viruses within the Alphaflexiviridae and Tymoviridae, and the CP appears to be ancestral to CPs of allexiviruses (Alphaflexiviridae). The MP shares an evolutionary history with MPs of dianthoviruses, but the other putative proteins are distant from plant viruses. DOSV is not readily classified in current lower order virus taxa. PMID:24223974
Genome sequence of Plasmopara viticola and insight into the pathogenic mechanism
Yin, Ling; An, Yunhe; Qu, Junjie; Li, Xinlong; Zhang, Yali; Dry, Ian; Wu, Huijuan; Lu, Jiang
2017-01-01
Plasmopara viticola causes downy mildew disease of grapevine which is one of the most devastating diseases of viticulture worldwide. Here we report a 101.3 Mb whole genome sequence of P. viticola isolate ‘JL-7-2’ obtained by a combination of Illumina and PacBio sequencing technologies. The P. viticola genome contains 17,014 putative protein-coding genes and has ~26% repetitive sequences. A total of 1,301 putative secreted proteins, including 100 putative RXLR effectors and 90 CRN effectors were identified in this genome. In the secretome, 261 potential pathogenicity genes and 95 carbohydrate-active enzymes were predicted. Transcriptional analysis revealed that most of the RXLR effectors, pathogenicity genes and carbohydrate-active enzymes were significantly up-regulated during infection. Comparative genomic analysis revealed that P. viticola evolved independently from the Arabidopsis downy mildew pathogen Hyaloperonospora arabidopsidis. The availability of the P. viticola genome provides a valuable resource not only for comparative genomic analysis and evolutionary studies among oomycetes, but also enhance our knowledge on the mechanism of interactions between this biotrophic pathogen and its host. PMID:28417959
Allen, S P; Polazzi, J O; Gierse, J K; Easton, A M
1992-01-01
In Escherichia coli high-level production of some heterologous proteins (specifically, human prorenin, renin, and bovine insulin-like growth factor 2) resulted in the induction of two new E. coli heat shock proteins, both of which have molecular masses of 16 kDa and are tightly associated with inclusion bodies formed during heterologous protein production. We named these inclusion body-associated proteins IbpA and IbpB. The coding sequences for IbpA and IbpB were identified and isolated from the Kohara E. coli gene bank. The genes for these proteins (ibpA and ibpB) are located at 82.5 min on the chromosome. Nucleotide sequencing of the two genes revealed that they are transcribed in the same direction and are separated by 110 bp. Putative Shine-Dalgarno sequences are located upstream from the initiation codons of both genes. A putative heat shock promoter is located upstream from ibpA, and a putative transcription terminator is located downstream from ibpB. A temperature upshift experiment in which we used a wild-type E. coli strain and an isogenic rpoH mutant strain indicated that a sigma 32-containing RNA polymerase is involved in the regulation of expression of these genes. There is 57.5% identity between the genes at the nucleotide level and 52.2% identity at the amino acid level. A search of the protein data bases showed that both of these 16-kDa proteins exhibit low levels of homology to low-molecular-weight heat shock proteins from eukaryotic species. Images PMID:1356969
Generation and Analysis of Expressed Sequence Tags from Olea europaea L.
Ozdemir Ozgenturk, Nehir; Oruç, Fatma; Sezerman, Ugur; Kuçukural, Alper; Vural Korkut, Senay; Toksoz, Feriha; Un, Cemal
2010-01-01
Olive (Olea europaea L.) is an important source of edible oil which was originated in Near-East region. In this study, two cDNA libraries were constructed from young olive leaves and immature olive fruits for generation of ESTs to discover the novel genes and search the function of unknown genes of olive. The randomly selected 3840 colonies were sequenced for EST collection from both libraries. Readable 2228 sequences for olive leaf and 1506 sequences for olive fruit were assembled into 205 and 69 contigs, respectively, whereas 2478 were singletons. Putative functions of all 2752 differentially expressed unique sequences were designated by gene homology based on BLAST and annotated using BLAST2GO. While 1339 ESTs show no homology to the database, 2024 ESTs have homology (under 80%) with hypothetical proteins, putative proteins, expressed proteins, and unknown proteins in NCBI-GenBank. 635 EST's unique genes sequence have been identified by over 80% homology to known function in other species which were not previously described in Olea family. Only 3.1% of total EST's was shown similarity with olive database existing in NCBI. This generated EST's data and consensus sequences were submitted to NCBI as valuable source for functional genome studies of olive. PMID:21197085
Identification and application of self-binding zipper-like sequences in SARS-CoV spike protein.
Zhang, Si Min; Liao, Ying; Neo, Tuan Ling; Lu, Yanning; Liu, Ding Xiang; Vahlne, Anders; Tam, James P
2018-05-22
Self-binding peptides containing zipper-like sequences, such as the Leu/Ile zipper sequence within the coiled coil regions of proteins and the cross-β spine steric zippers within the amyloid-like fibrils, could bind to the protein-of-origin through homophilic sequence-specific zipper motifs. These self-binding sequences represent opportunities for the development of biochemical tools and/or therapeutics. Here, we report on the identification of a putative self-binding β-zipper-forming peptide within the severe acute respiratory syndrome-associated coronavirus spike (S) protein and its application in viral detection. Peptide array scanning of overlapping peptides covering the entire length of S protein identified 34 putative self-binding peptides of six clusters, five of which contained octapeptide core consensus sequences. The Cluster I consensus octapeptide sequence GINITNFR was predicted by the Eisenberg's 3D profile method to have high amyloid-like fibrillation potential through steric β-zipper formation. Peptide C6 containing the Cluster I consensus sequence was shown to oligomerize and form amyloid-like fibrils. Taking advantage of this, C6 was further applied to detect the S protein expression in vitro by fluorescence staining. Meanwhile, the coiled-coil-forming Leu/Ile heptad repeat sequences within the S protein were under-represented during peptide array scanning, in agreement with that long peptide lengths were required to attain high helix-mediated interaction avidity. The data suggest that short β-zipper-like self-binding peptides within the S protein could be identified through combining the peptide scanning and predictive methods, and could be exploited as biochemical detection reagents for viral infection. Copyright © 2018. Published by Elsevier Ltd.
Flot, Jean-François; Tillier, Simon
2007-10-15
The complete mitochondrial genomes of two individuals attributed to different morphospecies of the scleractinian coral genus Pocillopora have been sequenced. Both genomes, respectively 17,415 and 17,422 nt long, share the presence of a previously undescribed ORF encoding a putative protein made up of 302 amino acids and of unknown function. Surprisingly, this ORF turns out to be the second most variable region of the mitochondrial genome (1% nucleotide sequence difference between the two individuals) after the putative control region (1.5% sequence difference). Except for the presence of this ORF and for the location of the putative control region, the mitochondrial genome of Pocillopora is organized in a fashion similar to the other scleractinian coral genomes published to date. For the first time in a cnidarian, a putative second origin of replication is described based on its secondary structure similar to the stem-loop structure of O(L), the origin of L-strand replication in vertebrates.
Miguel, Célia; Simões, Marta; Oliveira, Maria Margarida; Rocheta, Margarida
2008-11-01
Retroviruses differ from retrotransposons due to their infective capacity, which depends critically on the encoded envelope. Some plant retroelements contain domains reminiscent of the env of animal retroviruses but the number of such elements described to date is restricted to angiosperms. We show here the first evidence of the presence of putative env-like gene sequences in a gymnosperm species, Pinus pinaster (maritime pine). Using a degenerate primer approach for conserved domains of RNaseH gene, three clones from putative envelope-like retrotransposons (PpRT2, PpRT3, and PpRT4) were identified. The env-like sequences of P. pinaster clones are predicted to encode proteins with transmembrane domains. These sequences showed identity scores of up to 30% with env-like sequences belonging to different organisms. A phylogenetic analysis based on protein alignment of deduced aminoacid sequences revealed that these clones clustered with env-containing plant retrotransposons, as well as with retrotransposons from invertebrate organisms. The differences found among the sequences of maritime pine clones isolated here suggest the existence of different putative classes of env-like retroelements. The identification for the first time of env-like genes in a gymnosperm species may support the ancestrality of retroviruses among plants shedding light on their role in plant evolution.
Puthoff, D P; Neelam, A; Ehrenfried, M L; Scheffler, B E; Ballard, L; Song, Q; Campbell, K B; Cooper, B; Tucker, M L
2008-10-01
Hyphae, 2 to 8 days postinoculation (dpi), and haustoria, 5 dpi, were isolated from Uromyces appendiculatus infected bean leaves (Phaseolus vulgaris cv. Pinto 111) and a separate cDNA library prepared for each fungal preparation. Approximately 10,000 hyphae and 2,700 haustoria clones were sequenced from both the 5' and 3' ends. Assembly of all of the fungal sequences yielded 3,359 contigs and 927 singletons. The U. appendiculatus sequences were compared with sequence data for other rust fungi, Phakopsora pachyrhizi, Uromyces fabae, and Puccinia graminis. The U. appendiculatus haustoria library included a large number of genes with unknown cellular function; however, summation of sequences of known cellular function suggested that haustoria at 5 dpi had fewer transcripts linked to protein synthesis in favor of energy metabolism and nutrient uptake. In addition, open reading frames in the U. appendiculatus data set with an N-terminal signal peptide were identified and compared with other proteins putatively secreted from rust fungi. In this regard, a small family of putatively secreted RTP1-like proteins was identified in U. appendiculatus and P. graminis.
Wei, Dan-Dan; Chen, Er-Hu; Ding, Tian-Bo; Chen, Shi-Chun; Dou, Wei; Wang, Jin-Jun
2013-01-01
Background As a major stored-product pest insect, Liposcelis entomophila has developed high levels of resistance to various insecticides in grain storage systems. However, the molecular mechanisms underlying resistance and environmental stress have not been characterized. To date, there is a lack of genomic information for this species. Therefore, studies aimed at profiling the L. entomophila transcriptome would provide a better understanding of the biological functions at the molecular levels. Methodology/Principal Findings We applied Illumina sequencing technology to sequence the transcriptome of L. entomophila. A total of 54,406,328 clean reads were obtained and that de novo assembled into 54,220 unigenes, with an average length of 571 bp. Through a similarity search, 33,404 (61.61%) unigenes were matched to known proteins in the NCBI non-redundant (Nr) protein database. These unigenes were further functionally annotated with gene ontology (GO), cluster of orthologous groups of proteins (COG), and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. A large number of genes potentially involved in insecticide resistance were manually curated, including 68 putative cytochrome P450 genes, 37 putative glutathione S-transferase (GST) genes, 19 putative carboxyl/cholinesterase (CCE) genes, and other 126 transcripts to contain target site sequences or encoding detoxification genes representing eight types of resistance enzymes. Furthermore, to gain insight into the molecular basis of the L. entomophila toward thermal stresses, 25 heat shock protein (Hsp) genes were identified. In addition, 1,100 SSRs and 57,757 SNPs were detected and 231 pairs of SSR primes were designed for investigating the genetic diversity in future. Conclusions/Significance We developed a comprehensive transcriptomic database for L. entomophila. These sequences and putative molecular markers would further promote our understanding of the molecular mechanisms underlying insecticide resistance or environmental stress, and will facilitate studies on population genetics for psocids, as well as providing useful information for functional genomic research in the future. PMID:24244605
Zhu, Yu-Cheng; Specht, Charles A; Dittmer, Neal T; Muthukrishnan, Subbaratnam; Kanost, Michael R; Kramer, Karl J
2002-11-01
Glycosyltransferases are enzymes that synthesize oligosaccharides, polysaccharides and glycoconjugates. One type of glycosyltransferase is chitin synthase, a very important enzyme in biology, which is utilized by insects, fungi, and other invertebrates to produce chitin, a polysaccharide of beta-1,4-linked N-acetylglucosamine. Chitin is an important component of the insect's exoskeletal cuticle and gut lining. To identify and characterize a chitin synthase gene of the tobacco hornworm, Manduca sexta, degenerate primers were designed from two highly conserved regions in fungal and nematode chitin synthase protein sequences and then used to amplify a similar region from Manduca cDNA. A full-length cDNA of 5152 nucleotides was assembled for the putative Manduca chitin synthase gene, MsCHS1, and sequencing of genomic DNA verified the contiguity of the sequence. The MsCHS1 cDNA has an ORF of 4692 nucleotides that encodes a transmembrane protein of 1564 amino acid residues with a mass of approximately 179 kDa (GenBank no. AY062175). It is most similar, over its entire length of protein sequence, to putative chitin synthases from other insects and nematodes, with 68% identity to enzymes from both the blow fly, Lucilia cuprina, and the fruit fly, Drosophila melanogaster. The similarity with fungal chitin synthases is restricted to the putative catalytic domain, and the MsCHS1 protein has, at equivalent positions, several amino acids that are essential for activity as revealed by mutagenesis of the fungal enzymes. A 5.3-kb transcript of MsCHS1 was identified by northern blot hybridization of RNA from larval epidermis, suggesting that the enzyme functions to make chitin deposited in the cuticle. Further examination by RT-PCR showed that MsCHS1 expression is regulated in the epidermis, with the amount of transcript increasing during phases of cuticle deposition.
Sequence analysis and expression of the M1 and M2 matrix protein genes of hirame rhabdovirus (HIRRV)
Nishizawa, T.; Kurath, G.; Winton, J.R.
1997-01-01
We have cloned and sequenced a 2318 nucleotide region of the genomic RNA of hirame rhabdovirus (HIRRV), an important viral pathogen of Japanese flounder Paralichthys olivaceus. This region comprises approximately two-thirds of the 3' end of the nucleocapsid protein (N) gene and the complete matrix protein (M1 and M2) genes with the associated intergenic regions. The partial N gene sequence was 812 nucleotides in length with an open reading frame (ORF) that encoded the carboxyl-terminal 250 amino acids of the N protein. The M1 and M2 genes were 771 and 700 nucleotides in length, respectively, with ORFs encoding proteins of 227 and 193 amino acids. The M1 gene sequence contained an additional small ORF that could encode a highly basic, arginine-rich protein of 25 amino acids. Comparisons of the N, M1, and M2 gene sequences of HIRRV with the corresponding sequences of the fish rhabdoviruses, infectious hematopoietic necrosis virus (IHNV) or viral hemorrhagic septicemia virus (VHSV) indicated that HIRRV was more closely related to IHNV than to VHSV, but was clearly distinct from either. The putative consensus gene termination sequence for IHNV and VHSV, AGAYAG(A)(7), was present in the N-M1, M1-M2, and M2-G intergenic regions of HIRRV as were the putative transcription initiation sequences YGGCAC and AACA. An Escherichia coli expression system was used to produce recombinant proteins from the M1 and M2 genes of HIRRV. These were the same size as the authentic M1 and M2 proteins and reacted with anti-HIRRV rabbit serum in western blots. These reagents can be used for further study of the fish immune response and to test novel control methods.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ris-Stalpers, C.; Verleun-Mooijman, M.C.T.; Blaeij, T.J.P. de
1994-04-01
The analysis of the androgen receptor (AR) gene, mRNA, and protein in a subject with X-linked Reifenstein syndrome (partial androgen insensitivity) is reported. The presence of two mature AR transcripts in genital skin fibroblasts of the patient is established, and, by reverse transcriptase-PCR and RNase transcription analysis, the wild-type transcript and a transcript in which exon 3 sequences are absent without disruption of the translational reading frame are identified. Sequencing and hybridization analysis show a deletion of >6 kb in intron 2 of the human AR gene, starting 18 bp upstream of exon 3. The deletion includes the putative branch-pointmore » sequence (BPS) but not the acceptor splice site on the intron 2/exon 3 boundary. The deletion of the putative intron 2 BPS results in 90% inhibition of wild-type splicing. The mutant transcript encodes an AR protein lacking the second zinc finger of the DNA-binding domain. Western/immunoblotting analysis is used to show that the mutant AR protein is expressed in genital skin fibroblasts of the patient. The residual 10% wild-type transcript can be the result of the use of a cryptic BPS located 63 bp upstream of the intron 2/exon 3 boundary of the mutant AR gene. The mutated AR protein has no transcription-activating potential and does not influence the transactivating properties of the wild-type AR, as tested in cotransfection studies. It is concluded that the partial androgen-insensitivity syndrome of this patient is the consequence of the limited amount of wild-type AR protein expressed in androgen target cells, resulting from the deletion of the intron 2 putative BPS. 42 refs., 6 figs., 1 tab.« less
Sela, Noa; Lachman, Oded; Reingold, Victoria; Dombrovsky, Aviv
2013-10-01
A novel virus was detected in watermelon plants (Citrullus lanatus Thunb.) infected with Melon necrotic spot virus (MNSV) using SOLiD next-generation sequence analysis. In addition to the expected MSNV genome, two double-stranded RNA (dsRNA) segments of 1,312 and 1,118 bp were also identified and sequenced from the purified virus preparations. These two dsRNA segments encode two putative partitivirus-related proteins, an RNA-dependent RNA polymerase (RdRP) and a capsid protein, which were sequenced. Genomic-sequence analysis and analysis of phylogenetic relationships indicate that these two dsRNAs together make up the genome of a novel Partitivirus. This virus was found to be closely related to the Pepper cryptic virus 1 and Raphanus sativus cryptic virus. It is suggested that this novel virus putatively named Citrullus lanatus cryptic virus be considered as a new member of the family Partitiviridae.
Ringwald, M; Schuh, R; Vestweber, D; Eistetter, H; Lottspeich, F; Engel, J; Dölz, R; Jähnig, F; Epplen, J; Mayer, S
1987-01-01
We have determined the amino acid sequence of the Ca2+-dependent cell adhesion molecule uvomorulin as it appears on the cell surface. The extracellular part of the molecule exhibits three internally repeated domains of 112 residues which are most likely generated by gene duplication. Each of the repeated domains contains two highly conserved units which could represent putative Ca2+-binding sites. Secondary structure predictions suggest that the putative Ca2+-binding units are located in external loops at the surface of the protein. The protein sequence exhibits a single membrane-spanning region and a cytoplasmic domain. Sequence comparison reveals extensive homology to the chicken L-CAM. Both uvomorulin and L-CAM are identical in 65% of their entire amino acid sequence suggesting a common origin for both CAMs. Images Fig. 1. Fig. 4. Fig. 7. PMID:3501370
Whole-Genome Survey of the Putative ATP-Binding Cassette Transporter Family Genes in Vitis vinifera
Çakır, Birsen; Kılıçkaya, Ozan
2013-01-01
The ATP-binding cassette (ABC) protein superfamily constitutes one of the largest protein families known in plants. In this report, we performed a complete inventory of ABC protein genes in Vitis vinifera, the whole genome of which has been sequenced. By comparison with ABC protein members of Arabidopsis thaliana, we identified 135 putative ABC proteins with 1 or 2 NBDs in V. vinifera. Of these, 120 encode intrinsic membrane proteins, and 15 encode proteins missing TMDs. V. vinifera ABC proteins can be divided into 13 subfamilies with 79 “full-size,” 41 “half-size,” and 15 “soluble” putative ABC proteins. The main feature of the Vitis ABC superfamily is the presence of 2 large subfamilies, ABCG (pleiotropic drug resistance and white-brown complex homolog) and ABCC (multidrug resistance-associated protein). We identified orthologs of V. vinifera putative ABC transporters in different species. This work represents the first complete inventory of ABC transporters in V. vinifera. The identification of Vitis ABC transporters and their comparative analysis with the Arabidopsis counterparts revealed a strong conservation between the 2 species. This inventory could help elucidate the biological and physiological functions of these transporters in V. vinifera. PMID:24244377
Vettore, André L.; da Silva, Felipe R.; Kemper, Edson L.; Souza, Glaucia M.; da Silva, Aline M.; Ferro, Maria Inês T.; Henrique-Silva, Flavio; Giglioti, Éder A.; Lemos, Manoel V.F.; Coutinho, Luiz L.; Nobrega, Marina P.; Carrer, Helaine; França, Suzelei C.; Bacci, Maurício; Goldman, Maria Helena S.; Gomes, Suely L.; Nunes, Luiz R.; Camargo, Luis E.A.; Siqueira, Walter J.; Van Sluys, Marie-Anne; Thiemann, Otavio H.; Kuramae, Eiko E.; Santelli, Roberto V.; Marino, Celso L.; Targon, Maria L.P.N.; Ferro, Jesus A.; Silveira, Henrique C.S.; Marini, Danyelle C.; Lemos, Eliana G.M.; Monteiro-Vitorello, Claudia B.; Tambor, José H.M.; Carraro, Dirce M.; Roberto, Patrícia G.; Martins, Vanderlei G.; Goldman, Gustavo H.; de Oliveira, Regina C.; Truffi, Daniela; Colombo, Carlos A.; Rossi, Magdalena; de Araujo, Paula G.; Sculaccio, Susana A.; Angella, Aline; Lima, Marleide M.A.; de Rosa, Vicente E.; Siviero, Fábio; Coscrato, Virginia E.; Machado, Marcos A.; Grivet, Laurent; Di Mauro, Sonia M.Z.; Nobrega, Francisco G.; Menck, Carlos F.M.; Braga, Marilia D.V.; Telles, Guilherme P.; Cara, Frank A.A.; Pedrosa, Guilherme; Meidanis, João; Arruda, Paulo
2003-01-01
To contribute to our understanding of the genome complexity of sugarcane, we undertook a large-scale expressed sequence tag (EST) program. More than 260,000 cDNA clones were partially sequenced from 26 standard cDNA libraries generated from different sugarcane tissues. After the processing of the sequences, 237,954 high-quality ESTs were identified. These ESTs were assembled into 43,141 putative transcripts. Of the assembled sequences, 35.6% presented no matches with existing sequences in public databases. A global analysis of the whole SUCEST data set indicated that 14,409 assembled sequences (33% of the total) contained at least one cDNA clone with a full-length insert. Annotation of the 43,141 assembled sequences associated almost 50% of the putative identified sugarcane genes with protein metabolism, cellular communication/signal transduction, bioenergetics, and stress responses. Inspection of the translated assembled sequences for conserved protein domains revealed 40,821 amino acid sequences with 1415 Pfam domains. Reassembling the consensus sequences of the 43,141 transcripts revealed a 22% redundancy in the first assembling. This indicated that possibly 33,620 unique genes had been identified and indicated that >90% of the sugarcane expressed genes were tagged. PMID:14613979
Zimmermann, K; Herget, T; Salbaum, J M; Schubert, W; Hilbich, C; Cramer, M; Masters, C L; Multhaup, G; Kang, J; Lemaire, H G
1988-01-01
Cloning and sequence analysis revealed the putative amyloid A4 precursor (pre-A4) of Alzheimer's disease to have characteristics of a membrane-spanning glycoprotein. In addition to brain, pre-A4 mRNA was found in adult human muscle and other tissues. We demonstrate by in situ hybridization that pre-A4 mRNA is present in adult human muscle, in cultured human myoblasts and myotubes. Immunofluorescence with antipeptide antibodies shows the putative pre-A4 protein to be expressed in adult human muscle and associated with some but not all nuclear envelopes. Despite high levels of a single 3.5-kb pre-A4 mRNA species in cultured myoblasts and myotubes, the presence of putative pre-A4 protein could not be detected by immunofluorescence. This suggests that putative pre-A4 protein is stabilized and therefore functioning in the innervated muscle tissue but not in developing, i.e. non-innervated cultured muscle cells. The selective localization of the protein on distinct nuclear envelopes could reflect an interaction with motor endplates. Images PMID:2896589
Transcriptomics of the Bed Bug (Cimex lectularius)
Rajarapu, Swapna P.; Jones, Susan C.; Mittapalli, Omprakash
2011-01-01
Background Bed bugs (Cimex lectularius) are blood-feeding insects poised to become one of the major pests in households throughout the United States. Resistance of C. lectularius to insecticides/pesticides is one factor thought to be involved in its sudden resurgence. Despite its high-impact status, scant knowledge exists at the genomic level for C. lectularius. Hence, we subjected the C. lectularius transcriptome to 454 pyrosequencing in order to identify potential genes involved in pesticide resistance. Methodology and Principal Findings Using 454 pyrosequencing, we obtained a total of 216,419 reads with 79,596,412 bp, which were assembled into 35,646 expressed sequence tags (3902 contigs and 31744 singletons). Nearly 85.9% of the C. lectularius sequences showed similarity to insect sequences, but 44.8% of the deduced proteins of C. lectularius did not show similarity with sequences in the GenBank non-redundant database. KEGG analysis revealed putative members of several detoxification pathways involved in pesticide resistance. Lamprin domains, Protein Kinase domains, Protein Tyrosine Kinase domains and cytochrome P450 domains were among the top Pfam domains predicted for the C. lectularius sequences. An initial assessment of putative defense genes, including a cytochrome P450 and a glutathione-S-transferase (GST), revealed high transcript levels for the cytochrome P450 (CYP9) in pesticide-exposed versus pesticide-susceptible C. lectularius populations. A significant number of single nucleotide polymorphisms (296) and microsatellite loci (370) were predicted in the C. lectularius sequences. Furthermore, 59 putative sequences of Wolbachia were retrieved from the database. Conclusions To our knowledge this is the first study to elucidate the genetic makeup of C. lectularius. This pyrosequencing effort provides clues to the identification of potential detoxification genes involved in pesticide resistance of C. lectularius and lays the foundation for future functional genomics studies. PMID:21283830
Elrobh, Mohamed S.; Alanazi, Mohammad S.; Khan, Wajahatullah; Abduljaleel, Zainularifeen; Al-Amri, Abdullah; Bazzi, Mohammad D.
2011-01-01
Heat shock proteins are ubiquitous, induced under a number of environmental and metabolic stresses, with highly conserved DNA sequences among mammalian species. Camelus dromedaries (the Arabian camel) domesticated under semi-desert environments, is well adapted to tolerate and survive against severe drought and high temperatures for extended periods. This is the first report of molecular cloning and characterization of full length cDNA of encoding a putative stress-induced heat shock HSPA6 protein (also called HSP70B′) from Arabian camel. A full-length cDNA (2417 bp) was obtained by rapid amplification of cDNA ends (RACE) and cloned in pET-b expression vector. The sequence analysis of HSPA6 gene showed 1932 bp-long open reading frame encoding 643 amino acids. The complete cDNA sequence of the Arabian camel HSPA6 gene was submitted to NCBI GeneBank (accession number HQ214118.1). The BLAST analysis indicated that C. dromedaries HSPA6 gene nucleotides shared high similarity (77–91%) with heat shock gene nucleotide of other mammals. The deduced 643 amino acid sequences (accession number ADO12067.1) showed that the predicted protein has an estimated molecular weight of 70.5 kDa with a predicted isoelectric point (pI) of 6.0. The comparative analyses of camel HSPA6 protein sequences with other mammalian heat shock proteins (HSPs) showed high identity (80–94%). Predicted camel HSPA6 protein structure using Protein 3D structural analysis high similarities with human and mouse HSPs. Taken together, this study indicates that the cDNA sequences of HSPA6 gene and its amino acid and protein structure from the Arabian camel are highly conserved and have similarities with other mammalian species. PMID:21845074
Partial DNA sequencing of Douglas-fir cDNAs used in RFLP mapping
K.D. Jermstad; D.L. Bassoni; C.S. Kinlaw; D.B. Neale
1998-01-01
DNA sequences from 87 Douglas-fir (Pseudotsuga menziesii [Mirb.] Franco) cDNA RFLP probes were determined. Sequences were submitted to the GenBank dbEST database and searched for similarity against nucleotide and protein databases using the BLASTn and BLASTx programs. Twenty-one sequences (24%) were assigned putative functions; 18 of which...
An insight into the sialotranscriptome of the seed-feeding bug, Oncopeltus fasciatus.
Francischetti, Ivo M B; Lopes, Angela H; Dias, Felipe A; Pham, Van M; Ribeiro, José M C
2007-09-01
The salivary transcriptome of the seed-feeding hemipteran, Oncopeltus fasciatus (milkweed bug), is described following assembly of 1025 expressed sequence tags (ESTs) into 305 clusters of related sequences. Inspection of these sequences reveals abundance of low complexity, putative secreted products rich in the amino acids (aa) glycine, serine or threonine, which might function as silk or mucins and assist food canal lubrication and sealing of the feeding site around the mouthparts. Several protease inhibitors were found, including abundant expression of cystatin transcripts that may inhibit cysteine proteases common in seeds that might injure the insect or induce plant apoptosis. Serine proteases and lipases are described that might assist digestion and liquefaction of seed proteins and oils. Finally, several novel putative proteins are described with no known function that might affect plant physiology or act as antimicrobials.
Reizer, J.; Hoischen, C.; Reizer, A.; Pham, T. N.; Saier, M. H.
1993-01-01
We have previously reported the overexpression, purification, and biochemical properties of the Bacillus subtilis Enzyme I of the phosphoenolpyruvate: sugar phosphotransferase system (PTS) (Reizer, J., et al., 1992, J. Biol. Chem. 267, 9158-9169). We now report the sequencing of the ptsI gene of B. subtilis encoding Enzyme I (570 amino acids and 63,076 Da). Putative transcriptional regulatory signals are identified, and the pts operon is shown to be subject to carbon source-dependent regulation. Multiple alignments of the B. subtilis Enzyme I with (1) six other sequenced Enzymes I of the PTS from various bacterial species, (2) phosphoenolpyruvate synthase of Escherichia coli, and (3) bacterial and plant pyruvate: phosphate dikinases (PPDKs) revealed regions of sequence similarity as well as divergence. Statistical analyses revealed that these three types of proteins comprise a homologous family, and the phylogenetic tree of the 11 sequenced protein members of this family was constructed. This tree was compared with that of the 12 sequence HPr proteins or protein domains. Antibodies raised against the B. subtilis and E. coli Enzymes I exhibited immunological cross-reactivity with each other as well as with PPDK of Bacteroides symbiosus, providing support for the evolutionary relationships of these proteins suggested from the sequence comparisons. Putative flexible linkers tethering the N-terminal and the C-terminal domains of protein members of the Enzyme I family were identified, and their potential significance with regard to Enzyme I function is discussed. The codon choice pattern of the B. subtilis and E. coli ptsI and ptsH genes was found to exhibit a bias toward optimal codons in these organisms.(ABSTRACT TRUNCATED AT 250 WORDS) PMID:7686067
Evolutionary profiles from the QR factorization of multiple sequence alignments
Sethi, Anurag; O'Donoghue, Patrick; Luthey-Schulten, Zaida
2005-01-01
We present an algorithm to generate complete evolutionary profiles that represent the topology of the molecular phylogenetic tree of the homologous group. The method, based on the multidimensional QR factorization of numerically encoded multiple sequence alignments, removes redundancy from the alignments and orders the protein sequences by increasing linear dependence, resulting in the identification of a minimal basis set of sequences that spans the evolutionary space of the homologous group of proteins. We observe a general trend that these smaller, more evolutionarily balanced profiles have comparable and, in many cases, better performance in database searches than conventional profiles containing hundreds of sequences, constructed in an iterative and computationally intensive procedure. For more diverse families or superfamilies, with sequence identity <30%, structural alignments, based purely on the geometry of the protein structures, provide better alignments than pure sequence-based methods. Merging the structure and sequence information allows the construction of accurate profiles for distantly related groups. These structure-based profiles outperformed other sequence-based methods for finding distant homologs and were used to identify a putative class II cysteinyl-tRNA synthetase (CysRS) in several archaea that eluded previous annotation studies. Phylogenetic analysis showed the putative class II CysRSs to be a monophyletic group and homology modeling revealed a constellation of active site residues similar to that in the known class I CysRS. PMID:15741270
Jiménez, Diego Javier; Dini-Andreote, Francisco; Ottoni, Júlia Ronzella; de Oliveira, Valéria Maia; van Elsas, Jan Dirk; Andreote, Fernando Dini
2015-01-01
The occurrence of genes encoding biotechnologically relevant α/β-hydrolases in mangrove soil microbial communities was assessed using data obtained by whole-metagenome sequencing of four mangroves areas, denoted BrMgv01 to BrMgv04, in São Paulo, Brazil. The sequences (215 Mb in total) were filtered based on local amino acid alignments against the Lipase Engineering Database. In total, 5923 unassembled sequences were affiliated with 30 different α/β-hydrolase fold superfamilies. The most abundant predicted proteins encompassed cytosolic hydrolases (abH08; ∼ 23%), microsomal hydrolases (abH09; ∼ 12%) and Moraxella lipase-like proteins (abH04 and abH01; < 5%). Detailed analysis of the genes predicted to encode proteins of the abH08 superfamily revealed a high proportion related to epoxide hydrolases and haloalkane dehalogenases in polluted mangroves BrMgv01-02-03. This suggested selection and putative involvement in local degradation/detoxification of the pollutants. Seven sequences that were annotated as genes for putative epoxide hydrolases and five for putative haloalkane dehalogenases were found in a fosmid library generated from BrMgv02 DNA. The latter enzymes were predicted to belong to Actinobacteria, Deinococcus-Thermus, Planctomycetes and Proteobacteria. Our integrated approach thus identified 12 genes (complete and/or partial) that may encode hitherto undescribed enzymes. The low amino acid identity (< 60%) with already-described genes opens perspectives for both production in an expression host and genetic screening of metagenomes. PMID:25171437
Lévesque, Céline; Duplessis, Martin; Labonté, Jessica; Labrie, Steve; Fremaux, Christophe; Tremblay, Denise; Moineau, Sylvain
2005-01-01
The Streptococcus thermophilus virulent pac-type phage 2972 was isolated from a yogurt made in France in 1999. It is a representative of several phages that have emerged with the industrial use of the exopolysaccharide-producing S. thermophilus strain RD534. The genome of phage 2972 has 34,704 bp with an overall G+C content of 40.15%, making it the shortest S. thermophilus phage genome analyzed so far. Forty-four open reading frames (ORFs) encoding putative proteins of 40 or more amino acids were identified, and bioinformatic analyses led to the assignment of putative functions to 23 ORFs. Comparative genomic analysis of phage 2972 with the six other sequenced S. thermophilus phage genomes confirmed that the replication module is conserved and that cos- and pac-type phages have distinct structural and packaging genes. Two group I introns were identified in the genome of 2972. They interrupted the genes coding for the putative endolysin and the terminase large subunit. Phage mRNA splicing was demonstrated for both introns, and the secondary structures were predicted. Eight structural proteins were also identified by N-terminal sequencing and/or matrix-assisted laser desorption ionization—time-of-flight mass spectrometry. Detailed analysis of the putative minor tail proteins ORF19 and ORF21 as well as the putative receptor-binding protein ORF20 showed the following interesting features: (i) ORF19 is a hybrid protein, because it displays significant identity with both pac- and cos-type phages; (ii) ORF20 is unique; and (iii) a protein similar to ORF21 of 2972 was also found in the structure of the cos-type phage DT1, indicating that this structural protein is present in both S. thermophilus phage groups. The implications of these findings for phage classification are discussed. PMID:16000821
Wise, C A; Chiang, L C; Paznekas, W A; Sharma, M; Musy, M M; Ashley, J A; Lovett, M; Jabs, E W
1997-04-01
Treacher Collins Syndrome (TCS) is the most common of the human mandibulofacial dysostosis disorders. Recently, a partial TCOF1 cDNA was identified and shown to contain mutations in TCS families. Here we present the entire exon/intron genomic structure and the complete coding sequence of TCOF1. TCOF1 encodes a low complexity protein of 1,411 amino acids, whose predicted protein structure reveals repeated motifs that mirror the organization of its exons. These motifs are shared with nucleolar trafficking proteins in other species and are predicted to be highly phosphorylated by casein kinase. Consistent with this, the full-length TCOF1 protein sequence also contains putative nuclear and nucleolar localization signals. Throughout the open reading frame, we detected an additional eight mutations in TCS families and several polymorphisms. We postulate that TCS results from defects in a nucleolar trafficking protein that is critically required during human craniofacial development.
Wise, Carol A.; Chiang, Lydia C.; Paznekas, William A.; Sharma, Mridula; Musy, Maurice M.; Ashley, Jennifer A.; Lovett, Michael; Jabs, Ethylin W.
1997-01-01
Treacher Collins Syndrome (TCS) is the most common of the human mandibulofacial dysostosis disorders. Recently, a partial TCOF1 cDNA was identified and shown to contain mutations in TCS families. Here we present the entire exon/intron genomic structure and the complete coding sequence of TCOF1. TCOF1 encodes a low complexity protein of 1,411 amino acids, whose predicted protein structure reveals repeated motifs that mirror the organization of its exons. These motifs are shared with nucleolar trafficking proteins in other species and are predicted to be highly phosphorylated by casein kinase. Consistent with this, the full-length TCOF1 protein sequence also contains putative nuclear and nucleolar localization signals. Throughout the open reading frame, we detected an additional eight mutations in TCS families and several polymorphisms. We postulate that TCS results from defects in a nucleolar trafficking protein that is critically required during human craniofacial development. PMID:9096354
Ferriol, I; Silva Junior, D M; Nigg, J C; Zamora-Macorra, E J; Falk, B W
2016-11-01
Torradoviruses, family Secoviridae, are emergent bipartite RNA plant viruses. RNA1 is ca. 7kb and has one open reading frame (ORF) encoding for the protease, helicase and RNA-dependent RNA polymerase (RdRp). RNA2 is ca. 5kb and has two ORFs. RNA2-ORF1 encodes for a putative protein with unknown function(s). RNA2-ORF2 encodes for a putative movement protein and three capsid proteins. Little is known about the replication and polyprotein processing strategies of torradoviruses. Here, the cleavage sites in the RNA2-ORF2-encoded polyproteins of two torradoviruses, Tomato marchitez virus isolate M (ToMarV-M) and tomato chocolate spot virus, were determined by N-terminal sequencing, revealing that the amino acid (aa) at the -1 position of the cleavage sites is a glutamine. Multiple aa sequence comparison confirmed that this glutamine is conserved among other torradoviruses. Finally, site-directed mutagenesis of conserved aas in the ToMarV-M RdRp and protease prevented substantial accumulation of viral coat proteins or RNAs. Copyright © 2016 Elsevier Inc. All rights reserved.
de Kloet, E; de Kloet, S R
2004-12-01
A study was made of the phylogenetic relationships between fifteen complete nucleotide sequences as well as 43 nucleotide sequences of the putative coat protein gene of different strains belonging to the virus species Beak and feather disease virus obtained from 39 individuals of 16 psittacine species. The species included among others, cockatoos ( Cacatuini), African grey parrots ( Psittacus erithacus) and peach-faced lovebirds ( Agapornis roseicollis), which were infected at different geographical locations, within and outside Australia, the native origin of the virus. The derived amino acid sequences of the putative coat protein were highly diverse, with differences between some strains amounting to 50 of the 250 amino acids. Phylogenetic analysis demonstrated that the putative coat gene sequences form six clusters which show a varying degree of psittacine species specificity. Most, but not all strains infecting African grey parrots formed a single cluster as did the strains infecting the cockatoos. Strains infecting the lovebirds clustered with those infecting such Australasian species as Eclectus roratus, Psittacula kramerii and Psephotus haematogaster. Although individual birds included in this study were, where studied, often infected by closely related strains, infection by highly diverged trains was also detected. The possible relationship between BFD viral strains and clinical disease signs is discussed.
Alvares, Keith; Dixit, Saryu N; Lux, Elizabeth; Veis, Arthur
2009-09-18
Studies of mineralization of embryonic spicules and of the sea urchin genome have identified several putative mineralization-related proteins. These predicted proteins have not been isolated or confirmed in mature mineralized tissues. Mature Lytechinus variegatus teeth were demineralized with 0.6 N HCl after prior removal of non-mineralized constituents with 4.0 M guanidinium HCl. The HCl-extracted proteins were fractionated on ceramic hydroxyapatite and separated into bound and unbound pools. Gel electrophoresis compared the protein distributions. The differentially present bands were purified and digested with trypsin, and the tryptic peptides were separated by high pressure liquid chromatography. NH2-terminal sequences were determined by Edman degradation and compared with the genomic sequence bank data. Two of the putative mineralization-related proteins were found. Their complete amino acid sequences were cloned from our L. variegatus cDNA library. Apatite-binding UTMP16 was found to be present in two isoforms; both isoforms had a signal sequence, a Ser-Asp-rich extracellular matrix domain, and a transmembrane and cytosolic insertion sequence. UTMP19, although rich in Glu and Thr did not bind to apatite. It had neither signal peptide nor transmembrane domain but did have typical nuclear localization and nuclear exit signal sequences. Both proteins were phosphorylated and good substrates for phosphatase. Immunolocalization studies with anti-UTMP16 show it to concentrate at the syncytial membranes in contact with the mineral. On the basis of our TOF-SIMS analyses of magnesium ion and Asp mapping of the mineral phase composition, we speculate that UTMP16 may be important in establishing the high magnesium columns that fuse the calcite plates together to enhance the mechanical strength of the mineralized tooth.
Jiménez, Diego Javier; Dini-Andreote, Francisco; Ottoni, Júlia Ronzella; de Oliveira, Valéria Maia; van Elsas, Jan Dirk; Andreote, Fernando Dini
2015-05-01
The occurrence of genes encoding biotechnologically relevant α/β-hydrolases in mangrove soil microbial communities was assessed using data obtained by whole-metagenome sequencing of four mangroves areas, denoted BrMgv01 to BrMgv04, in São Paulo, Brazil. The sequences (215 Mb in total) were filtered based on local amino acid alignments against the Lipase Engineering Database. In total, 5923 unassembled sequences were affiliated with 30 different α/β-hydrolase fold superfamilies. The most abundant predicted proteins encompassed cytosolic hydrolases (abH08; ∼ 23%), microsomal hydrolases (abH09; ∼ 12%) and Moraxella lipase-like proteins (abH04 and abH01; < 5%). Detailed analysis of the genes predicted to encode proteins of the abH08 superfamily revealed a high proportion related to epoxide hydrolases and haloalkane dehalogenases in polluted mangroves BrMgv01-02-03. This suggested selection and putative involvement in local degradation/detoxification of the pollutants. Seven sequences that were annotated as genes for putative epoxide hydrolases and five for putative haloalkane dehalogenases were found in a fosmid library generated from BrMgv02 DNA. The latter enzymes were predicted to belong to Actinobacteria, Deinococcus-Thermus, Planctomycetes and Proteobacteria. Our integrated approach thus identified 12 genes (complete and/or partial) that may encode hitherto undescribed enzymes. The low amino acid identity (< 60%) with already-described genes opens perspectives for both production in an expression host and genetic screening of metagenomes. © 2014 The Authors. Microbial Biotechnology published by John Wiley & Sons Ltd and Society for Applied Microbiology.
Sequence variability of Campylobacter temperate bacteriophages
Clark, Clifford G; Ng, Lai-King
2008-01-01
Background Prophages integrated within the chromosomes of Campylobacter jejuni isolates have been demonstrated very recently. Prior work with Campylobacter temperate bacteriophages, as well as evidence from prophages in other enteric bacteria, suggests these prophages might have a role in the biology and virulence of the organism. However, very little is known about the genetic variability of Campylobacter prophages which, if present, could lead to differential phenotypes in isolates carrying the phages versus those that do not. As a first step in the characterization of C. jejuni prophages, we investigated the distribution of prophage DNA within a C. jejuni population assessed the DNA and protein sequence variability within a subset of the putative prophages found. Results Southern blotting of C. jejuni DNA using probes from genes within the three putative prophages of the C. jejuni sequenced strain RM 1221 demonstrated the presence of at least one prophage gene in a large proportion (27/35) of isolates tested. Of these, 15 were positive for 5 or more of the 7 Campylobacter Mu-like phage 1 (CMLP 1, also designated Campylobacter jejuni integrated element 1, or CJIE 1) genes tested. Twelve of these putative prophages were chosen for further analysis. DNA sequencing of a 9,000 to 11,000 nucleotide region of each prophage demonstrated a close homology with CMLP 1 in both gene order and nucleotide sequence. Structural and sequence variability, including short insertions, deletions, and allele replacements, were found within the prophage genomes, some of which would alter the protein products of the ORFs involved. No insertions of novel genes were detected within the sequenced regions. The 12 prophages and RM 1221 had a % G+C very similar to C. jejuni sequenced strains, as well as promoter regions characteristic of C. jejuni. None of the putative prophages were successfully induced and propagated, so it is not known if they were functional or if they represented remnant prophage DNA in the bacterial chromosomes. Conclusion These putative prophages form a family of phages with conserved sequences, and appear to be adapted to Campylobacter. There was evidence for recombination among groups of prophages, suggesting that the prophages had a mosaic structure. In many of these properties, the Mu-like CMLP 1 homologs characterized in this study resemble temperate bacteriophages of enteric bacteria that are responsible for contributions to virulence and host adaptation. PMID:18366706
Ares, Miguel A; Rios-Sarabia, Nora; De la Cruz, Miguel A; Rivera-Gutiérrez, Sandra; García-Morales, Lázaro; León-Solís, Lizbel; Espitia, Clara; Pacheco, Sabino; Cerna-Cortés, Jorge F; Helguera-Repetto, Cecilia A; García, María Jesús; González-Y-Merchand, Jorge A
2017-07-01
This work examined the expression of the septum site determining gene (ssd) of Mycobacterium tuberculosis CDC1551 and its ∆sigD mutant under different growing conditions. The results showed an up-regulation of ssd during stationary phase and starvation conditions, but not during in vitro dormancy, suggesting a putative role for SigD in the control of ssd expression mainly under lack-of-nutrients environments. Furthermore, we elucidated a putative link between ssd expression and cell elongation of bacilli at stationary phase. In addition, a -35 sigD consensus sequence was found for the ssd promoter region, reinforcing the putative regulation of ssd by SigD, and in turn, supporting this protein role during the adaptation of M. tuberculosis to some stressful environments.
Guo, Deyin; Spetz, Carl; Saarma, Mart; Valkonen, Jari P T
2003-05-01
Potyviral helper-component proteinase (HCpro) is a multifunctional protein exerting its cellular functions in interaction with putative host proteins. In this study, cellular protein partners of the HCpro encoded by Potato virus A (PVA) (genus Potyvirus) were screened in a potato leaf cDNA library using a yeast two-hybrid system. Two cellular proteins were obtained that interact specifically with PVA HCpro in yeast and in the two in vitro binding assays used. Both proteins are encoded by single-copy genes in the potato genome. Analysis of the deduced amino acid sequences revealed that one (HIP1) of the two HCpro interactors is a novel RING finger protein. The sequence of the other protein (HIP2) showed no resemblance to the protein sequences available from databanks and has known biological functions.
Konami, Y; Yamamoto, K; Osawa, T; Irimura, T
1995-04-01
The complete amino acid sequence of a lactose-binding Cytisus sessilifolius anti-H(O) lectin II (CSA-II) was determined using a protein sequencer. After digestion of CSA-II with endoproteinase Lys-C or Asp-N, the resulting peptides were purified by reversed-phase high performance liquid chromatography (HPLC) and then subjected to sequence analysis. Comparison of the complete amino acid sequence of CSA-II with the sequences of other leguminous seed lectins revealed regions of extensive homology. The amino acid sequence of a putative carbohydrate-binding domain of CSA-II was found to be similar to those of several anti-H(O) leguminous lectins, especially to that of the L-fucose-binding Ulex europaeus lectin I (UEA-I).
ComplexContact: a web server for inter-protein contact prediction using deep learning.
Zeng, Hong; Wang, Sheng; Zhou, Tianming; Zhao, Feifeng; Li, Xiufeng; Wu, Qing; Xu, Jinbo
2018-05-22
ComplexContact (http://raptorx2.uchicago.edu/ComplexContact/) is a web server for sequence-based interfacial residue-residue contact prediction of a putative protein complex. Interfacial residue-residue contacts are critical for understanding how proteins form complex and interact at residue level. When receiving a pair of protein sequences, ComplexContact first searches for their sequence homologs and builds two paired multiple sequence alignments (MSA), then it applies co-evolution analysis and a CASP-winning deep learning (DL) method to predict interfacial contacts from paired MSAs and visualizes the prediction as an image. The DL method was originally developed for intra-protein contact prediction and performed the best in CASP12. Our large-scale experimental test further shows that ComplexContact greatly outperforms pure co-evolution methods for inter-protein contact prediction, regardless of the species.
Stone, David M; Kerr, Rose C; Hughes, Margaret; Radford, Alan D; Darby, Alistair C
2013-11-01
The complete coding sequences were determined for four putative vesiculoviruses isolated from fish. Sequence alignment and phylogenetic analysis based on the predicted amino acid sequences of the five main proteins assigned tench rhabdovirus and grass carp rhabdovirus together with spring viraemia of carp and pike fry rhabdovirus to a lineage that was distinct from the mammalian vesiculoviruses. Perch rhabdovirus, eel virus European X, lake trout rhabdovirus 903/87 and sea trout virus were placed in a second lineage that was also distinct from the recognised genera in the family Rhabdoviridae. Establishment of two new rhabdovirus genera, "Perhabdovirus" and "Sprivivirus", is discussed.
Gucciardo, Sébastian; Wisniewski, Jean-Pierre; Brewin, Nicholas J; Bornemann, Stephen
2007-01-01
The cDNAs encoding three germin-like proteins (PsGER1, PsGER2a, and PsGER2b) were isolated from Pisum sativum. The coding sequence of PsGER1 transiently expressed in tobacco leaves gave a protein with superoxide dismutase activity but no detectable oxalate oxidase activity according to in-gel activity stains. The transient expression of wheat germin gf-2.8 oxalate oxidase showed oxalate oxidase but no superoxide dismutase activity under the same conditions. The superoxide dismutase activity of PsGER1 was resistant to high temperature, denaturation by detergent, and high concentrations of hydrogen peroxide. In salt-stressed pea roots, a heat-resistant superoxide dismutase activity was observed with an electrophoretic mobility similar to that of the PsGER1 protein, but this activity was below the detection limit in non-stressed or H(2)O(2)-stressed pea roots. Oxalate oxidase activity was not detected in either pea roots or nodules. Following in situ hybridization in developing pea nodules, PsGER1 transcript was detected in expanding cells just proximal to the meristematic zone and also in the epidermis, but to a lesser extent. PsGER1 is the first known germin-like protein with superoxide dismutase activity to be associated with nodules. It shared protein sequence identity with the N-terminal sequence of a putative plant receptor for rhicadhesin, a bacterial attachment protein. However, its primary location in nodules suggests functional roles other than as a rhicadhesin receptor required for the first stage of bacterial attachment to root hairs.
Experimental Evidence and In Silico Identification of Tryptophan Decarboxylase in Citrus Genus.
De Masi, Luigi; Castaldo, Domenico; Pignone, Domenico; Servillo, Luigi; Facchiano, Angelo
2017-02-11
Plant tryptophan decarboxylase (TDC) converts tryptophan into tryptamine, precursor of indolealkylamine alkaloids. The recent finding of tryptamine metabolites in Citrus plants leads to hypothesize the existence of TDC activity in this genus. Here, we report for the first time that, in Citrus x limon seedlings, deuterium labeled tryptophan is decarboxylated into tryptamine, from which successively deuterated N , N , N -trimethyltryptamine is formed. These results give an evidence of the occurrence of the TDC activity and the successive methylation pathway of the tryptamine produced from the tryptophan decarboxylation. In addition, with the aim to identify the genetic basis for the presence of TDC, we carried out a sequence similarity search for TDC in the Citrus genomes using as a probe the TDC sequence reported for the plant Catharanthus roseus . We analyzed the genomes of both Citrus clementina and Citrus sinensis , available in public database, and identified putative protein sequences of aromatic l-amino acid decarboxylase. Similarly, 42 aromatic l-amino acid decarboxylase sequences from 23 plant species were extracted from public databases. Potential sequence signatures for functional TDC were then identified. With this research, we propose for the first time a putative protein sequence for TDC in the genus Citrus .
The predicted secretome and transmembranome of the poultry red mite Dermanyssus gallinae.
Schicht, Sabine; Qi, Weihong; Poveda, Lucy; Strube, Christina
2013-09-11
The worldwide distributed hematophagous poultry red mite Dermanyssus gallinae (De Geer, 1778) is one of the most important pests of poultry. Even though 35 acaricide compounds are available, control of D. gallinae remains difficult due to acaricide resistances as well as food safety regulations. The current study was carried out to identify putative excretory/secretory (pES) proteins of D. gallinae since these proteins play an important role in the host-parasite interaction and therefore represent potential targets for the development of novel intervention strategies. Additionally, putative transmembrane proteins (pTM) of D. gallinae were analyzed as representatives of this protein group also serve as promising targets for new control strategies. D. gallinae pES and pTM protein prediction was based on putative protein sequences of whole transcriptome data which was parsed to different bioinformatical servers (SignalP, SecretomeP, TMHMM and TargetP). Subsequently, pES and pTM protein sequences were functionally annotated by different computational tools. Computational analysis of the D. gallinae proteins identified 3,091 pES (5.6%) and 7,361 pTM proteins (13.4%). A significant proportion of pES proteins are considered to be involved in blood feeding and digestion such as salivary proteins, proteases, lipases and carbohydrases. The cysteine proteases cathepsin D and L as well as legumain, enzymes that cleave hemoglobin during blood digestion of the near related ticks, represented 6 of the top-30 BLASTP matches of the poultry red mite's secretome. Identified pTM proteins may be involved in many important biological processes including cell signaling, transport of membrane-impermeable molecules and cell recognition. Ninjurin-like proteins, whose functions in mites are still unknown, represent the most frequently occurring pTM. The current study is the first providing a mite's secretome as well as transmembranome and provides valuable insights into D. gallinae pES and pTM proteins operating in different metabolic pathways. Identifying a variety of molecules putatively involved in blood feeding may significantly contribute to the development of new therapeutic targets or vaccines against this poultry pest.
The predicted secretome and transmembranome of the poultry red mite Dermanyssus gallinae
2013-01-01
Background The worldwide distributed hematophagous poultry red mite Dermanyssus gallinae (De Geer, 1778) is one of the most important pests of poultry. Even though 35 acaricide compounds are available, control of D. gallinae remains difficult due to acaricide resistances as well as food safety regulations. The current study was carried out to identify putative excretory/secretory (pES) proteins of D. gallinae since these proteins play an important role in the host-parasite interaction and therefore represent potential targets for the development of novel intervention strategies. Additionally, putative transmembrane proteins (pTM) of D. gallinae were analyzed as representatives of this protein group also serve as promising targets for new control strategies. Methods D. gallinae pES and pTM protein prediction was based on putative protein sequences of whole transcriptome data which was parsed to different bioinformatical servers (SignalP, SecretomeP, TMHMM and TargetP). Subsequently, pES and pTM protein sequences were functionally annotated by different computational tools. Results Computational analysis of the D. gallinae proteins identified 3,091 pES (5.6%) and 7,361 pTM proteins (13.4%). A significant proportion of pES proteins are considered to be involved in blood feeding and digestion such as salivary proteins, proteases, lipases and carbohydrases. The cysteine proteases cathepsin D and L as well as legumain, enzymes that cleave hemoglobin during blood digestion of the near related ticks, represented 6 of the top-30 BLASTP matches of the poultry red mite’s secretome. Identified pTM proteins may be involved in many important biological processes including cell signaling, transport of membrane-impermeable molecules and cell recognition. Ninjurin-like proteins, whose functions in mites are still unknown, represent the most frequently occurring pTM. Conclusion The current study is the first providing a mite’s secretome as well as transmembranome and provides valuable insights into D. gallinae pES and pTM proteins operating in different metabolic pathways. Identifying a variety of molecules putatively involved in blood feeding may significantly contribute to the development of new therapeutic targets or vaccines against this poultry pest. PMID:24020355
Proteomics reveals novel components of the Anopheles gambiae eggshell
Amenya, Dolphine A.; Chou, Wayne; Li, Jianyong; Yan, Guiyun; Gershon, Paul D.; James, Anthony A.; Marinotti, Osvaldo
2010-01-01
While genome and transcriptome sequencing has revealed a large number and diversity of Anopheles gambiae predicted proteins, identifying their functions and biosynthetic pathways remains challenging. Applied mass spectrometry based proteomics in conjunction with mosquito genome and transcriptome databases were used to identify 44 proteins as putative components of the eggshell. Among the identified molecules are two vitelline membrane proteins and a group of seven putative chorion proteins. Enzymes with peroxidase, laccase and phenoloxidase activities, likely involved in cross-linking reactions that stabilize the eggshell structure, also were identified. Seven odorant binding proteins were found in association with the mosquito eggshell, although their role has yet to be demonstrated. This analysis fills a considerable gap of knowledge about proteins that build the eggshell of anopheline mosquitoes. PMID:20433845
Tenebrio molitor antifreeze protein gene identification and regulation.
Qin, Wensheng; Walker, Virginia K
2006-02-15
The yellow mealworm, Tenebrio molitor, is a freeze susceptible, stored product pest. Its winter survival is facilitated by the accumulation of antifreeze proteins (AFPs), encoded by a small gene family. We have now isolated 11 different AFP genomic clones from 3 genomic libraries. All the clones had a single coding sequence, with no evidence of intervening sequences. Three genomic clones were further characterized. All have putative TATA box sequences upstream of the coding regions and multiple potential poly(A) signal sequences downstream of the coding regions. A TmAFP regulatory region, B1037, conferred transcriptional activity when ligated to a luciferase reporter sequence and after transfection into an insect cell line. A 143 bp core promoter including a TATA box sequence was identified. Its promoter activity was increased 4.4 times by inserting an exotic 245 bp intron into the construct, similar to the enhancement of transgenic expression seen in several other systems. The addition of a duplication of the first 120 bp sequence from the 143 bp core promoter decreased promoter activity by half. Although putative hormonal response sequences were identified, none of the five hormones tested enhanced reporter activity. These studies on the mechanisms of AFP transcriptional control are important for the consideration of any transfer of freeze-resistance phenotypes to beneficial hosts.
2013-01-01
Background Fungal pathogens cause devastating losses in economically important cereal crops by utilising pathogen proteins to infect host plants. Secreted pathogen proteins are referred to as effectors and have thus far been identified by selecting small, cysteine-rich peptides from the secretome despite increasing evidence that not all effectors share these attributes. Results We take advantage of the availability of sequenced fungal genomes and present an unbiased method for finding putative pathogen proteins and secreted effectors in a query genome via comparative hidden Markov model analyses followed by unsupervised protein clustering. Our method returns experimentally validated fungal effectors in Stagonospora nodorum and Fusarium oxysporum as well as the N-terminal Y/F/WxC-motif from the barley powdery mildew pathogen. Application to the cereal pathogen Fusarium graminearum reveals a secreted phosphorylcholine phosphatase that is characteristic of hemibiotrophic and necrotrophic cereal pathogens and shares an ancient selection process with bacterial plant pathogens. Three F. graminearum protein clusters are found with an enriched secretion signal. One of these putative effector clusters contains proteins that share a [SG]-P-C-[KR]-P sequence motif in the N-terminal and show features not commonly associated with fungal effectors. This motif is conserved in secreted pathogenic Fusarium proteins and a prime candidate for functional testing. Conclusions Our pipeline has successfully uncovered conservation patterns, putative effectors and motifs of fungal pathogens that would have been overlooked by existing approaches that identify effectors as small, secreted, cysteine-rich peptides. It can be applied to any pathogenic proteome data, such as microbial pathogen data of plants and other organisms. PMID:24252298
Chimeras taking shape: Potential functions of proteins encoded by chimeric RNA transcripts
Frenkel-Morgenstern, Milana; Lacroix, Vincent; Ezkurdia, Iakes; Levin, Yishai; Gabashvili, Alexandra; Prilusky, Jaime; del Pozo, Angela; Tress, Michael; Johnson, Rory; Guigo, Roderic; Valencia, Alfonso
2012-01-01
Chimeric RNAs comprise exons from two or more different genes and have the potential to encode novel proteins that alter cellular phenotypes. To date, numerous putative chimeric transcripts have been identified among the ESTs isolated from several organisms and using high throughput RNA sequencing. The few corresponding protein products that have been characterized mostly result from chromosomal translocations and are associated with cancer. Here, we systematically establish that some of the putative chimeric transcripts are genuinely expressed in human cells. Using high throughput RNA sequencing, mass spectrometry experimental data, and functional annotation, we studied 7424 putative human chimeric RNAs. We confirmed the expression of 175 chimeric RNAs in 16 human tissues, with an abundance varying from 0.06 to 17 RPKM (Reads Per Kilobase per Million mapped reads). We show that these chimeric RNAs are significantly more tissue-specific than non-chimeric transcripts. Moreover, we present evidence that chimeras tend to incorporate highly expressed genes. Despite the low expression level of most chimeric RNAs, we show that 12 novel chimeras are translated into proteins detectable in multiple shotgun mass spectrometry experiments. Furthermore, we confirm the expression of three novel chimeric proteins using targeted mass spectrometry. Finally, based on our functional annotation of exon organization and preserved domains, we discuss the potential features of chimeric proteins with illustrative examples and suggest that chimeras significantly exploit signal peptides and transmembrane domains, which can alter the cellular localization of cognate proteins. Taken together, these findings establish that some chimeric RNAs are translated into potentially functional proteins in humans. PMID:22588898
Wustman, Brandon A; Morse, Daniel E; Evans, John Spencer
2004-08-05
The AP7 and AP24 proteins represent a class of mineral-interaction polypeptides that are found in the aragonite-containing nacre layer of mollusk shell (H. rufescens). These proteins have been shown to preferentially interfere with calcium carbonate mineral growth in vitro. It is believed that both proteins play an important role in aragonite polymorph selection in the mollusk shell. Previously, we demonstrated the 1-30 amino acid (AA) N-terminal sequences of AP7 and AP24 represent mineral interaction/modification domains in both proteins, as evidenced by their ability to frustrate calcium carbonate crystal growth at step edge regions. In this present report, using free N-terminal, C(alpha)-amide "capped" synthetic polypeptides representing the 1-30 AA regions of AP7 (AP7-1 polypeptide) and AP24 (AP24-1 polypeptide) and NMR spectroscopy, we confirm that both N-terminal sequences possess putative Ca (II) interaction polyanionic sequence regions (2 x -DD- in AP7-1, -DDDED- in AP24-1) that are random coil-like in structure. However, with regard to the remaining sequences regions, each polypeptide features unique structural differences. AP7-1 possesses an extended beta-strand or polyproline type II-like structure within the A11-M10, S12-V13, and S28-I27 sequence regions, with the remaining sequence regions adopting a random-coil-like structure, a trait common to other polyelectrolyte mineral-associated polypeptide sequences. Conversely, AP24-1 possesses random coil-like structure within A1-S9 and Q14-N16 sequence regions, and evidence for turn-like, bend, or loop conformation within the G10-N13, Q17-N24, and M29-F30 sequence regions, similar to the structures identified within the putative elastomeric proteins Lustrin A and sea urchin spicule matrix proteins. The similarities and differences in AP7 and AP24 N-terminal domain structure are discussed with regard to joint AP7-AP24 protein modification of calcium carbonate growth. Copyright 2004 Wiley Periodicals, Inc.
Dziewit, Lukasz; Oscik, Karolina; Bartosik, Dariusz
2014-01-01
ABSTRACT ΦLM21 is a temperate phage isolated from Sinorhizobium sp. strain LM21 (Alphaproteobacteria). Genomic analysis and electron microscopy suggested that ΦLM21 is a member of the family Siphoviridae. The phage has an isometric head and a long noncontractile tail. The genome of ΦLM21 has 50,827 bp of linear double-stranded DNA encoding 72 putative proteins, including proteins responsible for the assembly of the phage particles, DNA packaging, transcription, replication, and lysis. Virion proteins were characterized using mass spectrometry, leading to the identification of the major capsid and tail components, tape measure, and a putative portal protein. We have confirmed the activity of two gene products, a lytic enzyme (a putative chitinase) and a DNA methyltransferase, sharing sequence specificity with the cell cycle-regulating methyltransferase (CcrM) of the bacterial host. Interestingly, the genome of Sinorhizobium phage ΦLM21 shows very limited similarity to other known phage genome sequences and is thus considered unique. IMPORTANCE Prophages are known to play an important role in the genomic diversification of bacteria via horizontal gene transfer. The influence of prophages on pathogenic bacteria is very well documented. However, our knowledge of the overall impact of prophages on the survival of their lysogenic, nonpathogenic bacterial hosts is still limited. In particular, information on prophages of the agronomically important Sinorhizobium species is scarce. In this study, we describe the isolation and molecular characterization of a novel temperate bacteriophage, ΦLM21, of Sinorhizobium sp. LM21. Since we have not found any similar sequences, we propose that this bacteriophage is a novel species. We conducted a functional analysis of selected proteins. We have demonstrated that the phage DNA methyltransferase has the same sequence specificity as the cell cycle-regulating methyltransferase CcrM of its host. We point out that this phenomenon of mimicking the host regulatory mechanisms by viruses is quite common in bacteriophages. PMID:25187538
Qin, Jin-Hong; Zhang, Qing; Zhang, Zhi-Ming; Zhong, Yi; Yang, Yang; Hu, Bao-Yu; Zhao, Guo-Ping; Guo, Xiao-Kui
2008-06-01
DNA microarray analysis was used to compare the differential gene expression profiles between Leptospira interrogans serovar Lai type strain 56601 and its corresponding attenuated strain IPAV. A 22-kb genomic island covering a cluster of 34 genes (i.e., genes LA0186 to LA0219) was actively expressed in both strains but concomitantly upregulated in strain 56601 in contrast to that of IPAV. Reverse transcription-PCR assays proved that the gene cluster comprised five transcripts. Gene annotation of this cluster revealed characteristics of a putative prophage-like remnant with at least 8 of 34 sequences encoding prophage-like proteins, of which the LA0195 protein is probably a putative prophage CI-like regulator. The transcription initiation activities of putative promoter-regulatory sequences of transcripts I, II, and III, all proximal to the LA0195 gene, were further analyzed in the Escherichia coli promoter probe vector pKK232-8 by assaying the reporter chloramphenicol acetyltransferase (CAT) activities. The strong promoter activities of both transcripts I and II indicated by the E. coli CAT assay were well correlated with the in vitro sequence-specific binding of the recombinant LA0195 protein to the corresponding promoter probes detected by the electrophoresis mobility shift assay. On the other hand, the promoter activity of transcript III was very low in E. coli and failed to show active binding to the LA0195 protein in vitro. These results suggested that the LA0195 protein is likely involved in the transcription of transcripts I and II. However, the identical complete DNA sequences of this prophage remnant from these two strains strongly suggests that possible regulatory factors or signal transduction systems residing outside of this region within the genome may be responsible for the differential expression profiling in these two strains.
Genome-wide analysis of putative peroxiredoxin in unicellular and filamentous cyanobacteria.
Cui, Hongli; Wang, Yipeng; Wang, Yinchu; Qin, Song
2012-11-16
Cyanobacteria are photoautotrophic prokaryotes with wide variations in genome sizes and ecological habitats. Peroxiredoxin (PRX) is an important protein that plays essential roles in protecting own cells against reactive oxygen species (ROS). PRXs have been identified from mammals, fungi and higher plants. However, knowledge on cyanobacterial PRXs still remains obscure. With the availability of 37 sequenced cyanobacterial genomes, we performed a comprehensive comparative analysis of PRXs and explored their diversity, distribution, domain structure and evolution. Overall 244 putative prx genes were identified, which were abundant in filamentous diazotrophic cyanobacteria, Acaryochloris marina MBIC 11017, and unicellular cyanobacteria inhabiting freshwater and hot-springs, while poor in all Prochlorococcus and marine Synechococcus strains. Among these putative genes, 25 open reading frames (ORFs) encoding hypothetical proteins were identified as prx gene family members and the others were already annotated as prx genes. All 244 putative PRXs were classified into five major subfamilies (1-Cys, 2-Cys, BCP, PRX5_like, and PRX-like) according to their domain structures. The catalytic motifs of the cyanobacterial PRXs were similar to those of eukaryotic PRXs and highly conserved in all but the PRX-like subfamily. Classical motif (CXXC) of thioredoxin was detected in protein sequences from the PRX-like subfamily. Phylogenetic tree constructed of catalytic domains coincided well with the domain structures of PRXs and the phylogenies based on 16s rRNA. The distribution of genes encoding PRXs in different unicellular and filamentous cyanobacteria especially those sub-families like PRX-like or 1-Cys PRX correlate with the genome size, eco-physiology, and physiological properties of the organisms. Cyanobacterial and eukaryotic PRXs share similar conserved motifs, indicating that cyanobacteria adopt similar catalytic mechanisms as eukaryotes. All cyanobacterial PRX proteins share highly similar structures, implying that these genes may originate from a common ancestor. In this study, a general framework of the sequence-structure-function connections of the PRXs was revealed, which may facilitate functional investigations of PRXs in various organisms.
Genome-wide analysis of putative peroxiredoxin in unicellular and filamentous cyanobacteria
2012-01-01
Background Cyanobacteria are photoautotrophic prokaryotes with wide variations in genome sizes and ecological habitats. Peroxiredoxin (PRX) is an important protein that plays essential roles in protecting own cells against reactive oxygen species (ROS). PRXs have been identified from mammals, fungi and higher plants. However, knowledge on cyanobacterial PRXs still remains obscure. With the availability of 37 sequenced cyanobacterial genomes, we performed a comprehensive comparative analysis of PRXs and explored their diversity, distribution, domain structure and evolution. Results Overall 244 putative prx genes were identified, which were abundant in filamentous diazotrophic cyanobacteria, Acaryochloris marina MBIC 11017, and unicellular cyanobacteria inhabiting freshwater and hot-springs, while poor in all Prochlorococcus and marine Synechococcus strains. Among these putative genes, 25 open reading frames (ORFs) encoding hypothetical proteins were identified as prx gene family members and the others were already annotated as prx genes. All 244 putative PRXs were classified into five major subfamilies (1-Cys, 2-Cys, BCP, PRX5_like, and PRX-like) according to their domain structures. The catalytic motifs of the cyanobacterial PRXs were similar to those of eukaryotic PRXs and highly conserved in all but the PRX-like subfamily. Classical motif (CXXC) of thioredoxin was detected in protein sequences from the PRX-like subfamily. Phylogenetic tree constructed of catalytic domains coincided well with the domain structures of PRXs and the phylogenies based on 16s rRNA. Conclusions The distribution of genes encoding PRXs in different unicellular and filamentous cyanobacteria especially those sub-families like PRX-like or 1-Cys PRX correlate with the genome size, eco-physiology, and physiological properties of the organisms. Cyanobacterial and eukaryotic PRXs share similar conserved motifs, indicating that cyanobacteria adopt similar catalytic mechanisms as eukaryotes. All cyanobacterial PRX proteins share highly similar structures, implying that these genes may originate from a common ancestor. In this study, a general framework of the sequence-structure-function connections of the PRXs was revealed, which may facilitate functional investigations of PRXs in various organisms. PMID:23157370
Albornos, Lucía; Martín, Ignacio; Iglesias, Rebeca; Jiménez, Teresa; Labrador, Emilia; Dopico, Berta
2012-11-07
Many proteins with tandem repeats in their sequence have been described and classified according to the length of the repeats: I) Repeats of short oligopeptides (from 2 to 20 amino acids), including structural cell wall proteins and arabinogalactan proteins. II) Repeats that range in length from 20 to 40 residues, including proteins with a well-established three-dimensional structure often involved in mediating protein-protein interactions. (III) Longer repeats in the order of 100 amino acids that constitute structurally and functionally independent units. Here we analyse ShooT specific (ST) proteins, a family of proteins with tandem repeats of unknown function that were first found in Leguminosae, and their possible similarities to other proteins with tandem repeats. ST protein sequences were only found in dicotyledonous plants, limited to several plant families, mainly the Fabaceae and the Asteraceae. ST mRNAs accumulate mainly in the roots and under biotic interactions. Most ST proteins have one or several Domain(s) of Unknown Function 2775 (DUF2775). All deduced ST proteins have a signal peptide, indicating that these proteins enter the secretory pathway, and the mature proteins have tandem repeat oligopeptides that share a hexapeptide (E/D)FEPRP followed by 4 partially conserved amino acids, which could determine a putative N-glycosylation signal, and a fully conserved tyrosine. In a phylogenetic tree, the sequences clade according to taxonomic group. A possible involvement in symbiosis and abiotic stress as well as in plant cell elongation is suggested, although different STs could play different roles in plant development. We describe a new family of proteins called ST whose presence is limited to the plant kingdom, specifically to a few families of dicotyledonous plants. They present 20 to 40 amino acid tandem repeat sequences with different characteristics (signal peptide, DUF2775 domain, conservative repeat regions) from the described group of 20 to 40 amino acid tandem repeat proteins and also from known cell wall proteins with repeat sequences. Several putative roles in plant physiology can be inferred from the characteristics found.
2012-01-01
Background Many proteins with tandem repeats in their sequence have been described and classified according to the length of the repeats: I) Repeats of short oligopeptides (from 2 to 20 amino acids), including structural cell wall proteins and arabinogalactan proteins. II) Repeats that range in length from 20 to 40 residues, including proteins with a well-established three-dimensional structure often involved in mediating protein-protein interactions. (III) Longer repeats in the order of 100 amino acids that constitute structurally and functionally independent units. Here we analyse ShooT specific (ST) proteins, a family of proteins with tandem repeats of unknown function that were first found in Leguminosae, and their possible similarities to other proteins with tandem repeats. Results ST protein sequences were only found in dicotyledonous plants, limited to several plant families, mainly the Fabaceae and the Asteraceae. ST mRNAs accumulate mainly in the roots and under biotic interactions. Most ST proteins have one or several Domain(s) of Unknown Function 2775 (DUF2775). All deduced ST proteins have a signal peptide, indicating that these proteins enter the secretory pathway, and the mature proteins have tandem repeat oligopeptides that share a hexapeptide (E/D)FEPRP followed by 4 partially conserved amino acids, which could determine a putative N-glycosylation signal, and a fully conserved tyrosine. In a phylogenetic tree, the sequences clade according to taxonomic group. A possible involvement in symbiosis and abiotic stress as well as in plant cell elongation is suggested, although different STs could play different roles in plant development. Conclusions We describe a new family of proteins called ST whose presence is limited to the plant kingdom, specifically to a few families of dicotyledonous plants. They present 20 to 40 amino acid tandem repeat sequences with different characteristics (signal peptide, DUF2775 domain, conservative repeat regions) from the described group of 20 to 40 amino acid tandem repeat proteins and also from known cell wall proteins with repeat sequences. Several putative roles in plant physiology can be inferred from the characteristics found. PMID:23134664
Marcilla, Antonio; Garg, Gagan; Bernal, Dolores; Ranganathan, Shoba; Forment, Javier; Ortiz, Javier; Muñoz-Antolí, Carla; Dominguez, M. Victoria; Pedrola, Laia; Martinez-Blanch, Juan; Sotillo, Javier; Trelis, Maria; Toledo, Rafael; Esteban, J. Guillermo
2012-01-01
Background Strongyloidiasis is one of the most neglected diseases distributed worldwide with endemic areas in developed countries, where chronic infections are life threatening. Despite its impact, very little is known about the molecular biology of the parasite involved and its interplay with its hosts. Next generation sequencing technologies now provide unique opportunities to rapidly address these questions. Principal Findings Here we present the first transcriptome of the third larval stage of S. stercoralis using 454 sequencing coupled with semi-automated bioinformatic analyses. 253,266 raw sequence reads were assembled into 11,250 contiguous sequences, most of which were novel. 8037 putative proteins were characterized based on homology, gene ontology and/or biochemical pathways. Comparison of the transcriptome of S. strongyloides with those of other nematodes, including S. ratti, revealed similarities in transcription of molecules inferred to have key roles in parasite-host interactions. Enzymatic proteins, like kinases and proteases, were abundant. 1213 putative excretory/secretory proteins were compiled using a new pipeline which included non-classical secretory proteins. Potential drug targets were also identified. Conclusions Overall, the present dataset should provide a solid foundation for future fundamental genomic, proteomic and metabolomic explorations of S. stercoralis, as well as a basis for applied outcomes, such as the development of novel methods of intervention against this neglected parasite. PMID:22389732
You, Min Kyoung; Kim, Jin Hwa; Lee, Yeo Jin; Jeong, Ye Sol; Ha, Sun-Hwa
2016-12-22
Plastoglobules (PGs) are thylakoid membrane microdomains within plastids that are known as specialized locations of carotenogenesis. Three rice phytoene synthase proteins (OsPSYs) involved in carotenoid biosynthesis have been identified. Here, the N-terminal 80-amino-acid portion of OsPSY2 (PTp) was demonstrated to be a chloroplast-targeting peptide by displaying cytosolic localization of OsPSY2(ΔPTp):mCherry in rice protoplast, in contrast to chloroplast localization of OsPSY2:mCherry in a punctate pattern. The peptide sequence of a PTp was predicted to harbor two transmembrane domains eligible for a putative PG-targeting signal. To assess and enhance the PG-targeting ability of PTp, the original PTp DNA sequence ( PTp ) was modified to a synthetic DNA sequence ( stPTp ), which had 84.4% similarity to the original sequence. The motivation of this modification was to reduce the GC ratio from 75% to 65% and to disentangle the hairpin loop structures of PTp . These two DNA sequences were fused to the sequence of the synthetic green fluorescent protein (sGFP) and drove GFP expression with different efficiencies. In particular, the RNA and protein levels of stPTp-sGFP were slightly improved to 1.4-fold and 1.3-fold more than those of sGFP, respectively. The green fluorescent signals of their mature proteins were all observed as speckle-like patterns with slightly blurred stromal signals in chloroplasts. These discrete green speckles of PTp - sGFP and stPTp - sGFP corresponded exactly to the red fluorescent signal displayed by OsPSY2:mCherry in both etiolated and greening protoplasts and it is presumed to correspond to distinct PGs. In conclusion, we identified PTp as a transit peptide sequence facilitating preferential translocation of foreign proteins to PGs, and developed an improved PTp sequence, a s tPTp , which is expected to be very useful for applications in plant biotechnologies requiring precise micro-compartmental localization in plastids.
Luis, Luis; Serrano, María Luisa; Hidalgo, Mariana; Mendoza-León, Alexis
2013-01-01
Differential susceptibility to microtubule agents has been demonstrated between mammalian cells and kinetoplastid organisms such as Leishmania spp. and Trypanosoma spp. The aims of this study were to identify and characterize the architecture of the putative colchicine binding site of Leishmania spp. and investigate the molecular basis of colchicine resistance. We cloned and sequenced the β-tubulin gene of Leishmania (Viannia) guyanensis and established the theoretical 3D model of the protein, using the crystallographic structure of the bovine protein as template. We identified mutations on the Leishmania β-tubulin gene sequences on regions related to the putative colchicine-binding pocket, which generate amino acid substitutions and changes in the topology of this region, blocking the access of colchicine. The same mutations were found in the β-tubulin sequence of kinetoplastid organisms such as Trypanosoma cruzi, T. brucei, and T. evansi. Using molecular modelling approaches, we demonstrated that conformational changes include an elongation and torsion of an α-helix structure and displacement to the inside of the pocket of one β-sheet that hinders access of colchicine. We propose that kinetoplastid organisms show resistance to colchicine due to amino acids substitutions that generate structural changes in the putative colchicine-binding domain, which prevent colchicine access. PMID:24083244
Identification of Streptococcus mitis321A vaccine antigens based on reverse vaccinology
Zhang, Qiao; Lin, Kexiong; Wang, Changzheng; Xu, Zhi; Yang, Li; Ma, Qianli
2018-01-01
Streptococcus mitis (S. mitis) may transform into highly pathogenic bacteria. The aim of the present study was to identify potential antigen targets for designing an effective vaccine against the pathogenic S. mitis321A. The genome of S. mitis321A was sequenced using an Illumina Hiseq2000 instrument. Subsequently, Glimmer 3.02 and Tandem Repeat Finder (TRF) 4.04 were used to predict genes and tandem repeats, respectively, with DNA sequence function analysis using the Basic Local Alignment Search Tool (BLAST) in the Kyoto Encyclopedia of Genes and Genomes (KEGG) and Cluster of Orthologous Groups of proteins (COG) databases. Putative gene antigen candidates were screened with BLAST ahead of phylogenetic tree analysis. The DNA sequence assembly size was 2,110,680 bp with 40.12% GC, 6 scaffolds and 9 contig. Consequently, 1,944 genes were predicted, and 119 TRF, 56 microsatellite DNA, 10 minisatellite DNA and 154 transposons were acquired. The predicted genes were associated with various pathways and functions concerning membrane transport and energy metabolism. Multiple putative genes encoding surface proteins, secreted proteins and virulence factors, as well as essential genes were determined. The majority of essential genes belonged to a phylogenetic lineage, while 321AGL000129 and 321AGL000299 were on the same branch. The current study provided useful information regarding the biological function of the S. mitis321A genome and recommends putative antigen candidates for developing a potent vaccine against S. mitis. PMID:29620181
García Guerreiro, M P; Fontdevila, A
2007-01-01
A new transposable element, Isis, is identified as a LTR retrotransposon in Drosophila buzzatii. DNA sequence analysis shows that Isis contains three long ORFs similar to gag, pol and env genes of retroviruses. The ORF1 exhibits sequence homology to matrix, capsid and nucleocapsid gag proteins and ORF2 encodes a putative protease (PR), a reverse transcriptase (RT), an Rnase H (RH) and an integrase (IN) region. The analysis of a putative env product, encoded by the env ORF3, shows a degenerated protein containing several stop codons. The molecular study of the putative proteins coded by this new element shows striking similarities to both Ulysses and Osvaldo elements, two LTR retrotransposons, present in D. virilis and D. buzzatii, respectively. Comparisons of the predicted Isis RT to several known retrotransposons show strong phylogenetic relationships to gypsy-like elements, particulary to Ulysses retrotransposon. Studies of Isis chromosomal distribution show a strong hybridization signal in centromeric and pericentromeric regions, and a scattered distribution along all chromosomal arms. The existence of insertional polymorphisms between different strains and high molecular weight bands by Southern blot suggests the existence of full-sized copies that have been active recently. The presence of euchromatic insertion sites coincident between Isis and Osvaldo could indicate preferential insertion sites of Osvaldo element into Isis sequence or vice versa. Moreover, the presence of Isis in different species of the buzzatii complex indicates the ancient origin of this element.
Seppala, Susanna; Solomon, Kevin V.; Gilmore, Sean P.; ...
2016-12-20
Here, engineered cell factories that convert biomass into value-added compounds are emerging as a timely alternative to petroleum-based industries. Although often overlooked, integral membrane proteins such as solute transporters are pivotal for engineering efficient microbial chassis. Anaerobic gut fungi, adapted to degrade raw plant biomass in the intestines of herbivores, are a potential source of valuable transporters for biotechnology, yet very little is known about the membrane constituents of these non-conventional organisms. Here, we mined the transcriptome of three recently isolated strains of anaerobic fungi to identify membrane proteins responsible for sensing and transporting biomass hydrolysates within a competitive andmore » rather extreme environment. Using sequence analyses and homology, we identified membrane protein-coding sequences from assembled transcriptomes from three strains of anaerobic gut fungi: Neocallimastix californiae, Anaeromyces robustus, and Piromyces finnis. We identified nearly 2000 transporter components: about half of these are involved in the general secretory pathway and intracellular sorting of proteins; the rest are predicted to be small-solute transporters. Unexpectedly, we found a number of putative sugar binding proteins that are associated with prokaryotic uptake systems; and approximately 100 class C G-protein coupled receptors (GPCRs) with non-canonical putative sugar binding domains. In conclusion, we report the first comprehensive characterization of the membrane protein machinery of biotechnologically relevant anaerobic gut fungi. Apart from identifying conserved machinery for protein sorting and secretion, we identify a large number of putative solute transporters that are of interest for biotechnological applications. Notably, our data suggests that the fungi display a plethora of carbohydrate binding domains at their surface, perhaps as a means to sense and sequester some of the sugars that their biomass degrading, extracellular enzymes produce.« less
Seppälä, Susanna; Solomon, Kevin V; Gilmore, Sean P; Henske, John K; O'Malley, Michelle A
2016-12-20
Engineered cell factories that convert biomass into value-added compounds are emerging as a timely alternative to petroleum-based industries. Although often overlooked, integral membrane proteins such as solute transporters are pivotal for engineering efficient microbial chassis. Anaerobic gut fungi, adapted to degrade raw plant biomass in the intestines of herbivores, are a potential source of valuable transporters for biotechnology, yet very little is known about the membrane constituents of these non-conventional organisms. Here, we mined the transcriptome of three recently isolated strains of anaerobic fungi to identify membrane proteins responsible for sensing and transporting biomass hydrolysates within a competitive and rather extreme environment. Using sequence analyses and homology, we identified membrane protein-coding sequences from assembled transcriptomes from three strains of anaerobic gut fungi: Neocallimastix californiae, Anaeromyces robustus, and Piromyces finnis. We identified nearly 2000 transporter components: about half of these are involved in the general secretory pathway and intracellular sorting of proteins; the rest are predicted to be small-solute transporters. Unexpectedly, we found a number of putative sugar binding proteins that are associated with prokaryotic uptake systems; and approximately 100 class C G-protein coupled receptors (GPCRs) with non-canonical putative sugar binding domains. We report the first comprehensive characterization of the membrane protein machinery of biotechnologically relevant anaerobic gut fungi. Apart from identifying conserved machinery for protein sorting and secretion, we identify a large number of putative solute transporters that are of interest for biotechnological applications. Notably, our data suggests that the fungi display a plethora of carbohydrate binding domains at their surface, perhaps as a means to sense and sequester some of the sugars that their biomass degrading, extracellular enzymes produce.
Winterhoff, Nora; Goethe, Ralph; Gruening, Petra; Rohde, Manfred; Kalisz, Henryk; Smith, Hilde E.; Valentin-Weigand, Peter
2002-01-01
The present study was performed to identify stress-induced putative virulence proteins of Streptococcus suis. For this, protein expression patterns of streptococci grown at 32, 37, and 42°C were compared by one- and two-dimensional gel electrophoresis. Temperature shifts from 32 and 37 to 42°C induced expression of two cell wall-associated proteins with apparent molecular masses of approximately 47 and 53 kDa. Amino-terminal sequence analysis of the two proteins indicated homologies of the 47-kDa protein with an ornithine carbamoyltransferase (OCT) from Streptococcus pyogenes and of the 53-kDa protein with the streptococcal acid glycoprotein (SAGP) from S. pyogenes, an arginine deiminase (AD) recently proposed as a putative virulence factor. Cloning and sequencing the genes encoding the putative OCT and AD of S. suis, octS and adiS, respectively, revealed that they had 81.2 (octS) and 80.2% (adiS) identity with the respective genes of S. pyogenes. Both genes belong to the AD system, also found in other bacteria. Southern hybridization analysis demonstrated the presence of the adiS gene in all 42 serotype 2 and 9 S. suis strains tested. In 9 of these 42 strains, selected randomly, we confirmed expression of the AdiS protein, homologous to SAGP, by immunoblot analysis using a specific antiserum against the SAGP of S. pyogenes. In all strains AD activity was detected. Furthermore, by immunoelectron microscopy using the anti-S. pyogenes SAGP antiserum we were able to demonstrate that the AdiS protein is expressed on the streptococcal surface in association with the capsular polysaccharides but is not coexpressed with them. PMID:12446626
Li, You-Hai; Han, Wen-Jin; Gui, Xi-Wu; Wei, Tao; Tang, Shuang-Yan; Jin, Jian-Ming
2016-08-02
Tentoxin, a cyclic tetrapeptide produced by several Alternaria species, inhibits the F₁-ATPase activity of chloroplasts, resulting in chlorosis in sensitive plants. In this study, we report two clustered genes, encoding a putative non-ribosome peptide synthetase (NRPS) TES and a cytochrome P450 protein TES1, that are required for tentoxin biosynthesis in Alternaria alternata strain ZJ33, which was isolated from blighted leaves of Eupatorium adenophorum. Using a pair of primers designed according to the consensus sequences of the adenylation domain of NRPSs, two fragments containing putative adenylation domains were amplified from A. alternata ZJ33, and subsequent PCR analyses demonstrated that these fragments belonged to the same NRPS coding sequence. With no introns, TES consists of a single 15,486 base pair open reading frame encoding a predicted 5161 amino acid protein. Meanwhile, the TES1 gene is predicted to contain five introns and encode a 506 amino acid protein. The TES protein is predicted to be comprised of four peptide synthase modules with two additional N-methylation domains, and the number and arrangement of the modules in TES were consistent with the number and arrangement of the amino acid residues of tentoxin, respectively. Notably, both TES and TES1 null mutants generated via homologous recombination failed to produce tentoxin. This study provides the first evidence concerning the biosynthesis of tentoxin in A. alternata.
Schaeffer, E; Sninsky, J J
1984-01-01
Proteins that are related evolutionarily may have diverged at the level of primary amino acid sequence while maintaining similar secondary structures. Computer analysis has been used to compare the open reading frames of the hepatitis B virus to those of the woodchuck hepatitis virus at the level of amino acid sequence, and to predict the relative hydrophilic character and the secondary structure of putative polypeptides. Similarity is seen at the levels of relative hydrophilicity and secondary structure, in the absence of sequence homology. These data reinforce the proposal that these open reading frames encode viral proteins. Computer analysis of this type can be more generally used to establish structural similarities between proteins that do not share obvious sequence homology as well as to assess whether an open reading frame is fortuitous or codes for a protein. PMID:6585835
Sequence and analysis of chromosome 4 of the plant Arabidopsis thaliana.
Mayer, K; Schüller, C; Wambutt, R; Murphy, G; Volckaert, G; Pohl, T; Düsterhöft, A; Stiekema, W; Entian, K D; Terryn, N; Harris, B; Ansorge, W; Brandt, P; Grivell, L; Rieger, M; Weichselgartner, M; de Simone, V; Obermaier, B; Mache, R; Müller, M; Kreis, M; Delseny, M; Puigdomenech, P; Watson, M; Schmidtheini, T; Reichert, B; Portatelle, D; Perez-Alonso, M; Boutry, M; Bancroft, I; Vos, P; Hoheisel, J; Zimmermann, W; Wedler, H; Ridley, P; Langham, S A; McCullagh, B; Bilham, L; Robben, J; Van der Schueren, J; Grymonprez, B; Chuang, Y J; Vandenbussche, F; Braeken, M; Weltjens, I; Voet, M; Bastiaens, I; Aert, R; Defoor, E; Weitzenegger, T; Bothe, G; Ramsperger, U; Hilbert, H; Braun, M; Holzer, E; Brandt, A; Peters, S; van Staveren, M; Dirske, W; Mooijman, P; Klein Lankhorst, R; Rose, M; Hauf, J; Kötter, P; Berneiser, S; Hempel, S; Feldpausch, M; Lamberth, S; Van den Daele, H; De Keyser, A; Buysshaert, C; Gielen, J; Villarroel, R; De Clercq, R; Van Montagu, M; Rogers, J; Cronin, A; Quail, M; Bray-Allen, S; Clark, L; Doggett, J; Hall, S; Kay, M; Lennard, N; McLay, K; Mayes, R; Pettett, A; Rajandream, M A; Lyne, M; Benes, V; Rechmann, S; Borkova, D; Blöcker, H; Scharfe, M; Grimm, M; Löhnert, T H; Dose, S; de Haan, M; Maarse, A; Schäfer, M; Müller-Auer, S; Gabel, C; Fuchs, M; Fartmann, B; Granderath, K; Dauner, D; Herzl, A; Neumann, S; Argiriou, A; Vitale, D; Liguori, R; Piravandi, E; Massenet, O; Quigley, F; Clabauld, G; Mündlein, A; Felber, R; Schnabl, S; Hiller, R; Schmidt, W; Lecharny, A; Aubourg, S; Chefdor, F; Cooke, R; Berger, C; Montfort, A; Casacuberta, E; Gibbons, T; Weber, N; Vandenbol, M; Bargues, M; Terol, J; Torres, A; Perez-Perez, A; Purnelle, B; Bent, E; Johnson, S; Tacon, D; Jesse, T; Heijnen, L; Schwarz, S; Scholler, P; Heber, S; Francs, P; Bielke, C; Frishman, D; Haase, D; Lemcke, K; Mewes, H W; Stocker, S; Zaccaria, P; Bevan, M; Wilson, R K; de la Bastide, M; Habermann, K; Parnell, L; Dedhia, N; Gnoj, L; Schutz, K; Huang, E; Spiegel, L; Sehkon, M; Murray, J; Sheet, P; Cordes, M; Abu-Threideh, J; Stoneking, T; Kalicki, J; Graves, T; Harmon, G; Edwards, J; Latreille, P; Courtney, L; Cloud, J; Abbott, A; Scott, K; Johnson, D; Minx, P; Bentley, D; Fulton, B; Miller, N; Greco, T; Kemp, K; Kramer, J; Fulton, L; Mardis, E; Dante, M; Pepin, K; Hillier, L; Nelson, J; Spieth, J; Ryan, E; Andrews, S; Geisel, C; Layman, D; Du, H; Ali, J; Berghoff, A; Jones, K; Drone, K; Cotton, M; Joshu, C; Antonoiu, B; Zidanic, M; Strong, C; Sun, H; Lamar, B; Yordan, C; Ma, P; Zhong, J; Preston, R; Vil, D; Shekher, M; Matero, A; Shah, R; Swaby, I K; O'Shaughnessy, A; Rodriguez, M; Hoffmann, J; Till, S; Granat, S; Shohdy, N; Hasegawa, A; Hameed, A; Lodhi, M; Johnson, A; Chen, E; Marra, M; Martienssen, R; McCombie, W R
1999-12-16
The higher plant Arabidopsis thaliana (Arabidopsis) is an important model for identifying plant genes and determining their function. To assist biological investigations and to define chromosome structure, a coordinated effort to sequence the Arabidopsis genome was initiated in late 1996. Here we report one of the first milestones of this project, the sequence of chromosome 4. Analysis of 17.38 megabases of unique sequence, representing about 17% of the genome, reveals 3,744 protein coding genes, 81 transfer RNAs and numerous repeat elements. Heterochromatic regions surrounding the putative centromere, which has not yet been completely sequenced, are characterized by an increased frequency of a variety of repeats, new repeats, reduced recombination, lowered gene density and lowered gene expression. Roughly 60% of the predicted protein-coding genes have been functionally characterized on the basis of their homology to known genes. Many genes encode predicted proteins that are homologous to human and Caenorhabditis elegans proteins.
Ranford, Julia C; Bryce, James H; Morris, Peter C
2002-01-01
A barley (Hordeum vulgare L.) cDNA, PM19, encoding a putative plasma membrane protein was isolated through differential screening of a dormant wild oat embryo library. PM19 is expressed in barley embryos from mid-embryogenesis up to maturity. PM19 mRNA levels decline upon germination, whereas dormant embryos retained high levels of message for up to 72 h of imbibition. PM19 mRNA levels also remained high or were reinduced in non-dormant embryos by treatments that prevented germination (250 mm NaCl, 10% sorbitol, or 50 microm ABA). The PM19 protein sequence is highly conserved in monocotyledonous and dicotyledonous plants.
Mapping a nucleolar targeting sequence of an RNA binding nucleolar protein, Nop25
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fujiwara, Takashi; Suzuki, Shunji; Kanno, Motoko
2006-06-10
Nop25 is a putative RNA binding nucleolar protein associated with rRNA transcription. The present study was undertaken to determine the mechanism of Nop25 localization in the nucleolus. Deletion experiments of Nop25 amino acid sequence showed Nop25 to contain a nuclear targeting sequence in the N-terminal and a nucleolar targeting sequence in the C-terminal. By expressing derivative peptides from the C-terminal as GFP-fusion proteins in the cells, a lysine and arginine residue-enriched peptide (KRKHPRRAQDSTKKPPSATRTSKTQRRRR) allowed a GFP-fusion protein to be transported and fully retained in the nucleolus. When the peptide was fused with cMyc epitope and expressed in the cells, amore » cMyc epitope was then detected in the nucleolus. Nop25 did not localize in the nucleolus by deletion of the peptide from Nop25. Furthermore, deletion of a subdomain (KRKHPRRAQ) in the peptide or amino acid substitution of lysine and arginine residues in the subdomain resulted in the loss of Nop25 nucleolar localization. These results suggest that the lysine and arginine residue-enriched peptide is the most prominent nucleolar targeting sequence of Nop25 and that the long stretch of basic residues might play an important role in the nucleolar localization of Nop25. Although Nop25 contained putative SUMOylation, phosphorylation and glycosylation sites, the amino acid substitution in these sites had no effect on the nucleolar localization, thus suggesting that these post-translational modifications did not contribute to the localization of Nop25 in the nucleolus. The treatment of the cells, which expressed a GFP-fusion protein with a nucleolar targeting sequence of Nop25, with RNase A resulted in a complete dislocation of the protein from the nucleolus. These data suggested that the nucleolar targeting sequence might therefore play an important role in the binding of Nop25 to RNA molecules and that the RNA binding of Nop25 might be essential for the nucleolar localization of Nop25.« less
Complete nucleotide sequence and annotation of the temperate corynephage ϕ16 genome.
Lobanova, Juliya S; Gak, Evgueni R; Andreeva, Irina G; Rybak, Konstantin V; Krylov, Alexander A; Mashko, Sergey V
2017-08-01
The complete genome of ϕ16, a temperate corynephage from Corynebacterium glutamicum ATCC 21792, was sequenced and annotated (GenBank: KY250482). The electron microscopy study of ϕ16 virion confirmed that it belongs to the family Siphoviridae. The ϕ16 genome consists of a linear double-stranded DNA molecule of 58,200 bp (G+C = 52.2%) with protruding cohesive 3'-ends of 14 nt. Four major structural proteins were separated by SDS-PAGE and identified by peptide mass fingerprinting technique. Using bioinformatics analysis, 101 putative ORFs and 5 tRNA genes were predicted. Only 27 putative gene products could be assigned to known biological functions. The ϕ16 genome was divided into functional modules. Seven putative promoters and eight putative unidirectional intrinsic terminators were predicted. One site of putative «-1» programmed ribosomal frameshifting was proposed in the phage tail assembly genome region. C. glutamicum genetic tools could be broadened by exploiting the known integrase gene (gp33) and the newly identified excisionase gene (gp47), participating in site-specific recombination between ϕ16-attP/attB.
Transcription activation mediated by a cyclic AMP receptor protein from Thermus thermophilus HB8.
Shinkai, Akeo; Kira, Satoshi; Nakagawa, Noriko; Kashihara, Aiko; Kuramitsu, Seiki; Yokoyama, Shigeyuki
2007-05-01
The extremely thermophilic bacterium Thermus thermophilus HB8, which belongs to the phylum Deinococcus-Thermus, has an open reading frame encoding a protein belonging to the cyclic AMP (cAMP) receptor protein (CRP) family present in many bacteria. The protein named T. thermophilus CRP is highly homologous to the CRP family proteins from the phyla Firmicutes, Actinobacteria, and Cyanobacteria, and it forms a homodimer and interacts with cAMP. CRP mRNA and intracellular cAMP were detected in this strain, which did not drastically fluctuate during cultivation in a rich medium. The expression of several genes was altered upon disruption of the T. thermophilus CRP gene. We found six CRP-cAMP-dependent promoters in in vitro transcription assays involving DNA fragments containing the upstream regions of the genes exhibiting decreased expression in the CRP disruptant, indicating that the CRP is a transcriptional activator. The consensus T. thermophilus CRP-binding site predicted upon nucleotide sequence alignment is 5'-(C/T)NNG(G/T)(G/T)C(A/C)N(A/T)NNTCACAN(G/C)(G/C)-3'. This sequence is unique compared with the known consensus binding sequences of CRP family proteins. A putative -10 hexamer sequence resides at 18 to 19 bp downstream of the predicted T. thermophilus CRP-binding site. The CRP-regulated genes found in this study comprise clustered regularly interspaced short palindromic repeat (CRISPR)-associated (cas) ones, and the genes of a putative transcriptional regulator, a protein containing the exonuclease III-like domain of DNA polymerase, a GCN5-related acetyltransferase homolog, and T. thermophilus-specific proteins of unknown function. These results suggest a role for cAMP signal transduction in T. thermophilus and imply the T. thermophilus CRP is a cAMP-responsive regulator.
Vasala, A; Dupont, L; Baumann, M; Ritzenthaler, P; Alatossava, T
1993-01-01
Virulent phage LL-H and temperate phage mv4 are two related bacteriophages of Lactobacillus delbrueckii. The gene clusters encoding structural proteins of these two phages have been sequenced and further analyzed. Six open reading frames (ORF-1 to ORF-6) were detected. Protein sequencing and Western immunoblotting experiments confirmed that ORF-3 (g34) encoded the main capsid protein Gp34. The presence of a putative late promoter in front of the phage LL-H g34 gene was suggested by primer extension experiments. Comparative sequence analysis between phage LL-H and phage mv4 revealed striking similarities in the structure and organization of this gene cluster, suggesting that the genes encoding phage structural proteins belong to a highly conservative module. Images PMID:8497043
The transcriptome of Lutzomyia longipalpis (Diptera: Psychodidae) male reproductive organs.
Azevedo, Renata V D M; Dias, Denise B S; Bretãs, Jorge A C; Mazzoni, Camila J; Souza, Nataly A; Albano, Rodolpho M; Wagner, Glauber; Davila, Alberto M R; Peixoto, Alexandre A
2012-01-01
It has been suggested that genes involved in the reproductive biology of insect disease vectors are potential targets for future alternative methods of control. Little is known about the molecular biology of reproduction in phlebotomine sand flies and there is no information available concerning genes that are expressed in male reproductive organs of Lutzomyia longipalpis, the main vector of American visceral leishmaniasis and a species complex. We generated 2678 high quality ESTs ("Expressed Sequence Tags") of L. longipalpis male reproductive organs that were grouped in 1391 non-redundant sequences (1136 singlets and 255 clusters). BLAST analysis revealed that only 57% of these sequences share similarity with a L. longipalpis female EST database. Although no more than 36% of the non-redundant sequences showed similarity to protein sequences deposited in databases, more than half of them presented the best-match hits with mosquito genes. Gene ontology analysis identified subsets of genes involved in biological processes such as protein biosynthesis and DNA replication, which are probably associated with spermatogenesis. A number of non-redundant sequences were also identified as putative male reproductive gland proteins (mRGPs), also known as male accessory gland protein genes (Acps). The transcriptome analysis of L. longipalpis male reproductive organs is one step further in the study of the molecular basis of the reproductive biology of this important species complex. It has allowed the identification of genes potentially involved in spermatogenesis as well as putative mRGPs sequences, which have been studied in many insect species because of their effects on female post-mating behavior and physiology and their potential role in sexual selection and speciation. These data open a number of new avenues for further research in the molecular and evolutionary reproductive biology of sand flies.
The Transcriptome of Lutzomyia longipalpis (Diptera: Psychodidae) Male Reproductive Organs
Bretãs, Jorge A. C.; Mazzoni, Camila J.; Souza, Nataly A.; Albano, Rodolpho M.; Wagner, Glauber; Davila, Alberto M. R.; Peixoto, Alexandre A.
2012-01-01
Background It has been suggested that genes involved in the reproductive biology of insect disease vectors are potential targets for future alternative methods of control. Little is known about the molecular biology of reproduction in phlebotomine sand flies and there is no information available concerning genes that are expressed in male reproductive organs of Lutzomyia longipalpis, the main vector of American visceral leishmaniasis and a species complex. Methods/Principal Findings We generated 2678 high quality ESTs (“Expressed Sequence Tags”) of L. longipalpis male reproductive organs that were grouped in 1391 non-redundant sequences (1136 singlets and 255 clusters). BLAST analysis revealed that only 57% of these sequences share similarity with a L. longipalpis female EST database. Although no more than 36% of the non-redundant sequences showed similarity to protein sequences deposited in databases, more than half of them presented the best-match hits with mosquito genes. Gene ontology analysis identified subsets of genes involved in biological processes such as protein biosynthesis and DNA replication, which are probably associated with spermatogenesis. A number of non-redundant sequences were also identified as putative male reproductive gland proteins (mRGPs), also known as male accessory gland protein genes (Acps). Conclusions The transcriptome analysis of L. longipalpis male reproductive organs is one step further in the study of the molecular basis of the reproductive biology of this important species complex. It has allowed the identification of genes potentially involved in spermatogenesis as well as putative mRGPs sequences, which have been studied in many insect species because of their effects on female post-mating behavior and physiology and their potential role in sexual selection and speciation. These data open a number of new avenues for further research in the molecular and evolutionary reproductive biology of sand flies. PMID:22496818
Liu, Zhong-Yuan; Wang, Yun; Lü, Guo-Dong; Wang, Xian-Lei; Zhang, Fu-Chun; Ma, Ji
2006-12-01
The partial cDNA sequence coding for the antifreeze proteins in the Tenebrio molitor was obtained by RT-PCR. Sequence analysis revealed nine putative cDNAs with a high degree of homology to Tenebrio molitor antifreeze proteins. The recombinant pGEX-4T-1-tmafp-XJ430 was introduced into E. coli BL21 to induce a GST fusion protein by IPTG. SDS-PAGE of the fusion protein demonstrated that the antifreeze protein migrated at a size of 38 kDa. The immunization was performed by intra-muscular injection of pCDNA3-tmafp-XJ430, and then antiserum was detected by ELISA. The titer of the antibody was 1:2,000. Western blotting analysis showed the antiserum was specific against the antifreeze protein. This finding could lead to further investigation of the properties and function of antifreeze proteins.
ERIC Educational Resources Information Center
Mertz, Pamela; Streu, Craig
2015-01-01
This article describes a synergistic two-semester writing sequence for biochemistry courses. In the first semester, students select a putative protein and are tasked with researching their protein largely through bioinformatics resources. In the second semester, students develop original ideas and present them in the form of a research grant…
Sitthithaworn, W; Kojima, N; Viroonchatapan, E; Suh, D Y; Iwanami, N; Hayashi, T; Noji, M; Saito, K; Niwa, Y; Sankawa, U
2001-02-01
cDNAs encoding geranylgeranyl diphosphate synthase (GGPPS) of two diterpene-producing plants, Scoparia dulcis and Croton sublyratus, have been isolated using the homology-based polymerase chain reaction (PCR) method. Both clones contained highly conserved aspartate-rich motifs (DDXX(XX)D) and their N-terminal residues exhibited the characteristics of chloroplast targeting sequence. When expressed in Escherichia coli, both the full-length and truncated proteins in which the putative targeting sequence was deleted catalyzed the condensation of farnesyl diphosphate and isopentenyl diphosphate to produce geranylgeranyl diphosphate (GGPP). The structural factors determining the product length in plant GGPPSs were investigated by constructing S. dulcis GGPPS mutants on the basis of sequence comparison with the first aspartate-rich motif (FARM) of plant farnesyl diphosphate synthase. The result indicated that in plant GGPPSs small amino acids, Met and Ser, at the fourth and fifth positions before FARM and Pro and Cys insertion in FARM play essential roles in determination of product length. Further, when a chimeric gene comprised of the putative transit peptide of the S. dulcis GGPPS gene and a green fluorescent protein was introduced into Arabidopsis leaves by particle gun bombardment, the chimeric protein was localized in chloroplasts, indicating that the cloned S. dulcis GGPPS is a chloroplast protein.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Seppala, Susanna; Solomon, Kevin V.; Gilmore, Sean P.
Here, engineered cell factories that convert biomass into value-added compounds are emerging as a timely alternative to petroleum-based industries. Although often overlooked, integral membrane proteins such as solute transporters are pivotal for engineering efficient microbial chassis. Anaerobic gut fungi, adapted to degrade raw plant biomass in the intestines of herbivores, are a potential source of valuable transporters for biotechnology, yet very little is known about the membrane constituents of these non-conventional organisms. Here, we mined the transcriptome of three recently isolated strains of anaerobic fungi to identify membrane proteins responsible for sensing and transporting biomass hydrolysates within a competitive andmore » rather extreme environment. Using sequence analyses and homology, we identified membrane protein-coding sequences from assembled transcriptomes from three strains of anaerobic gut fungi: Neocallimastix californiae, Anaeromyces robustus, and Piromyces finnis. We identified nearly 2000 transporter components: about half of these are involved in the general secretory pathway and intracellular sorting of proteins; the rest are predicted to be small-solute transporters. Unexpectedly, we found a number of putative sugar binding proteins that are associated with prokaryotic uptake systems; and approximately 100 class C G-protein coupled receptors (GPCRs) with non-canonical putative sugar binding domains. In conclusion, we report the first comprehensive characterization of the membrane protein machinery of biotechnologically relevant anaerobic gut fungi. Apart from identifying conserved machinery for protein sorting and secretion, we identify a large number of putative solute transporters that are of interest for biotechnological applications. Notably, our data suggests that the fungi display a plethora of carbohydrate binding domains at their surface, perhaps as a means to sense and sequester some of the sugars that their biomass degrading, extracellular enzymes produce.« less
Charles, Jermilia; Firth, Andrew E; Loroño-Pino, Maria A; Garcia-Rejon, Julian E; Farfan-Ale, Jose A; Lipkin, W Ian; Blitvich, Bradley J; Briese, Thomas
2016-04-01
Sequences corresponding to a putative, novel rhabdovirus [designated Merida virus (MERDV)] were initially detected in a pool of Culex quinquefasciatus collected in the Yucatan Peninsula of Mexico. The entire genome was sequenced, revealing 11 798 nt and five major ORFs, which encode the nucleoprotein (N), phosphoprotein (P), matrix protein (M), glycoprotein (G) and RNA-dependent RNA polymerase (L). The deduced amino acid sequences of the N, G and L proteins have no more than 24, 38 and 43 % identity, respectively, to the corresponding sequences of all other known rhabdoviruses, whereas those of the P and M proteins have no significant identity with any sequences in GenBank and their identity is only suggested based on their genome position. Using specific reverse transcription-PCR assays established from the genome sequence, 27 571 C. quinquefasciatus which had been sorted in 728 pools were screened to assess the prevalence of MERDV in nature and 25 pools were found positive. The minimal infection rate (calculated as the number of positive mosquito pools per 1000 mosquitoes tested) was 0.9, and similar for both females and males. Screening another 140 pools of 5484 mosquitoes belonging to four other genera identified positive pools of Ochlerotatus spp. mosquitoes, indicating that the host range is not restricted to C. quinquefasciatus. Attempts to isolate MERDV in C6/36 and Vero cells were unsuccessful. In summary, we provide evidence that a previously undescribed rhabdovirus occurs in mosquitoes in Mexico.
Incorrectly predicted genes in rice?
Cruveiller, Stéphane; Jabbari, Kamel; Clay, Oliver; Bernardi, Giorgio
2004-05-26
Between one third and one half of the proposed rice genes appear to have no homologs in other species, including Arabidopsis. Compositional considerations, and a comparison of curated rice sequences with ex novo predictions, suggest that many or most of the putative genes without homologs may be false positive predictions, i.e., sequences that are never translated into functional proteins in vivo.
Negrete-Abascal, Erasmo; Montes-Garcia, Fernando; Vaca-Pacheco, Sergio; Leyto-Gil, Abraham M.; Fragoso-Garcia, Edgar; Carvente-Garcia, Roberto; Perez-Agueros, Sandra; Castelan-Sanchez, Hugo G.; Garcia-Molina, Alejandra; Villamar, Tomas E.; Sánchez-Alonso, Patricia
2018-01-01
ABSTRACT The draft genome sequence of Actinobacillus seminis strain ATCC 15768 is reported here. The genome comprises 22 contigs corresponding to 2.36 Mb with 40.7% G+C content and contains several genes related to virulence, including a putative RTX protein. PMID:29326222
Oliveira, Letícia de C.; Silveira, Aline M. M.; Monteiro, Andréa de S.; dos Santos, Vera L.; Nicoli, Jacques R.; Azevedo, Vasco A. de C.; Soares, Siomar de C.; Dias-Souza, Marcus V.; Nardi, Regina M. D.
2017-01-01
A bacteriocinogenic Lactobacillus rhamnosus L156.4 strain isolated from the feces of NIH mice was identified by 16S rRNA gene sequencing and MALDI-TOF mass spectrometry. The entire genome was sequenced using Illumina, annotated in the PGAAP, and RAST servers, and deposited. Conserved genes associated with bacteriocin synthesis were predicted using BAGEL3, leading to the identification of an open reading frame (ORF) that shows homology with the L. rhamnosus GG (ATCC 53103) prebacteriocin gene. The encoded protein contains a conserved protein motif associated a structural gene of the Enterocin A superfamily. We found ORFs related to the prebacteriocin, immunity protein, ABC transporter proteins, and regulatory genes with 100% identity to those of L. rhamnosus HN001. In this study, we provide evidence of a putative bacteriocin produced by L. rhamnosus L156.4 that was further confirmed by in vitro assays. The antibacterial activity of the substances produced by this strain was evaluated using the deferred agar-spot and spot-on-the lawn assays, and a wide antimicrobial activity spectrum against human and foodborne pathogens was observed. The physicochemical characterization of the putative bacteriocin indicated that it was sensitive to proteolytic enzymes, heat stable and maintained its antibacterial activity in a pH ranging from 3 to 9. The activity against Lactobacillus fermentum, which was used as an indicator strain, was detected during bacterial logarithmic growth phase, and a positive correlation was confirmed between bacterial growth and production of the putative bacteriocin. After a partial purification from cell-free supernatant by salt precipitation, the putative bacteriocin migrated as a diffuse band of approximately 1.0–3.0 kDa by SDS-PAGE. Additional studies are being conducted to explore its use in the food industry for controlling bacterial growth and for probiotic applications. PMID:28579977
Colombo, Lívia Tavares; de Oliveira, Marcelo Nagem Valério; Carneiro, Deisy Guimarães; de Souza, Robson Assis; Alvim, Mariana Caroline Tocantins; Dos Santos, Josenilda Carlos; da Silva, Cynthia Canêdo; Vidigal, Pedro Marcus Pereira; da Silveira, Wendel Batista; Passos, Flávia Maria Lopes
2016-09-01
Environments where lignocellulosic biomass is naturally decomposed are sources for discovery of new hydrolytic enzymes that can reduce the high cost of enzymatic cocktails for second-generation ethanol production. Metagenomic analysis was applied to discover genes coding carbohydrate-depleting enzymes from a microbial laboratory subculture using a mix of sugarcane bagasse and cow manure in the thermophilic composting phase. From a fosmid library, 182 clones had the ability to hydrolyse carbohydrate. Sequencing of 30 fosmids resulted in 12 contigs encoding 34 putative carbohydrate-active enzymes belonging to 17 glycosyl hydrolase (GH) families. One third of the putative proteins belong to the GH3 family, which includes β-glucosidase enzymes known to be important in the cellulose-deconstruction process but present with low activity in commercial enzyme preparations. Phylogenetic analysis of the amino acid sequences of seven selected proteins, including three β-glucosidases, showed low relatedness with protein sequences deposited in databases. These findings highlight microbial consortia obtained from a mixture of decomposing biomass residues, such as sugar cane bagasse and cow manure, as a rich resource of novel enzymes potentially useful in biotechnology for saccharification of lignocellulosic substrate.
Diallinas, G; Gorfinkiel, L; Arst, H N; Cecchetto, G; Scazzocchio, C
1995-04-14
In Aspergillus nidulans, loss-of-function mutations in the uapA and azgA genes, encoding the major uric acid-xanthine and hypoxanthine-adenine-guanine permeases, respectively, result in impaired utilization of these purines as sole nitrogen sources. The residual growth of the mutant strains is due to the activity of a broad specificity purine permease. We have identified uapC, the gene coding for this third permease through the isolation of both gain-of-function and loss-of-function mutations. Uptake studies with wild-type and mutant strains confirmed the genetic analysis and showed that the UapC protein contributes 30% and 8-10% to uric acid and hypoxanthine transport rates, respectively. The uapC gene was cloned, its expression studied, its sequence and transcript map established, and the sequence of its putative product analyzed. uapC message accumulation is: (i) weakly induced by 2-thiouric acid; (ii) repressed by ammonium; (iii) dependent on functional uaY and areA regulatory gene products (mediating uric acid induction and nitrogen metabolite repression, respectively); (iv) increased by uapC gain-of-function mutations which specifically, but partially, suppress a leucine to valine mutation in the zinc finger of the protein coded by the areA gene. The putative uapC gene product is a highly hydrophobic protein of 580 amino acids (M(r) = 61,251) including 12-14 putative transmembrane segments. The UapC protein is highly similar (58% identity) to the UapA permease and significantly similar (23-34% identity) to a number of bacterial transporters. Comparisons of the sequences and hydropathy profiles of members of this novel family of transporters yield insights into their structure, functionally important residues, and possible evolutionary relationships.
Putative Monofunctional Type I Polyketide Synthase Units: A Dinoflagellate-Specific Feature?
Eichholz, Karsten; Beszteri, Bánk; John, Uwe
2012-01-01
Marine dinoflagellates (alveolata) are microalgae of which some cause harmful algal blooms and produce a broad variety of most likely polyketide synthesis derived phycotoxins. Recently, novel polyketide synthesase (PKS) transcripts have been described from the Florida red tide dinoflagellate Karenia brevis (gymnodiniales) which are evolutionarily related to Type I PKS but were apparently expressed as monofunctional proteins, a feature typical of Type II PKS. Here, we investigated expression units of PKS I-like sequences in Alexandrium ostenfeldii (gonyaulacales) and Heterocapsa triquetra (peridiniales) at the transcript and protein level. The five full length transcripts we obtained were all characterized by polyadenylation, a 3′ UTR and the dinoflagellate specific spliced leader sequence at the 5′end. Each of the five transcripts encoded a single ketoacylsynthase (KS) domain showing high similarity to K. brevis KS sequences. The monofunctional structure was also confirmed using dinoflagellate specific KS antibodies in Western Blots. In a maximum likelihood phylogenetic analysis of KS domains from diverse PKSs, dinoflagellate KSs formed a clade placed well within the protist Type I PKS clade between apicomplexa, haptophytes and chlorophytes. These findings indicate that the atypical PKS I structure, i.e., expression as putative monofunctional units, might be a dinoflagellate specific feature. In addition, the sequenced transcripts harbored a previously unknown, apparently dinoflagellate specific conserved N-terminal domain. We discuss the implications of this novel region with regard to the putative monofunctional organization of Type I PKS in dinoflagellates. PMID:23139807
Terminal region sequence variations in variola virus DNA.
Massung, R F; Loparev, V N; Knight, J C; Totmenin, A V; Chizhikov, V E; Parsons, J M; Safronov, P F; Gutorov, V V; Shchelkunov, S N; Esposito, J J
1996-07-15
Genome DNA terminal region sequences were determined for a Brazilian alastrim variola minor virus strain Garcia-1966 that was associated with an 0.8% case-fatality rate and African smallpox strains Congo-1970 and Somalia-1977 associated with variola major (9.6%) and minor (0.4%) mortality rates, respectively. A base sequence identity of > or = 98.8% was determined after aligning 30 kb of the left- or right-end region sequences with cognate sequences previously determined for Asian variola major strains India-1967 (31% death rate) and Bangladesh-1975 (18.5% death rate). The deduced amino acid sequences of putative proteins of > or = 65 amino acids also showed relatively high identity, although the Asian and African viruses were clearly more related to each other than to alastrim virus. Alastrim virus contained only 10 of 70 proteins that were 100% identical to homologs in Asian strains, and 7 alastrim-specific proteins were noted.
Alam, Syed Imteyaz; Dwivedi, Pratistha
2016-10-01
The whole genome sequencing and annotation of Clostridium perfringens strains revealed several genes coding for proteins of unknown function with no significant similarities to genes in other organisms. Our previous studies clearly demonstrated that hypothetical proteins CPF_2500, CPF_1441, CPF_0876, CPF_0093, CPF_2002, CPF_2314, CPF_1179, CPF_1132, CPF_2853, CPF_0552, CPF_2032, CPF_0438, CPF_1440, CPF_2918, CPF_0656, and CPF_2364 are genuine proteins of C. perfringens expressed in high abundance. This study explored the putative role of these hypothetical proteins using bioinformatic tools and evaluated their potential as putative candidates for prophylaxis. Apart from a group of eight hypothetical proteins (HPs), a putative function was predicted for the rest of the hypothetical proteins using one or more of the algorithms used. The phylogenetic analysis did not suggest an evidence of a horizontal gene transfer event except for HP CPF_0876. HP CPF_2918 is an abundant extracellular protein, unique to C. perfringens species with maximum strain coverage and did not show any significant match in the database. CPF_2918 was cloned, recombinant protein was purified to near homogeneity, and probing with mouse anti-CPF_2918 serum revealed surface localization of the protein in C. perfringens ATCC13124 cultures. The purified recombinant CPF_2918 protein induced antibody production, a mixed Th1 and Th2 kind of response, and provided partial protection to immunized mice in direct C. perfringens challenge. Copyright © 2016 Elsevier B.V. All rights reserved.
Transcriptome analysis of resistant soybean roots infected by Meloidogyne javanica
de Sá, Maria Eugênia Lisei; Conceição Lopes, Marcus José; de Araújo Campos, Magnólia; Paiva, Luciano Vilela; dos Santos, Regina Maria Amorim; Beneventi, Magda Aparecida; Firmino, Alexandre Augusto Pereira; de Sá, Maria Fátima Grossi
2012-01-01
Soybean is an important crop for Brazilian agribusiness. However, many factors can limit its production, especially root-knot nematode infection. Studies on the mechanisms employed by the resistant soybean genotypes to prevent infection by these nematodes are of great interest for breeders. For these reasons, the aim of this work is to characterize the transcriptome of soybean line PI 595099-Meloidogyne javanica interaction through expression analysis. Two cDNA libraries were obtained using a pool of RNA from PI 595099 uninfected and M. javanica (J2) infected roots, collected at 6, 12, 24, 48, 96, 144 and 192 h after inoculation. Around 800 ESTs (Expressed Sequence Tags) were sequenced and clustered into 195 clusters. In silico subtraction analysis identified eleven differentially expressed genes encoding putative proteins sharing amino acid sequence similarities by using BlastX: metallothionein, SLAH4 (SLAC1 Homologue 4), SLAH1 (SLAC1 Homologue 1), zinc-finger proteins, AN1-type proteins, auxin-repressed proteins, thioredoxin and nuclear transport factor 2 (NTF-2). Other genes were also found exclusively in nematode stressed soybean roots, such as NAC domain-containing proteins, MADS-box proteins, SOC1 (suppressor of overexpression of constans 1) proteins, thioredoxin-like protein 4-Coumarate-CoA ligase and the transcription factor (TF) MYBZ2. Among the genes identified in non-stressed roots only were Ser/Thr protein kinases, wound-induced basic protein, ethylene-responsive family protein, metallothionein-like protein cysteine proteinase inhibitor (cystatin) and Putative Kunitz trypsin protease inhibitor. An understanding of the roles of these differentially expressed genes will provide insights into the resistance mechanisms and candidate genes involved in soybean-M. javanica interaction and contribute to more effective control of this pathogen. PMID:22802712
Piao, Hailan; Froula, Jeff; Du, Changbin; Kim, Tae-Wan; Hawley, Erik R; Bauer, Stefan; Wang, Zhong; Ivanova, Nathalia; Clark, Douglas S; Klenk, Hans-Peter; Hess, Matthias
2014-08-01
Although recent nucleotide sequencing technologies have significantly enhanced our understanding of microbial genomes, the function of ∼35% of genes identified in a genome currently remains unknown. To improve the understanding of microbial genomes and consequently of microbial processes it will be crucial to assign a function to this "genomic dark matter." Due to the urgent need for additional carbohydrate-active enzymes for improved production of transportation fuels from lignocellulosic biomass, we screened the genomes of more than 5,500 microorganisms for hypothetical proteins that are located in the proximity of already known cellulases. We identified, synthesized and expressed a total of 17 putative cellulase genes with insufficient sequence similarity to currently known cellulases to be identified as such using traditional sequence annotation techniques that rely on significant sequence similarity. The recombinant proteins of the newly identified putative cellulases were subjected to enzymatic activity assays to verify their hydrolytic activity towards cellulose and lignocellulosic biomass. Eleven (65%) of the tested enzymes had significant activity towards at least one of the substrates. This high success rate highlights that a gene context-based approach can be used to assign function to genes that are otherwise categorized as "genomic dark matter" and to identify biomass-degrading enzymes that have little sequence similarity to already known cellulases. The ability to assign function to genes that have no related sequence representatives with functional annotation will be important to enhance our understanding of microbial processes and to identify microbial proteins for a wide range of applications. © 2014 Wiley Periodicals, Inc.
Probing Protein Sequences as Sources for Encrypted Antimicrobial Peptides
Brand, Guilherme D.; Magalhães, Mariana T. Q.; Tinoco, Maria L. P.; Aragão, Francisco J. L.; Nicoli, Jacques; Kelly, Sharon M.; Cooper, Alan; Bloch, Carlos
2012-01-01
Starting from the premise that a wealth of potentially biologically active peptides may lurk within proteins, we describe here a methodology to identify putative antimicrobial peptides encrypted in protein sequences. Candidate peptides were identified using a new screening procedure based on physicochemical criteria to reveal matching peptides within protein databases. Fifteen such peptides, along with a range of natural antimicrobial peptides, were examined using DSC and CD to characterize their interaction with phospholipid membranes. Principal component analysis of DSC data shows that the investigated peptides group according to their effects on the main phase transition of phospholipid vesicles, and that these effects correlate both to antimicrobial activity and to the changes in peptide secondary structure. Consequently, we have been able to identify novel antimicrobial peptides from larger proteins not hitherto associated with such activity, mimicking endogenous and/or exogenous microorganism enzymatic processing of parent proteins to smaller bioactive molecules. A biotechnological application for this methodology is explored. Soybean (Glycine max) plants, transformed to include a putative antimicrobial protein fragment encoded in its own genome were tested for tolerance against Phakopsora pachyrhizi, the causative agent of the Asian soybean rust. This procedure may represent an inventive alternative to the transgenic technology, since the genetic material to be used belongs to the host organism and not to exogenous sources. PMID:23029273
The Rabies Virus L Protein Catalyzes mRNA Capping with GDP Polyribonucleotidyltransferase Activity.
Ogino, Minako; Ito, Naoto; Sugiyama, Makoto; Ogino, Tomoaki
2016-05-21
The large (L) protein of rabies virus (RABV) plays multiple enzymatic roles in viral RNA synthesis and processing. However, none of its putative enzymatic activities have been directly demonstrated in vitro. In this study, we expressed and purified a recombinant form of the RABV L protein and verified its guanosine 5'-triphosphatase and GDP polyribonucleotidyltransferase (PRNTase) activities, which are essential for viral mRNA cap formation by the unconventional mechanism. The RABV L protein capped 5'-triphosphorylated but not 5'-diphosphorylated RABV mRNA-start sequences, 5'-AACA(C/U), with GDP to generate the 5'-terminal cap structure G(5')ppp(5')A. The 5'-AAC sequence in the substrate RNAs was found to be strictly essential for RNA capping with the RABV L protein. Furthermore, site-directed mutagenesis showed that some conserved amino acid residues (G1112, T1170, W1201, H1241, R1242, F1285, and Q1286) in the PRNTase motifs A to E of the RABV L protein are required for cap formation. These findings suggest that the putative PRNTase domain in the RABV L protein catalyzes the rhabdovirus-specific capping reaction involving covalent catalysis of the pRNA transfer to GDP, thus offering this domain as a target for developing anti-viral agents.
Boulila, Moncef
2010-06-01
To enhance the knowledge of recombination as an evolutionary process, 267 accessions retrieved from GenBank were investigated, all belonging to five economically important viruses infecting fruit crops (Plum pox, Apple chlorotic leaf spot, Apple mosaic, Prune dwarf, and Prunus necrotic ringspot viruses). Putative recombinational events were detected in the coat protein (CP)-encoding gene using RECCO and RDP version 3.31beta algorithms. Based on RECCO results, all five viruses were shown to contain potential recombination signals in the CP gene. Reconstructed trees with modified topologies were proposed. Furthermore, RECCO performed better than the RDP package in detecting recombination events and exhibiting their evolution rate along the sequences of the five viruses. RDP, however, provided the possible major and minor parents of the recombinants. Thus, the two methods should be considered complementary.
2010-01-01
Background Bathymodiolus azoricus is a deep-sea hydrothermal vent mussel found in association with large faunal communities living in chemosynthetic environments at the bottom of the sea floor near the Azores Islands. Investigation of the exceptional physiological reactions that vent mussels have adopted in their habitat, including responses to environmental microbes, remains a difficult challenge for deep-sea biologists. In an attempt to reveal genes potentially involved in the deep-sea mussel innate immunity we carried out a high-throughput sequence analysis of freshly collected B. azoricus transcriptome using gills tissues as the primary source of immune transcripts given its strategic role in filtering the surrounding waterborne potentially infectious microorganisms. Additionally, a substantial EST data set was produced and from which a comprehensive collection of genes coding for putative proteins was organized in a dedicated database, "DeepSeaVent" the first deep-sea vent animal transcriptome database based on the 454 pyrosequencing technology. Results A normalized cDNA library from gills tissue was sequenced in a full 454 GS-FLX run, producing 778,996 sequencing reads. Assembly of the high quality reads resulted in 75,407 contigs of which 3,071 were singletons. A total of 39,425 transcripts were conceptually translated into amino-sequences of which 22,023 matched known proteins in the NCBI non-redundant protein database, 15,839 revealed conserved protein domains through InterPro functional classification and 9,584 were assigned with Gene Ontology terms. Queries conducted within the database enabled the identification of genes putatively involved in immune and inflammatory reactions which had not been previously evidenced in the vent mussel. Their physical counterpart was confirmed by semi-quantitative quantitative Reverse-Transcription-Polymerase Chain Reactions (RT-PCR) and their RNA transcription level by quantitative PCR (qPCR) experiments. Conclusions We have established the first tissue transcriptional analysis of a deep-sea hydrothermal vent animal and generated a searchable catalog of genes that provides a direct method of identifying and retrieving vast numbers of novel coding sequences which can be applied in gene expression profiling experiments from a non-conventional model organism. This provides the most comprehensive sequence resource for identifying novel genes currently available for a deep-sea vent organism, in particular, genes putatively involved in immune and inflammatory reactions in vent mussels. The characterization of the B. azoricus transcriptome will facilitate research into biological processes underlying physiological adaptations to hydrothermal vent environments and will provide a basis for expanding our understanding of genes putatively involved in adaptations processes during post-capture long term acclimatization experiments, at "sea-level" conditions, using B. azoricus as a model organism. PMID:20937131
Characterization of Plasmids in a Human Clinical Strain of Lactococcus garvieae
Blanco, M. Mar; López-Campos, Guillermo H.; Cutuli, M. Teresa; Fernández-Garayzábal, José F.
2012-01-01
The present work describes the molecular characterization of five circular plasmids found in the human clinical strain Lactococcus garvieae 21881. The plasmids were designated pGL1-pGL5, with molecular sizes of 4,536 bp, 4,572 bp, 12,948 bp, 14,006 bp and 68,798 bp, respectively. Based on detailed sequence analysis, some of these plasmids appear to be mosaics composed of DNA obtained by modular exchange between different species of lactic acid bacteria. Based on sequence data and the derived presence of certain genes and proteins, the plasmid pGL2 appears to replicate via a rolling-circle mechanism, while the other four plasmids appear to belong to the group of lactococcal theta-type replicons. The plasmids pGL1, pGL2 and pGL5 encode putative proteins related with bacteriocin synthesis and bacteriocin secretion and immunity. The plasmid pGL5 harbors genes (txn, orf5 and orf25) encoding proteins that could be considered putative virulence factors. The gene txn encodes a protein with an enzymatic domain corresponding to the family actin-ADP-ribosyltransferases toxins, which are known to play a key role in pathogenesis of a variety of bacterial pathogens. The genes orf5 and orf25 encode two putative surface proteins containing the cell wall-sorting motif LPXTG, with mucin-binding and collagen-binding protein domains, respectively. These proteins could be involved in the adherence of L. garvieae to mucus from the intestine, facilitating further interaction with intestinal epithelial cells and to collagenous tissues such as the collagen-rich heart valves. To our knowledge, this is the first report on the characterization of plasmids in a human clinical strain of this pathogen. PMID:22768237
Identification of a putative triacylglycerol lipase from papaya latex by functional proteomics.
Dhouib, R; Laroche-Traineau, J; Shaha, R; Lapaillerie, D; Solier, E; Rualès, J; Pina, M; Villeneuve, P; Carrière, F; Bonneu, M; Arondel, V
2011-01-01
Latex from Caricaceae has been known since 1925 to contain strong lipase activity. However, attempts to purify and identify the enzyme were not successful, mainly because of the lack of solubility of the enzyme. Here, we describe the characterization of lipase activity of the latex of Vasconcellea heilbornii and the identification of a putative homologous lipase from Carica papaya. Triacylglycerol lipase activity was enriched 74-fold from crude latex of Vasconcellea heilbornii to a specific activity (SA) of 57 μmol·min(-1)·mg(-1) on long-chain triacylglycerol (olive oil). The extract was also active on trioctanoin (SA = 655 μmol·min(-1)·mg(-1) ), tributyrin (SA = 1107 μmol·min(-1)·mg(-1) ) and phosphatidylcholine (SA = 923 μmol·min(-1)·mg(-1) ). The optimum pH ranged from 8.0 to 9.0. The protein content of the insoluble fraction of latex was analyzed by electrophoresis followed by mass spectrometry, and 28 different proteins were identified. The protein fraction was incubated with the lipase inhibitor [(14) C]tetrahydrolipstatin, and a 45 kDa protein radiolabeled by the inhibitor was identified as being a putative lipase. A C. papaya cDNA encoding a 55 kDa protein was further cloned, and its deduced sequence had 83.7% similarity with peptides from the 45 kDa protein, with a coverage of 25.6%. The protein encoded by this cDNA had 35% sequence identity and 51% similarity to castor bean acid lipase, suggesting that it is the lipase responsible for the important lipolytic activities detected in papaya latex. © 2010 The Authors Journal compilation © 2010 FEBS.
Martin, Rowena E; Henry, Roselani I; Abbey, Janice L; Clements, John D; Kirk, Kiaran
2005-01-01
Background The uptake of nutrients, expulsion of metabolic wastes and maintenance of ion homeostasis by the intraerythrocytic malaria parasite is mediated by membrane transport proteins. Proteins of this type are also implicated in the phenomenon of antimalarial drug resistance. However, the initial annotation of the genome of the human malaria parasite Plasmodium falciparum identified only a limited number of transporters, and no channels. In this study we have used a combination of bioinformatic approaches to identify and attribute putative functions to transporters and channels encoded by the malaria parasite, as well as comparing expression patterns for a subset of these. Results A computer program that searches a genome database on the basis of the hydropathy plots of the corresponding proteins was used to identify more than 100 transport proteins encoded by P. falciparum. These include all the transporters previously annotated as such, as well as a similar number of candidate transport proteins that had escaped detection. Detailed sequence analysis enabled the assignment of putative substrate specificities and/or transport mechanisms to all those putative transport proteins previously without. The newly-identified transport proteins include candidate transporters for a range of organic and inorganic nutrients (including sugars, amino acids, nucleosides and vitamins), and several putative ion channels. The stage-dependent expression of RNAs for 34 candidate transport proteins of particular interest are compared. Conclusion The malaria parasite possesses substantially more membrane transport proteins than was originally thought, and the analyses presented here provide a range of novel insights into the physiology of this important human pathogen. PMID:15774027
Mapping protein-protein interactions with phage-displayed combinatorial peptide libraries.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kay, B. K.; Castagnoli, L.; Biosciences Division
This unit describes the process and analysis of affinity selecting bacteriophage M13 from libraries displaying combinatorial peptides fused to either a minor or major capsid protein. Direct affinity selection uses target protein bound to a microtiter plate followed by purification of selected phage by ELISA. Alternatively, there is a bead-based affinity selection method. These methods allow one to readily isolate peptide ligands that bind to a protein target of interest and use the consensus sequence to search proteomic databases for putative interacting proteins.
Dostálová, Anna; Votýpka, Jan; Favreau, Amanda J; Barbian, Kent D; Volf, Petr; Valenzuela, Jesus G; Jochim, Ryan C
2011-05-10
Parasite-vector interactions are fundamental in the transmission of vector-borne diseases such as leishmaniasis. Leishmania development in the vector sand fly is confined to the digestive tract, where sand fly midgut molecules interact with the parasites. In this work we sequenced and analyzed two midgut-specific cDNA libraries from sugar fed and blood fed female Phlebotomus perniciosus and compared the transcript expression profiles. A total of 4111 high quality sequences were obtained from the two libraries and assembled into 370 contigs and 1085 singletons. Molecules with putative roles in blood meal digestion, peritrophic matrix formation, immunity and response to oxidative stress were identified, including proteins that were not previously reported in sand flies. These molecules were evaluated relative to other published sand fly transcripts. Comparative analysis of the two libraries revealed transcripts differentially expressed in response to blood feeding. Molecules up regulated by blood feeding include a putative peritrophin (PperPer1), two chymotrypsin-like proteins (PperChym1 and PperChym2), a putative trypsin (PperTryp3) and four putative microvillar proteins (PperMVP1, 2, 4 and 5). Additionally, several transcripts were more abundant in the sugar fed midgut, such as two putative trypsins (PperTryp1 and PperTryp2), a chymotrypsin (PperChym3) and a microvillar protein (PperMVP3). We performed a detailed temporal expression profile analysis of the putative trypsin transcripts using qPCR and confirmed the expression of blood-induced and blood-repressed trypsins. Trypsin expression was measured in Leishmania infantum-infected and uninfected sand flies, which identified the L. infantum-induced down regulation of PperTryp3 at 24 hours post-blood meal. This midgut tissue-specific transcriptome provides insight into the molecules expressed in the midgut of P. perniciosus, an important vector of visceral leishmaniasis in the Old World. Through the comparative analysis of the libraries we identified molecules differentially expressed during blood meal digestion. Additionally, this study provides a detailed comparison to transcripts of other sand flies. Moreover, our analysis of putative trypsins demonstrated that L. infantum infection can reduce the transcript abundance of trypsin PperTryp3 in the midgut of P. perniciosus.
Huang, Lin; Li, Guiyang; Mo, Zhaolan; Xiao, Peng; Li, Jie; Huang, Jie
2015-01-01
Background Japanese flounder (Paralichthys olivaceus) is an economically important marine fish in Asia and has suffered from disease outbreaks caused by various pathogens, which requires more information for immune relevant genes on genome background. However, genomic and transcriptomic data for Japanese flounder remain scarce, which limits studies on the immune system of this species. In this study, we characterized the Japanese flounder spleen transcriptome using an Illumina paired-end sequencing platform to identify putative genes involved in immunity. Methodology/Principal Findings A cDNA library from the spleen of P. olivaceus was constructed and randomly sequenced using an Illumina technique. The removal of low quality reads generated 12,196,968 trimmed reads, which assembled into 96,627 unigenes. A total of 21,391 unigenes (22.14%) were annotated in the NCBI Nr database, and only 1.1% of the BLASTx top-hits matched P. olivaceus protein sequences. Approximately 12,503 (58.45%) unigenes were categorized into three Gene Ontology groups, 19,547 (91.38%) were classified into 26 Cluster of Orthologous Groups, and 10,649 (49.78%) were assigned to six Kyoto Encyclopedia of Genes and Genomes pathways. Furthermore, 40,928 putative simple sequence repeats and 47, 362 putative single nucleotide polymorphisms were identified. Importantly, we identified 1,563 putative immune-associated unigenes that mapped to 15 immune signaling pathways. Conclusions/Significance The P. olivaceus transciptome data provides a rich source to discover and identify new genes, and the immune-relevant sequences identified here will facilitate our understanding of the mechanisms involved in the immune response. Furthermore, the plentiful potential SSRs and SNPs found in this study are important resources with respect to future development of a linkage map or marker assisted breeding programs for the flounder. PMID:25723398
Darris, Maxwell
2017-01-01
ABSTRACT Most of the 24 known Chitinophaga species were originally isolated from soils. We report the draft genome sequence of a putatively novel Chitinophaga sp. from a biofilm in an air conditioner condensate pipe. The genome comprises 7,661,303 bp in one scaffold, 5,694 predicted protein-coding sequences, and a G+C content of 47.6%. PMID:29051259
Mauricio-Castillo, J A; Torres-Herrera, S I; Cárdenas-Conejo, Y; Pastor-Palacios, G; Méndez-Lozano, J; Argüello-Astorga, G R
2014-09-01
A novel begomovirus isolated from a Sida rhombifolia plant collected in Sinaloa, Mexico, was characterized. The genomic components of sida mosaic Sinaloa virus (SiMSinV) shared highest sequence identity with DNA-A and DNA-B components of chino del tomate virus (CdTV), suggesting a vertical evolutionary relationship between these viruses. However, recombination analysis indicated that a short segment of SiMSinV DNA-A encompassing the plus-strand replication origin and the 5´-proximal 43 codons of the Rep gene was derived from tomato mottle Taino virus (ToMoTV). Accordingly, the putative cis- and trans-acting replication specificity determinants of SiMSinV were identical to those of ToMoTV but differed from those of CdTV. Modeling of the SiMSinV and CdTV Rep proteins revealed significant differences in the region comprising the small β1/β5 sheet element, where five putative DNA-binding specificity determinants (SPDs) of Rep (i.e., amino acid residues 5, 8, 10, 69 and 71) were previously identified. Computer-assisted searches of public databases led to identification of 33 begomoviruses from three continents encoding proteins with SPDs identical to those of the Rep encoded by SiMSinV. Sequence analysis of the replication origins demonstrated that all 33 begomoviruses harbor potential Rep-binding sites identical to those of SiMSinV. These data support the hypothesis that the Rep β1/β5 sheet region determines specificity of this protein for DNA replication origin sequences.
Vandesteene, Lies; Ramon, Matthew; Le Roy, Katrien; Van Dijck, Patrick; Rolland, Filip
2010-03-01
Higher plants typically do not produce trehalose in large amounts, but their genome sequences reveal large families of putative trehalose metabolism enzymes. An important regulatory role in plant growth and development is also emerging for the metabolic intermediate trehalose-6-P (T6P). Here, we present an update on Arabidopsis trehalose metabolism and a resource for further detailed analyses. In addition, we provide evidence that Arabidopsis encodes a single trehalose-6-P synthase (TPS) next to a family of catalytically inactive TPS-like proteins that might fulfill specific regulatory functions in actively growing tissues.
Khani, Afsaneh; Popp, Nicole; Kreikemeyer, Bernd; Patenge, Nadja
2018-01-01
Regulatory RNAs play important roles in the control of bacterial gene expression. In this study, we investigated gene expression regulation by a putative glycine riboswitch located in the 5'-untranslated region of a sodium:alanine symporter family (SAF) protein gene in the group A Streptococcus pyogenes serotype M49 strain 591. Glycine-dependent gene expression mediated by riboswitch activity was studied using a luciferase reporter gene system. Maximal reporter gene expression was observed in the absence of glycine and in the presence of low glycine concentrations. Differences in glycine-dependent gene expression were not based on differential promoter activity. Expression of the SAF protein gene and the downstream putative cation efflux protein gene was investigated in wild-type bacteria by RT-qPCR transcript analyses. During growth in the presence of glycine (≥1 mM), expression of the genes were downregulated. Northern blot analyses revealed premature transcription termination in the presence of high glycine concentrations. Growth in the presence of 0.1 mM glycine led to the production of a full-length transcript. Furthermore, stability of the SAF protein gene transcript was drastically reduced in the presence of glycine. We conclude that the putative glycine riboswitch in S. pyogenes serotype M49 strain 591 represses expression of the SAF protein gene and the downstream putative cation efflux protein gene in the presence of high glycine concentrations. Sequence and secondary structure comparisons indicated that the streptococcal riboswitch belongs to the class of tandem aptamer glycine riboswitches.
Negrete-Abascal, Erasmo; Montes-Garcia, Fernando; Vaca-Pacheco, Sergio; Leyto-Gil, Abraham M; Fragoso-Garcia, Edgar; Carvente-Garcia, Roberto; Perez-Agueros, Sandra; Castelan-Sanchez, Hugo G; Garcia-Molina, Alejandra; Villamar, Tomas E; Sánchez-Alonso, Patricia; Vazquez-Cruz, Candelario
2018-01-11
The draft genome sequence of Actinobacillus seminis strain ATCC 15768 is reported here. The genome comprises 22 contigs corresponding to 2.36 Mb with 40.7% G+C content and contains several genes related to virulence, including a putative RTX protein. Copyright © 2018 Negrete-Abascal et al.
Majumder, P; Choudhury, A; Banerjee, M; Lahiri, A; Bhattacharyya, N P
2007-08-01
To investigate the mechanism of increased expression of caspase-1 caused by exogenous Hippi, observed earlier in HeLa and Neuro2A cells, in this work we identified a specific motif AAAGACATG (- 101 to - 93) at the caspase-1 gene upstream sequence where HIPPI could bind. Various mutations in this specific sequence compromised the interaction, showing the specificity of the interactions. In the luciferase reporter assay, when the reporter gene was driven by caspase-1 gene upstream sequences (- 151 to - 92) with the mutation G to T at position - 98, luciferase activity was decreased significantly in green fluorescent protein-Hippi-expressing HeLa cells in comparison to that obtained with the wild-type caspase-1 gene 60 bp upstream sequence, indicating the biological significance of such binding. It was observed that the C-terminal 'pseudo' death effector domain of HIPPI interacted with the 60 bp (- 151 to - 92) upstream sequence of the caspase-1 gene containing the motif. We further observed that expression of caspase-8 and caspase-10 was increased in green fluorescent protein-Hippi-expressing HeLa cells. In addition, HIPPI interacted in vitro with putative promoter sequences of these genes, containing a similar motif. In summary, we identified a novel function of HIPPI; it binds to specific upstream sequences of the caspase-1, caspase-8 and caspase-10 genes and alters the expression of the genes. This result showed the motif-specific interaction of HIPPI with DNA, and indicates that it could act as transcription regulator.
2014-01-01
Background Bacteroides spp. form a significant part of our gut microbiome and are well known for optimized metabolism of diverse polysaccharides. Initial analysis of the archetypal Bacteroides thetaiotaomicron genome identified 172 glycosyl hydrolases and a large number of uncharacterized proteins associated with polysaccharide metabolism. Results BT_1012 from Bacteroides thetaiotaomicron VPI-5482 is a protein of unknown function and a member of a large protein family consisting entirely of uncharacterized proteins. Initial sequence analysis predicted that this protein has two domains, one on the N- and one on the C-terminal. A PSI-BLAST search found over 150 full length and over 90 half size homologs consisting only of the N-terminal domain. The experimentally determined three-dimensional structure of the BT_1012 protein confirms its two-domain architecture and structural analysis of both domains suggests their specific functions. The N-terminal domain is a putative catalytic domain with significant similarity to known glycoside hydrolases, the C-terminal domain has a beta-sandwich fold typically found in C-terminal domains of other glycosyl hydrolases, however these domains are typically involved in substrate binding. We describe the structure of the BT_1012 protein and discuss its sequence-structure relationship and their possible functional implications. Conclusions Structural and sequence analyses of the BT_1012 protein identifies it as a glycosyl hydrolase, expanding an already impressive catalog of enzymes involved in polysaccharide metabolism in Bacteroides spp. Based on this we have renamed the Pfam families representing the two domains found in the BT_1012 protein, PF13204 and PF12904, as putative glycoside hydrolase and glycoside hydrolase-associated C-terminal domain respectively. PMID:24742328
Li, You-Hai; Han, Wen-Jin; Gui, Xi-Wu; Wei, Tao; Tang, Shuang-Yan; Jin, Jian-Ming
2016-01-01
Tentoxin, a cyclic tetrapeptide produced by several Alternaria species, inhibits the F1-ATPase activity of chloroplasts, resulting in chlorosis in sensitive plants. In this study, we report two clustered genes, encoding a putative non-ribosome peptide synthetase (NRPS) TES and a cytochrome P450 protein TES1, that are required for tentoxin biosynthesis in Alternaria alternata strain ZJ33, which was isolated from blighted leaves of Eupatorium adenophorum. Using a pair of primers designed according to the consensus sequences of the adenylation domain of NRPSs, two fragments containing putative adenylation domains were amplified from A. alternata ZJ33, and subsequent PCR analyses demonstrated that these fragments belonged to the same NRPS coding sequence. With no introns, TES consists of a single 15,486 base pair open reading frame encoding a predicted 5161 amino acid protein. Meanwhile, the TES1 gene is predicted to contain five introns and encode a 506 amino acid protein. The TES protein is predicted to be comprised of four peptide synthase modules with two additional N-methylation domains, and the number and arrangement of the modules in TES were consistent with the number and arrangement of the amino acid residues of tentoxin, respectively. Notably, both TES and TES1 null mutants generated via homologous recombination failed to produce tentoxin. This study provides the first evidence concerning the biosynthesis of tentoxin in A. alternata. PMID:27490569
Liu, Min; Zhang, Zhongqi; Zang, Tianzhu; Spahr, Chris; Cheetham, Janet; Ren, Da; Sunny Zhou, Zhaohui
2013-01-01
Characterization of protein crosslinking, particularly without prior knowledge of the chemical nature and site of crosslinking, poses a significant challenge due to their intrinsic structural complexity and the lack of a comprehensive analytical approach. Towards this end, we have developed a generally applicable workflow—XChem-Finder that involves four stages. (1) Detection of crosslinked peptides via 18O-labeling at C-termini. (2) Determination of the putative partial sequences of each crosslinked peptide pair using a fragment ion mass database search against known protein sequences coupled with a de novo sequence tag search. (3) Extension to full sequences based on protease specificity, the unique combination of mass, and other constraints. (4) Deduction of crosslinking chemistry and site. The mass difference between the sum of two putative full-length peptides and the crosslinked peptide provides the formulas (elemental composition analysis) for the functional groups involved in each cross- linking. Combined with sequence restraint from MS/MS data, plausible crosslinking chemistry and site were inferred, and ultimately, confirmed by matching with all data. Applying our approach to a stressed IgG2 antibody, ten cross-linked peptides were discovered and found to be connected via thioether originating from disulfides at locations that had not been previously recognized. Furthermore, once the crosslink chemistry was revealed, a targeted crosslink search yielded four additional crosslinked peptides that all contain the C-terminus of the light chain. PMID:23634697
A novel, highly divergent ssDNA virus identified in Brazil infecting apple, pear and grapevine.
Basso, Marcos Fernando; da Silva, José Cleydson Ferreira; Fajardo, Thor Vinícius Martins; Fontes, Elizabeth Pacheco Batista; Zerbini, Francisco Murilo
2015-12-02
Fruit trees of temperate and tropical climates are of great economical importance worldwide and several viruses have been reported affecting their productivity and longevity. Fruit trees of different Brazilian regions displaying virus-like symptoms were evaluated for infection by circular DNA viruses. Seventy-four fruit trees were sampled and a novel, highly divergent, monopartite circular ssDNA virus was cloned from apple, pear and grapevine trees. Forty-five complete viral genomes were sequenced, with a size of approx. 3.4 kb and organized into five ORFs. Deduced amino acid sequences showed identities in the range of 38% with unclassified circular ssDNA viruses, nanoviruses and alphasatellites (putative Replication-associated protein, Rep), and begomo-, curto- and mastreviruses (putative coat protein, CP, and movement protein, MP). A large intergenic region contains a short palindromic sequence capable of forming a hairpin-like structure with the loop sequence TAGTATTAC, identical to the conserved nonanucleotide of circoviruses, nanoviruses and alphasatellites. Recombination events were not detected and phylogenetic analysis showed a relationship with circo-, nano- and geminiviruses. PCR confirmed the presence of this novel ssDNA virus in field plants. Infectivity tests using the cloned viral genome confirmed its ability to infect apple and pear tree seedlings, but not Nicotiana benthamiana. The name "Temperate fruit decay-associated virus" (TFDaV) is proposed for this novel virus. Copyright © 2015 Elsevier B.V. All rights reserved.
ORF157 from the Archaeal Virus Acidianus Filamentous Virus 1 Defines a New Class of Nuclease▿
Goulet, Adeline; Pina, Mery; Redder, Peter; Prangishvili, David; Vera, Laura; Lichière, Julie; Leulliot, Nicolas; van Tilbeurgh, Herman; Ortiz-Lombardia, Miguel; Campanacci, Valérie; Cambillau, Christian
2010-01-01
Acidianus filamentous virus 1 (AFV1) (Lipothrixviridae) is an enveloped filamentous virus that was characterized from a crenarchaeal host. It infects Acidianus species that thrive in the acidic hot springs (>85°C and pH <3) of Yellowstone National Park, WY. The AFV1 20.8-kb, linear, double-stranded DNA genome encodes 40 putative open reading frames whose sequences generally show little similarity to other genes in the sequence databases. Because three-dimensional structures are more conserved than sequences and hence are more effective at revealing function, we set out to determine protein structures from putative AFV1 open reading frames (ORF). The crystal structure of ORF157 reveals an α+β protein with a novel fold that remotely resembles the nucleotidyltransferase topology. In vitro, AFV1-157 displays a nuclease activity on linear double-stranded DNA. Alanine substitution mutations demonstrated that E86 is essential to catalysis. AFV1-157 represents a novel class of nuclease, but its exact role in vivo remains to be determined. PMID:20200253
Charles, Jermilia; Firth, Andrew E.; Loroño-Pino, Maria A.; Garcia-Rejon, Julian E.; Farfan-Ale, Jose A.; Lipkin, W. Ian; Briese, Thomas
2016-01-01
Sequences corresponding to a putative, novel rhabdovirus [designated Merida virus (MERDV)] were initially detected in a pool of Culex quinquefasciatus collected in the Yucatan Peninsula of Mexico. The entire genome was sequenced, revealing 11 798 nt and five major ORFs, which encode the nucleoprotein (N), phosphoprotein (P), matrix protein (M), glycoprotein (G) and RNA-dependent RNA polymerase (L). The deduced amino acid sequences of the N, G and L proteins have no more than 24, 38 and 43 % identity, respectively, to the corresponding sequences of all other known rhabdoviruses, whereas those of the P and M proteins have no significant identity with any sequences in GenBank and their identity is only suggested based on their genome position. Using specific reverse transcription-PCR assays established from the genome sequence, 27 571 C. quinquefasciatus which had been sorted in 728 pools were screened to assess the prevalence of MERDV in nature and 25 pools were found positive. The minimal infection rate (calculated as the number of positive mosquito pools per 1000 mosquitoes tested) was 0.9, and similar for both females and males. Screening another 140 pools of 5484 mosquitoes belonging to four other genera identified positive pools of Ochlerotatus spp. mosquitoes, indicating that the host range is not restricted to C. quinquefasciatus. Attempts to isolate MERDV in C6/36 and Vero cells were unsuccessful. In summary, we provide evidence that a previously undescribed rhabdovirus occurs in mosquitoes in Mexico. PMID:26868915
Rapid Identification of Sequences for Orphan Enzymes to Power Accurate Protein Annotation
Ojha, Sunil; Watson, Douglas S.; Bomar, Martha G.; Galande, Amit K.; Shearer, Alexander G.
2013-01-01
The power of genome sequencing depends on the ability to understand what those genes and their proteins products actually do. The automated methods used to assign functions to putative proteins in newly sequenced organisms are limited by the size of our library of proteins with both known function and sequence. Unfortunately this library grows slowly, lagging well behind the rapid increase in novel protein sequences produced by modern genome sequencing methods. One potential source for rapidly expanding this functional library is the “back catalog” of enzymology – “orphan enzymes,” those enzymes that have been characterized and yet lack any associated sequence. There are hundreds of orphan enzymes in the Enzyme Commission (EC) database alone. In this study, we demonstrate how this orphan enzyme “back catalog” is a fertile source for rapidly advancing the state of protein annotation. Starting from three orphan enzyme samples, we applied mass-spectrometry based analysis and computational methods (including sequence similarity networks, sequence and structural alignments, and operon context analysis) to rapidly identify the specific sequence for each orphan while avoiding the most time- and labor-intensive aspects of typical sequence identifications. We then used these three new sequences to more accurately predict the catalytic function of 385 previously uncharacterized or misannotated proteins. We expect that this kind of rapid sequence identification could be efficiently applied on a larger scale to make enzymology’s “back catalog” another powerful tool to drive accurate genome annotation. PMID:24386392
Rapid identification of sequences for orphan enzymes to power accurate protein annotation.
Ramkissoon, Kevin R; Miller, Jennifer K; Ojha, Sunil; Watson, Douglas S; Bomar, Martha G; Galande, Amit K; Shearer, Alexander G
2013-01-01
The power of genome sequencing depends on the ability to understand what those genes and their proteins products actually do. The automated methods used to assign functions to putative proteins in newly sequenced organisms are limited by the size of our library of proteins with both known function and sequence. Unfortunately this library grows slowly, lagging well behind the rapid increase in novel protein sequences produced by modern genome sequencing methods. One potential source for rapidly expanding this functional library is the "back catalog" of enzymology--"orphan enzymes," those enzymes that have been characterized and yet lack any associated sequence. There are hundreds of orphan enzymes in the Enzyme Commission (EC) database alone. In this study, we demonstrate how this orphan enzyme "back catalog" is a fertile source for rapidly advancing the state of protein annotation. Starting from three orphan enzyme samples, we applied mass-spectrometry based analysis and computational methods (including sequence similarity networks, sequence and structural alignments, and operon context analysis) to rapidly identify the specific sequence for each orphan while avoiding the most time- and labor-intensive aspects of typical sequence identifications. We then used these three new sequences to more accurately predict the catalytic function of 385 previously uncharacterized or misannotated proteins. We expect that this kind of rapid sequence identification could be efficiently applied on a larger scale to make enzymology's "back catalog" another powerful tool to drive accurate genome annotation.
NemaPath: online exploration of KEGG-based metabolic pathways for nematodes
Wylie, Todd; Martin, John; Abubucker, Sahar; Yin, Yong; Messina, David; Wang, Zhengyuan; McCarter, James P; Mitreva, Makedonka
2008-01-01
Background Nematode.net is a web-accessible resource for investigating gene sequences from parasitic and free-living nematode genomes. Beyond the well-characterized model nematode C. elegans, over 500,000 expressed sequence tags (ESTs) and nearly 600,000 genome survey sequences (GSSs) have been generated from 36 nematode species as part of the Parasitic Nematode Genomics Program undertaken by the Genome Center at Washington University School of Medicine. However, these sequencing data are not present in most publicly available protein databases, which only include sequences in Swiss-Prot. Swiss-Prot, in turn, relies on GenBank/Embl/DDJP for predicted proteins from complete genomes or full-length proteins. Description Here we present the NemaPath pathway server, a web-based pathway-level visualization tool for navigating putative metabolic pathways for over 30 nematode species, including 27 parasites. The NemaPath approach consists of two parts: 1) a backend tool to align and evaluate nematode genomic sequences (curated EST contigs) against the annotated Kyoto Encyclopedia of Genes and Genomes (KEGG) protein database; 2) a web viewing application that displays annotated KEGG pathway maps based on desired confidence levels of primary sequence similarity as defined by a user. NemaPath also provides cross-referenced access to nematode genome information provided by other tools available on Nematode.net, including: detailed NemaGene EST cluster information; putative translations; GBrowse EST cluster views; links from nematode data to external databases for corresponding synonymous C. elegans counterparts, subject matches in KEGG's gene database, and also KEGG Ontology (KO) identification. Conclusion The NemaPath server hosts metabolic pathway mappings for 30 nematode species and is available on the World Wide Web at . The nematode source sequences used for the metabolic pathway mappings are available via FTP , as provided by the Genome Center at Washington University School of Medicine. PMID:18983679
Ponce, Dalia; Brinkman, Diane L; Potriquet, Jeremy; Mulvenna, Jason
2016-04-05
Jellyfish venoms are rich sources of toxins designed to capture prey or deter predators, but they can also elicit harmful effects in humans. In this study, an integrated transcriptomic and proteomic approach was used to identify putative toxins and their potential role in the venom of the scyphozoan jellyfish Chrysaora fuscescens. A de novo tentacle transcriptome, containing more than 23,000 contigs, was constructed and used in proteomic analysis of C. fuscescens venom to identify potential toxins. From a total of 163 proteins identified in the venom proteome, 27 were classified as putative toxins and grouped into six protein families: proteinases, venom allergens, C-type lectins, pore-forming toxins, glycoside hydrolases and enzyme inhibitors. Other putative toxins identified in the transcriptome, but not the proteome, included additional proteinases as well as lipases and deoxyribonucleases. Sequence analysis also revealed the presence of ShKT domains in two putative venom proteins from the proteome and an additional 15 from the transcriptome, suggesting potential ion channel blockade or modulatory activities. Comparison of these potential toxins to those from other cnidarians provided insight into their possible roles in C. fuscescens venom and an overview of the diversity of potential toxin families in cnidarian venoms.
Comparative Analysis of Predicted Plastid-Targeted Proteomes of Sequenced Higher Plant Genomes
Schaeffer, Scott; Harper, Artemus; Raja, Rajani; Jaiswal, Pankaj; Dhingra, Amit
2014-01-01
Plastids are actively involved in numerous plant processes critical to growth, development and adaptation. They play a primary role in photosynthesis, pigment and monoterpene synthesis, gravity sensing, starch and fatty acid synthesis, as well as oil, and protein storage. We applied two complementary methods to analyze the recently published apple genome (Malus × domestica) to identify putative plastid-targeted proteins, the first using TargetP and the second using a custom workflow utilizing a set of predictive programs. Apple shares roughly 40% of its 10,492 putative plastid-targeted proteins with that of the Arabidopsis (Arabidopsis thaliana) plastid-targeted proteome as identified by the Chloroplast 2010 project and ∼57% of its entire proteome with Arabidopsis. This suggests that the plastid-targeted proteomes between apple and Arabidopsis are different, and interestingly alludes to the presence of differential targeting of homologs between the two species. Co-expression analysis of 2,224 genes encoding putative plastid-targeted apple proteins suggests that they play a role in plant developmental and intermediary metabolism. Further, an inter-specific comparison of Arabidopsis, Prunus persica (Peach), Malus × domestica (Apple), Populus trichocarpa (Black cottonwood), Fragaria vesca (Woodland Strawberry), Solanum lycopersicum (Tomato) and Vitis vinifera (Grapevine) also identified a large number of novel species-specific plastid-targeted proteins. This analysis also revealed the presence of alternatively targeted homologs across species. Two separate analyses revealed that a small subset of proteins, one representing 289 protein clusters and the other 737 unique protein sequences, are conserved between seven plastid-targeted angiosperm proteomes. Majority of the novel proteins were annotated to play roles in stress response, transport, catabolic processes, and cellular component organization. Our results suggest that the current state of knowledge regarding plastid biology, preferentially based on model systems is deficient. New plant genomes are expected to enable the identification of potentially new plastid-targeted proteins that will aid in studying novel roles of plastids. PMID:25393533
In Silico Pattern-Based Analysis of the Human Cytomegalovirus Genome
Rigoutsos, Isidore; Novotny, Jiri; Huynh, Tien; Chin-Bow, Stephen T.; Parida, Laxmi; Platt, Daniel; Coleman, David; Shenk, Thomas
2003-01-01
More than 200 open reading frames (ORFs) from the human cytomegalovirus genome have been reported as potentially coding for proteins. We have used two pattern-based in silico approaches to analyze this set of putative viral genes. With the help of an objective annotation method that is based on the Bio-Dictionary, a comprehensive collection of amino acid patterns that describes the currently known natural sequence space of proteins, we have reannotated all of the previously reported putative genes of the human cytomegalovirus. Also, with the help of MUSCA, a pattern-based multiple sequence alignment algorithm, we have reexamined the original human cytomegalovirus gene family definitions. Our analysis of the genome shows that many of the coded proteins comprise amino acid combinations that are unique to either the human cytomegalovirus or the larger group of herpesviruses. We have confirmed that a surprisingly large portion of the analyzed ORFs encode membrane proteins, and we have discovered a significant number of previously uncharacterized proteins that are predicted to be G-protein-coupled receptor homologues. The analysis also indicates that many of the encoded proteins undergo posttranslational modifications such as hydroxylation, phosphorylation, and glycosylation. ORFs encoding proteins with similar functional behavior appear in neighboring regions of the human cytomegalovirus genome. All of the results of the present study can be found and interactively explored online (http://cbcsrv.watson.ibm.com/virus/). PMID:12634390
Winokur, S T; Shiang, R
1998-11-01
The TCOF1 gene product, treacle, responsible for the craniofacial disorder Treacher Collins syndrome, has been predicted to be a member of a class of nucleolar phosphoproteins based on its primary amino acid sequence. Treacle is a low complexity protein with ten repeating units of acidic and basic residues, each of which contains a large number of putative casein kinase 2 and protein kinase C phosphorylation sites. In addition, the C-terminus of treacle contains multiple putative nuclear localization signals. The overall structure of treacle, as well as sequence similarity to several nucleolar phosphoproteins, predicts that treacle is a member of this class of proteins. Using green fluorescent protein fusion constructs with the full-length and deleted domains of the murine homolog of treacle, we demonstrate that the cellular localization of treacle is nucleolar. This localization is mediated by the last 41 residues of the C-terminus (residues 1262-1302). At least two functional nuclear localization signals have been identified in the protein, one between residues 1176 and 1270 and the second within the last 32 residues of the protein (1271-1302). The nucleolar localization signal is disrupted by two constructs that split the C-terminal region between residues 1270 and 1271. This study provides the first direct analysis of treacle and demonstrates that the protein involved in TCOF1 is a nucleolar protein.
In silico pattern-based analysis of the human cytomegalovirus genome.
Rigoutsos, Isidore; Novotny, Jiri; Huynh, Tien; Chin-Bow, Stephen T; Parida, Laxmi; Platt, Daniel; Coleman, David; Shenk, Thomas
2003-04-01
More than 200 open reading frames (ORFs) from the human cytomegalovirus genome have been reported as potentially coding for proteins. We have used two pattern-based in silico approaches to analyze this set of putative viral genes. With the help of an objective annotation method that is based on the Bio-Dictionary, a comprehensive collection of amino acid patterns that describes the currently known natural sequence space of proteins, we have reannotated all of the previously reported putative genes of the human cytomegalovirus. Also, with the help of MUSCA, a pattern-based multiple sequence alignment algorithm, we have reexamined the original human cytomegalovirus gene family definitions. Our analysis of the genome shows that many of the coded proteins comprise amino acid combinations that are unique to either the human cytomegalovirus or the larger group of herpesviruses. We have confirmed that a surprisingly large portion of the analyzed ORFs encode membrane proteins, and we have discovered a significant number of previously uncharacterized proteins that are predicted to be G-protein-coupled receptor homologues. The analysis also indicates that many of the encoded proteins undergo posttranslational modifications such as hydroxylation, phosphorylation, and glycosylation. ORFs encoding proteins with similar functional behavior appear in neighboring regions of the human cytomegalovirus genome. All of the results of the present study can be found and interactively explored online (http://cbcsrv.watson.ibm.com/virus/).
Fanning, T; Singer, M
1987-01-01
Recent work suggests that one or more members of the highly repeated LINE-1 (L1) DNA family found in all mammals may encode one or more proteins. Here we report the sequence of a portion of an L1 cloned from the domestic cat (Felis catus). These data permit comparison of the L1 sequences in four mammalian orders (Carnivore, Lagomorph, Rodent and Primate) and the comparison supports the suggested coding potential. In two separate, noncontiguous regions in the carboxy terminal half of the proteins predicted from the DNA sequences, there are several strongly conserved segments. In one region, these share homology with known or suspected reverse transcriptases, as described by others in rodents and primates. In the second region, closer to the carboxy terminus, the strongly conserved segments are over 90% homologous among the four orders. One of the latter segments is cysteine rich and resembles the putative metal binding domains of nucleic acid binding proteins, including those of TFIIIA and retroviruses. PMID:3562227
Wan, Xuehua; Darris, Maxwell; Hou, Shaobin; Donachie, Stuart P
2017-10-19
Most of the 24 known Chitinophaga species were originally isolated from soils. We report the draft genome sequence of a putatively novel Chitinophaga sp. from a biofilm in an air conditioner condensate pipe. The genome comprises 7,661,303 bp in one scaffold, 5,694 predicted protein-coding sequences, and a G+C content of 47.6%. Copyright © 2017 Wan et al.
Determining Zebrafish Epitope Reactivity to Commercially Available Antibodies.
Villarreal, Michael A; Biediger, Nicole M; Bonner, Natalie A; Miller, Jennifer N; Zepeda, Samantha K; Ricard, Benjamin J; García, Dana M; Lewis, Karen A
2017-08-01
Antibodies raised against mammalian proteins may exhibit cross-reactivity with zebrafish proteins, making these antibodies useful for fish studies. However, zebrafish may express multiple paralogues of similar sequence and size, making them difficult to distinguish by traditional Western blot analysis. To identify the zebrafish proteins that are recognized by an antimammalian antibody, we developed a system to screen putative epitopes by cloning the sequences between the yeast SUMO protein and a C-terminal 6xHis tag. The recombinant fusion protein was expressed in Escherichia coli and analyzed by Western blot to conclusively identify epitopes that exhibit cross-reactivity with the antibodies of interest. This approach can be used to determine the species cross-reactivity and epitope specificity of a wide variety of peptide antigen-derived antibodies.
Hyndman, Timothy H; Marschang, Rachel E; Wellehan, James F X; Nicholls, Philip K
2012-10-01
This paper describes the isolation and molecular identification of a novel paramyxovirus found during an investigation of an outbreak of neurorespiratory disease in a collection of Australian pythons. Using Illumina® high-throughput sequencing, a 17,187 nucleotide sequence was assembled from RNA extracts from infected viper heart cells (VH2) displaying widespread cytopathic effects in the form of multinucleate giant cells. The sequence appears to contain all the coding regions of the genome, including the following predicted paramyxoviral open reading frames (ORFs): 3'--Nucleocapsid (N)--putative Phosphoprotein (P)--Matrix (M)--Fusion (F)--putative attachment protein--Polymerase (L)--5'. There is also a 540 nucleotide ORF between the N and putative P genes that may be an additional coding region. Phylogenetic analyses of the complete N, M, F and L genes support the clustering of this virus within the family Paramyxoviridae but outside both of the current subfamilies: Paramyxovirinae and Pneumovirinae. We propose to name this new virus, Sunshine virus, after the geographic origin of the first isolate--the Sunshine Coast of Queensland, Australia. Copyright © 2012 Elsevier B.V. All rights reserved.
High-Molecular-Mass Multi-c-Heme Cytochromes from Methylococcus capsulatus Bath†
Bergmann, David J.; Zahn, James A.; DiSpirito, Alan A.
1999-01-01
The polypeptide and structural gene for a high-molecular-mass c-type cytochrome, cytochrome c553O, was isolated from the methanotroph Methylococcus capsulatus Bath. Cytochrome c553O is a homodimer with a subunit molecular mass of 124,350 Da and an isoelectric point of 6.0. The heme c concentration was estimated to be 8.2 ± 0.4 mol of heme c per subunit. The electron paramagnetic resonance spectrum showed the presence of multiple low spin, S = 1/2, hemes. A degenerate oligonucleotide probe synthesized based on the N-terminal amino acid sequence of cytochrome c553O was used to identify a DNA fragment from M. capsulatus Bath that contains occ, the gene encoding cytochrome c553O. occ is part of a gene cluster which contains three other open reading frames (ORFs). ORF1 encodes a putative periplasmic c-type cytochrome with a molecular mass of 118,620 Da that shows approximately 40% amino acid sequence identity with occ and contains nine c-heme-binding motifs. ORF3 encodes a putative periplasmic c-type cytochrome with a molecular mass of 94,000 Da and contains seven c-heme-binding motifs but shows no sequence homology to occ or ORF1. ORF4 encodes a putative 11,100-Da protein. The four ORFs have no apparent similarity to any proteins in the GenBank database. The subunit molecular masses, arrangement and number of hemes, and amino acid sequences demonstrate that cytochrome c553O and the gene products of ORF1 and ORF3 constitute a new class of c-type cytochrome. PMID:9922265
Mapping HLA-A2, -A3 and -B7 supertype-restricted T-cell epitopes in the ebolavirus proteome.
Lim, Wan Ching; Khan, Asif M
2018-01-19
Ebolavirus (EBOV) is responsible for one of the most fatal diseases encountered by mankind. Cellular T-cell responses have been implicated to be important in providing protection against the virus. Antigenic variation can result in viral escape from immune recognition. Mapping targets of immune responses among the sequence of viral proteins is, thus, an important first step towards understanding the immune responses to viral variants and can aid in the identification of vaccine targets. Herein, we performed a large-scale, proteome-wide mapping and diversity analyses of putative HLA supertype-restricted T-cell epitopes of Zaire ebolavirus (ZEBOV), the most pathogenic species among the EBOV family. All publicly available ZEBOV sequences (14,098) for each of the nine viral proteins were retrieved, removed of irrelevant and duplicate sequences, and aligned. The overall proteome diversity of the non-redundant sequences was studied by use of Shannon's entropy. The sequences were predicted, by use of the NetCTLpan server, for HLA-A2, -A3, and -B7 supertype-restricted epitopes, which are relevant to African and other ethnicities and provide for large (~86%) population coverage. The predicted epitopes were mapped to the alignment of each protein for analyses of antigenic sequence diversity and relevance to structure and function. The putative epitopes were validated by comparison with experimentally confirmed epitopes. ZEBOV proteome was generally conserved, with an average entropy of 0.16. The 185 HLA supertype-restricted T-cell epitopes predicted (82 (A2), 37 (A3) and 66 (B7)) mapped to 125 alignment positions and covered ~24% of the proteome length. Many of the epitopes showed a propensity to co-localize at select positions of the alignment. Thirty (30) of the mapped positions were completely conserved and may be attractive for vaccine design. The remaining (95) positions had one or more epitopes, with or without non-epitope variants. A significant number (24) of the putative epitopes matched reported experimentally validated HLA ligands/T-cell epitopes of A2, A3 and/or B7 supertype representative allele restrictions. The epitopes generally corresponded to functional motifs/domains and there was no correlation to localization on the protein 3D structure. These data and the epitope map provide important insights into the interaction between EBOV and the host immune system.
Molecular cloning of chitinase 33 (chit33) gene from Trichoderma atroviride
Matroudi, S.; Zamani, M.R.; Motallebi, M.
2008-01-01
In this study Trichoderma atroviride was selected as over producer of chitinase enzyme among 30 different isolates of Trichoderma sp. on the basis of chitinase specific activity. From this isolate the genomic and cDNA clones encoding chit33 have been isolated and sequenced. Comparison of genomic and cDNA sequences for defining gene structure indicates that this gene contains three short introns and also an open reading frame coding for a protein of 321 amino acids. The deduced amino acid sequence includes a 19 aa putative signal peptide. Homology between this sequence and other reported Trichoderma Chit33 proteins are discussed. The coding sequence of chit33 gene was cloned in pEt26b(+) expression vector and expressed in E. coli. PMID:24031242
Xu, Shou Ling; Shen, Si Shi; Xu, Zhi Hong; Xue, Hong Wei
2002-12-01
Abscisic acid (ABA) was critical in plant seed development and response to environmental factors such as stress situations. To study the possible ABA related signaling transduction pathways, we tried to isolate the ABA-regulated genes through fluorescent differential display PCR (FDD-PCR) technology using rice seedling as materials (treated with ABA for 2, 4, 8 and 12h). In the 17 fragments isolated, 14 and 3 clones were up-and down-regulated respectively. Sequence analyses revealed that the encoded proteins were involved in photosynthesis (7 fragments), signal transduction (1 fragments), transcription (2 fragments), metabolism and resistance (6 fragments), and unknown protein (1 fragments). 3 clones, encoding putative alpha/beta hydrolase fold, putative vacuolar H+ -ATPase B subunit, putative tyrosine phosphatase, were confirmed to be regulated under ABA treatment by RT-PCR and northern blot analysis. FDD-PCR and possible functional mechanisms of ABA were discussed.
An approach to large scale identification of non-obvious structural similarities between proteins
Cherkasov, Artem; Jones, Steven JM
2004-01-01
Background A new sequence independent bioinformatics approach allowing genome-wide search for proteins with similar three dimensional structures has been developed. By utilizing the numerical output of the sequence threading it establishes putative non-obvious structural similarities between proteins. When applied to the testing set of proteins with known three dimensional structures the developed approach was able to recognize structurally similar proteins with high accuracy. Results The method has been developed to identify pathogenic proteins with low sequence identity and high structural similarity to host analogues. Such protein structure relationships would be hypothesized to arise through convergent evolution or through ancient horizontal gene transfer events, now undetectable using current sequence alignment techniques. The pathogen proteins, which could mimic or interfere with host activities, would represent candidate virulence factors. The developed approach utilizes the numerical outputs from the sequence-structure threading. It identifies the potential structural similarity between a pair of proteins by correlating the threading scores of the corresponding two primary sequences against the library of the standard folds. This approach allowed up to 64% sensitivity and 99.9% specificity in distinguishing protein pairs with high structural similarity. Conclusion Preliminary results obtained by comparison of the genomes of Homo sapiens and several strains of Chlamydia trachomatis have demonstrated the potential usefulness of the method in the identification of bacterial proteins with known or potential roles in virulence. PMID:15147578
Becker, Y; Asher, Y; Tabor, E; Davidson, I; Malkinson, M
1994-01-01
A DNA segment of the MDV-1 BamHI-D fragment was sequenced, and the open reading frames (ORFs) present in the 4556 nucleotide fragment were analyzed by computer programs. Computer analysis identified 19 putative ORFs in the sequence ranging from a coding capacity of 37 amino acids (aa) (ORF-1a) to 684aa (ORF-1). The special properties of four ORFs (1a, 1, 2, and 3) were investigated. Two adjacent ORFs, ORF-1a and ORF-1, were found by computer analysis to have the properties of two introns encoding a glycoprotein: ORF-1a encodes an aa sequence with the properties of a signal peptide, and ORF-1 encodes a polypeptide with a membrane anchor domain and putative N-glycosylation sites in the aa sequence. ORF-1a and ORF-1 were found to be transcribed in MDV-1-infected cells. Two RNA transcripts were detected: a precursor RNA and its spliced form. Both are transcribed from a promoter located 5' to ORF-1a, and splice donor and acceptor sites are used to splice the mRNA after cleavage of a 71-nucleotide sequence. This finding suggest that ORF-1a and ORF-1 are two introns of a new MDV-1 glycoprotein gene. The DNA sequence containing ORF-1 was transiently expressed in COS-1 cells, and the viral protein produced in these cells was found to react with anti-MDV serotype-1 Antigen B-specific monoclonal antibodies. These studies indicate that the protein encoded by ORF-1 has antigenic properties resembling Antigen B of MDV-1. A gene homologous to ORF-1 was detected in the genome of both MDV-2(SB1) and MDV-3(HVT), which serve as commercial vaccine strains. Two additional ORFs were noted in the 4556 nucleotide sequence: ORF-2, which encodes a 333 aa polypeptide initiating in the UL and terminating in the TRL prior to the putative origin of replication, and ORF-3, which encodes a 155 aa polypeptide that is partly homologous to the phosphoprotein pp38 encoded by the BamHI-H sequence. The 65 N-terminal aa of the two gene products are identical, both being derived from the nucleotide sequences in the TRL and IRL, respectively. Additional homologous aa sequences are the hydrophobic aa domain in the middle of both proteins. The functions of ORF-2, ORF-3, and additional ORFs are under study.
de Castro, Minique Hilda; de Klerk, Daniel; Pienaar, Ronel; Rees, D Jasper G; Mans, Ben J
2017-08-10
Ticks secrete a diverse mixture of secretory proteins into the host to evade its immune response and facilitate blood-feeding, making secretory proteins attractive targets for the production of recombinant anti-tick vaccines. The largely neglected tick species, Rhipicephalus zambeziensis, is an efficient vector of Theileria parva in southern Africa but its available sequence information is limited. Next generation sequencing has advanced sequence availability for ticks in recent years and has assisted the characterisation of secretory proteins. This study focused on the de novo assembly and annotation of the salivary gland transcriptome of R. zambeziensis and the temporal expression of secretory protein transcripts in female and male ticks, before the onset of feeding and during early and late feeding. The sialotranscriptome of R. zambeziensis yielded 23,631 transcripts from which 13,584 non-redundant proteins were predicted. Eighty-six percent of these contained a predicted start and stop codon and were estimated to be putatively full-length proteins. A fifth (2569) of the predicted proteins were annotated as putative secretory proteins and explained 52% of the expression in the transcriptome. Expression analyses revealed that 2832 transcripts were differentially expressed among feeding time points and 1209 between the tick sexes. The expression analyses further indicated that 57% of the annotated secretory protein transcripts were differentially expressed. Dynamic expression profiles of secretory protein transcripts were observed during feeding of female ticks. Whereby a number of transcripts were upregulated during early feeding, presumably for feeding site establishment and then during late feeding, 52% of these were downregulated, indicating that transcripts were required at specific feeding stages. This suggested that secretory proteins are under stringent transcriptional regulation that fine-tunes their expression in salivary glands during feeding. No open reading frames were predicted for 7947 transcripts. This class represented 17% of the differentially expressed transcripts, suggesting a potential transcriptional regulatory function of long non-coding RNA in tick blood-feeding. The assembled sialotranscriptome greatly expands the sequence availability of R. zambeziensis, assists in our understanding of the transcription of secretory proteins during blood-feeding and will be a valuable resource for future vaccine candidate selection.
Genome-Wide Analysis of Corynespora cassiicola Leaf Fall Disease Putative Effectors
Lopez, David; Ribeiro, Sébastien; Label, Philippe; Fumanal, Boris; Venisse, Jean-Stéphane; Kohler, Annegret; de Oliveira, Ricardo R.; Labutti, Kurt; Lipzen, Anna; Lail, Kathleen; Bauer, Diane; Ohm, Robin A.; Barry, Kerrie W.; Spatafora, Joseph; Grigoriev, Igor V.; Martin, Francis M.; Pujade-Renaud, Valérie
2018-01-01
Corynespora cassiicola is an Ascomycetes fungus with a broad host range and diverse life styles. Mostly known as a necrotrophic plant pathogen, it has also been associated with rare cases of human infection. In the rubber tree, this fungus causes the Corynespora leaf fall (CLF) disease, which increasingly affects natural rubber production in Asia and Africa. It has also been found as an endophyte in South American rubber plantations where no CLF outbreak has yet occurred. The C. cassiicola species is genetically highly diverse, but no clear relationship has been evidenced between phylogenetic lineage and pathogenicity. Cassiicolin, a small glycosylated secreted protein effector, is thought to be involved in the necrotrophic interaction with the rubber tree but some virulent C. cassiicola isolates do not have a cassiicolin gene. This study set out to identify other putative effectors involved in CLF. The genome of a highly virulent C. cassiicola isolate from the rubber tree (CCP) was sequenced and assembled. In silico prediction revealed 2870 putative effectors, comprising CAZymes, lipases, peptidases, secreted proteins and enzymes associated with secondary metabolism. Comparison with the genomes of 44 other fungal species, focusing on effector content, revealed a striking proximity with phylogenetically unrelated species (Colletotrichum acutatum, Colletotrichum gloesporioides, Fusarium oxysporum, nectria hematococca, and Botrosphaeria dothidea) sharing life style plasticity and broad host range. Candidate effectors involved in the compatible interaction with the rubber tree were identified by transcriptomic analysis. Differentially expressed genes included 92 putative effectors, among which cassiicolin and two other secreted singleton proteins. Finally, the genomes of 35 C. cassiicola isolates representing the genetic diversity of the species were sequenced and assembled, and putative effectors identified. At the intraspecific level, effector-based classification was found to be highly consistent with the phylogenomic trees. Identification of lineage-specific effectors is a key step toward understanding C. cassiicola virulence and host specialization mechanisms. PMID:29551995
Genome-Wide Analysis of Corynespora cassiicola Leaf Fall Disease Putative Effectors.
Lopez, David; Ribeiro, Sébastien; Label, Philippe; Fumanal, Boris; Venisse, Jean-Stéphane; Kohler, Annegret; de Oliveira, Ricardo R; Labutti, Kurt; Lipzen, Anna; Lail, Kathleen; Bauer, Diane; Ohm, Robin A; Barry, Kerrie W; Spatafora, Joseph; Grigoriev, Igor V; Martin, Francis M; Pujade-Renaud, Valérie
2018-01-01
Corynespora cassiicola is an Ascomycetes fungus with a broad host range and diverse life styles. Mostly known as a necrotrophic plant pathogen, it has also been associated with rare cases of human infection. In the rubber tree, this fungus causes the Corynespora leaf fall (CLF) disease, which increasingly affects natural rubber production in Asia and Africa. It has also been found as an endophyte in South American rubber plantations where no CLF outbreak has yet occurred. The C. cassiicola species is genetically highly diverse, but no clear relationship has been evidenced between phylogenetic lineage and pathogenicity. Cassiicolin, a small glycosylated secreted protein effector, is thought to be involved in the necrotrophic interaction with the rubber tree but some virulent C. cassiicola isolates do not have a cassiicolin gene. This study set out to identify other putative effectors involved in CLF. The genome of a highly virulent C. cassiicola isolate from the rubber tree (CCP) was sequenced and assembled. In silico prediction revealed 2870 putative effectors, comprising CAZymes, lipases, peptidases, secreted proteins and enzymes associated with secondary metabolism. Comparison with the genomes of 44 other fungal species, focusing on effector content, revealed a striking proximity with phylogenetically unrelated species ( Colletotrichum acutatum, Colletotrichum gloesporioides, Fusarium oxysporum, nectria hematococca , and Botrosphaeria dothidea ) sharing life style plasticity and broad host range. Candidate effectors involved in the compatible interaction with the rubber tree were identified by transcriptomic analysis. Differentially expressed genes included 92 putative effectors, among which cassiicolin and two other secreted singleton proteins. Finally, the genomes of 35 C. cassiicola isolates representing the genetic diversity of the species were sequenced and assembled, and putative effectors identified. At the intraspecific level, effector-based classification was found to be highly consistent with the phylogenomic trees. Identification of lineage-specific effectors is a key step toward understanding C. cassiicola virulence and host specialization mechanisms.
Meitinger, T; Meindl, A; Bork, P; Rost, B; Sander, C; Haasemann, M; Murken, J
1993-12-01
The X-lined gene for Norrie disease, which is characterized by blindness, deafness and mental retardation has been cloned recently. This gene has been thought to code for a putative extracellular factor; its predicted amino acid sequence is homologous to the C-terminal domain of diverse extracellular proteins. Sequence pattern searches and three-dimensional modelling now suggest that the Norrie disease protein (NDP) has a tertiary structure similar to that of transforming growth factor beta (TGF beta). Our model identifies NDP as a member of an emerging family of growth factors containing a cystine knot motif, with direct implications for the physiological role of NDP. The model also sheds light on sequence related domains such as the C-terminal domain of mucins and of von Willebrand factor.
An insight into the sialome of the blood-sucking bug Triatoma infestans, a vector of Chagas' disease
Assumpção, Teresa C. F.; Francischetti, Ivo M. B.; Andersen, John F.; Schwarz, Alexandra; Santana, Jaime M.; Ribeiro, José M. C.
2008-01-01
Triatoma infestans is a hemiptera, vector of Chagas’ disease, that feeds exclusively on vertebrate blood in all life stages. Hematophagous insects’ salivary glands (SG) produce potent pharmacological compounds that counteract host hemostasis, including anti-clotting, anti-platelet, and vasodilatory molecules. To obtain a further insight into the salivary biochemical and pharmacological complexity of this insect, a cDNA library from its salivary glands was randomly sequenced. Also, salivary proteins were submitted to two dimentional gel (2D-gel) electrophoresis followed by MS analysis. We present the analysis of a set of 1,534 (SG) cDNA sequences, 645 of which coded for proteins of a putative secretory nature. Most salivary proteins described as lipocalins matched peptide sequences obtained from proteomic results. PMID:18207082
Tetrahymena thermophila acidic ribosomal protein L37 contains an archaebacterial type of C-terminus.
Hansen, T S; Andreasen, P H; Dreisig, H; Højrup, P; Nielsen, H; Engberg, J; Kristiansen, K
1991-09-15
We have cloned and characterized a Tetrahymena thermophila macronuclear gene (L37) encoding the acidic ribosomal protein (A-protein) L37. The gene contains a single intron located in the 3'-part of the coding region. Two major and three minor transcription start points (tsp) were mapped 39 to 63 nucleotides upstream from the translational start codon. The uppermost tsp mapped to the first T in a putative T. thermophila RNA polymerase II initiator element, TATAA. The coding region of L37 predicts a protein of 109 amino acid (aa) residues. A substantial part of the deduced aa sequence was verified by protein sequencing. The T. thermophila L37 clearly belongs to the P1-type family of eukaryotic A-proteins, but the C-terminal region has the hallmarks of archaebacterial A-proteins.
2011-01-01
Background Wheat grains accumulate a variety of low molecular weight proteins that are inhibitors of alpha-amylases and proteases and play an important protective role in the grain. These proteins have more balanced amino acid compositions than the major wheat gluten proteins and contribute important reserves for both seedling growth and human nutrition. The alpha-amylase/protease inhibitors also are of interest because they cause IgE-mediated occupational and food allergies and thereby impact human health. Results The complement of genes encoding alpha-amylase/protease inhibitors expressed in the US bread wheat Butte 86 was characterized by analysis of expressed sequence tags (ESTs). Coding sequences for 19 distinct proteins were identified. These included two monomeric (WMAI), four dimeric (WDAI), and six tetrameric (WTAI) inhibitors of exogenous alpha-amylases, two inhibitors of endogenous alpha-amylases (WASI), four putative trypsin inhibitors (CMx and WTI), and one putative chymotrypsin inhibitor (WCI). A number of the encoded proteins were identical or very similar to proteins in the NCBI database. Sequences not reported previously included variants of WTAI-CM3, three CMx inhibitors and WTI. Within the WDAI group, two different genes encoded the same mature protein. Based on numbers of ESTs, transcripts for WTAI-CM3 Bu-1, WMAI Bu-1 and WTAI-CM16 Bu-1 were most abundant in Butte 86 developing grain. Coding sequences for 16 of the inhibitors were unequivocally associated with specific proteins identified by tandem mass spectrometry (MS/MS) in a previous proteomic analysis of milled white flour from Butte 86. Proteins corresponding to WDAI Bu-1/Bu-2, WMAI Bu-1 and the WTAI subunits CM2 Bu-1, CM3 Bu-1 and CM16 Bu-1 were accumulated to the highest levels in flour. Conclusions Information on the spectrum of alpha-amylase/protease inhibitor genes and proteins expressed in a single wheat cultivar is central to understanding the importance of these proteins in both plant defense mechanisms and human allergies and facilitates both breeding and biotechnology approaches for manipulating the composition of these proteins in plants. PMID:21774824
Complete mitochondrial DNA sequence of the Eastern keelback mullet Liza affinis.
Gong, Xiaoling; Zhu, Wenjia; Bao, Baolong
2016-05-01
Eastern keelback mullet (Liza affinis) inhabits inlet waters and estuaries of rivers. In this paper, we initially determined the complete mitochondrial genome of Liza affinis. The entire mtDNA sequence is 16,831 bp in length, including 2 rRNA genes, 22 tRNA genes, 13 protein-coding genes and 1 putative control region. Its order and numbers of genes are similar to most bony fishes.
Protein and gene structure of a blue laccase from Pleurotus ostreatus1.
Giardina, P; Palmieri, G; Scaloni, A; Fontanella, B; Faraco, V; Cennamo, G; Sannia, G
1999-01-01
A new laccase isoenzyme (POXA1b, where POX is phenol oxidase), produced by Pleurotus ostreatus in cultures supplemented with copper sulphate, has been purified and fully characterized. The main characteristics of this protein (molecular mass in native and denaturing conditions, pI and catalytic properties) are almost identical to the previously studied laccase POXA1w. However, POXA1b contains four copper atoms per molecule instead of one copper, two zinc and one iron atom per molecule of POXA1w. Furthermore, POXA1b shows an unusually high stability at alkaline pH. The gene and cDNA coding for POXA1b have been cloned and sequenced. The gene coding sequence contains 1599 bp, interrupted by 15 introns. Comparison of the structure of the poxa1b gene with the two previously studied P. ostreatus laccase genes (pox1 and poxc) suggests that these genes belong to two different subfamilies. The amino acid sequence of POXA1b deduced from the cDNA sequence has been almost completely verified by means of matrix-assisted laser desorption ionization MS. It has been demonstrated that three out of six putative glycosylation sites are post-translationally modified and the structure of the bound glycosidic moieties has been determined, whereas two other putative glycosylation sites are unmodified. PMID:10417329
Reddy, M K; Nair, S; Singh, B N; Mudgil, Y; Tewari, K K; Sopory, S K
2001-01-24
We report the cloning and sequencing of both cDNA and genomic DNA of a 33 kDa chloroplast ribonucleoprotein (33RNP) from pea. The analysis of the predicted amino acid sequence of the cDNA clone revealed that the encoded protein contains two RNA binding domains, including the conserved consensus ribonucleoprotein sequences CS-RNP1 and CS-RNP2, on the C-terminus half and the presence of a putative transit peptide sequence in the N-terminus region. The phylogenetic and multiple sequence alignment analysis of pea chloroplast RNP along with RNPs reported from the other plant sources revealed that the pea 33RNP is very closely related to Nicotiana sylvestris 31RNP and 28RNP and also to 31RNP and 28RNP of Arabidopsis and spinach, respectively. The pea 33RNP was expressed in Escherichia coli and purified to homogeneity. The in vitro import of precursor protein into chloroplasts confirmed that the N-terminus putative transit peptide is a bona fide transit peptide and 33RNP is localized in the chloroplast. The nucleic acid-binding properties of the recombinant protein, as revealed by South-Western analysis, showed that 33RNP has higher binding affinity for poly (U) and oligo dT than for ssDNA and dsDNA. The steady state transcript level was higher in leaves than in roots and the expression of this gene is light stimulated. Sequence analysis of the genomic clone revealed that the gene contains four exons and three introns. We have also isolated and analyzed the 5' flanking region of the pea 33RNP gene.
Ferrando, Sara; Gallus, Lorenzo; Gambardella, Chiara; Masini, Maria Angela; Cutolo, Alessia; Vacchi, Marino
2012-05-31
The mucosa covering the tongue of the Chimaera monstrosa has been investigated with histological and immunohistochemical methods allowing to describe, for the first time, gustatory structures (taste buds) in this subclass of cartilaginous fish. G-protein-alpha-subunit-inhibitory-like (Gαi-like) immunoreactivity has been detected in the taste buds of C. monstrosa, as described in other vertebrates. In order to gain confidence on the antiserum used, able to recognize three Gαi proteins in mammals, alignments of the antigenic sequence in mammals and other vertebrates were performed. The data were used for a research of putative genes in the genome of the holocephalan Callorhinchus milii, to date the only cartilaginous fish with a sequenced genome; the highlighted sequences could suggest the presence of all three genes (gnai1, gnai2 and gnai3) in holocephalans. The sequences of the predicted proteins present a high identity with the mammalian proteins. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Huang, Kailong; Zhang, Xu-Xiang; Shi, Peng; Wu, Bing; Ren, Hongqiang
2014-11-01
In order to comprehensively investigate bacterial virulence in drinking water, 454 pyrosequencing and Illumina high-throughput sequencing were used to detect potential pathogenic bacteria and virulence factors (VFs) in a full-scale drinking water treatment and distribution system. 16S rRNA gene pyrosequencing revealed high bacterial diversity in the drinking water (441-586 operational taxonomic units). Bacterial diversity decreased after chlorine disinfection, but increased after pipeline distribution. α-Proteobacteria was the most dominant taxonomic class. Alignment against the established pathogen database showed that several types of putative pathogens were present in the drinking water and Pseudomonas aeruginosa had the highest abundance (over 11‰ of total sequencing reads). Many pathogens disappeared after chlorine disinfection, but P. aeruginosa and Leptospira interrogans were still detected in the tap water. High-throughput sequencing revealed prevalence of various pathogenicity islands and virulence proteins in the drinking water, and translocases, transposons, Clp proteases and flagellar motor switch proteins were the predominant VFs. Both diversity and abundance of the detectable VFs increased after the chlorination, and decreased after the pipeline distribution. This study indicates that joint use of 454 pyrosequencing and Illumina sequencing can comprehensively characterize environmental pathogenesis, and several types of putative pathogens and various VFs are prevalent in drinking water. Copyright © 2014 Elsevier Inc. All rights reserved.
Lu, Jia-hai; Zhang, Ding-mei; Wang, Guo-ling; Guo, Zhong-min; Zhang, Chuan-hai; Tan, Bing-yan; Ouyang, Li-ping; Lin, Li; Liu, Yi-min; Chen, Wei-qing; Ling, Wen-hua; Yu, Xin-bing; Zhong, Nan-shan
2005-05-05
The rapid transmission and high mortality rate made severe acute respiratory syndrome (SARS) a global threat for which no efficacious therapy is available now. Without sufficient knowledge about the SARS coronavirus (SARS-CoV), it is impossible to define the candidate for the anti-SARS targets. The putative non-structural protein 2 (nsp2) (3CL(pro), following the nomenclature by Gao et al, also known as nsp5 in Snidjer et al) of SARS-CoV plays an important role in viral transcription and replication, and is an attractive target for anti-SARS drug development, so we carried on this study to have an insight into putative polymerase nsp2 of SARS-CoV Guangdong (GD) strain. The SARS-CoV strain was isolated from a SARS patient in Guangdong, China, and cultured in Vero E6 cells. The nsp2 gene was amplified by reverse transcription-polymerase chain reaction (RT-PCR) and cloned into eukaryotic expression vector pCI-neo (pCI-neo/nsp2). Then the recombinant eukaryotic expression vector pCI-neo/nsp2 was transfected into COS-7 cells using lipofectin reagent to express the nsp2 protein. The expressive protein of SARS-CoV nsp2 was analyzed by 7% sodium dodecylsulfate polyacrylamide gel electrophoresis (SDS-PAGE). The nucleotide sequence and protein sequence of GD nsp2 were compared with that of other SARS-CoV strains by nucleotide-nucleotide basic local alignment search tool (BLASTN) and protein-protein basic local alignment search tool (BLASTP) to investigate its variance trend during the transmission. The secondary structure of GD strain and that of other strains were predicted by Garnier-Osguthorpe-Robson (GOR) Secondary Structure Prediction. Three-dimensional-PSSM Protein Fold Recognition (Threading) Server was employed to construct the three-dimensional model of the nsp2 protein. The putative polymerase nsp2 gene of GD strain was amplified by RT-PCR. The eukaryotic expression vector (pCI-neo/nsp2) was constructed and expressed the protein in COS-7 cells successfully. The result of sequencing and sequence comparison with other SARS-CoV strains showed that nsp2 gene was relatively conservative during the transmission and total five base sites mutated in about 100 strains investigated, three of which in the early and middle phases caused synonymous mutation, and another two base sites variation in the late phase resulted in the amino acid substitutions and secondary structure changes. The three-dimensional structure of the nsp2 protein was successfully constructed. The results suggest that polymerase nsp2 is relatively stable during the phase of epidemic. The amino acid and secondary structure change may be important for viral infection. The fact that majority of single nucleotide variations (SNVs) are predicted to cause synonymous, as well as the result of low mutation rate of nsp2 gene in the epidemic variations, indicates that the nsp2 is conservative and could be a target for anti-SARS drugs. The three-dimensional structure result indicates that the nsp2 protein of GD strain is high homologous with 3CL(pro) of SARS-CoV urbani strain, 3CL(pro) of transmissible gastroenteritis virus and 3CL(pro) of human coronavirus 229E strain, which further suggests that nsp2 protein of GD strain possesses the activity of 3CL(pro).
Fu, Shulin; Zhang, Minmin; Xu, Juan; Ou, Jiwen; Wang, Yan; Liu, Huazhen; Liu, Jinlin; Chen, Huanchun; Bei, Weicheng
2013-01-02
Haemophilus parasuis (H. parasuis), the causative agent of swine polyserositis, polyarthritis, and meningitis, is one of the most important bacterial diseases of pigs worldwide. Little vaccines currently exist that have a significant effect on infections with all pathogenic serovars of H. parasuis. H. parasuis putative outer membrane proteins (OMPs) are potentially essential components of more effective vaccines. Recently, the genomic sequence of H. parasuis serovar 5 strain SH0165 was completed in our laboratory, which allow us to target OMPs for the development of recombinant vaccines. In this study, we focused on 10 putative OMPs and all the putative OMPs were cloned, expressed and purified as HIS fusion proteins. Primary screening for immunoprotective potential was performed in mice challenged with an LD50 challenge. Out of these 10 OMPs three fusion proteins rGAPDH, rOapA, and rHPS-0675 were found to be protective in a mouse model of H. parasuis infection. We further evaluated the immune responses and protective efficacy of rGAPDH, rOapA, and rHPS-0675 in pig models. All three proteins elicited humoral antibody responses and conferred different levels of protection against challenge with a lethal dose of H. parasuis SH0165 in pig models. In addition, the antisera against the three individual proteins and the synergistic protein efficiently inhibited bacterial growth in a whole blood assay. The data demonstrated that the three proteins showed high value individually and the combination of rGAPDH, rOapA, and rHPS-0675 offered the best protection. Our results indicate that rGAPDH, rOapA, and rHPS-0675 induced protection against H. parasuis SH0165 infection, which may facilitate the development of a multi-component vaccine. Copyright © 2012 Elsevier Ltd. All rights reserved.
Genome mining of ascomycetous fungi reveals their genetic potential for ergot alkaloid production.
Gerhards, Nina; Matuschek, Marco; Wallwey, Christiane; Li, Shu-Ming
2015-06-01
Ergot alkaloids are important as mycotoxins or as drugs. Naturally occurring ergot alkaloids as well as their semisynthetic derivatives have been used as pharmaceuticals in modern medicine for decades. We identified 196 putative ergot alkaloid biosynthetic genes belonging to at least 31 putative gene clusters in 31 fungal species by genome mining of the 360 available genome sequences of ascomycetous fungi with known proteins. Detailed analysis showed that these fungi belong to the families Aspergillaceae, Clavicipitaceae, Arthrodermataceae, Helotiaceae and Thermoascaceae. Within the identified families, only a small number of taxa are represented. Literature search revealed a large diversity of ergot alkaloid structures in different fungi of the phylum Ascomycota. However, ergot alkaloid accumulation was only observed in 15 of the sequenced species. Therefore, this study provides genetic basis for further study on ergot alkaloid production in the sequenced strains.
Dröge, M; Pühler, A; Selbitschka, W
2000-04-01
In order to isolate antibiotic resistance plasmids from bacterial communities found in activated sludge, derivatives of the 3-chlorobenzoate-degrading strain Pseudomonas sp. B13, tagged with the green fluorescent protein as an identification marker, were used as recipients in filter crosses. Transconjugants were selected on agar plates containing 3-chlorobenzoate as the sole carbon source and the antibiotic tetracycline, streptomycin or spectinomycin, and were recovered at frequencies in the range of 10(-5) to 10(-8) per recipient. A total of 12 distinct plasmids, designated pB1-pB12, was identified. Their sizes ranged between 41 to 69 kb and they conferred various patterns of antibiotic resistance on their hosts. Two of the plasmids, pB10 and pB11, also mediated resistance to inorganic mercury. Seven of the 12 plasmids were identified as broad-host-range plasmids, displaying extremely high transfer frequencies in filter crosses, ranging from 10(-1) to 10(-2) per recipient cell. Ten of the 12 plasmids belonged to the IncP incompatibility group, based on replicon typing using IncP group-specific PCR primers. DNA sequencing of PCR amplification products further revealed that eight of the 12 plasmids belonged to the IncPbeta subgroup, whereas two plasmids were identified as IncPalpha plasmids. Analysis of the IncP-specific PCR products revealed considerable differences among the IncPbeta plasmids at the DNA sequence level. In order to characterize the gene "load" of the IncP plasmids, restriction fragments were cloned and their DNA sequences established. A remarkable diversity of putative proteins encoded by these fragments was identified. Besides transposases and proteins involved in antibiotic resistance, two putative DNA invertases belonging to the Din family, a methyltransferase of a type I restriction/modification system, a superoxide dismutase, parts of a putative efflux system belonging to the RND family, and proteins of unknown function were identified.
Zhang, Xiaodong; Allan, Andrew C.; Li, Caixia; Wang, Yuanzhong; Yao, Qiuyang
2015-01-01
Gentiana rigescens is an important medicinal herb in China. The main validated medicinal component gentiopicroside is synthesized in shoots, but is mainly found in the plant’s roots. The gentiopicroside biosynthetic pathway and its regulatory control remain to be elucidated. Genome resources of gentian are limited. Next-generation sequencing (NGS) technologies can aid in supplying global gene expression profiles. In this study we present sequence and transcript abundance data for the root and leaf transcriptome of G. rigescens, obtained using the Illumina Hiseq2000. Over fifty million clean reads were obtained from leaf and root libraries. This yields 76,717 unigenes with an average length of 753 bp. Among these, 33,855 unigenes were identified as putative homologs of annotated sequences in public protein and nucleotide databases. Digital abundance analysis identified 3306 unigenes differentially enriched between leaf and root. Unigenes found in both tissues were categorized according to their putative functional categories. Of the differentially expressed genes, over 130 were annotated as related to terpenoid biosynthesis. This work is the first study of global transcriptome analyses in gentian. These sequences and putative functional data comprise a resource for future investigation of terpenoid biosynthesis in Gentianaceae species and annotation of the gentiopicroside biosynthetic pathway and its regulatory mechanisms. PMID:26006235
Khanna, Namita; Ghosh, Ananta Kumar; Huntemann, Marcel; Deshpande, Shweta; Han, James; Chen, Amy; Kyrpides, Nikos; Mavrommatis, Kostas; Szeto, Ernest; Markowitz, Victor; Ivanova, Natalia; Pagani, Ioanna; Pati, Amrita; Pitluck, Sam; Nolan, Matt; Woyke, Tanja; Teshima, Hazuki; Chertkov, Olga; Daligault, Hajnalka; Davenport, Karen; Gu, Wei; Munk, Christine; Zhang, Xiaojing; Bruce, David; Detter, Chris; Xu, Yan; Quintana, Beverly; Reitenga, Krista; Kunde, Yulia; Green, Lance; Erkkila, Tracy; Han, Cliff; Brambilla, Evelyne-Marie; Lang, Elke; Klenk, Hans-Peter; Goodwin, Lynne; Chain, Patrick; Das, Debabrata
2013-12-20
Enterobacter sp. IIT-BT 08 belongs to Phylum: Proteobacteria, Class: Gammaproteobacteria, Order: Enterobacteriales, Family: Enterobacteriaceae. The organism was isolated from the leaves of a local plant near the Kharagpur railway station, Kharagpur, West Bengal, India. It has been extensively studied for fermentative hydrogen production because of its high hydrogen yield. For further enhancement of hydrogen production by strain development, complete genome sequence analysis was carried out. Sequence analysis revealed that the genome was linear, 4.67 Mbp long and had a GC content of 56.01%. The genome properties encode 4,393 protein-coding and 179 RNA genes. Additionally, a putative pathway of hydrogen production was suggested based on the presence of formate hydrogen lyase complex and other related genes identified in the genome. Thus, in the present study we describe the specific properties of the organism and the generation, annotation and analysis of its genome sequence as well as discuss the putative pathway of hydrogen production by this organism.
Hsu, Ju-Chun; Chien, Ting-Ying; Hu, Chia-Cheng; Chen, Mei-Ju May; Wu, Wen-Jer; Feng, Hai-Tung; Haymer, David S; Chen, Chien-Yu
2012-01-01
Insecticide resistance has recently become a critical concern for control of many insect pest species. Genome sequencing and global quantization of gene expression through analysis of the transcriptome can provide useful information relevant to this challenging problem. The oriental fruit fly, Bactrocera dorsalis, is one of the world's most destructive agricultural pests, and recently it has been used as a target for studies of genetic mechanisms related to insecticide resistance. However, prior to this study, the molecular data available for this species was largely limited to genes identified through homology. To provide a broader pool of gene sequences of potential interest with regard to insecticide resistance, this study uses whole transcriptome analysis developed through de novo assembly of short reads generated by next-generation sequencing (NGS). The transcriptome of B. dorsalis was initially constructed using Illumina's Solexa sequencing technology. Qualified reads were assembled into contigs and potential splicing variants (isotigs). A total of 29,067 isotigs have putative homologues in the non-redundant (nr) protein database from NCBI, and 11,073 of these correspond to distinct D. melanogaster proteins in the RefSeq database. Approximately 5,546 isotigs contain coding sequences that are at least 80% complete and appear to represent B. dorsalis genes. We observed a strong correlation between the completeness of the assembled sequences and the expression intensity of the transcripts. The assembled sequences were also used to identify large numbers of genes potentially belonging to families related to insecticide resistance. A total of 90 P450-, 42 GST-and 37 COE-related genes, representing three major enzyme families involved in insecticide metabolism and resistance, were identified. In addition, 36 isotigs were discovered to contain target site sequences related to four classes of resistance genes. Identified sequence motifs were also analyzed to characterize putative polypeptide translational products and associate them with specific genes and protein functions.
Novel venom gene discovery in the platypus
2010-01-01
Background To date, few peptides in the complex mixture of platypus venom have been identified and sequenced, in part due to the limited amounts of platypus venom available to study. We have constructed and sequenced a cDNA library from an active platypus venom gland to identify the remaining components. Results We identified 83 novel putative platypus venom genes from 13 toxin families, which are homologous to known toxins from a wide range of vertebrates (fish, reptiles, insectivores) and invertebrates (spiders, sea anemones, starfish). A number of these are expressed in tissues other than the venom gland, and at least three of these families (those with homology to toxins from distant invertebrates) may play non-toxin roles. Thus, further functional testing is required to confirm venom activity. However, the presence of similar putative toxins in such widely divergent species provides further evidence for the hypothesis that there are certain protein families that are selected preferentially during evolution to become venom peptides. We have also used homology with known proteins to speculate on the contributions of each venom component to the symptoms of platypus envenomation. Conclusions This study represents a step towards fully characterizing the first mammal venom transcriptome. We have found similarities between putative platypus toxins and those of a number of unrelated species, providing insight into the evolution of mammalian venom. PMID:20920228
Novel venom gene discovery in the platypus.
Whittington, Camilla M; Papenfuss, Anthony T; Locke, Devin P; Mardis, Elaine R; Wilson, Richard K; Abubucker, Sahar; Mitreva, Makedonka; Wong, Emily S W; Hsu, Arthur L; Kuchel, Philip W; Belov, Katherine; Warren, Wesley C
2010-01-01
To date, few peptides in the complex mixture of platypus venom have been identified and sequenced, in part due to the limited amounts of platypus venom available to study. We have constructed and sequenced a cDNA library from an active platypus venom gland to identify the remaining components. We identified 83 novel putative platypus venom genes from 13 toxin families, which are homologous to known toxins from a wide range of vertebrates (fish, reptiles, insectivores) and invertebrates (spiders, sea anemones, starfish). A number of these are expressed in tissues other than the venom gland, and at least three of these families (those with homology to toxins from distant invertebrates) may play non-toxin roles. Thus, further functional testing is required to confirm venom activity. However, the presence of similar putative toxins in such widely divergent species provides further evidence for the hypothesis that there are certain protein families that are selected preferentially during evolution to become venom peptides. We have also used homology with known proteins to speculate on the contributions of each venom component to the symptoms of platypus envenomation. This study represents a step towards fully characterizing the first mammal venom transcriptome. We have found similarities between putative platypus toxins and those of a number of unrelated species, providing insight into the evolution of mammalian venom.
Kemperman, Robèr; Jonker, Marnix; Nauta, Arjen; Kuipers, Oscar P.; Kok, Jan
2003-01-01
A region of 12 kb flanking the structural gene of the cyclic antibacterial peptide circularin A of Clostridium beijerinckii ATCC 25752 was sequenced, and the putative proteins involved in the production and secretion of circularin A were identified. The genes are tightly organized in overlapping open reading frames. Heterologous expression of circularin A in Enterococcus faecalis was achieved, and five genes were identified as minimally required for bacteriocin production and secretion. Two of the putative proteins, CirB and CirC, are predicted to contain membrane-spanning domains, while CirD contains a highly conserved ATP-binding domain. Together with CirB and CirC, this ATP-binding protein is involved in the production of circularin A. The fifth gene, cirE, confers immunity towards circularin A when expressed in either Lactococcus lactis or E. faecalis and is needed in order to allow the bacteria to produce bacteriocin. Additional resistance against circularin A is conferred by the activity of the putative transporter consisting of CirB and CirD. PMID:14532033
Ponce, Dalia; Brinkman, Diane L.; Potriquet, Jeremy; Mulvenna, Jason
2016-01-01
Jellyfish venoms are rich sources of toxins designed to capture prey or deter predators, but they can also elicit harmful effects in humans. In this study, an integrated transcriptomic and proteomic approach was used to identify putative toxins and their potential role in the venom of the scyphozoan jellyfish Chrysaora fuscescens. A de novo tentacle transcriptome, containing more than 23,000 contigs, was constructed and used in proteomic analysis of C. fuscescens venom to identify potential toxins. From a total of 163 proteins identified in the venom proteome, 27 were classified as putative toxins and grouped into six protein families: proteinases, venom allergens, C-type lectins, pore-forming toxins, glycoside hydrolases and enzyme inhibitors. Other putative toxins identified in the transcriptome, but not the proteome, included additional proteinases as well as lipases and deoxyribonucleases. Sequence analysis also revealed the presence of ShKT domains in two putative venom proteins from the proteome and an additional 15 from the transcriptome, suggesting potential ion channel blockade or modulatory activities. Comparison of these potential toxins to those from other cnidarians provided insight into their possible roles in C. fuscescens venom and an overview of the diversity of potential toxin families in cnidarian venoms. PMID:27058558
Qiu, T; Lu, R H; Zhang, J; Zhu, Z Y
2001-07-01
The complete nucleotide sequence of M6 gene of grass carp hemorrhage virus (GCHV) was determined. It is 2039 nucleotides in length and contains a single large open reading frame that could encode a protein of 648 amino acids with predicted molecular mass of 68.7 kDa. Amino acid sequence comparison revealed that the protein encoded by GCHV M6 is closely related to the protein mu1 of mammalian reovirus. The M6 gene, encoding the major outer-capsid protein, was expressed using the pET fusion protein vector in Escherichia coli and detected by Western blotting using chicken anti-GCHV immunoglobulin (IgY). The result indicates that the protein encoded by M6 may share a putative Asn-42-Pro-43 proteolytic cleavage site with mu1.
Ali, Shawkat; Magne, Maxime; Chen, Shiyan; Côté, Olivier; Stare, Barbara Gerič; Obradovic, Natasa; Jamshaid, Lubna; Wang, Xiaohong; Bélair, Guy; Moffett, Peter
2015-01-01
The potato cyst nematode, Globodera rostochiensis, is an important pest of potato. Like other pathogens, plant parasitic nematodes are presumed to employ effector proteins, secreted into the apoplast as well as the host cytoplasm, to alter plant cellular functions and successfully infect their hosts. We have generated a library of ORFs encoding putative G. rostochiensis putative apoplastic effectors in vectors for expression in planta. These clones were assessed for morphological and developmental effects on plants as well as their ability to induce or suppress plant defenses. Several CLAVATA3/ESR-like proteins induced developmental phenotypes, whereas predicted cell wall-modifying proteins induced necrosis and chlorosis, consistent with roles in cell fate alteration and tissue invasion, respectively. When directed to the apoplast with a signal peptide, two effectors, an ubiquitin extension protein (GrUBCEP12) and an expansin-like protein (GrEXPB2), suppressed defense responses including NB-LRR signaling induced in the cytoplasm. GrEXPB2 also elicited defense response in species- and sequence-specific manner. Our results are consistent with the scenario whereby potato cyst nematodes secrete effectors that modulate host cell fate and metabolism as well as modifying host cell walls. Furthermore, we show a novel role for an apoplastic expansin-like protein in suppressing intra-cellular defense responses. PMID:25606855
Ali, Shawkat; Magne, Maxime; Chen, Shiyan; Côté, Olivier; Stare, Barbara Gerič; Obradovic, Natasa; Jamshaid, Lubna; Wang, Xiaohong; Bélair, Guy; Moffett, Peter
2015-01-01
The potato cyst nematode, Globodera rostochiensis, is an important pest of potato. Like other pathogens, plant parasitic nematodes are presumed to employ effector proteins, secreted into the apoplast as well as the host cytoplasm, to alter plant cellular functions and successfully infect their hosts. We have generated a library of ORFs encoding putative G. rostochiensis putative apoplastic effectors in vectors for expression in planta. These clones were assessed for morphological and developmental effects on plants as well as their ability to induce or suppress plant defenses. Several CLAVATA3/ESR-like proteins induced developmental phenotypes, whereas predicted cell wall-modifying proteins induced necrosis and chlorosis, consistent with roles in cell fate alteration and tissue invasion, respectively. When directed to the apoplast with a signal peptide, two effectors, an ubiquitin extension protein (GrUBCEP12) and an expansin-like protein (GrEXPB2), suppressed defense responses including NB-LRR signaling induced in the cytoplasm. GrEXPB2 also elicited defense response in species- and sequence-specific manner. Our results are consistent with the scenario whereby potato cyst nematodes secrete effectors that modulate host cell fate and metabolism as well as modifying host cell walls. Furthermore, we show a novel role for an apoplastic expansin-like protein in suppressing intra-cellular defense responses.
Proteomic Profiling of Cereal Aphid Saliva Reveals Both Ubiquitous and Adaptive Secreted Proteins
Wilkinson, Tom L.
2013-01-01
The secreted salivary proteins from two cereal aphid species, Sitobion avenae and Metopolophium dirhodum, were collected from artificial diets and analysed by tandem mass spectrometry. Protein identification was performed by searching MS data against the official protein set from the current pea aphid (Acyrthosiphon pisum) genome assembly and revealed 12 and 7 proteins in the saliva of S. avenae and M. dirhodum, respectively. When combined with a comparable dataset from A. pisum, only three individual proteins were common to all the aphid species; two paralogues of the GMC oxidoreductase family (glucose dehydrogenase; GLD) and ACYPI009881, an aphid specific protein previously identified as a putative component of the salivary sheath. Antibodies were designed from translated protein sequences obtained from partial cDNA sequences for ACYPI009881 and both saliva associated GLDs. The antibodies detected all parent proteins in secreted saliva from the three aphid species, but could only detect ACYPI009881, and not saliva associated GLDs, in protein extractions from the salivary glands. This result was confirmed by immunohistochemistry using whole and sectioned salivary glands, and in addition, localised ACYPI009881 to specific cell types within the principal salivary gland. The implications of these findings for the origin of salivary components and the putative role of the proteins identified are discussed in the context of our limited understanding of the functional relationship between aphid saliva and the plants they feed on. The mass spectrometry data have been deposited to the ProteomeXchange and can be accessed under the identifier PXD000113. PMID:23460852
Proteomic profiling of cereal aphid saliva reveals both ubiquitous and adaptive secreted proteins.
Rao, Sohail A K; Carolan, James C; Wilkinson, Tom L
2013-01-01
The secreted salivary proteins from two cereal aphid species, Sitobion avenae and Metopolophium dirhodum, were collected from artificial diets and analysed by tandem mass spectrometry. Protein identification was performed by searching MS data against the official protein set from the current pea aphid (Acyrthosiphon pisum) genome assembly and revealed 12 and 7 proteins in the saliva of S. avenae and M. dirhodum, respectively. When combined with a comparable dataset from A. pisum, only three individual proteins were common to all the aphid species; two paralogues of the GMC oxidoreductase family (glucose dehydrogenase; GLD) and ACYPI009881, an aphid specific protein previously identified as a putative component of the salivary sheath. Antibodies were designed from translated protein sequences obtained from partial cDNA sequences for ACYPI009881 and both saliva associated GLDs. The antibodies detected all parent proteins in secreted saliva from the three aphid species, but could only detect ACYPI009881, and not saliva associated GLDs, in protein extractions from the salivary glands. This result was confirmed by immunohistochemistry using whole and sectioned salivary glands, and in addition, localised ACYPI009881 to specific cell types within the principal salivary gland. The implications of these findings for the origin of salivary components and the putative role of the proteins identified are discussed in the context of our limited understanding of the functional relationship between aphid saliva and the plants they feed on. The mass spectrometry data have been deposited to the ProteomeXchange and can be accessed under the identifier PXD000113.
Yamaguchi, S; Saito, T; Abe, H; Yamane, H; Murofushi, N; Kamiya, Y
1996-08-01
The first committed step in the formation of diterpenoids leading to gibberellin (GA) biosynthesis is the conversion of geranylgeranyl diphosphate (GGDP) to ent-kaurene. ent-Kaurene synthase A (KSA) catalyzes the conversion of GGDP to copalyl diphosphate (CDP), which is subsequently converted to ent-kaurene by ent-kaurene synthase B (KSB). A full-length KSB cDNA was isolated from developing cotyledons in immature seeds of pumpkin (Cucurbita maxima L.). Degenerate oligonucleotide primers were designed from the amino acid sequences obtained from the purified protein to amplify a cDNA fragment, which was used for library screening. The isolated full-length cDNA was expressed in Escherichia coli as a fusion protein, which demonstrated the KSB activity to cyclize [3H]CDP to [3H]ent-kaurene. The KSB transcript was most abundant in growing tissues, but was detected in every organ in pumpkin seedlings. The deduced amino acid sequence shares significant homology with other terpene cyclases, including the conserved DDXXD motif, a putative divalent metal ion-diphosphate complex binding site. A putative transit peptide sequence that may target the translated product into the plastids is present in the N-terminal region.
Transcriptome and gene expression analysis during flower blooming in Rosa chinensis 'Pallida'.
Yan, Huijun; Zhang, Hao; Chen, Min; Jian, Hongying; Baudino, Sylvie; Caissard, Jean-Claude; Bendahmane, Mohammed; Li, Shubin; Zhang, Ting; Zhou, Ningning; Qiu, Xianqin; Wang, Qigang; Tang, Kaixue
2014-04-25
Rosa chinensis 'Pallida' (Rosa L.) is one of the most important ancient rose cultivars originating from China. It contributed the 'tea scent' trait to modern roses. However, little information is available on the gene regulatory networks involved in scent biosynthesis and metabolism in Rosa. In this study, the transcriptome of R. chinensis 'Pallida' petals at different developmental stages, from flower buds to senescent flowers, was investigated using Illumina sequencing technology. De novo assembly generated 89,614 clusters with an average length of 428bp. Based on sequence similarity search with known proteins, 62.9% of total clusters were annotated. Out of these annotated transcripts, 25,705 and 37,159 sequences were assigned to gene ontology and clusters of orthologous groups, respectively. The dataset provides information on transcripts putatively associated with known scent metabolic pathways. Digital gene expression (DGE) was obtained using RNA samples from flower bud, open flower and senescent flower stages. Comparative DGE and quantitative real time PCR permitted the identification of five transcripts encoding proteins putatively associated with scent biosynthesis in roses. The study provides a foundation for scent-related gene discovery in roses. Copyright © 2014. Published by Elsevier B.V.
Dores, Robert M
2016-01-01
The evolution of the melanocortin receptors (MCRs) is closely associated with the evolution of the melanocortin-2 receptor accessory proteins (MRAPs). Recent annotation of the elephant shark genome project revealed the sequence of a putative MRAP1 ortholog. The presence of this sequence in the genome of a cartilaginous fish raises the possibility that the mrap1 and mrap2 genes in the genomes of gnathostome vertebrates were the result of the chordate 2R genome duplication event. The presence of a putative MRAP1 ortholog in a cartilaginous fish genome is perplexing. Recent studies on melanocortin-2 receptor (MC2R) in the genomes of the elephant shark and the Japanese stingray indicate that these MC2R orthologs can be functionally expressed in CHO cells without co-expression of an exogenous mrap1 cDNA. The novel ligand selectivity of these cartilaginous fish MC2R orthologs is discussed. Finally, the origin of the mc2r and mc5r genes is reevaluated. The distinctive primary sequence conservation of MC2R and MC5R is discussed in light of the physiological roles of these two MCR paralogs.
2013-01-01
Background In recent years biogas plants in Germany have been supposed to be involved in amplification and dissemination of pathogenic bacteria causing severe infections in humans and animals. In particular, biogas plants are discussed to contribute to the spreading of Escherichia coli infections in humans or chronic botulism in cattle caused by Clostridium botulinum. Metagenome datasets of microbial communities from an agricultural biogas plant as well as from anaerobic lab-scale digesters operating at different temperatures and conditions were analyzed for the presence of putative pathogenic bacteria and virulence determinants by various bioinformatic approaches. Results All datasets featured a low abundance of reads that were taxonomically assigned to the genus Escherichia or further selected genera comprising pathogenic species. Higher numbers of reads were taxonomically assigned to the genus Clostridium. However, only very few sequences were predicted to originate from pathogenic clostridial species. Moreover, mapping of metagenome reads to complete genome sequences of selected pathogenic bacteria revealed that not the pathogenic species itself, but only species that are more or less related to pathogenic ones are present in the fermentation samples analyzed. Likewise, known virulence determinants could hardly be detected. Only a marginal number of reads showed similarity to sequences described in the Microbial Virulence Database MvirDB such as those encoding protein toxins, virulence proteins or antibiotic resistance determinants. Conclusions Findings of this first study of metagenomic sequence reads of biogas producing microbial communities suggest that the risk of dissemination of pathogenic bacteria by application of digestates from biogas fermentations as fertilizers is low, because obtained results do not indicate the presence of putative pathogenic microorganisms in the samples analyzed. PMID:23557021
Santibáñez-López, Carlos E; Cid-Uribe, Jimena I; Zamudio, Fernando Z; Batista, Cesar V F; Ortiz, Ernesto; Possani, Lourival D
2017-07-01
The soluble venom from the Mexican scorpion Megacormus gertschi of the family Euscorpiidae was obtained and its biological effects were tested in several animal models. This venom is not toxic to mice at doses of 100 μg per 20 g of mouse weight, while being lethal to arthropods (insects and crustaceans), at doses of 20 μg (for crickets) and 100 μg (for shrimps) per animal. Samples of the venom were separated by high performance liquid chromatography and circa 80 distinct chromatographic fractions were obtained from which 67 components have had their molecular weights determined by mass spectrometry analysis. The N-terminal amino acid sequence of seven protein/peptides were obtained by Edman degradation and are reported. Among the high molecular weight components there are enzymes with experimentally-confirmed phospholipase activity. A pair of telsons from this scorpion species was dissected, from which total RNA was extracted and used for cDNA library construction. Massive sequencing by the Illumina protocol, followed by de novo assembly, resulted in a total of 110,528 transcripts. From those, we were able to annotate 182, which putatively code for peptides/proteins with sequence similarity to previously-reported venom components available from different protein databases. Transcripts seemingly coding for enzymes showed the richest diversity, with 52 sequences putatively coding for proteases, 20 for phospholipases, 8 for lipases and 5 for hyaluronidases. The number of different transcripts potentially coding for peptides with sequence similarity to those that affect ion channels was 19, for putative antimicrobial peptides 19, and for protease inhibitor-like peptides, 18. Transcripts seemingly coding for other venom components were identified and described. The LC/MS analysis of a trypsin-digested venom aliquot resulted in 23 matches with the translated transcriptome database, which validates the transcriptome. The proteomic and transcriptomic analyses reported here constitute the first approach to study the venom components from a scorpion species belonging to the family Euscorpiidae. The data certainly show that this venom is different from all the ones described thus far in the literature. Copyright © 2017 Elsevier Ltd. All rights reserved.
RNA-ID, a Powerful Tool for Identifying and Characterizing Regulatory Sequences.
Brule, C E; Dean, K M; Grayhack, E J
2016-01-01
The identification and analysis of sequences that regulate gene expression is critical because regulated gene expression underlies biology. RNA-ID is an efficient and sensitive method to discover and investigate regulatory sequences in the yeast Saccharomyces cerevisiae, using fluorescence-based assays to detect green fluorescent protein (GFP) relative to a red fluorescent protein (RFP) control in individual cells. Putative regulatory sequences can be inserted either in-frame or upstream of a superfolder GFP fusion protein whose expression, like that of RFP, is driven by the bidirectional GAL1,10 promoter. In this chapter, we describe the methodology to identify and study cis-regulatory sequences in the RNA-ID system, explaining features and variations of the RNA-ID reporter, as well as some applications of this system. We describe in detail the methods to analyze a single regulatory sequence, from construction of a single GFP variant to assay of variants by flow cytometry, as well as modifications required to screen libraries of different strains simultaneously. We also describe subsequent analyses of regulatory sequences. © 2016 Elsevier Inc. All rights reserved.
A putative peroxidase cDNA from turnip and analysis of the encoded protein sequence.
Romero-Gómez, S; Duarte-Vázquez, M A; García-Almendárez, B E; Mayorga-Martínez, L; Cervantes-Avilés, O; Regalado, C
2008-12-01
A putative peroxidase cDNA was isolated from turnip roots (Brassica napus L. var. purple top white globe) by reverse transcriptase-polymerase chain reaction (RT-PCR) and rapid amplification of cDNA ends (RACE). Total RNA extracted from mature turnip roots was used as a template for RT-PCR, using a degenerated primer designed to amplify the highly conserved distal motif of plant peroxidases. The resulting partial sequence was used to design the rest of the specific primers for 5' and 3' RACE. Two cDNA fragments were purified, sequenced, and aligned with the partial sequence from RT-PCR, and a complete overlapping sequence was obtained and labeled as BbPA (Genbank Accession No. AY423440, named as podC). The full length cDNA is 1167bp long and contains a 1077bp open reading frame (ORF) encoding a 358 deduced amino acid peroxidase polypeptide. The putative peroxidase (BnPA) showed a calculated Mr of 34kDa, and isoelectric point (pI) of 4.5, with no significant identity with other reported turnip peroxidases. Sequence alignment showed that only three peroxidases have a significant identity with BnPA namely AtP29a (84%), and AtPA2 (81%) from Arabidopsis thaliana, and HRPA2 (82%) from horseradish (Armoracia rusticana). Work is in progress to clone this gene into an adequate host to study the specific role and possible biotechnological applications of this alternative peroxidase source.
Alpert, Carl-Alfred; Crutz-Le Coq, Anne-Marie; Malleret, Christine; Zagorec, Monique
2003-01-01
The complete nucleotide sequence of the 13-kb plasmid pRV500, isolated from Lactobacillus sakei RV332, was determined. Sequence analysis enabled the identification of genes coding for a putative type I restriction-modification system, two genes coding for putative recombinases of the integrase family, and a region likely involved in replication. The structural features of this region, comprising a putative ori segment containing 11- and 22-bp repeats and a repA gene coding for a putative initiator protein, indicated that pRV500 belongs to the pUCL287 subfamily of theta-type replicons. A 3.7-kb fragment encompassing this region was fused to an Escherichia coli replicon to produce the shuttle vector pRV566 and was observed to be functional in L. sakei for plasmid replication. The L. sakei replicon alone could not support replication in E. coli. Plasmid pRV500 and its derivative pRV566 were determined to be at very low copy numbers in L. sakei. pRV566 was maintained at a reasonable rate over 20 generations in several lactobacilli, such as Lactobacillus curvatus, Lactobacillus casei, and Lactobacillus plantarum, in addition to L. sakei, making it an interesting basis for developing vectors. Sequence relationships with other plasmids are described and discussed. PMID:12957947
2011-01-01
Background The inorganic (Pi) phosphate transporter (PiT) family comprises known and putative Na+- or H+-dependent Pi-transporting proteins with representatives from all kingdoms. The mammalian members are placed in the outer cell membranes and suggested to supply cells with Pi to maintain house-keeping functions. Alignment of protein sequences representing PiT family members from all kingdoms reveals the presence of conserved amino acids and that bacterial phosphate permeases and putative phosphate permeases from archaea lack substantial parts of the protein sequence when compared to the mammalian PiT family members. Besides being Na+-dependent Pi (NaPi) transporters, the mammalian PiT paralogs, PiT1 and PiT2, also are receptors for gamma-retroviruses. We have here exploited the dual-function of PiT1 and PiT2 to study the structure-function relationship of PiT proteins. Results We show that the human PiT2 histidine, H502, and the human PiT1 glutamate, E70, - both conserved in eukaryotic PiT family members - are critical for Pi transport function. Noticeably, human PiT2 H502 is located in the C-terminal PiT family signature sequence, and human PiT1 E70 is located in ProDom domains characteristic for all PiT family members. A human PiT2 truncation mutant, which consists of the predicted 10 transmembrane (TM) domain backbone without a large intracellular domain (human PiT2ΔR254-V483), was found to be a fully functional Pi transporter. Further truncation of the human PiT2 protein by additional removal of two predicted TM domains together with the large intracellular domain created a mutant that resembles a bacterial phosphate permease and an archaeal putative phosphate permease. This human PiT2 truncation mutant (human PiT2ΔL183-V483) did also support Pi transport albeit at very low levels. Conclusions The results suggest that the overall structure of the Pi-transporting unit of the PiT family proteins has remained unchanged during evolution. Moreover, in combination, our studies of the gene structure of the human PiT1 and PiT2 genes (SLC20A1 and SLC20A2, respectively) and alignment of protein sequences of PiT family members from all kingdoms, along with the studies of the dual functions of the human PiT paralogs show that these proteins are excellent as models for studying the evolution of a protein's structure-function relationship. PMID:21586110
Bøttger, Pernille; Pedersen, Lene
2011-05-17
The inorganic (Pi) phosphate transporter (PiT) family comprises known and putative Na(+)- or H(+)-dependent Pi-transporting proteins with representatives from all kingdoms. The mammalian members are placed in the outer cell membranes and suggested to supply cells with Pi to maintain house-keeping functions. Alignment of protein sequences representing PiT family members from all kingdoms reveals the presence of conserved amino acids and that bacterial phosphate permeases and putative phosphate permeases from archaea lack substantial parts of the protein sequence when compared to the mammalian PiT family members. Besides being Na(+)-dependent P(i) (NaP(i)) transporters, the mammalian PiT paralogs, PiT1 and PiT2, also are receptors for gamma-retroviruses. We have here exploited the dual-function of PiT1 and PiT2 to study the structure-function relationship of PiT proteins. We show that the human PiT2 histidine, H(502), and the human PiT1 glutamate, E(70),--both conserved in eukaryotic PiT family members--are critical for P(i) transport function. Noticeably, human PiT2 H(502) is located in the C-terminal PiT family signature sequence, and human PiT1 E(70) is located in ProDom domains characteristic for all PiT family members.A human PiT2 truncation mutant, which consists of the predicted 10 transmembrane (TM) domain backbone without a large intracellular domain (human PiT2ΔR(254)-V(483)), was found to be a fully functional P(i) transporter. Further truncation of the human PiT2 protein by additional removal of two predicted TM domains together with the large intracellular domain created a mutant that resembles a bacterial phosphate permease and an archaeal putative phosphate permease. This human PiT2 truncation mutant (human PiT2ΔL(183)-V(483)) did also support P(i) transport albeit at very low levels. The results suggest that the overall structure of the P(i)-transporting unit of the PiT family proteins has remained unchanged during evolution. Moreover, in combination, our studies of the gene structure of the human PiT1 and PiT2 genes (SLC20A1 and SLC20A2, respectively) and alignment of protein sequences of PiT family members from all kingdoms, along with the studies of the dual functions of the human PiT paralogs show that these proteins are excellent as models for studying the evolution of a protein's structure-function relationship. © 2011 Bøttger and Pedersen; licensee BioMed Central Ltd.
Replicase activity of purified recombinant protein P2 of double-stranded RNA bacteriophage phi6.
Makeyev, E V; Bamford, D H
2000-01-04
In nature, synthesis of both minus- and plus-sense RNA strands of all the known double-stranded RNA viruses occurs in the interior of a large protein assembly referred to as the polymerase complex. In addition to other proteins, the complex contains a putative polymerase possessing characteristic sequence motifs. However, none of the previous studies has shown template-dependent RNA synthesis directly with an isolated putative polymerase protein. In this report, recombinant protein P2 of double-stranded RNA bacteriophage phi6 was purified and demonstrated in an in vitro enzymatic assay to act as the replicase. The enzyme efficiently utilizes phage-specific, positive-sense RNA substrates to produce double-stranded RNA molecules, which are formed by newly synthesized, full-length minus-strands base paired with the plus-strand templates. P2-catalyzed replication is also shown to be very effective with a broad range of heterologous single-stranded RNA templates. The importance and implications of these results are discussed.
Liu, Chen; Shen, He Ding; Zhou, Na
2016-01-01
The complete mitochondrial genome sequence of Platevindex sp. is firstly described in the article. The mitogenome (13,908 bp) contains 22 tRNA genes, 2 ribosomal RNA genes and 13 protein-coding genes, and 1 putative control region (CR). CR is not well characterized due to lack of discrete conserved sequence blocks. This characteristic is similar with CRs of other invertebrate mitochondrial genomes. The characteristic is the typical bivalvia mitochondrial gene composition.
Lieutaud, Philippe; Uversky, Alexey V.; Uversky, Vladimir N.; Longhi, Sonia
2016-01-01
ABSTRACT In the last 2 decades it has become increasingly evident that a large number of proteins are either fully or partially disordered. Intrinsically disordered proteins lack a stable 3D structure, are ubiquitous and fulfill essential biological functions. Their conformational heterogeneity is encoded in their amino acid sequences, thereby allowing intrinsically disordered proteins or regions to be recognized based on properties of these sequences. The identification of disordered regions facilitates the functional annotation of proteins and is instrumental for delineating boundaries of protein domains amenable to structural determination with X-ray crystallization. This article discusses a comprehensive selection of databases and methods currently employed to disseminate experimental and putative annotations of disorder, predict disorder and identify regions involved in induced folding. It also provides a set of detailed instructions that should be followed to perform computational analysis of disorder. PMID:28232901
2008-10-13
Furthermore, the encoded protein of this gene is only 30 kDa. A potential GTG start codon at position 625 also encodes a protein that is too small...horizontal bar and putative alternate translation initiation sites (ATG, GTG , and TTG) are indicated. The sizes and locations of the proteins encoded... gray line with rounded rectangles showing sequence features and motifs, including the Ala- and Pro-rich N-terminal region and the C-terminal Cys and
Li, De-Zhu; Guo, Zhen-Hua
2012-01-01
Background Transcriptome sequencing can be used to determine gene sequences and transcript abundance in non-model species, and the advent of next-generation sequencing (NGS) technologies has greatly decreased the cost and time required for this process. Transcriptome data are especially desirable in bamboo species, as certain members constitute an economically and culturally important group of mostly semelparous plants with remarkable flowering features, yet little bamboo genomic research has been performed. Here we present, for the first time, extensive sequence and transcript abundance data for the floral transcriptome of a key bamboo species, Dendrocalamus latiflorus, obtained using the Illumina GAII sequencing platform. Our further goal was to identify patterns of gene expression during bamboo flower development. Results Approximately 96 million sequencing reads were generated and assembled de novo, yielding 146,395 high quality unigenes with an average length of 461 bp. Of these, 80,418 were identified as putative homologs of annotated sequences in the public protein databases, of which 290 were associated with the floral transition and 47 were related to flower development. Digital abundance analysis identified 26,529 transcripts differentially enriched between two developmental stages, young flower buds and older developing flowers. Unigenes found at each stage were categorized according to their putative functional categories. These sequence and putative function data comprise a resource for future investigation of the floral transition and flower development in bamboo species. Conclusions Our results present the first broad survey of a bamboo floral transcriptome. Although it will be necessary to validate the functions carried out by these genes, these results represent a starting point for future functional research on D. latiflorus and related species. PMID:22916120
Molecular Characterization of a Novel Species of Capillovirus from Japanese Apricot (Prunus mume)
Faure, Chantal; Theil, Sébastien; Candresse, Thierry
2018-01-01
With the increased use of high-throughput sequencing methods, new viruses infecting Prunus spp. are being discovered and characterized, especially in the family Betaflexiviridae. Double-stranded RNAs from symptomatic leaves of a Japanese apricot (Prunus mume) tree from Japan were purified and analyzed by Illumina sequencing. Blast comparisons of reconstructed contigs showed that the P. mume sample was infected by a putative novel virus with homologies to Cherry virus A (CVA) and to the newly described Currant virus A (CuVA), both members of genus Capillovirus. Completion of the genome showed the new agent to have a genomic organization typical of capilloviruses, with two overlapping open reading frames encoding a large replication-associated protein fused to the coat protein (CP), and a putative movement protein (MP). This virus shares only, respectively, 63.2% and 62.7% CP amino acid identity with the most closely related viruses, CVA and CuVA. Considering the species demarcation criteria in the family and phylogenetic analyses, this virus should be considered as representing a new viral species in the genus Capillovirus, for which the name of Mume virus A is proposed. PMID:29570605
Kelly, Shannan; Yamamoto, Hideki
2008-01-01
Purpose We previously reported the differential expression and translation of mRNA and protein in dark- and light-adapted octopus retinas, which may result from cytoplasmic polyadenylation element (CPE)–dependent mRNA masking and unmasking. Here we investigate the presence of CPEs in α-tubulin and S-crystallin mRNA and report the identification of cytoplasmic polyadenylation element binding protein (CPEB) in light- and dark-adapted octopus retinas. Methods 3’-RACE and sequencing were used to isolate and analyze the 3’-UTRs of α-tubulin and S-crystallin mRNA. Total retinal protein isolated from light- and dark-adapted octopus retinas was subjected to western blot analysis followed by CPEB antibody detection, PEP-171 inhibition of CPEB, and dephosphorylation of CPEB. Results The following CPE-like sequence was detected in the 3’-UTR of isolated long S-crystallin mRNA variants: UUUAACA. No CPE or CPE-like sequences were detected in the 3’-UTRs of α-tubulin mRNA or of the short S-crystallin mRNA variants. Western blot analysis detected CPEB as two putative bands migrating between 60-80 kDa, while a third band migrated below 30 kDa in dark- and light-adapted retinas. Conclusions The detection of CPEB and the identification of the putative CPE-like sequences in the S-crystallin 3’-UTR suggest that CPEB may be involved in the activation of masked S-crystallin mRNA, but not in the regulation of α-tubulin mRNA, resulting in increased S-crystallin protein synthesis in dark-adapted octopus retinas. PMID:18682811
Quaglino, Fabio; Kube, Michael; Jawhari, Maan; Abou-Jawdah, Yusuf; Siewert, Christin; Choueiri, Elia; Sobh, Hana; Casati, Paola; Tedeschi, Rosemarie; Lova, Marina Molino; Alma, Alberto; Bianco, Piero Attilio
2015-07-30
Almond witches'-broom (AlmWB), a devastating disease of almond, peach and nectarine in Lebanon, is associated with 'Candidatus Phytoplasma phoenicium'. In the present study, we generated a draft genome sequence of 'Ca. P. phoenicium' strain SA213, representative of phytoplasma strain populations from different host plants, and determined the genetic diversity among phytoplasma strain populations by phylogenetic analyses of 16S rRNA, groEL, tufB and inmp gene sequences. Sequence-based typing and phylogenetic analysis of the gene inmp, coding an integral membrane protein, distinguished AlmWB-associated phytoplasma strains originating from diverse host plants, whereas their 16S rRNA, tufB and groEL genes shared 100 % sequence identity. Moreover, dN/dS analysis indicated positive selection acting on inmp gene. Additionally, the analysis of 'Ca. P. phoenicium' draft genome revealed the presence of integral membrane proteins and effector-like proteins and potential candidates for interaction with hosts. One of the integral membrane proteins was predicted as BI-1, an inhibitor of apoptosis-promoting Bax factor. Bioinformatics analyses revealed the presence of putative BI-1 in draft and complete genomes of other 'Ca. Phytoplasma' species. The genetic diversity within 'Ca. P. phoenicium' strain populations in Lebanon suggested that AlmWB disease could be associated with phytoplasma strains derived from the adaptation of an original strain to diverse hosts. Moreover, the identification of a putative inhibitor of apoptosis-promoting Bax factor (BI-1) in 'Ca. P. phoenicium' draft genome and within genomes of other 'Ca. Phytoplasma' species suggested its potential role as a phytoplasma fitness-increasing factor by modification of the host-defense response.
Das, Abhishek; Panda, Arijit; Singh, Deeksha; Chandrababunaidu, Mathu Malar; Mishra, Gyan Prakash; Bhan, Sushma
2015-01-01
Scytonema tolypothrichoides VB-61278, a terrestrial cyanobacterium, can be exploited to produce commercially important products. Here, we report for the first time a 10-Mb draft genome assembly of S. tolypothrichoides VB-61278, with 214 scaffolds and 7,148 putative protein-coding genes. PMID:25838486
Learning cellular sorting pathways using protein interactions and sequence motifs.
Lin, Tien-Ho; Bar-Joseph, Ziv; Murphy, Robert F
2011-11-01
Proper subcellular localization is critical for proteins to perform their roles in cellular functions. Proteins are transported by different cellular sorting pathways, some of which take a protein through several intermediate locations until reaching its final destination. The pathway a protein is transported through is determined by carrier proteins that bind to specific sequence motifs. In this article, we present a new method that integrates protein interaction and sequence motif data to model how proteins are sorted through these sorting pathways. We use a hidden Markov model (HMM) to represent protein sorting pathways. The model is able to determine intermediate sorting states and to assign carrier proteins and motifs to the sorting pathways. In simulation studies, we show that the method can accurately recover an underlying sorting model. Using data for yeast, we show that our model leads to accurate prediction of subcellular localization. We also show that the pathways learned by our model recover many known sorting pathways and correctly assign proteins to the path they utilize. The learned model identified new pathways and their putative carriers and motifs and these may represent novel protein sorting mechanisms. Supplementary results and software implementation are available from http://murphylab.web.cmu.edu/software/2010_RECOMB_pathways/.
Shi, Huazhong; Kim, YongSig; Guo, Yan; Stevenson, Becky; Zhu, Jian-Kang
2003-01-01
Cell surface proteoglycans have been implicated in many aspects of plant growth and development, but genetic evidence supporting their function has been lacking. Here, we report that the Salt Overly Sensitive5 (SOS5) gene encodes a putative cell surface adhesion protein and is required for normal cell expansion. The sos5 mutant was isolated in a screen for Arabidopsis salt-hypersensitive mutants. Under salt stress, the root tips of sos5 mutant plants swell and root growth is arrested. The root-swelling phenotype is caused by abnormal expansion of epidermal, cortical, and endodermal cells. The SOS5 gene was isolated through map-based cloning. The predicted SOS5 protein contains an N-terminal signal sequence for plasma membrane localization, two arabinogalactan protein–like domains, two fasciclin-like domains, and a C-terminal glycosylphosphatidylinositol lipid anchor signal sequence. The presence of fasciclin-like domains, which typically are found in animal cell adhesion proteins, suggests a role for SOS5 in cell-to-cell adhesion in plants. The SOS5 protein was present at the outer surface of the plasma membrane. The cell walls are thinner in the sos5 mutant, and those between neighboring epidermal and cortical cells in sos5 roots appear less organized. SOS5 is expressed ubiquitously in all plant organs and tissues, including guard cells in the leaf. PMID:12509519
Gu, Xiao-Cui; Zhang, Ya-Nan; Kang, Ke; Dong, Shuang-Lin; Zhang, Long-Wa
2015-01-01
The red turpentine beetle (RTB), Dendroctonus valens LeConte (Coleoptera: Curculionidae, Scolytinae), is a destructive invasive pest of conifers which has become the second most important forest pest nationwide in China. Dendroctonus valens is known to use host odors and aggregation pheromones, as well as non-host volatiles, in host location and mass-attack modulation, and thus antennal olfaction is of the utmost importance for the beetles' survival and fitness. However, information on the genes underlying olfaction has been lacking in D. valens. Here, we report the antennal transcriptome of D. valens from next-generation sequencing, with the goal of identifying the olfaction gene repertoire that is involved in D. valens odor-processing. We obtained 51 million reads that were assembled into 61,889 genes, including 39,831 contigs and 22,058 unigenes. In total, we identified 68 novel putative odorant reception genes, including 21 transcripts encoding for putative odorant binding proteins (OBP), six chemosensory proteins (CSP), four sensory neuron membrane proteins (SNMP), 22 odorant receptors (OR), four gustatory receptors (GR), three ionotropic receptors (IR), and eight ionotropic glutamate receptors. We also identified 155 odorant/xenobiotic degradation enzymes from the antennal transcriptome, putatively identified to be involved in olfaction processes including cytochrome P450s, glutathione-S-transferases, and aldehyde dehydrogenase. Predicted protein sequences were compared with counterparts in Tribolium castaneum, Megacyllene caryae, Ips typographus, Dendroctonus ponderosae, and Agrilus planipennis. The antennal transcriptome described here represents the first study of the repertoire of odor processing genes in D. valens. The genes reported here provide a significant addition to the pool of identified olfactory genes in Coleoptera, which might represent novel targets for insect management. The results from our study also will assist with evolutionary analyses of coleopteran olfaction.
Dong, Shuang-Lin; Zhang, Long-Wa
2015-01-01
Background The red turpentine beetle (RTB), Dendroctonus valens LeConte (Coleoptera: Curculionidae, Scolytinae), is a destructive invasive pest of conifers which has become the second most important forest pest nationwide in China. Dendroctonus valens is known to use host odors and aggregation pheromones, as well as non-host volatiles, in host location and mass-attack modulation, and thus antennal olfaction is of the utmost importance for the beetles’ survival and fitness. However, information on the genes underlying olfaction has been lacking in D. valens. Here, we report the antennal transcriptome of D. valens from next-generation sequencing, with the goal of identifying the olfaction gene repertoire that is involved in D. valens odor-processing. Results We obtained 51 million reads that were assembled into 61,889 genes, including 39,831 contigs and 22,058 unigenes. In total, we identified 68 novel putative odorant reception genes, including 21 transcripts encoding for putative odorant binding proteins (OBP), six chemosensory proteins (CSP), four sensory neuron membrane proteins (SNMP), 22 odorant receptors (OR), four gustatory receptors (GR), three ionotropic receptors (IR), and eight ionotropic glutamate receptors. We also identified 155 odorant/xenobiotic degradation enzymes from the antennal transcriptome, putatively identified to be involved in olfaction processes including cytochrome P450s, glutathione-S-transferases, and aldehyde dehydrogenase. Predicted protein sequences were compared with counterparts in Tribolium castaneum, Megacyllene caryae, Ips typographus, Dendroctonus ponderosae, and Agrilus planipennis. Conclusion The antennal transcriptome described here represents the first study of the repertoire of odor processing genes in D. valens. The genes reported here provide a significant addition to the pool of identified olfactory genes in Coleoptera, which might represent novel targets for insect management. The results from our study also will assist with evolutionary analyses of coleopteran olfaction. PMID:25938508
Ramalho-Ortigão, J M; Temporal, P; de Oliveira , S M; Barbosa, A F; Vilela, M L; Rangel, E F; Brazil, R P; Traub-Cseko, Y M
2001-01-01
Molecular studies of insect disease vectors are of paramount importance for understanding parasite-vector relationship. Advances in this area have led to important findings regarding changes in vectors' physiology upon blood feeding and parasite infection. Mechanisms for interfering with the vectorial capacity of insects responsible for the transmission of diseases such as malaria, Chagas disease and dengue fever are being devised with the ultimate goal of developing transgenic insects. A primary necessity for this goal is information on gene expression and control in the target insect. Our group is investigating molecular aspects of the interaction between Leishmania parasites and Lutzomyia sand flies. As an initial step in our studies we have used random sequencing of cDNA clones from two expression libraries made from head/thorax and abdomen of sugar fed L. longipalpis for the identification of expressed sequence tags (EST). We applied differential display reverse transcriptase-PCR and randomly amplified polymorphic DNA-PCR to characterize differentially expressed mRNA from sugar and blood fed insects, and, in one case, from a L. (V.) braziliensis-infected L. longipalpis. We identified 37 cDNAs that have shown homology to known sequences from GeneBank. Of these, 32 cDNAs code for constitutive proteins such as zinc finger protein, glutamine synthetase, G binding protein, ubiquitin conjugating enzyme. Three are putative differentially expressed cDNAs from blood fed and Leishmania-infected midgut, a chitinase, a V-ATPase and a MAP kinase. Finally, two sequences are homologous to Drosophila melanogaster gene products recently discovered through the Drosophila genome initiative.
Busk, Peter Kamp; Lange, Lene
2013-06-01
Functional prediction of carbohydrate-active enzymes is difficult due to low sequence identity. However, similar enzymes often share a few short motifs, e.g., around the active site, even when the overall sequences are very different. To exploit this notion for functional prediction of carbohydrate-active enzymes, we developed a simple algorithm, peptide pattern recognition (PPR), that can divide proteins into groups of sequences that share a set of short conserved sequences. When this method was used on 118 glycoside hydrolase 5 proteins with 9% average pairwise identity and representing four characterized enzymatic functions, 97% of the proteins were sorted into groups correlating with their enzymatic activity. Furthermore, we analyzed 8,138 glycoside hydrolase 13 proteins including 204 experimentally characterized enzymes with 28 different functions. There was a 91% correlation between group and enzyme activity. These results indicate that the function of carbohydrate-active enzymes can be predicted with high precision by finding short, conserved motifs in their sequences. The glycoside hydrolase 61 family is important for fungal biomass conversion, but only a few proteins of this family have been functionally characterized. Interestingly, PPR divided 743 glycoside hydrolase 61 proteins into 16 subfamilies useful for targeted investigation of the function of these proteins and pinpointed three conserved motifs with putative importance for enzyme activity. Furthermore, the conserved sequences were useful for cloning of new, subfamily-specific glycoside hydrolase 61 proteins from 14 fungi. In conclusion, identification of conserved sequence motifs is a new approach to sequence analysis that can predict carbohydrate-active enzyme functions with high precision.
Domain fusion analysis by applying relational algebra to protein sequence and domain databases
Truong, Kevin; Ikura, Mitsuhiko
2003-01-01
Background Domain fusion analysis is a useful method to predict functionally linked proteins that may be involved in direct protein-protein interactions or in the same metabolic or signaling pathway. As separate domain databases like BLOCKS, PROSITE, Pfam, SMART, PRINTS-S, ProDom, TIGRFAMs, and amalgamated domain databases like InterPro continue to grow in size and quality, a computational method to perform domain fusion analysis that leverages on these efforts will become increasingly powerful. Results This paper proposes a computational method employing relational algebra to find domain fusions in protein sequence databases. The feasibility of this method was illustrated on the SWISS-PROT+TrEMBL sequence database using domain predictions from the Pfam HMM (hidden Markov model) database. We identified 235 and 189 putative functionally linked protein partners in H. sapiens and S. cerevisiae, respectively. From scientific literature, we were able to confirm many of these functional linkages, while the remainder offer testable experimental hypothesis. Results can be viewed at . Conclusion As the analysis can be computed quickly on any relational database that supports standard SQL (structured query language), it can be dynamically updated along with the sequence and domain databases, thereby improving the quality of predictions over time. PMID:12734020
Polypeptide p41 of a Norwalk-Like Virus Is a Nucleic Acid-Independent Nucleoside Triphosphatase
Pfister, Thomas; Wimmer, Eckard
2001-01-01
Southampton virus (SHV) is a member of the Norwalk-like viruses (NLVs), one of four genera of the family Caliciviridae. The genome of SHV contains three open reading frames (ORFs). ORF 1 encodes a polyprotein that is autocatalytically processed into six proteins, one of which is p41. p41 shares sequence motifs with protein 2C of picornaviruses and superfamily 3 helicases. We have expressed p41 of SHV in bacteria. Purified p41 exhibited nucleoside triphosphate (NTP)-binding and NTP hydrolysis activities. The NTPase activity was not stimulated by single-stranded nucleic acids. SHV p41 had no detectable helicase activity. Protein sequence comparison between the consensus sequences of NLV p41 and enterovirus protein 2C revealed regions of high similarity. According to secondary structure prediction, the conserved regions were located within a putative central domain of alpha helices and beta strands. This study reveals for the first time an NTPase activity associated with a calicivirus-encoded protein. Based on enzymatic properties and sequence information, a functional relationship between NLV p41 and enterovirus 2C is discussed in regard to the role of 2C-like proteins in virus replication. PMID:11160659
Comparative analysis of programmed cell death pathways in filamentous fungi.
Fedorova, Natalie D; Badger, Jonathan H; Robson, Geoff D; Wortman, Jennifer R; Nierman, William C
2005-12-08
Fungi can undergo autophagic- or apoptotic-type programmed cell death (PCD) on exposure to antifungal agents, developmental signals, and stress factors. Filamentous fungi can also exhibit a form of cell death called heterokaryon incompatibility (HI) triggered by fusion between two genetically incompatible individuals. With the availability of recently sequenced genomes of Aspergillus fumigatus and several related species, we were able to define putative components of fungi-specific death pathways and the ancestral core apoptotic machinery shared by all fungi and metazoa. Phylogenetic profiling of HI-associated proteins from four Aspergilli and seven other fungal species revealed lineage-specific protein families, orphan genes, and core genes conserved across all fungi and metazoa. The Aspergilli-specific domain architectures include NACHT family NTPases, which may function as key integrators of stress and nutrient availability signals. They are often found fused to putative effector domains such as Pfs, SesB/LipA, and a newly identified domain, HET-s/LopB. Many putative HI inducers and mediators are specific to filamentous fungi and not found in unicellular yeasts. In addition to their role in HI, several of them appear to be involved in regulation of cell cycle, development and sexual differentiation. Finally, the Aspergilli possess many putative downstream components of the mammalian apoptotic machinery including several proteins not found in the model yeast, Saccharomyces cerevisiae. Our analysis identified more than 100 putative PCD associated genes in the Aspergilli, which may help expand the range of currently available treatments for aspergillosis and other invasive fungal diseases. The list includes species-specific protein families as well as conserved core components of the ancestral PCD machinery shared by fungi and metazoa.
Hidalgo, A R; Akond, M A; Kita, K; Kataoka, M; Shimizu, S
2001-12-01
Two conjugated polyketone reductases (CPRs) were isolated from Candida parapsilosis IFO 0708. The primary structures of CPRs (C1 and C2) were analyzed by amino acid sequencing. The amino acid sequences of both enzymes had high similarity to those of several proteins of the aldo-keto-reductase (AKR) superfamily. However, several amino acid residues in the putative active sites of AKRs were not conserved in CPRs-C1 and -C2.
Jia, Haiwei; Zhang, Xiaojuan; Wang, Wenjun; Bai, Yuanyuan; Ling, Youguo; Cao, Cheng; Ma, Runlin Z; Zhong, Hui; Wang, Xue; Xu, Quanbin
2015-02-27
Mps1, an essential component of the mitotic checkpoint, is also an important interphase regulator and has roles in DNA damage response, cytokinesis and centrosome duplication. Mps1 predominantly resides in the cytoplasm and relocates into the nucleus at the late G2 phase. So far, the mechanism underlying the Mps1 translocation between the cytoplasm and nucleus has been unclear. In this work, a dynamic export process of Mps1 from the nucleus to cytoplasm in interphase was revealed- a process blocked by the Crm1 inhibitor, Leptomycin B, suggesting that export of Mps1 is Crm1 dependent. Consistent with this speculation, a direct association between Mps1 and Crm1 was found. Furthermore, a putative nuclear export sequence (pNES) motif at the N-terminal of Mps1 was identified by analyzing the motif of Mps1. This motif shows a high sequence similarity to the classic NES, a fusion of this motif with EGFP results in dramatic exclusion of the fusion protein from the nucleus. Additionally, Mps1 mutant loss of pNES integrity was shown by replacing leucine with alanine which produced a diffused subcellular distribution, compared to the wild type protein which resides predominantly in cytoplasm. Taken these findings together, it was concluded that the pNES sequence is sufficient for the Mps1 export from nucleus during interphase.
Zhao, J.; Chen, Y. H.; Kwan, H. S.
2000-01-01
The complete nucleotide sequence of putative glucoamylase gene gla1 from the basidiomycetous fungus Lentinula edodes strain L54 is reported. The coding region of the genomic glucoamylase sequence, which is preceded by eukaryotic promoter elements CAAT and TATA, spans 2,076 bp. The gla1 gene sequence codes for a putative polypeptide of 571 amino acids and is interrupted by seven introns. The open reading frame sequence of the gla1 gene shows strong homology with those of other fungal glucoamylase genes and encodes a protein with an N-terminal catalytic domain and a C-terminal starch-binding domain. The similarity between the Gla1 protein and other fungal glucoamylases is from 45 to 61%, with the region of highest conservation found in catalytic domains and starch-binding domains. We compared the kinetics of glucoamylase activity and levels of gene expression in L. edodes strain L54 grown on different carbon sources (glucose, starch, cellulose, and potato extract) and in various developmental stages (mycelium growth, primordium appearance, and fruiting body formation). Quantitative reverse transcription PCR utilizing pairs of primers specific for gla1 gene expression shows that expression of gla1 was induced by starch and increased during the process of fruiting body formation, which indicates that glucoamylases may play an important role in the morphogenesis of the basidiomycetous fungus. PMID:10831434
The Leptospiral Antigen Lp49 is a Two-Domain Protein with Putative Protein Binding Function
DOE Office of Scientific and Technical Information (OSTI.GOV)
Oliveira Giuseppe,P.; Oliveira Neves, F.; Nascimento, A.
2008-01-01
Pathogenic Leptospira is the etiological agent of leptospirosis, a life-threatening disease that affects populations worldwide. Currently available vaccines have limited effectiveness and therapeutic interventions are complicated by the difficulty in making an early diagnosis of leptospirosis. The genome of Leptospira interrogans was recently sequenced and comparative genomic analysis contributed to the identification of surface antigens, potential candidates for development of new vaccines and serodiagnosis. Lp49 is a membrane-associated protein recognized by antibodies present in sera from early and convalescent phases of leptospirosis patients. Its crystal structure was determined by single-wavelength anomalous diffraction using selenomethionine-labelled crystals and refined at 2.0 Angstromsmore » resolution. Lp49 is composed of two domains and belongs to the all-beta-proteins class. The N-terminal domain folds in an immunoglobulin-like beta-sandwich structure, whereas the C-terminal domain presents a seven-bladed beta-propeller fold. Structural analysis of Lp49 indicates putative protein-protein binding sites, suggesting a role in Leptospira-host interaction. This is the first crystal structure of a leptospiral antigen described to date.« less
Comparative analysis of Leishmania exoproteomes: implication for host-pathogen interactions.
Peysselon, Franck; Launay, Guillaume; Lisacek, Frédérique; Duclos, Bertrand; Ricard-Blum, Sylvie
2013-12-01
Leishmaniasis is a vector-borne disease caused by the protozoa Leishmania. We have analyzed and compared the sequences of three experimental exoproteomes of Leishmania promastigotes from different species to determine their specific features and to identify new candidate proteins involved in interactions of Leishmania with the host. The exoproteomes differ from the proteomes by a decrease in the average molecular weight per protein, in disordered amino acid residues and in basic proteins. The exoproteome of the visceral species is significantly enriched in sites predicted to be phosphorylated as well as in features frequently associated with molecular interactions (intrinsic disorder, number of disordered binding regions per protein, interaction and/or trafficking motifs) compared to the other species. The visceral species might thus have a larger interaction repertoire with the host than the other species. Less than 10% of the exoproteomes contain heparin-binding and RGD sequences, and ~30% the host targeting signal RXLXE/D/Q. These latter proteins might thus be exported inside the host cell during the intracellular stage of the infection. Furthermore we have identified nine protein families conserved in the three exoproteomes with specific combinations of Pfam domains and selected eleven proteins containing at least three interaction and/or trafficking motifs including two splicing factors, phosphomannomutase, 2,3-bisphosphoglycerate-independent phosphoglycerate mutase, the paraflagellar rod protein-1D and a putative helicase. Their role in host-Leishmania interactions warrants further investigation but the putative ATP-dependent DEAD/H RNA helicase, which contains numerous interaction motifs, a host targeting signal and two disordered regions, is a very promising candidate. © 2013.
NASA Astrophysics Data System (ADS)
Tian, Z. H.; Jiao, C. Z.
2017-07-01
RIG-I like receptors (RLRs) play key roles in sensing non-self nucleic acids in cytoplasm and trigger antiviral innate immune response in vertebrates and human body. Here we carried out in silico analysis to identify and investigate the putative RLRs encoded in the genome of marine mollusk, Crassostrea gigas (cgRLRs), an invertebrate species. We found the unusual duplication and varieties on domain architecture of putative cgRLRs encoded in the genome of C. gigas. Three putative cgRLRs (accessions numbers are EKC24603, EKC31344.1 and EKC38304.1 on GenBank), have the similar domain architecture with that of human RIG-I or MDA5, and one protein (EKC34573.1) with that of human LGP2; The fifth putative cgRLRs (EKC38303.1) is somewhat similar with human RIG-I/MDA5 except that it has only one caspase activation and recruitment domain (CARD) in its N-terminal. Other nine proteins were identified to be partialy similar with RLRs while with the incomplete sequences, which maybe reflect the events of partial duplication of cgRLRs genes occurred in the oyster genome.
Wolffe, E J; Gause, W C; Pelfrey, C M; Holland, S M; Steinberg, A D; August, J T
1990-01-05
We describe the isolation and sequencing of a cDNA encoding mouse Pgp-1. An oligonucleotide probe corresponding to the NH2-terminal sequence of the purified protein was synthesized by the polymerase chain reaction and used to screen a mouse macrophage lambda gt11 library. A cDNA clone with an insert of 1.2 kilobases was selected and sequenced. In Northern blot analysis, only cells expressing Pgp-1 contained mRNA species that hybridized with this Pgp-1 cDNA. The nucleotide sequence of the cDNA has a single open reading frame that yields a protein-coding sequence of 1076 base pairs followed by a 132-base pair 3'-untranslated sequence that includes a putative polyadenylation signal but no poly(A) tail. The translated sequence comprises a 13-amino acid signal peptide followed by a polypeptide core of 345 residues corresponding to an Mr of 37,800. Portions of the deduced amino acid sequence were identical to those obtained by amino acid sequence analysis from the purified glycoprotein, confirming that the cDNA encodes Pgp-1. The predicted structure of Pgp-1 includes an NH2-terminal extracellular domain (residues 14-265), a transmembrane domain (residues 266-286), and a cytoplasmic tail (residues 287-358). Portions of the mouse Pgp-1 sequence are highly similar to that of the human CD44 cell surface glycoprotein implicated in cell adhesion. The protein also shows sequence similarity to the proteoglycan tandem repeat sequences found in cartilage link protein and cartilage proteoglycan core protein which are thought to be involved in binding to hyaluronic acid.
Non-B-DNA structures on the interferon-beta promoter?
Robbe, K; Bonnefoy, E
1998-01-01
The high mobility group (HMG) I protein intervenes as an essential factor during the virus induced expression of the interferon-beta (IFN-beta) gene. It is a non-histone chromatine associated protein that has the dual capacity of binding to a non-B-DNA structure such as cruciform-DNA as well as to AT rich B-DNA sequences. In this work we compare the binding affinity of HMGI for a synthetic cruciform-DNA to its binding affinity for the HMGI-binding-site present in the positive regulatory domain II (PRDII) of the IFN-beta promoter. Using gel retardation experiments, we show that HMGI protein binds with at least ten times more affinity to the synthetic cruciform-DNA structure than to the PRDII B-DNA sequence. DNA hairpin sequences are present in both the human and the murine PRDII-DNAs. We discuss in this work the presence of, yet putative, non-B-DNA structures in the IFN-beta promoter.
Fu, Xiao-Zhe; Shi, Cun-Bin; Li, Ning-Qiu; Pan, Hou-Jun; Chang, Ou-Qin; Wu, Shu-Qin
2007-09-01
The major capsid protein of lymphocystis disease virus isolated from Rachycentron canadum (LCDV-rc) was amplified and analysed. The 457bp DNA core fragment was amplified with the degenerate primers designed according to the conserved sequences of MCP gene of iridoviruses, then the flaking sequences adjacent to the core region were amplified by inverse PCR, and the complete sequence was obtained by combining all of them. The open reading frame of the gene is 1380bp in length, encoding a putative protein of 459 aa with molecular weight 51.12 kD and pI 6.87. Constructing the phylogenetic tree for comparing the MCP amino acid of iridoviruses, the results indicated that LCDV-rc is most homologous to the other Lymphocystis viruses and all of them constitute a branch. Accordingly LCDV-rc is identified as Lymphocystivirus.
Zuotin, a putative Z-DNA binding protein in Saccharomyces cerevisiae
NASA Technical Reports Server (NTRS)
Zhang, S.; Lockshin, C.; Herbert, A.; Winter, E.; Rich, A.
1992-01-01
A putative Z-DNA binding protein, named zuotin, was purified from a yeast nuclear extract by means of a Z-DNA binding assay using [32P]poly(dG-m5dC) and [32P]oligo(dG-Br5dC)22 in the presence of B-DNA competitor. Poly(dG-Br5dC) in the Z-form competed well for the binding of a zuotin containing fraction, but salmon sperm DNA, poly(dG-dC) and poly(dA-dT) were not effective. Negatively supercoiled plasmid pUC19 did not compete, whereas an otherwise identical plasmid pUC19(CG), which contained a (dG-dC)7 segment in the Z-form was an excellent competitor. A Southwestern blot using [32P]poly(dG-m5dC) as a probe in the presence of MgCl2 identified a protein having a molecular weight of 51 kDa. The 51 kDa zuotin was partially sequenced at the N-terminal and the gene, ZUO1, was cloned, sequenced and expressed in Escherichia coli; the expressed zuotin showed similar Z-DNA binding activity, but with lower affinity than zuotin that had been partially purified from yeast. Zuotin was deduced to have a number of potential phosphorylation sites including two CDC28 (homologous to the human and Schizosaccharomyces pombe cdc2) phosphorylation sites. The hexapeptide motif KYHPDK was found in zuotin as well as in several yeast proteins, DnaJ of E.coli, csp29 and csp32 proteins of Drosophila and the small t and large T antigens of the polyoma virus. A 60 amino acid segment of zuotin has similarity to several histone H1 sequences. Disruption of ZUO1 in yeast resulted in a slow growth phenotype.
Mesquita, Rafael D.; Vionette-Amaral, Raquel J.; Lowenberger, Carl; Rivera-Pomar, Rolando; Monteiro, Fernando A.; Minx, Patrick; Spieth, John; Carvalho, A. Bernardo; Panzera, Francisco; Lawson, Daniel; Torres, André Q.; Ribeiro, Jose M. C.; Sorgine, Marcos H. F.; Waterhouse, Robert M.; Abad-Franch, Fernando; Alves-Bezerra, Michele; Amaral, Laurence R.; Araujo, Helena M.; Aravind, L.; Atella, Georgia C.; Azambuja, Patricia; Berni, Mateus; Bittencourt-Cunha, Paula R.; Braz, Gloria R. C.; Calderón-Fernández, Gustavo; Carareto, Claudia M. A.; Christensen, Mikkel B.; Costa, Igor R.; Costa, Samara G.; Dansa, Marilvia; Daumas-Filho, Carlos R. O.; De-Paula, Iron F.; Dias, Felipe A.; Dimopoulos, George; Emrich, Scott J.; Esponda-Behrens, Natalia; Fampa, Patricia; Fernandez-Medina, Rita D.; da Fonseca, Rodrigo N.; Fontenele, Marcio; Fronick, Catrina; Fulton, Lucinda A.; Gandara, Ana Caroline; Garcia, Eloi S.; Genta, Fernando A.; Giraldo-Calderón, Gloria I.; Gomes, Bruno; Gondim, Katia C.; Granzotto, Adriana; Guarneri, Alessandra A.; Guigó, Roderic; Harry, Myriam; Hughes, Daniel S. T.; Jablonka, Willy; Jacquin-Joly, Emmanuelle; Juárez, M. Patricia; Koerich, Leonardo B.; Lange, Angela B.; Latorre-Estivalis, José Manuel; Lavore, Andrés; Lawrence, Gena G.; Lazoski, Cristiano; Lazzari, Claudio R.; Lopes, Raphael R.; Lorenzo, Marcelo G.; Lugon, Magda D.; Marcet, Paula L.; Mariotti, Marco; Masuda, Hatisaburo; Megy, Karine; Missirlis, Fanis; Mota, Theo; Noriega, Fernando G.; Nouzova, Marcela; Nunes, Rodrigo D.; Oliveira, Raquel L. L.; Oliveira-Silveira, Gilbert; Ons, Sheila; Orchard, Ian; Pagola, Lucia; Paiva-Silva, Gabriela O.; Pascual, Agustina; Pavan, Marcio G.; Pedrini, Nicolás; Peixoto, Alexandre A.; Pereira, Marcos H.; Pike, Andrew; Polycarpo, Carla; Prosdocimi, Francisco; Ribeiro-Rodrigues, Rodrigo; Robertson, Hugh M.; Salerno, Ana Paula; Salmon, Didier; Santesmasses, Didac; Schama, Renata; Seabra-Junior, Eloy S.; Silva-Cardoso, Livia; Silva-Neto, Mario A. C.; Souza-Gomes, Matheus; Sterkel, Marcos; Taracena, Mabel L.; Tojo, Marta; Tu, Zhijian Jake; Tubio, Jose M. C.; Ursic-Bedoya, Raul; Venancio, Thiago M.; Walter-Nuno, Ana Beatriz; Wilson, Derek; Warren, Wesley C.; Wilson, Richard K.; Huebner, Erwin; Dotson, Ellen M.; Oliveira, Pedro L.
2015-01-01
Rhodnius prolixus not only has served as a model organism for the study of insect physiology, but also is a major vector of Chagas disease, an illness that affects approximately seven million people worldwide. We sequenced the genome of R. prolixus, generated assembled sequences covering 95% of the genome (∼702 Mb), including 15,456 putative protein-coding genes, and completed comprehensive genomic analyses of this obligate blood-feeding insect. Although immune-deficiency (IMD)-mediated immune responses were observed, R. prolixus putatively lacks key components of the IMD pathway, suggesting a reorganization of the canonical immune signaling network. Although both Toll and IMD effectors controlled intestinal microbiota, neither affected Trypanosoma cruzi, the causal agent of Chagas disease, implying the existence of evasion or tolerance mechanisms. R. prolixus has experienced an extensive loss of selenoprotein genes, with its repertoire reduced to only two proteins, one of which is a selenocysteine-based glutathione peroxidase, the first found in insects. The genome contained actively transcribed, horizontally transferred genes from Wolbachia sp., which showed evidence of codon use evolution toward the insect use pattern. Comparative protein analyses revealed many lineage-specific expansions and putative gene absences in R. prolixus, including tandem expansions of genes related to chemoreception, feeding, and digestion that possibly contributed to the evolution of a blood-feeding lifestyle. The genome assembly and these associated analyses provide critical information on the physiology and evolution of this important vector species and should be instrumental for the development of innovative disease control methods. PMID:26627243
Mesquita, Rafael D; Vionette-Amaral, Raquel J; Lowenberger, Carl; Rivera-Pomar, Rolando; Monteiro, Fernando A; Minx, Patrick; Spieth, John; Carvalho, A Bernardo; Panzera, Francisco; Lawson, Daniel; Torres, André Q; Ribeiro, Jose M C; Sorgine, Marcos H F; Waterhouse, Robert M; Montague, Michael J; Abad-Franch, Fernando; Alves-Bezerra, Michele; Amaral, Laurence R; Araujo, Helena M; Araujo, Ricardo N; Aravind, L; Atella, Georgia C; Azambuja, Patricia; Berni, Mateus; Bittencourt-Cunha, Paula R; Braz, Gloria R C; Calderón-Fernández, Gustavo; Carareto, Claudia M A; Christensen, Mikkel B; Costa, Igor R; Costa, Samara G; Dansa, Marilvia; Daumas-Filho, Carlos R O; De-Paula, Iron F; Dias, Felipe A; Dimopoulos, George; Emrich, Scott J; Esponda-Behrens, Natalia; Fampa, Patricia; Fernandez-Medina, Rita D; da Fonseca, Rodrigo N; Fontenele, Marcio; Fronick, Catrina; Fulton, Lucinda A; Gandara, Ana Caroline; Garcia, Eloi S; Genta, Fernando A; Giraldo-Calderón, Gloria I; Gomes, Bruno; Gondim, Katia C; Granzotto, Adriana; Guarneri, Alessandra A; Guigó, Roderic; Harry, Myriam; Hughes, Daniel S T; Jablonka, Willy; Jacquin-Joly, Emmanuelle; Juárez, M Patricia; Koerich, Leonardo B; Lange, Angela B; Latorre-Estivalis, José Manuel; Lavore, Andrés; Lawrence, Gena G; Lazoski, Cristiano; Lazzari, Claudio R; Lopes, Raphael R; Lorenzo, Marcelo G; Lugon, Magda D; Majerowicz, David; Marcet, Paula L; Mariotti, Marco; Masuda, Hatisaburo; Megy, Karine; Melo, Ana C A; Missirlis, Fanis; Mota, Theo; Noriega, Fernando G; Nouzova, Marcela; Nunes, Rodrigo D; Oliveira, Raquel L L; Oliveira-Silveira, Gilbert; Ons, Sheila; Orchard, Ian; Pagola, Lucia; Paiva-Silva, Gabriela O; Pascual, Agustina; Pavan, Marcio G; Pedrini, Nicolás; Peixoto, Alexandre A; Pereira, Marcos H; Pike, Andrew; Polycarpo, Carla; Prosdocimi, Francisco; Ribeiro-Rodrigues, Rodrigo; Robertson, Hugh M; Salerno, Ana Paula; Salmon, Didier; Santesmasses, Didac; Schama, Renata; Seabra-Junior, Eloy S; Silva-Cardoso, Livia; Silva-Neto, Mario A C; Souza-Gomes, Matheus; Sterkel, Marcos; Taracena, Mabel L; Tojo, Marta; Tu, Zhijian Jake; Tubio, Jose M C; Ursic-Bedoya, Raul; Venancio, Thiago M; Walter-Nuno, Ana Beatriz; Wilson, Derek; Warren, Wesley C; Wilson, Richard K; Huebner, Erwin; Dotson, Ellen M; Oliveira, Pedro L
2015-12-01
Rhodnius prolixus not only has served as a model organism for the study of insect physiology, but also is a major vector of Chagas disease, an illness that affects approximately seven million people worldwide. We sequenced the genome of R. prolixus, generated assembled sequences covering 95% of the genome (∼ 702 Mb), including 15,456 putative protein-coding genes, and completed comprehensive genomic analyses of this obligate blood-feeding insect. Although immune-deficiency (IMD)-mediated immune responses were observed, R. prolixus putatively lacks key components of the IMD pathway, suggesting a reorganization of the canonical immune signaling network. Although both Toll and IMD effectors controlled intestinal microbiota, neither affected Trypanosoma cruzi, the causal agent of Chagas disease, implying the existence of evasion or tolerance mechanisms. R. prolixus has experienced an extensive loss of selenoprotein genes, with its repertoire reduced to only two proteins, one of which is a selenocysteine-based glutathione peroxidase, the first found in insects. The genome contained actively transcribed, horizontally transferred genes from Wolbachia sp., which showed evidence of codon use evolution toward the insect use pattern. Comparative protein analyses revealed many lineage-specific expansions and putative gene absences in R. prolixus, including tandem expansions of genes related to chemoreception, feeding, and digestion that possibly contributed to the evolution of a blood-feeding lifestyle. The genome assembly and these associated analyses provide critical information on the physiology and evolution of this important vector species and should be instrumental for the development of innovative disease control methods.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mavromatis, K; Doyle, C Kuyler; Lykidis, A
2006-01-01
Ehrlichia canis, a small obligately intracellular, tick-transmitted, gram-negative, {alpha}-proteobacterium, is the primary etiologic agent of globally distributed canine monocytic ehrlichiosis. Complete genome sequencing revealed that the E. canis genome consists of a single circular chromosome of 1,315,030 bp predicted to encode 925 proteins, 40 stable RNA species, 17 putative pseudogenes, and a substantial proportion of noncoding sequence (27%). Interesting genome features include a large set of proteins with transmembrane helices and/or signal sequences and a unique serine-threonine bias associated with the potential for O glycosylation that was prominent in proteins associated with pathogen-host interactions. Furthermore, two paralogous protein families associatedmore » with immune evasion were identified, one of which contains poly(G-C) tracts, suggesting that they may play a role in phase variation and facilitation of persistent infections. Genes associated with pathogen-host interactions were identified, including a small group encoding proteins (n = 12) with tandem repeats and another group encoding proteins with eukaryote-like ankyrin domains (n = 7).« less
2011-01-01
The genomic DNA sequence of a novel enteric uncultured microphage, ΦCA82 from a turkey gastrointestinal system was determined utilizing metagenomics techniques. The entire circular, single-stranded nucleotide sequence of the genome was 5,514 nucleotides. The ΦCA82 genome is quite different from other microviruses as indicated by comparisons of nucleotide similarity, predicted protein similarity, and functional classifications. Only three genes showed significant similarity to microviral proteins as determined by local alignments using BLAST analysis. ORF1 encoded a predicted phage F capsid protein that was phylogenetically most similar to the Microviridae ΦMH2K member's major coat protein. The ΦCA82 genome also encoded a predicted minor capsid protein (ORF2) and putative replication initiation protein (ORF3) most similar to the microviral bacteriophage SpV4. The distant evolutionary relationship of ΦCA82 suggests that the divergence of this novel turkey microvirus from other microviruses may reflect unique evolutionary pressures encountered within the turkey gastrointestinal system. PMID:21714899
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mavromatis, K.; Kuyler Doyle, C.; Lykidis, A.
2005-09-01
Ehrlichia canis, a small obligately intracellular, tick-transmitted, gram-negative, a-proteobacterium is the primary etiologic agent of globally distributed canine monocytic ehrlichiosis. Complete genome sequencing revealed that the E. canis genome consists of a single circular chromosome of 1,315,030 bp predicted to encode 925 proteins, 40 stable RNA species, and 17 putative pseudogenes, and a substantial proportion of non-coding sequence (27 percent). Interesting genome features include a large set of proteins with transmembrane helices and/or signal sequences, and a unique serine-threonine bias associated with the potential for O-glycosylation that was prominent in proteins associated with pathogen-host interactions. Furthermore, two paralogous protein familiesmore » associated with immune evasion were identified, one of which contains poly G:C tracts, suggesting that they may play a role in phase variation and facilitation of persistent infections. Proteins associated with pathogen-host interactions were identified including a small group of proteins (12) with tandem repeats and another with eukaryotic-like ankyrin domains (7).« less
Tsuzuki, Syusaku; Handa, Yoshihiro; Takeda, Naoya; Kawaguchi, Masayoshi
2016-04-01
Arbuscular mycorrhizal (AM) symbiosis is the most widespread association between plants and fungi. To provide novel insights into the molecular mechanisms of AM symbiosis, we screened and investigated genes of the AM fungus Rhizophagus irregularis that contribute to the infection of host plants. R. irregularis genes involved in the infection were explored by RNA-sequencing (RNA-seq) analysis. One of the identified genes was then characterized by a reverse genetic approach using host-induced gene silencing (HIGS), which causes RNA interference in the fungus via the host plant. The RNA-seq analysis revealed that 19 genes are up-regulated by both treatment with strigolactone (SL) (a plant symbiotic signal) and symbiosis. Eleven of the 19 genes were predicted to encode secreted proteins and, of these, SL-induced putative secreted protein 1 (SIS1) showed the largest induction under both conditions. In hairy roots of Medicago truncatula, SIS1 expression is knocked down by HIGS, resulting in significant suppression of colonization and formation of stunted arbuscules. These results suggest that SIS1 is a putative secreted protein that is induced in a wide spatiotemporal range including both the presymbiotic and symbiotic stages and that SIS1 positively regulates colonization of host plants by R. irregularis.
Porcine parvovirus: DNA sequence and genome organization.
Ranz, A I; Manclús, J J; Díaz-Aroca, E; Casal, J I
1989-10-01
We have determined the nucleotide sequence of an almost full-length clone of porcine parvovirus (PPV). The sequence is 4973 nucleotides (nt) long. The 3' end of virion DNA shows a Y-shaped configuration homologous to rodent parvoviruses. The 5' end of virion DNA shows a repetition of 127 nt at the carboxy terminus of the capsid proteins. The overall organization of the PPV genome is similar to those of other autonomous parvoviruses. There are two large open reading frames (ORFs) that almost entirely cover the genome, both located in the same frame of the complementary strand. The left ORF encodes the non-structural protein NS1 and the right ORF encodes the capsid proteins (VP1, VP2 and VP3). Promoter analysis, location of splicing sites and putative amino acid sequences for the viral proteins show a high homology of PPV with feline panleukopenia virus and canine parvoviruses (FPV and CPV) and rodent parvovirus. Therefore we conclude that PPV is related to the Kilham rat virus (KRV) group of autonomous parvoviruses formed by KRV, minute virus of mice, Lu III, H-1, FPV and CPV.
Molecular evolution of miraculin-like proteins in soybean Kunitz super-family.
Selvakumar, Purushotham; Gahloth, Deepankar; Tomar, Prabhat Pratap Singh; Sharma, Nidhi; Sharma, Ashwani Kumar
2011-12-01
Miraculin-like proteins (MLPs) belong to soybean Kunitz super-family and have been characterized from many plant families like Rutaceae, Solanaceae, Rubiaceae, etc. Many of them possess trypsin inhibitory activity and are involved in plant defense. MLPs exhibit significant sequence identity (~30-95%) to native miraculin protein, also belonging to Kunitz super-family compared with a typical Kunitz family member (~30%). The sequence and structure-function comparison of MLPs with that of a classical Kunitz inhibitor have demonstrated that MLPs have evolved to form a distinct group within Kunitz super-family. Sequence analysis of new genes along with available MLP sequences in the literature revealed three major groups for these proteins. A significant feature of Rutaceae MLP type 2 sequences is the presence of phosphorylation motif. Subtle changes are seen in putative reactive loop residues among different MLPs suggesting altered specificities to specific proteases. In phylogenetic analysis, Rutaceae MLP type 1 and type 2 proteins clustered together on separate branches, whereas native miraculin along with other MLPs formed distinct clusters. Site-specific positive Darwinian selection was observed at many sites in both the groups of Rutaceae MLP sequences with most of the residues undergoing positive selection located in loop regions. The results demonstrate the sequence and thereby the structure-function divergence of MLPs as a distinct group within soybean Kunitz super-family due to biotic and abiotic stresses of local environment.
Das, Abhishek; Panda, Arijit; Singh, Deeksha; Chandrababunaidu, Mathu Malar; Mishra, Gyan Prakash; Bhan, Sushma; Adhikary, Siba Prasad; Tripathy, Sucheta
2015-04-02
Scytonema tolypothrichoides VB-61278, a terrestrial cyanobacterium, can be exploited to produce commercially important products. Here, we report for the first time a 10-Mb draft genome assembly of S. tolypothrichoides VB-61278, with 214 scaffolds and 7,148 putative protein-coding genes. Copyright © 2015 Das et al.
The genome of black cottonwood, Populus trichocarpa (Torr. & Gray)
G.A. Tuskan; S. DiFazio; S. Jansson; J. Bohlmann; I. Grigoriev; U. Hellsten; N. Putnam; S. Ralph; S. Rombauts; A. Salamov; J. Schein; L. Sterck; A. Aerts; R.R. Bhalerao; R.P. Bhalerao; D. Blaudez; W. Boerjan; A. Brun; A. Brunner; V. Busov; M. Campbell; J. Carlson; M. Chalot; J. Chapman; G.-L. Chen; D. Cooper; P.M. Coutinho; J. Couturier; S. Covert; Q. Cronk; R. Cunningham; J. Davis; S. Degroeve; A. Dejardin; C. dePamphilis; J. Detter; B. Dirks; U. Dubchak; S. Duplessis; J. Ehlting; B. Ellis; K. Gendler; D. Goodstein; M. Gribskov; J. Grimwood; A. Groover; L. Gunter; B. Hamberger; B. Heinze; Y. Helariutta; B. Henrissat; D. Holligan; R. Holt; W. Huang; N. Islam-Faridi; S. Jones; M. Jones-Rhoades; R. Jorgensen; C. Joshi; J. Kangasjarvi; J. Karlsson; C. Kelleher; R. Kirkpatrick; M. Kirst; A. Kohler; U. Kalluri; F. Larimer; J. Leebens-Mack; J.-C. Leple; P. Locascio; Y. Lou; S. Lucas; F. Martin; B. Montanini; C. Napoli; D.R. Nelson; C. Nelson; K. Nieminen; O. Nilsson; V. Pereda; G. Peter; R. Philippe; G. Pilate; A. Poliakov; J. Razumovskaya; P. Richardson; C. Rinaldi; K. Ritland; P. Rouze; D. Ryaboy; J. Schumtz; J. Schrader; B. Segerman; H. Shin; A. Siddiqui; F. Sterky; A. Terry; C.-J. Tsai; E. Uberbacher; P. Unneberg; J. Vahala; K. Wall; S. Wessler; G. Yang; T. Yin; C. Douglas; M. Marra; G. Sandberg; Y. Van de Peer; D. Rokhsar
2006-01-01
We report the draft genome of the black cottonwood tree, Populus trichocarpa. Integration of shotgun sequence assembly with genetic mapping enabled chromosome-scale reconstruction of the genome. More than 45,000 putative protein-coding genes were identified. Analysis of the assembled genome revealed a whole-genome duplication event; about 8000 pairs...
Iiyama, Kazuhiro; Otao, Masahiro; Mori, Kazuki; Mon, Hiroaki; Lee, Jae Man; Kusakabe, Takahiro; Tashiro, Kousuke; Asano, Shin-Ichiro; Yasunaga-Aoki, Chisa
2014-01-01
To determine the phylogenetic relationship among Paenibacillus species, putative replication origin regions were compared. In the rsmG-gyrA region, gene arrangements in Paenibacillus species were identical to those of Bacillus species, with the exception of an open reading frame (orf14) positioned between gyrB and gyrA, which was observed only in Paenibacillus species. The orf14 product was homologous to the endospore-associated proteins YheC and YheD of Bacillus subtilis. Phylogenetic analysis based on the YheCD proteins suggested that Orf14 could be categorized into the YheC group. In the Paenibacillus genome, DnaA box clusters were found in rpmH-dnaA and dnaA-dnaN intergenic regions, known as box regions C and R, respectively; this localization was similar to that observed in B. halodurans. A phylogenetic tree based on the nucleotide sequences of the whole replication origin regions suggested that P. popilliae, P. thiaminolyticus, and P. dendritiformis are closely related species.
Regulation of the alpha-glucuronidase-encoding gene ( aguA) from Aspergillus niger.
de Vries, R P; van de Vondervoort, P J I; Hendriks, L; van de Belt, M; Visser, J
2002-09-01
The alpha-glucuronidase gene aguA from Aspergillus niger was cloned and characterised. Analysis of the promoter region of aguA revealed the presence of four putative binding sites for the major carbon catabolite repressor protein CREA and one putative binding site for the transcriptional activator XLNR. In addition, a sequence motif was detected which differed only in the last nucleotide from the XLNR consensus site. A construct in which part of the aguA coding region was deleted still resulted in production of a stable mRNA upon transformation of A. niger. The putative XLNR binding sites and two of the putative CREA binding sites were mutated individually in this construct and the effects on expression were examined in A. niger transformants. Northern analysis of the transformants revealed that the consensus XLNR site is not actually functional in the aguA promoter, whereas the sequence that diverges from the consensus at a single position is functional. This indicates that XLNR is also able to bind to the sequence GGCTAG, and the XLNR binding site consensus should therefore be changed to GGCTAR. Both CREA sites are functional, indicating that CREA has a strong influence on aguA expression. A detailed expression analysis of aguA in four genetic backgrounds revealed a second regulatory system involved in activation of aguA gene expression. This system responds to the presence of glucuronic and galacturonic acids, and is not dependent on XLNR.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Leung, Elo; Huang, Amy; Cadag, Eithon
In this study, we introduce the Protein Sequence Annotation Tool (PSAT), a web-based, sequence annotation meta-server for performing integrated, high-throughput, genome-wide sequence analyses. Our goals in building PSAT were to (1) create an extensible platform for integration of multiple sequence-based bioinformatics tools, (2) enable functional annotations and enzyme predictions over large input protein fasta data sets, and (3) provide a web interface for convenient execution of the tools. In this paper, we demonstrate the utility of PSAT by annotating the predicted peptide gene products of Herbaspirillum sp. strain RV1423, importing the results of PSAT into EC2KEGG, and using the resultingmore » functional comparisons to identify a putative catabolic pathway, thereby distinguishing RV1423 from a well annotated Herbaspirillum species. This analysis demonstrates that high-throughput enzyme predictions, provided by PSAT processing, can be used to identify metabolic potential in an otherwise poorly annotated genome. Lastly, PSAT is a meta server that combines the results from several sequence-based annotation and function prediction codes, and is available at http://psat.llnl.gov/psat/. PSAT stands apart from other sequencebased genome annotation systems in providing a high-throughput platform for rapid de novo enzyme predictions and sequence annotations over large input protein sequence data sets in FASTA. PSAT is most appropriately applied in annotation of large protein FASTA sets that may or may not be associated with a single genome.« less
Leung, Elo; Huang, Amy; Cadag, Eithon; ...
2016-01-20
In this study, we introduce the Protein Sequence Annotation Tool (PSAT), a web-based, sequence annotation meta-server for performing integrated, high-throughput, genome-wide sequence analyses. Our goals in building PSAT were to (1) create an extensible platform for integration of multiple sequence-based bioinformatics tools, (2) enable functional annotations and enzyme predictions over large input protein fasta data sets, and (3) provide a web interface for convenient execution of the tools. In this paper, we demonstrate the utility of PSAT by annotating the predicted peptide gene products of Herbaspirillum sp. strain RV1423, importing the results of PSAT into EC2KEGG, and using the resultingmore » functional comparisons to identify a putative catabolic pathway, thereby distinguishing RV1423 from a well annotated Herbaspirillum species. This analysis demonstrates that high-throughput enzyme predictions, provided by PSAT processing, can be used to identify metabolic potential in an otherwise poorly annotated genome. Lastly, PSAT is a meta server that combines the results from several sequence-based annotation and function prediction codes, and is available at http://psat.llnl.gov/psat/. PSAT stands apart from other sequencebased genome annotation systems in providing a high-throughput platform for rapid de novo enzyme predictions and sequence annotations over large input protein sequence data sets in FASTA. PSAT is most appropriately applied in annotation of large protein FASTA sets that may or may not be associated with a single genome.« less
Pal, Debojyoti; Sharma, Deepak; Kumar, Mukesh; Sandur, Santosh K
2016-09-01
S-glutathionylation of proteins plays an important role in various biological processes and is known to be protective modification during oxidative stress. Since, experimental detection of S-glutathionylation is labor intensive and time consuming, bioinformatics based approach is a viable alternative. Available methods require relatively longer sequence information, which may prevent prediction if sequence information is incomplete. Here, we present a model to predict glutathionylation sites from pentapeptide sequences. It is based upon differential association of amino acids with glutathionylated and non-glutathionylated cysteines from a database of experimentally verified sequences. This data was used to calculate position dependent F-scores, which measure how a particular amino acid at a particular position may affect the likelihood of glutathionylation event. Glutathionylation-score (G-score), indicating propensity of a sequence to undergo glutathionylation, was calculated using position-dependent F-scores for each amino-acid. Cut-off values were used for prediction. Our model returned an accuracy of 58% with Matthew's correlation-coefficient (MCC) value of 0.165. On an independent dataset, our model outperformed the currently available model, in spite of needing much less sequence information. Pentapeptide motifs having high abundance among glutathionylated proteins were identified. A list of potential glutathionylation hotspot sequences were obtained by assigning G-scores and subsequent Protein-BLAST analysis revealed a total of 254 putative glutathionable proteins, a number of which were already known to be glutathionylated. Our model predicted glutathionylation sites in 93.93% of experimentally verified glutathionylated proteins. Outcome of this study may assist in discovering novel glutathionylation sites and finding candidate proteins for glutathionylation.
Arisue, Nobuko; Sánchez, Lidya B.; Weiss, Louis M.; Müller, Miklós; Hashimoto, Tetsuo
2011-01-01
Genes encoding putative mitochondrial-type heat shock protein 70 (mit-hsp70) were isolated and sequenced from amitochondriate protists, Giardia intestinalis, Entamoeba histolytica, and two microsporidians, Encephalitozoon hellem and Glugea plecoglossi. The deduced mit-hsp70 sequences were analyzed by sequence alignments and phylogenetic reconstructions. The mit-hsp70 sequence of these four amitochondriate protists were divergent from other mit-hsp70 sequences of mitochondriate eukaryotes. However, all of these sequences were clearly located within a eukaryotic mitochondrial clade in the tree including various type hsp70 sequences, supporting the emerging notion that none of these amitochondriate lineages are primitively amitochodrial, but lost their mitochondria secondarily in their evolutionary past. PMID:11880223
Peng, Jing; Peng, Futian; Zhu, Chunfu; Wei, Shaochong
2008-06-01
A putative isopentenyltransferase (IPT) encoding gene was identified from a pingyitiancha (Malus hupehensis Rehd.) expressed sequence tag database, and the full-length gene was cloned by RACE. Based on expression profile and sequence alignment, the nucleotide sequence of the clone, named MhIPT3, was most similar to AtIPT3, an IPT gene in Arabidopsis. The full-length cDNA contained a 963-bp open reading frame encoding a protein of 321 amino acids with a molecular mass of 37.3 kDa. Sequence analysis of genomic DNA revealed the absence of introns in the frame. Quantitative real-time PCR analysis demonstrated that the gene was expressed in roots, stems and leaves. Application of nitrate to roots of nitrogen-deprived seedlings strongly induced expression of MhIPT3 and was accompanied by the accumulation of cytokinins, whereas MhIPT3 expression was little affected by ammonium application to roots of nitrogen-deprived seedlings. Application of nitrate to leaves also up-regulated the expression of MhIPT3 and corresponded closely with the accumulation of isopentyladenine and isopentyladenosine in leaves.
Lahr, Roni M; Mack, Seshat M; Héroux, Annie; Blagden, Sarah P; Bousquet-Antonelli, Cécile; Deragon, Jean-Marc; Berman, Andrea J
2015-09-18
La-related protein 1 (LARP1) regulates the stability of many mRNAs. These include 5'TOPs, mTOR-kinase responsive mRNAs with pyrimidine-rich 5' UTRs, which encode ribosomal proteins and translation factors. We determined that the highly conserved LARP1-specific C-terminal DM15 region of human LARP1 directly binds a 5'TOP sequence. The crystal structure of this DM15 region refined to 1.86 Å resolution has three structurally related and evolutionarily conserved helix-turn-helix modules within each monomer. These motifs resemble HEAT repeats, ubiquitous helical protein-binding structures, but their sequences are inconsistent with consensus sequences of known HEAT modules, suggesting this structure has been repurposed for RNA interactions. A putative mTORC1-recognition sequence sits within a flexible loop C-terminal to these repeats. We also present modelling of pyrimidine-rich single-stranded RNA onto the highly conserved surface of the DM15 region. These studies lay the foundation necessary for proceeding toward a structural mechanism by which LARP1 links mTOR signalling to ribosome biogenesis. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Lahr, Roni M.; Mack, Seshat M.; Heroux, Annie; ...
2015-07-22
La-related protein 1 (LARP1) regulates the stability of many mRNAs. These include 5'TOPs, mTOR-kinase responsive mRNAs with pyrimidine-rich 5' UTRs, which encode ribosomal proteins and translation factors. We determined that the highly conserved LARP1-specific C-terminal DM15 region of human LARP1 directly binds a 5'TOP sequence. The crystal structure of this DM15 region refined to 1.86 Å resolution has three structurally related and evolutionarily conserved helix-turn-helix modules within each monomer. These motifs resemble HEAT repeats, ubiquitous helical protein-binding structures, but their sequences are inconsistent with consensus sequences of known HEAT modules, suggesting this structure has been repurposed for RNA interactions. Amore » putative mTORC1-recognition sequence sits within a flexible loop C-terminal to these repeats. We also present modelling of pyrimidine-rich single-stranded RNA onto the highly conserved surface of the DM15 region. Ultimately, these studies lay the foundation necessary for proceeding toward a structural mechanism by which LARP1 links mTOR signalling to ribosome biogenesis.« less
Dores, Robert M.
2016-01-01
The evolution of the melanocortin receptors (MCRs) is closely associated with the evolution of the melanocortin-2 receptor accessory proteins (MRAPs). Recent annotation of the elephant shark genome project revealed the sequence of a putative MRAP1 ortholog. The presence of this sequence in the genome of a cartilaginous fish raises the possibility that the mrap1 and mrap2 genes in the genomes of gnathostome vertebrates were the result of the chordate 2R genome duplication event. The presence of a putative MRAP1 ortholog in a cartilaginous fish genome is perplexing. Recent studies on melanocortin-2 receptor (MC2R) in the genomes of the elephant shark and the Japanese stingray indicate that these MC2R orthologs can be functionally expressed in CHO cells without co-expression of an exogenous mrap1 cDNA. The novel ligand selectivity of these cartilaginous fish MC2R orthologs is discussed. Finally, the origin of the mc2r and mc5r genes is reevaluated. The distinctive primary sequence conservation of MC2R and MC5R is discussed in light of the physiological roles of these two MCR paralogs. PMID:27445982
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lai, Xiaokuang; Davis, F.C.; Ingram, L.O.
1997-02-01
Genomic libraries from nine cellobiose-metabolizing bacteria were screened for cellobiose utilization. Positive clones were recovered from six libraries, all of which encode phosphoenolpyruvate:carbohydrate phosphotransferase system (PTS) proteins. Clones from Bacillus subtilis, Butyrivibrio fibrisolvens, and Klebsiella oxytoca allowed the growth of recombinant Escherichia coli in cellobiose-M9 minimal medium. The K. oxytoca clone, pLOI1906, exhibited an unusually broad substrate range (cellobiose, arbutin, salicin, and methylumbelliferyl derivatives of glucose, cellobiose, mannose, and xylose) and was sequenced. The insert in this plasmid encoded the carboxy-terminal region of a putative regulatory protein, cellobiose permease (single polypeptide), and phospho-{beta}-glucosidase, which appear to form an operon (casRAB).more » Subclones allowed both casA and casB to be expressed independently, as evidenced by in vitro complementation. An analysis of the translated sequences from the EIIC domains of cellobiose, aryl-{beta}-glucoside, and other disaccharide permeases allowed the identification of a 50-amino-acid conserved region. A disaccharide consensus sequence is proposed for the most conserved segment (13 amino acids), which may represent part of the EIIC active site for binding and phosphorylation. 63 refs., 4 figs., 4 tabs.« less
The OGCleaner: filtering false-positive homology clusters.
Fujimoto, M Stanley; Suvorov, Anton; Jensen, Nicholas O; Clement, Mark J; Snell, Quinn; Bybee, Seth M
2017-01-01
Detecting homologous sequences in organisms is an essential step in protein structure and function prediction, gene annotation and phylogenetic tree construction. Heuristic methods are often employed for quality control of putative homology clusters. These heuristics, however, usually only apply to pairwise sequence comparison and do not examine clusters as a whole. We present the Orthology Group Cleaner (the OGCleaner), a tool designed for filtering putative orthology groups as homology or non-homology clusters by considering all sequences in a cluster. The OGCleaner relies on high-quality orthologous groups identified in OrthoDB to train machine learning algorithms that are able to distinguish between true-positive and false-positive homology groups. This package aims to improve the quality of phylogenetic tree construction especially in instances of lower-quality transcriptome assemblies. https://github.com/byucsl/ogcleaner CONTACT: sfujimoto@gmail.comSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
2013-01-01
Background Comparatively little information is available on members of the Myoviridae infecting low G+C content, Gram-positive host bacteria of the family Firmicutes. While numerous Bacillus phages have been isolated up till now only very few Bacillus cereus phages have been characterized in detail. Results Here we present data on the large, virulent, broad-host-range B. cereus phage vB_BceM_Bc431v3 (Bc431v3). Bc431v3 features a 158,618 bp dsDNA genome, encompassing 239 putative open reading frames (ORFs) and, 20 tRNA genes encoding 17 different amino acids. Since pulsed-field gel electrophoresis indicated that the genome of this phage has a mass of 155-158 kb Bc431v3 DNA appears not to contain long terminal repeats that are found in the genome of Bacillus phage SPO1. Conclusions Bc431v3 displays significant sequence similarity, at the protein level, to B. cereus phage BCP78, Listeria phage A511 and Enterococcus phage ØEF24C and other morphologically related phages infecting Firmicutes such as Staphylococcus phage K and Lactobacillus phage LP65. Based on these data we suggest that Bc431v3 should be included as a member of the Spounavirinae; however, because of all the diverse taxonomical information has been addressed recently, it is difficult to determine the genus. The Bc431v3 phage contains some highly unusual genes such as gp143 encoding putative tRNAHis guanylyltransferase. In addition, it carries some genes that appear to be related to the host sporulation regulators. These are: gp098, which encodes a putative segregation protein related to FstK/SpoIIIE DNA transporters; gp105, a putative segregation protein; gp108, RNA polymerase sigma factor F/B; and, gp109 encoding RNA polymerase sigma factor G. PMID:23388049
The complete genome sequence of freesia mosaic virus and its relationship to other potyviruses.
Choi, H I; Lim, H R; Song, Y S; Kim, M J; Choi, S H; Song, Y S; Bae, S C; Ryu, K H
2010-07-01
We have completed the genomic sequence of a potyvirus, freesia mosaic virus (FreMV), and compared it to those of other known potyviruses. The full-length genome sequence of FreMV consists of 9,489 nucleotides. The large protein contains 3,077 amino acids, with an AUG start codon and UAA stop codon, containing one open reading frame typical of a potyvirus polyprotein. The polyprotein of FreMV-Kr gives rise to eleven proteins (P1, HC-pro, P3, PIPO, 6K1, CI, 6K2, VPg, NIa, NIb and CP), and putative cleavage sites of each protein were identified by sequence comparison to those of other known potyviruses. Phylogenetic analysis of the polyprotein revealed that FreMV-Kr was most closely related to PeMoV and was related to BtMV, BaRMV and PeLMV, which belong to the BCMV subgroup. This is the first information on the complete genome structure of FreMV, and the sequence information clearly supports the status of FreMV as a member of a distinct species in the genus Potyvirus.
The complete DNA sequence of lymphocystis disease virus.
Tidona, C A; Darai, G
1997-04-14
Lymphocystis disease virus (LCDV) is the causative agent of lymphocystis disease, which has been reported to occur in over 100 different fish species worldwide. LCDV is a member of the family Iridoviridae and the type species of the genus Lymphocystivirus. The virions contain a single linear double-stranded DNA molecule, which is circularly permuted, terminally redundant, and heavily methylated at cytosines in CpG sequences. The complete nucleotide sequence of LCDV-1 (flounder isolate) was determined by automated cycle sequencing and primer walking. The genome of LCDV-1 is 102.653 bp in length and contains 195 open reading frames with coding capacities ranging from 40 to 1199 amino acids. Computer-assisted analyses of the deduced amino acid sequences led to the identification of several putative gene products with significant homologies to entries in protein data banks, such as the two major subunits of the viral DNA-dependent RNA polymerase, DNA polymerase, several protein kinases, two subunits of the ribonucleoside diphosphate reductase, DNA methyltransferase, the viral major capsid protein, insulin-like growth factor, and tumor necrosis factor receptor homolog.
Crystal structure of bacillus subtilis YdaF protein : a putative ribosomal N-acetyltransferase.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brunzelle, J. S.; Wu, R.; Korolev, S. V.
2004-12-01
Comparative sequence analysis suggests that the ydaF gene encodes a protein (YdaF) that functions as an N-acetyltransferase, more specifically, a ribosomal N-acetyltransferase. Sequence analysis using basic local alignment search tool (BLAST) suggests that YdaF belongs to a large family of proteins (199 proteins found in 88 unique species of bacteria, archaea, and eukaryotes). YdaF also belongs to the COG1670, which includes the Escherichia coli RimL protein that is known to acetylate ribosomal protein L12. N-acetylation (NAT) has been found in all kingdoms. NAT enzymes catalyze the transfer of an acetyl group from acetyl-CoA (AcCoA) to a primary amino group. Formore » example, NATs can acetylate the N-terminal {alpha}-amino group, the {epsilon}-amino group of lysine residues, aminoglycoside antibiotics, spermine/speridine, or arylalkylamines such as serotonin. The crystal structure of the alleged ribosomal NAT protein, YdaF, from Bacillus subtilis presented here was determined as a part of the Midwest Center for Structural Genomics. The structure maintains the conserved tertiary structure of other known NATs and a high sequence similarity in the presumed AcCoA binding pocket in spite of a very low overall level of sequence identity to other NATs of known structure.« less
The proteolytic processing site of the precursor of lysyl oxidase.
Cronshaw, A D; Fothergill-Gilmore, L A; Hulmes, D J
1995-01-01
The precise cleavage site of the N-terminal propeptide region of the precursor of lysyl oxidase has not yet been established, due to N-terminal blocking of the mature protein. Using a combination of peptide fragmentation, amino acid sequencing, time-of-flight m.s. and partial chemical unblocking procedures, it is shown that the mature form of lysyl oxidase begins at residue Asp-169 of the precursor protein (numbered according to the human sequence). The cleavage site is 28 residues to the C-terminal side of the site previously suggested on the basis of apparant molecular mass by SDS/PAGE, with the consequence that the two putative, N-linked glycosylation sites and the position of the Arg/Gln sequence polymorphism are now all in the precursor region. PMID:7864821
Virulence and molecular polymorphism of Prunus necrotic ringspot virus isolates.
Hammond, R W; Crosslin, J M
1998-07-01
Prunus necrotic ringspot virus (PNRSV) occurs as numerous strains or isolates that vary widely in their pathogenic, biophysical and serological properties. Prior attempts to distinguish pathotypes based upon physical properties have not been successful; our approach was to examine the molecular properties that may distinguish these isolates. The nucleic acid sequence was determined from 1.65 kbp RT-PCR products derived from RNA 3 of seven distinct isolates of PNRSV that differ serologically and in pathology on sweet cherry. Sequence comparisons of ORF 3a (putative movement protein) and ORF 3b (coat protein) revealed single nucleotide and amino acid differences with strong correlations to serology and symptom types (pathotypes). Sequence differences between serotypes and pathotypes were also reflected in the overall phylogenetic relationships between the isolates.
The leukocyte common antigen (CD45): a putative receptor-linked protein tyrosine phosphatase.
Charbonneau, H; Tonks, N K; Walsh, K A; Fischer, E H
1988-01-01
A major protein tyrosine phosphatase (PTPase 1B) has been isolated in essentially homogeneous form from the soluble and particulate fractions of human placenta. Unexpectedly, partial amino acid sequences displayed no homology with the primary structures of the protein Ser/Thr phosphatases deduced from cDNA clones. However, the sequence is strikingly similar to the tandem C-terminal homologous domains of the leukocyte common antigen (CD45). A 157-residue segment of PTPase 1B displayed 40% and 33% sequence identity with corresponding regions from cytoplasmic domains I and II of human CD45. Similar degrees of identity have been observed among the catalytic domains of families of regulatory proteins such as protein kinases and cyclic nucleotide phosphodiesterases. On this basis, it is proposed that the CD45 family has protein tyrosine phosphatase activity and may represent a set of cell-surface receptors involved in signal transduction. This suggests that the repertoire of signal transduction mechanisms may include the direct control of an intracellular protein tyrosine phosphatase, offering the possibility of a regulatory balance with those protein tyrosine kinases that act at the internal surface of the membrane. Images PMID:2845400
Quarta, Angela; Mita, Giovanni; Durante, Miriana; Arlorio, Marco; De Paolis, Angelo
2013-07-01
The polyphenol oxidase (PPO) enzyme, which can catalyze the oxidation of phenolics to quinones, has been reported to be involved in undesirable browning in many plant foods. This phenomenon is particularly severe in artichoke heads wounded during the manufacturing process. A full-length cDNA encoding for a putative polyphenol oxidase (designated as CsPPO) along with a 1432 bp sequence upstream of the starting ATG codon was characterized for the first time from [Cynara cardunculus var. scolymus (L.) Fiori]. The 1764 bp CsPPO sequence encodes a putative protein of 587 amino acids with a calculated molecular mass of 65,327 Da and an isoelectric point of 5.50. Analysis of the promoter region revealed the presence of cis-acting elements, some of which are putatively involved in the response to light and wounds. Expression analysis of the gene in wounded capitula indicated that CsPPO was significantly induced after 48 h, even though the browning process had started earlier. This suggests that the early browning event observed in artichoke heads was not directly related to de novo mRNA synthesis. Finally, we provide the complete gene sequence encoding for polyphenol oxidase and the upstream regulative region in artichoke. Copyright © 2013 Elsevier Masson SAS. All rights reserved.
Tumour suppressor protein p53 regulates the stress activated bilirubin oxidase cytochrome P450 2A6
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hu, Hao, E-mail: hao.hu1@uqconnect.edu.au; Yu, Ting, E-mail: t.yu2@uq.edu.au; Arpiainen, Satu, E-mail: Satu.Juhila@orion.fi
2015-11-15
Human cytochrome P450 (CYP) 2A6 enzyme has been proposed to play a role in cellular defence against chemical-induced oxidative stress. The encoding gene is regulated by various stress activated transcription factors. This paper demonstrates that p53 is a novel transcriptional regulator of the gene. Sequence analysis of the CYP2A6 promoter revealed six putative p53 binding sites in a 3 kb proximate promoter region. The site closest to transcription start site (TSS) is highly homologous with the p53 consensus sequence. Transfection with various stepwise deletions of CYP2A6-5′-Luc constructs – down to − 160 bp from the TSS – showed p53 responsivenessmore » in p53 overexpressed C3A cells. However, a further deletion from − 160 to − 74 bp, including the putative p53 binding site, totally abolished the p53 responsiveness. Electrophoretic mobility shift assay with a probe containing the putative binding site showed specific binding of p53. A point mutation at the binding site abolished both the binding and responsiveness of the recombinant gene to p53. Up-regulation of the endogenous p53 with benzo[α]pyrene – a well-known p53 activator – increased the expression of the p53 responsive positive control and the CYP2A6-5′-Luc construct containing the intact p53 binding site but not the mutated CYP2A6-5′-Luc construct. Finally, inducibility of the native CYP2A6 gene by benzo[α]pyrene was demonstrated by dose-dependent increases in CYP2A6 mRNA and protein levels along with increased p53 levels in the nucleus. Collectively, the results indicate that p53 protein is a regulator of the CYP2A6 gene in C3A cells and further support the putative cytoprotective role of CYP2A6. - Highlights: • CYP2A6 is an immediate target gene of p53. • Six putative p53REs located on 3 kb proximate CYP2A6 promoter region. • The region − 160 bp from TSS is highly homologous with the p53 consensus sequence. • P53 specifically bind to the p53RE on the − 160 bp region. • HNF4α may interact with p53 in regulating CYP2A6 expression.« less
A novel subviral agent associated with a geminivirus: The first report of a DNA satellite
Dry, Ian B.; Krake, Leslie R.; Rigden, Justin E.; Rezaian, M. Ali
1997-01-01
Numerous plant RNA viruses have associated with them satellite (sat) RNAs that have little or no nucleotide sequence similarity to either the viral or host genomes but are completely dependent on the helper virus for replication. We report here on the discovery of a 682-nt circular DNA satellite associated with tomato leaf curl geminivirus (TLCV) infection in northern Australia. This is the first demonstration that satellite molecules are not limited to RNA viral systems. The DNA satellite (TLCV sat-DNA) is strictly dependent for replication on the helper virus replication-associated protein and is encapsidated by TLCV coat protein. It has no significant open reading frames, and it shows no significant sequence similarity to the 2766-nt helper-virus genome except for two short motifs present in separate putative stem–loop structures: TAATATTAC, which is universally conserved in all geminiviruses, and AATCGGTGTC, which is identical to a putative replication-associated protein binding motif in TLCV. Replication of TLCV sat-DNA is also supported by other taxonomically distinct geminiviruses, including tomato yellow leaf curl virus, African cassava mosaic virus, and beet curly top virus. Therefore, this unique DNA satellite does not appear to strictly conform with the requirements that dictate the specificity of interaction of geminiviral replication-associated proteins with their cognate origins as predicted by the current model of geminivirus replication. PMID:9192696
Complete genome sequence of lymphocystis disease virus isolated from China.
Zhang, Qi-Ya; Xiao, Feng; Xie, Jian; Li, Zheng-Qiu; Gui, Jian-Fang
2004-07-01
Lymphocystis diseases in fish throughout the world have been extensively described. Here we report the complete genome sequence of lymphocystis disease virus isolated in China (LCDV-C), an LCDV isolated from cultured flounder (Paralichthys olivaceus) with lymphocystis disease in China. The LCDV-C genome is 186,250 bp, with a base composition of 27.25% G+C. Computer-assisted analysis revealed 240 potential open reading frames (ORFs) and 176 nonoverlapping putative viral genes, which encode polypeptides ranging from 40 to 1,193 amino acids. The percent coding density is 67%, and the average length of each ORF is 702 bp. A search of the GenBank database using the 176 individual putative genes revealed 103 homologues to the corresponding ORFs of LCDV-1 and 73 potential genes that were not found in LCDV-1 and other iridoviruses. Among the 73 genes, there are 8 genes that contain conserved domains of cellular genes and 65 novel genes that do not show any significant homology with the sequences in public databases. Although a certain extent of similarity between putative gene products of LCDV-C and corresponding proteins of LCDV-1 was revealed, no colinearity was detected when their ORF arrangements and coding strategies were compared to each other, suggesting that a high degree of genetic rearrangements between them has occurred. And a large number of tandem and overlapping repeated sequences were observed in the LCDV-C genome. The deduced amino acid sequence of the major capsid protein (MCP) presents the highest identity to those of LCDV-1 and other iridoviruses among the LCDV-C gene products. Furthermore, a phylogenetic tree was constructed based on the multiple alignments of nine MCP amino acid sequences. Interestingly, LCDV-C and LCDV-1 were clustered together, but their amino acid identity is much less than that in other clusters. The unexpected levels of divergence between their genomes in size, gene organization, and gene product identity suggest that LCDV-C and LCDV-1 shouldn't belong to a same species and that LCDV-C should be considered a species different from LCDV-1.
Complete Genome Sequence of Lymphocystis Disease Virus Isolated from China
Zhang, Qi-Ya; Xiao, Feng; Xie, Jian; Li, Zheng-Qiu; Gui, Jian-Fang
2004-01-01
Lymphocystis diseases in fish throughout the world have been extensively described. Here we report the complete genome sequence of lymphocystis disease virus isolated in China (LCDV-C), an LCDV isolated from cultured flounder (Paralichthys olivaceus) with lymphocystis disease in China. The LCDV-C genome is 186,250 bp, with a base composition of 27.25% G+C. Computer-assisted analysis revealed 240 potential open reading frames (ORFs) and 176 nonoverlapping putative viral genes, which encode polypeptides ranging from 40 to 1,193 amino acids. The percent coding density is 67%, and the average length of each ORF is 702 bp. A search of the GenBank database using the 176 individual putative genes revealed 103 homologues to the corresponding ORFs of LCDV-1 and 73 potential genes that were not found in LCDV-1 and other iridoviruses. Among the 73 genes, there are 8 genes that contain conserved domains of cellular genes and 65 novel genes that do not show any significant homology with the sequences in public databases. Although a certain extent of similarity between putative gene products of LCDV-C and corresponding proteins of LCDV-1 was revealed, no colinearity was detected when their ORF arrangements and coding strategies were compared to each other, suggesting that a high degree of genetic rearrangements between them has occurred. And a large number of tandem and overlapping repeated sequences were observed in the LCDV-C genome. The deduced amino acid sequence of the major capsid protein (MCP) presents the highest identity to those of LCDV-1 and other iridoviruses among the LCDV-C gene products. Furthermore, a phylogenetic tree was constructed based on the multiple alignments of nine MCP amino acid sequences. Interestingly, LCDV-C and LCDV-1 were clustered together, but their amino acid identity is much less than that in other clusters. The unexpected levels of divergence between their genomes in size, gene organization, and gene product identity suggest that LCDV-C and LCDV-1 shouldn't belong to a same species and that LCDV-C should be considered a species different from LCDV-1. PMID:15194775
A perchlorate sensitive iodide transporter in frogs
Carr, Deborah L.; Carr, James A.; Willis, Ray E.; Pressley, Thomas A.
2008-01-01
Nucleotide sequence comparisons have identified a gene product in the genome database of African clawed frogs (Xenopus laevis) as a probable member of the solute carrier family of membrane transporters. To confirm its identity as a putative iodide transporter, we examined the function of this sequence after heterologous expression in mammalian cells. A green monkey kidney cell line transfected with the Xenopus nucleotide sequence had significantly greater 125I uptake than sham-transfected control cells. The uptake in carrier-transfected cells was significantly inhibited in the presence of perchlorate, a competitive inhibitor of mammalian Na+/iodide symporter. Tissue distributions of the sequence were also consistent with a role in iodide uptake. The mRNA encoding the carrier was found to be expressed in the thyroid gland, stomach, and kidney of tadpoles from X. laevis, as well as the bullfrog Rana catesbeiana. The ovaries of adult X. laevis also were found to express the carrier. Phylogenetic analysis suggested that the putative X. laevis iodide transporter is orthologous to vertebrate Na+-dependent iodide symporters. We conclude that the amphibian sequence encodes a protein that is indeed a functional Na+/iodide symporter in Xenopus laevis, as well as Rana catesbeiana. PMID:18275962
Maldonado-Borges, Josefina Ines; Ku-Cauich, José Roberto; Escobedo-Graciamedrano, Rosa Maria
2013-01-01
Analysis of cDNA-AFLP was used to study the genes expressed in zygotic and somatic embryogenesis of Musa acuminata Colla ssp. malaccensis, and a comparison was made between their differential transcribed fragments (TDFs) and the sequenced genome of the double haploid- (DH-) Pahang of the malaccensis subspecies that is available in the network. A total of 253 transcript-derived fragments (TDFs) were detected with apparent size of 100-4000 bp using 5 pairs of AFLP primers, of which 21 were differentially expressed during the different stages of banana embryogenesis; 15 of the sequences have matched DH-Pahang chromosomes, with 7 of them being homologous to gene sequences encoding either known or putative protein domains of higher plants. Four TDF sequences were located in all Musa chromosomes, while the rest were located in one or two chromosomes. Their putative individual function is briefly reviewed based on published information, and the potential roles of these genes in embryo development are discussed. Thus the availability of the genome of Musa and the information of TDFs sequences presented here opens new possibilities for an in-depth study of the molecular and biochemical research of zygotic and somatic embryogenesis of Musa.
Learning Cellular Sorting Pathways Using Protein Interactions and Sequence Motifs
Lin, Tien-Ho; Bar-Joseph, Ziv
2011-01-01
Abstract Proper subcellular localization is critical for proteins to perform their roles in cellular functions. Proteins are transported by different cellular sorting pathways, some of which take a protein through several intermediate locations until reaching its final destination. The pathway a protein is transported through is determined by carrier proteins that bind to specific sequence motifs. In this article, we present a new method that integrates protein interaction and sequence motif data to model how proteins are sorted through these sorting pathways. We use a hidden Markov model (HMM) to represent protein sorting pathways. The model is able to determine intermediate sorting states and to assign carrier proteins and motifs to the sorting pathways. In simulation studies, we show that the method can accurately recover an underlying sorting model. Using data for yeast, we show that our model leads to accurate prediction of subcellular localization. We also show that the pathways learned by our model recover many known sorting pathways and correctly assign proteins to the path they utilize. The learned model identified new pathways and their putative carriers and motifs and these may represent novel protein sorting mechanisms. Supplementary results and software implementation are available from http://murphylab.web.cmu.edu/software/2010_RECOMB_pathways/. PMID:21999284
Hara, Yasushi; Hayashi, Kyohei; Nakajima, Takuya; Kagawa, Shizuko; Tazumi, Akihiro; Moore, John E; Matsuda, Motoo
2013-09-01
Clustered regularly interspaced short palindromic repeats (CRISPRs), of approximately 10,000 base pairs (bp) in length, were shown to occur in the Japanese Taylorella equigenitalis strain, EQ59. The locus was composed of the putative CRISPRs-associated with 5 (cas5), RAMP csd1, csd2, recB, cas1, a leader region, 13 CRISPR consensus sequence repeats (each 32 bp; 5'-TCAGCCACGTTCGCGTGGCTGTGTGTTTAAAG-3'). These were in turn separated by 12 non repetitive unique spacer regions of similar length. In addition, a leader region, a transposase/IS protein, a leader region, and cas3 were also seen. All seven putative open reading frames carry their ribosome binding sites. Promoter consensus sequences at the -35 and -10 regions and putative intrinsic ρ-independent transcription terminator regions also occurred. A possible long overlap of 170 bp in length occurred between the recB and cas1 loci. Positive reverse transcription PCR signals of cas5, RAMP csd1, csd2-recB/cas1, and cas3 were generated. A putative secondary structure of the CRISPR consensus repeats was constructed. Following this, CRISPR results of the T. equigenitalis EQ59 isolate were subsequently compared with those from the Taylorella asinigenitalis MCE3 isolate.
Characterization and mapping of cDNA encoding aspartate aminotransferase in rice, Oryza sativa L.
Song, J; Yamamoto, K; Shomura, A; Yano, M; Minobe, Y; Sasaki, T
1996-10-31
Fifteen cDNA clones, putatively identified as encoding aspartate aminotransferase (AST, EC 2.6.1.1.), were isolated and partially sequenced. Together with six previously isolated clones putatively identified to encode ASTs (Sasaki, et al. 1994, Plant Journal 6, 615-624), their sequences were characterized and classified into 4 cDNA species. Two of the isolated clones, C60213 and C2079, were full-length cDNAs, and their complete nucleotide sequences were determined. C60213 was 1612 bp long and its deduced amino acid sequence showed 88% homology with that of Panicum miliaceum L. mitochondrial AST. The C60213-encoded protein had an N-terminal amino acid sequence that was characteristic of a mitochondrial transit peptide. On the other hand, C2079 was 1546 bp long and had 91% amino acid sequence homology with P. miliaceum L. cytosolic AST but lacked in the transit peptide sequence. The homologies of nucleotide sequences and deduced amino acid sequences of C2079 and C60213 were 54% and 52%, respectively. C2079 and C60213 were mapped on chromosomes 1 and 6, respectively, by restriction fragment length polymorphism linkage analysis. Northern blot analysis using C2079 as a probe revealed much higher transcript levels in callus and root than in green and etiolated shoots, suggesting tissue-specific variations of AST gene expression.
Analysis of expressed sequence tags for Frankliniella occidentalis, the western flower thrips.
Rotenberg, D; Whitfield, A E
2010-08-01
Thrips are members of the insect order Thysanoptera and Frankliniella occidentalis (the western flower thrips) is the most economically important pest within this order. F. occidentalis is both a direct pest of crops and an efficient vector of plant viruses, including Tomato spotted wilt virus (TSWV). Despite the world-wide importance of thrips in agriculture, there is little knowledge of the F. occidentalis genome or gene functions at this time. A normalized cDNA library was constructed from first instar thrips and 13 839 expressed sequence tags (ESTs) were obtained. Our EST data assembled into 894 contigs and 11 806 singletons (12 700 nonredundant sequences). We found that 31% of these sequences had significant similarity (E< or = 10(-10)) to protein sequences in the National Center for Biotechnology Information nonredundant (nr) protein database, and 25% were functionally annotated using Blast 2GO. We identified 74 sequences with putative homology to proteins associated with insect innate immunity. Sixteen sequences had significant similarity to proteins associated with small RNA-mediated gene silencing pathways (RNA interference; RNAi), including the antiviral pathway (short interfering RNA-mediated pathway). Our EST collection provides new sequence resources for characterizing gene functions in F. occidentalis and other thrips species with regards to vital biological processes, studying the mechanism of interactions with the viruses harboured and transmitted by the vector, and identifying new insect gene-centred targets for plant disease and insect control.
Genome sequence diversity and clues to the evolution of variola (smallpox) virus.
Esposito, Joseph J; Sammons, Scott A; Frace, A Michael; Osborne, John D; Olsen-Rasmussen, Melissa; Zhang, Ming; Govil, Dhwani; Damon, Inger K; Kline, Richard; Laker, Miriam; Li, Yu; Smith, Geoffrey L; Meyer, Hermann; Leduc, James W; Wohlhueter, Robert M
2006-08-11
Comparative genomics of 45 epidemiologically varied variola virus isolates from the past 30 years of the smallpox era indicate low sequence diversity, suggesting that there is probably little difference in the isolates' functional gene content. Phylogenetic clustering inferred three clades coincident with their geographical origin and case-fatality rate; the latter implicated putative proteins that mediate viral virulence differences. Analysis of the viral linear DNA genome suggests that its evolution involved direct descent and DNA end-region recombination events. Knowing the sequences will help understand the viral proteome and improve diagnostic test precision, therapeutics, and systems for their assessment.
Scuotto, Angelo; Djorie, Serge; Colavizza, Michel; Romond, Pierre-Charles; Romond, Marie-Bénédicte
2014-12-01
Extracellular components secreted by Bifidobacterium breve C50 can induce maturation, high IL-10 production and prolonged survival of dendritic cells via a TLR2 pathway. In this study, the components were isolated from the supernatant by gel filtration chromatography. Antibodies raised against the major compounds with molecular weight above 600 kDa (Bb C50BC) also recognized compounds of lower molecular weight (200–600 kDa). TLR2 and TLR6 bound to the components already recognized by the antibodies. Trypsin digestion of Bb C50BC released three major peptides whose sequences displayed close similarities to a putative secreted protein with a CHAP amidase domain from B. breve. The 1300-bp genomic region corresponding to the hypothetical protein was amplified by PCR. The deduced polypeptide started with an N-terminal signal sequence of 45 amino acids, containing the lipobox motif (LAAC) with the cysteine in position 25, and 2 positively charged residues within the first 14 residues of the signal sequence. Lipid detection in Bb C50BC by GC/MS further supported the implication of a lipoprotein. Sugars were also detected in Bb C50BC. Close similarity with the glucan-binding protein B from Bifidobacterium animalis of two released peptides from Bb C50BC protein suggested that glucose moieties, possibly in glucan form, could be bound to the lipoprotein. Finally, heating at 100 °C for 5 min led to the breakdown of Bb C50BC in compounds of molecular weight below 67 kDa, which suggested that Bb C50BC was an aggregate. One might assume that a basic unit was formed by the lipoprotein bound putatively to glucan. Besides the other sugars and hexosamines recognized by galectin 1 were localized at the surface of the Bb C50BC aggregate. In conclusion, the extracellular components secreted by B. breve C50 were constituted of a lipoprotein putatively associated with glucose moieties and acting in an aggregating form as an agonist of TLR2/TLR6.
Characterization of sequences in human TWIST required for nuclear localization
Singh, Shalini; Gramolini, Anthony O
2009-01-01
Background Twist is a transcription factor that plays an important role in proliferation and tumorigenesis. Twist is a nuclear protein that regulates a variety of cellular functions controlled by protein-protein interactions and gene transcription events. The focus of this study was to characterize putative nuclear localization signals (NLSs) 37RKRR40 and 73KRGKK77 in the human TWIST (H-TWIST) protein. Results Using site-specific mutagenesis and immunofluorescences, we observed that altered TWISTNLS1 K38R, TWISTNLS2 K73R and K77R constructs inhibit nuclear accumulation of H-TWIST in mammalian cells, while TWISTNLS2 K76R expression was un-affected and retained to the nucleus. Subsequently, co-transfection of TWIST mutants K38R, K73R and K77R with E12 formed heterodimers and restored nuclear localization despite the NLSs mutations. Using a yeast-two-hybrid assay, we identified a novel TWIST-interacting candidate TCF-4, a basic helix-loop-helix transcription factor. The interaction of TWIST with TCF-4 confirmed using NLS rescue assays, where nuclear expression of mutant TWISTNLS1 with co-transfixed TCF-4 was observed. The interaction of TWIST with TCF-4 was also seen using standard immunoprecipitation assays. Conclusion Our study demonstrates the presence of two putative NLS motifs in H-TWIST and suggests that these NLS sequences are functional. Furthermore, we identified and confirmed the interaction of TWIST with a novel protein candidate TCF-4. PMID:19534813
First isolation of West Nile virus from a dromedary camel
Joseph, Sunitha; Wernery, Ulrich; Teng, Jade LL; Wernery, Renate; Huang, Yi; Patteril, Nissy AG; Chan, Kwok-Hung; Elizabeth, Shyna K; Fan, Rachel YY; Lau, Susanna KP; Kinne, Jörg; Woo, Patrick CY
2016-01-01
Although antibodies against West Nile virus (WNV) have been detected in the sera of dromedaries in the Middle East, North Africa and Spain, no WNV has been isolated or amplified from dromedary or Bactrian camels. In this study, WNV was isolated from Vero cells inoculated with both nasal swab and pooled trachea/lung samples from a dromedary calf in Dubai. Complete-genome sequencing and phylogenetic analysis using the near-whole-genome polyprotein revealed that the virus belonged to lineage 1a. There was no clustering of the present WNV with other WNVs isolated in other parts of the Middle East. Within lineage 1a, the dromedary WNV occupied a unique position, although it was most closely related to other WNVs of cluster 2. Comparative analysis revealed that the putative E protein encoded by the genome possessed the original WNV E protein glycosylation motif NYS at E154–156, which contained the N-linked glycosylation site at N-154 associated with increased WNV pathogenicity and neuroinvasiveness. In the putative NS1 protein, the A70S substitution observed in other cluster 2 WNVs and P250, which has been implicated in neuroinvasiveness, were present. In addition, the foo motif in the putative NS2A protein, which has been implicated in neuroinvasiveness, was detected. Notably, the amino-acid residues at 14 positions in the present dromedary WNV genome differed from those in most of the closely related WNV strains in cluster 2 of lineage 1a, with the majority of these differences observed in the putative E and NS5 proteins. The present study is the first to demonstrate the isolation of WNV from dromedaries. This finding expands the possible reservoirs of WNV and sources of WNV infection. PMID:27273223
Demina, Tatiana A; Pietilä, Maija K; Svirskaitė, Julija; Ravantti, Janne J; Atanasova, Nina S; Bamford, Dennis H; Oksanen, Hanna M
2017-02-18
Members of the virus family Sphaerolipoviridae include both archaeal viruses and bacteriophages that possess a tailless icosahedral capsid with an internal membrane. The genera Alpha- and Betasphaerolipovirus comprise viruses that infect halophilic euryarchaea, whereas viruses of thermophilic Thermus bacteria belong to the genus Gammasphaerolipovirus . Both sequence-based and structural clustering of the major capsid proteins and ATPases of sphaerolipoviruses yield three distinct clades corresponding to these three genera. Conserved virion architectural principles observed in sphaerolipoviruses suggest that these viruses belong to the PRD1-adenovirus structural lineage. Here we focus on archaeal alphasphaerolipoviruses and their related putative proviruses. The highest sequence similarities among alphasphaerolipoviruses are observed in the core structural elements of their virions: the two major capsid proteins, the major membrane protein, and a putative packaging ATPase. A recently described tailless icosahedral haloarchaeal virus, Haloarcula californiae icosahedral virus 1 (HCIV-1), has a double-stranded DNA genome and an internal membrane lining the capsid. HCIV-1 shares significant similarities with the other tailless icosahedral internal membrane-containing haloarchaeal viruses of the family Sphaerolipoviridae . The proposal to include a new virus species, Haloarcula virus HCIV1 , into the genus Alphasphaerolipovirus was submitted to the International Committee on Taxonomy of Viruses (ICTV) in 2016.
Vanacker, J M; Corbau, R; Adelmant, G; Perros, M; Laudet, V; Rommelaere, J
1996-01-01
The promoter of the thyroid hormone receptor alpha gene (c-erbA-1) is activated by the nonstructural protein 1 (NS1) of parvovirus minute virus of mice (prototype strain [MVMp]) in ras-transformed FREJ4 cells that are permissive for lytic MVMp replication. This stimulation may be related to the sensitivity of host cells to MVMp, as it does not take place in parental FR3T3 cells, which are resistant to the parvovirus killing effect. The analysis of a series of deletion and point mutants of the c-erbA-1 promoter led to the identification of an upstream region that is necessary for NS1-driven transactivation. This sequence harbors a putative hormone-responsive element and is sufficient to render a minimal promoter NS1 inducible in FREJ4 but not in FR3T3 cells, and it is involved in distinct interactions with proteins from the respective cell lines. The NS1-responsive element of the c-erbA-1 promoter bears no homology with sequences that were previously reported to be necessary for NS1 DNA binding and transactivation. Altogether, our data point to a novel, cell-specific mechanism of promoter activation by NS1. PMID:8642664
Bernkopf, Marie; Webersinke, Gerald; Tongsook, Chanakan; Koyani, Chintan N.; Rafiq, Muhammad A.; Ayaz, Muhammad; Müller, Doris; Enzinger, Christian; Aslam, Muhammad; Naeem, Farooq; Schmidt, Kurt; Gruber, Karl; Speicher, Michael R.; Malle, Ernst; Macheroux, Peter; Ayub, Muhammad; Vincent, John B.; Windpassinger, Christian; Duba, Hans-Christoph
2014-01-01
We describe the characterization of a gene for mild nonsyndromic autosomal recessive intellectual disability (ID) in two unrelated families, one from Austria, the other from Pakistan. Genome-wide single nucleotide polymorphism microarray analysis enabled us to define a region of homozygosity by descent on chromosome 17q25. Whole-exome sequencing and analysis of this region in an affected individual from the Austrian family identified a 5 bp frameshifting deletion in the METTL23 gene. By means of Sanger sequencing of METTL23, a nonsense mutation was detected in a consanguineous ID family from Pakistan for which homozygosity-by-descent mapping had identified a region on 17q25. Both changes lead to truncation of the putative METTL23 protein, which disrupts the predicted catalytic domain and alters the cellular localization. 3D-modelling of the protein indicates that METTL23 is strongly predicted to function as an S-adenosyl-methionine (SAM)-dependent methyltransferase. Expression analysis of METTL23 indicated a strong association with heat shock proteins, which suggests that these may act as a putative substrate for methylation by METTL23. A number of methyltransferases have been described recently in association with ID. Disruption of METTL23 presented here supports the importance of methylation processes for intact neuronal function and brain development. PMID:24626631
Brown, D P; Idler, K B; Katz, L
1990-01-01
The 18.1-kilobase plasmid pSE211 integrates into the chromosome of Saccharopolyspora erythraea at a specific attB site. Restriction analysis of the integrated plasmid, pSE211int, and adjacent chromosomal sequences allowed identification of attP, the plasmid attachment site. Nucleotide sequencing of attP, attB, attL, and attR revealed a 57-base-pair sequence common to all sites with no duplications of adjacent plasmid or chromosomal sequences in the integrated state, indicating that integration takes place through conservative, reciprocal strand exchange. An analysis of the sequences indicated the presence of a putative gene for Phe-tRNA at attB which is preserved at attL after integration has occurred. A comparison of the attB site for a number of actinomycete plasmids is presented. Integration at attB was also observed when a 2.4-kilobase segment of pSE211 containing attP and the adjacent plasmid sequence was used to transform a pSE211- host. Nucleotide sequencing of this segment revealed the presence of two complete open reading frames (ORFs) and a segment of a third ORF. The ORF adjacent to attP encodes a putative polypeptide 437 amino acids in length that shows similarity, at its C-terminal domain, to sequences of site-specific recombinases of the integrase family. The adjacent ORF encodes a putative 98-amino-acid basic polypeptide that contains a helix-turn-helix motif at its N terminus which corresponds to domains in the Xis proteins of a number of bacteriophages. A proposal for the function of this polypeptide is presented. The deduced amino acid sequence of the third ORF did not reveal similarities to polypeptide sequences in the current data banks. Images FIG. 2 FIG. 3 PMID:2180909
Wittenberger, T; Schaller, H C; Hellebrand, S
2001-03-30
We have developed a comprehensive expressed sequence tag database search method and used it for the identification of new members of the G-protein coupled receptor superfamily. Our approach proved to be especially useful for the detection of expressed sequence tag sequences that do not encode conserved parts of a protein, making it an ideal tool for the identification of members of divergent protein families or of protein parts without conserved domain structures in the expressed sequence tag database. At least 14 of the expressed sequence tags found with this strategy are promising candidates for new putative G-protein coupled receptors. Here, we describe the sequence and expression analysis of five new members of this receptor superfamily, namely GPR84, GPR86, GPR87, GPR90 and GPR91. We also studied the genomic structure and chromosomal localization of the respective genes applying in silico methods. A cluster of six closely related G-protein coupled receptors was found on the human chromosome 3q24-3q25. It consists of four orphan receptors (GPR86, GPR87, GPR91, and H963), the purinergic receptor P2Y1, and the uridine 5'-diphosphoglucose receptor KIAA0001. It seems likely that these receptors evolved from a common ancestor and therefore might have related ligands. In conclusion, we describe a data mining procedure that proved to be useful for the identification and first characterization of new genes and is well applicable for other gene families. Copyright 2001 Academic Press.
RF-Phos: A Novel General Phosphorylation Site Prediction Tool Based on Random Forest.
Ismail, Hamid D; Jones, Ahoi; Kim, Jung H; Newman, Robert H; Kc, Dukka B
2016-01-01
Protein phosphorylation is one of the most widespread regulatory mechanisms in eukaryotes. Over the past decade, phosphorylation site prediction has emerged as an important problem in the field of bioinformatics. Here, we report a new method, termed Random Forest-based Phosphosite predictor 2.0 (RF-Phos 2.0), to predict phosphorylation sites given only the primary amino acid sequence of a protein as input. RF-Phos 2.0, which uses random forest with sequence and structural features, is able to identify putative sites of phosphorylation across many protein families. In side-by-side comparisons based on 10-fold cross validation and an independent dataset, RF-Phos 2.0 compares favorably to other popular mammalian phosphosite prediction methods, such as PhosphoSVM, GPS2.1, and Musite.
Lery, Letícia M S; Bitar, Mainá; Costa, Mauricio G S; Rössle, Shaila C S; Bisch, Paulo M
2010-12-22
G. diazotrophicus and A. vinelandii are aerobic nitrogen-fixing bacteria. Although oxygen is essential for the survival of these organisms, it irreversibly inhibits nitrogenase, the complex responsible for nitrogen fixation. Both microorganisms deal with this paradox through compensatory mechanisms. In A. vinelandii a conformational protection mechanism occurs through the interaction between the nitrogenase complex and the FeSII protein. Previous studies suggested the existence of a similar system in G. diazotrophicus, but the putative protein involved was not yet described. This study intends to identify the protein coding gene in the recently sequenced genome of G. diazotrophicus and also provide detailed structural information of nitrogenase conformational protection in both organisms. Genomic analysis of G. diazotrophicus sequences revealed a protein coding ORF (Gdia0615) enclosing a conserved "fer2" domain, typical of the ferredoxin family and found in A. vinelandii FeSII. Comparative models of both FeSII and Gdia0615 disclosed a conserved beta-grasp fold. Cysteine residues that coordinate the 2[Fe-S] cluster are in conserved positions towards the metallocluster. Analysis of solvent accessible residues and electrostatic surfaces unveiled an hydrophobic dimerization interface. Dimers assembled by molecular docking presented a stable behaviour and a proper accommodation of regions possibly involved in binding of FeSII to nitrogenase throughout molecular dynamics simulations in aqueous solution. Molecular modeling of the nitrogenase complex of G. diazotrophicus was performed and models were compared to the crystal structure of A. vinelandii nitrogenase. Docking experiments of FeSII and Gdia0615 with its corresponding nitrogenase complex pointed out in both systems a putative binding site presenting shape and charge complementarities at the Fe-protein/MoFe-protein complex interface. The identification of the putative FeSII coding gene in G. diazotrophicus genome represents a large step towards the understanding of the conformational protection mechanism of nitrogenase against oxygen. In addition, this is the first study regarding the structural complementarities of FeSII-nitrogenase interactions in diazotrophic bacteria. The combination of bioinformatic tools for genome analysis, comparative protein modeling, docking calculations and molecular dynamics provided a powerful strategy for the elucidation of molecular mechanisms and structural features of FeSII-nitrogenase interaction.
USDA-ARS?s Scientific Manuscript database
The gene TtGH28 encoding a putative GH28 polygalacturonase from Pseudothermotoga thermarum DSM 5069 (Theth_0397, NCBI# AEH50492.1) was synthesized, expressed in E. coli, and characterized. Alignment of the amino acid sequence of gene product TtGH28 with other GH28 proteins whose structures and detai...
NASA Astrophysics Data System (ADS)
El-Assaad, Atlal; Dawy, Zaher; Nemer, Georges; Kobeissy, Firas
2017-01-01
The crucial biological role of proteases has been visible with the development of degradomics discipline involved in the determination of the proteases/substrates resulting in breakdown-products (BDPs) that can be utilized as putative biomarkers associated with different biological-clinical significance. In the field of cancer biology, matrix metalloproteinases (MMPs) have shown to result in MMPs-generated protein BDPs that are indicative of malignant growth in cancer, while in the field of neural injury, calpain-2 and caspase-3 proteases generate BDPs fragments that are indicative of different neural cell death mechanisms in different injury scenarios. Advanced proteomic techniques have shown a remarkable progress in identifying these BDPs experimentally. In this work, we present a bioinformatics-based prediction method that identifies protease-associated BDPs with high precision and efficiency. The method utilizes state-of-the-art sequence matching and alignment algorithms. It starts by locating consensus sequence occurrences and their variants in any set of protein substrates, generating all fragments resulting from cleavage. The complexity exists in space O(mn) as well as in O(Nmn) time, where N, m, and n are the number of protein sequences, length of the consensus sequence, and length per protein sequence, respectively. Finally, the proposed methodology is validated against βII-spectrin protein, a brain injury validated biomarker.
Semiz, Asli; Sen, Alaattin
2015-03-01
Cytochrome P450 monooxygenases mediate a broad range of oxidative reactions involved in the biosynthesis of both primary and secondary metabolites in plants. Until now, only two P450 genes, CYP720B1 from Pinus taeda and CYP720B4 from Picea sitchensis, have been functionally characterised and described in the literature. The purpose of this study was to describe the cloning and expression of CYP720B from Pinus brutia due to its suggested role in the synthesis of bioactive compounds used for chemical defence against insects. A PCR product of the P. brutia CYP720B gene was cloned into the pCR8/GW/TOPO cloning vector. After optimising the sequence for codon usage in yeast, it was transferred into the inducible expression vector pYES-DEST52 and transfected into the S. cerevisiae INVSc1 strain. Sequence analysis showed that the P. brutia CYP720B gene contains an open reading frame of 1,464 nucleotides, which encodes a 53,570 Da putative protein of 487 amino acid residues. The putative protein contains the classic heme-binding sequence motif that is conserved in all P450 enzymes. It shares 99 and 61% identity with the deduced amino acid sequences of CYP720B1 from Pinus taeda and CYP720B4 from Picea sitchensis, respectively. Recombinant CYP720B protein expression was confirmed using western blot analysis. Furthermore, recombinant CYP720B was functionally active, showing a Soret peak at approximately 448 nm in the reduced CO difference spectra. These data suggest that the cloned gene is an orthologue of CYP720B in P. brutia and might be involved in DRA biosynthesis.
Domain fusion analysis by applying relational algebra to protein sequence and domain databases.
Truong, Kevin; Ikura, Mitsuhiko
2003-05-06
Domain fusion analysis is a useful method to predict functionally linked proteins that may be involved in direct protein-protein interactions or in the same metabolic or signaling pathway. As separate domain databases like BLOCKS, PROSITE, Pfam, SMART, PRINTS-S, ProDom, TIGRFAMs, and amalgamated domain databases like InterPro continue to grow in size and quality, a computational method to perform domain fusion analysis that leverages on these efforts will become increasingly powerful. This paper proposes a computational method employing relational algebra to find domain fusions in protein sequence databases. The feasibility of this method was illustrated on the SWISS-PROT+TrEMBL sequence database using domain predictions from the Pfam HMM (hidden Markov model) database. We identified 235 and 189 putative functionally linked protein partners in H. sapiens and S. cerevisiae, respectively. From scientific literature, we were able to confirm many of these functional linkages, while the remainder offer testable experimental hypothesis. Results can be viewed at http://calcium.uhnres.utoronto.ca/pi. As the analysis can be computed quickly on any relational database that supports standard SQL (structured query language), it can be dynamically updated along with the sequence and domain databases, thereby improving the quality of predictions over time.
The nagA gene of Penicillium chrysogenum encoding beta-N-acetylglucosaminidase.
Díez, Bruno; Rodríguez-Sáiz, Marta; de la Fuente, Juan Luis; Moreno, Miguel Angel; Barredo, José Luis
2005-01-15
We purified the beta-N-acetylglucosaminidase from the filamentous fungus Penicillium chrysogenum and its N-terminal sequence was determined, showing the presence of a mixture of two proteins (P1 and P2). A genomic DNA fragment was cloned by using degenerated oligonucleotides from the Nt sequences. The nucleotide sequence showed the presence of an ORF (nagA gene) lacking introns, with a length of 1791 bp, and coding for a protein of 66.5 kDa showing similarity to acetylglucosaminidases. The NagA deduced protein includes P1 and P2 as incomplete forms of the mature protein, and contains putative features for protein maturation: an 18-amino acid signal peptide, a KEX2 processing site, and four glycosylation motifs. The sequence just after the signal peptide corresponds to P2 and that after the KEX2 site to P1. The nagA transcript has a size of about 2.1 kb and is present until the end of the fermentation process for penicillin production. NagA is one of the most largely represented proteins in P. chrysogenum, increasing along the fermentation process. The suitability of the nagA promoter (PnagA) for gene expression in fungi was demonstrated by expressing the bleomycin resistance gene (ble(R)) from Streptoalloteichus hindustanus in P. chrysogenum.
Ventura, Marco; Jankovic, Ivana; Walker, D. Carey; Pridmore, R. David; Zink, Ralf
2002-01-01
We have identified and sequenced the genes encoding the aggregation-promoting factor (APF) protein from six different strains of Lactobacillus johnsonii and Lactobacillus gasseri. Both species harbor two apf genes, apf1 and apf2, which are in the same orientation and encode proteins of 257 to 326 amino acids. Multiple alignments of the deduced amino acid sequences of these apf genes demonstrate a very strong sequence conservation of all of the genes with the exception of their central regions. Northern blot analysis showed that both genes are transcribed, reaching their maximum expression during the exponential phase. Primer extension analysis revealed that apf1 and apf2 harbor a putative promoter sequence that is conserved in all of the genes. Western blot analysis of the LiCl cell extracts showed that APF proteins are located on the cell surface. Intact cells of L. johnsonii revealed the typical cell wall architecture of S-layer-carrying gram-positive eubacteria, which could be selectively removed with LiCl treatment. In addition, the amino acid composition, physical properties, and genetic organization were found to be quite similar to those of S-layer proteins. These results suggest that APF is a novel surface protein of the Lactobacillus acidophilus B-homology group which might belong to an S-layer-like family. PMID:12450842
Oh, Ji-eun; Karlmark, Karlin Raja; Shin, Jooho; Hengstschläger, Markus; Lubec, Gert
2006-05-15
Several protein cascades, including signaling, cytoskeletal, chaperones, metabolic, and antioxidant proteins, have been shown to be involved in the process of neuronal differentiation (ND) of neuroblastoma cell lines. No systematic approach to detect hitherto unknown and unnamed proteins or structures that have been predicted upon nucleic acid sequences in ND has been published so far. We therefore decided to screen hypothetical protein (HP) expression by protein profiling. Two-dimensional gel electrophoresis with subsequent matrix-assisted laser desorption/ionization-time of flight mass spectrometry (MALDI-TOF/TOF) identification was used for expression analysis of undifferentiated and dimethylsulfoxide-induced neuronally differentiated N1E-115 cells. We unambiguously identified six HPs: Q8C520, Q99LF4, Q9CXS1, Q9DAF8, Q91WT0, and Q8C5G2. A prefoldin domain in Q91WT0, a t-SNARE domain in Q9CXS1, and a bromodomain were observed in Q8C5G2. For the three remaining proteins, no putative function using Pfam, BLOCKS, PROSITE, PRINTS, InterPro, Superfamily, CoPS, and ExPASy could be assigned. While two proteins were present in both cell lines, Q9CXS1 was switched off (i.e., undetectably low) in differentiated cells only, and Q9DAF8, Q91WT0, and Q8C5G2 were switched on in differentiated cells exclusively. Herein, using a proteomic approach suitable for screening and identification of HP, we present HP structures that have been only predicted so far based upon nucleic acid sequences. The four differentially regulated HPs may play a putative role in the process of ND. (c) 2006 Wiley-Liss, Inc.
Adams, C; Dowling, D N; O'Sullivan, D J; O'Gara, F
1994-06-03
An iron-regulated gene, pbsC, required for siderophore production in fluorescent Pseudomonas sp. strain M114 has been identified. A kanamycin-resistance cassette was inserted at specific restriction sites within a 7 kb genomic fragment of M114 DNA and by marker exchange two siderophore-negative mutants, designated M1 and M2, were isolated. The nucleotide sequence of approximately 4 kb of the region flanking the insertion sites was determined and a large open reading frame (ORF) extending for 2409 bp was identified. This gene was designated pbsC (pseudobactin synthesis C) and its putative protein product termed PbsC. PbsC was found to be homologous to a family of enzymes involved in the biosynthesis of secondary metabolites, including EntF of Escherichia coli. These enzymes are believed to act via ATP-dependent binding of AMP to their substrate. Several areas of high sequence homology between these proteins and PbsC were observed, including a conserved AMP-binding domain. The expression of pbsC is iron-regulated as revealed when a DNA fragment containing the upstream region was cloned in a promoter probe vector and conjugated into the wild-type strain, M114. The nucleotide sequence upstream of the putative translational start site contains a region homologous to previously defined -16 to -25 sequences of iron-regulated genes but did not contain an iron-box consensus sequence. It was noted that inactivation of the pbsC gene also affected other iron-regulated phenotypes of Pseudomonas M114.
Previously unknown and highly divergent ssDNA viruses populate the oceans.
Labonté, Jessica M; Suttle, Curtis A
2013-11-01
Single-stranded DNA (ssDNA) viruses are economically important pathogens of plants and animals, and are widespread in oceans; yet, the diversity and evolutionary relationships among marine ssDNA viruses remain largely unknown. Here we present the results from a metagenomic study of composite samples from temperate (Saanich Inlet, 11 samples; Strait of Georgia, 85 samples) and subtropical (46 samples, Gulf of Mexico) seawater. Most sequences (84%) had no evident similarity to sequenced viruses. In total, 608 putative complete genomes of ssDNA viruses were assembled, almost doubling the number of ssDNA viral genomes in databases. These comprised 129 genetically distinct groups, each represented by at least one complete genome that had no recognizable similarity to each other or to other virus sequences. Given that the seven recognized families of ssDNA viruses have considerable sequence homology within them, this suggests that many of these genetic groups may represent new viral families. Moreover, nearly 70% of the sequences were similar to one of these genomes, indicating that most of the sequences could be assigned to a genetically distinct group. Most sequences fell within 11 well-defined gene groups, each sharing a common gene. Some of these encoded putative replication and coat proteins that had similarity to sequences from viruses infecting eukaryotes, suggesting that these were likely from viruses infecting eukaryotic phytoplankton and zooplankton.
Neuwald, Andrew F
2009-08-01
The patterns of sequence similarity and divergence present within functionally diverse, evolutionarily related proteins contain implicit information about corresponding biochemical similarities and differences. A first step toward accessing such information is to statistically analyze these patterns, which, in turn, requires that one first identify and accurately align a very large set of protein sequences. Ideally, the set should include many distantly related, functionally divergent subgroups. Because it is extremely difficult, if not impossible for fully automated methods to align such sequences correctly, researchers often resort to manual curation based on detailed structural and biochemical information. However, multiply-aligning vast numbers of sequences in this way is clearly impractical. This problem is addressed using Multiply-Aligned Profiles for Global Alignment of Protein Sequences (MAPGAPS). The MAPGAPS program uses a set of multiply-aligned profiles both as a query to detect and classify related sequences and as a template to multiply-align the sequences. It relies on Karlin-Altschul statistics for sensitivity and on PSI-BLAST (and other) heuristics for speed. Using as input a carefully curated multiple-profile alignment for P-loop GTPases, MAPGAPS correctly aligned weakly conserved sequence motifs within 33 distantly related GTPases of known structure. By comparison, the sequence- and structurally based alignment methods hmmalign and PROMALS3D misaligned at least 11 and 23 of these regions, respectively. When applied to a dataset of 65 million protein sequences, MAPGAPS identified, classified and aligned (with comparable accuracy) nearly half a million putative P-loop GTPase sequences. A C++ implementation of MAPGAPS is available at http://mapgaps.igs.umaryland.edu. Supplementary data are available at Bioinformatics online.
Didi, Jennifer; Lemée, Ludovic; Gibert, Laure; Pons, Jean-Louis
2014-01-01
Staphylococcus lugdunensis is an emergent virulent coagulase-negative staphylococcus responsible for severe infections similar to those caused by Staphylococcus aureus. To understand its potentially pathogenic capacity and have further detailed knowledge of the molecular traits of this organism, 93 isolates from various geographic origins were analyzed by multi-virulence-locus sequence typing (MVLST), targeting seven known or putative virulence-associated loci (atlLR2, atlLR3, hlb, isdJ, SLUG_09050, SLUG_16930, and vwbl). The polymorphisms of the putative virulence-associated loci were moderate and comparable to those of the housekeeping genes analyzed by multilocus sequence typing (MLST). However, the MVLST scheme generated 43 virulence types (VTs) compared to 20 sequence types (STs) based on MLST, indicating that MVLST was significantly more discriminating (Simpson's index [D], 0.943). No hypervirulent lineage or cluster specific to carriage strains was defined. The results of multilocus sequence analysis of known and putative virulence-associated loci are consistent with a clonal population structure for S. lugdunensis, suggesting a coevolution of these genes with housekeeping genes. Indeed, the nonsynonymous to synonymous evolutionary substitutions (dN/dS) ratio, the Tajima's D test, and Single-likelihood ancestor counting (SLAC) analysis suggest that all virulence-associated loci were under negative selection, even atlLR2 (AtlL protein) and SLUG_16930 (FbpA homologue), for which the dN/dS ratios were higher. In addition, this analysis of virulence-associated loci allowed us to propose a trilocus sequence typing scheme based on the intragenic regions of atlLR3, isdJ, and SLUG_16930, which is more discriminant than MLST for studying short-term epidemiology and further characterizing the lineages of the rare but highly pathogenic S. lugdunensis. PMID:25078912
Sakai, Yoriko; Ogawa, Naoto; Shimomura, Yumi; Fujii, Takeshi
2014-03-01
Analysis of the complete nucleotide sequence of plasmid pM7012 from 2,4-dichlorophenoxyacetic-acid (2,4-D)-degrading bacterium Burkholderia sp. M701 revealed that the plasmid had 582 142 bp, with 541 putative protein-coding sequences and 39 putative tRNA genes for the transport of the standard 20 aa. pM7012 contains sequences homologous to the regions involved in conjugal transfer and plasmid maintenance found in plasmids byi_2p from Burkholderia sp. YI23 and pBVIE01 from Burkholderia sp. G4. No relaxase gene was found in any of these plasmids, although genes for a type IV secretion system and type IV coupling proteins were identified. Plasmids with no relaxase gene have been classified as non-mobile plasmids. However, nucleotide sequences with a high level of similarity to the genes for plasmid transfer, plasmid maintenance, 2,4-D degradation and arsenic resistance contained on pM7012 were also detected in eight other megaplasmids (~600 or 900 kb) found in seven Burkholderia strains and a strain of Cupriavidus, which were isolated as 2,4-D-degrading bacteria in Japan and the United States. These results suggested that the 2,4-D degradation megaplasmids related to pM7012 are mobile and distributed across various bacterial species worldwide, and that the plasmid group could be distinguished from known mobile plasmid groups.
Complete Genomic Structure of the Bloom-forming Toxic Cyanobacterium Microcystis aeruginosa NIES-843
Kaneko, Takakazu; Nakajima, Nobuyoshi; Okamoto, Shinobu; Suzuki, Iwane; Tanabe, Yuuhiko; Tamaoki, Masanori; Nakamura, Yasukazu; Kasai, Fumie; Watanabe, Akiko; Kawashima, Kumiko; Kishida, Yoshie; Ono, Akiko; Shimizu, Yoshimi; Takahashi, Chika; Minami, Chiharu; Fujishiro, Tsunakazu; Kohara, Mitsuyo; Katoh, Midori; Nakazaki, Naomi; Nakayama, Shinobu; Yamada, Manabu; Tabata, Satoshi; Watanabe, Makoto M.
2007-01-01
Abstract The nucleotide sequence of the complete genome of a cyanobacterium, Microcystis aeruginosa NIES-843, was determined. The genome of M. aeruginosa is a single, circular chromosome of 5 842 795 base pairs (bp) in length, with an average GC content of 42.3%. The chromosome comprises 6312 putative protein-encoding genes, two sets of rRNA genes, 42 tRNA genes representing 41 tRNA species, and genes for tmRNA, the B subunit of RNase P, SRP RNA, and 6Sa RNA. Forty-five percent of the putative protein-encoding sequences showed sequence similarity to genes of known function, 32% were similar to hypothetical genes, and the remaining 23% had no apparent similarity to reported genes. A total of 688 kb of the genome, equivalent to 11.8% of the entire genome, were composed of both insertion sequences and miniature inverted-repeat transposable elements. This is indicative of a plasticity of the M. aeruginosa genome, through a mechanism that involves homologous recombination mediated by repetitive DNA elements. In addition to known gene clusters related to the synthesis of microcystin and cyanopeptolin, novel gene clusters that may be involved in the synthesis and modification of toxic small polypeptides were identified. Compared with other cyanobacteria, a relatively small number of genes for two component systems and a large number of genes for restriction-modification systems were notable characteristics of the M. aeruginosa genome. PMID:18192279
USDA-ARS?s Scientific Manuscript database
The concept of utilizing putative and unique gene sequences for the design of species specific probes was tested. The abundance profile of assigned functions within the Lactobacillus plantarum genome was used for the identification of the putative and unique gene sequence, csh. The targeted gene (cs...
G Protein-Coupled Receptors in Anopheles gambiae
NASA Astrophysics Data System (ADS)
Hill, Catherine A.; Fox, A. Nicole; Pitts, R. Jason; Kent, Lauren B.; Tan, Perciliz L.; Chrystal, Mathew A.; Cravchik, Anibal; Collins, Frank H.; Robertson, Hugh M.; Zwiebel, Laurence J.
2002-10-01
We used bioinformatic approaches to identify a total of 276 G protein-coupled receptors (GPCRs) from the Anopheles gambiae genome. These include GPCRs that are likely to play roles in pathways affecting almost every aspect of the mosquito's life cycle. Seventy-nine candidate odorant receptors were characterized for tissue expression and, along with 76 putative gustatory receptors, for their molecular evolution relative to Drosophila melanogaster. Examples of lineage-specific gene expansions were observed as well as a single instance of unusually high sequence conservation.
Pomel, Sébastien; Diogon, Marie; Bouchard, Philippe; Pradel, Lydie; Ravet, Viviane; Coffe, Gérard; Viguès, Bernard
2006-02-01
Previous attempts to identify the membrane skeleton of Paramecium cells have revealed a protein pattern that is both complex and specific. The most prominent structural elements, epiplasmic scales, are centered around ciliary units and are closely apposed to the cytoplasmic side of the inner alveolar membrane. We sought to characterize epiplasmic scale proteins (epiplasmins) at the molecular level. PCR approaches enabled the cloning and sequencing of two closely related genes by amplifications of sequences from a macronuclear genomic library. Using these two genes (EPI-1 and EPI-2), we have contributed to the annotation of the Paramecium tetraurelia macronuclear genome and identified 39 additional (paralogous) sequences. Two orthologous sequences were found in the Tetrahymena thermophila genome. Structural analysis of the 43 sequences indicates that the hallmark of this new multigenic family is a 79 aa domain flanked by two Q-, P- and V-rich stretches of sequence that are much more variable in amino-acid composition. Such features clearly distinguish members of the multigenic family from epiplasmic proteins previously sequenced in other ciliates. The expression of Green Fluorescent Protein (GFP)-tagged epiplasmin showed significant labeling of epiplasmic scales as well as oral structures. We expect that the GFP construct described herein will prove to be a useful tool for comparative subcellular localization of different putative epiplasmins in Paramecium.
Walker, M D; Park, C W; Rosen, A; Aronheim, A
1990-01-01
Cell specific expression of the insulin gene is achieved through transcriptional mechanisms operating on multiple DNA sequence elements located in the 5' flanking region of the gene. Of particular importance in the rat insulin I gene are two closely similar 9 bp sequences (IEB1 and IEB2): mutation of either of these leads to 5-10 fold reduction in transcriptional activity. We have screened an expression cDNA library derived from mouse pancreatic endocrine beta cells with a radioactive DNA probe containing multiple copies of the IEB1 sequence. A cDNA clone (A1) isolated by this procedure encodes a protein which shows efficient binding to the IEB1 probe, but much weaker binding to either an unrelated DNA probe or to a probe bearing a single base pair insertion within the recognition sequence. DNA sequence analysis indicates a protein belonging to the helix-loop-helix family of DNA-binding proteins. The ability of the protein encoded by clone A1 to recognize a number of wild type and mutant DNA sequences correlates closely with the ability of each sequence element to support transcription in vivo in the context of the insulin 5' flanking DNA. We conclude that the isolated cDNA may encode a transcription factor that participates in control of insulin gene expression. Images PMID:2181401
Stieb, Stefanie; Roth, Ziv; Dal Magro, Christina; Fischer, Sabine; Butz, Eric; Sagi, Amir; Khalaila, Isam; Lieb, Bernhard; Schenk, Sven; Hoeger, Ulrich
2014-12-01
The novel discoidal lipoprotein (dLp) recently detected in the crayfish, differs from other crustacean lipoproteins in its large size, apoprotein composition and high lipid binding capacity, We identified the dLp sequence by transcriptome analyses of the hepatopancreas and mass spectrometry. Further de novo assembly of the NGS data followed by BLAST searches using the sequence of the high density lipoprotein/1-glucan binding protein (HDL-BGBP) of Astacus leptodactylus as query revealed a putative precursor molecule with an open reading frame of 14.7 kb and a deduced primary structure of 4889 amino acids. The presence of an N-terminal lipid bind- ing domain and a DUF 1943 domain suggests the relationship with the large lipid transfer proteins. Two-putative dibasic furin cleavage sites were identified bordering the sequence of the HDL-BGBP. When subjected to mass spectroscopic analyses, tryptic peptides of the large apoprotein of dLp matched the N-terminal part of the precursor, while the peptides obtained for its small apoprotein matched the C-terminal part. Repeating the analysis in the prawn Macrobrachium rosenbergii revealed a similar protein with identical domain architecture suggesting that our findings do not represent an isolated instance. Our results indicate that the above three apolipoproteins (i.e HDL-BGBP and both the large and the small subunit of dLp) are translated as a large precursor. Cleavage at the furin type sites releases two subunits forming a heterodimeric dLP particle, while the remaining part forms an HDL-BGBP whose relationship with other lipoproteins as well as specific functions are yet to be elucidated.
Jobin, Michel-Philippe; Garmyn, Dominique; Diviès, Charles; Guzzo, Jean
1999-01-01
Using degenerated primers from conserved regions of previously studied clpX gene products, we cloned the clpX gene of the malolactic bacterium Oenococcus oeni. The clpX gene was sequenced, and the deduced protein of 413 amino acids (predicted molecular mass of 45,650 Da) was highly similar to previously analyzed clpX gene products from other organisms. An open reading frame located upstream of the clpX gene was identified as the tig gene by similarity of its predicted product to other bacterial trigger factors. ClpX was purified by using a maltose binding protein fusion system and was shown to possess an ATPase activity. Northern analyses indicated the presence of two independent 1.6-kb monocistronic clpX and tig mRNAs and also showed an increase in clpX mRNA amount after a temperature shift from 30 to 42°C. The clpX transcript is abundant in the early exponential growth phase and progressively declines to undetectable levels in the stationary phase. Thus, unlike hsp18, the gene encoding one of the major small heat shock proteins of Oenococcus oeni, clpX expression is related to the exponential growth phase and requires de novo protein synthesis. Primer extension analysis identified the 5′ end of clpX mRNA which is located 408 nucleotides upstream of a putative AUA start codon. The putative transcription start site allowed identification of a predicted promoter sequence with a high similarity to the consensus sequence found in the housekeeping gene promoter of gram-positive bacteria as well as Escherichia coli. PMID:10542163
Vela-Corcía, David; Bautista, Rocío; de Vicente, Antonio; Spanu, Pietro D.; Pérez-García, Alejandro
2016-01-01
The cucurbit powdery mildew fungus Podosphaera xanthii is a major limiting factor for cucurbit production worldwide. Despite the fungus’s agronomic and economic importance, very little is known about fundamental aspects of P. xanthii biology, such as obligate biotrophy or pathogenesis. To design more durable control strategies, genomic information about P. xanthii is needed. Powdery mildews are fungal pathogens with large genomes compared with those of other fungi, which contain vast amounts of repetitive DNA sequences, much of which is composed of retrotransposons. To reduce genome complexity, in this work we aimed to obtain and analyse the epiphytic transcriptome of P. xanthii as a starting point for genomic research. Total RNA was isolated from epiphytic fungal material, and the corresponding cDNA library was sequenced using a 454 GS FLX platform. Over 676,562 reads were obtained and assembled into 37,241 contigs. Annotation data identified 8,798 putative genes with different orthologues. As described for other powdery mildew fungi, a similar set of missing core ascomycete genes was found, which may explain obligate biotrophy. To gain insight into the plant-pathogen relationships, special attention was focused on the analysis of the secretome. After this analysis, 137 putative secreted proteins were identified, including 53 candidate secreted effector proteins (CSEPs). Consistent with a putative role in pathogenesis, the expression profile observed for some of these CSEPs showed expression maxima at the beginning of the infection process at 24 h after inoculation, when the primary appressoria are mostly formed. Our data mark the onset of genomics research into this very important pathogen of cucurbits and shed some light on the intimate relationship between this pathogen and its host plant. PMID:27711117
Vela-Corcía, David; Bautista, Rocío; de Vicente, Antonio; Spanu, Pietro D; Pérez-García, Alejandro
2016-01-01
The cucurbit powdery mildew fungus Podosphaera xanthii is a major limiting factor for cucurbit production worldwide. Despite the fungus's agronomic and economic importance, very little is known about fundamental aspects of P. xanthii biology, such as obligate biotrophy or pathogenesis. To design more durable control strategies, genomic information about P. xanthii is needed. Powdery mildews are fungal pathogens with large genomes compared with those of other fungi, which contain vast amounts of repetitive DNA sequences, much of which is composed of retrotransposons. To reduce genome complexity, in this work we aimed to obtain and analyse the epiphytic transcriptome of P. xanthii as a starting point for genomic research. Total RNA was isolated from epiphytic fungal material, and the corresponding cDNA library was sequenced using a 454 GS FLX platform. Over 676,562 reads were obtained and assembled into 37,241 contigs. Annotation data identified 8,798 putative genes with different orthologues. As described for other powdery mildew fungi, a similar set of missing core ascomycete genes was found, which may explain obligate biotrophy. To gain insight into the plant-pathogen relationships, special attention was focused on the analysis of the secretome. After this analysis, 137 putative secreted proteins were identified, including 53 candidate secreted effector proteins (CSEPs). Consistent with a putative role in pathogenesis, the expression profile observed for some of these CSEPs showed expression maxima at the beginning of the infection process at 24 h after inoculation, when the primary appressoria are mostly formed. Our data mark the onset of genomics research into this very important pathogen of cucurbits and shed some light on the intimate relationship between this pathogen and its host plant.
Characterization of two new putative adhesins of Leptospira interrogans.
Figueredo, Jupciana M; Siqueira, Gabriela H; de Souza, Gisele O; Heinemann, Marcos B; Vasconcellos, Silvio A; Chapola, Erica G B; Nascimento, Ana L T O
2017-01-01
We here report the characterization of two novel proteins encoded by the genes LIC11122 and LIC12287, identified in the genome sequences of Leptospira interrogans, annotated, respectively, as a putative sigma factor and a hypothetical protein. The CDSs LIC11122 and LIC12287 have signal peptide SPII and SPI and are predicted to be located mainly at the cytoplasmic membrane of the bacteria. The genes were cloned and the proteins expressed using Escherichia coli. Proteinase K digestion showed that both proteins are surface exposed. Evaluation of interaction of recombinant proteins with extracellular matrix components revealed that they are laminin binding and they were called Lsa19 (LIC11122) and Lsa14 (LIC12287), for Leptospiral-surface adhesin of 19 and 14 kDa, respectively. The bindings were dose-dependent on protein concentration, reaching saturation, fulfilling the ligand-binding criteria. Reactivity of the recombinant proteins with leptospirosis human sera has shown that Lsa19 and, to a lesser extent, Lsa14, are recognized by antibodies, suggesting that, most probably, Lsa19 is expressed during infection. The proteins interact with plasminogen and generate plasmin in the presence of urokinase-type plasminogen activator. Plasmin generation in Leptospira has been associated with tissue penetration and immune evasion strategies. The presence of a sigma factor on the cell surface playing a secondary role, probably mediating host -pathogen interaction, suggests that LIC11122 is a moonlighting protein candidate. Although the biological significance of these putative adhesins will require the generation of mutants, our data suggest that Lsa19 is a potential candidate for future evaluation of its role in adhesion/colonization activities during L. interrogans infection.
The complete mitochondrial genome sequence of Aesopia cornuta (Pleuronectiformes: Soleidae).
Wang, Shu-Ying; Shi, Wei; Wang, Zhong-Ming; Gong, Li; Kong, Xiao-Yu
2015-02-01
Aesopia cornuta belongs to the family Soleidae of Pleuronectiformes, and the morphological characters are much similar to those of Zebrias. In this article, we sequenced, characterized, and compared the complete mitogenome of A. cornuta for the first time. The genome is 16,737 base pairs in length, and is typically consist of 37 genes, including 13 protein-coding genes, two ribosomal RNA, 22 transfer RNA, as well as a putative L-strand replication origin and a putative control region. The gene organization is identical to that of typical bony fishes. The overall base composition is 29.1, 28.3, 26.8 and 15.8% for C, A, T and G, respectively, with a slight AT bias of 55.1%. This result is expected to contribute to understanding the systematic evolution of the genus Aesopia and further taxonomic and phylogenetic studies of Soleidae and Pleuronectiformes.
Penz, Thomas; Horn, Matthias; Schmitz-Esser, Stephan
2010-01-01
The recently sequenced genome of the obligate intracellular amoeba symbiont 'Candidatus Amoebophilus asiaticus' is unique among prokaryotic genomes due to its extremely large fraction of genes encoding proteins harboring eukaryotic domains such as ankyrin-repeats, TPR/SEL1 repeats, leucine-rich repeats, as well as F- and U-box domains, most of which likely serve in the interaction with the amoeba host. Here we provide evidence for the presence of additional proteins which are presumably presented extracellularly and should thus also be important for host cell interaction. Surprisingly, we did not find homologues of any of the well-known protein secretion systems required to translocate effector proteins into the host cell in the A. asiaticus genome, and the type six secretion systems seems to be incomplete. Here we describe the presence of a putative prophage in the A. asiaticus genome, which shows similarity to the antifeeding prophage from the insect pathogen Serratia entomophila. In S. entomophila this system is used to deliver toxins into insect hosts. This putative antifeeding-like prophage might thus represent the missing protein secretion apparatus in A. asiaticus.
Carolan, James C; Fitzroy, Carol I J; Ashton, Peter D; Douglas, Angela E; Wilkinson, Thomas L
2009-05-01
Nine proteins secreted in the saliva of the pea aphid Acyrthosiphon pisum were identified by a proteomics approach using GE-LC-MS/MS and LC-MS/MS, with reference to EST and genomic sequence data for A. pisum. Four proteins were identified by their sequences: a homolog of angiotensin-converting enzyme (an M2 metalloprotease), an M1 zinc-dependant metalloprotease, a glucose-methanol-choline (GMC)-oxidoreductase and a homolog to regucalcin (also known as senescence marker protein 30). The other five proteins are not homologous to any previously described sequence and included an abundant salivary protein (represented by ACYPI009881), with a predicted length of 1161 amino acids and high serine, tyrosine and cysteine content. A. pisum feeds on plant phloem sap and the metalloproteases and regucalcin (a putative calcium-binding protein) are predicted determinants of sustained feeding, by inactivation of plant protein defences and inhibition of calcium-mediated occlusion of phloem sieve elements, respectively. The amino acid composition of ACYPI009881 suggests a role in the aphid salivary sheath that protects the aphid mouthparts from plant defences, and the oxidoreductase may promote gelling of the sheath protein or mediate oxidative detoxification of plant allelochemicals. Further salivary proteins are expected to be identified as more sensitive MS technologies are developed.
Nishida, Takashi; Watanabe, Kenta; Tachibana, Masato; Shimizu, Takashi; Watarai, Masahisa
2017-03-01
In this study, a cryptic plasmid pOfk55 from Legionella pneumophila was isolated and characterized. pOfk55 comprised 2584bp with a GC content of 37.3% and contained three putative open reading frames (ORFs). orf1 encoded a protein of 195 amino acids and the putative protein shared 39% sequence identity with a putative plasmid replication protein RepL. ORF1 was needed for replication in L. pneumophila but pOfk55 did not replicate in Escherichia coli. orf2 and orf3 encoded putative hypothetical proteins of 114 amino acids and 78 amino acids, respectively, but the functions of the putative proteins ORF2 and OFR3 are not clear. The transfer mechanism for pOfk55 was independent on the type IVB secretion system in the original host. A L. pneumophila-E. coli shuttle vector, pNT562 (5058bp, Km R ), was constructed by In-Fusion Cloning of pOfk55 with a kanamycin-resistance gene from pUTmini-Tn5Km and the origin of replication from pBluescript SK(+) (pNT561). Multiple cloning sites from pBluescript SK(+) as well as the tac promoter region and lacI gene from pAM239-GFP were inserted into pNT561 to construct pNT562. The transformation efficiency of pNT562 in L. pneumophila strains ranged from 1.6×10 1 to 1.0×10 5 CFU/ng. The relative number of pNT562 was estimated at 5.7±1.0 copies and 73.6% of cells maintained the plasmid after 1week in liquid culture without kanamycin. A green fluorescent protein (GFP) expression vector, pNT563, was constructed by ligating pNT562 with the gfpmut3 gene from pAM239-GFP. pNT563 was introduced into L. pneumophila Lp02 and E. coli DH5α, and both strains expressed GFP successfully. These results suggest that the shuttle vector is useful for genetic studies in L. pneumophila. Copyright © 2017 Elsevier Inc. All rights reserved.
Urra, Félix A; Pulgar, Rodrigo; Gutiérrez, Ricardo; Hodar, Christian; Cambiazo, Verónica; Labra, Antonieta
2015-12-15
Philodryas chamissonis is a rear-fanged snake endemic to Chile. Its bite produces mild to moderate symptoms with proteolytic and anti-coagulant effects. Presently, the composition of the venom, as well as, the biochemical and structural characteristics of its toxins, remains unknown. In this study, we cloned and reported the first full-length sequences of five toxin-encoding genes from the venom gland of this species: Type III snake venom metalloprotease (SVMP), snake venom serine protease (SVSP), Cysteine-rich secretory protein (CRISP), α and β subunits of C-type lectin-like protein (CLP) and C-type natriuretic peptide (NP). These genes are highly expressed in the venom gland and their sequences exhibited a putative signal peptide, suggesting that these are components of the venom. These putative toxins had different evolutionary relationships with those reported for some front-fanged snakes, being SVMP, SVSP and CRISP of P. chamissonis closely related to the toxins present in Elapidae species, while NP was more related to those of Viperidae species. In addition, analyses suggest that the α and β subunits of CLP of P. chamissonis might have a α-subunit scaffold in common with Viperidae species, whose highly variable C-terminal region might have allowed the diversification in α and β subunits. Our results provide the first molecular description of the toxins possibly implicated in the envenomation of prey and humans by the bite of P. chamissonis. Copyright © 2015 Elsevier Ltd. All rights reserved.
Fritz, David T; Jiang, Shan; Xu, Junwang; Rogers, Melissa B
2006-07-01
The bone morphogenetic protein (BMP)2 gene has been genetically linked to osteoporosis and osteoarthritis. We have shown that the 3'-untranslated regions (UTR) of BMP2 genes from mammals to fishes are extraordinarily conserved. This indicates that the BMP2 3'-UTR is under stringent selective pressure. We present evidence that the conserved region is a strong posttranscriptional regulator of BMP2 expression. Polymorphisms in cis-regulatory elements have been proven to influence susceptibility to a growing number of diseases. A common single nucleotide polymorphism (SNP) disrupts a putative posttranscriptional regulatory motif, an AU-rich element, within the BMP2 3'-UTR. The affinity of specific proteins for the rs15705 SNP sequence differs from their affinity for the normal human sequence. More importantly, the in vitro decay rate of RNAs with the SNP is higher than that of RNAs with the normal sequence. Such changes in mRNA:protein interactions may influence the posttranscriptional mechanisms that control BMP2 gene expression. The consequent alterations in BMP2 protein levels may influence the development or physiology of bone or other BMP2-influenced tissues.
Abebe-Akele, Feseha; Tisa, Louis S; Cooper, Vaughn S; Hatcher, Philip J; Abebe, Eyualem; Thomas, W Kelley
2015-07-18
Entomopathogenic associations between nematodes in the genera Steinernema and Heterorhabdus with their cognate bacteria from the bacterial genera Xenorhabdus and Photorhabdus, respectively, are extensively studied for their potential as biological control agents against invasive insect species. These two highly coevolved associations were results of convergent evolution. Given the natural abundance of bacteria, nematodes and insects, it is surprising that only these two associations with no intermediate forms are widely studied in the entomopathogenic context. Discovering analogous systems involving novel bacterial and nematode species would shed light on the evolutionary processes involved in the transition from free living organisms to obligatory partners in entomopathogenicity. We report the complete genome sequence of a new member of the enterobacterial genus Serratia that forms a putative entomopathogenic complex with Caenorhabditis briggsae. Analysis of the 5.04 MB chromosomal genome predicts 4599 protein coding genes, seven sets of ribosomal RNA genes, 84 tRNA genes and a 64.8 KB plasmid encoding 74 genes. Comparative genomic analysis with three of the previously sequenced Serratia species, S. marcescens DB11 and S. proteamaculans 568, and Serratia sp. AS12, revealed that these four representatives of the genus share a core set of ~3100 genes and extensive structural conservation. The newly identified species shares a more recent common ancestor with S. marcescens with 99% sequence identity in rDNA sequence and orthology across 85.6% of predicted genes. Of the 39 genes/operons implicated in the virulence, symbiosis, recolonization, immune evasion and bioconversion, 21 (53.8%) were present in Serratia while 33 (84.6%) and 35 (89%) were present in Xenorhabdus and Photorhabdus EPN bacteria respectively. The majority of unique sequences in Serratia sp. SCBI (South African Caenorhabditis briggsae Isolate) are found in ~29 genomic islands of 5 to 65 genes and are enriched in putative functions that are biologically relevant to an entomopathogenic lifestyle, including non-ribosomal peptide synthetases, bacteriocins, fimbrial biogenesis, ushering proteins, toxins, secondary metabolite secretion and multiple drug resistance/efflux systems. By revealing the early stages of adaptation to this lifestyle, the Serratia sp. SCBI genome underscores the fact that in EPN formation the composite end result - killing, bioconversion, cadaver protection and recolonization- can be achieved by dissimilar mechanisms. This genome sequence will enable further study of the evolution of entomopathogenic nematode-bacteria complexes.
HOXBES2: a novel epididymal HOXB2 homeoprotein and its domain-specific association with spermatozoa.
Prabagaran, E; Bandivdekar, A H; Dighe, V; Raghavan, V P
2007-02-01
The sperm from the testis acquires complete fertilizing ability and forward progressive motility following its transit through the epididymis. Acquisition of these characteristics results from the modification of the sperm proteome following interactions with epididymal secretions. In our attempts to identify epididymis-specific sperm plasma membrane proteins, a partial 2.83-kb clone was identified by immunoscreening a monkey epididymal cDNA library with an agglutinating monoclonal antibody raised against washed human spermatozoa. The sequence of the 2.83-kb clone exhibited homology to the region between 1 and 1097 bp of the homeobox gene, Hoxb2. This sequence was found to be species conserved, as revealed by RT-PCR analysis. To obtain a full-length clone of the sequence, 5' RACE-PCR (rapid amplification of cDNA ends PCR) was carried out using rat epididymal RNA as the template. It resulted in a full-length 1.657-kb cDNA encoding a 32.9-kDa putative protein. The protein designated HOXBES2 exhibited homology to the conserved 61-amino acid homeodomain region of the HOXB2 homeoprotein. However, characteristic differences were noted in its amino and carboxyl termini compared with HOXB2. A putative 30-kDa protein was detected in the tissue extracts from adult rat epididymis and caudal spermatozoa, and a 37-kDa protein was detected in the rat embryo when probed with a polyclonal antibody against HOXB2 protein. Multiple tissue Western blot and immunohistochemical analysis further indicated its expression in the cytoplasm of the principal and basal epithelial cells, with maximal expression in the distal epididymal segments. Northern blot analysis detected a single approximately 2.5-kb transcript from the adult epididymis. Indirect immunofluorescence localized the protein to the acrosome, midpiece, and equatorial segments of rat caudal and ejaculated human and monkey spermatozoa, respectively. In conclusion, we have identified and characterized a novel epididymal homeoprotein different from HOXB2 protein and hereafter referred to as HOXBES2, (HOXB2 homeodomain containing epididymis-specific sperm protein) with a probable role in fertilization.
Maurino, Fernanda; Dumón, Analía D; Llauger, Gabriela; Alemandri, Vanina; de Haro, Luis A; Mattio, M Fernanda; Del Vas, Mariana; Laguna, Irma Graciela; Giménez Pecci, María de la Paz
2018-01-01
A rhabdovirus infecting maize and wheat crops in Argentina was molecularly characterized. Through next-generation sequencing (NGS) of symptomatic leaf samples, the complete genome was obtained of two isolates of maize yellow striate virus (MYSV), a putative new rhabdovirus, differing by only 0.4% at the nucleotide level. The MYSV genome consists of 12,654 nucleotides for maize and wheat virus isolates, and shares 71% nucleotide sequence identity with the complete genome of barley yellow striate mosaic virus (BYSMV, NC028244). Ten open reading frames (ORFs) were predicted in the MYSV genome from the antigenomic strand and were compared with their BYSMV counterparts. The highest amino acid sequence identity of the MYSV and BYSMV proteins was 80% between the L proteins, and the lowest was 37% between the proteins 4. Phylogenetic analysis suggested that the MYSV isolates are new members of the genus Cytorhabdovirus, family Rhabdoviridae. Yellow striate, affecting maize and wheat crops in Argentina, is an emergent disease that presents a potential economic risk for these widely distributed crops.
ProteinSeq: High-Performance Proteomic Analyses by Proximity Ligation and Next Generation Sequencing
Vänelid, Johan; Siegbahn, Agneta; Ericsson, Olle; Fredriksson, Simon; Bäcklin, Christofer; Gut, Marta; Heath, Simon; Gut, Ivo Glynne; Wallentin, Lars; Gustafsson, Mats G.; Kamali-Moghaddam, Masood; Landegren, Ulf
2011-01-01
Despite intense interest, methods that provide enhanced sensitivity and specificity in parallel measurements of candidate protein biomarkers in numerous samples have been lacking. We present herein a multiplex proximity ligation assay with readout via realtime PCR or DNA sequencing (ProteinSeq). We demonstrate improved sensitivity over conventional sandwich assays for simultaneous analysis of sets of 35 proteins in 5 µl of blood plasma. Importantly, we observe a minimal tendency to increased background with multiplexing, compared to a sandwich assay, suggesting that higher levels of multiplexing are possible. We used ProteinSeq to analyze proteins in plasma samples from cardiovascular disease (CVD) patient cohorts and matched controls. Three proteins, namely P-selectin, Cystatin-B and Kallikrein-6, were identified as putative diagnostic biomarkers for CVD. The latter two have not been previously reported in the literature and their potential roles must be validated in larger patient cohorts. We conclude that ProteinSeq is promising for screening large numbers of proteins and samples while the technology can provide a much-needed platform for validation of diagnostic markers in biobank samples and in clinical use. PMID:21980495
Arya, Gitanjali; Niven, Donald F
2011-03-24
Members of the Actinobacillus minor/"porcitonsillarum" complex are common inhabitants of the swine respiratory tract. Although avirulent or of low virulence for pigs, these organisms, like pathogens, do grow in vivo and must, therefore, be able to acquire iron within the host. Here, we investigated the abilities of six members of the A. minor/"porcitonsillarum" complex to acquire iron from transferrin and various haemoglobins. Using growth assays, all six strains were shown to acquire iron from porcine, bovine and human haemoglobins but not from porcine transferrin. Analyses of whole genome sequences revealed that A. minor strains NM305(T) and 202, unlike the swine-pathogenic actinobacilli, A. pleuropneumoniae and A. suis, lack not only the transferrin-binding protein genes, tbpA and tbpB, but also the haemoglobin-binding protein gene, hgbA. Strains NM305(T) and 202, however, were found to possess other putative haemin/haemoglobin-binding protein genes that were predicted to encode mature proteins of ∼ 72 and ∼ 75 kDa, respectively. An affinity procedure based on haemin-agarose allowed the isolation of ∼ 65 and ∼ 67 kDa iron-repressible outer membrane polypeptides from membranes derived from strains NM305(T) and 202, respectively, and mass spectrometry revealed that these polypeptides were the products of the putative haemin/haemoglobin-binding protein genes. PCR approaches allowed the amplification and sequencing of homologues of both haemin/haemoglobin-binding protein genes from each of the other four strains, strains 33PN and 7ATS of the A. minor/"porcitonsillarum" complex and "A. porcitonsillarum" strains 9953L55 and 0347, suggesting that such proteins are involved in the utilization of haemoglobin-bound iron, presumably as surface receptors, by all six strains investigated. Copyright © 2010 Elsevier B.V. All rights reserved.
Efficient prediction of human protein-protein interactions at a global scale.
Schoenrock, Andrew; Samanfar, Bahram; Pitre, Sylvain; Hooshyar, Mohsen; Jin, Ke; Phillips, Charles A; Wang, Hui; Phanse, Sadhna; Omidi, Katayoun; Gui, Yuan; Alamgir, Md; Wong, Alex; Barrenäs, Fredrik; Babu, Mohan; Benson, Mikael; Langston, Michael A; Green, James R; Dehne, Frank; Golshani, Ashkan
2014-12-10
Our knowledge of global protein-protein interaction (PPI) networks in complex organisms such as humans is hindered by technical limitations of current methods. On the basis of short co-occurring polypeptide regions, we developed a tool called MP-PIPE capable of predicting a global human PPI network within 3 months. With a recall of 23% at a precision of 82.1%, we predicted 172,132 putative PPIs. We demonstrate the usefulness of these predictions through a range of experiments. The speed and accuracy associated with MP-PIPE can make this a potential tool to study individual human PPI networks (from genomic sequences alone) for personalized medicine.
2013-01-01
Background The European spruce bark beetle, Ips typographus, and the North American mountain pine beetle, Dendroctonus ponderosae (Coleoptera: Curculionidae: Scolytinae), are severe pests of coniferous forests. Both bark beetle species utilize aggregation pheromones to coordinate mass-attacks on host trees, while odorants from host and non-host trees modulate the pheromone response. Thus, the bark beetle olfactory sense is of utmost importance for fitness. However, information on the genes underlying olfactory detection has been lacking in bark beetles and is limited in Coleoptera. We assembled antennal transcriptomes from next-generation sequencing of I. typographus and D. ponderosae to identify members of the major chemosensory multi-gene families. Results Gene ontology (GO) annotation indicated that the relative abundance of transcripts associated with specific GO terms was highly similar in the two species. Transcripts with terms related to olfactory function were found in both species. Focusing on the chemosensory gene families, we identified 15 putative odorant binding proteins (OBP), 6 chemosensory proteins (CSP), 3 sensory neuron membrane proteins (SNMP), 43 odorant receptors (OR), 6 gustatory receptors (GR), and 7 ionotropic receptors (IR) in I. typographus; and 31 putative OBPs, 11 CSPs, 3 SNMPs, 49 ORs, 2 GRs, and 15 IRs in D. ponderosae. Predicted protein sequences were compared with counterparts in the flour beetle, Tribolium castaneum, the cerambycid beetle, Megacyllene caryae, and the fruit fly, Drosophila melanogaster. The most notable result was found among the ORs, for which large bark beetle-specific expansions were found. However, some clades contained receptors from all four beetle species, indicating a degree of conservation among some coleopteran OR lineages. Putative GRs for carbon dioxide and orthologues for the conserved antennal IRs were included in the identified receptor sets. Conclusions The protein families important for chemoreception have now been identified in three coleopteran species (four species for the ORs). Thus, this study allows for improved evolutionary analyses of coleopteran olfaction. Identification of these proteins in two of the most destructive forest pests, sharing many semiochemicals, is especially important as they might represent novel targets for population control. PMID:23517120
Sequencing and phylogenetic analysis of tobacco virus 2, a polerovirus from Nicotiana tabacum.
Zhou, Benguo; Wang, Fang; Zhang, Xuesong; Zhang, Lina; Lin, Huafeng
2017-07-01
The complete genome sequence of a new virus, provisionally named tobacco virus 2 (TV2), was determined and identified from leaves of tobacco (Nicotiana tabacum) exhibiting leaf mosaic, yellowing, and deformity, in Anhui Province, China. The genome sequence of TV2 comprises 5,979 nucleotides, with 87% nucleotide sequence identity to potato leafroll virus (PLRV). Its genome organization is similar to that of PLRV, containing six open reading frames (ORFs) that potentially encode proteins with putative functions in cell-to-cell movement and suppression of RNA silencing. Phylogenetic analysis of the nucleotide sequence placed TV2 alongside members of the genus Polerovirus in the family Luteoviridae. To the best our knowledge, this study is the first report of a complete genome sequence of a new polerovirus identified in tobacco.
Bamford, Vicki A; Armour, Maria; Mitchell, Sue A; Cartron, Michaël; Andrews, Simon C; Watson, Kimberly A
2008-09-01
YqjH is a cytoplasmic FAD-containing protein from Escherichia coli; based on homology to ViuB of Vibrio cholerae, it potentially acts as a ferri-siderophore reductase. This work describes its overexpression, purification, crystallization and structure solution at 3.0 A resolution. YqjH shares high sequence similarity with a number of known siderophore-interacting proteins and its structure was solved by molecular replacement using the siderophore-interacting protein from Shewanella putrefaciens as the search model. The YqjH structure resembles those of other members of the NAD(P)H:flavin oxidoreductase superfamily.
Jonniaux, J L; Coster, F; Purnelle, B; Goffeau, A
1994-12-01
We report the amino acid sequence of 13 open reading frames (ORF > 299 bp) located on a 21.7 kb DNA segment from the left arm of chromosome XIV of Saccharomyces cerevisiae. Five open reading frames had been entirely or partially sequenced previously: WHI3, GCR2, SPX19, SPX18 and a heat shock gene similar to SSB1. The products of 8 other ORFs are new putative proteins among which N1394 is probably a membrane protein. N1346 contains a leucine zipper pattern and the corresponding ORF presents an HAP (global regulator of respiratory genes) upstream activating sequence in the promoting region. N1386 shares homologies with the DNA structure-specific recognition protein family SSRPs and the corresponding ORF is preceded by an MCB (MluI cell cycle box) upstream activating factor.
A bacterial Argonaute with noncanonical guide RNA specificity
Kaya, Emine; Doxzen, Kevin W.; Knoll, Kilian R.; Wilson, Ross C.; Strutt, Steven C.; Kranzusch, Philip J.; Doudna, Jennifer A.
2016-01-01
Eukaryotic Argonaute proteins induce gene silencing by small RNA-guided recognition and cleavage of mRNA targets. Although structural similarities between human and prokaryotic Argonautes are consistent with shared mechanistic properties, sequence and structure-based alignments suggested that Argonautes encoded within CRISPR-cas [clustered regularly interspaced short palindromic repeats (CRISPR)-associated] bacterial immunity operons have divergent activities. We show here that the CRISPR-associated Marinitoga piezophila Argonaute (MpAgo) protein cleaves single-stranded target sequences using 5′-hydroxylated guide RNAs rather than the 5′-phosphorylated guides used by all known Argonautes. The 2.0-Å resolution crystal structure of an MpAgo–RNA complex reveals a guide strand binding site comprising residues that block 5′ phosphate interactions. Using structure-based sequence alignment, we were able to identify other putative MpAgo-like proteins, all of which are encoded within CRISPR-cas loci. Taken together, our data suggest the evolution of an Argonaute subclass with noncanonical specificity for a 5′-hydroxylated guide. PMID:27035975
A Universal Trend among Proteomes Indicates an Oily Last Common Ancestor
Mannige, Ranjan V.; Brooks, Charles L.; Shakhnovich, Eugene I.
2012-01-01
Despite progresses in ancestral protein sequence reconstruction, much needs to be unraveled about the nature of the putative last common ancestral proteome that served as the prototype of all extant lifeforms. Here, we present data that indicate a steady decline (oil escape) in proteome hydrophobicity over species evolvedness (node number) evident in 272 diverse proteomes, which indicates a highly hydrophobic (oily) last common ancestor (LCA). This trend, obtained from simple considerations (free from sequence reconstruction methods), was corroborated by regression studies within homologous and orthologous protein clusters as well as phylogenetic estimates of the ancestral oil content. While indicating an inherent irreversibility in molecular evolution, oil escape also serves as a rare and universal reaction-coordinate for evolution (reinforcing Darwin's principle of Common Descent), and may prove important in matters such as (i) explaining the emergence of intrinsically disordered proteins, (ii) developing composition- and speciation-based “global” molecular clocks, and (iii) improving the statistical methods for ancestral sequence reconstruction. PMID:23300421
Prm3p is a pheromone-induced peripheral nuclear envelope protein required for yeast nuclear fusion.
Shen, Shu; Tobery, Cynthia E; Rose, Mark D
2009-05-01
Nuclear membrane fusion is the last step in the mating pathway of the yeast Saccharomyces cerevisiae. We adapted a bioinformatics approach to identify putative pheromone-induced membrane proteins potentially required for nuclear membrane fusion. One protein, Prm3p, was found to be required for nuclear membrane fusion; disruption of PRM3 caused a strong bilateral defect, in which nuclear congression was completed but fusion did not occur. Prm3p was localized to the nuclear envelope in pheromone-responding cells, with significant colocalization with the spindle pole body in zygotes. A previous report, using a truncated protein, claimed that Prm3p is localized to the inner nuclear envelope. Based on biochemistry, immunoelectron microscopy and live cell microscopy, we find that functional Prm3p is a peripheral membrane protein exposed on the cytoplasmic face of the outer nuclear envelope. In support of this, mutations in a putative nuclear localization sequence had no effect on full-length protein function or localization. In contrast, point mutations and deletions in the highly conserved hydrophobic carboxy-terminal domain disrupted both protein function and localization. Genetic analysis, colocalization, and biochemical experiments indicate that Prm3p interacts directly with Kar5p, suggesting that nuclear membrane fusion is mediated by a protein complex.
Zhang, Lin-Lin; Tan, Mei-Juan; Liu, Guang-Lei; Chi, Zhe; Wang, Guang-Yuan; Chi, Zhen-Ming
2015-04-01
The INU1 gene encoding an exo-inulinase from the marine-derived yeast Candida membranifaciens subsp. flavinogenie W14-3 was cloned and characterized. It had an open reading frame of 1,536 bp long encoding an inulinase. The coding region of it was not interrupted by any intron. The cloned gene encoded 512 amino acid residues of a protein with a putative signal peptide of 23 amino acids and a calculated molecular mass of 57.8 kDa. The protein sequence deduced from the inulinase gene contained the inulinase consensus sequences (WMNDPNGL), (RDP), ECP FS and Q. The protein also had six conserved putative N-glycosylation sites. The deduced inulinase from the yeast strain W14-3 was found to be closely related to that from Candida kutaonensis sp. nov. KRF1, Kluyveromyces marxianus, and Cryptococcus aureus G7a. The inulinase gene with its signal peptide encoding sequence was subcloned into the pMIRSC11 expression vector and expressed in Saccharomyces sp. W0. The recombinant yeast strain W14-3-INU-112 obtained could produce 16.8 U/ml of inulinase activity and 12.5 % (v/v) ethanol from 250 g/l of inulin within 168 h. The monosaccharides were detected after the hydrolysis of inulin with the crude inulinase (the yeast culture). All the results indicated that the cloned gene and the recombinant yeast strain W14-3-INU-112 had potential applications in biotechnology.
Baird, Fiona J; Su, Xiaopei; Aibinu, Ibukun; Nolan, Matthew J; Sugiyama, Hiromu; Otranto, Domenico; Lopata, Andreas L; Cantacessi, Cinzia
2016-07-01
Food-borne nematodes of the genus Anisakis are responsible for a wide range of illnesses (= anisakiasis), from self-limiting gastrointestinal forms to severe systemic allergic reactions, which are often misdiagnosed and under-reported. In order to enhance and refine current diagnostic tools for anisakiasis, knowledge of the whole spectrum of parasite molecules transcribed and expressed by this parasite, including those acting as potential allergens, is necessary. In this study, we employ high-throughput (Illumina) sequencing and bioinformatics to characterise the transcriptomes of two Anisakis species, A. simplex and A. pegreffii, and utilize this resource to compile lists of potential allergens from these parasites. A total of ~65,000,000 reads were generated from cDNA libraries for each species, and assembled into ~34,000 transcripts (= Unigenes); ~18,000 peptides were predicted from each cDNA library and classified based on homology searches, protein motifs and gene ontology and biological pathway mapping. Using comparative analyses with sequence data available in public databases, 36 (A. simplex) and 29 (A. pegreffii) putative allergens were identified, including sequences encoding 'novel' Anisakis allergenic proteins (i.e. cyclophilins and ABA-1 domain containing proteins). This study represents a first step towards providing the research community with a curated dataset to use as a molecular resource for future investigations of the biology of Anisakis, including molecules putatively acting as allergens, using functional genomics, proteomics and immunological tools. Ultimately, an improved knowledge of the biological functions of these molecules in the parasite, as well as of their immunogenic properties, will assist the development of comprehensive, reliable and robust diagnostic tools.
Torto-Alalibo, Trudy; Tian, Miaoying; Gajendran, Kamal; Waugh, Mark E; van West, Pieter; Kamoun, Sophien
2005-01-01
Background The oomycete Saprolegnia parasitica is one of the most economically important fish pathogens. There is a dramatic recrudescence of Saprolegnia infections in aquaculture since the use of the toxic organic dye malachite green was banned in 2002. Little is known about the molecular mechanisms underlying pathogenicity in S. parasitica and other animal pathogenic oomycetes. In this study we used a genomics approach to gain a first insight into the transcriptome of S. parasitica. Results We generated 1510 expressed sequence tags (ESTs) from a mycelial cDNA library of S. parasitica. A total of 1279 consensus sequences corresponding to 525944 base pairs were assembled. About half of the unigenes showed similarities to known protein sequences or motifs. The S. parasitica sequences tended to be relatively divergent from Phytophthora sequences. Based on the sequence alignments of 18 conserved proteins, the average amino acid identity between S. parasitica and three Phytophthora species was 77% compared to 93% within Phytophthora. Several S. parasitica cDNAs, such as those with similarity to fungal type I cellulose binding domain proteins, PAN/Apple module proteins, glycosyl hydrolases, proteases, as well as serine and cysteine protease inhibitors, were predicted to encode secreted proteins that could function in virulence. Some of these cDNAs were more similar to fungal proteins than to other eukaryotic proteins confirming that oomycetes and fungi share some virulence components despite their evolutionary distance Conclusion We provide a first glimpse into the gene content of S. parasitica, a reemerging oomycete fish pathogen. These resources will greatly accelerate research on this important pathogen. The data is available online through the Oomycete Genomics Database [1]. PMID:16076392
Takeshita, S; Kikuno, R; Tezuka, K; Amann, E
1993-01-01
A cDNA library prepared from the mouse osteoblastic cell line MC3T3-E1 was screened for the presence of specifically expressed genes by employing a combined subtraction hybridization/differential screening approach. A cDNA was identified and sequenced which encodes a protein designated osteoblast-specific factor 2 (OSF-2) comprising 811 amino acids. OSF-2 has a typical signal sequence, followed by a cysteine-rich domain, a fourfold repeated domain and a C-terminal domain. The protein lacks a typical transmembrane region. The fourfold repeated domain of OSF-2 shows homology with the insect protein fasciclin I. RNA analyses revealed that OSF-2 is expressed in bone and to a lesser extent in lung, but not in other tissues. Mouse OSF-2 cDNA was subsequently used as a probe to clone the human counterpart. Mouse and human OSF-2 show a high amino acid sequence conservation except for the signal sequence and two regions in the C-terminal domain in which 'in-frame' insertions or deletions are observed, implying alternative splicing events. On the basis of the amino acid sequence homology with fasciclin I, we suggest that OSF-2 functions as a homophilic adhesion molecule in bone formation. Images Figure 3 Figure 4 Figure 5 Figure 6 PMID:8363580
Pitre, S; North, C; Alamgir, M; Jessulat, M; Chan, A; Luo, X; Green, J R; Dumontier, M; Dehne, F; Golshani, A
2008-08-01
Protein-protein interaction (PPI) maps provide insight into cellular biology and have received considerable attention in the post-genomic era. While large-scale experimental approaches have generated large collections of experimentally determined PPIs, technical limitations preclude certain PPIs from detection. Recently, we demonstrated that yeast PPIs can be computationally predicted using re-occurring short polypeptide sequences between known interacting protein pairs. However, the computational requirements and low specificity made this method unsuitable for large-scale investigations. Here, we report an improved approach, which exhibits a specificity of approximately 99.95% and executes 16,000 times faster. Importantly, we report the first all-to-all sequence-based computational screen of PPIs in yeast, Saccharomyces cerevisiae in which we identify 29,589 high confidence interactions of approximately 2 x 10(7) possible pairs. Of these, 14,438 PPIs have not been previously reported and may represent novel interactions. In particular, these results reveal a richer set of membrane protein interactions, not readily amenable to experimental investigations. From the novel PPIs, a novel putative protein complex comprised largely of membrane proteins was revealed. In addition, two novel gene functions were predicted and experimentally confirmed to affect the efficiency of non-homologous end-joining, providing further support for the usefulness of the identified PPIs in biological investigations.
Molecular cloning and characterization of a gene encoding glutaminase from Aspergillus oryzae.
Koibuchi, K; Nagasaki, H; Yuasa, A; Kataoka, J; Kitamoto, K
2000-07-01
A glutaminase from Aspergillus oryzae was purified and its molecular weight was determined to be 82,091 by matrix-assisted laser desorption ionization time-of-flight mass spectrometry. Purified glutaminase catalysed the hydrolysis not only of L-glutamine but also of D-glutamine. Both the molecular weight and the substrate specificity of this glutaminase were different from those reported previously [Yano et al. (1998) J Ferment Technol 66: 137-143]. On the basis of its internal amino acid sequences, we have isolated and characterized the glutaminase gene (gtaA) from A. oryzae. The gtaA gene had an open reading frame coding for 690 amino acid residues, including a signal peptide of 20 amino acid residues and a mature protein of 670 amino acid residues. In the 5'-flanking region of the gene, there were three putative CreAp binding sequences and one putative AreAp binding sequence. The gtaA structural gene was introduced into A. oryzae NS4 and a marked increase in activity was detected in comparison with the control strain. The gtaA gene was also isolated from Aspergillus nidulans on the basis of the determined nucleotide sequence of the gtaA gene from A. oryzae.
Margam, Venu M.; Coates, Brad S.; Bayles, Darrell O.; Hellmich, Richard L.; Agunbiade, Tolulope; Seufferheld, Manfredo J.; Sun, Weilin; Kroemer, Jeremy A.; Ba, Malick N.; Binso-Dabire, Clementine L.; Baoua, Ibrahim; Ishiyaku, Mohammad F.; Covas, Fernando G.; Srinivasan, Ramasamy; Armstrong, Joel; Murdock, Larry L.; Pittendrigh, Barry R.
2011-01-01
The legume pod borer, Maruca vitrata (Lepidoptera: Crambidae), is an insect pest species of crops grown by subsistence farmers in tropical regions of Africa. We present the de novo assembly of 3729 contigs from 454- and Sanger-derived sequencing reads for midgut, salivary, and whole adult tissues of this non-model species. Functional annotation predicted that 1320 M. vitrata protein coding genes are present, of which 631 have orthologs within the Bombyx mori gene model. A homology-based analysis assigned M. vitrata genes into a group of paralogs, but these were subsequently partitioned into putative orthologs following phylogenetic analyses. Following sequence quality filtering, a total of 1542 putative single nucleotide polymorphisms (SNPs) were predicted within M. vitrata contig assemblies. Seventy one of 1078 designed molecular genetic markers were used to screen M. vitrata samples from five collection sites in West Africa. Population substructure may be present with significant implications in the insect resistance management recommendations pertaining to the release of biological control agents or transgenic cowpea that express Bacillus thuringiensis crystal toxins. Mutation data derived from transcriptome sequencing is an expeditious and economical source for genetic markers that allow evaluation of ecological differentiation. PMID:21754987
Nucleotide sequence of a resistance breaking mutant of southern bean mosaic virus.
Lee, L; Anderson, E J
1998-01-01
SBMV-S is a resistance-breaking mutant of an Arkansas isolate of the bean strain of southern bean mosaic virus (SBMV-BARK) that is able to move systemically in Phaseolus vulgaris cvs. Pinto and Great Northern, whereas the wild-type SBMV-BARK causes local necrotic lesions and is restricted to the inoculated leaves of these hosts. Sequence analysis of the 4136 nucleotide genomes of SBMV-BARK and SBMV-S revealed seven nucleotide differences, but only four deduced amino acid changes. A single amino acid change occurred in the C-terminal region of the putative RNA-dependent RNA polymerase and three differences were identified in the N-terminal portion of the virus coat protein. SBMV-BARK and SBMV-S were compared with other sobemoviruses and were found to contain a high level of nucleotide sequence identity (91.3%) to SBMV-B. Unlike SBMV-B however, SBMV-BARK and SBMV-S contained four putative overlapping open reading frames, making them more similar in genome organization to the cowpea strain, SBMV-C. The possibility exists that mutations or even errors, that resulted in mis-identification of open reading frames, occurred in previously published information on nucleotide sequence and genomic organization for SBMV-B.
Kathiravan, P; Goyal, S; Kataria, R S; Mishra, B P; Jayakumar, S; Joshi, B K
2011-01-01
The present study was undertaken to characterize the structure of S100A8 gene and its promoter in water buffalo and yak. Sequence data of 2.067 kb, 2.071 kb, and 2.052 kb with respect to complete S100A8 gene including 5' flanking region was generated in river buffalo, swamp buffalo, and yak, respectively. BLAST analysis of coding DNA sequences (CDS) of S100A8 gene revealed 95% homology of buffalo sequence with cattle, 85% with pig and horse, 83% with dog, 72-73% with murines, and around 79% with primates and humans. Phylogenetic analysis of predicted CDS revealed distinct clustering of murines, primates, and domestic animals with bovines and bubalines forming a subcluster among farm animals. In silico translation of predicted CDS revealed a sequence of 89 amino acids with 7 amino acid changes between cattle and buffalo and 2 changes between cattle and yak. The search for Pfam family revealed the N-terminal calcium binding domain and the noncanonical EF hand domain in the carboxy terminus, with more variations being observed in the N-terminal domain among different species. Two amino acid changes observed in carboxy terminal EF hand domain resulted in altered secondary structure of yak S100A8 protein. Analysis of S100A8 gene promoter revealed 14 putative motifs for transcriptional factor binding sites. Two putative motifs viz. C/EBP and v-Myb were found to be absent in swamp buffalo as compared to river buffalo and cattle. Differences in the structure of S100A8 protein and the transcriptional factor binding sites identified in the present study need to be analyzed further for their functional significance in yak and swamp buffalo respectively. Copyright © Taylor & Francis Group, LLC
Huang, Xiaoshuai; Ye, Haihui; Chung, J Sook
2017-08-01
Insulin-like androgenic gland factor (IAG) that is produced by the male androgenic gland (AG), plays a role in sexual differentiation and maintenance of male secondary sex characteristics in decapod crustaceans. With an earlier finding of IAG expression in a female Callinectes sapidus ovary, we aimed to examine a putative role of IAG during the ovarian development of this species. To this end, the full-length cDNA sequence of the ovarian CasIAG (termed CasIAG-ova) has been isolated. The predicted mature peptide sequence of CasIAG-ova is identical to that of the IAG from the AG, except in their signal peptide regions. The CasIAG-ova contains an alternative initiation codon (UUG) as the start codon, which suggests that the translational regulation of CasIAG-ova may differ from that of the IAG from AG. To define the function of CasIAG-ova, the expressions of CasIAG-ova as well as its putative binding protein, insulin-like peptide binding protein (ILPBP), are measured in the ovaries at various developmental stages obtained from different seasons. Season affects both CasIAG and ILPBP expression in the ovary. Overall, summer females at earlier ovarian stages contain high levels of CasIAG and ILPBP than spring or fall females. These findings indicate that CasIAG-ova and CasILPBP may be involved in the ovarian development. When comparing the levels of CasIAG and CasILPBP in the ovary, the latter are much higher (∼10-10000 fold) than the former. Expression patterns of CasILPBP differ from those of CasIAG-ova during ovarian development and by season, suggesting that ILPBP may have an additional role in ovarian development rather than a function of a putative binding protein of IAG. Copyright © 2017 Elsevier Inc. All rights reserved.
Ghosh, Pritha; Mathew, Oommen K; Sowdhamini, Ramanathan
2016-10-07
RNA-binding proteins (RBPs) interact with their cognate RNA(s) to form large biomolecular assemblies. They are versatile in their functionality and are involved in a myriad of processes inside the cell. RBPs with similar structural features and common biological functions are grouped together into families and superfamilies. It will be useful to obtain an early understanding and association of RNA-binding property of sequences of gene products. Here, we report a web server, RStrucFam, to predict the structure, type of cognate RNA(s) and function(s) of proteins, where possible, from mere sequence information. The web server employs Hidden Markov Model scan (hmmscan) to enable association to a back-end database of structural and sequence families. The database (HMMRBP) comprises of 437 HMMs of RBP families of known structure that have been generated using structure-based sequence alignments and 746 sequence-centric RBP family HMMs. The input protein sequence is associated with structural or sequence domain families, if structure or sequence signatures exist. In case of association of the protein with a family of known structures, output features like, multiple structure-based sequence alignment (MSSA) of the query with all others members of that family is provided. Further, cognate RNA partner(s) for that protein, Gene Ontology (GO) annotations, if any and a homology model of the protein can be obtained. The users can also browse through the database for details pertaining to each family, protein or RNA and their related information based on keyword search or RNA motif search. RStrucFam is a web server that exploits structurally conserved features of RBPs, derived from known family members and imprinted in mathematical profiles, to predict putative RBPs from sequence information. Proteins that fail to associate with such structure-centric families are further queried against the sequence-centric RBP family HMMs in the HMMRBP database. Further, all other essential information pertaining to an RBP, like overall function annotations, are provided. The web server can be accessed at the following link: http://caps.ncbs.res.in/rstrucfam .
The arbuscular mycorrhizal fungal protein glomalin is a putative homolog of heat shock protein 60.
Gadkar, Vijay; Rillig, Matthias C
2006-10-01
Work on glomalin-related soil protein produced by arbuscular mycorrhizal (AM) fungi (AMF) has been limited because of the unknown identity of the protein. A protein band cross-reactive with the glomalin-specific antibody MAb32B11 from the AM fungus Glomus intraradices was partially sequenced using tandem liquid chromatography-mass spectrometry. A 17 amino acid sequence showing similarity to heat shock protein 60 (hsp 60) was obtained. Based on degenerate PCR, a full-length cDNA of 1773 bp length encoding the hsp 60 gene was isolated from a G. intraradices cDNA library. The ORF was predicted to encode a protein of 590 amino acids. The protein sequence had three N-terminal glycosylation sites and a string of GGM motifs at the C-terminal end. The GiHsp 60 ORF had three introns of 67, 76 and 131 bp length. The GiHsp 60 was expressed using an in vitro translation system, and the protein was purified using the 6xHis-tag system. A dot-blot assay on the purified protein showed that it was highly cross-reactive with the glomalin-specific antibody MAb32B11. The present work provides the first evidence for the identity of the glomalin protein in the model AMF G. intraradices, thus facilitating further characterization of this protein, which is of great interest in soil ecology.
Brzuszkiewicz, Elzbieta; Thürmer, Andrea; Schuldes, Jörg; Leimbach, Andreas; Liesegang, Heiko; Meyer, Frauke-Dorothee; Boelter, Jürgen; Petersen, Heiko; Gottschalk, Gerhard; Daniel, Rolf
2011-12-01
The genome sequences of two Escherichia coli O104:H4 strains derived from two different patients of the 2011 German E. coli outbreak were determined. The two analyzed strains were designated E. coli GOS1 and GOS2 (German outbreak strain). Both isolates comprise one chromosome of approximately 5.31 Mbp and two putative plasmids. Comparisons of the 5,217 (GOS1) and 5,224 (GOS2) predicted protein-encoding genes with various E. coli strains, and a multilocus sequence typing analysis revealed that the isolates were most similar to the entero-aggregative E. coli (EAEC) strain 55989. In addition, one of the putative plasmids of the outbreak strain is similar to pAA-type plasmids of EAEC strains, which contain aggregative adhesion fimbrial operons. The second putative plasmid harbors genes for extended-spectrum β-lactamases. This type of plasmid is widely distributed in pathogenic E. coli strains. A significant difference of the E. coli GOS1 and GOS2 genomes to those of EAEC strains is the presence of a prophage encoding the Shiga toxin, which is characteristic for enterohemorrhagic E. coli (EHEC) strains. The unique combination of genomic features of the German outbreak strain, containing characteristics from pathotypes EAEC and EHEC, suggested that it represents a new pathotype Entero-Aggregative-Haemorrhagic E scherichia c oli (EAHEC).
Evolutionary conservation of regulatory elements in vertebrate HOX gene clusters
DOE Office of Scientific and Technical Information (OSTI.GOV)
Santini, Simona; Boore, Jeffrey L.; Meyer, Axel
2003-12-31
Due to their high degree of conservation, comparisons of DNA sequences among evolutionarily distantly-related genomes permit to identify functional regions in noncoding DNA. Hox genes are optimal candidate sequences for comparative genome analyses, because they are extremely conserved in vertebrates and occur in clusters. We aligned (Pipmaker) the nucleotide sequences of HoxA clusters of tilapia, pufferfish, striped bass, zebrafish, horn shark, human and mouse (over 500 million years of evolutionary distance). We identified several highly conserved intergenic sequences, likely to be important in gene regulation. Only a few of these putative regulatory elements have been previously described as being involvedmore » in the regulation of Hox genes, while several others are new elements that might have regulatory functions. The majority of these newly identified putative regulatory elements contain short fragments that are almost completely conserved and are identical to known binding sites for regulatory proteins (Transfac). The conserved intergenic regions located between the most rostrally expressed genes in the developing embryo are longer and better retained through evolution. We document that presumed regulatory sequences are retained differentially in either A or A clusters resulting from a genome duplication in the fish lineage. This observation supports both the hypothesis that the conserved elements are involved in gene regulation and the Duplication-Deletion-Complementation model.« less
Mathupala, S P; Lowe, S E; Podkovyrov, S M; Zeikus, J G
1993-08-05
The complete nucleotide sequence of the gene encoding the dual active amylopullulanase of Thermoanaerobacter ethanolicus 39E (formerly Clostridium thermohydrosulfuricum) was determined. The structural gene (apu) contained a single open reading frame 4443 base pairs in length, corresponding to 1481 amino acids, with an estimated molecular weight of 162,780. Analysis of the deduced sequence of apu with sequences of alpha-amylases and alpha-1,6 debranching enzymes enabled the identification of four conserved regions putatively involved in substrate binding and in catalysis. The conserved regions were localized within a 2.9-kilobase pair gene fragment, which encoded a M(r) 100,000 protein that maintained the dual activities and thermostability of the native enzyme. The catalytic residues of amylopullulanase were tentatively identified by using hydrophobic cluster analysis for comparison of amino acid sequences of amylopullulanase and other amylolytic enzymes. Asp597, Glu626, and Asp703 were individually modified to their respective amide form, or the alternate acid form, and in all cases both alpha-amylase and pullulanase activities were lost, suggesting the possible involvement of 3 residues in a catalytic triad, and the presence of a putative single catalytic site within the enzyme. These findings substantiate amylopullulanase as a new type of amylosaccharidase.
Maldonado-Borges, Josefina Ines; Ku-Cauich, José Roberto; Escobedo-GraciaMedrano, Rosa Maria
2013-01-01
Analysis of cDNA-AFLP was used to study the genes expressed in zygotic and somatic embryogenesis of Musa acuminata Colla ssp. malaccensis, and a comparison was made between their differential transcribed fragments (TDFs) and the sequenced genome of the double haploid- (DH-) Pahang of the malaccensis subspecies that is available in the network. A total of 253 transcript-derived fragments (TDFs) were detected with apparent size of 100–4000 bp using 5 pairs of AFLP primers, of which 21 were differentially expressed during the different stages of banana embryogenesis; 15 of the sequences have matched DH-Pahang chromosomes, with 7 of them being homologous to gene sequences encoding either known or putative protein domains of higher plants. Four TDF sequences were located in all Musa chromosomes, while the rest were located in one or two chromosomes. Their putative individual function is briefly reviewed based on published information, and the potential roles of these genes in embryo development are discussed. Thus the availability of the genome of Musa and the information of TDFs sequences presented here opens new possibilities for an in-depth study of the molecular and biochemical research of zygotic and somatic embryogenesis of Musa. PMID:24027442
Musumeci, Matías A; Lozada, Mariana; Rial, Daniela V; Mac Cormack, Walter P; Jansson, Janet K; Sjöling, Sara; Carroll, JoLynn; Dionisi, Hebe M
2017-04-09
The goal of this work was to identify sequences encoding monooxygenase biocatalysts with novel features by in silico mining an assembled metagenomic dataset of polar and subpolar marine sediments. The targeted enzyme sequences were Baeyer-Villiger and bacterial cytochrome P450 monooxygenases (CYP153). These enzymes have wide-ranging applications, from the synthesis of steroids, antibiotics, mycotoxins and pheromones to the synthesis of monomers for polymerization and anticancer precursors, due to their extraordinary enantio-, regio-, and chemo- selectivity that are valuable features for organic synthesis. Phylogenetic analyses were used to select the most divergent sequences affiliated to these enzyme families among the 264 putative monooxygenases recovered from the ~14 million protein-coding sequences in the assembled metagenome dataset. Three-dimensional structure modeling and docking analysis suggested features useful in biotechnological applications in five metagenomic sequences, such as wide substrate range, novel substrate specificity or regioselectivity. Further analysis revealed structural features associated with psychrophilic enzymes, such as broader substrate accessibility, larger catalytic pockets or low domain interactions, suggesting that they could be applied in biooxidations at room or low temperatures, saving costs inherent to energy consumption. This work allowed the identification of putative enzyme candidates with promising features from metagenomes, providing a suitable starting point for further developments.
Musumeci, Matías A.; Lozada, Mariana; Rial, Daniela V.; Mac Cormack, Walter P.; Jansson, Janet K.; Sjöling, Sara; Carroll, JoLynn; Dionisi, Hebe M.
2017-01-01
The goal of this work was to identify sequences encoding monooxygenase biocatalysts with novel features by in silico mining an assembled metagenomic dataset of polar and subpolar marine sediments. The targeted enzyme sequences were Baeyer–Villiger and bacterial cytochrome P450 monooxygenases (CYP153). These enzymes have wide-ranging applications, from the synthesis of steroids, antibiotics, mycotoxins and pheromones to the synthesis of monomers for polymerization and anticancer precursors, due to their extraordinary enantio-, regio-, and chemo- selectivity that are valuable features for organic synthesis. Phylogenetic analyses were used to select the most divergent sequences affiliated to these enzyme families among the 264 putative monooxygenases recovered from the ~14 million protein-coding sequences in the assembled metagenome dataset. Three-dimensional structure modeling and docking analysis suggested features useful in biotechnological applications in five metagenomic sequences, such as wide substrate range, novel substrate specificity or regioselectivity. Further analysis revealed structural features associated with psychrophilic enzymes, such as broader substrate accessibility, larger catalytic pockets or low domain interactions, suggesting that they could be applied in biooxidations at room or low temperatures, saving costs inherent to energy consumption. This work allowed the identification of putative enzyme candidates with promising features from metagenomes, providing a suitable starting point for further developments. PMID:28397770
Dzurová, Lenka; Forneris, Federico; Savino, Simone; Galuszka, Petr; Vrabka, Josef; Frébort, Ivo
2015-08-01
The recently discovered cytokinin (CK)-specific phosphoribohydrolase "Lonely Guy" (LOG) is a key enzyme of CK biosynthesis, converting inactive CK nucleotides into biologically active free bases. We have determined the crystal structures of LOG from Claviceps purpurea (cpLOG) and its complex with the enzymatic product phosphoribose. The structures reveal a dimeric arrangement of Rossmann folds, with the ligands bound to large pockets at the interface between cpLOG monomers. Structural comparisons highlight the homology of cpLOG to putative lysine decarboxylases. Extended sequence analysis enabled identification of a distinguishing LOG sequence signature. Taken together, our data suggest phosphoribohydrolase activity for several proteins of unknown function. © 2015 Wiley Periodicals, Inc.
Ferro, Myriam; Tardif, Marianne; Reguer, Erwan; Cahuzac, Romain; Bruley, Christophe; Vermat, Thierry; Nugues, Estelle; Vigouroux, Marielle; Vandenbrouck, Yves; Garin, Jérôme; Viari, Alain
2008-05-01
PepLine is a fully automated software which maps MS/MS fragmentation spectra of trypsic peptides to genomic DNA sequences. The approach is based on Peptide Sequence Tags (PSTs) obtained from partial interpretation of QTOF MS/MS spectra (first module). PSTs are then mapped on the six-frame translations of genomic sequences (second module) giving hits. Hits are then clustered to detect potential coding regions (third module). Our work aimed at optimizing the algorithms of each component to allow the whole pipeline to proceed in a fully automated manner using raw nucleic acid sequences (i.e., genomes that have not been "reduced" to a database of ORFs or putative exons sequences). The whole pipeline was tested on controlled MS/MS spectra sets from standard proteins and from Arabidopsis thaliana envelope chloroplast samples. Our results demonstrate that PepLine competed with protein database searching softwares and was fast enough to potentially tackle large data sets and/or high size genomes. We also illustrate the potential of this approach for the detection of the intron/exon structure of genes.
Yin, Min; Li, Guiding; Jiang, Yi; Han, Li; Huang, Xueshi; Lu, Tao; Jiang, Chenglin
2017-11-20
Streptomyces albolongus YIM 101047 produces novel bafilomycins and odoriferous sesquiterpenoids with cytotoxic and antimicrobial activities. Here, we report the complete genome sequence of S. albolongus YIM 101047, which consists of an 8,027,788bp linear chromosome. Forty-six putative biosynthetic gene clusters of secondary metabolites were found. The sesquiterpenoid gene cluster was on the left arm (0.09-0.10Mb), and the bafilomycin biosynthetic gene cluster was on the right arm (7.46-7.64Mb) of the chromosome. Twenty-two putative gene clusters with high or moderate similarity to important antibiotic biosynthetic gene clusters were found, including the antitumor agents bafilomycin, epothilone and hedamycin; the antibacterial/antifungal agents clavulanic acid, collismycin A, frontalamides, kanamycin, streptomycin and streptothricin; the protein phosphatase inhibitor RK-682; and the acute iron poisoning medication desferrioxamine B. The genome sequence reported here will enable us to study the biosynthetic mechanism of these important antibiotics and will facilitate the discovery of novel secondary metabolites with potential applications to human health. Copyright © 2017 Elsevier B.V. All rights reserved.
Landès-Devauchelle, C; Bras, F; Dezélée, S; Teninges, D
1995-11-10
The nucleotide sequence of the genes 2 and 3 of the Drosophila rhabdovirus sigma was determined from cDNAs to viral genome and poly(A)+ mRNAs. Gene 2 comprises 1032 nucleotides and contains a long ORF encoding a molecular weight 35,208 polypeptide present in infected cells and in virions which migrates in SDS-PAGE as a doublet of M(r) about 60 kDa. The distribution of acidic charges as well as the electrophoretic properties of the protein are characteristic of the rhabdovirus P proteins. Gene 3 comprises 923 nucleotides and contains a long ORF capable of coding a polypeptide of 298 amino acids of MW 33,790. The putative protein (PP3) is similar in size to a minor component of the virions. Computer analysis shows that the sequence of PP3 contains three motifs related to the conserved motifs of reverse transcriptases.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sakuma, Hitoshi; Inana, G.; Murakami, Akira
1995-05-20
ROM1 is a 351-amino-acid, 37-kDa outer segment membrane protein of rod photoreceptors. ROM1 is related to peripherin/RDS, another outer segment membrane protein found in both rods and cones. The precise function of ROM1 or peripherin/RDS is not known, but they have been suggested to play important roles in the function and/or structure of the rod photoreceptor outer segment disks. A recent report implicated ROM1 in disease by suggesting that RP can be caused by a heterozygous null mutation in ROM1 but only in combination with another heterozygous mutation in peripherin/RDS. Screening of the ROM1 gene using polymerase chain reaction amplification,more » denaturing gradient gel electrophoresis, and direct DNA sequencing identified the same heterozygous putative null mutation in a family with RP.« less
Satheesh, Viswanathan; Jagannadham, P Tej Kumar; Chidambaranathan, Parameswaran; Jain, P K; Srinivasan, R
2014-12-01
The NAC (NAM, ATAF and CUC) proteins are plant-specific transcription factors implicated in development and stress responses. In the present study 88 pigeonpea NAC genes were identified from the recently published draft genome of pigeonpea by using homology based and de novo prediction programmes. These sequences were further subjected to phylogenetic, motif and promoter analyses. In motif analysis, highly conserved motifs were identified in the NAC domain and also in the C-terminal region of the NAC proteins. A phylogenetic reconstruction using pigeonpea, Arabidopsis and soybean NAC genes revealed 33 putative stress-responsive pigeonpea NAC genes. Several stress-responsive cis-elements were identified through in silico analysis of the promoters of these putative stress-responsive genes. This analysis is the first report of NAC gene family in pigeonpea and will be useful for the identification and selection of candidate genes associated with stress tolerance.
[Expression, purification and antibody preparation of recombinat SARS-CoV X5 protein].
Wang, Li-Na; Kong, Jian-Qiang; Zhu, Ping; Du, Guan-Hua; Wang, Wei; Cheng, Ke-Di
2008-11-01
X5 protein is one of the putative unknown proteins of SARS-CoV. The recombinant protein has been successfully expressed in E. coli in the form of insoluble inclusion body. The inclusion body was dissolved in high concentration of urea. Affinity Chromatography was preformed to purify the denatured protein, and then the product was refolded in a series of gradient solutions of urea. The purified protein was obtained with the purity of > 95% and the yield of 93.3 mg x L(-1). Polyclonal antibody of this protein was obtained, and Western blotting assay indicated that the X5 protein has the strong property of antigen. Sixty-eight percent of the recombinant protein sequence was confirmed by LC-ESI-MS/MS analysis.
Shorrosh, B S; Roesler, K R; Shintani, D; van de Loo, F J; Ohlrogge, J B
1995-06-01
Acetyl-coenzyme A carboxylase (ACCase, EC 6.4.1.2) catalyzes the synthesis of malonyl-coenzyme A, which is utilized in the plastid for de novo fatty acid synthesis and outside the plastid for a variety of reactions, including the synthesis of very long chain fatty acids and flavonoids. Recent evidence for both multifunctional and multisubunit ACCase isozymes in dicot plants has been obtained. We describe here the isolation of a tobacco (Nicotiana tabacum L. cv bright yellow 2 [NT1]) cDNA clone (E3) that encodes a 58.4-kD protein that shares 80% sequence similarity and 65% identity with the Anabaena biotin carboxylase subunit of ACCase. Similar to other biotin carboxylase subunits of acetyl-CoA carboxylase, the E3-encoded protein contains a putative ATP-binding motif but lacks a biotin-binding site (methionine-lysine-methionine or methionine-lysine-leucine). The deduced protein sequence contains a putative transit peptide whose function was confirmed by its ability to direct in vitro chloroplast uptake. The subcellular localization of this biotin carboxylase has also been confirmed to be plastidial by western blot analysis of pea (Pisum sativum), alfalfa (Medicago sativa L.), and castor (Ricinus communis L.) plastid preparations. Northern blot analysis indicates that the plastid biotin carboxylase transcripts are expressed at severalfold higher levels in castor seeds than in leaves.
Shorrosh, B S; Roesler, K R; Shintani, D; van de Loo, F J; Ohlrogge, J B
1995-01-01
Acetyl-coenzyme A carboxylase (ACCase, EC 6.4.1.2) catalyzes the synthesis of malonyl-coenzyme A, which is utilized in the plastid for de novo fatty acid synthesis and outside the plastid for a variety of reactions, including the synthesis of very long chain fatty acids and flavonoids. Recent evidence for both multifunctional and multisubunit ACCase isozymes in dicot plants has been obtained. We describe here the isolation of a tobacco (Nicotiana tabacum L. cv bright yellow 2 [NT1]) cDNA clone (E3) that encodes a 58.4-kD protein that shares 80% sequence similarity and 65% identity with the Anabaena biotin carboxylase subunit of ACCase. Similar to other biotin carboxylase subunits of acetyl-CoA carboxylase, the E3-encoded protein contains a putative ATP-binding motif but lacks a biotin-binding site (methionine-lysine-methionine or methionine-lysine-leucine). The deduced protein sequence contains a putative transit peptide whose function was confirmed by its ability to direct in vitro chloroplast uptake. The subcellular localization of this biotin carboxylase has also been confirmed to be plastidial by western blot analysis of pea (Pisum sativum), alfalfa (Medicago sativa L.), and castor (Ricinus communis L.) plastid preparations. Northern blot analysis indicates that the plastid biotin carboxylase transcripts are expressed at severalfold higher levels in castor seeds than in leaves. PMID:7610168
Bernkopf, Marie; Webersinke, Gerald; Tongsook, Chanakan; Koyani, Chintan N; Rafiq, Muhammad A; Ayaz, Muhammad; Müller, Doris; Enzinger, Christian; Aslam, Muhammad; Naeem, Farooq; Schmidt, Kurt; Gruber, Karl; Speicher, Michael R; Malle, Ernst; Macheroux, Peter; Ayub, Muhammad; Vincent, John B; Windpassinger, Christian; Duba, Hans-Christoph
2014-08-01
We describe the characterization of a gene for mild nonsyndromic autosomal recessive intellectual disability (ID) in two unrelated families, one from Austria, the other from Pakistan. Genome-wide single nucleotide polymorphism microarray analysis enabled us to define a region of homozygosity by descent on chromosome 17q25. Whole-exome sequencing and analysis of this region in an affected individual from the Austrian family identified a 5 bp frameshifting deletion in the METTL23 gene. By means of Sanger sequencing of METTL23, a nonsense mutation was detected in a consanguineous ID family from Pakistan for which homozygosity-by-descent mapping had identified a region on 17q25. Both changes lead to truncation of the putative METTL23 protein, which disrupts the predicted catalytic domain and alters the cellular localization. 3D-modelling of the protein indicates that METTL23 is strongly predicted to function as an S-adenosyl-methionine (SAM)-dependent methyltransferase. Expression analysis of METTL23 indicated a strong association with heat shock proteins, which suggests that these may act as a putative substrate for methylation by METTL23. A number of methyltransferases have been described recently in association with ID. Disruption of METTL23 presented here supports the importance of methylation processes for intact neuronal function and brain development. © The Author 2014. Published by Oxford University Press.
Zhang, Jin; Wang, Bing; Dong, Shuanglin; Cao, Depan; Dong, Junfeng; Walker, William B.; Liu, Yang; Wang, Guirong
2015-01-01
To better understand the olfactory mechanisms in the two lepidopteran pest model species, the Helicoverpa armigera and H. assulta, we conducted transcriptome analysis of the adult antennae using Illumina sequencing technology and compared the chemosensory genes between these two related species. Combined with the chemosensory genes we had identified previously in H. armigera by 454 sequencing, we identified 133 putative chemosensory unigenes in H. armigera including 60 odorant receptors (ORs), 19 ionotropic receptors (IRs), 34 odorant binding proteins (OBPs), 18 chemosensory proteins (CSPs), and 2 sensory neuron membrane proteins (SNMPs). Consistent with these results, 131 putative chemosensory genes including 64 ORs, 19 IRs, 29 OBPs, 17 CSPs, and 2 SNMPs were identified through male and female antennal transcriptome analysis in H. assulta. Reverse Transcription-PCR (RT-PCR) was conducted in H. assulta to examine the accuracy of the assembly and annotation of the transcriptome and the expression profile of these unigenes in different tissues. Most of the ORs, IRs and OBPs were enriched in adult antennae, while almost all the CSPs were expressed in antennae as well as legs. We compared the differences of the chemosensory genes between these two species in detail. Our work will surely provide valuable information for further functional studies of pheromones and host volatile recognition genes in these two related species. PMID:25659090
Lima Leite, Aline; Silva Fernandes, Mileni; Charone, Senda; Whitford, Gary Milton; Everett, Eric T; Buzalaf, Marília Afonso Rabelo
2018-01-01
Enamel formation is a complex 2-step process by which proteins are secreted to form an extracellular matrix, followed by massive protein degradation and subsequent mineralization. Excessive systemic exposure to fluoride can disrupt this process and lead to a condition known as dental fluorosis. The genetic background influences the responses of mineralized tissues to fluoride, such as dental fluorosis, observed in A/J and 129P3/J mice. The aim of the present study was to map the protein profile of enamel matrix from A/J and 129P3/J strains. Enamel matrix samples were obtained from A/J and 129P3/J mice and analyzed by 2-dimensional electrophoresis and liquid chromatography coupled with mass spectrometry. A total of 120 proteins were identified, and 7 of them were classified as putative uncharacterized proteins and analyzed in silico for structural and functional characterization. An interesting finding was the possibility of the uncharacterized sequence Q8BIS2 being an enzyme involved in the degradation of matrix proteins. Thus, the results provide a comprehensive view of the structure and function for putative uncharacterized proteins found in the enamel matrix that could help to elucidate the mechanisms involved in enamel biomineralization and genetic susceptibility to dental fluorosis. © 2018 S. Karger AG, Basel.
Sun, Haiyue; Liu, Yushan; Gai, Yuzhuo; Geng, Jinman; Chen, Li; Liu, Hongdi; Kang, Limin; Tian, Youwen; Li, Yadong
2015-09-02
Cranberries (Vaccinium macrocarpon Ait.), renowned for their excellent health benefits, are an important berry crop. Here, we performed transcriptome sequencing of one cranberry cultivar, from fruits at two different developmental stages, on the Illumina HiSeq 2000 platform. Our main goals were to identify putative genes for major metabolic pathways of bioactive compounds and compare the expression patterns between white fruit (W) and red fruit (R) in cranberry. In this study, two cDNA libraries of W and R were constructed. Approximately 119 million raw sequencing reads were generated and assembled de novo, yielding 57,331 high quality unigenes with an average length of 739 bp. Using BLASTx, 38,460 unigenes were identified as putative homologs of annotated sequences in public protein databases, including NCBI NR, NT, Swiss-Prot, KEGG, COG and GO. Of these, 21,898 unigenes mapped to 128 KEGG pathways, with the metabolic pathways, secondary metabolites, glycerophospholipid metabolism, ether lipid metabolism, starch and sucrose metabolism, purine metabolism, and pyrimidine metabolism being well represented. Among them, many candidate genes were involved in flavonoid biosynthesis, transport and regulation. Furthermore, digital gene expression (DEG) analysis identified 3,257 unigenes that were differentially expressed between the two fruit developmental stages. In addition, 14,473 simple sequence repeats (SSRs) were detected. Our results present comprehensive gene expression information about the cranberry fruit transcriptome that could facilitate our understanding of the molecular mechanisms of fruit development in cranberries. Although it will be necessary to validate the functions carried out by these genes, these results could be used to improve the quality of breeding programs for the cranberry and related species.
Halmillawewa, Anupama P; Restrepo-Córdoba, Marcela; Perry, Benjamin J; Yost, Christopher K; Hynes, Michael F
2016-02-01
Bacteriophages may play an important role in regulating population size and diversity of the root nodule symbiont Rhizobium leguminosarum, as well as participating in horizontal gene transfer. Although phages that infect this species have been isolated in the past, our knowledge of their molecular biology, and especially of genome composition, is extremely limited, and this lack of information impacts on the ability to assess phage population dynamics and limits potential agricultural applications of rhizobiophages. To help address this deficit in available sequence and biological information, the complete genome sequence of the Myoviridae temperate phage PPF1 that infects R. leguminosarum biovar viciae strain F1 was determined. The genome is 54,506 bp in length with an average G+C content of 61.9 %. The genome contains 94 putative open reading frames (ORFs) and 74.5 % of these predicted ORFs share homology at the protein level with previously reported sequences in the database. However, putative functions could only be assigned to 25.5 % (24 ORFs) of the predicted genes. PPF1 was capable of efficiently lysogenizing its rhizobial host R. leguminosarum F1. The site-specific recombination system of the phage targets an integration site that lies within a putative tRNA-Pro (CGG) gene in R. leguminosarum F1. Upon integration, the phage is capable of restoring the disrupted tRNA gene, owing to the 50 bp homologous sequence (att core region) it shares with its rhizobial host genome. Phage PPF1 is the first temperate phage infecting members of the genus Rhizobium for which a complete genome sequence, as well as other biological data such as the integration site, is available.
HomPPI: a class of sequence homology based protein-protein interface prediction methods
2011-01-01
Background Although homology-based methods are among the most widely used methods for predicting the structure and function of proteins, the question as to whether interface sequence conservation can be effectively exploited in predicting protein-protein interfaces has been a subject of debate. Results We studied more than 300,000 pair-wise alignments of protein sequences from structurally characterized protein complexes, including both obligate and transient complexes. We identified sequence similarity criteria required for accurate homology-based inference of interface residues in a query protein sequence. Based on these analyses, we developed HomPPI, a class of sequence homology-based methods for predicting protein-protein interface residues. We present two variants of HomPPI: (i) NPS-HomPPI (Non partner-specific HomPPI), which can be used to predict interface residues of a query protein in the absence of knowledge of the interaction partner; and (ii) PS-HomPPI (Partner-specific HomPPI), which can be used to predict the interface residues of a query protein with a specific target protein. Our experiments on a benchmark dataset of obligate homodimeric complexes show that NPS-HomPPI can reliably predict protein-protein interface residues in a given protein, with an average correlation coefficient (CC) of 0.76, sensitivity of 0.83, and specificity of 0.78, when sequence homologs of the query protein can be reliably identified. NPS-HomPPI also reliably predicts the interface residues of intrinsically disordered proteins. Our experiments suggest that NPS-HomPPI is competitive with several state-of-the-art interface prediction servers including those that exploit the structure of the query proteins. The partner-specific classifier, PS-HomPPI can, on a large dataset of transient complexes, predict the interface residues of a query protein with a specific target, with a CC of 0.65, sensitivity of 0.69, and specificity of 0.70, when homologs of both the query and the target can be reliably identified. The HomPPI web server is available at http://homppi.cs.iastate.edu/. Conclusions Sequence homology-based methods offer a class of computationally efficient and reliable approaches for predicting the protein-protein interface residues that participate in either obligate or transient interactions. For query proteins involved in transient interactions, the reliability of interface residue prediction can be improved by exploiting knowledge of putative interaction partners. PMID:21682895
Heinz, Eva; Lithgow, Trevor
2014-01-01
Members of the Omp85/TpsB protein superfamily are ubiquitously distributed in Gram-negative bacteria, and function in protein translocation (e.g., FhaC) or the assembly of outer membrane proteins (e.g., BamA). Several recent findings are suggestive of a further level of variation in the superfamily, including the identification of the novel membrane protein assembly factor TamA and protein translocase PlpD. To investigate the diversity and the causal evolutionary events, we undertook a comprehensive comparative sequence analysis of the Omp85/TpsB proteins. A total of 10 protein subfamilies were apparent, distinguished in their domain structure and sequence signatures. In addition to the proteins FhaC, BamA, and TamA, for which structural and functional information is available, are families of proteins with so far undescribed domain architectures linked to the Omp85 β-barrel domain. This study brings a classification structure to a dynamic protein superfamily of high interest given its essential function for Gram-negative bacteria as well as its diverse domain architecture, and we discuss several scenarios of putative functions of these so far undescribed proteins. PMID:25101071
Pru du 2S albumin or Pru du vicilin?
Garino, Cristiano; De Paolis, Angelo; Coïsson, Jean Daniel; Arlorio, Marco
2015-06-01
A short partial sequence of 28 amino acids is all the information we have so far about the putative allergen 2S albumin from almond. The aim of this work was to analyze this information using mainly bioinformatics tools, in order to verify its rightness. Based on the results reported in the paper describing this allergen from almond, we analyzed the original data of amino acids sequencing through available software. The degree of homology of the almond 12kDa protein with any other known 2S albumin appears to be much lower than the one reported in the paper that firstly described it. In a publicly available cDNA library we discovered an expressed sequence tag which translation generates a protein that perfectly matches both of the sequencing outputs described in the same paper. A further analysis indicated that the latter protein seems to belong to the vicilin superfamily rather than to the prolamin one. The fact that also vicilins are seed storage proteins known to be highly allergenic would explain the IgE reactivity originally observed. Based on our observations we suggest that the IgE reactive 12kDa protein from almond currently known as Pru du 2S albumin is in reality the cleaved N-terminal region of a 7S vicilin like protein. Copyright © 2015 Elsevier Ltd. All rights reserved.
Analysis of functional redundancies within the Arabidopsis TCP transcription factor family.
Danisman, Selahattin; van Dijk, Aalt D J; Bimbo, Andrea; van der Wal, Froukje; Hennig, Lars; de Folter, Stefan; Angenent, Gerco C; Immink, Richard G H
2013-12-01
Analyses of the functions of TEOSINTE-LIKE1, CYCLOIDEA, and PROLIFERATING CELL FACTOR1 (TCP) transcription factors have been hampered by functional redundancy between its individual members. In general, putative functionally redundant genes are predicted based on sequence similarity and confirmed by genetic analysis. In the TCP family, however, identification is impeded by relatively low overall sequence similarity. In a search for functionally redundant TCP pairs that control Arabidopsis leaf development, this work performed an integrative bioinformatics analysis, combining protein sequence similarities, gene expression data, and results of pair-wise protein-protein interaction studies for the 24 members of the Arabidopsis TCP transcription factor family. For this, the work completed any lacking gene expression and protein-protein interaction data experimentally and then performed a comprehensive prediction of potential functional redundant TCP pairs. Subsequently, redundant functions could be confirmed for selected predicted TCP pairs by genetic and molecular analyses. It is demonstrated that the previously uncharacterized class I TCP19 gene plays a role in the control of leaf senescence in a redundant fashion with TCP20. Altogether, this work shows the power of combining classical genetic and molecular approaches with bioinformatics predictions to unravel functional redundancies in the TCP transcription factor family.
Hume, Maxwell A; Barrera, Luis A; Gisselbrecht, Stephen S; Bulyk, Martha L
2015-01-01
The Universal PBM Resource for Oligonucleotide Binding Evaluation (UniPROBE) serves as a convenient source of information on published data generated using universal protein-binding microarray (PBM) technology, which provides in vitro data about the relative DNA-binding preferences of transcription factors for all possible sequence variants of a length k ('k-mers'). The database displays important information about the proteins and displays their DNA-binding specificity data in terms of k-mers, position weight matrices and graphical sequence logos. This update to the database documents the growth of UniPROBE since the last update 4 years ago, and introduces a variety of new features and tools, including a new streamlined pipeline that facilitates data deposition by universal PBM data generators in the research community, a tool that generates putative nonbinding (i.e. negative control) DNA sequences for one or more proteins and novel motifs obtained by analyzing the PBM data using the BEEML-PBM algorithm for motif inference. The UniPROBE database is available at http://uniprobe.org. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
2004-01-01
Numerous invertebrate species belonging to several phyla cannot synthesize sterols de novo and rely on a dietary source of the compound. SCPx (sterol carrier protein 2/3-oxoacyl-CoA thiolase) is a protein involved in the trafficking of sterols and oxidation of branched-chain fatty acids. We have isolated SCPx protein from Spodoptera littoralis (cotton leafworm) and have subjected it to limited amino acid sequencing. A reverse-transcriptase PCR-based approach has been used to clone the cDNA (1.9 kb), which encodes a 57 kDa protein. Northern blotting detected two mRNA transcripts, one of 1.9 kb, encoding SCPx, and one of 0.95 kb, presumably encoding SCP2 (sterol carrier protein 2). The former mRNA was highly expressed in midgut and Malpighian tubules during the last larval instar. Furthermore, constitutive expression of the gene was detected in the prothoracic glands, which are the main tissue producing the insect moulting hormone. There was no significant change in the 1.9 kb mRNA in midgut throughout development, but slightly higher expression in the early stages. Conceptual translation of the cDNA and a database search revealed that the gene includes the SCP2 sequence and a putative peroxisomal targeting signal in the C-terminal region. Also a cysteine residue at the putative active site for the 3-oxoacyl-CoA thiolase is conserved. Southern blotting showed that SCPx is likely to be encoded by a single-copy gene. The mRNA expression pattern and the gene structure suggest that SCPx from S. littoralis (a lepidopteran) is evolutionarily closer to that of mammals than to that of dipterans. PMID:15149283
Defense Against Cannibalism: The SdpI Family of Bacterial Immunity/Signal Transduction Proteins
Povolotsky, Tatyana Leonidovna; Orlova, Ekaterina; Tamang, Dorjee G.
2010-01-01
The SdpI family consists of putative bacterial toxin immunity and signal transduction proteins. One member of the family in Bacillus subtilis, SdpI, provides immunity to cells from cannibalism in times of nutrient limitation. SdpI family members are transmembrane proteins with 3, 4, 5, 6, 7, 8, or 12 putative transmembrane α-helical segments (TMSs). These varied topologies appear to be genuine rather than artifacts due to sequencing or annotation errors. The basic and most frequently occurring element of the SdpI family has 6 TMSs. Homologues of all topological types were aligned to determine the homologous TMSs and loop regions, and the positive-inside rule was used to determine sidedness. The two most conserved motifs were identified between TMSs 1 and 2 and TMSs 4 and 5 of the 6 TMS proteins. These showed significant sequence similarity, leading us to suggest that the primordial precursor of these proteins was a 3 TMS–encoding genetic element that underwent intragenic duplication. Various deletional and fusional events, as well as intragenic duplications and inversions, may have yielded SdpI homologues with topologies of varying numbers and positions of TMSs. We propose a specific evolutionary pathway that could have given rise to these distantly related bacterial immunity proteins. We further show that genes encoding SdpI homologues often appear in operons with genes for homologues of SdpR, SdpI’s autorepressor. Our analyses allow us to propose structure–function relationships that may be applicable to most family members. Electronic supplementary material The online version of this article (doi:10.1007/s00232-010-9260-7) contains supplementary material, which is available to authorized users. PMID:20563570
Bennett, Mark; Tu, Shin-Lin; Upton, Chris; McArtor, Cassie; Gillett, Amber; Laird, Tanya; O'Dea, Mark
2017-10-15
Poxviruses have previously been detected in macropods with cutaneous papillomatous lesions, however to date, no comprehensive analysis of a poxvirus from kangaroos has been performed. Here we report the genome sequences of a western grey kangaroo poxvirus (WKPV) and an eastern grey kangaroo poxvirus (EKPV), named for the host species from which they were isolated, western grey (Macropus fuliginosus) and eastern grey (Macropus giganteus) kangaroos. Poxvirus DNA from WKPV and EKPV was isolated and entire coding genome regions determined through Roche GS Junior and Illumina Miseq sequencing, respectively. Viral genomes were assembled using MIRA and SPAdes, and annotations performed using tools available from the Viral Bioinformatics Resource Centre. Histopathology and transmission electron microscopy analysis was also performed on WKPV and its associated lesions. The WKPV and EKPV genomes show 96% identity (nucleotide) to each other and phylogenetic analysis places them on a distinct branch between the established Molluscipoxvirus and Avipoxvirus genera. WKPV and EKPV are 170 kbp and 167 kbp long, containing 165 and 162 putative genes, respectively. Together, their genomes encode up to 47 novel unique hypothetical proteins, and possess virulence proteins including a major histocompatibility complex class II inhibitor, a semaphorin-like protein, a serpin, a 3-β-hydroxysteroid dehydrogenase/δ 5→4 isomerase, and a CD200-like protein. These viruses also encode a large putative protein (WKPV-WA-039 and EKPV-SC-038) with a C-terminal domain that is structurally similar to the C-terminal domain of a cullin, suggestive of a role in the control of host ubiquitination. The relationship of these viruses to members of the Molluscipoxvirus and Avipoxvirus genera is discussed in terms of sequence similarity, gene content and nucleotide composition. A novel genus within subfamily Chordopoxvirinae is proposed to accommodate these two poxvirus species from kangaroos; we suggest the name, Thylacopoxvirus (thylaco-: [Gr.] thylakos meaning sac or pouch). Copyright © 2017 Elsevier B.V. All rights reserved.
Proliferating cell nuclear antigen (Pcna) as a direct downstream target gene of Hoxc8
DOE Office of Scientific and Technical Information (OSTI.GOV)
Min, Hyehyun; Lee, Ji-Yeon; Bok, Jinwoong
2010-02-19
Hoxc8 is a member of Hox family transcription factors that play crucial roles in spatiotemporal body patterning during embryogenesis. Hox proteins contain a conserved 61 amino acid homeodomain, which is responsible for recognition and binding of the proteins onto Hox-specific DNA binding motifs and regulates expression of their target genes. Previously, using proteome analysis, we identified Proliferating cell nuclear antigen (Pcna) as one of the putative target genes of Hoxc8. Here, we asked whether Hoxc8 regulates Pcna expression by directly binding to the regulatory sequence of Pcna. In mouse embryos at embryonic day 11.5, the expression pattern of Pcna wasmore » similar to that of Hoxc8 along the anteroposterior body axis. Moreover, Pcna transcript levels as well as cell proliferation rate were increased by overexpression of Hoxc8 in C3H10T1/2 mouse embryonic fibroblast cells. Characterization of 2.3 kb genomic sequence upstream of Pcna coding region revealed that the upstream sequence contains several Hox core binding sequences and one Hox-Pbx binding sequence. Direct binding of Hoxc8 proteins to the Pcna regulatory sequence was verified by chromatin immunoprecipitation assay. Taken together, our data suggest that Pcna is a direct downstream target of Hoxc8.« less
Streptococcus pneumonia YlxR at 1.35 A shows a putative new fold.
Osipiuk, J; Górnicki, P; Maj, L; Dementieva, I; Laskowski, R; Joachimiak, A
2001-11-01
The structure of the YlxR protein of unknown function from Streptococcus pneumonia was determined to 1.35 A. YlxR is expressed from the nusA/infB operon in bacteria and belongs to a small protein family (COG2740) that shares a conserved sequence motif GRGA(Y/W). The family shows no significant amino-acid sequence similarity with other proteins. Three-wavelength diffraction MAD data were collected to 1.7 A from orthorhombic crystals using synchrotron radiation and the structure was determined using a semi-automated approach. The YlxR structure resembles a two-layer alpha/beta sandwich with the overall shape of a cylinder and shows no structural homology to proteins of known structure. Structural analysis revealed that the YlxR structure represents a new protein fold that belongs to the alpha-beta plait superfamily. The distribution of the electrostatic surface potential shows a large positively charged patch on one side of the protein, a feature often found in nucleic acid-binding proteins. Three sulfate ions bind to this positively charged surface. Analysis of potential binding sites uncovered several substantial clefts, with the largest spanning 3/4 of the protein. A similar distribution of binding sites and a large sharply bent cleft are observed in RNA-binding proteins that are unrelated in sequence and structure. It is proposed that YlxR is an RNA-binding protein.
Razban, Rostam M; Gilson, Amy I; Durfee, Niamh; Strobelt, Hendrik; Dinkla, Kasper; Choi, Jeong-Mo; Pfister, Hanspeter; Shakhnovich, Eugene I
2018-05-08
Protein evolution spans time scales and its effects span the length of an organism. A web app named ProteomeVis is developed to provide a comprehensive view of protein evolution in the S. cerevisiae and E. coli proteomes. ProteomeVis interactively creates protein chain graphs, where edges between nodes represent structure and sequence similarities within user-defined ranges, to study the long time scale effects of protein structure evolution. The short time scale effects of protein sequence evolution are studied by sequence evolutionary rate (ER) correlation analyses with protein properties that span from the molecular to the organismal level. We demonstrate the utility and versatility of ProteomeVis by investigating the distribution of edges per node in organismal protein chain universe graphs (oPCUGs) and putative ER determinants. S. cerevisiae and E. coli oPCUGs are scale-free with scaling constants of 1.79 and 1.56, respectively. Both scaling constants can be explained by a previously reported theoretical model describing protein structure evolution (Dokholyan et al., 2002). Protein abundance most strongly correlates with ER among properties in ProteomeVis, with Spearman correlations of -0.49 (p-value<10-10) and -0.46 (p-value<10-10) for S. cerevisiae and E. coli, respectively. This result is consistent with previous reports that found protein expression to be the most important ER determinant (Zhang and Yang, 2015). ProteomeVis is freely accessible at http://proteomevis.chem.harvard.edu. Supplementary data are available at Bioinformatics. shakhnovich@chemistry.harvard.edu.
Xia, Qing; Wang, Hong-xia; Wang, Jie; Liu, Bing-yu; Hu, Mei-ru; Zhang, Xue-min; Shen, Bei-fen
2004-10-01
To identify two differentiation-associated proteins induced by rhIL-6 in M1 mouse myeloid leukemia cells. Protein spots were excised from 2-D gels and digested in-gel with trypsin. The trypsin lysis products were first analyzed by matrix-assisted laser desorption/ionization-time of flight-mass spectrometry (MALDI-TOF-MS) through peptide mass fingerprinting and then performed peptide sequencing by nano-electrospray ionization mass spectrometry/mass spectrometry (nano-ESI-MS/MS). The database search was finished with the Mascot search engine (http://www.matrixscience.co.uk) using the data processed through MaxEnt3 and MasSeq. The two proteins were not revealed by peptide mass fingerprint using MALDI-TOF-MS, while they were respectively identified as Destrin and Putative protein after the sequence of their trypic peptides were obtained by the nano-ESI-MS/MS techniques. Nano-ESI-MS/MS technique can successfully identify the two differentiation-associated proteins induced by rhIL-6 and has great advantage in protein analysis.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Markussen, Turhan; Jonassen, Christine Monceyron; Numanovic, Sanela
2008-05-10
Infectious salmon anaemia virus (ISAV) is an orthomyxovirus causing a multisystemic, emerging disease in Atlantic salmon. Here we present, for the first time, detailed sequence analyses of the full-genome sequence of a presumed avirulent isolate displaying a full-length hemagglutinin-esterase (HE) gene (HPR0), and compare this with full-genome sequences of 11 Norwegian ISAV isolates from clinically diseased fish. These analyses revealed the presence of a virulence marker right upstream of the putative cleavage site R{sub 267} in the fusion (F) protein, suggesting a Q{sub 266} {yields} L{sub 266} substitution to be a prerequisite for virulence. To gain virulence in isolates lackingmore » this substitution, a sequence insertion near the cleavage site seems to be required. This strongly suggests the involvement of a protease recognition pattern at the cleavage site of the fusion protein as a determinant of virulence, as seen in highly pathogenic influenza A virus H5 or H7 and the paramyxovirus Newcastle disease virus.« less
A third genotype of the human parvovirus PARV4 in sub-Saharan Africa.
Simmonds, Peter; Douglas, Jill; Bestetti, Giovanna; Longhi, Erika; Antinori, Spinello; Parravicini, Carlo; Corbellino, Mario
2008-09-01
PARV4 is a recently discovered human parvovirus widely distributed in injecting drug users in the USA and Europe, particularly in those co-infected with human immunodeficiency virus (HIV). Like parvovirus B19, PARV4 persists in previously exposed individuals. In bone marrow and lymphoid tissue, PARV4 sequences were detected in two sub-Saharan African study subjects with AIDS but without a reported history of parenteral exposure and who were uninfected with hepatitis C virus. PARV4 variants infecting these subjects were phylogenetically distinct from genotypes 1 and 2 (formerly PARV5) that were reported previously. Analysis of near-complete genome sequences demonstrated that they should be classified as a third (equidistant) PARV4 genotype. The availability of a further near-complete genome sequence of this novel genotype facilitated identification of conserved novel open reading frames embedded in the ORF2 coding sequence; one encoded a putative protein with identifiable homology to SAT proteins of members of the genus Parvovirus.
Lagkouvardos, Ilias; Weinmaier, Thomas; Lauro, Federico M; Cavicchioli, Ricardo; Rattei, Thomas; Horn, Matthias
2014-01-01
In the era of metagenomics and amplicon sequencing, comprehensive analyses of available sequence data remain a challenge. Here we describe an approach exploiting metagenomic and amplicon data sets from public databases to elucidate phylogenetic diversity of defined microbial taxa. We investigated the phylum Chlamydiae whose known members are obligate intracellular bacteria that represent important pathogens of humans and animals, as well as symbionts of protists. Despite their medical relevance, our knowledge about chlamydial diversity is still scarce. Most of the nine known families are represented by only a few isolates, while previous clone library-based surveys suggested the existence of yet uncharacterized members of this phylum. Here we identified more than 22 000 high quality, non-redundant chlamydial 16S rRNA gene sequences in diverse databases, as well as 1900 putative chlamydial protein-encoding genes. Even when applying the most conservative approach, clustering of chlamydial 16S rRNA gene sequences into operational taxonomic units revealed an unexpectedly high species, genus and family-level diversity within the Chlamydiae, including 181 putative families. These in silico findings were verified experimentally in one Antarctic sample, which contained a high diversity of novel Chlamydiae. In our analysis, the Rhabdochlamydiaceae, whose known members infect arthropods, represents the most diverse and species-rich chlamydial family, followed by the protist-associated Parachlamydiaceae, and a putative new family (PCF8) with unknown host specificity. Available information on the origin of metagenomic samples indicated that marine environments contain the majority of the newly discovered chlamydial lineages, highlighting this environment as an important chlamydial reservoir. PMID:23949660
El-Assaad, Atlal; Dawy, Zaher; Nemer, Georges; Hajj, Hazem; Kobeissy, Firas H
2017-01-01
Degradomics is a novel discipline that involves determination of the proteases/substrate fragmentation profile, called the substrate degradome, and has been recently applied in different disciplines. A major application of degradomics is its utility in the field of biomarkers where the breakdown products (BDPs) of different protease have been investigated. Among the major proteases assessed, calpain and caspase proteases have been associated with the execution phases of the pro-apoptotic and pro-necrotic cell death, generating caspase/calpain-specific cleaved fragments. The distinction between calpain and caspase protein fragments has been applied to distinguish injury mechanisms. Advanced proteomics technology has been used to identify these BDPs experimentally. However, it has been a challenge to identify these BDPs with high precision and efficiency, especially if we are targeting a number of proteins at one time. In this chapter, we present a novel bioinfromatic detection method that identifies BDPs accurately and efficiently with validation against experimental data. This method aims at predicting the consensus sequence occurrences and their variants in a large set of experimentally detected protein sequences based on state-of-the-art sequence matching and alignment algorithms. After detection, the method generates all the potential cleaved fragments by a specific protease. This space and time-efficient algorithm is flexible to handle the different orientations that the consensus sequence and the protein sequence can take before cleaving. It is O(mn) in space complexity and O(Nmn) in time complexity, with N number of protein sequences, m length of the consensus sequence, and n length of each protein sequence. Ultimately, this knowledge will subsequently feed into the development of a novel tool for researchers to detect diverse types of selected BDPs as putative disease markers, contributing to the diagnosis and treatment of related disorders.
Perdomo-Sabogal, Alvaro; Nowick, Katja; Piccini, Ilaria; Sudbrak, Ralf; Lehrach, Hans; Yaspo, Marie-Laure; Warnatz, Hans-Jörg; Querfurth, Robert
2016-01-01
A substantial fraction of phenotypic differences between closely related species are likely caused by differences in gene regulation. While this has already been postulated over 30 years ago, only few examples of evolutionary changes in gene regulation have been verified. Here, we identified and investigated binding sites of the transcription factor GA-binding protein alpha (GABPa) aiming to discover cis-regulatory adaptations on the human lineage. By performing chromatin immunoprecipitation-sequencing experiments in a human cell line, we found 11,619 putative GABPa binding sites. Through sequence comparisons of the human GABPa binding regions with orthologous sequences from 34 mammals, we identified substitutions that have resulted in 224 putative human-specific GABPa binding sites. To experimentally assess the transcriptional impact of those substitutions, we selected four promoters for promoter-reporter gene assays using human and African green monkey cells. We compared the activities of wild-type promoters to mutated forms, where we have introduced one or more substitutions to mimic the ancestral state devoid of the GABPa consensus binding sequence. Similarly, we introduced the human-specific substitutions into chimpanzee and macaque promoter backgrounds. Our results demonstrate that the identified substitutions are functional, both in human and nonhuman promoters. In addition, we performed GABPa knock-down experiments and found 1,215 genes as strong candidates for primary targets. Further analyses of our data sets link GABPa to cognitive disorders, diabetes, KRAB zinc finger (KRAB-ZNF), and human-specific genes. Thus, we propose that differences in GABPa binding sites played important roles in the evolution of human-specific phenotypes. PMID:26814189
Pope, Welkin H.; Weigele, Peter R.; Chang, Juan; Pedulla, Marisa L.; Ford, Michael E.; Houtz, Jennifer M.; Jiang, Wen; Chiu, Wah; Hatfull, Graham F.; Hendrix, Roger W.; King, Jonathan
2010-01-01
Marine Synechococcus spp and marine Prochlorococcus spp are numerically dominant photoautotrophs in the open oceans and contributors to the global carbon cycle. Syn5 is a short-tailed cyanophage isolated from the Sargasso Sea on Synechococcus strain WH8109. Syn5 has been grown in WH8109 to high titer in the laboratory and purified and concentrated retaining infectivity. Genome sequencing and annotation of Syn5 revealed that the linear genome is 46,214bp with a 237bp terminal direct repeat. Sixty-one open reading frames (ORFs) were identified. Based on genomic organization and sequence similarity to known protein sequences within GenBank, Syn5 shares features with T7-like phages. The presence of a putative integrase suggests access to a temperate life-cycle. Assignment of eleven ORFs to structural proteins found within the phage virion was confirmed by mass-spectrometry and N-terminal sequencing. Eight of these identified structural proteins exhibited amino acid sequence similarity to enteric phage proteins. The remaining three virion proteins did not resemble any known phage sequences in GenBank as of August 2006. Cryoelectron micrographs of purified Syn5 virions revealed that the capsid has a single “horn”, a novel fibrous structure protruding from the opposing end of the capsid from the tail of the virion. The tail appendage displayed an apparent three-fold rather than six-fold symmetry. An 18Å-resolution icosahedral reconstruction of the capsid revealed a T=7 lattice, but with an unusual pattern of surface knobs. This phage/host system should allow detailed investigation of the physiology and biochemistry of phage propagation in marine photosynthetic bacteria. PMID:17383677
The complete nucleotide sequence of RNA 3 of a peach isolate of Prunus necrotic ringspot virus.
Hammond, R W; Crosslin, J M
1995-04-01
The complete nucleotide sequence of RNA 3 of the PE-5 peach isolate of Prunus necrotic ringspot ilarvirus (PNRSV) was obtained from cloned cDNA. The RNA sequence is 1941 nucleotides and contains two open reading frames (ORFs). ORF 1 consisted of 284 amino acids with a calculated molecular weight of 31,729 Da and ORF 2 contained 224 amino acids with a calculated molecular weight of 25,018 Da. ORF 2 corresponds to the coat protein gene. Expression of ORF 2 engineered into a pTrcHis vector in Escherichia coli results in a fusion polypeptide of approximately 28 kDa which cross-reacts with PNRSV polyclonal antiserum. Analysis of the coat protein amino acid sequence reveals a putative "zinc-finger" domain at the amino-terminal portion of the protein. Two tetranucleotide AUGC motifs occur in the 3'-UTR of the RNA and may function in coat protein binding and genome activation. ORF 1 homologies to other ilarviruses and alfalfa mosaic virus are confined to limited regions of conserved amino acids. The translated amino acid sequence of the coat protein gene shows 92% similarity to one isolate of apple mosaic virus, a closely related member of the ilarvirus group of plant viruses, but only 66% similarity to the amino acid sequence of the coat protein gene of a second isolate. These relationships are also reflected at the nucleotide sequence level. These results in one instance confirm the close similarities observed at the biophysical and serological levels between these two viruses, but on the other hand call into question the nomenclature used to describe these viruses.
Avian sarcoma virus 17 carries the jun oncogene.
Maki, Y; Bos, T J; Davis, C; Starbuck, M; Vogt, P K
1987-01-01
Biologically active molecular clones of avian sarcoma virus 17 (ASV 17) contain a replication-defective proviral genome of 3.5 kilobases (kb). The genome retains partial gag and env sequences, which flank a cell-derived putative oncogene of 0.93 kb, termed jun. The jun gene lacks preserved coding domains of tyrosine-specific protein kinases. It also shows no significant nucleic acid homology with other known oncogenes. The probable transformation-specific protein in ASV 17-transformed cells is a 55-kDa gag-jun fusion product. Images PMID:3033666
Analysis of xylem formation in pine by cDNA sequencing
NASA Technical Reports Server (NTRS)
Allona, I.; Quinn, M.; Shoop, E.; Swope, K.; St Cyr, S.; Carlis, J.; Riedl, J.; Retzel, E.; Campbell, M. M.; Sederoff, R.;
1998-01-01
Secondary xylem (wood) formation is likely to involve some genes expressed rarely or not at all in herbaceous plants. Moreover, environmental and developmental stimuli influence secondary xylem differentiation, producing morphological and chemical changes in wood. To increase our understanding of xylem formation, and to provide material for comparative analysis of gymnosperm and angiosperm sequences, ESTs were obtained from immature xylem of loblolly pine (Pinus taeda L.). A total of 1,097 single-pass sequences were obtained from 5' ends of cDNAs made from gravistimulated tissue from bent trees. Cluster analysis detected 107 groups of similar sequences, ranging in size from 2 to 20 sequences. A total of 361 sequences fell into these groups, whereas 736 sequences were unique. About 55% of the pine EST sequences show similarity to previously described sequences in public databases. About 10% of the recognized genes encode factors involved in cell wall formation. Sequences similar to cell wall proteins, most known lignin biosynthetic enzymes, and several enzymes of carbohydrate metabolism were found. A number of putative regulatory proteins also are represented. Expression patterns of several of these genes were studied in various tissues and organs of pine. Sequencing novel genes expressed during xylem formation will provide a powerful means of identifying mechanisms controlling this important differentiation pathway.
USDA-ARS?s Scientific Manuscript database
Grapevine red blotch-associated virus (GRBaV) is a newly identified virus of grapevines, and a putative member of a new genus within the family Geminiviridae. This virus is associated with red blotch disease that was first reported in California in 2008. It affects the profitability of vineyards by ...
O’Keeffe, Triona; Hill, Colin; Ross, R. Paul
1999-01-01
Enterocin A is a small, heat-stable, antilisterial bacteriocin produced by Enterococcus faecium DPC1146. The sequence of a 10,879-bp chromosomal region containing at least 12 open reading frames (ORFs), 7 of which are predicted to play a role in enterocin biosynthesis, is presented. The genes entA, entI, and entF encode the enterocin A prepeptide, the putative immunity protein, and the induction factor prepeptide, respectively. The deduced proteins EntK and EntR resemble the histidine kinase and response regulator proteins of two-component signal transducing systems of the AgrC-AgrA type. The predicted proteins EntT and EntD are homologous to ABC (ATP-binding cassette) transporters and accessory factors, respectively, of several other bacteriocin systems and to proteins implicated in the signal-sequence-independent export of Escherichia coli hemolysin A. Immediately downstream of the entT and entD genes are two ORFs, the product of one of which, ORF4, is very similar to the product of the yteI gene of Bacillus subtilis and to E. coli protease IV, a signal peptide peptidase known to be involved in outer membrane lipoprotein export. Another potential bacteriocin is encoded in the opposite direction to the other genes in the enterocin cluster. This putative bacteriocin-like peptide is similar to LafX, one of the components of the lactacin F complex. A deletion which included one of two direct repeats upstream of the entA gene abolished enterocin A activity, immunity, and ability to induce bacteriocin production. Transposon insertion upstream of the entF gene also had the same effect, but this mutant could be complemented by exogenously supplied induction factor. The putative EntI peptide was shown to be involved in the immunity to enterocin A. Cloning of a 10.5-kb amplicon comprising all predicted ORFs and regulatory regions resulted in heterologous production of enterocin A and induction factor in Enterococcus faecalis, while a four-gene construct (entAITD) under the control of a constitutive promoter resulted in heterologous enterocin A production in both E. faecalis and Lactococcus lactis. PMID:10103244
Zhang, Songyan; Gao, Jiuxiang; Lu, Yiling; Cai, Shasha; Qiao, Xue; Wang, Yipeng; Yu, Haining
2013-08-01
Antifreeze proteins (AFPs) refer to a class of polypeptides that are produced by certain vertebrates, plants, fungi, and bacteria and which permit their survival in subzero environments. In this study, we report the molecular cloning, sequence analysis and three-dimensional structure of the axolotl antifreeze-like protein (AFLP) by homology modeling of the first caudate amphibian AFLP. We constructed a full-length spleen cDNA library of axolotl (Ambystoma mexicanum). An EST having highest similarity (∼42%) with freeze-responsive liver protein Li16 from Rana sylvatica was identified, and the full-length cDNA was subsequently obtained by RACE-PCR. The axolotl antifreeze-like protein sequence represents an open reading frame for a putative signal peptide and the mature protein composed of 93 amino acids. The calculated molecular mass and the theoretical isoelectric point (pl) of this mature protein were 10128.6 Da and 8.97, respectively. The molecular characterization of this gene and its deduced protein were further performed by detailed bioinformatics analysis. The three-dimensional structure of current AFLP was predicted by homology modeling, and the conserved residues required for functionality were identified. The homology model constructed could be of use for effective drug design. This is the first report of an antifreeze-like protein identified from a caudate amphibian.
Dinant, S; Lot, H; Albouy, J; Kuziak, C; Meyer, M; Astier-Manifacier, S
1991-01-01
DNA complementary to the 3' terminal 1651 nucleotides of the genome of the common strain of lettuce mosaic virus (LMV-O) has been cloned and sequenced. Microsequencing of the N-terminus enabled localization of the coat protein gene in this sequence. It showed also that the LMV coat protein coding region is at the 3' end of the genome, and that the coat protein is processed from a larger protein by cleavage at an unusual Q/V dipeptide between the polymerase and the coat protein. This is the first report of such a site for cleavage of a potyvirus polyprotein, where only Q/A, Q/S, and Q/G cleavage sites have been reported. The LMV coat protein gene encodes a 278 amino acid polypeptide with a calculated Mr of 31,171 and is flanked by a region which has a high degree of homology with the putative polymerase and a 3' untranslated region of 211 nucleotides in length. Percentage of homology with the coat protein of other potyviruses confirms that LMV is a distinct member of this group. Moreover, amino acid homologies noticed with the coat protein of potexvirus, bymovirus, and carlavirus elongated plant viruses suggest a functional significance for the conserved domains.
Cloning and characterization of a novel zinc finger gene in Xp11.2
DOE Office of Scientific and Technical Information (OSTI.GOV)
Derry, J.M.J.; Jess, U.; Francke, U.
1995-11-20
During a systematic search for open reading frames in chromosome band Xp11.2, a novel gene (ZNF157) that encodes a putative 506-amino-acid protein with the sequence characteristics of a zinc-finger-containing transcription factor was isolated. ZNF157 is encoded by four exons distributed over >20 kb of genomic DNA. The second and third exons contain sequences similar to those of the previously described KRAB-A and KRAB-B domains, motifs that have been shown to mediate transcriptional repression in other members of the protein family. A fourth exon contains 12 zinc finger DNA binding motifs and finger linking regions characteristic of ZNF proteins of themore » Krueppel family. ZNF157 maps to the telomeric end of a cluster of ZNF genes that includes ZNF21, ZNF41, and ZNF81. 19 refs., 2 figs.« less
Tomie, Tetsuya; Ishibashi, Jun; Furukawa, Seiichi; Kobayashi, Satoe; Sawahata, Ryoko; Asaoka, Ai; Tagawa, Michito; Yamakawa, Minoru
2003-07-25
A novel antifungal peptide, scarabaecin (4080Da), was isolated from the coconut rhinoceros beetle, Oryctes rhinoceros. Scarabaecin cDNA was cloned by reverse transcriptase-polymerase chain reactions (RT-PCR) using a primer based on the N-terminal amino acid sequence. The amino acid sequence deduced from scarabaecin cDNA showed no significant similarity to those of reported proteins. Chemically synthesized scarabaecin indicated antifungal activity against phytopathogenic fungi such as Pyricularia oryzae, Rhizoctonia solani, and Botrytis cinerea, but not against phytopathogenic bacteria. It showed weak activity against Bauberia bassiana, an insect pathogenic fungus, and Staphylococcus aureus, a pathogenic bacterium. Scarabaecin showed chitin binding property and its K(d) was 1.315 microM. A comparison of putative chitin-binding domains among scarabaecin, invertebrate, and plant chitin-binding proteins suggests that scarabaecin is a new member of chitin-binding antimicrobial proteins.
Torres-Cortés, Gloria; Ghignone, Stefano; Bonfante, Paola; Schüßler, Arthur
2015-06-23
For more than 450 million years, arbuscular mycorrhizal fungi (AMF) have formed intimate, mutualistic symbioses with the vast majority of land plants and are major drivers in almost all terrestrial ecosystems. The obligate plant-symbiotic AMF host additional symbionts, so-called Mollicutes-related endobacteria (MRE). To uncover putative functional roles of these widespread but yet enigmatic MRE, we sequenced the genome of DhMRE living in the AMF Dentiscutata heterogama. Multilocus phylogenetic analyses showed that MRE form a previously unidentified lineage sister to the hominis group of Mycoplasma species. DhMRE possesses a strongly reduced metabolic capacity with 55% of the proteins having unknown function, which reflects unique adaptations to an intracellular lifestyle. We found evidence for transkingdom gene transfer between MRE and their AMF host. At least 27 annotated DhMRE proteins show similarities to nuclear-encoded proteins of the AMF Rhizophagus irregularis, which itself lacks MRE. Nuclear-encoded homologs could moreover be identified for another AMF, Gigaspora margarita, and surprisingly, also the non-AMF Mortierella verticillata. Our data indicate a possible origin of the MRE-fungus association in ancestors of the Glomeromycota and Mucoromycotina. The DhMRE genome encodes an arsenal of putative regulatory proteins with eukaryotic-like domains, some of them encoded in putative genomic islands. MRE are highly interesting candidates to study the evolution and interactions between an ancient, obligate endosymbiotic prokaryote with its obligate plant-symbiotic fungal host. Our data moreover may be used for further targeted searches for ancient effector-like proteins that may be key components in the regulation of the arbuscular mycorrhiza symbiosis.
Feki, Kaouthar; Kamoun, Yosra; Ben Mahmoud, Rihem; Farhat-Khemakhem, Ameny; Gargouri, Ali; Brini, Faiçal
2015-12-01
Catalases are reactive oxygen species scavenging enzymes involved in response to abiotic and biotic stresses. In this study, we described the isolation and functional characterization of a novel catalase from durum wheat, designed TdCAT1. Molecular Phylogeny analyses showed that wheat TdCAT1 exhibited high amino acids sequence identity to other plant catalases. Sequence homology analysis showed that TdCAT1 protein contained the putative calmodulin binding domain and a putative conserved internal peroxisomal targeting signal PTS1 motif around its C-terminus. Predicted three-dimensional structural model revealed the presence of four putative distinct structural regions which are the N-terminal arm, the β-barrel, the wrapping and the α-helical domains. TdCAT1 protein had the heme pocket that was composed by five essential residues. TdCAT1 gene expression analysis showed that this gene was induced by various abiotic stresses in durum wheat. The expression of TdCAT1 in yeast cells and Arabidopsis plants conferred tolerance to several abiotic stresses. Compared with the non-transformed plants, the transgenic lines maintained their growth and accumulated more proline under stress treatments. Furthermore, the amount of H2O2 was lower in transgenic lines, which was due to the high CAT and POD activities. Taken together, these data provide the evidence for the involvement of durum wheat catalase TdCAT1 in tolerance to multiple abiotic stresses in crop plants. Copyright © 2015 Elsevier Masson SAS. All rights reserved.
Komatsu, Ken; Hirata, Hisae; Fukagawa, Takako; Yamaji, Yasuyuki; Okano, Yukari; Ishikawa, Kazuya; Adachi, Tatsushi; Maejima, Kensaku; Hashimoto, Masayoshi; Namba, Shigetou
2012-07-01
The first open-reading frame (ORF) of apple stem grooving virus (ASGV), of the genus Capillovirus, encodes an apparently chimeric polyprotein containing conserved regions for replicase (Rep) and coat protein (CP). However, our previous study revealed that ASGV mutants with distinct and discontinuous Rep- and CP-coding regions successfully infect plants, indicating that CP expressed via a subgenomic RNA (sgRNA) is sufficient for viability of the virus. Here we identified a transcription start site of the CP sgRNA and revealed that CP translated from the sgRNA is essential for ASGV infection. We mapped the transcription start sites of both the CP and the movement protein (MP) sgRNAs of ASGV and found a hexanucleotide motif, UUAGGU, conserved upstream from both sgRNA transcription start sites. Mutational analysis of the putative CP initiation codon and of the UUAGGU sequence upstream from the transcription start site of CP sgRNA demonstrated their importance for ASGV accumulation. Our results also demonstrated that potato virus T (PVT), an unassigned species closely related to ASGV, produces two sgRNAs putatively deployed for the CP and MP expression and that the same hexanucleotide motif as found in ASGV is located upstream from the transcription start sites of both sgRNAs. This motif, which constituted putative core elements of the sgRNA promoter, is broadly conserved among viruses in the families Alphaflexiviridae and Betaflexiviridae, suggesting that the gene expression strategy of the viruses in both families has been conserved throughout evolution. Copyright © 2012 Elsevier B.V. All rights reserved.
Pomati, Francesco; Burns, Brendan P; Neilan, Brett A
2004-08-01
Blooms of the freshwater cyanobacterium Anabaena circinalis are recognized as an important health risk worldwide due to the production of a range of toxins such as saxitoxin (STX) and its derivatives. In this study we used HIP1 octameric-palindrome repeated-sequence PCR to compare the genomic structure of phylogenetically similar Australian isolates of A. circinalis. STX-producing and nontoxic cyanobacterial strains showed different HIP1 (highly iterated octameric palindrome 1) DNA patterns, and characteristic interrepeat amplicons for each group were identified. Suppression subtractive hybridization (SSH) was performed using HIP1 PCR-generated libraries to further identify toxic-strain-specific genes. An STX-producing strain and a nontoxic strain of A. circinalis were chosen as testers in two distinct experiments. The two categories of SSH putative tester-specific sequences were characterized by different families of encoded proteins that may be representative of the differences in metabolism between STX-producing and nontoxic A. circinalis strains. DNA-microarray hybridization and genomic screening revealed a toxic-strain-specific HIP1 fragment coding for a putative Na(+)-dependent transporter. Analysis of this gene demonstrated analogy to the mrpF gene of Bacillus subtilis, whose encoded protein is involved in Na(+)-specific pH homeostasis. The application of this gene as a molecular probe in laboratory and environmental screening for STX-producing A. circinalis strains was demonstrated. The possible role of this putative Na(+)-dependent transporter in the toxic cyanobacterial phenotype is also discussed, in light of recent physiological studies of STX-producing cyanobacteria.
Defining Electron Bifurcation in the Electron-Transferring Flavoprotein Family.
Garcia Costas, Amaya M; Poudel, Saroj; Miller, Anne-Frances; Schut, Gerrit J; Ledbetter, Rhesa N; Fixen, Kathryn R; Seefeldt, Lance C; Adams, Michael W W; Harwood, Caroline S; Boyd, Eric S; Peters, John W
2017-11-01
Electron bifurcation is the coupling of exergonic and endergonic redox reactions to simultaneously generate (or utilize) low- and high-potential electrons. It is the third recognized form of energy conservation in biology and was recently described for select electron-transferring flavoproteins (Etfs). Etfs are flavin-containing heterodimers best known for donating electrons derived from fatty acid and amino acid oxidation to an electron transfer respiratory chain via Etf-quinone oxidoreductase. Canonical examples contain a flavin adenine dinucleotide (FAD) that is involved in electron transfer, as well as a non-redox-active AMP. However, Etfs demonstrated to bifurcate electrons contain a second FAD in place of the AMP. To expand our understanding of the functional variety and metabolic significance of Etfs and to identify amino acid sequence motifs that potentially enable electron bifurcation, we compiled 1,314 Etf protein sequences from genome sequence databases and subjected them to informatic and structural analyses. Etfs were identified in diverse archaea and bacteria, and they clustered into five distinct well-supported groups, based on their amino acid sequences. Gene neighborhood analyses indicated that these Etf group designations largely correspond to putative differences in functionality. Etfs with the demonstrated ability to bifurcate were found to form one group, suggesting that distinct conserved amino acid sequence motifs enable this capability. Indeed, structural modeling and sequence alignments revealed that identifying residues occur in the NADH- and FAD-binding regions of bifurcating Etfs. Collectively, a new classification scheme for Etf proteins that delineates putative bifurcating versus nonbifurcating members is presented and suggests that Etf-mediated bifurcation is associated with surprisingly diverse enzymes. IMPORTANCE Electron bifurcation has recently been recognized as an electron transfer mechanism used by microorganisms to maximize energy conservation. Bifurcating enzymes couple thermodynamically unfavorable reactions with thermodynamically favorable reactions in an overall spontaneous process. Here we show that the electron-transferring flavoprotein (Etf) enzyme family exhibits far greater diversity than previously recognized, and we provide a phylogenetic analysis that clearly delineates bifurcating versus nonbifurcating members of this family. Structural modeling of proteins within these groups reveals key differences between the bifurcating and nonbifurcating Etfs. Copyright © 2017 American Society for Microbiology.
Defining Electron Bifurcation in the Electron-Transferring Flavoprotein Family
Garcia Costas, Amaya M.; Poudel, Saroj; Miller, Anne-Frances; Schut, Gerrit J.; Ledbetter, Rhesa N.; Seefeldt, Lance C.; Adams, Michael W. W.
2017-01-01
ABSTRACT Electron bifurcation is the coupling of exergonic and endergonic redox reactions to simultaneously generate (or utilize) low- and high-potential electrons. It is the third recognized form of energy conservation in biology and was recently described for select electron-transferring flavoproteins (Etfs). Etfs are flavin-containing heterodimers best known for donating electrons derived from fatty acid and amino acid oxidation to an electron transfer respiratory chain via Etf-quinone oxidoreductase. Canonical examples contain a flavin adenine dinucleotide (FAD) that is involved in electron transfer, as well as a non-redox-active AMP. However, Etfs demonstrated to bifurcate electrons contain a second FAD in place of the AMP. To expand our understanding of the functional variety and metabolic significance of Etfs and to identify amino acid sequence motifs that potentially enable electron bifurcation, we compiled 1,314 Etf protein sequences from genome sequence databases and subjected them to informatic and structural analyses. Etfs were identified in diverse archaea and bacteria, and they clustered into five distinct well-supported groups, based on their amino acid sequences. Gene neighborhood analyses indicated that these Etf group designations largely correspond to putative differences in functionality. Etfs with the demonstrated ability to bifurcate were found to form one group, suggesting that distinct conserved amino acid sequence motifs enable this capability. Indeed, structural modeling and sequence alignments revealed that identifying residues occur in the NADH- and FAD-binding regions of bifurcating Etfs. Collectively, a new classification scheme for Etf proteins that delineates putative bifurcating versus nonbifurcating members is presented and suggests that Etf-mediated bifurcation is associated with surprisingly diverse enzymes. IMPORTANCE Electron bifurcation has recently been recognized as an electron transfer mechanism used by microorganisms to maximize energy conservation. Bifurcating enzymes couple thermodynamically unfavorable reactions with thermodynamically favorable reactions in an overall spontaneous process. Here we show that the electron-transferring flavoprotein (Etf) enzyme family exhibits far greater diversity than previously recognized, and we provide a phylogenetic analysis that clearly delineates bifurcating versus nonbifurcating members of this family. Structural modeling of proteins within these groups reveals key differences between the bifurcating and nonbifurcating Etfs. PMID:28808132
Shu, Benshui; Zhang, Jingjing; Sethuraman, Veeran; Cui, Gaofeng; Yi, Xin; Zhong, Guohua
2017-10-16
As an important botanical pesticide, azadirachtin demonstrates broad insecticidal activity against many agricultural pests. The results of a previous study indicated the toxicity and apoptosis induction of azadirachtin in Spodoptera frugiperda Sf9 cells. However, the lack of genomic data has hindered a deeper investigation of apoptosis in Sf9 cells at a molecular level. In the present study, the complete transcriptome data for Sf9 cell line was accomplished using Illumina sequencing technology, and 97 putative apoptosis-related genes were identified through BLAST and KEGG orthologue annotations. Fragments of potential candidate apoptosis-related genes were cloned, and the mRNA expression patterns of ten identified genes regulated by azadirachtin were examined using qRT-PCR. Furthermore, Western blot analysis showed that six putative apoptosis-related proteins were upregulated after being treated with azadirachtin while the protein Bcl-2 were downregulated. These data suggested that both intrinsic and extrinsic apoptotic signal pathways comprising the identified potential apoptosis-related genes were potentially active in S. frugiperda. In addition, the preliminary results revealed that caspase-dependent or caspase-independent apoptotic pathways could function in azadirachtin-induced apoptosis in Sf9 cells.
Secretome of Aspergillus oryzae in Shaoxing rice wine koji.
Zhang, Bo; Guan, Zheng-Bing; Cao, Yu; Xie, Guang-Fa; Lu, Jian
2012-04-16
Shaoxing rice wine is the most famous and representative Chinese rice wine. Aspergillus oryzae SU16 is used in the manufacture of koji, the Shaoxing rice wine starter culture. In the current study, a comprehensive analysis of the secretome profile of A. oryzae SU16 in Shaoxing rice wine koji was performed for the first time. The proteomic analysis for the identification of the secretory proteins was done using two-dimensional electrophoresis combined with matrix-assisted laser desorption/ionization-tandem time of flight mass spectrometry based on the annotated A. oryzae genome sequence. A total of 41 unique proteins were identified from the secretome. These proteins included 17 extracellular proteins following the classical secretory pathway, and 10 extracellular proteins putatively secreted by the non-classical secretory pathway. The present secretome profile greatly differed from previous reports on A. oryzae growing in other solid-state nutrient sources. Several new secretory or putative secretory proteins were also found. These proteomic data will significantly aid the advancement of research on the secretome of A. oryzae, especially in solid-state cultures, and in elucidating the production process mechanism of Shaoxing rice wine koji. The findings may promote the technological development and innovation of the Shaoxing rice wine industry. Copyright © 2012 Elsevier B.V. All rights reserved.
dos Reis, Sávio Pinho; Tavares, Liliane de Souza Conceição; Costa, Carinne de Nazaré Monteiro; Brígida, Aílton Borges Santa; de Souza, Cláudia Regina Batista
2012-06-01
Cassava (Manihot esculenta Crantz) is one of the world's most important food crops. It is cultivated mainly in developing countries of tropics, since its root is a major source of calories for low-income people due to its high productivity and resistance to many abiotic and biotic factors. A previous study has identified a partial cDNA sequence coding for a putative RING zinc finger in cassava storage root. The RING zinc finger protein is a specialized type of zinc finger protein found in many organisms. Here, we isolated the full-length cDNA sequence coding for M. esculenta RZF (MeRZF) protein by a combination of 5' and 3' RACE assays. BLAST analysis showed that its deduced amino acid sequence has a high level of similarity to plant proteins of RZF family. MeRZF protein contains a signature sequence motif for a RING zinc finger at its C-terminal region. In addition, this protein showed a histidine residue at the fifth coordination site, likely belonging to the RING-H2 subgroup, as confirmed by our phylogenetic analysis. There is also a transmembrane domain in its N-terminal region. Finally, semi-quantitative RT-PCR assays showed that MeRZF expression is increased in detached leaves treated with sodium chloride. Here, we report the first evidence of a RING zinc finger gene of cassava showing potential role in response to salt stress.
Bamford, Vicki A.; Armour, Maria; Mitchell, Sue A.; Cartron, Michaël; Andrews, Simon C.; Watson, Kimberly A.
2008-01-01
YqjH is a cytoplasmic FAD-containing protein from Escherichia coli; based on homology to ViuB of Vibrio cholerae, it potentially acts as a ferri-siderophore reductase. This work describes its overexpression, purification, crystallization and structure solution at 3.0 Å resolution. YqjH shares high sequence similarity with a number of known siderophore-interacting proteins and its structure was solved by molecular replacement using the siderophore-interacting protein from Shewanella putrefaciens as the search model. The YqjH structure resembles those of other members of the NAD(P)H:flavin oxidoreductase superfamily. PMID:18765906
Behera, Pratiksha; Vaishampayan, Parag; Singh, Nitin K; Mishra, Samir R; Raina, Vishakha; Suar, Mrutyunjay; Pattnaik, Ajit K; Rastogi, Gurdeep
2016-09-01
Till date, only one draft genome has been reported within the genus Mangrovibacter. Here, we report the second draft genome shotgun sequence of a Mangrovibacter sp. strain MP23 that was isolated from the roots of Phargmites karka (P. karka), an invasive weed growing in the Chilika Lagoon, Odisha, India. Strain MP23 is a facultative anaerobic, nitrogen-fixing endophytic bacteria that grows optimally at 37 °C, 7.0 pH, and 1% NaCl concentration. The draft genome sequence of strain MP23 contains 4,947,475 bp with an estimated G + C content of 49.9% and total 4392 protein coding genes. The genome sequence has provided information on putative genes that code for proteins involved in oxidative stress, uptake of nutrients, and nitrogen fixation that might offer niche specific ecological fitness and explain the invasive success of P. karka in Chilika Lagoon. The draft genome sequence and annotation have been deposited at DDBJ/EMBL/GenBank under the accession number LYRP00000000.
Identification and preliminary characterization of a protein motif related to the zinc finger.
Lovering, R; Hanson, I M; Borden, K L; Martin, S; O'Reilly, N J; Evan, G I; Rahman, D; Pappin, D J; Trowsdale, J; Freemont, P S
1993-01-01
We have identified a protein motif, related to the zinc finger, which defines a newly discovered family of proteins. The motif was found in the sequence of the human RING1 gene, which is proximal to the major histocompatibility complex region on chromosome six. We propose naming this motif the "RING finger" and it is found in 27 proteins, all of which have putative DNA binding functions. We have synthesized a peptide corresponding to the RING1 motif and examined a number of properties, including metal and DNA binding. We provide evidence to support the suggestion that the RING finger motif is the DNA binding domain of this newly defined family of proteins. Images Fig. 1 Fig. 4 PMID:7681583
The Nucleotide Sequence and Spliced pol mRNA Levels of the Nonprimate Spumavirus Bovine Foamy Virus
Holzschu, Donald L.; Delaney, Mari A.; Renshaw, Randall W.; Casey, James W.
1998-01-01
We have determined the complete nucleotide sequence of a replication-competent clone of bovine foamy virus (BFV) and have quantitated the amount of splice pol mRNA processed early in infection. The 544-amino-acid Gag protein precursor has little sequence similarity with its primate foamy virus homologs, but the putative nucleocapsid (NC) protein, like the primate NCs, contains the three glycine-arginine-rich regions that are postulated to bind genomic RNA during virion assembly. The BFV gag and pol open reading frames overlap, with pro and pol in the same translational frame. As with the human foamy virus (HFV) and feline foamy virus, we have detected a spliced pol mRNA by PCR. Quantitatively, this mRNA approximates the level of full-length genomic RNA early in infection. The integrase (IN) domain of reverse transcriptase does not contain the canonical HH-CC zinc finger motif present in all characterized retroviral INs, but it does contain a nearby histidine residue that could conceivably participate as a member of the zinc finger. The env gene encodes a protein that is over 40% identical in sequence to the HFV Env. By comparison, the Gag precursor of BFV is predicted to be only 28% identical to the HFV protein. PMID:9499074
Romanutti, Carina; Gallo Calderón, Marina; Keller, Leticia; Mattion, Nora; La Torre, José
2016-02-01
During 2007-2014, 84 out of 236 (35.6%) samples from domestic dogs submitted to our laboratory for diagnostic purposes were positive for Canine Distemper Virus (CDV), as analyzed by RT-PCR amplification of a fragment of the nucleoprotein gene. Fifty-nine of them (70.2%) were from dogs that had been vaccinated against CDV. The full-length gene encoding the Fusion (F) protein of fifteen isolates was sequenced and compared with that of those of other CDVs, including wild-type and vaccine strains. Phylogenetic analysis using the F gene full-length sequences grouped all the Argentinean CDV strains in the SA2 clade. Sequence identity with the Onderstepoort vaccine strain was 89.0-90.6%, and the highest divergence was found in the 135 amino acids corresponding to the F protein signal-peptide, Fsp (64.4-66.7% identity). In contrast, this region was highly conserved among the local strains (94.1-100% identity). One extra putative N-glycosylation site was identified in the F gene of CDV Argentinean strains with respect to the vaccine strain. The present report is the first to analyze full-length F protein sequences of CDV strains circulating in Argentina, and contributes to the knowledge of molecular epidemiology of CDV, which may help in understanding future disease outbreaks. Copyright © 2015 Elsevier B.V. All rights reserved.
Song, Wen Jun; Qin, Qi Wei; Qiu, Jin; Huang, Can Hua; Wang, Fan; Hew, Choy Leong
2004-01-01
Here we report the complete genome sequence of Singapore grouper iridovirus (SGIV). Sequencing of the random shotgun and restriction endonuclease genomic libraries showed that the entire SGIV genome consists of 140,131 nucleotide bp. One hundred sixty-two open reading frames (ORFs) from the sense and antisense DNA strands, coding for lengths varying from 41 to 1,268 amino acids, were identified. Computer-assisted analyses of the deduced amino acid sequences revealed that 77 of the ORFs exhibited homologies to known virus genes, 23 of which matched functional iridovirus proteins. Forty-two putative conserved domains or signatures were detected in the National Center for Biotechnology Information CD-Search database and PROSITE database. An assortment of enzyme activities involved in DNA replication, transcription, nucleotide metabolism, cell signaling, etc., were identified. Viruses were cultured on a cell line derived from the embryonated egg of the grouper Epinephelus tauvina, isolated, and purified by sucrose gradient ultracentrifugation. The protein extract from the purified virions was analyzed by polyacrylamide gel electrophoresis followed by in-gel digestion of protein bands. Matrix-assisted laser desorption ionization-time of flight mass spectrometry and database searching led to identification of 26 proteins. Twenty of these represented novel or previously unidentified genes, which were further confirmed by reverse transcription-PCR (RT-PCR) and DNA sequencing of their respective RT-PCR products. PMID:15507645
Tramontano, A; Macchiato, M F
1986-01-01
An algorithm to determine the probability that a reading frame codifies for a protein is presented. It is based on the results of our previous studies on the thermodynamic characteristics of a translated reading frame. We also develop a prediction procedure to distinguish between coding and non-coding reading frames. The procedure is based on the characteristics of the putative product of the DNA sequence and not on periodicity characteristics of the sequence, so the prediction is not biased by the presence of overlapping translated reading frames or by the presence of translated reading frames on the complementary DNA strand. PMID:3753761
Sharan, Malvika; Förstner, Konrad U; Eulalio, Ana; Vogel, Jörg
2017-06-20
RNA-binding proteins (RBPs) have been established as core components of several post-transcriptional gene regulation mechanisms. Experimental techniques such as cross-linking and co-immunoprecipitation have enabled the identification of RBPs, RNA-binding domains (RBDs) and their regulatory roles in the eukaryotic species such as human and yeast in large-scale. In contrast, our knowledge of the number and potential diversity of RBPs in bacteria is poorer due to the technical challenges associated with the existing global screening approaches. We introduce APRICOT, a computational pipeline for the sequence-based identification and characterization of proteins using RBDs known from experimental studies. The pipeline identifies functional motifs in protein sequences using position-specific scoring matrices and Hidden Markov Models of the functional domains and statistically scores them based on a series of sequence-based features. Subsequently, APRICOT identifies putative RBPs and characterizes them by several biological properties. Here we demonstrate the application and adaptability of the pipeline on large-scale protein sets, including the bacterial proteome of Escherichia coli. APRICOT showed better performance on various datasets compared to other existing tools for the sequence-based prediction of RBPs by achieving an average sensitivity and specificity of 0.90 and 0.91 respectively. The command-line tool and its documentation are available at https://pypi.python.org/pypi/bio-apricot. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Christie, Andrew E.; Fontanilla, Tiana M.; Nesbit, Katherine T.; Lenz, Petra H.
2013-01-01
Diel vertical migration and seasonal diapause are critical life history events for the copepod Calanus finmarchicus. While much is known about these behaviors phenomenologically, little is known about their molecular underpinnings. Recent studies in insects suggest that some circadian genes/proteins also contribute to the establishment of seasonal diapause. Thus, it is possible that in Calanus these distinct timing regimes share some genetic components. To begin to address this possibility, we used the well-established Drosophila melanogaster circadian system as a reference for mining clock transcripts from a 200,000+ sequence Calanus transcriptome; the proteins encoded by the identified transcripts were also deduced and characterized. Sequences encoding homologs of the Drosophila core clock proteins CLOCK, CYCLE, PERIOD and TIMELESS were identified, as was one encoding CRYPTOCHROME 2, a core clock protein in ancestral insect systems, but absent in Drosophila. Calanus transcripts encoding proteins known to modulate the Drosophila core clock were also identified and characterized, e.g. CLOCKWORK ORANGE, DOUBLETIME, SHAGGY and VRILLE. Alignment and structural analyses of the deduced Calanus proteins with their Drosophila counterparts revealed extensive sequence conservation, particularly in functional domains. Interestingly, reverse BLAST analyses of these sequences against all arthropod proteins typically revealed non-Drosophila isoforms to be most similar to the Calanus queries. This, in combination with the presence of both CRYPTOCHROME 1 (a clock input pathway protein) and CRYPTOCHROME 2 in Calanus, suggests that the organization of the copepod circadian system is an ancestral one, more similar to that of insects like Danaus plexippus than to that of Drosophila. PMID:23727418
Baird, Fiona J.; Su, Xiaopei; Aibinu, Ibukun; Nolan, Matthew J.; Sugiyama, Hiromu; Otranto, Domenico
2016-01-01
Background Food-borne nematodes of the genus Anisakis are responsible for a wide range of illnesses (= anisakiasis), from self-limiting gastrointestinal forms to severe systemic allergic reactions, which are often misdiagnosed and under-reported. In order to enhance and refine current diagnostic tools for anisakiasis, knowledge of the whole spectrum of parasite molecules transcribed and expressed by this parasite, including those acting as potential allergens, is necessary. Methodology/Principal Findings In this study, we employ high-throughput (Illumina) sequencing and bioinformatics to characterise the transcriptomes of two Anisakis species, A. simplex and A. pegreffii, and utilize this resource to compile lists of potential allergens from these parasites. A total of ~65,000,000 reads were generated from cDNA libraries for each species, and assembled into ~34,000 transcripts (= Unigenes); ~18,000 peptides were predicted from each cDNA library and classified based on homology searches, protein motifs and gene ontology and biological pathway mapping. Using comparative analyses with sequence data available in public databases, 36 (A. simplex) and 29 (A. pegreffii) putative allergens were identified, including sequences encoding ‘novel’ Anisakis allergenic proteins (i.e. cyclophilins and ABA-1 domain containing proteins). Conclusions/Significance This study represents a first step towards providing the research community with a curated dataset to use as a molecular resource for future investigations of the biology of Anisakis, including molecules putatively acting as allergens, using functional genomics, proteomics and immunological tools. Ultimately, an improved knowledge of the biological functions of these molecules in the parasite, as well as of their immunogenic properties, will assist the development of comprehensive, reliable and robust diagnostic tools. PMID:27472517
Innate Immune Complexity in the Purple Sea Urchin: Diversity of the Sp185/333 System
Smith, L. Courtney
2012-01-01
The California purple sea urchin, Strongylocentrotus purpuratus, is a long-lived echinoderm with a complex and sophisticated innate immune system. There are several large gene families that function in immunity in this species including the Sp185/333 gene family that has ∼50 (±10) members. The family shows intriguing sequence diversity and encodes a broad array of diverse yet similar proteins. The genes have two exons of which the second encodes the mature protein and has repeats and blocks of sequence called elements. Mosaics of element patterns plus single nucleotide polymorphisms-based variants of the elements result in significant sequence diversity among the genes yet maintains similar structure among the members of the family. Sequence of a bacterial artificial chromosome insert shows a cluster of six, tightly linked Sp185/333 genes that are flanked by GA microsatellites. The sequences between the GA microsatellites in which the Sp185/333 genes and flanking regions are located, are much more similar to each other than are the sequences outside the microsatellites suggesting processes such as gene conversion, recombination, or duplication. However, close linkage does not correspond with greater sequence similarity compared to randomly cloned and sequenced genes that are unlikely to be linked. There are three segmental duplications that are bounded by GAT microsatellites and include three almost identical genes plus flanking regions. RNA editing is detectible throughout the mRNAs based on comparisons to the genes, which, in combination with putative post-translational modifications to the proteins, results in broad arrays of Sp185/333 proteins that differ among individuals. The mature proteins have an N-terminal glycine-rich region, a central RGD motif, and a C-terminal histidine-rich region. The Sp185/333 proteins are localized to the cell surface and are found within vesicles in subsets of polygonal and small phagocytes. The coelomocyte proteome shows full-length and truncated proteins, including some with missense sequence. Current results suggest that both native Sp185/333 proteins and a recombinant protein bind bacteria and are likely important in sea urchin innate immunity. PMID:22566951
Austin, Christopher M; Tan, Mun Hua; Lee, Yin Peng; Croft, Laurence J; Meekan, Mark G; Pierce, Simon J; Gan, Han Ming
2016-01-01
The complete mitochondrial genome of the parasitic copepod Pandarus rhincodonicus was obtained from a partial genome scan using the HiSeq sequencing system. The Pandarus rhincodonicus mitogenome has 14,480 base pairs (62% A+T content) made up of 12 protein-coding genes, 2 ribosomal subunit genes, 22 transfer RNAs, and a putative 384 bp non-coding AT-rich region. This Pandarus mitogenome sequence is the first for the family Pandaridae, the second for the order Siphonostomatoida and the sixth for the Copepoda.
Feng, X; Happ, G M
1996-11-14
The cDNA for Sp23, a structural protein of the spermatophore of Tenebrio molitor, had been previously cloned and characterized (Paesen, G.C., Schwartz, M.B., Peferoen, M., Weyda, F. and Happ, G.M. (1992a) Amino acid sequence of Sp23, a structure protein of the spermatophore of the mealworm beetle, Tenebrio molitor. J. Biol. Chem. 257, 18852-18857). Using the labeled cDNA for Sp23 as a probe to screen a library of genomic DNA from Tenebrio molitor, we isolated a genomic clone for Sp23. A 5373-base pair (bp) restriction fragment containing the Sp23 gene was sequenced. The coding region is separated by a 55-bp intron which is located close to the translation start site. Three putative ecdysone response elements (EcRE) are identified in the 5' flanking region of the Sp23 gene. Comparison of the flanking regions of the Sp23 gene with those of the D-protein gene expressed in the accessory glands of Tenebrio reveals similar sequences present in the flanking regions of the two genes. The genomic organization of the coding region of the Sp23 gene shares similarities with that of the D-protein gene, three Drosophila accessory gland genes and two Drosophila 20-OH ecdysone-responsive genes.
AboElkhair, M; Iwamoto, T; Clark, K F; McKenna, P; Siah, A; Greenwood, S J; Berthe, F C J; Casey, J W; Cepica, A
2012-01-01
Haemic neoplasia (HN) is a leukemia-like disease that affects at least 20 species of marine bivalves including soft shell clam, Mya arenaria. Since the disease was discovered in 1969, the etiology remains unknown. A retroviral etiology has been suggested based on the detection of reverse transcriptase activity and electron microscopic observation of retroviral-like particles using negative staining. To date, however no virus isolate and no retroviral sequence from HN has been obtained. Moreover, transmission of the disease by cell-free filtrate from affected clams has not been reproduced. In the current study, we reinvestigated the association of HN with a putative retrovirus. Sucrose gradient centrifugation followed by assessment of reverse transcriptase activity, electrophoretic analysis of protein and RNA, and electron microscopic examinations of fractions corresponding to retroviral density were employed. Detection of retroviral pol sequences using degenerate RT-PCR approaches was also attempted. Our results showed visible bands at the expected density of retrovirus in HN-positive and HN-negative clam tissues and both with reverse transcriptase activity. Electron microscopy, RNA analysis, protein analysis, and PCR systems targeting the pol gene of retroviruses did not however provide clear evidence supporting presence of a retrovirus. We point out that the retrovirus etiology of HN of Mya arenaria proposed some 25 years ago should be reconsidered in the absence of a virus isolate or virus sequences. Copyright © 2011 Elsevier Inc. All rights reserved.
Cho, Min Seok; Joh, Kiseong; Ahn, Tae-Young; Park, Dong Suk
2014-09-01
Escherichia coli serotype O157 is still a major global healthcare problem. However, only limited information is now available on the molecular and serological detection of pathogenic bacteria. Therefore, the development of appropriate strategies for their rapid identification and monitoring is still needed. In general, the sequence analysis based on stx, slt, eae, hlyA, rfb, and fliCh7 genes is widely employed for the identification of E. coli serotype O157; but there have been critical defects in the diagnosis and identification of E. coli serotype O157, in that they are also present in other E. coli serogroups. In this study, NCBI-BLAST searches using the nucleotide sequences of the putative regulatory protein gene from E. coli O157:H7 str. Sakai found sequence difference at the serotype level. The specific primers from the putative regulatory protein gene were designed and investigated for their sensitivity and specificity for detecting the pathogen in environment water samples. The specificity of the primer set was evaluated using genomic DNA from 8 isolates of E. coli serotype O157 and 32 other reference strains. In addition, the sensitivity and specificity of this assay were confirmed by successful identification of E. coli serotype O157 in environmental water samples. In conclusion, this study showed that the newly developed quantitative serotype-specific PCR method is a highly specific and efficient tool for the surveillance and rapid detection of high-risk E. coli serotype O157.
Watanabe, Yoh-ichi; Gray, Michael W.
2000-01-01
A reverse transcription–polymerase chain reaction (RT–PCR) approach was used to clone a cDNA encoding the Euglena gracilis homolog of yeast Cbf5p, a protein component of the box H/ACA class of snoRNPs that mediate pseudouridine formation in eukaryotic rRNA. Cbf5p is a putative pseudouridine synthase, and the Euglena homolog is the first full-length Cbf5p sequence to be reported for an early diverging unicellular eukaryote (protist). Phylogenetic analysis of putative pseudouridine synthase sequences confirms that archaebacterial and eukaryotic (including Euglena) Cbf5p proteins are specifically related and are distinct from the TruB/Pus4p clade that is responsible for formation of pseudouridine at position 55 in eubacterial (TruB) and eukaryotic (Pus4p) tRNAs. Using a bioinformatics approach, we also identified archaebacterial genes encoding candidate homologs of yeast Gar1p and Nop10p, two additional proteins known to be associated with eukaryotic box H/ACA snoRNPs. These observations raise the possibility that pseudouridine formation in archaebacterial rRNA may be dependent on analogs of the eukaryotic box H/ACA snoRNPs, whose evolutionary origin may therefore predate the split between Archaea (archaebacteria) and Eucarya (eukaryotes). Database searches further revealed, in archaebacterial and some eukaryotic genomes, two previously unrecognized groups of genes (here designated ‘PsuX’ and ‘PsuY’) distantly related to the Cbf5p/TruB gene family. PMID:10871366
Nyarko, Afua; Singarapu, Kiran K.; Figueroa, Melania; Manning, Viola A.; Pandelova, Iovanna; Wolpert, Thomas J.; Ciuffetti, Lynda M.; Barbar, Elisar
2014-01-01
Pyrenophora tritici-repentis Ptr ToxB (ToxB) is a proteinaceous host-selective toxin produced by Pyrenophora tritici-repentis (P. tritici-repentis), a plant pathogenic fungus that causes the disease tan spot of wheat. One feature that distinguishes ToxB from other host-selective toxins is that it has naturally occurring homologs in non-pathogenic P. tritici-repentis isolates that lack toxic activity. There are no high-resolution structures for any of the ToxB homologs, or for any protein with >30% sequence identity, and therefore what underlies activity remains an open question. Here, we present the NMR structures of ToxB and its inactive homolog Ptr toxb. Both proteins adopt a β-sandwich fold comprising three strands in each half that are bridged together by two disulfide bonds. The inactive toxb, however, shows higher flexibility localized to the sequence-divergent β-sandwich half. The absence of toxic activity is attributed to a more open structure in the vicinity of one disulfide bond, higher flexibility, and residue differences in an exposed loop that likely impacts interaction with putative targets. We propose that activity is regulated by perturbations in a putative active site loop and changes in dynamics distant from the site of activity. Interestingly, the new structures identify AvrPiz-t, a secreted avirulence protein produced by the rice blast fungus, as a structural homolog to ToxB. This homology suggests that fungal proteins involved in either disease susceptibility such as ToxB or resistance such as AvrPiz-t may have a common evolutionary origin. PMID:25063993
Reid, S D; Green, N M; Buss, J K; Lei, B; Musser, J M
2001-06-19
Species of pathogenic microbes are composed of an array of evolutionarily distinct chromosomal genotypes characterized by diversity in gene content and sequence (allelic variation). The occurrence of substantial genetic diversity has hindered progress in developing a comprehensive understanding of the molecular basis of virulence and new therapeutics such as vaccines. To provide new information that bears on these issues, 11 genes encoding extracellular proteins in the human bacterial pathogen group A Streptococcus identified by analysis of four genomes were studied. Eight of the 11 genes encode proteins with a LPXTG(L) motif that covalently links Gram-positive virulence factors to the bacterial cell surface. Sequence analysis of the 11 genes in 37 geographically and phylogenetically diverse group A Streptococcus strains cultured from patients with different infection types found that recent horizontal gene transfer has contributed substantially to chromosomal diversity. Regions of the inferred proteins likely to interact with the host were identified by molecular population genetic analysis, and Western immunoblot analysis with sera from infected patients confirmed that they were antigenic. Real-time reverse transcriptase-PCR (TaqMan) assays found that transcription of six of the 11 genes was substantially up-regulated in the stationary phase. In addition, transcription of many genes was influenced by the covR and mga trans-acting gene regulatory loci. Multilocus investigation of putative virulence genes by the integrated approach described herein provides an important strategy to aid microbial pathogenesis research and rapidly identify new targets for therapeutics research.
Gillot, Guillaume; Jany, Jean-Luc; Dominguez-Santos, Rebeca; Poirier, Elisabeth; Debaets, Stella; Hidalgo, Pedro I; Ullán, Ricardo V; Coton, Emmanuel; Coton, Monika
2017-04-01
Mycophenolic acid (MPA) is a secondary metabolite produced by various Penicillium species including Penicillium roqueforti. The MPA biosynthetic pathway was recently described in Penicillium brevicompactum. In this study, an in silico analysis of the P. roqueforti FM164 genome sequence localized a 23.5-kb putative MPA gene cluster. The cluster contains seven genes putatively coding seven proteins (MpaA, MpaB, MpaC, MpaDE, MpaF, MpaG, MpaH) and is highly similar (i.e. gene synteny, sequence homology) to the P. brevicompactum cluster. To confirm the involvement of this gene cluster in MPA biosynthesis, gene silencing using RNA interference targeting mpaC, encoding a putative polyketide synthase, was performed in a high MPA-producing P. roqueforti strain (F43-1). In the obtained transformants, decreased MPA production (measured by LC-Q-TOF/MS) was correlated to reduced mpaC gene expression by Q-RT-PCR. In parallel, mycotoxin quantification on multiple P. roqueforti strains suggested strain-dependent MPA-production. Thus, the entire MPA cluster was sequenced for P. roqueforti strains with contrasted MPA production and a 174bp deletion in mpaC was observed in low MPA-producers. PCRs directed towards the deleted region among 55 strains showed an excellent correlation with MPA quantification. Our results indicated the clear involvement of mpaC gene as well as surrounding cluster in P. roqueforti MPA biosynthesis. Copyright © 2016 Elsevier Ltd. All rights reserved.
Putative cross-kingdom horizontal gene transfer in sponge (Porifera) mitochondria.
Rot, Chagai; Goldfarb, Itay; Ilan, Micha; Huchon, Dorothée
2006-09-14
The mitochondrial genome of Metazoa is usually a compact molecule without introns. Exceptions to this rule have been reported only in corals and sea anemones (Cnidaria), in which group I introns have been discovered in the cox1 and nad5 genes. Here we show several lines of evidence demonstrating that introns can also be found in the mitochondria of sponges (Porifera). A 2,349 bp fragment of the mitochondrial cox1 gene was sequenced from the sponge Tetilla sp. (Spirophorida). This fragment suggests the presence of a 1143 bp intron. Similar to all the cnidarian mitochondrial introns, the putative intron has group I intron characteristics. The intron is present in the cox1 gene and encodes a putative homing endonuclease. In order to establish the distribution of this intron in sponges, the cox1 gene was sequenced from several representatives of the demosponge diversity. The intron was found only in the sponge order Spirophorida. A phylogenetic analysis of the COI protein sequence and of the intron open reading frame suggests that the intron may have been transmitted horizontally from a fungus donor. Little is known about sponge-associated fungi, although in the last few years the latter have been frequently isolated from sponges. We suggest that the horizontal gene transfer of a mitochondrial intron was facilitated by a symbiotic relationship between fungus and sponge. Ecological relationships are known to have implications at the genomic level. Here, an ecological relationship between sponge and fungus is suggested based on the genomic analysis.
Isvoran, Adriana; Craciun, Dana; Martiny, Virginie; Sperandio, Olivier; Miteva, Maria A
2013-06-14
Protein-Protein Interactions (PPIs) are key for many cellular processes. The characterization of PPI interfaces and the prediction of putative ligand binding sites and hot spot residues are essential to design efficient small-molecule modulators of PPI. Terphenyl and its derivatives are small organic molecules known to mimic one face of protein-binding alpha-helical peptides. In this work we focus on several PPIs mediated by alpha-helical peptides. We performed computational sequence- and structure-based analyses in order to evaluate several key physicochemical and surface properties of proteins known to interact with alpha-helical peptides and/or terphenyl and its derivatives. Sequence-based analysis revealed low sequence identity between some of the analyzed proteins binding alpha-helical peptides. Structure-based analysis was performed to calculate the volume, the fractal dimension roughness and the hydrophobicity of the binding regions. Besides the overall hydrophobic character of the binding pockets, some specificities were detected. We showed that the hydrophobicity is not uniformly distributed in different alpha-helix binding pockets that can help to identify key hydrophobic hot spots. The presence of hydrophobic cavities at the protein surface with a more complex shape than the entire protein surface seems to be an important property related to the ability of proteins to bind alpha-helical peptides and low molecular weight mimetics. Characterization of similarities and specificities of PPI binding sites can be helpful for further development of small molecules targeting alpha-helix binding proteins.
Kato, Hirotomo; Jochim, Ryan C.; Gomez, Eduardo A.; Sakoda, Ryo; Iwata, Hiroyuki; Valenzuela, Jesus G.; Hashiguchi, Yoshihisa
2010-01-01
Triatoma (T.) dimidiata is a hematophagous Hemiptera and a main vector of Chagas disease. The saliva of this and other blood-sucking insects contains potent pharmacologically active components that assist them in counteracting the host hemostatic and inflammatory systems during blood feeding. To describe the repertoire of potential bioactive salivary molecules from this insect, a number of randomly selected transcripts from the salivary gland cDNA library of T. dimidiata were sequenced and analyzed. This analysis showed that 77.5% of the isolated transcripts coded for putative secreted proteins, and 89.9% of these coded for variants of the lipocalin family proteins. The most abundant transcript was a homologue of procalin, the major allergen of T. protracta saliva, and contributed more than 50% of the transcripts coding for putative secreted proteins, suggesting that it may play an important role in the blood-feeding process. Other salivary transcripts encoding lipocalin family proteins had homology to triabin (a thrombin inhibitor), triafestin (an inhibitor of kallikrein–kinin system), pallidipin (an inhibitor of collagen-induced platelet aggregation) and others with unknown function. PMID:19900580
Proteins interacting with cloning scars: a source of false positive protein-protein interactions.
Banks, Charles A S; Boanca, Gina; Lee, Zachary T; Florens, Laurence; Washburn, Michael P
2015-02-23
A common approach for exploring the interactome, the network of protein-protein interactions in cells, uses a commercially available ORF library to express affinity tagged bait proteins; these can be expressed in cells and endogenous cellular proteins that copurify with the bait can be identified as putative interacting proteins using mass spectrometry. Control experiments can be used to limit false-positive results, but in many cases, there are still a surprising number of prey proteins that appear to copurify specifically with the bait. Here, we have identified one source of false-positive interactions in such studies. We have found that a combination of: 1) the variable sequence of the C-terminus of the bait with 2) a C-terminal valine "cloning scar" present in a commercially available ORF library, can in some cases create a peptide motif that results in the aberrant co-purification of endogenous cellular proteins. Control experiments may not identify false positives resulting from such artificial motifs, as aberrant binding depends on sequences that vary from one bait to another. It is possible that such cryptic protein binding might occur in other systems using affinity tagged proteins; this study highlights the importance of conducting careful follow-up studies where novel protein-protein interactions are suspected.
Proteins interacting with cloning scars: a source of false positive protein-protein interactions
Banks, Charles A. S.; Boanca, Gina; Lee, Zachary T.; Florens, Laurence; Washburn, Michael P.
2015-01-01
A common approach for exploring the interactome, the network of protein-protein interactions in cells, uses a commercially available ORF library to express affinity tagged bait proteins; these can be expressed in cells and endogenous cellular proteins that copurify with the bait can be identified as putative interacting proteins using mass spectrometry. Control experiments can be used to limit false-positive results, but in many cases, there are still a surprising number of prey proteins that appear to copurify specifically with the bait. Here, we have identified one source of false-positive interactions in such studies. We have found that a combination of: 1) the variable sequence of the C-terminus of the bait with 2) a C-terminal valine “cloning scar” present in a commercially available ORF library, can in some cases create a peptide motif that results in the aberrant co-purification of endogenous cellular proteins. Control experiments may not identify false positives resulting from such artificial motifs, as aberrant binding depends on sequences that vary from one bait to another. It is possible that such cryptic protein binding might occur in other systems using affinity tagged proteins; this study highlights the importance of conducting careful follow-up studies where novel protein-protein interactions are suspected. PMID:25704442
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fernández-Sainz, I.J.; Largo, E.; Gladue, D.P.
E2, along with E{sup rns} and E1, is an envelope glycoprotein of Classical Swine Fever Virus (CSFV). E2 is involved in several virus functions: cell attachment, host range susceptibility and virulence in natural hosts. Here we evaluate the role of a specific E2 region, {sup 818}CPIGWTGVIEC{sup 828}, containing a putative fusion peptide (FP) sequence. Reverse genetics utilizing a full-length infectious clone of the highly virulent CSFV strain Brescia (BICv) was used to evaluate how individual amino acid substitutions within this region of E2 may affect replication of BICv. A synthetic peptide representing the complete E2 FP amino acid sequence adoptedmore » a β-type extended conformation in membrane mimetics, penetrated into model membranes, and perturbed lipid bilayer integrity in vitro. Similar peptides harboring amino acid substitutions adopted comparable conformations but exhibited different membrane activities. Therefore, a preliminary characterization of the putative FP {sup 818}CPIGWTGVIEC{sup 828} indicates a membrane fusion activity and a critical role in virus replication. - Highlights: • A putative fusion peptide (FP) region in CSFV E2 protein was shown to be critical for virus growth. • Synthetic FPs were shown to efficiently penetrate into lipid membranes using an in vitro model. • Individual residues in the FP affecting virus replication were identified by reverse genetics. • The same FP residues are also responsible for mediating membrane fusion.« less
Elfassi, E; Haseltine, W A; Dienstag, J L
1986-01-01
The genome of the hepatitis B virus (HBV) contains a sequence, designated X, capable of encoding a protein of 154 amino acids. To determine whether the putative protein synthesized from this region is antigenic, we examined the sera of HBV-infected patients for the ability to react with a hybrid protein that contained 133 amino acids encoded by the X region and portions of the bacterial ompF and beta-galactosidase genes. Some HBV-positive sera tested contained antibodies that specifically recognized the hybrid protein. All sera were from patients diagnosed as suffering from chronic active hepatitis. We conclude that the X region of HBV encodes a protein and that this protein is antigenic in some patients. Images PMID:3515347
Onishi, M; Tachi, H; Kojima, T; Shiraiwa, M; Takahara, H
2006-10-01
We identified a novel salt-inducible soybean gene encoding an acidic-isoform of pathogenesis-related protein group 5 (PR-5 protein). The soybean PR-5-homologous gene, designated as Glycine max osmotin-like protein, acidic isoform (GmOLPa)), encodes a putative polypeptide having an N-terminal signal peptide. The mature GmOLPa protein without the signal peptide has a calculated molecular mass of 21.5 kDa and a pI value of 4.4, and was distinguishable from a known PR-5-homologous gene of soybean (namely P21 protein) through examination of the structural features. A comparison with two intracellular salt-inducible PR-5 proteins, tobacco osmotin and tomato NP24, revealed that GmOLPa did not have a C-terminal extension sequence functioning as a vacuole-targeting motif. The GmOLPa gene was transcribed constitutively in the soybean root and was induced almost exclusively in the root during 24 h of high-salt stress (300 mM NaCl). Interestingly, GmOLPa gene expression in the stem and leaf, not observed until 24 h, was markedly induced at 48 and 72 h after commencement of the high-salt stress. Abscisic acid (ABA) and dehydration also induced expression of the GmOLPa gene in the root; additionally, dehydration slightly induced expression in the stem and leaf. In fact, the 5'-upstream sequence of the GmOLPa gene contained several putative cis-elements known to be involved in responsiveness to ABA and dehydration, e.g. ABA-responsive element (ABRE), MYB/MYC, and low temperature-responsive element (LTRE). These results suggested that GmOLPa may function as a protective PR-5 protein in the extracellular space of the soybean root in response to high-salt stress and dehydration.
Simulating protein folding initiation sites using an alpha-carbon-only knowledge-based force field
Buck, Patrick M.; Bystroff, Christopher
2015-01-01
Protein folding is a hierarchical process where structure forms locally first, then globally. Some short sequence segments initiate folding through strong structural preferences that are independent of their three-dimensional context in proteins. We have constructed a knowledge-based force field in which the energy functions are conditional on local sequence patterns, as expressed in the hidden Markov model for local structure (HMMSTR). Carbon-alpha force field (CALF) builds sequence specific statistical potentials based on database frequencies for α-carbon virtual bond opening and dihedral angles, pairwise contacts and hydrogen bond donor-acceptor pairs, and simulates folding via Brownian dynamics. We introduce hydrogen bond donor and acceptor potentials as α-carbon probability fields that are conditional on the predicted local sequence. Constant temperature simulations were carried out using 27 peptides selected as putative folding initiation sites, each 12 residues in length, representing several different local structure motifs. Each 0.6 μs trajectory was clustered based on structure. Simulation convergence or representativeness was assessed by subdividing trajectories and comparing clusters. For 21 of the 27 sequences, the largest cluster made up more than half of the total trajectory. Of these 21 sequences, 14 had cluster centers that were at most 2.6 Å root mean square deviation (RMSD) from their native structure in the corresponding full-length protein. To assess the adequacy of the energy function on nonlocal interactions, 11 full length native structures were relaxed using Brownian dynamics simulations. Equilibrated structures deviated from their native states but retained their overall topology and compactness. A simple potential that folds proteins locally and stabilizes proteins globally may enable a more realistic understanding of hierarchical folding pathways. PMID:19137613
Didi, Jennifer; Lemée, Ludovic; Gibert, Laure; Pons, Jean-Louis; Pestel-Caron, Martine
2014-10-01
Staphylococcus lugdunensis is an emergent virulent coagulase-negative staphylococcus responsible for severe infections similar to those caused by Staphylococcus aureus. To understand its potentially pathogenic capacity and have further detailed knowledge of the molecular traits of this organism, 93 isolates from various geographic origins were analyzed by multi-virulence-locus sequence typing (MVLST), targeting seven known or putative virulence-associated loci (atlLR2, atlLR3, hlb, isdJ, SLUG_09050, SLUG_16930, and vwbl). The polymorphisms of the putative virulence-associated loci were moderate and comparable to those of the housekeeping genes analyzed by multilocus sequence typing (MLST). However, the MVLST scheme generated 43 virulence types (VTs) compared to 20 sequence types (STs) based on MLST, indicating that MVLST was significantly more discriminating (Simpson's index [D], 0.943). No hypervirulent lineage or cluster specific to carriage strains was defined. The results of multilocus sequence analysis of known and putative virulence-associated loci are consistent with a clonal population structure for S. lugdunensis, suggesting a coevolution of these genes with housekeeping genes. Indeed, the nonsynonymous to synonymous evolutionary substitutions (dN/dS) ratio, the Tajima's D test, and Single-likelihood ancestor counting (SLAC) analysis suggest that all virulence-associated loci were under negative selection, even atlLR2 (AtlL protein) and SLUG_16930 (FbpA homologue), for which the dN/dS ratios were higher. In addition, this analysis of virulence-associated loci allowed us to propose a trilocus sequence typing scheme based on the intragenic regions of atlLR3, isdJ, and SLUG_16930, which is more discriminant than MLST for studying short-term epidemiology and further characterizing the lineages of the rare but highly pathogenic S. lugdunensis. Copyright © 2014, American Society for Microbiology. All Rights Reserved.
RICD: a rice indica cDNA database resource for rice functional genomics.
Lu, Tingting; Huang, Xuehui; Zhu, Chuanrang; Huang, Tao; Zhao, Qiang; Xie, Kabing; Xiong, Lizhong; Zhang, Qifa; Han, Bin
2008-11-26
The Oryza sativa L. indica subspecies is the most widely cultivated rice. During the last few years, we have collected over 20,000 putative full-length cDNAs and over 40,000 ESTs isolated from various cDNA libraries of two indica varieties Guangluai 4 and Minghui 63. A database of the rice indica cDNAs was therefore built to provide a comprehensive web data source for searching and retrieving the indica cDNA clones. Rice Indica cDNA Database (RICD) is an online MySQL-PHP driven database with a user-friendly web interface. It allows investigators to query the cDNA clones by keyword, genome position, nucleotide or protein sequence, and putative function. It also provides a series of information, including sequences, protein domain annotations, similarity search results, SNPs and InDels information, and hyperlinks to gene annotation in both The Rice Annotation Project Database (RAP-DB) and The TIGR Rice Genome Annotation Resource, expression atlas in RiceGE and variation report in Gramene of each cDNA. The online rice indica cDNA database provides cDNA resource with comprehensive information to researchers for functional analysis of indica subspecies and for comparative genomics. The RICD database is available through our website http://www.ncgr.ac.cn/ricd.
cDNA encoding a polypeptide including a hevein sequence
Raikhel, Natasha V.; Broekaert, Willem F.; Chua, Nam-Hai; Kush, Anil
1999-05-04
A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74-79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli.
cDNA encoding a polypeptide including a hev ein sequence
Raikhel, Natasha V.; Broekaert, Willem F.; Chua, Nam-Hai; Kush, Anil
2000-07-04
A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74-79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli.
cDNA encoding a polypeptide including a hevein sequence
Raikhel, N.V.; Broekaert, W.F.; Chua, N.H.; Kush, A.
1999-05-04
A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74--79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli. 12 figs.
CDNA encoding a polypeptide including a hevein sequence
Raikhel, Natasha V.; Broekaert, Willem F.; Chua, Nam-Hai; Kush, Anil
1995-03-21
A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74-79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli.
cDNA encoding a polypeptide including a hevein sequence
Raikhel, N.V.; Broekaert, W.F.; Chua, N.H.; Kush, A.
1995-03-21
A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1,018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74--79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli. 11 figures.
Cloning and characterization of the gene encoding IMP dehydrogenase from Arabidopsis thaliana.
Collart, F R; Osipiuk, J; Trent, J; Olsen, G J; Huberman, E
1996-10-03
We have cloned and characterized the gene encoding inosine monophosphate dehydrogenase (IMPDH) from Arabidopsis thaliana (At). The transcription unit of the At gene spans approximately 1900 bp and specifies a protein of 503 amino acids with a calculated relative molecular mass (M(r)) of 54,190. The gene is comprised of a minimum of four introns and five exons with all donor and acceptor splice sequences conforming to previously proposed consensus sequences. The deduced IMPDH amino-acid sequence from At shows a remarkable similarity to other eukaryotic IMPDH sequences, with a 48% identity to human Type II enzyme. Allowing for conservative substitutions, the enzyme is 69% similar to human Type II IMPDH. The putative active-site sequence of At IMPDH conforms to the IMP dehydrogenase/guanosine monophosphate reductase motif and contains an essential active-site cysteine residue.
Osato, Naoki
2018-01-19
Transcriptional target genes show functional enrichment of genes. However, how many and how significantly transcriptional target genes include functional enrichments are still unclear. To address these issues, I predicted human transcriptional target genes using open chromatin regions, ChIP-seq data and DNA binding sequences of transcription factors in databases, and examined functional enrichment and gene expression level of putative transcriptional target genes. Gene Ontology annotations showed four times larger numbers of functional enrichments in putative transcriptional target genes than gene expression information alone, independent of transcriptional target genes. To compare the number of functional enrichments of putative transcriptional target genes between cells or search conditions, I normalized the number of functional enrichment by calculating its ratios in the total number of transcriptional target genes. With this analysis, native putative transcriptional target genes showed the largest normalized number of functional enrichments, compared with target genes including 5-60% of randomly selected genes. The normalized number of functional enrichments was changed according to the criteria of enhancer-promoter interactions such as distance from transcriptional start sites and orientation of CTCF-binding sites. Forward-reverse orientation of CTCF-binding sites showed significantly higher normalized number of functional enrichments than the other orientations. Journal papers showed that the top five frequent functional enrichments were related to the cellular functions in the three cell types. The median expression level of transcriptional target genes changed according to the criteria of enhancer-promoter assignments (i.e. interactions) and was correlated with the changes of the normalized number of functional enrichments of transcriptional target genes. Human putative transcriptional target genes showed significant functional enrichments. Functional enrichments were related to the cellular functions. The normalized number of functional enrichments of human putative transcriptional target genes changed according to the criteria of enhancer-promoter assignments and correlated with the median expression level of the target genes. These analyses and characters of human putative transcriptional target genes would be useful to examine the criteria of enhancer-promoter assignments and to predict the novel mechanisms and factors such as DNA binding proteins and DNA sequences of enhancer-promoter interactions.
Discovering genes associated with dormancy in the monogonont rotifer Brachionus plicatilis
Denekamp, Nadav Y; Thorne, Michael AS; Clark, Melody S; Kube, Michael; Reinhardt, Richard; Lubzens, Esther
2009-01-01
Background Microscopic monogonont rotifers, including the euryhaline species Brachionus plicatilis, are typically found in water bodies where environmental factors restrict population growth to short periods lasting days or months. The survival of the population is ensured via the production of resting eggs that show a remarkable tolerance to unfavorable conditions and remain viable for decades. The aim of this study was to generate Expressed Sequence Tags (ESTs) for molecular characterisation of processes associated with the formation of resting eggs, their survival during dormancy and hatching. Results Four normalized and four subtractive libraries were constructed to provide a resource for rotifer transcriptomics associated with resting-egg formation, storage and hatching. A total of 47,926 sequences were assembled into 18,000 putative transcripts and analyzed using both Blast and GO annotation. About 28–55% (depending on the library) of the clones produced significant matches against the Swissprot and Trembl databases. Genes known to be associated with desiccation tolerance during dormancy in other organisms were identified in the EST libraries. These included genes associated with antioxidant activity, low molecular weight heat shock proteins and Late Embryonic Abundant (LEA) proteins. Real-time PCR confirmed that LEA transcripts, small heat-shock proteins and some antioxidant genes were upregulated in resting eggs, therefore suggesting that desiccation tolerance is a characteristic feature of resting eggs even though they do not necessarily fully desiccate during dormancy. The role of trehalose in resting-egg formation and survival remains unclear since there was no significant difference between resting-egg producing females and amictic females in the expression of the tps-1 gene. In view of the absence of vitellogenin transcripts, matches to lipoprotein lipase proteins suggest that, similar to the situation in dipterans, these proteins may serve as the yolk proteins in rotifers. Conclusion The 47,926 ESTs expand significantly the current sequence resource of B. plicatilis. It describes, for the first time, genes putatively associated with resting eggs and will serve as a database for future global expression experiments, particularly for the further identification of dormancy related genes. PMID:19284654
Discovering genes associated with dormancy in the monogonont rotifer Brachionus plicatilis.
Denekamp, Nadav Y; Thorne, Michael A S; Clark, Melody S; Kube, Michael; Reinhardt, Richard; Lubzens, Esther
2009-03-13
Microscopic monogonont rotifers, including the euryhaline species Brachionus plicatilis, are typically found in water bodies where environmental factors restrict population growth to short periods lasting days or months. The survival of the population is ensured via the production of resting eggs that show a remarkable tolerance to unfavorable conditions and remain viable for decades. The aim of this study was to generate Expressed Sequence Tags (ESTs) for molecular characterisation of processes associated with the formation of resting eggs, their survival during dormancy and hatching. Four normalized and four subtractive libraries were constructed to provide a resource for rotifer transcriptomics associated with resting-egg formation, storage and hatching. A total of 47,926 sequences were assembled into 18,000 putative transcripts and analyzed using both Blast and GO annotation. About 28-55% (depending on the library) of the clones produced significant matches against the Swissprot and Trembl databases. Genes known to be associated with desiccation tolerance during dormancy in other organisms were identified in the EST libraries. These included genes associated with antioxidant activity, low molecular weight heat shock proteins and Late Embryonic Abundant (LEA) proteins. Real-time PCR confirmed that LEA transcripts, small heat-shock proteins and some antioxidant genes were upregulated in resting eggs, therefore suggesting that desiccation tolerance is a characteristic feature of resting eggs even though they do not necessarily fully desiccate during dormancy. The role of trehalose in resting-egg formation and survival remains unclear since there was no significant difference between resting-egg producing females and amictic females in the expression of the tps-1 gene. In view of the absence of vitellogenin transcripts, matches to lipoprotein lipase proteins suggest that, similar to the situation in dipterans, these proteins may serve as the yolk proteins in rotifers. The 47,926 ESTs expand significantly the current sequence resource of B. plicatilis. It describes, for the first time, genes putatively associated with resting eggs and will serve as a database for future global expression experiments, particularly for the further identification of dormancy related genes.
NASA Astrophysics Data System (ADS)
Qi, Fei; Guo, Huarong; Wang, Jian
2008-02-01
Reversible protein phosphorylation, catalyzed by protein kinases and phosphatases, is an important and versatile mechanism by which eukaryotic cells regulate almost all the signaling processes. Protein phosphatase 1 (PP1) is the first and well-characterized member of the protein serine/threonine phosphatase family. In the present study, a full-length cDNA encoding the beta isoform of the catalytic subunit of protein phosphatase 1(PP1cb), was for the first time isolated and sequenced from the skin tissue of flatfish turbot Scophthalmus maximus, designated SmPP1cb, by the rapid amplification of cDNA ends (RACE) technique. The cDNA sequence of SmPP1cb we obtained contains a 984 bp open reading frame (ORF), flanked by a complete 39 bp 5' untranslated region and 462 bp 3' untranslated region. The ORF encodes a putative 327 amino acid protein, and the N-terminal section of this protein is highly acidic, Met-Ala-Glu-Gly-Glu-Leu-Asp-Val-Asp, a common feature for PP1 catalytic subunit but absent in protein phosphatase 2B (PP2B). And its calculated molecular mass is 37 193 Da and pI 5.8. Sequence analysis indicated that, SmPP1cb is extremely conserved in both amino acid and nucleotide acid levels compared with the PP1cb of other vertebrates and invertebrates, and its Kozak motif contained in the 5'UTR around ATG start codon is GXXAXXGXX ATGG, which is different from mammalian in two positions A-6 and G-3, indicating the possibility of different initiation of translation in turbot, and also the 3'UTR of SmPP1cb is highly diverse in the sequence similarity and length compared with other animals, especially zebrafish. The cloning and sequencing of SmPP1cb gene lays a good foundation for the future work on the biological functions of PP1 in the flatfish turbot.
Zhu, Ruo-Lin; Lei, Xiao-Ying; Ke, Fei; Yuan, Xiu-Ping; Zhang, Qi-Ya
2011-02-01
Genomic sequence of Scophthalmus maximus rhabdovirus (SMRV) isolated from diseased turbot has been characterized. The complete genome of SMRV comprises 11,492 nucleotides and encodes five typical rhabdovirus genes N, P, M, G and L. In addition, two open reading frames (ORF) are predicted overlapping with P gene, one upstream of P and smaller than P (temporarily called Ps), and another in P gene which may encodes a protein similar to the vesicular stomatitis virus C protein. The C ORF is contained within the P ORF. The five typical proteins share the highest sequence identities (48.9%) with the corresponding proteins of rhabdoviruses in genus Vesiculovirus. Phylogenetic analysis of partial L protein sequence indicates that SMRV is close to genus Vesiculovirus. The first 13 nucleotides at the ends of the SMRV genome are absolutely inverse complementarity. The gene junctions between the five genes show conserved polyadenylation signal (CATGA(7)) and intergenic dinucleotide (CT) followed by putative transcription initiation sequence A(A/G)(C/G)A(A/G/T), which are different from known rhabdoviruses. The entire Ps ORF was cloned and expressed, and used to generate polyclonal antibody in mice. One obvious band could be detected in SMRV-infected carp leucocyte cells (CLCs) by anti-Ps/C serum via Western blot, and the subcellular localization of Ps-GFP fusion protein exhibited cytoplasm distribution as multiple punctuate or doughnut shaped foci of uneven size. Copyright © 2010 Elsevier B.V. All rights reserved.
Seo, H S; Kim, H Y; Jeong, J Y; Lee, S Y; Cho, M J; Bahk, J D
1995-03-01
A cDNA clone, RGA1, was isolated by using a GPA1 cDNA clone of Arabidopsis thaliana G protein alpha subunit as a probe from a rice (Oryza sativa L. IR-36) seedling cDNA library from roots and leaves. Sequence analysis of genomic clone reveals that the RGA1 gene has 14 exons and 13 introns, and encodes a polypeptide of 380 amino acid residues with a calculated molecular weight of 44.5 kDa. The encoded protein exhibits a considerable degree of amino acid sequence similarity to all the other known G protein alpha subunits. A putative TATA sequence (ATATGA), a potential CAAT box sequence (AGCAATAC), and a cis-acting element, CCACGTGG (ABRE), known to be involved in ABA induction are found in the promoter region. The RGA1 protein contains all the consensus regions of G protein alpha subunits except the cysteine residue near the C-terminus for ADP-ribosylation by pertussis toxin. The RGA1 polypeptide expressed in Escherichia coli was, however, ADP-ribosylated by 10 microM [adenylate-32P] NAD and activated cholera toxin. Southern analysis indicates that there are no other genes similar to the RGA1 gene in the rice genome. Northern analysis reveals that the RGA1 mRNA is 1.85 kb long and expressed in vegetative tissues, including leaves and roots, and that its expression is regulated by light.
Lentes, K U; Mathieu, E; Bischoff, R; Rasmussen, U B; Pavirani, A
1993-01-01
Current methods for comparative analyses of protein sequences are 1D-alignments of amino acid sequences based on the maximization of amino acid identity (homology) and the prediction of secondary structure elements. This method has a major drawback once the amino acid identity drops below 20-25%, since maximization of a homology score does not take into account any structural information. A new technique called Hydrophobic Cluster Analysis (HCA) has been developed by Lemesle-Varloot et al. (Biochimie 72, 555-574), 1990). This consists of comparing several sequences simultaneously and combining homology detection with secondary structure analysis. HCA is primarily based on the detection and comparison of structural segments constituting the hydrophobic core of globular protein domains, with or without transmembrane domains. We have applied HCA to the analysis of different families of G-protein coupled receptors, such as catecholamine receptors as well as peptide hormone receptors. Utilizing HCA the thrombin receptor, a new and as yet unique member of the family of G-protein coupled receptors, can be clearly classified as being closely related to the family of neuropeptide receptors rather than to the catecholamine receptors for which the shape of the hydrophobic clusters and the length of their third cytoplasmic loop are very different. Furthermore, the potential of HCA to predict relationships between new putative and already characterized members of this family of receptors will be presented.
Fukumori, F; Saint, C P
1997-01-01
A 9,233-bp HindIII fragment of the aromatic amine catabolic plasmid pTDN1, isolated from a derivative of Pseudomonas putida mt-2 (UCC22), confers the ability to degrade aniline on P. putida KT2442. The fragment encodes six open reading frames which are arranged in the same direction. Their 5' upstream region is part of the direct-repeat sequence of pTDN1. Nucleotide sequence of 1.8 kb of the repeat sequence revealed only a single base pair change compared to the known sequence of IS1071 which is involved in the transposition of the chlorobenzoate genes (C. Nakatsu, J. Ng, R. Singh, N. Straus, and C. Wyndham, Proc. Natl. Acad. Sci. USA 88:8312-8316, 1991). Four open reading frames encode proteins with considerable homology to proteins found in other aromatic-compound degradation pathways. On the basis of sequence similarity, these genes are proposed to encode the large and small subunits of aniline oxygenase (tdnA1 and tdnA2, respectively), a reductase (tdnB), and a LysR-type regulatory gene (tdnR). The putative large subunit has a conserved [2Fe-2S]R Rieske-type ligand center. Two genes, tdnQ and tdnT, which may be involved in amino group transfer, are localized upstream of the putative oxygenase genes. The tdnQ gene product shares about 30% similarity with glutamine synthetases; however, a pUC-based plasmid carrying tdnQ did not support the growth of an Escherichia coli glnA strain in the absence of glutamine. TdnT possesses domains that are conserved among amidotransferases. The tdnQ, tdnA1, tdnA2, tdnB, and tdnR genes are essential for the conversion of aniline to catechol. PMID:8990291
Kobayashi, Michie; Hiraka, Yukie; Abe, Akira; Yaegashi, Hiroki; Natsume, Satoshi; Kikuchi, Hideko; Takagi, Hiroki; Saitoh, Hiromasa; Win, Joe; Kamoun, Sophien; Terauchi, Ryohei
2017-11-22
Downy mildew, caused by the oomycete pathogen Sclerospora graminicola, is an economically important disease of Gramineae crops including foxtail millet (Setaria italica). Plants infected with S. graminicola are generally stunted and often undergo a transformation of flower organs into leaves (phyllody or witches' broom), resulting in serious yield loss. To establish the molecular basis of downy mildew disease in foxtail millet, we carried out whole-genome sequencing and an RNA-seq analysis of S. graminicola. Sequence reads were generated from S. graminicola using an Illumina sequencing platform and assembled de novo into a draft genome sequence comprising approximately 360 Mbp. Of this sequence, 73% comprised repetitive elements, and a total of 16,736 genes were predicted from the RNA-seq data. The predicted genes included those encoding effector-like proteins with high sequence similarity to those previously identified in other oomycete pathogens. Genes encoding jacalin-like lectin-domain-containing secreted proteins were enriched in S. graminicola compared to other oomycetes. Of a total of 1220 genes encoding putative secreted proteins, 91 significantly changed their expression levels during the infection of plant tissues compared to the sporangia and zoospore stages of the S. graminicola lifecycle. We established the draft genome sequence of a downy mildew pathogen that infects Gramineae plants. Based on this sequence and our transcriptome analysis, we generated a catalog of in planta-induced candidate effector genes, providing a solid foundation from which to identify the effectors causing phyllody.
Discriminative Prediction of A-To-I RNA Editing Events from DNA Sequence
Sun, Jiangming; Singh, Pratibha; Bagge, Annika; Valtat, Bérengère; Vikman, Petter; Spégel, Peter; Mulder, Hindrik
2016-01-01
RNA editing is a post-transcriptional alteration of RNA sequences that, via insertions, deletions or base substitutions, can affect protein structure as well as RNA and protein expression. Recently, it has been suggested that RNA editing may be more frequent than previously thought. A great impediment, however, to a deeper understanding of this process is the paramount sequencing effort that needs to be undertaken to identify RNA editing events. Here, we describe an in silico approach, based on machine learning, that ameliorates this problem. Using 41 nucleotide long DNA sequences, we show that novel A-to-I RNA editing events can be predicted from known A-to-I RNA editing events intra- and interspecies. The validity of the proposed method was verified in an independent experimental dataset. Using our approach, 203 202 putative A-to-I RNA editing events were predicted in the whole human genome. Out of these, 9% were previously reported. The remaining sites require further validation, e.g., by targeted deep sequencing. In conclusion, the approach described here is a useful tool to identify potential A-to-I RNA editing events without the requirement of extensive RNA sequencing. PMID:27764195
Characterization of Urtica dioica agglutinin isolectins and the encoding gene family.
Does, M P; Ng, D K; Dekker, H L; Peumans, W J; Houterman, P M; Van Damme, E J; Cornelissen, B J
1999-01-01
Urtica dioica agglutinin (UDA) has previously been found in roots and rhizomes of stinging nettles as a mixture of UDA-isolectins. Protein and cDNA sequencing have shown that mature UDA is composed of two hevein domains and is processed from a precursor protein. The precursor contains a signal peptide, two in-tandem hevein domains, a hinge region and a carboxyl-terminal chitinase domain. Genomic fragments encoding precursors for UDA-isolectins have been amplified by five independent polymerase chain reactions on genomic DNA from stinging nettle ecotype Weerselo. One amplified gene was completely sequenced. As compared to the published cDNA sequence, the genomic sequence contains, besides two basepair substitutions, two introns located at the same positions as in other plant chitinases. By partial sequence analysis of 40 amplified genes, 16 different genes were identified which encode seven putative UDA-isolectins. The deduced amino acid sequences share 78.9-98.9% identity. In extracts of roots and rhizomes of stinging nettle ecotype Weerselo six out of these seven isolectins were detected by mass spectrometry. One of them is an acidic form, which has not been identified before. Our results demonstrate that UDA is encoded by a large gene family.
Li, Fengmei; Liu, Wuyi
2017-06-01
The basic helix-loop-helix (bHLH) transcription factors (TFs) form a huge superfamily and play crucial roles in many essential developmental, genetic, and physiological-biochemical processes of eukaryotes. In total, 109 putative bHLH TFs were identified and categorized successfully in the genomic databases of cattle, Bos Taurus, after removing redundant sequences and merging genetic isoforms. Through phylogenetic analyses, 105 proteins among these bHLH TFs were classified into 44 families with 46, 25, 14, 3, 13, and 4 members in the high-order groups A, B, C, D, E, and F, respectively. The remaining 4 bHLH proteins were sorted out as 'orphans.' Next, these 109 putative bHLH proteins identified were further characterized as significantly enriched in 524 significant Gene Ontology (GO) annotations (corrected P value ≤ 0.05) and 21 significantly enriched pathways (corrected P value ≤ 0.05) that had been mapped by the web server KOBAS 2.0. Furthermore, 95 bHLH proteins were further screened and analyzed together with two uncharacterized proteins in the STRING online database to reconstruct the protein-protein interaction network of cattle bHLH TFs. Ultimately, 89 bHLH proteins were fully mapped in a network with 67 biological process, 13 molecular functions, 5 KEGG pathways, 12 PFAM protein domains, and 25 INTERPRO classified protein domains and features. These results provide much useful information and a good reference for further functional investigations and updated researches on cattle bHLH TFs.
Transcriptome Sequencing and Developmental Regulation of Gene Expression in Anopheles aquasalis
Silva, Maria C. P.; Lopes, Adriana R.; Barros, Michele S.; Sá-Nunes, Anderson; Kojin, Bianca B.; Carvalho, Eneas; Suesdek, Lincoln; Silva-Neto, Mário Alberto C.; James, Anthony A.; Capurro, Margareth L.
2014-01-01
Background Anopheles aquasalis is a major malaria vector in coastal areas of South and Central America where it breeds preferentially in brackish water. This species is very susceptible to Plasmodium vivax and it has been already incriminated as responsible vector in malaria outbreaks. There has been no high-throughput investigation into the sequencing of An. aquasalis genes, transcripts and proteins despite its epidemiological relevance. Here we describe the sequencing, assembly and annotation of the An. aquasalis transcriptome. Methodology/Principal Findings A total of 419 thousand cDNA sequence reads, encompassing 164 million nucleotides, were assembled in 7544 contigs of ≥2 sequences, and 1999 singletons. The majority of the An. aquasalis transcripts encode proteins with their closest counterparts in another neotropical malaria vector, An. darlingi. Several analyses in different protein databases were used to annotate and predict the putative functions of the deduced An. aquasalis proteins. Larval and adult-specific transcripts were represented by 121 and 424 contig sequences, respectively. Fifty-one transcripts were only detected in blood-fed females. The data also reveal a list of transcripts up- or down-regulated in adult females after a blood meal. Transcripts associated with immunity, signaling networks and blood feeding and digestion are discussed. Conclusions/Significance This study represents the first large-scale effort to sequence the transcriptome of An. aquasalis. It provides valuable information that will facilitate studies on the biology of this species and may lead to novel strategies to reduce malaria transmission on the South American continent. The An. aquasalis transcriptome is accessible at http://exon.niaid.nih.gov/transcriptome/An_aquasalis/Anaquexcel.xlsx. PMID:25033462
1996-01-01
Mutations in the Caenorhabditis elegans gene unc-89 result in nematodes having disorganized muscle structure in which thick filaments are not organized into A-bands, and there are no M-lines. Beginning with a partial cDNA from the C. elegans sequencing project, we have cloned and sequenced the unc-89 gene. An unc-89 allele, st515, was found to contain an 84-bp deletion and a 10-bp duplication, resulting in an in- frame stop codon within predicted unc-89 coding sequence. Analysis of the complete coding sequence for unc-89 predicts a novel 6,632 amino acid polypeptide consisting of sequence motifs which have been implicated in protein-protein interactions. UNC-89 begins with 67 residues of unique sequences, SH3, dbl/CDC24, and PH domains, 7 immunoglobulins (Ig) domains, a putative KSP-containing multiphosphorylation domain, and ends with 46 Ig domains. A polyclonal antiserum raised to a portion of unc-89 encoded sequence reacts to a twitchin-sized polypeptide from wild type, but truncated polypeptides from st515 and from the amber allele e2338. By immunofluorescent microscopy, this antiserum localizes to the middle of A-bands, consistent with UNC-89 being a structural component of the M-line. Previous studies indicate that myofilament lattice assembly begins with positional cues laid down in the basement membrane and muscle cell membrane. We propose that the intracellular protein UNC-89 responds to these signals, localizes, and then participates in assembling an M-line. PMID:8603916
Opposite Stereoselectivities of Dirigent Proteins in Arabidopsis and Schizandra Species
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kim, Kye-Won; Moinuddin, Syed G. A.; Atwell, Kathleen M.
2012-08-01
How stereoselective monolignol-derived phenoxy radical-radical coupling reactions are differentially biochemically orchestrated in planta, whereby for example they afford (+)- and (-)-pinoresinols, respectively, is both a fascinating mechanistic and evolutionary question. In earlier work, biochemical control of (+)-pinoresinol formation had been established to be engendered by a (+)-pinoresinol-forming dirigent protein in Forsythia intermedia, whereas the presence of a (-)-pinoresinol-forming dirigent protein was indirectly deduced based on the enantiospecificity of downstream pinoresinol reductases (AtPrRs) in Arabidopsis thaliana root tissue. In this study of 16 putative dirigent protein homologs in Arabidopsis, AtDIR6, AtDIR10, and AtDIR13 were established to be root-specific using a β-glucuronidasemore » reporter gene strategy. Of these three, in vitro analyses established that only recombinant AtDIR6 was a (-)-pinoresinol-forming dirigent protein, whose physiological role was further confirmed using overexpression and RNAi strategies in vivo. Interestingly, its closest homolog, AtDIR5, was also established to be a (-)-pinoresinol-forming dirigent protein based on in vitro biochemical analyses. Both of these were compared in terms of properties with a (+)-pinoresinol-forming dirigent protein from Schizandra chinensis. In this context, sequence analyses, site-directed mutagenesis, and region swapping resulted in identification of putative substrate binding sites/regions and candidate residues controlling distinct stereoselectivities of coupling modes.« less
Intracellular Localization Map of Human Herpesvirus 8 Proteins▿
Sander, Gaby; Konrad, Andreas; Thurau, Mathias; Wies, Effi; Leubert, Rene; Kremmer, Elisabeth; Dinkel, Holger; Schulz, Thomas; Neipel, Frank; Stürzl, Michael
2008-01-01
Human herpesvirus 8 (HHV-8) is the etiological agent of Kaposi's sarcoma. We present a localization map of 85 HHV-8-encoded proteins in mammalian cells. Viral open reading frames were cloned with a Myc tag in expression plasmids, confirmed by full-length sequencing, and expressed in HeLa cells. Protein localizations were analyzed by immunofluorescence microscopy. Fifty-one percent of all proteins were localized in the cytoplasm, 22% were in the nucleus, and 27% were found in both compartments. Surprisingly, we detected viral FLIP (v-FLIP) in the nucleus and in the cytoplasm, whereas cellular FLIPs are generally localized exclusively in the cytoplasm. This suggested that v-FLIP may exert additional or alternative functions compared to cellular FLIPs. In addition, it has been shown recently that the K10 protein can bind to at least 15 different HHV-8 proteins. We noticed that K10 and only five of its 15 putative binding factors were localized in the nucleus when the proteins were expressed in HeLa cells individually. Interestingly, in coexpression experiments K10 colocalized with 87% (13 of 15) of its putative binding partners. Colocalization was induced by translocation of either K10 alone or both proteins. These results indicate active intracellular translocation processes in virus-infected cells. Specifically in this framework, the localization map may provide a useful reference to further elucidate the function of HHV-8-encoded genes in human diseases. PMID:18077714
Le, Shuai; He, Xuesong; Tan, Yinling; Huang, Guangtao; Zhang, Lin; Lux, Renate; Shi, Wenyuan; Hu, Fuquan
2013-01-01
The first step in bacteriophage infection is recognition and binding to the host receptor, which is mediated by the phage receptor binding protein (RBP). Different RBPs can lead to differential host specificity. In many bacteriophages, such as Escherichia coli and Lactococcal phages, RBPs have been identified as the tail fiber or protruding baseplate proteins. However, the tail fiber-dependent host specificity in Pseudomonas aeruginosa phages has not been well studied. This study aimed to identify and investigate the binding specificity of the RBP of P. aeruginosa phages PaP1 and JG004. These two phages share high DNA sequence homology but exhibit different host specificities. A spontaneous mutant phage was isolated and exhibited broader host range compared with the parental phage JG004. Sequencing of its putative tail fiber and baseplate region indicated a single point mutation in ORF84 (a putative tail fiber gene), which resulted in the replacement of a positively charged lysine (K) by an uncharged asparagine (N). We further demonstrated that the replacement of the tail fiber gene (ORF69) of PaP1 with the corresponding gene from phage JG004 resulted in a recombinant phage that displayed altered host specificity. Our study revealed the tail fiber-dependent host specificity in P. aeruginosa phages and provided an effective tool for its alteration. These contributions may have potential value in phage therapy. PMID:23874674
Uncovering the defence responses of Eucalyptus to pests and pathogens in the genomics age.
Naidoo, Sanushka; Külheim, Carsten; Zwart, Lizahn; Mangwanda, Ronishree; Oates, Caryn N; Visser, Erik A; Wilken, Febé E; Mamni, Thandekile B; Myburg, Alexander A
2014-09-01
Long-lived tree species are subject to attack by various pests and pathogens during their lifetime. This problem is exacerbated by climate change, which may increase the host range for pathogens and extend the period of infestation by pests. Plant defences may involve preformed barriers or induced resistance mechanisms based on recognition of the invader, complex signalling cascades, hormone signalling, activation of transcription factors and production of pathogenesis-related (PR) proteins with direct antimicrobial or anti-insect activity. Trees have evolved some unique defence mechanisms compared with well-studied model plants, which are mostly herbaceous annuals. The genome sequence of Eucalyptus grandis W. Hill ex Maiden has recently become available and provides a resource to extend our understanding of defence in large woody perennials. This review synthesizes existing knowledge of defence mechanisms in model plants and tree species and features mechanisms that may be important for defence in Eucalyptus, such as anatomical variants and the role of chemicals and proteins. Based on the E. grandis genome sequence, we have identified putative PR proteins based on sequence identity to the previously described plant PR proteins. Putative orthologues for PR-1, PR-2, PR-4, PR-5, PR-6, PR-7, PR-8, PR-9, PR-10, PR-12, PR-14, PR-15 and PR-17 have been identified and compared with their orthologues in Populus trichocarpa Torr. & A. Gray ex Hook and Arabidopsis thaliana (L.) Heynh. The survey of PR genes in Eucalyptus provides a first step in identifying defence gene targets that may be employed for protection of the species in future. Genomic resources available for Eucalyptus are discussed and approaches for improving resistance in these hardwood trees, earmarked as a bioenergy source in future, are considered. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Jakubowska, Agata K; Peters, Sander A; Ziemnicka, Jadwiga; Vlak, Just M; van Oers, Monique M
2006-03-01
The genome sequence of a Polish isolate of Agrotis segetum nucleopolyhedrovirus (AgseNPV-A) was determined and analysed. The circular genome is composed of 147,544 bp and has a G+C content of 45.7 mol%. It contains 153 putative, non-overlapping open reading frames (ORFs) encoding predicted proteins of more than 50 aa, together making up 89.8 % of the genome. The remaining 10.2 % of the DNA constitutes non-coding regions and homologous-repeat regions. One hundred and forty-three AgseNPV-A ORFs are homologues of previously reported baculovirus gene sequences. There are ten unique ORFs and they account for 3 % of the genome in total. All 62 lepidopteran baculovirus genes, including the 29 core baculovirus genes, were found in the AgseNPV-A genome. The gene content and gene order of AgseNPV-A are most similar to those of Spodoptera exigua (Se) multiple NPV and their shared homologous genes are 100 % collinear. Three putative enhancin genes were identified in the AgseNPV-A genome. In phylogenetic analysis, the AgseNPV-A enhancins form a cluster separated from enhancins of the Mamestra species NPVs.
Luna-Ramírez, Karen; Quintero-Hernández, Veronica; Vargas-Jaimes, Leonel; Batista, Cesar V F; Winkel, Kenneth D; Possani, Lourival D
2013-03-01
The Urodacidae scorpions are the most widely distributed of the four families in Australia and represent half of the species in the continent, yet their venoms remain largely unstudied. This communication reports the first results of a proteome analysis of the venom of the scorpion Urodacus yaschenkoi performed by mass fingerprinting, after high performance liquid chromatography (HPLC) separation. A total of 74 fractions were obtained by HPLC separation allowing the identification of approximately 274 different molecular masses with molecular weights varying from 287 to 43,437 Da. The most abundant peptides were those from 1 K Da and 4-5 K Da representing antimicrobial peptides and putative potassium channel toxins, respectively. Three such peptides were chemically synthesized and tested against Gram-positive and Gram-negative bacteria showing minimum inhibitory concentration in the low micromolar range, but with moderate hemolytic activity. It also reports a transcriptome analysis of the venom glands of the same scorpion species, undertaken by constructing a cDNA library and conducting random sequencing screening of the transcripts. From the resultant cDNA library 172 expressed sequence tags (ESTs) were analyzed. These transcripts were further clustered into 120 unique sequences (23 contigs and 97 singlets). The identified putative proteins can be assorted in several groups, such as those implicated in common cellular processes, putative neurotoxins and antimicrobial peptides. The scorpion U. yaschenkoi is not known to be dangerous to humans and its venom contains peptides similar to those of Opisthacanthus cayaporum (antibacterial), Scorpio maurus palmatus (maurocalcin), Opistophthalmus carinatus (opistoporines) and Hadrurus gerstchi (scorpine-like molecules), amongst others. Copyright © 2012 Elsevier Ltd. All rights reserved.
Malhotra, Sony; Sowdhamini, Ramanathan
2013-08-01
The interaction of proteins with their respective DNA targets is known to control many high-fidelity cellular processes. Performing a comprehensive survey of the sequenced genomes for DNA-binding proteins (DBPs) will help in understanding their distribution and the associated functions in a particular genome. Availability of fully sequenced genome of Arabidopsis thaliana enables the review of distribution of DBPs in this model plant genome. We used profiles of both structure and sequence-based DNA-binding families, derived from PDB and PFam databases, to perform the survey. This resulted in 4471 proteins, identified as DNA-binding in Arabidopsis genome, which are distributed across 300 different PFam families. Apart from several plant-specific DNA-binding families, certain RING fingers and leucine zippers also had high representation. Our search protocol helped to assign DNA-binding property to several proteins that were previously marked as unknown, putative or hypothetical in function. The distribution of Arabidopsis genes having a role in plant DNA repair were particularly studied and noted for their functional mapping. The functions observed to be overrepresented in the plant genome harbour DNA-3-methyladenine glycosylase activity, alkylbase DNA N-glycosylase activity and DNA-(apurinic or apyrimidinic site) lyase activity, suggesting their role in specialized functions such as gene regulation and DNA repair.
de Melo, Ivan S.; Jimenez-Nuñez, Maria D.; Iglesias, Concepción; Campos-Caro, Antonio; Moreno-Sanchez, David; Ruiz, Felix A.; Bolívar, Jorge
2013-01-01
NOA36/ZNF330 is an evolutionarily well-preserved protein present in the nucleolus and mitochondria of mammalian cells. We have previously reported that the pro-apoptotic activity of this protein is mediated by a characteristic cysteine-rich domain. We now demonstrate that the nucleolar localization of NOA36 is due to a highly-conserved nucleolar localization signal (NoLS) present in residues 1–33. This NoLS is a sequence containing three clusters of two or three basic amino acids. We fused the amino terminal of NOA36 to eGFP in order to characterize this putative NoLS. We show that a cluster of three lysine residues at positions 3 to 5 within this sequence is critical for the nucleolar localization. We also demonstrate that the sequence as found in human is capable of directing eGFP to the nucleolus in several mammal, fish and insect cells. Moreover, this NoLS is capable of specifically directing the cytosolic yeast enzyme polyphosphatase to the target of the nucleolus of HeLa cells, wherein its enzymatic activity was detected. This NoLS could therefore serve as a very useful tool as a nucleolar marker and for directing particular proteins to the nucleolus in distant animal species. PMID:23516598
Wang, Zhen; Anderson, Nicholas Scott; Benning, Christoph
2013-01-01
Chloroplast membrane lipid synthesis relies on the import of glycerolipids from the ER. The TGD (TriGalactosylDiacylglycerol) proteins are required for this lipid transfer process. The TGD1, -2, and -3 proteins form a putative ABC (ATP-binding cassette) transporter transporting ER-derived lipids through the inner envelope membrane of the chloroplast, while TGD4 binds phosphatidic acid (PtdOH) and resides in the outer chloroplast envelope. We identified two sequences in TGD4, amino acids 1–80 and 110–145, which are necessary and sufficient for PtdOH binding. Deletion of both sequences abolished PtdOH binding activity. We also found that TGD4 from 18:3 plants bound specifically and with increased affinity PtdOH. TGD4 did not interact with other proteins and formed a homodimer both in vitro and in vivo. Our results suggest that TGD4 is an integral dimeric β-barrel lipid transfer protein that binds PtdOH with its N terminus and contains dimerization domains at its C terminus. PMID:23297418
Morea, Edna G O; Viviescas, Maria Alejandra; Fernandes, Carlos A H; Matioli, Fabio F; Lira, Cristina B B; Fernandez, Maribel F; Moraes, Barbara S; da Silva, Marcelo S; Storti, Camila B; Fontes, Marcos R M; Cano, Maria Isabel N
2017-11-01
Leishmania spp. telomeres are composed of 5'-TTAGGG-3' repeats associated with proteins. We have previously identified LaRbp38 and LaRPA-1 as proteins that bind the G-rich telomeric strand. At that time, we had also partially characterized a protein: DNA complex, named LaGT1, but we could not identify its protein component. Using protein-DNA interaction and competition assays, we confirmed that LaGT1 is highly specific to the G-rich telomeric single-stranded DNA. Three protein bands, with LaGT1 activity, were isolated from affinity-purified protein extracts in-gel digested, and sequenced de novo using mass spectrometry analysis. In silico analysis of the digested peptide identified them as a putative calmodulin with sequences identical to the T. cruzi calmodulin. In the Leishmania genome, the calmodulin ortholog is present in three identical copies. We cloned and sequenced one of the gene copies, named it LCalA, and obtained the recombinant protein. Multiple sequence alignment and molecular modeling showed that LCalA shares homology to most eukaryotes calmodulin. In addition, we demonstrated that LCalA is nuclear, partially co-localizes with telomeres and binds in vivo the G-rich telomeric strand. Recombinant LCalA can bind specifically and with relative affinity to the G-rich telomeric single-strand and to a 3'G-overhang, and DNA binding is calcium dependent. We have described a novel candidate component of Leishmania telomeres, LCalA, a nuclear calmodulin that binds the G-rich telomeric strand with high specificity and relative affinity, in a calcium-dependent manner. LCalA is the first reported calmodulin that binds in vivo telomeric DNA. Copyright © 2017 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Samudrala, Ram; Heffron, Fred; McDermott, Jason E.
2009-04-24
The type III secretion system is an essential component for virulence in many Gram-negative bacteria. Though components of the secretion system apparatus are conserved, its substrates, effector proteins, are not. We have used a machine learning approach to identify new secreted effectors. The method integrates evolutionary measures, such as the pattern of homologs in a range of other organisms, and sequence-based features, such as G+C content, amino acid composition and the N-terminal 30 residues of the protein sequence. The method was trained on known effectors from Salmonella typhimurium and validated on a corresponding set of effectors from Pseudomonas syringae, aftermore » eliminating effectors with detectable sequence similarity. The method was able to identify all of the known effectors in P. syringae with a specificity of 84% and sensitivity of 82%. The reciprocal validation, training on P. syringae and validating on S. typhimurium, gave similar results with a specificity of 86% when the sensitivity level was 87%. These results show that type III effectors in disparate organisms share common features. We found that maximal performance is attained by including an N-terminal sequence of only 30 residues, which agrees with previous studies indicating that this region contains the secretion signal. We then used the method to define the most important residues in this putative secretion signal. Finally, we present novel predictions of secreted effectors in S. typhimurium, some of which have been experimentally validated, and apply the method to predict secreted effectors in the genetically intractable human pathogen Chlamydia trachomatis. This approach is a novel and effective way to identify secreted effectors in a broad range of pathogenic bacteria for further experimental characterization and provides insight into the nature of the type III secretion signal.« less
Li, Chun; Haug, Tor; Moe, Morten K; Styrvold, Olaf B; Stensvåg, Klara
2010-09-01
As immune effector molecules, antimicrobial peptides (AMPs) play an important role in the invertebrate immune system. Here, we present two novel AMPs, named centrocins 1 (4.5kDa) and 2 (4.4kDa), purified from coelomocyte extracts of the green sea urchin, Strongylocentrotus droebachiensis. The native peptides are cationic and show potent activities against Gram-positive and Gram-negative bacteria. The centrocins have an intramolecular heterodimeric structure, containing a heavy chain (30 amino acids) and a light chain (12 amino acids). The cDNA encoding the peptides and genomic sequences were cloned and sequenced. One putative isoform (centrocin 1b) was identified and one intron was found in the genes coding for the centrocins. The full length protein sequence of centrocin 1 consists of 119 amino acids, whereas centrocin 2 consists of 118 amino acids which both include a preprosequence of 51 or 50 amino acids for centrocins 1 and 2, respectively, and an interchain of 24 amino acids between the heavy and light chain. The difference of molecular mass between the native centrocins and the deduced sequences from cDNA indicates that the native centrocins contain a post-translational brominated tryptophan. In addition, two amino acids at the C-terminal, Gly-Arg, were removed from the light chains during the post-translational processing. The separate peptide chains of centrocin 1 were synthesized and the heavy chain alone was shown to be sufficient for antimicrobial activity. The genome of the closely related species, the purple sea urchin (S. purpuratus), was shown to contain two putative proteins with high similarity to the centrocins. Copyright 2010 Elsevier Ltd. All rights reserved.
Dallery, Jean-Félix; Lapalu, Nicolas; Zampounis, Antonios; Pigné, Sandrine; Luyten, Isabelle; Amselem, Joëlle; Wittenberg, Alexander H J; Zhou, Shiguo; de Queiroz, Marisa V; Robin, Guillaume P; Auger, Annie; Hainaut, Matthieu; Henrissat, Bernard; Kim, Ki-Tae; Lee, Yong-Hwan; Lespinet, Olivier; Schwartz, David C; Thon, Michael R; O'Connell, Richard J
2017-08-29
The ascomycete fungus Colletotrichum higginsianum causes anthracnose disease of brassica crops and the model plant Arabidopsis thaliana. Previous versions of the genome sequence were highly fragmented, causing errors in the prediction of protein-coding genes and preventing the analysis of repetitive sequences and genome architecture. Here, we re-sequenced the genome using single-molecule real-time (SMRT) sequencing technology and, in combination with optical map data, this provided a gapless assembly of all twelve chromosomes except for the ribosomal DNA repeat cluster on chromosome 7. The more accurate gene annotation made possible by this new assembly revealed a large repertoire of secondary metabolism (SM) key genes (89) and putative biosynthetic pathways (77 SM gene clusters). The two mini-chromosomes differed from the ten core chromosomes in being repeat- and AT-rich and gene-poor but were significantly enriched with genes encoding putative secreted effector proteins. Transposable elements (TEs) were found to occupy 7% of the genome by length. Certain TE families showed a statistically significant association with effector genes and SM cluster genes and were transcriptionally active at particular stages of fungal development. All 24 subtelomeres were found to contain one of three highly-conserved repeat elements which, by providing sites for homologous recombination, were probably instrumental in four segmental duplications. The gapless genome of C. higginsianum provides access to repeat-rich regions that were previously poorly assembled, notably the mini-chromosomes and subtelomeres, and allowed prediction of the complete SM gene repertoire. It also provides insights into the potential role of TEs in gene and genome evolution and host adaptation in this asexual pathogen.
Samad, Abdul Fatah A; Nazaruddin, Nazaruddin; Murad, Abdul Munir Abdul; Jani, Jaeyres; Zainal, Zamri; Ismail, Ismanizan
2018-03-01
In current era, majority of microRNA (miRNA) are being discovered through computational approaches which are more confined towards model plants. Here, for the first time, we have described the identification and characterization of novel miRNA in a non-model plant, Persicaria minor ( P . minor ) using computational approach. Unannotated sequences from deep sequencing were analyzed based on previous well-established parameters. Around 24 putative novel miRNAs were identified from 6,417,780 reads of the unannotated sequence which represented 11 unique putative miRNA sequences. PsRobot target prediction tool was deployed to identify the target transcripts of putative novel miRNAs. Most of the predicted target transcripts (mRNAs) were known to be involved in plant development and stress responses. Gene ontology showed that majority of the putative novel miRNA targets involved in cellular component (69.07%), followed by molecular function (30.08%) and biological process (0.85%). Out of 11 unique putative miRNAs, 7 miRNAs were validated through semi-quantitative PCR. These novel miRNAs discoveries in P . minor may develop and update the current public miRNA database.
Wustman, Brandon A; Santos, Rudolpho; Zhang, Bo; Evans, John Spencer
2002-12-05
Fracture resistance in biomineralized structures has been linked to the presence of proteins, some of which possess sequences that are associated with elastic behavior. One such protein superfamily, the Pro,Gly-rich sea urchin intracrystalline spicule matrix proteins, form protein-protein supramolecular assemblies that modify the microstructure and fracture-resistant properties of the calcium carbonate mineral phase within embryonic sea urchin spicules and adult sea urchin spines. In this report, we detail the identification of a repetitive keratin-like "glycine-loop"- or coil-like structure within the 34-AA (AA: amino acid) N-terminal domain, (PGMG)(8)PG, of the spicule matrix protein, PM27. The identification of this repetitive structural motif was accomplished using two capped model peptides: a 9-AA sequence, GPGMGPGMG, and a 34-AA peptide representing the entire motif. Using CD, NMR spectrometry, and molecular dynamics simulated annealing/minimization simulations, we have determined that the 9-AA model peptide adopts a loop-like structure at pH 7.4. The structure of the 34-AA polypeptide resembles a coil structure consisting of repeating loop motifs that do not exhibit long-range ordering. Given that loop structures have been associated with protein elastic behavior and protein motion, it is plausible that the 34-AA Pro,Gly,Met repeat sequence motif in PM27 represents a putative elastic or mobile domain. Copyright 2002 Wiley Periodicals, Inc.
Detection of alternative splice variants at the proteome level in Aspergillus flavus.
Chang, Kung-Yen; Georgianna, D Ryan; Heber, Steffen; Payne, Gary A; Muddiman, David C
2010-03-05
Identification of proteins from proteolytic peptides or intact proteins plays an essential role in proteomics. Researchers use search engines to match the acquired peptide sequences to the target proteins. However, search engines depend on protein databases to provide candidates for consideration. Alternative splicing (AS), the mechanism where the exon of pre-mRNAs can be spliced and rearranged to generate distinct mRNA and therefore protein variants, enable higher eukaryotic organisms, with only a limited number of genes, to have the requisite complexity and diversity at the proteome level. Multiple alternative isoforms from one gene often share common segments of sequences. However, many protein databases only include a limited number of isoforms to keep minimal redundancy. As a result, the database search might not identify a target protein even with high quality tandem MS data and accurate intact precursor ion mass. We computationally predicted an exhaustive list of putative isoforms of Aspergillus flavus proteins from 20 371 expressed sequence tags to investigate whether an alternative splicing protein database can assign a greater proportion of mass spectrometry data. The newly constructed AS database provided 9807 new alternatively spliced variants in addition to 12 832 previously annotated proteins. The searches of the existing tandem MS spectra data set using the AS database identified 29 new proteins encoded by 26 genes. Nine fungal genes appeared to have multiple protein isoforms. In addition to the discovery of splice variants, AS database also showed potential to improve genome annotation. In summary, the introduction of an alternative splicing database helps identify more proteins and unveils more information about a proteome.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chang, C.; Coggill, P.; Bateman, A.
Many Gram-positive lactic acid bacteria (LAB) produce anti-bacterial peptides and small proteins called bacteriocins, which enable them to compete against other bacteria in the environment. These peptides fall structurally into three different classes, I, II, III, with class IIa being pediocin-like single entities and class IIb being two-peptide bacteriocins. Self-protective cognate immunity proteins are usually co-transcribed with these toxins. Several examples of cognates for IIa have already been solved structurally. Streptococcus pyogenes, closely related to LAB, is one of the most common human pathogens, so knowledge of how it competes against other LAB species is likely to prove invaluable. Wemore » have solved the crystal structure of the gene-product of locus Spy-2152 from S. pyogenes, (PDB: 2fu2), and found it to comprise an anti-parallel four-helix bundle that is structurally similar to other bacteriocin immunity proteins. Sequence analyses indicate this protein to be a possible immunity protein protective against class IIa or IIb bacteriocins. However, given that S. pyogenes appears to lack any IIa pediocin-like proteins but does possess class IIb bacteriocins, we suggest this protein confers immunity to IIb-like peptides. Combined structural, genomic and proteomic analyses have allowed the identification and in silico characterization of a new putative immunity protein from S. pyogenes, possibly the first structure of an immunity protein protective against potential class IIb two-peptide bacteriocins. We have named the two pairs of putative bacteriocins found in S. pyogenes pyogenecin 1, 2, 3 and 4.« less
Predicting protein-binding RNA nucleotides with consideration of binding partners.
Tuvshinjargal, Narankhuu; Lee, Wook; Park, Byungkyu; Han, Kyungsook
2015-06-01
In recent years several computational methods have been developed to predict RNA-binding sites in protein. Most of these methods do not consider interacting partners of a protein, so they predict the same RNA-binding sites for a given protein sequence even if the protein binds to different RNAs. Unlike the problem of predicting RNA-binding sites in protein, the problem of predicting protein-binding sites in RNA has received little attention mainly because it is much more difficult and shows a lower accuracy on average. In our previous study, we developed a method that predicts protein-binding nucleotides from an RNA sequence. In an effort to improve the prediction accuracy and usefulness of the previous method, we developed a new method that uses both RNA and protein sequence data. In this study, we identified effective features of RNA and protein molecules and developed a new support vector machine (SVM) model to predict protein-binding nucleotides from RNA and protein sequence data. The new model that used both protein and RNA sequence data achieved a sensitivity of 86.5%, a specificity of 86.2%, a positive predictive value (PPV) of 72.6%, a negative predictive value (NPV) of 93.8% and Matthews correlation coefficient (MCC) of 0.69 in a 10-fold cross validation; it achieved a sensitivity of 58.8%, a specificity of 87.4%, a PPV of 65.1%, a NPV of 84.2% and MCC of 0.48 in independent testing. For comparative purpose, we built another prediction model that used RNA sequence data alone and ran it on the same dataset. In a 10 fold-cross validation it achieved a sensitivity of 85.7%, a specificity of 80.5%, a PPV of 67.7%, a NPV of 92.2% and MCC of 0.63; in independent testing it achieved a sensitivity of 67.7%, a specificity of 78.8%, a PPV of 57.6%, a NPV of 85.2% and MCC of 0.45. In both cross-validations and independent testing, the new model that used both RNA and protein sequences showed a better performance than the model that used RNA sequence data alone in most performance measures. To the best of our knowledge, this is the first sequence-based prediction of protein-binding nucleotides in RNA which considers the binding partner of RNA. The new model will provide valuable information for designing biochemical experiments to find putative protein-binding sites in RNA with unknown structure. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Walia, Rasna R; Xue, Li C; Wilkins, Katherine; El-Manzalawy, Yasser; Dobbs, Drena; Honavar, Vasant
2014-01-01
Protein-RNA interactions are central to essential cellular processes such as protein synthesis and regulation of gene expression and play roles in human infectious and genetic diseases. Reliable identification of protein-RNA interfaces is critical for understanding the structural bases and functional implications of such interactions and for developing effective approaches to rational drug design. Sequence-based computational methods offer a viable, cost-effective way to identify putative RNA-binding residues in RNA-binding proteins. Here we report two novel approaches: (i) HomPRIP, a sequence homology-based method for predicting RNA-binding sites in proteins; (ii) RNABindRPlus, a new method that combines predictions from HomPRIP with those from an optimized Support Vector Machine (SVM) classifier trained on a benchmark dataset of 198 RNA-binding proteins. Although highly reliable, HomPRIP cannot make predictions for the unaligned parts of query proteins and its coverage is limited by the availability of close sequence homologs of the query protein with experimentally determined RNA-binding sites. RNABindRPlus overcomes these limitations. We compared the performance of HomPRIP and RNABindRPlus with that of several state-of-the-art predictors on two test sets, RB44 and RB111. On a subset of proteins for which homologs with experimentally determined interfaces could be reliably identified, HomPRIP outperformed all other methods achieving an MCC of 0.63 on RB44 and 0.83 on RB111. RNABindRPlus was able to predict RNA-binding residues of all proteins in both test sets, achieving an MCC of 0.55 and 0.37, respectively, and outperforming all other methods, including those that make use of structure-derived features of proteins. More importantly, RNABindRPlus outperforms all other methods for any choice of tradeoff between precision and recall. An important advantage of both HomPRIP and RNABindRPlus is that they rely on readily available sequence and sequence-derived features of RNA-binding proteins. A webserver implementation of both methods is freely available at http://einstein.cs.iastate.edu/RNABindRPlus/.
Nguyen, Thao Thi; Lee, Hyun-Hee; Park, Jungwook; Park, Inmyoung; Seo, Young-Su
2017-04-01
As a step towards discovering novel pathogenesis-related proteins, we performed a genome scale computational identification and characterization of secreted and transmembrane (TM) proteins, which are mainly responsible for bacteria-host interactions and interactions with other bacteria, in the genomes of six representative Burkholderia species. The species comprised plant pathogens ( B. glumae BGR1, B. gladioli BSR3), human pathogens ( B. pseudomallei K96243, B. cepacia LO6), and plant-growth promoting endophytes ( Burkholderia sp. KJ006, B. phytofirmans PsJN). The proportions of putative classically secreted proteins (CSPs) and TM proteins among the species were relatively high, up to approximately 20%. Lower proportions of putative type 3 non-classically secreted proteins (T3NCSPs) (~10%) and unclassified non-classically secreted proteins (NCSPs) (~5%) were observed. The numbers of TM proteins among the three clusters (plant pathogens, human pathogens, and endophytes) were different, while the distribution of these proteins according to the number of TM domains was conserved in which TM proteins possessing 1, 2, 4, or 12 TM domains were the dominant groups in all species. In addition, we observed conservation in the protein size distribution of the secreted protein groups among the species. There were species-specific differences in the functional characteristics of these proteins in the various groups of CSPs, T3NCSPs, and unclassified NCSPs. Furthermore, we assigned the complete sets of the conserved and unique NCSP candidates of the collected Burkholderia species using sequence similarity searching. This study could provide new insights into the relationship among plant-pathogenic, human-pathogenic, and endophytic bacteria.
Phage phenomics: Physiological approaches to characterize novel viral proteins
Sanchez, Savannah E. [San Diego State Univ., San Diego, CA (United States); Cuevas, Daniel A. [San Diego State Univ., San Diego, CA (United States); Rostron, Jason E. [San Diego State Univ., San Diego, CA (United States); Liang, Tiffany Y. [San Diego State Univ., San Diego, CA (United States); Pivaroff, Cullen G. [San Diego State Univ., San Diego, CA (United States); Haynes, Matthew R. [San Diego State Univ., San Diego, CA (United States); Nulton, Jim [San Diego State Univ., San Diego, CA (United States); Felts, Ben [San Diego State Univ., San Diego, CA (United States); Bailey, Barbara A. [San Diego State Univ., San Diego, CA (United States); Salamon, Peter [San Diego State Univ., San Diego, CA (United States); Edwards, Robert A. [San Diego State Univ., San Diego, CA (United States); Argonne National Lab. (ANL), Argonne, IL (United States); Burgin, Alex B. [Broad Institute, Cambridge, MA (United States); Segall, Anca M. [San Diego State Univ., San Diego, CA (United States); Rohwer, Forest [San Diego State Univ., San Diego, CA (United States)
2018-06-21
Current investigations into phage-host interactions are dependent on extrapolating knowledge from (meta)genomes. Interestingly, 60 - 95% of all phage sequences share no homology to current annotated proteins. As a result, a large proportion of phage genes are annotated as hypothetical. This reality heavily affects the annotation of both structural and auxiliary metabolic genes. Here we present phenomic methods designed to capture the physiological response(s) of a selected host during expression of one of these unknown phage genes. Multi-phenotype Assay Plates (MAPs) are used to monitor the diversity of host substrate utilization and subsequent biomass formation, while metabolomics provides bi-product analysis by monitoring metabolite abundance and diversity. Both tools are used simultaneously to provide a phenotypic profile associated with expression of a single putative phage open reading frame (ORF). Thus, representative results for both methods are compared, highlighting the phenotypic profile differences of a host carrying either putative structural or metabolic phage genes. In addition, the visualization techniques and high throughput computational pipelines that facilitated experimental analysis are presented.
A transferrin gene associated with development and 2-tridecanone tolerance in Helicoverpa armigera
Zhang, L; Shang, Q; Lu, Y; Zhao, Q; Gao, X
2015-01-01
The full-length cDNA (2320 bp) encoding a putative iron-binding transferrin protein from Helicoverpa armigera was cloned and named HaTrf. The putative HaTrf sequence included 670 amino acids with a molecular mass of approximately 76 kDa. Quantitative PCR results demonstrated that the transcriptional level of HaTrf was significantly higher in the sixth instar and pupa stages as compared with other developmental stages. HaTrf transcripts were more abundant in fat bodies and in the epidermis than in malpighian tubules. Compared with the control, the expression of HaTrf increased dramatically 24 h after treatment with 2-tridecanone. Apparent growth inhibition with a dramatic body weight decrease was observed in larvae fed with HaTrf double-stranded RNA (dsRNA), as compared with those fed with green fluorescent protein dsRNA. RNA interference of HaTrf also significantly increased the susceptibility of larvae to 2-tridecanone. These results indicate the possible involvement of HaTrf in tolerance to plant secondary chemicals. PMID:25430818
Khairy, Heba; Meinert, Christina; Wübbeler, Jan Hendrik; Poehlein, Anja; Daniel, Rolf; Voigt, Birgit; Riedel, Katharina; Steinbüchel, Alexander
2016-01-01
Rhodococcus erythropolis MI2 has the extraordinary ability to utilize the xenobiotic 4,4´-dithiodibutyric acid (DTDB). Cleavage of DTDB by the disulfide-reductase Nox, which is the only verified enzyme involved in DTDB-degradation, raised 4-mercaptobutyric acid (4MB). 4MB could act as building block of a novel polythioester with unknown properties. To completely unravel the catabolism of DTDB, the genome of R. erythropolis MI2 was sequenced, and subsequently the proteome was analyzed. The draft genome sequence consists of approximately 7.2 Mbp with an overall G+C content of 62.25% and 6,859 predicted protein-encoding genes. The genome of strain MI2 is composed of three replicons: one chromosome and two megaplasmids with sizes of 6.45, 0.4 and 0.35 Mbp, respectively. When cells of strain MI2 were cultivated with DTDB as sole carbon source and compared to cells grown with succinate, several interesting proteins with significantly higher expression levels were identified using 2D-PAGE and MALDI-TOF mass spectrometry. A putative luciferase-like monooxygenase-class F420-dependent oxidoreductase (RERY_05640), which is encoded by one of the 126 monooxygenase-encoding genes of the MI2-genome, showed a 3-fold increased expression level. This monooxygenase could oxidize the intermediate 4MB into 4-oxo-4-sulfanylbutyric acid. Next, a desulfurization step, which forms succinic acid and volatile hydrogen sulfide, is proposed. One gene coding for a putative desulfhydrase (RERY_06500) was identified in the genome of strain MI2. However, the gene product was not recognized in the proteome analyses. But, a significant expression level with a ratio of up to 7.3 was determined for a putative sulfide:quinone oxidoreductase (RERY_02710), which could also be involved in the abstraction of the sulfur group. As response to the toxicity of the intermediates, several stress response proteins were strongly expressed, including a superoxide dismutase (RERY_05600) and an osmotically induced protein (RERY_02670). Accordingly, novel insights in the catabolic pathway of DTDB were gained. PMID:27977722
Khairy, Heba; Meinert, Christina; Wübbeler, Jan Hendrik; Poehlein, Anja; Daniel, Rolf; Voigt, Birgit; Riedel, Katharina; Steinbüchel, Alexander
2016-01-01
Rhodococcus erythropolis MI2 has the extraordinary ability to utilize the xenobiotic 4,4´-dithiodibutyric acid (DTDB). Cleavage of DTDB by the disulfide-reductase Nox, which is the only verified enzyme involved in DTDB-degradation, raised 4-mercaptobutyric acid (4MB). 4MB could act as building block of a novel polythioester with unknown properties. To completely unravel the catabolism of DTDB, the genome of R. erythropolis MI2 was sequenced, and subsequently the proteome was analyzed. The draft genome sequence consists of approximately 7.2 Mbp with an overall G+C content of 62.25% and 6,859 predicted protein-encoding genes. The genome of strain MI2 is composed of three replicons: one chromosome and two megaplasmids with sizes of 6.45, 0.4 and 0.35 Mbp, respectively. When cells of strain MI2 were cultivated with DTDB as sole carbon source and compared to cells grown with succinate, several interesting proteins with significantly higher expression levels were identified using 2D-PAGE and MALDI-TOF mass spectrometry. A putative luciferase-like monooxygenase-class F420-dependent oxidoreductase (RERY_05640), which is encoded by one of the 126 monooxygenase-encoding genes of the MI2-genome, showed a 3-fold increased expression level. This monooxygenase could oxidize the intermediate 4MB into 4-oxo-4-sulfanylbutyric acid. Next, a desulfurization step, which forms succinic acid and volatile hydrogen sulfide, is proposed. One gene coding for a putative desulfhydrase (RERY_06500) was identified in the genome of strain MI2. However, the gene product was not recognized in the proteome analyses. But, a significant expression level with a ratio of up to 7.3 was determined for a putative sulfide:quinone oxidoreductase (RERY_02710), which could also be involved in the abstraction of the sulfur group. As response to the toxicity of the intermediates, several stress response proteins were strongly expressed, including a superoxide dismutase (RERY_05600) and an osmotically induced protein (RERY_02670). Accordingly, novel insights in the catabolic pathway of DTDB were gained.
Barat, Ashoktaru; Sahoo, Prabhati Kumari; Kumar, Rohit; Pande, Veena
2016-10-01
The solute carriers (SLC) are trans-membrane proteins, those regulate the transport of various substances (sugars, amino acids, nucleotides, inorganic cations/anions, metals, drugs etc.) across the cell membrane. There are more than 338 solute carriers (slc) reported in fishes that play crucial role in cellular influx and efflux. The study of solute carrier families may reveal many answers regarding the function of transporter genes in the species and their effect in the existing environment. Therefore, we performed RNA sequencing of kidney tissue of the golden mahseer (Tor putitora) using Illumina platform to identify the solute carrier families and characterized 24 putative functional genes under 15 solute carrier families. Out of 24 putative functional genes, 11 genes were differentially expressed in different tissues (head kidney, trunk kidney, spleen, liver, gill, muscle, intestine and brain) using qRT-PCR assay. The slc5a1, slc5a12, slc12a3, slc13a3, slc22a13 and slc26a6 were highly expressed in kidney. The slc15a2, slc25a47, slc33a1 and slc38a2 were highly expressed in brain and slc30a5 was over-expressed in gill. The unrooted phylogenetic trees of slc2, slc5, slc13 and slc33 were constructed using amino acid sequences of Homo sapiens, Salmo salar, Danio rerio, Cyprinus carpio and Tor putitora. It appears that all the putative solute carrier families are very much conserved in human and fish species including the present fish, golden mahseer. This study provides the first hand database of solute carrier families particularly transporter encoding proteins in the species. Copyright © 2016 Elsevier Inc. All rights reserved.
Pornbanlualap, Somchai; Chalopagorn, Pornchanok
2011-08-01
The sequencing of the genome of Streptomyces coelicolor A3(2) identified seven putative adenine/adenosine deaminases and adenosine deaminase-like proteins, none of which have been biochemically characterized. This report describes recombinant expression, purification and characterization of SCO4901 which had been annotated in data bases as a putative adenosine deaminase. The purified putative adenosine deaminase gives a subunit Mr=48,400 on denaturing gel electrophoresis and an oligomer molecular weight of approximately 182,000 by comparative gel filtration. These values are consistent with the active enzyme being composed of four subunits with identical molecular weights. The turnover rate of adenosine is 11.5 s⁻¹ at 30 °C. Since adenine is deaminated ∼10³ slower by the enzyme when compared to that of adenosine, these data strongly show that the purified enzyme is an adenosine deaminase (ADA) and not an adenine deaminase (ADE). Other adenine nucleosides/nucleotides, including 9-β-D-arabinofuranosyl-adenine (ara-A), 5'-AMP, 5'-ADP and 5'-ATP, are not substrates for the enzyme. Coformycin and 2'-deoxycoformycin are potent competitive inhibitors of the enzyme with inhibition constants of 0.25 and 3.4 nM, respectively. Amino acid sequence alignment of ScADA with ADAs from other organisms reveals that eight of the nine highly conserved catalytic site residues in other ADAs are also conserved in ScADA. The only non-conserved residue is Asn317, which replaces Asp296 in the murine enzyme. Based on these data, it is suggested here that ADA and ADE proteins are divergently related enzymes that have evolved from a common α/β barrel scaffold to catalyze the deamination of different substrates, using a similar catalytic mechanism. Copyright © 2011 Elsevier Inc. All rights reserved.
Chiriac, Cecilia; Baricz, Andreea
2018-01-01
ABSTRACT The draft genome assembly of Janthinobacterium sp. strain ROICE36 has 207 contigs, with a total genome size of 5,977,006 bp and a G+C content of 62%. Preliminary genome analysis identified 5,363 protein-coding genes and a total of 7 secondary metabolic gene clusters (encoding bacteriocins, nonribosomal peptide-synthetase [NRPS], terpene, hserlactone, and other ketide synthases). PMID:29650588
A database of annotated tentative orthologs from crop abiotic stress transcripts.
Balaji, Jayashree; Crouch, Jonathan H; Petite, Prasad V N S; Hoisington, David A
2006-10-07
A minimal requirement to initiate a comparative genomics study on plant responses to abiotic stresses is a dataset of orthologous sequences. The availability of a large amount of sequence information, including those derived from stress cDNA libraries allow for the identification of stress related genes and orthologs associated with the stress response. Orthologous sequences serve as tools to explore genes and their relationships across species. For this purpose, ESTs from stress cDNA libraries across 16 crop species including 6 important cereal crops and 10 dicots were systematically collated and subjected to bioinformatics analysis such as clustering, grouping of tentative orthologous sets, identification of protein motifs/patterns in the predicted protein sequence, and annotation with stress conditions, tissue/library source and putative function. All data are available to the scientific community at http://intranet.icrisat.org/gt1/tog/homepage.htm. We believe that the availability of annotated plant abiotic stress ortholog sets will be a valuable resource for researchers studying the biology of environmental stresses in plant systems, molecular evolution and genomics.
Cristofari, Gaël; Ivanyi-Nagy, Roland; Gabus, Caroline; Boulant, Steeve; Lavergne, Jean-Pierre; Penin, François; Darlix, Jean-Luc
2004-01-01
The hepatitis C virus (HCV) is an important human pathogen causing chronic hepatitis, liver cirrhosis and hepatocellular carcinoma. HCV is an enveloped virus with a positive-sense, single-stranded RNA genome encoding a single polyprotein that is processed to generate viral proteins. Several hundred molecules of the structural Core protein are thought to coat the genome in the viral particle, as do nucleocapsid (NC) protein molecules in Retroviruses, another class of enveloped viruses containing a positive-sense RNA genome. Retroviral NC proteins also possess nucleic acid chaperone properties that play critical roles in the structural remodelling of the genome during retrovirus replication. This analogy between HCV Core and retroviral NC proteins prompted us to investigate the putative nucleic acid chaperoning properties of the HCV Core protein. Here we report that Core protein chaperones the annealing of complementary DNA and RNA sequences and the formation of the most stable duplex by strand exchange. These results show that the HCV Core is a nucleic acid chaperone similar to retroviral NC proteins. We also find that the Core protein directs dimerization of HCV (+) RNA 3′ untranslated region which is promoted by a conserved palindromic sequence possibly involved at several stages of virus replication. PMID:15141033
Cristofari, Gaël; Ivanyi-Nagy, Roland; Gabus, Caroline; Boulant, Steeve; Lavergne, Jean-Pierre; Penin, François; Darlix, Jean-Luc
2004-01-01
The hepatitis C virus (HCV) is an important human pathogen causing chronic hepatitis, liver cirrhosis and hepatocellular carcinoma. HCV is an enveloped virus with a positive-sense, single-stranded RNA genome encoding a single polyprotein that is processed to generate viral proteins. Several hundred molecules of the structural Core protein are thought to coat the genome in the viral particle, as do nucleocapsid (NC) protein molecules in Retroviruses, another class of enveloped viruses containing a positive-sense RNA genome. Retroviral NC proteins also possess nucleic acid chaperone properties that play critical roles in the structural remodelling of the genome during retrovirus replication. This analogy between HCV Core and retroviral NC proteins prompted us to investigate the putative nucleic acid chaperoning properties of the HCV Core protein. Here we report that Core protein chaperones the annealing of complementary DNA and RNA sequences and the formation of the most stable duplex by strand exchange. These results show that the HCV Core is a nucleic acid chaperone similar to retroviral NC proteins. We also find that the Core protein directs dimerization of HCV (+) RNA 3' untranslated region which is promoted by a conserved palindromic sequence possibly involved at several stages of virus replication.
Zhang, Li; Liang, Shuli; Zhou, Xinying; Jin, Zi; Jiang, Fengchun; Han, Shuangyan; Zheng, Suiping
2013-01-01
Glycosylphosphatidylinositol (GPI)-anchored glycoproteins have various intrinsic functions in yeasts and different uses in vitro. In the present study, the genome of Pichia pastoris GS115 was screened for potential GPI-modified cell wall proteins. Fifty putative GPI-anchored proteins were selected on the basis of (i) the presence of a C-terminal GPI attachment signal sequence, (ii) the presence of an N-terminal signal sequence for secretion, and (iii) the absence of transmembrane domains in mature protein. The predicted GPI-anchored proteins were fused to an alpha-factor secretion signal as a substitute for their own N-terminal signal peptides and tagged with the chimeric reporters FLAG tag and mature Candida antarctica lipase B (CALB). The expression of fusion proteins on the cell surface of P. pastoris GS115 was determined by whole-cell flow cytometry and immunoblotting analysis of the cell wall extracts obtained by β-1,3-glucanase digestion. CALB displayed on the cell surface of P. pastoris GS115 with the predicted GPI-anchored proteins was examined on the basis of potential hydrolysis of p-nitrophenyl butyrate. Finally, 13 proteins were confirmed to be GPI-modified cell wall proteins in P. pastoris GS115, which can be used to display heterologous proteins on the yeast cell surface. PMID:23835174
Zang, Wen; Eckstein, Peter E; Colin, Mark; Voth, Doug; Himmelbach, Axel; Beier, Sebastian; Stein, Nils; Scoles, Graham J; Beattie, Aaron D
2015-07-01
The candidate gene for the barley Un8 true loose smut resistance gene encodes a deduced protein containing two tandem protein kinase domains. In North America, durable resistance against all known isolates of barley true loose smut, caused by the basidiomycete pathogen Ustilago nuda (Jens.) Rostr. (U. nuda), is under the control of the Un8 resistance gene. Previous genetic studies mapped Un8 to the long arm of chromosome 5 (1HL). Here, a population of 4625 lines segregating for Un8 was used to delimit the Un8 gene to a 0.108 cM interval on chromosome arm 1HL, and assign it to fingerprinted contig 546 of the barley physical map. The minimal tilling path was identified for the Un8 locus using two flanking markers and consisted of two overlapping bacterial artificial chromosomes. One gene located close to a marker co-segregating with Un8 showed high sequence identity to a disease resistance gene containing two kinase domains. Sequence of the candidate gene from the parents of the segregating population, and in an additional 19 barley lines representing a broader spectrum of diversity, showed there was no intron in alleles present in either resistant or susceptible lines, and fifteen amino acid variations unique to the deduced protein sequence in resistant lines differentiated it from the deduced protein sequences in susceptible lines. Some of these variations were present within putative functional domains which may cause a loss of function in the deduced protein sequences within susceptible lines.
Geyer, David D.; Spence, M. Anne; Johannes, Meriam; Flodman, Pamela; Clancy, Kevin P.; Berry, Rebecca; Sparkes, Robert S.; Jonsen, Matthew D.; Isenberg, Sherwin J.; Bateman, J. Bronwyn
2006-01-01
PURPOSE To further elucidate the cataract phenotype, and identify the gene and mutation for autosomal dominant cataract (ADC) in an American family of European descent (ADC2) by sequencing the major intrinsic protein gene (MIP), a candidate based on linkage to chromosome 12q13. DESIGN Observational case series and laboratory experimental study. METHODS We examined two at-risk individuals in ADC2. We PCR-amplified and sequenced all four exons and all intron-exon boundaries of the MIP gene from genomic and cloned DNA in affected members to confirm one variant as the putative mutation. RESULTS We found a novel single deletion of nucleotide (nt) 3223 (within codon 235) in exon four, causing a frameshift that alters 41 of 45 subsequent amino acids and creates a premature stop codon. CONCLUSIONS We identified a novel single base pair deletion in the MIP gene and conclude that it is a pathogenic sequence alteration. PMID:16564824
Janeček, Stefan; Blesák, Karol
2011-08-01
The glycoside hydrolase family 57 (GH57) contains α-amylase and a few other amylolytic specificities. It counts ~400 members from Archaea (1/4) and Bacteria (3/4), mostly of extremophilic prokaryotes. Only 17 GH57 enzymes have been biochemically characterized. The main goal of the present bioinformatics study was to analyze sequences having the clear GH57 α-amylase features. Of the 107 GH57 sequences, 59 were evaluated as α-amylases (containing both GH57 catalytic residues), whereas 48 were assigned as GH57 α-amylase-like proteins (having a substitution in one or both catalytic residues). Forty-eight of 59 α-amylases were from Archaea, but 42 of 48 α-amylase-like proteins were of bacterial origin. The catalytic residues were substituted in most cases in Bacteroides and Prevotella by serine (instead of catalytic nucleophile glutamate) and glutamate (instead of proton donor aspartate). The GH57 α-amylase specificity has thus been evolved and kept enzymatically active mainly in Archaea.
Galinier, Richard; van Beurden, Steven; Amilhat, Elsa; Castric, Jeannette; Schoehn, Guy; Verneau, Olivier; Fazio, Géraldine; Allienne, Jean-François; Engelsma, Marc; Sasal, Pierre; Faliex, Elisabeth
2012-06-01
Eel virus European X (EVEX) was first isolated from diseased European eel Anguilla anguilla in Japan at the end of seventies. The virus was tentatively classified into the Rhabdoviridae family on the basis of morphology and serological cross reactivity. This family of viruses is organized into six genera and currently comprises approximately 200 members, many of which are still unassigned because of the lack of molecular data. This work presents the morphological, biochemical and genetic characterizations of EVEX, and proposes a taxonomic classification for this virus. We provide its complete genome sequence, plus a comprehensive sequence comparison between isolates from different geographical origins. The genome encodes the five classical structural proteins plus an overlapping open reading frame in the phosphoprotein gene, coding for a putative C protein. Phylogenic relationship with other rhabdoviruses indicates that EVEX is most closely related to the Vesiculovirus genus and shares the highest identity with trout rhabdovirus 903/87. Copyright © 2012 Elsevier B.V. All rights reserved.
Liu, Yan-Hua; Liu, Xin-Xin; Zhang, Ming-Hai
2016-07-01
Sika deer (Cervus nippon Temminck 1836) are classified in the order Artiodactyla, family Cervidae, subfamily Cervinae. At present, the phylogenetic studies of C. nippon are problematic. In this study, we first determined and described the complete mitochondrial sequence of the wild C. nippon hortulorum. The complete mitogenome sequence is 16 566 bp in length, including 13 protein-coding genes, two rRNA genes, 22 tRNA genes, a putative control region (CR) and a light-strand replication origin (OL). The overall base composition was 33.4% A, 28.6% T, 24.5% C, 13.5% G, with a 62.0% AT bias. The 13 protein-coding genes encode 3782 amino acids in total. To further validate the new determined sequences and phylogeny of Sika deer, phylogenetic trees involving 15 most closely related species available in GenBank database were constructed. These results are expected to provide useful molecular data for deer species identification and further phylogenetic studies of Artiodactyla.
Peoples, R J; Cisco, M J; Kaplan, P; Francke, U
1998-01-01
We have identified a novel gene (WBSCR9) within the common Williams-Beuren syndrome (WBS) deletion by interspecies sequence conservation. The WBSCR9 gene encodes a roughly 7-kb transcript with an open reading frame of 1483 amino acids and a predicted protein product size of 170.8 kDa. WBSCR9 is comprised of at least 20 exons extending over 60 kb. The transcript is expressed ubiquitously throughout development and is subject to alternative splicing. Functional motifs identified by sequence homology searches include a bromodomain; a PHD, or C4HC3, finger; several putative nuclear localization signals; four nuclear receptor binding motifs; a polyglutamate stretch and two PEST sequences. Bromodomains, PHD motifs and nuclear receptor binding motifs are cardinal features of proteins that are involved in chromatin remodeling and modulation of transcription. Haploinsufficiency for WBSCR9 gene products may contribute to the complex phenotype of WBS by interacting with tissue-specific regulatory factors during development.
Proteomic analysis of pollination-induced corolla senescence in petunia.
Bai, Shuangyi; Willard, Belinda; Chapin, Laura J; Kinter, Michael T; Francis, David M; Stead, Anthony D; Jones, Michelle L
2010-02-01
Senescence represents the last phase of petal development during which macromolecules and organelles are degraded and nutrients are recycled to developing tissues. To understand better the post-transcriptional changes regulating petal senescence, a proteomic approach was used to profile protein changes during the senescence of Petuniaxhybrida 'Mitchell Diploid' corollas. Total soluble proteins were extracted from unpollinated petunia corollas at 0, 24, 48, and 72 h after flower opening and at 24, 48, and 72 h after pollination. Two-dimensional gel electrophoresis (2-DE) was used to identify proteins that were differentially expressed in non-senescing (unpollinated) and senescing (pollinated) corollas, and image analysis was used to determine which proteins were up- or down-regulated by the experimentally determined cut-off of 2.1-fold for P <0.05. One hundred and thirty-three differentially expressed protein spots were selected for sequencing. Liquid chromatography-tandem mass spectrometry (LC-MS/MS) was used to determine the identity of these proteins. Searching translated EST databases and the NCBI non-redundant protein database, it was possible to assign a putative identification to greater than 90% of these proteins. Many of the senescence up-regulated proteins were putatively involved in defence and stress responses or macromolecule catabolism. Some proteins, not previously characterized during flower senescence, were identified, including an orthologue of the tomato abscisic acid stress ripening protein 4 (ASR4). Gene expression patterns did not always correlate with protein expression, confirming that both proteomic and genomic approaches will be required to obtain a detailed understanding of the regulation of petal senescence.
Morzunov , Sergey P.; Winton, James R.; Nichol, Stuart T.
1995-01-01
Infectious hematopoietic necrosis virus (IHNV), a member of the family Rhabdoviridae, causes a severe disease with high mortality in salmonid fish. The nucleotide sequence (11, 131 bases) of the entire genome was determined for the pathogenic WRAC strain of IHNV from southern Idaho. This allowed detailed analysis of all 6 genes, the deduced amino acid sequences of their encoded proteins, and important control motifs including leader, trailer and gene junction regions. Sequence analysis revealed that the 6 virus genes are located along the genome in the 3′ to 5′ order: nucleocapsid (N), polymerase-associated phosphoprotein (P or M1), matrix protein (M or M2), surface glycoprotein (G), a unique non-virion protein (NV) and virus polymerase (L). The IHNV genome RNA was found to have highly complementary termini (15 of 16 nucleotides). The gene junction regions display the highly conserved sequence UCURUC(U)7RCCGUG(N)4CACR (in the vRNA sense), which includes the typical rhabdovirus transcription termination/polyadenylation signal and a novel putative transcription initiation signal. Phylogenetic analysis of M, G and L protein sequences allowed insights into the evolutionary and taxonomic relationship of rhabdoviruses of fish relative to those of insects or mammals, and a broader sense of the relationship of non-segmented negative-strand RNA viruses. Based on these data, a new genus, piscivirus, is proposed which will initially contain IHNV, viral hemorrhagic septicemia virus and Hirame rhabdovirus.
Hall, R L; Moyer, R W
1991-01-01
Entomopoxvirus virions are frequently contained within crystalline occlusion bodies, which are composed of primarily a single protein, spheroidin, which is analogous to the polyhedrin protein of baculovirus. The spheroidin gene of Amsacta moorei entomopoxvirus was identified following the microsequencing of polypeptides generated from cyanogen bromide treatment of spheroidin and the subsequent synthesis of oligonucleotide hybridization probes. DNA sequencing of a 6.8-kb region of DNA containing the spheroidin gene showed that the spheroidin protein is derived from a 3.0-kb open reading frame potentially encoding a protein of 115 kDa. Three copies of the heptanucleotide, TTTTTNT, a sequence associated with early gene transcription in the vertebrate poxviruses, and four in-frame translational termination signals were found within 60 bp upstream of the putative spheroidin gene promoter (TAAATG). The spheroidin gene promoter region contains the sequence TAAATG, which is found in many late promoters of the vertebrate poxviruses and which serves as the site of transcriptional initiation, as shown by primer extension. Primer extension experiments also showed that spheroidin gene transcripts contain 5' poly(A) sequences typical of vertebrate poxvirus late transcripts. The 92 bases upstream of the initiating TAAATG are unusually A + T rich and contain only 7 G or C residues. An analysis of open reading frames around the spheroidin gene suggests that the colinear core of "essential genes" typical of the vertebrate poxviruses is absent in A. moorei entomopoxvirus. Images PMID:1942245
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jackson, P.J.; Walthers, E.A.; Richmond, K.L.
1997-04-01
PCR analysis of 198 Bacillus anthracis isolates revealed a variable region of DNA sequence differing in length among the isolates. Five Polymorphisms differed by the presence Of two to six copies of the 12-bp tandem repeat 5{prime}-CAATATCAACAA-3{prime}. This variable-number tandem repeat (VNTR) region is located within a larger sequence containing one complete open reading frame that encodes a putative 30-kDa protein. Length variation did not change the reading frame of the encoded protein and only changed the copy number of a 4-amino-acid sequence (QYQQ) from 2 to 6. The structure of the VNTR region suggests that these multiple repeats aremore » generated by recombination or polymerase slippage. Protein structures predicted from the reverse-translated DNA sequence suggest that any structural changes in the encoded protein are confined to the region encoded by the VNTR sequence. Copy number differences in the VNTR region were used to define five different B. anthracis alleles. Characterization of 198 isolates revealed allele frequencies of 6.1, 17.7, 59.6, 5.6, and 11.1% sequentially from shorter to longer alleles. The high degree of polymorphism in the VNTR region provides a criterion for assigning isolates to five allelic categories. There is a correlation between categories and geographic distribution. Such molecular markers can be used to monitor the epidemiology of anthrax outbreaks in domestic and native herbivore populations. 22 refs., 4 figs., 3 tabs.« less
Goh, C J; Park, D; Lee, J S; Sebastiani, F; Hahn, Y
2018-01-01
Amalgaviridae is a family of double-stranded, monosegmented RNA viruses that are associated with plants, fungi, microsporidians, and animals. A sequence contig derived from the transcriptome of a eudicot, Cistus incanus (the family Cistaceae; commonly known as hoary rockrose), was identified as the genome sequence of a novel plant RNA virus and named Cistus incanus RNA virus 1 (CiRV1). Sequence comparison and phylogenetic analysis indicated that CiRV1 is a novel species of the genus Amalgavirus in the family Amalgaviridae. The CiRV1 genome contig has two overlapping open reading frames (ORFs). ORF1 encodes a putative replication factory matrix-like protein, while ORF2 encodes a RNA-dependent RNA polymerase (RdRp) domain. An ORF1+2 fusion protein, which functions in viral RNA replication, is produced by a +1 programmed ribosomal frameshifting (PRF) mechanism. A +1 PRF motif UUU_CGU, which matches the conserved amalgavirus +1 PRF consensus sequence UUU_CGN, was found at the boundary of CiRV1 ORF1 and ORF2. Comparison of 25 amalgavirus ORF1+2 fusion proteins revealed that only three different positions within a 13-amino acid segment were recurrently used at the boundary, possibly being selected so as not to interfere with correct folding and function of the fusion protein. CiRV1 is the first virus found to be associated with the Cistus species and may be useful for studying amalgaviruses.
Park, Dongbin; Goh, Chul Jun; Kim, Hyein; Hahn, Yoonsoo
2018-04-01
The genome sequences of two novel monopartite RNA viruses were identified in a common eelgrass ( Zostera marina ) transcriptome dataset. Sequence comparison and phylogenetic analyses revealed that these two novel viruses belong to the genus Amalgavirus in the family Amalgaviridae . They were named Zostera marina amalgavirus 1 (ZmAV1) and Zostera marina amalgavirus 2 (ZmAV2). Genomes of both ZmAV1 and ZmAV2 contain two overlapping open reading frames (ORFs). ORF1 encodes a putative replication factory matrix-like protein, while ORF2 encodes a RNA-dependent RNA polymerase (RdRp) domain. The fusion protein (ORF1+2) of ORF1 and ORF2, which mediates RNA replication, was produced using the +1 programmed ribosomal frameshifting (PRF) mechanism. The +1 PRF motif sequence, UUU_CGN, which is highly conserved among known amalgaviruses, was also found in ZmAV1 and ZmAV2. Multiple sequence alignment of the ORF1+2 fusion proteins from 24 amalgaviruses revealed that +1 PRF occurred only at three different positions within the 13-amino acid-long segment, which was surrounded by highly conserved regions on both sides. This suggested that the +1 PRF may be constrained by the structure of fusion proteins. Genome sequences of ZmAV1 and ZmAV2, which are the first viruses to be identified in common eelgrass, will serve as useful resources for studying evolution and diversity of amalgaviruses.
Park, Dongbin; Goh, Chul Jun; Kim, Hyein; Hahn, Yoonsoo
2018-01-01
The genome sequences of two novel monopartite RNA viruses were identified in a common eelgrass (Zostera marina) transcriptome dataset. Sequence comparison and phylogenetic analyses revealed that these two novel viruses belong to the genus Amalgavirus in the family Amalgaviridae. They were named Zostera marina amalgavirus 1 (ZmAV1) and Zostera marina amalgavirus 2 (ZmAV2). Genomes of both ZmAV1 and ZmAV2 contain two overlapping open reading frames (ORFs). ORF1 encodes a putative replication factory matrix-like protein, while ORF2 encodes a RNA-dependent RNA polymerase (RdRp) domain. The fusion protein (ORF1+2) of ORF1 and ORF2, which mediates RNA replication, was produced using the +1 programmed ribosomal frameshifting (PRF) mechanism. The +1 PRF motif sequence, UUU_CGN, which is highly conserved among known amalgaviruses, was also found in ZmAV1 and ZmAV2. Multiple sequence alignment of the ORF1+2 fusion proteins from 24 amalgaviruses revealed that +1 PRF occurred only at three different positions within the 13-amino acid-long segment, which was surrounded by highly conserved regions on both sides. This suggested that the +1 PRF may be constrained by the structure of fusion proteins. Genome sequences of ZmAV1 and ZmAV2, which are the first viruses to be identified in common eelgrass, will serve as useful resources for studying evolution and diversity of amalgaviruses. PMID:29628822
1991-01-01
We recently described the identification of BOS1 (Newman, A., J. Shim, and S. Ferro-Novick. 1990. Mol. Cell. Biol. 10:3405-3414.). BOS1 is a gene that in multiple copy suppresses the growth and secretion defect of bet1 and sec22, two mutants that disrupt transport from the ER to the Golgi complex in yeast. The ability of BOS1 to specifically suppress mutants blocked at a particular stage of the secretory pathway suggested that this gene encodes a protein that functions in this process. The experiments presented in this study support this hypothesis. Specifically, the BOS1 gene was found to be essential for cellular growth. Furthermore, cells depleted of the Bos1 protein fail to transport pro-alpha-factor and carboxypeptidase Y (CPY) to the Golgi apparatus. This defect in export leads to the accumulation of an extensive network of ER and small vesicles. DNA sequence analysis predicts that Bos1 is a 27-kD protein containing a putative membrane- spanning domain. This prediction is supported by differential centrifugation experiments. Thus, Bos1 appears to be a membrane protein that functions in conjunction with Bet1 and Sec22 to facilitate the transport of proteins at a step subsequent to translocation into the ER but before entry into the Golgi apparatus. PMID:2007627
Różycka, Mirosława; Wojtas, Magdalena; Jakób, Michał; Stigloher, Christian; Grzeszkowiak, Mikołaj; Mazur, Maciej; Ożyhar, Andrzej
2014-01-01
Fish otoliths, biominerals composed of calcium carbonate with a small amount of organic matrix, are involved in the functioning of the inner ear. Starmaker (Stm) from zebrafish (Danio rerio) was the first protein found to be capable of controlling the formation of otoliths. Recently, a gene was identified encoding the Starmaker-like (Stm-l) protein from medaka (Oryzias latipes), a putative homologue of Stm and human dentine sialophosphoprotein. Although there is no sequence similarity between Stm-l and Stm, Stm-l was suggested to be involved in the biomineralization of otoliths, as had been observed for Stm even before. The molecular properties and functioning of Stm-l as a putative regulatory protein in otolith formation have not been characterized yet. A comprehensive biochemical and biophysical analysis of recombinant Stm-l, along with in silico examinations, indicated that Stm-l exhibits properties of a coil-like intrinsically disordered protein. Stm-l possesses an elongated and pliable structure that is able to adopt a more ordered and rigid conformation under the influence of different factors. An in vitro assay of the biomineralization activity of Stm-l indicated that Stm-l affected the size, shape and number of calcium carbonate crystals. The functional significance of intrinsically disordered properties of Stm-l and the possible role of this protein in controlling the formation of calcium carbonate crystals is discussed.
Różycka, Mirosława; Wojtas, Magdalena; Jakób, Michał; Stigloher, Christian; Grzeszkowiak, Mikołaj; Mazur, Maciej; Ożyhar, Andrzej
2014-01-01
Fish otoliths, biominerals composed of calcium carbonate with a small amount of organic matrix, are involved in the functioning of the inner ear. Starmaker (Stm) from zebrafish (Danio rerio) was the first protein found to be capable of controlling the formation of otoliths. Recently, a gene was identified encoding the Starmaker-like (Stm-l) protein from medaka (Oryzias latipes), a putative homologue of Stm and human dentine sialophosphoprotein. Although there is no sequence similarity between Stm-l and Stm, Stm-l was suggested to be involved in the biomineralization of otoliths, as had been observed for Stm even before. The molecular properties and functioning of Stm-l as a putative regulatory protein in otolith formation have not been characterized yet. A comprehensive biochemical and biophysical analysis of recombinant Stm-l, along with in silico examinations, indicated that Stm-l exhibits properties of a coil-like intrinsically disordered protein. Stm-l possesses an elongated and pliable structure that is able to adopt a more ordered and rigid conformation under the influence of different factors. An in vitro assay of the biomineralization activity of Stm-l indicated that Stm-l affected the size, shape and number of calcium carbonate crystals. The functional significance of intrinsically disordered properties of Stm-l and the possible role of this protein in controlling the formation of calcium carbonate crystals is discussed. PMID:25490041
The organisation and interviral homologies of genes at the 3' end of tobacco rattle virus RNA1
Boccara, Martine; Hamilton, William D. O.; Baulcombe, David C.
1986-01-01
The RNA1 of tobacco rattle virus (TRV) has been cloned as cDNA and the nucleotide sequence determined of 2 kb from the 3'-terminal region. The sequence contains three long open reading frames. One of these starts 5' of the cDNA and probably corresponds to the carboxy-terminal sequence of a 170-K protein encoded on RNA1. The deduced protein sequence from this reading frame shows homology with the putative replicases of tobacco mosaic virus (TMV) and tricornaviruses. The location of the second open reading frame, which encodes a 29-K polypeptide, was shown by Northern blot analysis to coincide with a 1.6-kb subgenomic RNA. The validity of this reading frame was confirmed by showing that the cDNA extending over this region could be transcribed and translated in vitro to produce a polypeptide of the predicted size which co-migrates in electrophoresis with a translation product of authentic viral RNA. The sequence of this 29-K polypeptide showed homology with two regions in the 30-K protein of TMV. This homology includes positions in the TMV 30-K protein where mutations have been identified which affect the transport of virus between cells. The third open reading frame encodes a potential 16-K protein and was shown by Northern blot hybridisation to be contained within the region of a 0.7-kb subgenomic RNA which is found in cellular RNA of infected cells but not virus particles. The many similarities between TRV and TMV in viral morphology, gene organisation and sequence suggest that these two viral groups may share a common viral ancestor. ImagesFig. 2.Fig. 3. PMID:16453668
McElroy, Kerensa; Mouton, Laurence; Du Pasquier, Louis; Qi, Weihong; Ebert, Dieter
2011-09-01
Collagen-like proteins containing glycine-X-Y repeats have been identified in several pathogenic bacteria potentially involved in virulence. Recently, a collagen-like surface protein, Pcl1a, was identified in Pasteuria ramosa, a spore-forming parasite of Daphnia. Here we characterise 37 novel putative P. ramosa collagen-like protein genes (PCLs). PCR amplification and sequencing across 10 P. ramosa strains showed they were polymorphic, distinguishing genotypes matching known differences in Daphnia/P. ramosa interaction specificity. Thirty PCLs could be divided into four groups based on sequence similarity, conserved N- and C-terminal regions and G-X-Y repeat structure. Group 1, Group 2 and Group 3 PCLs formed triplets within the genome, with one member from each group represented in each triplet. Maximum-likelihood trees suggested that these groups arose through multiple instances of triplet duplication. For Group 1, 2, 3 and 4 PCLs, X was typically proline and Y typically threonine, consistent with other bacterial collagen-like proteins. The amino acid composition of Pcl2 closely resembled Pcl1a, with X typically being glutamic acid or aspartic acid and Y typically being lysine or glutamine. Pcl2 also showed sequence similarity to Pcl1a and contained a predicted signal peptide, cleavage site and transmembrane domain, suggesting that it is a surface protein. Copyright © 2011 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.
Characterization of HIV Transmission in South-East Austria
Kessler, Harald H.; Haas, Bernhard; Stelzl, Evelyn; Weninger, Karin; Little, Susan J.; Mehta, Sanjay R.
2016-01-01
To gain deeper insight into the epidemiology of HIV-1 transmission in South-East Austria we performed a retrospective analysis of 259 HIV-1 partial pol sequences obtained from unique individuals newly diagnosed with HIV infection in South-East Austria from 2008 through 2014. After quality filtering, putative transmission linkages were inferred when two sequences were ≤1.5% genetically different. Multiple linkages were resolved into putative transmission clusters. Further phylogenetic analyses were performed using BEAST v1.8.1. Finally, we investigated putative links between the 259 sequences from South-East Austria and all publicly available HIV polymerase sequences in the Los Alamos National Laboratory HIV sequence database. We found that 45.6% (118/259) of the sampled sequences were genetically linked with at least one other sequence from South-East Austria forming putative transmission clusters. Clustering individuals were more likely to be men who have sex with men (MSM; p<0.001), infected with subtype B (p<0.001) or subtype F (p = 0.02). Among clustered males who reported only heterosexual (HSX) sex as an HIV risk, 47% clustered closely with MSM (either as pairs or within larger MSM clusters). One hundred and seven of the 259 sequences (41.3%) from South-East Austria had at least one putative inferred linkage with sequences from a total of 69 other countries. In conclusion, analysis of HIV-1 sequences from newly diagnosed individuals residing in South-East Austria revealed a high degree of national and international clustering mainly within MSM. Interestingly, we found that a high number of heterosexual males clustered within MSM networks, suggesting either linkage between risk groups or misrepresentation of sexual risk behaviors by subjects. PMID:26967154
Characterization of HIV Transmission in South-East Austria.
Hoenigl, Martin; Chaillon, Antoine; Kessler, Harald H; Haas, Bernhard; Stelzl, Evelyn; Weninger, Karin; Little, Susan J; Mehta, Sanjay R
2016-01-01
To gain deeper insight into the epidemiology of HIV-1 transmission in South-East Austria we performed a retrospective analysis of 259 HIV-1 partial pol sequences obtained from unique individuals newly diagnosed with HIV infection in South-East Austria from 2008 through 2014. After quality filtering, putative transmission linkages were inferred when two sequences were ≤1.5% genetically different. Multiple linkages were resolved into putative transmission clusters. Further phylogenetic analyses were performed using BEAST v1.8.1. Finally, we investigated putative links between the 259 sequences from South-East Austria and all publicly available HIV polymerase sequences in the Los Alamos National Laboratory HIV sequence database. We found that 45.6% (118/259) of the sampled sequences were genetically linked with at least one other sequence from South-East Austria forming putative transmission clusters. Clustering individuals were more likely to be men who have sex with men (MSM; p<0.001), infected with subtype B (p<0.001) or subtype F (p = 0.02). Among clustered males who reported only heterosexual (HSX) sex as an HIV risk, 47% clustered closely with MSM (either as pairs or within larger MSM clusters). One hundred and seven of the 259 sequences (41.3%) from South-East Austria had at least one putative inferred linkage with sequences from a total of 69 other countries. In conclusion, analysis of HIV-1 sequences from newly diagnosed individuals residing in South-East Austria revealed a high degree of national and international clustering mainly within MSM. Interestingly, we found that a high number of heterosexual males clustered within MSM networks, suggesting either linkage between risk groups or misrepresentation of sexual risk behaviors by subjects.
Bai, Wen L; Zhao, Su J; Wang, Ze Y; Zhu, Yu B; Dang, Yun L; Cong, Yu Y; Xue, Hui L; Wang, Wei; Deng, Liang; Guo, Dan; Wang, Shi Q; Zhu, Yan X; Yin, Rong H
2018-07-03
Long noncoding RNAs (lncRNAs) are a novel class of eukaryotic transcripts. They are thought to act as a critical regulator of protein-coding gene expression. Herein, we identified and characterized 13 putative lncRNAs from the expressed sequence tags from secondary hair follicle of Cashmere goat. Furthermore, we investigated their transcriptional pattern in secondary hair follicle of Liaoning Cashmere goat during telogen and anagen phases. Also, we generated intracellular regulatory networks of upregulated lncRNAs at anagen in Wnt signaling pathway based on bioinformatics analysis. The relative expression of six putative lncRNAs (lncRNA-599618, -599556, -599554, -599547, -599531, and -599509) at the anagen phase is significantly higher than that at telogen. Compared with anagen, the relative expression of four putative lncRNAs (lncRNA-599528, -599518, -599511, and -599497) was found to be significantly upregulated at telogen phase. The network generated showed that a rich and complex regulatory relationship of the putative lncRNAs and related miRNAs with their target genes in Wnt signaling pathway. Our results from the present study provided a foundation for further elucidating the functional and regulatory mechanisms of these putative lncRNAs in the development of secondary hair follicle and cashmere fiber growth of Cashmere goat.
Hiessl, Sebastian; Schuldes, Jörg; Thürmer, Andrea; Halbsguth, Tobias; Bröker, Daniel; Angelov, Angel; Liebl, Wolfgang; Daniel, Rolf
2012-01-01
The increasing production of synthetic and natural poly(cis-1,4-isoprene) rubber leads to huge challenges in waste management. Only a few bacteria are known to degrade rubber, and little is known about the mechanism of microbial rubber degradation. The genome of Gordonia polyisoprenivorans strain VH2, which is one of the most effective rubber-degrading bacteria, was sequenced and annotated to elucidate the degradation pathway and other features of this actinomycete. The genome consists of a circular chromosome of 5,669,805 bp and a circular plasmid of 174,494 bp with average GC contents of 67.0% and 65.7%, respectively. It contains 5,110 putative protein-coding sequences, including many candidate genes responsible for rubber degradation and other biotechnically relevant pathways. Furthermore, we detected two homologues of a latex-clearing protein, which is supposed to be a key enzyme in rubber degradation. The deletion of these two genes for the first time revealed clear evidence that latex-clearing protein is essential for the microbial utilization of rubber. Based on the genome sequence, we predict a pathway for the microbial degradation of rubber which is supported by previous and current data on transposon mutagenesis, deletion mutants, applied comparative genomics, and literature search. PMID:22327575
Schübbe, Sabrina; Kube, Michael; Scheffel, André; Wawer, Cathrin; Heyen, Udo; Meyerdierks, Anke; Madkour, Mohamed H.; Mayer, Frank; Reinhardt, Richard; Schüler, Dirk
2003-01-01
Frequent spontaneous loss of the magnetic phenotype was observed in stationary-phase cultures of the magnetotactic bacterium Magnetospirillum gryphiswaldense MSR-1. A nonmagnetic mutant, designated strain MSR-1B, was isolated and characterized. The mutant lacked any structures resembling magnetosome crystals as well as internal membrane vesicles. The growth of strain MSR-1B was impaired under all growth conditions tested, and the uptake and accumulation of iron were drastically reduced under iron-replete conditions. A large chromosomal deletion of approximately 80 kb was identified in strain MSR-1B, which comprised both the entire mamAB and mamDC clusters as well as further putative operons encoding a number of magnetosome-associated proteins. A bacterial artificial chromosome clone partially covering the deleted region was isolated from the genomic library of wild-type M. gryphiswaldense. Sequence analysis of this fragment revealed that all previously identified mam genes were closely linked with genes encoding other magnetosome-associated proteins within less than 35 kb. In addition, this region was remarkably rich in insertion elements and harbored a considerable number of unknown gene families which appeared to be specific for magnetotactic bacteria. Overall, these findings suggest the existence of a putative large magnetosome island in M. gryphiswaldense and other magnetotactic bacteria. PMID:13129949
Chandrapala, Dilini; Kim, Kyumson; Choi, Younho; Senevirathne, Amal; Kang, Dong-Hyun; Ryu, Sangryeol
2014-01-01
Cronobacter sakazakii is an opportunistic pathogen that causes neonatal meningitis and necrotizing enterocolitis. Its interaction with intestinal epithelium is important in the pathogenesis of enteric infections. In this study, we investigated the involvement of the inv gene in the virulence of C. sakazakii ATCC 29544 in vitro and in vivo. Sequence analysis of C. sakazakii ATCC 29544 inv revealed that it is different from other C. sakazakii isolates. In various cell culture models, an Δinv deletion mutant showed significantly lowered invasion efficiency, which was restored upon genetic complementation. Studying invasion potentials using tight-junction-disrupted Caco-2 cells suggested that the inv gene product mediates basolateral invasion of C. sakazakii ATCC 29544. In addition, comparison of invasion potentials of double mutant (ΔompA Δinv) and single mutants (ΔompA and Δinv) provided evidence for an additive effect of the two putative outer membrane proteins. Finally, the importance of inv and the additive effect of putative Inv and OmpA were also proven in an in vivo rat pup model. This report is the first to demonstrate two proteins working synergistically in vitro, as well as in vivo in C. sakazakii pathogenesis. PMID:24549330
Chandrapala, Dilini; Kim, Kyumson; Choi, Younho; Senevirathne, Amal; Kang, Dong-Hyun; Ryu, Sangryeol; Kim, Kwang-Pyo
2014-05-01
Cronobacter sakazakii is an opportunistic pathogen that causes neonatal meningitis and necrotizing enterocolitis. Its interaction with intestinal epithelium is important in the pathogenesis of enteric infections. In this study, we investigated the involvement of the inv gene in the virulence of C. sakazakii ATCC 29544 in vitro and in vivo. Sequence analysis of C. sakazakii ATCC 29544 inv revealed that it is different from other C. sakazakii isolates. In various cell culture models, an Δinv deletion mutant showed significantly lowered invasion efficiency, which was restored upon genetic complementation. Studying invasion potentials using tight-junction-disrupted Caco-2 cells suggested that the inv gene product mediates basolateral invasion of C. sakazakii ATCC 29544. In addition, comparison of invasion potentials of double mutant (ΔompA Δinv) and single mutants (ΔompA and Δinv) provided evidence for an additive effect of the two putative outer membrane proteins. Finally, the importance of inv and the additive effect of putative Inv and OmpA were also proven in an in vivo rat pup model. This report is the first to demonstrate two proteins working synergistically in vitro, as well as in vivo in C. sakazakii pathogenesis.
Evolutionary distance from human homologs reflects allergenicity of animal food proteins.
Jenkins, John A; Breiteneder, Heimo; Mills, E N Clare
2007-12-01
In silico analysis of allergens can identify putative relationships among protein sequence, structure, and allergenic properties. Such systematic analysis reveals that most plant food allergens belong to a restricted number of protein superfamilies, with pollen allergens behaving similarly. We have investigated the structural relationships of animal food allergens and their evolutionary relatedness to human homologs to define how closely a protein must resemble a human counterpart to lose its allergenic potential. Profile-based sequence homology methods were used to classify animal food allergens into Pfam families, and in silico analyses of their evolutionary and structural relationships were performed. Animal food allergens could be classified into 3 main families--tropomyosins, EF-hand proteins, and caseins--along with 14 minor families each composed of 1 to 3 allergens. The evolutionary relationships of each of these allergen superfamilies showed that in general, proteins with a sequence identity to a human homolog above approximately 62% were rarely allergenic. Single substitutions in otherwise highly conserved regions containing IgE epitopes in EF-hand parvalbumins may modulate allergenicity. These data support the premise that certain protein structures are more allergenic than others. Contrasting with plant food allergens, animal allergens, such as the highly conserved tropomyosins, challenge the capability of the human immune system to discriminate between foreign and self-proteins. Such immune responses run close to becoming autoimmune responses. Exploiting the closeness between animal allergens and their human homologs in the development of recombinant allergens for immunotherapy will need to consider the potential for developing unanticipated autoimmune responses.
Heterogeneous RNA-binding protein M4 is a receptor for carcinoembryonic antigen in Kupffer cells.
Bajenova, O V; Zimmer, R; Stolper, E; Salisbury-Rowswell, J; Nanji, A; Thomas, P
2001-08-17
Here we report the isolation of the recombinant cDNA clone from rat macrophages, Kupffer cells (KC) that encodes a protein interacting with carcinoembryonic antigen (CEA). To isolate and identify the CEA receptor gene we used two approaches: screening of a KC cDNA library with a specific antibody and the yeast two-hybrid system for protein interaction using as a bait the N-terminal part of the CEA encoding the binding site. Both techniques resulted in the identification of the rat heterogeneous RNA-binding protein (hnRNP) M4 gene. The rat ortholog cDNA sequence has not been previously described. The open reading frame for this gene contains a 2351-base pair sequence with the polyadenylation signal AATAAA and a termination poly(A) tail. The mRNA shows ubiquitous tissue expression as a 2.4-kilobase transcript. The deduced amino acid sequence comprised a 78-kDa membrane protein with 3 putative RNA-binding domains, arginine/methionine/glutamine-rich C terminus and 3 potential membrane spanning regions. When hnRNP M4 protein is expressed in pGEX4T-3 vector system in Escherichia coli it binds (125)I-labeled CEA in a Ca(2+)-dependent fashion. Transfection of rat hnRNP M4 cDNA into a non-CEA binding mouse macrophage cell line p388D1 resulted in CEA binding. These data provide evidence for a new function of hnRNP M4 protein as a CEA-binding protein in Kupffer cells.
Nucleotide sequences of bovine alpha S1- and kappa-casein cDNAs.
Stewart, A F; Willis, I M; Mackinlay, A G
1984-01-01
The nucleotide sequences corresponding to bovine alpha S1- and kappa-casein mRNAs are presented. An unusual alpha S1-casein cDNA has been characterised whose 5' end commences upstream from its putative TATA box. The alpha S1-casein mRNA is compared to rat alpha-casein mRNA and two components of divergence are identified. Firstly, the two sequences have diverged at a high point mutation rate and the rate of amino acid replacement by this mechanism is at least as great as the rate of divergence of any other part of the mRNAs. Secondly, the protein coding sequence has been subjected to several insertion/deletion events, one of which may be an example of exon shuffling . The kappa-casein mRNA sequence verifies the proposition that it has arisen from a different ancestral gene to the other caseins. Images PMID:6328443
Abdelkafi, Slim; Ogata, Hiroyuki; Barouh, Nathalie; Fouquet, Benjamin; Lebrun, Régine; Pina, Michel; Scheirlinckx, Frantz; Villeneuve, Pierre; Carrière, Frédéric
2009-11-01
An esterase (CpEst) showing high specific activities on tributyrin and short chain vinyl esters was obtained from Carica papaya latex after an extraction step with zwitterionic detergent and sonication, followed by gel filtration chromatography. Although the protein could not be purified to complete homogeneity due to its presence in high molecular mass aggregates, a major protein band with an apparent molecular mass of 41 kDa was obtained by SDS-PAGE. This material was digested with trypsin and the amino acid sequences of the tryptic peptides were determined by LC/ESI/MS/MS. These sequences were used to identify a partial cDNA (679 bp) from expressed sequence tags (ESTs) of C. papaya. Based upon EST sequences, a full-length gene was identified in the genome of C. papaya, with an open reading frame of 1029 bp encoding a protein of 343 amino acid residues, with a theoretical molecular mass of 38 kDa. From sequence analysis, CpEst was identified as a GDSL-motif carboxylester hydrolase belonging to the SGNH protein family and four potential N-glycosylation sites were identified. The putative catalytic triad was localised (Ser(35)-Asp(307)-His(310)) with the nucleophile serine being part of the GDSL-motif. A 3D-model of CpEst was built from known X-ray structures and sequence alignments and the catalytic triad was found to be exposed at the surface of the molecule, thus confirming the results of CpEst inhibition by tetrahydrolipstatin suggesting a direct accessibility of the inhibitor to the active site.
Wang, H C; Shi, F Y; Hou, M J; Fu, X Y; Long, R J
2016-08-01
The gastrointestinal lumen can directly absorb all di- and tripeptide protein degradation products, and oligopeptide absorption depends on the specific peptide transport carriers, which are located in gastrointestinal epithelial cells on the brush border membrane. Yak () use N more efficiently than cattle do, which implies that yak have a specific mechanism of nonprotein utilization including a peptide absorption mechanism. However, this mechanism has not been clarified. Our objective was to explore whether yak possess any adaptive mechanisms of peptide absorption to survive in the harsh foraging environment of the Qinghai-Tibetan plateau. Twelve castrated males of each of 2 genotypes, yak () and indigenous cattle (), were fed diets of various N levels. The yak PepT1 (yPepT1) cDNA was cloned in omasum epithelial tissue. Our results showed that the full-length yPepT1 cDNA contains 2,805 bp, and a 2,121-bp open reading frame encodes a putative protein of 707 AA residues. The yPepT1 AA sequence identified 5 putative extracellular N-glycosylation sites (Asn, Asn, Asn, Asn, and Asn), 2 putative intracellular protein kinase A sites (Ser and Thr), and 3 intracellular putative protein kinase C sites (Ser, Ser, and Ser). The yPepT1 AA sequence was 99, 95, 86, and 83% identical to PepT1 from cattle (), sheep (), pigs (), and humans (), respectively. The relative PepT1 mRNA expression for indigenous cattle was greater than yak in the rumen, omasum, duodenum, ileum, and liver ( < 0.001); however, it was lower in jejunum tissue ( < 0.01). The relative PepT1 mRNA expression in response to increasing dietary N for both genotypes were linear in the rumen and jejunum ( < 0.10); quadratic or cubic in the reticulum ( < 0.01); linear or quadratic in the duodenum, ileum, and liver ( ≤ 0.01); and linear, quadratic, or cubic in the omasum ( < 0.001). Moreover, there were significant interactions between genotype and dietary N in rumen, reticulum, omasum, duodenum, jejunum, ileum, and liver tissues. In conclusion, the PepT1 profile and expression in gastrointestinal epithelial cells of yak varied from those of cattle, implying that yak have evolved a peptide transport mechanism to adapt the environment of the Qinghai-Tibetan plateau.
de Souza, C R; Aragão, F J; Moreira, E C O; Costa, C N M; Nascimento, S B; Carvalho, L J
2009-03-24
Cassava is one of the most important tropical food crops for more than 600 million people worldwide. Transgenic technologies can be useful for increasing its nutritional value and its resistance to viral diseases and insect pests. However, tissue-specific promoters that guarantee correct expression of transgenes would be necessary. We used inverse polymerase chain reaction to isolate a promoter sequence of the Mec1 gene coding for Pt2L4, a glutamic acid-rich protein differentially expressed in cassava storage roots. In silico analysis revealed putative cis-acting regulatory elements within this promoter sequence, including root-specific elements that may be required for its expression in vascular tissues. Transient expression experiments showed that the Mec1 promoter is functional, since this sequence was able to drive GUS expression in bean embryonic axes. Results from our computational analysis can serve as a guide for functional experiments to identify regions with tissue-specific Mec1 promoter activity. The DNA sequence that we identified is a new promoter that could be a candidate for genetic engineering of cassava roots.
Cloning and characterization of a Prevotella melaninogenica hemolysin.
Allison, H E; Hillman, J D
1997-01-01
Hemolysins have been proven to be important virulence factors in many medically relevant pathogenic organisms. Their production has also been implicated in the etiology of periodontal disease. Hemolytic strain 361B of Prevotella melaninogenica, a putative etiologic agent of periodontal disease, was used in this study. The cloning, sequencing, and characterization of phyA, the structural gene for a P. melaninogenica hemolysin, is described. No extensive sequence homology could be identified between phyA and any reported sequence at either the nucleotide or amino acid level. As predicted from sequence analysis, this gene produces a 39-kDa protein which has hemolytic activity as measured by zymogram analysis. Unlike many Ca2+-dependent bacterial hemolysins, both the cloned and native PhyA proteins were enhanced by the presence of EDTA in a dose-dependent fashion with 40 mM EDTA allowing maximum activity. Ca2+ and Mg2+ were found to be inhibitory. The hemolytic activity also was found to have a dose-dependent endpoint. Through recovery of hemolytic activity from a spent reaction, this endpoint was shown to be the result of end product inhibition. This is the first report describing the cloning and sequencing of a gene from P. melaninogenica. PMID:9199448
Cloning and characterization of a Prevotella melaninogenica hemolysin.
Allison, H E; Hillman, J D
1997-07-01
Hemolysins have been proven to be important virulence factors in many medically relevant pathogenic organisms. Their production has also been implicated in the etiology of periodontal disease. Hemolytic strain 361B of Prevotella melaninogenica, a putative etiologic agent of periodontal disease, was used in this study. The cloning, sequencing, and characterization of phyA, the structural gene for a P. melaninogenica hemolysin, is described. No extensive sequence homology could be identified between phyA and any reported sequence at either the nucleotide or amino acid level. As predicted from sequence analysis, this gene produces a 39-kDa protein which has hemolytic activity as measured by zymogram analysis. Unlike many Ca2+-dependent bacterial hemolysins, both the cloned and native PhyA proteins were enhanced by the presence of EDTA in a dose-dependent fashion with 40 mM EDTA allowing maximum activity. Ca2+ and Mg2+ were found to be inhibitory. The hemolytic activity also was found to have a dose-dependent endpoint. Through recovery of hemolytic activity from a spent reaction, this endpoint was shown to be the result of end product inhibition. This is the first report describing the cloning and sequencing of a gene from P. melaninogenica.
Study of cnidarian-algal symbiosis in the "omics" age.
Meyer, Eli; Weis, Virginia M
2012-08-01
The symbiotic associations between cnidarians and dinoflagellate algae (Symbiodinium) support productive and diverse ecosystems in coral reefs. Many aspects of this association, including the mechanistic basis of host-symbiont recognition and metabolic interaction, remain poorly understood. The first completed genome sequence for a symbiotic anthozoan is now available (the coral Acropora digitifera), and extensive expressed sequence tag resources are available for a variety of other symbiotic corals and anemones. These resources make it possible to profile gene expression, protein abundance, and protein localization associated with the symbiotic state. Here we review the history of "omics" studies of cnidarian-algal symbiosis and the current availability of sequence resources for corals and anemones, identifying genes putatively involved in symbiosis across 10 anthozoan species. The public availability of candidate symbiosis-associated genes leaves the field of cnidarian-algal symbiosis poised for in-depth comparative studies of sequence diversity and gene expression and for targeted functional studies of genes associated with symbiosis. Reviewing the progress to date suggests directions for future investigations of cnidarian-algal symbiosis that include (i) sequencing of Symbiodinium, (ii) proteomic analysis of the symbiosome membrane complex, (iii) glycomic analysis of Symbiodinium cell surfaces, and (iv) expression profiling of the gastrodermal cells hosting Symbiodinium.
Licht, J D; Hanna-Rose, W; Reddy, J C; English, M A; Ro, M; Grossel, M; Shaknovich, R; Hansen, U
1994-01-01
We previously demonstrated that the Drosophila Krüppel protein is a transcriptional repressor with separable DNA-binding and transcriptional repression activities. In this study, the minimal amino (N)-terminal repression region of the Krüppel protein was defined by transferring regions of the Krüppel protein to a heterologous DNA-binding protein, the lacI protein. Fusion of a predicted alpha-helical region from amino acids 62 to 92 in the N terminus of the Krüppel protein was sufficient to transfer repression activity. This putative alpha-helix has several hydrophobic surfaces, as well as a glutamine-rich surface. Mutants containing multiple amino acid substitutions of the glutamine residues demonstrated that this putative alpha-helical region is essential for repression activity of a Krüppel protein containing the entire N-terminal and DNA-binding regions. Furthermore, one point mutant with only a single glutamine on this surface altered to lysine abolished the ability of the Krüppel protein to repress, indicating the importance of the amino acid at residue 86 for repression. The N terminus also contained an adjacent activation region localized between amino acids 86 and 117. Finally, in accordance with predictions from primary amino acid sequence similarity, a repression region from the Drosophila even-skipped protein, which was six times more potent than that of the Krüppel protein in the mammalian cells, was characterized. This segment included a hydrophobic stretch of 11 consecutive alanine residues and a proline-rich region. Images PMID:8196644
Gulliver, Emily L; Wright, Amy; Lucas, Deanna Deveson; Mégroz, Marianne; Kleifeld, Oded; Schittenhelm, Ralf B; Powell, David R; Seemann, Torsten; Bulitta, Jürgen B; Harper, Marina; Boyce, John D
2018-05-01
Pasteurella multocida is a Gram-negative bacterium responsible for many important animal diseases. While a number of P. multocida virulence factors have been identified, very little is known about how gene expression and protein production is regulated in this organism. Small RNA (sRNA) molecules are critical regulators that act by binding to specific mRNA targets, often in association with the RNA chaperone protein Hfq. In this study, transcriptomic analysis of the P. multocida strain VP161 revealed a putative sRNA with high identity to GcvB from Escherichia coli and Salmonella enterica serovar Typhimurium. High-throughput quantitative liquid proteomics was used to compare the proteomes of the P. multocida VP161 wild-type strain, a gcvB mutant, and a GcvB overexpression strain. These analyses identified 46 proteins that displayed significant differential production after inactivation of gcvB , 36 of which showed increased production. Of the 36 proteins that were repressed by GcvB, 27 were predicted to be involved in amino acid biosynthesis or transport. Bioinformatic analyses of putative P. multocida GcvB target mRNAs identified a strongly conserved 10 nucleotide consensus sequence, 5'-AACACAACAT-3', with the central eight nucleotides identical to the seed binding region present within GcvB mRNA targets in E. coli and S. Typhimurium. Using a defined set of seed region mutants, together with a two-plasmid reporter system that allowed for quantification of sRNA-mRNA interactions, this sequence was confirmed to be critical for the binding of the P. multocida GcvB to the target mRNA, gltA . © 2018 Gulliver et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.
Nishiyama, Milton Yutaka; dos Santos, Maria Beatriz Viana; Santos-da-Silva, Andria de Paula; Chalkidis, Hipócrates de Menezes; Souza-Imberg, Andreia; Candido, Denise Maria; Yamanouye, Norma; Dorce, Valquíria Abrão Coronado; Junqueira-de-Azevedo, Inácio de Loiola Meirelles
2018-01-01
Background Except for the northern region, where the Amazonian black scorpion, T. obscurus, represents the predominant and most medically relevant scorpion species, Tityus serrulatus, the Brazilian yellow scorpion, is widely distributed throughout Brazil, causing most envenoming and fatalities due to scorpion sting. In order to evaluate and compare the diversity of venom components of Tityus obscurus and T. serrulatus, we performed a transcriptomic investigation of the telsons (venom glands) corroborated by a shotgun proteomic analysis of the venom from the two species. Results The putative venom components represented 11.4% and 16.7% of the total gene expression for T. obscurus and T. serrulatus, respectively. Transcriptome and proteome data revealed high abundance of metalloproteinases sequences followed by sodium and potassium channel toxins, making the toxin core of the venom. The phylogenetic analysis of metalloproteinases from T. obscurus and T. serrulatus suggested an intraspecific gene expansion, as we previously observed for T. bahiensis, indicating that this enzyme may be under evolutionary pressure for diversification. We also identified several putative venom components such as anionic peptides, antimicrobial peptides, bradykinin-potentiating peptide, cysteine rich protein, serine proteinases, cathepsins, angiotensin-converting enzyme, endothelin-converting enzyme and chymotrypsin like protein, proteinases inhibitors, phospholipases and hyaluronidases. Conclusion The present work shows that the venom composition of these two allopatric species of Tityus are considerably similar in terms of the major classes of proteins produced and secreted, although their individual toxin sequences are considerably divergent. These differences at amino acid level may reflect in different epitopes for the same protein classes in each species, explaining the basis for the poor recognition of T. obscurus venom by the antiserum raised against other species. PMID:29561852
ESTs Analysis Reveals Putative Genes Involved in Symbiotic Seed Germination in Dendrobium officinale
Zhao, Ming-Ming; Zhang, Gang; Zhang, Da-Wei; Hsiao, Yu-Yun; Guo, Shun-Xing
2013-01-01
Dendrobium officinale (Orchidaceae) is one of the world’s most endangered plants with great medicinal value. In nature, D . officinale seeds must establish symbiotic relationships with fungi to germinate. However, the molecular events involved in the interaction between fungus and plant during this process are poorly understood. To isolate the genes involved in symbiotic germination, a suppression subtractive hybridization (SSH) cDNA library of symbiotically germinated D . officinale seeds was constructed. From this library, 1437 expressed sequence tags (ESTs) were clustered to 1074 Unigenes (including 902 singletons and 172 contigs), which were searched against the NCBI non-redundant (NR) protein database (E-value cutoff, e-5). Based on sequence similarity with known proteins, 579 differentially expressed genes in D . officinale were identified and classified into different functional categories by Gene Ontology (GO), Clusters of orthologous Groups of proteins (COGs) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. The expression levels of 15 selected genes emblematic of symbiotic germination were confirmed via real-time quantitative PCR. These genes were classified into various categories, including defense and stress response, metabolism, transcriptional regulation, transport process and signal transduction pathways. All transcripts were upregulated in the symbiotically germinated seeds (SGS). The functions of these genes in symbiotic germination were predicted. Furthermore, two fungus-induced calcium-dependent protein kinases (CDPKs), which were upregulated 6.76- and 26.69-fold in SGS compared with un-germinated seeds (UGS), were cloned from D . officinale and characterized for the first time. This study provides the first global overview of genes putatively involved in D . officinale symbiotic seed germination and provides a foundation for further functional research regarding symbiotic relationships in orchids. PMID:23967335
Zhao, Ming-Ming; Zhang, Gang; Zhang, Da-Wei; Hsiao, Yu-Yun; Guo, Shun-Xing
2013-01-01
Dendrobiumofficinale (Orchidaceae) is one of the world's most endangered plants with great medicinal value. In nature, D. officinale seeds must establish symbiotic relationships with fungi to germinate. However, the molecular events involved in the interaction between fungus and plant during this process are poorly understood. To isolate the genes involved in symbiotic germination, a suppression subtractive hybridization (SSH) cDNA library of symbiotically germinated D. officinale seeds was constructed. From this library, 1437 expressed sequence tags (ESTs) were clustered to 1074 Unigenes (including 902 singletons and 172 contigs), which were searched against the NCBI non-redundant (NR) protein database (E-value cutoff, e(-5)). Based on sequence similarity with known proteins, 579 differentially expressed genes in D. officinale were identified and classified into different functional categories by Gene Ontology (GO), Clusters of orthologous Groups of proteins (COGs) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. The expression levels of 15 selected genes emblematic of symbiotic germination were confirmed via real-time quantitative PCR. These genes were classified into various categories, including defense and stress response, metabolism, transcriptional regulation, transport process and signal transduction pathways. All transcripts were upregulated in the symbiotically germinated seeds (SGS). The functions of these genes in symbiotic germination were predicted. Furthermore, two fungus-induced calcium-dependent protein kinases (CDPKs), which were upregulated 6.76- and 26.69-fold in SGS compared with un-germinated seeds (UGS), were cloned from D. officinale and characterized for the first time. This study provides the first global overview of genes putatively involved in D. officinale symbiotic seed germination and provides a foundation for further functional research regarding symbiotic relationships in orchids.
Distribution and Genetic Diversity of Bacteriocin Gene Clusters in Rumen Microbial Genomes.
Azevedo, Analice C; Bento, Cláudia B P; Ruiz, Jeronimo C; Queiroz, Marisa V; Mantovani, Hilário C
2015-10-01
Some species of ruminal bacteria are known to produce antimicrobial peptides, but the screening procedures have mostly been based on in vitro assays using standardized methods. Recent sequencing efforts have made available the genome sequences of hundreds of ruminal microorganisms. In this work, we performed genome mining of the complete and partial genome sequences of 224 ruminal bacteria and 5 ruminal archaea to determine the distribution and diversity of bacteriocin gene clusters. A total of 46 bacteriocin gene clusters were identified in 33 strains of ruminal bacteria. Twenty gene clusters were related to lanthipeptide biosynthesis, while 11 gene clusters were associated with sactipeptide production, 7 gene clusters were associated with class II bacteriocin production, and 8 gene clusters were associated with class III bacteriocin production. The frequency of strains whose genomes encode putative antimicrobial peptide precursors was 14.4%. Clusters related to the production of sactipeptides were identified for the first time among ruminal bacteria. BLAST analysis indicated that the majority of the gene clusters (88%) encoding putative lanthipeptides contained all the essential genes required for lanthipeptide biosynthesis. Most strains of Streptococcus (66.6%) harbored complete lanthipeptide gene clusters, in addition to an open reading frame encoding a putative class II bacteriocin. Albusin B-like proteins were found in 100% of the Ruminococcus albus strains screened in this study. The in silico analysis provided evidence of novel biosynthetic gene clusters in bacterial species not previously related to bacteriocin production, suggesting that the rumen microbiota represents an underexplored source of antimicrobial peptides. Copyright © 2015, American Society for Microbiology. All Rights Reserved.
Putative cross-kingdom horizontal gene transfer in sponge (Porifera) mitochondria
Rot, Chagai; Goldfarb, Itay; Ilan, Micha; Huchon, Dorothée
2006-01-01
Background The mitochondrial genome of Metazoa is usually a compact molecule without introns. Exceptions to this rule have been reported only in corals and sea anemones (Cnidaria), in which group I introns have been discovered in the cox1 and nad5 genes. Here we show several lines of evidence demonstrating that introns can also be found in the mitochondria of sponges (Porifera). Results A 2,349 bp fragment of the mitochondrial cox1 gene was sequenced from the sponge Tetilla sp. (Spirophorida). This fragment suggests the presence of a 1143 bp intron. Similar to all the cnidarian mitochondrial introns, the putative intron has group I intron characteristics. The intron is present in the cox1 gene and encodes a putative homing endonuclease. In order to establish the distribution of this intron in sponges, the cox1 gene was sequenced from several representatives of the demosponge diversity. The intron was found only in the sponge order Spirophorida. A phylogenetic analysis of the COI protein sequence and of the intron open reading frame suggests that the intron may have been transmitted horizontally from a fungus donor. Conclusion Little is known about sponge-associated fungi, although in the last few years the latter have been frequently isolated from sponges. We suggest that the horizontal gene transfer of a mitochondrial intron was facilitated by a symbiotic relationship between fungus and sponge. Ecological relationships are known to have implications at the genomic level. Here, an ecological relationship between sponge and fungus is suggested based on the genomic analysis. PMID:16972986
The Goddard and Saturn Genes Are Essential for Drosophila Male Fertility and May Have Arisen De Novo
Gubala, Anna M.; Schmitz, Jonathan F.; Kearns, Michael J.; Vinh, Tery T.; Bornberg-Bauer, Erich; Wolfner, Mariana F.
2017-01-01
New genes arise through a variety of mechanisms, including the duplication of existing genes and the de novo birth of genes from noncoding DNA sequences. While there are numerous examples of duplicated genes with important functional roles, the functions of de novo genes remain largely unexplored. Many newly evolved genes are expressed in the male reproductive tract, suggesting that these evolutionary innovations may provide advantages to males experiencing sexual selection. Using testis-specific RNA interference, we screened 11 putative de novo genes in Drosophila melanogaster for effects on male fertility and identified two, goddard and saturn, that are essential for spermatogenesis and sperm function. Goddard knockdown (KD) males fail to produce mature sperm, while saturn KD males produce few sperm, and these function inefficiently once transferred to females. Consistent with a de novo origin, both genes are identifiable only in Drosophila and are predicted to encode proteins with no sequence similarity to any annotated protein. However, since high levels of divergence prevented the unambiguous identification of the noncoding sequences from which each gene arose, we consider goddard and saturn to be putative de novo genes. Within Drosophila, both genes have been lost in certain lineages, but show conserved, male-specific patterns of expression in the species in which they are found. Goddard is consistently found in single-copy and evolves under purifying selection. In contrast, saturn has diversified through gene duplication and positive selection. These data suggest that de novo genes can acquire essential roles in male reproduction. PMID:28104747
Komisarczuk, Anna Z; Kongshaug, Heidi; Nilsen, Frank
2018-02-01
Na + /K + -ATPase has a key function in a variety of physiological processes including membrane excitability, osmoregulation, regulation of cell volume, and transport of nutrients. While knowledge about Na + /K + -ATPase function in osmoregulation in crustaceans is extensive, the role of this enzyme in other physiological and developmental processes is scarce. Here, we report characterization, transcriptional distribution and likely functions of the newly identified L. salmonis Na + /K + -ATPase (LsalNa + /K + -ATPase) α subunit in various developmental stages. The complete mRNA sequence was identified, with 3003 bp open reading frame encoding a putative protein of 1001 amino acids. Putative protein sequence of LsalNa + /K + -ATPase revealed all typical features of Na + /K + -ATPase and demonstrated high sequence identity to other invertebrate and vertebrate species. Quantitative RT-PCR analysis revealed higher LsalNa + /K + -ATPase transcript level in free-living stages in comparison to parasitic stages. In situ hybridization analysis of copepodids and adult lice revealed LsalNa + /K + -ATPase transcript localization in a wide variety of tissues such as nervous system, intestine, reproductive system, and subcuticular and glandular tissue. RNAi mediated knock-down of LsalNa + /K + -ATPase caused locomotion impairment, and affected reproduction and feeding. Morphological analysis of dsRNA treated animals revealed muscle degeneration in larval stages, severe changes in the oocyte formation and maturation in females and abnormalities in tegmental glands. Thus, the study represents an important foundation for further functional investigation and identification of physiological pathways in which Na + /K + -ATPase is directly or indirectly involved. Copyright © 2018 Elsevier Inc. All rights reserved.
Methods for Discovery of Novel Cellulosomal Cellulases Using Genomics and Biochemical Tools.
Ben-David, Yonit; Dassa, Bareket; Bensoussan, Lizi; Bayer, Edward A; Moraïs, Sarah
2018-01-01
Cell wall degradation by cellulases is extensively explored owing to its potential contribution to biofuel production. The cellulosome is an extracellular multienzyme complex that can degrade the plant cell wall very efficiently, and cellulosomal enzymes are therefore of great interest. The cellulosomal cellulases are defined as enzymes that contain a dockerin module, which can interact with a cohesin module contained in multiple copies in a noncatalytic protein, termed scaffoldin. The assembly of the cellulosomal cellulases into the cellulosomal complex occurs via specific protein-protein interactions. Cellulosome systems have been described initially only in several anaerobic cellulolytic bacteria. However, owing to ongoing genome sequencing and metagenomic projects, the discovery of novel cellulosome-producing bacteria and the description of their cellulosomal genes have dramatically increased in the recent years. In this chapter, methods for discovery of novel cellulosomal cellulases from a DNA sequence by bioinformatics and biochemical tools are described. Their biochemical characterization is also described, including both the enzymatic activity of the putative cellulases and their assembly into mature designer cellulosomes.
Hammond, R W; Crosslin, J M; Pasini, R; Howell, W E; Mink, G I
1999-07-01
Prunus necrotic ringspot ilarvirus (PNRSV) exists as a number of biologically distinct variants which differ in host specificity, serology, and pathology. Previous nucleotide sequence alignment and phylogenetic analysis of cloned reverse transcription-polymerase chain reaction (RT-PCR) products of several biologically distinct sweet cherry isolates revealed correlations between symptom type and the nucleotide and amino acid sequences of the 3a (putative movement protein) and 3b (coat protein) open reading frames. Based upon this analysis, RT-PCR assays have been developed that can identify isolates displaying different symptoms and serotypes. The incorporation of primers in a multiplex PCR protocol permits rapid detection and discrimination among the strains. The results of PCR amplification using type-specific primers that amplify a portion of the coat protein gene demonstrate that the primer-selection procedure developed for PNRSV constitutes a reliable method of viral strain discrimination in cherry for disease control and will also be useful for examining biological diversity within the PNRSV virus group.
Yu, Haining; Gao, Jiuxiang; Lu, Yiling; Guang, Huijuan; Cai, Shasha; Zhang, Songyan; Wang, Yipeng
2013-11-01
Lysozymes are key proteins that play important roles in innate immune defense in many animal phyla by breaking down the bacterial cell-walls. In this study, we report the molecular cloning, sequence analysis and phylogeny of the first caudate amphibian g-lysozyme: a full-length spleen cDNA library from axolotl (Ambystoma mexicanum). A goose-type (g-lysozyme) EST was identified and the full-length cDNA was obtained using RACE-PCR. The axolotl g-lysozyme sequence represents an open reading frame for a putative signal peptide and the mature protein composed of 184 amino acids. The calculated molecular mass and the theoretical isoelectric point (pl) of this mature protein are 21523.0 Da and 4.37, respectively. Expression of g-lysozyme mRNA is predominantly found in skin, with lower levels in spleen, liver, muscle, and lung. Phylogenetic analysis revealed that caudate amphibian g-lysozyme had distinct evolution pattern for being juxtaposed with not only anura amphibian, but also with the fish, bird and mammal. Although the first complete cDNA sequence for caudate amphibian g-lysozyme is reported in the present study, clones encoding axolotl's other functional immune molecules in the full-length cDNA library will have to be further sequenced to gain insight into the fundamental aspects of antibacterial mechanisms in caudate.
A novel paired domain DNA recognition motif can mediate Pax2 repression of gene transcription.
Håvik, B; Ragnhildstveit, E; Lorens, J B; Saelemyr, K; Fauske, O; Knudsen, L K; Fjose, A
1999-12-20
The paired domain (PD) is an evolutionarily conserved DNA-binding domain encoded by the Pax gene family of developmental regulators. The Pax proteins are transcription factors and are involved in a variety of processes such as brain development, patterning of the central nervous system (CNS), and B-cell development. In this report we demonstrate that the zebrafish Pax2 PD can interact with a novel type of DNA sequences in vitro, the triple-A motif, consisting of a heptameric nucleotide sequence G/CAAACA/TC with an invariant core of three adjacent adenosines. This recognition sequence was found to be conserved in known natural Pax5 repressor elements involved in controlling the expression of the p53 and J-chain genes. By identifying similar high affinity binding sites in potential target genes of the Pax2 protein, including the pax2 gene itself, we obtained further evidence that the triple-A sites are biologically significant. The putative natural target sites also provide a basis for defining an extended consensus recognition sequence. In addition, we observed in transformation assays a direct correlation between Pax2 repressor activity and the presence of triple-A sites. The results suggest that a transcriptional regulatory function of Pax proteins can be modulated by PD binding to different categories of target sequences. Copyright 1999 Academic Press.