Conservation and variability of West Nile virus proteins.
Koo, Qi Ying; Khan, Asif M; Jung, Keun-Ok; Ramdas, Shweta; Miotto, Olivo; Tan, Tin Wee; Brusic, Vladimir; Salmon, Jerome; August, J Thomas
2009-01-01
West Nile virus (WNV) has emerged globally as an increasingly important pathogen for humans and domestic animals. Studies of the evolutionary diversity of the virus over its known history will help to elucidate conserved sites, and characterize their correspondence to other pathogens and their relevance to the immune system. We describe a large-scale analysis of the entire WNV proteome, aimed at identifying and characterizing evolutionarily conserved amino acid sequences. This study, which used 2,746 WNV protein sequences collected from the NCBI GenPept database, focused on analysis of peptides of length 9 amino acids or more, which are immunologically relevant as potential T-cell epitopes. Entropy-based analysis of the diversity of WNV sequences, revealed the presence of numerous evolutionarily stable nonamer positions across the proteome (entropy value of < or = 1). The representation (frequency) of nonamers variant to the predominant peptide at these stable positions was, generally, low (< or = 10% of the WNV sequences analyzed). Eighty-eight fragments of length 9-29 amino acids, representing approximately 34% of the WNV polyprotein length, were identified to be identical and evolutionarily stable in all analyzed WNV sequences. Of the 88 completely conserved sequences, 67 are also present in other flaviviruses, and several have been associated with the functional and structural properties of viral proteins. Immunoinformatic analysis revealed that the majority (78/88) of conserved sequences are potentially immunogenic, while 44 contained experimentally confirmed human T-cell epitopes. This study identified a comprehensive catalogue of completely conserved WNV sequences, many of which are shared by other flaviviruses, and majority are potential epitopes. The complete conservation of these immunologically relevant sequences through the entire recorded WNV history suggests they will be valuable as components of peptide-specific vaccines or other therapeutic applications, for sequence-specific diagnosis of a wide-range of Flavivirus infections, and for studies of homologous sequences among other flaviviruses.
Kaplan, Oktay I; Berber, Burak; Hekim, Nezih; Doluca, Osman
2016-11-02
Many studies show that short non-coding sequences are widely conserved among regulatory elements. More and more conserved sequences are being discovered since the development of next generation sequencing technology. A common approach to identify conserved sequences with regulatory roles relies on topological changes such as hairpin formation at the DNA or RNA level. G-quadruplexes, non-canonical nucleic acid topologies with little established biological roles, are increasingly considered for conserved regulatory element discovery. Since the tertiary structure of G-quadruplexes is strongly dependent on the loop sequence which is disregarded by the generally accepted algorithm, we hypothesized that G-quadruplexes with similar topology and, indirectly, similar interaction patterns, can be determined using phylogenetic clustering based on differences in the loop sequences. Phylogenetic analysis of 52 G-quadruplex forming sequences in the Escherichia coli genome revealed two conserved G-quadruplex motifs with a potential regulatory role. Further analysis revealed that both motifs tend to form hairpins and G quadruplexes, as supported by circular dichroism studies. The phylogenetic analysis as described in this work can greatly improve the discovery of functional G-quadruplex structures and may explain unknown regulatory patterns. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Pérez Sirkin, Daniela I; Lafont, Anne-Gaëlle; Kamech, Nédia; Somoza, Gustavo M; Vissio, Paula G; Dufour, Sylvie
2017-01-01
GnRH-associated peptide (GAP) is the C-terminal portion of the gonadotropin-releasing hormone (GnRH) preprohormone. Although it was reported in mammals that GAP may act as a prolactin-inhibiting factor and can be co-secreted with GnRH into the hypophyseal portal blood, GAP has been practically out of the research circuit for about 20 years. Comparative studies highlighted the low conservation of GAP primary amino acid sequences among vertebrates, contributing to consider that this peptide only participates in the folding or carrying process of GnRH. Considering that the three-dimensional (3D) structure of a protein may define its function, the aim of this study was to evaluate if GAP sequences and 3D structures are conserved in the vertebrate lineage. GAP sequences from various vertebrates were retrieved from databases. Analysis of primary amino acid sequence identity and similarity, molecular phylogeny, and prediction of 3D structures were performed. Amino acid sequence comparison and phylogeny analyses confirmed the large variation of GAP sequences throughout vertebrate radiation. In contrast, prediction of the 3D structure revealed a striking conservation of the 3D structure of GAP1 (GAP associated with the hypophysiotropic type 1 GnRH), despite low amino acid sequence conservation. This GAP1 peptide presented a typical helix-loop-helix (HLH) structure in all the vertebrate species analyzed. This HLH structure could also be predicted for GAP2 in some but not all vertebrate species and in none of the GAP3 analyzed. These results allowed us to infer that selective pressures have maintained GAP1 HLH structure throughout the vertebrate lineage. The conservation of the HLH motif, known to confer biological activity to various proteins, suggests that GAP1 peptides may exert some hypophysiotropic biological functions across vertebrate radiation.
Pérez Sirkin, Daniela I.; Lafont, Anne-Gaëlle; Kamech, Nédia; Somoza, Gustavo M.; Vissio, Paula G.; Dufour, Sylvie
2017-01-01
GnRH-associated peptide (GAP) is the C-terminal portion of the gonadotropin-releasing hormone (GnRH) preprohormone. Although it was reported in mammals that GAP may act as a prolactin-inhibiting factor and can be co-secreted with GnRH into the hypophyseal portal blood, GAP has been practically out of the research circuit for about 20 years. Comparative studies highlighted the low conservation of GAP primary amino acid sequences among vertebrates, contributing to consider that this peptide only participates in the folding or carrying process of GnRH. Considering that the three-dimensional (3D) structure of a protein may define its function, the aim of this study was to evaluate if GAP sequences and 3D structures are conserved in the vertebrate lineage. GAP sequences from various vertebrates were retrieved from databases. Analysis of primary amino acid sequence identity and similarity, molecular phylogeny, and prediction of 3D structures were performed. Amino acid sequence comparison and phylogeny analyses confirmed the large variation of GAP sequences throughout vertebrate radiation. In contrast, prediction of the 3D structure revealed a striking conservation of the 3D structure of GAP1 (GAP associated with the hypophysiotropic type 1 GnRH), despite low amino acid sequence conservation. This GAP1 peptide presented a typical helix-loop-helix (HLH) structure in all the vertebrate species analyzed. This HLH structure could also be predicted for GAP2 in some but not all vertebrate species and in none of the GAP3 analyzed. These results allowed us to infer that selective pressures have maintained GAP1 HLH structure throughout the vertebrate lineage. The conservation of the HLH motif, known to confer biological activity to various proteins, suggests that GAP1 peptides may exert some hypophysiotropic biological functions across vertebrate radiation. PMID:28878737
CODEHOP (COnsensus-DEgenerate Hybrid Oligonucleotide Primer) PCR primer design
Rose, Timothy M.; Henikoff, Jorja G.; Henikoff, Steven
2003-01-01
We have developed a new primer design strategy for PCR amplification of distantly related gene sequences based on consensus-degenerate hybrid oligonucleotide primers (CODEHOPs). An interactive program has been written to design CODEHOP PCR primers from conserved blocks of amino acids within multiply-aligned protein sequences. Each CODEHOP consists of a pool of related primers containing all possible nucleotide sequences encoding 3–4 highly conserved amino acids within a 3′ degenerate core. A longer 5′ non-degenerate clamp region contains the most probable nucleotide predicted for each flanking codon. CODEHOPs are used in PCR amplification to isolate distantly related sequences encoding the conserved amino acid sequence. The primer design software and the CODEHOP PCR strategy have been utilized for the identification and characterization of new gene orthologs and paralogs in different plant, animal and bacterial species. In addition, this approach has been successful in identifying new pathogen species. The CODEHOP designer (http://blocks.fhcrc.org/codehop.html) is linked to BlockMaker and the Multiple Alignment Processor within the Blocks Database World Wide Web (http://blocks.fhcrc.org). PMID:12824413
[Comparative genomics and evolutionary analysis of CRISPR loci in acetic acid bacteria].
Xia, Kai; Liang, Xin-le; Li, Yu-dong
2015-12-01
The clustered regularly interspaced short palindromic repeat (CRISPR) is a widespread adaptive immunity system that exists in most archaea and many bacteria against foreign DNA, such as phages, viruses and plasmids. In general, CRISPR system consists of direct repeat, leader, spacer and CRISPR-associated sequences. Acetic acid bacteria (AAB) play an important role in industrial fermentation of vinegar and bioelectrochemistry. To investigate the polymorphism and evolution pattern of CRISPR loci in acetic acid bacteria, bioinformatic analyses were performed on 48 species from three main genera (Acetobacter, Gluconacetobacter and Gluconobacter) with whole genome sequences available from the NCBI database. The results showed that the CRISPR system existed in 32 species of the 48 strains studied. Most of the CRISPR-Cas system in AAB belonged to type I CRISPR-Cas system (subtype E and C), but type II CRISPR-Cas system which contain cas9 gene was only found in the genus Acetobacter and Gluconacetobacter. The repeat sequences of some CRISPR were highly conserved among species from different genera, and the leader sequences of some CRISPR possessed conservative motif, which was associated with regulated promoters. Moreover, phylogenetic analysis of cas1 demonstrated that they were suitable for classification of species. The conservation of cas1 genes was associated with that of repeat sequences among different strains, suggesting they were subjected to similar functional constraints. Moreover, the number of spacer was positively correlated with the number of prophages and insertion sequences, indicating the acetic acid bacteria were continually invaded by new foreign DNA. The comparative analysis of CRISR loci in acetic acid bacteria provided the basis for investigating the molecular mechanism of different acetic acid tolerance and genome stability in acetic acid bacteria.
Liu, Yanli; Huangfu, Jie; Qi, Feng; Kaleem, Imdad; E, Wenwen; Li, Chun
2012-01-01
We cloned the β-glucuronidase gene (AtGUS) from Aspergillus terreus Li-20 encoding 657 amino acids (aa), which can transform glycyrrhizin into glycyrrhetinic acid monoglucuronide (GAMG) and glycyrrhetinic acid (GA). Based on sequence alignment, the C-terminal non-conservative sequence showed low identity with those of other species; thus, the partial sequence AtGUS(-3t) (1–592 aa) was amplified to determine the effects of the non-conservative sequence on the enzymatic properties. AtGUS and AtGUS(-3t) were expressed in E. coli BL21, producing AtGUS-E and AtGUS(-3t)-E, respectively. At the similar optimum temperature (55°C) and pH (AtGUS-E, 6.6; AtGUS(-3t)-E, 7.0) conditions, the thermal stability of AtGUS(-3t)-E was enhanced at 65°C, and the metal ions Co2+, Ca2+ and Ni2+ showed opposite effects on AtGUS-E and AtGUS(-3t)-E, respectively. Furthermore, Km of AtGUS(-3t)-E (1.95 mM) was just nearly one-seventh that of AtGUS-E (12.9 mM), whereas the catalytic efficiency of AtGUS(-3t)-E was 3.2 fold higher than that of AtGUS-E (7.16 vs. 2.24 mM s−1), revealing that the truncation of non-conservative sequence can significantly improve the catalytic efficiency of AtGUS. Conformational analysis illustrated significant difference in the secondary structure between AtGUS-E and AtGUS(-3t)-E by circular dichroism (CD). The results showed that the truncation of the non-conservative sequence could preferably alter and influence the stability and catalytic efficiency of enzyme. PMID:22347419
Chakravorty, S; Sarkar, S; Gachhui, R
2015-01-01
The Acetobacteraceae family of the class Alpha Proteobacteria is comprised of high sugar and acid tolerant bacteria. The Acetic Acid Bacteria are the economically most significant group of this family because of its association with food products like vinegar, wine etc. Acetobacteraceae are often hard to culture in laboratory conditions and they also maintain very low abundances in their natural habitats. Thus identification of the organisms in such environments is greatly dependent on modern tools of molecular biology which require a thorough knowledge of specific conserved gene sequences that may act as primers and or probes. Moreover unconserved domains in genes also become markers for differentiating closely related genera. In bacteria, the 16S rRNA gene is an ideal candidate for such conserved and variable domains. In order to study the conserved and variable domains of the 16S rRNA gene of Acetic Acid Bacteria and the Acetobacteraceae family, sequences from publicly available databases were aligned and compared. Near complete sequences of the gene were also obtained from Kombucha tea biofilm, a known Acetobacteraceae family habitat, in order to corroborate the domains obtained from the alignment studies. The study indicated that the degree of conservation in the gene is significantly higher among the Acetic Acid Bacteria than the whole Acetobacteraceae family. Moreover it was also observed that the previously described hypervariable regions V1, V3, V5, V6 and V7 were more or less conserved in the family and the spans of the variable regions are quite distinct as well.
Hemalatha, G. R.; Rao, D. Satyanarayana; Guruprasad, L.
2007-01-01
We have identified four repeats and ten domains that are novel in proteins encoded by the Bacillus anthracis str. Ames proteome using automated in silico methods. A “repeat” corresponds to a region comprising less than 55-amino-acid residues that occur more than once in the protein sequence and sometimes present in tandem. A “domain” corresponds to a conserved region with greater than 55-amino-acid residues and may be present as single or multiple copies in the protein sequence. These correspond to (1) 57-amino-acid-residue PxV domain, (2) 122-amino-acid-residue FxF domain, (3) 111-amino-acid-residue YEFF domain, (4) 109-amino-acid-residue IMxxH domain, (5) 103-amino-acid-residue VxxT domain, (6) 84-amino-acid-residue ExW domain, (7) 104-amino-acid-residue NTGFIG domain, (8) 36-amino-acid-residue NxGK repeat, (9) 95-amino-acid-residue VYV domain, (10) 75-amino-acid-residue KEWE domain, (11) 59-amino-acid-residue AFL domain, (12) 53-amino-acid-residue RIDVK repeat, (13) (a) 41-amino-acid-residue AGQF repeat and (b) 42-amino-acid-residue GSAL repeat. A repeat or domain type is characterized by specific conserved sequence motifs. We discuss the presence of these repeats and domains in proteins from other genomes and their probable secondary structure. PMID:17538688
CoSMoS: Conserved Sequence Motif Search in the proteome
Liu, Xiao I; Korde, Neeraj; Jakob, Ursula; Leichert, Lars I
2006-01-01
Background With the ever-increasing number of gene sequences in the public databases, generating and analyzing multiple sequence alignments becomes increasingly time consuming. Nevertheless it is a task performed on a regular basis by researchers in many labs. Results We have now created a database called CoSMoS to find the occurrences and at the same time evaluate the significance of sequence motifs and amino acids encoded in the whole genome of the model organism Escherichia coli K12. We provide a precomputed set of multiple sequence alignments for each individual E. coli protein with all of its homologues in the RefSeq database. The alignments themselves, information about the occurrence of sequence motifs together with information on the conservation of each of the more than 1.3 million amino acids encoded in the E. coli genome can be accessed via the web interface of CoSMoS. Conclusion CoSMoS is a valuable tool to identify highly conserved sequence motifs, to find regions suitable for mutational studies in functional analyses and to predict important structural features in E. coli proteins. PMID:16433915
Location of a major antigenic site involved in Ross River virus neutralization.
Vrati, S; Fernon, C A; Dalgarno, L; Weir, R C
1988-02-01
The location of a major antigenic domain involved in the neutralization of an alphavirus, Ross River virus, has been defined in terms of its position in the amino acid sequence of the E2 glycoprotein. The domain encompasses three topographically close epitopes which were identified using three E2-specific neutralizing monoclonal antibodies in competitive binding assays. Nucleotide sequencing of the structural protein genes of monoclonal antibody-selected antigenic variants showed that for each variant there was a single nucleotide change in the E2 gene leading to a nonconservative amino acid substitution in E2. Changes were at positions 216, 234, and 246-251 in the amino acid sequence. The epitopes are in a region of E2 which, though not strongly conserved as to sequence among Ross River virus, Semliki Forest virus, and Sindbis virus, is conserved in its hydropathy profile among the three alphaviruses. The epitopes lie between two asparagine-linked glycosylation sites (residues 200 and 262) in E2. They are conserved as to position between the mouse virulent T48 strain and the mouse avirulent NB5092 strain.
P53 Gene Mutagenesis in Breast Cancer
2005-03-01
the wild type T peak. 12 Table 1. Sonic ntations dected by SINtA Individual Cell Sequence Amino Acid Species Conservation 3 ID’ ID Change2 Change... differences in the content of toxic substances in the diet (Biggs et al., 1993; Blaszyk et al., 1996). The development of this p53 mutation load...Changes in the P53 Gene in Single Cells Individual Sequence Amino acid Species conservation ’ ID’ Cell ID change’ change Monkey Mouse Rat Chicken
Patarca, R; Dorta, B; Ramirez, J L
1982-01-01
As part of a project pertaining the organization of ribosomal genes in Kinetoplastidae, we have created a data base for published sequences of ribosomal nucleic acids, with information in Spanish. As a first step in their processing, we have written a computer program which introduces the new feature of determining the length of the fragments produced after single or multiple digestion with any of the known restriction enzymes. With this information we have detected conserved SAU 3A sites: (i) at the 5' end of the 5.8S rRNA and at the 3' end of the small subunit rRNA, both included in similar larger sequences; (ii) in the 5.8S rRNA of vertebrates (a second one), which is not present in lower eukaryotes, showing a clear evolutive divergence; and, (iii) at the 5' terminal of the small subunit rRNA, included in a larger conserved sequence. The possible biological importance of these sequences is discussed. PMID:6278402
A dehydrin cognate protein from pea (Pisum sativum L.) with an atypical pattern of expression.
Robertson, M; Chandler, P M
1994-11-01
Dehydrins are a family of proteins characterised by conserved amino acid motifs, and induced in plants by dehydration or treatment with ABA. An antiserum was raised against a synthetic oligopeptide based on the most highly conserved dehydrin amino acid motif, the lysine-rich (core sequence KIKEK-LPG). This antiserum detected a novel M(r) 40,000 polypeptide and enabled isolation of a corresponding cDNA clone, pPsB61 (B61). The deduced amino acid sequence contained two lysine-rich blocks, however the remainder of the sequenced differed markedly from other pea dehydrins. Surprisingly, the sequence contained a stretch of serine residues, a characteristic common to dehydrins from many plant species but which is missing in pea dehydrin. The expression patterns of B61 mRNA and polypeptide were distinctively different from those of the pea dehydrins during seed development, germination and in young seedlings exposed to dehydration stress or treated with ABA. In particular, dehydration stress led to slightly reduced levels of B61 RNA, and ABA application to young seedlings had no marked effect on its abundance. The M(r) 40,000 polypeptide is thus related to pea dehydrin by the presence of the most highly conserved amino acid sequence motifs, but lacks the characteristic expression pattern of dehydrin. By analogy with heat shock cognate proteins we refer to this protein as a dehydrin cognate.
DOE Office of Scientific and Technical Information (OSTI.GOV)
White, D.A.; Zilinskas, B.A.
1991-08-01
The authors now report the nucleotide sequence of the cytosolic Cu/Zn SOD cloned from a {lambda}gt11 cDNA library constructed from mRNA extracted from leaves of 7- to 10-d pea seedlings (Pisum sativum L.). The clone was isolated using a 22-base synthetic oligonucleotide complementary to the amino acid sequence CGIIGLQG. This sequence, found at the protein's carboxy terminus, is highly conserved among plant cytosolic Cu/Zn SODs but not chloroplastic Cu/Zn SODs. The 738-base pair sequence contains an open reading frame specifying 152 codons and a predicted M{sub r} of 18,024 D. The deduced amino acid sequence is highly homologous (79-82% identity)more » with the sequences of other known plant cytosolic Cu/Zn SODs but less highly conserved (63-65%) when compared with several chloroplastic Cu/Zn SODs including pea (10).« less
Human somatostatin I: sequence of the cDNA.
Shen, L P; Pictet, R L; Rutter, W J
1982-01-01
RNA has been isolated from a human pancreatic somatostatinoma and used to prepare a cDNA library. After prescreening, clones containing somatostatin I sequences were identified by hybridization with an anglerfish somatostatin I-cloned cDNA probe. From the nucleotide sequence of two of these clones, we have deduced an essentially full-length mRNA sequence, including the preprosomatostatin coding region, 105 nucleotides from the 5' untranslated region and the complete 150-nucleotide 3' untranslated region. The coding region predicts a 116-amino acid precursor protein (Mr, 12.727) that contains somatostatin-14 and -28 at its COOH terminus. The predicted amino acid sequence of human somatostatin-28 is identical to that of somatostatin-28 isolated from the porcine and ovine species. A comparison of the amino acid sequences of human and anglerfish preprosomatostatin I indicated that the COOH-terminal region encoding somatostatin-14 and the adjacent 6 amino acids are highly conserved, whereas the remainder of the molecule, including the signal peptide region, is more divergent. However, many of the amino acid differences found in the pro region of the human and anglerfish proteins are conservative changes. This suggests that the propeptides have a similar secondary structure, which in turn may imply a biological function for this region of the molecule. Images PMID:6126875
Manikandan, Selvaraj; Balaji, Seetharaaman; Kumar, Anil; Kumar, Rita
2007-01-01
The molecular basis for the survival of bacteria under extreme conditions in which growth is inhibited is a question of great current interest. A preliminary study was carried out to determine residue pattern conservation among the antiporters of enteric bacteria, responsible for extreme acid sensitivity especially in Escherichia coli and Shigella flexneri. Here we found the molecular evidence that proved the relationship between E. coli and S. flexneri. Multiple sequence alignment of the gadC coded acid sensitive antiporter showed many conserved residue patterns at regular intervals at the N-terminal region. It was observed that as the alignment approaches towards the C-terminal, the number of conserved residues decreases, indicating that the N-terminal region of this protein has much active role when compared to the carboxyl terminal. The motif, FHLVFFLLLGG, is well conserved within the entire gadC coded protein at the amino terminal. The motif is also partially conserved among other antiporters (which are not coded by gadC) but involved in acid sensitive/resistance mechanism. Phylogenetic cluster analysis proves the relationship of Escherichia coli and Shigella flexneri. The gadC coded proteins are converged as a clade and diverged from other antiporters belongs to the amino acid-polyamine-organocation (APC) superfamily. PMID:21670792
Folmar, L.D.; Denslow, N.D.; Wallace, R.A.; LaFleur, G.; Gross, T.S.; Bonomelli, S.; Sullivan, C.V.
1995-01-01
N-terminal amino acid sequences for vitellogenin (Vtg) from six species of teleost fish (striped bass, mummichog, pinfish, brown bullhead, medaka, yellow perch and the sturgeon) are compared with published N-terminal Vtg sequences for the lamprey, clawed frog and domestic chicken. Striped bass and mummichog had 100% identical amino acids between positions 7 and 21, while pinfish, brown bullhead, sturgeon, lamprey, Xenopus and chicken had 87%, 93%, 60%, 47%, 47-60%) for four transcripts and had 40% identical, respectively, with striped bass for the same positions. Partial sequences obtained for medaka and yellow perch were 100% identical between positions 5 to 10. The potential utility of this conserved sequence for studies on the biochemistry, molecular biology and pathology of vitellogenesis is discussed.
Bonen, Linda; Boer, Poppo H.; Gray, Michael W.
1984-01-01
We have determined the sequence of the wheat mitochondrial gene for cytochrome oxidase subunit II (COII) and find that its derived protein sequence differs from that of maize at only three amino acid positions. Unexpectedly, all three replacements are non-conservative ones. The wheat COII gene has a highly-conserved intron at the same position as in maize, but the wheat intron is 1.5 times longer because of an insert relative to its maize counterpart. Hybridization analysis of mitochondrial DNA from rye, pea, broad bean and cucumber indicates strong sequence conservation of COII coding sequences among all these higher plants. However, only rye and maize mitochondrial DNA show homology with wheat COII intron sequences and rye alone with intron-insert sequences. We find that a sequence identical to the region of the 5' exon corresponding to the transmembrane domain of the COII protein is present at a second genomic location in wheat mitochondria. These variations in COII gene structure and size, as well as the presence of repeated COII sequences, illustrate at the DNA sequence level, factors which contribute to higher plant mitochondrial DNA diversity and complexity. ImagesFig. 3.Fig. 4.Fig. 5. PMID:16453565
Vouille, V; Amiche, M; Nicolas, P
1997-09-01
We cloned the genes of two members of the dermaseptin family, broad-spectrum antimicrobial peptides isolated from the skin of the arboreal frog Phyllomedusa bicolor. The dermaseptin gene Drg2 has a 2-exon coding structure interrupted by a small 137-bp intron, wherein exon 1 encoded a 22-residue hydrophobic signal peptide and the first three amino acids of the acidic propiece; exon 2 contained the 18 additional acidic residues of the propiece plus a typical prohormone processing signal Lys-Arg and a 32-residue dermaseptin progenitor sequence. The dermaseptin genes Drg2 and Drg1g2 have conserved sequences at both untranslated ends and in the first and second coding exons. In contrast, Drg1g2 comprises a third coding exon for a short version of the acidic propiece and a second dermaseptin progenitor sequence. Structural conservation between the two genes suggests that Drg1g2 arose recently from an ancestral Drg2-like gene through amplification of part of the second coding exon and 3'-untranslated region. Analysis of the cDNAs coding precursors for several frog skin peptides of highly different structures and activities demonstrates that the signal peptides and part of the acidic propieces are encoded by conserved nucleotides encompassed by the first coding exon of the dermaseptin genes. The organization of the genes that belong to this family, with the signal peptide and the progenitor sequence on separate exons, permits strikingly different peptides to be directed into the secretory pathway. The recruitment of such a homologous 'secretory' exon by otherwise non-homologous genes may have been an early event in the evolution of amphibian.
Pyrin gene and mutants thereof, which cause familial Mediterranean fever
Kastner, Daniel L [Bethesda, MD; Aksentijevichh, Ivona [Bethesda, MD; Centola, Michael [Tacoma Park, MD; Deng, Zuoming [Gaithersburg, MD; Sood, Ramen [Rockville, MD; Collins, Francis S [Rockville, MD; Blake, Trevor [Laytonsville, MD; Liu, P Paul [Ellicott City, MD; Fischel-Ghodsian, Nathan [Los Angeles, CA; Gumucio, Deborah L [Ann Arbor, MI; Richards, Robert I [North Adelaide, AU; Ricke, Darrell O [San Diego, CA; Doggett, Norman A [Santa Cruz, NM; Pras, Mordechai [Tel-Hashomer, IL
2003-09-30
The invention provides the nucleic acid sequence encoding the protein associated with familial Mediterranean fever (FMF). The cDNA sequence is designated as MEFV. The invention is also directed towards fragments of the DNA sequence, as well as the corresponding sequence for the RNA transcript and fragments thereof. Another aspect of the invention provides the amino acid sequence for a protein (pyrin) associated with FMF. The invention is directed towards both the full length amino acid sequence, fusion proteins containing the amino acid sequence and fragments thereof. The invention is also directed towards mutants of the nucleic acid and amino acid sequences associated with FMF. In particular, the invention discloses three missense mutations, clustered in within about 40 to 50 amino acids, in the highly conserved rfp (B30.2) domain at the C-terminal of the protein. These mutants include M6801, M694V, K695R, and V726A. Additionally, the invention includes methods for diagnosing a patient at risk for having FMF and kits therefor.
Amino acid sequence analysis of the annexin super-gene family of proteins.
Barton, G J; Newman, R H; Freemont, P S; Crumpton, M J
1991-06-15
The annexins are a widespread family of calcium-dependent membrane-binding proteins. No common function has been identified for the family and, until recently, no crystallographic data existed for an annexin. In this paper we draw together 22 available annexin sequences consisting of 88 similar repeat units, and apply the techniques of multiple sequence alignment, pattern matching, secondary structure prediction and conservation analysis to the characterisation of the molecules. The analysis clearly shows that the repeats cluster into four distinct families and that greatest variation occurs within the repeat 3 units. Multiple alignment of the 88 repeats shows amino acids with conserved physicochemical properties at 22 positions, with only Gly at position 23 being absolutely conserved in all repeats. Secondary structure prediction techniques identify five conserved helices in each repeat unit and patterns of conserved hydrophobic amino acids are consistent with one face of a helix packing against the protein core in predicted helices a, c, d, e. Helix b is generally hydrophobic in all repeats, but contains a striking pattern of repeat-specific residue conservation at position 31, with Arg in repeats 4 and Glu in repeats 2, but unconserved amino acids in repeats 1 and 3. This suggests repeats 2 and 4 may interact via a buried saltbridge. The loop between predicted helices a and b of repeat 3 shows features distinct from the equivalent loop in repeats 1, 2 and 4, suggesting an important structural and/or functional role for this region. No compelling evidence emerges from this study for uteroglobin and the annexins sharing similar tertiary structures, or for uteroglobin representing a derivative of a primordial one-repeat structure that underwent duplication to give the present day annexins. The analyses performed in this paper are re-evaluated in the Appendix, in the light of the recently published X-ray structure for human annexin V. The structure confirms most of the predictions and shows the power of techniques for the determination of tertiary structural information from the amino acid sequences of an aligned protein family.
Takaesu, Azusa; Watanabe, Kiyotaka; Takai, Shinji; Sasaki, Yukako; Orino, Koichi
2008-01-01
Background Iron-storage protein, ferritin plays a central role in iron metabolism. Ferritin has dual function to store iron and segregate iron for protection of iron-catalyzed reactive oxygen species. Tissue ferritin is composed of two kinds of subunits (H: heavy chain or heart-type subunit; L: light chain or liver-type subunit). Ferritin gene expression is controlled at translational level in iron-dependent manner or at transcriptional level in iron-independent manner. However, sequencing analysis of marine mammalian ferritin subunits has not yet been performed fully. The purpose of this study is to reveal cDNA-derived amino acid sequences of cetacean ferritin H and L subunits, and demonstrate the possibility of expression of these subunits, especially H subunit, by iron. Methods Sequence analyses of cetacean ferritin H and L subunits were performed by direct sequencing of polymerase chain reaction (PCR) fragments from cDNAs generated via reverse transcription-PCR of leukocyte total RNA prepared from blood samples of six different dolphin species (Pseudorca crassidens, Lagenorhynchus obliquidens, Grampus griseus, Globicephala macrorhynchus, Tursiops truncatus, and Delphinapterus leucas). The putative iron-responsive element sequence in the 5'-untranslated region of the six different dolphin species was revealed by direct sequencing of PCR fragments obtained using leukocyte genomic DNA. Results Dolphin H and L subunits consist of 182 and 174 amino acids, respectively, and amino acid sequence identities of ferritin subunits among these dolphins are highly conserved (H: 99–100%, (99→98) ; L: 98–100%). The conserved 28 bp IRE sequence was located -144 bp upstream from the initiation codon in the six different dolphin species. Conclusion These results indicate that six different dolphin species have conserved ferritin sequences, and suggest that these genes are iron-dependently expressed. PMID:18954429
2014-01-01
Background Ambiscript is a graphically-designed nucleic acid notation that uses symbol symmetries to support sequence complementation, highlight biologically-relevant palindromes, and facilitate the analysis of consensus sequences. Although the original Ambiscript notation was designed to easily represent consensus sequences for multiple sequence alignments, the notation’s black-on-white ambiguity characters are unable to reflect the statistical distribution of nucleotides found at each position. We now propose a color-augmented ambigraphic notation to encode the frequency of positional polymorphisms in these consensus sequences. Results We have implemented this color-coding approach by creating an Adobe Flash® application ( http://www.ambiscript.org) that shades and colors modified Ambiscript characters according to the prevalence of the encoded nucleotide at each position in the alignment. The resulting graphic helps viewers perceive biologically-relevant patterns in multiple sequence alignments by uniquely combining color, shading, and character symmetries to highlight palindromes and inverted repeats in conserved DNA motifs. Conclusion Juxtaposing an intuitive color scheme over the deliberate character symmetries of an ambigraphic nucleic acid notation yields a highly-functional nucleic acid notation that maximizes information content and successfully embodies key principles of graphic excellence put forth by the statistician and graphic design theorist, Edward Tufte. PMID:24447494
Wang, Zhengjia; Huang, Ruiming; Sun, Zhichao; Zhang, Tong; Huang, Jianqin
2017-05-01
MicroRNAs (miRNAs) are important regulators of plant development and fruit formation. Mature embryos of hickory (Carya cathayensis Sarg.) nuts contain more than 70% oil (comprising 90% unsaturated fatty acids), along with a substantial amount of oleic acid. To understand the roles of miRNAs involved in oil and oleic acid production during hickory embryogenesis, three small RNA libraries from different stages of embryogenesis were constructed. Deep sequencing of these three libraries identified 95 conserved miRNAs with 19 miRNA*s, 7 novel miRNAs (as well as their corresponding miRNA*s), and 26 potentially novel miRNAs. The analysis identified 15 miRNAs involved in oil and oleic acid production that are differentially expressed during embryogenesis in hickory. Among them, nine miRNA sequences, including eight conserved and one novel, were confirmed by qRT-PCR. In addition, 145 target genes of the novel miRNAs were predicted using a bioinformatic approach. Our results provide a framework for better understanding the roles of miRNAs during embryogenesis in hickory.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Crooks, Gavin E.
WebLogo is a web based application designed to make the generation of sequence logos as easy and painless as possible. Sequesnce logos are a graphical representation of an amino acid or nucleic acid multiple sequence alignment developed by Tom Schneider and Mike Stephens. Each logo consists of stacks of symbols, one stack for each position in the sequence. The overall height of the stack indicates the sequence conservation at that position, while the height of symbols within the stack indicates the relative frequency of each amino or nucleic acid at that position. In general, a sequence logo provides a richermore » and more precise description of, for example, a binding site, than would a consensus sequence.« less
Dijk, J; van den Broek, R; Nasiulas, G; Beck, A; Reinhardt, R; Wittmann-Liebold, B
1987-08-01
The amino-terminal sequence of ribosomal protein L10 from Halobacterium marismortui has been determined up to residue 54, using both a liquid- and a gas-phase sequenator. The two sequences are in good agreement. The protein is clearly homologous to protein HcuL10 from the related strain Halobacterium cutirubrum. Furthermore, a weaker but distinct homology to ribosomal protein L6 from Escherichia coli and Bacillus stearothermophilus can be detected. In addition to 7 identical amino acids in the first 36 residues in all four sequences a number of conservative replacements occurs, of mainly hydrophobic amino acids. In this common region the pattern of conserved amino acids suggests the presence of a beta-alpha fold as it occurs in ribosomal proteins L12 and L30. Furthermore, several potential cases of homology to other ribosomal components of the three ur-kingdoms have been found.
Nucleotide and amino acid variations of tannase gene from different Aspergillus strains.
Borrego-Terrazas, J A; Lara-Victoriano, F; Flores-Gallegos, A C; Veana, F; Aguilar, C N; Rodríguez-Herrera, R
2014-08-01
Tannase is an enzyme that catalyses the hydrolysis of ester bonds present in tannins. Most of the scientific reports about this biocatalysis focus on aspects related to tannase production and its recovery; on the other hand, reports assessing the molecular aspects of the tannase gene or protein are scarce. In the present study, a tannase gene fragment from several Aspergillus strains isolated from the Mexican semidesert was sequenced and compared with tannase amino acid sequences reported in NCBI database using bioinformatics tools. The genetic relationship among the different tannase sequences was also determined. A conserved region of 7 amino acids was found with the conserved motif GXSXG common to esterases, in which the active-site serine residue is located. In addition, in Aspergillus niger strains GH1 and PSH, we found an extra codon in the tannase sequences encoding glycine. The tannase gene belonging to semidesert fungal strains followed a neutral evolution path with the formation of 10 haplotypes, of which A. niger GH1 and PSH haplotypes are the oldest.
Beccari, T; Hoade, J; Orlacchio, A; Stirling, J L
1992-01-01
cDNAs encoding the mouse beta-N-acetylhexosaminidase alpha-subunit were isolated from a mouse testis library. The longest of these (1.7 kb) was sequenced and showed 83% similarity with the human alpha-subunit cDNA sequence. The 5' end of the coding sequence was obtained from a genomic DNA clone. Alignment of the human and mouse sequences showed that all three putative N-glycosylation sites are conserved, but that the mouse alpha-subunit has an additional site towards the C-terminus. All eight cysteines in the human sequence are conserved in the mouse. There are an additional two cysteines in the mouse alpha-subunit signal peptide. All amino acids affected in Tay-Sachs-disease mutations are conserved in the mouse. Images Fig. 1. PMID:1379046
Chapell, J D; Goral, M I; Rodgers, S E; dePamphilis, C W; Dermody, T S
1994-01-01
To better understand genetic diversity within mammalian reoviruses, we determined S2 nucleotide and deduced sigma 2 amino acid sequences of nine reovirus strains and compared these sequences with those of prototype strains of the three reovirus serotypes. The S2 gene and sigma 2 protein are highly conserved among the four type 1, one type 2, and seven type 3 strains studied. Phylogenetic analyses based on S2 nucleotide sequences of the 12 reovirus strains indicate that diversity within the S2 gene is independent of viral serotype. Additionally, we found marked topological differences between phylogenetic trees generated from S1 and S2 gene nucleotide sequences of the seven type 3 strains. These results demonstrate that reovirus S1 and S2 genes have distinct evolutionary histories, thus providing phylogenetic evidence for lateral transfer of reovirus genes in nature. When variability among the 12 sigma 2-encoding S2 nucleotide sequences was analyzed at synonymous positions, we found that approximately 60 nucleotides at the 5' terminus and 30 nucleotides at the 3' terminus were markedly conserved in comparison with other sigma 2-encoding regions of S2. Predictions of RNA secondary structures indicate that the more conserved S2 sequences participate in the formation of an extended region of duplex RNA interrupted by a pair of stem-loops. Among the 12 deduced sigma 2 amino acid sequences examined, substitutions were observed at only 11% of amino acid positions. This finding suggests that constraints on the structure or function of sigma 2, perhaps in part because of its location in the virion core, have limited sequence diversity within this protein. PMID:8289378
NASA Technical Reports Server (NTRS)
Fox, G. E.
1985-01-01
Comparisons of complete 16S ribosomal ribonucleic acid (rRNA) sequences established that the secondary structure of these molecules is highly conserved. Earlier work with 5S rRNA secondary structure revealed that when structural conservation exists the alignment of sequences is straightforward. The constancy of structure implies minimal functional change. Under these conditions a uniform evolutionary rate can be expected so that conditions are favorable for phylogenetic tree construction.
Ashkenazy, Haim; Abadi, Shiran; Martz, Eric; Chay, Ofer; Mayrose, Itay; Pupko, Tal; Ben-Tal, Nir
2016-01-01
The degree of evolutionary conservation of an amino acid in a protein or a nucleic acid in DNA/RNA reflects a balance between its natural tendency to mutate and the overall need to retain the structural integrity and function of the macromolecule. The ConSurf web server (http://consurf.tau.ac.il), established over 15 years ago, analyses the evolutionary pattern of the amino/nucleic acids of the macromolecule to reveal regions that are important for structure and/or function. Starting from a query sequence or structure, the server automatically collects homologues, infers their multiple sequence alignment and reconstructs a phylogenetic tree that reflects their evolutionary relations. These data are then used, within a probabilistic framework, to estimate the evolutionary rates of each sequence position. Here we introduce several new features into ConSurf, including automatic selection of the best evolutionary model used to infer the rates, the ability to homology-model query proteins, prediction of the secondary structure of query RNA molecules from sequence, the ability to view the biological assembly of a query (in addition to the single chain), mapping of the conservation grades onto 2D RNA models and an advanced view of the phylogenetic tree that enables interactively rerunning ConSurf with the taxa of a sub-tree. PMID:27166375
Hall, L; Laird, J E; Craig, R K
1984-01-01
Nucleotide sequence analysis of cloned guinea-pig casein B cDNA sequences has identified two casein B variants related to the bovine and rat alpha s1 caseins. Amino acid homology was largely confined to the known bovine or predicted rat phosphorylation sites and within the 'signal' precursor sequence. Comparison of the deduced nucleotide sequence of the guinea-pig and rat alpha s1 casein mRNA species showed greater sequence conservation in the non-coding than in the coding regions, suggesting a functional and possibly regulatory role for the non-coding regions of casein mRNA. The results provide insight into the evolution of the casein genes, and raise questions as to the role of conserved nucleotide sequences within the non-coding regions of mRNA species. Images Fig. 1. PMID:6548375
Chen, Xiaochi; Ansai, Toshihiro; Awano, Shuji; Iida, Toshiya; Barik, Sailen; Takehara, Tadamichi
1999-01-01
A novel acid phosphatase containing phosphotyrosyl phosphatase (PTPase) activity, designated PiACP, from Prevotella intermedia ATCC 25611, an anaerobe implicated in progressive periodontal disease, has been purified and characterized. PiACP, a monomer with an apparent molecular mass of 30 kDa, did not require divalent metal cations for activity and was sensitive to orthovanadate but highly resistant to okadaic acid. The enzyme exhibited substantial activity against tyrosine phosphate-containing peptides derived from the epidermal growth factor receptor. On the basis of N-terminal and internal amino acid sequences of purified PiACP, the gene coding for PiACP was isolated and sequenced. The PiACP gene consisted of 792 bp and coded for a basic protein with an Mr of 29,164. The deduced amino acid sequence exhibited striking similarity (25 to 64%) to those of members of class A bacterial acid phosphatases, including PhoC of Morganella morganii, and involved a conserved phosphatase sequence motif that is shared among several lipid phosphatases and the mammalian glucose-6-phosphatases. The highly conservative motif HCXAGXXR in the active domain of PTPase was not found in PiACP. Mutagenesis of recombinant PiACP showed that His-170 and His-209 were essential for activity. Thus, the class A bacterial acid phosphatases including PiACP may function as atypical PTPases, the biological functions of which remain to be determined. PMID:10559178
Conservation and diversification of Msx protein in metazoan evolution.
Takahashi, Hirokazu; Kamiya, Akiko; Ishiguro, Akira; Suzuki, Atsushi C; Saitou, Naruya; Toyoda, Atsushi; Aruga, Jun
2008-01-01
Msx (/msh) family genes encode homeodomain (HD) proteins that control ontogeny in many animal species. We compared the structures of Msx genes from a wide range of Metazoa (Porifera, Cnidaria, Nematoda, Arthropoda, Tardigrada, Platyhelminthes, Mollusca, Brachiopoda, Annelida, Echiura, Echinodermata, Hemichordata, and Chordata) to gain an understanding of the role of these genes in phylogeny. Exon-intron boundary analysis suggested that the position of the intron located N-terminally to the HDs was widely conserved in all the genes examined, including those of cnidarians. Amino acid (aa) sequence comparison revealed 3 new evolutionarily conserved domains, as well as very strong conservation of the HDs. Two of the three domains were associated with Groucho-like protein binding in both a vertebrate and a cnidarian Msx homolog, suggesting that the interaction between Groucho-like proteins and Msx proteins was established in eumetazoan ancestors. Pairwise comparison among the collected HDs and their C-flanking aa sequences revealed that the degree of sequence conservation varied depending on the animal taxa from which the sequences were derived. Highly conserved Msx genes were identified in the Vertebrata, Cephalochordata, Hemichordata, Echinodermata, Mollusca, Brachiopoda, and Anthozoa. The wide distribution of the conserved sequences in the animal phylogenetic tree suggested that metazoan ancestors had already acquired a set of conserved domains of the current Msx family genes. Interestingly, although strongly conserved sequences were recovered from the Vertebrata, Cephalochordata, and Anthozoa, the sequences from the Urochordata and Hydrozoa showed weak conservation. Because the Vertebrata-Cephalochordata-Urochordata and Anthozoa-Hydrozoa represent sister groups in the Chordata and Cnidaria, respectively, Msx sequence diversification may have occurred differentially in the course of evolution. We speculate that selective loss of the conserved domains in Msx family proteins contributed to the diversification of animal body organization.
CodonLogo: a sequence logo-based viewer for codon patterns.
Sharma, Virag; Murphy, David P; Provan, Gregory; Baranov, Pavel V
2012-07-15
Conserved patterns across a multiple sequence alignment can be visualized by generating sequence logos. Sequence logos show each column in the alignment as stacks of symbol(s) where the height of a stack is proportional to its informational content, whereas the height of each symbol within the stack is proportional to its frequency in the column. Sequence logos use symbols of either nucleotide or amino acid alphabets. However, certain regulatory signals in messenger RNA (mRNA) act as combinations of codons. Yet no tool is available for visualization of conserved codon patterns. We present the first application which allows visualization of conserved regions in a multiple sequence alignment in the context of codons. CodonLogo is based on WebLogo3 and uses the same heuristics but treats codons as inseparable units of a 64-letter alphabet. CodonLogo can discriminate patterns of codon conservation from patterns of nucleotide conservation that appear indistinguishable in standard sequence logos. The CodonLogo source code and its implementation (in a local version of the Galaxy Browser) are available at http://recode.ucc.ie/CodonLogo and through the Galaxy Tool Shed at http://toolshed.g2.bx.psu.edu/.
Bahramnejad, Bahman
2014-01-01
P. atlantica subsp. Kurdica, with the local name of Baneh, is a wild medicinal plant which grows in Kurdistan, Iran. The identification of resistance gene analogs holds great promise for the development of resistant cultivars. A PCR approach with degenerate primers designed according to conserved NBS-LRR (nucleotide binding site-leucine rich repeat) regions of known disease-resistance (R) genes was used to amplify and clone homologous sequences from P. atlantica subsp. Kurdica. A DNA fragment of the expected 500-bp size was amplified. The nucleotide sequence of this amplicon was obtained through sequencing and the predicted amino acid sequence compared to the amino acid sequences of known R-genes revealed significant sequence similarity. Alignment of the deduced amino acid sequence of P. atlantica subsp. Kurdica resistance gene analog (RGA) showed strong identity, ranging from 68% to 77%, to the non-toll interleukin receptor (non-TIR) R-gene subfamily from other plants. A P-loop motif (GMMGGEGKTT), a conserved and hydrophobic motif GLPLAL, a kinase-2a motif (LLVLDDV), when replaced by IAVFDDI in PAKRGA1 and a kinase-3a (FGPGSRIII) were presented in all RGA. A phylogenetic tree, based on the deduced amino-acid sequences of PAKRGA1 and RGAs from different species indicated that they were separated in two clusters, PAKRGA1 being on cluster II. The isolated NBS analogs can be eventually used as guidelines to isolate numerous R-genes in Pistachio. PMID:27843981
The complete amino acid sequence of human skeletal-muscle fructose-bisphosphate aldolase.
Freemont, P S; Dunbar, B; Fothergill-Gilmore, L A
1988-01-01
The complete amino acid sequence of human skeletal-muscle fructose-bisphosphate aldolase, comprising 363 residues, was determined. The sequence was deduced by automated sequencing of CNBr-cleavage, o-iodosobenzoic acid-cleavage, trypsin-digest and staphylococcal-proteinase-digest fragments. Comparison of the sequence with other class I aldolase sequences shows that the mammalian muscle isoenzyme is one of the most highly conserved enzymes known, with only about 2% of the residues changing per 100 million years. Non-mammalian aldolases appear to be evolving at the same rate as other glycolytic enzymes, with about 4% of the residues changing per 100 million years. Secondary-structure predictions are analysed in an accompanying paper [Sawyer, Fothergill-Gilmore & Freemont (1988) Biochem. J. 249, 789-793]. PMID:3355497
Janecek, S
1995-12-11
A short conserved sequence equivalent to the fifth conserved sequence region of alpha-amylases (173_LPDLD, Aspergillus oryzae alpha-amylase) comprising the calcium-ligand aspartate, Asp-175, was identified in the amino acid sequences of several members of the family of (alpha/beta)8-barrel glycosyl hydrolases. Despite the fact that the aspartate is not invariantly conserved, the stretch can be easily recognised in all sequences to be positioned 26-28 amino acid residues in front of the well-known catalytic aspartate (Asp-206, A. oryzae alpha-amylase) located in the beta 4-strand of the barrel. The identification of this region revealed remarkable similarities between some alpha-amylases (those from Bacillus megaterium, Bacillus subtilis and Dictyoglomus thermophilum) on the one hand and several different enzyme specificities (such as oligo-1,6-glucosidase, amylomaltase and neopullulanase, respectively) on the other hand. The most interesting example was offered by B. subtilis alpha-amylase and potato amylomaltase with the regions LYDWN and LYDWK, respectively. These observations support the idea that all members of the family of glycosyl hydrolases adopting the structure of the alpha-amylase-type (alpha/beta)8-barrel are mutually closely related and the strict evolutionary borders separating the individual enzyme specificities can be hardly defined.
Genomic structure of the human D-site binding protein (DBP) gene
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shutler, G.; Glassco, T.; Kang, Xiaolin
1996-06-15
The human gene for the D-Site Binding Protein (DBP) has been sequenced and characterized. This gene is a member of the b/ZIP family of transcription factors and is one of three genes forming the PAR sub-family. DBP has been implicated in the diurnal regulation of a variety of liver-specific genes. Examination of the genomic structure of DBP reveals that the gene is divided into four exons and is contained within a relatively compact region of approximately 6 kb. These exons appear to correspond to functional divisions the DBP protein. Exon 1 contains a long 5{prime} UTR, and conservation between themore » rat and the human genes of the presence of small open reading frames within this region suggests that is may play a role in translational control. Exon 2 contains a limited region of similarity to the other PAR domain genes, which may be part of a potential activation domain. Exon 3 contains the PAR domain and differs by only 1 of 71 amino acids between rat and human. Exon 4, containing both the basic and the leucine zipper domains, is likewise highly conserved. The overall degree of homology between the rat and the human cDNA sequences is 82% for the nucleic acid sequence and 92% for the protein sequence. comparison of the rat and human proximal promoters reveals extensive sequence conservation, with two previously characterized DNA binding sites being conserved at the functional and sequence levels. 31 refs., 4 figs.« less
Evidence for the Concerted Evolution between Short Linear Protein Motifs and Their Flanking Regions
Chica, Claudia; Diella, Francesca; Gibson, Toby J.
2009-01-01
Background Linear motifs are short modules of protein sequences that play a crucial role in mediating and regulating many protein–protein interactions. The function of linear motifs strongly depends on the context, e.g. functional instances mainly occur inside flexible regions that are accessible for interaction. Sometimes linear motifs appear as isolated islands of conservation in multiple sequence alignments. However, they also occur in larger blocks of sequence conservation, suggesting an active role for the neighbouring amino acids. Results The evolution of regions flanking 116 functional linear motif instances was studied. The conservation of the amino acid sequence and order/disorder tendency of those regions was related to presence/absence of the instance. For the majority of the analysed instances, the pairs of sequences conserving the linear motif were also observed to maintain a similar local structural tendency and/or to have higher local sequence conservation when compared to pairs of sequences where one is missing the linear motif. Furthermore, those instances have a higher chance to co–evolve with the neighbouring residues in comparison to the distant ones. Those findings are supported by examples where the regulation of the linear motif–mediated interaction has been shown to depend on the modifications (e.g. phosphorylation) at neighbouring positions or is thought to benefit from the binding versatility of disordered regions. Conclusion The results suggest that flanking regions are relevant for linear motif–mediated interactions, both at the structural and sequence level. More interestingly, they indicate that the prediction of linear motif instances can be enriched with contextual information by performing a sequence analysis similar to the one presented here. This can facilitate the understanding of the role of these predicted instances in determining the protein function inside the broader context of the cellular network where they arise. PMID:19584925
Molecular Characterization of a Catalase from Hydra vulgaris
Dash, Bhagirathi; Phillips, Timothy D.
2012-01-01
Catalase, an antioxidant and hydroperoxidase enzyme protects the cellular environment from harmful effects of hydrogen peroxide by facilitating its degradation to oxygen and water. Molecular information on a cnidarian catalase and/or peroxidase is, however, limited. In this work an apparent full length cDNA sequence coding for a catalase (HvCatalase) was isolated from Hydra vulgaris using 3’- and 5’- (RLM) RACE approaches. The 1859 bp HvCatalase cDNA included an open reading frame of 1518 bp encoding a putative protein of 505 amino acids with a predicted molecular mass of 57.44 kDa. The deduced amino acid sequence of HvCatalase contained several highly conserved motifs including the heme-ligand signature sequence RLFSYGDTH and the active site signature FXRERIPERVVHAKGXGA. A comparative analysis showed the presence of conserved catalytic amino acids [His(71), Asn(145), and Tyr(354)] in HvCatalase as well. Homology modeling indicated the presence of the conserved features of mammalian catalase fold. Hydrae exposed to thermal, starvation, metal and oxidative stress responded by regulating its catalase mRNA transcription. These results indicated that the HvCatalase gene is involved in the cellular stress response and (anti)oxidative processes triggered by stressor and contaminant exposure. PMID:22521743
Comparative analysis of ribosomal protein L5 sequences from bacteria of the genus Thermus.
Jahn, O; Hartmann, R K; Boeckh, T; Erdmann, V A
1991-06-01
The genes for the ribosomal 5S rRNA binding protein L5 have been cloned from three extremely thermophilic eubacteria, Thermus flavus, Thermus thermophilus HB8 and Thermus aquaticus (Jahn et al, submitted). Genes for protein L5 from the three Thermus strains display 95% G/C in third positions of codons. Amino acid sequences deduced from the DNA sequence were shown to be identical for T flavus and T thermophilus, although the corresponding DNA sequences differed by two T to C transitions in the T thermophilus gene. Protein L5 sequences from T flavus and T thermophilus are 95% homologous to L5 from T aquaticus and 56.5% homologous to the corresponding E coli sequence. The lowest degrees of homology were found between the T flavus/T thermophilus L5 proteins and those of yeast L16 (27.5%), Halobacterium marismortui (34.0%) and Methanococcus vannielii (36.6%). From sequence comparison it becomes clear that thermostability of Thermus L5 proteins is achieved by an increase in hydrophobic interactions and/or by restriction of steric flexibility due to the introduction of amino acids with branched aliphatic side chains such as leucine. Alignment of the nine protein sequences equivalent to Thermus L5 proteins led to identification of a conserved internal segment, rich in acidic amino acids, which shows homology to subsequences of E coli L18 and L25. The occurrence of conserved sequence elements in 5S rRNA binding proteins and ribosomal proteins in general is discussed in terms of evolution and function.
Hidalgo, A R; Akond, M A; Kita, K; Kataoka, M; Shimizu, S
2001-12-01
Two conjugated polyketone reductases (CPRs) were isolated from Candida parapsilosis IFO 0708. The primary structures of CPRs (C1 and C2) were analyzed by amino acid sequencing. The amino acid sequences of both enzymes had high similarity to those of several proteins of the aldo-keto-reductase (AKR) superfamily. However, several amino acid residues in the putative active sites of AKRs were not conserved in CPRs-C1 and -C2.
The complete amino acid sequence of human erythrocyte diphosphoglycerate mutase.
Haggarty, N W; Dunbar, B; Fothergill, L A
1983-01-01
The complete amino acid sequence of human erythrocyte diphosphoglycerate mutase, comprising 239 residues, was determined. The sequence was deduced from the four cyanogen bromide fragments, and from the peptides derived from these fragments after digestion with a number of proteolytic enzymes. Comparison of this sequence with that of the yeast glycolytic enzyme, phosphoglycerate mutase, shows that these enzymes are 47% identical. Most, but not all, of the residues implicated as being important for the activity of the glycolytic mutase are conserved in the erythrocyte diphosphoglycerate mutase. PMID:6313356
Carlow, Chevonne E; Faultless, J Trent; Lee, Christine; Siddiqua, Mahbuba; Edge, Alison; Nassuth, Annette
2017-09-01
The highly conserved CBF pathway is crucial in the regulation of plant responses to low temperatures. Extensive analysis of Arabidopsis CBF proteins revealed that their functions rely on several conserved amino acid domains although the exact function of each domain is disputed. The question was what functions similar domains have in CBFs from other, overwintering woody plants such as Vitis, which likely have a more involved regulation than the model plant Arabidopsis. A total of seven CBF genes were cloned and sequenced from V. riparia and the less frost tolerant V. vinifera. The deduced species-specific amino acid sequences differ in only a few amino acids, mostly in non-conserved regions. Amino acid sequence comparison and phylogenetic analysis showed two distinct groups of Vitis CBFs. One group contains CBF1, CBF2, CBF3 and CBF8 and the other group contains CBF4, CBF5 and CBF6. Transient transactivation assays showed that all Vitis CBFs except CBF5 activate via a CRT or DRE promoter element, whereby Vitis CBF3 and 4 prefer a CRT element. The hydrophobic domains in the C-terminal end of VrCBF6 were shown to be important for how well it activates. The putative nuclear localization domain of Vitis CBF1 was shown to be sufficient for nuclear localization, in contrast to previous reports for AtCBF1, and also important for transactivation. The latter highlights the value of careful analysis of domain functions instead of reliance on computer predictions and published data for other related proteins. Copyright © 2017 Elsevier Masson SAS. All rights reserved.
Use of conserved key amino acid positions to morph protein folds.
Reddy, Boojala V B; Li, Wilfred W; Bourne, Philip E
2002-07-15
By using three-dimensional (3D) structure alignments and a previously published method to determine Conserved Key Amino Acid Positions (CKAAPs) we propose a theoretical method to design mutations that can be used to morph the protein folds. The original Paracelsus challenge, met by several groups, called for the engineering of a stable but different structure by modifying less than 50% of the amino acid residues. We have used the sequences from the Protein Data Bank (PDB) identifiers 1ROP, and 2CRO, which were previously used in the Paracelsus challenge by those groups, and suggest mutation to CKAAPs to morph the protein fold. The total number of mutations suggested is less than 40% of the starting sequence theoretically improving the challenge results. From secondary structure prediction experiments of the proposed mutant sequence structures, we observe that each of the suggested mutant protein sequences likely folds to a different, non-native potentially stable target structure. These results are an early indicator that analyses using structure alignments leading to CKAAPs of a given structure are of value in protein engineering experiments. Copyright 2002 Wiley Periodicals, Inc.
Evolutionary and biophysical relationships among the papillomavirus E2 proteins.
Blakaj, Dukagjin M; Fernandez-Fuentes, Narcis; Chen, Zigui; Hegde, Rashmi; Fiser, Andras; Burk, Robert D; Brenowitz, Michael
2009-01-01
Infection by human papillomavirus (HPV) may result in clinical conditions ranging from benign warts to invasive cancer. The HPV E2 protein represses oncoprotein transcription and is required for viral replication. HPV E2 binds to palindromic DNA sequences of highly conserved four base pair sequences flanking an identical length variable 'spacer'. E2 proteins directly contact the conserved but not the spacer DNA. Variation in naturally occurring spacer sequences results in differential protein affinity that is dependent on their sensitivity to the spacer DNA's unique conformational and/or dynamic properties. This article explores the biophysical character of this core viral protein with the goal of identifying characteristics that associated with risk of virally caused malignancy. The amino acid sequence, 3d structure and electrostatic features of the E2 protein DNA binding domain are highly conserved; specific interactions with DNA binding sites have also been conserved. In contrast, the E2 protein's transactivation domain does not have extensive surfaces of highly conserved residues. Rather, regions of high conservation are localized to small surface patches. Implications to cancer biology are discussed.
González, Carolina; Tabernero, David; Cortese, Maria Francesca; Gregori, Josep; Casillas, Rosario; Riveiro-Barciela, Mar; Godoy, Cristina; Sopena, Sara; Rando, Ariadna; Yll, Marçal; Lopez-Martinez, Rosa; Quer, Josep; Esteban, Rafael; Buti, Maria; Rodríguez-Frías, Francisco
2018-05-21
To detect hyper-conserved regions in the hepatitis B virus (HBV) X gene ( HBX ) 5' region that could be candidates for gene therapy. The study included 27 chronic hepatitis B treatment-naive patients in various clinical stages (from chronic infection to cirrhosis and hepatocellular carcinoma, both HBeAg-negative and HBeAg-positive), and infected with HBV genotypes A-F and H. In a serum sample from each patient with viremia > 3.5 log IU/mL, the HBX 5' end region [nucleotide (nt) 1255-1611] was PCR-amplified and submitted to next-generation sequencing (NGS). We assessed genotype variants by phylogenetic analysis, and evaluated conservation of this region by calculating the information content of each nucleotide position in a multiple alignment of all unique sequences (haplotypes) obtained by NGS. Conservation at the HBx protein amino acid (aa) level was also analyzed. NGS yielded 1333069 sequences from the 27 samples, with a median of 4578 sequences/sample (2487-9279, IQR 2817). In 14/27 patients (51.8%), phylogenetic analysis of viral nucleotide haplotypes showed a complex mixture of genotypic variants. Analysis of the information content in the haplotype multiple alignments detected 2 hyper-conserved nucleotide regions, one in the HBX upstream non-coding region (nt 1255-1286) and the other in the 5' end coding region (nt 1519-1603). This last region coded for a conserved amino acid region (aa 63-76) that partially overlaps a Kunitz-like domain. Two hyper-conserved regions detected in the HBX 5' end may be of value for targeted gene therapy, regardless of the patients' clinical stage or HBV genotype.
Scop3D: three-dimensional visualization of sequence conservation.
Vermeire, Tessa; Vermaere, Stijn; Schepens, Bert; Saelens, Xavier; Van Gucht, Steven; Martens, Lennart; Vandermarliere, Elien
2015-04-01
The integration of a protein's structure with its known sequence variation provides insight on how that protein evolves, for instance in terms of (changing) function or immunogenicity. Yet, collating the corresponding sequence variants into a multiple sequence alignment, calculating each position's conservation, and mapping this information back onto a relevant structure is not straightforward. We therefore built the Sequence Conservation on Protein 3D structure (scop3D) tool to perform these tasks automatically. The output consists of two modified PDB files in which the B-values for each position are replaced by the percentage sequence conservation, or the information entropy for each position, respectively. Furthermore, text files with absolute and relative amino acid occurrences for each position are also provided, along with snapshots of the protein from six distinct directions in space. The visualization provided by scop3D can for instance be used as an aid in vaccine development or to identify antigenic hotspots, which we here demonstrate based on an analysis of the fusion proteins of human respiratory syncytial virus and mumps virus. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Erickson, Harold P.
2009-01-01
Summary The eukaryotic cytoskeleton appears to have evolved from ancestral precursors related to prokaryotic FtsZ and MreB. FtsZ and MreB show 40−50% sequence identity across different bacterial and archaeal species. Here I suggest that this represents the limit of divergence that is consistent with maintaining their functions for cytokinesis and cell shape. Previous analyses have noted that tubulin and actin are highly conserved across eukaryotic species, but so divergent from their prokaryotic relatives as to be hardly recognizable from sequence comparisons. One suggestion for this extreme divergence of tubulin and actin is that it occurred as they evolved very different functions from FtsZ and MreB. I will present new arguments favoring this suggestion, and speculate on pathways. Moreover, the extreme conservation of tubulin and actin across eukaryotic species is not due to an intrinsic lack of variability, but is attributed to their acquisition of elaborate mechanisms for assembly dynamics and their interactions with multiple motor and binding proteins. A new structure-based sequence alignment identifies amino acids that are conserved from FtsZ to tubulins. The highly conserved amino acids are not those forming the subunit core or protofilament interface, but those involved in binding and hydrolysis of GTP. PMID:17563102
Siche, Stefanie; Brett, Katharina; Möller, Lars; Kordyukova, Larisa V.; Mintaev, Ramil R.; Alexeevski, Andrei V.; Veit, Michael
2015-01-01
Recruitment of the matrix protein M1 to the assembly site of the influenza virus is thought to be mediated by interactions with the cytoplasmic tail of hemagglutinin (HA). Based on a comprehensive sequence comparison of all sequences present in the database, we analyzed the effect of mutating conserved residues in the cytosol-facing part of the transmembrane region and cytoplasmic tail of HA (A/WSN/33 (H1N1) strain) on virus replication and morphology of virions. Removal of the two cytoplasmic acylation sites and substitution of a neighboring isoleucine by glutamine prevented rescue of infectious virions. In contrast, a conservative exchange of the same isoleucine, non-conservative exchanges of glycine and glutamine, deletion of the acylation site at the end of the transmembrane region and shifting it into the tail did not affect virus morphology and had only subtle effects on virus growth and on the incorporation of M1 and Ribo-Nucleoprotein Particles (RNPs). Thus, assuming that essential amino acids are conserved between HA subtypes we suggest that, besides the two cytoplasmic acylation sites (including adjacent hydrophobic residues), no other amino acids in the cytoplasmic tail of HA are indispensable for virus assembly and budding. PMID:26670246
Nirasawa, Satoru; Nakahara, Kazuhiko; Takahashi, Saori
2018-02-27
Paenidase is the first microorganism-derived D-aspartyl endopeptidase that specifically recognizes an internal D-Asp residue to cleave [D-Asp]-X peptide bonds. Using peptide sequences obtained from the protein, we performed PCR with degenerate primers to amplify the paenidase I-encoding gene. Nucleotide sequencing revealed that mature paenidase I consists of 322 amino acid residues and that the protein is encoded as a pro-protein with a 197-amino-acid N-terminal extension compared to the mature protein. Paenidase I exhibits amino acid sequence similarity to several penicillin-binding proteins. In addition, paenidase I was classified into peptidase family S12 based on a MEROPS database search. Family S12 contains serine-type D-Ala-D-Ala carboxypeptidases that have three active site residues (Ser, Lys, and Tyr) in the conserved motifs Ser-Xaa-Thr-Lys and Tyr-Xaa-Asn. These motifs were conserved in the primary structure of paenidase I, and the role of these residues was confirmed by site-directed mutagenesis.
Hughes, Austin L.
2015-01-01
Members of the aminopepidase N (APN) gene family of the insect order Lepidoptera (moths and butterflies) bind the naturally insecticidal Cry toxins produced by the bacterium Bacillus thuringiensis. Phylogenetic analysis of amino acid sequences of seven lepidopteran APN classes provided strong support for the hypothesis that lepidopteran APN2 class arose by gene duplication prior to the most recent common ancestor of Lepidoptera and Diptera. The Cry toxin-binding region (BR) of lepidopteran and dipteran APNs was subject to stronger purifying selection within APN classes than was the remainder of the molecule, reflecting conservation of catalytic site and adjoining residues within the BR. Of lepidopteran APN classes, APN2, APN6, and APN8 showed the strongest evidence of functional specialization, both in expression patterns and in the occurrence of conserved derived amino acid residues. The latter three APN classes also shared a convergently evolved conserved residue close to the catalytic site. APN8 showed a particularly strong tendency towards class-specific conserved residues, including one of the catalytic site residues in the BR and ten others in close vicinity to the catalytic site residues. The occurrence of class-specific sequences along with the conservation of enzymatic function is consistent with the hypothesis that the presence of Cry toxins in the environment has been a factor shaping the evolution of this multi-gene family. PMID:24675701
Production of hydroxylated fatty acids in genetically modified plants
Somerville, Chris; Broun, Pierre; van de Loo, Frank
2001-01-01
This invention relates to plant fatty acyl hydroxylases. Methods to use conserved amino acid or nucleotide sequences to obtain plant fatty acyl hydroxylases are described. Also described is the use of cDNA clones encoding a plant hydroxylase to produce a family of hydroxylated fatty acids in transgenic plants.
Nishizawa, M; Nishizawa, K
2000-10-01
The tendency for repetitiveness of nucleotides in DNA sequences has been reported for a variety of organisms. We show that the tendency for repetitive use of amino acids is widespread and is observed even for segments conserved between human and Drosophila melanogaster at the level of >50% amino acid identity. This indicates that repetitiveness influences not only the weakly constrained segments but also those sequence segments conserved among phyla. Not only glutamine (Q) but also many of the 20 amino acids show a comparable level of repetitiveness. Repetitiveness in bases at codon position 3 is stronger for human than for D.melanogaster, whereas local repetitiveness in intron sequences is similar between the two organisms. While genes for immune system-specific proteins, but not ancient human genes (i.e. human homologs of Escherichia coli genes), have repetitiveness at codon bases 1 and 2, repetitiveness at codon base 3 for these groups is similar, suggesting that the human genome has at least two mechanisms generating local repetitiveness. Neither amino acid nor nucleotide repetitiveness is observed beyond the exon boundary, denying the possibility that such repetitiveness could mainly stem from natural selection on mRNA or protein sequences. Analyses of mammalian sequence alignments show that while the 'between gene' GC content heterogeneity, which is linked to 'isochores', is a principal factor associated with the bias in substitution patterns in human, 'within gene' heterogeneity in nucleotide composition is also associated with such bias on a more local scale. The relationship amongst the various types of repetitiveness is discussed.
Nishizawa, Manami; Nishizawa, Kazuhisa
2000-01-01
The tendency for repetitiveness of nucleotides in DNA sequences has been reported for a variety of organisms. We show that the tendency for repetitive use of amino acids is widespread and is observed even for segments conserved between human and Drosophila melanogaster at the level of >50% amino acid identity. This indicates that repetitiveness influences not only the weakly constrained segments but also those sequence segments conserved among phyla. Not only glutamine (Q) but also many of the 20 amino acids show a comparable level of repetitiveness. Repetitiveness in bases at codon position 3 is stronger for human than for D.melanogaster, whereas local repetitiveness in intron sequences is similar between the two organisms. While genes for immune system-specific proteins, but not ancient human genes (i.e. human homologs of Escherichia coli genes), have repetitiveness at codon bases 1 and 2, repetitiveness at codon base 3 for these groups is similar, suggesting that the human genome has at least two mechanisms generating local repetitiveness. Neither amino acid nor nucleotide repetitiveness is observed beyond the exon boundary, denying the possibility that such repetitiveness could mainly stem from natural selection on mRNA or protein sequences. Analyses of mammalian sequence alignments show that while the ‘between gene’ GC content heterogeneity, which is linked to ‘isochores’, is a principal factor associated with the bias in substitution patterns in human, ‘within gene’ heterogeneity in nucleotide composition is also associated with such bias on a more local scale. The relationship amongst the various types of repetitiveness is discussed. PMID:11000273
Conserved Sequence Preferences Contribute to Substrate Recognition by the Proteasome*
Yu, Houqing; Singh Gautam, Amit K.; Wilmington, Shameika R.; Wylie, Dennis; Martinez-Fonts, Kirby; Kago, Grace; Warburton, Marie; Chavali, Sreenivas; Inobe, Tomonao; Finkelstein, Ilya J.; Babu, M. Madan
2016-01-01
The proteasome has pronounced preferences for the amino acid sequence of its substrates at the site where it initiates degradation. Here, we report that modulating these sequences can tune the steady-state abundance of proteins over 2 orders of magnitude in cells. This is the same dynamic range as seen for inducing ubiquitination through a classic N-end rule degron. The stability and abundance of His3 constructs dictated by the initiation site affect survival of yeast cells and show that variation in proteasomal initiation can affect fitness. The proteasome's sequence preferences are linked directly to the affinity of the initiation sites to their receptor on the proteasome and are conserved between Saccharomyces cerevisiae, Schizosaccharomyces pombe, and human cells. These findings establish that the sequence composition of unstructured initiation sites influences protein abundance in vivo in an evolutionarily conserved manner and can affect phenotype and fitness. PMID:27226608
Callahan, Courtney; Fox, Karen; Fox, Alvin
2009-01-01
The Bacillus cereus group includes Bacillus anthracis, Bacillus cereus, Bacillus thuringiensis, Bacillus mycoides and Bacillus weihenstephanensis. The small acid-soluble spore protein (SASP) β has been previously demonstrated to be among the biomarkers differentiating B. anthracis and B. cereus; SASP β of B. cereus most commonly exhibits one or two amino acid substitutions when compared to B. anthracis. SASP α is conserved in sequence among these two species. Neither SASP α nor β for B. thuringiensis, B. mycoides and B. weihenstephanensis have been previously characterized as taxonomic discriminators. In the current work molecular weight (MW) variation of these SASPs were determined by matrix assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI TOF MS) for representative strains of the 5 species within the B. cereus group. The measured MWs also correlate with calculated MWs of translated amino acid sequences generated from whole genome sequencing projects. SASP α and β demonstrated consistent MW among B. cereus, B. thuringiensis, and B. mycoides strains (group 1). However B. mycoides (group 2) and B. weihenstephanensis SASP α and β were quite distinct making them unique among the B. cereus group. Limited sequence changes were observed in SASP α (at most 3 substitutions and 2 deletions) indicating it is a more conserved protein than SASP β (up to 6 substitutions and a deletion). Another even more conserved SASP, SASP α-β type, was described here for the first time. PMID:19616612
Cho, Young Sun; Choi, Buyl Nim; Ha, En-Mi; Kim, Ki Hong; Kim, Sung Koo; Kim, Dong Soo; Nam, Yoon Kwon
2005-01-01
Novel metallothionein (MT) complementary DNA and genomic sequences were isolated from a cartilaginous shark species, Scyliorhinus torazame. The full-length open reading frame (ORF) of shark MT cDNA encoded 68 amino acids with a high cysteine content (29%). The genomic ORF sequence (932 bp) of shark MT isolated by polymerase chain reaction (PCR) comprised 3 exons with 2 interventing introns. Shark MT sequence shared many conserved features with other vertebrate MTs: overall amino acid identities of shark MT ranged from 47% to 57% with fish MTs, and 41% to 62% with mammalian MTs. However, in addition to these conserved characteristics, shark MT sequence exhibited some unique characteristics. It contained 4 extra amino acids (Lys-Ala-Gly-Arg) at the end of the beta-domain, which have not been reported in any other vertebrate MTs. The last amino acid residue at the C-terminus was Ser, which also has not been reported in fish and mammalian MTs. The MT messenger RNA levels in shark liver and kidney, assessed by semiquantitative reverse transcriptase PCR and RNA blot hybridization, were significantly affected by experimental exposures to heavy metals (cadmium, copper, and zinc). Generally, the transcriptional activation of shark MT gene was dependent on the dose (0-10 mg/kg body weight for injection and 0-20 microM for immersion) and duration (1-10 days); zinc was a more potent inducer than copper and cadmium.
Strickland, Michelle; Tudorica, Victor; Řezáč, Milan; Thomas, Neil R; Goodacre, Sara L
2018-06-01
Spiders produce multiple silks with different physical properties that allow them to occupy a diverse range of ecological niches, including the underwater environment. Despite this functional diversity, past molecular analyses show a high degree of amino acid sequence similarity between C-terminal regions of silk genes that appear to be independent of the physical properties of the resulting silks; instead, this domain is crucial to the formation of silk fibers. Here, we present an analysis of the C-terminal domain of all known types of spider silk and include silk sequences from the spider Argyroneta aquatica, which spins the majority of its silk underwater. Our work indicates that spiders have retained a highly conserved mechanism of silk assembly, despite the extraordinary diversification of species, silk types and applications of silk over 350 million years. Sequence analysis of the silk C-terminal domain across the entire gene family shows the conservation of two uncommon amino acids that are implicated in the formation of a salt bridge, a functional bond essential to protein assembly. This conservation extends to the novel sequences isolated from A. aquatica. This finding is relevant to research regarding the artificial synthesis of spider silk, suggesting that synthesis of all silk types will be possible using a single process.
Bjorklund, H.V.; Higman, K.H.; Kurath, G.
1996-01-01
The nucleotide sequences of the glycoprotein genes and all of the internal gene junctions of the fish pathogenic rhabdoviruses spring viremia of carp virus (SVCV) and hirame rhabdovirus (HIRRV) have been determined from cDNA clones generated from viral genomic RNA. The SVCV glycoprotein gene sequence is 1588 nucleotides (nt) long and encodes a 509 amino acid (aa) protein. The HIRRV glycoprotein gene sequence comprises 1612 nt, coding for a 508 aa protein. In sequence comparisons of 15 rhabdovirus glycoproteins, the SVCV glycoprotein gene showed the highest amino acid sequence identity (31.2–33.2%) with vesicular stomatitis New Jersey virus (VSNJV), Chandipura virus (CHPV) and vesicular stomatitis Indiana virus (VSIV). The HIRRV glycoprotein gene showed a very high amino acid sequence identity (74.3%) with the glycoprotein gene of another fish pathogenic rhabdovirus, infectious hematopoietic necrosis virus (IHNV), but no significant similarity with glycoproteins of VSIV or rabies virus (RABV). In phylogenetic analyses SVCV was grouped consistently with VSIV, VSNJV and CHPV in the Vesiculovirus genus of Rhabdoviridae. The fish rhabdoviruses HIRRV, IHNV and viral hemorrhagic septicemia virus (VHSV) showed close relationships with each other, but only very distant relationships with mammalian rhabdoviruses. The gene junctions are highly conserved between SVCV and VSIV, well conserved between IHNV and HIRRV, but not conserved between HIRRV/IHNV and RABV. Based on the combined results we suggest that the fish lyssa-type rhabdoviruses HIRRV, IHNV and VHSV may be grouped in their own genus within the family Rhabdoviridae. Aquarhabdovirus has been proposed for the name of this new genus.
Bjorklund, H.V.; Higman, K.H.; Kurath, G.
1996-01-01
The nucleotide sequences of the glycoprotein genes and all of the internal gene junctions of the fish pathogenic rhabdoviruses spring viremia of carp virus (SVCV) and hirame rhabdovirus (HIRRV) have been determined from cDNA clones generated from viral genomic RNA. The SVCV glycoprotein gene sequence is 1588 nucleotides (nt) long and encodes a 509 amino acid (aa) protein. The HIRRV glycoprotein gene sequence comprises 1612 nt, coding for a 508 aa protein. In sequence comparisons of 15 rhabdovirus glycoproteins, the SVCV glycoprotein gene showed the highest amino acid sequence identity (31.2-33.2%) with vesicular stomatitis New Jersey virus (VSNJV), Chandipura virus (CHPV) and vesicular stomatitis Indiana virus (VSIV). The HIRRV glycoprotein gene showed a very high amino acid sequence identity (74.3%) with the glycoprotein gene of another fish pathogenic rhabdovirus, infectious hematopoietic necrosis virus (IHNV), but no significant similarity with glycoproteins of VSIV or rabies virus (RABV). In phylogenetic analyses SVCV was grouped consistently with VSIV, VSNJV and CHPV in the Vesiculovirus genus of Rhabdoviridae. The fish rhabdoviruses HIRRV, IHNV and viral hemorrhagic septicemia virus (VHSV) showed close relationships with each other, but only very distant relationships with mammalian rhabdoviruses. The gene junctions are highly conserved between SVCV and VSIV, well conserved between IHNV and HIRRV, but not conserved between HIRRV/IHNV and RABV. Based on the combined results we suggest that the fish lyssa-type rhabdoviruses HIRRV, IHNV and VHSV may be grouped in their own genus within the family Rhabdoviridae. Aquarhabdovirus has been proposed for the name of this new genus.
USDA-ARS?s Scientific Manuscript database
In order to identify amino acid residues crucial for the enzymatic activity of ^8-sphingolipid desaturases, a sequence comparison was performed among ^8-sphingolipid desaturases and ^6-fatty acid desaturase from various plants. In addition to the known conserved cytb5 (cytochrome b5) HPGG motif and...
Coiled-coil length: Size does matter.
Surkont, Jaroslaw; Diekmann, Yoan; Ryder, Pearl V; Pereira-Leal, Jose B
2015-12-01
Protein evolution is governed by processes that alter primary sequence but also the length of proteins. Protein length may change in different ways, but insertions, deletions and duplications are the most common. An optimal protein size is a trade-off between sequence extension, which may change protein stability or lead to acquisition of a new function, and shrinkage that decreases metabolic cost of protein synthesis. Despite the general tendency for length conservation across orthologous proteins, the propensity to accept insertions and deletions is heterogeneous along the sequence. For example, protein regions rich in repetitive peptide motifs are well known to extensively vary their length across species. Here, we analyze length conservation of coiled-coils, domains formed by an ubiquitous, repetitive peptide motif present in all domains of life, that frequently plays a structural role in the cell. We observed that, despite the repetitive nature, the length of coiled-coil domains is generally highly conserved throughout the tree of life, even when the remaining parts of the protein change, including globular domains. Length conservation is independent of primary amino acid sequence variation, and represents a conservation of domain physical size. This suggests that the conservation of domain size is due to functional constraints. © 2015 Wiley Periodicals, Inc.
Janecek, S; Baláz, S
1995-08-01
Twelve different (alpha/beta)8-barrel enzymes belonging to three structurally distinct families were found to contain, near the C-terminus of their strand beta 5, a conserved invariant glutamic acid residue that plays an important functional role in each of these enzymes. The search was based on the idea that a conserved sequence region of an (alpha/beta)8-barrel enzyme should be more or less conserved also in the equivalent part of the structure of the other enzymes with this folding motif owing to their mutual evolutionary relatedness. For this purpose, the sequence region around the well conserved fifth beta-strand of alpha-amylase containing catalytic glutamate (Glu230, Aspergillus oryzae alpha-amylase numbering), was used as the sequence-structural template. The isolated sequence stretches of the 12 (alpha/beta)8-barrels are discussed from both the sequence-structural and the evolutionary point of view, the invariant glutamate residue being proposed to be a joining feature of the studied group of enzymes remaining from their ancestral (alpha/beta)8-barrel.
Arndt, E; Scholzen, T; Krömer, W; Hatakeyama, T; Kimura, M
1991-06-01
Approximately 40 ribosomal proteins from each Halobacterium marismortui and Bacillus stearothermophilus have been sequenced either by direct protein sequence analysis or by DNA sequence analysis of the appropriate genes. The comparison of the amino acid sequences from the archaebacterium H marismortui with the available ribosomal proteins from the eubacterial and eukaryotic kingdoms revealed four different groups of proteins: 24 proteins are related to both eubacterial as well as eukaryotic proteins. Eleven proteins are exclusively related to eukaryotic counterparts. For three proteins only eubacterial relatives-and for another three proteins no counterpart-could be found. The similarities of the halobacterial ribosomal proteins are in general somewhat higher to their eukaryotic than to their eubacterial counterparts. The comparison of B stearothermophilus proteins with their E coli homologues showed that the proteins evolved at different rates. Some proteins are highly conserved with 64-76% identity, others are poorly conserved with only 25-34% identical amino acid residues.
RNA Editing in Plant Mitochondria
NASA Astrophysics Data System (ADS)
Hiesel, Rudolf; Wissinger, Bernd; Schuster, Wolfgang; Brennicke, Axel
1989-12-01
Comparative sequence analysis of genomic and complementary DNA clones from several mitochondrial genes in the higher plant Oenothera revealed nucleotide sequence divergences between the genomic and the messenger RNA-derived sequences. These sequence alterations could be most easily explained by specific post-transcriptional nucleotide modifications. Most of the nucleotide exchanges in coding regions lead to altered codons in the mRNA that specify amino acids better conserved in evolution than those encoded by the genomic DNA. Several instances show that the genomic arginine codon CGG is edited in the mRNA to the tryptophan codon TGG in amino acid positions that are highly conserved as tryptophan in the homologous proteins of other species. This editing suggests that the standard genetic code is used in plant mitochondria and resolves the frequent coincidence of CGG codons and tryptophan in different plant species. The apparently frequent and non-species-specific equivalency of CGG and TGG codons in particular suggests that RNA editing is a common feature of all higher plant mitochondria.
Mathupala, S P; Lowe, S E; Podkovyrov, S M; Zeikus, J G
1993-08-05
The complete nucleotide sequence of the gene encoding the dual active amylopullulanase of Thermoanaerobacter ethanolicus 39E (formerly Clostridium thermohydrosulfuricum) was determined. The structural gene (apu) contained a single open reading frame 4443 base pairs in length, corresponding to 1481 amino acids, with an estimated molecular weight of 162,780. Analysis of the deduced sequence of apu with sequences of alpha-amylases and alpha-1,6 debranching enzymes enabled the identification of four conserved regions putatively involved in substrate binding and in catalysis. The conserved regions were localized within a 2.9-kilobase pair gene fragment, which encoded a M(r) 100,000 protein that maintained the dual activities and thermostability of the native enzyme. The catalytic residues of amylopullulanase were tentatively identified by using hydrophobic cluster analysis for comparison of amino acid sequences of amylopullulanase and other amylolytic enzymes. Asp597, Glu626, and Asp703 were individually modified to their respective amide form, or the alternate acid form, and in all cases both alpha-amylase and pullulanase activities were lost, suggesting the possible involvement of 3 residues in a catalytic triad, and the presence of a putative single catalytic site within the enzyme. These findings substantiate amylopullulanase as a new type of amylosaccharidase.
Defining Electron Bifurcation in the Electron-Transferring Flavoprotein Family.
Garcia Costas, Amaya M; Poudel, Saroj; Miller, Anne-Frances; Schut, Gerrit J; Ledbetter, Rhesa N; Fixen, Kathryn R; Seefeldt, Lance C; Adams, Michael W W; Harwood, Caroline S; Boyd, Eric S; Peters, John W
2017-11-01
Electron bifurcation is the coupling of exergonic and endergonic redox reactions to simultaneously generate (or utilize) low- and high-potential electrons. It is the third recognized form of energy conservation in biology and was recently described for select electron-transferring flavoproteins (Etfs). Etfs are flavin-containing heterodimers best known for donating electrons derived from fatty acid and amino acid oxidation to an electron transfer respiratory chain via Etf-quinone oxidoreductase. Canonical examples contain a flavin adenine dinucleotide (FAD) that is involved in electron transfer, as well as a non-redox-active AMP. However, Etfs demonstrated to bifurcate electrons contain a second FAD in place of the AMP. To expand our understanding of the functional variety and metabolic significance of Etfs and to identify amino acid sequence motifs that potentially enable electron bifurcation, we compiled 1,314 Etf protein sequences from genome sequence databases and subjected them to informatic and structural analyses. Etfs were identified in diverse archaea and bacteria, and they clustered into five distinct well-supported groups, based on their amino acid sequences. Gene neighborhood analyses indicated that these Etf group designations largely correspond to putative differences in functionality. Etfs with the demonstrated ability to bifurcate were found to form one group, suggesting that distinct conserved amino acid sequence motifs enable this capability. Indeed, structural modeling and sequence alignments revealed that identifying residues occur in the NADH- and FAD-binding regions of bifurcating Etfs. Collectively, a new classification scheme for Etf proteins that delineates putative bifurcating versus nonbifurcating members is presented and suggests that Etf-mediated bifurcation is associated with surprisingly diverse enzymes. IMPORTANCE Electron bifurcation has recently been recognized as an electron transfer mechanism used by microorganisms to maximize energy conservation. Bifurcating enzymes couple thermodynamically unfavorable reactions with thermodynamically favorable reactions in an overall spontaneous process. Here we show that the electron-transferring flavoprotein (Etf) enzyme family exhibits far greater diversity than previously recognized, and we provide a phylogenetic analysis that clearly delineates bifurcating versus nonbifurcating members of this family. Structural modeling of proteins within these groups reveals key differences between the bifurcating and nonbifurcating Etfs. Copyright © 2017 American Society for Microbiology.
Defining Electron Bifurcation in the Electron-Transferring Flavoprotein Family
Garcia Costas, Amaya M.; Poudel, Saroj; Miller, Anne-Frances; Schut, Gerrit J.; Ledbetter, Rhesa N.; Seefeldt, Lance C.; Adams, Michael W. W.
2017-01-01
ABSTRACT Electron bifurcation is the coupling of exergonic and endergonic redox reactions to simultaneously generate (or utilize) low- and high-potential electrons. It is the third recognized form of energy conservation in biology and was recently described for select electron-transferring flavoproteins (Etfs). Etfs are flavin-containing heterodimers best known for donating electrons derived from fatty acid and amino acid oxidation to an electron transfer respiratory chain via Etf-quinone oxidoreductase. Canonical examples contain a flavin adenine dinucleotide (FAD) that is involved in electron transfer, as well as a non-redox-active AMP. However, Etfs demonstrated to bifurcate electrons contain a second FAD in place of the AMP. To expand our understanding of the functional variety and metabolic significance of Etfs and to identify amino acid sequence motifs that potentially enable electron bifurcation, we compiled 1,314 Etf protein sequences from genome sequence databases and subjected them to informatic and structural analyses. Etfs were identified in diverse archaea and bacteria, and they clustered into five distinct well-supported groups, based on their amino acid sequences. Gene neighborhood analyses indicated that these Etf group designations largely correspond to putative differences in functionality. Etfs with the demonstrated ability to bifurcate were found to form one group, suggesting that distinct conserved amino acid sequence motifs enable this capability. Indeed, structural modeling and sequence alignments revealed that identifying residues occur in the NADH- and FAD-binding regions of bifurcating Etfs. Collectively, a new classification scheme for Etf proteins that delineates putative bifurcating versus nonbifurcating members is presented and suggests that Etf-mediated bifurcation is associated with surprisingly diverse enzymes. IMPORTANCE Electron bifurcation has recently been recognized as an electron transfer mechanism used by microorganisms to maximize energy conservation. Bifurcating enzymes couple thermodynamically unfavorable reactions with thermodynamically favorable reactions in an overall spontaneous process. Here we show that the electron-transferring flavoprotein (Etf) enzyme family exhibits far greater diversity than previously recognized, and we provide a phylogenetic analysis that clearly delineates bifurcating versus nonbifurcating members of this family. Structural modeling of proteins within these groups reveals key differences between the bifurcating and nonbifurcating Etfs. PMID:28808132
Structure, synthesis, and molecular cloning of dermaseptins B, a family of skin peptide antibiotics.
Charpentier, S; Amiche, M; Mester, J; Vouille, V; Le Caer, J P; Nicolas, P; Delfour, A
1998-06-12
Analysis of antimicrobial activities that are present in the skin secretions of the South American frog Phyllomedusa bicolor revealed six polycationic (lysine-rich) and amphipathic alpha-helical peptides, 24-33 residues long, termed dermaseptins B1 to B6, respectively. Prepro-dermaseptins B all contain an almost identical signal peptide, which is followed by a conserved acidic propiece, a processing signal Lys-Arg, and a dermaseptin progenitor sequence. The 22-residue signal peptide plus the first 3 residues of the acidic propiece are encoded by conserved nucleotides encompassed by the first coding exon of the dermaseptin genes. The 25-residue amino-terminal region of prepro-dermaseptins B shares 50% identity with the corresponding region of precursors for D-amino acid containing opioid peptides or for antimicrobial peptides originating from the skin of distantly related frog species. The remarkable similarity found between prepro-proteins that encode end products with strikingly different sequences, conformations, biological activities and modes of action suggests that the corresponding genes have evolved through dissemination of a conserved "secretory cassette" exon.
Dong, Zheng; Zhou, Hongyu; Tao, Peng
2018-02-01
PAS domains are widespread in archaea, bacteria, and eukaryota, and play important roles in various functions. In this study, we aim to explore functional evolutionary relationship among proteins in the PAS domain superfamily in view of the sequence-structure-dynamics-function relationship. We collected protein sequences and crystal structure data from RCSB Protein Data Bank of the PAS domain superfamily belonging to three biological functions (nucleotide binding, photoreceptor activity, and transferase activity). Protein sequences were aligned and then used to select sequence-conserved residues and build phylogenetic tree. Three-dimensional structure alignment was also applied to obtain structure-conserved residues. The protein dynamics were analyzed using elastic network model (ENM) and validated by molecular dynamics (MD) simulation. The result showed that the proteins with same function could be grouped by sequence similarity, and proteins in different functional groups displayed statistically significant difference in their vibrational patterns. Interestingly, in all three functional groups, conserved amino acid residues identified by sequence and structure conservation analysis generally have a lower fluctuation than other residues. In addition, the fluctuation of conserved residues in each biological function group was strongly correlated with the corresponding biological function. This research suggested a direct connection in which the protein sequences were related to various functions through structural dynamics. This is a new attempt to delineate functional evolution of proteins using the integrated information of sequence, structure, and dynamics. © 2017 The Protein Society.
Knowles, D P; Cheevers, W P; McGuire, T C; Brassfield, A L; Harwood, W G; Stem, T A
1991-11-01
To define the structure of the caprine arthritis-encephalitis virus (CAEV) env gene and characterize genetic changes which occur during antigenic variation, we sequenced the env genes of CAEV-63 and CAEV-Co, two antigenic variants of CAEV defined by serum neutralization. The deduced primary translation product of the CAEV env gene consists of a 60- to 80-amino-acid signal peptide followed by an amino-terminal surface protein (SU) and a carboxy-terminal transmembrane protein (TM) separated by an Arg-Lys-Lys-Arg cleavage site. The signal peptide cleavage site was verified by amino-terminal amino acid sequencing of native CAEV-63 SU. In addition, immunoprecipitation of [35S]methionine-labeled CAEV-63 proteins by sera from goats immunized with recombinant vaccinia virus expressing the CAEV-63 env gene confirmed that antibodies induced by env-encoded recombinant proteins react specifically with native virion SU and TM. The env genes of CAEV-63 and CAEV-Co encode 28 conserved cysteines and 25 conserved potential N-linked glycosylation sites. Nucleotide sequence variability results in 62 amino acid changes and one deletion within the SU and 34 amino acid changes within the TM.
Knowles, D P; Cheevers, W P; McGuire, T C; Brassfield, A L; Harwood, W G; Stem, T A
1991-01-01
To define the structure of the caprine arthritis-encephalitis virus (CAEV) env gene and characterize genetic changes which occur during antigenic variation, we sequenced the env genes of CAEV-63 and CAEV-Co, two antigenic variants of CAEV defined by serum neutralization. The deduced primary translation product of the CAEV env gene consists of a 60- to 80-amino-acid signal peptide followed by an amino-terminal surface protein (SU) and a carboxy-terminal transmembrane protein (TM) separated by an Arg-Lys-Lys-Arg cleavage site. The signal peptide cleavage site was verified by amino-terminal amino acid sequencing of native CAEV-63 SU. In addition, immunoprecipitation of [35S]methionine-labeled CAEV-63 proteins by sera from goats immunized with recombinant vaccinia virus expressing the CAEV-63 env gene confirmed that antibodies induced by env-encoded recombinant proteins react specifically with native virion SU and TM. The env genes of CAEV-63 and CAEV-Co encode 28 conserved cysteines and 25 conserved potential N-linked glycosylation sites. Nucleotide sequence variability results in 62 amino acid changes and one deletion within the SU and 34 amino acid changes within the TM. Images PMID:1656067
Thomsen, Martin Christen Frølund; Nielsen, Morten
2012-01-01
Seq2Logo is a web-based sequence logo generator. Sequence logos are a graphical representation of the information content stored in a multiple sequence alignment (MSA) and provide a compact and highly intuitive representation of the position-specific amino acid composition of binding motifs, active sites, etc. in biological sequences. Accurate generation of sequence logos is often compromised by sequence redundancy and low number of observations. Moreover, most methods available for sequence logo generation focus on displaying the position-specific enrichment of amino acids, discarding the equally valuable information related to amino acid depletion. Seq2logo aims at resolving these issues allowing the user to include sequence weighting to correct for data redundancy, pseudo counts to correct for low number of observations and different logotype representations each capturing different aspects related to amino acid enrichment and depletion. Besides allowing input in the format of peptides and MSA, Seq2Logo accepts input as Blast sequence profiles, providing easy access for non-expert end-users to characterize and identify functionally conserved/variable amino acids in any given protein of interest. The output from the server is a sequence logo and a PSSM. Seq2Logo is available at http://www.cbs.dtu.dk/biotools/Seq2Logo (14 May 2012, date last accessed). PMID:22638583
Computational mining for hypothetical patterns of amino acid side chains in protein data bank (PDB)
NASA Astrophysics Data System (ADS)
Ghani, Nur Syatila Ab; Firdaus-Raih, Mohd
2018-04-01
The three-dimensional structure of a protein can provide insights regarding its function. Functional relationship between proteins can be inferred from fold and sequence similarities. In certain cases, sequence or fold comparison fails to conclude homology between proteins with similar mechanism. Since the structure is more conserved than the sequence, a constellation of functional residues can be similarly arranged among proteins of similar mechanism. Local structural similarity searches are able to detect such constellation of amino acids among distinct proteins, which can be useful to annotate proteins of unknown function. Detection of such patterns of amino acids on a large scale can increase the repertoire of important 3D motifs since available known 3D motifs currently, could not compensate the ever-increasing numbers of uncharacterized proteins to be annotated. Here, a computational platform for an automated detection of 3D motifs is described. A fuzzy-pattern searching algorithm derived from IMagine an Amino Acid 3D Arrangement search EnGINE (IMAAAGINE) was implemented to develop an automated method for searching of hypothetical patterns of amino acid side chains in Protein Data Bank (PDB), without the need for prior knowledge on related sequence or structure of pattern of interest. We present an example of the searches, which is the detection of a hypothetical pattern derived from known structural motif of C2H2 structural pattern from zinc fingers. The conservation of particular patterns of amino acid side chains in unrelated proteins is highlighted. This approach can act as a complementary method for available structure- and sequence-based platforms and may contribute in improving functional association between proteins.
2015-01-01
Abstract Trees contribute to enormous plant oil reserves because many trees contain 50%–80% of oil (triacylglycerols, TAGs) in the fruits and kernels. TAGs accumulate in subcellular structures called oil bodies/droplets, in which TAGs are covered by low-molecular-mass hydrophobic proteins called oleosins (OLEs). The OLEs/TAGs ratio determines the size and shape of intracellular oil bodies. There is a lack of comprehensive sequence analysis and structural information of OLEs among diverse trees. The objectives of this study were to identify OLEs from 22 tree species (e.g., tung tree, tea-oil tree, castor bean), perform genome-wide analysis of OLEs, classify OLEs, identify conserved sequence motifs and amino acid residues, and predict secondary and three-dimensional structures in tree OLEs and OLE subfamilies. Data mining identified 65 OLEs with perfect conservation of the “proline knot” motif (PX5SPX3P) from 19 trees. These OLEs contained >40% hydrophobic amino acid residues. They displayed similar properties and amino acid composition. Genome-wide phylogenetic analysis and multiple sequence alignment demonstrated that these proteins could be classified into five OLE subfamilies. There were distinct patterns of sequence conservation among the OLE subfamilies and within individual tree species. Computational modeling indicated that OLEs were composed of at least three α-helixes connected with short coils without any β-strand and that they exhibited distinct 3D structures and ligand binding sites. These analyses provide fundamental information in the similarity and specificity of diverse OLE isoforms within the same subfamily and among the different species, which should facilitate studying the structure-function relationship and identify critical amino acid residues in OLEs for metabolic engineering of tree TAGs. PMID:26258573
Cao, Heping
2015-09-01
Trees contribute to enormous plant oil reserves because many trees contain 50%-80% of oil (triacylglycerols, TAGs) in the fruits and kernels. TAGs accumulate in subcellular structures called oil bodies/droplets, in which TAGs are covered by low-molecular-mass hydrophobic proteins called oleosins (OLEs). The OLEs/TAGs ratio determines the size and shape of intracellular oil bodies. There is a lack of comprehensive sequence analysis and structural information of OLEs among diverse trees. The objectives of this study were to identify OLEs from 22 tree species (e.g., tung tree, tea-oil tree, castor bean), perform genome-wide analysis of OLEs, classify OLEs, identify conserved sequence motifs and amino acid residues, and predict secondary and three-dimensional structures in tree OLEs and OLE subfamilies. Data mining identified 65 OLEs with perfect conservation of the "proline knot" motif (PX5SPX3P) from 19 trees. These OLEs contained >40% hydrophobic amino acid residues. They displayed similar properties and amino acid composition. Genome-wide phylogenetic analysis and multiple sequence alignment demonstrated that these proteins could be classified into five OLE subfamilies. There were distinct patterns of sequence conservation among the OLE subfamilies and within individual tree species. Computational modeling indicated that OLEs were composed of at least three α-helixes connected with short coils without any β-strand and that they exhibited distinct 3D structures and ligand binding sites. These analyses provide fundamental information in the similarity and specificity of diverse OLE isoforms within the same subfamily and among the different species, which should facilitate studying the structure-function relationship and identify critical amino acid residues in OLEs for metabolic engineering of tree TAGs.
Müller, M; Schnitzler, P; Koonin, E V; Darai, G
1995-05-01
Cytoplasmic DNA viruses encode a DNA-dependent RNA polymerase (DdRP) that is essential for transcription of viral genes. The amino acid sequences of the known largest subunits of DdRPs from different species contain highly conserved regions. Oligonucleotide primers, deduced from two conserved domains (RQP[T/S]LH and NADFDGDE) were used for detecting the corresponding gene of fish lymphocystis disease virus (FLCDV), a member of the family Iridoviridae, which replicates in the cytoplasm of infected cells of flatfish. The gene coding for the largest subunit of the DdRP was identified using a PCR-derived probe. The screening of the complete EcoRI gene library of the viral genome led to the identification of the gene locus of the largest subunit of the DdRP within the EcoRI DNA fragment B (12.4 kbp, 0.034 to 0.165 map units). The nucleotide sequence of a part (8334 bp) of the EcoRI DNA fragment B was determined and a large ORF on the lower strand (ATG = 5787; TAA = 2190) was detected which encodes a protein of 1199 amino acids. Comparison of the amino acid sequences of the largest subunits of the DdRP (RPO1) of FLCDV and Chilo iridescent virus (CIV) revealed a dramatic difference in their domain organization. Unlike the 1051 aa RPO1 of CIV, which lacks the C-terminal domain conserved in eukaryotic, eubacterial and other viral RNA polymerases, the 1199 aa RPO1 of FLCDV is fully collinear with its cellular and viral homologues. Despite this difference, comparative analysis of the amino acid sequences of viral and cellular RNA polymerases suggests a common origin for the largest RNA polymerase subunits of FLCDV and CIV.
NASA Technical Reports Server (NTRS)
Gatlin, L. L.
1974-01-01
Concepts of information theory are applied to examine various proteins in terms of their redundancy in natural originators such as animals and plants. The Monte Carlo method is used to derive information parameters for random protein sequences. Real protein sequence parameters are compared with the standard parameters of protein sequences having a specific length. The tendency of a chain to contain some amino acids more frequently than others and the tendency of a chain to contain certain amino acid pairs more frequently than other pairs are used as randomness measures of individual protein sequences. Non-periodic proteins are generally found to have random Shannon redundancies except in cases of constraints due to short chain length and genetic codes. Redundant characteristics of highly periodic proteins are discussed. A degree of periodicity parameter is derived.
Johnstone, E M; Chaney, M O; Norris, F H; Pascual, R; Little, S P
1991-07-01
Neuritic plaque and cerebrovascular amyloid deposits have been detected in the aged monkey, dog, and polar bear and have rarely been found in aged rodents (Biochem. Biophy. Res. Commun., 12 (1984) 885-890; Proc. Natl. Acad. Sci. U.S.A., 82 (1985) 4245-4249). To determine if the primary structure of the 42-43 residue amyloid peptide is conserved in species that accumulate plaques, the region of the amyloid precursor protein (APP) cDNA that encodes the peptide region was amplified by the polymerase chain reaction and sequenced. The deduced amino acid sequence was compared to those species where amyloid accumulation has not been detected. The DNA sequences of dog, polar bear, rabbit, cow, sheep, pig and guinea pig were compared and a phylogenetic tree was generated. We conclude that the amino acid sequence of dog and polar bear and other mammals which may form amyloid plaques is conserved and the species where amyloid has not been detected (mouse, rat) may be evolutionarily a distinct group. In addition, the predicted secondary structure of mouse and rat amyloid that differs from that of amyloid bearing species is its lack of propensity to form a beta sheeted structure. Thus, a cross-species examination of the amyloid peptide may suggest what is essential for amyloid deposition.
Somerville, Chris; Broun, Pierre; van de Loo, Frank
2001-01-01
This invention relates to plant fatty acyl hydroxylases. Methods to use conserved amino acid or nucleotide sequences to obtain plant fatty acyl hydroxylases are described. Also described is the use of cDNA clones encoding a plant hydroxylase to produce a family of hydroxylated fatty acids in transgenic plants. In addition, the use of genes encoding fatty acid hydroxylases or desaturases to alter the level of lipid fatty acid unsaturation in transgenic plants is described.
Zurawski, Gerard; Bohnert, Hans J.; Whitfeld, Paul R.; Bottomley, Warwick
1982-01-01
The gene for the so-called Mr 32,000 rapidly labeled photosystem II thylakoid membrane protein (here designated psbA) of spinach (Spinacia oleracea) chloroplasts is located on the chloroplast DNA in the large single-copy region immediately adjacent to one of the inverted repeat sequences. In this paper we show that the size of the mRNA for this protein is ≈ 1.25 kilobases and that the direction of transcription is towards the inverted repeat unit. The nucleotide sequence of the gene and its flanking regions is presented. The only large open reading frame in the sequence codes for a protein of Mr 38,950. The nucleotide sequence of psbA from Nicotiana debneyi also has been determined, and comparison of the sequences from the two species shows them to be highly conserved (>95% homology) throughout the entire reading frame. Conservation of the amino acid sequence is absolute, there being no changes in a total of 353 residues. This leads us to conclude that the primary translation product of psbA must be a protein of Mr 38,950. The protein is characterized by the complete absence of lysine residues and is relatively rich in hydrophobic amino acids, which tend to be clustered. Transcription of spinach psbA starts about 86 base pairs before the first ATG codon. Immediately upstream from this point there is a sequence typical of that found in E. coli promoters. An almost identical sequence occurs in the equivalent region of N. debneyi DNA. Images PMID:16593262
Buck, Patrick M.; Kumar, Sandeep; Singh, Satish K.
2013-01-01
The various roles that aggregation prone regions (APRs) are capable of playing in proteins are investigated here via comprehensive analyses of multiple non-redundant datasets containing randomly generated amino acid sequences, monomeric proteins, intrinsically disordered proteins (IDPs) and catalytic residues. Results from this study indicate that the aggregation propensities of monomeric protein sequences have been minimized compared to random sequences with uniform and natural amino acid compositions, as observed by a lower average aggregation propensity and fewer APRs that are shorter in length and more often punctuated by gate-keeper residues. However, evidence for evolutionary selective pressure to disrupt these sequence regions among homologous proteins is inconsistent. APRs are less conserved than average sequence identity among closely related homologues (≥80% sequence identity with a parent) but APRs are more conserved than average sequence identity among homologues that have at least 50% sequence identity with a parent. Structural analyses of APRs indicate that APRs are three times more likely to contain ordered versus disordered residues and that APRs frequently contribute more towards stabilizing proteins than equal length segments from the same protein. Catalytic residues and APRs were also found to be in structural contact significantly more often than expected by random chance. Our findings suggest that proteins have evolved by optimizing their risk of aggregation for cellular environments by both minimizing aggregation prone regions and by conserving those that are important for folding and function. In many cases, these sequence optimizations are insufficient to develop recombinant proteins into commercial products. Rational design strategies aimed at improving protein solubility for biotechnological purposes should carefully evaluate the contributions made by candidate APRs, targeted for disruption, towards protein structure and activity. PMID:24146608
Gocayne, J; Robinson, D A; FitzGerald, M G; Chung, F Z; Kerlavage, A R; Lentes, K U; Lai, J; Wang, C D; Fraser, C M; Venter, J C
1987-12-01
Two cDNA clones, lambda RHM-MF and lambda RHB-DAR, encoding the muscarinic cholinergic receptor and the beta-adrenergic receptor, respectively, have been isolated from a rat heart cDNA library. The cDNA clones were characterized by restriction mapping and automated DNA sequence analysis utilizing fluorescent dye primers. The rat heart muscarinic receptor consists of 466 amino acids and has a calculated molecular weight of 51,543. The rat heart beta-adrenergic receptor consists of 418 amino acids and has a calculated molecular weight of 46,890. The two cardiac receptors have substantial amino acid homology (27.2% identity, 50.6% with favored substitutions). The rat cardiac beta receptor has 88.0% homology (92.5% with favored substitutions) with the human brain beta receptor and the rat cardiac muscarinic receptor has 94.6% homology (97.6% with favored substitutions) with the porcine cardiac muscarinic receptor. The muscarinic cholinergic and beta-adrenergic receptors appear to be as conserved as hemoglobin and cytochrome c but less conserved than histones and are clearly members of a multigene family. These data support our hypothesis, based upon biochemical and immunological evidence, that suggests considerable structural homology and evolutionary conservation between adrenergic and muscarinic cholinergic receptors. To our knowledge, this is the first report utilizing automated DNA sequence analysis to determine the structure of a gene.
Mandl, C W; Holzmann, H; Kunz, C; Heinz, F X
1993-05-01
The complete nucleotide sequence of the positive-stranded RNA genome of the tick-borne flavivirus Powassan (10,839 nucleotides) was elucidated and the amino acid sequence of all viral proteins was derived. Based on this sequence as well as serological data, Powassan virus represents the most divergent member of the tick-borne serocomplex within the genus flaviviruses, family Flaviviridae. The primary nucleotide sequence and potential RNA secondary structures of the Powassan virus genome as well as the protein sequences and the reactivities of the virion with a panel of monoclonal antibodies were compared to other tick-borne and mosquito-borne flaviviruses. These analyses corroborated significant differences between tick-borne and mosquito-borne flaviviruses, but also emphasized structural elements that are conserved among both vector groups. The comparisons among tick-borne flaviviruses revealed conserved sequence elements that might represent important determinants of the tick-borne flavivirus phenotype.
Albornos, Lucía; Martín, Ignacio; Iglesias, Rebeca; Jiménez, Teresa; Labrador, Emilia; Dopico, Berta
2012-11-07
Many proteins with tandem repeats in their sequence have been described and classified according to the length of the repeats: I) Repeats of short oligopeptides (from 2 to 20 amino acids), including structural cell wall proteins and arabinogalactan proteins. II) Repeats that range in length from 20 to 40 residues, including proteins with a well-established three-dimensional structure often involved in mediating protein-protein interactions. (III) Longer repeats in the order of 100 amino acids that constitute structurally and functionally independent units. Here we analyse ShooT specific (ST) proteins, a family of proteins with tandem repeats of unknown function that were first found in Leguminosae, and their possible similarities to other proteins with tandem repeats. ST protein sequences were only found in dicotyledonous plants, limited to several plant families, mainly the Fabaceae and the Asteraceae. ST mRNAs accumulate mainly in the roots and under biotic interactions. Most ST proteins have one or several Domain(s) of Unknown Function 2775 (DUF2775). All deduced ST proteins have a signal peptide, indicating that these proteins enter the secretory pathway, and the mature proteins have tandem repeat oligopeptides that share a hexapeptide (E/D)FEPRP followed by 4 partially conserved amino acids, which could determine a putative N-glycosylation signal, and a fully conserved tyrosine. In a phylogenetic tree, the sequences clade according to taxonomic group. A possible involvement in symbiosis and abiotic stress as well as in plant cell elongation is suggested, although different STs could play different roles in plant development. We describe a new family of proteins called ST whose presence is limited to the plant kingdom, specifically to a few families of dicotyledonous plants. They present 20 to 40 amino acid tandem repeat sequences with different characteristics (signal peptide, DUF2775 domain, conservative repeat regions) from the described group of 20 to 40 amino acid tandem repeat proteins and also from known cell wall proteins with repeat sequences. Several putative roles in plant physiology can be inferred from the characteristics found.
2012-01-01
Background Many proteins with tandem repeats in their sequence have been described and classified according to the length of the repeats: I) Repeats of short oligopeptides (from 2 to 20 amino acids), including structural cell wall proteins and arabinogalactan proteins. II) Repeats that range in length from 20 to 40 residues, including proteins with a well-established three-dimensional structure often involved in mediating protein-protein interactions. (III) Longer repeats in the order of 100 amino acids that constitute structurally and functionally independent units. Here we analyse ShooT specific (ST) proteins, a family of proteins with tandem repeats of unknown function that were first found in Leguminosae, and their possible similarities to other proteins with tandem repeats. Results ST protein sequences were only found in dicotyledonous plants, limited to several plant families, mainly the Fabaceae and the Asteraceae. ST mRNAs accumulate mainly in the roots and under biotic interactions. Most ST proteins have one or several Domain(s) of Unknown Function 2775 (DUF2775). All deduced ST proteins have a signal peptide, indicating that these proteins enter the secretory pathway, and the mature proteins have tandem repeat oligopeptides that share a hexapeptide (E/D)FEPRP followed by 4 partially conserved amino acids, which could determine a putative N-glycosylation signal, and a fully conserved tyrosine. In a phylogenetic tree, the sequences clade according to taxonomic group. A possible involvement in symbiosis and abiotic stress as well as in plant cell elongation is suggested, although different STs could play different roles in plant development. Conclusions We describe a new family of proteins called ST whose presence is limited to the plant kingdom, specifically to a few families of dicotyledonous plants. They present 20 to 40 amino acid tandem repeat sequences with different characteristics (signal peptide, DUF2775 domain, conservative repeat regions) from the described group of 20 to 40 amino acid tandem repeat proteins and also from known cell wall proteins with repeat sequences. Several putative roles in plant physiology can be inferred from the characteristics found. PMID:23134664
Sequences of heavy and light chain variable regions from four bovine immunoglobulins.
Armour, K L; Tempest, P R; Fawcett, P H; Fernie, M L; King, S I; White, P; Taylor, G; Harris, W J
1994-12-01
Oligodeoxyribonucleotide primers based on the 5' ends of bovine IgG1/2 and lambda constant (C) region genes, together with primers encoding conserved amino acids at the N-terminus of mature variable (V) regions from other species, have been used in cDNA and polymerase chain reactions (PCRs) to amplify heavy and light chain V region cDNA from bovine heterohybridomas. The amino acid sequences of VH and V lambda from four bovine immunoglobulins of different specificities are presented.
Ventura, Marco; Jankovic, Ivana; Walker, D. Carey; Pridmore, R. David; Zink, Ralf
2002-01-01
We have identified and sequenced the genes encoding the aggregation-promoting factor (APF) protein from six different strains of Lactobacillus johnsonii and Lactobacillus gasseri. Both species harbor two apf genes, apf1 and apf2, which are in the same orientation and encode proteins of 257 to 326 amino acids. Multiple alignments of the deduced amino acid sequences of these apf genes demonstrate a very strong sequence conservation of all of the genes with the exception of their central regions. Northern blot analysis showed that both genes are transcribed, reaching their maximum expression during the exponential phase. Primer extension analysis revealed that apf1 and apf2 harbor a putative promoter sequence that is conserved in all of the genes. Western blot analysis of the LiCl cell extracts showed that APF proteins are located on the cell surface. Intact cells of L. johnsonii revealed the typical cell wall architecture of S-layer-carrying gram-positive eubacteria, which could be selectively removed with LiCl treatment. In addition, the amino acid composition, physical properties, and genetic organization were found to be quite similar to those of S-layer proteins. These results suggest that APF is a novel surface protein of the Lactobacillus acidophilus B-homology group which might belong to an S-layer-like family. PMID:12450842
GCPred: a web tool for guanylyl cyclase functional centre prediction from amino acid sequence.
Xu, Nuo; Fu, Dongfang; Li, Shiang; Wang, Yuxuan; Wong, Aloysius
2018-06-15
GCPred is a webserver for the prediction of guanylyl cyclase (GC) functional centres from amino acid sequence. GCs are enzymes that generate the signalling molecule cyclic guanosine 3', 5'-monophosphate from guanosine-5'-triphosphate. A novel class of GC centres (GCCs) has been identified in complex plant proteins. Using currently available experimental data, GCPred is created to automate and facilitate the identification of similar GCCs. The server features GCC values that consider in its calculation, the physicochemical properties of amino acids constituting the GCC and the conserved amino acids within the centre. From user input amino acid sequence, the server returns a table of GCC values and graphs depicting deviations from mean values. The utility of this server is demonstrated using plant proteins and the human interleukin-1 receptor-associated kinase family of proteins as example. The GCPred server is available at http://gcpred.com. Supplementary data are available at Bioinformatics online.
Lampel, J S; Aphale, J S; Lampel, K A; Strohl, W R
1992-01-01
The gene encoding a novel milk protein-hydrolyzing proteinase was cloned on a 6.56-kb SstI fragment from Streptomyces sp. strain C5 genomic DNA into Streptomyces lividans 1326 by using the plasmid vector pIJ702. The gene encoding the small neutral proteinase (snpA) was located within a 2.6-kb BamHI-SstI restriction fragment that was partially sequenced. The molecular mass of the deduced amino acid sequence of the mature protein was determined to be 15,740, which corresponds very closely with the relative molecular mass of the purified protein (15,500) determined by sodium dodecyl sulfate-polyacrylamide gel electrophoresis. The N-terminal amino acid sequence of the purified neutral proteinase was determined, and the DNA encoding this sequence was found to be located within the sequenced DNA. The deduced amino acid sequence contains a conserved zinc binding site, although secondary ligand binding and active sites typical of thermolysinlike metalloproteinases are absent. The combination of its small size, deduced amino acid sequence, and substrate and inhibition profile indicate that snpA encodes a novel neutral proteinase. Images PMID:1569011
Moreira, K G; Prates, M V; Andrade, F A C; Silva, L P; Beirão, P S L; Kushmerick, C; Naves, L A; Bloch, C
2010-08-01
Neurotoxicity is a major symptom of envenomation caused by Brazilian coral snake Micrurus frontalis. Due to the small amount of material that can be collected, no neurotoxin has been fully sequenced from this venom. In this work we report six new three-finger like toxins isolated from the venom of the coral snake M. frontalis which we named Frontoxin (FTx) I-VI. Toxins were purified using multiple steps of RP-HPLC. Molecular masses were determined by MALDI-TOF and ESI ion-trap mass spectrometry. The complete amino acid sequence of FTx II, III, IV and V were determined by sequencing of overlapping proteolytic fragments by Edman degradation and by de novo sequencing. The amino acid sequences of FTx I, II, III and VI predict 4 conserved disulphide bonds and structural similarity to previously reported short-chain alpha-neurotoxins. FTx IV and V each contained 10 conserved cysteines and share high similarity with long-chain alpha-neurotoxins. At the frog neuromuscular junction FTx II, III and IV reduced miniature endplate potential amplitudes in a time-and concentration-dependent manner suggesting Frontoxins block nicotinic acetylcholine receptors. Copyright 2010 Elsevier Ltd. All rights reserved.
Coffinet, Stéphanie; Cossu-Leguille, Carole; Rodius, François; Vasseur, Paule
2008-09-01
Glutamate cysteine ligase (GCL; EC 6.3.2.2) is the first enzyme involved in the synthesis of glutathione. A HPLC method with fluorimetric detection was used to measure GCL activity in the gills and the digestive gland of the freshwater bivalve, Unio tumidus. Storage conditions were optimized in order to prevent decrease of GCL activity and consisted in freezing the cytosolic fraction in the presence of protease (1 mM phenylmethylsulfonic fluoric acid) and gamma-glutamyltranspeptidase (1 mM L-serine borate mixture and 0.5 mM acivicin) inhibitors. Seasonal variations of activity in the digestive gland and to a lesser extent in the gills were found with activity increasing in spring compared to winter. No sex differences were revealed. The GCL coding sequence was identified using degenerated primers designed in the highly conserved regions of the catalytic subunit of GCL. The partial sequence identified encoded for 121 amino acids. The comparison of the identified partial coding sequence of U. tumidus with those available from vertebrates and invertebrates indicated that GCL sequence was highly conserved.
The primary structure of the thymidine kinase gene of fish lymphocystis disease virus.
Schnitzler, P; Handermann, M; Szépe, O; Darai, G
1991-06-01
The DNA nucleotide sequence of the thymidine kinase (TK) gene of fish lymphocystis disease virus (FLDV) which has been localized between the coordinates 0.678 to 0.688 of the viral genome was determined. The analysis of the DNA nucleotide sequence located between the recognition sites of HindIII (0.669 map unit; nucleotide position 1) and AccI (nucleotide position 2032) revealed the presence of an open reading frame of 954 bp on the lower strand of this region between nucleotide positions 1868 (ATG) and 915 (TAA). It encodes for a protein of 318 amino acid residues. The evolutionary relationships of the TK gene of FLDV to the other known TK genes was investigated using the method of progressive sequence alignment. These analyses revealed a high degree of diversity between the protein sequence of FLDV TK gene and the amino acid composition of other TKs tested. However, significant conservations were detected at several regions of amino acid residues of the FLDV TK protein when compared to the amino acid sequence of TKs of African swine fever virus, fowlpox virus, shope fibroma virus, and vaccinia virus and to the amino acid sequences of the cellular cytoplasmic TK of chicken, mouse, and man.
Quaranfil, Johnston Atoll, and Lake Chad viruses are novel members of the family Orthomyxoviridae.
Presti, Rachel M; Zhao, Guoyan; Beatty, Wandy L; Mihindukulasuriya, Kathie A; da Rosa, Amelia P A Travassos; Popov, Vsevolod L; Tesh, Robert B; Virgin, Herbert W; Wang, David
2009-11-01
Arboviral infections are an important cause of emerging infections due to the movements of humans, animals, and hematophagous arthropods. Quaranfil virus (QRFV) is an unclassified arbovirus originally isolated from children with mild febrile illness in Quaranfil, Egypt, in 1953. It has subsequently been isolated in multiple geographic areas from ticks and birds. We used high-throughput sequencing to classify QRFV as a novel orthomyxovirus. The genome of this virus is comprised of multiple RNA segments; five were completely sequenced. Proteins with limited amino acid similarity to conserved domains in polymerase (PA, PB1, and PB2) and hemagglutinin (HA) genes from known orthomyxoviruses were predicted to be present in four of the segments. The fifth sequenced segment shared no detectable similarity to any protein and is of uncertain function. The end-terminal sequences of QRFV are conserved between segments and are different from those of the known orthomyxovirus genera. QRFV is known to cross-react serologically with two other unclassified viruses, Johnston Atoll virus (JAV) and Lake Chad virus (LKCV). The complete open reading frames of PB1 and HA were sequenced for JAV, while a fragment of PB1 of LKCV was identified by mass sequencing. QRFV and JAV PB1 and HA shared 80% and 70% amino acid identity to each other, respectively; the LKCV PB1 fragment shared 83% amino acid identity with the corresponding region of QRFV PB1. Based on phylogenetic analyses, virion ultrastructural features, and the unique end-terminal sequences identified, we propose that QRFV, JAV, and LKCV comprise a novel genus of the family Orthomyxoviridae.
Quaranfil, Johnston Atoll, and Lake Chad Viruses Are Novel Members of the Family Orthomyxoviridae▿
Presti, Rachel M.; Zhao, Guoyan; Beatty, Wandy L.; Mihindukulasuriya, Kathie A.; Travassos da Rosa, Amelia P. A.; Popov, Vsevolod L.; Tesh, Robert B.; Virgin, Herbert W.; Wang, David
2009-01-01
Arboviral infections are an important cause of emerging infections due to the movements of humans, animals, and hematophagous arthropods. Quaranfil virus (QRFV) is an unclassified arbovirus originally isolated from children with mild febrile illness in Quaranfil, Egypt, in 1953. It has subsequently been isolated in multiple geographic areas from ticks and birds. We used high-throughput sequencing to classify QRFV as a novel orthomyxovirus. The genome of this virus is comprised of multiple RNA segments; five were completely sequenced. Proteins with limited amino acid similarity to conserved domains in polymerase (PA, PB1, and PB2) and hemagglutinin (HA) genes from known orthomyxoviruses were predicted to be present in four of the segments. The fifth sequenced segment shared no detectable similarity to any protein and is of uncertain function. The end-terminal sequences of QRFV are conserved between segments and are different from those of the known orthomyxovirus genera. QRFV is known to cross-react serologically with two other unclassified viruses, Johnston Atoll virus (JAV) and Lake Chad virus (LKCV). The complete open reading frames of PB1 and HA were sequenced for JAV, while a fragment of PB1 of LKCV was identified by mass sequencing. QRFV and JAV PB1 and HA shared 80% and 70% amino acid identity to each other, respectively; the LKCV PB1 fragment shared 83% amino acid identity with the corresponding region of QRFV PB1. Based on phylogenetic analyses, virion ultrastructural features, and the unique end-terminal sequences identified, we propose that QRFV, JAV, and LKCV comprise a novel genus of the family Orthomyxoviridae. PMID:19726499
Production of hydroxylated fatty acids in genetically modified plants
Somerville, Chris [Portola Valley, CA; Broun, Pierre [Burlingame, CA; van de Loo, Frank [Weston, AU; Boddupalli, Sekhar S [Manchester, MI
2011-08-23
This invention relates to plant fatty acyl hydroxylases. Methods to use conserved amino acid or nucleotide sequences to obtain plant fatty acyl hydroxylases are described. Also described is the use of cDNA clones encoding a plant hydroxylase to produce a family of hydroxylated fatty acids in transgenic plants. In addition, the use of genes encoding fatty acid hydroxylases or desaturases to alter the level of lipid fatty acid unsaturation in transgenic plants is described.
Production of hydroxylated fatty acids in genetically modified plants
Somerville, Chris; Broun, Pierre; van de Loo, Frank; Boddupalli, Sekhar S.
2005-08-30
This invention relates to plant fatty acyl hydroxylases. Methods to use conserved amino acid or nucleotide sequences to obtain plant fatty acyl hydroxylases are described. Also described is the use of cDNA clones encoding a plant hydroxylase to produce a family of hydroxylated fatty acids in transgenic plants. In addition, the use of genes encoding fatty acid hydroxylases or desaturases to alter the level of lipid fatty acid unsaturation in transgenic plants is described.
Drobni, Mirva; Hallberg, Kristina; Öhman, Ulla; Birve, Anna; Persson, Karina; Johansson, Ingegerd; Strömberg, Nicklas
2006-01-01
Background Actinomyces naeslundii genospecies 1 and 2 express type-2 fimbriae (FimA subunit polymers) with variant Galβ binding specificities and Actinomyces odontolyticus a sialic acid specificity to colonize different oral surfaces. However, the fimbrial nature of the sialic acid binding property and sequence information about FimA proteins from multiple strains are lacking. Results Here we have sequenced fimA genes from strains of A.naeslundii genospecies 1 (n = 4) and genospecies 2 (n = 4), both of which harboured variant Galβ-dependent hemagglutination (HA) types, and from A.odontolyticus PK984 with a sialic acid-dependent HA pattern. Three unique subtypes of FimA proteins with 63.8–66.4% sequence identity were present in strains of A. naeslundii genospecies 1 and 2 and A. odontolyticus. The generally high FimA sequence identity (>97.2%) within a genospecies revealed species specific sequences or segments that coincided with binding specificity. All three FimA protein variants contained a signal peptide, pilin motif, E box, proline-rich segment and an LPXTG sorting motif among other conserved segments for secretion, assembly and sorting of fimbrial proteins. The highly conserved pilin, E box and LPXTG motifs are present in fimbriae proteins from other Gram-positive bacteria. Moreover, only strains of genospecies 1 were agglutinated with type-2 fimbriae antisera derived from A. naeslundii genospecies 1 strain 12104, emphasizing that the overall folding of FimA may generate different functionalities. Western blot analyses with FimA antisera revealed monomers and oligomers of FimA in whole cell protein extracts and a purified recombinant FimA preparation, indicating a sortase-independent oligomerization of FimA. Conclusion The genus Actinomyces involves a diversity of unique FimA proteins with conserved pilin, E box and LPXTG motifs, depending on subspecies and associated binding specificity. In addition, a sortase independent oligomerization of FimA subunit proteins in solution was indicated. PMID:16686953
Chen, Yuhuang; Duan, Ran; Li, Xu; Li, Kewei; Liang, Junrong; Liu, Chang; Qiu, Haiyan; Xiao, Yuchun; Jing, Huaiqi; Wang, Xin
2015-12-01
The outer membrane protein A (OmpA) is one of the intra-species conserved proteins with immunogenicity widely found in the family of Enterobacteriaceae. Here we first confirmed OmpA is conserved in the three pathogenic Yersinia: Yersinia pestis, Yersinia pseudotuberculosis and pathogenic Yersinia enterocolitica, with high homology at the nucleotide level and at the amino acid sequence level. The identity of ompA sequences for 262 Y. pestis strains, 134 Y. pseudotuberculosis strains and 219 pathogenic Y. enterocolitica strains are 100%, 98.8% and 97.7% similar. The main pattern of OmpA of pathogenic Yersinia are 86.2% and 88.8% identical at the nucleotide and amino acid sequence levels, respectively. Immunological analysis showed the immunogenicity of each OmpA and cross-immunogenicity of OmpA for pathogenic Yersinia where OmpA may be a vaccine candidate for Y. pestis and other pathogenic Yersinia. Copyright © 2015 Elsevier Ltd. All rights reserved.
Cloning and sequence analysis of Hemonchus contortus HC58cDNA.
Muleke, Charles I; Ruofeng, Yan; Lixin, Xu; Xinwen, Bo; Xiangrui, Li
2007-06-01
The complete coding sequence of Hemonchus contortus HC58cDNA was generated by rapid amplification of cDNA ends and polymerase chain reaction using primers based on the 5' and 3' ends of the parasite mRNA, accession no. AF305964. The HC58cDNA gene was 851 bp long, with open reading frame of 717 bp, precursors to 239 amino acids coding for approximately 27 kDa protein. Analysis of amino acid sequence revealed conserved residues of cysteine, histidine, asparagine, occluding loop pattern, hemoglobinase motif and glutamine of the oxyanion hole characteristic of cathepsin B like proteases (CBL). Comparison of the predicted amino acid sequences showed the protein shared 33.5-58.7% identity to cathepsin B homologues in the papain clan CA family (family C1). Phylogenetic analysis revealed close evolutionary proximity of the protein sequence to counterpart sequences in the CBL, suggesting that HC58cDNA was a member of the papain family.
Diversity of the P2 protein among nontypeable Haemophilus influenzae isolates.
Bell, J; Grass, S; Jeanteur, D; Munson, R S
1994-01-01
The genes for outer membrane protein P2 of four nontypeable Haemophilus influenzae strains were cloned and sequenced. The derived amino acid sequences were compared with the outer membrane protein P2 sequence from H. influenzae type b MinnA and the sequences of P2 from three additional nontypeable H. influenzae strains. The sequences were 76 to 94% identical. The sequences had regions with considerable variability separated by regions which were highly conserved. The variable regions mapped to putative surface-exposed loops of the protein. PMID:8188390
Collin, Matthew A; Clarke, Thomas H; Ayoub, Nadia A; Hayashi, Cheryl Y
2018-07-01
A powerful system for studying protein aggregation, particularly rapid self-assembly, is spider silk. Spider silks are proteinaceous and silk proteins are synthesized and stored within silk glands as liquid dope. As needed, liquid dope is near-instantaneously transformed into solid fibers or viscous adhesives. The dominant constituents of silks are spidroins (spider fibroins) and their terminal domains are vital for the tight control of silk self-assembly. To better understand spidroin termini, we used target capture and deep sequencing to identify spidroin gene sequences from six species representing the araneoid families of Araneidae, Nephilidae, and Theridiidae. We obtained 145 terminal regions, of which 103 are newly annotated here, as well as novel variants within nine diverse spidroin types. Our comparative analyses demonstrated the conservation of acidic, basic, and cysteine amino acid residues across spidroin types that had been proposed to be important for monomer stability, dimer formation, and self-assembly from a limited sampling of spidroins. Computational, protein homology modeling revealed areas of spidroin terminal regions that are highly conserved in three-dimensions despite sequence divergence across spidroin types. Analyses of our dense sampling of terminal regions suggest that most spidroins share stabilization mechanisms, dimer formation, and tertiary structure, despite producing functionally distinct materials. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.
Fanning, T; Singer, M
1987-01-01
Recent work suggests that one or more members of the highly repeated LINE-1 (L1) DNA family found in all mammals may encode one or more proteins. Here we report the sequence of a portion of an L1 cloned from the domestic cat (Felis catus). These data permit comparison of the L1 sequences in four mammalian orders (Carnivore, Lagomorph, Rodent and Primate) and the comparison supports the suggested coding potential. In two separate, noncontiguous regions in the carboxy terminal half of the proteins predicted from the DNA sequences, there are several strongly conserved segments. In one region, these share homology with known or suspected reverse transcriptases, as described by others in rodents and primates. In the second region, closer to the carboxy terminus, the strongly conserved segments are over 90% homologous among the four orders. One of the latter segments is cysteine rich and resembles the putative metal binding domains of nucleic acid binding proteins, including those of TFIIIA and retroviruses. PMID:3562227
New Insight Into the Diversity of SemiSWEET Sugar Transporters and the Homologs in Prokaryotes
Jia, Baolei; Hao, Lujiang; Xuan, Yuan Hu; Jeon, Che Ok
2018-01-01
Sugars will eventually be exported transporters (SWEETs) and SemiSWEETs represent a family of sugar transporters in eukaryotes and prokaryotes, respectively. SWEETs contain seven transmembrane helices (TMHs), while SemiSWEETs contain three. The functions of SemiSWEETs are less studied. In this perspective article, we analyzed the diversity and conservation of SemiSWEETs and further proposed the possible functions. 1,922 SemiSWEET homologs were retrieved from the UniProt database, which is not proportional to the sequenced prokaryotic genomes. However, these proteins are very diverse in sequences and can be classified into 19 clusters when >50% sequence identity is required. Moreover, a gene context analysis indicated that several SemiSWEETs are located in the operons that are related to diverse carbohydrate metabolism. Several proteins with seven TMHs can be found in bacteria, and sequence alignment suggested that these proteins in bacteria may be formed by the duplication and fusion. Multiple sequence alignments showed that the amino acids for sugar translocation are still conserved and coevolved, although the sequences show diversity. Among them, the functions of a few amino acids are still not clear. These findings highlight the challenges that exist in SemiSWEETs and provide future researchers the foundation to explore these uncharted areas. PMID:29872447
New Insight Into the Diversity of SemiSWEET Sugar Transporters and the Homologs in Prokaryotes.
Jia, Baolei; Hao, Lujiang; Xuan, Yuan Hu; Jeon, Che Ok
2018-01-01
Sugars will eventually be exported transporters (SWEETs) and SemiSWEETs represent a family of sugar transporters in eukaryotes and prokaryotes, respectively. SWEETs contain seven transmembrane helices (TMHs), while SemiSWEETs contain three. The functions of SemiSWEETs are less studied. In this perspective article, we analyzed the diversity and conservation of SemiSWEETs and further proposed the possible functions. 1,922 SemiSWEET homologs were retrieved from the UniProt database, which is not proportional to the sequenced prokaryotic genomes. However, these proteins are very diverse in sequences and can be classified into 19 clusters when >50% sequence identity is required. Moreover, a gene context analysis indicated that several SemiSWEETs are located in the operons that are related to diverse carbohydrate metabolism. Several proteins with seven TMHs can be found in bacteria, and sequence alignment suggested that these proteins in bacteria may be formed by the duplication and fusion. Multiple sequence alignments showed that the amino acids for sugar translocation are still conserved and coevolved, although the sequences show diversity. Among them, the functions of a few amino acids are still not clear. These findings highlight the challenges that exist in SemiSWEETs and provide future researchers the foundation to explore these uncharted areas.
Wang, Yin-qiu; Qian, Ya-ping; Yang, Su; Shi, Hong; Liao, Cheng-hong; Zheng, Hong-Kun; Wang, Jun; Lin, Alice A.; Cavalli-Sforza, L. Luca; Underhill, Peter A.; Chakraborty, Ranajit; Jin, Li; Su, Bing
2005-01-01
Pituitary adenylate cyclase-activating polypeptide (PACAP) is a neuropeptide abundantly expressed in the central nervous system and involved in regulating neurogenesis and neuronal signal transduction. The amino acid sequence of PACAP is extremely conserved across vertebrate species, indicating a strong functional constraint during the course of evolution. However, through comparative sequence analysis, we demonstrated that the PACAP precursor gene underwent an accelerated evolution in the human lineage since the divergence from chimpanzees, and the amino acid substitution rate in humans is at least seven times faster than that in other mammal species resulting from strong Darwinian positive selection. Eleven human-specific amino acid changes were identified in the PACAP precursors, which are conserved from murine to African apes. Protein structural analysis suggested that a putative novel neuropeptide might have originated during human evolution and functioned in the human brain. Our data suggested that the PACAP precursor gene underwent adaptive changes during human origin and may have contributed to the formation of human cognition. PMID:15834139
Adhesive Proteins of Stalked and Acorn Barnacles Display Homology with Low Sequence Similarities
Jonker, Jaimie-Leigh; Abram, Florence; Pires, Elisabete; Varela Coelho, Ana; Grunwald, Ingo; Power, Anne Marie
2014-01-01
Barnacle adhesion underwater is an important phenomenon to understand for the prevention of biofouling and potential biotechnological innovations, yet so far, identifying what makes barnacle glue proteins ‘sticky’ has proved elusive. Examination of a broad range of species within the barnacles may be instructive to identify conserved adhesive domains. We add to extensive information from the acorn barnacles (order Sessilia) by providing the first protein analysis of a stalked barnacle adhesive, Lepas anatifera (order Lepadiformes). It was possible to separate the L. anatifera adhesive into at least 10 protein bands using SDS-PAGE. Intense bands were present at approximately 30, 70, 90 and 110 kilodaltons (kDa). Mass spectrometry for protein identification was followed by de novo sequencing which detected 52 peptides of 7–16 amino acids in length. None of the peptides matched published or unpublished transcriptome sequences, but some amino acid sequence similarity was apparent between L. anatifera and closely-related Dosima fascicularis. Antibodies against two acorn barnacle proteins (ab-cp-52k and ab-cp-68k) showed cross-reactivity in the adhesive glands of L. anatifera. We also analysed the similarity of adhesive proteins across several barnacle taxa, including Pollicipes pollicipes (a stalked barnacle in the order Scalpelliformes). Sequence alignment of published expressed sequence tags clearly indicated that P. pollicipes possesses homologues for the 19 kDa and 100 kDa proteins in acorn barnacles. Homology aside, sequence similarity in amino acid and gene sequences tended to decline as taxonomic distance increased, with minimum similarities of 18–26%, depending on the gene. The results indicate that some adhesive proteins (e.g. 100 kDa) are more conserved within barnacles than others (20 kDa). PMID:25295513
Adhesive proteins of stalked and acorn barnacles display homology with low sequence similarities.
Jonker, Jaimie-Leigh; Abram, Florence; Pires, Elisabete; Varela Coelho, Ana; Grunwald, Ingo; Power, Anne Marie
2014-01-01
Barnacle adhesion underwater is an important phenomenon to understand for the prevention of biofouling and potential biotechnological innovations, yet so far, identifying what makes barnacle glue proteins 'sticky' has proved elusive. Examination of a broad range of species within the barnacles may be instructive to identify conserved adhesive domains. We add to extensive information from the acorn barnacles (order Sessilia) by providing the first protein analysis of a stalked barnacle adhesive, Lepas anatifera (order Lepadiformes). It was possible to separate the L. anatifera adhesive into at least 10 protein bands using SDS-PAGE. Intense bands were present at approximately 30, 70, 90 and 110 kilodaltons (kDa). Mass spectrometry for protein identification was followed by de novo sequencing which detected 52 peptides of 7-16 amino acids in length. None of the peptides matched published or unpublished transcriptome sequences, but some amino acid sequence similarity was apparent between L. anatifera and closely-related Dosima fascicularis. Antibodies against two acorn barnacle proteins (ab-cp-52k and ab-cp-68k) showed cross-reactivity in the adhesive glands of L. anatifera. We also analysed the similarity of adhesive proteins across several barnacle taxa, including Pollicipes pollicipes (a stalked barnacle in the order Scalpelliformes). Sequence alignment of published expressed sequence tags clearly indicated that P. pollicipes possesses homologues for the 19 kDa and 100 kDa proteins in acorn barnacles. Homology aside, sequence similarity in amino acid and gene sequences tended to decline as taxonomic distance increased, with minimum similarities of 18-26%, depending on the gene. The results indicate that some adhesive proteins (e.g. 100 kDa) are more conserved within barnacles than others (20 kDa).
Gocayne, J; Robinson, D A; FitzGerald, M G; Chung, F Z; Kerlavage, A R; Lentes, K U; Lai, J; Wang, C D; Fraser, C M; Venter, J C
1987-01-01
Two cDNA clones, lambda RHM-MF and lambda RHB-DAR, encoding the muscarinic cholinergic receptor and the beta-adrenergic receptor, respectively, have been isolated from a rat heart cDNA library. The cDNA clones were characterized by restriction mapping and automated DNA sequence analysis utilizing fluorescent dye primers. The rat heart muscarinic receptor consists of 466 amino acids and has a calculated molecular weight of 51,543. The rat heart beta-adrenergic receptor consists of 418 amino acids and has a calculated molecular weight of 46,890. The two cardiac receptors have substantial amino acid homology (27.2% identity, 50.6% with favored substitutions). The rat cardiac beta receptor has 88.0% homology (92.5% with favored substitutions) with the human brain beta receptor and the rat cardiac muscarinic receptor has 94.6% homology (97.6% with favored substitutions) with the porcine cardiac muscarinic receptor. The muscarinic cholinergic and beta-adrenergic receptors appear to be as conserved as hemoglobin and cytochrome c but less conserved than histones and are clearly members of a multigene family. These data support our hypothesis, based upon biochemical and immunological evidence, that suggests considerable structural homology and evolutionary conservation between adrenergic and muscarinic cholinergic receptors. To our knowledge, this is the first report utilizing automated DNA sequence analysis to determine the structure of a gene. Images PMID:2825184
USDA-ARS?s Scientific Manuscript database
Our recent study has shown that bovine rhinovirus type 2 (BRV2), a new member of the Aphthovirus genus, shares many motifs and sequence similarities with foot-and-mouth disease virus (FMDV). Despite low sequence conservation (36percent amino acid identity) and N- and C-terminus folding differences,...
Structural analysis of Bacillus pumilus phenolic acid decarboxylase, a lipocalin-fold enzyme
DOE Office of Scientific and Technical Information (OSTI.GOV)
Matte, Allan; Grosse, Stephan; Bergeron, Hélène
The decarboxylation of phenolic acids, including ferulic and p-coumaric acids, to their corresponding vinyl derivatives is of importance in the flavoring and polymer industries. Here, the crystal structure of phenolic acid decarboxylase (PAD) from Bacillus pumilus strain UI-670 is reported. The enzyme is a 161-residue polypeptide that forms dimers both in the crystal and in solution. The structure of PAD as determined by X-ray crystallography revealed a -barrel structure and two -helices, with a cleft formed at one edge of the barrel. The PAD structure resembles those of the lipocalin-fold proteins, which often bind hydrophobic ligands. Superposition of structurally relatedmore » proteins bound to their cognate ligands shows that they and PAD bind their ligands in a conserved location within the -barrel. Analysis of the residue-conservation pattern for PAD-related sequences mapped onto the PAD structure reveals that the conservation mainly includes residues found within the hydrophobic core of the protein, defining a common lipocalin-like fold for this enzyme family. A narrow cleft containing several conserved amino acids was observed as a structural feature and a potential ligand-binding site.« less
Evolutionarily conserved ELOVL4 gene expression in the vertebrate retina.
Lagali, Pamela S; Liu, Jiafan; Ambasudhan, Rajesh; Kakuk, Laura E; Bernstein, Steven L; Seigel, Gail M; Wong, Paul W; Ayyagari, Radha
2003-07-01
The gene elongation of very long chain fatty acids-4 (ELOVL4) has been shown to underlie phenotypically heterogeneous forms of autosomal dominant macular degeneration. In this study, the extent of evolutionary conservation and the existence and localization of retinal expression of this gene was investigated across a wide variety of species. Southern blot analysis of genomic DNA and bioinformatic analysis using the human ELOVL4 cDNA and protein sequences, respectively, were performed to identify species in which ELOVL4 orthologues and/or homologues are present. Retinal RNA and protein extracts derived from different species were assessed by Northern hybridization and immunoblot techniques to assess evolutionary conservation of gene expression. Immunohistochemical analysis of tissue sections prepared from various mammalian retinas was performed to determine the distribution of ELOVL4 and homologous proteins within specific retinal cell layers. The existence of ELOVL4 sequence orthologues and homologues was confirmed by both Southern blot analysis and in silico searches of protein sequence databases. Phylogenetic analysis places ELOVL4 among a large family of known and putative fatty acid elongase proteins. Northern blot analysis revealed the presence of multiple transcripts corresponding to ELOVL4 homologues expressed in the retina of several different mammalian species. Conserved proteins were also detected among retinal extracts of different mammals and were found to localize predominantly to the photoreceptor cell layer within retinal tissue preparations. The ELOVL4 gene is highly conserved throughout evolution and is expressed in the photoreceptor cells of the retina in a variety of different species, which suggests that it plays a critical role in retinal cell biology.
Zhu, X; Naz, R K
1999-03-01
The deduced ZP3 amino acid (aa) sequences of 13 vertebrate species namely mouse, hamster, rabbit, pig, porcine, cow, dog, cat, human, bonnet, marmoset, carp, and frog were compared using the PILEUP and PRETTY alignment programs (GCG, Wisconsin, USA). The published aa sequences obtained from 13 vertebrate species indicated the overall evolutionarily conservation in the N-terminus, central region, and C-terminus of the ZP3 polypeptide. More variations of ZP3 polypeptide sequences were seen in the alignments of carp and frog from the 11 mammalian species making the leader sequence more prominent. The canonical furin proteolytic processing signal at the C-terminus was found in all the ZP3 polypeptide sequences except of carp and frog. In the central region, the ZP3 deduced aa sequences of all the 13 vertebrate species aligned well, and six relatively conserved sequences were found. There are 11 conserved cysteine residues in the central region across all species including carp and frog, indicating that these residues have longer evolutionary history. The ZP3 aa sequence similarities were examined using the GAP program (GCG). The highest aa similarities are observed between the members of the same order within the class mammalia, and also (95.4%) between pig (ungulata) and rabbit (lagomorpha). The deduced ZP3 aa sequences per se may not be enough to build a phylogenetic tree.
2012-01-01
Background The detection of conserved residue clusters on a protein structure is one of the effective strategies for the prediction of functional protein regions. Various methods, such as Evolutionary Trace, have been developed based on this strategy. In such approaches, the conserved residues are identified through comparisons of homologous amino acid sequences. Therefore, the selection of homologous sequences is a critical step. It is empirically known that a certain degree of sequence divergence in the set of homologous sequences is required for the identification of conserved residues. However, the development of a method to select homologous sequences appropriate for the identification of conserved residues has not been sufficiently addressed. An objective and general method to select appropriate homologous sequences is desired for the efficient prediction of functional regions. Results We have developed a novel index to select the sequences appropriate for the identification of conserved residues, and implemented the index within our method to predict the functional regions of a protein. The implementation of the index improved the performance of the functional region prediction. The index represents the degree of conserved residue clustering on the tertiary structure of the protein. For this purpose, the structure and sequence information were integrated within the index by the application of spatial statistics. Spatial statistics is a field of statistics in which not only the attributes but also the geometrical coordinates of the data are considered simultaneously. Higher degrees of clustering generate larger index scores. We adopted the set of homologous sequences with the highest index score, under the assumption that the best prediction accuracy is obtained when the degree of clustering is the maximum. The set of sequences selected by the index led to higher functional region prediction performance than the sets of sequences selected by other sequence-based methods. Conclusions Appropriate homologous sequences are selected automatically and objectively by the index. Such sequence selection improved the performance of functional region prediction. As far as we know, this is the first approach in which spatial statistics have been applied to protein analyses. Such integration of structure and sequence information would be useful for other bioinformatics problems. PMID:22643026
Basak, Papri; Maitra-Majee, Susmita; Das, Jayanta Kumar; Mukherjee, Abhishek; Ghosh Dastidar, Shubhra; Pal Choudhury, Pabitra
2017-01-01
A molecular evolutionary analysis of a well conserved protein helps to determine the essential amino acids in the core catalytic region. Based on the chemical properties of amino acid residues, phylogenetic analysis of a total of 172 homologous sequences of a highly conserved enzyme, L-myo-inositol 1-phosphate synthase or MIPS from evolutionarily diverse organisms was performed. This study revealed the presence of six phylogenetically conserved blocks, out of which four embrace the catalytic core of the functional protein. Further, specific amino acid modifications targeting the lysine residues, known to be important for MIPS catalysis, were performed at the catalytic site of a MIPS from monocotyledonous model plant, Oryza sativa (OsMIPS1). Following this study, OsMIPS mutants with deletion or replacement of lysine residues in the conserved blocks were made. Based on the enzyme kinetics performed on the deletion/replacement mutants, phylogenetic and structural comparison with the already established crystal structures from non-plant sources, an evolutionarily conserved peptide stretch was identified at the active pocket which contains the two most important lysine residues essential for catalytic activity. PMID:28950028
The Malarial Host-Targeting Signal Is Conserved in the Irish Potato Famine Pathogen
Liolios, Konstantinos; Win, Joe; Kanneganti, Thirumala-Devi; Young, Carolyn; Kamoun, Sophien; Haldar, Kasturi
2006-01-01
Animal and plant eukaryotic pathogens, such as the human malaria parasite Plasmodium falciparum and the potato late blight agent Phytophthora infestans, are widely divergent eukaryotic microbes. Yet they both produce secretory virulence and pathogenic proteins that alter host cell functions. In P. falciparum, export of parasite proteins to the host erythrocyte is mediated by leader sequences shown to contain a host-targeting (HT) motif centered on an RxLx (E, D, or Q) core: this motif appears to signify a major pathogenic export pathway with hundreds of putative effectors. Here we show that a secretory protein of P. infestans, which is perceived by plant disease resistance proteins and induces hypersensitive plant cell death, contains a leader sequence that is equivalent to the Plasmodium HT-leader in its ability to export fusion of green fluorescent protein (GFP) from the P. falciparum parasite to the host erythrocyte. This export is dependent on an RxLR sequence conserved in P. infestans leaders, as well as in leaders of all ten secretory oomycete proteins shown to function inside plant cells. The RxLR motif is also detected in hundreds of secretory proteins of P. infestans, Phytophthora sojae, and Phytophthora ramorum and has high value in predicting host-targeted leaders. A consensus motif further reveals E/D residues enriched within ~25 amino acids downstream of the RxLR, which are also needed for export. Together the data suggest that in these plant pathogenic oomycetes, a consensus HT motif may reside in an extended sequence of ~25–30 amino acids, rather than in a short linear sequence. Evidence is presented that although the consensus is much shorter in P. falciparum, information sufficient for vacuolar export is contained in a region of ~30 amino acids, which includes sequences flanking the HT core. Finally, positional conservation between Phytophthora RxLR and P. falciparum RxLx (E, D, Q) is consistent with the idea that the context of their presentation is constrained. These studies provide the first evidence to our knowledge that eukaryotic microbes share equivalent pathogenic HT signals and thus conserved mechanisms to access host cells across plant and animal kingdoms that may present unique targets for prophylaxis across divergent pathogens. PMID:16733545
Nadjar-Boger, Elisabeth; Maccatrozzo, Lisa; Radaelli, Giuseppe; Funkenstein, Bruria
2013-02-01
Myostatin (MSTN) is a member of the transforming growth factor-ß superfamily, known as a negative regulator of skeletal muscle development and growth in mammals. In contrast to mammals, fish possess at least two paralogs of MSTN: MSTN-1 and MSTN-2. Here we describe the cloning and sequence analysis of spliced and precursor (unspliced) transcripts as well as the 5' flanking region of MSTN-2 from the marine fish Umbrina cirrosa (ucMSTN-2). In silico analysis revealed numerous putative cis regulatory elements including several E-boxes known as binding sites to myogenic transcription factors. Transient transfection experiments using non-muscle and muscle cell lines showed high transcriptional activity in muscle cells and in differentiated neural cells, in accordance with our previous findings in MSTN-2 promoter from Sparus aurata. Comparative informatics analysis of MSTN-2 from several fish species revealed high conservation of the predicted amino acid sequence as well as the gene structure (exon length) although intron length varied between species. The proximal promoter of MSTN-2 gene was found to be conserved among Perciforms. In conclusion, this study reinforces our conclusion that MSTN-2 promoter is a very strong promoter, especially in muscle cells. In addition, we show that the MSTN-2 gene structure is highly conserved among fishes as is the predicted amino acid sequence of the peptide. Copyright © 2012 Elsevier Inc. All rights reserved.
Sánchez-Navarro, J A; Pallás, V
1997-01-01
The complete nucleotide sequence of an isolate of prunus necrotic ringspot virus (PNRSV) RNA 3 has been determined. Elucidation of the amino acid sequence of the proteins encoded by the two large open reading frames (ORFs) allowed us to carry out comparative and phylogenetic studies on the movement (MP) and coat (CP) proteins in the ilarvirus group. Amino acid sequence comparison of the MP revealed a highly conserved basic sequence motif with an amphipathic alpha-helical structure preceding the conserved motif of the '30K superfamily' proposed by Mushegian and Koonin [26] for MP's. Within this '30K' motif a strictly conserved transmembrane domain is present in all ilarviruses sequenced so far. At the amino-terminal end, prune dwarf virus (PDV) has an extension not present in other ilarviruses but which is observed in all bromo- and cucumoviruses, suggesting a common ancestor or a recombinational event in the Bromoviridae family. Examination of the N-terminus of the CP's of all ilarviruses revealed a highly basic region, part of which resembles the Arg-rich motif that has been characterized in the RNA-binding protein family. This motif has also been found in the other members of the Bromoviridae family, suggesting its involvement in a structural function. Furthermore this region is required for infectivity in ilarviruses. The similarities found in this Arg-rich motif are discussed in terms of this process known as genome activation. Finally, phylogenetic analysis of both the MP and CP proteins revealed a higher relationship of A1MV to PNRSV, apple mosaic virus (ApMV) and PDV than any other member of the ilarvirus group. In that sense, A1MV should be considered as a true ilarvirus instead of forming a distinct group of viruses.
Georgi, Laura; Johnson-Cicalese, Jennifer; Honig, Josh; Das, Sushma Parankush; Rajah, Veeran D; Bhattacharya, Debashish; Bassil, Nahla; Rowland, Lisa J; Polashock, James; Vorsa, Nicholi
2013-03-01
The first genetic map of cranberry (Vaccinium macrocarpon) has been constructed, comprising 14 linkage groups totaling 879.9 cM with an estimated coverage of 82.2 %. This map, based on four mapping populations segregating for field fruit-rot resistance, contains 136 distinct loci. Mapped markers include blueberry-derived simple sequence repeat (SSR) and cranberry-derived sequence-characterized amplified region markers previously used for fingerprinting cranberry cultivars. In addition, SSR markers were developed near cranberry sequences resembling genes involved in flavonoid biosynthesis or defense against necrotrophic pathogens, or conserved orthologous set (COS) sequences. The cranberry SSRs were developed from next-generation cranberry genomic sequence assemblies; thus, the positions of these SSRs on the genomic map provide information about the genomic location of the sequence scaffold from which they were derived. The use of SSR markers near COS and other functional sequences, plus 33 SSR markers from blueberry, facilitates comparisons of this map with maps of other plant species. Regions of the cranberry map were identified that showed conservation of synteny with Vitis vinifera and Arabidopsis thaliana. Positioned on this map are quantitative trait loci (QTL) for field fruit-rot resistance (FFRR), fruit weight, titratable acidity, and sound fruit yield (SFY). The SFY QTL is adjacent to one of the fruit weight QTL and may reflect pleiotropy. Two of the FFRR QTL are in regions of conserved synteny with grape and span defense gene markers, and the third FFRR QTL spans a flavonoid biosynthetic gene.
USDA-ARS?s Scientific Manuscript database
In this paper, we report the full length coding sequence of bovine ATGL cDNA are reported and analyze its expression in bovine tissues. Similar to human, mouse, and pig ATGL sequences, bovine ATGL has a highly conserved patatin domain that is necessary for lipolytic function in mice and humans. Thi...
Xing, Wen-Rui; Hou, Bei-Wei; Guan, Jing-Jiao; Luo, Jing; Ding, Xiao-Yu
2013-04-01
The LEAFY (LFY) homologous gene of Dendrobium moniliforme (L.) Sw. was cloned by new primers which were designed based on the conservative region of known sequences of orchid LEAFY gene. Partial LFY homologous gene was cloned by common PCR, then we got the complete LFY homologous gene Den LFY by Tail-PCR. The complete sequence of DenLFY gene was 3 575 bp which contained three exons and two introns. Using BLAST method, comparison analysis among the exon of LFY homologous gene indicted that the DenLFY gene had high identity with orchids LFY homologous, including the related fragment of PhalLFY (84%) in Phalaenopsis hybrid cultivar, LFY homologous gene in Oncidium (90%) and in other orchid (over 80%). Using MP analysis, Dendrobium is found to be the sister to Oncidium and Phalaenopsis. Homologous analysis demonstrated that the C-terminal amino acids were highly conserved. When the exons and introns were separately considered, exons and the sequence of amino acid were good markers for the function research of DenLFY gene. The second intron can be used in authentication research of Dendrobium based on the length polymorphism between Dendrobium moniliforme and Dendrobium officinale.
Protein structure based prediction of catalytic residues.
Fajardo, J Eduardo; Fiser, Andras
2013-02-22
Worldwide structural genomics projects continue to release new protein structures at an unprecedented pace, so far nearly 6000, but only about 60% of these proteins have any sort of functional annotation. We explored a range of features that can be used for the prediction of functional residues given a known three-dimensional structure. These features include various centrality measures of nodes in graphs of interacting residues: closeness, betweenness and page-rank centrality. We also analyzed the distance of functional amino acids to the general center of mass (GCM) of the structure, relative solvent accessibility (RSA), and the use of relative entropy as a measure of sequence conservation. From the selected features, neural networks were trained to identify catalytic residues. We found that using distance to the GCM together with amino acid type provide a good discriminant function, when combined independently with sequence conservation. Using an independent test set of 29 annotated protein structures, the method returned 411 of the initial 9262 residues as the most likely to be involved in function. The output 411 residues contain 70 of the annotated 111 catalytic residues. This represents an approximately 14-fold enrichment of catalytic residues on the entire input set (corresponding to a sensitivity of 63% and a precision of 17%), a performance competitive with that of other state-of-the-art methods. We found that several of the graph based measures utilize the same underlying feature of protein structures, which can be simply and more effectively captured with the distance to GCM definition. This also has the added the advantage of simplicity and easy implementation. Meanwhile sequence conservation remains by far the most influential feature in identifying functional residues. We also found that due the rapid changes in size and composition of sequence databases, conservation calculations must be recalibrated for specific reference databases.
Sequence Diversity Diagram for comparative analysis of multiple sequence alignments.
Sakai, Ryo; Aerts, Jan
2014-01-01
The sequence logo is a graphical representation of a set of aligned sequences, commonly used to depict conservation of amino acid or nucleotide sequences. Although it effectively communicates the amount of information present at every position, this visual representation falls short when the domain task is to compare between two or more sets of aligned sequences. We present a new visual presentation called a Sequence Diversity Diagram and validate our design choices with a case study. Our software was developed using the open-source program called Processing. It loads multiple sequence alignment FASTA files and a configuration file, which can be modified as needed to change the visualization. The redesigned figure improves on the visual comparison of two or more sets, and it additionally encodes information on sequential position conservation. In our case study of the adenylate kinase lid domain, the Sequence Diversity Diagram reveals unexpected patterns and new insights, for example the identification of subgroups within the protein subfamily. Our future work will integrate this visual encoding into interactive visualization tools to support higher level data exploration tasks.
Selection of the simplest RNA that binds isoleucine
LOZUPONE, CATHERINE; CHANGAYIL, SHANKAR; MAJERFELD, IRENE; YARUS, MICHAEL
2003-01-01
We have identified the simplest RNA binding site for isoleucine using selection-amplification (SELEX), by shrinking the size of the randomized region until affinity selection is extinguished. Such a protocol can be useful because selection does not necessarily make the simplest active motif most prominent, as is often assumed. We find an isoleucine binding site that behaves exactly as predicted for the site that requires fewest nucleotides. This UAUU motif (16 highly conserved positions; 27 total), is also the most abundant site in successful selections on short random tracts. The UAUU site, now isolated independently at least 63 times, is a small asymmetric internal loop. Conserved loop sequences include isoleucine codon and anticodon triplets, whose nucleotides are required for amino acid binding. This reproducible association between isoleucine and its coding sequences supports the idea that the genetic code is, at least in part, a stereochemical residue of the most easily isolated RNA–amino acid binding structures. PMID:14561881
Bricheux, G; Brugerolle, G
1997-08-01
The parasitic protozoan Trichomonas vaginalis is known to contain the ubiquitous and highly conserved protein actin. A genomic library and a cDNA library have been screened to identify and clone the actin gene(s) of T. vaginalis. The nucleotide sequence of one gene and its flanking regions have been determined. The open reading frame encodes a protein of 376 amino acids. The sequence is not interrupted by any introns and the promoter could be represented by a 10 bp motif close to a consensus motif also found upstream of most sequenced T. vaginalis genes. The five different clones isolated from the cDNA library have similar sequences and encode three actin proteins differing only by one or two amino acids. A phylogenetic analysis of 31 actin sequences by distance matrix and parsimony methods, using centractin as outgroup, gives congruent trees with Parabasala branching above Diplomonadida.
Cloning and characterization of the gene encoding IMP dehydrogenase from Arabidopsis thaliana.
Collart, F R; Osipiuk, J; Trent, J; Olsen, G J; Huberman, E
1996-10-03
We have cloned and characterized the gene encoding inosine monophosphate dehydrogenase (IMPDH) from Arabidopsis thaliana (At). The transcription unit of the At gene spans approximately 1900 bp and specifies a protein of 503 amino acids with a calculated relative molecular mass (M(r)) of 54,190. The gene is comprised of a minimum of four introns and five exons with all donor and acceptor splice sequences conforming to previously proposed consensus sequences. The deduced IMPDH amino-acid sequence from At shows a remarkable similarity to other eukaryotic IMPDH sequences, with a 48% identity to human Type II enzyme. Allowing for conservative substitutions, the enzyme is 69% similar to human Type II IMPDH. The putative active-site sequence of At IMPDH conforms to the IMP dehydrogenase/guanosine monophosphate reductase motif and contains an essential active-site cysteine residue.
NASA Astrophysics Data System (ADS)
Prasetyo, Afiono Agung; Dharmawan, Ruben; Sari, Yulia; Sariyatun, Ratna
2017-02-01
Human immunodeficiency virus type 1 (HIV-1) remains a cause of global health problem. Continuous studies of HIV-1 genetic and immunological profiles are important to find strategies against the virus. This study aimed to conduct analysis of sequence conservation, HLA-E-restricted peptide, and best-defined CTL/CD8+ epitopes in p24 (capsid) of HIV-1 subtype B worldwide. The p24-coding sequences from 3,557 HIV subtype B isolates were aligned using MUSCLE and analysed. Some highly conserved regions (sequence conservation ≥95%) were observed. Two considerably long series of sequences with conservation of 100% was observed at base 349-356 and 550-557 of p24 (HXB2 numbering). The consensus from all aligned isolates was precisely the same as consensus B in the Los Alamos HIV Database. The HLA-E-restricted peptide in amino acid (aa) 14-22 of HIV-1 p24 (AISPRTLNA) was found in 55.9% (1,987/3,557) of HIV-1 subtype B worldwide. Forty-four best-defined CTL/CD8+ epitopes were observed, in which VKNWMTETL epitope (aa 181-189 of p24) restricted by B*4801 was the most frequent, as found in 94.9% of isolates. The results of this study would contribute information about HIV-1 subtype B and benefits for further works willing to develop diagnostic and therapeutic strategies against the virus.
Chen, Chih-Ying; Brodsky, Frances M
2005-02-18
Clathrin heavy and light chains form triskelia, which assemble into polyhedral coats of membrane vesicles that mediate transport for endocytosis and organelle biogenesis. Light chain subunits regulate clathrin assembly in vitro by suppressing spontaneous self-assembly of the heavy chains. The residues that play this regulatory role are at the N terminus of a conserved 22-amino acid sequence that is shared by all vertebrate light chains. Here we show that these regulatory residues and others in the conserved sequence mediate light chain interaction with Hip1 and Hip1R. These related proteins were previously found to be enriched in clathrin-coated vesicles and to promote clathrin assembly in vitro. We demonstrate Hip1R binding preference for light chains associated with clathrin heavy chain and show that Hip1R stimulation of clathrin assembly in vitro is blocked by mutations in the conserved sequence of light chains that abolish interaction with Hip1 and Hip1R. In vivo overexpression of a fragment of clathrin light chain comprising the Hip1R-binding region affected cellular actin distribution. Together these results suggest that the roles of Hip1 and Hip1R in affecting clathrin assembly and actin distribution are mediated by their interaction with the conserved sequence of clathrin light chains.
Two different groups of signal sequence in M-superfamily conotoxins.
Wang, Qi; Jiang, Hui; Han, Yu-Hong; Yuan, Duo-Duo; Chi, Cheng-Wu
2008-04-01
M-superfamily conotoxins can be divided into four branches (M-1, M-2, M-3 and M-4) according to the number of amino acid residues in the third Cys loop. In general, it is widely accepted that the conotoxin signal peptides of each superfamily are strictly conserved. Recently, we cloned six cDNAs of novel M-superfamily conotoxins from Conus leopardus, Conus marmoreus and Conus quercinus, belonging to either M-1 or M-3 branch. These conotoxins, judging from the putative peptide sequences deducted from cDNAs, are rich in acidic residues and share highly conserved signal and pro-peptide region. However, they are quite different from the reported conotoxins of M-2 and M-4 branches even in their signal peptides, which in general are considered highly conserved for each superfamily of conotoxins. The signal sequences of M-1 and M-3 conotoxins composed of 24 residues start with MLKMGVVL-, while those of M-2 and M-4 conotoxins composed of 25 residues start with MMSKLGVL-. It is another example that different types of signal peptides can exist within a superfamily besides the I-conotoxin superfamily. In addition to the different disulfide connectivity of M-1 conotoxins from that of M-4 or M-2 conotoxins, the sequence alignment, preferential Cys codon usage and phylogenetic tree analysis suggest that M-1 and M-3 conotoxins have much closer relationship, being different from the conotoxins of other two branches (M-4 and M-2) of M-superfamily.
Cloning a Chymotrypsin-Like 1 (CTRL-1) Protease cDNA from the Jellyfish Nemopilema nomurai
Heo, Yunwi; Kwon, Young Chul; Bae, Seong Kyeong; Hwang, Duhyeon; Yang, Hye Ryeon; Choudhary, Indu; Lee, Hyunkyoung; Yum, Seungshic; Shin, Kyoungsoon; Yoon, Won Duk; Kang, Changkeun; Kim, Euikyung
2016-01-01
An enzyme in a nematocyst extract of the Nemopilema nomurai jellyfish, caught off the coast of the Republic of Korea, catalyzed the cleavage of chymotrypsin substrate in an amidolytic kinetic assay, and this activity was inhibited by the serine protease inhibitor, phenylmethanesulfonyl fluoride. We isolated the full-length cDNA sequence of this enzyme, which contains 850 nucleotides, with an open reading frame of 801 encoding 266 amino acids. A blast analysis of the deduced amino acid sequence showed 41% identity with human chymotrypsin-like (CTRL) and the CTRL-1 precursor. Therefore, we designated this enzyme N. nomurai CTRL-1. The primary structure of N. nomurai CTRL-1 includes a leader peptide and a highly conserved catalytic triad of His69, Asp117, and Ser216. The disulfide bonds of chymotrypsin and the substrate-binding sites are highly conserved compared with the CTRLs of other species, including mammalian species. Nemopilema nomurai CTRL-1 is evolutionarily more closely related to Actinopterygii than to Scyphozoan (Aurelia aurita) or Hydrozoan (Hydra vulgaris). The N. nomurai CTRL1 was amplified from the genomic DNA with PCR using specific primers designed based on the full-length cDNA, and then sequenced. The N. nomurai CTRL1 gene contains 2434 nucleotides and four distinct exons. The 5′ donor splice (GT) and 3′ acceptor splice sequences (AG) are wholly conserved. This is the first report of the CTRL1 gene and cDNA structures in the jellyfish N. nomurai. PMID:27399771
Cloning a Chymotrypsin-Like 1 (CTRL-1) Protease cDNA from the Jellyfish Nemopilema nomurai.
Heo, Yunwi; Kwon, Young Chul; Bae, Seong Kyeong; Hwang, Duhyeon; Yang, Hye Ryeon; Choudhary, Indu; Lee, Hyunkyoung; Yum, Seungshic; Shin, Kyoungsoon; Yoon, Won Duk; Kang, Changkeun; Kim, Euikyung
2016-07-05
An enzyme in a nematocyst extract of the Nemopilema nomurai jellyfish, caught off the coast of the Republic of Korea, catalyzed the cleavage of chymotrypsin substrate in an amidolytic kinetic assay, and this activity was inhibited by the serine protease inhibitor, phenylmethanesulfonyl fluoride. We isolated the full-length cDNA sequence of this enzyme, which contains 850 nucleotides, with an open reading frame of 801 encoding 266 amino acids. A blast analysis of the deduced amino acid sequence showed 41% identity with human chymotrypsin-like (CTRL) and the CTRL-1 precursor. Therefore, we designated this enzyme N. nomurai CTRL-1. The primary structure of N. nomurai CTRL-1 includes a leader peptide and a highly conserved catalytic triad of His(69), Asp(117), and Ser(216). The disulfide bonds of chymotrypsin and the substrate-binding sites are highly conserved compared with the CTRLs of other species, including mammalian species. Nemopilema nomurai CTRL-1 is evolutionarily more closely related to Actinopterygii than to Scyphozoan (Aurelia aurita) or Hydrozoan (Hydra vulgaris). The N. nomurai CTRL1 was amplified from the genomic DNA with PCR using specific primers designed based on the full-length cDNA, and then sequenced. The N. nomurai CTRL1 gene contains 2434 nucleotides and four distinct exons. The 5' donor splice (GT) and 3' acceptor splice sequences (AG) are wholly conserved. This is the first report of the CTRL1 gene and cDNA structures in the jellyfish N. nomurai.
Campion, S R; Ameen, A S; Lai, L; King, J M; Munzenmaier, T N
2001-08-15
This report describes the application of a simple computational tool, AAPAIR.TAB, for the systematic analysis of the cysteine-rich EGF, Sushi, and Laminin motif/sequence families at the two-amino acid level. Automated dipeptide frequency/bias analysis detects preferences in the distribution of amino acids in established protein families, by determining which "ordered dipeptides" occur most frequently in comprehensive motif-specific sequence data sets. Graphic display of the dipeptide frequency/bias data revealed family-specific preferences for certain dipeptides, but more importantly detected a shared preference for employment of the ordered dipeptides Gly-Tyr (GY) and Gly-Phe (GF) in all three protein families. The dipeptide Asn-Gly (NG) also exhibited high-frequency and bias in the EGF and Sushi motif families, whereas Asn-Thr (NT) was distinguished in the Laminin family. Evaluation of the distribution of dipeptides identified by frequency/bias analysis subsequently revealed the highly restricted localization of the G(F/Y) and N(G/T) sequence elements at two separate sites of extreme conservation in the consensus sequence of all three sequence families. The similar employment of the high-frequency/bias dipeptides in three distinct protein sequence families was further correlated with the concurrence of these shared molecular determinants at similar positions within the distinctive scaffolds of three structurally divergent, but similarly employed, motif modules.
The ConSurf-DB: pre-calculated evolutionary conservation profiles of protein structures.
Goldenberg, Ofir; Erez, Elana; Nimrod, Guy; Ben-Tal, Nir
2009-01-01
ConSurf-DB is a repository for evolutionary conservation analysis of the proteins of known structures in the Protein Data Bank (PDB). Sequence homologues of each of the PDB entries were collected and aligned using standard methods. The evolutionary conservation of each amino acid position in the alignment was calculated using the Rate4Site algorithm, implemented in the ConSurf web server. The algorithm takes into account the phylogenetic relations between the aligned proteins and the stochastic nature of the evolutionary process explicitly. Rate4Site assigns a conservation level for each position in the multiple sequence alignment using an empirical Bayesian inference. Visual inspection of the conservation patterns on the 3D structure often enables the identification of key residues that comprise the functionally important regions of the protein. The repository is updated with the latest PDB entries on a monthly basis and will be rebuilt annually. ConSurf-DB is available online at http://consurfdb.tau.ac.il/
The ConSurf-DB: pre-calculated evolutionary conservation profiles of protein structures
Goldenberg, Ofir; Erez, Elana; Nimrod, Guy; Ben-Tal, Nir
2009-01-01
ConSurf-DB is a repository for evolutionary conservation analysis of the proteins of known structures in the Protein Data Bank (PDB). Sequence homologues of each of the PDB entries were collected and aligned using standard methods. The evolutionary conservation of each amino acid position in the alignment was calculated using the Rate4Site algorithm, implemented in the ConSurf web server. The algorithm takes into account the phylogenetic relations between the aligned proteins and the stochastic nature of the evolutionary process explicitly. Rate4Site assigns a conservation level for each position in the multiple sequence alignment using an empirical Bayesian inference. Visual inspection of the conservation patterns on the 3D structure often enables the identification of key residues that comprise the functionally important regions of the protein. The repository is updated with the latest PDB entries on a monthly basis and will be rebuilt annually. ConSurf-DB is available online at http://consurfdb.tau.ac.il/ PMID:18971256
Zhang, Jing-Nan; Song, Ping; Hu, Jia-Rui; Mo, Sai-Jun; Peng, Mao-Yu; Zhou, Wei; Zou, Ji-Xing; Hu, Yin-Chang
2005-01-01
In this study,the full-length cDNAs of GH (Growth Hormone) gene was isolated from six important economic fishes, Siniperca kneri, Epinephelus coioides, Monopterus albus, Silurus asotus, Misgurnus anguillicaudatus and Carassius auratus gibelio Bloch. It is the first time to clone these GH sequences except E. coioides GH. The lengths of the above cDNAs are as follows: 953 bp, 1 023 bp, 825 bp, 1 082 bp, 1 154 bp and 1 180 bp. Each sequence includes an ORF of about 600 bp which encodes a protein of about 200 amino acid: S. kneri, E. coioides and M. albus GHs of 204 amino acid, S. asotus GH of 200 amino acid, M. anguillicaudatus and C. auratus gibelio GHs of 210 amino acid. Then detailed sequence analysis of the six GHs with many other fish sequences was performed. The six sequences all showed high homology to other sequences, especially to sequences within the same order, and many conserved residues were identified, most localized in five domains. The phylogenetic trees (MP and NJ) of many fish GH ORF sequences (including the new six) with Amia calva as outgroup were generally resolved and largely congruent with the morphology-based tree though some incongruities were observed, suggesting GH ORF should be paid more attention to in teleostean phylogeny.
Lahr, Roni M; Mack, Seshat M; Héroux, Annie; Blagden, Sarah P; Bousquet-Antonelli, Cécile; Deragon, Jean-Marc; Berman, Andrea J
2015-09-18
La-related protein 1 (LARP1) regulates the stability of many mRNAs. These include 5'TOPs, mTOR-kinase responsive mRNAs with pyrimidine-rich 5' UTRs, which encode ribosomal proteins and translation factors. We determined that the highly conserved LARP1-specific C-terminal DM15 region of human LARP1 directly binds a 5'TOP sequence. The crystal structure of this DM15 region refined to 1.86 Å resolution has three structurally related and evolutionarily conserved helix-turn-helix modules within each monomer. These motifs resemble HEAT repeats, ubiquitous helical protein-binding structures, but their sequences are inconsistent with consensus sequences of known HEAT modules, suggesting this structure has been repurposed for RNA interactions. A putative mTORC1-recognition sequence sits within a flexible loop C-terminal to these repeats. We also present modelling of pyrimidine-rich single-stranded RNA onto the highly conserved surface of the DM15 region. These studies lay the foundation necessary for proceeding toward a structural mechanism by which LARP1 links mTOR signalling to ribosome biogenesis. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Casillas, Rosario; Tabernero, David; Gregori, Josep; Belmonte, Irene; Cortese, Maria Francesca; González, Carolina; Riveiro-Barciela, Mar; López, Rosa Maria; Quer, Josep; Esteban, Rafael; Buti, Maria; Rodríguez-Frías, Francisco
2018-01-01
AIM To determine the variability/conservation of the domain of hepatitis B virus (HBV) preS1 region that interacts with sodium-taurocholate cotransporting polypeptide (hereafter, NTCP-interacting domain) and the prevalence of the rs2296651 polymorphism (S267F, NTCP variant) in a Spanish population. METHODS Serum samples from 246 individuals were included and divided into 3 groups: patients with chronic HBV infection (CHB) (n = 41, 73% Caucasians), patients with resolved HBV infection (n = 100, 100% Caucasians) and an HBV-uninfected control group (n = 105, 100% Caucasians). Variability/conservation of the amino acid (aa) sequences of the NTCP-interacting domain, (aa 2-48 in viral genotype D) and a highly conserved preS1 domain associated with virion morphogenesis (aa 92-103 in viral genotype D) were analyzed by next-generation sequencing and compared in 18 CHB patients with viremia > 4 log IU/mL. The rs2296651 polymorphism was determined in all individuals in all 3 groups using an in-house real-time PCR melting curve analysis. RESULTS The HBV preS1 NTCP-interacting domain showed a high degree of conservation among the examined viral genomes especially between aa 9 and 21 (in the genotype D consensus sequence). As compared with the virion morphogenesis domain, the NTCP-interacting domain had a smaller proportion of HBV genotype-unrelated changes comprising > 1% of the quasispecies (25.5% vs 31.8%), but a larger proportion of genotype-associated viral polymorphisms (34% vs 27.3%), according to consensus sequences from GenBank patterns of HBV genotypes A to H. Variation/conservation in both domains depended on viral genotype, with genotype C being the most highly conserved and genotype E the most variable (limited finding, only 2 genotype E included). Of note, proline residues were highly conserved in both domains, and serine residues showed changes only to threonine or tyrosine in the virion morphogenesis domain. The rs2296651 polymorphism was not detected in any participant. CONCLUSION In our CHB population, the NTCP-interacting domain was highly conserved, particularly the proline residues and essential amino acids related with the NTCP interaction, and the prevalence of rs2296651 was low/null. PMID:29456407
Korber, B T; Kunstman, K J; Patterson, B K; Furtado, M; McEvilly, M M; Levy, R; Wolinsky, S M
1994-01-01
Human immunodeficiency virus type 1 (HIV-1) sequences were generated from blood and from brain tissue obtained by stereotactic biopsy from six patients undergoing a diagnostic neurosurgical procedure. Proviral DNA was directly amplified by nested PCR, and 8 to 36 clones from each sample were sequenced. Phylogenetic analysis of intrapatient envelope V3-V5 region HIV-1 DNA sequence sets revealed that brain viral sequences were clustered relative to the blood viral sequences, suggestive of tissue-specific compartmentalization of the virus in four of the six cases. In the other two cases, the blood and brain virus sequences were intermingled in the phylogenetic analyses, suggesting trafficking of virus between the two tissues. Slide-based PCR-driven in situ hybridization of two of the patients' brain biopsy samples confirmed our interpretation of the intrapatient phylogenetic analyses. Interpatient V3 region brain-derived sequence distances were significantly less than blood-derived sequence distances. Relative to the tip of the loop, the set of brain-derived viral sequences had a tendency towards negative or neutral charge compared with the set of blood-derived viral sequences. Entropy calculations were used as a measure of the variability at each position in alignments of blood and brain viral sequences. A relatively conserved set of positions were found, with a significantly lower entropy in the brain-than in the blood-derived viral sequences. These sites constitute a brain "signature pattern," or a noncontiguous set of amino acids in the V3 region conserved in viral sequences derived from brain tissue. This brain-derived signature pattern was also well preserved among isolates previously characterized in vitro as macrophage tropic. Macrophage-monocyte tropism may be the biological constraint that results in the conservation of the viral brain signature pattern. Images PMID:7933130
Lu, Hong; Patil, Prabhu; Van Sluys, Marie-Anne; White, Frank F; Ryan, Robert P; Dow, J Maxwell; Rabinowicz, Pablo; Salzberg, Steven L; Leach, Jan E; Sonti, Ramesh; Brendel, Volker; Bogdanove, Adam J
2008-01-01
Xanthomonas is a large genus of plant-associated and plant-pathogenic bacteria. Collectively, members cause diseases on over 392 plant species. Individually, they exhibit marked host- and tissue-specificity. The determinants of this specificity are unknown. To assess potential contributions to host- and tissue-specificity, pathogenesis-associated gene clusters were compared across genomes of eight Xanthomonas strains representing vascular or non-vascular pathogens of rice, brassicas, pepper and tomato, and citrus. The gum cluster for extracellular polysaccharide is conserved except for gumN and sequences downstream. The xcs and xps clusters for type II secretion are conserved, except in the rice pathogens, in which xcs is missing. In the otherwise conserved hrp cluster, sequences flanking the core genes for type III secretion vary with respect to insertion sequence element and putative effector gene content. Variation at the rpf (regulation of pathogenicity factors) cluster is more pronounced, though genes with established functional relevance are conserved. A cluster for synthesis of lipopolysaccharide varies highly, suggesting multiple horizontal gene transfers and reassortments, but this variation does not correlate with host- or tissue-specificity. Phylogenetic trees based on amino acid alignments of gum, xps, xcs, hrp, and rpf cluster products generally reflect strain phylogeny. However, amino acid residues at four positions correlate with tissue specificity, revealing hpaA and xpsD as candidate determinants. Examination of genome sequences of xanthomonads Xylella fastidiosa and Stenotrophomonas maltophilia revealed that the hrp, gum, and xcs clusters are recent acquisitions in the Xanthomonas lineage. Our results provide insight into the ancestral Xanthomonas genome and indicate that differentiation with respect to host- and tissue-specificity involved not major modifications or wholesale exchange of clusters, but subtle changes in a small number of genes or in non-coding sequences, and/or differences outside the clusters, potentially among regulatory targets or secretory substrates.
Shayan, P; Jafari, S; Fattahi, R; Ebrahimzade, E; Amininia, N; Changizi, E
2016-05-01
Ovine theileriosis is an important hemoprotozoal disease of sheep and goats in tropical and subtropical regions which caused high economic loses in the livestock industry. Theileria annulata surface protein (TaSp) was used previously as a tool for serological analysis in livestock. Since the amino acid sequences of TaSp is, at least, in part very conserved in T. annulata, Theileria lestoquardi and Theileria china I and II, it is very important to determine the amino acid sequence of this protein in Theileria ovis as well, to avoid false interpretation of serological data based on this protein in small animal. In the present study, the nucleotide sequence and amino acid sequence of T. ovis surface protein (ToSp) were determined. The comparison of the nucleotide sequence of ToSp showed 96, 96, 99, and 86 % homology to the corresponding nucleotide sequence of TaSp genes by T. annulata, T. China I, T. China II and T. lestoquardi, previously registered in GenBank under accession nos. AJ316260.1, AY274329.1, DQ120058.1, and EF092924.1 respectively. The amino acid sequence analysis showed 95, 81, 98 and 70 % homology to the corresponding amino acid sequence of T. annulata, T chinaI, T china II and T. lestoquardi, registered in GenBank under accession nos. CAC87478.1, AAP36993.1, AAZ30365.1 and AAP36999.11, respectively. Interestingly, in contrast to the C terminus, a significant difference in amino acid sequence in the N teminus of the ToSp protein could be determined compared to the other known corresponding TaSp sequences, which make this region attractive for designing of a suitable tool for serological diagnosis.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aho, Hanne; Schwemmer, M.; Tessmann, D.
1996-03-01
The mitochondrial capsule selenoprotein (MCS) (HGMW-approved symbol MCSP) is one of three proteins that are important for the maintenance and stabilization of the crescent structure of the sperm mitochondria. We describe here the isolation of a cDNA, the exon-intron organization, the expression, and the chromosomal localization of the human MCS gene. Nucleotide sequence analysis of the human and mouse MCS cDNAs reveals that the 5{prime}- and 3{prime}-untranslated sequences are more conserved (71%) than the coding sequences (59%). The open reading frame encodes a 116-amino-acid protein and lacks the UGA codons, which have been reported to encode the selenocysteines in themore » N-terminal of the deduced mouse protein. The deduced human protein shows a low degree of amino acid sequence identity to the mouse protein. The deduced human protein shows a low degree of amino acid sequence identity to the mouse protein (39%). The most striking homology lies in the dicysteine motifs. Northern and Southern zooblot analyses reveal that the MCS gene in human, baboon, and bovine is more conserved than its counterparts in mouse and rat. The single intron in the human MCS gene is approximately 6 kb and interrupts the 5{prime}-untranslated region at a position equivalent to that in the mouse and rat genes. Northern blot and in situ hybridization experiments demonstrate that the expression of the human MCS gene is restricted to haploid spermatids. The human gene was assigned to q21 of chromosome 1. 30 refs., 9 figs.« less
Elrobh, Mohamed S.; Alanazi, Mohammad S.; Khan, Wajahatullah; Abduljaleel, Zainularifeen; Al-Amri, Abdullah; Bazzi, Mohammad D.
2011-01-01
Heat shock proteins are ubiquitous, induced under a number of environmental and metabolic stresses, with highly conserved DNA sequences among mammalian species. Camelus dromedaries (the Arabian camel) domesticated under semi-desert environments, is well adapted to tolerate and survive against severe drought and high temperatures for extended periods. This is the first report of molecular cloning and characterization of full length cDNA of encoding a putative stress-induced heat shock HSPA6 protein (also called HSP70B′) from Arabian camel. A full-length cDNA (2417 bp) was obtained by rapid amplification of cDNA ends (RACE) and cloned in pET-b expression vector. The sequence analysis of HSPA6 gene showed 1932 bp-long open reading frame encoding 643 amino acids. The complete cDNA sequence of the Arabian camel HSPA6 gene was submitted to NCBI GeneBank (accession number HQ214118.1). The BLAST analysis indicated that C. dromedaries HSPA6 gene nucleotides shared high similarity (77–91%) with heat shock gene nucleotide of other mammals. The deduced 643 amino acid sequences (accession number ADO12067.1) showed that the predicted protein has an estimated molecular weight of 70.5 kDa with a predicted isoelectric point (pI) of 6.0. The comparative analyses of camel HSPA6 protein sequences with other mammalian heat shock proteins (HSPs) showed high identity (80–94%). Predicted camel HSPA6 protein structure using Protein 3D structural analysis high similarities with human and mouse HSPs. Taken together, this study indicates that the cDNA sequences of HSPA6 gene and its amino acid and protein structure from the Arabian camel are highly conserved and have similarities with other mammalian species. PMID:21845074
Roles of JnRAP2.6-like from the transition zone of black walnut in hormone signaling
Zhonglian Huang; Peng Zhao; Jose Medina; Richard Meilan; Keith Woeste
2013-01-01
An EST sequence, designated JnRAP2-like, was isolated from tissue at the heartwood/sapwood transition zone (TZ) in black walnut (Juglans nigra L). The deduced amino acid sequence of JnRAP2-like protein consists of a single AP2- containing domain with significant similarity to conserved AP2/ERF DNA-binding domains in other...
A TALE-inspired computational screen for proteins that contain approximate tandem repeats.
Perycz, Malgorzata; Krwawicz, Joanna; Bochtler, Matthias
2017-01-01
TAL (transcription activator-like) effectors (TALEs) are bacterial proteins that are secreted from bacteria to plant cells to act as transcriptional activators. TALEs and related proteins (RipTALs, BurrH, MOrTL1 and MOrTL2) contain approximate tandem repeats that differ in conserved positions that define specificity. Using PERL, we screened ~47 million protein sequences for TALE-like architecture characterized by approximate tandem repeats (between 30 and 43 amino acids in length) and sequence variability in conserved positions, without requiring sequence similarity to TALEs. Candidate proteins were scored according to their propensity for nuclear localization, secondary structure, repeat sequence complexity, as well as covariation and predicted structural proximity of variable residues. Biological context was tentatively inferred from co-occurrence of other domains and interactome predictions. Approximate repeats with TALE-like features that merit experimental characterization were found in a protein of chestnut blight fungus, a eukaryotic plant pathogen.
A TALE-inspired computational screen for proteins that contain approximate tandem repeats
Krwawicz, Joanna
2017-01-01
TAL (transcription activator-like) effectors (TALEs) are bacterial proteins that are secreted from bacteria to plant cells to act as transcriptional activators. TALEs and related proteins (RipTALs, BurrH, MOrTL1 and MOrTL2) contain approximate tandem repeats that differ in conserved positions that define specificity. Using PERL, we screened ~47 million protein sequences for TALE-like architecture characterized by approximate tandem repeats (between 30 and 43 amino acids in length) and sequence variability in conserved positions, without requiring sequence similarity to TALEs. Candidate proteins were scored according to their propensity for nuclear localization, secondary structure, repeat sequence complexity, as well as covariation and predicted structural proximity of variable residues. Biological context was tentatively inferred from co-occurrence of other domains and interactome predictions. Approximate repeats with TALE-like features that merit experimental characterization were found in a protein of chestnut blight fungus, a eukaryotic plant pathogen. PMID:28617832
Falk, K.; Batts, W.N.; Kvellestad, A.; Kurath, G.; Wiik-Nielsen, J.; Winton, J.R.
2008-01-01
Atlantic salmon paramyxovirus (ASPV) was isolated in 1995 from gills of farmed Atlantic salmon suffering from proliferative gill inflammation. The complete genome sequence of ASPV was determined, revealing a genome 16,968 nucleotides in length consisting of six non-overlapping genes coding for the nucleo- (N), phospho- (P), matrix- (M), fusion- (F), haemagglutinin-neuraminidase- (HN) and large polymerase (L) proteins in the order 3???-N-P-M-F-HN-L-5???. The various conserved features related to virus replication found in most paramyxoviruses were also found in ASPV. These include: conserved and complementary leader and trailer sequences, tri-nucleotide intergenic regions and highly conserved transcription start and stop signal sequences. The P gene expression strategy of ASPV was like that of the respiro-, morbilli- and henipaviruses, which express the P and C proteins from the primary transcript and edit a portion of the mRNA to encode V and W proteins. Sequence similarities among various features related to virus replication, pairwise comparisons of all deduced ASPV protein sequences with homologous regions from other members of the family Paramyxoviridae, and phylogenetic analyses of these amino acid sequences suggested that ASPV was a novel member of the sub-family Paramyxovirinae, most closely related to the respiroviruses. ?? 2008 Elsevier B.V. All rights reserved.
Otikovs, Martins; Chen, Gefei; Nordling, Kerstin; Landreh, Michael; Meng, Qing; Jörnvall, Hans; Kronqvist, Nina; Rising, Anna; Johansson, Jan; Jaudzems, Kristaps
2015-08-17
Conversion of spider silk proteins from soluble dope to insoluble fibers involves pH-dependent dimerization of the N-terminal domain (NT). This conversion is tightly regulated to prevent premature precipitation and enable rapid silk formation at the end of the duct. Three glutamic acid residues that mediate this process in the NT from Euprosthenops australis major ampullate spidroin 1 are well conserved among spidroins. However, NTs of minor ampullate spidroins from several species, including Araneus ventricosus ((Av)MiSp NT), lack one of the glutamic acids. Here we investigate the pH-dependent structural changes of (Av)MiSp NT, revealing that it uses the same mechanism but involves a non-conserved glutamic acid residue instead. Homology modeling of the structures of other MiSp NTs suggests that these harbor different compensatory residues. This indicates that, despite sequence variations, the molecular mechanism underlying pH-dependent dimerization of NT is conserved among different silk types. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Wang, Yongkang; Song, Xiaodan; Li, Xiaorong; Yang, Sang-tian; Zou, Xiang
2017-01-04
To explore the genome sequence of Aureobasidium pullulans CCTCC M2012223, analyze the key genes related to the biosynthesis of important metabolites, and provide genetic background for metabolic engineering. Complete genome of A. pullulans CCTCC M2012223 was sequenced by Illumina HiSeq high throughput sequencing platform. Then, fragment assembly, gene prediction, functional annotation, and GO/COG cluster were analyzed in comparison with those of other five A. pullulans varieties. The complete genome sequence of A. pullulans CCTCC M2012223 was 30756831 bp with an average GC content of 47.49%, and 9452 genes were successfully predicted. Genome-wide analysis showed that A. pullulans CCTCC M2012223 had the biggest genome assembly size. Protein sequences involved in the pullulan and polymalic acid pathway were highly conservative in all of six A. pullulans varieties. Although both A. pullulans CCTCC M2012223 and A. pullulans var. melanogenum have a close affinity, some point mutation and inserts were occurred in protein sequences involved in melanin biosynthesis. Genome information of A. pullulans CCTCC M2012223 was annotated and genes involved in melanin, pullulan and polymalic acid pathway were compared, which would provide a theoretical basis for genetic modification of metabolic pathway in A. pullulans.
Berstein, R M; Schluter, S F; Shen, S; Marchalonis, J J
1996-04-16
All immunoglobulins and T-cell receptors throughout phylogeny share regions of highly conserved amino acid sequence. To identify possible primitive immunoglobulins and immunoglobulin-like molecules, we utilized 3' RACE (rapid amplification of cDNA ends) and a highly conserved constant region consensus amino acid sequence to isolate a new immunoglobulin class from the sandbar shark Carcharhinus plumbeus. The immunoglobulin, termed IgW, in its secreted form consists of 782 amino acids and is expressed in both the thymus and the spleen. The molecule overall most closely resembles mu chains of the skate and human and a new putative antigen binding molecule isolated from the nurse shark (NAR). The full-length IgW chain has a variable region resembling human and shark heavy-chain (VH) sequences and a novel joining segment containing the WGXGT motif characteristic of H chains. However, unlike any other H-chain-type molecule, it contains six constant (C) domains. The first C domain contains the cysteine residue characteristic of C mu1 that would allow dimerization with a light (L) chain. The fourth and sixth domains also contain comparable cysteines that would enable dimerization with other H chains or homodimerization. Comparison of the sequences of IgW V and C domains shows homology greater than that found in comparisons among VH and C mu or VL, or CL thereby suggesting that IgW may retain features of the primordial immunoglobulin in evolution.
Identification of a novel vitivirus from grapevines in New Zealand.
Blouin, Arnaud G; Keenan, Sandi; Napier, Kathryn R; Barrero, Roberto A; MacDiarmid, Robin M
2018-01-01
We report a sequence of a novel vitivirus from Vitis vinifera obtained using two high-throughput sequencing (HTS) strategies on RNA. The initial discovery from small-RNA sequencing was confirmed by HTS of the total RNA and Sanger sequencing. The new virus has a genome structure similar to the one reported for other vitiviruses, with five open reading frames (ORFs) coding for the conserved domains described for members of that genus. Phylogenetic analysis of the complete genome sequence confirmed its affiliation to the genus Vitivirus, with the closest described viruses being grapevine virus E (GVE) and Agave tequilana leaf virus (ATLV). However, the virus we report is distinct and shares only 51% amino acid sequence identity with GVE in the replicase polyprotein and 66.8% amino acid sequence identity with ATLV in the coat protein. This is well below the threshold determined by the ICTV for species demarcation, and we propose that this virus represents a new species. It is provisionally named "grapevine virus G".
Brewer, Michael S; Swafford, Lynn; Spruill, Chad L; Bond, Jason E
2013-01-01
Arthropods are the most diverse group of eukaryotic organisms, but their phylogenetic relationships are poorly understood. Herein, we describe three mitochondrial genomes representing orders of millipedes for which complete genomes had not been characterized. Newly sequenced genomes are combined with existing data to characterize the protein coding regions of myriapods and to attempt to reconstruct the evolutionary relationships within the Myriapoda and Arthropoda. The newly sequenced genomes are similar to previously characterized millipede sequences in terms of synteny and length. Unique translocations occurred within the newly sequenced taxa, including one half of the Appalachioria falcifera genome, which is inverted with respect to other millipede genomes. Across myriapods, amino acid conservation levels are highly dependent on the gene region. Additionally, individual loci varied in the level of amino acid conservation. Overall, most gene regions showed low levels of conservation at many sites. Attempts to reconstruct the evolutionary relationships suffered from questionable relationships and low support values. Analyses of phylogenetic informativeness show the lack of signal deep in the trees (i.e., genes evolve too quickly). As a result, the myriapod tree resembles previously published results but lacks convincing support, and, within the arthropod tree, well established groups were recovered as polyphyletic. The novel genome sequences described herein provide useful genomic information concerning millipede groups that had not been investigated. Taken together with existing sequences, the variety of compositions and evolution of myriapod mitochondrial genomes are shown to be more complex than previously thought. Unfortunately, the use of mitochondrial protein-coding regions in deep arthropod phylogenetics appears problematic, a result consistent with previously published studies. Lack of phylogenetic signal renders the resulting tree topologies as suspect. As such, these data are likely inappropriate for investigating such ancient relationships.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chang, Soo-Ik; Hammes, G.G.
1989-11-01
Homology analyses of the protein sequences of chicken liver and rat mammary gland fatty acid synthases were carried out. The amino acid sequences of the chicken and rat enzymes are 67% identical. If conservative substitutions are allowed, 78% of the amino acids are matched. A region of low homologies exists between the functional domains, in particular around amino acid residues 1059-1264 of the chicken enzyme. Homologies between the active sites of chicken and rat and of chicken and yeast enzymes have been analyzed by an alignment method. A high degree of homology exists between the active sites of the chickenmore » and rat enzymes. However, the chicken and yeast enzymes show a lower degree of homology. The DADPH-binding dinucleotide folds of the {beta}-ketoacyl reductase and the enoyl reductase sites were identified by comparison with a known consensus sequence for the DADP- and FAD-binding dinucleotide folds. The active sites of all of the enzymes are primarily in hydrophobic regions of the protein. This study suggests that the genes for the functional domains of fatty acid synthase were originally separated, and these genes were connected to each other by using different connecting nucleotide sequences in different species. An alternative explanation for the differences in rat and chicken is a common ancestry and mutations in the joining regions during evolution.« less
Molecular cloning and sequence analysis of stearoyl-CoA desaturase in milkfish, Chanos chanos.
Hsieh, S L; Liao, W L; Kuo, C M
2001-12-01
Stearoyl-CoA desaturase (EC 1.14.99.5) is a key enzyme in the biosynthesis of polyunsaturated fatty acids and the maintenance of the homeoviscous fluidity of biological membranes. The stearoyl-CoA desaturase cDNA in milkfish (Chanos chanos) was cloned by RT-PCR and RACE, and it was compared with the stearoyl-CoA desaturase in cold-tolerant teleosts, common carp and grass carp. Nucleotide sequence analysis revealed that the cDNA clone has a 972-bp open reading frame encoding 323 amino acid residues. Alignments of the deduced amino acid sequence showed that the milkfish stearoyl-CoA desaturase shares 79% and 75% identity with common carp and grass carp, and 63%-64% with other vertebrates such as sheep, hamsters, rats, mice, and humans. Like common carp and grass carp, the deduced amino acid sequence in milkfish well conserves three histidine cluster motifs (one HXXXXH and two HXXHH) that are essential for catalysis of stearoyl-CoA desaturase activity. However, RT-PCR analysis showed that stearoyl-CoA desaturase expression in milkfish is detected in the tissues of liver, muscle, kidney, brain, and gill, and more expression sites were found in milkfish than in common carp and grass carp. Phylogenic relationships among the deduced stearoyl-CoA desaturase amino acid sequence in milkfish and those in other vertebrates showed that the milkfish stearoyl-CoA desaturase amino acid sequence is phylogenetically closer to those of common carp and grass carp than to other higher vertebrates.
Comparative genomic analysis of the false killer whale (Pseudorca crassidens) LMBR1 locus.
Kim, Dae-Won; Choi, Sang-Haeng; Kim, Ryong Nam; Kim, Sun-Hong; Paik, Sang-Gi; Nam, Seong-Hyeuk; Kim, Dong-Wook; Kim, Aeri; Kang, Aram; Park, Hong-Seog
2010-09-01
The sequencing and comparative genomic analysis of LMBR1 loci in mammals or other species, including human, would be very important in understanding evolutionary genetic changes underlying the evolution of limb development. In this regard, comparative genomic annotation of the false killer whale LMBR1 locus could shed new light on the evolution of limb development. We sequenced two false killer whale BAC clones, corresponding to 156 kb and 144 kb, respectively, harboring the tightly linked RNF32, LMBR1, and NOM1 genes. Our annotation of the false killer whale LMBR1 gene showed that it consists of 17 exons (1473 bp), in contrast to 18 exons (1596 bp) in human, and it displays 93.1% and 95.6% nucleotide and amino acid sequence similarity, respectively, compared with the human gene. In particular, we discovered that exon 10, deleted in the false killer whale LMBR1 gene, is present only in primates, and this fact strongly implies that exon 10 might be crucial in determining primate-specific limb development. ZRS and TFBS sequences have been well conserved across 11 species, suggesting that these regions could be involved in an important function of limb development and limb patterning. The neighboring gene RNF32 showed several lineage-conserved exons, such as exons 2 through 9 conserved in eutherian mammals, exons 3 through 9 conserved in mammals, and exons 5 through 9 conserved in vertebrates. The other neighboring gene, NOM1, had undergone a substitution (ATG→GTA) at the start codon, giving rise to a 36 bp shorter N-terminal sequence compared with the human sequence. Our comparative analysis of the false killer whale LMBR1 genomic locus provides important clues regarding the genetic regions that may play crucial roles in limb development and patterning.
DNA-binding proteins from marine bacteria expand the known sequence diversity of TALE-like repeats.
de Lange, Orlando; Wolf, Christina; Thiel, Philipp; Krüger, Jens; Kleusch, Christian; Kohlbacher, Oliver; Lahaye, Thomas
2015-11-16
Transcription Activator-Like Effectors (TALEs) of Xanthomonas bacteria are programmable DNA binding proteins with unprecedented target specificity. Comparative studies into TALE repeat structure and function are hindered by the limited sequence variation among TALE repeats. More sequence-diverse TALE-like proteins are known from Ralstonia solanacearum (RipTALs) and Burkholderia rhizoxinica (Bats), but RipTAL and Bat repeats are conserved with those of TALEs around the DNA-binding residue. We study two novel marine-organism TALE-like proteins (MOrTL1 and MOrTL2), the first to date of non-terrestrial origin. We have assessed their DNA-binding properties and modelled repeat structures. We found that repeats from these proteins mediate sequence specific DNA binding conforming to the TALE code, despite low sequence similarity to TALE repeats, and with novel residues around the BSR. However, MOrTL1 repeats show greater sequence discriminating power than MOrTL2 repeats. Sequence alignments show that there are only three residues conserved between repeats of all TALE-like proteins including the two new additions. This conserved motif could prove useful as an identifier for future TALE-likes. Additionally, comparing MOrTL repeats with those of other TALE-likes suggests a common evolutionary origin for the TALEs, RipTALs and Bats. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Structure of the horseradish peroxidase isozyme C genes.
Fujiyama, K; Takemura, H; Shibayama, S; Kobayashi, K; Choi, J K; Shinmyo, A; Takano, M; Yamada, Y; Okada, H
1988-05-02
We have isolated, cloned and characterized three cDNAs and two genomic DNAs corresponding to the mRNAs and genes for the horseradish (Armoracia rusticana) peroxidase isoenzyme C (HPR C). The amino acid sequence of HRP C1, deduced from the nucleotide sequence of one of the cDNA clone, pSK1, contained the same primary sequence as that of the purified enzyme established by Welinder [FEBS Lett. 72, 19-23 (1976)] with additional sequences at the N and C terminal. All three inserts in the cDNA clones, pSK1, pSK2 and pSK3, coded the same size of peptide (308 amino acid residues) if these are processed in the same way, and the amino acid sequence were homologous to each other by 91-94%. Functional amino acids, including His40, His170, Tyr185 and Arg183 and S-S-bond-forming Cys, were conserved in the three isozymes, but a few N-glycosylation sites were not the same. Two HRP C isoenzyme genomic genes, prxC1 and prxC2, were tandem on the chromosomal DNA and each gene consisted of four exons and three introns. The positions in the exons interrupted by introns were the same in two genes. We observed a putative promoter sequence 5' upstream and a poly(A) signal 3' downstream in both genes. The gene product of prxC1 might be processed with a signal sequence of 30 amino acid residues at the N terminus and a peptide consisting of 15 amino acid residues at the C terminus.
Neshich, Goran; Togawa, Roberto C.; Mancini, Adauto L.; Kuser, Paula R.; Yamagishi, Michel E. B.; Pappas, Georgios; Torres, Wellington V.; Campos, Tharsis Fonseca e; Ferreira, Leonardo L.; Luna, Fabio M.; Oliveira, Adilton G.; Miura, Ronald T.; Inoue, Marcus K.; Horita, Luiz G.; de Souza, Dimas F.; Dominiquini, Fabiana; Álvaro, Alexandre; Lima, Cleber S.; Ogawa, Fabio O.; Gomes, Gabriel B.; Palandrani, Juliana F.; dos Santos, Gabriela F.; de Freitas, Esther M.; Mattiuz, Amanda R.; Costa, Ivan C.; de Almeida, Celso L.; Souza, Savio; Baudet, Christian; Higa, Roberto H.
2003-01-01
STING Millennium Suite (SMS) is a new web-based suite of programs and databases providing visualization and a complex analysis of molecular sequence and structure for the data deposited at the Protein Data Bank (PDB). SMS operates with a collection of both publicly available data (PDB, HSSP, Prosite) and its own data (contacts, interface contacts, surface accessibility). Biologists find SMS useful because it provides a variety of algorithms and validated data, wrapped-up in a user friendly web interface. Using SMS it is now possible to analyze sequence to structure relationships, the quality of the structure, nature and volume of atomic contacts of intra and inter chain type, relative conservation of amino acids at the specific sequence position based on multiple sequence alignment, indications of folding essential residue (FER) based on the relationship of the residue conservation to the intra-chain contacts and Cα–Cα and Cβ–Cβ distance geometry. Specific emphasis in SMS is given to interface forming residues (IFR)—amino acids that define the interactive portion of the protein surfaces. SMS may simultaneously display and analyze previously superimposed structures. PDB updates trigger SMS updates in a synchronized fashion. SMS is freely accessible for public data at http://www.cbi.cnptia.embrapa.br, http://mirrors.rcsb.org/SMS and http://trantor.bioc.columbia.edu/SMS. PMID:12824333
In silico analysis of subtilisin from Glaciozyma antarctica PI12
NASA Astrophysics Data System (ADS)
Mustafha, Siti Mardhiah; Murad, Abdul Munir Abdul; Mahadi, Nor Muhammad; Kamaruddin, Shazilah; Bakar, Farah Diba Abu
2015-09-01
Subtilisin constitute as a major player in industrial enzymes that has a wide range of application especially in the detergent industry. In this study, a cDNA encoding for subtilisin (GaSUBT) was extracted from the psychrophilic yeast, Glaciozyma antarctica PI12, PCR amplified and sequenced. Various bioinformatics tools were used to characterize the GaSUBT. GaSUBT contains 1587 bp nucleotides encoding for 529 amino acids. The predicted molecular weight of the deduced protein is 55.34 kDa with an isoelectric point of 6.25. GaSUBT was predicted to possess a signal peptide and pro-peptide consisting of a peptidase inhibitor I9 sequence. From the sequence alignment analysis of deduced amino acids with other subtilisins in the NCBI database showed that the sequences surrounding the catalytic triad that forms the catalytic domain are well conserved.
Sequence similarities and evolutionary relationships of microbial, plant and animal alpha-amylases.
Janecek, S
1994-09-01
Amino acid sequence comparison of 37 alpha-amylases from microbial, plant and animal sources was performed to identify their mutual sequence similarities in addition to the five already described conserved regions. These sequence regions were examined from structure/function and evolutionary perspectives. An unrooted evolutionary tree of alpha-amylases was constructed on a subset of 55 residues from the alignment of sequence similarities along with conserved regions. The most important new information extracted from the tree was as follows: (a) the close evolutionary relationship of Alteromonas haloplanctis alpha-amylase (thermolabile enzyme from an antarctic psychrotroph) with the already known group of homologous alpha-amylases from streptomycetes, Thermomonospora curvata, insects and mammals, and (b) the remarkable 40.1% identity between starch-saccharifying Bacillus subtilis alpha-amylase and the enzyme from the ruminal bacterium Butyrivibrio fibrisolvens, an alpha-amylase with an unusually large polypeptide chain (943 residues in the mature enzyme). Due to a very high degree of similarity, the whole amino acid sequences of three groups of alpha-amylases, namely (a) fungi and yeasts, (b) plants, and (c) A. haloplanctis, streptomycetes, T. curvata, insects and mammals, were aligned independently and their unrooted distance trees were calculated using these alignments. Possible rooting of the trees was also discussed. Based on the knowledge of the location of the five disulfide bonds in the structure of pig pancreatic alpha-amylase, the possible disulfide bridges were established for each of these groups of homologous alpha-amylases.
Sequence repeats and protein structure
NASA Astrophysics Data System (ADS)
Hoang, Trinh X.; Trovato, Antonio; Seno, Flavio; Banavar, Jayanth R.; Maritan, Amos
2012-11-01
Repeats are frequently found in known protein sequences. The level of sequence conservation in tandem repeats correlates with their propensities to be intrinsically disordered. We employ a coarse-grained model of a protein with a two-letter amino acid alphabet, hydrophobic (H) and polar (P), to examine the sequence-structure relationship in the realm of repeated sequences. A fraction of repeated sequences comprises a distinct class of bad folders, whose folding temperatures are much lower than those of random sequences. Imperfection in sequence repetition improves the folding properties of the bad folders while deteriorating those of the good folders. Our results may explain why nature has utilized repeated sequences for their versatility and especially to design functional proteins that are intrinsically unstructured at physiological temperatures.
Matsuoka, Masanari; Sugita, Masatake; Kikuchi, Takeshi
2014-09-18
Proteins that share a high sequence homology while exhibiting drastically different 3D structures are investigated in this study. Recently, artificial proteins related to the sequences of the GA and IgG binding GB domains of human serum albumin have been designed. These artificial proteins, referred to as GA and GB, share 98% amino acid sequence identity but exhibit different 3D structures, namely, a 3α bundle versus a 4β + α structure. Discriminating between their 3D structures based on their amino acid sequences is a very difficult problem. In the present work, in addition to using bioinformatics techniques, an analysis based on inter-residue average distance statistics is used to address this problem. It was hard to distinguish which structure a given sequence would take only with the results of ordinary analyses like BLAST and conservation analyses. However, in addition to these analyses, with the analysis based on the inter-residue average distance statistics and our sequence tendency analysis, we could infer which part would play an important role in its structural formation. The results suggest possible determinants of the different 3D structures for sequences with high sequence identity. The possibility of discriminating between the 3D structures based on the given sequences is also discussed.
Quéméneur, Marianne; Heinrich-Salmeron, Audrey; Muller, Daniel; Lièvremont, Didier; Jauzein, Michel; Bertin, Philippe N; Garrido, Francis; Joulian, Catherine
2008-07-01
A new primer set was designed to specifically amplify ca. 1,100 bp of aoxB genes encoding the As(III) oxidase catalytic subunit from taxonomically diverse aerobic As(III)-oxidizing bacteria. Comparative analysis of AoxB protein sequences showed variable conservation levels and highlighted the conservation of essential amino acids and structural motifs. AoxB phylogeny of pure strains showed well-discriminated taxonomic groups and was similar to 16S rRNA phylogeny. Alphaproteobacteria-, Betaproteobacteria-, and Gammaproteobacteria-related sequences were retrieved from environmental surveys, demonstrating their prevalence in mesophilic As-contaminated soils. Our study underlines the usefulness of the aoxB gene as a functional marker of aerobic As(III) oxidizers.
Characterization of tannase protein sequences of bacteria and fungi: an in silico study.
Banerjee, Amrita; Jana, Arijit; Pati, Bikash R; Mondal, Keshab C; Das Mohapatra, Pradeep K
2012-04-01
The tannase protein sequences of 149 bacteria and 36 fungi were retrieved from NCBI database. Among them only 77 bacterial and 31 fungal tannase sequences were taken which have different amino acid compositions. These sequences were analysed for different physical and chemical properties, superfamily search, multiple sequence alignment, phylogenetic tree construction and motif finding to find out the functional motif and the evolutionary relationship among them. The superfamily search for these tannase exposed the occurrence of proline iminopeptidase-like, biotin biosynthesis protein BioH, O-acetyltransferase, carboxylesterase/thioesterase 1, carbon-carbon bond hydrolase, haloperoxidase, prolyl oligopeptidase, C-terminal domain and mycobacterial antigens families and alpha/beta hydrolase superfamily. Some bacterial and fungal sequence showed similarity with different families individually. The multiple sequence alignment of these tannase protein sequences showed conserved regions at different stretches with maximum homology from amino acid residues 389-469 and 482-523 which could be used for designing degenerate primers or probes specific for tannase producing bacterial and fungal species. Phylogenetic tree showed two different clusters; one has only bacteria and another have both fungi and bacteria showing some relationship between these different genera. Although in second cluster near about all fungal species were found together in a corner which indicates the sequence level similarity among fungal genera. The distributions of fourteen motifs analysis revealed Motif 1 with a signature amino acid sequence of 29 amino acids, i.e. GCSTGGREALKQAQRWPHDYDGIIANNPA, was uniformly observed in 83.3 % of studied tannase sequences representing its participation with the structure and enzymatic function.
Evolution of the arginase fold and functional diversity
Dowling, Daniel P.; Costanzo, Luigi Di; Gennadios, Heather A.; Christianson, David W.
2009-01-01
The large number of protein structures deposited in the Protein Data Bank allows for the identification of novel structural superfamilies based on conservation of fold in addition to conservation of amino acid sequence. Since sequence diverges more rapidly than fold in protein evolution, proteins with little or no significant sequence identity are occasionally observed to adopt similar folds, thereby reflecting unanticipated evolutionary relationships. Here, we review the unique α/β fold first observed in the manganese metalloenzyme rat liver arginase, consisting of a parallel 8 stranded β-sheet surrounded by several helices, and its evolutionary relationship with the zinc-requiring and/or iron-requiring histone deacetylases and acetylpolyamine amidohydrolases. Structural comparisons reveal key features of the core α/β fold that contribute to the divergent metal ion specificity and stoichiometry required for the chemical and biological functions of these enzymes. PMID:18360740
Schuster, W; Wissinger, B; Unseld, M; Brennicke, A
1990-01-01
A number of cytosines are altered to be recognized as uridines in transcripts of the nad3 locus in mitochondria of the higher plant Oenothera. Such nucleotide modifications can be found at 16 different sites within the nad3 coding region. Most of these alterations in the mRNA sequence change codon identities to specify amino acids better conserved in evolution. Individual cDNA clones differ in their degree of editing at five nucleotide positions, three of which are silent, while two lead to codon alterations specifying different amino acids. None of the cDNA clones analysed is maximally edited at all possible sites, suggesting slow processing or lowered stringency of editing at these nucleotides. Differentially edited transcripts could be editing intermediates or could code for differing polypeptides. Two edited nucleotides in an open reading frame located upstream of nad3 change two amino acids in the deduced polypeptide. Part of the well-conserved ribosomal protein gene rps12 also encoded downstream of nad3 in other plants, is lost in Oenothera mitochondria by recombination events. The functional rps12 protein must be imported from the cytoplasm since the deleted sequences of this gene are not found in the Oenothera mitochondrial genome. The pseudogene sequence is not edited at any nucleotide position. Images Fig. 3. Fig. 4. Fig. 7. PMID:1688531
Identification of Group B Streptococcal Sip Protein, Which Elicits Cross-Protective Immunity
Brodeur, Bernard R.; Boyer, Martine; Charlebois, Isabelle; Hamel, Josée; Couture, France; Rioux, Clément R.; Martin, Denis
2000-01-01
A protein of group B streptococci (GBS), named Sip for surface immunogenic protein, which is distinct from previously described surface proteins, was identified after immunological screening of a genomic library. Immunoblots using a Sip-specific monoclonal antibody indicated that a protein band with an approximate molecular mass of 53 kDa which did not vary in size was present in every GBS strain tested. Representatives of all nine GBS serotypes were included in the panel of strains. Cloning and sequencing of the sip gene revealed an open reading frame of 1,305 nucleotides coding for a polypeptide of 434 amino acid residues, with a calculated pI of 6.84 and molecular mass of 45.5 kDa. Comparison of the nucleotide sequences from six different strains confirmed with 98% identity that the sip gene is highly conserved among GBS isolates. N-terminal amino acid sequencing also indicated the presence of a 25-amino-acid signal peptide which is cleaved in the mature protein. More importantly, immunization with the recombinant Sip protein efficiently protected CD-1 mice against deadly challenges with six GBS strains of serotypes Ia/c, Ib, II/R, III, V, and VI. The data presented in this study suggest that this highly conserved protein induces cross-protective immunity against GBS infections and emphasize its potential as a universal vaccine candidate. PMID:10992461
Conserved thioredoxin fold is present in Pisum sativum L. sieve element occlusion-1 protein
Umate, Pavan; Tuteja, Renu
2010-01-01
Homology-based three-dimensional model for Pisum sativum sieve element occlusion 1 (Ps.SEO1) (forisomes) protein was constructed. A stretch of amino acids (residues 320 to 456) which is well conserved in all known members of forisomes proteins was used to model the 3D structure of Ps.SEO1. The structural prediction was done using Protein Homology/analogY Recognition Engine (PHYRE) web server. Based on studies of local sequence alignment, the thioredoxin-fold containing protein [Structural Classification of Proteins (SCOP) code d1o73a_], a member of the glutathione peroxidase family was selected as a template for modeling the spatial structure of Ps.SEO1. Selection was based on comparison of primary sequence, higher match quality and alignment accuracy. Motif 1 (EVF) is conserved in Ps.SEO1, Vicia faba (Vf.For1) and Medicago truncatula (MT.SEO3); motif 2 (KKED) is well conserved across all forisomes proteins and motif 3 (IGYIGNP) is conserved in Ps.SEO1 and Vf.For1. PMID:20404566
Comparative and evolutionary studies of vertebrate ALDH1A-like genes and proteins.
Holmes, Roger S
2015-06-05
Vertebrate ALDH1A-like genes encode cytosolic enzymes capable of metabolizing all-trans-retinaldehyde to retinoic acid which is a molecular 'signal' guiding vertebrate development and adipogenesis. Bioinformatic analyses of vertebrate and invertebrate genomes were undertaken using known ALDH1A1, ALDH1A2 and ALDH1A3 amino acid sequences. Comparative analyses of the corresponding human genes provided evidence for distinct modes of gene regulation and expression with putative transcription factor binding sites (TFBS), CpG islands and micro-RNA binding sites identified for the human genes. ALDH1A-like sequences were identified for all mammalian, bird, lizard and frog genomes examined, whereas fish genomes displayed a more restricted distribution pattern for ALDH1A1 and ALDH1A3 genes. The ALDH1A1 gene was absent in many bony fish genomes examined, with the ALDH1A3 gene also absent in the medaka and tilapia genomes. Multiple ALDH1A1-like genes were identified in mouse, rat and marsupial genomes. Vertebrate ALDH1A1, ALDH1A2 and ALDH1A3 subunit sequences were highly conserved throughout vertebrate evolution. Comparative amino acid substitution rates showed that mammalian ALDH1A2 sequences were more highly conserved than for the ALDH1A1 and ALDH1A3 sequences. Phylogenetic studies supported an hypothesis for ALDH1A2 as a likely primordial gene originating in invertebrate genomes and undergoing sequential gene duplication to generate two additional genes, ALDH1A1 and ALDH1A3, in most vertebrate genomes. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Yedavalli, Venkat R. K.; Chappey, Colombe; Matala, Erik; Ahmad, Nafees
1998-01-01
The human immunodeficiency virus type 1 (HIV-1) vif gene is conserved among most lentiviruses, suggesting that vif is important for natural infection. To determine whether an intact vif gene is positively selected during mother-to-infant transmission, we analyzed vif sequences from five infected mother-infant pairs following perinatal transmission. The coding potential of the vif open reading frame directly derived from uncultured peripheral blood mononuclear cell DNA was maintained in most of the 78,912 bp sequenced. We found that 123 of the 137 clones analyzed showed an 89.8% frequency of intact vif open reading frames. There was a low degree of heterogeneity of vif genes within mothers, within infants, and between epidemiologically linked mother-infant pairs. The distances between vif sequences were greater in epidemiologically unlinked individuals than in epidemiologically linked mother-infant pairs. Furthermore, the epidemiologically linked mother-infant pair vif sequences displayed similar patterns that were not seen in vif sequences from epidemiologically unlinked individuals. The functional domains, including the two cysteines at positions 114 and 133, a serine phosphorylation site at position 144, and the C-terminal basic amino acids essential for vif protein function, were highly conserved in most of the sequences. Phylogenetic analyses of 137 mother-infant pair vif sequences and 187 other available vif sequences from HIV-1 databases revealed distinct clusters for vif sequences from each mother-infant pair and for other vif sequences. Taken together, these findings suggest that vif plays an important role in HIV-1 infection and replication in mothers and their perinatally infected infants. PMID:9445004
Wang, Bo; Guo, Ruiqi; Zuo, Lei; Shao, Hong; Liu, Ying; Wang, Yu; Ju, Yan; Sun, Chao; Wang, Lifeng; Zhang, Yanmin; Liu, Liwen
2017-08-10
To analyze the phenotype-genotype correlation of MYH7-V878A mutation. Exonic amplification and high-throughput sequencing of 96-cardiovascular disease-related genes were carried out on probands from 210 pedigrees affected with hypertrophic cardiomyopathy (HCM). For the probands, their family members, and 300 healthy volunteers, the identified MYH7-V878A mutation was verified by Sanger sequencing. Information of the HCM patients and their family members, including clinical data, physical examination, echocardiography (UCG), electrocardiography (ECG), and conserved sequence of the mutation among various species were analyzed. A MYH7-V878A mutation was detected in five HCM pedigrees containing 31 family members. Fourteen members have carried the mutation, among whom 11 were diagnosed with HCM, while 3 did not meet the diagnostic criteria. Some of the fourteen members also carried other mutations. Family members not carrying the mutation had normal UCG and ECG. No MYH7-V878A mutation was found among the 300 healthy volunteers. Analysis of sequence conservation showed that the amino acid is located in highly conserved regions among various species. MYH7-V878A is a hot spot among ethnic Han Chinese with a high penetrance. Functional analysis of the conserved sequences suggested that the mutation may cause significant alteration of the function. MYH7-V878A has a significant value for the early diagnosis of HCM.
Batianovskiĭ, A V; Filatov, I V; Namiot, V A; Esipova, N G; Volotovskiĭ, I D
2012-01-01
It was shown that selective interactions between helical segments of macromolecules can realize in globular proteins in the segments characterized by the same periodicities of charge distribution i.e. between conformationally conservative oligopeptides. It was found that in the macromolecules of alpha-helical proteins conformationally conservative oligopeptides are disposed at a distance being characteristic of direct interactions. For representatives of many structural families of alpha-type proteins specific disposition of conformationally conservative segments is observed. This disposition is inherent to a particular structural family. Disposition of conformationally conservative segments is not related to homology of the amino acid sequence but reflects peculiarities of native 3D-architectures of protein globules.
Maruyama, Kyonoshin; Todaka, Daisuke; Mizoi, Junya; Yoshida, Takuya; Kidokoro, Satoshi; Matsukura, Satoko; Takasaki, Hironori; Sakurai, Tetsuya; Yamamoto, Yoshiharu Y.; Yoshiwara, Kyouko; Kojima, Mikiko; Sakakibara, Hitoshi; Shinozaki, Kazuo; Yamaguchi-Shinozaki, Kazuko
2012-01-01
The genomes of three plants, Arabidopsis (Arabidopsis thaliana), rice (Oryza sativa), and soybean (Glycine max), have been sequenced, and their many genes and promoters have been predicted. In Arabidopsis, cis-acting promoter elements involved in cold- and dehydration-responsive gene expression have been extensively analysed; however, the characteristics of such cis-acting promoter sequences in cold- and dehydration-inducible genes of rice and soybean remain to be clarified. In this study, we performed microarray analyses using the three species, and compared characteristics of identified cold- and dehydration-inducible genes. Transcription profiles of the cold- and dehydration-responsive genes were similar among these three species, showing representative upregulated (dehydrin/LEA) and downregulated (photosynthesis-related) genes. All (46 = 4096) hexamer sequences in the promoters of the three species were investigated, revealing the frequency of conserved sequences in cold- and dehydration-inducible promoters. A core sequence of the abscisic acid-responsive element (ABRE) was the most conserved in dehydration-inducible promoters of all three species, suggesting that transcriptional regulation for dehydration-inducible genes is similar among these three species, with the ABRE-dependent transcriptional pathway. In contrast, for cold-inducible promoters, the conserved hexamer sequences were diversified among these three species, suggesting the existence of diverse transcriptional regulatory pathways for cold-inducible genes among the species. PMID:22184637
A survey of the pyrabactin resistance-like abscisic acid receptor gene family in poplar.
Yu, Jingling; Li, Hejuan; Peng, Yajing; Yang, Lei; Zhao, Fugeng; Luan, Sheng; Lan, Wenzhi
2017-08-03
The conserved PYR/PYL/RCAR family acts as abscisic acid (ABA) receptors for land plants to adapt to terrestrial environments. Our recent study reported that the exogenous overexpression of poplar PtPYRL1 and PtPYRL5, the PYR/PYL/RCAR orthologs, promoted the sensitivity of transgenic Arabidopsis to ABA responses. Here, we surveyed the PtPYRL family in poplar, and revealed that although the sequence and structure are relatively conserved among these receptors, PtPYRL members have differential expression patterns and the sensitivity to ABA or drought treatment, suggesting that PtPYRLs might be good candidates to a future biotechnological use to enhance poplar resistance to water-stress environments.
Ayub, Gohar; Waheed, Yasir
2016-06-01
The 2014 Ebola outbreak was one of the largest that have occurred; it started in Guinea and spread to Nigeria, Liberia and Sierra Leone. Phylogenetic analysis of the current virus species indicated that this outbreak is the result of a divergent lineage of the Zaire ebolavirus. The L protein of Ebola virus (EBOV) is the catalytic subunit of the RNA‑dependent RNA polymerase complex, which, with VP35, is key for the replication and transcription of viral RNA. Earlier sequence analysis demonstrated that the L protein of all non‑segmented negative‑sense (NNS) RNA viruses consists of six domains containing conserved functional motifs. The aim of the present study was to analyze the presence of these motifs in 2014 EBOV isolates, highlight their function and how they may contribute to the overall pathogenicity of the isolates. For this purpose, 81 2014 EBOV L protein sequences were aligned with 475 other NNS RNA viruses, including Paramyxoviridae and Rhabdoviridae viruses. Phylogenetic analysis of all EBOV outbreak L protein sequences was also performed. Analysis of the amino acid substitutions in the 2014 EBOV outbreak was conducted using sequence analysis. The alignment demonstrated the presence of previously conserved motifs in the 2014 EBOV isolates and novel residues. Notably, all the mutations identified in the 2014 EBOV isolates were tolerant, they were pathogenic with certain examples occurring within previously determined functional conserved motifs, possibly altering viral pathogenicity, replication and virulence. The phylogenetic analysis demonstrated that all sequences with the exception of the 2014 EBOV sequences were clustered together. The 2014 EBOV outbreak has acquired a great number of mutations, which may explain the reasons behind this unprecedented outbreak. Certain residues critical to the function of the polymerase remain conserved and may be targets for the development of antiviral therapeutic agents.
Protein structure based prediction of catalytic residues
2013-01-01
Background Worldwide structural genomics projects continue to release new protein structures at an unprecedented pace, so far nearly 6000, but only about 60% of these proteins have any sort of functional annotation. Results We explored a range of features that can be used for the prediction of functional residues given a known three-dimensional structure. These features include various centrality measures of nodes in graphs of interacting residues: closeness, betweenness and page-rank centrality. We also analyzed the distance of functional amino acids to the general center of mass (GCM) of the structure, relative solvent accessibility (RSA), and the use of relative entropy as a measure of sequence conservation. From the selected features, neural networks were trained to identify catalytic residues. We found that using distance to the GCM together with amino acid type provide a good discriminant function, when combined independently with sequence conservation. Using an independent test set of 29 annotated protein structures, the method returned 411 of the initial 9262 residues as the most likely to be involved in function. The output 411 residues contain 70 of the annotated 111 catalytic residues. This represents an approximately 14-fold enrichment of catalytic residues on the entire input set (corresponding to a sensitivity of 63% and a precision of 17%), a performance competitive with that of other state-of-the-art methods. Conclusions We found that several of the graph based measures utilize the same underlying feature of protein structures, which can be simply and more effectively captured with the distance to GCM definition. This also has the added the advantage of simplicity and easy implementation. Meanwhile sequence conservation remains by far the most influential feature in identifying functional residues. We also found that due the rapid changes in size and composition of sequence databases, conservation calculations must be recalibrated for specific reference databases. PMID:23433045
Singh, B N; Mudgil, Yashwanti; Sopory, S K; Reddy, M K
2003-07-01
We have successfully expressed enzymatically active plant topoisomerase II in Escherichia coli for the first time, which has enabled its biochemical characterization. Using a PCR-based strategy, we obtained a full-length cDNA and the corresponding genomic clone of tobacco topoisomerase II. The genomic clone has 18 exons interrupted by 17 introns. Most of the 5' and 3' splice junctions follow the typical canonical consensus dinucleotide sequence GU-AG present in other plant introns. The position of introns and phasing with respect to primary amino acid sequence in tobacco TopII and Arabidopsis TopII are highly conserved, suggesting that the two genes are evolved from the common ancestral type II topoisomerase gene. The cDNA encodes a polypeptide of 1482 amino acids. The primary amino acid sequence shows a striking sequence similarity, preserving all the structural domains that are conserved among eukaryotic type II topoisomerases in an identical spatial order. We have expressed the full-length polypeptide in E. coli and purified the recombinant protein to homogeneity. The full-length polypeptide relaxed supercoiled DNA and decatenated the catenated DNA in a Mg(2+)- and ATP-dependent manner, and this activity was inhibited by 4'-(9-acridinylamino)-3'-methoxymethanesulfonanilide (m-AMSA). The immunofluorescence and confocal microscopic studies, with antibodies developed against the N-terminal region of tobacco recombinant topoisomerase II, established the nuclear localization of topoisomerase II in tobacco BY2 cells. The regulated expression of tobacco topoisomerase II gene under the GAL1 promoter functionally complemented a temperature-sensitive TopII(ts) yeast mutant.
Guiding principles for peptide nanotechnology through directed discovery.
Lampel, A; Ulijn, R V; Tuttle, T
2018-05-21
Life's diverse molecular functions are largely based on only a small number of highly conserved building blocks - the twenty canonical amino acids. These building blocks are chemically simple, but when they are organized in three-dimensional structures of tremendous complexity, new properties emerge. This review explores recent efforts in the directed discovery of functional nanoscale systems and materials based on these same amino acids, but that are not guided by copying or editing biological systems. The review summarises insights obtained using three complementary approaches of searching the sequence space to explore sequence-structure relationships for assembly, reactivity and complexation, namely: (i) strategic editing of short peptide sequences; (ii) computational approaches to predicting and comparing assembly behaviours; (iii) dynamic peptide libraries that explore the free energy landscape. These approaches give rise to guiding principles on controlling order/disorder, complexation and reactivity by peptide sequence design.
Molecular cloning of crustins from the hemocytes of Brazilian penaeid shrimps.
Rosa, Rafael Diego; Bandeira, Paula Terra; Barracco, Margherita Anna
2007-09-01
Crustins are antimicrobial peptides initially identified in the hemocytes of the crab Carcinus maenas (11.5-kDa peptide or carcinin) and recently also recognized in penaeid shrimps and other crustacean species. The aim of this study was to identify sequences encoding for crustins from the hemocytes of four Brazilian penaeid species: Farfantepenaeus paulensis, Farfantepenaeus subtilis, Farfantepenaeus brasiliensis and Litopenaeus schmitti. Using primers based on consensus nucleotide alignment of crustins from different crustaceans, cDNA sequences coding for crustins in all indigenous penaeid species were amplified. The obtained four crustin sequences encoded for peptides containing a hydrophobic N-terminal region rich in glycine repeats and a C-terminal part with 12 cysteine residues and a conserved whey acidic protein domain. All obtained crustin sequences showed high amino acidic similarity among each other and with crustins from litopenaeid shrimps (76-98%). This is the first report of crustins in native Brazilian penaeid shrimps.
Ringwald, M; Schuh, R; Vestweber, D; Eistetter, H; Lottspeich, F; Engel, J; Dölz, R; Jähnig, F; Epplen, J; Mayer, S
1987-01-01
We have determined the amino acid sequence of the Ca2+-dependent cell adhesion molecule uvomorulin as it appears on the cell surface. The extracellular part of the molecule exhibits three internally repeated domains of 112 residues which are most likely generated by gene duplication. Each of the repeated domains contains two highly conserved units which could represent putative Ca2+-binding sites. Secondary structure predictions suggest that the putative Ca2+-binding units are located in external loops at the surface of the protein. The protein sequence exhibits a single membrane-spanning region and a cytoplasmic domain. Sequence comparison reveals extensive homology to the chicken L-CAM. Both uvomorulin and L-CAM are identical in 65% of their entire amino acid sequence suggesting a common origin for both CAMs. Images Fig. 1. Fig. 4. Fig. 7. PMID:3501370
Horibata, Y; Okino, N; Ichinose, S; Omori, A; Ito, M
2000-10-06
Endoglycoceramidase (EC ) is an enzyme capable of cleaving the glycosidic linkage between oligosaccharides and ceramides in various glycosphingolipids. We report here the purification, characterization, and cDNA cloning of a novel endoglycoceramidase from the jellyfish, Cyanea nozakii. The purified enzyme showed a single protein band estimated to be 51 kDa on SDS-polyacrylamide gel electrophoresis. The enzyme showed a pH optimum of 3.0 and was activated by Triton X-100 and Lubrol PX but not by sodium taurodeoxycholate. This enzyme preferentially hydrolyzed gangliosides, especially GT1b and GQ1b, whereas neutral glycosphingolipids were somewhat resistant to hydrolysis by the enzyme. A full-length cDNA encoding the enzyme was cloned by 5'- and 3'-rapid amplification of cDNA ends using a partial amino acid sequence of the purified enzyme. The open reading frame of 1509 nucleotides encoded a polypeptide of 503 amino acids including a signal sequence of 25 residues and six potential N-glycosylation sites. Interestingly, the Asn-Glu-Pro sequence, which is the putative active site of Rhodococcus endoglycoceramidase, was conserved in the deduced amino acid sequences. This is the first report of the cloning of an endoglycoceramidase from a eukaryote.
Ding, Zhong; Peng, Deliang; Huang, Wenkun; He, Wenting; Gao, Bida
2008-02-01
A cDNA, named Dd-ace-2, encoding an acetylcholinesterase (AChE, EC3.1.1.7), was isolated from sweet-potato-stem nematode, Ditylenchus destructor. The nucleotide and amino acid sequences among different nematode species were compared and analyzed with DNAMAN5.0, MEGA3.0 softwares. The results showed that the complete nucleotide sequence of Dd-ace-2 gene of Ditylenchus destructor contains 2425 base pairs from which deduced 734 amino acids (GenBank accession No. EF583058). The homology rates of amino acid sequences of Dd-ace-2 gene between Ditylenchus destructor and Meloidogyne incognita, Caenorhabditis elegans, Dictyocaulus viviparous were 48.0%, 42.7%, 42.1% respectively. The mature acetylcholinesterase sequences of Ditylenchus destructor may encode by the first 701 residues of deduced 734 amino acids.The conserved motifs involved in the catalytic triad, the choline binding site and 10 aromatic residues lining the catalytic gorge were present in the Dd-ace-2 deduced protein. Phylogenetic analysis based on AChEs of other nematodes and species showed that the deduced AChE formed the same cluster with ACE-2s.
Ono, K; Ohtomo, T; Sato, S; Sugamata, Y; Suzuki, M; Hisamoto, N; Ninomiya-Tsuji, J; Tsuchiya, M; Matsumoto, K
2001-06-29
TAK1, a member of the MAPKKK family, is involved in the intracellular signaling pathways mediated by transforming growth factor beta, interleukin 1, and Wnt. TAK1 kinase activity is specifically activated by the TAK1-binding protein TAB1. The C-terminal 68-amino acid sequence of TAB1 (TAB1-C68) is sufficient for TAK1 interaction and activation. Analysis of various truncated versions of TAB1-C68 defined a C-terminal 30-amino acid sequence (TAB1-C30) necessary for TAK1 binding and activation. NMR studies revealed that the TAB1-C30 region has a unique alpha-helical structure. We identified a conserved sequence motif, PYVDXA/TXF, in the C-terminal domain of mammalian TAB1, Xenopus TAB1, and its Caenorhabditis elegans homolog TAP-1, suggesting that this motif constitutes a specific TAK1 docking site. Alanine substitution mutagenesis showed that TAB1 Phe-484, located in the conserved motif, is crucial for TAK1 binding and activation. The C. elegans homolog of TAB1, TAP-1, was able to interact with and activate the C. elegans homolog of TAK1, MOM-4. However, the site in TAP-1 corresponding to Phe-484 of TAB1 is an alanine residue (Ala-364), and changing this residue to Phe abrogates the ability of TAP-1 to interact with and activate MOM-4. These results suggest that the Phe or Ala residue within the conserved motif of the TAB1-related proteins is important for interaction with and activation of specific TAK1 MAPKKK family members in vivo.
The reduced genomes of Parcubacteria (OD1) contain signatures of a symbiotic lifestyle
Nelson, William C.; Stegen, James C.
2015-01-01
Candidate phylum OD1 bacteria (also referred to as Parcubacteria) have been identified in a broad range of anoxic environments through community survey analysis. Although none of these species have been isolated in the laboratory, several genome sequences have been reconstructed from metagenomic sequence data and single-cell sequencing. The organisms have small (generally <1 Mb) genomes with severely reduced metabolic capabilities. We have reconstructed 8 partial to near-complete OD1 genomes from oxic groundwater samples, and compared them against existing genomic data. The conserved core gene set comprises 202 genes, or ~28% of the genomic complement. “Housekeeping” genes and genes for biosynthesis of peptidoglycan and Type IV pilus production are conserved. Gene sets for biosynthesis of cofactors, amino acids, nucleotides, and fatty acids are absent entirely or greatly reduced. The only aspects of energy metabolism conserved are the non-oxidative branch of the pentose-phosphate shunt and central glycolysis. These organisms also lack some activities conserved in almost all other known bacterial genomes, including signal recognition particle, pseudouridine synthase A, and FAD synthase. Pan-genome analysis indicates a broad genotypic diversity and perhaps a highly fluid gene complement, indicating historical adaptation to a wide range of growth environments and a high degree of specialization. The genomes were examined for signatures suggesting either a free-living, streamlined lifestyle, or a symbiotic lifestyle. The lack of biosynthetic capabilities and DNA repair, along with the presence of potential attachment and adhesion proteins suggest that the Parcubacteria are ectosymbionts or parasites of other organisms. The wide diversity of genes that potentially mediate cell-cell contact suggests a broad range of partner/prey organisms across the phylum. PMID:26257709
The reduced genomes of Parcubacteria (OD1) contain signatures of a symbiotic lifestyle
Nelson, William C.; Stegen, James C.
2015-07-21
Candidate phylum OD1 bacteria (also referred to as Parcubacteria) have been identified in a broad range of anoxic environments through community survey analysis. Although none of these species have been isolated in the laboratory, several genome sequences have been reconstructed from metagenomic sequence data and single-cell sequencing. The organisms have small (generally <1 Mb) genomes with severely reduced metabolic capabilities. We have reconstructed 8 partial to near-complete OD1 genomes from oxic groundwater samples, and compared them against existing genomic data. The conserved core gene set comprises 202 genes, or ~28% of the genomic complement. “Housekeeping” genes and genes for biosynthesismore » of peptidoglycan and Type IV pilus production are conserved. Gene sets for biosynthesis of cofactors, amino acids, nucleotides, and fatty acids are absent entirely or greatly reduced. The only aspects of energy metabolism conserved are the non-oxidative branch of the pentose-phosphate shunt and central glycolysis. These organisms also lack some activities conserved in almost all other known bacterial genomes, including signal recognition particle, pseudouridine synthase A, and FAD synthase. Pan-genome analysis indicates a broad genotypic diversity and perhaps a highly fluid gene complement, indicating historical adaptation to a wide range of growth environments and a high degree of specialization. The genomes were examined for signatures suggesting either a free-living, streamlined lifestyle, or a symbiotic lifestyle. The lack of biosynthetic capabilities and DNA repair, along with the presence of potential attachment and adhesion proteins suggest that the Parcubacteria are ectosymbionts or parasites of other organisms. The wide diversity of genes that potentially mediate cell-cell contact suggests a broad range of partner/prey organisms across the phylum.« less
The reduced genomes of Parcubacteria (OD1) contain signatures of a symbiotic lifestyle
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nelson, William C.; Stegen, James C.
2015-07-21
Candidate phylum OD1 bacteria (also referred to as Parcubacteria) have been identified in broad range of anoxic environments through community survey analysis. Although none of these species have been isolated in the laboratory, several genome sequences have been reconstructed from metagenomic sequence data and single-cell sequencing. The organisms have small (generally <1 Mb) genomes with severely reduced metabolic capabilities. We have reconstructed 8 partial to near-complete OD1 genomes from oxic groundwater samples, and compared them against existing genomic data. The conserved core gene set comprises 202 genes, or ~28% of the genomic complement. ‘Housekeeping’ genes and genes for biosynthesis ofmore » peptidoglycan and Type IV pilus production are conserved. Gene sets for biosynthesis of cofactors, amino acids, nucleotides and fatty acids are absent entirely or greatly reduced. The only aspects of energy metabolism conserved are the non-oxidative branch of the pentose-phosphate shunt and central glycolysis. These organisms also lack some activities conserved in almost all other known bacterial genomes, including signal recognition particle, pseudouridine synthase A, and FAD synthase. Pan-genome analysis indicates a broad genotypic diversity and perhaps a highly fluid gene complement, indicating historical adaptation to a wide range of growth environments and a high degree of specialization. The genomes were examined for signatures suggesting either a free-living, streamlined lifestyle or a symbiotic lifestyle. The lack of biosynthetic capabilities and DNA repair, along with the presence of potential attachment and adhesion proteins suggest the Parcubacteria are ectosymbionts or parasites of other organisms. The wide diversity of genes that potentially mediate cell-cell contact suggests a broad range of partner/prey organisms across the phylum.« less
2013-01-01
ATP-binding cassette transporter G1 (ABCG1) mediates cholesterol and oxysterol efflux onto lipidated lipoproteins and plays an important role in macrophage reverse cholesterol transport. Here, we identified a highly conserved sequence present in the five ABCG transporter family members. The conserved sequence is located between the nucleotide binding domain and the transmembrane domain and contains five amino acid residues from Asn at position 316 to Phe at position 320 in ABCG1 (NPADF). We found that cells expressing mutant ABCG1, in which Asn316, Pro317, Asp319, and Phe320 in the conserved sequence were replaced with Ala simultaneously, showed impaired cholesterol efflux activity compared with wild type ABCG1-expressing cells. A more detailed mutagenesis study revealed that mutation of Asn316 or Phe 320 to Ala significantly reduced cellular cholesterol and 7-ketocholesterol efflux conferred by ABCG1, whereas replacement of Pro317 or Asp319 with Ala had no detectable effect. To confirm the important role of Asn316 and Phe320, we mutated Asn316 to Asp (N316D) and Gln (N316Q), and Phe320 to Ile (F320I) and Tyr (F320Y). The mutant F320Y showed the same phenotype as wild type ABCG1. However, the efflux of cholesterol and 7-ketocholesterol was reduced in cells expressing ABCG1 mutant N316D, N316Q, or F320I compared with wild type ABCG1. Further, mutations N316Q and F320I impaired ABCG1 trafficking while having no marked effect on the stability and oligomerization of ABCG1. The mutant N316Q and F320I could not be transported to the cell surface efficiently. Instead, the mutant proteins were mainly localized intracellularly. Thus, these findings indicate that the two highly conserved amino acid residues, Asn and Phe, play an important role in ABCG1-dependent export of cellular cholesterol, mainly through the regulation of ABCG1 trafficking. PMID:24320932
NASA Astrophysics Data System (ADS)
Wang, Bin; Shao, Yanchun; Chen, Tao; Chen, Wanping; Chen, Fusheng
2015-12-01
Acetobacter pasteurianus (Ap) CICC 20001 and CGMCC 1.41 are two acetic acid bacteria strains that, because of their strong abilities to produce and tolerate high concentrations of acetic acid, have been widely used to brew vinegar in China. To globally understand the fermentation characteristics, acid-tolerant mechanisms and genetic stabilities, their genomes were sequenced. Genomic comparisons with 9 other sequenced Ap strains revealed that their chromosomes were evolutionarily conserved, whereas the plasmids were unique compared with other Ap strains. Analysis of the acid-tolerant metabolic pathway at the genomic level indicated that the metabolism of some amino acids and the known mechanisms of acetic acid tolerance, might collaboratively contribute to acetic acid resistance in Ap strains. The balance of instability factors and stability factors in the genomes of Ap CICC 20001 and CGMCC 1.41 strains might be the basis for their genetic stability, consistent with their stable industrial performances. These observations provide important insights into the acid resistance mechanism and the genetic stability of Ap strains and lay a foundation for future genetic manipulation and engineering of these two strains.
Wang, Bin; Shao, Yanchun; Chen, Tao; Chen, Wanping; Chen, Fusheng
2015-12-22
Acetobacter pasteurianus (Ap) CICC 20001 and CGMCC 1.41 are two acetic acid bacteria strains that, because of their strong abilities to produce and tolerate high concentrations of acetic acid, have been widely used to brew vinegar in China. To globally understand the fermentation characteristics, acid-tolerant mechanisms and genetic stabilities, their genomes were sequenced. Genomic comparisons with 9 other sequenced Ap strains revealed that their chromosomes were evolutionarily conserved, whereas the plasmids were unique compared with other Ap strains. Analysis of the acid-tolerant metabolic pathway at the genomic level indicated that the metabolism of some amino acids and the known mechanisms of acetic acid tolerance, might collaboratively contribute to acetic acid resistance in Ap strains. The balance of instability factors and stability factors in the genomes of Ap CICC 20001 and CGMCC 1.41 strains might be the basis for their genetic stability, consistent with their stable industrial performances. These observations provide important insights into the acid resistance mechanism and the genetic stability of Ap strains and lay a foundation for future genetic manipulation and engineering of these two strains.
Kim, Juhan; Kyung, Dohyun; Yun, Hyungdon; Cho, Byung-Kwan; Seo, Joo-Hyun; Cha, Minho; Kim, Byung-Gee
2007-01-01
A novel β-transaminase gene was cloned from Mesorhizobium sp. strain LUK. By using N-terminal sequence and an internal protein sequence, a digoxigenin-labeled probe was made for nonradioactive hybridization, and a 2.5-kb gene fragment was obtained by colony hybridization of a cosmid library. Through Southern blotting and sequence analysis of the selected cosmid clone, the structural gene of the enzyme (1,335 bp) was identified, which encodes a protein of 47,244 Da with a theoretical pI of 6.2. The deduced amino acid sequence of the β-transaminase showed the highest sequence similarity with glutamate-1-semialdehyde aminomutase of transaminase subgroup II. The β-transaminase showed higher activities toward d-β-aminocarboxylic acids such as 3-aminobutyric acid, 3-amino-5-methylhexanoic acid, and 3-amino-3-phenylpropionic acid. The β-transaminase has an unusually broad specificity for amino acceptors such as pyruvate and α-ketoglutarate/oxaloacetate. The enantioselectivity of the enzyme suggested that the recognition mode of β-aminocarboxylic acids in the active site is reversed relative to that of α-amino acids. After comparison of its primary structure with transaminase subgroup II enzymes, it was proposed that R43 interacts with the carboxylate group of the β-aminocarboxylic acids and the carboxylate group on the side chain of dicarboxylic α-keto acids such as α-ketoglutarate and oxaloacetate. R404 is another conserved residue, which interacts with the α-carboxylate group of the α-amino acids and α-keto acids. The β-transaminase was used for the asymmetric synthesis of enantiomerically pure β-aminocarboxylic acids. (3S)-Amino-3-phenylpropionic acid was produced from the ketocarboxylic acid ester substrate by coupled reaction with a lipase using 3-aminobutyric acid as amino donor. PMID:17259358
Matsuo, Taisuke; Yamamoto, Atsushi; Yamamoto, Takenori; Otsuki, Kaoru; Yamazaki, Naoshi; Kataoka, Masatoshi; Terada, Hiroshi; Shinohara, Yasuo
2010-04-01
Liver- and heart/muscle-type isozymes of human carnitine palmitoyltransferase I (L- and M-CPTI, respectively) show a certain similarity in their amino acid sequences, and mutation studies on the conserved amino acids between these two isozymes often show essentially the same effects on their enzymatic properties. Earlier mutation studies on C305 in human M-CPTI and its counterpart residue, C304, in human L-CPTI showed distinct effects of the mutations, especially in the aspect of enzyme stability; however, simple comparison of these effects on the conserved Cys residue between L- and M-CPTI was difficult, because these studies were carried out using different expression systems and distinct amino acids as replacements. In the present study, we carried out mutation studies on the C305 in human M-CPTI using COS cells for the expression system. Our results showed that C305 was replaceable with aspartic acid but that substitution with other amino acids caused both loss of function and reduced expression.
The complete nucleotide sequence of RNA 3 of a peach isolate of Prunus necrotic ringspot virus.
Hammond, R W; Crosslin, J M
1995-04-01
The complete nucleotide sequence of RNA 3 of the PE-5 peach isolate of Prunus necrotic ringspot ilarvirus (PNRSV) was obtained from cloned cDNA. The RNA sequence is 1941 nucleotides and contains two open reading frames (ORFs). ORF 1 consisted of 284 amino acids with a calculated molecular weight of 31,729 Da and ORF 2 contained 224 amino acids with a calculated molecular weight of 25,018 Da. ORF 2 corresponds to the coat protein gene. Expression of ORF 2 engineered into a pTrcHis vector in Escherichia coli results in a fusion polypeptide of approximately 28 kDa which cross-reacts with PNRSV polyclonal antiserum. Analysis of the coat protein amino acid sequence reveals a putative "zinc-finger" domain at the amino-terminal portion of the protein. Two tetranucleotide AUGC motifs occur in the 3'-UTR of the RNA and may function in coat protein binding and genome activation. ORF 1 homologies to other ilarviruses and alfalfa mosaic virus are confined to limited regions of conserved amino acids. The translated amino acid sequence of the coat protein gene shows 92% similarity to one isolate of apple mosaic virus, a closely related member of the ilarvirus group of plant viruses, but only 66% similarity to the amino acid sequence of the coat protein gene of a second isolate. These relationships are also reflected at the nucleotide sequence level. These results in one instance confirm the close similarities observed at the biophysical and serological levels between these two viruses, but on the other hand call into question the nomenclature used to describe these viruses.
Nandi, Soumyadeep; Mehra, Nipun; Lynn, Andrew M; Bhattacharya, Alok
2005-09-09
Theoretical proteome analysis, generated by plotting theoretical isoelectric points (pI) against molecular masses of all proteins encoded by the genome show a multimodal distribution for pI. This multimodal distribution is an effect of allowed combinations of the charged amino acids, and not due to evolutionary causes. The variation in this distribution can be correlated to the organisms ecological niche. Contributions to this variation maybe mapped to individual proteins by studying the variation in pI of orthologs across microorganism genomes. The distribution of ortholog pI values showed trimodal distributions for all prokaryotic genomes analyzed, similar to whole proteome plots. Pairwise analysis of pI variation show that a few COGs are conserved within, but most vary between, the acidic and basic regions of the distribution, while molecular mass is more highly conserved. At the level of functional grouping of orthologs, five groups vary significantly from the population of orthologs, which is attributed to either conservation at the level of sequences or a bias for either positively or negatively charged residues contributing to the function. Individual COGs conserved in both the acidic and basic regions of the trimodal distribution are identified, and orthologs that best represent the variation in levels of the acidic and basic regions are listed. The analysis of pI distribution by using orthologs provides a basis for resolution of theoretical proteome comparison at the level of individual proteins. Orthologs identified that significantly vary between the major acidic and basic regions maybe used as representative of the variation of the entire proteome.
Trcek, Janja
2005-10-01
Acetic acid bacteria (AAB) are well known for oxidizing different ethanol-containing substrates into various types of vinegar. They are also used for production of some biotechnologically important products, such as sorbose and gluconic acids. However, their presence is not always appreciated since certain species also spoil wine, juice, beer and fruits. To be able to follow AAB in all these processes, the species involved must be identified accurately and quickly. Because of inaccuracy and very time-consuming phenotypic analysis of AAB, the application of molecular methods is necessary. Since the pairwise comparison among the 16S rRNA gene sequences of AAB shows very high similarity (up to 99.9%) other DNA-targets should be used. Our previous studies showed that the restriction analysis of 16S-23S rDNA internal transcribed spacer region is a suitable approach for quick affiliation of an acetic acid bacterium to a distinct group of restriction types and also for quick identification of a potentially novel species of acetic acid bacterium (Trcek & Teuber 2002; Trcek 2002). However, with the exception of two conserved genes, encoding tRNAIle and tRNAAla, the sequences of 16S-23S rDNA are highly divergent among AAB species. For this reason we analyzed in this study a gene encoding PQQ-dependent ADH as a possible DNA-target. First we confirmed the expression of subunit I of PQQ-dependent ADH (AdhA) also in Asaia, the only genus of AAB which exhibits little or no ADH-activity. Further we analyzed the partial sequences of adhA among some representative species of the genera Acetobacter, Gluconobacter and Gluconacetobacter. The conserved and variable regions in these sequences made possible the construction of A. acetispecific oligonucleotide the specificity of which was confirmed in PCR-reaction using 45 well-defined strains of AAB as DNA-templates. The primer was also successfully used in direct identification of A. aceti from home made cider vinegar as well as for revealing the misclassification of strain IFO 3283 into the species A. aceti.
Kataoka, M; Delacruz-Hidalgo, A-R G; Akond, M A; Sakuradani, E; Kita, K; Shimizu, S
2004-04-01
The genes encoding two conjugated polyketone reductases (CPR-C1, CPR-C2) of Candida parapsilosis IFO 0708 were cloned and sequenced. The genes encoded a total of 304 and 307 amino acid residues for CPR-C1 and CPR-C2, respectively. The deduced amino acid sequences of the two enzymes showed high similarity to each other and to several proteins of the aldo-keto reductase (AKR) superfamily. However, several amino acid residues in putative active sites of AKRs were not conserved in CPR-C1 and CPR-C2. The two CPR genes were overexpressed in Escherichia coli. The E. coli transformant bearing the CPR-C2 gene almost stoichiometrically reduced 30 mg ketopantoyl lactone/ml to D-pantoyl lactone.
2010-01-01
Background Multiple sequence alignments are used to study gene or protein function, phylogenetic relations, genome evolution hypotheses and even gene polymorphisms. Virtually without exception, all available tools focus on conserved segments or residues. Small divergent regions, however, are biologically important for specific quantitative polymerase chain reaction, genotyping, molecular markers and preparation of specific antibodies, and yet have received little attention. As a consequence, they must be selected empirically by the researcher. AlignMiner has been developed to fill this gap in bioinformatic analyses. Results AlignMiner is a Web-based application for detection of conserved and divergent regions in alignments of conserved sequences, focusing particularly on divergence. It accepts alignments (protein or nucleic acid) obtained using any of a variety of algorithms, which does not appear to have a significant impact on the final results. AlignMiner uses different scoring methods for assessing conserved/divergent regions, Entropy being the method that provides the highest number of regions with the greatest length, and Weighted being the most restrictive. Conserved/divergent regions can be generated either with respect to the consensus sequence or to one master sequence. The resulting data are presented in a graphical interface developed in AJAX, which provides remarkable user interaction capabilities. Users do not need to wait until execution is complete and can.even inspect their results on a different computer. Data can be downloaded onto a user disk, in standard formats. In silico and experimental proof-of-concept cases have shown that AlignMiner can be successfully used to designing specific polymerase chain reaction primers as well as potential epitopes for antibodies. Primer design is assisted by a module that deploys several oligonucleotide parameters for designing primers "on the fly". Conclusions AlignMiner can be used to reliably detect divergent regions via several scoring methods that provide different levels of selectivity. Its predictions have been verified by experimental means. Hence, it is expected that its usage will save researchers' time and ensure an objective selection of the best-possible divergent region when closely related sequences are analysed. AlignMiner is freely available at http://www.scbi.uma.es/alignminer. PMID:20525162
NASA Astrophysics Data System (ADS)
Sethaphong, Latsavongsakda
This work examines smart material properties of rational self-assembly and molecular recognition found in nano-biosystems. Exploiting the sequence and structural information encoded within nucleic acids and proteins will permit programmed synthesis of nanomaterials and help create molecular machines that may carry out new roles involving chemical catalysis and bioenergy. Responsive to different ionic environments thru self-reorgnization, nucleic acids (NA) are nature's signature smart material; organisms such as viruses and bacteria use features of NAs to react to their environment and orchestrate their lifecycle. Furthermore, nucleic acid systems (both RNA and DNA) are currently exploited as scaffolds; recent applications have been showcased to build bioelectronics and biotemplated nanostructures via directed assembly of multidimensional nanoelectronic devices 1. Since the most stable and rudimentary structure of nucleic acids is the helical duplex, these were modeled in order to examine the influence of the microenvironment, sequence, and cation-dependent perturbations of their canonical forms. Due to their negatively charged phosphate backbone, NA's rely on counterions to overcome the inherent repulsive forces that arise from the assembly of two complementary strands. As a realistic model system, we chose the HIV-TAR helix (PDB ID: 397D) to study specific sequence motifs on cation sequestration. At physiologically relevant concentrations of sodium and potassium ions, we observed sequence based effects where purine stretches were adept in retaining high residency cations. The transitional space between adenine and guanosine nucleotides (ApG step) in a sequence proved the most favorable. This work was the first to directly show these subtle interactions of sequence based cationic sequestration and may be useful for controlling metallization of nucleic acids in conductive nanowires. Extending the study further, we explored the degree to which the structure of NA duplexes alone interacted with cations distinct from a specific sequence. Under physiologically relevant conditions, a duplex of RNA polyguanine-polycitidine was highly responsive and able to sequester cations to the middle of the purine stretches. The least responsive structure was a DNA polyadenine-polythymine duplex. A random sequence DNA duplex contorted into an RNA-like helix resulted in cationic dynamics similar to RNA systems. These studies showed that cation diffusive binding events in nucleic acid duplex structures are sequence specific and heavily influenced by structural aspects helical forms to account for much of the differences observed. Although structural information in nucleic acids is encoded within their sequence, linking amino acid sequence to protein structure is murkier; the structural information within proteins is encoded by the folding process itself: a complex phenomenon driven toward the equilibrium state of the active conformation. Upwards of two thirds of a protein's sequence can be substituted with similar amino acids without significantly perturbing its function; conserved residues of about 10% seem to be vital; since evolutionary selection pressure in proteins operates 3-dimenionally, a linear sequence is partially informative. We explored this problem by folding de-novo the cytosolic portion of the membrane protein, cellulose synthase, CESA1 from upland cotton, Gossypium hirsutum (Ghcesa1). The cytoplasmic region was generated by homology modeling and refined with molecular dynamics. These mutations impair local structural flexibility which likely results in cellulose that is produced at a lower rate and is less crystalline. Additional modeling of fragments of cellulose synthases from the model plant, Arabidopsis thaliana, offered novel insights into the function of conserved cytosolic domains within plant cellulose synthases. Transport mechanisms related to the transmembrane region revealed significant differences between plants and a bacterial complex. These studies generated possible mutations that may allow for the creation of new synthases and identified other avenues of research in order to develop technologies that may alter the crystallinity and other useful properties of cellulose. 1. Karplus, K., SAM-T08, HMM-based protein structure prediction. Nucleic Acids Research, 2009. 37: p. W492-W497.
Taravat, Elham; Zebarjadi, Alireza; Kahrizi, Danial; Yari, Kheirollah
2015-05-01
Among the essential amino acids, phenylalanine, tryptophan, and tyrosine are aromatic amino acids which are synthesized by the shikimate pathway in plants and bacteria. Herbicide glyphosate can inhibit the biosynthesis of these amino acids. So, identification of the gene tolerant to glyphosate is very important. It has been shown that the common reed or Phragmites australis Cav. (Poaceae) is relatively tolerant to glyphosate. The aim of the current research is identification, cloning, sequencing, and registering of partial aro A gene of the common reed P. australis. The partial aro A gene of common reed (P. australis) was cloned in Escherichia coli and the amino acid sequence was identified/determined for the first time. This is the first report for isolation, cloning, and sequencing of a part of aro A gene from the common reed. A 670 bp fragment including two introns (86 bp and 289 bp) was obtained. The open reading frame (ORF) region in part of gene was encoded for 98 amino acids. Alignment showed high similarity among this region with Zea mays (L.) (Poaceae) (94.6%), Eleusine indica L. Gaertn (Poaceae) (94.2%), and Zoysia japonica Steud. (Poaceae) (94.2%). The alignment of amino acid sequence of the investigated part of the gene showed a homology with aro A from several other plants. This conserved region forms the enzyme active site. The alignment results of nucleotide and amino acid residues with related sequences showed that there are some differences among them. The relative glyphosate tolerance in the common reed may be related to these differences.
2011-07-27
domain (type 2 phosphatidic acid phosphatase) and may be a PAP2 like superfamily member. In order to localize the promoter(s) for these three genes...Standard Form 298 (Rev. 8-98) Prescribed by ANSI Std Z39-18 which amino acid residue(s) was critical for the enzyme activity. This enzyme possesses a...analyzed the role of eight conserved amino acid residues. The amino acids to be mutated were chosen based on the sequence alignment of several class C
Esmaelizad, Majid; Jelokhani-Niaraki, Saber; Hashemnejad, Khadije; Kamalzadeh, Morteza; Lotfi, Mohsen
2011-12-01
The nucleotide sequence of the VP1 (1D) and partial 3D polymerase (3D(pol)) coding regions of the foot and mouth disease virus (FMDV) vaccine strain A/Iran87, a highly passaged isolate (~150 passages), was determined and aligned with previously published FMDV serotype A sequences. Overall analysis of the amino acid substitutions revealed that the partial 3D(pol) coding region contained four amino acid alterations. Amino acid sequence comparison of the VP1 coding region of the field isolates revealed deletions in the highly passaged Iranian isolate (A/Iran87). The prominent G-H loop of the FMDV VP1 protein contains the conserved arginine-glycine-aspartic acid (RGD) tripeptide, which is a well-known ligand for a specific cell surface integrin. Despite losing the RGD sequence of the VP1 protein and an Asp(26)→Glu substitution in a beta sheet located within a small groove of the 3D(pol) protein, the virus grew in BHK 21 suspension cell cultures. Since this strain has been used as a vaccine strain, it may be inferred that the RGD deletion has no critical role in virus attachment to the cell during the initiation of infection. It is probable that this FMDV subtype can utilize other pathways for cell attachment.
NASA Technical Reports Server (NTRS)
Haney, P. J.; Badger, J. H.; Buldak, G. L.; Reich, C. I.; Woese, C. R.; Olsen, G. J.
1999-01-01
The genome sequence of the extremely thermophilic archaeon Methanococcus jannaschii provides a wealth of data on proteins from a thermophile. In this paper, sequences of 115 proteins from M. jannaschii are compared with their homologs from mesophilic Methanococcus species. Although the growth temperatures of the mesophiles are about 50 degrees C below that of M. jannaschii, their genomic G+C contents are nearly identical. The properties most correlated with the proteins of the thermophile include higher residue volume, higher residue hydrophobicity, more charged amino acids (especially Glu, Arg, and Lys), and fewer uncharged polar residues (Ser, Thr, Asn, and Gln). These are recurring themes, with all trends applying to 83-92% of the proteins for which complete sequences were available. Nearly all of the amino acid replacements most significantly correlated with the temperature change are the same relatively conservative changes observed in all proteins, but in the case of the mesophile/thermophile comparison there is a directional bias. We identify 26 specific pairs of amino acids with a statistically significant (P < 0.01) preferred direction of replacement.
Tange, N; Jong-Young, L; Mikawa, N; Hirono, I; Aoki, T
1997-12-01
A cDNA clone of rainbow trout (Oncorhynchus mykiss) transferrin was obtained from a liver cDNA library. The 2537-bp cDNA sequence contained an open reading frame encoding 691 amino acids and the 5' and 3' noncoding regions. The amino acid sequences at the iron-binding sites and the two N-linked glycosylation sites, and the cysteine residues were consistent with known, conserved vertebrate transferrin cDNA sequences. Single N-linked glycosylation sites existed on the N- and C-lobe. The deduced amino acid sequence of the rainbow trout transferrin cDNA had 92.9% identities with transferrin of coho salmon (Oncorhynchus kisutch); 85%, Atlantic salmon (Salmo salar); 67.3%, medaka (Oryzias latipes); 61.3% Atlantic cod (Gadus morhua); and 59.7%, Japanese flounder (Paralichthys olivaceus). The long and accurate polymerase chain reaction (LA-PCR) was used to amplify approximately 6.5 kb of the transferrin gene from rainbow trout genomic DNA. Restriction fragment length polymorphisms (RFLPs) of the LA-PCR products revealed three digestion patterns in 22 samples.
Li, Wenli; Terenius, Olle; Hirai, Makoto; Nilsson, Anders S; Faye, Ingrid
2005-01-01
The Chinese oak silk moth Antheraea pernyi is an important silk producer. To understand microbial resistance of this moth, we cloned Hemolin, encoding a multifunctional immune protein belonging to the immunoglobulin superfamily, and examined the expression in gonads and fat body. The ApHemolin amino acid sequence was compared to other Hemolin sequences in order to predict functional sites. Several sites were conserved; among them a phosphate binding site, which according to 3D structure modelling does not appear in neuroglian, the phylogenetically closest related protein. In addition, two conserved KDG sequences in the C-C' loop of immunoglobulin domains 1 and 3, give rise to gamma-turns, which is a common motif in the C'-C'' loop of the hypervariable region L2 in vertebrate immunoglobulins. The comparisons also show variable regions of specific interest for future studies of hemolin and its interaction with microbial entities.
[Sequencing and analysis of the complete genome of a rabies virus isolate from Sika deer].
Zhao, Yun-Jiao; Guo, Li; Huang, Ying; Zhang, Li-Shi; Qian, Ai-Dong
2008-05-01
One DRV strain was isolated from Sika Deer brain and sequenced. Nine overlapped gene fragments were amplified by RT-PCR through 3'-RACE and 5'-RACE method, and the complete DRV genome sequence was assembled. The length of the complete genome is 11863bp. The DRV genome organization was similar to other rabies viruses which were composed of five genes and the initiation sites and termination sites were highly conservative. There were mutated amino acids in important antigen sites of nucleoprotein and glycoprotein. The nucleotide and amino acid homologies of gene N, P, M, G, L in strains with completed genomie sequencing were compared. Compared with N gene sequence of other typical rabies viruses, a phylogenetic tree was established . These results indicated that DRV belonged to gene type 1. The highest homology compared with Chinese vaccine strain 3aG was 94%, and the lowest was 71% compared with WCBV. These findings provided theoretical reference for further research in rabies virus.
De Groot, Anne S; Martin, William; Moise, Leonard; Guirakhoo, Farshad; Monath, Thomas
2007-11-19
T-cell epitope variability is associated with viral immune escape and may influence the outcome of vaccination against the highly variable Japanese Encephalitis Virus (JEV). We computationally analyzed the ChimeriVax-JEV vaccine envelope sequence for T helper epitopes that are conserved in 12 circulating JEV strains and discovered 75% conservation among putative epitopes. Among non-identical epitopes, only minor amino acid changes that would not significantly affect HLA-binding were present. Therefore, in most cases, circulating strain epitopes could be restricted by the same HLA and are likely to stimulate a cross-reactive T-cell response. Based on this analysis, we predict no significant abrogation of ChimeriVax-JEV-conferred protection against circulating JEV strains.
Quéméneur, Marianne; Heinrich-Salmeron, Audrey; Muller, Daniel; Lièvremont, Didier; Jauzein, Michel; Bertin, Philippe N.; Garrido, Francis; Joulian, Catherine
2008-01-01
A new primer set was designed to specifically amplify ca. 1,100 bp of aoxB genes encoding the As(III) oxidase catalytic subunit from taxonomically diverse aerobic As(III)-oxidizing bacteria. Comparative analysis of AoxB protein sequences showed variable conservation levels and highlighted the conservation of essential amino acids and structural motifs. AoxB phylogeny of pure strains showed well-discriminated taxonomic groups and was similar to 16S rRNA phylogeny. Alphaproteobacteria-, Betaproteobacteria-, and Gammaproteobacteria-related sequences were retrieved from environmental surveys, demonstrating their prevalence in mesophilic As-contaminated soils. Our study underlines the usefulness of the aoxB gene as a functional marker of aerobic As(III) oxidizers. PMID:18502920
NASA Astrophysics Data System (ADS)
Yu, Jianzhong; Ma, Xiaolei; Pan, Kehou; Yang, Guanpin; Yu, Wengong
2010-07-01
We constructed and characterized a normalized cDNA library of Nannochloropsis oculata CS-179, and obtained 905 nonredundant sequences (NRSs) ranging from 431-1 756 bp in length. Among them, 496 were very similar to nonredundant ones in the GenBank ( E ≤1.0e-05), and 349 ESTs had significant hits with the clusters of eukaryotic orthologous groups (KOG). Bases G and/or C at the third position of codons of 14 amino acid residues suggested a strong bias in the conserved domain of 362 NRSs (>60%). We also identified the unigenes encoding phosphorus and nitrogen transporters, suggesting that N. oculata could efficiently transport and metabolize phosphorus and nitrogen, and recognized the unigenes that involved in biosynthesis and storage of both fatty acids and polyunsaturated fatty acids (PUFAs), which will facilitate the demonstration of eicosapentaenoic acid (EPA) biosynthesis pathway of N. oculata. In comparison with the original cDNA library, the normalized library significantly increased the efficiencies of random sequencing and rarely expressed genes discovering, and decreased the frequency of abundant gene sequences.
Purification, amino acid sequence and characterisation of kangaroo IGF-I.
Yandell, C A; Francis, G L; Wheldrake, J F; Upton, Z
1998-01-01
Insulin-like growth factor-I (IGF-I) and IGF-II have been purified to homogeneity from kangaroo (Macropus fuliginosus) serum, thus this represents the first report of the purification, sequencing and characterisation of marsupial IGFs. N-Terminal protein sequencing reveals that there are six amino acid differences between kangaroo and human IGF-I. Kangaroo IGF-II has been partially sequenced and no differences were found between human and kangaroo IGF-II in the 53 residues identified. Thus the IGFs appear to be remarkably structurally conserved during mammalian radiation. In addition, in vitro characterisation of kangaroo IGF-I demonstrated that the functional properties of human, kangaroo and chicken IGF-I are very similar. In an assay measuring the ability of the proteins to stimulate protein synthesis in rat L6 myoblasts, all IGF-I proteins were found to be equally potent. The ability of all three proteins to compete for binding with radiolabelled human IGF-I to type-1 IGF receptors in L6 myoblasts and in Sminthopsis crassicaudata transformed lung fibroblasts, a marsupial cell line, was comparable. Furthermore, kangaroo and human IGF-I react equally in a human IGF-I RIA using a human reference standard, radiolabelled human IGF-I and a polyclonal antibody raised against recombinant human IGF-I. This study indicates that not only is the primary structure of eutherian and metatherian IGF-I conserved, but also the proteins appear to be functionally similar.
Nucleic and Amino Acid Sequences Support Structure-Based Viral Classification.
Sinclair, Robert M; Ravantti, Janne J; Bamford, Dennis H
2017-04-15
Viral capsids ensure viral genome integrity by protecting the enclosed nucleic acids. Interactions between the genome and capsid and between individual capsid proteins (i.e., capsid architecture) are intimate and are expected to be characterized by strong evolutionary conservation. For this reason, a capsid structure-based viral classification has been proposed as a way to bring order to the viral universe. The seeming lack of sufficient sequence similarity to reproduce this classification has made it difficult to reject structural convergence as the basis for the classification. We reinvestigate whether the structure-based classification for viral coat proteins making icosahedral virus capsids is in fact supported by previously undetected sequence similarity. Since codon choices can influence nascent protein folding cotranslationally, we searched for both amino acid and nucleotide sequence similarity. To demonstrate the sensitivity of the approach, we identify a candidate gene for the pandoravirus capsid protein. We show that the structure-based classification is strongly supported by amino acid and also nucleotide sequence similarities, suggesting that the similarities are due to common descent. The correspondence between structure-based and sequence-based analyses of the same proteins shown here allow them to be used in future analyses of the relationship between linear sequence information and macromolecular function, as well as between linear sequence and protein folds. IMPORTANCE Viral capsids protect nucleic acid genomes, which in turn encode capsid proteins. This tight coupling of protein shell and nucleic acids, together with strong functional constraints on capsid protein folding and architecture, leads to the hypothesis that capsid protein-coding nucleotide sequences may retain signatures of ancient viral evolution. We have been able to show that this is indeed the case, using the major capsid proteins of viruses forming icosahedral capsids. Importantly, we detected similarity at the nucleotide level between capsid protein-coding regions from viruses infecting cells belonging to all three domains of life, reproducing a previously established structure-based classification of icosahedral viral capsids. Copyright © 2017 Sinclair et al.
Nucleic and Amino Acid Sequences Support Structure-Based Viral Classification
Sinclair, Robert M.; Ravantti, Janne J.
2017-01-01
ABSTRACT Viral capsids ensure viral genome integrity by protecting the enclosed nucleic acids. Interactions between the genome and capsid and between individual capsid proteins (i.e., capsid architecture) are intimate and are expected to be characterized by strong evolutionary conservation. For this reason, a capsid structure-based viral classification has been proposed as a way to bring order to the viral universe. The seeming lack of sufficient sequence similarity to reproduce this classification has made it difficult to reject structural convergence as the basis for the classification. We reinvestigate whether the structure-based classification for viral coat proteins making icosahedral virus capsids is in fact supported by previously undetected sequence similarity. Since codon choices can influence nascent protein folding cotranslationally, we searched for both amino acid and nucleotide sequence similarity. To demonstrate the sensitivity of the approach, we identify a candidate gene for the pandoravirus capsid protein. We show that the structure-based classification is strongly supported by amino acid and also nucleotide sequence similarities, suggesting that the similarities are due to common descent. The correspondence between structure-based and sequence-based analyses of the same proteins shown here allow them to be used in future analyses of the relationship between linear sequence information and macromolecular function, as well as between linear sequence and protein folds. IMPORTANCE Viral capsids protect nucleic acid genomes, which in turn encode capsid proteins. This tight coupling of protein shell and nucleic acids, together with strong functional constraints on capsid protein folding and architecture, leads to the hypothesis that capsid protein-coding nucleotide sequences may retain signatures of ancient viral evolution. We have been able to show that this is indeed the case, using the major capsid proteins of viruses forming icosahedral capsids. Importantly, we detected similarity at the nucleotide level between capsid protein-coding regions from viruses infecting cells belonging to all three domains of life, reproducing a previously established structure-based classification of icosahedral viral capsids. PMID:28122979
Rosas-Santiago, Paul; Lagunas-Gomez, Daniel; Yáñez-Domínguez, Carolina; Vera-Estrella, Rosario; Zimmermannová, Olga; Sychrová, Hana; Pantoja, Omar
2017-10-01
The export of membrane proteins along the secretory pathway is initiated at the endoplasmic reticulum after proteins are folded and packaged inside this organelle by their recruiting into the coat complex COPII vesicles. It is proposed that cargo receptors are required for the correct transport of proteins to its target membrane, however, little is known about ER export signals for cargo receptors. Erv14/Cornichon belong to a well conserved protein family in Eukaryotes, and have been proposed to function as cargo receptors for many transmembrane proteins. Amino acid sequence alignment showed the presence of a conserved acidic motif in the C-terminal in homologues from plants and yeast. Here, we demonstrate that mutation of the C-terminal acidic motif from ScErv14 or OsCNIH1, did not alter the localization of these cargo receptors, however it modified the proper targeting of the plasma membrane transporters Nha1p, Pdr12p and Qdr2p. Our results suggest that mistargeting of these plasma membrane proteins is a consequence of a weaker interaction between the cargo receptor and cargo proteins caused by the mutation of the C-terminal acidic motif. Copyright © 2017 Elsevier B.V. All rights reserved.
Wu, Fang; Yan, Ming; Li, Yikun; Chang, Shaojie; Song, Xiaomin; Zhou, Zhaocai; Gong, Weimin
2003-12-19
SPE-16 is a new 16kDa protein that has been purified from the seeds of Pachyrrhizus erosus. It's N-terminal amino acid sequence shows significant sequence homology to pathogenesis-related class 10 proteins. cDNA encoding 150 amino acids was cloned by RT-PCR and the gene sequence proved SPE-16 to be a new member of PR-10 family. The cDNA was cloned into pET15b plasmid and expressed in Escherichia coli. The bacterially expressed SPE-16 also demonstrated ribonuclease-like activity in vitro. Site-directed mutation of three conserved amino acids E95A, E147A, Y150A, and a P-loop truncated form were constructed and their different effects on ribonuclease activities were observed. SPE-16 is also able to bind the fluorescent probe 8-anilino-1-naphthalenesulfonate (ANS) in the native state. The ANS anion is a much-utilized "hydrophobic probe" for proteins. This binding activity indicated another biological function of SPE-16.
Harper, J R; Prince, J T; Healy, P A; Stuart, J K; Nauman, S J; Stallcup, W B
1991-03-01
We have isolated cDNA clones coding for the human homologue of the neuronal cell adhesion molecule L1. The nucleotide sequence of the cDNA clones and the deduced primary amino acid sequence of the carboxy terminal portion of the human L1 are homologous to the corresponding sequences of mouse L1 and rat NILE glycoprotein, with an especially high sequences identity in the cytoplasmic regions of the proteins. There is also protein sequence homology with the cytoplasmic region of the Drosophila cell adhesion molecule, neuroglian. The conservation of the cytoplasmic domain argues for an important functional role for this portion of the molecule.
Molecular cloning of pepsinogens A and C from adult newt (Cynops pyrrhogaster) stomach.
Inokuchi, Tomofumi; Ikuzawa, Masayuki; Yamazaki, Shin; Watanabe, Yukari; Shiota, Koushiro; Katoh, Takuma; Kobayashi, Ken-Ichiro
2013-08-01
The full-length cDNAs of three pepsinogens (Pgs) were cloned from the stomach of newt, Cynops pyrrhogaster, and nucleotide sequences of the full-length cDNAs were determined. Molecular phylogenetic analysis showed that two Pgs, named PgC1 and PgC2, belong to the pepsinogen C group, and one Pg, named PgA, belongs to the pepsinogen A group. The sequences contain an open reading frame (ORF) encoding 385 amino acid residues for PgC1, 383 amino acid residues for PgC2 and 377 amino acid residues for PgA. In addition, all of the three amino acid sequences conserve some unique characteristics such as six cysteine residues and putative active site two aspartic acid residues. All of the pepsinogen mRNAs were detected in the stomach by RT-PCR but not in other organs. Although a slight difference at the time of the start of expression was seen among the three pepsinogen genes, all of them were expressed in the larval stage after hatching. This is the first report on cloning of pepsinogens from urodele stomach. Copyright © 2013 Elsevier Inc. All rights reserved.
Wang, Zhiwei; Qiao, Yan; Zhang, Jingjing; Shi, Wenhui; Zhang, Jinwen
2017-07-01
Rapeseed (Brassica napus) is an important cash crop considered as the third largest oil crop worldwide. Rapeseed oil contains various saturation or unsaturation fatty acids, these fatty acids, whose could incorporation with TAG form into lipids stored in seeds play various roles in the metabolic activity. The different fatty acids in B. napus seeds determine oil quality, define if the oil is edible or must be used as industrial material. miRNAs are kind of non-coding sRNAs that could regulate gene expressions through post-transcriptional modification to their target transcripts playing important roles in plant metabolic activities. We employed high-throughput sequencing to identify the miRNAs and their target transcripts involved in fatty acids and lipids metabolism in different development of B. napus seeds. As a result, we identified 826 miRNA sequences, including 523 conserved and 303 newly miRNAs. From the degradome sequencing, we found 589 mRNA could be targeted by 236 miRNAs, it includes 49 novel miRNAs and 187 conserved miRNAs. The miRNA-target couple suggests that bna-5p-163957_18, bna-5p-396192_7, miR9563a-p3, miR9563b-p5, miR838-p3, miR156e-p3, miR159c and miR1134 could target PDP, LACS9, MFPA, ADSL1, ACO32, C0401, GDL73, PlCD6, OLEO3 and WSD1. These target transcripts are involving in acetyl-CoA generate and carbon chain desaturase, regulating the levels of very long chain fatty acids, β-oxidation and lipids transport and metabolism process. At the same, we employed the q-PCR to valid the expression of miRNAs and their target transcripts that involve in fatty acid and lipid metabolism, the result suggested that the miRNA and their transcript expression are negative correlation, which in accord with the expression of miRNA and its target transcript. The study findings suggest that the identified miRNA may play important role in the fatty acids and lipids metabolism in seeds of B. napus. Copyright © 2017 The Author(s). Published by Elsevier B.V. All rights reserved.
Freas, Nicholas; Newton, Peter; Perozich, John
2016-01-01
UDP-glucose dehydrogenase (UDPGDH), UDP-N-acetyl-mannosamine dehydrogenase (UDPNAMDH) and GDP-mannose dehydrogenase (GDPMDH) belong to a family of NAD (+)-linked 4-electron-transfering oxidoreductases called nucleotide diphosphate sugar dehydrogenases (NDP-SDHs). UDPGDH is an enzyme responsible for converting UDP-d-glucose to UDP-d-glucuronic acid, a product that has different roles depending on the organism in which it is found. UDPNAMDH and GDPMDH convert UDP-N-acetyl-mannosamine to UDP-N-acetyl-mannosaminuronic acid and GDP-mannose to GDP-mannuronic acid, respectively, by a similar mechanism to UDPGDH. Their products are used as essential building blocks for the exopolysaccharides found in organisms like Pseudomonas aeruginosa and Staphylococcus aureus. Few studies have investigated the relationships between these enzymes. This study reveals the relationships between the three enzymes by analysing 229 amino acid sequences. Eighteen invariant and several other highly conserved residues were identified, each serving critical roles in maintaining enzyme structure, coenzyme binding or catalytic function. Also, 10 conserved motifs that included most of the conserved residues were identified and their roles proposed. A phylogenetic tree demonstrated relationships between each group and verified group assignment. Finally, group entropy analysis identified novel conservations unique to each NDP-SDH group, including residue positions critical to NDP-sugar substrate interaction, enzyme structure and intersubunit contact. These positions may serve as targets for future research. UDP-glucose dehydrogenase (UDPGDH, EC 1.1.1.22).
Cao, Guangli; Meng, Xiangkun; Xue, Renyu; Zhu, Yuexiong; Zhang, Xiaorong; Pan, Zhonghua; Zheng, Xiaojian; Gong, Chengliang
2012-07-01
A novel Bombyx mori cypovirus 1 isolated from infected silkworm larvae and tentatively assigned as Bombyx mori cypovirus 1 isolate Suzhou (BmCPV-SZ). The complete nucleotide sequences of genomic segments S1-S10 from BmCPV-SZ were determined. All segments possessed a single open reading frame; however, bioinformatic evidence suggested a short overlapping coding sequence in S1. Each BmCPV-SZ segment possessed the conserved terminal sequences AGUAA and GUUAGCC at the 5' and 3' ends, respectively. The conserved A/G at the -3 position in relation to the AUG codon could be found in the BmCPV-SZ genome, and it was postulated that this conserved A/G may be the most important nucleotide for efficient translation initiation in cypoviruses (CPVs). Examination of the putative amino acid sequences encoded by BmCPV-SZ revealed some characteristic motifs. Homology searches showed that viral structural proteins VP1, VP3, and VP4 had localized homologies with proteins of Rice ragged stunt virus , a member of the genus Oryzavirus within the family Reoviridae. A phylogenetic tree based on RNA-dependent RNA polymerase sequences demonstrated that CPV is more closely related to Rice ragged stunt virus and Aedes pseudoscutellaris reovirus than to other members of Reoviridae, suggesting that they may have originated from common ancestors.
Ma, G X; Zhou, R Q; Hu, L; Luo, Y L; Luo, Y F; Zhu, H H
2018-03-01
Toxocara canis is an important but neglected zoonotic parasite, and is the causative agent of human toxocariasis. Chondroitin proteoglycans are biological macromolecules, widely distributed in extracellular matrices, with a great diversity of functions in mammals. However, there is limited information regarding chondroitin proteoglycans in nematode parasites. In the present study, a female-enriched chondroitin proteoglycan 2 gene of T. canis (Tc-cpg-2) was cloned and characterized. Quantitative real-time polymerase chain reaction (qRT-PCR) was employed to measure the transcription levels of Tc-cpg-2 among tissues of male and female adult worms. A 485-amino-acid (aa) polypeptide was predicted from a continuous 1458-nuleotide open reading frame and designated as TcCPG2, which contains a 21-aa signal peptide. Conserved domain searching indicated three chitin-binding peritrophin-A (CBM_14) domains in the amino acid sequence of TcCPG2. Multiple alignment with the inferred amino acid sequences of Caenorhabditis elegans and Ascaris suum showed that CBM_14 domains were well conserved among these species. Phylogenetic analysis suggested that TcCPG2 was closely related to the sequence of chondroitin proteoglycan 2 of A. suum. Interestingly, a high level of Tc-cpg-2 was detected in female germline tissues, particularly in the oviduct, suggesting potential roles of this gene in reproduction (e.g. oogenesis and embryogenesis) of adult T. canis. The functional roles of Tc-cpg-2 in reproduction and development in this parasite and related parasitic nematodes warrant further functional studies.
Samal, Sweety; Kumar, Sachin; Khattar, Sunil K; Samal, Siba K
2011-10-01
A key determinant of Newcastle disease virus (NDV) virulence is the amino acid sequence at the fusion (F) protein cleavage site. The NDV F protein is synthesized as an inactive precursor, F(0), and is activated by proteolytic cleavage between amino acids 116 and 117 to produce two disulfide-linked subunits, F(1) and F(2). The consensus sequence of the F protein cleavage site of virulent [(112)(R/K)-R-Q-(R/K)-R↓F-I(118)] and avirulent [(112)(G/E)-(K/R)-Q-(G/E)-R↓L-I(118)] strains contains a conserved glutamine residue at position 114. Recently, some NDV strains from Africa and Madagascar were isolated from healthy birds and have been reported to contain five basic residues (R-R-R-K-R↓F-I/V or R-R-R-R-R↓F-I/V) at the F protein cleavage site. In this study, we have evaluated the role of this conserved glutamine residue in the replication and pathogenicity of NDV by using the moderately pathogenic Beaudette C strain and by making Q114R, K115R and I118V mutants of the F protein in this strain. Our results showed that changing the glutamine to a basic arginine residue reduced viral replication and attenuated the pathogenicity of the virus in chickens. The pathogenicity was further reduced when the isoleucine at position 118 was substituted for valine.
Kellett, Mark; McKechnie, Stephen W
2005-04-01
The coding region of the hsp68 gene has been amplified, cloned, and sequenced from 10 Drosophila species, 5 from the melanogaster subgroup and 5 from the montium subgroup. When the predicted amino acid sequences are compared with available Hsp70 sequences, patterns of conservation suggest that the C-terminal region should be subdivided according to predominant secondary structure. Conservation levels between Hsp68 and Hsp70 proteins were high in the N-terminal ATPase and adjacent beta-sheet domains, medium in the alpha-helix domain, and low in the C-terminal mobile domain (78%, 72%, 41%, and 21% identity, respectively). A number of amino acid sites were found to be "diagnostic" for Hsp68 (28 of approximately 635 residues). A few of these occur in the ATPase domain (385 residues) but most (75%) are concentrated in the beta-sheet and alpha-helix domains (34% of the protein) with none in the short mobile domain. Five of the diagnostic sites in the beta-sheet domain are clustered around, but not coincident with, functional sites known to be involved in substrate binding. Nearly all of the Hsp70 family length variation occurs in the mobile domain. Within montium subgroup species, 2 nearly identical hsp68 PCR products that differed in length are either different alleles or products of an ancestral hsp68 duplication.
Complete Amino Acid Sequence of a Copper/Zinc-Superoxide Dismutase from Ginger Rhizome.
Nishiyama, Yuki; Fukamizo, Tamo; Yoneda, Kazunari; Araki, Tomohiro
2017-04-01
Superoxide dismutase (SOD) is an antioxidant enzyme protecting cells from oxidative stress. Ginger (Zingiber officinale) is known for its antioxidant properties, however, there are no data on SODs from ginger rhizomes. In this study, we purified SOD from the rhizome of Z. officinale (Zo-SOD) and determined its complete amino acid sequence using N terminal sequencing, amino acid analysis, and de novo sequencing by tandem mass spectrometry. Zo-SOD consists of 151 amino acids with two signature Cu/Zn-SOD motifs and has high similarity to other plant Cu/Zn-SODs. Multiple sequence alignment showed that Cu/Zn-binding residues and cysteines forming a disulfide bond, which are highly conserved in Cu/Zn-SODs, are also present in Zo-SOD. Phylogenetic analysis revealed that plant Cu/Zn-SODs clustered into distinct chloroplastic, cytoplasmic, and intermediate groups. Among them, only chloroplastic enzymes carried amino acid substitutions in the region functionally important for enzymatic activity, suggesting that chloroplastic SODs may have a function distinct from those of SODs localized in other subcellular compartments. The nucleotide sequence of the Zo-SOD coding region was obtained by reverse-translation, and the gene was synthesized, cloned, and expressed. The recombinant Zo-SOD demonstrated pH stability in the range of 5-10, which is similar to other reported Cu/Zn-SODs, and thermal stability in the range of 10-60 °C, which is higher than that for most plant Cu/Zn-SODs but lower compared to the enzyme from a Z. officinale relative Curcuma aromatica.
Sequence analysis of Jembrana disease virus strains reveals a genetically stable lentivirus.
Desport, Moira; Stewart, Meredith E; Mikosza, Andrew S; Sheridan, Carol A; Peterson, Shane E; Chavand, Olivier; Hartaningsih, Nining; Wilcox, Graham E
2007-06-01
Jembrana disease virus (JDV) is a lentivirus associated with an acute disease syndrome with a 20% case fatality rate in Bos javanicus (Bali cattle) in Indonesia, occurring after a short incubation period and with no recurrence of the disease after recovery. Partial regions of gag and pol and the entire env were examined for sequence variation in DNA samples from cases of Jembrana disease obtained from Bali, Sumatra and South Kalimantan in Indonesian Borneo. A high level of nucleotide conservation (97-100%) was observed in gag sequences from samples taken in Bali and Sumatra, indicating that the source of JDV in Sumatra was most likely to have originated from Bali. The pol sequences and, unexpectedly, the env sequences from Bali samples were also well conserved with low nucleotide (96-99%) and amino acid substitutions (95-99%). However, the sample from South Kalimantan (JDV(KAL/01)) contained more divergent sequences, particularly in env (88% identity). Phylogenetic analysis revealed that the JDV(KAL/01)env sequences clustered with the sequence from the Pulukan sample (Bali) from 2001. JDV appears to be remarkably stable genetically and has undergone minor genetic changes over a period of nearly 20 years in Bali despite becoming endemic in the cattle population of the island.
Astell, C R; Gardiner, E M; Tattersall, P
1986-02-01
The sequence of molecular clones of the genome of MVM(i), a lymphotropic variant of minute virus of mice, was determined and compared with that of MVM(p), the fibrotropic prototype strain. At the nucleotide level there are 163 base changes: 129 transitions and 34 transversions. Most nucleotide changes are silent, with only 27 amino acids changes predicted, of which 22 are conservative. Notable differences between the MVM(i) and MVM(p) genomes which may account for the cell specificities of these viruses occur within the 3' nontranslated regions. The differences discussed include the absence of a 65-base-pair direct in MVM(i), the presence of only two polyadenylation sites in MVM(i) compared with four in MVM(p), and sequences that bear a resemblance to enhancer sequences. Also included in this paper is an important correction to the MVM(p) sequence (C.R. Astell, M. Thomson, M. Merchlinsky, and D. C. Ward, Nucleic Acids Res. 11:999-1018, 1983).
Suzuki, Akiko; Endo, Takeshi
2002-02-06
We have cloned a cDNA encoding a novel protein referred to as ermelin from mouse C2 skeletal muscle cells. This protein contained six hydrophobic amino acid stretches corresponding to transmembrane domains, two histidine-rich sequences, and a sequence homologous to the fusion peptides of certain fusion proteins. Ermelin also contained a novel modular sequence, designated as HELP domain, which was highly conserved among eukaryotes, from yeast to higher plants and animals. All these HELP domain-containing proteins, including mouse KE4, Drosophila Catsup, and Arabidopsis IAR1, possessed multipass transmembrane domains and histidine-rich sequences. Ermelin was predominantly expressed in brain and testis, and induced during neuronal differentiation of N1E-115 neuroblastoma cells but downregulated during myogenic differentiation of C2 cells. The mRNA was accumulated in hippocampus and cerebellum of brain and central areas of seminiferous tubules in testis. Epitope-tagging experiments located ermelin and KE4 to a network structure throughout the cytoplasm. Staining with the fluorescent dye DiOC(6)(3) identified this structure as the endoplasmic reticulum. These results suggest that at least some, if not all, of the HELP domain-containing proteins are multipass endoplasmic reticulum membrane proteins with functions conserved among eukaryotes.
WebLogo: A Sequence Logo Generator
Crooks, Gavin E.; Hon, Gary; Chandonia, John-Marc; Brenner, Steven E.
2004-01-01
WebLogo generates sequence logos, graphical representations of the patterns within a multiple sequence alignment. Sequence logos provide a richer and more precise description of sequence similarity than consensus sequences and can rapidly reveal significant features of the alignment otherwise difficult to perceive. Each logo consists of stacks of letters, one stack for each position in the sequence. The overall height of each stack indicates the sequence conservation at that position (measured in bits), whereas the height of symbols within the stack reflects the relative frequency of the corresponding amino or nucleic acid at that position. WebLogo has been enhanced recently with additional features and options, to provide a convenient and highly configurable sequence logo generator. A command line interface and the complete, open WebLogo source code are available for local installation and customization. PMID:15173120
Álvarez-Cervantes, Jorge; Díaz-Godínez, Gerardo; Mercado-Flores, Yuridia; Gupta, Vijai Kumar; Anducho-Reyes, Miguel Angel
2016-01-01
In this paper, the amino acid sequence of the β-xylanase SRXL1 of Sporisorium reilianum, which is a pathogenic fungus of maize was used as a model protein to find its phylogenetic relationship with other xylanases of Ascomycetes and Basidiomycetes and the information obtained allowed to establish a hypothesis of monophyly and of biological role. 84 amino acid sequences of β-xylanase obtained from the GenBank database was used. Groupings analysis of higher-level in the Pfam database allowed to determine that the proteins under study were classified into the GH10 and GH11 families, based on the regions of highly conserved amino acids, 233–318 and 180–193 respectively, where glutamate residues are responsible for the catalysis. PMID:27040368
Nandi, Soumyadeep; Mehra, Nipun; Lynn, Andrew M; Bhattacharya, Alok
2005-01-01
Background Theoretical proteome analysis, generated by plotting theoretical isoelectric points (pI) against molecular masses of all proteins encoded by the genome show a multimodal distribution for pI. This multimodal distribution is an effect of allowed combinations of the charged amino acids, and not due to evolutionary causes. The variation in this distribution can be correlated to the organisms ecological niche. Contributions to this variation maybe mapped to individual proteins by studying the variation in pI of orthologs across microorganism genomes. Results The distribution of ortholog pI values showed trimodal distributions for all prokaryotic genomes analyzed, similar to whole proteome plots. Pairwise analysis of pI variation show that a few COGs are conserved within, but most vary between, the acidic and basic regions of the distribution, while molecular mass is more highly conserved. At the level of functional grouping of orthologs, five groups vary significantly from the population of orthologs, which is attributed to either conservation at the level of sequences or a bias for either positively or negatively charged residues contributing to the function. Individual COGs conserved in both the acidic and basic regions of the trimodal distribution are identified, and orthologs that best represent the variation in levels of the acidic and basic regions are listed. Conclusion The analysis of pI distribution by using orthologs provides a basis for resolution of theoretical proteome comparison at the level of individual proteins. Orthologs identified that significantly vary between the major acidic and basic regions maybe used as representative of the variation of the entire proteome. PMID:16150155
Gupta, Anamika; Pal, Sudhir K; Pandey, Divya; Fakir, Najneen A; Rathod, Sunita; Sinha, Dhiraj; SivaKumar, S; Sinha, Pallavi; Periera, Mycal; Balgam, Shilpa; Sekar, Gomathi; UmaDevi, K R; Anupurba, Shampa; Nema, Vijay
2017-08-18
The Mycobacterium tuberculosis (M.tb) protein kinase B (PknB) which is now proved to be essential for the growth and survival of M.tb, is a transmembrane protein with a potential to be a good drug target. However it is not known if this target remains conserved in otherwise resistant isolates from clinical origin. The present study describes the conservation analysis of sequences covering the inhibitor binding domain of PknB to assess if it remains conserved in susceptible and resistant clinical strains of mycobacteria picked from three different geographical areas of India. A total of 116 isolates from North, South and West India were used in the study with a variable profile of their susceptibilities towards streptomycin, isoniazid, rifampicin, ethambutol and ofloxacin. Isolates were also spoligotyped in order to find if the conservation pattern of pknB gene remain consistent or differ with different spoligotypes. The impact of variation as found in the study was analyzed using Molecular dynamics simulations. The sequencing results with 115/116 isolates revealed the conserved nature of pknB sequences irrespective of their susceptibility status and spoligotypes. The only variation found was in one strains wherein pnkB sequence had G to A mutation at 664 position translating into a change of amino acid, Valine to Isoleucine. After analyzing the impact of this sequence variation using Molecular dynamics simulations, it was observed that the variation is causing no significant change in protein structure or the inhibitor binding. Hence, the study endorses that PknB is an ideal target for drug development and there is no pre-existing or induced resistance with respect to the sequences involved in inhibitor binding. Also if the mutation that we are reporting for the first time is found again in subsequent work, it should be checked with phenotypic profile before drawing the conclusion that it would affect the activity in any way. Bioinformatics analysis in our study says that it has no significant effect on the binding and hence the activity of the protein.
Martínez-Castilla, León P.; Rodríguez-Sotres, Rogelio
2010-01-01
Background Despite the remarkable progress of bioinformatics, how the primary structure of a protein leads to a three-dimensional fold, and in turn determines its function remains an elusive question. Alignments of sequences with known function can be used to identify proteins with the same or similar function with high success. However, identification of function-related and structure-related amino acid positions is only possible after a detailed study of every protein. Folding pattern diversity seems to be much narrower than sequence diversity, and the amino acid sequences of natural proteins have evolved under a selective pressure comprising structural and functional requirements acting in parallel. Principal Findings The approach described in this work begins by generating a large number of amino acid sequences using ROSETTA [Dantas G et al. (2003) J Mol Biol 332:449–460], a program with notable robustness in the assignment of amino acids to a known three-dimensional structure. The resulting sequence-sets showed no conservation of amino acids at active sites, or protein-protein interfaces. Hidden Markov models built from the resulting sequence sets were used to search sequence databases. Surprisingly, the models retrieved from the database sequences belonged to proteins with the same or a very similar function. Given an appropriate cutoff, the rate of false positives was zero. According to our results, this protocol, here referred to as Rd.HMM, detects fine structural details on the folding patterns, that seem to be tightly linked to the fitness of a structural framework for a specific biological function. Conclusion Because the sequence of the native protein used to create the Rd.HMM model was always amongst the top hits, the procedure is a reliable tool to score, very accurately, the quality and appropriateness of computer-modeled 3D-structures, without the need for spectroscopy data. However, Rd.HMM is very sensitive to the conformational features of the models' backbone. PMID:20830209
Ling, P D; Ryon, J J; Hayward, S D
1993-01-01
EBNA-2 contributes to the establishment of Epstein-Barr virus (EBV) latency in B cells and to the resultant alterations in B-cell growth pattern by up-regulating expression from specific viral and cellular promoters. We have taken a comparative approach toward characterizing functional domains within EBNA-2. To this end, we have cloned and sequenced the EBNA-2 gene from the closely related baboon virus herpesvirus papio (HVP). All human EBV isolates have either a type A or type B EBNA-2 gene. However, the HVP EBNA-2 gene falls into neither the type A category nor the type B category, suggesting that the separation into these two subtypes may have been a recent evolutionary event. Comparison of the predicted amino acid sequences indicates 37% amino acid identity with EBV type A EBNA-2 and 35% amino acid identity with type B EBNA-2. To define the domains of EBNA-2 required for transcriptional activation, the DNA binding domain of the GAL4 protein was fused to overlapping segments of EBV EBNA-2. This approach identified a 40-amino-acid (40-aa) EBNA-2 activation domain located between aa 437 and 477. Transactivation ability was completely lost when the amino-terminal boundary of this domain was moved to aa 441, indicating that the motif at aa 437 to 440, Pro-Ile-Leu-Phe, contains residues critical for function. The aa 437 boundary identified in these experiments coincides precisely with a block of conserved sequences in HVP EBNA-2, and the comparable carboxy-terminal region of HVP EBNA-2 also functioned as a strong transcriptional activation domain when fused to the Gal4(1-147) protein. The EBV and HVP EBNA-2 activation domains share a mixed proline-rich, negatively charged character with a striking conservation of positionally equivalent hydrophobic residues. The importance of the individual amino acids making up the Pro-Ile-Leu-Phe motif was examined by mutagenesis. Any alteration of these residues was found to reduce transactivation efficiency, with changes at the Pro-437 and Phe-440 positions producing the most deleterious effects. Activation of the EBV latency C promoter by EBNA-2 was shown to be dependent on the presence of the carboxy-terminal activation domain. However, this requirement was generic, rather than specific, since the EBNA-2 activation domain could be replaced with those from the herpes simplex virus (HSV) VP16 protein or the EBV Rta protein. Potential karyophilic signals within EBNA-2 were examined by introducing oligonucleotides encoding positively charged amino acid groupings that might serve in this capacity into a cytoplasmic test protein, HSV delta IE175, and by examining the intracellular localization of the resulting proteins. This assay identified a strong nuclear localization signal between EBV amino acids (aa) 478 to 485, which was conserved in HVP, and a weaker noncanonical signal between EBV aa 341 to 355, which was not conserved in HVP. Images PMID:8388484
Ling, P D; Ryon, J J; Hayward, S D
1993-06-01
EBNA-2 contributes to the establishment of Epstein-Barr virus (EBV) latency in B cells and to the resultant alterations in B-cell growth pattern by up-regulating expression from specific viral and cellular promoters. We have taken a comparative approach toward characterizing functional domains within EBNA-2. To this end, we have cloned and sequenced the EBNA-2 gene from the closely related baboon virus herpesvirus papio (HVP). All human EBV isolates have either a type A or type B EBNA-2 gene. However, the HVP EBNA-2 gene falls into neither the type A category nor the type B category, suggesting that the separation into these two subtypes may have been a recent evolutionary event. Comparison of the predicted amino acid sequences indicates 37% amino acid identity with EBV type A EBNA-2 and 35% amino acid identity with type B EBNA-2. To define the domains of EBNA-2 required for transcriptional activation, the DNA binding domain of the GAL4 protein was fused to overlapping segments of EBV EBNA-2. This approach identified a 40-amino-acid (40-aa) EBNA-2 activation domain located between aa 437 and 477. Transactivation ability was completely lost when the amino-terminal boundary of this domain was moved to aa 441, indicating that the motif at aa 437 to 440, Pro-Ile-Leu-Phe, contains residues critical for function. The aa 437 boundary identified in these experiments coincides precisely with a block of conserved sequences in HVP EBNA-2, and the comparable carboxy-terminal region of HVP EBNA-2 also functioned as a strong transcriptional activation domain when fused to the Gal4(1-147) protein. The EBV and HVP EBNA-2 activation domains share a mixed proline-rich, negatively charged character with a striking conservation of positionally equivalent hydrophobic residues. The importance of the individual amino acids making up the Pro-Ile-Leu-Phe motif was examined by mutagenesis. Any alteration of these residues was found to reduce transactivation efficiency, with changes at the Pro-437 and Phe-440 positions producing the most deleterious effects. Activation of the EBV latency C promoter by EBNA-2 was shown to be dependent on the presence of the carboxy-terminal activation domain. However, this requirement was generic, rather than specific, since the EBNA-2 activation domain could be replaced with those from the herpes simplex virus (HSV) VP16 protein or the EBV Rta protein. Potential karyophilic signals within EBNA-2 were examined by introducing oligonucleotides encoding positively charged amino acid groupings that might serve in this capacity into a cytoplasmic test protein, HSV delta IE175, and by examining the intracellular localization of the resulting proteins. This assay identified a strong nuclear localization signal between EBV amino acids (aa) 478 to 485, which was conserved in HVP, and a weaker noncanonical signal between EBV aa 341 to 355, which was not conserved in HVP.
Rhizobium etli asparaginase II
Huerta-Saquero, Alejandro; Evangelista-Martínez, Zahaed; Moreno-Enriquez, Angélica; Perez-Rueda, Ernesto
2013-01-01
Bacterial l-asparaginase has been a universal component of therapies for childhood acute lymphoblastic leukemia since the 1970s. Two principal enzymes derived from Escherichia coli and Erwinia chrysanthemi are the only options clinically approved to date. We recently reported a study of recombinant l-asparaginase (AnsA) from Rhizobium etli and described an increasing type of AnsA family members. Sequence analysis revealed four conserved motifs with notable differences with respect to the conserved regions of amino acid sequences of type I and type II l-asparaginases, particularly in comparison with therapeutic enzymes from E. coli and E. chrysanthemi. These differences suggested a distinct immunological specificity. Here, we report an in silico analysis that revealed immunogenic determinants of AnsA. Also, we used an extensive approach to compare the crystal structures of E. coli and E. chrysantemi asparaginases with a computational model of AnsA and identified immunogenic epitopes. A three-dimensional model of AsnA revealed, as expected based on sequence dissimilarities, completely different folding and different immunogenic epitopes. This approach could be very useful in transcending the problem of immunogenicity in two major ways: by chemical modifications of epitopes to reduce drug immunogenicity, and by site-directed mutagenesis of amino acid residues to diminish immunogenicity without reduction of enzymatic activity. PMID:22895060
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lai, Xiaokuang; Davis, F.C.; Ingram, L.O.
1997-02-01
Genomic libraries from nine cellobiose-metabolizing bacteria were screened for cellobiose utilization. Positive clones were recovered from six libraries, all of which encode phosphoenolpyruvate:carbohydrate phosphotransferase system (PTS) proteins. Clones from Bacillus subtilis, Butyrivibrio fibrisolvens, and Klebsiella oxytoca allowed the growth of recombinant Escherichia coli in cellobiose-M9 minimal medium. The K. oxytoca clone, pLOI1906, exhibited an unusually broad substrate range (cellobiose, arbutin, salicin, and methylumbelliferyl derivatives of glucose, cellobiose, mannose, and xylose) and was sequenced. The insert in this plasmid encoded the carboxy-terminal region of a putative regulatory protein, cellobiose permease (single polypeptide), and phospho-{beta}-glucosidase, which appear to form an operon (casRAB).more » Subclones allowed both casA and casB to be expressed independently, as evidenced by in vitro complementation. An analysis of the translated sequences from the EIIC domains of cellobiose, aryl-{beta}-glucoside, and other disaccharide permeases allowed the identification of a 50-amino-acid conserved region. A disaccharide consensus sequence is proposed for the most conserved segment (13 amino acids), which may represent part of the EIIC active site for binding and phosphorylation. 63 refs., 4 figs., 4 tabs.« less
Balasuriya, U B R; Nadler, S A; Wilson, W C; Pritchard, L I; Smythe, A B; Savini, G; Monaco, F; De Santis, P; Zhang, N; Tabachnick, W J; Maclachlan, N J
2008-01-01
Comparison of the deduced amino acid sequences of the genes (S10) encoding the NS3 protein of 137 strains of bluetongue virus (BTV) from Africa, the Americas, Asia, Australia and the Mediterranean Basin showed limited variation. Common to all NS3 sequences were potential glycosylation sites at amino acid residues 63 and 150 and a cysteine at residue 137, whereas a cysteine at residue 181 was not conserved. The PPXY and PS/TAP late-domain motifs were conserved in all but three of the viruses. Phylogenetic analyses of these same sequences yielded two principal clades that grouped the viruses irrespective of their serotype or year of isolation (1900-2003). All viruses from Asia and Australia were grouped in one clade, whereas those from the other regions were present in both clades. Each clade segregated into distinct subclades that included viruses from single or multiple regions, and the S10 genes of some field viruses were identical to those of live-attenuated BTV vaccines. There was no evidence of positive selection on the S10 gene as assessed by reconstruction of ancestral codon states on the phylogeny, rather the functional constraints of the NS3 protein are expressed through substantial negative (purifying) selection.
Rhizobium etli asparaginase II: an alternative for acute lymphoblastic leukemia (ALL) treatment.
Huerta-Saquero, Alejandro; Evangelista-Martínez, Zahaed; Moreno-Enriquez, Angélica; Perez-Rueda, Ernesto
2013-01-01
Bacterial L-asparaginase has been a universal component of therapies for childhood acute lymphoblastic leukemia since the 1970s. Two principal enzymes derived from Escherichia coli and Erwinia chrysanthemi are the only options clinically approved to date. We recently reported a study of recombinant L-asparaginase (AnsA) from Rhizobium etli and described an increasing type of AnsA family members. Sequence analysis revealed four conserved motifs with notable differences with respect to the conserved regions of amino acid sequences of type I and type II L-asparaginases, particularly in comparison with therapeutic enzymes from E. coli and E. chrysanthemi. These differences suggested a distinct immunological specificity. Here, we report an in silico analysis that revealed immunogenic determinants of AnsA. Also, we used an extensive approach to compare the crystal structures of E. coli and E. chrysantemi asparaginases with a computational model of AnsA and identified immunogenic epitopes. A three-dimensional model of AsnA revealed, as expected based on sequence dissimilarities, completely different folding and different immunogenic epitopes. This approach could be very useful in transcending the problem of immunogenicity in two major ways: by chemical modifications of epitopes to reduce drug immunogenicity, and by site-directed mutagenesis of amino acid residues to diminish immunogenicity without reduction of enzymatic activity.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tanaka, Yoshiyuki; Matsuoka, Makoto; Yamanoto, Naoki
A cDNA clone for phenylalanine ammonia-lyase (PAL) induced in wounded sweet potato (Ipomoea batatas Lam.) root was obtained by immunoscreening a cDNA library. The protein produced in Escherichia coli cells containing the plasmid pPAL02 was indistinguishable from sweet potato PAL as judged by Ouchterlony double diffusion assays. The M{sub r} of its subunit was 77,000. The cells converted ({sup 14}C)-L-phenylalanine into ({sup 14}C)-t-cinnamic acid and PAL activity was detected in the homogenate of the cells. The activity was dependent on the presence of the pPAL02 plasmid DNA. The nucleotide sequence of the cDNA contained a 2,121-base pair (bp) open-reading framemore » capable of coding for a polypeptide with 707 amino acids (M{sub r} 77,137), a 22-bp 5{prime}-noncoding region and a 207-bp 3{prime}-noncoding region. The results suggest that the insert DNA fully encoded the amino acid sequence for sweet potato PAL that is induced by wounding. Comparison of the deduced amino acid sequence with that of a PAL cDNA fragment from Phaseolus vulgaris revealed 78.9% homology. The sequence from amino acid residues 258 to 494 was highly conserved, showing 90.7% homology.« less
Strauss, E G; Levinson, R; Rice, C M; Dalrymple, J; Strauss, J H
1988-05-01
We have sequenced the nsP3 and nsP4 region of two alphaviruses, Ross River virus and O'Nyong-nyong virus, in order to examine these viruses for the presence or absence of an opal termination codon present between nsP3 and nsP4 in many alphaviruses. We found that Ross River virus possesses an in-phase opal termination codon between nsP3 and nsP4, whereas in O'Nyong-nyong virus this termination codon is replaced by an arginine codon. Previous studies have shown that two other alphaviruses, Sindbis virus and Middelburg virus, possess an opal termination codon separating nsP3 and nsP4 [E.G. Strauss, C.M. Rice, and J.H. Strauss (1983), Proc. Natl. Acad. Sci. USA 80, 5271-5275], whereas Semliki Forest virus possesses an arginine codon in lieu of the opal codon [K. Takkinen (1986), Nucleic Acids Res. 14, 5667-5682]. Thus, of the five alphaviruses examined to date, three possess the opal codon and two do not. Production of nsP4 requires readthrough of the opal codon in those alphaviruses that possess this termination codon and the function of the termination codon may be to regulate the amount of nsP4 produced. It is an open question then as to whether alphaviruses with no termination codon use other mechanisms to regulate the activity of this gene. The nsP4s of these five alphaviruses are highly conserved, sharing 71-76% amino acid sequence similarity, and all five contain the Gly-Asp-Asp motif found in many RNA virus replicases. The nsP3s are somewhat less conserved, sharing 52-73% amino acid sequence similarity throughout most of the protein, but each possesses a nonconserved C-terminal domain of 134 to 246 amino acids of unknown function.
Characterization and expression profiles of MaACS and MaACO genes from mulberry (Morus alba L.)*
Liu, Chang-ying; Lü, Rui-hua; Li, Jun; Zhao, Ai-chun; Wang, Xi-ling; Diane, Umuhoza; Wang, Xiao-hong; Wang, Chuan-hong; Yu, Ya-sheng; Han, Shu-mei; Lu, Cheng; Yu, Mao-de
2014-01-01
1-Aminocyclopropane-1-carboxylic acid synthase (ACS) and 1-aminocyclopropane-1-carboxylic acid oxidase (ACO) are encoded by multigene families and are involved in fruit ripening by catalyzing the production of ethylene throughout the development of fruit. However, there are no reports on ACS or ACO genes in mulberry, partly because of the limited molecular research background. In this study, we have obtained five ACS gene sequences and two ACO gene sequences from Morus Genome Database. Sequence alignment and phylogenetic analysis of MaACO1 and MaACO2 showed that their amino acids are conserved compared with ACO proteins from other species. MaACS1 and MaACS2 are type I, MaACS3 and MaACS4 are type II, and MaACS5 is type III, with different C-terminal sequences. Quantitative reverse transcriptase polymerase chain reaction (qRT-PCR) expression analysis showed that the transcripts of MaACS genes were strongly expressed in fruit, and more weakly in other tissues. The expression of MaACO1 and MaACO2 showed different patterns in various mulberry tissues. MaACS and MaACO genes demonstrated two patterns throughout the development of mulberry fruit, and both of them were strongly up-regulated by abscisic acid (ABA) and ethephon. PMID:25001221
Molecular cloning of a cDNA encoding the glycoprotein of hen oviduct microsomal signal peptidase.
Newsome, A L; McLean, J W; Lively, M O
1992-01-01
Detergent-solubilized hen oviduct signal peptidase has been characterized previously as an apparent complex of a 19 kDa protein and a 23 kDa glycoprotein (GP23) [Baker & Lively (1987) Biochemistry 26, 8561-8567]. A cDNA clone encoding GP23 from a chicken oviduct lambda gt11 cDNA library has now been characterized. The cDNA encodes a protein of 180 amino acid residues with a single site for asparagine-linked glycosylation that has been directly identified by amino acid sequence analysis of a tryptic-digest peptide containing the glycosylated site. Immunoblot analysis reveals cross-reactivity with a dog pancreas protein. Comparison of the deduced amino acid sequence of GP23 with the 22/23 kDa glycoprotein of dog microsomal signal peptidase [Shelness, Kanwar & Blobel (1988) J. Biol. Chem. 263, 17063-17070], one of five proteins associated with this enzyme, reveals that the amino acid sequences are 90% identical. Thus the signal peptidase glycoprotein is as highly conserved as the sequences of cytochromes c and b from these same species and is likely to be found in a similar form in many, if not all, vertebrate species. The data also show conclusively that the dog and avian signal peptidases have at least one protein subunit in common. Images Fig. 1. PMID:1546959
Carraher, Colm; Authier, Astrid; Steinwender, Bernd; Newcomb, Richard D.
2012-01-01
In insects, odorant receptors detect volatile cues involved in behaviours such as mate recognition, food location and oviposition. We have investigated the evolution of three odorant receptors from five species within the moth genera Ctenopseustis and Planotrotrix, family Tortricidae, which fall into distinct clades within the odorant receptor multigene family. One receptor is the orthologue of the co-receptor Or83b, now known as Orco (OR2), and encodes the obligate ion channel subunit of the receptor complex. In comparison, the other two receptors, OR1 and OR3, are ligand-binding receptor subunits, activated by volatile compounds produced by plants - methyl salicylate and citral, respectively. Rates of sequence evolution at non-synonymous sites were significantly higher in OR1 compared with OR2 and OR3. Within the dataset OR1 contains 109 variable amino acid positions that are distributed evenly across the entire protein including transmembrane helices, loop regions and termini, while OR2 and OR3 contain 18 and 16 variable sites, respectively. OR2 shows a high level of amino acid conservation as expected due to its essential role in odour detection; however we found unexpected differences in the rate of evolution between two ligand-binding odorant receptors, OR1 and OR3. OR3 shows high sequence conservation suggestive of a conserved role in odour reception, whereas the higher rate of evolution observed in OR1, particularly at non-synonymous sites, may be suggestive of relaxed constraint, perhaps associated with the loss of an ancestral role in sex pheromone reception. PMID:22701634
te Biesebeke, Rob; Levasseur, Anthony; Boussier, Amandine; Record, Eric; van den Hondel, Cees A M J J; Punt, Peter J
2010-01-01
The fhbA genes encoding putative flavohemoglobins (FHb) from Aspergillus niger and Aspergillus oryzae were isolated. Comparison of the deduced amino acid sequence of the A. niger fhbA gene and other putative filamentous fungal FHb-encoding genes to that of Ralstonia eutropha shows an overall conserved gene structure and completely conserved catalytic amino acids. Several yeasts and filamentous fungi, including both Aspergillus species have been found to contain a small FHb gene family mostly consisting of two family members. Based on these sequences the evolutionary history of the fungal FHb family was reconstructed. The isolated fhbA genes from A. oryzae and A. niger belong to a phylogenetic group, which exclusively contains Aspergillus genes. Different experimental approaches show that fhbA transcript levels appear during active hyphal growth. Moreover, in a pclA-disrupted strain with a hyperbranching growth phenotype, the transcript levels of the fhbA gene were 2–5 times higher compared to the wild-type. These results suggest that FHb from filamentous fungi have a function that is correlated to the hyphal growth phenotype.
Oakley, Aaron J; Coggan, Marjorie; Board, Philip G
2010-03-26
Gamma-glutamylamine cyclotransferase (GGACT) is an enzyme that converts gamma-glutamylamines to free amines and 5-oxoproline. GGACT shows high activity toward gamma-glutamyl-epsilon-lysine, derived from the breakdown of fibrin and other proteins cross-linked by transglutaminases. The enzyme adopts the newly identified cyclotransferase fold, observed in gamma-glutamylcyclotransferase (GGCT), an enzyme with activity toward gamma-glutamyl-alpha-amino acids (Oakley, A. J., Yamada, T., Liu, D., Coggan, M., Clark, A. G., and Board, P. G. (2008) J. Biol. Chem. 283, 22031-22042). Despite the absence of significant sequence identity, several residues are conserved in the active sites of GGCT and GGACT, including a putative catalytic acid/base residue (GGACT Glu(82)). The structure of GGACT in complex with the reaction product 5-oxoproline provides evidence for a common catalytic mechanism in both enzymes. The proposed mechanism, combined with the three-dimensional structures, also explains the different substrate specificities of these enzymes. Despite significant sequence divergence, there are at least three subfamilies in prokaryotes and eukaryotes that have conserved the GGCT fold and GGCT enzymatic activity.
Oakley, Aaron J.; Coggan, Marjorie; Board, Philip G.
2010-01-01
γ-Glutamylamine cyclotransferase (GGACT) is an enzyme that converts γ-glutamylamines to free amines and 5-oxoproline. GGACT shows high activity toward γ-glutamyl-ϵ-lysine, derived from the breakdown of fibrin and other proteins cross-linked by transglutaminases. The enzyme adopts the newly identified cyclotransferase fold, observed in γ-glutamylcyclotransferase (GGCT), an enzyme with activity toward γ-glutamyl-α-amino acids (Oakley, A. J., Yamada, T., Liu, D., Coggan, M., Clark, A. G., and Board, P. G. (2008) J. Biol. Chem. 283, 22031–22042). Despite the absence of significant sequence identity, several residues are conserved in the active sites of GGCT and GGACT, including a putative catalytic acid/base residue (GGACT Glu82). The structure of GGACT in complex with the reaction product 5-oxoproline provides evidence for a common catalytic mechanism in both enzymes. The proposed mechanism, combined with the three-dimensional structures, also explains the different substrate specificities of these enzymes. Despite significant sequence divergence, there are at least three subfamilies in prokaryotes and eukaryotes that have conserved the GGCT fold and GGCT enzymatic activity. PMID:20110353
Generation and reactivation of T-cell receptor A joining region pseudogenes in primates
DOE Office of Scientific and Technical Information (OSTI.GOV)
Thiel, C.; Lanchbury, J.S.; Otting, N.
1996-06-01
Tandemly duplicated T-cell receptor (Tcr) AJ (J{alpha}) segments contribute significantly to TCRA chain junctional region diversity in mammals. Since only limited data exists on TCRA diversity in nonhuman primates, we examined the TCRAJ regions of 37 chimpanzee and 71 rhesus macaque TCRA cDNA clones derived from inverse polymerase chain reaction on peripheral blood mononuclear cell cDNA of healthy animals. Twenty-five different TCRAJ regions were characterized in the chimpanzee and 36 in the rhesus macaque. Each bears a close structural relationship to an equivalent human TCRAJ region. Conserved amino acid motifs are shared between all three species. There are indications thatmore » differences between nonhuman primates and humans exist in the generation of TCRAJ pseudogenes. The nucleotide and amino acid sequences of the various characterized TCRAJ of each species are reported and we compare our results to the available information on human genomic sequences. Although we provide evidence of dynamic processes modifying TCRAJ segments during primate evolution, their repertoire and primary structure appears to be relatively conserved. 21 refs., 2 figs.« less
Wang, Bin; Shao, Yanchun; Chen, Tao; Chen, Wanping; Chen, Fusheng
2015-01-01
Acetobacter pasteurianus (Ap) CICC 20001 and CGMCC 1.41 are two acetic acid bacteria strains that, because of their strong abilities to produce and tolerate high concentrations of acetic acid, have been widely used to brew vinegar in China. To globally understand the fermentation characteristics, acid-tolerant mechanisms and genetic stabilities, their genomes were sequenced. Genomic comparisons with 9 other sequenced Ap strains revealed that their chromosomes were evolutionarily conserved, whereas the plasmids were unique compared with other Ap strains. Analysis of the acid-tolerant metabolic pathway at the genomic level indicated that the metabolism of some amino acids and the known mechanisms of acetic acid tolerance, might collaboratively contribute to acetic acid resistance in Ap strains. The balance of instability factors and stability factors in the genomes of Ap CICC 20001 and CGMCC 1.41 strains might be the basis for their genetic stability, consistent with their stable industrial performances. These observations provide important insights into the acid resistance mechanism and the genetic stability of Ap strains and lay a foundation for future genetic manipulation and engineering of these two strains. PMID:26691589
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gantt, E.; Cunningham, F.X. Jr.; Lipschultz, C.A.
1988-04-01
High molecular weight polypeptides from phycobilisomes, believed to be involved in facilitating the energy flow from phycobilisomes to thylakoids, are conserved in the prokaryote Nostoc sp. and the eukaryote Porphyridium cruentum. Partial N-terminal sequence analysis of the phycobilisome-polypeptides of Nostoc (94 kilodalton) and Porphyridium (92 kilodalton) revealed 55% identity in the first 20 residues, but no significant homology with sequences of other phycobiliproteins or phycobilisome-linkers. Polypeptides (94 and 92 kilodalton) from Nostoc thylakoids free of phycobilisomes, previously presumed to be involved in the phycobilisome-thylakoid linkage exhibit the same immunocrossreactivity but are different from the 94 kilodalton-phycobilisome polypeptide by having blockedmore » N-termini and a different amino acid composition.« less
Aggarwal, A; Adam, R D; Nash, T E
1989-01-01
The amino acid sequence of a 29.4-kilodalton [corrected] structural protein located in the ventral disk and axostyle of Giardia lamblia was determined. Clone lambda M16 from a mung bean expression library in lambda gt11 expressed a fusion protein recognized by three different isolate-specific antisera and sera from G. lamblia-infected gerbils. One of the three EcoRI fragments (M16; 1.26 kilobases) encoded the recognized protein. Sequence analysis revealed a single open reading frame of 813 base pairs. Two areas showed conservation of the positions of some amino acids. The abundance of arginine, glutamic acid, and threonine was increased. Two potential alpha-helical regions were deduced in the regions of repeats. Antisera to the M16 fusion protein reacted specifically with internal components of the ventral disk and axostyle, as well as Giardia fractions enriched for ventral disk structural proteins. An identical protein was recognized in different isolates by anti-M16, and a single identical band was recognized in Southern blots using the M16 1.26-kilobase fragment as a probe. Therefore, the 29.4-kilodaltion [corrected] protein appears to be highly conserved compared with variant surface proteins. Images PMID:2925253
Saito, Yohtaro; Ashida, Hiroki; Sakiyama, Tomoko; de Marsac, Nicole Tandeau; Danchin, Antoine; Sekowska, Agnieszka; Yokota, Akiho
2009-05-08
The sequences classified as genes for various ribulose-1,5-bisphosphate (RuBP) carboxylase/oxygenase (RuBisCO)-like proteins (RLPs) are widely distributed among bacteria, archaea, and eukaryota. In the phylogenic tree constructed with these sequences, RuBisCOs and RLPs are grouped into four separate clades, forms I-IV. In RuBisCO enzymes encoded by form I, II, and III sequences, 19 conserved amino acid residues are essential for CO(2) fixation; however, 1-11 of these 19 residues are substituted with other amino acids in form IV RLPs. Among form IV RLPs, the only enzymatic activity detected to date is a 2,3-diketo-5-methylthiopentyl 1-phosphate (DK-MTP-1-P) enolase reaction catalyzed by Bacillus subtilis, Microcystis aeruginosa, and Geobacillus kaustophilus form IV RLPs. RLPs from Rhodospirillum rubrum, Rhodopseudomonas palustris, Chlorobium tepidum, and Bordetella bronchiseptica were inactive in the enolase reaction. DK-MTP-1-P enolase activity of B. subtilis RLP required Mg(2+) for catalysis and, like RuBisCO, was stimulated by CO(2). Four residues that are essential for the enolization reaction of RuBisCO, Lys(175), Lys(201), Asp(203), and Glu(204), were conserved in RLPs and were essential for DK-MTP-1-P enolase catalysis. Lys(123), the residue conserved in DK-MTP-1-P enolases, was also essential for B. subtilis RLP enolase activity. Similarities between the active site structures of RuBisCO and B. subtilis RLP were examined by analyzing the effects of structural analogs of RuBP on DK-MTP-1-P enolase activity. A transition state analog for the RuBP carboxylation of RuBisCO was a competitive inhibitor in the DK-MTP-1-P enolase reaction with a K(i) value of 103 mum. RuBP and d-phosphoglyceric acid, the substrate and product, respectively, of RuBisCO, were weaker competitive inhibitors. These results suggest that the amino acid residues utilized in the B. subtilis RLP enolase reaction are the same as those utilized in the RuBisCO RuBP enolization reaction.
Rosa, Rafael Diego; Stoco, Patricia Hermes; Barracco, Margherita Anna
2008-11-01
Anti-lipopolysaccharide factors (ALFs) are antimicrobial peptides found in limulids and crustaceans that have a potent and broad range of antimicrobial activity. We report here the identification and molecular characterisation of new sequences encoding for ALFs in the haemocytes of the freshwater prawn Macrobrachium olfersi and also in two Brazilian penaeid species, Farfantepenaeus paulensis and Litopenaeus schmitti. All obtained sequences encoded for highly cationic peptides containing two conserved cysteine residues flanking a putative LPS-binding domain. They exhibited a significant amino acid similarity with crustacean and limulid ALF sequences, especially with those of penaeid shrimps. This is the first identification of ALF in a freshwater prawn.
Hepatitis delta genotypes in chronic delta infection in the northeast of Spain (Catalonia).
Cotrina, M; Buti, M; Jardi, R; Quer, J; Rodriguez, F; Pascual, C; Esteban, R; Guardia, J
1998-06-01
Based on genetic analysis of variants obtained around the world, three genotypes of the hepatitis delta virus have been defined. Hepatitis delta virus variants have been associated with different disease patterns and geographic distributions. To determine the prevalence of hepatitis delta virus genotypes in the northeast of Spain (Catalonia) and the correlation with transmission routes and clinical disease, we studied the nucleotide divergence of the consensus sequence of HDV RNA obtained from 33 patients with chronic delta hepatitis (24 were intravenous drug users and nine had no risk factors), and four patients with acute self-limited delta infection. Serum HDV RNA was amplified by the polymerase chain reaction technique and a fragment of 350 nucleotides (nt 910 to 1259) was directly sequenced. Genetic analysis of the nucleotide consensus sequence obtained showed a high degree of conservation among sequences (93% of mean). Comparison of these sequences with those derived from different geographic areas and pertaining to genotypes I, II and III, showed a mean sequence identity of 92% with genotype I, 73% with genotype II and 61% with genotype III. At the amino acid level (aa 115 to 214), the mean identity was 87% with genotype I, 63% with genotype II and 56% with genotype III. Conserved regions included the RNA editing domain, the carboxyl terminal 19 amino acids of the hepatitis delta antigen and the polyadenylation signal of the viral mRNA. Hepatitis delta virus isolates in the northeast of Spain are exclusively genotype I, independently of the transmission route and the type of infection. No hepatitis delta virus subgenotypes were found, suggesting that the origin of hepatitis delta virus infection in our geographical area is homogeneous.
Conservation of Fold and Topology of Functional Elements in Thiamin Pyrophosphate Enzymes
NASA Technical Reports Server (NTRS)
Dominiak, P.; Ciszak, E. M.
2005-01-01
Thiamin pyrophosphate (TPP)-dependent enzymes are a highly divergent family of proteins binding both TPP and metal ions. They perform decarboxylation-hydroxyaldehydes. Prior -ketoacids and of a common - (O=)C-C(OH)- fragment of to knowledge of three-dimensional structures of these enzmes, the GDGY25-30NN sequence was used to identify these enzymes. Subsequently, a number of structural studies on those enzymes revealed multi-subunit organization and the features of the two duplicate cofactor binding sites. Analyzing the structures of 44 structurally known enzymes, we found that the common structure of these enzymes is reduced to 180-220 amino acid long fragments of two PP and two PYR domains that form the [PP:PYR]2 binding center of two cofactor molecules. The structures of PP and PYR are arranged in a similar fold-sheet with triplets of helices on both sides.Dconsisting of a six-stranded Residues surrounding the cofactors are not strictly conserved, but they provide the same interatomic contacts required for the catalytic functions that these enzymes perform while maintaining interactive structural integrity. These structural and functional amino acids are topological counterparts located in the same positions of the conserved fold of sets of PP and PYR domains. Additional parallels include short fragments of sequences that link these amino acids to the fold and function. This report on the structural commonalities amongst TPP dependent enzymes is thought to contribute new approaches to annotation that may assist in advancing the functional proteomics of TPP dependent enzymes, and trace their complexity within evolutionary context.
Oluwayelu, D O; Todd, D; Olaleye, O D
2008-12-01
This work reports the first molecular analysis study of chicken anaemia virus (CAV) in backyard chickens in Africa using molecular cloning and sequence analysis to characterize CAV strains obtained from commercial chickens and Nigerian backyard chickens. Partial VP1 gene sequences were determined for three CAVs from commercial chickens and for six CAV variants present in samples from a backyard chicken. Multiple alignment analysis revealed that the 6% and 4% nucleotide diversity obtained respectively for the commercial and backyard chicken strains translated to only 2% amino acid diversity for each breed. Overall, the amino acid composition of Nigerian CAVs was found to be highly conserved. Since the partial VP1 gene sequence of two backyard chicken cloned CAV strains (NGR/CI-8 and NGR/CI-9) were almost identical and evolutionarily closely related to the commercial chicken strains NGR-1, and NGR-4 and NGR-5, respectively, we concluded that CAV infections had crossed the farm boundary.
Sweedler, J V; Li, L; Floyd, P; Gilly, W
2000-12-01
A matrix-assisted laser desorption/ionization (MALDI) mass spectrometric (MS) survey of the major peptides in the stellar, fin and pallial nerves and the posterior chromatophore lobe of the cephalopods Sepia officinalis, Loligo opalescens and Dosidicus gigas has been performed. Although a large number of putative peptides are distinct among the three species, several molecular masses are conserved. In addition to peptides, characterization of the lipid content of the nerves is reported, and these lipid peaks account for many of the lower molecular masses observed. One conserved set of peaks corresponds to the FMRFamide-related peptides (FRPs). The Loligo opalescens FMRFa gene has been sequenced. It encodes a 331 amino acid residue prohormone that is processed into 14 FRPs, which are both predicted by the nucleotide sequence and confirmed by MALDI MS. The FRPs predicted by this gene (FMRFa, FLRFa/FIRFa and ALSGDAFLRFa) are observed in all three species, indicating that members of this peptide family are highly conserved across cephalopods.
Richards, Stephen; Liu, Yue; Bettencourt, Brian R.; Hradecky, Pavel; Letovsky, Stan; Nielsen, Rasmus; Thornton, Kevin; Hubisz, Melissa J.; Chen, Rui; Meisel, Richard P.; Couronne, Olivier; Hua, Sujun; Smith, Mark A.; Zhang, Peili; Liu, Jing; Bussemaker, Harmen J.; van Batenburg, Marinus F.; Howells, Sally L.; Scherer, Steven E.; Sodergren, Erica; Matthews, Beverly B.; Crosby, Madeline A.; Schroeder, Andrew J.; Ortiz-Barrientos, Daniel; Rives, Catharine M.; Metzker, Michael L.; Muzny, Donna M.; Scott, Graham; Steffen, David; Wheeler, David A.; Worley, Kim C.; Havlak, Paul; Durbin, K. James; Egan, Amy; Gill, Rachel; Hume, Jennifer; Morgan, Margaret B.; Miner, George; Hamilton, Cerissa; Huang, Yanmei; Waldron, Lenée; Verduzco, Daniel; Clerc-Blankenburg, Kerstin P.; Dubchak, Inna; Noor, Mohamed A.F.; Anderson, Wyatt; White, Kevin P.; Clark, Andrew G.; Schaeffer, Stephen W.; Gelbart, William; Weinstock, George M.; Gibbs, Richard A.
2005-01-01
We have sequenced the genome of a second Drosophila species, Drosophila pseudoobscura, and compared this to the genome sequence of Drosophila melanogaster, a primary model organism. Throughout evolution the vast majority of Drosophila genes have remained on the same chromosome arm, but within each arm gene order has been extensively reshuffled, leading to a minimum of 921 syntenic blocks shared between the species. A repetitive sequence is found in the D. pseudoobscura genome at many junctions between adjacent syntenic blocks. Analysis of this novel repetitive element family suggests that recombination between offset elements may have given rise to many paracentric inversions, thereby contributing to the shuffling of gene order in the D. pseudoobscura lineage. Based on sequence similarity and synteny, 10,516 putative orthologs have been identified as a core gene set conserved over 25–55 million years (Myr) since the pseudoobscura/melanogaster divergence. Genes expressed in the testes had higher amino acid sequence divergence than the genome-wide average, consistent with the rapid evolution of sex-specific proteins. Cis-regulatory sequences are more conserved than random and nearby sequences between the species—but the difference is slight, suggesting that the evolution of cis-regulatory elements is flexible. Overall, a pattern of repeat-mediated chromosomal rearrangement, and high coadaptation of both male genes and cis-regulatory sequences emerges as important themes of genome divergence between these species of Drosophila. PMID:15632085
BayesMotif: de novo protein sorting motif discovery from impure datasets.
Hu, Jianjun; Zhang, Fan
2010-01-18
Protein sorting is the process that newly synthesized proteins are transported to their target locations within or outside of the cell. This process is precisely regulated by protein sorting signals in different forms. A major category of sorting signals are amino acid sub-sequences usually located at the N-terminals or C-terminals of protein sequences. Genome-wide experimental identification of protein sorting signals is extremely time-consuming and costly. Effective computational algorithms for de novo discovery of protein sorting signals is needed to improve the understanding of protein sorting mechanisms. We formulated the protein sorting motif discovery problem as a classification problem and proposed a Bayesian classifier based algorithm (BayesMotif) for de novo identification of a common type of protein sorting motifs in which a highly conserved anchor is present along with a less conserved motif regions. A false positive removal procedure is developed to iteratively remove sequences that are unlikely to contain true motifs so that the algorithm can identify motifs from impure input sequences. Experiments on both implanted motif datasets and real-world datasets showed that the enhanced BayesMotif algorithm can identify anchored sorting motifs from pure or impure protein sequence dataset. It also shows that the false positive removal procedure can help to identify true motifs even when there is only 20% of the input sequences containing true motif instances. We proposed BayesMotif, a novel Bayesian classification based algorithm for de novo discovery of a special category of anchored protein sorting motifs from impure datasets. Compared to conventional motif discovery algorithms such as MEME, our algorithm can find less-conserved motifs with short highly conserved anchors. Our algorithm also has the advantage of easy incorporation of additional meta-sequence features such as hydrophobicity or charge of the motifs which may help to overcome the limitations of PWM (position weight matrix) motif model.
Batts, William N.; LaPatra, Scott E.; Katona, Ryan; Leis, Eric; Fei Fan Ng, Terry; Bruieuc, Marine S.O.; Breyta, Rachel; Purcell, Maureen; Waltzek, Thomas B.; Delwart, Eric; Winton, James
2017-01-01
A novel virus, rainbow trout orthomyxovirus (RbtOV), was isolated in 1997 and again in 2000 from commercially-reared rainbow trout (Oncorhynchus mykiss) in Idaho, USA. The virus grew optimally in the CHSE-214 cell line at 15°C producing a diffuse cytopathic effect; however, juvenile rainbow trout exposed to cell culture-grown virus showed no mortality or gross pathology. Electron microscopy of preparations from infected cell cultures revealed the presence of typical orthomyxovirus particles. The complete genome of RbtOV is comprised of eight linear segments of single-stranded, negative-sense RNA having highly conserved 5′ and 3′-terminal nucleotide sequences. Another virus isolated in 2014 from steelhead trout (also O. mykiss) in Wisconsin, USA, and designated SttOV was found to have eight genome segments with high amino acid sequence identities (89–99%) to the corresponding genes of RbtOV, suggesting these new viruses are isolates of the same virus species and may be more widespread than currently realized. The new isolates had the same genome segment order and the closest pairwise amino acid sequence identities of 16–42% with Infectious salmon anemia virus (ISAV), the type species and currently only member of the genus Isavirus in the family Orthomyxoviridae. However, pairwise comparisons of the predicted amino acid sequences of the 10 RbtOV and SttOV proteins with orthologs from representatives of the established orthomyxoviral genera and a phylogenetic analysis using the PB1 protein showed that while RbtOV and SttOV clustered most closely with ISAV, they diverged sufficiently to merit consideration as representatives of a novel genus. A set of PCR primers was designed using conserved regions of the PB1 gene to produce amplicons that may be sequenced for identification of similar fish orthomyxoviruses in the future.
Cloning and characterization of an abalone (Haliotis discus hannai) actin gene
NASA Astrophysics Data System (ADS)
Ma, Hongming; Xu, Wei; Mai, Kangsen; Liufu, Zhiguo; Chen, Hong
2004-10-01
An actin encoding gene was cloned by using RT-PCR, 3‧ RACE and 5‧ RACE from abalone Haliotis discus hannai. The full length of the gene is 1532 base pairs, which contains a long 3‧ untranslated region of 307 base pairs and 79 base pairs of 5‧ untranslated sequence. The open reading frame encodes 376 amino acid residues. Sequence comparison with those of human and other mollusks showed high conservation among species at amino acid level. The identities was 96%, 97% and 96% respectively compared with Aplysia californica, Biomphalaria glabrata and Homo sapience β-actin. It is also indicated that this actin is more similar to the human cytoplasmic actin (β-actin) than to human muscle actin.
Site-Specific Pyrolysis Induced Cleavage at Aspartic Acid Residue in Peptides and Proteins
Zhang, Shaofeng; Basile, Franco
2011-01-01
A simple and site-specific non-enzymatic method based on pyrolysis has been developed to cleave peptides and proteins. Pyrolytic cleavage was found to be specific and rapid as it induced a cleavage at the C-terminal side of aspartic acid in the temperature range of 220–250 °C in 10 seconds. Electrospray Ionization (ESI) mass spectrometry (MS) and tandem-MS (MS/MS) were used to characterize and identify pyrolysis cleavage products, confirming that sequence information is conserved after the pyrolysis process in both peptides and protein tested. This suggests that pyrolysis-induced cleavage at aspartyl residues can be used as a rapid protein digestion procedure for the generation of sequence specific protein biomarkers. PMID:17388620
Nadin-Davis, S A; Huang, W; Wandeler, A I
1996-03-01
Since its recognition as a discrete epizootic in Florida in the early 1950s, the raccoon strain of rabies virus (RV) has spread over almost the entire eastern seaboard of the US and now threatens to enter the southernmost regions of Canada. To characterise this RV strain in more detail, nucleotide sequencing of the N and G genes, encoding the nucleoprotein and glycoprotein, respectively, of representative isolates has been undertaken. This sequence information generated a conserved restriction map of the N gene, thereby permitting unequivocal identification of this strain by molecular techniques. Comparisons of the predicted nucleoprotein and glycoprotein products with those of other RV strains identified a number of amino acid sequence variations conserved only in the raccoon strain. This information was used to design strain-specific primers targeted to the N gene sequences encoding these residues. The incorporation of these primers into a multiplex polymerase chain reaction (PCR) protocol permitted easy and rapid discrimination between the raccoon RV strain and indigenous Ontario RVs.
Shpakovskiĭ, G V; Lebedenko, E N
1996-12-01
The rpb10+ cDNA from the fission yeast Schizosaccharomyces pombe was cloned using two independent approaches (PCR and genetic suppression). The cloned cDNA encoded the Rpb10 subunit common for all three RNA polymerases. Comparison of the deduced amino acid sequence of the Sz. pombe Rbp10 subunit (71 amino acid residues) with those of the homologous subunits of RNA polymerases I, II, and III from Saccharomyces cerevisiae and Home sapiens revealed that heptapeptides RCFT/SCGK (residues 6-12), RYCCRRM (residues 43-49), and HVDLIEK (residues 53-59) were evolutionarily the most conserved structural motifs of these subunits. It is shown that the Rbp10 subunit from Sz. pombe can substitute its homolog (ABC10 beta) in the baker's yeast S. cerevisiae.
Molecular cloning of Kazal-type proteinase inhibitor of the shrimp Fenneropenaeus chinensis.
Kong, Hee Jeong; Cho, Hyun Kook; Park, Eun-Mi; Hong, Gyeong-Eun; Kim, Young-Ok; Nam, Bo-Hye; Kim, Woo-Jin; Lee, Sang-Jun; Han, Hyon Sob; Jang, In-Kwon; Lee, Chang Hoon; Cheong, Jaehun; Choi, Tae-Jin
2009-01-01
Proteinase inhibitors play important roles in host defence systems involving blood coagulation and pathogen digestion. We isolated and characterized a cDNA clone for a Kazal-type proteinase inhibitor (KPI) from a hemocyte cDNA library of the oriental white shrimp Fenneropenaeus chinensis. The KPI gene consists of three exons and two introns. KPI cDNA contains an open reading frame of 396 bp, a polyadenylation signal sequence AATAAA, and a poly (A) tail. KPI cDNA encodes a polypeptide of 131 amino acids with a putative signal peptide of 21 amino acids. The deduced amino acid sequence of KPI contains two homologous Kazal domains, each with six conserved cysteine residues. The mRNA of KPI is expressed in the hemocytes of healthy shrimp, and the higher expression of KPI transcript is observed in shrimp infected with the white spot syndrome virus (WSSV), suggesting a potential role for KPI in host defence mechanisms.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Largen, M.; Mills, S.E.; Rowe, J.
1978-01-25
Anthranilate-5-phosphoribosypyrophosphate phosphoribosyltransferase was purified from the bacterium Erwinia carotovora, a member of the Enterobacteriaceae. The enzyme was homogeneous according to the criteria of gel electrophoresis and NH/sub 2/-terminal amino acid sequence analysis. The molecular weight of the enzyme as determined on a calibrated Sephadex G-200 column was 67,000 +- 2,000. Sodium dodecyl sulfate-polyacrylamide gels gave a subunit molecular weight of 40,000 +- 1,000, suggesting that the enzyme was a dimer. A comparison of the NH/sub 2/-terminal sequence of the enzyme with the (previously determined) homologue from Serratia marcescens, a monomer with a molecular weight of 45,000, showed that the largermore » Serratia subunit came into register with amino acid 14 of the Erwinia subunit. The register for the length of the known overlap, 26 amino acids, was highly conserved.« less
Kolpakova, E; Frengen, E; Stokke, T; Olsnes, S
2000-01-01
Acidic fibroblast growth factor (aFGF) intracellular binding protein (FIBP) is a protein found mainly in the nucleus that might be involved in the intracellular function of aFGF. Here we present a comparative analysis of the deduced amino acid sequences of human, murine and Drosophila FIBP analogues and demonstrate that FIBP is an evolutionarily conserved protein. The human gene spans more than 5 kb, comprising ten exons and nine introns, and maps to chromosome 11q13.1. Two slightly different splice variants found in different tissues were isolated and characterized. Sequence analysis of the region surrounding the translation start revealed a CpG island, a classical feature of widely expressed genes. Functional studies of the promoter region with a luciferase reporter system suggested a strong transcriptional activity residing within 600 bp of the 5' flanking region. PMID:11104667
Ramesh, M V; Podkovyrov, S M; Lowe, S E; Zeikus, J G
1994-01-01
The amylopullulanase gene (apu) of the thermophilic anaerobic bacterium Thermoanaerobacterium saccharolyticum B6A-RI was cloned into Escherichia coli. The complete nucleotide sequence of the gene was determined. It encoded a protein consisting of 1,288 amino acids with a signal peptide of 35 amino acids. The enzyme purified from E. coli was a monomer with an M(r) of 142,000 +/- 2,000 and had same the catalytic and thermal characteristics as the native glycoprotein from T. saccharolyticum B6A. Linear alignment and the hydrophobic cluster analysis were used to compare this amylopullulanase with other amylolytic enzymes. Both methods revealed strictly conserved amino acid residues among these enzymes, and it is proposed that Asp-594, Asp-700, and Glu-623 are a putative catalytic triad of the T. saccharolyticum B6A-RI amylopullulanase.
Ramesh, M V; Podkovyrov, S M; Lowe, S E; Zeikus, J G
1994-01-01
The amylopullulanase gene (apu) of the thermophilic anaerobic bacterium Thermoanaerobacterium saccharolyticum B6A-RI was cloned into Escherichia coli. The complete nucleotide sequence of the gene was determined. It encoded a protein consisting of 1,288 amino acids with a signal peptide of 35 amino acids. The enzyme purified from E. coli was a monomer with an M(r) of 142,000 +/- 2,000 and had same the catalytic and thermal characteristics as the native glycoprotein from T. saccharolyticum B6A. Linear alignment and the hydrophobic cluster analysis were used to compare this amylopullulanase with other amylolytic enzymes. Both methods revealed strictly conserved amino acid residues among these enzymes, and it is proposed that Asp-594, Asp-700, and Glu-623 are a putative catalytic triad of the T. saccharolyticum B6A-RI amylopullulanase. Images PMID:8117096
Bezsudnova, Ekaterina Yu; Dibrova, Daria V; Nikolaeva, Alena Yu; Rakitina, Tatiana V; Popov, Vladimir O
2018-04-10
New class IV transaminases with activity towards L-Leu, which is typical of branched-chain amino acid aminotransferases (BCAT), and with activity towards (R)-(+)-1-phenylethylamine ((R)-PEA), which is typical of (R)-selective (R)-amine:pyruvate transaminases, were identified by bioinformatics analysis, obtained in recombinant form, and analyzed. The values of catalytic activities in the reaction with L-Leu and (R)-PEA are comparable to those measured for characteristic transaminases with the corresponding specificity. Earlier, (R)-selective class IV transaminases were found to be active, apart from (R)-PEA, only with some other (R)-primary amines and D-amino acids. Sequences encoding new transaminases with mixed type of activity were found by searching for changes in the conserved motifs of sequences of BCAT by different bioinformatics tools. Copyright © 2018 Elsevier B.V. All rights reserved.
Human mRNA polyadenylate binding protein: evolutionary conservation of a nucleic acid binding motif.
Grange, T; de Sa, C M; Oddos, J; Pictet, R
1987-01-01
We have isolated a full length cDNA (cDNA) coding for the human poly(A) binding protein. The cDNA derived 73 kd basic translation product has the same Mr, isoelectric point and peptidic map as the poly(A) binding protein. DNA sequence analysis reveals a 70,244 dalton protein. The N terminal part, highly homologous to the yeast poly(A) binding protein, is sufficient for poly(A) binding activity. This domain consists of a four-fold repeated unit of approximately 80 amino acids present in other nucleic acid binding proteins. In the C terminal part there is, as in the yeast protein, a sequence of approximately 150 amino acids, rich in proline, alanine and glutamine which together account for 48% of the residues. A 2,9 kb mRNA corresponding to this cDNA has been detected in several vertebrate cell types and in Drosophila melanogaster at every developmental stage including oogenesis. Images PMID:2885805
Kim, Jong-Hyun; Sunako, Michihiro; Ono, Hisayo; Murooka, Yoshikatsu; Fukusaki, Eiichiro; Yamashita, Mitsuo
2008-11-01
A starch-hydrolyzing lactic acid bacterium, Lactobacillus plantarum L137, was isolated from traditional fermented food made from fish and rice in the Philippines. A gene (apuA) encoding an amylolytic enzyme from Lactobacillus plantarum L137 was cloned, and its nucleotide sequence was determined. The apuA gene consisted of an open reading frame of 6171 bp encoding a protein of 2056 amino acids, the molecular mass of which was calculated to be 215,625 Da. The catalytic domains of amylase and pullulanase were located in the same region within the middle of the N-terminal region. The deduced amino acid sequence revealed four highly conserved regions that are common among amylolytic enzymes. In the N-terminal region, a six-amino-acid sequence (Asp-Ala/Thr-Ala-Asn-Ser-Thr) is repeated 39 times, and a three-amino-acid sequence (Gln-Pro-Thr) is repeated 50 times in the C-terminal region. The apuA gene was subcloned in L. plantarum NCL21, which is a plasmid-cured derivative of the wild-type L137 strain and has no amylopullulanase activity, and the gene was overexpressed under the control of its own promoter. The ApuA enzyme from this recombinant L. plantarum NCL21 harboring apuA gene was purified. The enzyme has both alpha-amylase and pullulanase activities. The N-terminal sequence of the purified enzyme showed that the signal peptide was cleaved at Ala(36) and the molecular mass of the mature extracellular enzyme is 211,537 Da. The major reaction products from soluble starch were maltotriose (G3) and maltotetraose (G4). Only maltotriose (G3) was produced from pullulan. From these results, we concluded that ApuA is an amylolytic enzyme belonging to the amylopullulanase family.
Song, B; Hou, Y L; Ding, X; Wang, T; Wang, F; Zhong, J C; Xu, T; Zhong, J; Hou, W R; Shuai, S R
2014-02-20
Fatty acid binding proteins (FABPs) are a family of small, highly conserved cytoplasmic proteins that bind long-chain fatty acids and other hydrophobic ligands. In this study, cDNA and genomic sequences of FABP4 and FABP5 were cloned successfully from the giant panda (Ailuropoda melanoleuca) using reverse transcription polymerase chain reaction (RT-PCR) technology and touchdown-PCR. The cDNAs of FABP4 and FABP5 cloned from the giant panda were 400 and 413 bp in length, containing an open reading frame of 399 and 408 bp, encoding 132 and 135 amino acids, respectively. The genomic sequences of FABP4 and FABP5 were 3976 and 3962 bp, respectively, which each contained four exons and three introns. Sequence alignment indicated a high degree of homology with reported FABP sequences of other mammals at both the amino acid and DNA levels. Topology prediction revealed seven protein kinase C phosphorylation sites, two casein kinase II phosphorylation sites, two N-myristoylation sites, and one cytosolic fatty acid-binding protein signature in the FABP4 protein, and three N-glycosylation sites, three protein kinase C phosphorylation sites, one casein kinase II phosphorylation site, one N-myristoylation site, one amidation site, and one cytosolic fatty acid-binding protein signature in the FABP5 protein. The FABP4 and FABP5 genes were overexpressed in Escherichia coli BL21 and they produced the expected 16.8- and 17.0-kDa polypeptides. The results obtained in this study provide information for further in-depth research of this system, which has great value of both theoretical and practical significance.
Satoh, Dan; Hiraoka, Yasutaka; Colman, Brian; Matsuda, Yusuke
2001-01-01
A single intracellular carbonic anhydrase (CA) was detected in air-grown and, at reduced levels, in high CO2-grown cells of the marine diatom Phaeodactylum tricornutum (UTEX 642). No external CA activity was detected irrespective of growth CO2 conditions. Ethoxyzolamide (0.4 mm), a CA-specific inhibitor, severely inhibited high-affinity photosynthesis at low concentrations of dissolved inorganic carbon, whereas 2 mm acetazolamide had little effect on the affinity for dissolved inorganic carbon, suggesting that internal CA is crucial for the operation of a carbon concentrating mechanism in P. tricornutum. Internal CA was purified 36.7-fold of that of cell homogenates by ammonium sulfate precipitation, and two-step column chromatography on diethylaminoethyl-sephacel and p-aminomethylbenzene sulfone amide agarose. The purified CA was shown, by SDS-PAGE, to comprise an electrophoretically single polypeptide of 28 kD under both reduced and nonreduced conditions. The entire sequence of the cDNA of this CA was obtained by the rapid amplification of cDNA ends method and indicated that the cDNA encodes 282 amino acids. Comparison of this putative precursor sequence with the N-terminal amino acid sequence of the purified CA indicated that it included a possible signal sequence of up to 46 amino acids at the N terminus. The mature CA was found to consist of 236 amino acids and the sequence was homologous to β-type CAs. Even though the zinc-ligand amino acid residues were shown to be completely conserved, the amino acid residues that may constitute a CO2-binding site appeared to be unique among the β-CAs so far reported. PMID:11500545
Multiple sequence alignment using multi-objective based bacterial foraging optimization algorithm.
Rani, R Ranjani; Ramyachitra, D
2016-12-01
Multiple sequence alignment (MSA) is a widespread approach in computational biology and bioinformatics. MSA deals with how the sequences of nucleotides and amino acids are sequenced with possible alignment and minimum number of gaps between them, which directs to the functional, evolutionary and structural relationships among the sequences. Still the computation of MSA is a challenging task to provide an efficient accuracy and statistically significant results of alignments. In this work, the Bacterial Foraging Optimization Algorithm was employed to align the biological sequences which resulted in a non-dominated optimal solution. It employs Multi-objective, such as: Maximization of Similarity, Non-gap percentage, Conserved blocks and Minimization of gap penalty. BAliBASE 3.0 benchmark database was utilized to examine the proposed algorithm against other methods In this paper, two algorithms have been proposed: Hybrid Genetic Algorithm with Artificial Bee Colony (GA-ABC) and Bacterial Foraging Optimization Algorithm. It was found that Hybrid Genetic Algorithm with Artificial Bee Colony performed better than the existing optimization algorithms. But still the conserved blocks were not obtained using GA-ABC. Then BFO was used for the alignment and the conserved blocks were obtained. The proposed Multi-Objective Bacterial Foraging Optimization Algorithm (MO-BFO) was compared with widely used MSA methods Clustal Omega, Kalign, MUSCLE, MAFFT, Genetic Algorithm (GA), Ant Colony Optimization (ACO), Artificial Bee Colony (ABC), Particle Swarm Optimization (PSO) and Hybrid Genetic Algorithm with Artificial Bee Colony (GA-ABC). The final results show that the proposed MO-BFO algorithm yields better alignment than most widely used methods. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Martin, Joanne; Kabat, Peter; Herniou, Elisabeth; Tristem, Michael
2002-01-01
A novel group of retroviruses found within the order Crocodylia are described. Phylogenetic analyses demonstrate that they are probably the most divergent members of the Retroviridae described to date; even the most conserved regions of Pol show an average of only 23% amino acid identity when compared to other retroviruses. PMID:11932432
Ding, Hai; Liu, Baoming; Zhao, Chengyu; Yang, Jingxian; Yan, Chunhui; Yan, Ling; Zhuang, Hui; Li, Tong
2014-02-01
Entire C-genotype small hepatitis B surface (SHBs) sequences were isolated from 139 nucleos(t)ide analogues (NA)-naïve and 74 lamivudine (LMV)-treated chronic hepatitis B (CHB) patients. The conservation and variability of total 226 amino acids (AAs) within the sequences were determined individually, revealing significant higher mutant isolate rate and mutation frequency in LMV-treated cohort than those in the NA-naïve one (P=0.009 and 0.0001, respectively). Three absolutely conserved fragments (s16-s19, s176-s181 and s185-s188) and seven moderately conserved regions (a few AA sites acquiring increased variability after LMV-treatment) were identified. The significant mutation rate increase after LMV-treatment occurred primarily in major hydrophilic region (except 'a' determinant) and transmembrane domain 3/4, but not in other upstream functional regions of SHBs. With little influence on immune escape-associated mutation frequencies within 'a' determinant, LMV-monotherapy significantly induced classical LMVr-associated mirror changes sE164D/rtV173L, sI195M/rtM204V and sW196L/S/rtM204I, as well as non-classical ones sG44E/rtS53N, sT47K/A/rtH55R/Q and sW182stop/rtV191I outside 'a' determinant. Interestingly, another newly-identified truncation mutation sC69stop/rtS78T decreased from 7.91% (11/139) in NA-naïve cohort to 2.70% (2/74) in LMV-treated one. Altogether, the altered AA conservation and diversity in SHBs sequences after LMV-treatment in genotype-C HBV infection might shed new insights into how LMV-therapy affects the SHBs variant evolution and its antigenicity. Copyright © 2013 Elsevier B.V. All rights reserved.
Cloning and expression of a cDNA coding for catalase from zebrafish (Danio rerio).
Ken, C F; Lin, C T; Wu, J L; Shaw, J F
2000-06-01
A full-length complementary DNA (cDNA) clone encoding a catalase was amplified by the rapid amplication of cDNA ends-polymerase chain reaction (RACE-PCR) technique from zebrafish (Danio rerio) mRNA. Nucleotide sequence analysis of this cDNA clone revealed that it comprised a complete open reading frame coding for 526 amino acid residues and that it had a molecular mass of 59 654 Da. The deduced amino acid sequence showed high similarity with the sequences of catalase from swine (86.9%), mouse (85.8%), rat (85%), human (83.7%), fruit fly (75.6%), nematode (71.1%), and yeast (58.6%). The amino acid residues for secondary structures are apparently conserved as they are present in other mammal species. Furthermore, the coding region of zebrafish catalase was introduced into an expression vector, pET-20b(+), and transformed into Escherichia coli expression host BL21(DE3)pLysS. A 60-kDa active catalase protein was expressed and detected by Coomassie blue staining as well as activity staining on polyacrylamide gel followed electrophoresis.
Wozniak, D J; Hsu, L Y; Galloway, D R
1988-01-01
Exotoxin A (ETA) is recognized as the most toxic product associated with the opportunistic pathogen Pseudomonas aeruginosa. Identification of the amino acids in the polypeptide sequence that are required for toxin activity is critical for vaccine development. By defining the nucleotide sequence of the structural gene of a mutant that encodes an enzymatically inactive ETA (CRM 66), we identified an essential amino acid (His-426), which is involved in the ADP-ribosyltransferase activity associated with functional ETA. A monoclonal antibody that inhibits ETA enzymatic activity in vitro fails to react with ETA variants that have a His 426----Tyr substitution. Several mono-ADP-ribosylating toxins, including diphtheria and pertussis toxins, within the primary amino acid sequences carry a histidine residue that is conserved in spacing and in location with respect to other critical residues. Analysis of the three-dimensional structure of ETA revealed that His-426 is not associated with the proposed NAD+ binding site. These findings should be useful for the design and construction of toxin vaccines. Images PMID:3143111
Identification and characterization of novel reptile cathelicidins from elapid snakes.
Zhao, Hui; Gan, Tong-Xiang; Liu, Xiao-Dong; Jin, Yang; Lee, Wen-Hui; Shen, Ji-Hong; Zhang, Yun
2008-10-01
Three cDNA sequences coding for elapid cathelicidins were cloned from constructed venom gland cDNA libraries of Naja atra, Bungarus fasciatus and Ophiophagus hannah. The open reading frames of the cloned elapid cathelicidins were all composed of 576bp and coded for 191 amino acid residue protein precursors. Each of the deduced elapid cathelicidin has a 22 amino acid residue signal peptide, a conserved cathelin domain of 135 amino acid residues and a mature antimicrobial peptide of 34 amino acid residues. Unlike the highly divergent cathelicidins in mammals, the nucleotide and deduced protein sequences of the three cloned elapid cathelicidins were remarkably conserved. All the elapid mature cathelicidins were predicted to be cleaved at Valine157 by elastase. OH-CATH, the deduced mature cathelicidin from king cobra, was chemically synthesized and it showed strong antibacterial activity against various bacteria with minimal inhibitory concentration of 1-20microg/ml in the presence of 1% NaCl. Meanwhile, the synthetic peptide showed no haemolytic activity toward human red blood cells even at a high dose of 200microg/ml. Phylogenetic analysis of cathelicidins from vertebrate suggested that elapid and viperid cathelicidins were grouped together in the tree. Snake cathelicidins were evolutionary closely related to the neutrophilic granule proteins (NGPs) from mouse, rat and rabbit. Snake cathelicidins also showed a close relationship with avian fowlicidins (1-3) and chicken myeloid antimicrobial peptide 27. Elapid cathelicidins might be used as models for the development of novel therapeutic drugs.
2011-01-01
Background Remote homology detection is a hard computational problem. Most approaches have trained computational models by using either full protein sequences or multiple sequence alignments (MSA), including all positions. However, when we deal with proteins in the "twilight zone" we can observe that only some segments of sequences (motifs) are conserved. We introduce a novel logical representation that allows us to represent physico-chemical properties of sequences, conserved amino acid positions and conserved physico-chemical positions in the MSA. From this, Inductive Logic Programming (ILP) finds the most frequent patterns (motifs) and uses them to train propositional models, such as decision trees and support vector machines (SVM). Results We use the SCOP database to perform our experiments by evaluating protein recognition within the same superfamily. Our results show that our methodology when using SVM performs significantly better than some of the state of the art methods, and comparable to other. However, our method provides a comprehensible set of logical rules that can help to understand what determines a protein function. Conclusions The strategy of selecting only the most frequent patterns is effective for the remote homology detection. This is possible through a suitable first-order logical representation of homologous properties, and through a set of frequent patterns, found by an ILP system, that summarizes essential features of protein functions. PMID:21429187
Bernardes, Juliana S; Carbone, Alessandra; Zaverucha, Gerson
2011-03-23
Remote homology detection is a hard computational problem. Most approaches have trained computational models by using either full protein sequences or multiple sequence alignments (MSA), including all positions. However, when we deal with proteins in the "twilight zone" we can observe that only some segments of sequences (motifs) are conserved. We introduce a novel logical representation that allows us to represent physico-chemical properties of sequences, conserved amino acid positions and conserved physico-chemical positions in the MSA. From this, Inductive Logic Programming (ILP) finds the most frequent patterns (motifs) and uses them to train propositional models, such as decision trees and support vector machines (SVM). We use the SCOP database to perform our experiments by evaluating protein recognition within the same superfamily. Our results show that our methodology when using SVM performs significantly better than some of the state of the art methods, and comparable to other. However, our method provides a comprehensible set of logical rules that can help to understand what determines a protein function. The strategy of selecting only the most frequent patterns is effective for the remote homology detection. This is possible through a suitable first-order logical representation of homologous properties, and through a set of frequent patterns, found by an ILP system, that summarizes essential features of protein functions.
Hatton, Leslie; Warr, Gregory
2015-01-01
That the physicochemical properties of amino acids constrain the structure, function and evolution of proteins is not in doubt. However, principles derived from information theory may also set bounds on the structure (and thus also the evolution) of proteins. Here we analyze the global properties of the full set of proteins in release 13-11 of the SwissProt database, showing by experimental test of predictions from information theory that their collective structure exhibits properties that are consistent with their being guided by a conservation principle. This principle (Conservation of Information) defines the global properties of systems composed of discrete components each of which is in turn assembled from discrete smaller pieces. In the system of proteins, each protein is a component, and each protein is assembled from amino acids. Central to this principle is the inter-relationship of the unique amino acid count and total length of a protein and its implications for both average protein length and occurrence of proteins with specific unique amino acid counts. The unique amino acid count is simply the number of distinct amino acids (including those that are post-translationally modified) that occur in a protein, and is independent of the number of times that the particular amino acid occurs in the sequence. Conservation of Information does not operate at the local level (it is independent of the physicochemical properties of the amino acids) where the influences of natural selection are manifest in the variety of protein structure and function that is well understood. Rather, this analysis implies that Conservation of Information would define the global bounds within which the whole system of proteins is constrained; thus it appears to be acting to constrain evolution at a level different from natural selection, a conclusion that appears counter-intuitive but is supported by the studies described herein.
Identification of two allelic IgG1 C(H) coding regions (Cgamma1) of cat.
Kanai, T H; Ueda, S; Nakamura, T
2000-01-31
Two types of cDNA encoding IgG1 heavy chain (gamma1) were isolated from a single domestic short-hair cat. Sequence analysis indicated a higher level of similarity of these Cgamma1 sequences to human Cgamma1 sequence (76.9 and 77.0%) than to mouse sequence (70.0 and 69.7%) at the nucleotide level. Predicted primary structures of both the feline Cgamma1 genes, designated as Cgamma1a and Cgamma1b, were similar to that of human Cgamma1 gene, for instance, as to the size of constant domains, the presence of six conserved cysteine residues involved in formation of the domain structure, and the location of a conserved N-linked glycosylation site. Sequence comparison between the two alleles showed that 7 out of 10 nucleotide differences were within the C(H)3 domain coding region, all leading to nonsynonymous changes in amino acid residues. Partial sequence analysis of genomic clones showed three nucleotide substitutions between the two Cgamma1 alleles in the intron between the CH2 and C(H)3 domain coding regions. In 12 domestic short-hair cats used in this study, the frequency of Cgamma1a allele (62.5%) was higher than that of the Cgamma1b allele (37.5%).
Kashuk, Carl S.; Stone, Eric A.; Grice, Elizabeth A.; Portnoy, Matthew E.; Green, Eric D.; Sidow, Arend; Chakravarti, Aravinda; McCallion, Andrew S.
2005-01-01
The ability to discriminate between deleterious and neutral amino acid substitutions in the genes of patients remains a significant challenge in human genetics. The increasing availability of genomic sequence data from multiple vertebrate species allows inclusion of sequence conservation and physicochemical properties of residues to be used for functional prediction. In this study, the RET receptor tyrosine kinase serves as a model disease gene in which a broad spectrum (≥116) of disease-associated mutations has been identified among patients with Hirschsprung disease and multiple endocrine neoplasia type 2. We report the alignment of the human RET protein sequence with the orthologous sequences of 12 non-human vertebrates (eight mammalian, one avian, and three teleost species), their comparative analysis, the evolutionary topology of the RET protein, and predicted tolerance for all published missense mutations. We show that, although evolutionary conservation alone provides significant information to predict the effect of a RET mutation, a model that combines comparative sequence data with analysis of physiochemical properties in a quantitative framework provides far greater accuracy. Although the ability to discern the impact of a mutation is imperfect, our analyses permit substantial discrimination between predicted functional classes of RET mutations and disease severity even for a multigenic disease such as Hirschsprung disease. PMID:15956201
Adam, Benoit; Charloteaux, Benoit; Beaufays, Jerome; Vanhamme, Luc; Godfroid, Edmond; Brasseur, Robert; Lins, Laurence
2008-01-01
Background Lipocalins are widely distributed in nature and are found in bacteria, plants, arthropoda and vertebra. In hematophagous arthropods, they are implicated in the successful accomplishment of the blood meal, interfering with platelet aggregation, blood coagulation and inflammation and in the transmission of disease parasites such as Trypanosoma cruzi and Borrelia burgdorferi. The pairwise sequence identity is low among this family, often below 30%, despite a well conserved tertiary structure. Under the 30% identity threshold, alignment methods do not correctly assign and align proteins. The only safe way to assign a sequence to that family is by experimental determination. However, these procedures are long and costly and cannot always be applied. A way to circumvent the experimental approach is sequence and structure analyze. To further help in that task, the residues implicated in the stabilisation of the lipocalin fold were determined. This was done by analyzing the conserved interactions for ten lipocalins having a maximum pairwise identity of 28% and various functions. Results It was determined that two hydrophobic clusters of residues are conserved by analysing the ten lipocalin structures and sequences. One cluster is internal to the barrel, involving all strands and the 310 helix. The other is external, involving four strands and the helix lying parallel to the barrel surface. These clusters are also present in RaHBP2, a unusual "outlier" lipocalin from tick Rhipicephalus appendiculatus. This information was used to assess assignment of LIR2 a protein from Ixodes ricinus and to build a 3D model that helps to predict function. FTIR data support the lipocalin fold for this protein. Conclusion By sequence and structural analyzes, two conserved clusters of hydrophobic residues in interactions have been identified in lipocalins. Since the residues implicated are not conserved for function, they should provide the minimal subset necessary to confer the lipocalin fold. This information has been used to assign LIR2 to lipocalins and to investigate its structure/function relationship. This study could be applied to other protein families with low pairwise similarity, such as the structurally related fatty acid binding proteins or avidins. PMID:18190694
de Vries, G E; Arfman, N; Terpstra, P; Dijkhuizen, L
1992-01-01
The gene (mdh) coding for methanol dehydrogenase (MDH) of thermotolerant, methylotroph Bacillus methanolicus C1 has been cloned and sequenced. The deduced amino acid sequence of the mdh gene exhibited similarity to those of five other alcohol dehydrogenase (type III) enzymes, which are distinct from the long-chain zinc-containing (type I) or short-chain zinc-lacking (type II) enzymes. Highly efficient expression of the mdh gene in Escherichia coli was probably driven from its own promoter sequence. After purification of MDH from E. coli, the kinetic and biochemical properties of the enzyme were investigated. The physiological effect of MDH synthesis in E. coli and the role of conserved sequence patterns in type III alcohol dehydrogenases have been analyzed and are discussed. Images PMID:1644761
Hunt, C; Morimoto, R I
1985-01-01
We have determined the nucleotide sequence of the human hsp70 gene and 5' flanking region. The hsp70 gene is transcribed as an uninterrupted primary transcript of 2440 nucleotides composed of a 5' noncoding leader sequence of 212 nucleotides, a 3' noncoding region of 242 nucleotides, and a continuous open reading frame of 1986 nucleotides that encodes a protein with predicted molecular mass of 69,800 daltons. Upstream of the 5' terminus are the canonical TATAAA box, the sequence ATTGG that corresponds in the inverted orientation to the CCAAT motif, and the dyad sequence CTGGAAT/ATTCCCG that shares homology in 12 of 14 positions with the consensus transcription regulatory sequence common to Drosophila heat shock genes. Comparison of the predicted amino acid sequences of human hsp70 with the published sequences of Drosophila hsp70 and Escherichia coli dnaK reveals that human hsp70 is 73% identical to Drosophila hsp70 and 47% identical to E. coli dnaK. Surprisingly, the nucleotide sequences of the human and Drosophila genes are 72% identical and human and E. coli genes are 50% identical, which is more highly conserved than necessary given the degeneracy of the genetic code. The lack of accumulated silent nucleotide substitutions leads us to propose that there may be additional information in the nucleotide sequence of the hsp70 gene or the corresponding mRNA that precludes the maximum divergence allowed in the silent codon positions. PMID:3931075
Single Amino Acid Repeats in the Proteome World: Structural, Functional, and Evolutionary Insights
Kumar, Amitha Sampath; Sowpati, Divya Tej; Mishra, Rakesh K.
2016-01-01
Microsatellites or simple sequence repeats (SSR) are abundant, highly diverse stretches of short DNA repeats present in all genomes. Tandem mono/tri/hexanucleotide repeats in the coding regions contribute to single amino acids repeats (SAARs) in the proteome. While SSRs in the coding region always result in amino acid repeats, a majority of SAARs arise due to a combination of various codons representing the same amino acid and not as a consequence of SSR events. Certain amino acids are abundant in repeat regions indicating a positive selection pressure behind the accumulation of SAARs. By analysing 22 proteomes including the human proteome, we explored the functional and structural relationship of amino acid repeats in an evolutionary context. Only ~15% of repeats are present in any known functional domain, while ~74% of repeats are present in the disordered regions, suggesting that SAARs add to the functionality of proteins by providing flexibility, stability and act as linker elements between domains. Comparison of SAAR containing proteins across species reveals that while shorter repeats are conserved among orthologs, proteins with longer repeats, >15 amino acids, are unique to the respective organism. Lysine repeats are well conserved among orthologs with respect to their length and number of occurrences in a protein. Other amino acids such as glutamic acid, proline, serine and alanine repeats are generally conserved among the orthologs with varying repeat lengths. These findings suggest that SAARs have accumulated in the proteome under positive selection pressure and that they provide flexibility for optimal folding of functional/structural domains of proteins. The insights gained from our observations can help in effective designing and engineering of proteins with novel features. PMID:27893794
Wu, Qinglong; Shah, Nagendra P
2017-11-22
γ-Aminobutyric acid (GABA) and GABA-rich foods have shown anti-hypertensive and anti-depressant activities as the major functions in humans and animals. Hence, high GABA-producing lactic acid bacteria (LAB) could be used as functional starters for manufacturing novel fermented dairy foods. Glutamic acid decarboxylases (GADs) from LAB are highly conserved at the species level based on the phylogenetic tree of GADs from LAB. Moreover, two functionally distinct GADs and one intact gad operon were observed in all the completely sequenced Lactobacillus brevis strains suggesting its common capability to synthesize GABA. Difficulties and strategies for the manufacture of GABA-rich fermented dairy foods have been discussed and proposed, respectively. In addition, a genetic survey on the sequenced LAB strains demonstrated the absence of cell envelope proteinases in the majority of LAB including Lb. brevis, which diminishes their cell viabilities in milk environments due to their non-proteolytic nature. Thus, several strategies have been proposed to overcome the non-proteolytic nature of Lb. brevis in order to produce GABA-rich dairy foods.
Inverse statistical physics of protein sequences: a key issues review.
Cocco, Simona; Feinauer, Christoph; Figliuzzi, Matteo; Monasson, Rémi; Weigt, Martin
2018-03-01
In the course of evolution, proteins undergo important changes in their amino acid sequences, while their three-dimensional folded structure and their biological function remain remarkably conserved. Thanks to modern sequencing techniques, sequence data accumulate at unprecedented pace. This provides large sets of so-called homologous, i.e. evolutionarily related protein sequences, to which methods of inverse statistical physics can be applied. Using sequence data as the basis for the inference of Boltzmann distributions from samples of microscopic configurations or observables, it is possible to extract information about evolutionary constraints and thus protein function and structure. Here we give an overview over some biologically important questions, and how statistical-mechanics inspired modeling approaches can help to answer them. Finally, we discuss some open questions, which we expect to be addressed over the next years.
Inverse statistical physics of protein sequences: a key issues review
NASA Astrophysics Data System (ADS)
Cocco, Simona; Feinauer, Christoph; Figliuzzi, Matteo; Monasson, Rémi; Weigt, Martin
2018-03-01
In the course of evolution, proteins undergo important changes in their amino acid sequences, while their three-dimensional folded structure and their biological function remain remarkably conserved. Thanks to modern sequencing techniques, sequence data accumulate at unprecedented pace. This provides large sets of so-called homologous, i.e. evolutionarily related protein sequences, to which methods of inverse statistical physics can be applied. Using sequence data as the basis for the inference of Boltzmann distributions from samples of microscopic configurations or observables, it is possible to extract information about evolutionary constraints and thus protein function and structure. Here we give an overview over some biologically important questions, and how statistical-mechanics inspired modeling approaches can help to answer them. Finally, we discuss some open questions, which we expect to be addressed over the next years.
Kumar, Rajinder; Adams, Brian; Oldenburg, Anja; Musiyenko, Alla; Barik, Sailen
2002-01-01
Background Reversible protein phosphorylation is relatively unexplored in the intracellular protozoa of the Apicomplexa family that includes the genus Plasmodium, to which belong the causative agents of malaria. Members of the PP1 family represent the most highly conserved protein phosphatase sequences in phylogeny and play essential regulatory roles in various cellular pathways. Previous evidence suggested a PP1-like activity in Plasmodium falciparum, not yet identified at the molecular level. Results We have identified a PP1 catalytic subunit from P. falciparum and named it PfPP1. The predicted primary structure of the 304-amino acid long protein was highly similar to PP1 sequences of other species, and showed conservation of all the signature motifs. The purified recombinant protein exhibited potent phosphatase activity in vitro. Its sensitivity to specific phosphatase inhibitors was characteristic of the PP1 class. The authenticity of the PfPP1 cDNA was further confirmed by mutational analysis of strategic amino acid residues important in catalysis. The protein was expressed in all erythrocytic stages of the parasite. Abrogation of PP1 expression by synthetic short interfering RNA (siRNA) led to inhibition of parasite DNA synthesis. Conclusions The high sequence similarity of PfPP1 with other PP1 members suggests conservation of function. Phenotypic gene knockdown studies using siRNA confirmed its essential role in the parasite. Detailed studies of PfPP1 and its regulation may unravel the role of reversible protein phosphorylation in the signalling pathways of the parasite, including glucose metabolism and parasitic cell division. The use of siRNA could be an important tool in the functional analysis of Apicomplexan genes. PMID:12057017
Jaiswal, Mamta; Dvorsky, Radovan; Ahmadian, Mohammad Reza
2013-02-08
The diffuse B-cell lymphoma (Dbl) family of the guanine nucleotide exchange factors is a direct activator of the Rho family proteins. The Rho family proteins are involved in almost every cellular process that ranges from fundamental (e.g. the establishment of cell polarity) to highly specialized processes (e.g. the contraction of vascular smooth muscle cells). Abnormal activation of the Rho proteins is known to play a crucial role in cancer, infectious and cognitive disorders, and cardiovascular diseases. However, the existence of 74 Dbl proteins and 25 Rho-related proteins in humans, which are largely uncharacterized, has led to increasing complexity in identifying specific upstream pathways. Thus, we comprehensively investigated sequence-structure-function-property relationships of 21 representatives of the Dbl protein family regarding their specificities and activities toward 12 Rho family proteins. The meta-analysis approach provides an unprecedented opportunity to broadly profile functional properties of Dbl family proteins, including catalytic efficiency, substrate selectivity, and signaling specificity. Our analysis has provided novel insights into the following: (i) understanding of the relative differences of various Rho protein members in nucleotide exchange; (ii) comparing and defining individual and overall guanine nucleotide exchange factor activities of a large representative set of the Dbl proteins toward 12 Rho proteins; (iii) grouping the Dbl family into functionally distinct categories based on both their catalytic efficiencies and their sequence-structural relationships; (iv) identifying conserved amino acids as fingerprints of the Dbl and Rho protein interaction; and (v) defining amino acid sequences conserved within, but not between, Dbl subfamilies. Therefore, the characteristics of such specificity-determining residues identified the regions or clusters conserved within the Dbl subfamilies.
Mapping Interaction Sites on Human Chemokine Receptors by Deep Mutational Scanning.
Heredia, Jeremiah D; Park, Jihye; Brubaker, Riley J; Szymanski, Steven K; Gill, Kevin S; Procko, Erik
2018-06-01
Chemokine receptors CXCR4 and CCR5 regulate WBC trafficking and are engaged by the HIV-1 envelope glycoprotein gp120 during infection. We combine a selection of human CXCR4 and CCR5 libraries comprising nearly all of ∼7000 single amino acid substitutions with deep sequencing to define sequence-activity landscapes for surface expression and ligand interactions. After consideration of sequence constraints for surface expression, known interaction sites with HIV-1-blocking Abs were appropriately identified as conserved residues following library sorting for Ab binding, validating the use of deep mutational scanning to map functional interaction sites in G protein-coupled receptors. Chemokine CXCL12 was found to interact with residues extending asymmetrically into the CXCR4 ligand-binding cavity, similar to the binding surface of CXCR4 recognized by an antagonistic viral chemokine previously observed crystallographically. CXCR4 mutations distal from the chemokine binding site were identified that enhance chemokine recognition. This included disruptive mutations in the G protein-coupling site that diminished calcium mobilization, as well as conservative mutations to a membrane-exposed site (CXCR4 residues H79 2.45 and W161 4.50 ) that increased ligand binding without loss of signaling. Compared with CXCR4-CXCL12 interactions, CCR5 residues conserved for gp120 (HIV-1 BaL strain) interactions map to a more expansive surface, mimicking how the cognate chemokine CCL5 makes contacts across the entire CCR5 binding cavity. Acidic substitutions in the CCR5 N terminus and extracellular loops enhanced gp120 binding. This study demonstrates how comprehensive mutational scanning can define functional interaction sites on receptors, and novel mutations that enhance receptor activities can be found simultaneously. Copyright © 2018 by The American Association of Immunologists, Inc.
Yao, Q; Fischer, K P; Tyrrell, D L; Gutfreund, K S
2015-04-01
Programmed death ligand-1 (PD-L1) plays an important role in the attenuation of adaptive immune responses in higher vertebrates. Here, we describe the identification of the Pekin duck PD-L1 orthologue (duPD-L1) and its gene structure. The duPD-L1 cDNA encodes a 311-amino acid protein that has an amino acid identity of 78% and 42% with chicken and human PD-L1, respectively. Mapping of the duPD-L1 cDNA with duck genomic sequences revealed an exonic structure of its coding sequence similar to those of other vertebrates but lacked a noncoding exon 1. Homology modelling of the duPD-L1 extracellular domain was compatible with the tandem IgV-like and IgC-like IgSF domain structure of human PD-L1 (PDB ID: 3BIS). Residues known to be important for receptor binding of human PD-L1 were mostly conserved in duPD-L1 within the N-terminus and the G sheet, and partially conserved within the F sheet but not within sheets C and C'. DuPD-L1 mRNA was constitutively expressed in all tissues examined with highest expression levels in lung and spleen and very low levels of expression in muscle, kidney and brain. Mitogen stimulation of duck peripheral blood mononuclear cells transiently increased duPD-L1 mRNA expression. Our observations demonstrate evolutionary conservation of the exonic structure of its coding sequence, the extracellular domain structure and residues implicated in receptor binding, but the role of the longer cytoplasmic tail in avian PD-L1 proteins remains to be determined. © 2014 John Wiley & Sons Ltd.
Sakoda, H; Imanaka, T
1992-02-01
Using Bacillus subtilis as a host and pTB524 as a vector plasmid, we cloned the thermostable alcohol dehydrogenase (ADH-T) gene (adhT) from Bacillus stearothermophilus NCA1503 and determined its nucleotide sequence. The deduced amino acid sequence (337 amino acids) was compared with the sequences of ADHs from four different origins. The amino acid residues responsible for the catalytic activity of horse liver ADH had been clarified on the basis of three-dimensional structure. Since those catalytic amino acid residues were fairly conserved in ADH-T and other ADHs, ADH-T was inferred to have basically the same proton release system as horse liver ADH. The putative proton release system of ADH-T was elucidated by introducing point mutations at the catalytic amino acid residues, Cys-38 (cysteine at position 38), Thr-40, and His-43, with site-directed mutagenesis. The mutant enzyme Thr-40-Ser (Thr-40 was replaced by serine) showed a little lower level of activity than wild-type ADH-T did. The result indicates that the OH group of serine instead of threonine can also be used for the catalytic activity. To change the pKa value of the putative system, His-43 was replaced by the more basic amino acid arginine. As a result, the optimum pH of the mutant enzyme His-43-Arg was shifted from 7.8 (wild-type enzyme) to 9.0. His-43-Arg exhibited a higher level of activity than wild-type enzyme at the optimum pH.
Sakoda, H; Imanaka, T
1992-01-01
Using Bacillus subtilis as a host and pTB524 as a vector plasmid, we cloned the thermostable alcohol dehydrogenase (ADH-T) gene (adhT) from Bacillus stearothermophilus NCA1503 and determined its nucleotide sequence. The deduced amino acid sequence (337 amino acids) was compared with the sequences of ADHs from four different origins. The amino acid residues responsible for the catalytic activity of horse liver ADH had been clarified on the basis of three-dimensional structure. Since those catalytic amino acid residues were fairly conserved in ADH-T and other ADHs, ADH-T was inferred to have basically the same proton release system as horse liver ADH. The putative proton release system of ADH-T was elucidated by introducing point mutations at the catalytic amino acid residues, Cys-38 (cysteine at position 38), Thr-40, and His-43, with site-directed mutagenesis. The mutant enzyme Thr-40-Ser (Thr-40 was replaced by serine) showed a little lower level of activity than wild-type ADH-T did. The result indicates that the OH group of serine instead of threonine can also be used for the catalytic activity. To change the pKa value of the putative system, His-43 was replaced by the more basic amino acid arginine. As a result, the optimum pH of the mutant enzyme His-43-Arg was shifted from 7.8 (wild-type enzyme) to 9.0. His-43-Arg exhibited a higher level of activity than wild-type enzyme at the optimum pH. Images PMID:1735726
Huang, C.; Chien, M.S.; Landolt, M.L.; Batts, W.; Winton, J.
1996-01-01
Twelve neutralizing monoclonal antibodies (MAbs) against the fish rhabdovirus, infectious haematopoietic necrosis virus (IHNV), were used to select 20 MAb escape mutants. The nucleotide sequence of the entire glycoprotein (G) gene was determined for six mutants representing differing cross-neutralization patterns and each had a single nucleotide change leading to a single amino acid substitution within one of three regions of the protein. These data were used to design nested PCR primers to amplify portions of the G gene of the 14 remaining mutants. When the PCR products from these mutants were sequenced, they also had single nucleotide substitutions coding for amino acid substitutions at the same, or nearby, locations. Of the 20 mutants for which all or part of the glycoprotein gene was sequenced, two MAbs selected mutants with substitutions at amino acids 230-231 (antigenic site I) and the remaining MAbs selected mutants with substitutions at amino acids 272-276 (antigenic site II). Two MAbs that selected mutants mapping to amino acids 272-276, selected other mutants that mapped to amino acids 78-81, raising the possibility that this portion of the N terminus of the protein was part of a discontinuous epitope defining antigenic site II. CLUSTAL alignment of the glycoproteins of rabies virus, vesicular stomatitis virus and IHNV revealed similarities in the location of the neutralizing epitopes and a high degree of conservation among cysteine residues, indicating that the glycoproteins of three different genera of animal rhabdoviruses may share a similar three-dimensional structure in spite of extensive sequence divergence.
Nozaki, T; Arase, T; Shigeta, Y; Asai, T; Leustek, T; Takeuchi, T
1998-12-08
A gene encoding adenosine-5'-triphosphate sulfurylase (AS) was cloned from the enteric protozoan parasite Entamoeba histolytica by polymerase chain reaction using degenerate oligonucleotide primers corresponding to conserved regions of the protein from a variety of organisms. The deduced amino acid sequence of E. histolytica AS revealed a calculated molecular mass of 47925 Da and an unusual basic pI of 9.38. The amebic protein sequence showed 23-48% identities with AS from bacteria, yeasts, fungi, plants, and animals with the highest identities being to Synechocystis sp. and Bacillus subtilis (48 and 44%, respectively). Four conserved blocks including putative sulfate-binding and phosphate-binding regions were highly conserved in the E. histolytica AS. The upstream region of the AS gene contained three conserved elements reported for other E. histolytica genes. A recombinant E. histolytica AS revealed enzymatic activity, measured in both the forward and reverse directions. Expression of the E. histolytica AS complemented cysteine auxotrophy of the AS-deficient Escherichia coli strains. Genomic hybridization revealed that the AS gene exists as a single copy gene. In the literature, this is the first description of an AS gene in Protozoa.
Functional dissection of the alphavirus capsid protease: sequence requirements for activity.
Thomas, Saijo; Rai, Jagdish; John, Lijo; Günther, Stephan; Drosten, Christian; Pützer, Brigitte M; Schaefer, Stephan
2010-11-18
The alphavirus capsid is multifunctional and plays a key role in the viral life cycle. The nucleocapsid domain is released by the self-cleavage activity of the serine protease domain within the capsid. All alphaviruses analyzed to date show this autocatalytic cleavage. Here we have analyzed the sequence requirements for the cleavage activity of Chikungunya virus capsid protease of genus alphavirus. Amongst alphaviruses, the C-terminal amino acid tryptophan (W261) is conserved and found to be important for the cleavage. Mutating tryptophan to alanine (W261A) completely inactivated the protease. Other amino acids near W261 were not having any effect on the activity of this protease. However, serine protease inhibitor AEBSF did not inhibit the activity. Through error-prone PCR we found that isoleucine 227 is important for the effective activity. The loss of activity was analyzed further by molecular modelling and comparison of WT and mutant structures. It was found that lysine introduced at position 227 is spatially very close to the catalytic triad and may disrupt electrostatic interactions in the catalytic site and thus inactivate the enzyme. We are also examining other sequence requirements for this protease activity. We analyzed various amino acid sequence requirements for the activity of ChikV capsid protease and found that amino acids outside the catalytic triads are important for the activity.
Genomic Characterization of Phenylalanine Ammonia Lyase Gene in Buckwheat
Thiyagarajan, Karthikeyan; Vitali, Fabio; Tolaini, Valentina; Galeffi, Patrizia; Cantale, Cristina; Vikram, Prashant; Singh, Sukhwinder; De Rossi, Patrizia; Nobili, Chiara; Procacci, Silvia; Del Fiore, Antonella; Antonini, Alessandro; Presenti, Ombretta; Brunori, Andrea
2016-01-01
Phenylalanine Ammonia Lyase (PAL) gene which plays a key role in bio-synthesis of medicinally important compounds, Rutin/quercetin was sequence characterized for its efficient genomics application. These compounds possessing anti-diabetic and anti-cancer properties and are predominantly produced by Fagopyrum spp. In the present study, PAL gene was sequenced from three Fagopyrum spp. (F. tataricum, F. esculentum and F. dibotrys) and showed the presence of three SNPs and four insertion/deletions at intra and inter specific level. Among them, the potential SNP (position 949th bp G>C) with Parsimony Informative Site was selected and successfully utilised to individuate the zygosity/allelic variation of 16 F. tataricum varieties. Insertion mutations were identified in coding region, which resulted the change of a stretch of 39 amino acids on the putative protein. Our Study revealed that autogamous species (F. tataricum) has lower frequency of observed SNPs as compared to allogamous species (F. dibotrys and F. esculentum). The identified SNPs in F. tataricum didn’t result to amino acid change, while in other two species it caused both conservative and non-conservative variations. Consistent pattern of SNPs across the species revealed their phylogenetic importance. We found two groups of F. tataricum and one of them was closely related with F. dibotrys. Sequence characterization information of PAL gene reported in present investigation can be utilized in genetic improvement of buckwheat in reference to its medicinal value. PMID:26990297
NASA Astrophysics Data System (ADS)
Yusof, Nik Yusnoraini; Bakar, Farah Diba Abu; Mahadi, Nor Muhammad; Raih, Mohd Firdaus; Murad, Abdul Munir Abdul
2015-09-01
A cDNA encoding Fe(II) 2-oxoglutarate (2OG) dependent dioxygenases was isolated from psychrophilic yeast, Glaciozyma antarctica PI12. We have successfully amplified 1,029 bp cDNA sequence that encodes 342 amino acid with predicted molecular weight 38 kDa. The prediction protein was analysed using various bioinformatics tools to explore the properties of the protein. Based on a BLAST search analysis, the Fe2OX amino acid sequence showed 61% identity to the sequence of oxoglutarate/iron-dependent oxygenase from Rhodosporidium toruloides NP11. SignalP prediction showed that the Fe2OX protein contains no putative signal peptide, which suggests that this enzyme most probably localised intracellularly.The structure of Fe2OX was predicted by homology modelling using MODELLER9v11. The model with the lowest objective function was selected from hundred models generated using MODELLER9v11. Analysis of the structure revealed the longer loop at Fe2OX from G.antarctica that might be responsible for the flexibility of the structure, which contributes to its adaptation to low temperatures. Fe2OX hold a highly conserved Fe(II) binding HXD/E…H triad motif. The binding site for 2-oxoglutarate was found conserved for Arg280 among reported studies, however the Phe268 was found to be different in Fe2OX.
de-Couet, H. G.; Fong, KSK.; Weeds, A. G.; McLaughlin, P. J.; Miklos, GLG.
1995-01-01
The flightless locus of Drosophila melanogaster has been analyzed at the genetic, molecular, ultrastructural and comparative crystallographic levels. The gene encodes a single transcript encoding a protein consisting of a leucine-rich amino terminal half and a carboxyterminal half with high sequence similarity to gelsolin. We determined the genomic sequence of the flightless landscape, the breakpoints of four chromosomal rearrangements, and the molecular lesions in two lethal and two viable alleles of the gene. The two alleles that lead to flight muscle abnormalities encode mutant proteins exhibiting amino acid replacements within the S1-like domain of their gelsolin-like region. Furthermore, the deduced intronexon structure of the D. melanogaster gene has been compared with that of the Caenorhabditis elegans homologue. Furthermore, the sequence similarities of the flightless protein with gelsolin allow it to be evaluated in the context of the published crystallographic structure of the S1 domain of gelsolin. Amino acids considered essential for the structural integrity of the core are found to be highly conserved in the predicted flightless protein. Some of the residues considered essential for actin and calcium binding in gelsolin S1 and villin V1 are also well conserved. These data are discussed in light of the phenotypic characteristics of the mutants and the putative functions of the protein. PMID:8582612
Chalcone synthase genes from milk thistle (Silybum marianum): isolation and expression analysis.
Sanjari, Sepideh; Shobbar, Zahra Sadat; Ebrahimi, Mohsen; Hasanloo, Tahereh; Sadat-Noori, Seyed-Ahmad; Tirnaz, Soodeh
2015-12-01
Silymarin is a flavonoid compound derived from milk thistle (Silybum marianum) seeds which has several pharmacological applications. Chalcone synthase (CHS) is a key enzyme in the biosynthesis of flavonoids; thereby, the identification of CHS encoding genes in milk thistle plant can be of great importance. In the current research, fragments of CHS genes were amplified using degenerate primers based on the conserved parts of Asteraceae CHS genes, and then cloned and sequenced. Analysis of the resultant nucleotide and deduced amino acid sequences led to the identification of two different members of CHS gene family,SmCHS1 and SmCHS2. Third member, full-length cDNA (SmCHS3) was isolated by rapid amplification of cDNA ends (RACE), whose open reading frame contained 1239 bp including exon 1 (190 bp) and exon 2 (1049 bp), encoding 63 and 349 amino acids, respectively. In silico analysis of SmCHS3 sequence contains all the conserved CHS sites and shares high homology with CHS proteins from other plants.Real-time PCR analysis indicated that SmCHS1 and SmCHS3 had the highest transcript level in petals in the early flowering stage and in the stem of five upper leaves, followed by five upper leaves in the mid-flowering stage which are most probably involved in anthocyanin and silymarin biosynthesis.
Genome sequences of a mouse-avirulent and a mouse-virulent strain of Ross River virus.
Faragher, S G; Meek, A D; Rice, C M; Dalgarno, L
1988-04-01
The nucleotide sequence of the genomic RNA of a mouse-avirulent strain of Ross River virus, RRV NB5092 (isolated in 1969), has been determined and the corresponding sequence for the prototype mouse-virulent strain, RRV T48 (isolated in 1959), has been completed. The RRV NB5092 genome is approximately 11,674 nucleotides in length, compared with 11,853 nucleotides for RRV T48. RRV NB5092 and RRV T48 have the same genome organization. For both viruses an untranslated region of 80 nucleotides at the 5' end of the genome is followed by a 7440-nucleotide open reading frame which is interrupted after 5586 nucleotides by a single opal termination codon. By homology with other alphaviruses, the 5586-nucleotide open reading frame encodes the nonstructural proteins nsP1, nsP2, and nsP3; a fourth nonstructural protein, nsP4, is produced by read-through of the opal codon. The RRV nonstructural proteins show strong homology with the corresponding proteins of Sindbis virus and Semliki Forest virus in terms of size, net charge, and hydropathy characteristics. However, homology is not uniform between or within the proteins; nsP1, nsP2, and nsP4 contain extended domains which are highly conserved between alphaviruses, while the C-terminal region of nsP3 shows little conservation in sequence or length between alphaviruses. An untranslated "junction" region of 44 nucleotides (for RRV NB5092) or 47 nucleotides (for RRV T48) separates the nonstructural and structural protein coding regions. The structural proteins (capsid-E3-E2-6K-E1) are translated from an open reading frame of 3762 nucleotides which is followed by a 3'-untranslated region of approximately 348 nucleotides (for RRV NB5092) or 524 nucleotides (for RRV T48). Excluding deletions and insertions, the genomes of RRV NB5092 and RRV T48 differ at 284 nucleotides, representing a sequence divergence of 2.38%. Sequence deletions or insertions were found only in the noncoding regions and include a 173-nucleotide deletion in the 3'-untranslated region of RRV NB5092, compared with RRV T48. In the coding regions, most of the nucleotide differences are silent; there are 36 amino acid differences in the nonstructural proteins and 12 in the structural proteins. The distribution of amino acid differences between the two RRV strains correlates with the location of domains which are poorly conserved in sequence between alphaviruses. The possible role of amino acid differences in envelope glycoproteins E1 and E2 in determining the different antigenic and biological properties of RRV NB5092 and RRV T48 is discussed.
Evolution of the insulin molecule: insights into structure-activity and phylogenetic relationships.
Conlon, J M
2001-07-01
The conformation of insulin in the crystalline state has been known for more than 30 years but there remains uncertainty regarding the biologically active conformation and the structural features that constitute the receptor-binding domain. The primary structure of insulin has been determined for at least 100 vertebrate species. In addition to the invariant cysteines, only ten amino acids (GlyA1, IleA2, ValA3, TyrA19, LeuB6, GlyB8, LeuB11, ValB12, GlyB23 and PheB24) have been fully conserved during vertebrate evolution. This observation supports the hypothesis derived from alanine-scanning mutagenesis studies that five of these invariant residues (IleA2, ValA3, TyrA19, GlyB23, and Phe24) interact directly with the receptor and five additional conserved residues (LeuB6, GlyB8, LeuB11, GluB13 and PheB25) are important in maintaining the receptor-binding conformation. With the exception of the hagfish, only conservative substitutions are found at B13 (Glu --> Asp) and B25(Phe --> Tyr). In contrast, amino acid residues that were also considered to be important in receptor binding based upon the crystal structure of insulin (GluA4, GlnA5, AsnA21, TyrB16, TyrB26) have been much less well conserved and are probably not components of the receptor-binding domain. The hypothesis that LeuA13 and LeuB17 form part of a second receptor-binding site in the insulin molecule finds some support in terms of their conservation during vertebrate evolution, although the site is probably absent in some hystricomorph insulins. In general, the amino acid sequences of insulins are not useful in cladistic analyses especially when evolutionary distant taxa are compared but, among related species in a particular order or family, the presence of unusual structural features in the insulin molecule may permit a meaningful phylogenetic inference. For example, analysis of insulin sequences supports monophyletic status for Dipnoi, Elasmobranchii, Holocephali and Petromyzontiformes.
Protection of CpG islands from DNA methylation is DNA-encoded and evolutionarily conserved.
Long, Hannah K; King, Hamish W; Patient, Roger K; Odom, Duncan T; Klose, Robert J
2016-08-19
DNA methylation is a repressive epigenetic modification that covers vertebrate genomes. Regions known as CpG islands (CGIs), which are refractory to DNA methylation, are often associated with gene promoters and play central roles in gene regulation. Yet how CGIs in their normal genomic context evade the DNA methylation machinery and whether these mechanisms are evolutionarily conserved remains enigmatic. To address these fundamental questions we exploited a transchromosomic animal model and genomic approaches to understand how the hypomethylated state is formed in vivo and to discover whether mechanisms governing CGI formation are evolutionarily conserved. Strikingly, insertion of a human chromosome into mouse revealed that promoter-associated CGIs are refractory to DNA methylation regardless of host species, demonstrating that DNA sequence plays a central role in specifying the hypomethylated state through evolutionarily conserved mechanisms. In contrast, elements distal to gene promoters exhibited more variable methylation between host species, uncovering a widespread dependence on nucleotide frequency and occupancy of DNA-binding transcription factors in shaping the DNA methylation landscape away from gene promoters. This was exemplified by young CpG rich lineage-restricted repeat sequences that evaded DNA methylation in the absence of co-evolved mechanisms targeting methylation to these sequences, and species specific DNA binding events that protected against DNA methylation in CpG poor regions. Finally, transplantation of mouse chromosomal fragments into the evolutionarily distant zebrafish uncovered the existence of a mechanistically conserved and DNA-encoded logic which shapes CGI formation across vertebrate species. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Gemovic, Branislava; Perovic, Vladimir; Glisic, Sanja; Veljkovic, Nevena
2013-01-01
There are more than 500 amino acid substitutions in each human genome, and bioinformatics tools irreplaceably contribute to determination of their functional effects. We have developed feature-based algorithm for the detection of mutations outside conserved functional domains (CFDs) and compared its classification efficacy with the most commonly used phylogeny-based tools, PolyPhen-2 and SIFT. The new algorithm is based on the informational spectrum method (ISM), a feature-based technique, and statistical analysis. Our dataset contained neutral polymorphisms and mutations associated with myeloid malignancies from epigenetic regulators ASXL1, DNMT3A, EZH2, and TET2. PolyPhen-2 and SIFT had significantly lower accuracies in predicting the effects of amino acid substitutions outside CFDs than expected, with especially low sensitivity. On the other hand, only ISM algorithm showed statistically significant classification of these sequences. It outperformed PolyPhen-2 and SIFT by 15% and 13%, respectively. These results suggest that feature-based methods, like ISM, are more suitable for the classification of amino acid substitutions outside CFDs than phylogeny-based tools.
Ainsztein, Alexandra M.; Kandels-Lewis, Stefanie E.; Mackay, Alastair M.; Earnshaw, William C.
1998-01-01
The inner centromere protein (INCENP) has a modular organization, with domains required for chromosomal and cytoskeletal functions concentrated near the amino and carboxyl termini, respectively. In this study we have identified an autonomous centromere- and midbody-targeting module in the amino-terminal 68 amino acids of INCENP. Within this module, we have identified two evolutionarily conserved amino acid sequence motifs: a 13–amino acid motif that is required for targeting to centromeres and transfer to the spindle, and an 11–amino acid motif that is required for transfer to the spindle by molecules that have targeted previously to the centromere. To begin to understand the mechanisms of INCENP function in mitosis, we have performed a yeast two-hybrid screen for interacting proteins. These and subsequent in vitro binding experiments identify a physical interaction between INCENP and heterochromatin protein HP1Hsα. Surprisingly, this interaction does not appear to be involved in targeting INCENP to the centromeric heterochromatin, but may instead have a role in its transfer from the chromosomes to the anaphase spindle. PMID:9864353
Provencher, Cathy; LaPointe, Gisèle; Sirois, Stéphane; Van Calsteren, Marie-Rose; Roy, Denis
2003-01-01
A primer design strategy named CODEHOP (consensus-degenerate hybrid oligonucleotide primer) for amplification of distantly related sequences was used to detect the priming glycosyltransferase (GT) gene in strains of the Lactobacillus casei group. Each hybrid primer consisted of a short 3′ degenerate core based on four highly conserved amino acids and a longer 5′ consensus clamp region based on six sequences of the priming GT gene products from exopolysaccharide (EPS)-producing bacteria. The hybrid primers were used to detect the priming GT gene of 44 commercial isolates and reference strains of Lactobacillus rhamnosus, L. casei, Lactobacillus zeae, and Streptococcus thermophilus. The priming GT gene was detected in the genome of both non-EPS-producing (EPS−) and EPS-producing (EPS+) strains of L. rhamnosus. The sequences of the cloned PCR products were similar to those of the priming GT gene of various gram-negative and gram-positive EPS+ bacteria. Specific primers designed from the L. rhamnosus RW-9595M GT gene were used to sequence the end of the priming GT gene in selected EPS+ strains of L. rhamnosus. Phylogenetic analysis revealed that Lactobacillus spp. form a distinctive group apart from other lactic acid bacteria for which GT genes have been characterized to date. Moreover, the sequences show a divergence existing among strains of L. rhamnosus with respect to the terminal region of the priming GT gene. Thus, the PCR approach with consensus-degenerate hybrid primers designed with CODEHOP is a practical approach for the detection of similar genes containing conserved motifs in different bacterial genomes. PMID:12788729
Karim, Kazi Muhammad Rezaul; Husaini, Ahmad; Sing, Ngieng Ngui; Sinang, Fazia Mohd; Roslan, Hairul Azman; Hussain, Hasnain
2018-04-01
In this study, an alpha-amylase enzyme from a locally isolated Aspergillus flavus NSH9 was purified and characterized. The extracellular α-amylase was purified by ammonium sulfate precipitation and anion-exchange chromatography at a final yield of 2.55-fold and recovery of 11.73%. The molecular mass of the purified α-amylase was estimated to be 54 kDa using SDS-PAGE and the enzyme exhibited optimal catalytic activity at pH 5.0 and temperature of 50 °C. The enzyme was also thermally stable at 50 °C, with 87% residual activity after 60 min. As a metalloenzymes containing calcium, the purified α-amylase showed significantly increased enzyme activity in the presence of Ca 2+ ions. Further gene isolation and characterization shows that the α-amylase gene of A. flavus NSH9 contained eight introns and an open reading frame that encodes for 499 amino acids with the first 21 amino acids presumed to be a signal peptide. Analysis of the deduced peptide sequence showed the presence of three conserved catalytic residues of α-amylase, two Ca 2+ -binding sites, seven conserved peptide sequences, and several other properties that indicates the protein belongs to glycosyl hydrolase family 13 capable of acting on α-1,4-bonds only. Based on sequence similarity, the deduced peptide sequence of A. flavus NSH9 α-amylase was also found to carry two potential surface/secondary-binding site (SBS) residues (Trp 237 and Tyr 409) that might be playing crucial roles in both the enzyme activity and also the binding of starch granules.
Pettigrew, Christopher; Wayte, Nicola; Lovelock, Paul K; Tavtigian, Sean V; Chenevix-Trench, Georgia; Spurdle, Amanda B; Brown, Melissa A
2005-01-01
Introduction Aberrant pre-mRNA splicing can be more detrimental to the function of a gene than changes in the length or nature of the encoded amino acid sequence. Although predicting the effects of changes in consensus 5' and 3' splice sites near intron:exon boundaries is relatively straightforward, predicting the possible effects of changes in exonic splicing enhancers (ESEs) remains a challenge. Methods As an initial step toward determining which ESEs predicted by the web-based tool ESEfinder in the breast cancer susceptibility gene BRCA1 are likely to be functional, we have determined their evolutionary conservation and compared their location with known BRCA1 sequence variants. Results Using the default settings of ESEfinder, we initially detected 669 potential ESEs in the coding region of the BRCA1 gene. Increasing the threshold score reduced the total number to 464, while taking into consideration the proximity to splice donor and acceptor sites reduced the number to 211. Approximately 11% of these ESEs (23/211) either are identical at the nucleotide level in human, primates, mouse, cow, dog and opossum Brca1 (conserved) or are detectable by ESEfinder in the same position in the Brca1 sequence (shared). The frequency of conserved and shared predicted ESEs between human and mouse is higher in BRCA1 exons (2.8 per 100 nucleotides) than in introns (0.6 per 100 nucleotides). Of conserved or shared putative ESEs, 61% (14/23) were predicted to be affected by sequence variants reported in the Breast Cancer Information Core database. Applying the filters described above increased the colocalization of predicted ESEs with missense changes, in-frame deletions and unclassified variants predicted to be deleterious to protein function, whereas they decreased the colocalization with known polymorphisms or unclassified variants predicted to be neutral. Conclusion In this report we show that evolutionary conservation analysis may be used to improve the specificity of an ESE prediction tool. This is the first report on the prediction of the frequency and distribution of ESEs in the BRCA1 gene, and it is the first reported attempt to predict which ESEs are most likely to be functional and therefore which sequence variants in ESEs are most likely to be pathogenic. PMID:16280041
Pudupakam, Raghavendra Sumanth; Raghunath, Shobana; Pudupakam, Meghanath; Daggupati, Sreenivasulu
2017-03-01
Sequence analysis and phylogenetic studies based on non-structural protein-3 (NS3) gene are important in understanding the evolution and epidemiology of bluetongue virus (BTV). This study was aimed at characterizing the NS3 gene sequence of Indian BTV serotype-2 (BTV2) to elucidate its genetic relationship to global BTV isolates. The NS3 gene of BTV2 was amplified from infected BHK-21 cell cultures, cloned and subjected to sequence analysis. The generated NS3 gene sequence was compared with the corresponding sequences of different BTV serotypes across the world, and a phylogenetic relationship was established. The NS3 gene of BTV2 showed moderate levels of variability in comparison to different BTV serotypes, with nucleotide sequence identities ranging from 81% to 98%. The region showed high sequence homology of 93-99% at amino acid level with various BTV serotypes. The PPXY/PTAP late domain motifs, glycosylation sites, hydrophobic domains, and the amino acid residues critical for virus-host interactions were conserved in NS3 protein. Phylogenetic analysis revealed that BTV isolates segregate into four topotypes and that the Indian BTV2 in subclade IA is closely related to Asian and Australian origin strains. Analysis of the NS3 gene indicated that Indian BTV2 isolate is closely related to strains from Asia and Australia, suggesting a common origin of infection. Although the pattern of evolution of BTV2 isolate is different from other global isolates, the deduced amino acid sequence of NS3 protein demonstrated high molecular stability.
Pudupakam, Raghavendra Sumanth; Raghunath, Shobana; Pudupakam, Meghanath; Daggupati, Sreenivasulu
2017-01-01
Aim: Sequence analysis and phylogenetic studies based on non-structural protein-3 (NS3) gene are important in understanding the evolution and epidemiology of bluetongue virus (BTV). This study was aimed at characterizing the NS3 gene sequence of Indian BTV serotype-2 (BTV2) to elucidate its genetic relationship to global BTV isolates. Materials and Methods: The NS3 gene of BTV2 was amplified from infected BHK-21 cell cultures, cloned and subjected to sequence analysis. The generated NS3 gene sequence was compared with the corresponding sequences of different BTV serotypes across the world, and a phylogenetic relationship was established. Results: The NS3 gene of BTV2 showed moderate levels of variability in comparison to different BTV serotypes, with nucleotide sequence identities ranging from 81% to 98%. The region showed high sequence homology of 93-99% at amino acid level with various BTV serotypes. The PPXY/PTAP late domain motifs, glycosylation sites, hydrophobic domains, and the amino acid residues critical for virus-host interactions were conserved in NS3 protein. Phylogenetic analysis revealed that BTV isolates segregate into four topotypes and that the Indian BTV2 in subclade IA is closely related to Asian and Australian origin strains. Conclusion: Analysis of the NS3 gene indicated that Indian BTV2 isolate is closely related to strains from Asia and Australia, suggesting a common origin of infection. Although the pattern of evolution of BTV2 isolate is different from other global isolates, the deduced amino acid sequence of NS3 protein demonstrated high molecular stability. PMID:28435199
van der Leij, F R; Visser, R G; Ponstein, A S; Jacobsen, E; Feenstra, W J
1991-08-01
The genomic sequence of the potato gene for starch granule-bound starch synthase (GBSS; "waxy protein") has been determined for the wild-type allele of a monoploid genotype from which an amylose-free (amf) mutant was derived, and for the mutant part of the amf allele. Comparison of the wild-type sequence with a cDNA sequence from the literature and a newly isolated cDNA revealed the presence of 13 introns, the first of which is located in the untranslated leader. The promoter contains a G-box-like sequence. The deduced amino acid sequence of the precursor of GBSS shows a high degree of identity with monocot waxy protein sequences in the region corresponding to the mature form of the enzyme. The transit peptide of 77 amino acids, required for routing of the precursor to the plastids, shows much less identity with the transit peptides of the other waxy preproteins, but resembles the hydropathic distributions of these peptides. Alignment of the amino acid sequences of the four mature starch synthases with the Escherichia coli glgA gene product revealed the presence of at least three conserved boxes; there is no homology with previously proposed starch-binding domains of other enzymes involved in starch metabolism. We report the use of chimeric constructs with wild-type and amf sequences to localize, via complementation experiments, the region of the amf allele in which the mutation resides. Direct sequencing of polymerase chain reaction products confirmed that the amf mutation is a deletion of a single AT basepair in the region coding for the transit peptide.(ABSTRACT TRUNCATED AT 250 WORDS)
Sirakova, T D; Markaryan, A; Kolattukudy, P E
1994-01-01
An extracellular elastinolytic metalloproteinase, purified from Aspergillus fumigatus isolated from an aspergillosis and patient/and an internal peptide derived from it were subjected to N-terminal sequencing. Oligonucleotide primers based on these sequences were used to PCR amplify a segment of the metalloproteinase cDNA, which was used as a probe to isolate the cDNA and gene for this enzyme. The gene sequence matched exactly with the cDNA sequence except for the four introns that interrupted the open reading frame. According to the deduced amino acid sequence, the metalloproteinase has a signal sequence and 227 additional amino acids preceding the sequence for the mature protein of 389 amino acids with a calculated molecular mass of 42 kDa, which is close to the size of the purified mature fungal proteinase. This sequence contains segments that matched both the N terminus of the mature protein and the internal peptide. A. fumigatus metalloproteinase contains some of the conserved zinc-binding and active-site motifs characteristic of metalloproteinases but shows no overall homology with known metalloproteinases. The cDNA of the mature protein when introduced into Escherichia coli directed the expression of a protein with a size, N-terminal sequence, and immunological cross-reactivity identical to those of the native fungal enzyme. Although the enzyme in the inclusion bodies could not be renatured, expression at 30 degrees C yielded soluble enzyme that showed chromatographic behavior identical to that of the native fungal enzyme and catalyzed hydrolysis of elastin. The metalloproteinase gene described here was not found in Aspergillus flavus. Images PMID:7927676
Xue, Yufei; Chen, Baojun; Wang, Rui; Win, Aung Naing; Li, Jiana; Chai, Yourong
2018-02-01
Rapeseed (Brassica napus) is an important oilseed crop worldwide, and fatty acid (FA) compositions determine the nutritional and economic value of its seed oil. Fatty acid desaturases (FADs) play a pivotal role in regulating FA compositions, but to date, no comprehensive genome-wide analysis of FAD gene family in rapeseed and its parent species has been reported. In this study, using homology searches, 84, 45, and 44 FAD genes were identified in rapeseed, Brassica rapa, and Brassica oleracea genomes, respectively. These FAD genes were unevenly located in 17 chromosomes and 2 scaffolds of rapeseed, 9 chromosomes and 1 scaffold of B. rapa, and all the chromosomes of B. oleracea. Phylogenetic analysis showed that the soluble and membrane-bound FADs in the three Brassica species were divided into four and six subfamilies, respectively. Generally, the soluble FADs contained two conserved histidine boxes, while three highly conserved histidine boxes were harbored in membrane-bound FADs. Exon-intron structure, intron phase, and motif composition and position were highly conserved in each FAD subfamily. Putative subcellular locations of FAD proteins in three Brassica species were consistent with those of corresponding known FADs. In total, 25 of simple sequence repeat (SSR) loci were found in FAD genes of the three Brassica species. Transcripts of selected FAD genes in the three species were examined in various organs/tissues or stress treatments from NCBI expressed sequence tag (EST) database. This study provides a critical molecular basis for quality improvement of rapeseed oil and facilitates our understanding of key roles of FAD genes in plant growth and development and stress response.
ScaffoldSeq: Software for characterization of directed evolution populations.
Woldring, Daniel R; Holec, Patrick V; Hackel, Benjamin J
2016-07-01
ScaffoldSeq is software designed for the numerous applications-including directed evolution analysis-in which a user generates a population of DNA sequences encoding for partially diverse proteins with related functions and would like to characterize the single site and pairwise amino acid frequencies across the population. A common scenario for enzyme maturation, antibody screening, and alternative scaffold engineering involves naïve and evolved populations that contain diversified regions, varying in both sequence and length, within a conserved framework. Analyzing the diversified regions of such populations is facilitated by high-throughput sequencing platforms; however, length variability within these regions (e.g., antibody CDRs) encumbers the alignment process. To overcome this challenge, the ScaffoldSeq algorithm takes advantage of conserved framework sequences to quickly identify diverse regions. Beyond this, unintended biases in sequence frequency are generated throughout the experimental workflow required to evolve and isolate clones of interest prior to DNA sequencing. ScaffoldSeq software uniquely handles this issue by providing tools to quantify and remove background sequences, cluster similar protein families, and dampen the impact of dominant clones. The software produces graphical and tabular summaries for each region of interest, allowing users to evaluate diversity in a site-specific manner as well as identify epistatic pairwise interactions. The code and detailed information are freely available at http://research.cems.umn.edu/hackel. Proteins 2016; 84:869-874. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Gonzales, Bianca; Yang, Hushan; Henning, Dale; Valdez, Benigno C
2005-10-10
Treacher Collins syndrome (TCS) is an autosomal dominant disorder of craniofacial development caused by mutations in the TCOF1 gene, which encodes the nucleolar phosphoprotein treacle. We previously reported a function for mammalian treacle in ribosomal DNA gene transcription by its interaction with upstream binding factor. As an initial step in the development of a TCS model for frog the cDNA that encodes the Xenopus laevis treacle was cloned. Although the derived amino acid sequence shows a poor homology with its mammalian orthologues, Xenopus treacle has 11 highly homologous direct repeats near the center of the protein molecule similar to those present in its human, dog and mouse orthologues. Comparison of their amino acid compositions indicates conservation of predominant specific amino acid residues. Antisense-mediated down-regulation of treacle expression in X. laevis oocytes resulted in inhibition of rDNA gene transcription. The results suggest evolutionary conservation of the function of treacle in ribosomal RNA biogenesis in higher eukaryotes.
Zolotarov, Yevgen; Strömvik, Martina
2015-01-01
Plants accumulate dehydrins in response to osmotic stresses. Dehydrins are divided into five different classes, which are thought to be regulated in different manners. To better understand differences in transcriptional regulation of the five dehydrin classes, de novo motif discovery was performed on 350 dehydrin promoter sequences from a total of 51 plant genomes. Overrepresented motifs were identified in the promoters of five dehydrin classes. The Kn dehydrin promoters contain motifs linked with meristem specific expression, as well as motifs linked with cold/dehydration and abscisic acid response. KS dehydrin promoters contain a motif with a GATA core. SKn and YnSKn dehydrin promoters contain motifs that match elements connected with cold/dehydration, abscisic acid and light response. YnKn dehydrin promoters contain motifs that match abscisic acid and light response elements, but not cold/dehydration response elements. Conserved promoter motifs are present in the dehydrin classes and across different plant lineages, indicating that dehydrin gene regulation is likely also conserved.
Conservation of small RNA pathways in platypus
Murchison, Elizabeth P.; Kheradpour, Pouya; Sachidanandam, Ravi; Smith, Carly; Hodges, Emily; Xuan, Zhenyu; Kellis, Manolis; Grützner, Frank; Stark, Alexander; Hannon, Gregory J.
2008-01-01
Small RNA pathways play evolutionarily conserved roles in gene regulation and defense from parasitic nucleic acids. The character and expression patterns of small RNAs show conservation throughout animal lineages, but specific animal clades also show variations on these recurring themes, including species-specific small RNAs. The monotremes, with only platypus and four species of echidna as extant members, represent the basal branch of the mammalian lineage. Here, we examine the small RNA pathways of monotremes by deep sequencing of six platypus and echidna tissues. We find that highly conserved microRNA species display their signature tissue-specific expression patterns. In addition, we find a large rapidly evolving cluster of microRNAs on platypus chromosome X1, which is unique to monotremes. Platypus and echidna testes contain a robust Piwi-interacting (piRNA) system, which appears to be participating in ongoing transposon defense. PMID:18463306
Conservation of small RNA pathways in platypus.
Murchison, Elizabeth P; Kheradpour, Pouya; Sachidanandam, Ravi; Smith, Carly; Hodges, Emily; Xuan, Zhenyu; Kellis, Manolis; Grützner, Frank; Stark, Alexander; Hannon, Gregory J
2008-06-01
Small RNA pathways play evolutionarily conserved roles in gene regulation and defense from parasitic nucleic acids. The character and expression patterns of small RNAs show conservation throughout animal lineages, but specific animal clades also show variations on these recurring themes, including species-specific small RNAs. The monotremes, with only platypus and four species of echidna as extant members, represent the basal branch of the mammalian lineage. Here, we examine the small RNA pathways of monotremes by deep sequencing of six platypus and echidna tissues. We find that highly conserved microRNA species display their signature tissue-specific expression patterns. In addition, we find a large rapidly evolving cluster of microRNAs on platypus chromosome X1, which is unique to monotremes. Platypus and echidna testes contain a robust Piwi-interacting (piRNA) system, which appears to be participating in ongoing transposon defense.
Ficarelli, A; Tassi, F; Restivo, F M
1999-03-01
We have isolated two full length cDNA clones encoding Nicotiana plumbaginifolia NADH-glutamate dehydrogenase. Both clones share amino acid boxes of homology corresponding to conserved GDH catalytic domains and putative mitochondrial targeting sequence. One clone shows a putative EF-hand loop. The level of the two transcripts is affected differently by carbon source.
Modeling of the Ebola Virus Delta Peptide Reveals a Potential Lytic Sequence Motif
Gallaher, William R.; Garry, Robert F.
2015-01-01
Filoviruses, such as Ebola and Marburg viruses, cause severe outbreaks of human infection, including the extensive epidemic of Ebola virus disease (EVD) in West Africa in 2014. In the course of examining mutations in the glycoprotein gene associated with 2014 Ebola virus (EBOV) sequences, a differential level of conservation was noted between the soluble form of glycoprotein (sGP) and the full length glycoprotein (GP), which are both encoded by the GP gene via RNA editing. In the region of the proteins encoded after the RNA editing site sGP was more conserved than the overlapping region of GP when compared to a distant outlier species, Tai Forest ebolavirus. Half of the amino acids comprising the “delta peptide”, a 40 amino acid carboxy-terminal fragment of sGP, were identical between otherwise widely divergent species. A lysine-rich amphipathic peptide motif was noted at the carboxyl terminus of delta peptide with high structural relatedness to the cytolytic peptide of the non-structural protein 4 (NSP4) of rotavirus. EBOV delta peptide is a candidate viroporin, a cationic pore-forming peptide, and may contribute to EBOV pathogenesis. PMID:25609303
Insect sex determination: it all evolves around transformer.
Verhulst, Eveline C; van de Zande, Louis; Beukeboom, Leo W
2010-08-01
Insects exhibit a variety of sex determining mechanisms including male or female heterogamety and haplodiploidy. The primary signal that starts sex determination is processed by a cascade of genes ending with the conserved switch doublesex that controls sexual differentiation. Transformer is the doublesex splicing regulator and has been found in all examined insects, indicating its ancestral function as a sex-determining gene. Despite this conserved function, the variation in transformer nucleotide sequence, amino acid composition and protein structure can accommodate a multitude of upstream sex determining signals. Transformer regulation of doublesex and its taxonomic distribution indicate that the doublesex-transformer axis is conserved among all insects and that transformer is the key gene around which variation in sex determining mechanisms has evolved.
Sequence Alignment to Predict Across Species Susceptibility ...
Conservation of a molecular target across species can be used as a line-of-evidence to predict the likelihood of chemical susceptibility. The web-based Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS) tool was developed to simplify, streamline, and quantitatively assess protein sequence/structural similarity across taxonomic groups as a means to predict relative intrinsic susceptibility. The intent of the tool is to allow for evaluation of any potential protein target, so it is amenable to variable degrees of protein characterization, depending on available information about the chemical/protein interaction and the molecular target itself. To allow for flexibility in the analysis, a layered strategy was adopted for the tool. The first level of the SeqAPASS analysis compares primary amino acid sequences to a query sequence, calculating a metric for sequence similarity (including detection of candidate orthologs), the second level evaluates sequence similarity within selected domains (e.g., ligand-binding domain, DNA binding domain), and the third level of analysis compares individual amino acid residue positions identified as being of importance for protein conformation and/or ligand binding upon chemical perturbation. Each level of the SeqAPASS analysis provides increasing evidence to apply toward rapid, screening-level assessments of probable cross species susceptibility. Such analyses can support prioritization of chemicals for further ev
Bosselut, R; Levin, J; Adjadj, E; Ghysdael, J
1993-11-11
Ets proteins form a family of sequence specific DNA binding proteins which bind DNA through a 85 aminoacids conserved domain, the Ets domain, whose sequence is unrelated to any other characterized DNA binding domain. Unlike all other known Ets proteins, which bind specific DNA sequences centered over either GGAA or GGAT core motifs, E74 and Elf1 selectively bind to GGAA corecontaining sites. Elf1 and E74 differ from other Ets proteins in three residues located in an otherwise highly conserved region of the Ets domain, referred to as conserved region III (CRIII). We show that a restricted selectivity for GGAA core-containing sites could be conferred to Ets1 upon changing a single lysine residue within CRIII to the threonine found in Elf1 and E74 at this position. Conversely, the reciprocal mutation in Elf1 confers to this protein the ability to bind to GGAT core containing EBS. This, together with the fact that mutation of two invariant arginine residues in CRIII abolishes DNA binding, indicates that CRIII plays a key role in Ets domain recognition of the GGAA/T core motif and lead us to discuss a model of Ets proteins--core motif interaction.
Takeshita, S; Kikuno, R; Tezuka, K; Amann, E
1993-01-01
A cDNA library prepared from the mouse osteoblastic cell line MC3T3-E1 was screened for the presence of specifically expressed genes by employing a combined subtraction hybridization/differential screening approach. A cDNA was identified and sequenced which encodes a protein designated osteoblast-specific factor 2 (OSF-2) comprising 811 amino acids. OSF-2 has a typical signal sequence, followed by a cysteine-rich domain, a fourfold repeated domain and a C-terminal domain. The protein lacks a typical transmembrane region. The fourfold repeated domain of OSF-2 shows homology with the insect protein fasciclin I. RNA analyses revealed that OSF-2 is expressed in bone and to a lesser extent in lung, but not in other tissues. Mouse OSF-2 cDNA was subsequently used as a probe to clone the human counterpart. Mouse and human OSF-2 show a high amino acid sequence conservation except for the signal sequence and two regions in the C-terminal domain in which 'in-frame' insertions or deletions are observed, implying alternative splicing events. On the basis of the amino acid sequence homology with fasciclin I, we suggest that OSF-2 functions as a homophilic adhesion molecule in bone formation. Images Figure 3 Figure 4 Figure 5 Figure 6 PMID:8363580
Differential signatures of bacterial and mammalian IMP dehydrogenase enzymes.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, R.; Evans, G.; Rotella, F.
1999-06-01
IMP dehydrogenase (IMPDH) is an essential enzyme of de novo guanine nucleotide synthesis. IMPDH inhibitors have clinical utility as antiviral, anticancer or immunosuppressive agents. The essential nature of this enzyme suggests its therapeutic applications may be extended to the development of antimicrobial agents. Bacterial IMPDH enzymes show bio- chemical and kinetic characteristics that are different than the mammalian IMPDH enzymes, suggesting IMPDH may be an attractive target for the development of antimicrobial agents. We suggest that the biochemical and kinetic differences between bacterial and mammalian enzymes are a consequence of the variance of specific, identifiable amino acid residues. Identification ofmore » these residues or combination of residues that impart this mammalian or bacterial enzyme signature is a prerequisite for the rational identification of agents that specifically target the bacterial enzyme. We used sequence alignments of IMPDH proteins to identify sequence signatures associated with bacterial or eukaryotic IMPDH enzymes. These selections were further refined to discern those likely to have a role in catalysis using information derived from the bacterial and mammalian IMPDH crystal structures and site-specific mutagenesis. Candidate bacterial sequence signatures identified by this process include regions involved in subunit interactions, the active site flap and the NAD binding region. Analysis of sequence alignments in these regions indicates a pattern of catalytic residues conserved in all enzymes and a secondary pattern of amino acid conservation associated with the major phylogenetic groups. Elucidation of the basis for this mammalian/bacterial IMPDH signature will provide insight into the catalytic mechanism of this enzyme and the foundation for the development of highly specific inhibitors.« less
Jiang, W; Gupta, D; Gallagher, D; Davis, S; Bhavanandan, V P
2000-04-01
We previously elucidated five distinct protein domains (I-V) for bovine submaxillary mucin, which is encoded by two genes, BSM1 and BSM2. Using Southern blot analysis, genomic cloning and sequencing of the BSM1 gene, we now show that the central domain (V) consists of approximately 55 tandem repeats of 329 amino acids and that domains III-V are encoded by a 58.4-kb exon, the largest exon known for all genes to date. The BSM1 gene was mapped by fluorescence in situ hybridization to the proximal half of chromosome 5 at bands q2. 2-q2.3. The amino-acid sequence of six tandem repeats (two full and four partial) were found to have only 92-94% identities. We propose that the variability in the amino-acid sequences of the mucin tandem repeat is important for generating the combinatorial library of saccharides that are necessary for the protective function of mucins. The deduced peptide sequences of the central domain match those determined from the purified bovine submaxillary mucin and also show 68-94% identity to published peptide sequences of ovine submaxillary mucin. This indicates that the core protein of ovine submaxillary mucin is closely related to that of bovine submaxillary mucin and contains similar tandem repeats in the central domain. In contrast, the central domain of porcine submaxillary mucin is reported to consist of 81-amino-acid tandem repeats. However, both bovine submaxillary mucin and porcine submaxillary mucin contain similar N-terminal and C-terminal domains and the corresponding genes are in the conserved linkage regions of the respective genomes.
Dong, J G; Kim, W T; Yip, W K; Thompson, G A; Li, L; Bennett, A B; Yang, S F
1991-08-01
1-Aminocyclopropane-1-carboxylate (ACC) synthase (EC 4.4.1.14) purified from apple (Malus sylvestris Mill.) fruit was subjected to trypsin digestion. Following separation by reversed-phase high-pressure liquid chromatography, ten tryptic peptides were sequenced. Based on the sequences of three tryptic peptides, three sets of mixed oligonucleotide probes were synthesized and used to screen a plasmid cDNA library prepared from poly(A)(+) RNA of ripe apple fruit. A 1.5-kb (kilobase) cDNA clone which hybridized to all three probes were isolated. The clone contained an open reading frame of 1214 base pairs (bp) encoding a sequence of 404 amino acids. While the polyadenine tail at the 3'-end was intact, it lacked a portion of sequence at the 5'-end. Using the RNA-based polymerase chain reaction, an additional sequence of 148 bp was obtained at the 5'-end. Thus, 1362 bp were sequenced and they encode 454 amino acids. The deduced amino-acid sequence contained peptide sequences corresponding to all ten tryptic fragments, confirming the identity of the cDNA clone. Comparison of the deduced amino-acid sequence between ACC synthase from apple fruit and those from tomato (Lycopersicon esculentum Mill.) and winter squash (Cucurbita maxima Duch.) fruits demonstrated the presence of seven highly conserved regions, including the previously identified region for the active site. The size of the translation product of ACC-synthase mRNA was similar to that of the mature protein on sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE), indicating that apple ACC-synthase undergoes only minor, if any, post-translational proteolytic processing. Analysis of ACC-synthase mRNA by in-vitro translation-immunoprecipitation, and by Northern blotting indicates that the ACC-synthase mRNA was undetectable in unripe fruit, but was accumulated massively during the ripening proccess. These data demonstrate that the expression of the ACC-synthase gene is developmentally regulated.
NASA Astrophysics Data System (ADS)
Zhao, Chunling; Ju, Jiyu
2015-06-01
The full-length cDNA of a protease gene from a marine annelid Arenicola cristata was amplified through rapid amplification of cDNA ends technique and sequenced. The size of the cDNA was 936 bp in length, including an open reading frame encoding a polypeptide of 270 amino acid residues. The deduced amino acid sequnce consisted of pro- and mature sequences. The protease belonged to the serine protease family because it contained the highly conserved sequence GDSGGP. This protease was novel as it showed a low amino acid sequence similarity (< 40%) to other serine proteases. The gene encoding the active form of A. cristata serine protease was cloned and expressed in E. coli. Purified recombinant protease in a supernatant could dissolve an artificial fibrin plate with plasminogen-rich fibrin, whereas the plasminogen-free fibrin showed no clear zone caused by hydrolysis. This result suggested that the recombinant protease showed an indirect fibrinolytic activity of dissolving fibrin, and was probably a plasminogen activator. A rat model with venous thrombosis was established to demonstrate that the recombinant protease could also hydrolyze blood clot in vivo. Therefore, this recombinant protease may be used as a thrombolytic agent for thrombosis treatment. To our knowledge, this study is the first of reporting the fibrinolytic serine protease gene in A. cristata.
Bäumlein, H; Wobus, U; Pustell, J; Kafatos, F C
1986-01-01
The field bean, Vicia faba L. var. minor, possesses two sub-families of 11 S legumin genes named A and B. We isolated from a genomic library a B-type gene (LeB4) and determined its primary DNA sequence. Gene LeB4 codes for a 484 amino acid residue prepropolypeptide, encompassing a signal peptide of 22 amino acid residues, an acidic, very hydrophilic alpha-chain of 281 residues and a basic, somewhat hydrophobic beta-chain of 181 residues. The latter two coding regions are immediately contiguous, but each is interrupted by a short intron. Type A legumin genes from soybean and pea are known to have introns in the same two positions, in addition to an extra intron (within the alpha-coding sequence). Sequence comparisons of legumin genes from these three plants revealed a highly conserved sequence element of at least 28 bp, centered at approximately 100 bp upstream of each cap site. The element is absent from the equivalent position of all non-legumin and other plant and fungal genes examined. We tentatively name this element "legumin box" and suggest that it may have a function in the regulation of legumin gene expression. PMID:3960730
Kerovuo, Janne; Lauraeus, Marko; Nurminen, Päivi; Kalkkinen, Nisse; Apajalahti, Juha
1998-01-01
The Bacillus subtilis strain VTT E-68013 was chosen for purification and characterization of its excreted phytase. Purified enzyme had maximal phytase activity at pH 7 and 55°C. Isolated enzyme required calcium for its activity and/or stability and was readily inhibited by EDTA. The enzyme proved to be highly specific since, of the substrates tested, only phytate, ADP, and ATP were hydrolyzed (100, 75, and 50% of the relative activity, respectively). The phytase gene (phyC) was cloned from the B. subtilis VTT E-68013 genomic library. The deduced amino acid sequence (383 residues) showed no homology to the sequences of other phytases nor to those of any known phosphatases. PhyC did not have the conserved RHGXRXP sequence found in the active site of known phytases, and therefore PhyC appears not to be a member of the phytase subfamily of histidine acid phosphatases but a novel enzyme having phytase activity. Due to its pH profile and optimum, it could be an interesting candidate for feed applications. PMID:9603817
A new ALF from Litopenaeus vannamei and its SNPs related to WSSV resistance
NASA Astrophysics Data System (ADS)
Liu, Jingwen; Yu, Yang; Li, Fuhua; Zhang, Xiaojun; Xiang, Jianhai
2014-11-01
Anti-lipopolysaccharide factors (ALFs) are basic components of the crustacean immune system that defend against a range of pathogens. The cDNA sequence of a new ALF, designated nLvALF2, with an open reading frame encoding 132 amino acids was cloned. Its deduced amino acid sequence contained the conserved functional domain of ALFs, the LPS binding domain (LBD). Its genomic sequence consisted of three exons and four introns. nLvALF2 was mainly expressed in the Oka organ and gills of shrimps. The transcriptional level of nLvALF2 increased significantly after white spot syndrome virus (WSSV) infection, suggesting its important roles in protecting shrimps from WSSV. Single nucleotide polymorphisms (SNPs) were found in the genomic sequence of nLvALF2, of which 38 were analyzed for associations with the susceptibility/resistance of shrimps to WSSV. The loci g.2422 A>G, g.2466 T>C, and g.2529 G>A were significantly associated with the resistance to WSSV ( P<0.05). These SNP loci could be developed as markers for selection of WSSV-resistant varieties of Litopenaeus vannamei.
Vakili Azghandi, Masoume; Nasiri, Mohammadreza; Shamsa, Ali; Jalali, Mohsen; Shariati, Mohammad Mahdi
2016-04-01
The SRY gene (SRY) provides instructions for making a transcription factor called the sex-determining region Y protein. The sex-determining region Y protein causes a fetus to develop as a male. In this study, SRY of 15 spices included of human, chimpanzee, dog, pig, rat, cattle, buffalo, goat, sheep, horse, zebra, frog, urial, dolphin and killer whale were used for determine of bioinformatic differences. Nucleotide sequences of SRY were retrieved from the NCBI databank. Bioinformatic analysis of SRY is done by CLC Main Workbench version 5.5 and ClustalW (http:/www.ebi.ac.uk/clustalw/) and MEGA6 softwares. The multiple sequence alignment results indicated that SRY protein sequences from Orcinus orca (killer whale) and Tursiopsaduncus (dolphin) have least genetic distance of 0.33 in these 15 species and are 99.67% identical at the amino acid level. Homosapiens and Pantroglodytes (chimpanzee) have the next lowest genetic distance of 1.35 and are 98.65% identical at the amino acid level. These findings indicate that the SRY proteins are conserved in the 15 species, and their evolutionary relationships are similar.
Naz, Sadia; Ngo, Tony; Farooq, Umar
2017-01-01
Background The rapid increase in antibiotic resistance by various bacterial pathogens underlies the significance of developing new therapies and exploring different drug targets. A fraction of bacterial pathogens abbreviated as ESKAPE by the European Center for Disease Prevention and Control have been considered a major threat due to the rise in nosocomial infections. Here, we compared putative drug binding pockets of twelve essential and mostly conserved metabolic enzymes in numerous bacterial pathogens including those of the ESKAPE group and Mycobacterium tuberculosis. The comparative analysis will provide guidelines for the likelihood of transferability of the inhibitors from one species to another. Methods Nine bacterial species including six ESKAPE pathogens, Mycobacterium tuberculosis along with Mycobacterium smegmatis and Eschershia coli, two non-pathogenic bacteria, have been selected for drug binding pocket analysis of twelve essential enzymes. The amino acid sequences were obtained from Uniprot, aligned using ICM v3.8-4a and matched against the Pocketome encyclopedia. We used known co-crystal structures of selected target enzyme orthologs to evaluate the location of their active sites and binding pockets and to calculate a matrix of pairwise sequence identities across each target enzyme across the different species. This was used to generate sequence maps. Results High sequence identity of enzyme binding pockets, derived from experimentally determined co-crystallized structures, was observed among various species. Comparison at both full sequence level and for drug binding pockets of key metabolic enzymes showed that binding pockets are highly conserved (sequence similarity up to 100%) among various ESKAPE pathogens as well as Mycobacterium tuberculosis. Enzymes orthologs having conserved binding sites may have potential to interact with inhibitors in similar way and might be helpful for design of similar class of inhibitors for a particular species. The derived pocket alignments and distance-based maps provide guidelines for drug discovery and repurposing. In addition they also provide recommendations for the relevant model bacteria that may be used for initial drug testing. Discussion Comparing ligand binding sites through sequence identity calculation could be an effective approach to identify conserved orthologs as drug binding pockets have shown higher level of conservation among various species. By using this approach we could avoid the problems associated with full sequence comparison. We identified essential metabolic enzymes among ESKAPE pathogens that share high sequence identity in their putative drug binding pockets (up to 100%), of which known inhibitors can potentially antagonize these identical pockets in the various species in a similar manner. PMID:28948099
Naz, Sadia; Ngo, Tony; Farooq, Umar; Abagyan, Ruben
2017-01-01
The rapid increase in antibiotic resistance by various bacterial pathogens underlies the significance of developing new therapies and exploring different drug targets. A fraction of bacterial pathogens abbreviated as ESKAPE by the European Center for Disease Prevention and Control have been considered a major threat due to the rise in nosocomial infections. Here, we compared putative drug binding pockets of twelve essential and mostly conserved metabolic enzymes in numerous bacterial pathogens including those of the ESKAPE group and Mycobacterium tuberculosis . The comparative analysis will provide guidelines for the likelihood of transferability of the inhibitors from one species to another. Nine bacterial species including six ESKAPE pathogens, Mycobacterium tuberculosis along with Mycobacterium smegmatis and Eschershia coli , two non-pathogenic bacteria, have been selected for drug binding pocket analysis of twelve essential enzymes. The amino acid sequences were obtained from Uniprot, aligned using ICM v3.8-4a and matched against the Pocketome encyclopedia. We used known co-crystal structures of selected target enzyme orthologs to evaluate the location of their active sites and binding pockets and to calculate a matrix of pairwise sequence identities across each target enzyme across the different species. This was used to generate sequence maps. High sequence identity of enzyme binding pockets, derived from experimentally determined co-crystallized structures, was observed among various species. Comparison at both full sequence level and for drug binding pockets of key metabolic enzymes showed that binding pockets are highly conserved (sequence similarity up to 100%) among various ESKAPE pathogens as well as Mycobacterium tuberculosis . Enzymes orthologs having conserved binding sites may have potential to interact with inhibitors in similar way and might be helpful for design of similar class of inhibitors for a particular species. The derived pocket alignments and distance-based maps provide guidelines for drug discovery and repurposing. In addition they also provide recommendations for the relevant model bacteria that may be used for initial drug testing. Comparing ligand binding sites through sequence identity calculation could be an effective approach to identify conserved orthologs as drug binding pockets have shown higher level of conservation among various species. By using this approach we could avoid the problems associated with full sequence comparison. We identified essential metabolic enzymes among ESKAPE pathogens that share high sequence identity in their putative drug binding pockets (up to 100%), of which known inhibitors can potentially antagonize these identical pockets in the various species in a similar manner.
Tharia, Hazel A; Shrive, Annette K; Mills, John D; Arme, Chris; Williams, Gwyn T; Greenhough, Trevor J
2002-02-22
The serum amyloid P component (SAP)-like pentraxin Limulus polyphemus SAP is a recently discovered, distinct pentraxin species, of known structure, which does not bind phosphocholine and whose N-terminal sequence has been shown to differ markedly from the highly conserved N terminus of all other known horseshoe crab pentraxins. The complete cDNA sequence of Limulus SAP, and the derived amino acid sequence, the first invertebrate SAP-like pentraxin sequence, have been determined. Two sequences were identified that differed only in the length of the 3' untranslated region. Limulus SAP is synthesised as a precursor protein of 234 amino acid residues, the first 17 residues encoding a signal peptide that is absent from the mature protein. Phylogenetic analysis clusters Limulus SAP pentraxin with the horseshoe crab C-reactive proteins (CRPs) rather than the mammalian SAPs, which are clustered with mammalian CRPs. The deduced amino acid sequence shares 22% identity with both human SAP and CRP, which are 51% identical, and 31-35% with horseshoe crab CRPs. These analyses indicate that gene duplication of CRP (or SAP), followed by sequence divergence and the evolution of CRP and/or SAP function, occurred independently along the chordate and arthropod evolutionary lines rather than in a common ancestor. They further indicate that the CRP/SAP gene duplication event in Limulus occurred before both the emergence of the Limulus CRP variants and the mammalian CRP/SAP gene duplication. Limulus SAP, which does not exhibit the CRP characteristic of calcium-dependent binding to phosphocholine, is established as a pentraxin species distinct from all other known horseshoe crab pentraxins that exist in many variant forms sharing a high level of sequence homology. Copyright 2002 Elsevier Science Ltd.
Applying the Concept of Peptide Uniqueness to Anti-Polio Vaccination.
Kanduc, Darja; Fasano, Candida; Capone, Giovanni; Pesce Delfino, Antonella; Calabrò, Michele; Polimeno, Lorenzo
2015-01-01
Although rare, adverse events may associate with anti-poliovirus vaccination thus possibly hampering global polio eradication worldwide. To design peptide-based anti-polio vaccines exempt from potential cross-reactivity risks and possibly able to reduce rare potential adverse events such as the postvaccine paralytic poliomyelitis due to the tendency of the poliovirus genome to mutate. Proteins from poliovirus type 1, strain Mahoney, were analyzed for amino acid sequence identity to the human proteome at the pentapeptide level, searching for sequences that (1) have zero percent of identity to human proteins, (2) are potentially endowed with an immunologic potential, and (3) are highly conserved among poliovirus strains. Sequence analyses produced a set of consensus epitopic peptides potentially able to generate specific anti-polio immune responses exempt from cross-reactivity with the human host. Peptide sequences unique to poliovirus proteins and conserved among polio strains might help formulate a specific and universal anti-polio vaccine able to react with multiple viral strains and exempt from the burden of possible cross-reactions with human proteins. As an additional advantage, using a peptide-based vaccine instead of current anti-polio DNA vaccines would eliminate the rare post-polio poliomyelitis cases and other disabling symptoms that may appear following vaccination.
Xiao, Shijun; Wang, Panpan; Dong, Linsong; Zhang, Yaguang; Han, Zhaofang; Wang, Qiurong
2016-01-01
Whole-genome single-nucleotide polymorphism (SNP) markers are valuable genetic resources for the association and conservation studies. Genome-wide SNP development in many teleost species are still challenging because of the genome complexity and the cost of re-sequencing. Genotyping-By-Sequencing (GBS) provided an efficient reduced representative method to squeeze cost for SNP detection; however, most of recent GBS applications were reported on plant organisms. In this work, we used an EcoRI-NlaIII based GBS protocol to teleost large yellow croaker, an important commercial fish in China and East-Asia, and reported the first whole-genome SNP development for the species. 69,845 high quality SNP markers that evenly distributed along genome were detected in at least 80% of 500 individuals. Nearly 95% randomly selected genotypes were successfully validated by Sequenom MassARRAY assay. The association studies with the muscle eicosapentaenoic acid (EPA) and docosahexaenoic acid (DHA) content discovered 39 significant SNP markers, contributing as high up to ∼63% genetic variance that explained by all markers. Functional genes that involved in fat digestion and absorption pathway were identified, such as APOB, CRAT and OSBPL10. Notably, PPT2 Gene, previously identified in the association study of the plasma n-3 and n-6 polyunsaturated fatty acid level in human, was re-discovered in large yellow croaker. Our study verified that EcoRI-NlaIII based GBS could produce quality SNP markers in a cost-efficient manner in teleost genome. The developed SNP markers and the EPA and DHA associated SNP loci provided invaluable resources for the population structure, conservation genetics and genomic selection of large yellow croaker and other fish organisms. PMID:28028455
NASA Technical Reports Server (NTRS)
Reddy, A. S.; Czernik, A. J.; An, G.; Poovaiah, B. W.
1992-01-01
We cloned and sequenced a plant cDNA that encodes U1 small nuclear ribonucleoprotein (snRNP) 70K protein. The plant U1 snRNP 70K protein cDNA is not full length and lacks the coding region for 68 amino acids in the amino-terminal region as compared to human U1 snRNP 70K protein. Comparison of the deduced amino acid sequence of the plant U1 snRNP 70K protein with the amino acid sequence of animal and yeast U1 snRNP 70K protein showed a high degree of homology. The plant U1 snRNP 70K protein is more closely related to the human counter part than to the yeast 70K protein. The carboxy-terminal half is less well conserved but, like the vertebrate 70K proteins, is rich in charged amino acids. Northern analysis with the RNA isolated from different parts of the plant indicates that the snRNP 70K gene is expressed in all of the parts tested. Southern blotting of genomic DNA using the cDNA indicates that the U1 snRNP 70K protein is coded by a single gene.
A new earthworm cellulase and its possible role in the innate immunity.
Park, In Yong; Cha, Ju Roung; Ok, Suk-Mi; Shin, Chuog; Kim, Jin-Se; Kwak, Hee-Jin; Yu, Yun-Sang; Kim, Yu-Kyung; Medina, Brenda; Cho, Sung-Jin; Park, Soon Cheol
2017-02-01
A new endogenous cellulase (Ean-EG) from the earthworm, Eisenia andrei and its expression pattern are demonstrated. Based on a deduced amino acid sequence, the open reading frame (ORF) of Ean-EG consisted of 1368 bps corresponding to a polypeptide of 456 amino acid residues in which is contained the conserved region specific to GHF9 that has the essential amino acid residues for enzyme activity. In multiple alignments and phylogenetic analysis, the deduced amino acid sequence of Ean- EG showed the highest sequence similarity (about 79%) to that of an annelid (Pheretima hilgendorfi) and could be clustered together with other GHF9 cellulases, indicating that Ean-EG could be categorized as a member of the GHF9 to which most animal cellulases belong. The histological expression pattern of Ean-EG mRNA using in situ hybridization revealed that the most distinct expression was observed in epithelial cells with positive hybridization signal in epidermis, chloragogen tissue cells, coelomic cell-aggregate, and even blood vessel, which could strongly support the fact that at least in the earthworm, Eisenia andrei, cellulase function must not be limited to digestive process but be possibly extended to the innate immunity. Copyright © 2016 Elsevier Ltd. All rights reserved.
Bowie, Michael V.; Reddy, G. Roman; Semu, Shalt M.; Mahan, Suman M.; Barbet, Anthony F.
1999-01-01
Cowdria ruminantium is the etiologic agent of heartwater, a disease causing major economic loss in ruminants in sub-Saharan Africa and the Caribbean. Development of a serodiagnostic test is essential for determining the carrier status of animals from regions where heartwater is endemic, but most available tests give false-positive reactions with sera against related Erhlichia species. Current approaches rely on molecular methods to define proteins and epitopes that may allow specific diagnosis. Two major antigenic proteins (MAPs), MAP1 and MAP2, have been examined for their use as antigens in the serodiagnosis of heartwater. The objectives of this study were (i) to determine if MAP2 is conserved among five geographically divergent strains of C. ruminantium and (ii) to determine if MAP2 homologs are present in Ehrlichia canis, the causative agent of canine ehrlichiosis, and Ehrlichia chaffeensis, the organism responsible for human monocytic ehrlichiosis. These two agents are closely related to C. ruminantium. The map2 gene from four strains of C. ruminantium was cloned, sequenced, and compared with the previously reported map2 gene from the Crystal Springs strain. Only 10 nucleic acid differences between the strains were identified, and they translate to only 3 amino acid changes, indicating that MAP2 is highly conserved. Genes encoding MAP2 homologs from E. canis and E. chaffeensis also were cloned and sequenced. Amino acid analysis of MAP2 homologs of E. chaffeensis and E. canis with MAP2 of C. ruminantium revealed 83.4 and 84.4% identities, respectively. Further analysis of MAP2 and its homologs revealed that the whole protein lacks specificity for heartwater diagnosis. The development of epitope-specific assays using this sequence information may produce diagnostic tests suitable for C. ruminantium and also other related rickettsiae. PMID:10066656
Zhu, Fuxiang; Sun, Ying; Wang, Yan; Pan, Hongyu; Wang, Fengting; Zhang, Xianghui; Zhang, Yanhua; Liu, Jinliang
2016-06-04
Turnip mosaic virus (TuMV) infects crops of plant species in the family Brassicaceae worldwide. TuMV isolates were clustered to five lineages corresponding to basal-B, basal-BR, Asian-BR, world-B and OMs. Here, we determined the complete genome sequences of three TuMV basal-BR isolates infecting radish from Shandong and Jilin Provinces in China. Their genomes were all composed of 9833 nucleotides, excluding the 3'-terminal poly(A) tail. They contained two open reading frames (ORFs), with the large one encoding a polyprotein of 3164 amino acids and the small overlapping ORF encoding a PIPO protein of 61 amino acids, which contained the typically conserved motifs found in members of the genus Potyvirus. In pairwise comparison with 30 other TuMV genome sequences, these three isolates shared their highest identities with isolates from Eurasian countries (Germany, Italy, Turkey and China). Recombination analysis showed that the three isolates in this study had no "clear" recombination. The analyses of conserved amino acids changed between groups showed that the codons in the TuMV out group (OGp) and OMs group were the same at three codon sites (852, 1006, 1548), and the other TuMV groups (basal-B, basal-BR, Asian-BR, world-B) were different. This pattern suggests that the codon in the OMs progenitor did not change but that in the other TuMV groups the progenitor sequence did change at divergence. Genetic diversity analyses indicate that the PIPO gene was under the highest selection pressure and the selection pressure on P3N-PIPO and P3 was almost the same. It suggests that most of the selection pressure on P3 was probably imposed through P3N-PIPO.
Kobayashi, Hiroshi; Parton, Angela; Czechanski, Anne; Durkin, Christopher; Kong, Chi-Chon; Barnes, David
2008-01-01
The multidrug resistance-associated protein 3 (MRP3/Mrp3) is a member of the ATP-binding cassette (ABC) protein family of membrane transporters and related proteins that act on a variety of xenobiotic and anionic molecules to transfer these substrates in an ATP-dependent manner. In recent years, useful comparative information regarding evolutionarily conserved structure and transport functions of these proteins has accrued through the use of primitive marine animals such as cartilaginous fish. Until recently, one missing tool in comparative studies with cartilaginous fish was cell culture. We have derived from the embryo of Squalus acanthias, the spiny dogfish shark, the S. acanthias embryo (SAE) mesenchymal stem cell line. This is the first continuously proliferating cell line from a cartilaginous fish. We identified expression of Mrp3 in this cell line, cloned the molecule, and examined molecular and cellular physiological aspects of the protein. Shark Mrp3 is characterized by three membrane-spanning domains and two nucleotide-binding domains. Multiple alignments with other species showed that the shark Mrp3 amino acid sequence was well conserved. The shark sequence was overall 64% identical to human MRP3, 72% identical to chicken Mrp3, and 71% identical to frog and stickleback Mrp3. Highest identity between shark and human amino acid sequence (82%) was seen in the carboxyl-terminal nucleotide-binding domain of the proteins. Cell culture experiments showed that mRNA for the protein was induced as much as 25-fold by peptide growth factors, fetal bovine serum, and lipid nutritional components, with the largest effect mediated by a combination of lipids including unsaturated and saturated fatty acids, cholesterol, and vitamin E. PMID:18284333
Kobayashi, Hiroshi; Parton, Angela; Czechanski, Anne; Durkin, Christopher; Kong, Chi-Chon; Barnes, David
2007-01-01
The multidrug resistance-associated protein 3 (MRP3/Mrp3) is a member of the ATP-binding cassette (ABC) protein family of membrane transporters and related proteins that act on a variety of xenobiotic and anionic molecules to transfer these substrates in an ATP-dependent manner. In recent years, useful comparative information regarding evolutionarily conserved structure and transport functions of these proteins has accrued through the use of primitive marine animals such as cartilaginous fish. Until recently, one missing tool in comparative studies with cartilaginous fish was cell culture. We have derived from the embryo of Squalus acanthias, the spiny dogfish shark, the S. acanthias embryo (SAE) mesenchymal stem cell line. This is the first continuously proliferating cell line from a cartilaginous fish. We identified expression of Mrp3 in this cell line, cloned the molecule, and examined molecular and cellular physiological aspects of the protein. Shark Mrp3 is characterized by three membrane-spanning domains and two nucleotide-binding domains. Multiple alignments with other species showed that the shark Mrp3 amino acid sequence was well conserved. The shark sequence was overall 64% identical to human MRP3, 72% identical to chicken Mrp3, and 71% identical to frog and stickleback Mrp3. Highest identity between shark and human amino acid sequence (82%) was seen in the carboxyl-terminal nucleotide-binding domain of the proteins. Cell culture experiments showed that mRNA for the protein was induced as much as 25-fold by peptide growth factors, fetal bovine serum, and lipid nutritional components, with the largest effect mediated by a combination of lipids including unsaturated and saturated fatty acids, cholesterol, and vitamin E.
Crystal structure of AFV3-109, a highly conserved protein from crenarchaeal viruses
Keller, Jenny; Leulliot, Nicolas; Cambillau, Christian; Campanacci, Valérie; Porciero, Stéphanie; Prangishvili, David; Forterre, Patrick; Cortez, Diego; Quevillon-Cheruel, Sophie; van Tilbeurgh, Herman
2007-01-01
The extraordinary morphologies of viruses infecting hyperthermophilic archaea clearly distinguish them from bacterial and eukaryotic viruses. Moreover, their genomes code for proteins that to a large extend have no related sequences in the extent databases. However, a small pool of genes is shared by overlapping subsets of these viruses, and the most conserved gene, exemplified by the ORF109 of the Acidianus Filamentous Virus 3, AFV3, is present on genomes of members of three viral familes, the Lipothrixviridae, Rudiviridae, and "Bicaudaviridae", as well as of the unclassified Sulfolobus Turreted Icosahedral Virus, STIV. We present here the crystal structure of the protein (Mr = 13.1 kD, 109 residues) encoded by the AFV3 ORF 109 in two different crystal forms at 1.5 and 1.3 Å resolution. The structure of AFV3-109 is a five stranded β-sheet with loops on one side and three helices on the other. It forms a dimer adopting the shape of a cradle that encompasses the best conserved regions of the sequence. No protein with a related fold could be identified except for the ortholog from STIV1, whose structure was deposited at the Protein Data Bank. We could clearly identify a well bound glycerol inside the cradle, contacting exclusively totally conserved residues. This interaction was confirmed in solution by fluorescence titration. Although the function of AFV3-109 cannot be deduced directly from its structure, structural homology with the STIV1 protein, and the size and charge distribution of the cavity suggested it could interact with nucleic acids. Fluorescence quenching titrations also showed that AFV3-109 interacts with dsDNA. Genomic sequence analysis revealed bacterial homologs of AFV3-109 as a part of a putative previously unidentified prophage sequences in some Firmicutes. PMID:17241456
Amino acid and structural variability of Yersinia pestis LcrV protein
DOE Office of Scientific and Technical Information (OSTI.GOV)
Anisimov, A P; Dentovskaya, S V; Panfertsev, E A
2009-11-09
The LcrV protein is a multifunctional virulence factor and protective antigen of the plague bacterium which is generally conserved between the epidemic strains of Yersinia pestis. They investigated the diversity in the LcrV sequences among non-epidemic Y. pestis strains which have a limited virulence in selected animal models and for humans. Sequencing of lcrV genes from ten Y. pestis strains belonging to different phylogenetic groups (subspecies) showed that the LcrV proteins possess four major variable hotspots at positions 18, 72, 273, and 324-326. These major variations, together with other minor substitutions in amino acid sequences, allowed them to classify themore » LcrV alleles into five sequence types (A-E). They observed that the strains of different Y. pestis subspecies can have the same typ of LcrV, and different types of LcrV can exist within the same natural plague focus. The LcrV polymorphisms were structurally analyzed by comparing the modeled structures of LcrV from all available strains. All changes except one occurred either in flexible regions or on the surface of the protein, but local chemical properties (i.e. those of a hydrophobic, hydrophilic, amphipathic, or charged nature) were conserved across all of the strains. Polymorphisms in flexible and surface regions are likely subject to less selective pressure, and have a limited impact on the structure. In contrast, the substitution of tryptophan at position 113 with either glutamic acid or glycine likely has a serious influence on the regional structure of the protein, and these mutations might have an effect on the function of LcrV. The polymorphisms at positions 18, 72 and 273 were accountable for differences in oligomerization of LcrV. The importance of the latter property in emergence of epidemic strains of Y. pestis during evolution of this pathogen will need to be further investigated.« less
Crystal structure of the Msx-1 homeodomain/DNA complex.
Hovde, S; Abate-Shen, C; Geiger, J H
2001-10-09
The Msx-1 homeodomain protein plays a crucial role in craniofacial, limb, and nervous system development. Homeodomain DNA-binding domains are comprised of 60 amino acids that show a high degree of evolutionary conservation. We have determined the structure of the Msx-1 homeodomain complexed to DNA at 2.2 A resolution. The structure has an unusually well-ordered N-terminal arm with a unique trajectory across the minor groove of the DNA. DNA specificity conferred by bases flanking the core TAAT sequence is explained by well ordered water-mediated interactions at Q50. Most interactions seen at the TAAT sequence are typical of the interactions seen in other homeodomain structures. Comparison of the Msx-1-HD structure to all other high resolution HD-DNA complex structures indicate a remarkably well-conserved sphere of hydration between the DNA and protein in these complexes.
Fiermonte, G; Runswick, M J; Walker, J E; Palmieri, F
1992-01-01
A human cDNA has been isolated previously from a thyroid library with the aid of serum from a patient with Grave's disease. It encodes a protein belonging to the mitochondrial metabolite carrier family, referred to as the Grave's disease carrier protein (GDC). Using primers based on this sequence, overlapping cDNAs encoding the bovine homologue of the GDC have been isolated from total bovine heart poly(A)+ cDNA. The bovine protein is 18 amino acids shorter than the published human sequence, but if a frame shift requiring the removal of one nucleotide is introduced into the human cDNA sequence, the human and bovine proteins become identical in their C-terminal regions, and 308 out of 330 amino acids are conserved over their entire sequences. The bovine cDNA has been used to investigate the expression of the GDC in various bovine tissues. In the tissues that were examined, the GDC is most strongly expressed in the thyroid, but substantial amounts of its mRNA were also detected in liver, lung and kidney, and lesser amounts in heart and skeletal muscle.
Human endomembrane H+ pump strongly resembles the ATP-synthetase of Archaebacteria.
Südhof, T C; Fried, V A; Stone, D K; Johnston, P A; Xie, X S
1989-01-01
Preparations of mammalian H+ pumps that acidify intracellular vesicles contain eight or nine polypeptides, ranging in size from 116 to 17 kDa. Biochemical analysis indicates that the 70- and 58-kDa polypeptides are subunits critical for ATP hydrolysis. The amino acid sequences of the major catalytic subunits (58 and 70 kDa) of the endomembrane H+ pump are unknown from animal cells. We report here the complete sequence of the 58-kDa subunit derived from a human kidney cDNA clone and partial sequences of the 70- and 58-kDa subunits purified from clathrin-coated vesicles of bovine brain. The amino acid sequences of both proteins strongly resemble the sequences of the corresponding subunits of the vacuolar H+ pumps of Archaebacteria, plants, and fungi. The archaebacterial enzyme is believed to use a H+ gradient to synthesize ATP. Thus, a common ancestral protein has given rise to a H+ pump that synthesizes ATP in one organism and hydrolyzes it in another and is highly conserved from prokaryotes to humans. The same pump appears to mediate the acidification of intracellular organelles, including coated vesicles, lysosomes, and secretory granules, as well as extracellular fluids such as urine. PMID:2527371
Delgado-Gaytán, María F; Rosas-Rodríguez, Jesús A; Yepiz-Plascencia, Gloria; Figueroa-Soto, Ciria G; Valenzuela-Soto, Elisa M
2017-10-01
The enzyme betaine aldehyde dehydrogenase (BADH) catalyzes the irreversible oxidation of betaine aldehyde to glycine betaine (GB), a very efficient osmolyte accumulated during osmotic stress. In this study, we determined the nucleotide sequence of the cDNA for the BADH from the white shrimp Litopenaeus vannamei (LvBADH). The cDNA was 1882 bp long, with a complete open reading frame of 1524 bp, encoding 507 amino acids with a predicted molecular mass of 54.15 kDa and a pI of 5.4. The predicted LvBADH amino acid sequence shares a high degree of identity with marine invertebrate BADHs. Catalytic residues (C-298, E-264 and N-167) and the decapeptide VTLELGGKSP involved in nucleotide binding and highly conserved in BADHs were identified in the amino acid sequence. Phylogenetic analyses classified LvBADH in a clade that includes ALDH9 sequences from marine invertebrates. Molecular modeling of LvBADH revealed that the protein has amino acid residues and sequence motifs essential for the function of the ALDH9 family of enzymes. LvBADH modeling showed three potential monovalent cation binding sites, one site is located in an intra-subunit cavity; other in an inter-subunit cavity and a third in a central-cavity of the protein. The results show that LvBADH shares a high degree of identity with BADH sequences from marine invertebrates and enzymes that belong to the ALDH9 family. Our findings suggest that the LvBADH has molecular mechanisms of regulation similar to those of other BADHs belonging to the ALDH9 family, and that BADH might be playing a role in the osmoregulation capacity of L. vannamei. Copyright © 2017 Elsevier B.V. All rights reserved.
Cloning and sequencing of the cDNA species for mammalian dimeric dihydrodiol dehydrogenases.
Arimitsu, E; Aoki, S; Ishikura, S; Nakanishi, K; Matsuura, K; Hara, A
1999-01-01
Cynomolgus and Japanese monkey kidneys, dog and pig livers and rabbit lens contain dimeric dihydrodiol dehydrogenase (EC 1.3.1.20) associated with high carbonyl reductase activity. Here we have isolated cDNA species for the dimeric enzymes by reverse transcriptase-PCR from human intestine in addition to the above five animal tissues. The amino acid sequences deduced from the monkey, pig and dog cDNA species perfectly matched the partial sequences of peptides digested from the respective enzymes of these animal tissues, and active recombinant proteins were expressed in a bacterial system from the monkey and human cDNA species. Northern blot analysis revealed the existence of a single 1.3 kb mRNA species for the enzyme in these animal tissues. The human enzyme shared 94%, 85%, 84% and 82% amino acid identity with the enzymes of the two monkey strains (their sequences were identical), the dog, the pig and the rabbit respectively. The sequences of the primate enzymes consisted of 335 amino acid residues and lacked one amino acid compared with the other animal enzymes. In contrast with previous reports that other types of dihydrodiol dehydrogenase, carbonyl reductases and enzymes with either activity belong to the aldo-keto reductase family or the short-chain dehydrogenase/reductase family, dimeric dihydrodiol dehydrogenase showed no sequence similarity with the members of the two protein families. The dimeric enzyme aligned with low degrees of identity (14-25%) with several prokaryotic proteins, in which 47 residues are strictly or highly conserved. Thus dimeric dihydrodiol dehydrogenase has a primary structure distinct from the previously known mammalian enzymes and is suggested to constitute a novel protein family with the prokaryotic proteins. PMID:10477285
The CD8α gene in duck (Anatidae): cloning, characterization, and expression during viral infection.
Xu, Qi; Chen, Yang; Zhao, Wen Ming; Huang, Zheng Yang; Duan, Xiu Jun; Tong, Yi Yu; Zhang, Yang; Li, Xiu; Chang, Guo Bin; Chen, Guo Hong
2015-02-01
Cluster of differentiation 8 alpha (CD8α) is critical for cell-mediated immune defense and T-cell development. Although CD8α sequences have been reported for several species, very little is known about CD8α in ducks. To elucidate the mechanisms involved in the innate and adaptive immune responses of ducks, we cloned CD8α coding sequences from domestic, Muscovy, Mallard, and Spotbill ducks using reverse transcription polymerase chain reaction (RT-PCR). Each sequence consisted of 714 nucleotides and encoded a signal peptide, an IgV-like domain, a stalk region, a transmembrane region, and a cytoplasmic tail. We identified 58 nucleotide differences and 37 amino acid differences among the four types of duck; of these, 53 nucleotide and 33 amino acid differences were between Muscovy ducks and the other duck species. The CD8α cDNA sequence from domestic duck consisted of a 61-nucleotide 5' untranslated region (UTR), a 714-nucleotide open reading frame, and an 849-nucleotide 3' UTR. Multiple sequence alignments showed that the amino acid sequence of CD8α is conserved in vertebrates. RT-PCR revealed that expression of CD8α mRNA of domestic ducks was highest in the thymus and very low in the kidney, cerebrum, cerebellum, and muscle. Immunohistochemical analyses detected CD8α on the splenic corpuscle and periarterial lymphatic sheath of the spleen. CD8α mRNA in domestic ducklings was initially up-regulated, and then down-regulated, in the thymus, spleen, and liver after treatment with duck hepatitis virus type I (DHV-1) or the immunostimulant polyriboinosinic polyribocytidylic acid (poly I:C).
Pasricha, Gunisha; Mishra, Akhilesh C; Chakrabarti, Alok K
2013-07-01
PB1F2 is the 11th protein of influenza A virus translated from +1 alternate reading frame of PB1 gene. Since the discovery, varying sizes and functions of the PB1F2 protein of influenza A viruses have been reported. Selection of PB1 gene segment in the pandemics, variable size and pleiotropic effect of PB1F2 intrigued us to analyze amino acid sequences of this protein in various influenza A viruses. Amino acid sequences for PB1F2 protein of influenza A H5N1, H1N1, H2N2, and H3N2 subtypes were obtained from Influenza Research Database. Multiple sequence alignments of the PB1F2 protein sequences of the aforementioned subtypes were used to determine the size, variable and conserved domains and to perform mutational analysis. Analysis showed that 96·4% of the H5N1 influenza viruses harbored full-length PB1F2 protein. Except for the 2009 pandemic H1N1 virus, all the subtypes of the 20th-century pandemic influenza viruses contained full-length PB1F2 protein. Through the years, PB1F2 protein of the H1N1 and H3N2 viruses has undergone much variation. PB1F2 protein sequences of H5N1 viruses showed both human- and avian host-specific conserved domains. Global database of PB1F2 protein revealed that N66S mutation was present only in 3·8% of the H5N1 strains. We found a novel mutation, N84S in the PB1F2 protein of 9·35% of the highly pathogenic avian influenza H5N1 influenza viruses. Varying sizes and mutations of the PB1F2 protein in different influenza A virus subtypes with pandemic potential were obtained. There was genetic divergence of the protein in various hosts which highlighted the host-specific evolution of the virus. However, studies are required to correlate this sequence variability with the virulence and pathogenicity. © 2012 John Wiley & Sons Ltd.
Characterization of cDNAs and genomic DNAs for human threonyl- and cysteinyl-tRNA synthetases
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cruzen, M.E.
1993-01-01
Techniques of molecular biology were used to clone, sequence and map two human aminoacyl-tRNA synthetase (aaRS) cDNAs: threonyl-tRNA synthetase (ThrRS) a class II enzyme and cysteinyl-tRNA synthetase (CysRS) a class I enzyme. The predicted protein sequence of human ThrRS is highly homologous to that of lower eukaryotic and prokaryotic ThRSs, particularly in the regions containing the three structural motifs common to all class II synthetases. Signature regions 1 and 2, which characterize the class IIa subgroup (SerRS, ThrRS and HisRS) are highly conserved from bacteria to human. Structural predictions for human ThrRS based on the known structure of the closelymore » related SerRS from E.coli implicate strongly conserved residues in the signature sequences to be important in substrate binding. The amino terminal 100 residues of the deduced amino acid sequence of ThrRS shares structural similarity to SerRS consistent with forming an antiparallel helix implicated in tRNA binding. The 5' untranslated sequence of the human ThrRS gene shares short stretches of common sequence with the gene for hamster HisRS including a binding site for the promoter specific transcription factor sp-1. The deduced amino acid sequence of human CysRS has a high degree of sequence identify to E. coli CysRS. Human CysRS possesses the classic characteristics of a class I synthetase and is most closely related to the MetRS subgroup. The amino terminal half of human CysRS can be modeled as a nucleotide binding fold and shares significant sequence and structural similarity to the other enzymes in this subgroup. The CysRS structural gene (CARS) was mapped to human chromosome 11p15.5 by fluorescent in situ hybridization. CARS is the first aaRS gene to be mapped to chromosome 11. The steady state of both CysRS and ThrRs mRNA were quantitated in several human tissues. Message levels for these enzymes appear to be subjected to differential regulation in different cell types.« less
FoxP2 in song-learning birds and vocal-learning mammals.
Webb, D M; Zhang, J
2005-01-01
FoxP2 is the first identified gene that is specifically involved in speech and language development in humans. Population genetic studies of FoxP2 revealed a selective sweep in recent human history associated with two amino acid substitutions in exon 7. Avian song learning and human language acquisition share many behavioral and neurological similarities. To determine whether FoxP2 plays a similar role in song-learning birds, we sequenced exon 7 of FoxP2 in multiple song-learning and nonlearning birds. We show extreme conservation of FoxP2 sequences in birds, including unusually low rates of synonymous substitutions. However, no amino acid substitutions are shared between the song-learning birds and humans. Furthermore, sequences from vocal-learning whales, dolphins, and bats do not share the human-unique substitutions. While FoxP2 appears to be under strong functional constraints in mammals and birds, we find no evidence for its role during the evolution of vocal learning in nonhuman animals as in humans.
Tao, Yaqiong; Zeng, Bo; Xu, Liu; Yue, Bisong; Yang, Dong; Zou, Fangdong
2010-01-01
Interferon-gamma (IFN-gamma) is the only member of type II IFN and is vital in the regulation of immune and inflammatory responses. Herein we report the cloning, expression, and sequence analysis of IFN-gamma from the giant panda (Ailuropoda melanoleuca). The open reading frame of this gene is 501 base pair in length and encodes a polypeptide consisting of 166 amino acids. All conserved N-linked glycosylation sites and cysteine residues among carnivores were found in the predicted amino acid sequence of the giant panda. Recombinant giant panda IFN-gamma with a V5 epitope and polyhistidine tag was expressed in HEK293 host cells and confirmed by Western blotting. Phylogenetic analysis of mammalian IFN-gamma-coding sequences indicated that the giant panda IFN-gamma was closest to that of carnivores, then to ungulates and dolphin, and shared a distant relationship with mouse and human. These results represent a first step into the study of IFN-gamma in giant panda.
Allen, Margaret L.; Mertens, Jeffrey A.
2008-01-01
Three unique cDNAs encoding putative polygalacturonase enzymes were isolated from the tarnished plant bug, Lygus lineolaris (Palisot de Beauvois) (Hemiptera: Miridae). The three nucleotide sequences were dissimilar to one another, but the deduced amino acid sequences were similar to each other and to other polygalacturonases from insects, fungi, plants, and bacteria. Four conserved segments characteristic of polygalacturonases were present, but with some notable semiconservative substitutions. Two of four expected disulfide bridge—forming cysteine pairs were present. All three inferred protein translations included predicted signal sequences of 17 to 20 amino acids. Amplification of genomic DNA identified an intron in one of the genes, Llpg1, in the 5′ untranslated region. Semiquantitative RT-PCR revealed expression in all stages of the insect except the eggs. Expression in adults, male and female, was highly variable, indicating a family of highly inducible and diverse enzymes adapted to the generalist polyphagous nature of this important pest. PMID:20233096
Buchko, Garry W.; Berg, Howard R.; Kaur, Jagdeep; Pandurangi, Raghu S.; Smith, Thomas J.; Shah, Dilip M.
2013-01-01
MtDef4 is a 47-amino acid cysteine-rich evolutionary conserved defensin from a model legume Medicago truncatula. It is an apoplast-localized plant defense protein that inhibits the growth of the ascomycetous fungal pathogen Fusarium graminearum in vitro at micromolar concentrations. Little is known about the mechanisms by which MtDef4 mediates its antifungal activity. In this study, we show that MtDef4 rapidly permeabilizes fungal plasma membrane and is internalized by the fungal cells where it accumulates in the cytoplasm. Furthermore, analysis of the structure of MtDef4 reveals the presence of a positively charged γ-core motif composed of β2 and β3 strands connected by a positively charged RGFRRR loop. Replacement of the RGFRRR sequence with AAAARR or RGFRAA abolishes the ability of MtDef4 to enter fungal cells, suggesting that the RGFRRR loop is a translocation signal required for the internalization of the protein. MtDef4 binds to phosphatidic acid (PA), a precursor for the biosynthesis of membrane phospholipids and a signaling lipid known to recruit cytosolic proteins to membranes. Amino acid substitutions in the RGFRRR sequence which abolish the ability of MtDef4 to enter fungal cells also impair its ability to bind PA. These findings suggest that MtDef4 is a novel antifungal plant defensin capable of entering into fungal cells and affecting intracellular targets and that these processes are mediated by the highly conserved cationic RGFRRR loop via its interaction with PA. PMID:24324798
Xu, Songtao; Zhang, Yan; Zhu, Zhen; Liu, Chunyu; Mao, Naiying; Ji, Yixin; Wang, Huiling; Jiang, Xiaohong; Li, Chongshan; Tang, Wei; Feng, Daxing; Wang, Changyin; Zheng, Lei; Lei, Yue; Ling, Hua; Zhao, Chunfang; Ma, Yan; He, Jilan; Wang, Yan; Li, Ping; Guan, Ronghui; Zhou, Shujie; Zhou, Jianhui; Wang, Shuang; Zhang, Hong; Zheng, Huanying; Liu, Leng; Ma, Hemuti; Guan, Jing; Lu, Peishan; Feng, Yan; Zhang, Yanjun; Zhou, Shunde; Xiong, Ying; Ba, Zhuoma; Chen, Hui; Yang, Xiuhui; Bo, Fang; Ma, Yujie; Liang, Yong; Lei, Yake; Gu, Suyi; Liu, Wei; Chen, Meng; Featherstone, David; Jee, Youngmee; Bellini, William J; Rota, Paul A; Xu, Wenbo
2013-01-01
China experienced several large measles outbreaks in the past two decades, and a series of enhanced control measures were implemented to achieve the goal of measles elimination. Molecular epidemiologic surveillance of wild-type measles viruses (MeV) provides valuable information about the viral transmission patterns. Since 1993, virologic surveillnace has confirmed that a single endemic genotype H1 viruses have been predominantly circulating in China. A component of molecular surveillance is to monitor the genetic characteristics of the hemagglutinin (H) gene of MeV, the major target for virus neutralizing antibodies. Analysis of the sequences of the complete H gene from 56 representative wild-type MeV strains circulating in China during 1993-2009 showed that the H gene sequences were clustered into 2 groups, cluster 1 and cluster 2. Cluster1 strains were the most frequently detected cluster and had a widespread distribution in China after 2000. The predicted amino acid sequences of the H protein were relatively conserved at most of the functionally significant amino acid positions. However, most of the genotype H1 cluster1 viruses had an amino acid substitution (Ser240Asn), which removed a predicted N-linked glycosylation site. In addition, the substitution of Pro397Leu in the hemagglutinin noose epitope (HNE) was identified in 23 of 56 strains. The evolutionary rate of the H gene of the genotype H1 viruses was estimated to be approximately 0.76×10(-3) substitutions per site per year, and the ratio of dN to dS (dN/dS) was <1 indicating the absence of selective pressure. Although H genes of the genotype H1 strains were conserved and not subjected to selective pressure, several amino acid substitutions were observed in functionally important positions. Therefore the antigenic and genetic properties of H genes of wild-type MeVs should be monitored as part of routine molecular surveillance for measles in China.
Porcine MYF6 gene: sequence, homology analysis, and variation in the promoter region.
Wyszyńska-Koko, J; Kurył, J
2004-01-01
MYF6 gene codes for the bHLH transcription factor belonging to MyoD family. Its expression accompanies the processes of differentiation and maturation of myotubes during embriogenesis and continues on a relatively high level after birth, affecting the muscle phenotype. The porcine MYF6 gene was amplified and sequenced and compared with MYF6 gene sequences of other species. The amino acid sequence was deduced and an interspecies homology analysis was performed. Myf-6 protein shows a high conservation among species of 99 and 97% identity when comparing pig with cow and human, respectively, and of 93% when comparing pig with mouse and rat. The single nucleotide polymorphism (SNP) was revealed within the promoter region, which appeared to be T --> C transition recognized by a MspI restriction enzyme.
Characterization of Clostridium perfringens iota-toxin genes and expression in Escherichia coli.
Perelle, S; Gibert, M; Boquet, P; Popoff, M R
1993-12-01
The iota toxin which is produced by Clostridium perfringens type E, is a binary toxin consisting of two independent polypeptides: Ia, which is an ADP-ribosyltransferase, and Ib, which is involved in the binding and internalization of the toxin into the cell. Two degenerate oligonucleotide probes deduced from partial amino acid sequence of each component of C. spiroforme toxin, which is closely related to the iota toxin, were used to clone three overlapping DNA fragments containing the iota-toxin genes from C. perfringens type E plasmid DNA. Two genes, in the same orientation, coding for Ia (387 amino acids) and Ib (875 amino acids) and separated by 243 noncoding nucleotides were identified. A predicted signal peptide was found for each component, and the secreted Ib displays two domains, the propeptide (172 amino acids) and the mature protein (664 amino acids). The Ia gene has been expressed in Escherichia coli and C. perfringens, under the control of its own promoter. The recombinant polypeptide obtained was recognized by Ia antibodies and ADP-ribosylated actin. The expression of the Ib gene was obtained in E. coli harboring a recombinant plasmid encompassing the putative promoter upstream of the Ia gene and the Ia and Ib genes. Two residues which have been found to be involved in the NAD+ binding site of diphtheria and pseudomonas toxins are conserved in the predicted Ia sequence (Glu-14 and Trp-19). The predicted amino acid Ib sequence shows 33.9% identity with and 54.4% similarity to the protective antigen of the anthrax toxin complex. In particular, the central region of Ib, which contains a predicted transmembrane segment (Leu-292 to Ser-308), presents 45% identity with the corresponding protective antigen sequence which is involved in the translocation of the toxin across the cell membrane.
Pasricha, Gunisha; Mishra, Akhilesh C.; Chakrabarti, Alok K.
2012-01-01
Please cite this paper as: Pasricha et al. (2012) Comprehensive global amino acid sequence analysis of PB1F2 protein of influenza A H5N1 viruses and the Influenza A virus subtypes responsible for the 20th‐century pandemics. Influenza and Other Respiratory Viruses 7(4), 497–505. Background PB1F2 is the 11th protein of influenza A virus translated from +1 alternate reading frame of PB1 gene. Since the discovery, varying sizes and functions of the PB1F2 protein of influenza A viruses have been reported. Selection of PB1 gene segment in the pandemics, variable size and pleiotropic effect of PB1F2 intrigued us to analyze amino acid sequences of this protein in various influenza A viruses. Methods Amino acid sequences for PB1F2 protein of influenza A H5N1, H1N1, H2N2, and H3N2 subtypes were obtained from Influenza Research Database. Multiple sequence alignments of the PB1F2 protein sequences of the aforementioned subtypes were used to determine the size, variable and conserved domains and to perform mutational analysis. Results Analysis showed that 96·4% of the H5N1 influenza viruses harbored full‐length PB1F2 protein. Except for the 2009 pandemic H1N1 virus, all the subtypes of the 20th‐century pandemic influenza viruses contained full‐length PB1F2 protein. Through the years, PB1F2 protein of the H1N1 and H3N2 viruses has undergone much variation. PB1F2 protein sequences of H5N1 viruses showed both human‐ and avian host‐specific conserved domains. Global database of PB1F2 protein revealed that N66S mutation was present only in 3·8% of the H5N1 strains. We found a novel mutation, N84S in the PB1F2 protein of 9·35% of the highly pathogenic avian influenza H5N1 influenza viruses. Conclusions Varying sizes and mutations of the PB1F2 protein in different influenza A virus subtypes with pandemic potential were obtained. There was genetic divergence of the protein in various hosts which highlighted the host‐specific evolution of the virus. However, studies are required to correlate this sequence variability with the virulence and pathogenicity. PMID:22788742
Santos, Regie Lyn P.; El-Shanti, Hatem; Sikandar, Shaheen; Lee, Kwanghyuk; Bhatti, Attya; Yan, Kai; Chahrour, Maria H.; McArthur, Nathan; Pham, Thanh L.; Mahasneh, Amjad Abdullah; Ahmad, Wasim
2010-01-01
To date, 37 genes have been identified for nonsyndromic hearing impairment (NSHI). Identifying the functional sequence variants within these genes and knowing their population-specific frequencies is of public health value, in particular for genetic screening for NSHI. To determine putatively functional sequence variants in the transmembrane inner ear (TMIE) gene in Pakistani and Jordanian families with autosomal recessive (AR) NSHI, four Jordanian and 168 Pakistani families with ARNSHI that is not due to GJB2 (CX26) were submitted to a genome scan. Two-point and multipoint parametric linkage analyses were performed, and families with logarithmic odds (LOD) scores of 1.0 or greater within the TMIE region underwent further DNA sequencing. The evolutionary conservation and location in predicted protein domains of amino acid residues where sequence variants occurred were studied to elucidate the possible effects of these sequence variants on function. Of seven families that were screened for TMIE, putatively functional sequence variants were found to segregate with hearing impairment in four families but were not seen in not less than 110 ethnically matched control chromosomes. The previously reported c.241C>T (p.R81C) variant was observed in two Pakistani families. Two novel variants, c.92A>G (p.E31G) and the splice site mutation c.212–2A>C, were identified in one Pakistani and one Jordanian family, respectively. The c.92A>G (p.E31G) variant occurred at a residue that is conserved in the mouse and is predicted to be extracellular. Conservation and potential functionality of previously published mutations were also examined. The prevalence of functional TMIE variants in Pakistani families is 1.7% [95% confidence interval (CI) 0.3–4.8]. Further studies on the spectrum, prevalence rates, and functional effect of sequence variants in the TMIE gene in other populations should demonstrate the true importance of this gene as a cause of hearing impairment. PMID:16389551
Vaira, A M; Accotto, G P; Costantini, A; Milne, R G
2003-06-01
A 4018 nucleotide sequence was obtained for RNA 1 of Ranunculus white mottle virus (RWMV), genus Ophiovirus, representing an incomplete ORF of 1339 aa. Amino acid sequence analysis revealed significant similarities with RNA polymerases of viruses in the family Rhabdoviridae and a conserved domain of 685 aa, corresponding to the RdRp domain of those in the order Mononegavirales. Phylogenetic analysis indicated that the genus Ophiovirus is not related to the genus Tenuivirus or the family Bunyaviridae, with which it has been linked, and probably deserves a special taxonomic position, within a new family. A pair of degenerate primers was designed from a consensus sequence obtained from a relatively conserved region in the RNA 1 of two members of the genus, Citrus psorosis virus (CPsV) and RWMV. The primers, used in RT-PCR experiments, amplified a 136 bp DNA fragment from all the three recognized members of the genus, i.e. CPsV, RWMV and Tulip mild mottle mosaic virus (TMMMV) and from two tentative ophioviruses from lettuce and freesia. The amplified DNAs were sequenced and compared with the corresponding sequences of CPsV and RWMV and phylogenetic relationships were evaluated. Assays using extracts from plants infected by viruses belonging to the genera Tospovirus, Tenuivirus, Rhabdovirus and Varicosavirus indicated that the primers are genus-specific.
Dictionary-driven protein annotation.
Rigoutsos, Isidore; Huynh, Tien; Floratos, Aris; Parida, Laxmi; Platt, Daniel
2002-09-01
Computational methods seeking to automatically determine the properties (functional, structural, physicochemical, etc.) of a protein directly from the sequence have long been the focus of numerous research groups. With the advent of advanced sequencing methods and systems, the number of amino acid sequences that are being deposited in the public databases has been increasing steadily. This has in turn generated a renewed demand for automated approaches that can annotate individual sequences and complete genomes quickly, exhaustively and objectively. In this paper, we present one such approach that is centered around and exploits the Bio-Dictionary, a collection of amino acid patterns that completely covers the natural sequence space and can capture functional and structural signals that have been reused during evolution, within and across protein families. Our annotation approach also makes use of a weighted, position-specific scoring scheme that is unaffected by the over-representation of well-conserved proteins and protein fragments in the databases used. For a given query sequence, the method permits one to determine, in a single pass, the following: local and global similarities between the query and any protein already present in a public database; the likeness of the query to all available archaeal/ bacterial/eukaryotic/viral sequences in the database as a function of amino acid position within the query; the character of secondary structure of the query as a function of amino acid position within the query; the cytoplasmic, transmembrane or extracellular behavior of the query; the nature and position of binding domains, active sites, post-translationally modified sites, signal peptides, etc. In terms of performance, the proposed method is exhaustive, objective and allows for the rapid annotation of individual sequences and full genomes. Annotation examples are presented and discussed in Results, including individual queries and complete genomes that were released publicly after we built the Bio-Dictionary that is used in our experiments. Finally, we have computed the annotations of more than 70 complete genomes and made them available on the World Wide Web at http://cbcsrv.watson.ibm.com/Annotations/.
Li, Yang; Ren, Yi
2017-01-01
Pseudomonas sp. QTF5 was isolated from the continuous permafrost near the bitumen layers in the Qiangtang basin of Qinghai-Tibetan Plateau in China (5,111 m above sea level). It is psychrotolerant and highly and widely tolerant to heavy metals and has the ability to metabolize benzoic acid and salicylic acid. To gain insight into the genetic basis for its adaptation, we performed whole genome sequencing and analyzed the resistant genes and metabolic pathways. Based on 120 published and annotated genomes representing 31 species in the genus Pseudomonas, in silico genomic DNA-DNA hybridization (<54%) and average nucleotide identity calculation (<94%) revealed that QTF5 is closest to Pseudomonas lini and should be classified into a novel species. This study provides the genetic basis to identify the genes linked to its specific mechanisms for adaptation to extreme environment and application of this microorganism in environmental conservation. PMID:29270429
Sequencing Conservation Actions Through Threat Assessments in the Southeastern United States
Robert D. Sutter; Christopher C. Szell
2006-01-01
The identification of conservation priorities is one of the leading issues in conservation biology. We present a project of The Nature Conservancy, called Sequencing Conservation Actions, which prioritizes conservation areas and identifies foci for crosscutting strategies at various geographic scales. We use the term âSequencingâ to mean an ordering of actions over...
GibbsCluster: unsupervised clustering and alignment of peptide sequences.
Andreatta, Massimo; Alvarez, Bruno; Nielsen, Morten
2017-07-03
Receptor interactions with short linear peptide fragments (ligands) are at the base of many biological signaling processes. Conserved and information-rich amino acid patterns, commonly called sequence motifs, shape and regulate these interactions. Because of the properties of a receptor-ligand system or of the assay used to interrogate it, experimental data often contain multiple sequence motifs. GibbsCluster is a powerful tool for unsupervised motif discovery because it can simultaneously cluster and align peptide data. The GibbsCluster 2.0 presented here is an improved version incorporating insertion and deletions accounting for variations in motif length in the peptide input. In basic terms, the program takes as input a set of peptide sequences and clusters them into meaningful groups. It returns the optimal number of clusters it identified, together with the sequence alignment and sequence motif characterizing each cluster. Several parameters are available to customize cluster analysis, including adjustable penalties for small clusters and overlapping groups and a trash cluster to remove outliers. As an example application, we used the server to deconvolute multiple specificities in large-scale peptidome data generated by mass spectrometry. The server is available at http://www.cbs.dtu.dk/services/GibbsCluster-2.0. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Sequence similarity is more relevant than species specificity in probabilistic backtranslation.
Ferro, Alfredo; Giugno, Rosalba; Pigola, Giuseppe; Pulvirenti, Alfredo; Di Pietro, Cinzia; Purrello, Michele; Ragusa, Marco
2007-02-21
Backtranslation is the process of decoding a sequence of amino acids into the corresponding codons. All synthetic gene design systems include a backtranslation module. The degeneracy of the genetic code makes backtranslation potentially ambiguous since most amino acids are encoded by multiple codons. The common approach to overcome this difficulty is based on imitation of codon usage within the target species. This paper describes EasyBack, a new parameter-free, fully-automated software for backtranslation using Hidden Markov Models. EasyBack is not based on imitation of codon usage within the target species, but instead uses a sequence-similarity criterion. The model is trained with a set of proteins with known cDNA coding sequences, constructed from the input protein by querying the NCBI databases with BLAST. Unlike existing software, the proposed method allows the quality of prediction to be estimated. When tested on a group of proteins that show different degrees of sequence conservation, EasyBack outperforms other published methods in terms of precision. The prediction quality of a protein backtranslation methis markedly increased by replacing the criterion of most used codon in the same species with a Hidden Markov Model trained with a set of most similar sequences from all species. Moreover, the proposed method allows the quality of prediction to be estimated probabilistically.
Wang, Xiao-Jing; Wang, Xiao-Xing; Wang, Ya-Jun; Wang, Xi-Zhong; He, Guang-Xin; Chen, Hong-Wei; Fei, Li-Song
2002-09-01
Activin, which is included in the transforming growth factor-beta (TGF beta) superfamily of proteins and receptors, is known to have broad-ranging effects in the creatures. The mature peptide of beta A subunit of this gene, one of the most highly conserved sequence, can elevate the basal secretion of follicle-stimulating hormone (FSH) in the pituitary and FSH is pivotal to organism's reproduction. Reproduction block is one of the main reasons which cause giant panda to extinct. The sequence of Activin beta A subunit gene mature peptides has been successfully amplified from giant panda, red panda and malayan sun bear's genomic DNA by using polymerase chain reaction (PCR) with a pair of degenerate primers. The PCR products were cloned into the vector pBlueScript+ of Esherichia coli. Sequence analysis of Activin beta A subunit gene mature peptides shows that the length of this gene segment is the same (359 bp) and there is no intron in all three species. The sequence encodes a peptide of 119 amino acid residues. The homology comparison demonstrates 93.9% DNA homology and 99% homology in amino acid among these three species. Both GenBank blast search result and restriction enzyme map reveal that the sequences of Activin beta A subunit gene mature peptides of different species are highly conserved during the evolution process. Phylogeny analysis is performed with PHYLIP software package. A consistent phylogeny tree has been drawn with three different methods. The software analysis outcome accords with the academic view that giant panda has a closer relationship to the malayan sun bear than the red panda. Giant panda should be grouped into the bear family (Uersidae) with the malayan sun bear. As to the red panda, it would be better that this animal be grouped into the unique family (red panda family) because of great difference between the red panda and the bears (Uersidae).
NASA Astrophysics Data System (ADS)
Weigt, Martin
Over the last years, biological research has been revolutionized by experimental high-throughput techniques, in particular by next-generation sequencing technology. Unprecedented amounts of data are accumulating, and there is a growing request for computational methods unveiling the information hidden in raw data, thereby increasing our understanding of complex biological systems. Statistical-physics models based on the maximum-entropy principle have, in the last few years, played an important role in this context. To give a specific example, proteins and many non-coding RNA show a remarkable degree of structural and functional conservation in the course of evolution, despite a large variability in amino acid sequences. We have developed a statistical-mechanics inspired inference approach - called Direct-Coupling Analysis - to link this sequence variability (easy to observe in sequence alignments, which are available in public sequence databases) to bio-molecular structure and function. In my presentation I will show, how this methodology can be used (i) to infer contacts between residues and thus to guide tertiary and quaternary protein structure prediction and RNA structure prediction, (ii) to discriminate interacting from non-interacting protein families, and thus to infer conserved protein-protein interaction networks, and (iii) to reconstruct mutational landscapes and thus to predict the phenotypic effect of mutations. References [1] M. Figliuzzi, H. Jacquier, A. Schug, O. Tenaillon and M. Weigt ''Coevolutionary landscape inference and the context-dependence of mutations in beta-lactamase TEM-1'', Mol. Biol. Evol. (2015), doi: 10.1093/molbev/msv211 [2] E. De Leonardis, B. Lutz, S. Ratz, S. Cocco, R. Monasson, A. Schug, M. Weigt ''Direct-Coupling Analysis of nucleotide coevolution facilitates RNA secondary and tertiary structure prediction'', Nucleic Acids Research (2015), doi: 10.1093/nar/gkv932 [3] F. Morcos, A. Pagnani, B. Lunt, A. Bertolino, D. Marks, C. Sander, R. Zecchina, J.N. Onuchic, T. Hwa, M. Weigt, ''Direct-coupling analysis of residue co-evolution captures native contacts across many protein families'', Proc. Natl. Acad. Sci. 108, E1293-E1301 (2011).
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sobell, J.L.; Lind, T.J.; Sommer, S.S.
To determine whether mutations in the D{sub 5} dopamine receptor (D{sub 5}DR) gene are associated with schizophrenia, the gene was examined in 78 unrelated schizophrenic individuals. After amplification by the polymerase chain reaction, products were examined by dideoxy fingerprinting (ddF), a highly sensitive screening method related to single strand conformational polymorphism analysis. All samples with unusual ddF patterns were sequenced to precisely identify the sequence change. In the 156 D{sub 5}DR alleles examined, nine sequence changes were identified. Four of the nine did not affect protein structure; of these, three were silent changes and one was a transition in themore » 3{prime} untranslated region. The remaining five sequence changes result in protein alterations: of these, one is a missense change in a non-conserved amino acid, 3 are missense changes in amino acids that are conserved in some dopamine D{sub 5} receptors and the last is a nonsense mutation. To investigate whether the nonsense mutation was associated with schizophrenia, 400 additional schizophrenic cases of western European descent and 1914 ethnically-similar controls were screened for the change. One additional schizophrenic carrier was identified and verified by direct genomic sequencing (allele frequency: .0013), but eight carriers also were found and confirmed among the non-schizophrenics (allele frequency: .0021)(p>.25). The gene was re-examined in all newly identified carriers of the nonsense mutation by direct sequencing and/or ddF in search of additional mutations. None were identified. Family studies also were conducted to investigate possible cosegregation of the mutation with other neuropsychiatric diseases, but this was not demonstrated. Thus, the mutation does not appear to be associated with an increased risk of schizophrenia nor does an initial analysis suggest cosegregation with other neuropsychiatric disorders or symptom complexes.« less
In silico analysis of β-1,3-glucanase from a psychrophilic yeast, Glaciozyma antarctica PI12
NASA Astrophysics Data System (ADS)
Mohammadi, Salimeh; Bakar, Farah Diba Abu; Rabu, Amir; Murad, Abdul Munir Abdul
2014-09-01
1,3-beta-glucanase is an industrially important enzyme having wide range of applications especially in food industry. It is crucial to gain an understanding about the structure and functional aspects of various beta-1,3-glucanase produced from diverse sources. In this, study a cDNA encoding β-1,3-glucanase (GaExg55) was isolated from a psychrophilic yeast, Glaciozyma antarctica PI12. The cDNA sequence has been submitted to Genbank with an accession number (KJ436377). Subsequently, the perdition protein was analyzed using various bioinformatics tools to explore the properties of the protein. GaEXG55 is consisting of 1,440-bp nucleotides encoding 480 amino acid residues. Alignment of the deduced amino acid for GaExg55 with other exo-β-1,3-glucanase available at the NCBI database indicate that deduced amino acids shared a consensus motif NEP, which is signature pattern of GH5 hydrolases. Predicted molecular weight of GaExg55 is 53.66 kDa. GaExg55 sequences possesses signal peptide sequence and it is highly conserved with other fungal exo-beta-1,3 glucanase.
Earl, P L; Jones, E V; Moss, B
1986-01-01
A 5400-base-pair segment of the vaccinia virus genome was sequenced and an open reading frame of 938 codons was found precisely where the DNA polymerase had been mapped by transfer of a phosphonoacetate-resistance marker. A single nucleotide substitution changing glycine at position 347 to aspartic acid accounts for the drug resistance of the mutant vaccinia virus. The 5' end of the DNA polymerase mRNA was located 80 base pairs before the methionine codon initiating the open reading frame. Correspondence between the predicted Mr 108,577 polypeptide and the 110,000 purified enzyme indicates that little or no proteolytic processing occurs. Extensive homology, extending over 435 amino acids, was found upon comparing the DNA polymerase of vaccinia virus and DNA polymerase of Epstein-Barr virus. A highly conserved sequence of 14 amino acids in the carboxyl-terminal regions of the above DNA polymerases is also present at a similar location in adenovirus DNA polymerase. This structure, which is predicted to form a turn flanked by beta-pleated sheets, may form part of an essential binding or catalytic site that accounts for its presence in DNA polymerases of poxviruses, herpesviruses, and adenoviruses. Images PMID:3012524
Wang, Y; Conlon, J M
1995-04-01
Vasoactive intestinal polypeptide (VIP) was purified from extracts of the stomachs of the rainbow trout, Oncorhynchus mykiss, and the bowfin, Amia calva. The primary structure of VIP from both species was the same: His-Ser-Asp-Ala-Ile-Phe-Thr-Asp-Asn-Tyr10- Ser-Arg-Phe-Arg-Lys-Gln-Met-Ala-Val-Lys20-Lys-Tyr-Leu-Asn-Ser-Val- Leu-Thr. This amino acid sequence shows only one amino acid substitution (Val5-->Ile) compared with the common sequence of VIP from the chicken, alligator, and European green frog. The structural identity of VIP from the trout and bowfin is consistent with the close phylogenetic relationship between the Salmoniformes and the Amiiformes and the data indicate that pressure to conserve the complete primary structure of VIP during vertebrate evolution has been very strong.
Positive selection in octopus haemocyanin indicates functional links to temperature adaptation.
Oellermann, Michael; Strugnell, Jan M; Lieb, Bernhard; Mark, Felix C
2015-07-05
Octopods have successfully colonised the world's oceans from the tropics to the poles. Yet, successful persistence in these habitats has required adaptations of their advanced physiological apparatus to compensate impaired oxygen supply. Their oxygen transporter haemocyanin plays a major role in cold tolerance and accordingly has undergone functional modifications to sustain oxygen release at sub-zero temperatures. However, it remains unknown how molecular properties evolved to explain the observed functional adaptations. We thus aimed to assess whether natural selection affected molecular and structural properties of haemocyanin that explains temperature adaptation in octopods. Analysis of 239 partial sequences of the haemocyanin functional units (FU) f and g of 28 octopod species of polar, temperate, subtropical and tropical origin revealed natural selection was acting primarily on charge properties of surface residues. Polar octopods contained haemocyanins with higher net surface charge due to decreased glutamic acid content and higher numbers of basic amino acids. Within the analysed partial sequences, positive selection was present at site 2545, positioned between the active copper binding centre and the FU g surface. At this site, methionine was the dominant amino acid in polar octopods and leucine was dominant in tropical octopods. Sites directly involved in oxygen binding or quaternary interactions were highly conserved within the analysed sequence. This study has provided the first insight into molecular and structural mechanisms that have enabled octopods to sustain oxygen supply from polar to tropical conditions. Our findings imply modulation of oxygen binding via charge-charge interaction at the protein surface, which stabilize quaternary interactions among functional units to reduce detrimental effects of high pH on venous oxygen release. Of the observed partial haemocyanin sequence, residue 2545 formed a close link between the FU g surface and the active centre, suggesting a role as allosteric binding site. The prevalence of methionine at this site in polar octopods, implies regulation of oxygen affinity via increased sensitivity to allosteric metal binding. High sequence conservation of sites directly involved in oxygen binding indicates that functional modifications of octopod haemocyanin rather occur via more subtle mechanisms, as observed in this study.
Kotlyar, S; Weihrauch, D; Paulsen, R S; Towle, D W
2000-08-01
Phosphagen kinases catalyze the reversible dephosphorylation of guanidino phosphagens such as phosphocreatine and phosphoarginine, contributing to the restoration of adenosine triphosphate concentrations in cells experiencing high and variable demands on their reserves of high-energy phosphates. The major invertebrate phosphagen kinase, arginine kinase, is expressed in the gills of two species of euryhaline crabs, the blue crab Callinectes sapidus and the shore crab Carcinus maenas, in which energy-requiring functions include monovalent ion transport, acid-base balance, nitrogen excretion and gas exchange. The enzymatic activity of arginine kinase approximately doubles in the ion-transporting gills of C. sapidus, a strong osmoregulator, when the crabs are transferred from high to low salinity, but does not change in C. maenas, a more modest osmoregulator. Amplification and sequencing of arginine kinase cDNA from both species, accomplished by reverse transcription of gill mRNA and the polymerase chain reaction, revealed an open reading frame coding for a 357-amino-acid protein. The predicted amino acid sequences showed a minimum of 75 % identity with arginine kinase sequences of other arthropods. Ten of the 11 amino acid residues believed to participate in arginine binding are completely conserved among the arthropod sequences analyzed. An estimation of arginine kinase mRNA abundance indicated that acclimation salinity has no effect on arginine kinase gene transcription. Thus, the observed enhancement of enzyme activity in C. sapidus probably results from altered translation rates or direct activation of pre-existing enzyme protein.
Kiriake, Aya; Shiomi, Kazuo
2011-11-01
Lionfish, members of the genera Pterois, Parapterois and Dendrochirus, are well known to be venomous, having venomous glandular tissues in dorsal, pelvic and anal spines. The lionfish toxins have been shown to cross-react with the stonefish toxins by neutralization tests using the commercial stonefish antivenom, although their chemical properties including structures have been little characterized. In this study, an antiserum against neoverrucotoxin, the stonefish Synanceia verrucosa toxin, was first raised in a guinea pig and used in immunoblotting and inhibition immunoblotting to confirm that two species of Pterois lionfish (P. antennata and P. volitans) contain a 75kDa protein (corresponding to the toxin subunit) cross-reacting with neoverrucotoxin. Then, the amino acid sequences of the P. antennata and P. volitans toxins were successfully determined by cDNA cloning using primers designed from the highly conserved sequences of the stonefish toxins. Notably, either α-subunits (699 amino acid residues) or β-subunits (698 amino acid residues) of the P. antennata and P. volitans toxins share as high as 99% sequence identity with each other. Furthermore, both α- and β-subunits of the lionfish toxins exhibit high sequence identity (70-80% identity) with each other and also with the β-subunits of the stonefish toxins. As reported for the stonefish toxins, the lionfish toxins also contain a B30.2/SPRY domain (comprising nearly 200 amino acid residues) in the C-terminal region of each subunit. Copyright © 2011 Elsevier Ltd. All rights reserved.
Combined sequence and structure analysis of the fungal laccase family.
Kumar, S V Suresh; Phale, Prashant S; Durani, S; Wangikar, Pramod P
2003-08-20
Plant and fungal laccases belong to the family of multi-copper oxidases and show much broader substrate specificity than other members of the family. Laccases have consequently been of interest for potential industrial applications. We have analyzed the essential sequence features of fungal laccases based on multiple sequence alignments of more than 100 laccases. This has resulted in identification of a set of four ungapped sequence regions, L1-L4, as the overall signature sequences that can be used to identify the laccases, distinguishing them within the broader class of multi-copper oxidases. The 12 amino acid residues in the enzymes serving as the copper ligands are housed within these four identified conserved regions, of which L2 and L4 conform to the earlier reported copper signature sequences of multi-copper oxidases while L1 and L3 are distinctive to the laccases. The mapping of regions L1-L4 on to the three-dimensional structure of the Coprinus cinerius laccase indicates that many of the non-copper-ligating residues of the conserved regions could be critical in maintaining a specific, more or less C-2 symmetric, protein conformational motif characterizing the active site apparatus of the enzymes. The observed intraprotein homologies between L1 and L3 and between L2 and L4 at both the structure and the sequence levels suggest that the quasi C-2 symmetric active site conformational motif may have arisen from a structural duplication event that neither the sequence homology analysis nor the structure homology analysis alone would have unraveled. Although the sequence and structure homology is not detectable in the rest of the protein, the relative orientation of region L1 with L2 is similar to that of L3 with L4. The structure duplication of first-shell and second-shell residues has become cryptic because the intraprotein sequence homology noticeable for a given laccase becomes significant only after comparing the conservation pattern in several fungal laccases. The identified motifs, L1-L4, can be useful in searching the newly sequenced genomes for putative laccase enzymes. Copyright 2003 Wiley Periodicals, Inc. Biotechnol Bioeng 83: 386-394, 2003.
Zhang, Xu; Diekwisch, Thomas G H; Luan, Xianghong
2011-12-01
The functional significance of extracellular matrix proteins in the life of vertebrates is underscored by a high level of sequence variability in tandem with a substantial degree of conservation in terms of cell-cell and cell-matrix adhesion interactions. Many extracellular matrix proteins feature multiple adhesion domains for successful attachment to substrates, such as integrin, CD63, and heparin. Here we have used homology and ab initio modeling algorithms to compare mouse ameloblastin (mAMBN) and human ameloblastin (hABMN) isoforms and to analyze their potential for cell adhesion and interaction with other matrix molecules as well as calcium binding. Sequence comparison between mAMBN and hAMBN revealed a 26-amino-acid deletion in mAMBN, corresponding to a helix-loop-helix frameshift. The human AMBN domain (174Q-201G), homologous to the mAMBN 157E-178I helix-loop-helix region, formed a helix-loop motif with an extended loop, suggesting a higher degree of flexibility of hAMBN compared with mAMBN, as confirmed by molecular dynamics simulation. Heparin-binding domains, CD63-interaction domains, and calcium-binding sites in both hAMBN and mAMBN support the concept of AMBN as an extracellular matrix protein. The high level of conservation between AMBN functional domains related to adhesion and differentiation was remarkable when compared with only 61% amino acid sequence homology. © 2011 Eur J Oral Sci.
Miller, Bradley R; Sundlov, Jesse A; Drake, Eric J; Makin, Thomas A; Gulick, Andrew M
2014-10-01
Nonribosomal peptide synthetases (NRPSs) are multimodular proteins capable of producing important peptide natural products. Using an assembly line process, the amino acid substrate and peptide intermediates are passed between the active sites of different catalytic domains of the NRPS while bound covalently to a peptidyl carrier protein (PCP) domain. Examination of the linker sequences that join the NRPS adenylation and PCP domains identified several conserved proline residues that are not found in standalone adenylation domains. We examined the roles of these proline residues and neighboring conserved sequences through mutagenesis and biochemical analysis of the reaction catalyzed by the adenylation domain and the fully reconstituted NRPS pathway. In particular, we identified a conserved LPxP motif at the start of the adenylation-PCP linker. The LPxP motif interacts with a region on the adenylation domain to stabilize a critical catalytic lysine residue belonging to the A10 motif that immediately precedes the linker. Further, this interaction with the C-terminal subdomain of the adenylation domain may coordinate movement of the PCP with the conformational change of the adenylation domain. Through this work, we extend the conserved A10 motif of the adenylation domain and identify residues that enable proper adenylation domain function. © 2014 Wiley Periodicals, Inc.
On the relationship between residue structural environment and sequence conservation in proteins.
Liu, Jen-Wei; Lin, Jau-Ji; Cheng, Chih-Wen; Lin, Yu-Feng; Hwang, Jenn-Kang; Huang, Tsun-Tsao
2017-09-01
Residues that are crucial to protein function or structure are usually evolutionarily conserved. To identify the important residues in protein, sequence conservation is estimated, and current methods rely upon the unbiased collection of homologous sequences. Surprisingly, our previous studies have shown that the sequence conservation is closely correlated with the weighted contact number (WCN), a measure of packing density for residue's structural environment, calculated only based on the C α positions of a protein structure. Moreover, studies have shown that sequence conservation is correlated with environment-related structural properties calculated based on different protein substructures, such as a protein's all atoms, backbone atoms, side-chain atoms, or side-chain centroid. To know whether the C α atomic positions are adequate to show the relationship between residue environment and sequence conservation or not, here we compared C α atoms with other substructures in their contributions to the sequence conservation. Our results show that C α positions are substantially equivalent to the other substructures in calculations of various measures of residue environment. As a result, the overlapping contributions between C α atoms and the other substructures are high, yielding similar structure-conservation relationship. Take the WCN as an example, the average overlapping contribution to sequence conservation is 87% between C α and all-atom substructures. These results indicate that only C α atoms of a protein structure could reflect sequence conservation at the residue level. © 2017 Wiley Periodicals, Inc.
Amino Acid Properties Conserved in Molecular Evolution
Rudnicki, Witold R.; Mroczek, Teresa; Cudek, Paweł
2014-01-01
That amino acid properties are responsible for the way protein molecules evolve is natural and is also reasonably well supported both by the structure of the genetic code and, to a large extent, by the experimental measures of the amino acid similarity. Nevertheless, there remains a significant gap between observed similarity matrices and their reconstructions from amino acid properties. Therefore, we introduce a simple theoretical model of amino acid similarity matrices, which allows splitting the matrix into two parts – one that depends only on mutabilities of amino acids and another that depends on pairwise similarities between them. Then the new synthetic amino acid properties are derived from the pairwise similarities and used to reconstruct similarity matrices covering a wide range of information entropies. Our model allows us to explain up to 94% of the variability in the BLOSUM family of the amino acids similarity matrices in terms of amino acid properties. The new properties derived from amino acid similarity matrices correlate highly with properties known to be important for molecular evolution such as hydrophobicity, size, shape and charge of amino acids. This result closes the gap in our understanding of the influence of amino acids on evolution at the molecular level. The methods were applied to the single family of similarity matrices used often in general sequence homology searches, but it is general and can be used also for more specific matrices. The new synthetic properties can be used in analyzes of protein sequences in various biological applications. PMID:24967708
2014-01-01
Background Neisseria meningitidis expresses type four pili (Tfp) which are important for colonisation and virulence. Tfp have been considered as one of the most variable structures on the bacterial surface due to high frequency gene conversion, resulting in amino acid sequence variation of the major pilin subunit (PilE). Meningococci express either a class I or a class II pilE gene and recent work has indicated that class II pilins do not undergo antigenic variation, as class II pilE genes encode conserved pilin subunits. The purpose of this work was to use whole genome sequences to further investigate the frequency and variability of the class II pilE genes in meningococcal isolate collections. Results We analysed over 600 publically available whole genome sequences of N. meningitidis isolates to determine the sequence and genomic organization of pilE. We confirmed that meningococcal strains belonging to a limited number of clonal complexes (ccs, namely cc1, cc5, cc8, cc11 and cc174) harbour a class II pilE gene which is conserved in terms of sequence and chromosomal context. We also identified pilS cassettes in all isolates with class II pilE, however, our analysis indicates that these do not serve as donor sequences for pilE/pilS recombination. Furthermore, our work reveals that the class II pilE locus lacks the DNA sequence motifs that enable (G4) or enhance (Sma/Cla repeat) pilin antigenic variation. Finally, through analysis of pilin genes in commensal Neisseria species we found that meningococcal class II pilE genes are closely related to pilE from Neisseria lactamica and Neisseria polysaccharea, suggesting horizontal transfer among these species. Conclusions Class II pilins can be defined by their amino acid sequence and genomic context and are present in meningococcal isolates which have persisted and spread globally. The absence of G4 and Sma/Cla sequences adjacent to the class II pilE genes is consistent with the lack of pilin subunit variation in these isolates, although horizontal transfer may generate class II pilin diversity. This study supports the suggestion that high frequency antigenic variation of pilin is not universal in pathogenic Neisseria. PMID:24690385
Joshi, R K; Mohanty, S; Subudhi, E; Nayak, S
2010-09-08
Turmeric (Curcuma longa), an important asexually reproducing spice crop of the family Zingiberaceae is highly susceptible to bacterial and fungal pathogens. The identification of resistance gene analogs holds great promise for development of resistant turmeric cultivars. Degenerate primers designed based on known resistance genes (R-genes) were used in combinations to elucidate resistance gene analogs from Curcuma longa cultivar surama. The three primers resulted in amplicons with expected sizes of 450-600 bp. The nucleotide sequence of these amplicons was obtained through sequencing; their predicted amino acid sequences compared to each other and to the amino acid sequences of known R-genes revealed significant sequence similarity. The finding of conserved domains, viz., kinase-1a, kinase-2 and hydrophobic motif, provided evidence that the sequences belong to the NBS-LRR class gene family. The presence of tryptophan as the last residue of kinase-2 motif further qualified them to be in the non-TIR-NBS-LRR subfamily of resistance genes. A cluster analysis based on the neighbor-joining method was carried out using Curcuma NBS analogs together with several resistance gene analogs and known R-genes, which classified them into two distinct subclasses, corresponding to clades N3 and N4 of non-TIR-NBS sequences described in plants. The NBS analogs that we isolated can be used as guidelines to eventually isolate numerous R-genes in turmeric.
Three closely related herpesviruses are associated with fibropapillomatosis in marine turtles
Quackenbush, S.L.; Work, Thierry M.; Balazs, George H.; Casey, Rufina N.; Rovnak, J.; Chaves, A.; duToit, L.; Baines, J.D.; Parrish, C.R.; Bowser, Paul R.; Casey, James W.
1998-01-01
Green turtle fibropapillomatosis is a neoplastic disease of increasingly significant threat to the survivability of this species. Degenerate PCR primers that target highly conserved regions of genes encoding herpesvirus DNA polymerases were used to amplify a DNA sequence from fibropapillomas and fibromas from Hawaiian and Florida green turtles. All of the tumors tested (n= 23) were found to harbor viral DNA, whereas no viral DNA was detected in skin biopsies from tumor-negative turtles. The tissue distribution of the green turtle herpesvirus appears to be generally limited to tumors where viral DNA was found to accumulate at approximately two to five copies per cell and is occasionally detected, only by PCR, in some tissues normally associated with tumor development. In addition, herpesviral DNA was detected in fibropapillomas from two loggerhead and four olive ridley turtles. Nucleotide sequencing of a 483-bp fragment of the turtle herpesvirus DNA polymerase gene determined that the Florida green turtle and loggerhead turtle sequences are identical and differ from the Hawaiian green turtle sequence by five nucleotide changes, which results in two amino acid substitutions. The olive ridley sequence differs from the Florida and Hawaiian green turtle sequences by 15 and 16 nucleotide changes, respectively, resulting in four amino acid substitutions, three of which are unique to the olive ridley sequence. Our data suggest that these closely related turtle herpesviruses are intimately involved in the genesis of fibropapillomatosis.
Jimenez, Karim L; Zavaleta, Amparo I; Izaguirre, Victor; Yarleque, Armando; Inga, Rosio R
2010-01-01
Isolate and characterize in silico gene phospholipase A(2) (PLA(2)) isolated from Lachesis muta venom of the Peruvian Amazon. Technique RT-PCR from total RNA was using specific primers, the amplified DNA product was inserted into the pGEM vector for subsequent sequencing. By bioinformatic analysis identified an open reading frame of 414 nucleotides that encoded 138 amino acids including a signal peptide of 16 aminoacids, molecular weight and pI were 13,976 kDa and 5.66 respectively. The aminoacid sequence was called Lm-PLA(2)-Peru, contains an aspartate at position 49, this aminoacid in conjunction with other conserved residues such as Tyr-28, Gly-30, Gly-32, His-48, Tyr52, Asp99 are important for enzymatic activity. The comparison with the amino acid sequence data banks showed of similarity between PLA(2) from Lachesis stenophrys (93%) and other PLA(2) snake venoms and over 80% of other sPLA(2) family Viperidae venoms. A phylogenetic analysis showed that Lm-PLA(2)-Peru grouped with other acidic [Asp(49)] sPLA(2) previously isolated from Bothriechis schlegelii venom showing 89 % nucleotide sequence identity. Finally, the computer modeling indicated that enzyme had the characteristic structure of sPLA(2) group II that consisted of three α-helices, a β-wing, a short helix and a calcium-binding loop. The nucleotide sequence corresponding to the first transcript of gene from PLA(2) cloned of Lachesis muta venom, snake from the Peruvian rainforest.
Identification and expression analysis of cDNA encoding insulin-like growth factor 2 in horses
KIKUCHI, Kohta; SASAKI, Keisuke; AKIZAWA, Hiroki; TSUKAHARA, Hayato; BAI, Hanako; TAKAHASHI, Masashi; NAMBO, Yasuo; HATA, Hiroshi; KAWAHARA, Manabu
2017-01-01
Insulin-like growth factor 2 (IGF2) is responsible for a broad range of physiological processes during fetal development and adulthood, but genomic analyses of IGF2 containing the 5ʹ- and 3ʹ-untranslated regions (UTRs) in equines have been limited. In this study, we characterized the IGF2 mRNA containing the UTRs, and determined its expression pattern in the fetal tissues of horses. The complete equine IGF2 mRNA sequence harboring another exon approximately 2.8 kb upstream from the canonical transcription start site was identified as a new transcript variant. As this upstream exon did not contain the start codon, the amino acid sequence was identical to the canonical variant. Analysis of the deduced amino acid sequence revealed that the protein possessed two major domains, IlGF and IGF2_C, and analysis of IGF2 sequence polymorphism in fetal tissues of Hokkaido native horse and Thoroughbreds revealed a single nucleotide polymorphism (T to C transition) at position 398 in Thoroughbreds, which caused an amino acid substitution at position 133 in the IGF2 sequence. Furthermore, the expression pattern of the IGF2 mRNA in the fetal tissues of horses was determined for the first time, and was found to be consistent with those of other species. Taken together, these results suggested that the transcriptional and translational products of the IGF2 gene have conserved functions in the fetal development of mammals, including horses. PMID:29151450
Molecular evaluation of five cardiac genes in Doberman Pinschers with dilated cardiomyopathy.
Meurs, Kathryn M; Hendrix, Kristina P; Norgard, Michelle M
2008-08-01
To sequence the exonic and splice site regions of 5 cardiac genes associated with the human form of familial dilated cardiomyopathy (DCM) in Doberman Pinschers with DCM and to identify a causative mutation. 5 unrelated Doberman Pinschers with DCM and 2 unaffected Labrador Retrievers (control dogs). Exonic and splice site regions of the 5 genes encoding the cardiac proteins troponin C, lamin A/C, cysteine- and glycine-rich protein 3, cardiac troponin T, and the beta-myosin heavy chain were sequenced. Sequences were compared for nucleotide changes between affected dogs and the published canine sequences and 2 control dogs. Base pair changes were considered to be causative for DCM if they were present in an affected dog but not in the control dogs or published sequences and if they involved a conserved amino acid and changed that amino acid to a different polarity, acid-base status, or structure. A causative mutation for DCM in Doberman Pinschers was not identified, although single nucleotide polymorphisms were detected in some dogs in the cysteine- and glycine-rich protein 3, beta-myosin heavy chain, and troponin T genes. Mutations in 5 of the cardiac genes associated with the development of DCM in humans did not appear to be causative for DCM in Doberman Pinschers. Continued evaluation of additional candidate genes or a focused approach with an association analysis is warranted to elucidate the molecular cause of this important cardiac disease in Doberman Pinschers.
NASA Astrophysics Data System (ADS)
Hamid, Nur Athirah Abd; Ismail, Ismanizan
2013-11-01
Polygonum minus, locally named as Kesum is an aromatic herb which is high in secondary metabolite content. Alcohol dehydrogenase is an important enzyme that catalyzes the reversible oxidation of alcohol and aldehyde with the presence of NAD(P)(H) as co-factor. The main focus of this research is to identify the gene of ADH. The total RNA was extracted from leaves of P. minus which was treated with 150 μM Jasmonic acid. Full-length cDNA sequence of ADH was isolated via rapid amplification cDNA end (RACE). Subsequently, in silico analysis was conducted on the full-length cDNA sequence and PCR was done on genomic DNA to determine the exon and intron organization. Two sequences of ADH, designated as PmADH1 and PmADH2 were successfully isolated. Both sequences have ORF of 801 bp which encode 266 aa residues. Nucleotide sequence comparison of PmADH1 and PmADH2 indicated that both sequences are highly similar at the ORF region but divergent in the 3' untranslated regions (UTR). The amino acid is differ at the 107 residue; PmADH1 contains Gly (G) residue while PmADH2 contains Cys (C) residue. The intron-exon organization pattern of both sequences are also same, with 3 introns and 4 exons. Based on in silico analysis, both sequences contain "classical" short chain alcohol dehydrogenases/reductases ((c) SDRs) conserved domain. The results suggest that both sequences are the members of short chain alcohol dehydrogenase family.
Li, Guang-Qi; Zang, Xiao-Nan; Zhang, Xue-Cheng; Lu, Ning; Ding, Yan; Gong, Le; Chen, Wen-Chao
2014-03-15
To study the response of Gracilaria lemaneiformis to heat stress, two key enzymes - ubiquitin-activating enzyme (E1) and ubiquitin-conjugating enzyme (E2) - of the Ubiquitin/26S proteasome pathway (UPP) were studied in three strains of G. lemaneiformis-wild type, heat-tolerant cultivar 981 and heat-tolerant cultivar 07-2. The full length DNA sequence of E1 contained only one exon. The open reading frame (ORF) sequence was 981 nucleotides encoding 326 amino acids, which contained conserved ATP binding sites (LYDRQIRLWGLE, ELAKNVLLAGV, LKEMN, VVCAI) and the ubiquitin-activating domains (VVCAI…LMTEAC, VFLDLGDEYSYQ, AIVGGMWGRE). The gene sequence of E2 contained four exons and three introns. The sum of the four exons gave an open reading frame sequence of 444 nucleotides encoding 147 amino acids, which contained a conserved ubiquitin-activating domain (GSICLDIL), ubiquitin-conjugating domains (RIYHPNIN, KVLLSICSLL, DDPLV) and ubiquitin-ligase (E3) recognition sites (KRI, YPF, WSP). Real-time-PCR analysis of transcription levels of E1 and E2 under heat shock conditions (28°C and 32°C) showed that in wild type, transcriptions of E1 and E2 were up-regulated at 28°C, while at 32°C, transcriptions of the two enzymes were below the normal level. In cultivar 981 and cultivar 07-2 of G. lemaneiformis, the transcription levels of the two enzymes were up-regulated at 32°C, and transcription level of cultivar 07-2 was even higher than that of cultivar 981. These results suggest that the UPP plays an important role in high temperature resistance of G. lemaneiformis and the bioactivity of UPP is directly related to the heat-resistant ability of G. lemaneiformis. Copyright © 2013 Elsevier B.V. All rights reserved.
Bulau, Patrick; Okuno, Atsuro; Thome, Elke; Schmitz, Tina; Peter-Katalinic, Jasna; Keller, Rainer
2005-11-01
The structure of the precursor of a molt-inhibiting hormone (MIH) of the American crayfish, Orconectes limosus was determined by cloning of a cDNA based on RNA from the neurosecretory perikarya of the X-organ in the eyestalk ganglia. The open reading frame includes the complete precursor sequence, consisting of a signal peptide of 29, and the MIH sequence of 77 amino acids. In addition, the mature peptide was isolated by HPLC from the neurohemal sinus gland and analyzed by ESI-MS and MALDI-TOF-MS peptide mapping. This showed that the mature peptide (Mass 8664.29 Da) consists of only 75 amino acids, having Ala75-NH2 as C-terminus. Thus, C-terminal Arg77 of the precursor is removed during processing, and Gly76 serves as an amide donor. Sequence comparison confirms this peptide as a novel member of the large family, which includes crustacean hyperglycaemic hormone (CHH), MIH and gonad (vitellogenesis)-inhibiting hormone (GIH/VIH). The lack of a CPRP (CHH-precursor related peptide) in the hormone precursor, the size and specific sequence characteristics show that Orl MIH belongs to the MIH/GIH(VIH) subgroup of this larger family. Comparison with the MIH of Procambarus clarkii, the only other MIH that has thus far been identified in freshwater crayfish, shows extremely high sequence conservation. Both MIHs differ in only one amino acid residue ( approximately 99% identity), whereas the sequence identity to several other known MIHs is between 40 and 46%.
Formighieri, Eduardo F; Tiburcio, Ricardo A; Armas, Eduardo D; Medrano, Francisco J; Shimo, Hugo; Carels, Nicolas; Góes-Neto, Aristóteles; Cotomacci, Carolina; Carazzolle, Marcelo F; Sardinha-Pinto, Naiara; Thomazella, Daniela P T; Rincones, Johana; Digiampietri, Luciano; Carraro, Dirce M; Azeredo-Espin, Ana M; Reis, Sérgio F; Deckmann, Ana C; Gramacho, Karina; Gonçalves, Marilda S; Moura Neto, José P; Barbosa, Luciana V; Meinhardt, Lyndel W; Cascardo, Júlio C M; Pereira, Gonçalo A G
2008-10-01
We present here the sequence of the mitochondrial genome of the basidiomycete phytopathogenic hemibiotrophic fungus Moniliophthora perniciosa, causal agent of the Witches' Broom Disease in Theobroma cacao. The DNA is a circular molecule of 109,103 base pairs, with 31.9% GC, and is the largest sequenced so far. This size is due essentially to the presence of numerous non-conserved hypothetical ORFs. It contains the 14 genes coding for proteins involved in the oxidative phosphorylation, the two rRNA genes, one ORF coding for a ribosomal protein (rps3), and a set of 26 tRNA genes that recognize codons for all amino acids. Seven homing endonucleases are located inside introns. Except atp8, all conserved known genes are in the same orientation. Phylogenetic analysis based on the cox genes agrees with the commonly accepted fungal taxonomy. An uncommon feature of this mitochondrial genome is the presence of a region that contains a set of four, relatively small, nested, inverted repeats enclosing two genes coding for polymerases with an invertron-type structure and three conserved hypothetical genes interpreted as the stable integration of a mitochondrial linear plasmid. The integration of this plasmid seems to be a recent evolutionary event that could have implications in fungal biology. This sequence is available under GenBank accession number AY376688.
Rapid comparison of protein binding site surfaces with Property Encoded Shape Distributions (PESD)
Das, Sourav; Kokardekar, Arshad
2009-01-01
Patterns in shape and property distributions on the surface of binding sites are often conserved across functional proteins without significant conservation of the underlying amino-acid residues. To explore similarities of these sites from the viewpoint of a ligand, a sequence and fold-independent method was created to rapidly and accurately compare binding sites of proteins represented by property-mapped triangulated Gauss-Connolly surfaces. Within this paradigm, signatures for each binding site surface are produced by calculating their property-encoded shape distributions (PESD), a measure of the probability that a particular property will be at a specific distance to another on the molecular surface. Similarity between the signatures can then be treated as a measure of similarity between binding sites. As postulated, the PESD method rapidly detected high levels of similarity in binding site surface characteristics even in cases where there was very low similarity at the sequence level. In a screening experiment involving each member of the PDBBind 2005 dataset as a query against the rest of the set, PESD was able to retrieve a binding site with identical E.C. (Enzyme Commission) numbers as the top match in 79.5% of cases. The ability of the method in detecting similarity in binding sites with low sequence conservations were compared with state-of-the-art binding site comparison methods. PMID:19919089
Wen, Shijie; Liu, Hao; Li, Xingyu; Chen, Xiaoping; Hong, Yanbin; Li, Haifen; Lu, Qing; Liang, Xuanqiang
2018-05-01
A first creation of high oleic acid peanut varieties by using transcription activator-like effecter nucleases (TALENs) mediated targeted mutagenesis of Fatty Acid Desaturase 2 (FAD2). Transcription activator like effector nucleases (TALENs), which allow the precise editing of DNA, have already been developed and applied for genome engineering in diverse organisms. However, they are scarcely used in higher plant study and crop improvement, especially in allopolyploid plants. In the present study, we aimed to create targeted mutagenesis by TALENs in peanut. Targeted mutations in the conserved coding sequence of Arachis hypogaea fatty acid desaturase 2 (AhFAD2) were created by TALENs. Genetic stability of AhFAD2 mutations was identified by DNA sequencing in up to 9.52 and 4.11% of the regeneration plants at two different targeted sites, respectively. Mutation frequencies among AhFAD2 mutant lines were significantly correlated to oleic acid accumulation. Genetically, stable individuals of positive mutant lines displayed a 0.5-2 fold increase in the oleic acid content compared with non-transgenic controls. This finding suggested that TALEN-mediated targeted mutagenesis could increase the oleic acid content in edible peanut oil. Furthermore, this was the first report on peanut genome editing event, and the obtained high oleic mutants could serve for peanut breeding project.
Kudryavtseva, A A; Osetrova, M S; Livinyuk, V Ya; Manukhov, I V; Zavilgelsky, G B
2017-01-01
Antirestriction proteins of the ArdB/KlcA family are specific inhibitors of restriction (endonuclease) activity of type-I restriction/modification enzymes. The effect of conserved amino acid residues on the antirestriction activity of the ArdB protein encoded by the transmissible R64 (IncI1) plasmid has been investigated. An analysis of the amino acid sequences of ArdB homologues demonstrated the presence of four groups of conserved residues ((1) R16, E32, and W51; (2) Y46 and G48; (3) S81, D83 and E132, and (4) N77, L(I)140, and D141) on the surface of the protein globule. Amino acid residues of the fourth group showed a unique localization pattern with the terminal residue protruding beyond the globule surface. The replacement of two conserved amino acids (D141 and N77) located in the close vicinity of each other on the globule surface showed that the C-terminal D141 is essential for the antirestriction activity of ArdB. The deletion of this residue, as well as replacement by a hydrophobic threonine residue (D141T), completely abolished the antirestriction activity of ArdB. The synonymous replacement of D141 by a glutamic acid residue (D141E) caused an approximately 30-fold decrease of the antirestriction activity of ArdB, and the point mutation N77A caused an approximately 20-fold decrease in activity. The residues D141 and N77 located on the surface of the protein globule are presumably essential for the formation of a contact between ArdB and a currently unknown factor that modulates the activity of type-I restriction/modification enzymes.
Zheng, Lu-Lu; Niu, Shen; Hao, Pei; Feng, KaiYan; Cai, Yu-Dong; Li, Yixue
2011-01-01
Pyrrolidone carboxylic acid (PCA) is formed during a common post-translational modification (PTM) of extracellular and multi-pass membrane proteins. In this study, we developed a new predictor to predict the modification sites of PCA based on maximum relevance minimum redundancy (mRMR) and incremental feature selection (IFS). We incorporated 727 features that belonged to 7 kinds of protein properties to predict the modification sites, including sequence conservation, residual disorder, amino acid factor, secondary structure and solvent accessibility, gain/loss of amino acid during evolution, propensity of amino acid to be conserved at protein-protein interface and protein surface, and deviation of side chain carbon atom number. Among these 727 features, 244 features were selected by mRMR and IFS as the optimized features for the prediction, with which the prediction model achieved a maximum of MCC of 0.7812. Feature analysis showed that all feature types contributed to the modification process. Further site-specific feature analysis showed that the features derived from PCA's surrounding sites contributed more to the determination of PCA sites than other sites. The detailed feature analysis in this paper might provide important clues for understanding the mechanism of the PCA formation and guide relevant experimental validations. PMID:22174779
Ren, Siyuan; Yang, Guang; He, Youyu; Wang, Yiguo; Li, Yixue; Chen, Zhengjun
2008-10-01
Many well-represented domains recognize primary sequences usually less than 10 amino acids in length, called Short Linear Motifs (SLiMs). Accurate prediction of SLiMs has been difficult because they are short (often < 10 amino acids) and highly degenerate. In this study, we combined scoring matrixes derived from peptide library and conservation analysis to identify protein classes enriched of functional SLiMs recognized by SH2, SH3, PDZ and S/T kinase domains. Our combined approach revealed that SLiMs are highly conserved in proteins from functional classes that are known to interact with a specific domain, but that they are not conserved in most other protein groups. We found that SLiMs recognized by SH2 domains were highly conserved in receptor kinases/phosphatases, adaptor molecules, and tyrosine kinases/phosphatases, that SLiMs recognized by SH3 domains were highly conserved in cytoskeletal and cytoskeletal-associated proteins, that SLiMs recognized by PDZ domains were highly conserved in membrane proteins such as channels and receptors, and that SLiMs recognized by S/T kinase domains were highly conserved in adaptor molecules, S/T kinases/phosphatases, and proteins involved in transcription or cell cycle control. We studied Tyr-SLiMs recognized by SH2 domains in more detail, and found that SH2-recognized Tyr-SLiMs on the cytoplasmic side of membrane proteins are more highly conserved than those on the extra-cellular side. Also, we found that SH2-recognized Tyr-SLiMs that are associated with SH3 motifs and a tyrosine kinase phosphorylation motif are more highly conserved. The interactome of protein domains is reflected by the evolutionary conservation of SLiMs recognized by these domains. Combining scoring matrixes derived from peptide libraries and conservation analysis, we would be able to find those protein groups that are more likely to interact with specific domains.
Ren, Siyuan; Yang, Guang; He, Youyu; Wang, Yiguo; Li, Yixue; Chen, Zhengjun
2008-01-01
Background Many well-represented domains recognize primary sequences usually less than 10 amino acids in length, called Short Linear Motifs (SLiMs). Accurate prediction of SLiMs has been difficult because they are short (often < 10 amino acids) and highly degenerate. In this study, we combined scoring matrixes derived from peptide library and conservation analysis to identify protein classes enriched of functional SLiMs recognized by SH2, SH3, PDZ and S/T kinase domains. Results Our combined approach revealed that SLiMs are highly conserved in proteins from functional classes that are known to interact with a specific domain, but that they are not conserved in most other protein groups. We found that SLiMs recognized by SH2 domains were highly conserved in receptor kinases/phosphatases, adaptor molecules, and tyrosine kinases/phosphatases, that SLiMs recognized by SH3 domains were highly conserved in cytoskeletal and cytoskeletal-associated proteins, that SLiMs recognized by PDZ domains were highly conserved in membrane proteins such as channels and receptors, and that SLiMs recognized by S/T kinase domains were highly conserved in adaptor molecules, S/T kinases/phosphatases, and proteins involved in transcription or cell cycle control. We studied Tyr-SLiMs recognized by SH2 domains in more detail, and found that SH2-recognized Tyr-SLiMs on the cytoplasmic side of membrane proteins are more highly conserved than those on the extra-cellular side. Also, we found that SH2-recognized Tyr-SLiMs that are associated with SH3 motifs and a tyrosine kinase phosphorylation motif are more highly conserved. Conclusion The interactome of protein domains is reflected by the evolutionary conservation of SLiMs recognized by these domains. Combining scoring matrixes derived from peptide libraries and conservation analysis, we would be able to find those protein groups that are more likely to interact with specific domains. PMID:18828911
Agarwal, Pragati; Singh, Jyoti; Singh, R P
2017-05-01
Aspergillus niger PA2, a novel strain isolated from waste effluents of food industry, is a potential extracellular tyrosinase producer. Enzyme activity and L-DOPA production were maximum when glucose and peptone were employed as C source and nitrogen source respectively in the medium and enhanced notably when the copper was supplemented, thus depicting the significance of copper in tyrosinase activity. Tyrosinase-encoding gene from the fungus was cloned, and amplification of the tyrosinase gene yielded a 1127-bp DNA fragment and 374 amino acid residue long product that encoded for a predicted protein of 42.3 kDa with an isoelectric point of 4.8. Primary sequence analysis of A. niger PA2 tyrosinase had shown that it had approximately 99% identity with that of A. niger CBS 513.88, which was further confirmed by phylogenetic analysis. The inferred amino acid sequence of A. niger tyrosinase contained two putative copper-binding sites comprising of six histidines, a characteristic feature for type-3 copper proteins, which were highly conserved in all tyrosinases throughout the Aspergillus species. When superimposed onto the tertiary structure of A. oryzae tyrosinase, the conserved residues from both the organisms occupied same spatial positions to provide a di-copper-binding peptide groove.
Prohibitin-2 gene reveals sex-related differences in the salmon louse Caligus rogercresseyi.
Farlora, Rodolfo; Nuñez-Acuña, Gustavo; Gallardo-Escárate, Cristian
2015-06-10
Prohibitins are evolutionarily conserved proteins present in multiple cellular compartments, and are involved in diverse cellular processes, including steroid hormone transcription and gametogenesis. In the present study, we report for the first time the characterization of the prohibitin-2 (Phb2) gene in the sea lice Caligus rogercresseyi. The CrPhb2 cDNA showed a total length of 1406 bp, which contained a predicted open reading frame (ORF) of 894 base pairs (bp) encoding for 298 amino acids. Multiple sequence alignments of prohibitin proteins from other arthropods revealed a high degree of amino acid sequence conservation. In silico Illumina read counts and RT-qPCR analyses showed a sex-dependent differential expression, with mRNA levels exhibiting a 1.7-fold (RT-qPCR) increase in adult females compared with adult males. A total of nine single nucleotide polymorphisms (SNPs) were identified, three were located in the 5' UTR of the Phb2 messenger and six in the ORF, but no mutations associated with sex were found. These results contribute to expand the present knowledge of the reproduction-related genes in C. rogercresseyi, and may be useful in future experiments aimed at controlling the impacts of sea lice in fish farming. Copyright © 2015 Elsevier B.V. All rights reserved.
Palma, Leopoldo; Muñoz, Delia; Berry, Colin; Murillo, Jesús; Ruiz de Escudero, Iñigo; Caballero, Primitivo
2014-01-01
This study describes the insecticidal activity of a novel Bacillus thuringiensis Cry-related protein with a deduced 799 amino acid sequence (~89 kDa) and ~19% pairwise identity to the 95-kDa-aphidicidal protein (sequence number 204) from patent US 8318900 and ~40% pairwise identity to the cancer cell killing Cry proteins (parasporins Cry41Ab1 and Cry41Aa1), respectively. This novel Cry-related protein contained the five conserved amino acid blocks and the three conserved domains commonly found in 3-domain Cry proteins. The protein exhibited toxic activity against the green peach aphid, Myzus persicae (Sulzer) (Homoptera: Aphididae) with the lowest mean lethal concentration (LC50 = 32.7 μg/mL) reported to date for a given Cry protein and this insect species, whereas it had no lethal toxicity against the Lepidoptera of the family Noctuidae Helicoverpa armigera (Hübner), Mamestra brassicae (L.), Spodoptera exigua (Hübner), S. frugiperda (J.E. Smith) and S. littoralis (Boisduval), at concentrations as high as ~3.5 μg/cm2. This novel Cry-related protein may become a promising environmentally friendly tool for the biological control of M. persicae and possibly also for other sap sucking insect pests. PMID:25384108
Forster, Karine M; Hartwig, Daiane D; Seixas, Fabiana K; Bacelo, Kátia L; Amaral, Marta; Hartleben, Cláudia P; Dellagostin, Odir A
2013-05-01
The leptospiral immunoglobulin-like (Lig) proteins LigA and LigB possess immunoglobulin-like domains with 90-amino-acid repeats and are adhesion molecules involved in pathogenicity. They are conserved in pathogenic Leptospira spp. and thus are of interest for use as serodiagnostic antigens and in recombinant vaccine formulations. The N-terminal amino acid sequences of the LigA and LigB proteins are identical, but the C-terminal sequences vary. In this study, we evaluated the protective potential of five truncated forms of LigA and LigB proteins from Leptospira interrogans serovar Canicola as DNA vaccines using the pTARGET mammalian expression vector. Hamsters immunized with the DNA vaccines were subjected to a heterologous challenge with L. interrogans serovar Copenhageni strain Spool via the intraperitoneal route. Immunization with a DNA vaccine encoding LigBrep resulted in the survival of 5/8 (62.5%) hamsters against lethal infection (P < 0.05). None of the control hamsters or animals immunized with the other vaccine preparations survived. The vaccine induced an IgG antibody response and, additionally, conferred sterilizing immunity in 80% of the surviving animals. Our results indicate that the LigBrep DNA vaccine is a promising candidate for inclusion in a protective leptospiral vaccine.
Forster, Karine M.; Hartwig, Daiane D.; Seixas, Fabiana K.; Bacelo, Kátia L.; Amaral, Marta; Hartleben, Cláudia P.
2013-01-01
The leptospiral immunoglobulin-like (Lig) proteins LigA and LigB possess immunoglobulin-like domains with 90-amino-acid repeats and are adhesion molecules involved in pathogenicity. They are conserved in pathogenic Leptospira spp. and thus are of interest for use as serodiagnostic antigens and in recombinant vaccine formulations. The N-terminal amino acid sequences of the LigA and LigB proteins are identical, but the C-terminal sequences vary. In this study, we evaluated the protective potential of five truncated forms of LigA and LigB proteins from Leptospira interrogans serovar Canicola as DNA vaccines using the pTARGET mammalian expression vector. Hamsters immunized with the DNA vaccines were subjected to a heterologous challenge with L. interrogans serovar Copenhageni strain Spool via the intraperitoneal route. Immunization with a DNA vaccine encoding LigBrep resulted in the survival of 5/8 (62.5%) hamsters against lethal infection (P < 0.05). None of the control hamsters or animals immunized with the other vaccine preparations survived. The vaccine induced an IgG antibody response and, additionally, conferred sterilizing immunity in 80% of the surviving animals. Our results indicate that the LigBrep DNA vaccine is a promising candidate for inclusion in a protective leptospiral vaccine. PMID:23486420
Lashbrook, C C; Gonzalez-Bosch, C; Bennett, A B
1994-01-01
Two structurally divergent endo-beta-1,4-glucanase (EGase) cDNAs were cloned from tomato. Although both cDNAs (Cel1 and Cel2) encode potentially glycosylated, basic proteins of 51 to 53 kD and possess multiple amino acid domains conserved in both plant and microbial EGases, Cel1 and Cel2 exhibit only 50% amino acid identity at the overall sequence level. Amino acid sequence comparisons to other plant EGases indicate that tomato Cel1 is most similar to bean abscission zone EGase (68%), whereas Cel2 exhibits greatest sequence identity to avocado fruit EGase (57%). Sequence comparisons suggest the presence of at least two structurally divergent EGase families in plants. Unlike ripening avocado fruit and bean abscission zones in which a single EGase mRNA predominates, EGase expression in tomato reflects the overlapping accumulation of both Cel1 and Cel2 transcripts in ripening fruit and in plant organs undergoing cell separation. Cel1 mRNA contributes significantly to total EGase mRNA accumulation within plant organs undergoing cell separation (abscission zones and mature anthers), whereas Cel2 mRNA is most abundant in ripening fruit. The overlapping expression of divergent EGase genes within a single species may suggest that multiple activities are required for the cooperative disassembly of cell wall components during fruit ripening, floral abscission, and anther dehiscence. PMID:7994180
Degenerative minimalism in the genome of a psyllid endosymbiont.
Clark, M A; Baumann, L; Thao, M L; Moran, N A; Baumann, P
2001-03-01
Psyllids, like aphids, feed on plant phloem sap and are obligately associated with prokaryotic endosymbionts acquired through vertical transmission from an ancestral infection. We have sequenced 37 kb of DNA of the genome of Carsonella ruddii, the endosymbiont of psyllids, and found that it has a number of unusual properties revealing a more extreme case of degeneration than was previously reported from studies of eubacterial genomes, including that of the aphid endosymbiont Buchnera aphidicola. Among the unusual properties are an exceptionally low guanine-plus-cytosine content (19.9%), almost complete absence of intergenic spaces, operon fusion, and lack of the usual promoter sequences upstream of 16S rDNA. These features suggest the synthesis of long mRNAs and translational coupling. The most extreme instances of base compositional bias occur in the genes encoding proteins that have less highly conserved amino acid sequences; the guanine-plus-cytosine content of some protein-coding sequences is as low as 10%. The shift in base composition has a large effect on proteins: in polypeptides of C. ruddii, half of the residues consist of five amino acids with codons low in guanine plus cytosine. Furthermore, the proteins of C. ruddii are reduced in size, with an average of about 9% fewer amino acids than in homologous proteins of related bacteria. These observations suggest that the C. ruddii genome is not subject to constraints that limit the evolution of other known eubacteria.
Pfeiffer, M; Klein, A; Steinert, P; Schomburg, D
The 25 amino acid long subunit VhuU of the F420-non-reducing hydrogenase from Methanococcus voltae contains selenocysteine within the consensus sequence of known [NiFe] hydrogenases DP(C or U)CxxCxxH (U = selenocysteine). The sulfur-analogue VhuUc was chemically synthesized, purified and its metal binding capability, the catalytic properties, and structural features were investigated. The polypeptide was able to bind nickel, but did not catalyse the heterolytic activation of H2. 2D-NMR spectroscopy revealed an alpha-helical secondary structure for the 15 N-terminal amino acids in 50% TFE. Nickel only binds to the C-terminus, which contains the conserved amino acid motif. Structures derived from the NMR data are compatible with the participation of both sulfur atoms from the conserved cysteine residues in a metal ion binding. Structures obtained from the data sets for Ni.VhuUc as well as Zn.VhuUc showed no further ligands. The informational value for Ni.VhuUc was low due to paramagnetism.
Cytochrome oxidase subunit II gene in mitochondria of Oenothera has no intron
Hiesel, Rudolf; Brennicke, Axel
1983-01-01
The cytochrome oxidase subunit II gene has been localized in the mitochondrial genome of Oenothera berteriana and the nucleotide sequence has been determined. The coding sequence contains 777 bp and, unlike the corresponding gene in Zea mays, is not interrupted by an intron. No TGA codon is found within the open reading frame. The codon CGG, as in the maize gene, is used in place of tryptophan codons of corresponding genes in other organisms. At position 742 in the Oenothera sequence the TGG of maize is changed into a CGG codon, where Trp is conserved as the amino acid in other organisms. Homologous sequences occur more than once in the mitochondrial genome as several mitochondrial DNA species hybridize with DNA probes of the cytochrome oxidase subunit II gene. ImagesFig. 5. PMID:16453484
Comparative Analysis and Distribution of Omega-3 lcPUFA Biosynthesis Genes in Marine Molluscs
Surm, Joachim M.; Prentis, Peter J.; Pavasovic, Ana
2015-01-01
Recent research has identified marine molluscs as an excellent source of omega-3 long-chain polyunsaturated fatty acids (lcPUFAs), based on their potential for endogenous synthesis of lcPUFAs. In this study we generated a representative list of fatty acyl desaturase (Fad) and elongation of very long-chain fatty acid (Elovl) genes from major orders of Phylum Mollusca, through the interrogation of transcriptome and genome sequences, and various publicly available databases. We have identified novel and uncharacterised Fad and Elovl sequences in the following species: Anadara trapezia, Nerita albicilla, Nerita melanotragus, Crassostrea gigas, Lottia gigantea, Aplysia californica, Loligo pealeii and Chlamys farreri. Based on alignments of translated protein sequences of Fad and Elovl genes, the haeme binding motif and histidine boxes of Fad proteins, and the histidine box and seventeen important amino acids in Elovl proteins, were highly conserved. Phylogenetic analysis of aligned reference sequences was used to reconstruct the evolutionary relationships for Fad and Elovl genes separately. Multiple, well resolved clades for both the Fad and Elovl sequences were observed, suggesting that repeated rounds of gene duplication best explain the distribution of Fad and Elovl proteins across the major orders of molluscs. For Elovl sequences, one clade contained the functionally characterised Elovl5 proteins, while another clade contained proteins hypothesised to have Elovl4 function. Additional well resolved clades consisted only of uncharacterised Elovl sequences. One clade from the Fad phylogeny contained only uncharacterised proteins, while the other clade contained functionally characterised delta-5 desaturase proteins. The discovery of an uncharacterised Fad clade is particularly interesting as these divergent proteins may have novel functions. Overall, this paper presents a number of novel Fad and Elovl genes suggesting that many mollusc groups possess most of the required enzymes for the synthesis of lcPUFAs. PMID:26308548
An atypical topoisomerase II sequence from the slime mold Physarum polycephalum.
Hugodot, Yannick; Dutertre, Murielle; Duguet, Michel
2004-01-21
We have determined the complete nucleotide sequence of the cDNA encoding DNA topoisomerase II from Physarum polycephalum. Using degenerate primers, based on the conserved amino acid sequences of other eukaryotic enzymes, a 250-bp fragment was polymerase chain reaction (PCR) amplified. This fragment was used as a probe to screen a Physarum cDNA library. A partial cDNA clone was isolated that was truncated at the 3' end. Rapid amplification of cDNA ends (RACE)-PCR was employed to isolate the remaining portion of the gene. The complete sequence of 4613 bp contains an open reading frame of 4494 bp that codes for 1498 amino acid residues with a theoretical molecular weight of 167 kDa. The predicted amino acid sequence shares similarity with those of other eukaryotes and shows the highest degree of identity with the enzyme of Dictyostelium discoideum. However, the enzyme of P. polycephalum contains an atypical amino-terminal domain very rich in serine and proline, whose function is unknown. Remarkably, both a mitochondrial targeting sequence and a nuclear localization signal were predicted respectively in the amino and carboxy-terminus of the protein, as in the case of human topoisomerase III alpha. At the Physarum genomic level, the topoisomerase II gene encompasses a region of about 16 kbp suggesting a large proportion of intronic sequences, an unusual situation for a gene of a lower eukaryote, often free of introns. Finally, expression of topoisomerase II mRNA does not appear significantly dependent on the plasmodium cycle stage, possibly due to the lack of G1 phase or (and) to a mitochondrial localization of the enzyme.
Deppenmeier, U; Blaut, M; Lentes, S; Herzberg, C; Gottschalk, G
1995-01-15
DNA encompassing the structural genes of two membrane-bound hydrogenases from Methanosarcina mazei Gö1 was cloned and sequenced. The genes, arranged in the order vhoG and vhoA as well as vhtG and vhtA, were identified as those encoding the small and the large subunits of the NiFe hydrogenases [Deppenmeier, U., Blaut, M., Schmidt, B. & Gottschalk, G. (1992) Arch. Microbiol. 157, 505-511]. Northern-blot analysis revealed that the structural genes formed part of two operons, both containing one additional open reading frame (vhoC and vhtC) which codes for a cytochrome b. This conclusion was drawn from the homology of the deduced N-terminal amino acid sequences of vhoC and vhtC and the N-terminus of a 27-kDa cytochrome isolated from Ms. mazei C16. VhoC and VhtC contain four tentative hydrophobic segments which might span the cytoplasmic membrane. Hydropathy plots suggest that His23 and His50 are involved in heme coordination. The comparison of the sequencing data of vhoG and vhtG with the experimentally determined N-terminus of the small subunit indicate the presence of a 48-amino-acid leader peptide in front of the polypeptides. VhoA and VhtA contained the conserved sequence DPCXXC in the C-terminal region, which excludes the presence of a selenocysteine residue in these hydrogenases. Promoter sequences were found upstream of vhoG and vhtG, respectively. Downstream of vhoC, a putative terminator sequence was identified. Alignments of the deduced amino acid sequences of the gene clusters vhoGAC and vhtGAC showed 92-97% identity. Only the C-termini of VhoC and VhtC were not similar.
Bryant, Kevin F; Yan, Zhipeng; Dreyfus, David H; Knipe, David M
2012-06-01
Herpes simplex virus 1 (HSV-1) ICP8 is a single-stranded DNA-binding protein that is necessary for viral DNA replication and exhibits recombinase activity in vitro. Alignment of the HSV-1 ICP8 amino acid sequence with ICP8 homologs from other herpesviruses revealed conserved aspartic acid (D) and glutamic acid (E) residues. Amino acid residue D1087 was conserved in every ICP8 homolog analyzed, indicating that it is likely critical for ICP8 function. We took a genetic approach to investigate the functions of the conserved ICP8 D and E residues in HSV-1 replication. The E1086A D1087A mutant form of ICP8 failed to support the replication of an ICP8 mutant virus in a complementation assay. E1086A D1087A mutant ICP8 bound DNA, albeit with reduced affinity, demonstrating that the protein is not globally misfolded. This mutant form of ICP8 was also recognized by a conformation-specific antibody, further indicating that its overall structure was intact. A recombinant virus expressing E1086A D1087A mutant ICP8 was defective in viral replication, viral DNA synthesis, and late gene expression in Vero cells. A class of enzymes called DDE recombinases utilize conserved D and E residues to coordinate divalent metal cations in their active sites. We investigated whether the conserved D and E residues in ICP8 were also required for binding metal cations and found that the E1086A D1087A mutant form of ICP8 exhibited altered divalent metal binding in an in vitro iron-induced cleavage assay. These results identify a novel divalent metal cation-binding site in ICP8 that is required for ICP8 functions during viral replication.
Kawakami, Ryushi; Sakuraba, Haruhiko; Ohshima, Toshihisa
2007-01-01
NAD-dependent l-glutamate dehydrogenase (NAD-GDH) activity was detected in cell extract from the psychrophile Janthinobacterium lividum UTB1302, which was isolated from cold soil and purified to homogeneity. The native enzyme (1,065 kDa, determined by gel filtration) is a homohexamer composed of 170-kDa subunits (determined by sodium dodecyl sulfate-polyacrylamide gel electrophoresis). Consistent with these findings, gene cloning and sequencing enabled deduction of the amino acid sequence of the subunit, which proved to be comprised of 1,575 amino acids with a combined molecular mass of 169,360 Da. The enzyme from this psychrophile thus appears to belong to the GDH family characterized by very large subunits, like those expressed by Streptomyces clavuligerus and Pseudomonas aeruginosa (about 180 kDa). The entire amino acid sequence of the J. lividum enzyme showed about 40% identity with the sequences from S. clavuligerus and P. aeruginosa enzymes, but the central domains showed higher homology (about 65%). Within the central domain, the residues related to substrate and NAD binding were highly conserved, suggesting that this is the enzyme's catalytic domain. In the presence of NAD, but not in the presence of NADP, this GDH exclusively catalyzed the oxidative deamination of l-glutamate. The stereospecificity of the hydride transfer to NAD was pro-S, which is the same as that of the other known GDHs. Surprisingly, NAD-GDH activity was markedly enhanced by the addition of various amino acids, such as l-aspartate (1,735%) and l-arginine (936%), which strongly suggests that the N- and/or C-terminal domains play regulatory roles and are involved in the activation of the enzyme by these amino acids. PMID:17526698
Kawakami, Ryushi; Sakuraba, Haruhiko; Ohshima, Toshihisa
2007-08-01
NAD-dependent l-glutamate dehydrogenase (NAD-GDH) activity was detected in cell extract from the psychrophile Janthinobacterium lividum UTB1302, which was isolated from cold soil and purified to homogeneity. The native enzyme (1,065 kDa, determined by gel filtration) is a homohexamer composed of 170-kDa subunits (determined by sodium dodecyl sulfate-polyacrylamide gel electrophoresis). Consistent with these findings, gene cloning and sequencing enabled deduction of the amino acid sequence of the subunit, which proved to be comprised of 1,575 amino acids with a combined molecular mass of 169,360 Da. The enzyme from this psychrophile thus appears to belong to the GDH family characterized by very large subunits, like those expressed by Streptomyces clavuligerus and Pseudomonas aeruginosa (about 180 kDa). The entire amino acid sequence of the J. lividum enzyme showed about 40% identity with the sequences from S. clavuligerus and P. aeruginosa enzymes, but the central domains showed higher homology (about 65%). Within the central domain, the residues related to substrate and NAD binding were highly conserved, suggesting that this is the enzyme's catalytic domain. In the presence of NAD, but not in the presence of NADP, this GDH exclusively catalyzed the oxidative deamination of l-glutamate. The stereospecificity of the hydride transfer to NAD was pro-S, which is the same as that of the other known GDHs. Surprisingly, NAD-GDH activity was markedly enhanced by the addition of various amino acids, such as l-aspartate (1,735%) and l-arginine (936%), which strongly suggests that the N- and/or C-terminal domains play regulatory roles and are involved in the activation of the enzyme by these amino acids.
Comino, Cinzia; Lanteri, Sergio; Portis, Ezio; Acquadro, Alberto; Romani, Annalisa; Hehn, Alain; Larbat, Romain; Bourgaud, Frédéric
2007-01-01
Background Cynara cardunculus L. is an edible plant of pharmaceutical interest, in particular with respect to the polyphenolic content of its leaves. It includes three taxa: globe artichoke, cultivated cardoon, and wild cardoon. The dominating phenolics are the di-caffeoylquinic acids (such as cynarin), which are largely restricted to Cynara species, along with their precursor, chlorogenic acid (CGA). The scope of this study is to better understand CGA synthesis in this plant. Results A gene sequence encoding a hydroxycinnamoyltransferase (HCT) involved in the synthesis of CGA, was identified. Isolation of the gene sequence was achieved by using a PCR strategy with degenerated primers targeted to conserved regions of orthologous HCT sequences available. We have isolated a 717 bp cDNA which shares 84% aminoacid identity and 92% similarity with a tobacco gene responsible for the biosynthesis of CGA from p-coumaroyl-CoA and quinic acid. In silico studies revealed the globe artichoke HCT sequence clustering with one of the main acyltransferase groups (i.e. anthranilate N-hydroxycinnamoyl/benzoyltransferase). Heterologous expression of the full length HCT (GenBank accession DQ104740) cDNA in E. coli demonstrated that the recombinant enzyme efficiently synthesizes both chlorogenic acid and p-coumaroyl quinate from quinic acid and caffeoyl-CoA or p-coumaroyl-CoA, respectively, confirming its identity as a hydroxycinnamoyl-CoA: quinate HCT. Variable levels of HCT expression were shown among wild and cultivated forms of C. cardunculus subspecies. The level of expression was correlated with CGA content. Conclusion The data support the predicted involvement of the Cynara cardunculus HCT in the biosynthesis of CGA before and/or after the hydroxylation step of hydroxycinnamoyl esters. PMID:17374149
Liu, G Y; Gao, S Z
2009-01-01
The complete coding sequences of three sheep genes- BCKDHA, NAGA and HEXA were amplified using the reverse transcriptase polymerase chain reaction (RT-PCR), based on the conserved sequence information of the mouse or other mammals. The nucleotide sequences of these three genes revealed that the sheep BCKDHA gene encodes a protein of 313 amino acids which has high homology with the BCKDHA gene that encodes a protein of 447 amino acids that has high homology with the Branched chain keto acid dehydrogenase El, alpha polypeptide (BCKDHA) of five species chimpanzee (93%), human (96%), crab-eating macaque (93%), bovine (98%) and mouse (91%). The sheep NAGA gene encodes a protein of 411 amino acids that has high homology with the alpha-N-acetylgalactosaminidase (NAGA) of five species human (85%), bovine (94%), mouse (91%), rat (83%) and chicken (74%). The sheep HEXA gene encodes a protein of 529 amino acids that has high homology with the hexosaminidase A(HEXA) of five species bovine (98%), human (84%), Bornean orangután (84%), rat (80%) and mouse (81%). Finally these three novel sheep genes were assigned to GenelDs: 100145857, 100145858 and 100145856. The phylogenetic tree analysis revealed that the sheep BCKDHA, NAGA, and HEXA all have closer genetic relationships to the BCKDHA, NAGA, and HEXA of bovine. Tissue expression profile analysis was also carried out and results revealed that sheep BCKDHA, NAGA and HEXA genes were differentially expressed in tissues including muscle, heart, liver, fat, kidney, lung, small and large intestine. Our experiment is the first to establish the primary foundation for further research on these three sheep genes.
Sperm Bindin Divergence under Sexual Selection and Concerted Evolution in Sea Stars.
Patiño, Susana; Keever, Carson C; Sunday, Jennifer M; Popovic, Iva; Byrne, Maria; Hart, Michael W
2016-08-01
Selection associated with competition among males or sexual conflict between mates can create positive selection for high rates of molecular evolution of gamete recognition genes and lead to reproductive isolation between species. We analyzed coding sequence and repetitive domain variation in the gene encoding the sperm acrosomal protein bindin in 13 diverse sea star species. We found that bindin has a conserved coding sequence domain structure in all 13 species, with several repeated motifs in a large central region that is similar among all sea stars in organization but highly divergent among genera in nucleotide and predicted amino acid sequence. More bindin codons and lineages showed positive selection for high relative rates of amino acid substitution in genera with gonochoric outcrossing adults (and greater expected strength of sexual selection) than in selfing hermaphrodites. That difference is consistent with the expectation that selfing (a highly derived mating system) may moderate the strength of sexual selection and limit the accumulation of bindin amino acid differences. The results implicate both positive selection on single codons and concerted evolution within the repetitive region in bindin divergence, and suggest that both single amino acid differences and repeat differences may affect sperm-egg binding and reproductive compatibility. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Shahinyan, Grigor; Margaryan, Armine; Panosyan, Hovik; Trchounian, Armen
2017-05-02
Among the huge diversity of thermophilic bacteria mainly bacilli have been reported as active thermostable lipase producers. Geothermal springs serve as the main source for isolation of thermostable lipase producing bacilli. Thermostable lipolytic enzymes, functioning in the harsh conditions, have promising applications in processing of organic chemicals, detergent formulation, synthesis of biosurfactants, pharmaceutical processing etc. In order to study the distribution of lipase-producing thermophilic bacilli and their specific lipase protein primary structures, three lipase producers from different genera were isolated from mesothermal (27.5-70 °C) springs distributed on the territory of Armenia and Nagorno Karabakh. Based on phenotypic characteristics and 16S rRNA gene sequencing the isolates were identified as Geobacillus sp., Bacillus licheniformis and Anoxibacillus flavithermus strains. The lipase genes of isolates were sequenced by using initially designed primer sets. Multiple alignments generated from primary structures of the lipase proteins and annotated lipase protein sequences, conserved regions analysis and amino acid composition have illustrated the similarity (98-99%) of the lipases with true lipases (family I) and GDSL esterase family (family II). A conserved sequence block that determines the thermostability has been identified in the multiple alignments of the lipase proteins. The results are spreading light on the lipase producing bacilli distribution in geothermal springs in Armenia and Nagorno Karabakh. Newly isolated bacilli strains could be prospective source for thermostable lipases and their genes.
Matthews, R J; Cahir, E D; Thomas, M L
1990-01-01
Protein-tyrosine-phosphatases (protein-tyrosine-phosphate phosphohydrolase, EC 3.13.48) have been implicated in the regulation of cell growth; however, to date few tyrosine phosphatases have been characterized. To identify additional family members, the cDNA for the human tyrosine phosphatase leukocyte common antigen (LCA; CD45) was used to screen, under low stringency, a mouse pre-B-cell cDNA library. Two cDNA clones were isolated and sequence analysis predicts a protein sequence of 793 amino acids. We have named the molecule LRP (LCA-related phosphatase). RNA transfer analysis indicates that the cDNAs were derived from a 3.2-kilobase mRNA. The LRP mRNA is transcribed in a wide variety of tissues. The predicted protein structure can be divided into the following structural features: a short 19-amino acid leader sequence, an exterior domain of 123 amino acids that is predicted to be highly glycosylated, a 24-amino acid membrane-spanning region, and a 627-amino acid cytoplasmic region. The cytoplasmic region contains two approximately 260-amino acid domains, each with homology to the tyrosine phosphatase family. One of the cDNA clones differed in that it had a 108-base-pair insertion that, while preserving the reading frame, would disrupt the first protein-tyrosine-phosphatase domain. Analysis of genomic DNA indicates that the insertion is due to an alternatively spliced exon. LRP appears to be evolutionarily conserved as a putative homologue has been identified in the invertebrate Styela plicata. Images PMID:2162042
Polypeptide p41 of a Norwalk-Like Virus Is a Nucleic Acid-Independent Nucleoside Triphosphatase
Pfister, Thomas; Wimmer, Eckard
2001-01-01
Southampton virus (SHV) is a member of the Norwalk-like viruses (NLVs), one of four genera of the family Caliciviridae. The genome of SHV contains three open reading frames (ORFs). ORF 1 encodes a polyprotein that is autocatalytically processed into six proteins, one of which is p41. p41 shares sequence motifs with protein 2C of picornaviruses and superfamily 3 helicases. We have expressed p41 of SHV in bacteria. Purified p41 exhibited nucleoside triphosphate (NTP)-binding and NTP hydrolysis activities. The NTPase activity was not stimulated by single-stranded nucleic acids. SHV p41 had no detectable helicase activity. Protein sequence comparison between the consensus sequences of NLV p41 and enterovirus protein 2C revealed regions of high similarity. According to secondary structure prediction, the conserved regions were located within a putative central domain of alpha helices and beta strands. This study reveals for the first time an NTPase activity associated with a calicivirus-encoded protein. Based on enzymatic properties and sequence information, a functional relationship between NLV p41 and enterovirus 2C is discussed in regard to the role of 2C-like proteins in virus replication. PMID:11160659
Haseloff, J; Goelet, P; Zimmern, D; Ahlquist, P; Dasgupta, R; Kaesberg, P
1984-01-01
The plant viruses alfalfa mosaic virus (AMV) and brome mosaic virus (BMV) each divide their genetic information among three RNAs while tobacco mosaic virus (TMV) contains a single genomic RNA. Amino acid sequence comparisons suggest that the single proteins encoded by AMV RNA 1 and BMV RNA 1 and by AMV RNA 2 and BMV RNA 2 are related to the NH2-terminal two-thirds and the COOH-terminal one-third, respectively, of the largest protein encoded by TMV. Separating these two domains in the TMV RNA sequence is an amber termination codon, whose partial suppression allows translation of the downstream domain. Many of the residues that the TMV read-through domain and the segmented plant viruses have in common are also conserved in a read-through domain found in the nonstructural polyprotein of the animal alphaviruses Sindbis and Middelburg. We suggest that, despite substantial differences in gene organization and expression, all of these viruses use related proteins for common functions in RNA replication. Reassortment of functional modules of coding and regulatory sequence from preexisting viral or cellular sources, perhaps via RNA recombination, may be an important mechanism in RNA virus evolution. PMID:6611550
Fu, L; Hou, Y L; Ding, X; Du, Y J; Zhu, H Q; Zhang, N; Hou, W R
2016-08-30
The complementary DNA (cDNA) of the giant panda (Ailuropoda melanoleuca) ferritin light polypeptide (FTL) gene was successfully cloned using reverse transcription-polymerase chain reaction technology. We constructed a recombinant expression vector containing FTL cDNA and overexpressed it in Escherichia coli using pET28a plasmids. The expressed protein was then purified by nickel chelate affinity chromatography. The cloned cDNA fragment was 580 bp long and contained an open reading frame of 525 bp. The deduced protein sequence was composed of 175 amino acids and had an estimated molecular weight of 19.90 kDa, with an isoelectric point of 5.53. Topology prediction revealed one N-glycosylation site, two casein kinase II phosphorylation sites, one N-myristoylation site, two protein kinase C phosphorylation sites, and one cell attachment sequence. Alignment indicated that the nucleotide and deduced amino acid sequences are highly conserved across several mammals, including Homo sapiens, Cavia porcellus, Equus caballus, and Felis catus, among others. The FTL gene was readily expressed in E. coli, which gave rise to the accumulation of a polypeptide of the expected size (25.50 kDa, including an N-terminal polyhistidine tag).
Genomic Structure of the Luciferase Gene from the Bioluminescent Beetle, Nyctophila cf. Caucasica
Day, John C.; Chaichi, Mohammad J.; Najafil, Iraj; Whiteley, Andrew S.
2006-01-01
The gene coding for beetle luciferase, the enzyme responsible for bioluminescence in over two thousand coleopteran species has, to date, only been characterized from one Palearctic species of Lampyridae. Here we report the characterization of the luciferase gene from a female beetle of an Iranian lampyrid species, Nyctophila cf. caucasica (Coleoptera:Lampyridae). The luciferase gene was composed of seven exons, coding for 547 amino acids, separated by six introns spanning 1976 bp of genomic DNA. The deduced amino acid sequences of the luciferase gene of N. caucasica showed 98.9% homology to that of the Palearctic species Lampyris noctiluca. Analysis of the 810 bp upstream region of the luciferase gene revealed three TATA boxes and several other consensus transcriptional factor recognition sequences presenting evidence for a putative core promoter region conserved in Lampyrinae from -190 through to -155 upstream of the luciferase start codon. Along with the core promoter region the luciferase gene was compared with orthologous sequences from other lampyrid species and found to have greatest identity to Lampyris turkistanicus and Lampyris noctiluca. The significant sequence identity to the former is discussed in relation to taxonomic issues of Iranian lampyrids. PMID:20298115
In silico identification and analysis of phytoene synthase genes in plants.
Han, Y; Zheng, Q S; Wei, Y P; Chen, J; Liu, R; Wan, H J
2015-08-14
In this study, we examined phytoene synthetase (PSY), the first key limiting enzyme in the synthesis of carotenoids and catalyzing the formation of geranylgeranyl pyrophosphate in terpenoid biosynthesis. We used known amino acid sequences of the PSY gene in tomato plants to conduct a genome-wide search and identify putative candidates in 34 sequenced plants. A total of 101 homologous genes were identified. Phylogenetic analysis revealed that PSY evolved independently in algae as well as monocotyledonous and dicotyledonous plants. Our results showed that the amino acid structures exhibited 5 motifs (motifs 1 to 5) in algae and those in higher plants were highly conserved. The PSY gene structures showed that the number of intron in algae varied widely, while the number of introns in higher plants was 4 to 5. Identification of PSY genes in plants and the analysis of the gene structure may provide a theoretical basis for studying evolutionary relationships in future analyses.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Borziak, Kirill; Jouline, Igor B
2007-01-01
Motivation: Sensory domains that are conserved among Bacteria, Archaea and Eucarya are important detectors of common signals detected by living cells. Due to their high sequence divergence, sensory domains are difficult to identify. We systematically look for novel sensory domains using sensitive profile-based searches initi-ated with regions of signal transduction proteins where no known domains can be identified by current domain models. Results: Using profile searches followed by multiple sequence alignment, structure prediction, and domain architecture analysis, we have identified a novel sensory domain termed FIST, which is present in signal transduction proteins from Bacteria, Archaea and Eucarya. Remote similaritymore » to a known ligand-binding fold and chromosomal proximity of FIST-encoding genes to those coding for proteins involved in amino acid metabolism and transport suggest that FIST domains bind small ligands, such as amino acids.« less
Expression and characterization of aiiA gene from Bacillus subtilis BS-1.
Pan, Jieru; Huang, Tianpei; Yao, Fan; Huang, Zhipeng; Powell, Charles A; Qiu, Sixin; Guan, Xiong
2008-01-01
AHL-lactonase (AiiA), a metallo-beta-lactamase produced by Bacillus thuringiensis, Bacillus cereus and Bacillus anthracis, specifically hydrolyzes N-acyl-homoserine lactones (AHLs) secreted by Gram-negative bacteria and thereby attenuates the symptoms caused by plant pathogens. In this study, an aiiA gene was cloned from Bacillus subtilis BS-1 by PCR with a pair of degenerate primers. The deduced 250 amino acid sequence contained two small conserved regions, 103SHLHFDH109 and 166TPGHTPGH173, which are characteristic of the metallo-beta-lactamase family. Homology comparison revealed that the deduced amino acid sequence had a high degree of similarity with those of the known AiiA proteins in the B. cereus group. Additionally, the aiiA gene was expressed in Escherichia coli BL21 (DE3) pLysS and the expressed AiiA protein could attenuate the soft rot symptoms caused by Erwinia carotovora var. carotovora.
Brown, D P; Idler, K B; Katz, L
1990-01-01
The 18.1-kilobase plasmid pSE211 integrates into the chromosome of Saccharopolyspora erythraea at a specific attB site. Restriction analysis of the integrated plasmid, pSE211int, and adjacent chromosomal sequences allowed identification of attP, the plasmid attachment site. Nucleotide sequencing of attP, attB, attL, and attR revealed a 57-base-pair sequence common to all sites with no duplications of adjacent plasmid or chromosomal sequences in the integrated state, indicating that integration takes place through conservative, reciprocal strand exchange. An analysis of the sequences indicated the presence of a putative gene for Phe-tRNA at attB which is preserved at attL after integration has occurred. A comparison of the attB site for a number of actinomycete plasmids is presented. Integration at attB was also observed when a 2.4-kilobase segment of pSE211 containing attP and the adjacent plasmid sequence was used to transform a pSE211- host. Nucleotide sequencing of this segment revealed the presence of two complete open reading frames (ORFs) and a segment of a third ORF. The ORF adjacent to attP encodes a putative polypeptide 437 amino acids in length that shows similarity, at its C-terminal domain, to sequences of site-specific recombinases of the integrase family. The adjacent ORF encodes a putative 98-amino-acid basic polypeptide that contains a helix-turn-helix motif at its N terminus which corresponds to domains in the Xis proteins of a number of bacteriophages. A proposal for the function of this polypeptide is presented. The deduced amino acid sequence of the third ORF did not reveal similarities to polypeptide sequences in the current data banks. Images FIG. 2 FIG. 3 PMID:2180909
Várnai, Csilla; Burkoff, Nikolas S; Wild, David L
2017-01-01
Evolutionary information stored in multiple sequence alignments (MSAs) has been used to identify the interaction interface of protein complexes, by measuring either co-conservation or co-mutation of amino acid residues across the interface. Recently, maximum entropy related correlated mutation measures (CMMs) such as direct information, decoupling direct from indirect interactions, have been developed to identify residue pairs interacting across the protein complex interface. These studies have focussed on carefully selected protein complexes with large, good-quality MSAs. In this work, we study protein complexes with a more typical MSA consisting of fewer than 400 sequences, using a set of 79 intramolecular protein complexes. Using a maximum entropy based CMM at the residue level, we develop an interface level CMM score to be used in re-ranking docking decoys. We demonstrate that our interface level CMM score compares favourably to the complementarity trace score, an evolutionary information-based score measuring co-conservation, when combined with the number of interface residues, a knowledge-based potential and the variability score of individual amino acid sites. We also demonstrate, that, since co-mutation and co-complementarity in the MSA contain orthogonal information, the best prediction performance using evolutionary information can be achieved by combining the co-mutation information of the CMM with co-conservation information of a complementarity trace score, predicting a near-native structure as the top prediction for 41% of the dataset. The method presented is not restricted to small MSAs, and will likely improve interface prediction also for complexes with large and good-quality MSAs.
Cross, Megan; Lepage, Romain; Rajan, Siji; Biberacher, Sonja; Young, Neil D; Kim, Bo-Na; Coster, Mark J; Gasser, Robin B; Kim, Jeong-Sun; Hofmann, Andreas
2017-03-01
The trehalose biosynthetic pathway is of great interest for the development of novel therapeutics because trehalose is an essential disaccharide in many pathogens but is neither required nor synthesized in mammalian hosts. As such, trehalose-6-phosphate phosphatase (TPP), a key enzyme in trehalose biosynthesis, is likely an attractive target for novel chemotherapeutics. Based on a survey of genomes from a panel of parasitic nematodes and bacterial organisms and by way of a structure-based amino acid sequence alignment, we derive the topological structure of monoenzyme TPPs and classify them into 3 groups. Comparison of the functional roles of amino acid residues located in the active site for TPPs belonging to different groups reveal nuanced variations. Because current literature on this enzyme family shows a tendency to infer functional roles for individual amino acid residues, we investigated the roles of the strictly conserved aspartate tetrad in TPPs of the nematode Brugia malayi by using a conservative mutation approach. In contrast to aspartate-213, the residue inferred to carry out the nucleophilic attack on the substrate, we found that aspartate-215 and aspartate-428 of Bm TPP are involved in the chemistry steps of enzymatic hydrolysis of the substrate. Therefore, we suggest that homology-based inference of functionally important amino acids by sequence comparison for monoenzyme TPPs should only be carried out for each of the 3 groups.-Cross, M., Lepage, R., Rajan, S., Biberacher, S., Young, N. D., Kim, B.-N., Coster, M. J., Gasser, R. B., Kim, J.-S., Hofmann, A. Probing function and structure of trehalose-6-phosphate phosphatases from pathogenic organisms suggests distinct molecular groupings. © FASEB.
Yang, Zhifan; Chen, Jun; Chen, Yongqin; Jiang, Sijing
2010-01-01
A full cDNA encoding an acetylcholinesterase (AChE, EC 3.1.1.7) was cloned and characterized from the brown planthopper, Nilaparvata lugens Stål (Hemiptera: Delphacidae). The complete cDNA (2467 bp) contains a 1938-bp open reading frame encoding 646 amino acid residues. The amino acid sequence of the AChE deduced from the cDNA consists of 30 residues for a putative signal peptide and 616 residues for the mature protein with a predicted molecular weight of 69,418. The three residues (Ser242, Glu371, and His485) that putatively form the catalytic triad and the six Cys that form intra-subunit disulfide bonds are completely conserved, and 10 out of the 14 aromatic residues lining the active site gorge of the AChE are also conserved. Northern blot analysis of poly(A)+ RNA showed an approximately 2.6-kb transcript, and Southern blot analysis revealed there likely was just a single copy of this gene in N. lugens. The deduced protein sequence is most similar to AChE of Nephotettix cincticeps with 83% amino acid identity. Phylogenetic analysis constructed with 45 AChEs from 30 species showed that the deduced N. lugens AChE formed a cluster with the other 8 insect AChE2s. Additionally, the hypervariable region and amino acids specific to insect AChE2 also existed in the AChE of N. lugens. The results revealed that the AChE cDNA cloned in this work belongs to insect AChE2 subgroup, which is orthologous to Drosophila AChE. Comparison of the AChEs between the susceptible and resistant strains revealed a point mutation, Gly185Ser, is likely responsible for the insensitivity of the AChE to methamidopho in the resistant strain.
Wang, Pei; Song, Fan; Cai, Wanzhi
2014-01-01
Insect mitochondrial genomes are very important to understand the molecular evolution as well as for phylogenetic and phylogeographic studies of the insects. The Miridae are the largest family of Heteroptera encompassing more than 11,000 described species and of great economic importance. For better understanding the diversity and the evolution of plant bugs, we sequence five new mitochondrial genomes and present the first comparative analysis of nine mitochondrial genomes of mirids available to date. Our result showed that gene content, gene arrangement, base composition and sequences of mitochondrial transcription termination factor were conserved in plant bugs. Intra-genus species shared more conserved genomic characteristics, such as nucleotide and amino acid composition of protein-coding genes, secondary structure and anticodon mutations of tRNAs, and non-coding sequences. Control region possessed several distinct characteristics, including: variable size, abundant tandem repetitions, and intra-genus conservation; and was useful in evolutionary and population genetic studies. The AGG codon reassignments were investigated between serine and lysine in the genera Adelphocoris and other cimicomorphans. Our analysis revealed correlated evolution between reassignments of the AGG codon and specific point mutations at the antidocons of tRNALys and tRNASer(AGN). Phylogenetic analysis indicated that mitochondrial genome sequences were useful in resolving family level relationship of Cimicomorpha. Comparative evolutionary analysis of plant bug mitochondrial genomes allowed the identification of previously neglected coding genes or non-coding regions as potential molecular markers. The finding of the AGG codon reassignments between serine and lysine indicated the parallel evolution of the genetic code in Hemiptera mitochondrial genomes. PMID:24988409
Antimicrobial Peptides from Plants
Tam, James P.; Wang, Shujing; Wong, Ka H.; Tan, Wei Liang
2015-01-01
Plant antimicrobial peptides (AMPs) have evolved differently from AMPs from other life forms. They are generally rich in cysteine residues which form multiple disulfides. In turn, the disulfides cross-braced plant AMPs as cystine-rich peptides to confer them with extraordinary high chemical, thermal and proteolytic stability. The cystine-rich or commonly known as cysteine-rich peptides (CRPs) of plant AMPs are classified into families based on their sequence similarity, cysteine motifs that determine their distinctive disulfide bond patterns and tertiary structure fold. Cystine-rich plant AMP families include thionins, defensins, hevein-like peptides, knottin-type peptides (linear and cyclic), lipid transfer proteins, α-hairpinin and snakins family. In addition, there are AMPs which are rich in other amino acids. The ability of plant AMPs to organize into specific families with conserved structural folds that enable sequence variation of non-Cys residues encased in the same scaffold within a particular family to play multiple functions. Furthermore, the ability of plant AMPs to tolerate hypervariable sequences using a conserved scaffold provides diversity to recognize different targets by varying the sequence of the non-cysteine residues. These properties bode well for developing plant AMPs as potential therapeutics and for protection of crops through transgenic methods. This review provides an overview of the major families of plant AMPs, including their structures, functions, and putative mechanisms. PMID:26580629
Applying the Concept of Peptide Uniqueness to Anti-Polio Vaccination
Kanduc, Darja; Fasano, Candida; Capone, Giovanni; Pesce Delfino, Antonella; Calabrò, Michele; Polimeno, Lorenzo
2015-01-01
Background. Although rare, adverse events may associate with anti-poliovirus vaccination thus possibly hampering global polio eradication worldwide. Objective. To design peptide-based anti-polio vaccines exempt from potential cross-reactivity risks and possibly able to reduce rare potential adverse events such as the postvaccine paralytic poliomyelitis due to the tendency of the poliovirus genome to mutate. Methods. Proteins from poliovirus type 1, strain Mahoney, were analyzed for amino acid sequence identity to the human proteome at the pentapeptide level, searching for sequences that (1) have zero percent of identity to human proteins, (2) are potentially endowed with an immunologic potential, and (3) are highly conserved among poliovirus strains. Results. Sequence analyses produced a set of consensus epitopic peptides potentially able to generate specific anti-polio immune responses exempt from cross-reactivity with the human host. Conclusion. Peptide sequences unique to poliovirus proteins and conserved among polio strains might help formulate a specific and universal anti-polio vaccine able to react with multiple viral strains and exempt from the burden of possible cross-reactions with human proteins. As an additional advantage, using a peptide-based vaccine instead of current anti-polio DNA vaccines would eliminate the rare post-polio poliomyelitis cases and other disabling symptoms that may appear following vaccination. PMID:26568962
Niskanen, Einari A; Hytönen, Vesa P; Grapputo, Alessandro; Nordlund, Henri R; Kulomaa, Markku S; Laitinen, Olli H
2005-01-01
Background A chicken egg contains several biotin-binding proteins (BBPs), whose complete DNA and amino acid sequences are not known. In order to identify and characterise these genes and proteins we studied chicken cDNAs and genes available in the NCBI database and chicken genome database using the reported N-terminal amino acid sequences of chicken egg-yolk BBPs as search strings. Results Two separate hits showing significant homology for these N-terminal sequences were discovered. For one of these hits, the chromosomal location in the immediate proximity of the avidin gene family was found. Both of these hits encode proteins having high sequence similarity with avidin suggesting that chicken BBPs are paralogous to avidin family. In particular, almost all residues corresponding to biotin binding in avidin are conserved in these putative BBP proteins. One of the found DNA sequences, however, seems to encode a carboxy-terminal extension not present in avidin. Conclusion We describe here the predicted properties of the putative BBP genes and proteins. Our present observations link BBP genes together with avidin gene family and shed more light on the genetic arrangement and variability of this family. In addition, comparative modelling revealed the potential structural elements important for the functional and structural properties of the putative BBP proteins. PMID:15777476
Kawano, Mitsuoki; Oshima, Taku; Kasai, Hiroaki; Mori, Hirotada
2002-07-01
Genome sequence analyses of Escherichia coli K-12 revealed four copies of long repetitive elements. These sequences are designated as long direct repeat (LDR) sequences. Three of the repeats (LDR-A, -B, -C), each approximately 500 bp in length, are located as tandem repeats at 27.4 min on the genetic map. Another copy (LDR-D), 450 bp in length and nearly identical to LDR-A, -B and -C, is located at 79.7 min, a position that is directly opposite the position of LDR-A, -B and -C. In this study, we demonstrate that LDR-D encodes a 35-amino-acid peptide, LdrD, the overexpression of which causes rapid cell killing and nucleoid condensation of the host cell. Northern blot and primer extension analysis showed constitutive transcription of a stable mRNA (approximately 370 nucleotides) encoding LdrD and an unstable cis-encoded antisense RNA (approximately 60 nucleotides), which functions as a trans-acting regulator of ldrD translation. We propose that LDR encodes a toxin-antitoxin module. LDR-homologous sequences are not pre-sent on any known plasmids but are conserved in Salmonella and other enterobacterial species.
Oliveira-Neto, Osmundo B; Batista, João A N; Rigden, Daniel J; Fragoso, Rodrigo R; Silva, Rodrigo O; Gomes, Eliane A; Franco, Octávio L; Dias, Simoni C; Cordeiro, Célia M T; Monnerat, Rose G; Grossi-De-Sá, Maria F
2004-09-01
Fourteen different cDNA fragments encoding serine proteinases were isolated by reverse transcription-PCR from cotton boll weevil (Anthonomus grandis) larvae. A large diversity between the sequences was observed, with a mean pairwise identity of 22% in the amino acid sequence. The cDNAs encompassed 11 trypsin-like sequences classifiable into three families and three chymotrypsin-like sequences belonging to a single family. Using a combination of 5' and 3' RACE, the full-length sequence was obtained for five of the cDNAs, named Agser2, Agser5, Agser6, Agser10 and Agser21. The encoded proteins included amino acid sequence motifs of serine proteinase active sites, conserved cysteine residues, and both zymogen activation and signal peptides. Southern blotting analysis suggested that one or two copies of these serine proteinase genes exist in the A. grandis genome. Northern blotting analysis of Agser2 and Agser5 showed that for both genes, expression is induced upon feeding and is concentrated in the gut of larvae and adult insects. Reverse northern analysis of the 14 cDNA fragments showed that only two trypsin-like and two chymotrypsin-like were expressed at detectable levels. Under the effect of the serine proteinase inhibitors soybean Kunitz trypsin inhibitor and black-eyed pea trypsin/chymotrypsin inhibitor, expression of one of the trypsin-like sequences was upregulated while expression of the two chymotrypsin-like sequences was downregulated. Copyright 2004 Elsevier Ltd.
Apple miRNAs and tasiRNAs with novel regulatory networks
2012-01-01
Background MicroRNAs (miRNAs) and their regulatory functions have been extensively characterized in model species but whether apple has evolved similar or unique regulatory features remains unknown. Results We performed deep small RNA-seq and identified 23 conserved, 10 less-conserved and 42 apple-specific miRNAs or families with distinct expression patterns. The identified miRNAs target 118 genes representing a wide range of enzymatic and regulatory activities. Apple also conserves two TAS gene families with similar but unique trans-acting small interfering RNA (tasiRNA) biogenesis profiles and target specificities. Importantly, we found that miR159, miR828 and miR858 can collectively target up to 81 MYB genes potentially involved in diverse aspects of plant growth and development. These miRNA target sites are differentially conserved among MYBs, which is largely influenced by the location and conservation of the encoded amino acid residues in MYB factors. Finally, we found that 10 of the 19 miR828-targeted MYBs undergo small interfering RNA (siRNA) biogenesis at the 3' cleaved, highly divergent transcript regions, generating over 100 sequence-distinct siRNAs that potentially target over 70 diverse genes as confirmed by degradome analysis. Conclusions Our work identified and characterized apple miRNAs, their expression patterns, targets and regulatory functions. We also discovered that three miRNAs and the ensuing siRNAs exploit both conserved and divergent sequence features of MYB genes to initiate distinct regulatory networks targeting a multitude of genes inside and outside the MYB family. PMID:22704043
Chen, Wentian; Zhong, Yaogang; Qin, Yannan; Sun, Shisheng; Li, Zheng
2012-01-01
Two glycoproteins, hemagglutinin (HA) and neuraminidase (NA), on the surface of influenza viruses play crucial roles in transfaunation, membrane fusion and the release of progeny virions. To explore the distribution of N-glycosylation sites (glycosites) in these two glycoproteins, we collected and aligned the amino acid sequences of all the HA and NA subtypes. Two glycosites were located at HA0 cleavage sites and fusion peptides and were strikingly conserved in all HA subtypes, while the remaining glycosites were unique to their subtypes. Two to four conserved glycosites were found in the stalk domain of NA, but these are affected by the deletion of specific stalk domain sequences. Another highly conserved glycosite appeared at the top center of tetrameric global domain, while the others glycosites were distributed around the global domain. Here we present a detailed investigation of the distribution and the evolutionary pattern of the glycosites in the envelope glycoproteins of IVs, and further focus on the H5N1 virus and conclude that the glycosites in H5N1 have become more complicated in HA and less influential in NA in the last five years. PMID:23133677
Algorithm, applications and evaluation for protein comparison by Ramanujan Fourier transform.
Zhao, Jian; Wang, Jiasong; Hua, Wei; Ouyang, Pingkai
2015-12-01
The amino acid sequence of a protein determines its chemical properties, chain conformation and biological functions. Protein sequence comparison is of great importance to identify similarities of protein structures and infer their functions. Many properties of a protein correspond to the low-frequency signals within the sequence. Low frequency modes in protein sequences are linked to the secondary structures, membrane protein types, and sub-cellular localizations of the proteins. In this paper, we present Ramanujan Fourier transform (RFT) with a fast algorithm to analyze the low-frequency signals of protein sequences. The RFT method is applied to similarity analysis of protein sequences with the Resonant Recognition Model (RRM). The results show that the proposed fast RFT method on protein comparison is more efficient than commonly used discrete Fourier transform (DFT). RFT can detect common frequencies as significant feature for specific protein families, and the RFT spectrum heat-map of protein sequences demonstrates the information conservation in the sequence comparison. The proposed method offers a new tool for pattern recognition, feature extraction and structural analysis on protein sequences. Copyright © 2015 Elsevier Ltd. All rights reserved.
Molecular cloning and characterization of alpha - galactosidase gene from Glaciozyma antarctica
NASA Astrophysics Data System (ADS)
Moheer, Reyad Qaed Al; Bakar, Farah Diba Abu; Murad, Abdul Munir Abdul
2015-09-01
Psychrophilic enzymes are proteins produced by psychrophilic organisms which recently are the limelight for industrial applications. A gene encoding α-galactosidase from a psychrophilic yeast, Glaciozyma antarctica PI12 which belongs to glycoside hydrolase family 27, was isolated and analyzed using several bioinformatic tools. The cDNA of the gene with the size of 1,404-bp encodes a protein with 467 amino acid residues. Predicted molecular weight of protein was 48.59 kDa and hence we name the gene encoding α-galactosidase as GAL48. We found that the predicted protein sequences possessed signal peptide sequence and are highly conserved among other fungal α-galactosidase.
Structural study of Bombyx mori silk fibroin during processing for regeneration
NASA Astrophysics Data System (ADS)
Ha, Sung-Won
Bombyx mori silk fibroin has excellent mechanical properties combined with flexibility, tissue compatibility, and high oxygen permeability in the wet condition. This important material should be dissolved and regenerated to be utilized as useful forms such as gel, film, fiber, powder, or non-woven. However, it has long been a problem that the regenerated fibroin materials show poor mechanical properties and brittleness. These problems were technically solved by improving a fiber processing method reported here. The regenerated fibroin fibers showed much better mechanical properties compared to the original silk fibers. This improved technique for the fiber processing of Bombyx mori silk fibroin may be used as a model system for other semi-crystalline fiber forming proteins, becoming available through biotechnology. The physical and chemical properties of the regenerated fibers were characterized by SinTechRTM tensile testing, X-ray diffraction, solid state 13C NMR spectroscopy, and SEM. Unlike synthetic polymers, the molecular weight distribution of Bombyx mori silk fibroin is mono-disperse because silk fibroin is synthesized from DNA template. Genetic studies have revealed the entire amino acid sequence of Bombyx mori silk fibroin. It is known that the crystalline silk II structure is composed of hexa-amino acid sequences, GAGAGS. However, in the amino acid sequence of Bombyx mori silk fibroin heavy chain, there are present 11 chemically irregular but evolutionarily conserved sequences with about 31 amino acid residues (irregular GT˜GT sequences). The structure and role of these irregular sequences have remained unknown. One of the most frequently appearing irregular sequences was synthesized by a peptide synthesizer. The three-dimensional structure of this irregular silk peptide was studied by the high resolution two-dimensional NMR technique. The three-dimensional structure of this peptide shows that it makes a turn or loop structure (distorted O shape), which means the proceeding backbone direction is changed 180° by this sequence. This may facilitate the beta-sheet formation of the crystal forming building blocks, GAGAGS/GY˜GY sequences, in fibroin heavy chain. It may also facilitate the solubilization of the fibroin heavy chain within the silk gland.
Khan, A S
1984-01-01
The sequence of 363 nucleotides near the 3' end of the pol gene and 564 nucleotides from the 5' terminus of the env gene in an endogenous murine leukemia viral (MuLV) DNA segment, cloned from AKR/J mouse DNA and designated as A-12, was obtained. For comparison, the nucleotide sequence in an analogous portion of AKR mink cell focus-forming (MCF) 247 MuLV provirus was also determined. Sequence features unique to MCF247 MuLV DNA in the 3' pol and 5' env regions were identified by comparison with nucleotide sequences in analogous regions of NFS -Th-1 xenotropic and AKR ecotropic MuLV proviruses. These included (i) an insertion of 12 base pairs encoding four amino acids located 60 base pairs from the 3' terminus of the pol gene and immediately preceding the env gene, (ii) the deletion of 12 base pairs (encoding four amino acids) and the insertion of 3 base pairs (encoding one amino acid) in the 5' portion of the env gene, and (iii) single base substitutions resulting in 2 MCF247 -specific amino acids in the 3' pol and 23 in the 5' env regions. Nucleotide sequence comparison involving the 3' pol and 5' env regions of AKR MCF247 , NFS xenotropic, and AKR ecotropic MuLV proviruses with the cloned endogenous MuLV DNA indicated that MCF247 proviral DNA sequences were conserved in the cloned endogenous MuLV proviral segment. In fact, total nucleotide sequence identity existed between the endogenous MuLV DNA and the MCF247 MuLV provirus in the 3' portion of the pol gene. In the 5' env region, only 4 of 564 nucleotides were different, resulting in three amino acid changes between AKR MCF247 MuLV DNA and the endogenous MuLV DNA present in clone A-12. In addition, nucleotide sequence comparison indicated that Moloney-and Friend-MCF MuLVs were also highly related in the 3' pol and 5' env regions to the cloned endogenous MuLV DNA. These results establish the role of endogenous MuLV DNA segments in generation of recombinant MCF viruses. PMID:6328017
Bastien, Olivier; Maréchal, Eric
2008-08-07
Confidence in pairwise alignments of biological sequences, obtained by various methods such as Blast or Smith-Waterman, is critical for automatic analyses of genomic data. Two statistical models have been proposed. In the asymptotic limit of long sequences, the Karlin-Altschul model is based on the computation of a P-value, assuming that the number of high scoring matching regions above a threshold is Poisson distributed. Alternatively, the Lipman-Pearson model is based on the computation of a Z-value from a random score distribution obtained by a Monte-Carlo simulation. Z-values allow the deduction of an upper bound of the P-value (1/Z-value2) following the TULIP theorem. Simulations of Z-value distribution is known to fit with a Gumbel law. This remarkable property was not demonstrated and had no obvious biological support. We built a model of evolution of sequences based on aging, as meant in Reliability Theory, using the fact that the amount of information shared between an initial sequence and the sequences in its lineage (i.e., mutual information in Information Theory) is a decreasing function of time. This quantity is simply measured by a sequence alignment score. In systems aging, the failure rate is related to the systems longevity. The system can be a machine with structured components, or a living entity or population. "Reliability" refers to the ability to operate properly according to a standard. Here, the "reliability" of a sequence refers to the ability to conserve a sufficient functional level at the folded and maturated protein level (positive selection pressure). Homologous sequences were considered as systems 1) having a high redundancy of information reflected by the magnitude of their alignment scores, 2) which components are the amino acids that can independently be damaged by random DNA mutations. From these assumptions, we deduced that information shared at each amino acid position evolved with a constant rate, corresponding to the information hazard rate, and that pairwise sequence alignment scores should follow a Gumbel distribution, which parameters could find some theoretical rationale. In particular, one parameter corresponds to the information hazard rate. Extreme value distribution of alignment scores, assessed from high scoring segments pairs following the Karlin-Altschul model, can also be deduced from the Reliability Theory applied to molecular sequences. It reflects the redundancy of information between homologous sequences, under functional conservative pressure. This model also provides a link between concepts of biological sequence analysis and of systems biology.
Brunak, S; Engelbrecht, J
1996-06-01
A direct comparison of experimentally determined protein structures and their corresponding protein coding mRNA sequences has been performed. We examine whether real world data support the hypothesis that clusters of rare codons correlate with the location of structural units in the resulting protein. The degeneracy of the genetic code allows for a biased selection of codons which may control the translational rate of the ribosome, and may thus in vivo have a catalyzing effect on the folding of the polypeptide chain. A complete search for GenBank nucleotide sequences coding for structural entries in the Brookhaven Protein Data Bank produced 719 protein chains with matching mRNA sequence, amino acid sequence, and secondary structure assignment. By neural network analysis, we found strong signals in mRNA sequence regions surrounding helices and sheets. These signals do not originate from the clustering of rare codons, but from the similarity of codons coding for very abundant amino acid residues at the N- and C-termini of helices and sheets. No correlation between the positioning of rare codons and the location of structural units was found. The mRNA signals were also compared with conserved nucleotide features of 16S-like ribosomal RNA sequences and related to mechanisms for maintaining the correct reading frame by the ribosome.
Gültas, Mehmet; Düzgün, Güncel; Herzog, Sebastian; Jäger, Sven Joachim; Meckbach, Cornelia; Wingender, Edgar; Waack, Stephan
2014-04-03
The identification of functionally or structurally important non-conserved residue sites in protein MSAs is an important challenge for understanding the structural basis and molecular mechanism of protein functions. Despite the rich literature on compensatory mutations as well as sequence conservation analysis for the detection of those important residues, previous methods often rely on classical information-theoretic measures. However, these measures usually do not take into account dis/similarities of amino acids which are likely to be crucial for those residues. In this study, we present a new method, the Quantum Coupled Mutation Finder (QCMF) that incorporates significant dis/similar amino acid pair signals in the prediction of functionally or structurally important sites. The result of this study is twofold. First, using the essential sites of two human proteins, namely epidermal growth factor receptor (EGFR) and glucokinase (GCK), we tested the QCMF-method. The QCMF includes two metrics based on quantum Jensen-Shannon divergence to measure both sequence conservation and compensatory mutations. We found that the QCMF reaches an improved performance in identifying essential sites from MSAs of both proteins with a significantly higher Matthews correlation coefficient (MCC) value in comparison to previous methods. Second, using a data set of 153 proteins, we made a pairwise comparison between QCMF and three conventional methods. This comparison study strongly suggests that QCMF complements the conventional methods for the identification of correlated mutations in MSAs. QCMF utilizes the notion of entanglement, which is a major resource of quantum information, to model significant dissimilar and similar amino acid pair signals in the detection of functionally or structurally important sites. Our results suggest that on the one hand QCMF significantly outperforms the previous method, which mainly focuses on dissimilar amino acid signals, to detect essential sites in proteins. On the other hand, it is complementary to the existing methods for the identification of correlated mutations. The method of QCMF is computationally intensive. To ensure a feasible computation time of the QCMF's algorithm, we leveraged Compute Unified Device Architecture (CUDA).The QCMF server is freely accessible at http://qcmf.informatik.uni-goettingen.de/.
Monfregola, Jlenia; Cevenini, Armando; Terracciano, Antonio; van Vlies, Naomi; Arbucci, Salvatore; Wanders, Ronald J A; D'Urso, Michele; Vaz, Frédéric M; Ursini, Matilde Valeria
2005-09-01
epsilon-N-Trimethyllysine hydroxylase (TMLH) (EC 1.14.11.8) is a non-heme-ferrous iron hydroxylase, Fe(++) and 2-oxoglutarate (2OG) dependent, catalyzing the first of four enzymatic reactions of the highly conserved carnitine biosynthetic pathway. Otherwise from all the other enzymes of carnitine biosynthesis, TMLH was found to be associated to the mitochondrial fraction. We here report molecular cloning of two alternative spliced forms of TMLH, which appear ubiquitously expressed in human adult and fetal tissues. The deduced proteins are designated TMLH-a and TMLH-b, and contain 421 and 399 amino acids, respectively. They share the first N-terminal 332 amino acids, including a mitochondrial targeting signal, but diverge at the C-terminal end. TMLH-a and TMLH-b exogenous expression in COS-1 cells shows that the first 15 amino acids are necessary and sufficient for mitochondrial import. Furthermore, comparative evolutionary analysis of the C-terminal portion of TMLH-a identifies a conserved domain characterized by a key triad of residues, His242-Glu244-His389 predicted to bind 2OG end. This sequence is conserved in the TMLH enzyme from all species but is partially substituted by a unique sequence in the TMLH-b variant. Indeed, TMLH-b is not functional by itself as well as a TMLH-H389L mutant produced by site directed mutagenesis. As great interest, we found that TMLH-b and TMLH-H389L, individually co-expressed with TMLH-a in COS-1 cells, negatively affect TMLH activity. Therefore, our studies on the TMLH alternative form provide relevant novel information, first that the C-terminal region of TMLH contains the main determinants for its enzymatic activity including a key H389 residue, and second that TMLH-b could act as a crucial physiological negative regulator of TMLH. Copyright 2005 Wiley-Liss, Inc.
On the conservative nature of intragenic recombination
Drummond, D. Allan; Silberg, Jonathan J.; Meyer, Michelle M.; Wilke, Claus O.; Arnold, Frances H.
2005-01-01
Intragenic recombination rapidly creates protein sequence diversity compared with random mutation, but little is known about the relative effects of recombination and mutation on protein function. Here, we compare recombination of the distantly related β-lactamases PSE-4 and TEM-1 to mutation of PSE-4. We show that, among β-lactamase variants containing the same number of amino acid substitutions, variants created by recombination retain function with a significantly higher probability than those generated by random mutagenesis. We present a simple model that accurately captures the differing effects of mutation and recombination in real and simulated proteins with only four parameters: (i) the amino acid sequence distance between parents, (ii) the number of substitutions, (iii) the average probability that random substitutions will preserve function, and (iv) the average probability that substitutions generated by recombination will preserve function. Our results expose a fundamental functional enrichment in regions of protein sequence space accessible by recombination and provide a framework for evaluating whether the relative rates of mutation and recombination observed in nature reflect the underlying imbalance in their effects on protein function. PMID:15809422
On the conservative nature of intragenic recombination.
Drummond, D Allan; Silberg, Jonathan J; Meyer, Michelle M; Wilke, Claus O; Arnold, Frances H
2005-04-12
Intragenic recombination rapidly creates protein sequence diversity compared with random mutation, but little is known about the relative effects of recombination and mutation on protein function. Here, we compare recombination of the distantly related beta-lactamases PSE-4 and TEM-1 to mutation of PSE-4. We show that, among beta-lactamase variants containing the same number of amino acid substitutions, variants created by recombination retain function with a significantly higher probability than those generated by random mutagenesis. We present a simple model that accurately captures the differing effects of mutation and recombination in real and simulated proteins with only four parameters: (i) the amino acid sequence distance between parents, (ii) the number of substitutions, (iii) the average probability that random substitutions will preserve function, and (iv) the average probability that substitutions generated by recombination will preserve function. Our results expose a fundamental functional enrichment in regions of protein sequence space accessible by recombination and provide a framework for evaluating whether the relative rates of mutation and recombination observed in nature reflect the underlying imbalance in their effects on protein function.
Characterization and expression of the calpastatin gene in Cyprinus carpio.
Chen, W X; Ma, Y
2015-07-03
Calpastatin, an important protein used to regulate meat quality traits in animals, is encoded by the CAST gene. The aim of the present study was to clone the cDNA sequence of the CAST gene and detect the expression of CAST in the tissues of Cyprinus carpio. The cDNA of the C. carpio CAST gene, amplified using rapid amplification of cDNA ends PCR, is 2834 bp in length (accession No. JX275386), contains a 2634-bp open reading frame, and encodes a protein with 877 amino acid residues. The amino acid sequence of the C. carpio CAST gene was 88, 80, and 59% identical to the sequences observed in grass carp, zebrafish, and other fish, respectively. The C. carpio CAST was observed to contain four conserved domains with 54 serine phosphorylation loci, 28 threonine phosphorylation loci, 1 tyrosine phosphorylation loci, and 6 specific protein kinase C phosphorylation loci. The CAST gene showed widespread expression in different tissues of C. carpio. Surprisingly, the relative expression of the CAST transcript in the muscle and heart tissues of C. carpio was significantly higher than in other tissues (P < 0.01).
Sequence analysis and molecular characterization of Wnt4 gene in metacestodes of Taenia solium.
Hou, Junling; Luo, Xuenong; Wang, Shuai; Yin, Cai; Zhang, Shaohua; Zhu, Xueliang; Dou, Yongxi; Cai, Xuepeng
2014-04-01
Wnt proteins are a family of secreted glycoproteins that are evolutionarily conserved and considered to be involved in extensive developmental processes in metazoan organisms. The characterization of wnt genes may improve understanding the parasite's development. In the present study, a wnt4 gene encoding 491amino acids was amplified from cDNA of metacestodes of Taenia solium using reverse transcription PCR (RT-PCR). Bioinformatics tools were used for sequence analysis. The conserved domain of the wnt gene family was predicted. The expression profile of Wnt4 was investigated using real-time PCR. Wnt4 expression was found to be dramatically increased in scolex evaginated cysticerci when compared to invaginated cysticerci. In situ hybridization showed that wnt4 gene was distributed in the posterior end of the worm along the primary body axis in evaginated cysticerci. These findings indicated that wnt4 may take part in the process of cysticerci evagination and play a role in scolex/bladder development of cysticerci of T. solium.
Petrov, Artem; Arzhanik, Vladimir; Makarov, Gennady; Koliasnikov, Oleg
2016-08-01
Antibodies are the family of proteins, which are responsible for antigen recognition. The computational modeling of interaction between an antigen and an antibody is very important when crystallographic structure is unavailable. In this research, we have discovered the correlation between the amino acid sequence of antibody and its specific binding characteristics on the example of the novel conservative binding motif, which consists of four residues: Arg H52, Tyr H33, Thr H59, and Glu H61. These residues are specifically oriented in the binding site and interact with each other in a specific manner. The residues of the binding motif are involved in interaction strictly with negatively charged groups of antigens, and form a binding complex. Mechanism of interaction and characteristics of the complex were also discovered. The results of this research can be used to increase the accuracy of computational antibody-antigen interaction modeling and for post-modeling quality control of the modeled structures.
Structural Plasticity of Helical Nanotubes Based on Coiled-Coil Assemblies
Egelman, Edward H.; Xu, C.; DiMaio, F.; ...
2015-01-22
Numerous instances can be seen in evolution in which protein quaternary structures have diverged while the sequences of the building blocks have remained fairly conserved. However, the path through which such divergence has taken place is usually not known. We have designed two synthetic 29-residue α-helical peptides, based on the coiled-coil structural motif, that spontaneously self-assemble into helical nanotubes in vitro. Using electron cryomicroscopy with a newly available direct electron detection capability, we can achieve near-atomic resolution of these thin structures. We show how conservative changes of only one or two amino acids result in dramatic changes in quaternary structure,more » in which the assemblies can be switched between two very different forms. This system provides a framework for understanding how small sequence changes in evolution can translate into very large changes in supramolecular structure, a phenomenon that may have significant implications for the de novo design of synthetic peptide assemblies.« less
Isolation and Characterization of the PKAr Gene From a Plant Pathogen, Curvularia lunata.
Liu, T; Ma, B C; Hou, J M; Zuo, Y H
2014-09-01
By using EST database from a full-length cDNA library of Curvularia lunata, we have isolated a 2.9 kb cDNA, termed PKAr. An ORF of 1,383 bp encoding a polypeptide of 460 amino acids with molecular weight 50.1 kDa, (GeneBank Acc. No. KF675744) was cloned. The deduced amino acid sequence of the PKAr shows 90 and 88 % identity with cAMP-dependent protein kinase A regulatory subunit from Alternaria alternate and Pyrenophora tritici-repentis Pt-1C-BFP, respectively. Database analysis revealed that the deduced amino acid sequence of PKAr shares considerable similarity with that of PKA regulatory subunits in other organisms, particularly in the conserved regions. No introns were identified within the 1,383 bp of ORF compared with PKAr genomic DNA sequence. Southern blot indicated that PKAr existed as a single copy per genome. The mRNA expression level of PKAr in different development stages were demonstrated using real-time quantitative PCR. The results showed that the level of PKAr expression was highest in vegetative growth mycelium, which indicated it might play an important role in the vegetative growth of C. lunata. These results provided a fundamental supporting research on the function of PKAr in plant pathogen, C. lunata.
Pseudomonas fluorescens-like bacteria from the stomach: a microbiological and molecular study.
Patel, Saurabh Kumar; Pratap, Chandra Bhan; Verma, Ajay Kumar; Jain, Ashok Kumar; Dixit, Vinod Kumar; Nath, Gopal
2013-02-21
To characterize oxidase- and urease-producing bacterial isolates, grown aerobically, that originated from antral biopsies of patients suffering from acid peptic diseases. A total of 258 antral biopsy specimens were subjected to isolation of bacteria followed by tests for oxidase and urease production, acid tolerance and aerobic growth. The selected isolates were further characterized by molecular techniques viz. amplifications for 16S rRNA using universal eubacterial and HSP60 gene specific primers. The amplicons were subjected to restriction analysis and partial sequencing. A phylogenetic tree was generated using unweighted pair group method with arithmetic mean (UPGMA) from evolutionary distance computed with bootstrap test of phylogeny. Assessment of acidity tolerance of bacteria isolated from antrum was performed using hydrochloric acid from 10(-7) mol/L to 10(-1) mol/L. Of the 258 antral biopsy specimens collected from patients, 179 (69.4%) were positive for urease production by rapid urease test and 31% (80/258) yielded typical Helicobacter pylori (H. pylori) after 5-7 d of incubation under a microaerophilic environment. A total of 240 (93%) antral biopsies yielded homogeneous semi-translucent and small colonies after overnight incubation. The partial 16S rRNA sequences revealed that the isolates had 99% similarity with Pseudomonas species. A phylogenetic tree on the basis of 16S rRNA sequences denoted that JQ927226 and JQ927227 were likely to be related to Pseudomonas fluorescens (P. fluorescens). On the basis of HSP60 sequences applied to the UPGMA phylogenetic tree, it was observed that isolated strains in an aerobic environment were likely to be P. fluorescens, and HSP60 sequences had more discriminatory potential rather than 16S rRNA sequences. Interestingly, this bacterium was acid tolerant for hours at low pH. Further, a total of 250 (96.9%) genomic DNA samples of 258 biopsy specimens and DNA from 240 bacterial isolates were positive for the 613 bp amplicons by targeting P. fluorescens-specific conserved putative outer membrane protein gene sequences. This study indicates that bacterial isolates from antral biopsies grown aerobically were P. fluorescens, and thus acid-tolerant bacteria other than H. pylori can also colonize the stomach and may be implicated in pathogenesis/protection.
Prediction and Identification of Krüppel-Like Transcription Factors by Machine Learning Method.
Liao, Zhijun; Wang, Xinrui; Chen, Xingyong; Zou, Quan
2017-01-01
The Krüppel-like factors (KLFs) are a family of containing Zn finger(ZF) motif transcription factors with 18 members in human genome, among them, KLF18 is predicted by bioinformatics. KLFs possess various physiological function involving in a number of cancers and other diseases. Here we perform a binary-class classification of KLFs and non-KLFs by machine learning methods. The protein sequences of KLFs and non-KLFs were searched from UniProt and randomly separate them into training dataset(containing positive and negative sequences) and test dataset(containing only negative sequences), after extracting the 188-dimensional(188D) feature vectors we carry out category with four classifiers(GBDT, libSVM, RF, and k-NN). On the human KLFs, we further dig into the evolutionary relationship and motif distribution, and finally we analyze the conserved amino acid residue of three zinc fingers. The classifier model from training dataset were well constructed, and the highest specificity(Sp) was 99.83% from a library for support vector machine(libSVM) and all the correctly classified rates were over 70% for 10-fold cross-validation on test dataset. The 18 human KLFs can be further divided into 7 groups and the zinc finger domains were located at the carboxyl terminus, and many conserved amino acid residues including Cysteine and Histidine, and the span and interval between them were consistent in the three ZF domains. Two classification models for KLFs prediction have been built by novel machine learning methods. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Gao, Quanxin; Yue, Yanfeng; Min, Minghua; Peng, Shiming; Shi, Zhaohong; Sheng, Wenquan; Zhang, Tao
2018-06-08
Toll like receptor (TLR) 5 and 9 are important members of the TLR family that play key roles in innate immunity in all vertebrates. In this study, paTLR5 and paTLR9 were identified in silver pomfret (Pampus argenteus), a marine teleost of great economic value. Open reading frames (ORFs) of paTLR5 and paTLR9 are 2646 and 3225 bp, encoding polypeptides of 881 and 1074 amino acids, respectively. Sequence analysis revealed several conserved characteristic features, including signal peptides, leucine-rich repeat (LRR) motifs, and a Toll/interleukin-I receptor (TIR) domain. Sequence, phylogenetic and synteny analysis revealed high sequence identity with counterparts in other teleosts, confirming their correct nomenclature and conservation during evolution. Quantitative real-time PCR revealed that the that both TLRs were ubiquitously expressed in all investigated tissues, most abundantly in liver, kidney, spleen, intestine and gill, but lower in muscle and skin. In vitro immunostimulation experiments revealed that Aeromonas hydrophila lipopolysaccharide (LPS) and Vibrio anguillarum flagellin induced higher levels of paTLR9 and paTLR5 mRNA expression in isolated fish intestinal epithelial cells (FIECs) than Lactobacillus plantarum lipoteichoic acid (LTA), but all increased the secretion of IL-6 and TNF-α and induced cell apoptosis and necrosis. Together, these results indicate that paTLR5 and paTLR9 may function in the response to bacterial pathogens. Our findings enhance our understanding of the function of TLRs in the innate immune system of silver pomfret and other teleosts. Copyright © 2018. Published by Elsevier Ltd.
PACCMIT/PACCMIT-CDS: identifying microRNA targets in 3' UTRs and coding sequences.
Šulc, Miroslav; Marín, Ray M; Robins, Harlan S; Vaníček, Jiří
2015-07-01
The purpose of the proposed web server, publicly available at http://paccmit.epfl.ch, is to provide a user-friendly interface to two algorithms for predicting messenger RNA (mRNA) molecules regulated by microRNAs: (i) PACCMIT (Prediction of ACcessible and/or Conserved MIcroRNA Targets), which identifies primarily mRNA transcripts targeted in their 3' untranslated regions (3' UTRs), and (ii) PACCMIT-CDS, designed to find mRNAs targeted within their coding sequences (CDSs). While PACCMIT belongs among the accurate algorithms for predicting conserved microRNA targets in the 3' UTRs, the main contribution of the web server is 2-fold: PACCMIT provides an accurate tool for predicting targets also of weakly conserved or non-conserved microRNAs, whereas PACCMIT-CDS addresses the lack of similar portals adapted specifically for targets in CDS. The web server asks the user for microRNAs and mRNAs to be analyzed, accesses the precomputed P-values for all microRNA-mRNA pairs from a database for all mRNAs and microRNAs in a given species, ranks the predicted microRNA-mRNA pairs, evaluates their significance according to the false discovery rate and finally displays the predictions in a tabular form. The results are also available for download in several standard formats. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Ancient genomic architecture for mammalian olfactory receptor clusters
Aloni, Ronny; Olender, Tsviya; Lancet, Doron
2006-01-01
Background Mammalian olfactory receptor (OR) genes reside in numerous genomic clusters of up to several dozen genes. Whole-genome sequence alignment nets of five mammals allow their comprehensive comparison, aimed at reconstructing the ancestral olfactory subgenome. Results We developed a new and general tool for genome-wide definition of genomic gene clusters conserved in multiple species. Syntenic orthologs, defined as gene pairs showing conservation of both genomic location and coding sequence, were subjected to a graph theory algorithm for discovering CLICs (clusters in conservation). When applied to ORs in five mammals, including the marsupial opossum, more than 90% of the OR genes were found within a framework of 48 multi-species CLICs, invoking a general conservation of gene order and composition. A detailed analysis of individual CLICs revealed multiple differences among species, interpretable through species-specific genomic rearrangements and reflecting complex mammalian evolutionary dynamics. One significant instance involves CLIC #1, which lacks a human member, implying the human-specific deletion of an OR cluster, whose mouse counterpart has been tentatively associated with isovaleric acid odorant detection. Conclusion The identified multi-species CLICs demonstrate that most of the mammalian OR clusters have a common ancestry, preceding the split between marsupials and placental mammals. However, only two of these CLICs were capable of incorporating chicken OR genes, parsimoniously implying that all other CLICs emerged subsequent to the avian-mammalian divergence. PMID:17010214
Hybridization capture reveals evolution and conservation across the entire Koala retrovirus genome.
Tsangaras, Kyriakos; Siracusa, Matthew C; Nikolaidis, Nikolas; Ishida, Yasuko; Cui, Pin; Vielgrader, Hanna; Helgen, Kristofer M; Roca, Alfred L; Greenwood, Alex D
2014-01-01
The koala retrovirus (KoRV) is the only retrovirus known to be in the midst of invading the germ line of its host species. Hybridization capture and next generation sequencing were used on modern and museum DNA samples of koala (Phascolarctos cinereus) to examine ca. 130 years of evolution across the full KoRV genome. Overall, the entire proviral genome appeared to be conserved across time in sequence, protein structure and transcriptional binding sites. A total of 138 polymorphisms were detected, of which 72 were found in more than one individual. At every polymorphic site in the museum koalas, one of the character states matched that of modern KoRV. Among non-synonymous polymorphisms, radical substitutions involving large physiochemical differences between amino acids were elevated in env, potentially reflecting anti-viral immune pressure or avoidance of receptor interference. Polymorphisms were not detected within two functional regions believed to affect infectivity. Host sequences flanking proviral integration sites were also captured; with few proviral loci shared among koalas. Recently described variants of KoRV, designated KoRV-B and KoRV-J, were not detected in museum samples, suggesting that these variants may be of recent origin.
Ferriol, I; Silva Junior, D M; Nigg, J C; Zamora-Macorra, E J; Falk, B W
2016-11-01
Torradoviruses, family Secoviridae, are emergent bipartite RNA plant viruses. RNA1 is ca. 7kb and has one open reading frame (ORF) encoding for the protease, helicase and RNA-dependent RNA polymerase (RdRp). RNA2 is ca. 5kb and has two ORFs. RNA2-ORF1 encodes for a putative protein with unknown function(s). RNA2-ORF2 encodes for a putative movement protein and three capsid proteins. Little is known about the replication and polyprotein processing strategies of torradoviruses. Here, the cleavage sites in the RNA2-ORF2-encoded polyproteins of two torradoviruses, Tomato marchitez virus isolate M (ToMarV-M) and tomato chocolate spot virus, were determined by N-terminal sequencing, revealing that the amino acid (aa) at the -1 position of the cleavage sites is a glutamine. Multiple aa sequence comparison confirmed that this glutamine is conserved among other torradoviruses. Finally, site-directed mutagenesis of conserved aas in the ToMarV-M RdRp and protease prevented substantial accumulation of viral coat proteins or RNAs. Copyright © 2016 Elsevier Inc. All rights reserved.
Li, Yang; Wang, Yixin; Fang, Lichun; Fu, Jiayuan; Cui, Shuai; Zhao, Yingjie; Cui, Zhizhong; Chang, Shuang; Zhao, Peng
2016-01-01
The antibody to chicken infectious anemia virus (CIAV) was positive in a specific pathogen-free (SPF) chicken population by ELISA test in our previous inspection, indicating a possible infection with CIAV. In this study, blood samples collected from the SPF chickens were used to isolate CIAV by inoculating into MSB1 cells and PCR amplification. A CIAV strain (SD1403) was isolated and successfully identified. Three overlapping genomic fragments were obtained by PCR amplification and sequencing. The full genome sequence of the SD1403 strain was obtained by aligning the sequences. The genome of the SD1403 strain was 2293 bp with a nucleotide identity of 94.8% to 98.5% when compared with 30 referred CIAV strains. The viral proteins VP2 and VP3 were highly conserved, but VP1 was not relatively conserved. Both amino acids 139 and 144 of VP1 were glutamine, which was in accord with the low pathogenic characteristics. In this study, we first reported that CIAV exists in Chinese SPF chicken populations and may be an important reason why attenuated vaccine can be contaminated with CIAV. PMID:27298822
Li, Yang; Wang, Yixin; Fang, Lichun; Fu, Jiayuan; Cui, Shuai; Zhao, Yingjie; Cui, Zhizhong; Chang, Shuang; Zhao, Peng
2016-01-01
The antibody to chicken infectious anemia virus (CIAV) was positive in a specific pathogen-free (SPF) chicken population by ELISA test in our previous inspection, indicating a possible infection with CIAV. In this study, blood samples collected from the SPF chickens were used to isolate CIAV by inoculating into MSB1 cells and PCR amplification. A CIAV strain (SD1403) was isolated and successfully identified. Three overlapping genomic fragments were obtained by PCR amplification and sequencing. The full genome sequence of the SD1403 strain was obtained by aligning the sequences. The genome of the SD1403 strain was 2293 bp with a nucleotide identity of 94.8% to 98.5% when compared with 30 referred CIAV strains. The viral proteins VP2 and VP3 were highly conserved, but VP1 was not relatively conserved. Both amino acids 139 and 144 of VP1 were glutamine, which was in accord with the low pathogenic characteristics. In this study, we first reported that CIAV exists in Chinese SPF chicken populations and may be an important reason why attenuated vaccine can be contaminated with CIAV.
Hybridization Capture Reveals Evolution and Conservation across the Entire Koala Retrovirus Genome
Ishida, Yasuko; Cui, Pin; Vielgrader, Hanna; Helgen, Kristofer M.; Roca, Alfred L.; Greenwood, Alex D.
2014-01-01
The koala retrovirus (KoRV) is the only retrovirus known to be in the midst of invading the germ line of its host species. Hybridization capture and next generation sequencing were used on modern and museum DNA samples of koala (Phascolarctos cinereus) to examine ca. 130 years of evolution across the full KoRV genome. Overall, the entire proviral genome appeared to be conserved across time in sequence, protein structure and transcriptional binding sites. A total of 138 polymorphisms were detected, of which 72 were found in more than one individual. At every polymorphic site in the museum koalas, one of the character states matched that of modern KoRV. Among non-synonymous polymorphisms, radical substitutions involving large physiochemical differences between amino acids were elevated in env, potentially reflecting anti-viral immune pressure or avoidance of receptor interference. Polymorphisms were not detected within two functional regions believed to affect infectivity. Host sequences flanking proviral integration sites were also captured; with few proviral loci shared among koalas. Recently described variants of KoRV, designated KoRV-B and KoRV-J, were not detected in museum samples, suggesting that these variants may be of recent origin. PMID:24752422
Transcriptional regulation of fatty acid biosynthesis in mycobacteria
Mondino, S.; Gago, G.; Gramajo, H.
2013-01-01
SUMMARY The main purpose of our study is to understand how mycobacteria exert control over the biosynthesis of their membrane lipids and find out the key components of the regulatory network that control fatty acid biosynthesis at the transcriptional level. In this paper we describe the identification and purification of FasR, a transcriptional regulator from Mycobacterium sp. that controls the expression of the fatty acid synthase (fas) and the 4-phosphopantetheinyl transferase (acpS) encoding genes, whose products are involved in the fatty acid and mycolic acid biosynthesis pathways. In vitro studies demonstrated that fas and acpS genes are part of the same transcriptional unit and that FasR specifically binds to three conserved operator sequences present in the fas-acpS promoter region (Pfas). The construction and further characterization of a fasR conditional mutant confirmed that FasR is a transcriptional activator of the fas-acpS operon and that this protein is essential for mycobacteria viability. Furthermore, the combined used of Pfas-lacZ fusions in different fasR backgrounds and electrophoretic mobility shift assays experiments, strongly suggested that long-chain acyl-CoAs are the effector molecules that modulate the affinity of FasR for its DNA binding sequences and therefore the expression of the essential fas-acpS operon. PMID:23721164
Functionally conserved enhancers with divergent sequences in distant vertebrates
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yang, Song; Oksenberg, Nir; Takayama, Sachiko
To examine the contributions of sequence and function conservation in the evolution of enhancers, we systematically identified enhancers whose sequences are not conserved among distant groups of vertebrate species, but have homologous function and are likely to be derived from a common ancestral sequence. In conclusion, our approach combined comparative genomics and epigenomics to identify potential enhancer sequences in the genomes of three groups of distantly related vertebrate species.
Functionally conserved enhancers with divergent sequences in distant vertebrates
Yang, Song; Oksenberg, Nir; Takayama, Sachiko; ...
2015-10-30
To examine the contributions of sequence and function conservation in the evolution of enhancers, we systematically identified enhancers whose sequences are not conserved among distant groups of vertebrate species, but have homologous function and are likely to be derived from a common ancestral sequence. In conclusion, our approach combined comparative genomics and epigenomics to identify potential enhancer sequences in the genomes of three groups of distantly related vertebrate species.
Sumi, S; Tsuneyoshi, T; Furutani, H
1993-09-01
Rod-shaped flexuous viruses were partially purified from garlic plants (Allium sativum) showing typical mosaic symptoms. The genome was shown to be composed of RNA with a poly(A) tail of an estimated size of 10 kb as shown by denaturing agarose gel electrophoresis. We constructed cDNA libraries and screened four independent clones, which were designated GV-A, GV-B, GV-C and GV-D, using Northern and Southern blot hybridization. Nucleotide sequence determination of the cDNAs, two of which correspond to nearly one-third of the virus genomic RNA, shows that all of these viruses possess an identical genomic structure and that also at least four proteins are encoded in the viral cDNA, their M(r)s being estimated to be 15K, 27K, 40K and 11K. The 15K open reading frame (ORF) encodes the core-like sequence of a zinc finger protein preceded by a cluster of basic amino acid residues. The 27K ORF probably encodes the viral coat protein (CP), based on both the existence of some conserved sequences observed in many other rod-shaped or flexuous virus CPs and an overall amino acid sequence similarity to potexvirus and carlavirus CPs. The 11K ORF shows significant amino acid sequence similarities to the corresponding 12K proteins of the potexviruses and carlaviruses. On the other hand, the 40K ORF product does not resemble any other plant virus gene products reported so far. The genomic organization in the 3' region of the garlic viruses resembles, but clearly differs from, that of carlaviruses. Phylogenetic analysis based upon the amino acid sequence of the viral capsid protein also indicates that the garlic viruses have a unique and distinct domain different from those of the potexvirus and carlavirus groups. The results suggest that the garlic viruses described here belong to an unclassified and new virus group closely related to the carlaviruses.
NASA Astrophysics Data System (ADS)
Qi, Fei; Guo, Huarong; Wang, Jian
2008-02-01
Reversible protein phosphorylation, catalyzed by protein kinases and phosphatases, is an important and versatile mechanism by which eukaryotic cells regulate almost all the signaling processes. Protein phosphatase 1 (PP1) is the first and well-characterized member of the protein serine/threonine phosphatase family. In the present study, a full-length cDNA encoding the beta isoform of the catalytic subunit of protein phosphatase 1(PP1cb), was for the first time isolated and sequenced from the skin tissue of flatfish turbot Scophthalmus maximus, designated SmPP1cb, by the rapid amplification of cDNA ends (RACE) technique. The cDNA sequence of SmPP1cb we obtained contains a 984 bp open reading frame (ORF), flanked by a complete 39 bp 5' untranslated region and 462 bp 3' untranslated region. The ORF encodes a putative 327 amino acid protein, and the N-terminal section of this protein is highly acidic, Met-Ala-Glu-Gly-Glu-Leu-Asp-Val-Asp, a common feature for PP1 catalytic subunit but absent in protein phosphatase 2B (PP2B). And its calculated molecular mass is 37 193 Da and pI 5.8. Sequence analysis indicated that, SmPP1cb is extremely conserved in both amino acid and nucleotide acid levels compared with the PP1cb of other vertebrates and invertebrates, and its Kozak motif contained in the 5'UTR around ATG start codon is GXXAXXGXX ATGG, which is different from mammalian in two positions A-6 and G-3, indicating the possibility of different initiation of translation in turbot, and also the 3'UTR of SmPP1cb is highly diverse in the sequence similarity and length compared with other animals, especially zebrafish. The cloning and sequencing of SmPP1cb gene lays a good foundation for the future work on the biological functions of PP1 in the flatfish turbot.
Coronado, Liani; Liniger, Matthias; Muñoz-González, Sara; Postel, Alexander; Pérez, Lester Josue; Pérez-Simó, Marta; Perera, Carmen Laura; Frías-Lepoureau, Maria Teresa; Rosell, Rosa; Grundhoff, Adam; Indenbirken, Daniela; Alawi, Malik; Fischer, Nicole; Becher, Paul; Ruggli, Nicolas; Ganges, Llilianne
2017-03-01
In this study, we compared the virulence in weaner pigs of the Pinar del Rio isolate and the virulent Margarita strain. The latter caused the Cuban classical swine fever (CSF) outbreak of 1993. Our results showed that the Pinar del Rio virus isolated during an endemic phase is clearly of low virulence. We analysed the complete nucleotide sequence of the Pinar del Rio virus isolated after persistence in newborn piglets, as well as the genome sequence of the inoculum. The consensus genome sequence of the Pinar del Rio virus remained completely unchanged after 28days of persistent infection in swine. More importantly, a unique poly-uridine tract was discovered in the 3'UTR of the Pinar del Rio virus, which was not found in the Margarita virus or any other known CSFV sequences. Based on RNA secondary structure prediction, the poly-uridine tract results in a long single-stranded intervening sequence (SS) between the stem-loops I and II of the 3'UTR, without major changes in the stem- loop structures when compared to the Margarita virus. The possible implications of this novel insertion on persistence and attenuation remain to be investigated. In addition, comparison of the amino acid sequence of the viral proteins E rns , E1, E2 and p7 of the Margarita and Pinar del Rio viruses showed that all non-conservative amino acid substitutions acquired by the Pinar del Rio isolate clustered in E2, with two of them being located within the B/C domain. Immunisation and cross-neutralisation experiments in pigs and rabbits suggest differences between these two viruses, which may be attributable to the amino acid differences observed in E2. Altogether, these data provide fresh insights into viral molecular features which might be associated with the attenuation and adaptation of CSFV for persistence in the field. Copyright © 2017 Elsevier B.V. All rights reserved.
The abundant extrachromosomal DNA content of the Spiroplasma citri GII3-3X genome
Saillard, Colette; Carle, Patricia; Duret-Nurbel, Sybille; Henri, Raphaël; Killiny, Nabil; Carrère, Sébastien; Gouzy, Jérome; Bové, Joseph-Marie; Renaudin, Joël; Foissac, Xavier
2008-01-01
Background Spiroplama citri, the causal agent of citrus stubborn disease, is a bacterium of the class Mollicutes and is transmitted by phloem-feeding leafhopper vectors. In order to characterize candidate genes potentially involved in spiroplasma transmission and pathogenicity, the genome of S. citri strain GII3-3X is currently being deciphered. Results Assembling 20,000 sequencing reads generated seven circular contigs, none of which fit the 1.8 Mb chromosome map or carried chromosomal markers. These contigs correspond to seven plasmids: pSci1 to pSci6, with sizes ranging from 12.9 to 35.3 kbp and pSciA of 7.8 kbp. Plasmids pSci were detected as multiple copies in strain GII3-3X. Plasmid copy numbers of pSci1-6, as deduced from sequencing coverage, were estimated at 10 to 14 copies per spiroplasma cell, representing 1.6 Mb of extrachromosomal DNA. Genes encoding proteins of the TrsE-TraE, Mob, TraD-TraG, and Soj-ParA protein families were predicted in most of the pSci sequences, in addition to members of 14 protein families of unknown function. Plasmid pSci6 encodes protein P32, a marker of insect transmissibility. Plasmids pSci1-5 code for eight different S. citri adhesion-related proteins (ScARPs) that are homologous to the previously described protein P89 and the S. kunkelii SkARP1. Conserved signal peptides and C-terminal transmembrane alpha helices were predicted in all ScARPs. The predicted surface-exposed N-terminal region possesses the following elements: (i) 6 to 8 repeats of 39 to 42 amino acids each (sarpin repeats), (ii) a central conserved region of 330 amino acids followed by (iii) a more variable domain of about 110 amino acids. The C-terminus, predicted to be cytoplasmic, consists of a 27 amino acid stretch enriched in arginine and lysine (KR) and an optional 23 amino acid stretch enriched in lysine, aspartate and glutamate (KDE). Plasmids pSci mainly present a linear increase of cumulative GC skew except in regions presenting conserved hairpin structures. Conclusion The genome of S. citri GII3-3X is characterized by abundant extrachromosomal elements. The pSci plasmids could not only be vertically inherited but also horizontally transmitted, as they encode proteins usually involved in DNA element partitioning and cell to cell DNA transfer. Because plasmids pSci1-5 encode surface proteins of the ScARP family and pSci6 was recently shown to confer insect transmissibility, diversity and abundance of S. citri plasmids may essentially aid the rapid adaptation of S. citri to more efficient transmission by different insect vectors and to various plant hosts. PMID:18442384
Gao, F; Cao, X F; Si, J P; Chen, Z Y; Duan, C L
2016-05-06
Dendrobium officinale is one of the most well-known traditional Chinese medicines, and polysaccharide is its main active ingredient. Many studies have investigated the synthesis and accumulation mechanisms of polysaccharide, but until recently, little was known about the molecular mechanism of how polysaccharide is synthesized because no related genes have been cloned. In this study, we cloned an alkaline/neutral invertase gene from D. officinale (DoNI) by the rapid amplification of cDNA ends (RACE) method. DoNI was 2231 bp long and contained an open reading frame that predicted a 62.8-kDa polypeptide with 554-amino acid residues. An alkaline/neutral invertase conserved domain was predicted from this deduced amino acid sequence, and DoNI had a similar deduced amino acid sequence to Setaria italica and Oryza brachyantha. We also found that DoNI expression in different tissues was closely related to DoNI activity, and more importantly, polysaccharide level. Our results indicate that DoNI is associated with polysaccharide accumulation in D. officinale.
Kobayashi, Hiroko; Motoyoshi, Naomi; Itagaki, Tadashi; Suzuki, Mamoru; Inokuchi, Norio
2015-01-01
RNase He1 from Hericium erinaceus, a member of the RNase T1 family, has high identity with RNase Po1 from Pleurotus ostreatus with complete conservation of the catalytic sequence. However, the optimal pH for RNase He1 activity is lower than that of RNase Po1, and the enzyme shows little inhibition of human tumor cell proliferation. Hence, to investigate the potential antitumor activity of recombinant RNase He1 and to possibly enhance its optimum pH, we generated RNase He1 mutants by replacing 12 Asn/Gln residues with Asp/Glu residues; the amino acid sequence of RNase Po1 was taken as reference. These mutants were then expressed in Escherichia coli. Using site-directed mutagenesis, we successfully modified the optimal pH for enzyme activity and generated a recombinant RNase He1 that inhibited the proliferation of cells in the human leukemia cell line. These properties are extremely important in the production of anticancer biologics that are based on RNase activity.
Martínez-Quintana, José A; Peregrino-Uriarte, Alma B; Gollas-Galván, Teresa; Gómez-Jiménez, Silvia; Yepiz-Plascencia, Gloria
2014-12-01
During hypoxia the shrimp Litopenaeus vannamei accelerates anaerobic glycolysis to obtain energy; therefore, a correct supply of glucose to the cells is needed. Facilitated glucose transport across the cells is mediated by a group of membrane embedded integral proteins called GLUT; being GLUT1 the most ubiquitous form. In this work, we report the first cDNA nucleotide and deduced amino acid sequences of a glucose transporter 1 from L. vannamei. A 1619 bp sequence was obtained by RT-PCR and RACE approaches. The 5´ UTR is 161 bp and the poly A tail is exactly after the stop codon in the mRNA. The ORF is 1485 bp and codes for 485 amino acids. The deduced protein sequence has high identity to GLUT1 proteins from several species and contains all the main features of glucose transporter proteins, including twelve transmembrane domains, the conserved motives and amino acids involved in transport activity, ligands binding and membrane anchor. Therefore, we decided to name this sequence, glucose transporter 1 of L. vannamei (LvGLUT1). A partial gene sequence of 8.87 Kbp was also obtained; it contains the complete coding sequence divided in 10 exons. LvGlut1 expression was detected in hemocytes, hepatopancreas, intestine gills, muscle and pleopods. The higher relative expression was found in gills and the lower in hemocytes. This indicates that LvGlut1 is ubiquitously expressed but its levels are tissue-specific and upon short-term hypoxia, the GLUT1 transcripts increase 3.7-fold in hepatopancreas and gills. To our knowledge, this is the first evidence of expression of GLUT1 in crustaceans.
Analysis of the mitochondrial genome of cheetahs (Acinonyx jubatus) with neurodegenerative disease.
Burger, Pamela A; Steinborn, Ralf; Walzer, Christian; Petit, Thierry; Mueller, Mathias; Schwarzenberger, Franz
2004-08-18
The complete mitochondrial genome of Acinonyx jubatus was sequenced and mitochondrial DNA (mtDNA) regions were screened for polymorphisms as candidates for the cause of a neurodegenerative demyelinating disease affecting captive cheetahs. The mtDNA reference sequences were established on the basis of the complete sequences of two diseased and two nondiseased animals as well as partial sequences of 26 further individuals. The A. jubatus mitochondrial genome is 17,047-bp long and shows a high sequence similarity (91%) to the domestic cat. Based on single nucleotide polymorphisms (SNPs) in the control region (CR) and pedigree information, the 18 myelopathic and 12 non-myelopathic cheetahs included in this study were classified into haplotypes I, II and III. In view of the phenotypic comparability of the neurodegenerative disease observed in cheetahs and human mtDNA-associated diseases, specific coding regions including the tRNAs leucine UUR, lysine, serine UCN, and partial complex I and V sequences were screened. We identified a heteroplasmic and a homoplasmic SNP at codon 507 in the subunit 5 (MTND5) of complex I. The heteroplasmic haplotype I-specific valine to methionine substitution represents a nonconservative amino acid change and was found in 11 myelopathic and eight non-myelopathic cheetahs with levels ranging from 29% to 79%. The homoplasmic conservative amino acid substitution valine to alanine was identified in two myelopathic animals of haplotype II. In addition, a synonymous SNP in the codon 76 of the MTND4L gene was found in the single haplotype III animal. The amino acid exchanges in the MTND5 gene were not associated with the occurrence of neurodegenerative disease in captive cheetahs.
Zhang, L J; Dong, W X; Guo, S M; Wang, Y X; Wang, A D; Lu, X J
2015-11-19
This study aims to explore the roles of somatic embryogenesis receptor-like kinase (SERK) in Malus hupehensis (Pingyi Tiancha). The full-length sequences of SERK1 in triploid Pingyi Tiancha (3n) and a tetraploid hybrid strain 33# (4n) were cloned, sequenced, and designated as MhSERK1 and MhdSERK1, respectively. Multiple alignments of amino acid sequences were conducted to identify similarity between MhSERK1 and MhdSERK1 and SERK sequences in other species, and a neighbor-joining phylogenetic tree was constructed to elucidate their phylogenetic relations. Expression levels of MhSERK1 and MhdSERK1 in different tissues and developmental stages were investigated using quantitative real-time PCR. The coding sequence lengths of MhSERK1 and MhdSERK1 were 1899 bp (encoding 632 amino acids) and 1881 bp (encoding 626 amino acids), respectively. Sequence analysis demonstrated that MhSERK1 and MhdSERK1 display high similarity to SERKs in other species, with a conserved intron/exon structure that is unique to members of the SERK family. Additionally, the phylogenetic tree showed that MhSERK1 and MhdSERK1 clustered with orange CitSERK (93%). Furthermore, MhSERK1 and MhdSERK1 were mainly expressed in the reproductive organs, in particular the ovary. Their expression levels were highest in young flowers and they differed among different tissues and organs. Our results suggest that MhSERK1 and MhdSERK1 are related to plant reproduction, and that MhSERK1 is related to apomixis in triploid Pingyi Tiancha.
Bergman, C M; Kreitman, M
2001-08-01
Comparative genomic approaches to gene and cis-regulatory prediction are based on the principle that differential DNA sequence conservation reflects variation in functional constraint. Using this principle, we analyze noncoding sequence conservation in Drosophila for 40 loci with known or suspected cis-regulatory function encompassing >100 kb of DNA. We estimate the fraction of noncoding DNA conserved in both intergenic and intronic regions and describe the length distribution of ungapped conserved noncoding blocks. On average, 22%-26% of noncoding sequences surveyed are conserved in Drosophila, with median block length approximately 19 bp. We show that point substitution in conserved noncoding blocks exhibits transition bias as well as lineage effects in base composition, and occurs more than an order of magnitude more frequently than insertion/deletion (indel) substitution. Overall, patterns of noncoding DNA structure and evolution differ remarkably little between intergenic and intronic conserved blocks, suggesting that the effects of transcription per se contribute minimally to the constraints operating on these sequences. The results of this study have implications for the development of alignment and prediction algorithms specific to noncoding DNA, as well as for models of cis-regulatory DNA sequence evolution.
Missihoun, Tagnon D.; Kotchoni, Simeon O.; Bartels, Dorothea
2016-01-01
Plant aldehyde dehydrogenases (ALDHs) play important roles in cell wall biosynthesis, growth, development, and tolerance to biotic and abiotic stresses. The Reduced Epidermal Fluorescence1 is encoded by the subfamily 2C of ALDHs and was shown to oxidise coniferaldehyde and sinapaldehyde to ferulic acid and sinapic acid in the phenylpropanoid pathway, respectively. This knowledge has been gained from works in the dicotyledon model species Arabidopsis thaliana then used to functionally annotate ALDH2C isoforms in other species, based on the orthology principle. However, the extent to which the ALDH isoforms differ between monocotyledons and dicotyledons has rarely been accessed side-by-side. In this study, we used a phylogenetic approach to address this question. We have analysed the ALDH genes in Brachypodium distachyon, alongside those of other sequenced monocotyledon and dicotyledon species to examine traits supporting either a convergent or divergent evolution of the ALDH2C/REF1-type proteins. We found that B. distachyon, like other grasses, contains more ALDH2C/REF1 isoforms than A. thaliana and other dicotyledon species. Some amino acid residues in ALDH2C/REF1 isoforms were found as being conserved in dicotyledons but substituted by non-equivalent residues in monocotyledons. One example of those substitutions concerns a conserved phenylalanine and a conserved tyrosine in monocotyledons and dicotyledons, respectively. Protein structure modelling suggests that the presence of tyrosine would widen the substrate-binding pocket in the dicotyledons, and thereby influence substrate specificity. We discussed the importance of these findings as new hints to investigate why ferulic acid contents and cell wall digestibility differ between the dicotyledon and monocotyledon species. PMID:27798665
Missihoun, Tagnon D; Kotchoni, Simeon O; Bartels, Dorothea
2016-01-01
Plant aldehyde dehydrogenases (ALDHs) play important roles in cell wall biosynthesis, growth, development, and tolerance to biotic and abiotic stresses. The Reduced Epidermal Fluorescence1 is encoded by the subfamily 2C of ALDHs and was shown to oxidise coniferaldehyde and sinapaldehyde to ferulic acid and sinapic acid in the phenylpropanoid pathway, respectively. This knowledge has been gained from works in the dicotyledon model species Arabidopsis thaliana then used to functionally annotate ALDH2C isoforms in other species, based on the orthology principle. However, the extent to which the ALDH isoforms differ between monocotyledons and dicotyledons has rarely been accessed side-by-side. In this study, we used a phylogenetic approach to address this question. We have analysed the ALDH genes in Brachypodium distachyon, alongside those of other sequenced monocotyledon and dicotyledon species to examine traits supporting either a convergent or divergent evolution of the ALDH2C/REF1-type proteins. We found that B. distachyon, like other grasses, contains more ALDH2C/REF1 isoforms than A. thaliana and other dicotyledon species. Some amino acid residues in ALDH2C/REF1 isoforms were found as being conserved in dicotyledons but substituted by non-equivalent residues in monocotyledons. One example of those substitutions concerns a conserved phenylalanine and a conserved tyrosine in monocotyledons and dicotyledons, respectively. Protein structure modelling suggests that the presence of tyrosine would widen the substrate-binding pocket in the dicotyledons, and thereby influence substrate specificity. We discussed the importance of these findings as new hints to investigate why ferulic acid contents and cell wall digestibility differ between the dicotyledon and monocotyledon species.
Huang, Youhua; Huang, Xiaohong; Liu, Hong; Gong, Jie; Ouyang, Zhengliang; Cui, Huachun; Cao, Jianhao; Zhao, Yingtao; Wang, Xiujie; Jiang, Yulin; Qin, Qiwei
2009-01-01
Background Soft-shelled turtle iridovirus (STIV) is the causative agent of severe systemic diseases in cultured soft-shelled turtles (Trionyx sinensis). To our knowledge, the only molecular information available on STIV mainly concerns the highly conserved STIV major capsid protein. The complete sequence of the STIV genome is not yet available. Therefore, determining the genome sequence of STIV and providing a detailed bioinformatic analysis of its genome content and evolution status will facilitate further understanding of the taxonomic elements of STIV and the molecular mechanisms of reptile iridovirus pathogenesis. Results We determined the complete nucleotide sequence of the STIV genome using 454 Life Science sequencing technology. The STIV genome is 105 890 bp in length with a base composition of 55.1% G+C. Computer assisted analysis revealed that the STIV genome contains 105 potential open reading frames (ORFs), which encode polypeptides ranging from 40 to 1,294 amino acids and 20 microRNA candidates. Among the putative proteins, 20 share homology with the ancestral proteins of the nuclear and cytoplasmic large DNA viruses (NCLDVs). Comparative genomic analysis showed that STIV has the highest degree of sequence conservation and a colinear arrangement of genes with frog virus 3 (FV3), followed by Tiger frog virus (TFV), Ambystoma tigrinum virus (ATV), Singapore grouper iridovirus (SGIV), Grouper iridovirus (GIV) and other iridovirus isolates. Phylogenetic analysis based on conserved core genes and complete genome sequence of STIV with other virus genomes was performed. Moreover, analysis of the gene gain-and-loss events in the family Iridoviridae suggested that the genes encoded by iridoviruses have evolved for favoring adaptation to different natural host species. Conclusion This study has provided the complete genome sequence of STIV. Phylogenetic analysis suggested that STIV and FV3 are strains of the same viral species belonging to the Ranavirus genus in the Iridoviridae family. Given virus-host co-evolution and the phylogenetic relationship among vertebrates from fish to reptiles, we propose that iridovirus might transmit between reptiles and amphibians and that STIV and FV3 are strains of the same viral species in the Ranavirus genus. PMID:19439104
Biological function in the twilight zone of sequence conservation.
Ponting, Chris P
2017-08-16
Strong DNA conservation among divergent species is an indicator of enduring functionality. With weaker sequence conservation we enter a vast 'twilight zone' in which sequence subject to transient or lower constraint cannot be distinguished easily from neutrally evolving, non-functional sequence. Twilight zone functional sequence is illuminated instead by principles of selective constraint and positive selection using genomic data acquired from within a species' population. Application of these principles reveals that despite being biochemically active, most twilight zone sequence is not functional.
Vettore, André L.; da Silva, Felipe R.; Kemper, Edson L.; Souza, Glaucia M.; da Silva, Aline M.; Ferro, Maria Inês T.; Henrique-Silva, Flavio; Giglioti, Éder A.; Lemos, Manoel V.F.; Coutinho, Luiz L.; Nobrega, Marina P.; Carrer, Helaine; França, Suzelei C.; Bacci, Maurício; Goldman, Maria Helena S.; Gomes, Suely L.; Nunes, Luiz R.; Camargo, Luis E.A.; Siqueira, Walter J.; Van Sluys, Marie-Anne; Thiemann, Otavio H.; Kuramae, Eiko E.; Santelli, Roberto V.; Marino, Celso L.; Targon, Maria L.P.N.; Ferro, Jesus A.; Silveira, Henrique C.S.; Marini, Danyelle C.; Lemos, Eliana G.M.; Monteiro-Vitorello, Claudia B.; Tambor, José H.M.; Carraro, Dirce M.; Roberto, Patrícia G.; Martins, Vanderlei G.; Goldman, Gustavo H.; de Oliveira, Regina C.; Truffi, Daniela; Colombo, Carlos A.; Rossi, Magdalena; de Araujo, Paula G.; Sculaccio, Susana A.; Angella, Aline; Lima, Marleide M.A.; de Rosa, Vicente E.; Siviero, Fábio; Coscrato, Virginia E.; Machado, Marcos A.; Grivet, Laurent; Di Mauro, Sonia M.Z.; Nobrega, Francisco G.; Menck, Carlos F.M.; Braga, Marilia D.V.; Telles, Guilherme P.; Cara, Frank A.A.; Pedrosa, Guilherme; Meidanis, João; Arruda, Paulo
2003-01-01
To contribute to our understanding of the genome complexity of sugarcane, we undertook a large-scale expressed sequence tag (EST) program. More than 260,000 cDNA clones were partially sequenced from 26 standard cDNA libraries generated from different sugarcane tissues. After the processing of the sequences, 237,954 high-quality ESTs were identified. These ESTs were assembled into 43,141 putative transcripts. Of the assembled sequences, 35.6% presented no matches with existing sequences in public databases. A global analysis of the whole SUCEST data set indicated that 14,409 assembled sequences (33% of the total) contained at least one cDNA clone with a full-length insert. Annotation of the 43,141 assembled sequences associated almost 50% of the putative identified sugarcane genes with protein metabolism, cellular communication/signal transduction, bioenergetics, and stress responses. Inspection of the translated assembled sequences for conserved protein domains revealed 40,821 amino acid sequences with 1415 Pfam domains. Reassembling the consensus sequences of the 43,141 transcripts revealed a 22% redundancy in the first assembling. This indicated that possibly 33,620 unique genes had been identified and indicated that >90% of the sugarcane expressed genes were tagged. PMID:14613979
Constancy and diversity in the flavivirus fusion peptide.
Seligman, Stephen J
2008-02-14
Flaviviruses include the mosquito-borne dengue, Japanese encephalitis, yellow fever and West Nile and the tick-borne encephalitis viruses. They are responsible for considerable world-wide morbidity and mortality. Viral entry is mediated by a conserved fusion peptide containing 16 amino acids located in domain II of the envelope protein E. Highly orchestrated conformational changes initiated by exposure to acidic pH accompany the fusion process and are important factors limiting amino acid changes in the fusion peptide that still permit fusion with host cell membranes in both arthropod and vertebrate hosts. The cell-fusing related agents, growing only in mosquitoes or insect cell lines, possess a different homologous peptide. Analysis of 46 named flaviviruses deposited in the Entrez Nucleotides database extended the constancy in the canonical fusion peptide sequences of mosquito-borne, tick-borne and viruses with no known vector to include more recently-sequenced viruses. The mosquito-borne signature amino acid, G104, was also found in flaviviruses with no known vector and with the cell-fusion related viruses. Despite the constancy in the canonical sequences in pathogenic flaviviruses, mutations were surprisingly frequent with a 27% prevalence of nonsynonymous mutations in yellow fever virus fusion peptide sequences, and 0 to 7.4% prevalence in the others. Six of seven yellow fever patients whose virus had fusion peptide mutations died. In the cell-fusing related agents, not enough sequences have been deposited to estimate reliably the prevalence of fusion peptide mutations. However, the canonical sequences homologous to the fusion peptide and the pattern of disulfide linkages in protein E differed significantly from the other flaviviruses. The constancy of the canonical fusion peptide sequences in the arthropod-borne flaviviruses contrasts with the high prevalence of mutations in most individual viruses. The discrepancy may be the result of a survival advantage accompanying sequence diversity (quasispecies) involving the fusion peptide. Limited clinical data with yellow fever virus suggest that the presence of fusion peptide mutants is not associated with a decreased case fatality rate. The cell-fusing related agents may have substantial differences from other flaviviruses in their mechanism of viral entry into the host cell.
Dictionary-driven protein annotation
Rigoutsos, Isidore; Huynh, Tien; Floratos, Aris; Parida, Laxmi; Platt, Daniel
2002-01-01
Computational methods seeking to automatically determine the properties (functional, structural, physicochemical, etc.) of a protein directly from the sequence have long been the focus of numerous research groups. With the advent of advanced sequencing methods and systems, the number of amino acid sequences that are being deposited in the public databases has been increasing steadily. This has in turn generated a renewed demand for automated approaches that can annotate individual sequences and complete genomes quickly, exhaustively and objectively. In this paper, we present one such approach that is centered around and exploits the Bio-Dictionary, a collection of amino acid patterns that completely covers the natural sequence space and can capture functional and structural signals that have been reused during evolution, within and across protein families. Our annotation approach also makes use of a weighted, position-specific scoring scheme that is unaffected by the over-representation of well-conserved proteins and protein fragments in the databases used. For a given query sequence, the method permits one to determine, in a single pass, the following: local and global similarities between the query and any protein already present in a public database; the likeness of the query to all available archaeal/bacterial/eukaryotic/viral sequences in the database as a function of amino acid position within the query; the character of secondary structure of the query as a function of amino acid position within the query; the cytoplasmic, transmembrane or extracellular behavior of the query; the nature and position of binding domains, active sites, post-translationally modified sites, signal peptides, etc. In terms of performance, the proposed method is exhaustive, objective and allows for the rapid annotation of individual sequences and full genomes. Annotation examples are presented and discussed in Results, including individual queries and complete genomes that were released publicly after we built the Bio-Dictionary that is used in our experiments. Finally, we have computed the annotations of more than 70 complete genomes and made them available on the World Wide Web at http://cbcsrv.watson.ibm.com/Annotations/. PMID:12202776
Jesús, Torres; Rogelio, López; Abraham, Cetina; Uriel, López; J- Daniel, García; Alfonso, Méndez-Tenorio; Lilia, Barrón Blanca
2012-01-01
There are very few antiviral drugs available to fight viral infections and the appearance of viral strains resistant to these antivirals is not a rare event. Hence, the design of new antiviral drugs is important. We describe the prediction of peptides with antiviral activity (AVP) derived from the viral glycoproteins involved in the entrance of herpes simplex (HSV) and influenza A viruses into their host cells. It is known, that during this event viral glycoproteins suffer several conformational changes due to protein-protein interactions, which lead to membrane fusion between the viral envelope and the cellular membrane. Our hypothesis is that AVPs can be derived from these viral glycoproteins, specifically from regions highly conserved in amino acid sequences, which at the same time have the physicochemical properties of being highly exposed (antigenic), hydrophilic, flexible, and charged, since these properties are important for protein-protein interactions. For that, we separately analyzed the HSV glycoprotein H and B, and influenza A viruses hemagglutinin (HA), using several bioinformatics tools. A set of multiple alignments was carried out, to find the most conserved regions in the amino acid sequences. Then, the physicochemical properties indicated above were analyzed. We predicted several peptides 12-20 amino acid length which by docking analysis were able to interact with the fusion viral glycoproteins and thus may prevent conformational changes in them, blocking the viral infection. Our strategy to design AVPs seems to be very promising since the peptides were synthetized and their antiviral activities have produced very encouraging results. PMID:23144542
Henry, J S; Lance, V A; Conlon, J M
1993-02-01
Within the order Perissodactyla, the primary structure of insulin has been strongly conserved. Insulin from Przewalski's horse and the mountain zebra (suborder Hippomorpha) is the same as that from the domestic horse and differs from insulin from the white rhinoceros and mountain tapir (suborder Ceratomorpha) by a single substitution (Gly-->Ser) at position 9 in the A-chain. A second molecular form of Przewalski's horse insulin isolated in this study was shown to represent the gamma-ethyl ester of the Glu17 residue of the A-chain. This component was probably formed during the extraction of the pancreas with acidified ethanol. The amino acid sequence of the C-peptide of proinsulin has been less well conserved. Zebra C-peptide comprises 31 amino acid residues and differs from Przewalski's horse and domestic horse C-peptide by one substitution (Gln30-->Pro). Rhino C-peptide was isolated only in a truncated form corresponding to residues (1-23) of intact C-peptide. Its amino acid sequence contains three substitutions compared with the corresponding region of horse C-peptide. It is postulated that the substitution (Pro23-->Thr) renders rhino C-peptide more liable to proteolytic cleavage by a chymotrypsin-like enzyme than horse C-peptide. C-peptide could not be identified in the extract of tapir pancreas, suggesting that proteolytic degradation may have been more extensive than in the rhino. In contrast to the ox and pig (order Artiodactyla), there was no evidence for the expression of more than one proinsulin gene in the species of Perissodactyla examined.
Pons, Tirso; Naumoff, Daniil G; Martínez-Fleites, Carlos; Hernández, Lázaro
2004-02-15
Multiple-sequence alignment of glycoside hydrolase (GH) families 32, 43, 62, and 68 revealed three conserved blocks, each containing an acidic residue at an equivalent position in all the enzymes. A detailed analysis of the site-directed mutations so far performed on invertases (GH32), arabinanases (GH43), and bacterial fructosyltransferases (GH68) indicated a direct implication of the conserved residues Asp/Glu (block I), Asp (block II), and Glu (block III) in substrate binding and hydrolysis. These residues are close in space in the 5-bladed beta-propeller fold determined for Cellvibrio japonicus alpha-L-arabinanase Arb43A [Nurizzo et al., Nat Struct Biol 2002;9:665-668] and Bacillus subtilis endo-1,5-alpha-L-arabinanase. A sequence-structure compatibility search using 3D-PSSM, mGenTHREADER, INBGU, and SAM-T02 programs predicted indistinctly the 5-bladed beta-propeller fold of Arb43A and the 6-bladed beta-propeller fold of sialidase/neuraminidase (GH33, GH34, and GH83) as the most reliable topologies for GH families 32, 62, and 68. We conclude that the identified acidic residues are located at the active site of a beta-propeller architecture in GH32, GH43, GH62, and GH68, operating with a canonical reaction mechanism of either inversion (GH43 and likely GH62) or retention (GH32 and GH68) of the anomeric configuration. Also, we propose that the beta-propeller architecture accommodates distinct binding sites for the acceptor saccharide in glycosyl transfer reaction. Copyright 2003 Wiley-Liss, Inc.
Li, Jun Hua; Chang, Ming Xian; Xue, Na Na; Nie, P
2013-08-01
Peptidoglycan recognition proteins (PGRPs), which are evolutionarily conserved from insects to mammals, recognize bacterial peptidoglycan (PGN) and function in antibacterial innate immunity. In this study, a short-form PGRP, designated as gcPGRP5 was identified from grass carp Ctenopharyngodon idella. The deduced amino acid sequence of gcPGRP5 is composed of 180 residues with a conserved PGRP domain at the C-terminus. The gcPGRP5 gene consists of four exons and three introns, spacing approximately 2.3 kb in genomic sequence. Phylogenetic analysis demonstrated that the gcPGRP5 is clustered with other PGRP-S identified in teleost fish. The gcPGRP5 is constitutively expressed in all organs/tissues examined, and its expression was significantly induced in CIK cells treated with lipoteichoic acid (LTA), polyinosinic polycytidylic acid (Poly I:C) and PGN. Fluorescence analysis showed that gcPGRP5 is distributed in cytoplasm of CIK cells, and cell lysates from CIK cells transfected with pTurbo-gcPGRP5-GFP and ptGFP1-gcPGRP5 plasmids display the binding activity and peptidoglycan-lytic amidase activity toward Lys-PGN from Staphylococcus aureus and Dap-PGN from Bacillus subtilis. Furthermore, heat-shock protein70 (Hsp70), and MyD88, an adaptor molecule in Toll-like receptor pathway, had an increased expression in CIK cells overexpressed with gcPGRP5. It is thus indicated that gcPGRP5 exhibits amidase activity, and also possesses roles in anti-stress, and in Toll-like receptor signaling pathway. Copyright © 2013 Elsevier Ltd. All rights reserved.
Characterization of Clostridium perfringens iota-toxin genes and expression in Escherichia coli.
Perelle, S; Gibert, M; Boquet, P; Popoff, M R
1993-01-01
The iota toxin which is produced by Clostridium perfringens type E, is a binary toxin consisting of two independent polypeptides: Ia, which is an ADP-ribosyltransferase, and Ib, which is involved in the binding and internalization of the toxin into the cell. Two degenerate oligonucleotide probes deduced from partial amino acid sequence of each component of C. spiroforme toxin, which is closely related to the iota toxin, were used to clone three overlapping DNA fragments containing the iota-toxin genes from C. perfringens type E plasmid DNA. Two genes, in the same orientation, coding for Ia (387 amino acids) and Ib (875 amino acids) and separated by 243 noncoding nucleotides were identified. A predicted signal peptide was found for each component, and the secreted Ib displays two domains, the propeptide (172 amino acids) and the mature protein (664 amino acids). The Ia gene has been expressed in Escherichia coli and C. perfringens, under the control of its own promoter. The recombinant polypeptide obtained was recognized by Ia antibodies and ADP-ribosylated actin. The expression of the Ib gene was obtained in E. coli harboring a recombinant plasmid encompassing the putative promoter upstream of the Ia gene and the Ia and Ib genes. Two residues which have been found to be involved in the NAD+ binding site of diphtheria and pseudomonas toxins are conserved in the predicted Ia sequence (Glu-14 and Trp-19). The predicted amino acid Ib sequence shows 33.9% identity with and 54.4% similarity to the protective antigen of the anthrax toxin complex. In particular, the central region of Ib, which contains a predicted transmembrane segment (Leu-292 to Ser-308), presents 45% identity with the corresponding protective antigen sequence which is involved in the translocation of the toxin across the cell membrane. Images PMID:8225592
Zhu, Zhen; Liu, Chunyu; Mao, Naiying; Ji, Yixin; Wang, Huiling; Jiang, Xiaohong; Li, Chongshan; Tang, Wei; Feng, Daxing; Wang, Changyin; Zheng, Lei; Lei, Yue; Ling, Hua; Zhao, Chunfang; Ma, Yan; He, Jilan; Wang, Yan; Li, Ping; Guan, Ronghui; Zhou, Shujie; Zhou, Jianhui; Wang, Shuang; Zhang, Hong; Zheng, Huanying; Liu, Leng; Ma, Hemuti; Guan, Jing; Lu, Peishan; Feng, Yan; Zhang, Yanjun; Zhou, Shunde; Xiong, Ying; Ba, Zhuoma; Chen, Hui; Yang, Xiuhui; Bo, Fang; Ma, Yujie; Liang, Yong; Lei, Yake; Gu, Suyi; Liu, Wei; Chen, Meng; Featherstone, David; Jee, Youngmee; Bellini, William J.; Rota, Paul A.; Xu, Wenbo
2013-01-01
Background China experienced several large measles outbreaks in the past two decades, and a series of enhanced control measures were implemented to achieve the goal of measles elimination. Molecular epidemiologic surveillance of wild-type measles viruses (MeV) provides valuable information about the viral transmission patterns. Since 1993, virologic surveillnace has confirmed that a single endemic genotype H1 viruses have been predominantly circulating in China. A component of molecular surveillance is to monitor the genetic characteristics of the hemagglutinin (H) gene of MeV, the major target for virus neutralizing antibodies. Principal Findings Analysis of the sequences of the complete H gene from 56 representative wild-type MeV strains circulating in China during 1993–2009 showed that the H gene sequences were clustered into 2 groups, cluster 1 and cluster 2. Cluster1 strains were the most frequently detected cluster and had a widespread distribution in China after 2000. The predicted amino acid sequences of the H protein were relatively conserved at most of the functionally significant amino acid positions. However, most of the genotype H1 cluster1 viruses had an amino acid substitution (Ser240Asn), which removed a predicted N-linked glycosylation site. In addition, the substitution of Pro397Leu in the hemagglutinin noose epitope (HNE) was identified in 23 of 56 strains. The evolutionary rate of the H gene of the genotype H1 viruses was estimated to be approximately 0.76×10−3 substitutions per site per year, and the ratio of dN to dS (dN/dS) was <1 indicating the absence of selective pressure. Conclusions Although H genes of the genotype H1 strains were conserved and not subjected to selective pressure, several amino acid substitutions were observed in functionally important positions. Therefore the antigenic and genetic properties of H genes of wild-type MeVs should be monitored as part of routine molecular surveillance for measles in China. PMID:24073194
Borhani Dizaji, Nahid; Basseri, Hamid Reza; Naddaf, Saied Reza; Heidari, Mansour
2014-10-25
Transmission blocking vaccines (TBVs) that target the antigens on the midgut epithelium of Anopheles mosquitoes are among the promising tools for the elimination of the malaria parasite. Characterization and analysis of effective antigens is the first step to design TBVs. Calreticulin (CRT), a lectin-like protein, from Anopheles albimanus midgut, has shown antigenic features, suggesting a promising and novel TBV target. CRT is a highly conserved protein with similar features in vertebrates and invertebrates including anopheline. We cloned the full-length crt gene from malaria vector, Anopheles stephensi (AsCrt) and explored the interaction of recombinant AsCrt protein, expressed in a prokaryotic system (pGEX-6p-1), with surface proteins of Plasmodium berghei ookinetes by immunofluorescence assay. The cellular localization of AsCrt was determined using the baculovirus expression system. Sequence analysis of the whole cDNA of AsCrt revealed that AsCrt contains an ORF of 1221 bp. The amino acid sequence of AsCrt protein obtained in this study showed 64% homology with similar protein in human. The AsCrt shares the most common features of CRTs from other species. This gene encodes a 406 amino-acid protein with a molecular mass of 46 kDa, which contains a predicted 16 amino-acid signal peptides, conserved cysteine residues, a proline-rich region, and highly acidic C-terminal domain with endoplasmic reticulum retrieval sequence HDEL. The production of GST-AsCrt recombinant protein was confirmed by Western blot analysis using an antibody against the GST protein. The FITC-labeled GST-AsCrt exhibited a significant interaction with P. berghei ookinete surface proteins. Purified recombinant GST-AsCrt, labeled with FITC, displayed specific binding to the surface of P. berghei ookinetes in comparison with control. Moreover, the expression of AsCrt in baculovirus expression system indicated that AsCrt was localized on the surface of Sf9 cells. Our results suggest that AsCrt could be utilized as a potential target for future studies in TBV area for malaria control. Copyright © 2014 Elsevier B.V. All rights reserved.
Goettel, Wolfgang; Ramirez, Martha; Upchurch, Robert G; An, Yong-Qiang Charles
2016-08-01
Identification and characterization of a 254-kb genomic deletion on a duplicated chromosome segment that resulted in a low level of palmitic acid in soybean seeds using transcriptome sequencing. A large number of soybean genotypes varying in seed oil composition and content have been identified. Understanding the molecular mechanisms underlying these variations is important for breeders to effectively utilize them as a genetic resource. Through design and application of a bioinformatics approach, we identified nine co-regulated gene clusters by comparing seed transcriptomes of nine soybean genotypes varying in oil composition and content. We demonstrated that four gene clusters in the genotypes M23, Jack and N0304-303-3 coincided with large-scale genome rearrangements. The co-regulated gene clusters in M23 and Jack mapped to a previously described 164-kb deletion and a copy number amplification of the Rhg1 locus, respectively. The coordinately down-regulated gene clusters in N0304-303-3 were caused by a 254-kb deletion containing 19 genes including a fatty acyl-ACP thioesterase B gene (FATB1a). This deletion was associated with reduced palmitic acid content in seeds and was the molecular cause of a previously reported nonfunctional FATB1a allele, fap nc . The M23 and N0304-304-3 deletions were located in duplicated genome segments retained from the Glycine-specific whole genome duplication that occurred 13 million years ago. The homoeologous genes in these duplicated regions shared a strong similarity in both their encoded protein sequences and transcript accumulation levels, suggesting that they may have conserved and important functions in seeds. The functional conservation of homoeologous genes may result in genetic redundancy and gene dosage effects for their associated seed traits, explaining why the large deletion did not cause lethal effects or completely eliminate palmitic acid in N0304-303-3.
A strategy for detecting the conservation of folding-nucleus residues in protein superfamilies.
Michnick, S W; Shakhnovich, E
1998-01-01
Nucleation-growth theory predicts that fast-folding peptide sequences fold to their native structure via structures in a transition-state ensemble that share a small number of native contacts (the folding nucleus). Experimental and theoretical studies of proteins suggest that residues participating in folding nuclei are conserved among homologs. We attempted to determine if this is true in proteins with highly diverged sequences but identical folds (superfamilies). We describe a strategy based on comparisons of residue conservation in natural superfamily sequences with simulated sequences (generated with a Monte-Carlo sequence design strategy) for the same proteins. The basic assumptions of the strategy were that natural sequences will conserve residues needed for folding and stability plus function, the simulated sequences contain no functional conservation, and nucleus residues make native contacts with each other. Based on these assumptions, we identified seven potential nucleus residues in ubiquitin superfamily members. Non-nucleus conserved residues were also identified; these are proposed to be involved in stabilizing native interactions. We found that all superfamily members conserved the same potential nucleus residue positions, except those for which the structural topology is significantly different. Our results suggest that the conservation of the nucleus of a specific fold can be predicted by comparing designed simulated sequences with natural highly diverged sequences that fold to the same structure. We suggest that such a strategy could be used to help plan protein folding and design experiments, to identify new superfamily members, and to subdivide superfamilies further into classes having a similar folding mechanism.
Tan, Yung-Chie; Ang, Cheng-Liang; Wong, Mui-Yun; Ho, Chai-Ling
2016-01-01
Plant defensins are plant defence peptides that have many different biological activities, including antifungal, antimicrobial, and insecticidal activities. A cDNA (EgDFS) encoding defensin was isolated from Elaeis guineensis. The open reading frame of EgDFS contained 231 nucleotides encoding a 71-amino acid protein with a predicted molecular weight at 8.69 kDa, and a potential signal peptide. The eight highly conserved cysteine sites in plant defensins were also conserved in EgDFS. The EgDFS sequence lacking 30 amino acid residues at its N-terminus (EgDFSm) was cloned into Escherichia coli BL21 (DE3) pLysS and successfully expressed as a soluble recombinant protein. The recombinant EgDFSm was found to be a thermal stable peptide which demonstrated inhibitory activity against the growth of G. boninense possibly by inhibiting starch assimilation. The role of EgDFSm in oil palm defence system against the infection of pathogen G. boninense was discussed.
Zhou, Zhanping; Zhao, Shuangzhi; Liu, Yang; Chang, Zhengying; Ma, Yanhe; Li, Jian; Song, Jiangning
2016-11-01
The chitosanase from Bacillus sp. TS (CsnTS) is an enzyme belonging to the glycoside hydrolase family 8. The sequence of CsnTS shares 98 % identity with the chitosanase from Bacillus sp. K17. Crystallography analysis and site-direct mutagenesis of the chitosanase from Bacillus sp. K17 identified the important residues involved in the catalytic interaction and substrate binding. However, despite progress in understanding the catalytic mechanism of the chitosanase from the family GH8, the functional roles of some residues that are highly conserved throughout this family have not been fully elucidated. This study focused on one of these residues, i.e., the aspartic acid residue at position 318. We found that apart from asparagine, mutation of Asp318 resulted in significant loss of enzyme activity. In-depth investigations showed that mutation of this residue not only impaired enzymatic activity but also affected substrate binding. Taken together, our results showed that Asp318 plays an important role in CsnTS activity.
Nguyen, Tuan; Ruan, Zheng; Oruganty, Krishnadev; Kannan, Natarajan
2015-01-01
Mitogen activated protein kinases (MAPKs) form a closely related family of kinases that control critical pathways associated with cell growth and survival. Although MAPKs have been extensively characterized at the biochemical, cellular, and structural level, an integrated evolutionary understanding of how MAPKs differ from other closely related protein kinases is currently lacking. Here, we perform statistical sequence comparisons of MAPKs and related protein kinases to identify sequence and structural features associated with MAPK functional divergence. We show, for the first time, that virtually all MAPK-distinguishing sequence features, including an unappreciated short insert segment in the β4-β5 loop, physically couple distal functional sites in the kinase domain to the D-domain peptide docking groove via the C-terminal flanking tail (C-tail). The coupling mediated by MAPK-specific residues confers an allosteric regulatory mechanism unique to MAPKs. In particular, the regulatory αC-helix conformation is controlled by a MAPK-conserved salt bridge interaction between an arginine in the αC-helix and an acidic residue in the C-tail. The salt-bridge interaction is modulated in unique ways in individual sub-families to achieve regulatory specificity. Our study is consistent with a model in which the C-tail co-evolved with the D-domain docking site to allosterically control MAPK activity. Our study provides testable mechanistic hypotheses for biochemical characterization of MAPK-conserved residues and new avenues for the design of allosteric MAPK inhibitors. PMID:25799139
Driggers, Camden M.; Hartman, Steven J.; Karplus, P. Andrew
2015-01-01
In some bacteria, cysteine is converted to cysteine sulfinic acid by cysteine dioxygenases (CDO) that are only ~15–30% identical in sequence to mammalian CDOs. Among bacterial proteins having this range of sequence similarity to mammalian CDO are some that conserve an active site Arg residue (“Arg-type” enzymes) and some having a Gln substituted for this Arg (“Gln-type” enzymes). Here, we describe a structure from each of these enzyme types by analyzing structures originally solved by structural genomics groups but not published: a Bacillus subtilis “Arg-type” enzyme that has cysteine dioxygenase activity (BsCDO), and a Ralstonia eutropha “Gln-type” CDO homolog ofmore » uncharacterized activity (ReCDOhom). The BsCDO active site is well conserved with mammalian CDO, and a cysteine complex captured in the active site confirms that the cysteine binding mode is also similar. The ReCDOhom structure reveals a new active site Arg residue that is hydrogen bonding to an iron-bound diatomic molecule we have interpreted as dioxygen. Notably, the Arg position is not compatible with the mode of Cys binding seen in both rat CDO and BsCDO. As sequence alignments show that this newly discovered active site Arg is well conserved among “Gln-type” CDO enzymes, we conclude that the “Gln-type” CDO homologs are not authentic CDOs but will have substrate specificity more similar to 3-mercaptopropionate dioxygenases.« less
Hambly, Emma; Tétart, Francoise; Desplats, Carine; Wilson, William H.; Krisch, Henry M.; Mann, Nicholas H.
2001-01-01
Sequence analysis of a 10-kb region of the genome of the marine cyanomyovirus S-PM2 reveals a homology to coliphage T4 that extends as a contiguous block from gene (g)18 to g23. The order of the S-PM2 genes in this region is similar to that of T4, but there are insertions and deletions of small ORFs of unknown function. In T4, g18 codes for the tail sheath, g19, the tail tube, g20, the head portal protein, g21, the prohead core protein, g22, a scaffolding protein, and g23, the major capsid protein. Thus, the entire module that determines the structural components of the phage head and contractile tail is conserved between T4 and this cyanophage. The significant differences in the morphology of these phages must reflect the considerable divergence of the amino acid sequence of their homologous virion proteins, which uniformly exceeds 50%. We suggest that their enormous diversity in the sea could be a result of genetic shuffling between disparate phages mediated by such commonly shared modules. These conserved sequences could facilitate genetic exchange by providing partially homologous substrates for recombination between otherwise divergent phage genomes. Such a mechanism would thus expand the pool of phage genes accessible by recombination to all those phages that share common modules. PMID:11553768
Woyda-Ploszczyca, Andrzej M; Jarmuszkiewicz, Wieslawa
2017-01-01
Uncoupling proteins (UCPs) belong to the mitochondrial anion carrier protein family and mediate regulated proton leak across the inner mitochondrial membrane. Free fatty acids, aldehydes such as hydroxynonenal, and retinoids activate UCPs. However, there are some controversies about the effective action of retinoids and aldehydes alone; thus, only free fatty acids are commonly accepted positive effectors of UCPs. Purine nucleotides such as GTP inhibit UCP-mediated mitochondrial proton leak. In turn, membranous coenzyme Q may play a role as a redox state-dependent metabolic sensor that modulates the complete activation/inhibition of UCPs. Such regulation has been observed for UCPs in microorganisms, plant and animal UCP1 homologues, and UCP1 in mammalian brown adipose tissue. The origin of UCPs is still under debate, but UCP homologues have been identified in all systematic groups of eukaryotes. Despite the differing levels of amino acid/DNA sequence similarities, functional studies in unicellular and multicellular organisms, from amoebae to mammals, suggest that the mechanistic regulation of UCP activity is evolutionarily well conserved. This review focuses on the regulatory feedback loops of UCPs involving free fatty acids, aldehydes, retinoids, purine nucleotides, and coenzyme Q (particularly its reduction level), which may derive from the early stages of evolution as UCP first emerged. Copyright © 2016 Elsevier B.V. All rights reserved.