Science.gov

Sample records for acid sequence conservation

  1. The Chinese hamster Alu-equivalent sequence: a conserved highly repetitious, interspersed deoxyribonucleic acid sequence in mammals has a structure suggestive of a transposable element.

    PubMed Central

    Haynes, S R; Toomey, T P; Leinwand, L; Jelinek, W R

    1981-01-01

    A consensus sequence has been determined for a major interspersed deoxyribonucleic acid repeat in the genome of Chinese hamster ovary cells (CHO cells). This sequence is extensively homologous to (i) the human Alu sequence (P. L. Deininger et al., J. Mol. Biol., in press), (ii) the mouse B1 interspersed repetitious sequence (Krayev et al., Nucleic Acids Res. 8:1201-1215, 1980) (iii) an interspersed repetitious sequence from African green monkey deoxyribonucleic acid (Dhruva et al., Proc. Natl. Acad. Sci. U.S.A. 77:4514-4518, 1980) and (iv) the CHO and mouse 4.5S ribonucleic acid (this report; F. Harada and N. Kato, Nucleic Acids Res. 8:1273-1285, 1980). Because the CHO consensus sequence shows significant homology to the human Alu sequence it is termed the CHO Alu-equivalent sequence. A conserved structure surrounding CHO Alu-equivalent family members can be recognized. It is similar to that surrounding the human Alu and the mouse B1 sequences, and is represented as follows: direct repeat-CHO-Alu-A-rich sequence-direct repeat. A composite interspersed repetitious sequence has been identified. Its structure is represented as follows: direct repeat-residue 47 to 107 of CHO-Alu-non-Alu repetitious sequence-A-rich sequence-direct repeat. Because the Alu flanking sequences resemble those that flank known transposable elements, we think it likely that the Alu sequence dispersed throughout the mammalian genome by transposition. Images PMID:9279371

  2. Conservation of Shannon's redundancy for proteins. [information theory applied to amino acid sequences

    NASA Technical Reports Server (NTRS)

    Gatlin, L. L.

    1974-01-01

    Concepts of information theory are applied to examine various proteins in terms of their redundancy in natural originators such as animals and plants. The Monte Carlo method is used to derive information parameters for random protein sequences. Real protein sequence parameters are compared with the standard parameters of protein sequences having a specific length. The tendency of a chain to contain some amino acids more frequently than others and the tendency of a chain to contain certain amino acid pairs more frequently than other pairs are used as randomness measures of individual protein sequences. Non-periodic proteins are generally found to have random Shannon redundancies except in cases of constraints due to short chain length and genetic codes. Redundant characteristics of highly periodic proteins are discussed. A degree of periodicity parameter is derived.

  3. Nucleotide sequence of the capsid protein gene of two serotypes of San Miguel sea lion virus: identification of conserved and non-conserved amino acid sequences among calicivirus capsid proteins.

    PubMed

    Neill, J D

    1992-07-01

    The San Miguel sea lion viruses, members of the calicivirus family, are closely related to the vesicular disease of swine viruses which can cause severe disease in swine. In order to begin the molecular characterization of these viruses, the nucleotide sequence of the capsid protein gene of two San Miguel sea lion viruses (SMSV), serotypes 1 and 4, was determined. The coding sequences for the capsid precursor protein were located within the 3' terminal 2620 bases of the genomic RNAs of both viruses. The encoded capsid precursor proteins were 79,500 and 77,634 Da for SMSV 1 and SMSV 4, respectively. The SMSV 1 protein was 47.7% and SMSV 4 was 48.6% homologous to the feline calicivirus (FCV) capsid precursor protein while the two SMSV capsid precursors were 73% homologous to each other. Six distinct regions within the capsid precursors (denoted as regions A-F) were identified based on amino acid sequence alignment analysis of the two SMSV serotypes with FCV and the rabbit hemorrhagic disease virus (RHDV) capsid protein. Three regions showed similarity among all four viruses (regions B, D and F) and one region showed a very high degree of homology between the SMSV serotypes but only limited similarity with FCV (region A). RHDV contained only a truncated region A. A fifth region, consisting of approximately 100 residues, was not conserved among any of the viruses (region E) and, in SMSV, may contain the serotype-specific determinants. Another small region (region C) contained between 15 and 27 amino acids and showed little sequence conservation. Region B showed the highest degree of conservation among the four viruses and contained the residues which had homology to the picornavirus VP3 structural protein. An open reading frame, found in the 3' terminal 514 bases of the SMSV genomes, encoded small proteins (12,575 and 12,522 Da, respectively for SMSV 1 and SMSV 4) of which 32% of the conserved amino acids were basic residues, implying a possible nucleic acid

  4. Evolutionarily conserved sequences on human chromosome 21

    SciTech Connect

    Frazer, Kelly A.; Sheehan, John B.; Stokowski, Renee P.; Chen, Xiyin; Hosseini, Roya; Cheng, Jan-Fang; Fodor, Stephen P.A.; Cox, David R.; Patil, Nila

    2001-09-01

    Comparison of human sequences with the DNA of other mammals is an excellent means of identifying functional elements in the human genome. Here we describe the utility of high-density oligonucleotide arrays as a rapid approach for comparing human sequences with the DNA of multiple species whose sequences are not presently available. High-density arrays representing approximately 22.5 Mb of nonrepetitive human chromosome 21 sequence were synthesized and then hybridized with mouse and dog DNA to identify sequences conserved between humans and mice (human-mouse elements) and between humans and dogs (human-dog elements). Our data show that sequence comparison of multiple species provides a powerful empiric method for identifying actively conserved elements in the human genome. A large fraction of these evolutionarily conserved elements are present in regions on chromosome 21 that do not encode known genes.

  5. Sequence Fingerprints of MicroRNA Conservation

    PubMed Central

    Shi, Bing; Gao, Wei; Wang, Juan

    2012-01-01

    It is known that the conservation of protein-coding genes is associated with their sequences both various species, such as animals and plants. However, the association between microRNA (miRNA) conservation and their sequences in various species remains unexplored. Here we report the association of miRNA conservation with its sequence features, such as base content and cleavage sites, suggesting that miRNA sequences contain the fingerprints for miRNA conservation. More interestingly, different species show different and even opposite patterns between miRNA conservation and sequence features. For example, mammalian miRNAs show a positive/negative correlation between conservation and AU/GC content, whereas plant miRNAs show a negative/positive correlation between conservation and AU/GC content. Further analysis puts forward the hypothesis that the introns of protein-coding genes may be a main driving force for the origin and evolution of mammalian miRNAs. At the 5′ end, conserved miRNAs have a preference for base U, while less-conserved miRNAs have a preference for a non-U base in mammals. This difference does not exist in insects and plants, in which both conserved miRNAs and less-conserved miRNAs have a preference for base U at the 5′ end. We further revealed that the non-U preference at the 5′ end of less-conserved mammalian miRNAs is associated with miRNA function diversity, which may have evolved from the pressure of a highly sophisticated environmental stimulus the mammals encountered during evolution. These results indicated that miRNA sequences contain the fingerprints for conservation, and these fingerprints vary according to species. More importantly, the results suggest that although species share common mechanisms by which miRNAs originate and evolve, mammals may develop a novel mechanism for miRNA origin and evolution. In addition, the fingerprint found in this study can be predictor of miRNA conservation, and the findings are helpful in achieving a

  6. Conservation of sequence in recombination signal sequence spacers.

    PubMed Central

    Ramsden, D A; Baetz, K; Wu, G E

    1994-01-01

    The variable domains of immunoglobulins and T cell receptors are assembled through the somatic, site specific recombination of multiple germline segments (V, D, and J segments) or V(D)J rearrangement. The recombination signal sequence (RSS) is necessary and sufficient for cell type specific targeting of the V(D)J rearrangement machinery to these germline segments. Previously, the RSS has been described as possessing both a conserved heptamer and a conserved nonamer motif. The heptamer and nonamer motifs are separated by a 'spacer' that was not thought to possess significant sequence conservation, however the length of the spacer could be either 12 +/- 1 bp or 23 +/- 1 bp long. In this report we have assembled and analyzed an extensive data base of published RSS. We have derived, through extensive consensus comparison, a more detailed description of the RSS than has previously been reported. Our analysis indicates that RSS spacers possess significant conservation of sequence, and that the conserved sequence in 12 bp spacers is similar to the conserved sequence in the first half of 23 bp spacers. PMID:8208601

  7. Sequence conservation on the Y chromosome

    SciTech Connect

    Gibson, L.H.; Yang-Feng, L.; Lau, C.

    1994-09-01

    The Y chromosome is present in all mammals and is considered to be essential to sex determination. Despite intense genomic research, only a few genes have been identified and mapped to this chromosome in humans. Several of them, such as SRY and ZFY, have been demonstrated to be conserved and Y-located in other mammals. In order to address the issue of sequence conservation on the Y chromosome, we performed fluorescence in situ hybridization (FISH) with DNA from a human Y cosmid library as a probe to study the Y chromosomes from other mammalian species. Total DNA from 3,000-4,500 cosmid pools were labeled with biotinylated-dUTP and hybridized to metaphase chromosomes. For human and primate preparations, human cot1 DNA was included in the hybridization mixture to suppress the hybridization from repeat sequences. FISH signals were detected on the Y chromosomes of human, gorilla, orangutan and baboon (Old World monkey) and were absent on those of squirrel monkey (New World monkey), Indian munjac, wood lemming, Chinese hamster, rat and mouse. Since sequence analysis suggested that specific genes, e.g. SRY and ZFY, are conserved between these two groups, the lack of detectable hybridization in the latter group implies either that conservation of the human Y sequences is limited to the Y chromosomes of the great apes and Old World monkeys, or that the size of the syntenic segment is too small to be detected under the resolution of FISH, or that homologeous sequences have undergone considerable divergence. Further studies with reduced hybridization stringency are currently being conducted. Our results provide some clues as to Y-sequence conservation across species and demonstrate the limitations of FISH across species with total DNA sequences from a particular chromosome.

  8. Conserved Sequence Preferences Contribute to Substrate Recognition by the Proteasome*

    PubMed Central

    Yu, Houqing; Singh Gautam, Amit K.; Wilmington, Shameika R.; Wylie, Dennis; Martinez-Fonts, Kirby; Kago, Grace; Warburton, Marie; Chavali, Sreenivas; Inobe, Tomonao; Finkelstein, Ilya J.; Babu, M. Madan

    2016-01-01

    The proteasome has pronounced preferences for the amino acid sequence of its substrates at the site where it initiates degradation. Here, we report that modulating these sequences can tune the steady-state abundance of proteins over 2 orders of magnitude in cells. This is the same dynamic range as seen for inducing ubiquitination through a classic N-end rule degron. The stability and abundance of His3 constructs dictated by the initiation site affect survival of yeast cells and show that variation in proteasomal initiation can affect fitness. The proteasome's sequence preferences are linked directly to the affinity of the initiation sites to their receptor on the proteasome and are conserved between Saccharomyces cerevisiae, Schizosaccharomyces pombe, and human cells. These findings establish that the sequence composition of unstructured initiation sites influences protein abundance in vivo in an evolutionarily conserved manner and can affect phenotype and fitness. PMID:27226608

  9. Conserved Noncoding Sequences in the Grasses4

    PubMed Central

    Inada, Dan Choffnes; Bashir, Ali; Lee, Chunghau; Thomas, Brian C.; Ko, Cynthia; Goff, Stephen A.; Freeling, Michael

    2003-01-01

    As orthologous genes from related species diverge over time, some sequences are conserved in noncoding regions. In mammals, large phylogenetic footprints, or conserved noncoding sequences (CNSs), are known to be common features of genes. Here we present the first large-scale analysis of plant genes for CNSs. We used maize and rice, maximally diverged members of the grass family of monocots. Using a local sequence alignment set to deliver only significant alignments, we found one or more CNSs in the noncoding regions of the majority of genes studied. Grass genes have dramatically fewer and much smaller CNSs than mammalian genes. Twenty-seven percent of grass gene comparisons revealed no CNSs. Genes functioning in upstream regulatory roles, such as transcription factors, are greatly enriched for CNSs relative to genes encoding enzymes or structural proteins. Further, we show that a CNS cluster in an intron of the knotted1 homeobox gene serves as a site of negative regulation. We showthat CNSs in the adh1 gene do not correlate with known cis-acting sites. We discuss the potential meanings of CNSs and their value as analytical tools and evolutionary characters. We advance the idea that many CNSs function to lock-in gene regulatory decisions. PMID:12952874

  10. Composition for nucleic acid sequencing

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2008-08-26

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  11. The highly conserved amino acid sequence motif Tyr-Gly-Asp-Thr-Asp-Ser in alpha-like DNA polymerases is required by phage phi 29 DNA polymerase for protein-primed initiation and polymerization.

    PubMed Central

    Bernad, A; Lázaro, J M; Salas, M; Blanco, L

    1990-01-01

    The alpha-like DNA polymerases from bacteriophage phi 29 and other viruses, prokaryotes and eukaryotes contain an amino acid consensus sequence that has been proposed to form part of the dNTP binding site. We have used site-directed mutants to study five of the six highly conserved consecutive amino acids corresponding to the most conserved C-terminal segment (Tyr-Gly-Asp-Thr-Asp-Ser). Our results indicate that in phi 29 DNA polymerase this consensus sequence, although irrelevant for the 3'----5' exonuclease activity, is essential for initiation and elongation. Based on these results and on its homology with known or putative metal-binding amino acid sequences, we propose that in phi 29 DNA polymerase the Tyr-Gly-Asp-Thr-Asp-Ser consensus motif is part of the dNTP binding site, involved in the synthetic activities of the polymerase (i.e., initiation and polymerization), and that it is involved particularly in the metal binding associated with the dNTP site. Images PMID:2191296

  12. Conservation of cysteine residues in fungal histidine acid phytases.

    PubMed

    Mullaney, Edward J; Ullah, Abul H J

    2005-03-11

    Amino acid sequence analysis of fungal histidine acid phosphatases displaying phytase activity has revealed a conserved eight-cysteine motif. These conserved amino acids are not directly associated with catalytic function; rather they appear to be essential in the formation of disulfide bridges. Their role is seen as being similar to another eight-cysteine motif recently reported in the amino acid sequence of nearly 500 plant polypeptides. An additional disulfide bridge formed by two cysteines at the N-terminus of all the filamentous ascomycete phytases was also observed. Disulfide bridges are known to increase both stability and heat tolerance in proteins. It is therefore plausible that this extra disulfide bridge contributes to the higher stability found in phytase from some Aspergillus species. To engineer an enhanced phytase for the feed industry, it is imperative that the role of disulfide bridges be taken into cognizance and possibly be increased in number to further elevate stability in this enzyme.

  13. High speed nucleic acid sequencing

    SciTech Connect

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2011-05-17

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid. Each type of labeled nucleotide comprises an acceptor fluorophore attached to a phosphate portion of the nucleotide such that the fluorophore is removed upon incorporation into a growing strand. Fluorescent signal is emitted via fluorescent resonance energy transfer between the donor fluorophore and the acceptor fluorophore as each nucleotide is incorporated into the growing strand. The sequence is deduced by identifying which base is being incorporated into the growing strand.

  14. Functionally conserved enhancers with divergent sequences in distant vertebrates

    DOE PAGES

    Yang, Song; Oksenberg, Nir; Takayama, Sachiko; ...

    2015-10-30

    To examine the contributions of sequence and function conservation in the evolution of enhancers, we systematically identified enhancers whose sequences are not conserved among distant groups of vertebrate species, but have homologous function and are likely to be derived from a common ancestral sequence. In conclusion, our approach combined comparative genomics and epigenomics to identify potential enhancer sequences in the genomes of three groups of distantly related vertebrate species.

  15. Nucleotide sequence conservation in paramyxoviruses; the concept of codon constellation.

    PubMed

    Rima, Bert K

    2015-05-01

    The stability and conservation of the sequences of RNA viruses in the field and the high error rates measured in vitro are paradoxical. The field stability indicates that there are very strong selective constraints on sequence diversity. The nature of these constraints is discussed. Apart from constraints on variation in cis-acting RNA and the amino acid sequences of viral proteins, there are other ones relating to the presence of specific dinucleotides such CpG and UpA as well as the importance of RNA secondary structures and RNA degradation rates. Recent other constraints identified in other RNA viruses, such as effects of secondary RNA structure on protein folding or modification of cellular tRNA complements, are also discussed. Using the family Paramyxoviridae, I show that the codon usage pattern (CUP) is (i) specific for each virus species and (ii) that it is markedly different from the host - it does not vary even in vaccine viruses that have been derived by passage in a number of inappropriate host cells. The CUP might thus be an additional constraint on variation, and I propose the concept of codon constellation to indicate the informational content of the sequences of RNA molecules relating not only to stability and structure but also to the efficiency of translation of a viral mRNA resulting from the CUP and the numbers and position of rare codons.

  16. A Developmental Sequence of Skills Leading to Conservation

    ERIC Educational Resources Information Center

    Walker, Alice A.

    1978-01-01

    Examines the developmental sequence of skills involved in the understanding of relational concepts and in the development of conservation. Fifty kindergarten children participated in the study. (BD/BR)

  17. Housekeeping genes tend to show reduced upstream sequence conservation

    PubMed Central

    Farré, Domènec; Bellora, Nicolás; Mularoni, Loris; Messeguer, Xavier; Albà, M Mar

    2007-01-01

    Background Understanding the constraints that operate in mammalian gene promoter sequences is of key importance to understand the evolution of gene regulatory networks. The level of promoter conservation varies greatly across orthologous genes, denoting differences in the strength of the evolutionary constraints. Here we test the hypothesis that the number of tissues in which a gene is expressed is related in a significant manner to the extent of promoter sequence conservation. Results We show that mammalian housekeeping genes, expressed in all or nearly all tissues, show significantly lower promoter sequence conservation, especially upstream of position -500 with respect to the transcription start site, than genes expressed in a subset of tissues. In addition, we evaluate the effect of gene function, CpG island content and protein evolutionary rate on promoter sequence conservation. Finally, we identify a subset of transcription factors that bind to motifs that are specifically over-represented in housekeeping gene promoters. Conclusion This is the first report that shows that the promoters of housekeeping genes show reduced sequence conservation with respect to genes expressed in a more tissue-restricted manner. This is likely to be related to simpler gene expression, requiring a smaller number of functional cis-regulatory motifs. PMID:17626644

  18. Highly conserved repetitive DNA sequences are present at human centromeres.

    PubMed Central

    Grady, D L; Ratliff, R L; Robinson, D L; McCanlies, E C; Meyne, J; Moyzis, R K

    1992-01-01

    Highly conserved repetitive DNA sequence clones, largely consisting of (GGAAT)n repeats, have been isolated from a human recombinant repetitive DNA library by high-stringency hybridization with rodent repetitive DNA. This sequence, the predominant repetitive sequence in human satellites II and III, is similar to the essential core DNA of the Saccharomyces cerevisiae centromere, centromere DNA element (CDE) III. In situ hybridization to human telophase and Drosophila polytene chromosomes shows localization of the (GGAAT)n sequence to centromeric regions. Hyperchromicity studies indicate that the (GGAAT)n sequence exhibits unusual hydrogen bonding properties. The purine-rich strand alone has the same thermal stability as the duplex. Hyperchromicity studies of synthetic DNA variants indicate that all sequences with the composition (AATGN)n exhibit this unusual thermal stability. DNA-mobility-shift assays indicate that specific HeLa-cell nuclear proteins recognize this sequence with a relative affinity greater than 10(5). The extreme evolutionary conservation of this DNA sequence, its centromeric location, its unusual hydrogen bonding properties, its high affinity for specific nuclear proteins, and its similarity to functional centromeres isolated from yeast suggest that this sequence may be a component of the functional human centromere. Images PMID:1542662

  19. Chip-based sequencing nucleic acids

    DOEpatents

    Beer, Neil Reginald

    2014-08-26

    A system for fast DNA sequencing by amplification of genetic material within microreactors, denaturing, demulsifying, and then sequencing the material, while retaining it in a PCR/sequencing zone by a magnetic field. One embodiment includes sequencing nucleic acids on a microchip that includes a microchannel flow channel in the microchip. The nucleic acids are isolated and hybridized to magnetic nanoparticles or to magnetic polystyrene-coated beads. Microreactor droplets are formed in the microchannel flow channel. The microreactor droplets containing the nucleic acids and the magnetic nanoparticles are retained in a magnetic trap in the microchannel flow channel and sequenced.

  20. Multigenome DNA sequence conservation identifies Hox cis-regulatory elements

    PubMed Central

    Kuntz, Steven G.; Schwarz, Erich M.; DeModena, John A.; De Buysscher, Tristan; Trout, Diane; Shizuya, Hiroaki; Sternberg, Paul W.; Wold, Barbara J.

    2008-01-01

    To learn how well ungapped sequence comparisons of multiple species can predict cis-regulatory elements in Caenorhabditis elegans, we made such predictions across the large, complex ceh-13/lin-39 locus and tested them transgenically. We also examined how prediction quality varied with different genomes and parameters in our comparisons. Specifically, we sequenced ∼0.5% of the C. brenneri and C. sp. 3 PS1010 genomes, and compared five Caenorhabditis genomes (C. elegans, C. briggsae, C. brenneri, C. remanei, and C. sp. 3 PS1010) to find regulatory elements in 22.8 kb of noncoding sequence from the ceh-13/lin-39 Hox subcluster. We developed the MUSSA program to find ungapped DNA sequences with N-way transitive conservation, applied it to the ceh-13/lin-39 locus, and transgenically assayed 21 regions with both high and low degrees of conservation. This identified 10 functional regulatory elements whose activities matched known ceh-13/lin-39 expression, with 100% specificity and a 77% recovery rate. One element was so well conserved that a similar mouse Hox cluster sequence recapitulated the native nematode expression pattern when tested in worms. Our findings suggest that ungapped sequence comparisons can predict regulatory elements genome-wide. PMID:18981268

  1. The nucleotide sequence of the human int-1 mammary oncogene; evolutionary conservation of coding and non-coding sequences.

    PubMed Central

    van Ooyen, A; Kwee, V; Nusse, R

    1985-01-01

    The mouse mammary tumor virus can induce mammary tumors in mice by proviral activation of an evolutionarily conserved cellular oncogene called int-1. Here we present the nucleotide sequence of the human homologue of int-1, and compare it with the mouse gene. Like the mouse gene, the human homologue contains a reading frame of 370 amino acids, with only four substitutions. The amino acid changes are all in the hydrophobic leader domain of the int-1 encoded protein, and do not significantly alter its hydropathic index. The conservation between the mouse and the human int-1 genes is not restricted to exons; extensive parts of the introns are also homologous. Thus, int-1 ranks among the most conserved genes known, a property shared with other oncogenes. PMID:2998762

  2. Protein sequence conservation and stable molecular evolution reveals influenza virus nucleoprotein as a universal druggable target.

    PubMed

    Babar, Mustafeez Mujtaba; Zaidi, Najam-us-Sahar Sadaf

    2015-08-01

    The high mutation rate in influenza virus genome and appearance of drug resistance calls for a constant effort to identify alternate drug targets and develop new antiviral strategies. The internal proteins of the virus can be exploited as a potential target for therapeutic interventions. Among these, the nucleoprotein (NP) is the most abundant protein that provides structural and functional support to the viral replication machinery. The current study aims at analysis of protein sequence polymorphism patterns, degree of molecular evolution and sequence conservation as a function of potential druggability of nucleoprotein. We analyzed a universal set of amino acid sequences, (n=22,000) and, in order to identify and correlate the functionally conserved, druggable regions across different parameters, classified them on the basis of host organism, strain type and continental region of sample isolation. The results indicated that around 95% of the sequence length was conserved, with at least 7 regions conserved across the protein among various classes. Moreover, the highly variable regions, though very limited in number, were found to be positively selected indicating, thereby, the high degree of protein stability against various hosts and spatio-temporal references. Furthermore, on mapping the conserved regions on the protein, 7 drug binding pockets in the functionally important regions of the protein were revealed. The results, therefore, collectively indicate that nucleoprotein is a highly conserved and stable viral protein that can potentially be exploited for development of broadly effective antiviral strategies.

  3. Distinguishing Proteins From Arbitrary Amino Acid Sequences

    PubMed Central

    Yau, Stephen S.-T.; Mao, Wei-Guang; Benson, Max; He, Rong Lucy

    2015-01-01

    What kinds of amino acid sequences could possibly be protein sequences? From all existing databases that we can find, known proteins are only a small fraction of all possible combinations of amino acids. Beginning with Sanger's first detailed determination of a protein sequence in 1952, previous studies have focused on describing the structure of existing protein sequences in order to construct the protein universe. No one, however, has developed a criteria for determining whether an arbitrary amino acid sequence can be a protein. Here we show that when the collection of arbitrary amino acid sequences is viewed in an appropriate geometric context, the protein sequences cluster together. This leads to a new computational test, described here, that has proved to be remarkably accurate at determining whether an arbitrary amino acid sequence can be a protein. Even more, if the results of this test indicate that the sequence can be a protein, and it is indeed a protein sequence, then its identity as a protein sequence is uniquely defined. We anticipate our computational test will be useful for those who are attempting to complete the job of discovering all proteins, or constructing the protein universe. PMID:25609314

  4. The complete amino acid sequence of prochymosin.

    PubMed Central

    Foltmann, B; Pedersen, V B; Jacobsen, H; Kauffman, D; Wybrandt, G

    1977-01-01

    The total sequence of 365 amino acid residues in bovine prochymosin is presented. Alignment with the amino acid sequence of porcine pepsinogen shows that 204 amino acid residues are common to the two zymogens. Further comparison and alignment with the amino acid sequence of penicillopepsin shows that 66 residues are located at identical positions in all three proteases. The three enzymes belong to a large group of proteases with two aspartate residues in the active center. This group forms a family derived from one common ancestor. PMID:329280

  5. Method for sequencing nucleic acid molecules

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2006-05-30

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  6. Method for sequencing nucleic acid molecules

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2006-06-06

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  7. Amino acid sequence of porcine spleen cathepsin D.

    PubMed Central

    Shewale, J G; Tang, J

    1984-01-01

    The amino acid sequence of porcine spleen cathepsin D heavy chain has been determined and, hence, the complete structure of this enzyme is now known. The sequence of heavy chain was constructed by aligning the structures of peptides generated by cyanogen bromide, trypsin, and endo-proteinase Lys C cleavages. The structure of the light chain has been published previously. The cathepsin D molecule contains 339 amino acid residues in two polypeptide chains: a 97-residue light chain and a 242-residue heavy chain, with a combined Mr of 36,779 (without carbohydrate). There are two carbohydrate units linked to asparagine residues 70 and 192. The disulfide bond arrangement in cathepsin D is probably similar to that of pepsin, because the positions of six half-cystine residues are conserved. The active site aspartyl residues, corresponding to aspartic acid-32 and -215 of pepsin, are located at residues 33 and 224 in the cathepsin D molecule. The amino acid sequence around these aspartyl residues is strongly conserved. Cathepsin D shows a strong homology with other acid proteases. When the sequence of cathepsin D, renin, and pepsin are aligned, 32.7% of the residues are identical. The homology is observed throughout the length of the molecules, indicating that three-dimensional structures of all three molecules are similar. PMID:6587385

  8. Local Function Conservation in Sequence and Structure Space

    PubMed Central

    Weinhold, Nils; Sander, Oliver; Domingues, Francisco S.; Lengauer, Thomas; Sommer, Ingolf

    2008-01-01

    We assess the variability of protein function in protein sequence and structure space. Various regions in this space exhibit considerable difference in the local conservation of molecular function. We analyze and capture local function conservation by means of logistic curves. Based on this analysis, we propose a method for predicting molecular function of a query protein with known structure but unknown function. The prediction method is rigorously assessed and compared with a previously published function predictor. Furthermore, we apply the method to 500 functionally unannotated PDB structures and discuss selected examples. The proposed approach provides a simple yet consistent statistical model for the complex relations between protein sequence, structure, and function. The GOdot method is available online (http://godot.bioinf.mpi-inf.mpg.de). PMID:18604264

  9. Local function conservation in sequence and structure space.

    PubMed

    Weinhold, Nils; Sander, Oliver; Domingues, Francisco S; Lengauer, Thomas; Sommer, Ingolf

    2008-07-04

    We assess the variability of protein function in protein sequence and structure space. Various regions in this space exhibit considerable difference in the local conservation of molecular function. We analyze and capture local function conservation by means of logistic curves. Based on this analysis, we propose a method for predicting molecular function of a query protein with known structure but unknown function. The prediction method is rigorously assessed and compared with a previously published function predictor. Furthermore, we apply the method to 500 functionally unannotated PDB structures and discuss selected examples. The proposed approach provides a simple yet consistent statistical model for the complex relations between protein sequence, structure, and function. The GOdot method is available online (http://godot.bioinf.mpi-inf.mpg.de).

  10. Automatic identification of highly conserved family regions and relationships in genome wide datasets including remote protein sequences.

    PubMed

    Doğan, Tunca; Karaçalı, Bilge

    2013-01-01

    Identifying shared sequence segments along amino acid sequences generally requires a collection of closely related proteins, most often curated manually from the sequence datasets to suit the purpose at hand. Currently developed statistical methods are strained, however, when the collection contains remote sequences with poor alignment to the rest, or sequences containing multiple domains. In this paper, we propose a completely unsupervised and automated method to identify the shared sequence segments observed in a diverse collection of protein sequences including those present in a smaller fraction of the sequences in the collection, using a combination of sequence alignment, residue conservation scoring and graph-theoretical approaches. Since shared sequence fragments often imply conserved functional or structural attributes, the method produces a table of associations between the sequences and the identified conserved regions that can reveal previously unknown protein families as well as new members to existing ones. We evaluated the biological relevance of the method by clustering the proteins in gold standard datasets and assessing the clustering performance in comparison with previous methods from the literature. We have then applied the proposed method to a genome wide dataset of 17793 human proteins and generated a global association map to each of the 4753 identified conserved regions. Investigations on the major conserved regions revealed that they corresponded strongly to annotated structural domains. This suggests that the method can be useful in predicting novel domains on protein sequences.

  11. Internal epitope tagging informed by relative lack of sequence conservation

    PubMed Central

    Burg, Leonard; Zhang, Karen; Bonawitz, Tristan; Grajevskaja, Viktorija; Bellipanni, Gianfranco; Waring, Richard; Balciunas, Darius

    2016-01-01

    Many experimental techniques rely on specific recognition and stringent binding of proteins by antibodies. This can readily be achieved by introducing an epitope tag. We employed an approach that uses a relative lack of evolutionary conservation to inform epitope tag site selection, followed by integration of the tag-coding sequence into the endogenous locus in zebrafish. We demonstrate that an internal epitope tag is accessible for antibody binding, and that tagged proteins retain wild type function. PMID:27892520

  12. Properties of Sequence Conservation in Upstream Regulatory and Protein Coding Sequences among Paralogs in Arabidopsis thaliana

    NASA Astrophysics Data System (ADS)

    Richardson, Dale N.; Wiehe, Thomas

    Whole genome duplication (WGD) has catalyzed the formation of new species, genes with novel functions, altered expression patterns, complexified signaling pathways and has provided organisms a level of genetic robustness. We studied the long-term evolution and interrelationships of 5’ upstream regulatory sequences (URSs), protein coding sequences (CDSs) and expression correlations (EC) of duplicated gene pairs in Arabidopsis. Three distinct methods revealed significant evolutionary conservation between paralogous URSs and were highly correlated with microarray-based expression correlation of the respective gene pairs. Positional information on exact matches between sequences unveiled the contribution of micro-chromosomal rearrangements on expression divergence. A three-way rank analysis of URS similarity, CDS divergence and EC uncovered specific gene functional biases. Transcription factor activity was associated with gene pairs exhibiting conserved URSs and divergent CDSs, whereas a broad array of metabolic enzymes was found to be associated with gene pairs showing diverged URSs but conserved CDSs.

  13. Conservation patterns in different functional sequence categoriesof divergent Drosophila species

    SciTech Connect

    Papatsenko, Dmitri; Kislyuk, Andrey; Levine, Michael; Dubchak, Inna

    2005-10-01

    We have explored the distributions of fully conservedungapped blocks in genome-wide pairwise alignments of recently completedspecies of Drosophila: D.yakuba, D.ananassae, D.pseudoobscura, D.virilisand D.mojavensis. Based on these distributions we have found that nearlyevery functional sequence category possesses its own distinctiveconservation pattern, sometimes independent of the overall sequenceconservation level. In the coding and regulatory regions, the ungappedblocks were longer than in introns, UTRs and non-functional sequences. Atthe same time, the blocks in the coding regions carried 3N+2 signaturecharacteristic to synonymic substitutions in the 3rd codon positions.Larger block sizes in transcription regulatory regions can be explainedby the presence of conserved arrays of binding sites for transcriptionfactors. We also have shown that the longest ungapped blocks, or'ultraconserved' sequences, are associated with specific gene groups,including those encoding ion channels and components of the cytoskeleton.We discussed how restrained conservation patterns may help in mappingfunctional sequence categories and improving genomeannotation.

  14. Conservation patterns in angiosperm rDNA ITS2 sequences.

    PubMed Central

    Hershkovitz, M A; Zimmer, E A

    1996-01-01

    The two internal transcribed spacers (ITS1 and ITS2) of nuclear ribosomal DNA have become commonly exploited sources of informative variation for interspecific-/intergeneric-level phylogenetic analyses among angiosperms and other eukaryotes. We present an alignment in which one-third to one-half of the ITS2 sequence is alignable above the family level in angiosperms and a phenetic analysis showing that ITS2 contains information sufficient to diagnose lineages at several hierarchical levels. Base compositional analysis shows that angiosperm ITS2 is inherently GC-rich, and that the proportion of T is much more variable than that for other bases. We propose a general model of angiosperm ITS2 secondary structure that shows common pairing relationships for most of the conserved sequence tracts. Variations in our secondary structure predictions for sequences from different taxa indicate that compensatory mutation is not limited to paired positions. PMID:8760866

  15. Conservative Patch Algorithm and Mesh Sequencing for PAB3D

    NASA Technical Reports Server (NTRS)

    Pao, S. P.; Abdol-Hamid, K. S.

    2005-01-01

    A mesh-sequencing algorithm and a conservative patched-grid-interface algorithm (hereafter Patch Algorithm ) have been incorporated into the PAB3D code, which is a computer program that solves the Navier-Stokes equations for the simulation of subsonic, transonic, or supersonic flows surrounding an aircraft or other complex aerodynamic shapes. These algorithms are efficient, flexible, and have added tremendously to the capabilities of PAB3D. The mesh-sequencing algorithm makes it possible to perform preliminary computations using only a fraction of the grid cells (provided the original cell count is divisible by an integer) along any grid coordinate axis, independently of the other axes. The patch algorithm addresses another critical need in multi-block grid situation where the cell faces of adjacent grid blocks may not coincide, leading to errors in calculating fluxes of conserved physical quantities across interfaces between the blocks. The patch algorithm, based on the Stokes integral formulation of the applicable conservation laws, effectively matches each of the interfacial cells on one side of the block interface to the corresponding fractional cell area pieces on the other side. This approach is comprehensive and unified such that all interface topology is automatically processed without user intervention. This algorithm is implemented in a preprocessing code that creates a cell-by-cell database that will maintain flux conservation at any level of full or reduced grid density as the user may choose by way of the mesh-sequencing algorithm. These two algorithms have enhanced the numerical accuracy of the code, reduced the time and effort for grid preprocessing, and provided users with the flexibility of performing computations at any desired full or reduced grid resolution to suit their specific computational requirements.

  16. HIV-1 conserved-element vaccines: relationship between sequence conservation and replicative capacity.

    PubMed

    Rolland, Morgane; Manocheewa, Siriphan; Swain, J Victor; Lanxon-Cookson, Erinn C; Kim, Moon; Westfall, Dylan H; Larsen, Brendan B; Gilbert, Peter B; Mullins, James I

    2013-05-01

    To overcome the problem of HIV-1 variability, candidate vaccine antigens have been designed to be composed of conserved elements of the HIV-1 proteome. Such candidate vaccines could be improved with a better understanding of both HIV-1 evolutionary constraints and the fitness cost of specific mutations. We evaluated the in vitro fitness cost of 23 mutations engineered in the HIV-1 subtype B Gag-p24 Center-of-Tree (COT) protein through fitness competition assays. While some mutations at conserved sites exacted a high fitness cost, as expected under the assumption that the most conserved residue confers the highest fitness, there was no overall strong relationship between sequence conservation and replicative capacity. By comparing sites that have evolved since the beginning of the epidemic to those that have remain unchanged, we found that sites that have evolved over time were more likely to correspond to HLA-associated sites and that their mutation had limited fitness costs. Our data showed no transcendent link between high conservation and high fitness cost, indicating that merely focusing on conserved segments of HIV-1 would not be sufficient for a successful vaccine strategy. Nonetheless, a subset of sites exacted a high fitness cost upon mutation--these sites have been under selective pressure to change since the beginning of the epidemic but have proved virtually nonmutable and could constitute preferred targets for vaccine design.

  17. In Vivo Enhancer Analysis Chromosome 16 Conserved NoncodingSequences

    SciTech Connect

    Pennacchio, Len A.; Ahituv, Nadav; Moses, Alan M.; Nobrega,Marcelo; Prabhakar, Shyam; Shoukry, Malak; Minovitsky, Simon; Visel,Axel; Dubchak, Inna; Holt, Amy; Lewis, Keith D.; Plajzer-Frick, Ingrid; Akiyama, Jennifer; De Val, Sarah; Afzal, Veena; Black, Brian L.; Couronne, Olivier; Eisen, Michael B.; Rubin, Edward M.

    2006-02-01

    The identification of enhancers with predicted specificitiesin vertebrate genomes remains a significant challenge that is hampered bya lack of experimentally validated training sets. In this study, weleveraged extreme evolutionary sequence conservation as a filter toidentify putative gene regulatory elements and characterized the in vivoenhancer activity of human-fish conserved and ultraconserved1 noncodingelements on human chromosome 16 as well as such elements from elsewherein the genome. We initially tested 165 of these extremely conservedsequences in a transgenic mouse enhancer assay and observed that 48percent (79/165) functioned reproducibly as tissue-specific enhancers ofgene expression at embryonic day 11.5. While driving expression in abroad range of anatomical structures in the embryo, the majority of the79 enhancers drove expression in various regions of the developingnervous system. Studying a set of DNA elements that specifically droveforebrain expression, we identified DNA signatures specifically enrichedin these elements and used these parameters to rank all ~;3,400human-fugu conserved noncoding elements in the human genome. The testingof the top predictions in transgenic mice resulted in a three-foldenrichment for sequences with forebrain enhancer activity. These datadramatically expand the catalogue of in vivo-characterized human geneenhancers and illustrate the future utility of such training sets for avariety of iological applications including decoding the regulatoryvocabulary of the human genome.

  18. Extensive amino acid sequence homologies between animal lectins

    SciTech Connect

    Paroutaud, P.; Levi, G.; Teichberg, V.I.; Strosberg, A.D.

    1987-09-01

    The authors have established the amino acid sequence of the ..beta..-D-galactoside binding lectin from the electric eel and the sequences of several peptides from a similar lectin isolated from human placenta. These sequences were compared with the published sequences of peptides derived from the ..beta..-D-galactoside binding lectin from human lung and with sequences deduced from cDNAs assigned to the ..beta..-D-galactoside binding lectins from chicken embryo skin and human hepatomas. Significant homologies were observed. One of the highly conserved regions that contains a tryptophan residue and two glutamic acid resides is probably part of the ..beta..-D-galactoside binding site, which, on the basis of spectroscopic studies of the electric eel lectin, is expected to contain such residues. The similarity of the hydropathy profiles and the predicted secondary structure of the lectins from chicken skin and electric eel, in spite of differences in their amino acid sequences, strongly suggests that these proteins have maintained structural homologies during evolution and together with the other ..beta..-D-galactoside binding lectins were derived form a common ancestor gene.

  19. AlignMiner: a Web-based tool for detection of divergent regions in multiple sequence alignments of conserved sequences

    PubMed Central

    2010-01-01

    Background Multiple sequence alignments are used to study gene or protein function, phylogenetic relations, genome evolution hypotheses and even gene polymorphisms. Virtually without exception, all available tools focus on conserved segments or residues. Small divergent regions, however, are biologically important for specific quantitative polymerase chain reaction, genotyping, molecular markers and preparation of specific antibodies, and yet have received little attention. As a consequence, they must be selected empirically by the researcher. AlignMiner has been developed to fill this gap in bioinformatic analyses. Results AlignMiner is a Web-based application for detection of conserved and divergent regions in alignments of conserved sequences, focusing particularly on divergence. It accepts alignments (protein or nucleic acid) obtained using any of a variety of algorithms, which does not appear to have a significant impact on the final results. AlignMiner uses different scoring methods for assessing conserved/divergent regions, Entropy being the method that provides the highest number of regions with the greatest length, and Weighted being the most restrictive. Conserved/divergent regions can be generated either with respect to the consensus sequence or to one master sequence. The resulting data are presented in a graphical interface developed in AJAX, which provides remarkable user interaction capabilities. Users do not need to wait until execution is complete and can.even inspect their results on a different computer. Data can be downloaded onto a user disk, in standard formats. In silico and experimental proof-of-concept cases have shown that AlignMiner can be successfully used to designing specific polymerase chain reaction primers as well as potential epitopes for antibodies. Primer design is assisted by a module that deploys several oligonucleotide parameters for designing primers "on the fly". Conclusions AlignMiner can be used to reliably detect

  20. Phenolic acid esterases, coding sequences and methods

    DOEpatents

    Blum, David L.; Kataeva, Irina; Li, Xin-Liang; Ljungdahl, Lars G.

    2002-01-01

    Described herein are four phenolic acid esterases, three of which correspond to domains of previously unknown function within bacterial xylanases, from XynY and XynZ of Clostridium thermocellum and from a xylanase of Ruminococcus. The fourth specifically exemplified xylanase is a protein encoded within the genome of Orpinomyces PC-2. The amino acids of these polypeptides and nucleotide sequences encoding them are provided. Recombinant host cells, expression vectors and methods for the recombinant production of phenolic acid esterases are also provided.

  1. CKAAPs DB: a conserved key amino acid positions database.

    PubMed

    Li, W W; Reddy, B V; Shindyalov, I N; Bourne, P E

    2001-01-01

    The Conserved Key Amino Acid Positions DataBase (CKAAPs DB) provides access to an analysis of structurally similar proteins with dissimilar sequences where key residues within a common fold are identified. The derivation and significance of CKAAPs starting from pairwise structure alignments is described fully in Reddy et al. [Reddy,B.V.B., Li,W.W., Shindyalov,I.N. and Bourne,P.E. (2000) PROTEINS:, in press]. The CKAAPs identified from this theoretical analysis are provided to experimentalists and theoreticians for potential use in protein engineering and modeling. It has been suggested that CKAAPs may be crucial features for protein folding, structural stability and function. Over 170 substructures, as defined by the Combinatorial Extension (CE) database, which are found in approximately 3000 representative polypeptide chains have been analyzed and are available in the CKAAPs DB. CKAAPs DB also provides CKAAPs of the representative set of proteins derived from the CE and FSSP databases. Thus the database contains over 5000 representative poly-peptide chains, covering all known structures in the PDB. A web interface to a relational database permits fast retrieval of structure-sequence alignments, CKAAPs and associated statistics. Users may query by PDB ID, protein name, function and Enzyme Classification number. Users may also submit protein alignments of their own to obtain CKAAPs. An interface to display CKAAPs on each structure from a web browser is also being implemented. CKAAPs DB is maintained by the San Diego Supercomputer Center and accessible at the URL http://ckaaps.sdsc.edu.

  2. Method for identifying and quantifying nucleic acid sequence aberrations

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1998-07-21

    A method is disclosed for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe. 11 figs.

  3. Method for identifying and quantifying nucleic acid sequence aberrations

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1998-01-01

    A method for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe.

  4. Sequence and domain conservation of the coelacanth Hsp40 and Hsp90 chaperones suggests conservation of function.

    PubMed

    Bishop, Özlem Tastan; Edkins, Adrienne Lesley; Blatch, Gregory Lloyd

    2014-09-01

    Molecular chaperones and their associated co-chaperones play an important role in preserving and regulating the active conformational state of cellular proteins. The chaperone complement of the Indonesian Coelacanth, Latimeria menadoensis, was elucidated using transcriptomic sequences. Heat shock protein 90 (Hsp90) and heat shock protein 40 (Hsp40) chaperones, and associated co-chaperones were focused on, and homologous human sequences were used to search the sequence databases. Coelacanth homologs of the cytosolic, mitochondrial and endoplasmic reticulum (ER) homologs of human Hsp90 were identified, as well as all of the major co-chaperones of the cytosolic isoform. Most of the human Hsp40s were found to have coelacanth homologs, and the data suggested that all of the chaperone machinery for protein folding at the ribosome, protein translocation to cellular compartments such as the ER and protein degradation were conserved. Some interesting similarities and differences were identified when interrogating human, mouse, and zebrafish homologs. For example, DnaJB13 is predicted to be a non-functional Hsp40 in humans, mouse, and zebrafish due to a corrupted histidine-proline-aspartic acid (HPD) motif, while the coelacanth homolog has an intact HPD. These and other comparisons enabled important functional and evolutionary questions to be posed for future experimental studies.

  5. Inter-specific sequence conservation and intra-individual sequence variation in a spider silk gene.

    PubMed

    Tai, Pei-Ling; Hwang, Guang-Yuh; Tso, I-Min

    2004-10-01

    Currently, studies on major ampullate spidroin 1 (MaSp1) genes of non-orb weaving spiders are few, and it is not clear whether genes of these organisms exhibit the same characteristics as those of orb-weavers. In addition, many studies have proposed that MaSp1 might be a single gene with allelic variants, but supporting evidence is still lacking. In this study, we compared partial DNA and amino acid sequences of MaSp1 cloned from different spider guilds. We also cloned partial MaSp1 sequences from genomic DNA and cDNA of the same individuals of spiders using the same primer combination to see if different molecular forms existed. In the repetitive region of partial MaSp1 sequences obtained, GGX, GA and poly-A motifs were present in all Araneomorphae and Mygalomorpae species examined. An extreme similarity in MaSp1 non-repetitive portions was found in sequences of ecribellate, cribellate and Mygalomorphae web-builders and such a result suggested that this sequence might exhibit an important function. A comparison of sequences amplified from the same individual showed that substitutions in amino acids occurred in both repetitive and non-repetitive regions, with a much higher variation in the former. These results suggest that the MaSp1 of Araneomorphae spiders exhibits several forms in an individual spider and it might be either a multiple gene or a single gene with a multiple exon/intron organization.

  6. Methods for analyzing nucleic acid sequences

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2011-05-17

    The present invention is directed to a method of sequencing a target nucleic acid. The method provides a complex comprising a polymerase enzyme, a target nucleic acid molecule, and a primer, wherein the complex is immobilized on a support Fluorescent label is attached to a terminal phosphate group of the nucleotide or nucleotide analog. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The time duration of the signal from labeled nucleotides or nucleotide analogs that become incorporated is distinguished from freely diffusing labels by a longer retention in the observation volume for the nucleotides or nucleotide analogs that become incorporated than for the freely diffusing labels.

  7. Low molecular weight serine protease inhibitors from insects are proteins with highly conserved sequences.

    PubMed

    Boigegrain, R A; Pugnière, M; Paroutaud, P; Castro, B; Brehélin, M

    2000-02-01

    A low molecular weight protease inhibitor peptide found in ovaries of the desert locust Schistocerca gregaria (SGPI-2), was purified from plasma of the same locust and sequenced. It was named SGCI. It was found active towards chymotrypsin and human leukocyte elastase. SGCI was synthesized using a solid-phase procedure and the sequence of its reactive site for chymotrypsin was determined. Compared with an inhibitor purified earlier from another locust species, the total sequence of SGCI showed 88% identity. In particular, the sequence of the reactive site of these inhibitors was identical. Our search for a closely related peptide in an insect species far removed from locusts, the lepidopteran Spodoptera littoralis, was unfruitful but a different chymotrypsin inhibitor, belonging to the Kazal family, was found whose mass is greater than that of SGCI (20 vs 3.6 kDa). Its N-terminal sequence shares 80% identity with that of a chymotrypsin inhibitor purified earlier from the haemolymph of another lepidopteran. Conservation of the amino acid sequence in the reactive site seems to be an exception among protease inhibitors.

  8. Relationship between sequence conservation and three-dimensional structure in a large family of esterases, lipases, and related proteins.

    PubMed Central

    Cygler, M.; Schrag, J. D.; Sussman, J. L.; Harel, M.; Silman, I.; Gentry, M. K.; Doctor, B. P.

    1993-01-01

    Based on the recently determined X-ray structures of Torpedo californica acetylcholinesterase and Geotrichum candidum lipase and on their three-dimensional superposition, an improved alignment of a collection of 32 related amino acid sequences of other esterases, lipases, and related proteins was obtained. On the basis of this alignment, 24 residues are found to be invariant in 29 sequences of hydrolytic enzymes, and an additional 49 are well conserved. The conservation in the three remaining sequences is somewhat lower. The conserved residues include the active site, disulfide bridges, salt bridges, and residues in the core of the proteins. Most invariant residues are located at the edges of secondary structural elements. A clear structural basis for the preservation of many of these residues can be determined from comparison of the two X-ray structures. PMID:8453375

  9. Conserved sequence motifs among bacterial, eukaryotic, and archaeal phosphatases that define a new phosphohydrolase superfamily.

    PubMed Central

    Thaller, M. C.; Schippa, S.; Rossolini, G. M.

    1998-01-01

    Members of a new molecular family of bacterial nonspecific acid phosphatases (NSAPs), indicated as class C, were found to share significant sequence similarities to bacterial class B NSAPs and to some plant acid phosphatases, representing the first example of a family of bacterial NSAPs that has a relatively close eukaryotic counterpart. Despite the lack of an overall similarity, conserved sequence motifs were also identified among the above enzyme families (class B and class C bacterial NSAPs, and related plant phosphatases) and several other families of phosphohydrolases, including bacterial phosphoglycolate phosphatases, histidinol-phosphatase domains of the bacterial bifunctional enzymes imidazole-glycerolphosphate dehydratases, and bacterial, eukaryotic, and archaeal phosphoserine phosphatases and threalose-6-phosphatases. These conserved motifs are clustered within two domains, separated by a variable spacer region, according to the pattern [FILMAVT]-D-[ILFRMVY]-D-[GSNDE]-[TV]-[ILVAM]-[AT S VILMC]-X-¿YFWHKR)-X-¿YFWHNQ¿-X( 102,191)-¿KRHNQ¿-G-D-¿FYWHILVMC¿-¿QNH¿-¿FWYGP¿-D -¿PSNQYW¿. The dephosphorylating activity common to all these proteins supports the definition of this phosphatase motif and the inclusion of these enzymes into a superfamily of phosphohydrolases that we propose to indicate as "DDDD" after the presence of the four invariant aspartate residues. Database searches retrieved various hypothetical proteins of unknown function containing this or similar motifs, for which a phosphohydrolase activity could be hypothesized. PMID:9684901

  10. Discovering conserved insect microRNAs from expressed sequence tags.

    PubMed

    Jia, Qidong; Lin, Kejian; Liang, Jingdong; Yu, Lun; Li, Fei

    2010-12-01

    MicroRNAs (miRNA) participate in regulating diverse biological pathways by translational repression in animals. They have attracted increasing attention recently. However, little work has been done on the miRNA genes in agriculturally important pests. Because the transcripts of most miRNA genes are the products of type-II RNA polymerase, pri-miRNA has a poly(A) tail and appears in expressed sequence tags (EST). We developed a computational pipeline to identify miRNA genes from insect ESTs. First, 980,697 ESTs from 63 insects were collected and used to search the nr database. The ESTs which did not share significant similarities with any known protein-coding genes were treated as non-coding ESTs. Next, known mature miRNAs were used to align with non-coding ESTs. The ESTs which contain the sequence of mature miRNA were treated as candidate ESTs. Finally, putative precursors were extracted flanking the mature miRNA region in candidate ESTs and evaluated by the Triplet-SVM algorithm. As a result, 86 miRNAs from 30 insect species were found based on a strict criterion while 330 miRNAs from 51 species were found based on a loose criterion. Evolution analysis indicated that mir-467, mir-297 and mir-466 were the highest conserved miRNA families in insects. To confirm the reliability of putative insect miRNAs, the expression profile of nine predicted miRNAs in Locusta migratoria was investigated. Eight miRNAs were successfully detected by RT-PCR. Most miRNAs were expressed ubiquitously at all examined tissues and developmental stages whereas Lmi-mir-509 was specifically expressed in the thorax of the 2nd, 4th and 5th instars and adult locust. In all, our work reported an efficient computational strategy for predicting miRNA genes from insect ESTs and presented tens of miRNAs in diverse insect species which are expected to participate in many important physiological processes.

  11. 77 FR 65537 - Requirements for Patent Applications Containing Nucleotide Sequence and/or Amino Acid Sequence...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-10-29

    ... Amino Acid Sequence Disclosures ACTION: Proposed collection; comment request. SUMMARY: The United States....'' SUPPLEMENTARY INFORMATION: I. Abstract Patent applications that contain nucleotide and/or amino acid...

  12. Mitochondrial genome sequences illuminate maternal lineages of conservation concern in a rare carnivore

    PubMed Central

    2011-01-01

    Background Science-based wildlife management relies on genetic information to infer population connectivity and identify conservation units. The most commonly used genetic marker for characterizing animal biodiversity and identifying maternal lineages is the mitochondrial genome. Mitochondrial genotyping figures prominently in conservation and management plans, with much of the attention focused on the non-coding displacement ("D") loop. We used massively parallel multiplexed sequencing to sequence complete mitochondrial genomes from 40 fishers, a threatened carnivore that possesses low mitogenomic diversity. This allowed us to test a key assumption of conservation genetics, specifically, that the D-loop accurately reflects genealogical relationships and variation of the larger mitochondrial genome. Results Overall mitogenomic divergence in fishers is exceedingly low, with 66 segregating sites and an average pairwise distance between genomes of 0.00088 across their aligned length (16,290 bp). Estimates of variation and genealogical relationships from the displacement (D) loop region (299 bp) are contradicted by the complete mitochondrial genome, as well as the protein coding fraction of the mitochondrial genome. The sources of this contradiction trace primarily to the near-absence of mutations marking the D-loop region of one of the most divergent lineages, and secondarily to independent (recurrent) mutations at two nucleotide position in the D-loop amplicon. Conclusions Our study has two important implications. First, inferred genealogical reconstructions based on the fisher D-loop region contradict inferences based on the entire mitogenome to the point that the populations of greatest conservation concern cannot be accurately resolved. Whole-genome analysis identifies Californian haplotypes from the northern-most populations as highly distinctive, with a significant excess of amino acid changes that may be indicative of molecular adaptation; D-loop sequences fail

  13. Computational Prediction of Phylogenetically Conserved Sequence Motifs for Five Different Candidate Genes in Type II Diabetic Nephropathy

    PubMed Central

    Sindhu, T; Rajamanikandan, S; Srinivasan, P

    2012-01-01

    Background: Computational identification of phylogenetic motifs helps to understand the knowledge about known functional features that includes catalytic site, substrate binding epitopes, and protein-protein interfaces. Furthermore, they are strongly conserved among orthologs, indicating their evolutionary importance. The study aimed to analyze five candidate genes involved in type II diabetic nephropathy and to predict phylogenetic motifs from their corresponding orthologous protein sequences. Methods: AKR1B1, APOE, ENPP1, ELMO1 and IGFBP1 are the genes that have been identified as an important target for type II diabetic nephropathy through experimental studies. Their corresponding protein sequences, structures, orthologous sequences were retrieved from UniprotKB, PDB, and PHOG database respectively. Multiple sequence alignments were constructed using ClustalW and phylogenetic motifs were identified using MINER. The occurrence of amino acids in the obtained phylogenetic motifs was generated using WebLogo and false positive expectations were calculated against phylogenetic similarity. Results: In total, 17 phylogenetic motifs were identified from the five proteins and the residues such as glycine, leucine, tryptophan, aspartic acid were found in appreciable frequency whereas arginine identified in all the predicted PMs. The result implies that these residues can be important to the functional and structural role of the proteins and calculated false positive expectations implies that they were generally conserved in traditional sense. Conclusion: The prediction of phylogenetic motifs is an accurate method for detecting functionally important conserved residues. The conserved motifs can be used as a potential drug target for type II diabetic nephropathy. PMID:23113206

  14. Complete nucleotide sequence of the Actinomyces viscosus T14V sialidase gene: presence of a conserved repeating sequence among strains of Actinomyces spp.

    PubMed Central

    Yeung, M K

    1993-01-01

    The nucleotide sequence of the Actinomyces viscosus T14V sialidase gene (nanH) and flanking regions was determined. An open reading frame of 2,703 nucleotides that encodes a predominately hydrophobic protein of 901 amino acids (M(r), 92,871) was identified. The amino acid sequence at the amino terminus of the predicted protein exhibited properties characteristic of a typical leader peptide. Five 12-amino-acid units that shared between 33 and 67% sequence identity were noted within the central domain of the protein. Each unit contained the sequence Ser-X-Asp-X-Gly-X-Thr-Trp, which is conserved among other bacterial and trypanosoma sp. sialidases. Thus, the A. viscosus T14V nanH gene and the other prokaryotic and eukaryotic sialidase genes evolved from a common ancestor. Southern hybridization analyses under conditions of high stringency revealed the existence of DNA sequences homologous to A. viscosus T14V nanH in the genomes of 18 strains of five Actinomyces species that expressed various levels of sialidase activity. The data demonstrate that the sialidase genes from divergent groups of Actinomyces spp. are highly conserved. Images PMID:8418033

  15. The amino acid sequence of chymopapain from Carica papaya.

    PubMed Central

    Watson, D C; Yaguchi, M; Lynn, K R

    1990-01-01

    Chymopapain is a polypeptide of 218 amino acid residues. It has considerable structural similarity with papain and papaya proteinase omega, including conservation of the catalytic site and of the disulphide bonding. Chymopapain is like papaya proteinase omega in carrying four extra residues between papain positions 168 and 169, but differs from both papaya proteinases in the composition of its S2 subsite, as well as in having a second thiol group, Cys-117. Some evidence for the amino acid sequence of chymopapain has been deposited as Supplementary Publication SUP 50153 (12 pages) at the British Library Document Supply Centre, Boston Spa., Wetherby, West Yorkshire LS23 7BQ, U.K., from whom copies may be obtained on the terms indicated in Biochem. J. (1990) 265, 5. The information comprises Supplement Tables 1-4, which contain, in order, amino acid compositions of peptides from tryptic, peptic, CNBr and mild acid cleavages, Supplement Fig. 1, showing re-fractionation of selected peaks from Fig. 2 of the main paper. Supplement Fig. 2, showing cation-exchange chromatography of the earliest-eluted peak of Fig. 3 of the main paper, Supplement Fig. 3, showing reverse-phase h.p.l.c. of the later-eluted peak from Fig. 3 of the main paper, and Supplement Fig. 4, showing the separation of peptides after mild acid hydrolysis of CNBr-cleavage fragment CB3. PMID:2106878

  16. Patterns of sequence conservation in the S-Layer proteins and related sequences in Clostridium difficile.

    PubMed

    Calabi, Emanuela; Fairweather, Neil

    2002-07-01

    Clostridium difficile is the etiological agent of antibiotic-associated diarrhea. Among the factors that may play a role in infection are S-layer proteins (SLPs). Previous work has shown these to consist mainly of two components, resulting from the cleavage of a precursor encoded by the slpA gene. The high-molecular-weight (MW) subunit is related both to amidases from B. subtilis and to at least another 28 gene products in C. difficile strain 630. To gain insight into the functions of the SLPs and related proteins, we have further investigated the pattern of variability both at the slpA locus and at six nearby paralogs. Sequencing of the slpA gene from an S-layer group II strain and a variant S-layer group strain confirms a high degree of divergence in the low-MW SLP, which may result from diversifying selection. A highly conserved motif, however, is found at the C terminus in all low-MW subunits and may be essential for SlpA precursor cleavage. In strain 167, a variant cleavage product is present, suggesting a secondary processing site. Southern blotting analysis shows slpA-like open reading frames (ORFs) 2 to 7 to be conserved in all nine strains tested, with one exception: ORF2, which encodes a 66-kDa polypeptide coextracted at low pH with the main SLPs in strain 630, may be partially deleted in strain 167. Polymorphism within the slpA-ORF7 cluster may be more pronounced in the region proximal to the slpA gene. Unexpectedly, a high-MW subunit probe cross hybridizes to sequences outside the slpA locus, which appear to vary in number in different strains.

  17. Amino acid sequence of a mouse immunoglobulin mu chain.

    PubMed Central

    Kehry, M; Sibley, C; Fuhrman, J; Schilling, J; Hood, L E

    1979-01-01

    The complete amino acid sequence of the mouse mu chain from the BALB/c myeloma tumor MOPC 104E is reported. The C mu region contains four consecutive homology regions of approximately 110 residues and a COOH-terminal region of 19 residues. A comparison of this mu chain from mouse with a complete mu sequence from human (Ou) and a partial mu chain sequence from dog (Moo) reveals a striking gradient of increasing homology from the NH2-terminal to the COOH-terminal portion of these mu chains, with the former being the least and the latter the most highly conserved. Four of the five sites of carbohydrate attachment appear to be at identical residue positions when the constant regions of the mouse and human mu chains are compared. The mu chain of MOPC 104E has a carbohydrate moiety attached in the second hypervariable region. This is particularly interesting in view of the fact that MOPC 104E binds alpha-(1 leads to 3)-dextran, a simple carbohydrate. The structural and functional constraints imposed by these comparative sequence analyses are discussed. PMID:111247

  18. Evolutionary diversification of aminopeptidase N in Lepidoptera by conserved clade-specific amino acid residues.

    PubMed

    Hughes, Austin L

    2014-07-01

    Members of the aminopepidase N (APN) gene family of the insect order Lepidoptera (moths and butterflies) bind the naturally insecticidal Cry toxins produced by the bacterium Bacillus thuringiensis. Phylogenetic analysis of amino acid sequences of seven lepidopteran APN classes provided strong support for the hypothesis that lepidopteran APN2 class arose by gene duplication prior to the most recent common ancestor of Lepidoptera and Diptera. The Cry toxin-binding region (BR) of lepidopteran and dipteran APNs was subject to stronger purifying selection within APN classes than was the remainder of the molecule, reflecting conservation of catalytic site and adjoining residues within the BR. Of lepidopteran APN classes, APN2, APN6, and APN8 showed the strongest evidence of functional specialization, both in expression patterns and in the occurrence of conserved derived amino acid residues. The latter three APN classes also shared a convergently evolved conserved residue close to the catalytic site. APN8 showed a particularly strong tendency towards class-specific conserved residues, including one of the catalytic site residues in the BR and ten others in close vicinity to the catalytic site residues. The occurrence of class-specific sequences along with the conservation of enzymatic function is consistent with the hypothesis that the presence of Cry toxins in the environment has been a factor shaping the evolution of this multi-gene family.

  19. Detection of nucleic acid sequences by invader-directed cleavage

    DOEpatents

    Brow, Mary Ann D.; Hall, Jeff Steven Grotelueschen; Lyamichev, Victor; Olive, David Michael; Prudent, James Robert

    1999-01-01

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The 5' nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof. The present invention further relates to methods and devices for the separation of nucleic acid molecules based by charge.

  20. Sequence Conservation, Radial Distance and Packing Density in Spherical Viral Capsids

    PubMed Central

    Lee, Chi-Wen; Huang, Tsun-Tsao; Shih, Chung-Shiuan; Hwang, Jenn-Kang

    2015-01-01

    The conservation level of a residue is a useful measure about the importance of that residue in protein structure and function. Much information about sequence conservation comes from aligning homologous sequences. Profiles showing the variation of the conservation level along the sequence are usually interpreted in evolutionary terms and dictated by site similarities of a proper set of homologous sequences. Here, we report that, of the viral icosahedral capsids, the sequence conservation profile can be determined by variations in the distances between residues and the centroid of the capsid – with a direct inverse proportionality between the conservation level and the centroid distance – as well as by the spatial variations in local packing density. Examining both the centroid and the packing density models against a dataset of 51 crystal structures of nonhomologous icosahedral capsids, we found that many global patterns and minor features derived from the viral structures are consistent with those present in the sequence conservation profiles. The quantitative link between the level of conservation and structural features like centroid-distance or packing density allows us to look at residue conservation from a structural viewpoint as well as from an evolutionary viewpoint. PMID:26132081

  1. Los Alamos sequence analysis package for nucleic acids and proteins.

    PubMed Central

    Kanehisa, M I

    1982-01-01

    An interactive system for computer analysis of nucleic acid and protein sequences has been developed for the Los Alamos DNA Sequence Database. It provides a convenient way to search or verify various sequence features, e.g., restriction enzyme sites, protein coding frames, and properties of coded proteins. Further, the comprehensive analysis package on a large-scale database can be used for comparative studies on sequence and structural homologies in order to find unnoted information stored in nucleic acid sequences. PMID:6174934

  2. Analysis of cloned cDNA and genomic sequences for phytochrome: complete amino acid sequences for two gene products expressed in etiolated Avena.

    PubMed Central

    Hershey, H P; Barker, R F; Idler, K B; Lissemore, J L; Quail, P H

    1985-01-01

    Cloned cDNA and genomic sequences have been analyzed to deduce the amino acid sequence of phytochrome from etiolated Avena. Restriction endonuclease site polymorphism between clones indicates that at least four phytochrome genes are expressed in this tissue. Sequence analysis of two complete and one partial coding region shows approximately 98% homology at both the nucleotide and amino acid levels, with the majority of amino acid changes being conservative. High sequence homology is also found in the 5'-untranslated region but significant divergence occurs in the 3'-untranslated region. The phytochrome polypeptides are 1128 amino acid residues long corresponding to a molecular mass of 125 kdaltons. The known protein sequence at the chromophore attachment site occurs only once in the polypeptide, establishing that phytochrome has a single chromophore per monomer covalently linked to Cys-321. Computer analyses of the amino acid sequences have provided predictions regarding a number of structural features of the phytochrome molecule. PMID:3001642

  3. Conservation of acid waterlogged shipwrecks: nanotechnologies for de-acidification

    NASA Astrophysics Data System (ADS)

    Giorgi, R.; Chelazzi, D.; Baglioni, P.

    2006-06-01

    Preservation of waterlogged wooden artifacts, and in particular ancient wrecks, is a challenge in cultural heritage conservation. Samples, from the Swedish warship Vasa, are under investigation in order to develop innovative methods for wood de-acidification and preservation. The Vasa represents a unique case in the study of ancient wrecks. In the past four years the problem of the acidity of wood emerged as a strong threat to its conservation. The production of sulphuric acid inside the ship wood might be the cause of both chemical damage through the acid hydrolysis of cellulose, and of physical damage of the wood’s pore structure, due to the crystallization of sulphate minerals in the wood pores. In this paper we show that wood acidity can be neutralized by the application of nanoparticles of alkaline-earth carbonates and/or hydroxides. The treatment provides an alkaline reservoir inside the wood. Nanoparticles absorbed in the wood from an alcoholic dispersion adhere to the wood wall and release hydroxyl ions leading to the wood neutralization. Oak and pine samples from the Vasa wreck were characterized and treated with alkaline magnesium or calcium nanoparticle dispersions in non-aqueous solvents. De-acidification was monitored by pH changes and thermal analysis, and all the treated samples were submitted to thermal artificial ageing in order to demonstrate the efficacy of the method. The results obtained opened a new perspective in wood conservation.

  4. Primary structure of the merozoite surface antigen 1 of Plasmodium vivax reveals sequences conserved between different Plasmodium species.

    PubMed Central

    del Portillo, H A; Longacre, S; Khouri, E; David, P H

    1991-01-01

    Merozoite surface antigen 1 (MSA1) of several species of plasmodia has been shown to be a promising candidate for a vaccine directed against the asexual blood stages of malaria. We report the cloning and characterization of the MSA1 gene of the human malaria parasite Plasmodium vivax. This gene, which we call Pv200, encodes a polypeptide of 1726 amino acids and displays features described for MSA1 genes of other species, such as signal peptide and anchoring sequences, conserved cysteine residues, number of potential N-glycosylation sites, and repeats consisting here of 23 glutamine residues in a row. When the nucleotide and deduced amino acid sequences of the MSA1 of P. vivax are compared to those of another human malaria parasite, Plasmodium falciparum, and to those of the rodent parasite Plasmodium yoelii, 10 regions of high amino acid similarity are observed despite the very different dG + dC contents of the corresponding genes. All of the interspecies conserved regions reside within the conserved or semiconserved blocks delimited by the sequences of different alleles of the MSA1 gene of P. falciparum. Images PMID:2023952

  5. Global geno-proteomic analysis reveals cross-continental sequence conservation and druggable sites among influenza virus polymerases.

    PubMed

    Babar, Mustafeez Mujtaba; Zaidi, Najam-us-Sahar Sadaf; Tahir, Muhammad

    2014-12-01

    Influenza virus is one of the major causes of mortality and morbidity associated with respiratory diseases. The high rate of mutation in the viral proteome provides it with the ability to survive in a variety of host species. This property helps it in maintaining and developing its pathogenicity, transmission and drug resistance. Alternate drug targets, particularly the internal proteins, can potentially be exploited for addressing the resistance issues. In the current analysis, the degree of conservation of influenza virus polymerases has been studied as one of the essential elements for establishing its candidature as a potential target of antiviral therapy. We analyzed more than 130,000 nucleotide and amino acid sequences by classifying them on the basis of continental presence of host organisms. Computational analyses including genetic polymorphism study, mutation pattern determination, molecular evolution and geophylogenetic analysis were performed to establish the high degree of conservation among the sequences. These studies lead to establishing the polymerases, in particular PB1, as highly conserved proteins. Moreover, we mapped the conservation percentage on the tertiary structures of proteins to identify the conserved, druggable sites. The research study, hence, revealed that the influenza virus polymerases are highly conserved (95-99%) proteins with a very slow mutation rate. Potential drug binding sites on various polymerases have also been reported. A scheme for drug target candidate development that can be employed to rapidly mutating proteins has been presented. Moreover, the research output can help in designing new therapeutic molecules against the identified targets.

  6. Hybridization and sequencing of nucleic acids using base pair mismatches

    DOEpatents

    Fodor, Stephen P. A.; Lipshutz, Robert J.; Huang, Xiaohua

    2001-01-01

    Devices and techniques for hybridization of nucleic acids and for determining the sequence of nucleic acids. Arrays of nucleic acids are formed by techniques, preferably high resolution, light-directed techniques. Positions of hybridization of a target nucleic acid are determined by, e.g., epifluorescence microscopy. Devices and techniques are proposed to determine the sequence of a target nucleic acid more efficiently and more quickly through such synthesis and detection techniques.

  7. Comparative genomic analysis of equilibrative nucleoside transporters suggests conserved protein structure despite limited sequence identity.

    PubMed

    Sankar, Narendra; Machado, Jerry; Abdulla, Parween; Hilliker, Arthur J; Coe, Imogen R

    2002-10-15

    Equilibrative nucleoside transporters (ENTs) are a recently characterized and poorly understood group of membrane proteins that are important in the uptake of endogenous nucleosides required for nucleic acid and nucleoside triphosphate synthesis. Despite their central importance in cellular metabolism and nucleoside analog chemotherapy, no human ENT gene has been described and nothing is known about gene structure and function. To gain insight into the ENT gene family, we used experimental and in silico comparative genomic approaches to identify ENT genes in three evolutionarily diverse organisms with completely (or almost completely) sequenced genomes, Homo sapiens, Caenorhabditis elegans and Drosophila melanogaster. We describe the chromosomal location, the predicted ENT gene structure and putative structural topologies of predicted ENT proteins derived from the open reading frames. Despite variations in genomic layout and limited ortholog protein sequence identity (< or =27.45%), predicted topologies of ENT proteins are strikingly similar, suggesting an evolutionary conservation of a prototypic structure. In addition, a similar distribution of protein domains on exons is apparent in all three taxa. These data demonstrate that comparative sequence analyses should be combined with other approaches (such as genomic and proteomic analyses) to fully understand structure, function and evolution of protein families.

  8. Remarkable intron and exon sequence conservation in human and mouse homeobox Hox 1. 3 genes

    SciTech Connect

    Tournier-Lasserve, E.; Odenwald, W.F.; Garbern, J.; Trojanowski, J.; Lazzarini, R.A.

    1989-05-01

    A high degree of conservation exists between the Hox 1.3 homeobox genes of mice and humans. The two genes occupy the same relative positions in their respective Hox 1 gene clusters, they show extensive sequence similarities in their coding and noncoding portions, and both are transcribed into multiple transcripts of similar sizes. The predicted human Hox 1.3 protein differs from its murine counterpart in only 7 of 270 amino acids. The sequence similarity in the 250 base pairs upstream of the initiation codon is 98%, the similarity between the two introns, both 960 base pairs long, is 72%, and the similarity in the 3' noncoding region from termination codon to polyadenylation signal is 90%. Both mouse and human Hox 1.3 introns contain a sequence with homology to a mating-type-controlled cis element of the yeast Ty1 transposon. DNA-binding studies with a recombinant mouse Hox 1.3 protein identified two binding sites in the intron, both of which were within the region of shared homology with this Ty1 cis element.

  9. High sequence conservation among cucumber mosaic virus isolates from lily.

    PubMed

    Chen, Y K; Derks, A F; Langeveld, S; Goldbach, R; Prins, M

    2001-08-01

    For classification of Cucumber mosaic virus (CMV) isolates from ornamental crops of different geographical areas, these were characterized by comparing the nucleotide sequences of RNAs 4 and the encoded coat proteins. Within the ornamental-infecting CMV viruses both subgroups were represented. CMV isolates of Alstroemeria and crocus were classified as subgroup II isolates, whereas 8 other isolates, from lily, gladiolus, amaranthus, larkspur, and lisianthus, were identified as subgroup I members. In general, nucleotide sequence comparisons correlated well with geographic distribution, with one notable exception: the analyzed nucleotide sequences of 5 lily isolates showed remarkably high homology despite different origins.

  10. Conservative and nonconservative inhibitors of gastric acid secretion

    SciTech Connect

    Ekblad, E.B.M.; Licko, V.

    1987-09-01

    Inhibitors of the initial step (H/sub 2/-antagonist) and of the final step (thiocyanate, SCN/sup -/; and nitrite, NO/sub 2//sup -/) were used to study the dynamics of acid secretion in isolated frog gastric mucosa. Tissues were mounted in flow-through chambers, and the acid secretion rate (SR) was recorded on a pH-stat microprocessor. Continuous presence of H/sub 2/-antagonist decreases the SR to a lower steady state, and on removal the SR returns to basal SR, causing a net loss of acid, the nonconservative effect. The amount of lost acid is a unique function of exposure, thus, independent of the patterns (pulses or steps) of inhibition. In contrast, continuous presence of SCN/sup -/ or NO/sub 2//sup -/ (below 3 mM) results in an undershoot in SR with a return to basal SR, whereas at higher concentrations there is not return. Removal of these inhibitors causes an overshoot in SR with return to basal SR. The rebound acid is equal to acid suppressed by NO/sub 2//sup -/ and low concentration of SCN/sup -/, resulting in no net loss of acid, the conservative effect, whereas at high concentrations of SCN/sup -/ there is an apparent loss of acid. In maximally secreting tissue the overshoot of SR is not observed. However, the acid is not lost, merely delayed. In resting tissue NO/sub 2//sup -/ also merely delays the exit of the acid produced in response to forskolin. The rebound acid is proposed to reside in a sequestered acid pool that is stable for at least 120 min. Results with NO/sub 2//sup -/ and SCN/sup -/ suggest an effect on a saturable exit enzyme, possibly the K/sup +/-H/sup +/-ATPase.

  11. Sequence of a cDNA encoding nitrite reductase from the tree Betula pendula and identification of conserved protein regions.

    PubMed

    Friemann, A; Brinkmann, K; Hachtel, W

    1992-02-01

    The sequence of an mRNA encoding nitrite reductase (NiR, EC 1.7.7.1.) from the tree Betula pendula was determined. A cDNA library constructed from leaf poly(A)+ mRNA was screened with an oligonucleotide probe deduced from NiR sequences from spinach and maize. A 2.5 kb cDNA was isolated that hybridized to an mRNA, the steady-state level of which increased markedly upon induction with nitrate. The nucleotide sequence of the cDNA contains a reading frame encoding a protein of 583 amino acids that reveals 79% identity with NiR from spinach. The transit peptide of the NiR precursor from birch was determined to be 22 amino acids in size by sequence comparison with NiR from spinach and maize and is the shortest transit peptide reported so far. A graphical evaluation of identities found in the NiR sequence alignment revealed nine well conserved sections each exceeding ten amino acids in size. Sequence comparisons with related redox proteins identified essential residues involved in cofactor binding. A putative binding site for ferredoxin was found in the N-terminal half of the protein.

  12. WeederH: an algorithm for finding conserved regulatory motifs and regions in homologous sequences

    PubMed Central

    Pavesi, Giulio; Zambelli, Federico; Pesole, Graziano

    2007-01-01

    Background This work addresses the problem of detecting conserved transcription factor binding sites and in general regulatory regions through the analysis of sequences from homologous genes, an approach that is becoming more and more widely used given the ever increasing amount of genomic data available. Results We present an algorithm that identifies conserved transcription factor binding sites in a given sequence by comparing it to one or more homologs, adapting a framework we previously introduced for the discovery of sites in sequences from co-regulated genes. Differently from the most commonly used methods, the approach we present does not need or compute an alignment of the sequences investigated, nor resorts to descriptors of the binding specificity of known transcription factors. The main novel idea we introduce is a relative measure of conservation, assuming that true functional elements should present a higher level of conservation with respect to the rest of the sequence surrounding them. We present tests where we applied the algorithm to the identification of conserved annotated sites in homologous promoters, as well as in distal regions like enhancers. Conclusion Results of the tests show how the algorithm can provide fast and reliable predictions of conserved transcription factor binding sites regulating the transcription of a gene, with better performances than other available methods for the same task. We also show examples on how the algorithm can be successfully employed when promoter annotations of the genes investigated are missing, or when regulatory sites and regions are located far away from the genes. PMID:17286865

  13. Multilign: an algorithm to predict secondary structures conserved in multiple RNA sequences

    PubMed Central

    Xu, Zhenjiang; Mathews, David H.

    2011-01-01

    Motivation: With recent advances in sequencing, structural and functional studies of RNA lag behind the discovery of sequences. Computational analysis of RNA is increasingly important to reveal structure–function relationships with low cost and speed. The purpose of this study is to use multiple homologous sequences to infer a conserved RNA structure. Results: A new algorithm, called Multilign, is presented to find the lowest free energy RNA secondary structure common to multiple sequences. Multilign is based on Dynalign, which is a program that simultaneously aligns and folds two sequences to find the lowest free energy conserved structure. For Multilign, Dynalign is used to progressively construct a conserved structure from multiple pairwise calculations, with one sequence used in all pairwise calculations. A base pair is predicted only if it is contained in the set of low free energy structures predicted by all Dynalign calculations. In this way, Multilign improves prediction accuracy by keeping the genuine base pairs and excluding competing false base pairs. Multilign has computational complexity that scales linearly in the number of sequences. Multilign was tested on extensive datasets of sequences with known structure and its prediction accuracy is among the best of available algorithms. Multilign can run on long sequences (> 1500 nt) and an arbitrarily large number of sequences. Availability: The algorithm is implemented in ANSI C++ and can be downloaded as part of the RNAstructure package at: http://rna.urmc.rochester.edu Contact: david_mathews@urmc.rochester.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21193521

  14. Purification, characterization and partial amino acid sequence of glycogen synthase from Saccharomyces cerevisiae.

    PubMed Central

    Carabaza, A; Arino, J; Fox, J W; Villar-Palasi, C; Guinovart, J J

    1990-01-01

    Glycogen synthase from Saccharomyces cerevisiae was purified to homogeneity. The enzyme showed a subunit molecular mass of 80 kDa. The holoenzyme appears to be a tetramer. Antibodies developed against purified yeast glycogen synthase inactivated the enzyme in yeast extracts and allowed the detection of the protein in Western blots. Amino acid analysis showed that the enzyme is very rich in glutamate and/or glutamine residues. The N-terminal sequence (11 amino acid residues) was determined. In addition, selected tryptic-digest peptides were purified by reverse-phase h.p.l.c. and submitted to gas-phase sequencing. Up to eight sequences (79 amino acid residues) could be aligned with the human muscle enzyme sequence. Levels of identity range between 37 and 100%, indicating that, although human and yeast glycogen synthases probably share some conserved regions, significant differences in their primary structure should be expected. Images Fig. 1. Fig. 2. Fig. 3. PMID:2114092

  15. High Throughput Sequencing of T Cell Antigen Receptors Reveals a Conserved TCR Repertoire.

    PubMed

    Hou, Xianliang; Lu, Chong; Chen, Sisi; Xie, Qian; Cui, Guangying; Chen, Jianing; Chen, Zhi; Wu, Zhongwen; Ding, Yulong; Ye, Ping; Dai, Yong; Diao, Hongyan

    2016-03-01

    The T-cell receptor (TCR) repertoire is a mirror of the human immune system that reflects processes caused by infections, cancer, autoimmunity, and aging. Next-generation sequencing has become a powerful tool for deep TCR profiling. Herein, we used this technology to study the repertoire features of TCR beta chain in the blood of healthy individuals.Peripheral blood samples were collected from 10 healthy donors. T cells were isolated with anti-human CD3 magnetic beads according to the manufacturer's protocol. We then combined multiplex-PCR, Illumina sequencing, and IMGT/High V-QUEST to analyze the characteristics and polymorphisms of the TCR.Most of the individual T cell clones were present at very low frequencies, suggesting that they had not undergone clonal expansion. The usage frequencies of the TCR beta variable, beta joining, and beta diversity gene segments were similar among T cells from different individuals. Notably, the usage frequency of individual nucleotides and amino acids within complementarity-determining region (CDR3) intervals was remarkably consistent between individuals. Moreover, our data show that terminal deoxynucleotidyl transferase activity was biased toward the insertion of G (31.92%) and C (27.14%) over A (21.82%) and T (19.12%) nucleotides.Some conserved features could be observed in the composition of CDR3, which may inform future studies of human TCR gene recombination.

  16. High Throughput Sequencing of T Cell Antigen Receptors Reveals a Conserved TCR Repertoire

    PubMed Central

    Hou, Xianliang; Lu, Chong; Chen, Sisi; Xie, Qian; Cui, Guangying; Chen, Jianing; Chen, Zhi; Wu, Zhongwen; Ding, Yulong; Ye, Ping; Dai, Yong; Diao, Hongyan

    2016-01-01

    Abstract The T-cell receptor (TCR) repertoire is a mirror of the human immune system that reflects processes caused by infections, cancer, autoimmunity, and aging. Next-generation sequencing has become a powerful tool for deep TCR profiling. Herein, we used this technology to study the repertoire features of TCR beta chain in the blood of healthy individuals. Peripheral blood samples were collected from 10 healthy donors. T cells were isolated with anti-human CD3 magnetic beads according to the manufacturer's protocol. We then combined multiplex-PCR, Illumina sequencing, and IMGT/High V-QUEST to analyze the characteristics and polymorphisms of the TCR. Most of the individual T cell clones were present at very low frequencies, suggesting that they had not undergone clonal expansion. The usage frequencies of the TCR beta variable, beta joining, and beta diversity gene segments were similar among T cells from different individuals. Notably, the usage frequency of individual nucleotides and amino acids within complementarity-determining region (CDR3) intervals was remarkably consistent between individuals. Moreover, our data show that terminal deoxynucleotidyl transferase activity was biased toward the insertion of G (31.92%) and C (27.14%) over A (21.82%) and T (19.12%) nucleotides. Some conserved features could be observed in the composition of CDR3, which may inform future studies of human TCR gene recombination. PMID:26962778

  17. Conserved intergenic sequences revealed by CTAG-profiling in Salmonella: thermodynamic modeling for function prediction

    PubMed Central

    Tang, Le; Zhu, Songling; Mastriani, Emilio; Fang, Xin; Zhou, Yu-Jie; Li, Yong-Guo; Johnston, Randal N.; Guo, Zheng; Liu, Gui-Rong; Liu, Shu-Lin

    2017-01-01

    Highly conserved short sequences help identify functional genomic regions and facilitate genomic annotation. We used Salmonella as the model to search the genome for evolutionarily conserved regions and focused on the tetranucleotide sequence CTAG for its potentially important functions. In Salmonella, CTAG is highly conserved across the lineages and large numbers of CTAG-containing short sequences fall in intergenic regions, strongly indicating their biological importance. Computer modeling demonstrated stable stem-loop structures in some of the CTAG-containing intergenic regions, and substitution of a nucleotide of the CTAG sequence would radically rearrange the free energy and disrupt the structure. The postulated degeneration of CTAG takes distinct patterns among Salmonella lineages and provides novel information about genomic divergence and evolution of these bacterial pathogens. Comparison of the vertically and horizontally transmitted genomic segments showed different CTAG distribution landscapes, with the genome amelioration process to remove CTAG taking place inward from both terminals of the horizontally acquired segment. PMID:28262684

  18. Conserved intergenic sequences revealed by CTAG-profiling in Salmonella: thermodynamic modeling for function prediction

    NASA Astrophysics Data System (ADS)

    Tang, Le; Zhu, Songling; Mastriani, Emilio; Fang, Xin; Zhou, Yu-Jie; Li, Yong-Guo; Johnston, Randal N.; Guo, Zheng; Liu, Gui-Rong; Liu, Shu-Lin

    2017-03-01

    Highly conserved short sequences help identify functional genomic regions and facilitate genomic annotation. We used Salmonella as the model to search the genome for evolutionarily conserved regions and focused on the tetranucleotide sequence CTAG for its potentially important functions. In Salmonella, CTAG is highly conserved across the lineages and large numbers of CTAG-containing short sequences fall in intergenic regions, strongly indicating their biological importance. Computer modeling demonstrated stable stem-loop structures in some of the CTAG-containing intergenic regions, and substitution of a nucleotide of the CTAG sequence would radically rearrange the free energy and disrupt the structure. The postulated degeneration of CTAG takes distinct patterns among Salmonella lineages and provides novel information about genomic divergence and evolution of these bacterial pathogens. Comparison of the vertically and horizontally transmitted genomic segments showed different CTAG distribution landscapes, with the genome amelioration process to remove CTAG taking place inward from both terminals of the horizontally acquired segment.

  19. Conserved intergenic sequences revealed by CTAG-profiling in Salmonella: thermodynamic modeling for function prediction.

    PubMed

    Tang, Le; Zhu, Songling; Mastriani, Emilio; Fang, Xin; Zhou, Yu-Jie; Li, Yong-Guo; Johnston, Randal N; Guo, Zheng; Liu, Gui-Rong; Liu, Shu-Lin

    2017-03-06

    Highly conserved short sequences help identify functional genomic regions and facilitate genomic annotation. We used Salmonella as the model to search the genome for evolutionarily conserved regions and focused on the tetranucleotide sequence CTAG for its potentially important functions. In Salmonella, CTAG is highly conserved across the lineages and large numbers of CTAG-containing short sequences fall in intergenic regions, strongly indicating their biological importance. Computer modeling demonstrated stable stem-loop structures in some of the CTAG-containing intergenic regions, and substitution of a nucleotide of the CTAG sequence would radically rearrange the free energy and disrupt the structure. The postulated degeneration of CTAG takes distinct patterns among Salmonella lineages and provides novel information about genomic divergence and evolution of these bacterial pathogens. Comparison of the vertically and horizontally transmitted genomic segments showed different CTAG distribution landscapes, with the genome amelioration process to remove CTAG taking place inward from both terminals of the horizontally acquired segment.

  20. Methods and compositions for efficient nucleic acid sequencing

    DOEpatents

    Drmanac, Radoje

    2006-07-04

    Disclosed are novel methods and compositions for rapid and highly efficient nucleic acid sequencing based upon hybridization with two sets of small oligonucleotide probes of known sequences. Extremely large nucleic acid molecules, including chromosomes and non-amplified RNA, may be sequenced without prior cloning or subcloning steps. The methods of the invention also solve various current problems associated with sequencing technology such as, for example, high noise to signal ratios and difficult discrimination, attaching many nucleic acid fragments to a surface, preparing many, longer or more complex probes and labelling more species.

  1. Methods and compositions for efficient nucleic acid sequencing

    DOEpatents

    Drmanac, Radoje

    2002-01-01

    Disclosed are novel methods and compositions for rapid and highly efficient nucleic acid sequencing based upon hybridization with two sets of small oligonucleotide probes of known sequences. Extremely large nucleic acid molecules, including chromosomes and non-amplified RNA, may be sequenced without prior cloning or subcloning steps. The methods of the invention also solve various current problems associated with sequencing technology such as, for example, high noise to signal ratios and difficult discrimination, attaching many nucleic acid fragments to a surface, preparing many, longer or more complex probes and labelling more species.

  2. Kit for detecting nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    2001-01-01

    A kit is provided for detecting a target nucleic acid sequence in a sample, the kit comprising: a first hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the first hybridization probe including a first complexing agent for forming a binding pair with a second complexing agent; and a second hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the first hybridization probe does not selectively hybridize, the second hybridization probe including a detectable marker; a third hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the third hybridization probe including the same detectable marker as the second hybridization probe; and a fourth hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the third hybridization probe does not selectively hybridize, the fourth hybridization probe including the first complexing agent for forming a binding pair with the second complexing agent; wherein the first and second hybridization probes are capable of simultaneously hybridizing to the target sequence and the third and fourth hybridization probes are capable of simultaneously hybridizing to the target sequence, the detectable marker is not present on the first or fourth hybridization probes and the first, second, third, and fourth hybridization probes each include a competitive nucleic acid sequence which is sufficiently complementary to a third portion of the target sequence that the competitive sequences of the first, second, third, and fourth hybridization probes compete with each other to hybridize to the third portion of the

  3. Analysis and Annotation of Nucleic Acid Sequence

    SciTech Connect

    States, David J.

    2004-07-28

    The aims of this project were to develop improved methods for computational genome annotation and to apply these methods to improve the annotation of genomic sequence data with a specific focus on human genome sequencing. The project resulted in a substantial body of published work. Notable contributions of this project were the identification of basecalling and lane tracking as error processes in genome sequencing and contributions to improved methods for these steps in genome sequencing. This technology improved the accuracy and throughput of genome sequence analysis. Probabilistic methods for physical map construction were developed. Improved methods for sequence alignment, alternative splicing analysis, promoter identification and NF kappa B response gene prediction were also developed.

  4. Septal localization by membrane targeting sequences and a conserved sequence essential for activity at the COOH-terminus of Bacillus subtilis cardiolipin synthase.

    PubMed

    Kusaka, Jin; Shuto, Satoshi; Imai, Yukiko; Ishikawa, Kazuki; Saito, Tomo; Natori, Kohei; Matsuoka, Satoshi; Hara, Hiroshi; Matsumoto, Kouji

    2016-04-01

    The acidic phospholipid cardiolipin (CL) is localized on polar and septal membranes and plays an important physiological role in Bacillus subtilis cells. ClsA, the enzyme responsible for CL synthesis, is also localized on septal membranes. We found that GFP fusion proteins of the enzyme with NH2-terminal and internal deletions retained septal localization. However, derivatives with deletions starting from the COOH-terminus (Leu482) ceased to localize to the septum once the deletion passed the Ile residue at 448, indicating that the sequence responsible for septal localization is confined within a short distance from the COOH-terminus. Two sequences, Ile436-Leu450 and Leu466-Leu478, are predicted to individually form an amphipathic α-helix. This configuration is known as a membrane targeting sequence (MTS) and we therefore refer to them as MTS2 and MTS1, respectively. Either one has the ability to affect septal localization, and each of these sequences by itself localizes to the septum. Membrane association of the constructs of this enzyme containing the MTSs was verified by subcellular fractionation of the cells. CL synthesis, in contrast, was abolished after deleting just the last residue, Leu482, in the COOH-terminal four amino acid residue sequence, Ser-Pro-Ile-Leu, which is highly conserved among bacterial CL synthases.

  5. Solid phase sequencing of double-stranded nucleic acids

    DOEpatents

    Fu, Dong-Jing; Cantor, Charles R.; Koster, Hubert; Smith, Cassandra L.

    2002-01-01

    This invention relates to methods for detecting and sequencing of target double-stranded nucleic acid sequences, to nucleic acid probes and arrays of probes useful in these methods, and to kits and systems which contain these probes. Useful methods involve hybridizing the nucleic acids or nucleic acids which represent complementary or homologous sequences of the target to an array of nucleic acid probes. These probe comprise a single-stranded portion, an optional double-stranded portion and a variable sequence within the single-stranded portion. The molecular weights of the hybridized nucleic acids of the set can be determined by mass spectroscopy, and the sequence of the target determined from the molecular weights of the fragments. Nucleic acids whose sequences can be determined include nucleic acids in biological samples such as patient biopsies and environmental samples. Probes may be fixed to a solid support such as a hybridization chip to facilitate automated determination of molecular weights and identification of the target sequence.

  6. Detection of Weakly Conserved Ancestral Mammalian RegulatorySequences by Primate Comparisons

    SciTech Connect

    Wang, Qian-fei; Prabhakar, Shyam; Chanan, Sumita; Cheng,Jan-Fang; Rubin, Edward M.; Boffelli, Dario

    2006-06-01

    Genomic comparisons between human and distant, non-primatemammals are commonly used to identify cis-regulatory elements based onconstrained sequence evolution. However, these methods fail to detectcryptic functional elements, which are too weakly conserved among mammalsto distinguish from nonfunctional DNA. To address this problem, weexplored the potential of deep intra-primate sequence comparisons. Wesequenced the orthologs of 558 kb of human genomic sequence, coveringmultiple loci involved in cholesterol homeostasis, in 6 nonhumanprimates. Our analysis identified 6 noncoding DNA elements displayingsignificant conservation among primates, but undetectable in more distantcomparisons. In vitro and in vivo tests revealed that at least three ofthese 6 elements have regulatory function. Notably, the mouse orthologsof these three functional human sequences had regulatory activity despitetheir lack of significant sequence conservation, indicating that they arecryptic ancestral cis-regulatory elements. These regulatory elementscould still be detected in a smaller set of three primate speciesincluding human, rhesus and marmoset. Since the human and rhesus genomesequences are already available, and the marmoset genome is activelybeing sequenced, the primate-specific conservation analysis describedhere can be applied in the near future on a whole-genome scale, tocomplement the annotation provided by more distant speciescomparisons.

  7. Accelerated Evolution of Conserved Noncoding Sequences in theHuman Genome

    SciTech Connect

    Prambhakar, Shyam; Noonan, James P.; Paabo, Svante; Rubin, EdwardM.

    2006-07-06

    Genomic comparisons between human and distant, non-primatemammals are commonly used to identify cis-regulatory elements based onconstrained sequence evolution. However, these methods fail to detect"cryptic" functional elements, which are too weakly conserved amongmammals to distinguish from nonfunctional DNA. To address this problem,we explored the potential of deep intra-primate sequence comparisons. Wesequenced the orthologs of 558 kb of human genomic sequence, coveringmultiple loci involved in cholesterol homeostasis, in 6 nonhumanprimates. Our analysis identified 6 noncoding DNA elements displayingsignificant conservation among primates, but undetectable in more distantcomparisons. In vitro and in vivo tests revealed that at least three ofthese 6 elements have regulatory function. Notably, the mouse orthologsof these three functional human sequences had regulatory activity despitetheir lack of significant sequence conservation, indicating that they arecryptic ancestral cis-regulatory elements. These regulatory elementscould still be detected in a smaller set of three primate speciesincluding human, rhesus and marmoset. Since the human and rhesus genomesequences are already available, and the marmoset genome is activelybeing sequenced, the primate-specific conservation analysis describedhere can be applied in the near future on a whole-genome scale, tocomplement the annotation provided by more distant speciescomparisons.

  8. Conservation of sequence and function in fertilization of the cortical granule serine protease in echinoderms.

    PubMed

    Oulhen, Nathalie; Xu, Dongdong; Wessel, Gary M

    2014-08-01

    Conservation of the cortical granule serine protease during fertilization in echinoderms was tested both functionally in sea stars, and computationally throughout the echinoderm phylum. We find that the inhibitor of serine protease (soybean trypsin inhibitor) effectively blocks proper transition of the sea star fertilization envelope into a protective sperm repellent, whereas inhibitors of the other main types of proteases had no effect. Scanning the transcriptomes of 15 different echinoderm ovaries revealed sequences of high conservation to the originally identified sea urchin cortical serine protease, CGSP1. These conserved sequences contained the catalytic triad necessary for enzymatic activity, and the tandemly repeated LDLr-like repeats. We conclude that the protease involved in the slow block to polyspermy is an essential and conserved element of fertilization in echinoderms, and may provide an important reagent for identification and testing of the cell surface proteins in eggs necessary for sperm binding.

  9. Conservation of sequence and function in fertilization of the cortical granule serine protease in echinoderms

    PubMed Central

    Oulhen, Nathalie; Xu, Dongdong; Wessel, Gary M.

    2014-01-01

    Conservation of the cortical granule serine protease during fertilization in echinoderms was tested both functionally in sea stars, and computationally throughout the echinoderm phylum. We find that the inhibitor of serine protease (soybean trypsin inhibitor) effectively blocks proper transition of the sea star fertilization envelope into a protective sperm repellent, whereas inhibitors of the other main types of proteases had no effect. Scanning the transcriptomes of 15 different echinoderm ovaries revealed sequences of high conservation to the originally identified sea urchin cortical serine protease, CGSP1. These conserved sequences contained the catalytic triad necessary for enzymatic activity, and the tandemly repeated LDLr-like repeats. We conclude that the protease involved in the slow block to polyspermy is an essential and conserved element of fertilization in echinoderms, and may provide an important reagent for identification and testing of the cell surface proteins in eggs necessary for sperm binding. PMID:24878526

  10. Dipeptide Sequence Determination: Analyzing Phenylthiohydantoin Amino Acids by HPLC

    NASA Astrophysics Data System (ADS)

    Barton, Janice S.; Tang, Chung-Fei; Reed, Steven S.

    2000-02-01

    Amino acid composition and sequence determination, important techniques for characterizing peptides and proteins, are essential for predicting conformation and studying sequence alignment. This experiment presents improved, fundamental methods of sequence analysis for an upper-division biochemistry laboratory. Working in pairs, students use the Edman reagent to prepare phenylthiohydantoin derivatives of amino acids for determination of the sequence of an unknown dipeptide. With a single HPLC technique, students identify both the N-terminal amino acid and the composition of the dipeptide. This method yields good precision of retention times and allows use of a broad range of amino acids as components of the dipeptide. Students learn fundamental principles and techniques of sequence analysis and HPLC.

  11. Conservation of the human telomere sequence (TTAGGG)n among vertebrates.

    PubMed Central

    Meyne, J; Ratliff, R L; Moyzis, R K

    1989-01-01

    To determine the evolutionary origin of the human telomere sequence (TTAGGG)n, biotinylated oligodeoxynucleotides of this sequence were hybridized to metaphase spreads from 91 different species, including representative orders of bony fish, reptiles, amphibians, birds, and mammals. Under stringent hybridization conditions, fluorescent signals were detected at the telomeres of all chromosomes, in all 91 species. The conservation of the (TTAGGG)n sequence and its telomeric location, in species thought to share a common ancestor over 400 million years ago, strongly suggest that this sequence is the functional vertebrate telomere. Images PMID:2780561

  12. The role of evolutionary conserved germline DH sequence in B-1 cell development and natural antibody production

    PubMed Central

    Vale, Andre M.; Nobrega, Alberto; Schroeder, Harry W.

    2015-01-01

    Due to N addition and variation in the site of V–D–J joining, the third complementarity-determining region of the heavy chain (CDR-H3) is the most diverse component of the initial immunoglobulin antigen-binding site repertoire. A large component of the peritoneal cavity B-1 cell component is the product of fetal and perinatal B cell production. The CDR-H3 repertoire is thus depleted of N addition, which increases dependency on germ-line sequence. Cross-species comparisons have shown that DH gene sequence demonstrates conservation of amino acid preferences by reading frame. Preference for reading frame 1, which is enriched for tyrosine and glycine, is created both by rearrangement patterns and by pre-BCR and BCR selection. In previous studies, we have assessed the role of conserved DH sequence by examining peritoneal cavity B-1 cell numbers and antibody production in BALB/c mice with altered DH loci. Here, we review our finding that changes in the constraints normally imposed by germ line–encoded amino acids within the CDR-H3 repertoire profoundly affect B-1 cell development, especially B-1a cells, and thus natural antibody immunity. Our studies suggest that both natural and somatic selection operate to create a restricted B-1 cell CDR-H3 repertoire. PMID:26104486

  13. The role of evolutionarily conserved germ-line DH sequence in B-1 cell development and natural antibody production.

    PubMed

    Vale, Andre M; Nobrega, Alberto; Schroeder, Harry W

    2015-12-01

    Because of N addition and variation in the site of VDJ joining, the third complementarity-determining region of the heavy chain (CDR-H3) is the most diverse component of the initial immunoglobulin antigen-binding site repertoire. A large component of the peritoneal cavity B-1 cell component is the product of fetal and perinatal B cell production. The CDR-H3 repertoire is thus depleted of N addition, which increases dependency on germ-line sequence. Cross-species comparisons have shown that DH gene sequence demonstrates conservation of amino acid preferences by reading frame. Preference for reading frame 1, which is enriched for tyrosine and glycine, is created both by rearrangement patterns and by pre-BCR and BCR selection. In previous studies, we have assessed the role of conserved DH sequence by examining peritoneal cavity B-1 cell numbers and antibody production in BALB/c mice with altered DH loci. Here, we review our finding that changes in the constraints normally imposed by germ-line-encoded amino acids within the CDR-H3 repertoire profoundly affect B-1 cell development, especially B-1a cells, and thus natural antibody immunity. Our studies suggest that both natural and somatic selection operate to create a restricted B-1 cell CDR-H3 repertoire.

  14. Phylum-Level Conservation of Regulatory Information in Nematodes despite Extensive Non-coding Sequence Divergence.

    PubMed

    Gordon, Kacy L; Arthur, Robert K; Ruvinsky, Ilya

    2015-05-01

    Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2) from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements.

  15. Phylum-Level Conservation of Regulatory Information in Nematodes despite Extensive Non-coding Sequence Divergence

    PubMed Central

    Gordon, Kacy L.; Arthur, Robert K.; Ruvinsky, Ilya

    2015-01-01

    Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2) from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements. PMID:26020930

  16. Complete cDNA and derived amino acid sequence of human factor V

    SciTech Connect

    Jenny, R.J.; Pittman, D.D.; Toole, J.J.; Kriz, R.W.; Aldape, R.A.; Hewick, R.M.; Kaufman, R.J.; Mann, K.G.

    1987-07-01

    cDNA clones encoding human factor V have been isolated from an oligo(dT)-primed human fetal liver cDNA library prepared with vector Charon 21A. The cDNA sequence of factor V from three overlapping clones includes a 6672-base-pair (bp) coding region, a 90-bp 5' untranslated region, and a 163-bp 3' untranslated region within which is a poly(A)tail. The deduced amino acid sequence consists of 2224 amino acids inclusive of a 28-amino acid leader peptide. Direct comparison with human factor VIII reveals considerable homology between proteins in amino acid sequence and domain structure: a triplicated A domain and duplicated C domain show approx. 40% identity with the corresponding domains in factor VIII. As in factor VIII, the A domains of factor V share approx. 40% amino acid-sequence homology with the three highly conserved domains in ceruloplasmin. The B domain of factor V contains 35 tandem and approx. 9 additional semiconserved repeats of nine amino acids of the form Asp-Leu-Ser-Gln-Thr-Thr/Asn-Leu-Ser-Pro and 2 additional semiconserved repeats of 17 amino acids. Factor V contains 37 potential N-linked glycosylation sites, 25 of which are in the B domain, and a total of 19 cysteine residues.

  17. Amino acid sequence of mouse submaxillary gland renin.

    PubMed Central

    Misono, K S; Chang, J J; Inagami, T

    1982-01-01

    The complete amino acid sequences of the heavy chain and light chain of mouse submaxillary gland renin have been determined. The heavy chain consists of 288 amino acid residues having a Mr of 31,036 calculated from the sequence. The light chain contains 48 amino acid residues with a Mr of 5,458. The sequence of the heavy chain was determined by automated Edman degradations of the cyanogen bromide peptides and tryptic peptides generated after citraconylation, as well as other peptides generated therefrom. The sequence of the light chain was derived from sequence analyses of the peptides generated by cyanogen bromide cleavage or by digestion with Staphylococcus aureus protease. The sequences in the active site regions in renin containing two catalytically essential aspartyl residues 32 and 215 were found identical with those in pepsin, chymosin, and penicillopepsin. Comparison of the amino acid sequence of renin with that of porcine pepsin indicated a 42% sequence identity of the heavy chain with the amino-terminal and middle regions and a 46% identity of the light chain with the carboxyl-terminal region of the porcine pepsin sequence. Residues identical in renin and pepsin are distributed throughout the length of the molecules, suggesting a similarity in their overall structures. PMID:6812055

  18. The nucleotide sequence of the nitrogen-regulation gene ntrA of Klebsiella pneumoniae and comparison with conserved features in bacterial RNA polymerase sigma factors.

    PubMed Central

    Merrick, M J; Gibbins, J R

    1985-01-01

    The nucleotide sequence of the Klebsiella pneumoniae ntrA gene has been determined. NtrA encodes a 53,926 Dalton acidic polypeptide; a calculated molecular weight which is significantly lower than that determined by SDS polyacrylamide gel analysis. NtrA is followed by another open-reading frame (orf) of at least 75 amino acids. In the spacer region between ntrA and orf there are no apparent transcription termination or promoter sequences and therefore orf may be co-transcribed with ntrA. Previous authors have proposed that NtrA could act as an RNA polymerase sigma factor but the NtrA amino acid sequence does not show a high level of homology to any known sigma factor. However analysis of sequences of five sigma factors from E. coli and B. subtilis has identified two conserved sequences at the C-terminal end of all these polypeptides. These sequences resemble those found in known site-specific DNA-binding domains and may be involved in recognition of conserved -35 and -10 promoter sequences. A similar pair of sequences is present at the C-terminus of NtrA and could play a role in recognition of ntr-activatable promoters. Images PMID:2999700

  19. Amino Acid Sequence of Human Cholinesterase

    DTIC Science & Technology

    1985-10-01

    liquid chromatography (HPLC). Activity testing of the aged, DFP-labeled cholinesterase showed that 99.8% of the active sites had been labeled, since...acids were quantitated by ninhydrin at the AAA Labs, or by derivatization with phenylisothiocyanate at the University of Michigan. The latter method

  20. Highly conserved d-loop sequences in woolly mouse opossums Marmosa (Micoureus).

    PubMed

    Rocha, Rita Gomes; Leite, Yuri Luiz Reis; Ferreira, Eduardo; Justino, Juliana; Costa, Leonora Pires

    2012-04-01

    This study reports the occurrence of highly conserved d-loop sequences in the mitochondrial genome of the woolly mouse opossum genus Marmosa subgenus Micoureus (Mammalia, Didelphimorphia, Didelphidae). Sixty-six sequences of Marmosa (Micoureus) demerarae, Marmosa (Micoureus) constantiae, and Marmosa (Micoureus) paraguayanus were amplified using universal d-loop primers and virtually no genetic differences were detected within and among species. These sequences matched the control region of the mitochondrial marsupial genome. Analyses of qualitative aspects of these sequences revealed that their structural composition is very similar to the d-loop region of other didelphid species. However, the total lack of variability has not been reported from other closely related species. The data analyzed here support the occurrence of highly conserved d-loop sequences, and we found no support for the hypothesis that these sequences are d-loop-like nuclear pseudogenes. Furthermore, the control and flanking regions obtained with different primers corroborate the lack of variability of the d-loop sequences in the mitochondrial genome of Marmosa (Micoureus).

  1. Cystatin. Amino acid sequence and possible secondary structure.

    PubMed Central

    Schwabe, C; Anastasi, A; Crow, H; McDonald, J K; Barrett, A J

    1984-01-01

    The amino acid sequence of cystatin, the protein from chicken egg-white that is a tight-binding inhibitor of many cysteine proteinases, is reported. Cystatin is composed of 116 amino acid residues, and the Mr is calculated to be 13 143. No striking similarity to any other known sequence has been detected. The results of computer analysis of the sequence and c.d. spectrometry indicate that the secondary structure includes relatively little alpha-helix (about 20%) and that the remainder is mainly beta-structure. PMID:6712597

  2. Detecting Remote Sequence Homology in Disordered Proteins: Discovery of Conserved Motifs in the N-Termini of Mononegavirales phosphoproteins

    PubMed Central

    Karlin, David; Belshaw, Robert

    2012-01-01

    Paramyxovirinae are a large group of viruses that includes measles virus and parainfluenza viruses. The viral Phosphoprotein (P) plays a central role in viral replication. It is composed of a highly variable, disordered N-terminus and a conserved C-terminus. A second viral protein alternatively expressed, the V protein, also contains the N-terminus of P, fused to a zinc finger. We suspected that, despite their high variability, the N-termini of P/V might all be homologous; however, using standard approaches, we could previously identify sequence conservation only in some Paramyxovirinae. We now compared the N-termini using sensitive sequence similarity search programs, able to detect residual similarities unnoticeable by conventional approaches. We discovered that all Paramyxovirinae share a short sequence motif in their first 40 amino acids, which we called soyuz1. Despite its short length (11–16aa), several arguments allow us to conclude that soyuz1 probably evolved by homologous descent, unlike linear motifs. Conservation across such evolutionary distances suggests that soyuz1 plays a crucial role and experimental data suggest that it binds the viral nucleoprotein to prevent its illegitimate self-assembly. In some Paramyxovirinae, the N-terminus of P/V contains a second motif, soyuz2, which might play a role in blocking interferon signaling. Finally, we discovered that the P of related Mononegavirales contain similarly overlooked motifs in their N-termini, and that their C-termini share a previously unnoticed structural similarity suggesting a common origin. Our results suggest several testable hypotheses regarding the replication of Mononegavirales and suggest that disordered regions with little overall sequence similarity, common in viral and eukaryotic proteins, might contain currently overlooked motifs (intermediate in length between linear motifs and disordered domains) that could be detected simply by comparing orthologous proteins. PMID:22403617

  3. Characterization of the dead ringer gene identifies a novel, highly conserved family of sequence-specific DNA-binding proteins.

    PubMed Central

    Gregory, S L; Kortschak, R D; Kalionis, B; Saint, R

    1996-01-01

    We reported the identification of a new family of DNA-binding proteins from our characterization of the dead ringer (dri) gene of Drosophila melanogaster. We show that dri encodes a nuclear protein that contains a sequence-specific DNA-binding domain that bears no similarity to known DNA-binding domains. A number of proteins were found to contain sequences homologous to this domain. Other proteins containing the conserved motif include yeast SWI1, two human retinoblastoma binding proteins, and other mammalian regulatory proteins. A mouse B-cell-specific regulator exhibits 75% identity with DRI over the 137-amino-acid DNA-binding domains of these proteins, indicating a high degree of conservation of this domain. Gel retardation and optimal binding site screens revealed that the in vitro sequence specificity of DRI is strikingly similar to that of many homeodomain proteins, although the sequence and predicted secondary structure do not resemble a homeodomain. The early general expression of dri and the similarity of DRI and homeodomain in vitro DNA-binding specificity compound the problem of understanding the in vivo specificity of action of these proteins. Maternally derived dri product is found throughout the embryo until germ band extension, when dri is expressed in a developmentally regulated set of tissues, including salivary gland ducts, parts of the gut, and a subset of neural cells. The discovery of this new, conserved DNA-binding domain offers an explanation for the regulatory activity of several important members of this class and predicts significant regulatory roles for the others. PMID:8622680

  4. Significance of satellite DNA revealed by conservation of a widespread repeat DNA sequence among angiosperms.

    PubMed

    Mehrotra, Shweta; Goel, Shailendra; Raina, Soom Nath; Rajpal, Vijay Rani

    2014-08-01

    The analysis of plant genome structure and evolution requires comprehensive characterization of repetitive sequences that make up the majority of plant nuclear DNA. In the present study, we analyzed the nature of pCtKpnI-I and pCtKpnI-II tandem repeated sequences, reported earlier in Carthamus tinctorius. Interestingly, homolog of pCtKpnI-I repeat sequence was also found to be present in widely divergent families of angiosperms. pCtKpnI-I showed high sequence similarity but low copy number among various taxa of different families of angiosperms analyzed. In comparison, pCtKpnI-II was specific to the genus Carthamus and was not present in any other taxa analyzed. The molecular structure of pCtKpnI-I was analyzed in various unrelated taxa of angiosperms to decipher the evolutionary conserved nature of the sequence and its possible functional role.

  5. Mouse Vk gene classification by nucleic acid sequence similarity.

    PubMed

    Strohal, R; Helmberg, A; Kroemer, G; Kofler, R

    1989-01-01

    Analyses of immunoglobulin (Ig) variable (V) region gene usage in the immune response, estimates of V gene germline complexity, and other nucleic acid hybridization-based studies depend on the extent to which such genes are related (i.e., sequence similarity) and their organization in gene families. While mouse Igh heavy chain V region (VH) gene families are relatively well-established, a corresponding systematic classification of Igk light chain V region (Vk) genes has not been reported. The present analysis, in the course of which we reviewed the known extent of the Vk germline gene repertoire and Vk gene usage in a variety of responses to foreign and self antigens, provides a classification of mouse Vk genes in gene families composed of members with greater than 80% overall nucleic acid sequence similarity. This classification differed in several aspects from that of VH genes: only some Vk gene families were as clearly separated (by greater than 25% sequence dissimilarity) as typical VH gene families; most Vk gene families were closely related and, in several instances, members from different families were very similar (greater than 80%) over large sequence portions; frequently, classification by nucleic acid sequence similarity diverged from existing classifications based on amino-terminal protein sequence similarity. Our data have implications for Vk gene analyses by nucleic acid hybridization and describe potentially important differences in sequence organization between VH and Vk genes.

  6. Amino acid sequence of myoglobin from white-tailed deer (Odocoileus virginianus).

    PubMed

    Joseph, Poulson; Suman, Surendranath P; Li, Shuting; Fontaine, Michele; Steinke, Laurey

    2012-10-01

    Our objective was to determine the primary structure of white-tailed deer myoglobin (Mb). White-tailed deer Mb was isolated from cardiac muscles employing ammonium sulfate precipitation and gel-filtration chromatography. The amino acid sequence was determined by Edman degradation. Sequence analyses of intact Mb as well as tryptic- and cyanogen bromide-peptides yielded the complete primary structure of white-tailed deer Mb, which shared 100% similarity with red deer Mb. White-tailed deer Mb consists of 153 amino acid residues and shares more than 96% sequence similarity with myoglobins from meat-producing ruminants, such as cattle, buffalo, sheep, and goat. Similar to sheep and goat myoglobins, white-tailed deer Mb contains 12 histidine residues. Proximal (position 93) and distal (position 64) histidine residues responsible for maintaining the stability of heme are conserved in white-tailed deer Mb.

  7. Nucleic and Amino Acid Sequences Support Structure-Based Viral Classification

    PubMed Central

    Sinclair, Robert M.; Ravantti, Janne J.

    2017-01-01

    ABSTRACT Viral capsids ensure viral genome integrity by protecting the enclosed nucleic acids. Interactions between the genome and capsid and between individual capsid proteins (i.e., capsid architecture) are intimate and are expected to be characterized by strong evolutionary conservation. For this reason, a capsid structure-based viral classification has been proposed as a way to bring order to the viral universe. The seeming lack of sufficient sequence similarity to reproduce this classification has made it difficult to reject structural convergence as the basis for the classification. We reinvestigate whether the structure-based classification for viral coat proteins making icosahedral virus capsids is in fact supported by previously undetected sequence similarity. Since codon choices can influence nascent protein folding cotranslationally, we searched for both amino acid and nucleotide sequence similarity. To demonstrate the sensitivity of the approach, we identify a candidate gene for the pandoravirus capsid protein. We show that the structure-based classification is strongly supported by amino acid and also nucleotide sequence similarities, suggesting that the similarities are due to common descent. The correspondence between structure-based and sequence-based analyses of the same proteins shown here allow them to be used in future analyses of the relationship between linear sequence information and macromolecular function, as well as between linear sequence and protein folds. IMPORTANCE Viral capsids protect nucleic acid genomes, which in turn encode capsid proteins. This tight coupling of protein shell and nucleic acids, together with strong functional constraints on capsid protein folding and architecture, leads to the hypothesis that capsid protein-coding nucleotide sequences may retain signatures of ancient viral evolution. We have been able to show that this is indeed the case, using the major capsid proteins of viruses forming icosahedral capsids

  8. Domains in microbial beta-1, 4-glycanases: sequence conservation, function, and enzyme families.

    PubMed Central

    Gilkes, N R; Henrissat, B; Kilburn, D G; Miller, R C; Warren, R A

    1991-01-01

    Several types of domain occur in beta-1, 4-glycanases. The best characterized of these are the catalytic domains and the cellulose-binding domains. The domains may be joined by linker sequences rich in proline or hydroxyamino acids or both. Some of the enzymes contain repeated sequences up to 150 amino acids in length. The enzymes can be grouped into families on the basis of sequence similarities between the catalytic domains. There are sequence similarities between the cellulose-binding domains, of which two types have been identified, and also between some domains of unknown function. The beta-1, 4-glycanases appear to have arisen by the shuffling of a relatively small number of progenitor sequences. PMID:1886523

  9. Rice pseudomolecule-anchored cross-species DNA sequence alignments indicate regional genomic variation in expressed sequence conservation

    PubMed Central

    Armstead, Ian; Huang, Lin; King, Julie; Ougham, Helen; Thomas, Howard; King, Ian

    2007-01-01

    Background Various methods have been developed to explore inter-genomic relationships among plant species. Here, we present a sequence similarity analysis based upon comparison of transcript-assembly and methylation-filtered databases from five plant species and physically anchored rice coding sequences. Results A comparison of the frequency of sequence alignments, determined by MegaBLAST, between rice coding sequences in TIGR pseudomolecules and annotations vs 4.0 and comprehensive transcript-assembly and methylation-filtered databases from Lolium perenne (ryegrass), Zea mays (maize), Hordeum vulgare (barley), Glycine max (soybean) and Arabidopsis thaliana (thale cress) was undertaken. Each rice pseudomolecule was divided into 10 segments, each containing 10% of the functionally annotated, expressed genes. This indicated a correlation between relative segment position in the rice genome and numbers of alignments with all the queried monocot and dicot plant databases. Colour-coded moving windows of 100 functionally annotated, expressed genes along each pseudomolecule were used to generate 'heat-maps'. These revealed consistent intra- and inter-pseudomolecule variation in the relative concentrations of significant alignments with the tested plant databases. Analysis of the annotations and derived putative expression patterns of rice genes from 'hot-spots' and 'cold-spots' within the heat maps indicated possible functional differences. A similar comparison relating to ancestral duplications of the rice genome indicated that duplications were often associated with 'hot-spots'. Conclusion Physical positions of expressed genes in the rice genome are correlated with the degree of conservation of similar sequences in the transcriptomes of other plant species. This relative conservation is associated with the distribution of different sized gene families and segmentally duplicated loci and may have functional and evolutionary implications. PMID:17708759

  10. Seq2Logo: a method for construction and visualization of amino acid binding motifs and sequence profiles including sequence weighting, pseudo counts and two-sided representation of amino acid enrichment and depletion

    PubMed Central

    Thomsen, Martin Christen Frølund; Nielsen, Morten

    2012-01-01

    Seq2Logo is a web-based sequence logo generator. Sequence logos are a graphical representation of the information content stored in a multiple sequence alignment (MSA) and provide a compact and highly intuitive representation of the position-specific amino acid composition of binding motifs, active sites, etc. in biological sequences. Accurate generation of sequence logos is often compromised by sequence redundancy and low number of observations. Moreover, most methods available for sequence logo generation focus on displaying the position-specific enrichment of amino acids, discarding the equally valuable information related to amino acid depletion. Seq2logo aims at resolving these issues allowing the user to include sequence weighting to correct for data redundancy, pseudo counts to correct for low number of observations and different logotype representations each capturing different aspects related to amino acid enrichment and depletion. Besides allowing input in the format of peptides and MSA, Seq2Logo accepts input as Blast sequence profiles, providing easy access for non-expert end-users to characterize and identify functionally conserved/variable amino acids in any given protein of interest. The output from the server is a sequence logo and a PSSM. Seq2Logo is available at http://www.cbs.dtu.dk/biotools/Seq2Logo (14 May 2012, date last accessed). PMID:22638583

  11. Seq2Logo: a method for construction and visualization of amino acid binding motifs and sequence profiles including sequence weighting, pseudo counts and two-sided representation of amino acid enrichment and depletion.

    PubMed

    Thomsen, Martin Christen Frølund; Nielsen, Morten

    2012-07-01

    Seq2Logo is a web-based sequence logo generator. Sequence logos are a graphical representation of the information content stored in a multiple sequence alignment (MSA) and provide a compact and highly intuitive representation of the position-specific amino acid composition of binding motifs, active sites, etc. in biological sequences. Accurate generation of sequence logos is often compromised by sequence redundancy and low number of observations. Moreover, most methods available for sequence logo generation focus on displaying the position-specific enrichment of amino acids, discarding the equally valuable information related to amino acid depletion. Seq2logo aims at resolving these issues allowing the user to include sequence weighting to correct for data redundancy, pseudo counts to correct for low number of observations and different logotype representations each capturing different aspects related to amino acid enrichment and depletion. Besides allowing input in the format of peptides and MSA, Seq2Logo accepts input as Blast sequence profiles, providing easy access for non-expert end-users to characterize and identify functionally conserved/variable amino acids in any given protein of interest. The output from the server is a sequence logo and a PSSM. Seq2Logo is available at http://www.cbs.dtu.dk/biotools/Seq2Logo (14 May 2012, date last accessed).

  12. Sense-antisense gene pairs: sequence, transcription, and structure are not conserved between human and mouse

    PubMed Central

    Wood, Emily J.; Chin-Inmanu, Kwanrutai; Jia, Hui; Lipovich, Leonard

    2013-01-01

    Previous efforts to characterize conservation between the human and mouse genomes focused largely on sequence comparisons. These studies are inherently limited because they don't account for gene structure differences, which may exist despite genomic sequence conservation. Recent high-throughput transcriptome studies have revealed widespread and extensive overlaps between genes, and transcripts, encoded on both strands of the genomic sequence. This overlapping gene organization, which produces sense-antisense (SAS) gene pairs, is capable of effecting regulatory cascades through established mechanisms. We present an evolutionary conservation assessment of SAS pairs, on three levels: genomic, transcriptomic, and structural. From a genome-wide dataset of human SAS pairs, we first identified orthologous loci in the mouse genome, then assessed their transcription in the mouse, and finally compared the genomic structures of SAS pairs expressed in both species. We found that approximately half of human SAS loci have single orthologous locations in the mouse genome; however, only half of those orthologous locations have SAS transcriptional activity in the mouse. This suggests that high human-mouse gene conservation overlooks widespread distinctions in SAS pair incidence and expression. We compared gene structures at orthologous SAS loci, finding frequent differences in gene structure between human and orthologous mouse SAS pair members. Our categorization of human SAS pairs with respect to mouse conservation of expression as well as structure points to limitations of mouse models. Gene structure differences, including at SAS loci, may account for some of the phenotypic distinctions between primates and rodents. Genes in non-conserved SAS pairs may contribute to evolutionary lineage-specific regulatory outcomes. PMID:24133500

  13. Conservation of Tubulin-Binding Sequences in TRPV1 throughout Evolution

    PubMed Central

    Sardar, Puspendu; Kumar, Abhishek; Bhandari, Anita; Goswami, Chandan

    2012-01-01

    Background Transient Receptor Potential Vanilloid sub type 1 (TRPV1), commonly known as capsaicin receptor can detect multiple stimuli ranging from noxious compounds, low pH, temperature as well as electromagnetic wave at different ranges. In addition, this receptor is involved in multiple physiological and sensory processes. Therefore, functions of TRPV1 have direct influences on adaptation and further evolution also. Availability of various eukaryotic genomic sequences in public domain facilitates us in studying the molecular evolution of TRPV1 protein and the respective conservation of certain domains, motifs and interacting regions that are functionally important. Methodology and Principal Findings Using statistical and bioinformatics tools, our analysis reveals that TRPV1 has evolved about ∼420 million years ago (MYA). Our analysis reveals that specific regions, domains and motifs of TRPV1 has gone through different selection pressure and thus have different levels of conservation. We found that among all, TRP box is the most conserved and thus have functional significance. Our results also indicate that the tubulin binding sequences (TBS) have evolutionary significance as these stretch sequences are more conserved than many other essential regions of TRPV1. The overall distribution of positively charged residues within the TBS motifs is conserved throughout evolution. In silico analysis reveals that the TBS-1 and TBS-2 of TRPV1 can form helical structures and may play important role in TRPV1 function. Conclusions and Significance Our analysis identifies the regions of TRPV1, which are important for structure – function relationship. This analysis indicates that tubulin binding sequence-1 (TBS-1) near the TRP-box forms a potential helix and the tubulin interactions with TRPV1 via TBS-1 have evolutionary significance. This interaction may be required for the proper channel function and regulation and may also have significance in the context of Taxol

  14. Studying RNA homology and conservation with Infernal: from single sequences to RNA families

    PubMed Central

    Barquist, Lars; Burge, Sarah W.; Gardner, Paul P.

    2016-01-01

    Emerging high-throughput technologies have led to a deluge of putative non-coding RNA (ncRNA) sequences identified in a wide variety of organisms. Systematic characterization of these transcripts will be a tremendous challenge. Homology detection is critical to making maximal use of functional information gathered about ncRNAs: identifying homologous sequence allows us to transfer information gathered in one organism to another quickly and with a high degree of confidence. ncRNA presents a challenge for homology detection, as the primary sequence is often poorly conserved and de novo secondary structure prediction and search remains difficult. This protocol introduces methods developed by the Rfam database for identifying “families” of homologous ncRNAs starting from single “seed” sequences using manually curated sequence alignments to build powerful statistical models of sequence and structure conservation known as covariance models (CMs), implemented in the Infernal software package. We provide a step-by-step iterative protocol for identifying ncRNA homologs, then constructing an alignment and corresponding CM. We also work through an example for the bacterial small RNA MicA, discovering a previously unreported family of divergent MicA homologs in genus Xenorhabdus in the process. PMID:27322404

  15. Studying RNA Homology and Conservation with Infernal: From Single Sequences to RNA Families.

    PubMed

    Barquist, Lars; Burge, Sarah W; Gardner, Paul P

    2016-06-20

    Emerging high-throughput technologies have led to a deluge of putative non-coding RNA (ncRNA) sequences identified in a wide variety of organisms. Systematic characterization of these transcripts will be a tremendous challenge. Homology detection is critical to making maximal use of functional information gathered about ncRNAs: identifying homologous sequence allows us to transfer information gathered in one organism to another quickly and with a high degree of confidence. ncRNA presents a challenge for homology detection, as the primary sequence is often poorly conserved and de novo secondary structure prediction and search remain difficult. This unit introduces methods developed by the Rfam database for identifying "families" of homologous ncRNAs starting from single "seed" sequences, using manually curated sequence alignments to build powerful statistical models of sequence and structure conservation known as covariance models (CMs), implemented in the Infernal software package. We provide a step-by-step iterative protocol for identifying ncRNA homologs and then constructing an alignment and corresponding CM. We also work through an example for the bacterial small RNA MicA, discovering a previously unreported family of divergent MicA homologs in genus Xenorhabdus in the process. © 2016 by John Wiley & Sons, Inc.

  16. Mathematical Characterization of Protein Sequences Using Patterns as Chemical Group Combinations of Amino Acids.

    PubMed

    Das, Jayanta Kumar; Das, Provas; Ray, Korak Kumar; Choudhury, Pabitra Pal; Jana, Siddhartha Sankar

    2016-01-01

    Comparison of amino acid sequence similarity is the fundamental concept behind the protein phylogenetic tree formation. By virtue of this method, we can explain the evolutionary relationships, but further explanations are not possible unless sequences are studied through the chemical nature of individual amino acids. Here we develop a new methodology to characterize the protein sequences on the basis of the chemical nature of the amino acids. We design various algorithms for studying the variation of chemical group transitions and various chemical group combinations as patterns in the protein sequences. The amino acid sequence of conventional myosin II head domain of 14 family members are taken to illustrate this new approach. We find two blocks of maximum length 6 aa as 'FPKATD' and 'Y/FTNEKL' without repeating the same chemical nature and one block of maximum length 20 aa with the repetition of chemical nature which are common among all 14 members. We also check commonality with another motor protein sub-family kinesin, KIF1A. Based on our analysis we find a common block of length 8 aa both in myosin II and KIF1A. This motif is located in the neck linker region which could be responsible for the generation of mechanical force, enabling us to find the unique blocks which remain chemically conserved across the family. We also validate our methodology with different protein families such as MYOI, Myosin light chain kinase (MLCK) and Rho-associated protein kinase (ROCK), Na+/K+-ATPase and Ca2+-ATPase. Altogether, our studies provide a new methodology for investigating the conserved amino acids' pattern in different proteins.

  17. Integrated genome analysis suggests that most conserved non-coding sequences are regulatory factor binding sites

    PubMed Central

    Hemberg, Martin; Gray, Jesse M.; Cloonan, Nicole; Kuersten, Scott; Grimmond, Sean; Greenberg, Michael E.; Kreiman, Gabriel

    2012-01-01

    More than 98% of a typical vertebrate genome does not code for proteins. Although non-coding regions are sprinkled with short (<200 bp) islands of evolutionarily conserved sequences, the function of most of these unannotated conserved islands remains unknown. One possibility is that unannotated conserved islands could encode non-coding RNAs (ncRNAs); alternatively, unannotated conserved islands could serve as promoter-distal regulatory factor binding sites (RFBSs) like enhancers. Here we assess these possibilities by comparing unannotated conserved islands in the human and mouse genomes to transcribed regions and to RFBSs, relying on a detailed case study of one human and one mouse cell type. We define transcribed regions by applying a novel transcript-calling algorithm to RNA-Seq data obtained from total cellular RNA, and we define RFBSs using ChIP-Seq and DNAse-hypersensitivity assays. We find that unannotated conserved islands are four times more likely to coincide with RFBSs than with unannotated ncRNAs. Thousands of conserved RFBSs can be categorized as insulators based on the presence of CTCF or as enhancers based on the presence of p300/CBP and H3K4me1. While many unannotated conserved RFBSs are transcriptionally active to some extent, the transcripts produced tend to be unspliced, non-polyadenylated and expressed at levels 10 to 100-fold lower than annotated coding or ncRNAs. Extending these findings across multiple cell types and tissues, we propose that most conserved non-coding genomic DNA in vertebrate genomes corresponds to promoter-distal regulatory elements. PMID:22684627

  18. Mutational Studies on Resurrected Ancestral Proteins Reveal Conservation of Site-Specific Amino Acid Preferences throughout Evolutionary History

    PubMed Central

    Risso, Valeria A.; Manssour-Triedo, Fadia; Delgado-Delgado, Asunción; Arco, Rocio; Barroso-delJesus, Alicia; Ingles-Prieto, Alvaro; Godoy-Ruiz, Raquel; Gavira, Jose A.; Gaucher, Eric A.; Ibarra-Molero, Beatriz; Sanchez-Ruiz, Jose M.

    2015-01-01

    Local protein interactions (“molecular context” effects) dictate amino acid replacements and can be described in terms of site-specific, energetic preferences for any different amino acid. It has been recently debated whether these preferences remain approximately constant during evolution or whether, due to coevolution of sites, they change strongly. Such research highlights an unresolved and fundamental issue with far-reaching implications for phylogenetic analysis and molecular evolution modeling. Here, we take advantage of the recent availability of phenotypically supported laboratory resurrections of Precambrian thioredoxins and β-lactamases to experimentally address the change of site-specific amino acid preferences over long geological timescales. Extensive mutational analyses support the notion that evolutionary adjustment to a new amino acid may occur, but to a large extent this is insufficient to erase the primitive preference for amino acid replacements. Generally, site-specific amino acid preferences appear to remain conserved throughout evolutionary history despite local sequence divergence. We show such preference conservation to be readily understandable in molecular terms and we provide crystallographic evidence for an intriguing structural-switch mechanism: Energetic preference for an ancestral amino acid in a modern protein can be linked to reorganization upon mutation to the ancestral local structure around the mutated site. Finally, we point out that site-specific preference conservation naturally leads to one plausible evolutionary explanation for the existence of intragenic global suppressor mutations. PMID:25392342

  19. The penicillin gene cluster is amplified in tandem repeats linked by conserved hexanucleotide sequences.

    PubMed Central

    Fierro, F; Barredo, J L; Díez, B; Gutierrez, S; Fernández, F J; Martín, J F

    1995-01-01

    The penicillin biosynthetic genes (pcbAB, pcbC, penDE) of Penicillium chrysogenum AS-P-78 were located in a 106.5-kb DNA region that is amplified in tandem repeats (five or six copies) linked by conserved TTTACA sequences. The wild-type strains P. chrysogenum NRRL 1951 and Penicillium notatum ATCC 9478 (Fleming's isolate) contain a single copy of the 106.5-kb region. This region was bordered by the same TTTACA hexanucleotide found between tandem repeats in strain AS-P-78. A penicillin overproducer strain, P. chrysogenum E1, contains a large number of copies in tandem of a 57.9-kb DNA fragment, linked by the same hexanucleotide or its reverse complementary TGTAAA sequence. The deletion mutant P. chrysogenum npe10 showed a deletion of 57.9 kb that corresponds exactly to the DNA fragment that is amplified in E1. The conserved hexanucleotide sequence was reconstituted at the deletion site. The amplification has occurred within a single chromosome (chromosome I). The tandem reiteration and deletion appear to arise by mutation-induced site-specific recombination at the conserved hexanucleotide sequences. Images Fig. 3 PMID:7597101

  20. Amino acid sequence repertoire of the bacterial proteome and the occurrence of untranslatable sequences

    PubMed Central

    Navon, Sharon Penias; Kornberg, Guy; Chen, Jin; Schwartzman, Tali; Tsai, Albert; Puglisi, Elisabetta Viani; Puglisi, Joseph D.; Adir, Noam

    2016-01-01

    Bioinformatic analysis of Escherichia coli proteomes revealed that all possible amino acid triplet sequences occur at their expected frequencies, with four exceptions. Two of the four underrepresented sequences (URSs) were shown to interfere with translation in vivo and in vitro. Enlarging the URS by a single amino acid resulted in increased translational inhibition. Single-molecule methods revealed stalling of translation at the entrance of the peptide exit tunnel of the ribosome, adjacent to ribosomal nucleotides A2062 and U2585. Interaction with these same ribosomal residues is involved in regulation of translation by longer, naturally occurring protein sequences. The E. coli exit tunnel has evidently evolved to minimize interaction with the exit tunnel and maximize the sequence diversity of the proteome, although allowing some interactions for regulatory purposes. Bioinformatic analysis of the human proteome revealed no underrepresented triplet sequences, possibly reflecting an absence of regulation by interaction with the exit tunnel. PMID:27307442

  1. Amino acid sequence repertoire of the bacterial proteome and the occurrence of untranslatable sequences.

    PubMed

    Navon, Sharon Penias; Kornberg, Guy; Chen, Jin; Schwartzman, Tali; Tsai, Albert; Puglisi, Elisabetta Viani; Puglisi, Joseph D; Adir, Noam

    2016-06-28

    Bioinformatic analysis of Escherichia coli proteomes revealed that all possible amino acid triplet sequences occur at their expected frequencies, with four exceptions. Two of the four underrepresented sequences (URSs) were shown to interfere with translation in vivo and in vitro. Enlarging the URS by a single amino acid resulted in increased translational inhibition. Single-molecule methods revealed stalling of translation at the entrance of the peptide exit tunnel of the ribosome, adjacent to ribosomal nucleotides A2062 and U2585. Interaction with these same ribosomal residues is involved in regulation of translation by longer, naturally occurring protein sequences. The E. coli exit tunnel has evidently evolved to minimize interaction with the exit tunnel and maximize the sequence diversity of the proteome, although allowing some interactions for regulatory purposes. Bioinformatic analysis of the human proteome revealed no underrepresented triplet sequences, possibly reflecting an absence of regulation by interaction with the exit tunnel.

  2. Protein engineering of selected residues from conserved sequence regions of a novel Anoxybacillus α-amylase

    PubMed Central

    Ranjani, Velayudhan; Janeček, Štefan; Chai, Kian Piaw; Shahir, Shafinaz; Rahman, Raja Noor Zaliha Raja Abdul; Chan, Kok-Gan; Goh, Kian Mau

    2014-01-01

    The α-amylases from Anoxybacillus species (ASKA and ADTA), Bacillus aquimaris (BaqA) and Geobacillus thermoleovorans (GTA, Pizzo and GtamyII) were proposed as a novel group of the α-amylase family GH13. An ASKA yielding a high percentage of maltose upon its reaction on starch was chosen as a model to study the residues responsible for the biochemical properties. Four residues from conserved sequence regions (CSRs) were thus selected, and the mutants F113V (CSR-I), Y187F and L189I (CSR-II) and A161D (CSR-V) were characterised. Few changes in the optimum reaction temperature and pH were observed for all mutants. Whereas the Y187F (t1/2 43 h) and L189I (t1/2 36 h) mutants had a lower thermostability at 65°C than the native ASKA (t1/2 48 h), the mutants F113V and A161D exhibited an improved t1/2 of 51 h and 53 h, respectively. Among the mutants, only the A161D had a specific activity, kcat and kcat/Km higher (1.23-, 1.17- and 2.88-times, respectively) than the values determined for the ASKA. The replacement of the Ala-161 in the CSR-V with an aspartic acid also caused a significant reduction in the ratio of maltose formed. This finding suggests the Ala-161 may contribute to the high maltose production of the ASKA. PMID:25069018

  3. In silico comparative analysis of DNA and amino acid sequences for prion protein gene.

    PubMed

    Kim, Y; Lee, J; Lee, C

    2008-01-01

    Genetic variability might contribute to species specificity of prion diseases in various organisms. In this study, structures of the prion protein gene (PRNP) and its amino acids were compared among species of which sequence data were available. Comparisons of PRNP DNA sequences among 12 species including human, chimpanzee, monkey, bovine, ovine, dog, mouse, rat, wallaby, opossum, chicken and zebrafish allowed us to identify candidate regulatory regions in intron 1 and 3'-untranslated region (UTR) in addition to the coding region. Highly conserved putative binding sites for transcription factors, such as heat shock factor 2 (HSF2) and myocite enhancer factor 2 (MEF2), were discovered in the intron 1. In 3'-UTR, the functional sequence (ATTAAA) for nucleus-specific polyadenylation was found in all the analysed species. The functional sequence (TTTTTAT) for maturation-specific polyadenylation was identically observed only in ovine, and one or two nucleotide mismatches in the other species. A comparison of the amino acid sequences in 53 species revealed a large sequence identity. Especially the octapeptide repeat region was observed in all the species but frog and zebrafish. Functional changes and susceptibility to prion diseases with various isoforms of prion protein could be caused by numeric variability and conformational changes discovered in the repeat sequences.

  4. Annotation of cis-regulatory elements by identification, subclassification, and functional assessment of multispecies conserved sequences

    PubMed Central

    Hughes, Jim R.; Cheng, Jan-Fang; Ventress, Nicki; Prabhakar, Shyam; Clark, Kevin; Anguita, Eduardo; De Gobbi, Marco; de Jong, Pieter; Rubin, Eddy; Higgs, Douglas R.

    2005-01-01

    An important step toward improving the annotation of the human genome is to identify cis-acting regulatory elements from primary DNA sequence. One approach is to compare sequences from multiple, divergent species. This approach distinguishes multispecies conserved sequences (MCS) in noncoding regions from more rapidly evolving neutral DNA. Here, we have analyzed a region of ≈238kb containing the human α globin cluster that was sequenced and/or annotated across the syntenic region in 22 species spanning 500 million years of evolution. Using a variety of bioinformatic approaches and correlating the results with many aspects of chromosome structure and function in this region, we were able to identify and evaluate the importance of 24 individual MCSs. This approach sensitively and accurately identified previously characterized regulatory elements but also discovered unidentified promoters, exons, splicing, and transcriptional regulatory elements. Together, these studies demonstrate an integrated approach by which to identify, subclassify, and predict the potential importance of MCSs. PMID:15998734

  5. In Silico Structure and Sequence Analysis of Bacterial Porins and Specific Diffusion Channels for Hydrophilic Molecules: Conservation, Multimericity and Multifunctionality

    PubMed Central

    Vollan, Hilde S.; Tannæs, Tone; Vriend, Gert; Bukholm, Geir

    2016-01-01

    Diffusion channels are involved in the selective uptake of nutrients and form the largest outer membrane protein (OMP) family in Gram-negative bacteria. Differences in pore size and amino acid composition contribute to the specificity. Structure-based multiple sequence alignments shed light on the structure-function relations for all eight subclasses. Entropy-variability analysis results are correlated to known structural and functional aspects, such as structural integrity, multimericity, specificity and biological niche adaptation. The high mutation rate in their surface-exposed loops is likely an important mechanism for host immune system evasion. Multiple sequence alignments for each subclass revealed conserved residue positions that are involved in substrate recognition and specificity. An analysis of monomeric protein channels revealed particular sequence patterns of amino acids that were observed in other classes at multimeric interfaces. This adds to the emerging evidence that all members of the family exist in a multimeric state. Our findings are important for understanding the role of members of this family in a wide range of bacterial processes, including bacterial food uptake, survival and adaptation mechanisms. PMID:27110766

  6. Identification of conserved and novel microRNAs in Aquilaria sinensis based on small RNA sequencing and transcriptome sequence data.

    PubMed

    Gao, Zhi-Hui; Wei, Jian-He; Yang, Yun; Zhang, Zheng; Xiong, Huan-Ying; Zhao, Wen-Ting

    2012-08-15

    Agarwood is in great demand for its high value in medicine, incense, and perfume across Asia, Middle East, and Europe. As agarwood is formed only when the Aquilaria trees are wounded or infected by some microbes, overharvesting and habitat loss are threatening some populations of agarwood-producing species. Aquilaria sinensis is such a significant economic tree species. To promote the production efficiency and protect the resource of A. sinensis, it would be critical to reveal the regulation mechanisms of stress-induced agarwood formation. MicroRNAs (miRNAs), a key gene expression regulator involved in various plant stress response and metabolic processes, might function in agarwood formation, but no report concerning miRNAs in Aquilaria is available. In this study, the small RNA high-throughput sequencing and 454 transcriptome data were adopted to identify both conserved and novel miRNAs in A. sinensis. Deep sequencing showed that the small RNA (sRNA) population of A. sinensis was complex and the length of sRNAs varied. By in silico analysis of the small RNA deep sequencing data and transcriptome data, we discovered 27 novel miRNAs in A. sinensis. Based on the mature miRNA sequence conservation, we identified 74 putative conserved miRNAs from A. sinensis and 10 of them were confirmed with hairpin forming precursor. Interestingly, a novel miRNA sequence was determined to be the miRNA of asi-miR408, but with accumulation much higher than asi-miR408. The expression levels of ten stress-responsive miRNAs were examined during the time-course after wound treatment. Eight were shown to be wound-responsive. This not only shows the existence of miRNAs in this Asian economically significant tree species but also indicated its critical role in stress-induced agarwood formation. The highly accumulated miRNA of asi-miR408 implied miRNAs would be functional as well as miRNAs in plants.

  7. Amino acid sequences of proteins from Leptospira serovar pomona.

    PubMed

    Alves, S F; Lefebvre, R B; Probert, W

    2000-01-01

    This report describes a partial amino acid sequences from three putative outer envelope proteins from Leptospira serovar pomona. In order to obtain internal fragments for protein sequencing, enzymatic and chemical digestion was performed. The enzyme clostripain was used to digest the proteins 32 and 45 kDa. In situ digestion of 40 kDa molecular weight protein was accomplished using cyanogen bromide. The 32 kDa protein generated two fragments, one of 21 kDa and another of 10 kDa that yielded five residues. A fragment of 24 kDa that yielded nineteen residues of amino acids was obtained from 45 kDa protein. A fragment with a molecular weight of 20 kDa, yielding a twenty amino acids sequence from the 40 kDa protein.

  8. The Large Mitochondrial Genome of Symbiodinium minutum Reveals Conserved Noncoding Sequences between Dinoflagellates and Apicomplexans

    PubMed Central

    Shoguchi, Eiichi; Shinzato, Chuya; Hisata, Kanako; Satoh, Nori; Mungpakdee, Sutada

    2015-01-01

    Even though mitochondrial genomes, which characterize eukaryotic cells, were first discovered more than 50 years ago, mitochondrial genomics remains an important topic in molecular biology and genome sciences. The Phylum Alveolata comprises three major groups (ciliates, apicomplexans, and dinoflagellates), the mitochondrial genomes of which have diverged widely. Even though the gene content of dinoflagellate mitochondrial genomes is reportedly comparable to that of apicomplexans, the highly fragmented and rearranged genome structures of dinoflagellates have frustrated whole genomic analysis. Consequently, noncoding sequences and gene arrangements of dinoflagellate mitochondrial genomes have not been well characterized. Here we report that the continuous assembled genome (∼326 kb) of the dinoflagellate, Symbiodinium minutum, is AT-rich (∼64.3%) and that it contains three protein-coding genes. Based upon in silico analysis, the remaining 99% of the genome comprises transcriptomic noncoding sequences. RNA edited sites and unique, possible start and stop codons clarify conserved regions among dinoflagellates. Our massive transcriptome analysis shows that almost all regions of the genome are transcribed, including 27 possible fragmented ribosomal RNA genes and 12 uncharacterized small RNAs that are similar to mitochondrial RNA genes of the malarial parasite, Plasmodium falciparum. Gene map comparisons show that gene order is only slightly conserved between S. minutum and P. falciparum. However, small RNAs and intergenic sequences share sequence similarities with P. falciparum, suggesting that the function of noncoding sequences has been preserved despite development of very different genome structures. PMID:26199191

  9. Sequence-related human proteins cluster by degree of evolutionary conservation

    NASA Astrophysics Data System (ADS)

    Mrowka, Ralf; Patzak, Andreas; Herzel, Hanspeter; Holste, Dirk

    2004-11-01

    Gene duplication followed by adaptive evolution is thought to be a central mechanism for the emergence of novel genes. To illuminate the contribution of duplicated protein-coding sequences to the complexity of the human genome, we study the connectivity of pairwise sequence-related human proteins and construct a network (N) of linked protein sequences with shared similarities. We find that (i) the connectivity distribution P(k) for k sequence-related proteins decays as a power law P(k)˜k-γ with γ≈1.2 , (ii) the top rank of N consists of a single large cluster of proteins (≈70%) , while bottom ranks consist of multiple isolated clusters, and (iii) structural characteristics of N show both a high degree of clustering and an intermediate connectivity (“small-world” features). We gain further insight into structural properties of N by studying the relationship between the connectivity distribution and the phylogenetic conservation of proteins in bacteria, plants, invertebrates, and vertebrates. We find that (iv) the proportion of sequence-related proteins increases with increasing extent of evolutionary conservation. Our results support that small-world network properties constitute a footprint of an evolutionary mechanism and extend the traditional interpretation of protein families.

  10. Predicting RNA-binding residues from evolutionary information and sequence conservation

    PubMed Central

    2010-01-01

    Abstract Background RNA-binding proteins (RBPs) play crucial roles in post-transcriptional control of RNA. RBPs are designed to efficiently recognize specific RNA sequences after it is derived from the DNA sequence. To satisfy diverse functional requirements, RNA binding proteins are composed of multiple blocks of RNA-binding domains (RBDs) presented in various structural arrangements to provide versatile functions. The ability to computationally predict RNA-binding residues in a RNA-binding protein can help biologists reveal important site-directed mutagenesis in wet-lab experiments. Results The proposed prediction framework named “ProteRNA” combines a SVM-based classifier with conserved residue discovery by WildSpan to identify the residues that interact with RNA in a RNA-binding protein. Although these conserved residues can be either functionally conserved residues or structurally conserved residues, they provide clues on the important residues in a protein sequence. In the independent testing dataset, ProteRNA has been able to deliver overall accuracy of 89.78%, MCC of 0.2628, F-score of 0.3075, and F0.5-score of 0.3546. Conclusions This article presents the design of a sequence-based predictor aiming to identify the RNA-binding residues in a RNA-binding protein by combining machine learning and pattern mining approaches. RNA-binding proteins have diverse functions while interacting with different categories of RNAs because these proteins are composed of multiple copies of RNA-binding domains presented in various structural arrangements to expand the functional repertoire of RNA-binding proteins. Furthermore, predicting RNA-binding residues in a RNA-binding protein can help biologists reveal important site-directed mutagenesis in wet-lab experiments. PMID:21143803

  11. Mathematical Characterization of Protein Sequences Using Patterns as Chemical Group Combinations of Amino Acids

    PubMed Central

    Choudhury, Pabitra Pal; Jana, Siddhartha Sankar

    2016-01-01

    Comparison of amino acid sequence similarity is the fundamental concept behind the protein phylogenetic tree formation. By virtue of this method, we can explain the evolutionary relationships, but further explanations are not possible unless sequences are studied through the chemical nature of individual amino acids. Here we develop a new methodology to characterize the protein sequences on the basis of the chemical nature of the amino acids. We design various algorithms for studying the variation of chemical group transitions and various chemical group combinations as patterns in the protein sequences. The amino acid sequence of conventional myosin II head domain of 14 family members are taken to illustrate this new approach. We find two blocks of maximum length 6 aa as ‘FPKATD’ and ‘Y/FTNEKL’ without repeating the same chemical nature and one block of maximum length 20 aa with the repetition of chemical nature which are common among all 14 members. We also check commonality with another motor protein sub-family kinesin, KIF1A. Based on our analysis we find a common block of length 8 aa both in myosin II and KIF1A. This motif is located in the neck linker region which could be responsible for the generation of mechanical force, enabling us to find the unique blocks which remain chemically conserved across the family. We also validate our methodology with different protein families such as MYOI, Myosin light chain kinase (MLCK) and Rho-associated protein kinase (ROCK), Na+/K+-ATPase and Ca2+-ATPase. Altogether, our studies provide a new methodology for investigating the conserved amino acids’ pattern in different proteins. PMID:27930687

  12. Evolutionarily conserved sequences of striated muscle myosin heavy chain isoforms. Epitope mapping by cDNA expression.

    PubMed

    Miller, J B; Teal, S B; Stockdale, F E

    1989-08-05

    A cDNA expression strategy was used to localize amino acid sequences which were specific for fast, as opposed to slow, isoforms of the chicken skeletal muscle myosin heavy chain (MHC) and which were conserved in vertebrate evolution. Five monoclonal antibodies (mAbs), termed F18, F27, F30, F47, and F59, were prepared that reacted with all of the known chicken fast MHC isoforms but did not react with any of the known chicken slow nor with smooth muscle MHC isoforms. The epitopes recognized by mAbs F18, F30, F47, and F59 were on the globular head fragment of the MHC, whereas the epitope recognized by mAb F27 was on the helical tail or rod fragment. Reactivity of all five mAbs also was confined to fast MHCs in the rat, with the exception of mAb F59, which also reacted with the beta-cardiac MHC, the single slow MHC isoform common to both the rat heart and skeletal muscle. None of the five epitopes was expressed on amphioxus, nematode, or Dictyostelium MHC. The F27 and F59 epitopes were found on shark, electric ray, goldfish, newt, frog, turtle, chicken, quail, rabbit, and rat MHCs. The epitopes recognized by these mAbs were conserved, therefore, to varying degrees through vertebrate evolution and differed in sequence from homologous regions of a number of invertebrate MHCs and myosin-like proteins. The sequence of those epitopes on the head were mapped using a two-part cDNA expression strategy. First, Bal31 exonuclease digestion was used to rapidly generate fragments of a chicken embryonic fast MHC cDNA that were progressively deleted from the 3' end. These cDNA fragments were expressed as beta-galactosidase/MHC fusion proteins using the pUR290 vector; the fusion proteins were tested by immunoblotting for reactivity with the mAbs; and the approximate locations of the epitopes were determined from the sizes of the cDNA fragments that encoded a particular epitope. The epitopes were then precisely mapped by expression of overlapping cDNA fragments of known sequence that

  13. Predicting protein ligand binding sites by combining evolutionary sequence conservation and 3D structure.

    PubMed

    Capra, John A; Laskowski, Roman A; Thornton, Janet M; Singh, Mona; Funkhouser, Thomas A

    2009-12-01

    Identifying a protein's functional sites is an important step towards characterizing its molecular function. Numerous structure- and sequence-based methods have been developed for this problem. Here we introduce ConCavity, a small molecule binding site prediction algorithm that integrates evolutionary sequence conservation estimates with structure-based methods for identifying protein surface cavities. In large-scale testing on a diverse set of single- and multi-chain protein structures, we show that ConCavity substantially outperforms existing methods for identifying both 3D ligand binding pockets and individual ligand binding residues. As part of our testing, we perform one of the first direct comparisons of conservation-based and structure-based methods. We find that the two approaches provide largely complementary information, which can be combined to improve upon either approach alone. We also demonstrate that ConCavity has state-of-the-art performance in predicting catalytic sites and drug binding pockets. Overall, the algorithms and analysis presented here significantly improve our ability to identify ligand binding sites and further advance our understanding of the relationship between evolutionary sequence conservation and structural and functional attributes of proteins. Data, source code, and prediction visualizations are available on the ConCavity web site (http://compbio.cs.princeton.edu/concavity/).

  14. Detection of conserved segments in proteins: iterative scanning of sequence databases with alignment blocks.

    PubMed Central

    Tatusov, R L; Altschul, S F; Koonin, E V

    1994-01-01

    We describe an approach to analyzing protein sequence databases that, starting from a single uncharacterized sequence or group of related sequences, generates blocks of conserved segments. The procedure involves iterative database scans with an evolving position-dependent weight matrix constructed from a coevolving set of aligned conserved segments. For each iteration, the expected distribution of matrix scores under a random model is used to set a cutoff score for the inclusion of a segment in the next iteration. This cutoff may be calculated to allow the chance inclusion of either a fixed number or a fixed proportion of false positive segments. With sufficiently high cutoff scores, the procedure converged for all alignment blocks studied, with varying numbers of iterations required. Different methods for calculating weight matrices from alignment blocks were compared. The most effective of those tested was a logarithm-of-odds, Bayesian-based approach that used prior residue probabilities calculated from a mixture of Dirichlet distributions. The procedure described was used to detect novel conserved motifs of potential biological importance. Images PMID:7991589

  15. A phylogenetically conserved sequence within viral 3' untranslated RNA pseudoknots regulates translation.

    PubMed Central

    Leathers, V; Tanguay, R; Kobayashi, M; Gallie, D R

    1993-01-01

    Both the 68-base 5' leader (omega) and the 205-base 3' untranslated region (UTR) of tobacco mosaic virus (TMV) promote efficient translation. A 35-base region within omega is necessary and sufficient for the regulation. Within the 3' UTR, a 52-base region, composed of two RNA pseudoknots, is required for regulation. These pseudoknots are phylogenetically conserved among seven viruses from two different viral groups and one satellite virus. The pseudoknots contained significant conservation at the secondary and tertiary levels and at several positions at the primary sequence level. Mutational analysis of the sequences determined that the primary sequence in several conserved positions, particularly within the third pseudoknot, was essential for function. The higher-order structure of the pseudoknots was also required. Both the leader and the pseudoknot region were specifically recognized by, and competed for, the same proteins in extracts made from carrot cell suspension cells and wheat germ. Binding of the proteins is much stronger to omega than the pseudoknot region. Synergism was observed between the TMV 3' UTR and the cap and to a lesser extent between omega and the 3' UTR. The functional synergism and the protein binding data suggest that the cap, TMV 5' leader, and 3' UTR interact to establish an efficient level of translation. Images PMID:8355685

  16. Conservation of uORF repressiveness and sequence features in mouse, human and zebrafish

    PubMed Central

    Chew, Guo-Liang; Pauli, Andrea; Schier, Alexander F.

    2016-01-01

    Upstream open reading frames (uORFs) are ubiquitous repressive genetic elements in vertebrate mRNAs. While much is known about the regulation of individual genes by their uORFs, the range of uORF-mediated translational repression in vertebrate genomes is largely unexplored. Moreover, it is unclear whether the repressive effects of uORFs are conserved across species. To address these questions, we analyse transcript sequences and ribosome profiling data from human, mouse and zebrafish. We find that uORFs are depleted near coding sequences (CDSes) and have initiation contexts that diminish their translation. Linear modelling reveals that sequence features at both uORFs and CDSes modulate the translation of CDSes. Moreover, the ratio of translation over 5′ leaders and CDSes is conserved between human and mouse, and correlates with the number of uORFs. These observations suggest that the prevalence of vertebrate uORFs may be explained by their conserved role in repressing CDS translation. PMID:27216465

  17. A highly conserved N-terminal sequence for teleost vitellogenin with potential value to the biochemistry, molecular biology and pathology of vitellogenesis

    USGS Publications Warehouse

    Folmar, L.D.; Denslow, N.D.; Wallace, R.A.; LaFleur, G.; Gross, T.S.; Bonomelli, S.; Sullivan, C.V.

    1995-01-01

    N-terminal amino acid sequences for vitellogenin (Vtg) from six species of teleost fish (striped bass, mummichog, pinfish, brown bullhead, medaka, yellow perch and the sturgeon) are compared with published N-terminal Vtg sequences for the lamprey, clawed frog and domestic chicken. Striped bass and mummichog had 100% identical amino acids between positions 7 and 21, while pinfish, brown bullhead, sturgeon, lamprey, Xenopus and chicken had 87%, 93%, 60%, 47%, 47-60%) for four transcripts and had 40% identical, respectively, with striped bass for the same positions. Partial sequences obtained for medaka and yellow perch were 100% identical between positions 5 to 10. The potential utility of this conserved sequence for studies on the biochemistry, molecular biology and pathology of vitellogenesis is discussed.

  18. Assembly of transmembrane helices of simple polytopic membrane proteins from sequence conservation patterns.

    PubMed

    Park, Yungki; Helms, Volkhard

    2006-09-01

    The transmembrane (TM) domains of most membrane proteins consist of helix bundles. The seemingly simple task of TM helix bundle assembly has turned out to be extremely difficult. This is true even for simple TM helix bundle proteins, i.e., those that have the simple form of compact TM helix bundles. Herein, we present a computational method that is capable of generating native-like structural models for simple TM helix bundle proteins having modest numbers of TM helices based on sequence conservation patterns. Thus, the only requirement for our method is the presence of more than 30 homologous sequences for an accurate extraction of sequence conservation patterns. The prediction method first computes a number of representative well-packed conformations for each pair of contacting TM helices, and then a library of tertiary folds is generated by overlaying overlapping TM helices of the representative conformations. This library is scored using sequence conservation patterns, and a subsequent clustering analysis yields five final models. Assuming that neighboring TM helices in the sequence contact each other (but not that TM helices A and G contact each other), the method produced structural models of Calpha atom root-mean-square deviation (CA RMSD) of 3-5 A from corresponding crystal structures for bacteriorhodopsin, halorhodopsin, sensory rhodopsin II, and rhodopsin. In blind predictions, this type of contact knowledge is not available. Mimicking this, predictions were made for the rotor of the V-type Na(+)-adenosine triphosphatase without such knowledge. The CA RMSD between the best model and its crystal structure is only 3.4 A, and its contact accuracy reaches 55%. Furthermore, the model correctly identifies the binding pocket for sodium ion. These results demonstrate that the method can be readily applied to ab initio structure prediction of simple TM helix bundle proteins having modest numbers of TM helices.

  19. Variation in conserved non-coding sequences on chromosome 5q andsusceptibility to asthma and atopy

    SciTech Connect

    Donfack, Joseph; Schneider, Daniel H.; Tan, Zheng; Kurz,Thorsten; Dubchak, Inna; Frazer, Kelly A.; Ober, Carole

    2005-09-10

    Background: Evolutionarily conserved sequences likely havebiological function. Methods: To determine whether variation in conservedsequences in non-coding DNA contributes to risk for human disease, westudied six conserved non-coding elements in the Th2 cytokine cluster onhuman chromosome 5q31 in a large Hutterite pedigree and in samples ofoutbred European American and African American asthma cases and controls.Results: Among six conserved non-coding elements (>100 bp,>70percent identity; human-mouse comparison), we identified one singlenucleotide polymorphism (SNP) in each of two conserved elements and sixSNPs in the flanking regions of three conserved elements. We genotypedour samples for four of these SNPs and an additional three SNPs each inthe IL13 and IL4 genes. While there was only modest evidence forassociation with single SNPs in the Hutterite and European Americansamples (P<0.05), there were highly significant associations inEuropean Americans between asthma and haplotypes comprised of SNPs in theIL4 gene (P<0.001), including a SNP in a conserved non-codingelement. Furthermore, variation in the IL13 gene was strongly associatedwith total IgE (P = 0.00022) and allergic sensitization to mold allergens(P = 0.00076) in the Hutterites, and more modestly associated withsensitization to molds in the European Americans and African Americans (P<0.01). Conclusion: These results indicate that there is overalllittle variation in the conserved non-coding elements on 5q31, butvariation in IL4 and IL13, including possibly one SNP in a conservedelement, influence asthma and atopic phenotypes in diversepopulations.

  20. Computational analysis of conserved coil functional residues in the mitochondrial genomic sequences of dermatophytes

    PubMed Central

    Gupta, Bulbul; Kaur, Jaspreet

    2016-01-01

    Dermatophyte is a group of closely related fungi that have the capacity to invade keratinized tissue of humans and other animals. The infection known as dermatophytosis, caused by members of the genera Microsporum, Trichophyton, and Epidermophyton includes infection to the groin (tinea cruris), beard (tinea barbae), scalp (tinea capitis), feet (tinea pedis), glabrous skin (tinea corporis), nail (tinea unguium), and hand (tinea manuum). The identification of evolutionary relationship between these three genera of dermatophyte is epidemiologically important to understand their pathogenicity. Mitochondrial DNA evolves more rapidly than a nuclear DNA due to higher rate of mutation but is very less affected by genetic recombination, making it an important tool for phylogenetic studies. Thus, here we present a novel scheme to identify the conserved coil functional residues of Trichophyton rubrum, Trichophyton mentagrophytes, Epidermophyton floccosum and Microsporum canis. Protein coding sequences of the mitochondrial genome were aligned for their similar sequences and homology modelling was performed for structure and pocket identification. The results obtained from comparative analysis of the protein sequences revealed the presence of functionally active sites in all the species of the genera Trichophyton and Microsporum. However in Epidermophyton floccosum it was observed in three protein sequences of the five studied. The absence of these conserved coil functional residues in E. floccusum may be correlated with lesser infectivity of this organism. The functional residues identified in the present study could be responsible for the disease and thus can act as putative target sites for drug designing. PMID:28149055

  1. Nucleotide sequence of the Klebsiella pneumoniae nifD gene and predicted amino acid sequence of the alpha-subunit of nitrogenase MoFe protein.

    PubMed Central

    Ioannidis, I; Buck, M

    1987-01-01

    The nucleotide sequence of the Klebsiella pneumoniae nifD gene is presented and together with the accompanying paper [Holland, Zilberstein, Zamir & Sussman (1987) Biochem. J. 247, 277-285] completes the sequence of the nifHDK genes encoding the nitrogenase polypeptides. The K. pneumoniae nifD gene encodes the 483-amino acid-residue nitrogenase alpha-subunit polypeptide of Mr 54156. The alpha-subunit has five strongly conserved cysteine residues at positions 63, 89, 155, 184 and 275, some occurring in a region showing both primary sequence and potential structural homology to the K. pneumoniae nitrogenase beta-subunit. A comparison with six other alpha-subunit amino acid sequences has been made, which indicates a number of potentially important domains within alpha-subunits. PMID:3322262

  2. Using a color-coded ambigraphic nucleic acid notation to visualize conserved palindromic motifs within and across genomes

    PubMed Central

    2014-01-01

    Background Ambiscript is a graphically-designed nucleic acid notation that uses symbol symmetries to support sequence complementation, highlight biologically-relevant palindromes, and facilitate the analysis of consensus sequences. Although the original Ambiscript notation was designed to easily represent consensus sequences for multiple sequence alignments, the notation’s black-on-white ambiguity characters are unable to reflect the statistical distribution of nucleotides found at each position. We now propose a color-augmented ambigraphic notation to encode the frequency of positional polymorphisms in these consensus sequences. Results We have implemented this color-coding approach by creating an Adobe Flash® application ( http://www.ambiscript.org) that shades and colors modified Ambiscript characters according to the prevalence of the encoded nucleotide at each position in the alignment. The resulting graphic helps viewers perceive biologically-relevant patterns in multiple sequence alignments by uniquely combining color, shading, and character symmetries to highlight palindromes and inverted repeats in conserved DNA motifs. Conclusion Juxtaposing an intuitive color scheme over the deliberate character symmetries of an ambigraphic nucleic acid notation yields a highly-functional nucleic acid notation that maximizes information content and successfully embodies key principles of graphic excellence put forth by the statistician and graphic design theorist, Edward Tufte. PMID:24447494

  3. Sequence-conserved and antibody-accessible sites in the V1V2 domain of HIV-1 gp120 envelope protein.

    PubMed

    Shmelkov, Evgeny; Grigoryan, Arsen; Krachmarov, Chavdar; Abagyan, Ruben; Cardozo, Timothy

    2014-09-01

    The immune-correlates analysis of the RV144 trial suggested that epitopes targeted by protective antibodies (Abs) reside in the V1V2 domain of gp120. We mapped V1V2 positional sequence variation onto the conserved V1V2 structural fold and showed that while most of the solvent-accessible V1V2 amino acids vary between strains, there are two accessible molecular surface regions that are conserved and also naturally antigenic. These sites may contain epitopes targeted by broadly cross-reactive anti-V1V2 antibodies.

  4. GC Content Heterogeneity Transition of Conserved Noncoding Sequences Occurred at the Emergence of Vertebrates

    PubMed Central

    Hettiarachchi, Nilmini; Saitou, Naruya

    2016-01-01

    Conserved non-coding sequences (CNSs) of Eukaryotes are known to be significantly enriched in regulatory sequences. CNSs of diverse lineages follow different patterns in abundance, sequence composition, and location. Here, we report a thorough analysis of CNSs in diverse groups of Eukaryotes with respect to GC content heterogeneity. We examined 24 fungi, 19 invertebrates, and 12 non-mammalian vertebrates so as to find lineage specific features of CNSs. We found that fungi and invertebrate CNSs are predominantly GC rich as in plants we previously observed, whereas vertebrate CNSs are GC poor. This result suggests that the CNS GC content transition occurred from the ancestral GC rich state of Eukaryotes to GC poor in the vertebrate lineage due to the enrollment of GC poor transcription factor binding sites that are lineage specific. CNS GC content is closely linked with the nucleosome occupancy that determines the location and structural architecture of DNAs. PMID:28040773

  5. Massive microRNA sequence conservation and prevalence in human and chimpanzee introns.

    PubMed

    Hill, Aubrey E; Sorscher, Eric J

    2013-06-01

    Human and chimpanzee introns contain numerous sequences strongly related to known microRNA hairpin structures. The relative frequency is precisely maintained across all chromosomes, suggesting the possible co-evolution of gene networks dependent upon microRNA regulation and with origins corresponding to the advent of primate transposable elements (TEs). While the motifs are known to be derived from transposable elements, the most common are far more numerous than expected from the number of TEs and their paralogous sequences, and exhibit striking conservation in comparison to the surrounding TE sequence context. Several of these motifs also exhibit structural complimentarity to each other, suggesting a pairing function at the level of DNA or RNA. These "pseudomicroRNAs," in semblance to pseudogenes, include hundreds of thousands of vestigial paralogs of primate microRNAs, many of which may have functioned historically or remain active today.

  6. Amino acid sequence of myoglobin from emu (Dromaius novaehollandiae) skeletal muscle.

    PubMed

    Suman, S P; Joseph, P; Li, S; Beach, C M; Fontaine, M; Steinke, L

    2010-11-01

    The objective of the present study was to characterize the primary structure of emu myoglobin (Mb). Emu Mb was isolated from Iliofibularis muscle employing gel-filtration chromatography. Matrix Assisted Laser Desorption Ionization-Time of Flight Mass Spectrometry was employed to determine the exact molecular mass of emu Mb in comparison with horse Mb, and Edman degradation was utilized to characterize the amino acid sequence. The molecular mass of emu Mb was 17,380 Da and was close to those reported for ratite and poultry myoglobins. Similar to myoglobins from meat-producing livestock and birds, emu Mb has 153 amino acids. Emu Mb contains 9 histidines. Proximal and distal histidines, responsible for coordinating oxygen-binding property of Mb, are conserved in emu. Emu Mb shared more than 90% homology with ratite and chicken myoglobins, whereas it demonstrated only less than 70% sequence similarity with ruminant myoglobins.

  7. Highly conserved D-loop-like nuclear mitochondrial sequences (Numts) in tiger (Panthera tigris).

    PubMed

    Zhang, Wenping; Zhang, Zhihe; Shen, Fujun; Hou, Rong; Lv, Xiaoping; Yue, Bisong

    2006-08-01

    Using oligonucleotide primers designed to match hypervariable segments I (HVS-1) of Panthera tigris mitochondrial DNA (mtDNA), we amplified two different PCR products (500 bp and 287 bp) in the tiger (Panthera tigris), but got only one PCR product (287 bp) in the leopard (Panthera pardus). Sequence analyses indicated that the sequence of 287 bp was a D-loop-like nuclear mitochondrial sequence (Numts), indicating a nuclear transfer that occurred approximately 4.8-17 million years ago in the tiger and 4.6-16 million years ago in the leopard. Although the mtDNA D-loop sequence has a rapid rate of evolution, the 287-bp Numts are highly conserved; they are nearly identical in tiger subspecies and only 1.742% different between tiger and leopard. Thus, such sequences represent molecular 'fossils' that can shed light on evolution of the mitochondrial genome and may be the most appropriate outgroup for phylogenetic analysis. This is also proved by comparing the phylogenetic trees reconstructed using the D-loop sequence of snow leopard and the 287-bp Numts as outgroup.

  8. Sample sequencing of vascular plants demonstrates widespread conservation and divergence of microRNAs.

    PubMed

    Chávez Montes, Ricardo A; de Fátima Rosas-Cárdenas, Flor; De Paoli, Emanuele; Accerbi, Monica; Rymarquis, Linda A; Mahalingam, Gayathri; Marsch-Martínez, Nayelli; Meyers, Blake C; Green, Pamela J; de Folter, Stefan

    2014-04-23

    Small RNAs are pivotal regulators of gene expression that guide transcriptional and post-transcriptional silencing mechanisms in eukaryotes, including plants. Here we report a comprehensive atlas of sRNA and miRNA from 3 species of algae and 31 representative species across vascular plants, including non-model plants. We sequence and quantify sRNAs from 99 different tissues or treatments across species, resulting in a data set of over 132 million distinct sequences. Using miRBase mature sequences as a reference, we identify the miRNA sequences present in these libraries. We apply diverse profiling methods to examine critical sRNA and miRNA features, such as size distribution, tissue-specific regulation and sequence conservation between species, as well as to predict putative new miRNA sequences. We also develop database resources, computational analysis tools and a dedicated website, http://smallrna.udel.edu/. This study provides new insights on plant sRNAs and miRNAs, and a foundation for future studies.

  9. Reptiles and mammals have differentially retained long conserved noncoding sequences from the amniote ancestor.

    PubMed

    Janes, D E; Chapus, C; Gondo, Y; Clayton, D F; Sinha, S; Blatti, C A; Organ, C L; Fujita, M K; Balakrishnan, C N; Edwards, S V

    2011-01-01

    Many noncoding regions of genomes appear to be essential to genome function. Conservation of large numbers of noncoding sequences has been reported repeatedly among mammals but not thus far among birds and reptiles. By searching genomes of chicken (Gallus gallus), zebra finch (Taeniopygia guttata), and green anole (Anolis carolinensis), we quantified the conservation among birds and reptiles and across amniotes of long, conserved noncoding sequences (LCNS), which we define as sequences ≥500 bp in length and exhibiting ≥95% similarity between species. We found 4,294 LCNS shared between chicken and zebra finch and 574 LCNS shared by the two birds and Anolis. The percent of genomes comprised by LCNS in the two birds (0.0024%) is notably higher than the percent in mammals (<0.0003% to <0.001%), differences that we show may be explained in part by differences in genome-wide substitution rates. We reconstruct a large number of LCNS for the amniote ancestor (ca. 8,630) and hypothesize differential loss and substantial turnover of these sites in descendent lineages. By contrast, we estimated a small role for recruitment of LCNS via acquisition of novel functions over time. Across amniotes, LCNS are significantly enriched with transcription factor binding sites for many developmental genes, and 2.9% of LCNS shared between the two birds show evidence of expression in brain expressed sequence tag databases. These results show that the rate of retention of LCNS from the amniote ancestor differs between mammals and Reptilia (including birds) and that this may reflect differing roles and constraints in gene regulation.

  10. Reptiles and Mammals Have Differentially Retained Long Conserved Noncoding Sequences from the Amniote Ancestor

    PubMed Central

    Janes, D.E.; Chapus, C.; Gondo, Y.; Clayton, D.F.; Sinha, S.; Blatti, C.A.; Organ, C.L.; Fujita, M.K.; Balakrishnan, C.N.; Edwards, S.V.

    2011-01-01

    Many noncoding regions of genomes appear to be essential to genome function. Conservation of large numbers of noncoding sequences has been reported repeatedly among mammals but not thus far among birds and reptiles. By searching genomes of chicken (Gallus gallus), zebra finch (Taeniopygia guttata), and green anole (Anolis carolinensis), we quantified the conservation among birds and reptiles and across amniotes of long, conserved noncoding sequences (LCNS), which we define as sequences ≥500 bp in length and exhibiting ≥95% similarity between species. We found 4,294 LCNS shared between chicken and zebra finch and 574 LCNS shared by the two birds and Anolis. The percent of genomes comprised by LCNS in the two birds (0.0024%) is notably higher than the percent in mammals (<0.0003% to <0.001%), differences that we show may be explained in part by differences in genome-wide substitution rates. We reconstruct a large number of LCNS for the amniote ancestor (ca. 8,630) and hypothesize differential loss and substantial turnover of these sites in descendent lineages. By contrast, we estimated a small role for recruitment of LCNS via acquisition of novel functions over time. Across amniotes, LCNS are significantly enriched with transcription factor binding sites for many developmental genes, and 2.9% of LCNS shared between the two birds show evidence of expression in brain expressed sequence tag databases. These results show that the rate of retention of LCNS from the amniote ancestor differs between mammals and Reptilia (including birds) and that this may reflect differing roles and constraints in gene regulation. PMID:21183607

  11. Active site amino acid sequence of human factor D.

    PubMed

    Davis, A E

    1980-08-01

    Factor D was isolated from human plasma by chromatography on CM-Sephadex C50, Sephadex G-75, and hydroxylapatite. Digestion of reduced, S-carboxymethylated factor D with cyanogen bromide resulted in three peptides which were isolated by chromatography on Sephadex G-75 (superfine) equilibrated in 20% formic acid. NH2-Terminal sequences were determined by automated Edman degradation with a Beckman 890C sequencer using a 0.1 M Quadrol program. The smallest peptide (CNBr III) consisted of the NH2-terminal 14 amino acids. The other two peptides had molecular weights of 17,000 (CNBr I) and 7000 (CNBr II). Overlap of the NH2-terminal sequence of factor D with the NH2-terminal sequence of CNBr I established the order of the peptides. The NH2-terminal 53 residues of factor D are somewhat more homologous with the group-specific protease of rat intestine than with other serine proteases. The NH2-terminal sequence of CNBr II revealed the active site serine of factor D. The typical serine protease active site sequence (Gly-Asp-Ser-Gly-Gly-Pro was found at residues 12-17. The region surrounding the active site serine does not appear to be more highly homologous with any one of the other serine proteases. The structural data obtained point out the similarities between factor D and the other proteases. However, complete definition of the degree of relationship between factor D and other proteases will require determination of the remainder of the primary structure.

  12. The amino acid sequence of iguana (Iguana iguana) pancreatic ribonuclease.

    PubMed

    Zhao, W; Beintema, J J; Hofsteenge, J

    1994-01-15

    The pyrimidine-specific ribonuclease superfamily constitutes a group of homologous proteins so far found only in higher vertebrates. Four separate families are found in mammals, which have resulted from gene duplications in mammalian ancestors. To learn more about the evolutionary history of this superfamily, the primary structure and other characteristics of the pancreatic enzyme from iguana (Iguana iguana), a herbivorous lizard species belonging to the reptiles, have been determined. The polypeptide chain consists of 119 amino acid residues. The positions of insertions and deletions in the sequence are identical to those in the enzyme from snapping turtle. However, the two enzymes differ at 54% of the amino acid positions. Iguana ribonuclease contains no carbohydrate, although the enzyme possesses three recognition sites for carbohydrate attachment, and has a high number of acidic residues in a localized part of the sequence.

  13. Sequence conservation, HLA-E-Restricted peptide, and best-defined CTL/CD8+ epitopes in gag P24 (capsid) of HIV-1 subtype B

    NASA Astrophysics Data System (ADS)

    Prasetyo, Afiono Agung; Dharmawan, Ruben; Sari, Yulia; Sariyatun, Ratna

    2017-02-01

    Human immunodeficiency virus type 1 (HIV-1) remains a cause of global health problem. Continuous studies of HIV-1 genetic and immunological profiles are important to find strategies against the virus. This study aimed to conduct analysis of sequence conservation, HLA-E-restricted peptide, and best-defined CTL/CD8+ epitopes in p24 (capsid) of HIV-1 subtype B worldwide. The p24-coding sequences from 3,557 HIV subtype B isolates were aligned using MUSCLE and analysed. Some highly conserved regions (sequence conservation ≥95%) were observed. Two considerably long series of sequences with conservation of 100% was observed at base 349-356 and 550-557 of p24 (HXB2 numbering). The consensus from all aligned isolates was precisely the same as consensus B in the Los Alamos HIV Database. The HLA-E-restricted peptide in amino acid (aa) 14-22 of HIV-1 p24 (AISPRTLNA) was found in 55.9% (1,987/3,557) of HIV-1 subtype B worldwide. Forty-four best-defined CTL/CD8+ epitopes were observed, in which VKNWMTETL epitope (aa 181-189 of p24) restricted by B*4801 was the most frequent, as found in 94.9% of isolates. The results of this study would contribute information about HIV-1 subtype B and benefits for further works willing to develop diagnostic and therapeutic strategies against the virus.

  14. cDNA cloning and sequencing of human fibrillarin, a conserved nucleolar protein recognized by autoimmune antisera

    SciTech Connect

    Aris, J.P.; Blobel, G. )

    1991-02-01

    The authors have isolated a 1.1-kilobase cDNA clone that encodes human fibrillarin by screening a hepatoma library in parallel with DNA probes derived from the fibrillarin genes of Saccharomyces cerevisiae (NOP1) and Xenopus laevis. RNA blot analysis indicates that the corresponding mRNA is {approximately}1,300 nucleotides in length. Human fibrillarin expressed in vitro migrates on SDS gels as a 36-kDa protein that is specifically immunoprecipitated by antisera from humans with scleroderma autoimmune disease. Human fibrillarin contains an amino-terminal repetitive domain {approximately}75-80 amino acids in length that is rich in glycine and arginine residues and is similar to amino-terminal domains in the yeast and Xenopus fibrillarins. The occurrence of a putative RNA-binding domain and an RNP consensus sequence within the protein is consistent with the association of fibrillarin with small nucleolar RNAs. Protein sequence alignments show that 67% of amino acids from human fibrillarin are identical to those in yeast fibrillarin and that 81% are identical to those in Xenopus fibrillarin. This identity suggests the evolutionary conservation of an important function early in the pathway for ribosome biosynthesis.

  15. Alignment of 700 globin sequences: extent of amino acid substitution and its correlation with variation in volume.

    PubMed Central

    Kapp, O. H.; Moens, L.; Vanfleteren, J.; Trotman, C. N.; Suzuki, T.; Vinogradov, S. N.

    1995-01-01

    Seven-hundred globin sequences, including 146 nonvertebrate sequences, were aligned on the basis of conservation of secondary structure and the avoidance of gap penalties. Of the 182 positions needed to accommodate all the globin sequences, only 84 are common to all, including the absolutely conserved PheCD1 and HisF8. The mean number of amino acid substitutions per position ranges from 8 to 13 for all globins and 5 to 9 for internal positions. Although the total sequence volumes have a variation approximately 2-3%, the variation in volume per position ranges from approximately 13% for the internal to approximately 21% for the surface positions. Plausible correlations exist between amino acid substitution and the variation in volume per position for the 84 common and the internal but not the surface positions. The amino acid substitution matrix derived from the 84 common positions was used to evaluate sequence similarity within the globins and between the globins and phycocyanins C and colicins A, via calculation of pairwise similarity scores. The scores for globin-globin comparisons over the 84 common positions overlap the globin-phycocyanin and globin-colicin scores, with the former being intermediate. For the subset of internal positions, overlap is minimal between the three groups of scores. These results imply a continuum of amino acid sequences able to assume the common three-on-three alpha-helical structure and suggest that the determinants of the latter include sites other than those inaccessible to solvent. PMID:8535255

  16. Identification of evolutionarily conserved non-AUG-initiated N-terminal extensions in human coding sequences

    PubMed Central

    Ivanov, Ivaylo P.; Firth, Andrew E.; Michel, Audrey M.; Atkins, John F.; Baranov, Pavel V.

    2011-01-01

    In eukaryotes, it is generally assumed that translation initiation occurs at the AUG codon closest to the messenger RNA 5′ cap. However, in certain cases, initiation can occur at codons differing from AUG by a single nucleotide, especially the codons CUG, UUG, GUG, ACG, AUA and AUU. While non-AUG initiation has been experimentally verified for a handful of human genes, the full extent to which this phenomenon is utilized—both for increased coding capacity and potentially also for novel regulatory mechanisms—remains unclear. To address this issue, and hence to improve the quality of existing coding sequence annotations, we developed a methodology based on phylogenetic analysis of predicted 5′ untranslated regions from orthologous genes. We use evolutionary signatures of protein-coding sequences as an indicator of translation initiation upstream of annotated coding sequences. Our search identified novel conserved potential non-AUG-initiated N-terminal extensions in 42 human genes including VANGL2, FGFR1, KCNN4, TRPV6, HDGF, CITED2, EIF4G3 and NTF3, and also affirmed the conservation of known non-AUG-initiated extensions in 17 other genes. In several instances, we have been able to obtain independent experimental evidence of the expression of non-AUG-initiated products from the previously published literature and ribosome profiling data. PMID:21266472

  17. Inference of transcriptional networks in Arabidopsis through conserved noncoding sequence analysis.

    PubMed

    Van de Velde, Jan; Heyndrickx, Ken S; Vandepoele, Klaas

    2014-07-01

    Transcriptional regulation plays an important role in establishing gene expression profiles during development or in response to (a)biotic stimuli. Transcription factor binding sites (TFBSs) are the functional elements that determine transcriptional activity, and the identification of individual TFBS in genome sequences is a major goal to inferring regulatory networks. We have developed a phylogenetic footprinting approach for the identification of conserved noncoding sequences (CNSs) across 12 dicot plants. Whereas both alignment and non-alignment-based techniques were applied to identify functional motifs in a multispecies context, our method accounts for incomplete motif conservation as well as high sequence divergence between related species. We identified 69,361 footprints associated with 17,895 genes. Through the integration of known TFBS obtained from the literature and experimental studies, we used the CNSs to compile a gene regulatory network in Arabidopsis thaliana containing 40,758 interactions, of which two-thirds act through binding events located in DNase I hypersensitive sites. This network shows significant enrichment toward in vivo targets of known regulators, and its overall quality was confirmed using five different biological validation metrics. Finally, through the integration of detailed expression and function information, we demonstrate how static CNSs can be converted into condition-dependent regulatory networks, offering opportunities for regulatory gene annotation.

  18. Lack of evidence of conserved lentiviral sequences in pigs with post weaning multisystemic wasting syndrome.

    PubMed Central

    Bratanich, A; Lairmore, M; Heneine, W; Konoby, C; Harding, J; West, K; Vasquez, G; Allan, G; Ellis, J

    1999-01-01

    In order to investigate the role of retroviruses in the recently described porcine postweaning multisystemic wasting syndrome (PMWS) serum and leukocytes were screened for reverse transcriptase (RT) activity, and tissues were examined for the presence of conserved lentiviral sequences using degenerate primers in a polymerase chain reaction (PCR). Serum and stimulated leukocytes from the blood and lymph nodes from pigs with PMWS, as well as from control pigs had RT activity that was detected by the sensitive Amp-RT assay. A 257-bp fragment was amplified from DNA from the blood and bone marrow of pigs with PMWS. This fragment was identical in size to conserved lentiviral sequences that were amplified from plasmids containing DNA from several lentiviruses. Cloning and sequencing of the fragment from affected pigs, however, did not reveal homology with the recognized lentiviruses. Together the results of these analyses suggest that the RT activity present in tissues from control and affected pigs is the result of endogenous retrovirus expression, and that a lentivirus is not a primary pathogen in PMWS. Images Figure 1. Figure 2. PMID:10480463

  19. Amino acid sequence of bovine heart coupling factor 6.

    PubMed Central

    Fang, J K; Jacobs, J W; Kanner, B I; Racker, E; Bradshaw, R A

    1984-01-01

    The amino acid sequence of bovine heart mitochondrial coupling factor 6 (F6) has been determined by automated Edman degradation of the whole protein and derived peptides. Preparations based on heat precipitation and ethanol extraction showed allotypic variation at three positions while material further purified by HPLC yielded only one sequence that also differed by a Phe-Thr replacement at residue 62. The mature protein contains 76 amino acids with a calculated molecular weight of 9006 and a pI of approximately equal to 5, in good agreement with experimentally measured values. The charged amino acids are mainly clustered at the termini and in one section in the middle; these three polar segments are separated by two segments relatively rich in nonpolar residues. Chou-Fasman analysis suggests three stretches of alpha-helix coinciding (or within) the high-charge-density sequences with a single beta-turn at the first polar-nonpolar junction. Comparison of the F6 sequence with those of other proteins did not reveal any homologous structures. PMID:6149548

  20. Amino acid sequence and comparative antigenicity of chicken metallothionein.

    PubMed Central

    McCormick, C C; Fullmer, C S; Garvey, J S

    1988-01-01

    The complete amino acid sequence of metallothionein (MT) from chicken liver is reported. The primary structure was determined by automated sequence analysis of peptides produced by limited acid hydrolysis and by trypsin digestion. The comparative antigenicity of chicken MT was determined by radioimmunoassay using rabbit anti-rat MT polyclonal antibody. Chicken MT consists of 63 amino acids as compared to 61 found in MTs from mammals. One insertion (and two substitutions) occurs in the amino-terminal region, a region considered invariant among mammalian MTs. Eighteen of the 20 cysteines in chicken MT were aligned with cysteines from other mammalian sequences. Two cysteines near the carboxyl terminus are shifted by one residue due to the insertion of proline in that region. Overall, the chicken protein showed approximately equal to 68% sequence identity in a comparison with various mammalian MTs. The affinity of the polyclonal antibody for chicken MT was decreased by 2 orders of magnitude in comparison to that of a mammalian MT (rat MT isoforms). This reduced affinity is attributed to major substitutions in chicken MT in the regions of the principal determinants of mammalian MTs. Theoretical analysis of the primary structure predicted the secondary structure to consist of reverse turns and random coils with no stable beta or helix conformations. There is no evidence that chicken MT differs functionally from mammalian MTs. PMID:2448773

  1. A Collection of Conserved Noncoding Sequences to Study Gene Regulation in Flowering Plants1[OPEN

    PubMed Central

    2016-01-01

    Transcription factors (TFs) regulate gene expression by binding cis-regulatory elements, of which the identification remains an ongoing challenge owing to the prevalence of large numbers of nonfunctional TF binding sites. Powerful comparative genomics methods, such as phylogenetic footprinting, can be used for the detection of conserved noncoding sequences (CNSs), which are functionally constrained and can greatly help in reducing the number of false-positive elements. In this study, we applied a phylogenetic footprinting approach for the identification of CNSs in 10 dicot plants, yielding 1,032,291 CNSs associated with 243,187 genes. To annotate CNSs with TF binding sites, we made use of binding site information for 642 TFs originating from 35 TF families in Arabidopsis (Arabidopsis thaliana). In three species, the identified CNSs were evaluated using TF chromatin immunoprecipitation sequencing data, resulting in significant overlap for the majority of data sets. To identify ultraconserved CNSs, we included genomes of additional plant families and identified 715 binding sites for 501 genes conserved in dicots, monocots, mosses, and green algae. Additionally, we found that genes that are part of conserved mini-regulons have a higher coherence in their expression profile than other divergent gene pairs. All identified CNSs were integrated in the PLAZA 3.0 Dicots comparative genomics platform (http://bioinformatics.psb.ugent.be/plaza/versions/plaza_v3_dicots/) together with new functionalities facilitating the exploration of conserved cis-regulatory elements and their associated genes. The availability of this data set in a user-friendly platform enables the exploration of functional noncoding DNA to study gene regulation in a variety of plant species, including crops. PMID:27261064

  2. Structure-function studies of nerve growth factor: functional importance of highly conserved amino acid residues.

    PubMed Central

    Ibáñez, C F; Hallböök, F; Ebendal, T; Persson, H

    1990-01-01

    Selected amino acid residues in chicken nerve growth factor (NGF) were replaced by site-directed mutagenesis. Mutated NGF sequences were transiently expressed in COS cells and the yield of NGF protein in conditioned medium was quantified by Western blotting. Binding of each mutant to NGF receptors on PC12 cells was evaluated in a competition assay. The biological activity was determined by measuring stimulation of neurite outgrowth from chick sympathetic ganglia. The residues homologous to the proposed receptor binding site of insulin (Ser18, Met19, Val21, Asp23) were substituted by Ala. Replacement of Ser18, Met19 and Asp23 did not affect NGF activity. Modification of Val21 notably reduced both receptor binding and biological activity, suggesting that this residue is important to retain a fully active NGF. The highly conserved Tyr51 and Arg99 were converted into Phe and Lys respectively, without changing the biological properties of the molecule. However, binding and biological activity were greatly impaired after the simultaneous replacement of both Arg99 and Arg102 by Gly. The three conserved Trp residues at positions 20, 75 and 98 were substituted by Phe. The Trp mutated proteins retained 15-60% of receptor binding and 40-80% of biological activity, indicating that the Trp residues are not essential for NGF activity. However, replacement of Trp20 significantly reduced the amount of NGF in the medium, suggesting that this residue may be important for protein stability. Images Fig. 4. PMID:2328722

  3. Sequence conservation in the Ancylostoma secreted protein-2 of Necator americanus (Na-ASP-2) from hookworm infected individuals in Thailand.

    PubMed

    Ungcharoensuk, Charoenchai; Putaporntip, Chaturong; Pattanawong, Urassaya; Jongwutiwes, Somchai

    2012-12-01

    The Ancylostoma secreted protein-2 of Necator americanus (Na-ASP-2) was one of the promising vaccine candidates against the most prevalent human hookworm species as adverse vaccine reaction has compromised further human vaccine trials. To elucidate the gene structure and the extent of sequence diversity, we determined the complete nucleotide sequence of the Na-asp-2 gene of individual larvae from 32 infected subjects living in 3 different endemic areas of Thailand. Sequence analysis revealed that the gene encoding Na-ASP-2 comprised 8 exons. Of 3 nucleotide substitutions in these exons, only one causes an amino acid change from leucine to methionine. A consensus conserved GT and AG at the 5' and the 3' boundaries of each intron was observed akin to those found in other eukaryotic genes. Introns of Na-asp-2 contained 23 nucleotide substitutions and 0-18 indels. The mean number of nucleotide substitutions per site (d) in introns was not significantly different from the mean number of synonymous substitutions per synonymous site (d(S)) in exons whereas d in introns was significantly exceeded d(N) (the mean number of nonsynonymous substitutions per nonsynonymous site) in exons (p<0.05), suggesting that introns and synonymous sites in exons may evolve at a similar rate whereas functional constraints at the amino acid could limit amino acid substitutions in Na-ASP-2. A recombination site was identified in an intron near the 3' portion of the gene. The positions of introns and the intron phases in the Na-asp-2 gene comparing with those in other pathogenesis-related-1 proteins of Loa loa, Onchocerca volvulus, Heterodera glycines, Caenorhabditis elegans and human were relatively conserved, suggesting evolutionary conservation of these genes. Sequence conservation in Na-ASP-2 may not compromise further vaccine design if adverse vaccine effects could be resolved whereas microheterogeneity in introns of this locus may be useful for population genetics analysis of N. americanus.

  4. Sequences Of Amino Acids For Human Serum Albumin

    NASA Technical Reports Server (NTRS)

    Carter, Daniel C.

    1992-01-01

    Sequences of amino acids defined for use in making polypeptides one-third to one-sixth as large as parent human serum albumin molecule. Smaller, chemically stable peptides have diverse applications including service as artificial human serum and as active components of biosensors and chromatographic matrices. In applications involving production of artificial sera from new sequences, little or no concern about viral contaminants. Smaller genetically engineered polypeptides more easily expressed and produced in large quantities, making commercial isolation and production more feasible and profitable.

  5. A highly conserved repeated chromosomal sequence in the radioresistant bacterium Deinococcus radiodurans SARK

    SciTech Connect

    Lennon, E.; Gutman, P.D.; Hanlong Yao; Minton, K.W. )

    1991-03-01

    A DNA fragment containing a portion of a DNA damage-inducible gene from Deinococcus radiodurans SARK hybridized to numerous fragments of SARK genomic DNA because of a highly conserved repetitive chromosomal element. The element is of variable length, ranging from 150 to 192 bp, depending on the absence or presence of one or two 21-bp sequences located internally. A putative translational start site of the damage-inducible gene is within the reiterated element. The element contains dyad symmetries that suggest modes of transcriptional and/or translational control.

  6. Conserved amino acid motifs from the novel Piv/MooV family of transposases and site-specific recombinases are required for catalysis of DNA inversion by Piv.

    PubMed

    Tobiason, D M; Buchner, J M; Thiel, W H; Gernert, K M; Karls, A C

    2001-02-01

    Piv, a site-specific invertase from Moraxella lacunata, exhibits amino acid homology with the transposases of the IS110/IS492 family of insertion elements. The functions of conserved amino acid motifs that define this novel family of both transposases and site-specific recombinases (Piv/MooV family) were examined by mutagenesis of fully conserved amino acids within each motif in Piv. All Piv mutants altered in conserved residues were defective for in vivo inversion of the M. lacunata invertible DNA segment, but competent for in vivo binding to Piv DNA recognition sequences. Although the primary amino acid sequences of the Piv/MooV recombinases do not contain a conserved DDE motif, which defines the retroviral integrase/transposase (IN/Tnps) family, the predicted secondary structural elements of Piv align well with those of the IN/Tnps for which crystal structures have been determined. Molecular modelling of Piv based on these alignments predicts that E59, conserved as either E or D in the Piv/MooV family, forms a catalytic pocket with the conserved D9 and D101 residues. Analysis of Piv E59G confirms a role for E59 in catalysis of inversion. These results suggest that Piv and the related IS110/IS492 transposases mediate DNA recombination by a common mechanism involving a catalytic DED or DDD motif.

  7. Sequences of conserved region in the A subunit of DNA gyrase from nine species of the genus Mycobacterium: phylogenetic analysis and implication for intrinsic susceptibility to quinolones.

    PubMed

    Guillemin, I; Cambau, E; Jarlier, V

    1995-09-01

    The sequences of a conserved region in the A subunit of DNA gyrase corresponding to the quinolone resistance-determining region were determined for nine mycobacterial species and were compared. Although the nucleotide sequences were highly conserved, they clearly differentiated one species from another. The results of the phylogenetic analysis based on the sequences of the quinolone resistance-determining regions were compared with those provided by the 16S rRNA sequences. Deduced amino acid sequences were identical within the nine species except for amino acid 83, which was frequently involved in acquired resistance to quinolones in many genera, including mycobacteria. The presence at position 83 of an alanine for seven mycobacterial species (M. tuberculosis, M. bovis BCG, M. leprae, M. avium, M. kansasii, M. chelonae, and M. smegmatis) and of a serine for the two remaining mycobacterial species (M. fortuitum and M. aurum) correlated well with the MICs of ofloxacin for both groups of species, suggesting the role of this residue in intrinsic susceptibility to quinolones in mycobacteria.

  8. Conservation.

    ERIC Educational Resources Information Center

    National Audubon Society, New York, NY.

    This set of teaching aids consists of seven Audubon Nature Bulletins, providing the teacher and student with informational reading on various topics in conservation. The bulletins have these titles: Plants as Makers of Soil, Water Pollution Control, The Ground Water Table, Conservation--To Keep This Earth Habitable, Our Threatened Air Supply,…

  9. New insights into SRY regulation through identification of 5' conserved sequences

    PubMed Central

    Ross, Diana GF; Bowles, Josephine; Koopman, Peter; Lehnert, Sigrid

    2008-01-01

    Background SRY is the pivotal gene initiating male sex determination in most mammals, but how its expression is regulated is still not understood. In this study we derived novel SRY 5' flanking genomic sequence data from bovine and caprine genomic BAC clones. Results We identified four intervals of high homology upstream of SRY by comparison of human, bovine, pig, goat and mouse genomic sequences. These conserved regions contain putative binding sites for a large number of known transcription factor families, including several that have been implicated previously in sex determination and early gonadal development. Conclusion Our results reveal potentially important SRY regulatory elements, mutations in which might underlie cases of idiopathic human XY sex reversal. PMID:18851760

  10. Nanopores and nucleic acids: prospects for ultrarapid sequencing

    NASA Technical Reports Server (NTRS)

    Deamer, D. W.; Akeson, M.

    2000-01-01

    DNA and RNA molecules can be detected as they are driven through a nanopore by an applied electric field at rates ranging from several hundred microseconds to a few milliseconds per molecule. The nanopore can rapidly discriminate between pyrimidine and purine segments along a single-stranded nucleic acid molecule. Nanopore detection and characterization of single molecules represents a new method for directly reading information encoded in linear polymers. If single-nucleotide resolution can be achieved, it is possible that nucleic acid sequences can be determined at rates exceeding a thousand bases per second.

  11. Conserved Plasmid Hydrogen-Uptake (hup)-Specific Sequences within Hup+Rhizobium leguminosarum Strains

    PubMed Central

    Leyva, Antonio; Palacios, José M.; Ruiz-Argüeso, Tomás

    1987-01-01

    Thirteen Rhizobium leguminosarum strains previously reported as H2-uptake hydrogenase positive (Hup+) or negative (Hup−) were analyzed for the presence and conservation of DNA sequences homologous to cloned Bradyrhizobium japonicum hup-specific DNA from cosmid pHU1 (M. A. Cantrell, R. A. Haugland, and H. J. Evans, Proc. Natl. Acad. Sci. USA 80:181-185, 1983). The Hup phenotype of these strains was reexamined by determining hydrogenase activity induced in bacteroids from pea nodules. Five strains, including H2 oxidation-ATP synthesis-coupled and -uncoupled strains, induced significant rates of H2-uptake hydrogenase activity and contained DNA sequences homologous to three probe DNA fragments (5.9-kilobase [kb] HindIII, 2.9-kb EcoRI, and 5.0-kb EcoRI) from pHU1. The pattern of genomic DNA HindIII and EcoRI fragments with significant homology to each of the three probes was identical in all five strains regardless of the H2-dependent ATP generation trait. The restriction fragments containing the homology totalled about 22 kb of DNA common to the five strains. In all instances the putative hup sequences were located on a plasmid that also contained nif genes. The molecular sizes of the identified hup-sym plasmids ranged between 184 and 212 megadaltons. No common DNA sequences homologous to B. japonicum hup DNA were found in genomic DNA from any of the eight remaining strains showing no significant hydrogenase activity in pea bacteroids. These results suggest that the identified DNA region contains genes essential for hydrogenase activity in R. leguminosarum and that its organization is highly conserved within Hup+ strains in this symbiotic species. Images PMID:16347471

  12. Yeast general transcription factor GFI: sequence requirements for binding to DNA and evolutionary conservation.

    PubMed Central

    Dorsman, J C; van Heeswijk, W C; Grivell, L A

    1990-01-01

    GFI is an abundant DNA binding protein in the yeast S. cerevisiae. The protein binds to specific sequences in both ARS elements and the upstream regions of a large number of genes and is likely to play an important role in yeast cell growth. To get insight into the relative strength of the various GFI-DNA binding sites within the yeast genome, we have determined dissociation rates for several GFI-DNA complexes and found them to vary over a 70-fold range. Strong binding sites for GFI are present in the upstream activating sequences of the gene encoding the 40 kDa subunit II of the QH2:cytochrome c reductase, the gene encoding ribosomal protein S33 and in the intron of the actin gene. The binding site in the ARS1-TRP1 region is of intermediate strength. All strong binding sites conform to the sequence 5' RTCRYYYNNNACG-3'. Modification interference experiments and studies with mutant binding sites indicate that critical bases for GFI recognition are within the two elements of the consensus DNA recognition sequence. Proteins with the DNA binding specificities of GFI and GFII can also be detected in the yeast K. lactis, suggesting evolutionary conservation of at least the respective DNA-binding domains in both yeasts. Images PMID:2187179

  13. Sequence of Radiotherapy and Chemotherapy in Breast Cancer After Breast-Conserving Surgery

    SciTech Connect

    Jobsen, Jan J.; Palen, Job van der; Brinkhuis, Marieel; Ong, Francisca; Struikmans, Henk

    2012-04-01

    Purpose: The optimal sequence of radiotherapy and chemotherapy in breast-conserving therapy is unknown. Methods and Materials: From 1983 through 2007, a total of 641 patients with 653 instances of breast-conserving therapy (BCT), received both chemotherapy and radiotherapy and are the basis of this analysis. Patients were divided into three groups. Groups A and B comprised patients treated before 2005, Group A radiotherapy first and Group B chemotherapy first. Group C consisted of patients treated from 2005 onward, when we had a fixed sequence of radiotherapy first, followed by chemotherapy. Results: Local control did not show any differences among the three groups. For distant metastasis, no difference was shown between Groups A and B. Group C, when compared with Group A, showed, on univariate and multivariate analyses, a significantly better distant metastasis-free survival. The same was noted for disease-free survival. With respect to disease-specific survival, no differences were shown on multivariate analysis among the three groups. Conclusion: Radiotherapy, as an integral part of the primary treatment of BCT, should be administered first, followed by adjuvant chemotherapy.

  14. Sequence of a cDNA encoding the bi-specific NAD(P)H-nitrate reductase from the tree Betula pendula and identification of conserved protein regions.

    PubMed

    Friemann, A; Brinkmann, K; Hachtel, W

    1991-05-01

    Nitrate reductase (NR) assays revealed a bispecific NAD(P)H-NR (EC 1.6.6.2.) to be the only nitrate-reducing enzyme in leaves of hydroponically grown birches. To obtain the primary structure of the NAD(P)H-NR, leaf poly(A)+ mRNA was used to construct a cDNA library in the lambda gt11 phage. Recombinant clones were screened with heterologous gene probes encoding NADH-NR from tobacco and squash. A 3.0 kb cDNA was isolated which hybridized to a 3.2 kb mRNA whose level was significantly higher in plants grown on nitrate than in those grown on ammonia. The nucleotide sequence of the cDNA comprises a reading frame encoding a protein of 898 amino acids which reveals 67%-77% identity with NADH-nitrate reductase sequences from higher plants. To identify conserved and variable regions of the multicentre electron-transfer protein a graphical evaluation of identities found in NR sequence alignments was carried out. Thirteen well-conserved sections exceeding a size of 10 amino acids were found in higher plant nitrate reductases. Sequence comparisons with related redox proteins indicate that about half of the conserved NR regions are involved in cofactor binding. The most striking difference in the birch NAD(P)H-NR sequence in comparison to NADH-NR sequences was found at the putative pyridine nucleotide binding site. Southern analysis indicates that the bi-specific NR is encoded by a single copy gene in birch.

  15. The amino acid sequence of the aspartate aminotransferase from baker's yeast (Saccharomyces cerevisiae).

    PubMed Central

    Cronin, V B; Maras, B; Barra, D; Doonan, S

    1991-01-01

    1. The single (cytosolic) aspartate aminotransferase was purified in high yield from baker's yeast (Saccharomyces cerevisiae). 2. Amino-acid-sequence analysis was carried out by digestion of the protein with trypsin and with CNBr; some of the peptides produced were further subdigested with Staphylococcus aureus V8 proteinase or with pepsin. Peptides were sequenced by the dansyl-Edman method and/or by automated gas-phase methods. The amino acid sequence obtained was complete except for a probable gap of two residues as indicated by comparison with the structures of counterpart proteins in other species. 3. The N-terminus of the enzyme is blocked. Fast-atom-bombardment m.s. was used to identify the blocking group as an acetyl one. 4. Alignment of the sequence of the enzyme with those of vertebrate cytosolic and mitochondrial aspartate aminotransferases and with the enzyme from Escherichia coli showed that about 25% of residues are conserved between these distantly related forms. 5. Experimental details and confirmatory data for the results presented here are given in a Supplementary Publication (SUP 50164, 25 pages) that has been deposited at the British Library Document Supply Centre, Boston Spa. Wetherby, West Yorkshire LS23 7 BQ, U.K., from whom copies can be obtained on the terms indicated in Biochem. J. (1991) 273, 5. PMID:1859361

  16. The complementary deoxyribonucleic acid sequence of guinea pig endometrial prorelaxin.

    PubMed

    Lee, Y A; Bryant-Greenwood, G D; Mandel, M; Greenwood, F C

    1992-03-01

    The nucleotide sequence of the relaxin gene transcript in the endometrium of the late pregnant guinea pig has been determined. The strategy used was a combination of polymerase chain reaction (PCR) with primers designed from the mRNA sequence of porcine preprorelaxin, rapid amplification of cDNA ends-PCR, and blunt end cloning in M13 mp18. With heterologous primers, a 226-basepair (bp) segment of the guinea pig relaxin gene sequence was obtained and was used to design a guinea pig-specific primer for use with the rapid amplification of cDNA ends-PCR method. The latter allowed completion of the sequence of 336 bp, with a 96-bp overlap. The sequence obtained shows greater homology at both the nucleotide and amino acid levels with porcine and human relaxins H1 and H2 than with rat relaxin, supporting the thesis that the guinea pig is not a rodent. The transcription of the guinea pig endometrial relaxin gene during pregnancy was confirmed by Northern analysis of guinea pig endometrial tissues with a species-specific cDNA probe. The endometrial relaxin gene is transcribed during pregnancy, but not in lactation, consistent with the observed immunostaining for relaxin.

  17. Quantum-Sequencing: Biophysics of quantum tunneling through nucleic acids

    NASA Astrophysics Data System (ADS)

    Casamada Ribot, Josep; Chatterjee, Anushree; Nagpal, Prashant

    2014-03-01

    Tunneling microscopy and spectroscopy has extensively been used in physical surface sciences to study quantum tunneling to measure electronic local density of states of nanomaterials and to characterize adsorbed species. Quantum-Sequencing (Q-Seq) is a new method based on tunneling microscopy for electronic sequencing of single molecule of nucleic acids. A major goal of third-generation sequencing technologies is to develop a fast, reliable, enzyme-free single-molecule sequencing method. Here, we present the unique ``electronic fingerprints'' for all nucleotides on DNA and RNA using Q-Seq along their intrinsic biophysical parameters. We have analyzed tunneling spectra for the nucleotides at different pH conditions and analyzed the HOMO, LUMO and energy gap for all of them. In addition we show a number of biophysical parameters to further characterize all nucleobases (electron and hole transition voltage and energy barriers). These results highlight the robustness of Q-Seq as a technique for next-generation sequencing.

  18. The expressed TCRβ CDR3 repertoire is dominated by conserved DNA sequences in channel catfish.

    PubMed

    Findly, R Craig; Niagro, Frank D; Dickerson, Harry W

    2017-03-01

    We analyzed by high-throughput sequencing T cell receptor beta CDR3 repertoires expressed by αβ T cells in outbred channel catfish before and after an immunizing infection with the parasitic protozoan Ichthyophthirius multifiliis. We compared CDR3 repertoires in caudal fin before infection and at three weeks after infection, and in skin, PBL, spleen and head kidney at seven and twenty-one weeks after infection. Public clonotypes with the same CDR3 amino acid sequence were expressed by αβ T cells that underwent clonal expansion following development of immunity. These clonally expanded αβ T cells were primarily located in spleen and skin, which is a site of infection. Although multiple DNA sequences were expected to code for each public clonotype, each public clonotype was predominately coded by an identical CDR3 DNA sequence in combination with the same J gene in all fish. The processes underlying this shared use of CDR3 DNA sequences are not clear.

  19. Purification, amino acid sequence and characterisation of kangaroo IGF-I.

    PubMed

    Yandell, C A; Francis, G L; Wheldrake, J F; Upton, Z

    1998-01-01

    Insulin-like growth factor-I (IGF-I) and IGF-II have been purified to homogeneity from kangaroo (Macropus fuliginosus) serum, thus this represents the first report of the purification, sequencing and characterisation of marsupial IGFs. N-Terminal protein sequencing reveals that there are six amino acid differences between kangaroo and human IGF-I. Kangaroo IGF-II has been partially sequenced and no differences were found between human and kangaroo IGF-II in the 53 residues identified. Thus the IGFs appear to be remarkably structurally conserved during mammalian radiation. In addition, in vitro characterisation of kangaroo IGF-I demonstrated that the functional properties of human, kangaroo and chicken IGF-I are very similar. In an assay measuring the ability of the proteins to stimulate protein synthesis in rat L6 myoblasts, all IGF-I proteins were found to be equally potent. The ability of all three proteins to compete for binding with radiolabelled human IGF-I to type-1 IGF receptors in L6 myoblasts and in Sminthopsis crassicaudata transformed lung fibroblasts, a marsupial cell line, was comparable. Furthermore, kangaroo and human IGF-I react equally in a human IGF-I RIA using a human reference standard, radiolabelled human IGF-I and a polyclonal antibody raised against recombinant human IGF-I. This study indicates that not only is the primary structure of eutherian and metatherian IGF-I conserved, but also the proteins appear to be functionally similar.

  20. CSTminer: a web tool for the identification of coding and noncoding conserved sequence tags through cross-species genome comparison.

    PubMed

    Castrignanò, Tiziana; Canali, Alessandro; Grillo, Giorgio; Liuni, Sabino; Mignone, Flavio; Pesole, Graziano

    2004-07-01

    The identification and characterization of genome tracts that are highly conserved across species during evolution may contribute significantly to the functional annotation of whole-genome sequences. Indeed, such sequences are likely to correspond to known or unknown coding exons or regulatory motifs. Here, we present a web server implementing a previously developed algorithm that, by comparing user-submitted genome sequences, is able to identify statistically significant conserved blocks and assess their coding or noncoding nature through the measure of a coding potential score. The web tool, available at http://www.caspur.it/CSTminer/, is dynamically interconnected with the Ensembl genome resources and produces a graphical output showing a map of detected conserved sequences and annotated gene features.

  1. Evolutionary conservation of sequence and secondary structures inCRISPR repeats

    SciTech Connect

    Kunin, Victor; Sorek, Rotem; Hugenholtz, Philip

    2006-09-01

    Clustered Regularly Interspaced Palindromic Repeats (CRISPRs) are a novel class of direct repeats, separated by unique spacer sequences of similar length, that are present in {approx}40% of bacterial and all archaeal genomes analyzed to date. More than 40 gene families, called CRISPR-associated sequences (CAS), appear in conjunction with these repeats and are thought to be involved in the propagation and functioning of CRISPRs. It has been proposed that the CRISPR/CAS system samples, maintains a record of, and inactivates invasive DNA that the cell has encountered, and therefore constitutes a prokaryotic analog of an immune system. Here we analyze CRISPR repeats identified in 195 microbial genomes and show that they can be organized into multiple clusters based on sequence similarity. All individual repeats in any given cluster were inferred to form characteristic RNA secondary structure, ranging from non-existent to pronounced. Stable secondary structures included G:U base pairs and exhibited multiple compensatory base changes in the stem region, indicating evolutionary conservation and functional importance. We also show that the repeat-based classification corresponds to, and expands upon, a previously reported CAS gene-based classification including specific relationships between CRISPR and CAS subtypes.

  2. The complete nucleotide sequence of the Crossostoma lacustre mitochondrial genome: conservation and variations among vertebrates.

    PubMed Central

    Tzeng, C S; Hui, C F; Shen, S C; Huang, P C

    1992-01-01

    The complete mitochondrial (mt) genome of Crossostoma lacustre, a freshwater loach from mountain stream of Taiwan, has been cloned and sequenced. This fish mt genome, consisting of 16558 base-pairs, encodes genes for 13 proteins, two rRNAs, and 22 tRNAs, in addition to a regulatory sequence for replication and transcription (D-loop), is similar to those of the other vertebrates in both the order and orientation of these genes. The protein-coding and ribosomal RNA genes are highly homologous both in size and composition, to their counterparts in mammals, birds, amphibians, and invertebrates, and using essentially the same set of codons, including both the initiation and termination signals, and the tRNAs. Differences do exist, however, in the lengths and sequences of the D-loop regions, and in space between genes, which account for the variations in total lengths of the genomes. Our observations provide evidence for the first time for the conservation of genetic information in the fish mitochondrial genome, especially among the vertebrates. PMID:1408800

  3. Sequence conservation in avian CR1: an interspersed repetitive DNA family evolving under functional constraints.

    PubMed Central

    Chen, Z Q; Ritzel, R G; Lin, C C; Hodgetts, R B

    1991-01-01

    CR1 is a short interspersed repetitive DNA element originally identified in the domestic chicken (Gallus gallus). However, unlike virtually all other such sequences described to date, CR1 is not confined to one or a few closely related species. It is probably a ubiquitous component of the avian genome, having been detected in representatives of nine orders encompassing a wide spectrum of the class Aves. This identification was made possible by using the polymerase chain reaction (PCR), which revealed interspecific similarities not detected by conventional Southern analysis. DNA sequence comparisons between a CR1 element isolated from a sarus crane (Grus antigone) and those isolated from an emu (Dromaius novaehollandiae) showed that two short highly conserved regions are present. These are included within two regions previously characterized in the CR1 units of domestic fowl. One of these behaves as a transcriptional silencer and the other is a binding site for a nuclear protein. Our observations suggest that CR1 has evolved under functional constraints and that interspersed repetitive sequences as a class may constitute a more significant component of the eukaryotic genome than is generally acknowledged. Images PMID:1829530

  4. cDNA sequence, genomic organization, and evolutionary conservation of a novel gene from the WAGR region

    SciTech Connect

    Schwartz, F.; Eisenman, R.; Knoll, J.; Bruns, G.

    1995-09-20

    A new gene (239FB) with predominant and differential expression in fetal brain has recently been isolated from a chromosome 11p13-p14 boundary area near FSHB. The corresponding mRNA has an open reading frame of 294 amino acids, a 3` untranslated region of 1247 nucleotides, and a highly GC-rich 5` untranslated region. The coding and 3` UT sequence is specified by 6 exons within nearly 87 kb of isolated genomic locus. The 5` end region of the transcript maps adjacent to the only genomically defined CpG island in a chromosomal subregion that may be associated with part of the mental retardation of some WAGR (Wilms tumor, aniridia, genitourinary anomalies, and mental retardation) syndrome patients. In addition to nucleotide and amino acid similarity to an EST from a normalized infant brain cDNA library, the predicted protein has extensive similarity to Caenorhbditis elegans polypeptides of, as yet, unknown function. The 239FB locus is, therefore, likely part of a family of genes with two members expressed in human brain. The extensive conservation of the predicted protein suggests a fundamental function of the gene product and will enable evaluation of the role of the 239FB gene in neurogenesis in model organisms. 48 refs., 4 figs., 1 tab.

  5. Molecular cloning and amino acid sequence of human 5-lipoxygenase

    SciTech Connect

    Matsumoto, T.; Funk, C.D.; Radmark, O.; Hoeoeg, J.O.; Joernvall, H.; Samuelsson, B.

    1988-01-01

    5-Lipoxygenase (EC 1.13.11.34), a Ca/sup 2 +/- and ATP-requiring enzyme, catalyzes the first two steps in the biosynthesis of the peptidoleukotrienes and the chemotactic factor leukotriene B/sub 4/. A cDNA clone corresponding to 5-lipoxygenase was isolated from a human lung lambda gt11 expression library by immunoscreening with a polyclonal antibody. Additional clones from a human placenta lambda gt11 cDNA library were obtained by plaque hybridization with the /sup 32/P-labeled lung cDNA clone. Sequence data obtained from several overlapping clones indicate that the composite DNAs contain the complete coding region for the enzyme. From the deduced primary structure, 5-lipoxygenase encodes a 673 amino acid protein with a calculated molecular weight of 77,839. Direct analysis of the native protein and its proteolytic fragments confirmed the deduced composition, the amino-terminal amino acid sequence, and the structure of many internal segments. 5-Lipoxygenase has no apparent sequence homology with leukotriene A/sub 4/ hydrolase or Ca/sup 2 +/-binding proteins. RNA blot analysis indicated substantial amounts of an mRNA species of approx. = 2700 nucleotides in leukocytes, lung, and placenta.

  6. Nucleic acid sequence detection using multiplexed oligonucleotide PCR

    DOEpatents

    Nolan, John P.; White, P. Scott

    2006-12-26

    Methods for rapidly detecting single or multiple sequence alleles in a sample nucleic acid are described. Provided are all of the oligonucleotide pairs capable of annealing specifically to a target allele and discriminating among possible sequences thereof, and ligating to each other to form an oligonucleotide complex when a particular sequence feature is present (or, alternatively, absent) in the sample nucleic acid. The design of each oligonucleotide pair permits the subsequent high-level PCR amplification of a specific amplicon when the oligonucleotide complex is formed, but not when the oligonucleotide complex is not formed. The presence or absence of the specific amplicon is used to detect the allele. Detection of the specific amplicon may be achieved using a variety of methods well known in the art, including without limitation, oligonucleotide capture onto DNA chips or microarrays, oligonucleotide capture onto beads or microspheres, electrophoresis, and mass spectrometry. Various labels and address-capture tags may be employed in the amplicon detection step of multiplexed assays, as further described herein.

  7. Conserved aspartic acid 233 and alanine 231 are not required for poliovirus polymerase function in replicons

    PubMed Central

    Freistadt, Marion S; Eberle, Karen E

    2007-01-01

    Nucleic acid polymerases have similar structures and motifs. The function of an aspartic acid (conserved in all classes of nucleic acid polymerases) in motif A remains poorly understood in RNA-dependent RNA polymerases. We mutated this residue to alanine in a poliovirus replicon. The resulting mutant could still replicate, although at a reduced level. In addition, mutation A231C (also in motif A) yielded high levels of replication. Taken together these results show that poliovirus polymerase conserved residues D233 and A231 are not essential to poliovirus replicon function. PMID:17352827

  8. The amino acid sequence of rabbit cardiac troponin I.

    PubMed Central

    Grand, R J; Wilkinson, J M

    1976-01-01

    The complete amino acid sequence of troponin I from rabbit cardiac muscle was determined by the isolation of four unique CNBr fragments, together with overlapping tryptic peptides containing radioactive methionine residues. Overlap data for residues 35-36, 93-94 and 140-145 are incomplete, the sequence at these positions being based on homology with the sequence of the fast-skeletal-muscle protein. Cardiac troponin I is a single polypeptide chain of 206 residues with mol.wt. 23550 and an extinction coefficient, E 1%,1cm/280, of 4.37. The protein has a net positive charge of 14 and is thus somewhat more basic than troponin I from fast-skeletal muscle. Comparison of the sequences of troponin I from cardiac and fast skeletal muscle show that the cardiac protein has 26 extra residues at the N-terminus which account for the larger size of the protein. In the remainder of sequence there is a considerable degree of homology, this being greater in the C-terminal two-thirds of the molecule. The region in the cardiac protein corresponding to the peptide with inhibitory activity from the fast-skeletal-muscle protein is very similar and it seems unlikely that this is the cause of the difference in inhibitory activity between the two proteins. The region responsible for binding troponin C, however, possesses a lower degree of homology. Detailed evidence on which the sequence is based has been deposited as Supplementary Publication SUP 50072 (20 pages), at the British Library Lending Division, Boston Spa, Wetherby, West Yorkshire LS23 7QB, U.K., from whom copies may be obtained on the terms given in Biochem. J. (1976) 153, 5. PMID:1008822

  9. Identification of tropomyosins as major allergens in antarctic krill and mantis shrimp and their amino acid sequence characteristics.

    PubMed

    Motoyama, Kanna; Suma, Yota; Ishizaki, Shoichiro; Nagashima, Yuji; Lu, Ying; Ushio, Hideki; Shiomi, Kazuo

    2008-01-01

    Tropomyosin represents a major allergen of decapod crustaceans such as shrimps and crabs, and its highly conserved amino acid sequence (>90% identity) is a molecular basis of the immunoglobulin E (IgE) cross-reactivity among decapods. At present, however, little information is available about allergens in edible crustaceans other than decapods. In this study, the major allergen in two species of edible crustaceans, Antarctic krill Euphausia superba and mantis shrimp Oratosquilla oratoria that are taxonomically distinct from decapods, was demonstrated to be tropomyosin by IgE-immunoblotting using patient sera. The cross-reactivity of the tropomyosins from both species with decapod tropomyosins was also confirmed by inhibition IgE immunoblotting. Sequences of the tropomyosins from both species were determined by complementary deoxyribonucleic acid cloning. The mantis shrimp tropomyosin has high sequence identity (>90% identity) with decapod tropomyosins, especially with fast-type tropomyosins. On the other hand, the Antarctic krill tropomyosin is characterized by diverse alterations in region 13-42, the amino acid sequence of which is highly conserved for decapod tropomyosins, and hence, it shares somewhat lower sequence identity (82.4-89.8% identity) with decapod tropomyosins than the mantis shrimp tropomyosin. Quantification by enzyme-linked immunosorbent assay revealed that Antarctic krill contains tropomyosin at almost the same level as decapods, suggesting that its allergenicity is equivalent to decapods. However, mantis shrimp was assumed to be substantially not allergenic because of the extremely low content of tropomyosin.

  10. The highly conserved aspartic acid residue between hypervariable regions 1 and 2 of human immunodeficiency virus type 1 gp120 is important for early stages of virus replication.

    PubMed Central

    Wang, W K; Essex, M; Lee, T H

    1995-01-01

    Between hypervariable regions V1 and V2 of human immunodeficiency virus type 1 (HIV-1) gp120 lies a cluster of relatively conserved residues. The contribution of nine charged residues in this region to virus infectivity was evaluated by single-amino-acid substitutions in an infectious provirus clone. Three of the HIV-1 mutants studied had slower growth kinetics than the wild-type virus. The delay was most pronounced in a mutant with an alanine substituted for an aspartic acid residue at position 180. This aspartic acid is conserved by all HIV-1 isolates with known nucleotide sequences. Substitutions with three other residues at this position, including a negatively charged glutamic acid, all affected virus infectivity. The defect identified in these mutants suggests that this aspartic acid residue is involved in the early stages of HIV-1 replication. PMID:7983752

  11. Position-specific prediction of methylation sites from sequence conservation based on information theory.

    PubMed

    Shi, Yinan; Guo, Yanzhi; Hu, Yayun; Li, Menglong

    2015-07-23

    Protein methylation plays vital roles in many biological processes and has been implicated in various human diseases. To fully understand the mechanisms underlying methylation for use in drug design and work in methylation-related diseases, an initial but crucial step is to identify methylation sites. The use of high-throughput bioinformatics methods has become imperative to predict methylation sites. In this study, we developed a novel method that is based only on sequence conservation to predict protein methylation sites. Conservation difference profiles between methylated and non-methylated peptides were constructed by the information entropy (IE) in a wider neighbor interval around the methylation sites that fully incorporated all of the environmental information. Then, the distinctive neighbor residues were identified by the importance scores of information gain (IG). The most representative model was constructed by support vector machine (SVM) for Arginine and Lysine methylation, respectively. This model yielded a promising result on both the benchmark dataset and independent test set. The model was used to screen the entire human proteome, and many unknown substrates were identified. These results indicate that our method can serve as a useful supplement to elucidate the mechanism of protein methylation and facilitate hypothesis-driven experimental design and validation.

  12. Ultrasensitive nucleic acid sequence detection by single-molecule electrophoresis

    SciTech Connect

    Castro, A; Shera, E.B.

    1996-09-01

    This is the final report of a one-year laboratory-directed research and development project at Los Alamos National Laboratory. There has been considerable interest in the development of very sensitive clinical diagnostic techniques over the last few years. Many pathogenic agents are often present in extremely small concentrations in clinical samples, especially at the initial stages of infection, making their detection very difficult. This project sought to develop a new technique for the detection and accurate quantification of specific bacterial and viral nucleic acid sequences in clinical samples. The scheme involved the use of novel hybridization probes for the detection of nucleic acids combined with our recently developed technique of single-molecule electrophoresis. This project is directly relevant to the DOE`s Defense Programs strategic directions in the area of biological warfare counter-proliferation.

  13. Sequence of the canine herpesvirus thymidine kinase gene: taxon-preferred amino acid residues in the alphaherpesviral thymidine kinases.

    PubMed

    Rémond, M; Sheldrick, P; Lebreton, F; Foulon, T

    1995-12-01

    Multiple sequence alignments of evolutionarily related proteins are finding increasing use as indicators of critical amino acid residues necessary for structural stability or involved in functional domains responsible for catalytic activities. In the past, a number of alignments have provided such information for the herpesviral thymidine kinases, for which three-dimensional structures are not yet available. We have sequenced the thymidine kinase gene of a canine herpesvirus, and with a multiple alignment have identified amino acids preferentially conserved in either of two taxons, the genera Varicellovirus and Simplexvirus, of the subfamily Alphaherpesvirinae. Since some regions of the thymidine kinases show otherwise elevated levels of substitutional tolerance, these conserved amino acids are candidates for critical residues which have become fixed through selection during the evolutionary divergence of these enzymes. Several pairs with distinctive patterns of distribution among the various viruses occur in or near highly conserved sequence motifs previously proposed to form the catalytic site, and we speculate that they may represent interacting, co-ordinately variable residues.

  14. A conservative amino acid substitution alters the regiospecificity of CYP94A2, a fatty acid hydroxylase from the plant Vicia sativa.

    PubMed

    Kahn, R A; Le Bouquin, R; Pinot, F; Benveniste, I; Durst, F

    2001-07-15

    Fatty acid omega-hydroxylation is involved in the biosynthesis of the plant cuticle, formation of plant defense signaling molecules, and possibly in the rapid catabolism of free fatty acids liberated under stress conditions. CYP94A2 is a cytochrome P450-dependent medium-chain fatty acid hydroxylase that was recently isolated from Vicia sativa. Contrary to CYP94A1 and CYP86A1, two other fatty acid hydroxylases previously characterized in V. sativa and Arabidopsis thaliana, CYP94A2 is not a strict omega-hydroxylase, but exhibits chain-length-dependent regioselectivity of oxidative attack. Sequence alignments of CYP94A2 with CYP94A1 and molecular modeling studies suggested that F494, located in SRS-6 (substrate recognition site) was involved in substrate recognition and positioning. Indeed, a conservative amino acid substitution at that position markedly altered the regiospecificity of CYP94A2. The observed shift from omega toward omega-1 hydroxylation was prominent with lauric acid as substrate and declined with increasing fatty acid chain length.

  15. Nucleic acid (cDNA) and amino acid sequences of alpha-type gliadins from wheat (Triticum aestivum).

    PubMed Central

    Kasarda, D D; Okita, T W; Bernardin, J E; Baecker, P A; Nimmo, C C; Lew, E J; Dietler, M D; Greene, F C

    1984-01-01

    The complete amino acid sequence for an alpha-type gliadin protein of wheat (Triticum aestivum Linnaeus) endosperm has been derived from a cloned cDNA sequence. An additional cDNA clone that corresponds to about 75% of a similar alpha-type gliadin has been sequenced and shows some important differences. About 97% of the composite sequence of A-gliadin (an alpha-type gliadin fraction) has also been obtained by direct amino acid sequencing. This sequence shows a high degree of similarity with amino acid sequences derived from both cDNA clones and is virtually identical to one of them. On the basis of sequence information, after loss of the signal sequence, the mature alpha-type gliadins may be divided into five different domains, two of which may have evolved from an ancestral gliadin gene, whereas the remaining three contain repeating sequences that may have developed independently. Images PMID:6589619

  16. Origin and spread of photosynthesis based upon conserved sequence features in key bacteriochlorophyll biosynthesis proteins.

    PubMed

    Gupta, Radhey S

    2012-11-01

    The origin of photosynthesis and how this capability has spread to other bacterial phyla remain important unresolved questions. I describe here a number of conserved signature indels (CSIs) in key proteins involved in bacteriochlorophyll (Bchl) biosynthesis that provide important insights in these regards. The proteins BchL and BchX, which are essential for Bchl biosynthesis, are derived by gene duplication in a common ancestor of all phototrophs. More ancient gene duplication gave rise to the BchX-BchL proteins and the NifH protein of the nitrogenase complex. The sequence alignment of NifH-BchX-BchL proteins contain two CSIs that are uniquely shared by all NifH and BchX homologs, but not by any BchL homologs. These CSIs and phylogenetic analysis of NifH-BchX-BchL protein sequences strongly suggest that the BchX homologs are ancestral to BchL and that the Bchl-based anoxygenic photosynthesis originated prior to the chlorophyll (Chl)-based photosynthesis in cyanobacteria. Another CSI in the BchX-BchL sequence alignment that is uniquely shared by all BchX homologs and the BchL sequences from Heliobacteriaceae, but absent in all other BchL homologs, suggests that the BchL homologs from Heliobacteriaceae are primitive in comparison to all other photosynthetic lineages. Several other identified CSIs in the BchN homologs are commonly shared by all proteobacterial homologs and a clade consisting of the marine unicellular Cyanobacteria (Clade C). These CSIs in conjunction with the results of phylogenetic analyses and pair-wise sequence similarity on the BchL, BchN, and BchB proteins, where the homologs from Clade C Cyanobacteria and Proteobacteria exhibited close relationship, provide strong evidence that these two groups have incurred lateral gene transfers. Additionally, phylogenetic analyses and several CSIs in the BchL-N-B proteins that are uniquely shared by all Chlorobi and Chloroflexi homologs provide evidence that the genes for these proteins have also been

  17. Structural gene and complete amino acid sequence of Vibrio alginolyticus collagenase.

    PubMed Central

    Takeuchi, H; Shibano, Y; Morihara, K; Fukushima, J; Inami, S; Keil, B; Gilles, A M; Kawamoto, S; Okuda, K

    1992-01-01

    The DNA encoding the collagenase of Vibrio alginolyticus was cloned, and its complete nucleotide sequence was determined. When the cloned gene was ligated to pUC18, the Escherichia coli expression vector, bacteria carrying the gene exhibited both collagenase antigen and collagenase activity. The open reading frame from the ATG initiation codon was 2442 bp in length for the collagenase structural gene. The amino acid sequence, deduced from the nucleotide sequence, revealed that the mature collagenase consists of 739 amino acids with an Mr of 81875. The amino acid sequences of 20 polypeptide fragments were completely identical with the deduced amino acid sequences of the collagenase gene. The amino acid composition predicted from the DNA sequence was similar to the chemically determined composition of purified collagenase reported previously. The analyses of both the DNA and amino acid sequences of the collagenase gene were rigorously performed, but we could not detect any significant sequence similarity to other collagenases. Images Fig. 2. PMID:1311172

  18. Development of a protein-ligand-binding site prediction method based on interaction energy and sequence conservation.

    PubMed

    Tsujikawa, Hiroto; Sato, Kenta; Wei, Cao; Saad, Gul; Sumikoshi, Kazuya; Nakamura, Shugo; Terada, Tohru; Shimizu, Kentaro

    2016-09-01

    We present a new method for predicting protein-ligand-binding sites based on protein three-dimensional structure and amino acid conservation. This method involves calculation of the van der Waals interaction energy between a protein and many probes placed on the protein surface and subsequent clustering of the probes with low interaction energies to identify the most energetically favorable locus. In addition, it uses amino acid conservation among homologous proteins. Ligand-binding sites were predicted by combining the interaction energy and the amino acid conservation score. The performance of our prediction method was evaluated using a non-redundant dataset of 348 ligand-bound and ligand-unbound protein structure pairs, constructed by filtering entries in a ligand-binding site structure database, LigASite. Ligand-bound structure prediction (bound prediction) indicated that 74.0 % of predicted ligand-binding sites overlapped with real ligand-binding sites by over 25 % of their volume. Ligand-unbound structure prediction (unbound prediction) indicated that 73.9 % of predicted ligand-binding residues overlapped with real ligand-binding residues. The amino acid conservation score improved the average prediction accuracy by 17.0 and 17.6 points for the bound and unbound predictions, respectively. These results demonstrate the effectiveness of the combined use of the interaction energy and amino acid conservation in the ligand-binding site prediction.

  19. Discovery and profiling of novel and conserved microRNAs during flower development in Carya cathayensis via deep sequencing.

    PubMed

    Wang, Zheng Jia; Huang, Jian Qin; Huang, You Jun; Li, Zheng; Zheng, Bing Song

    2012-08-01

    Hickory (Carya cathayensis Sarg.) is an economically important woody plant in China, but its long juvenile phase delays yield. MicroRNAs (miRNAs) are critical regulators of genes and important for normal plant development and physiology, including flower development. We used Solexa technology to sequence two small RNA libraries from two floral differentiation stages in hickory to identify miRNAs related to flower development. We identified 39 conserved miRNA sequences from 114 loci belonging to 23 families as well as two novel and ten potential novel miRNAs belonging to nine families. Moreover, 35 conserved miRNA*s and two novel miRNA*s were detected. Twenty miRNA sequences from 49 loci belonging to 11 families were differentially expressed; all were up-regulated at the later stage of flower development in hickory. Quantitative real-time PCR of 12 conserved miRNA sequences, five novel miRNA families, and two novel miRNA*s validated that all were expressed during hickory flower development, and the expression patterns were similar to those detected with Solexa sequencing. Finally, a total of 146 targets of the novel and conserved miRNAs were predicted. This study identified a diverse set of miRNAs that were closely related to hickory flower development and that could help in plant floral induction.

  20. A conserved intronic U1 snRNP-binding sequence promotes trans-splicing in Drosophila.

    PubMed

    Gao, Jun-Li; Fan, Yu-Jie; Wang, Xiu-Ye; Zhang, Yu; Pu, Jia; Li, Liang; Shao, Wei; Zhan, Shuai; Hao, Jianjiang; Xu, Yong-Zhen

    2015-04-01

    Unlike typical cis-splicing, trans-splicing joins exons from two separate transcripts to produce chimeric mRNA and has been detected in most eukaryotes. Trans-splicing in trypanosomes and nematodes has been characterized as a spliced leader RNA-facilitated reaction; in contrast, its mechanism in higher eukaryotes remains unclear. Here we investigate mod(mdg4), a classic trans-spliced gene in Drosophila, and report that two critical RNA sequences in the middle of the last 5' intron, TSA and TSB, promote trans-splicing of mod(mdg4). In TSA, a 13-nucleotide (nt) core motif is conserved across Drosophila species and is essential and sufficient for trans-splicing, which binds U1 small nuclear RNP (snRNP) through strong base-pairing with U1 snRNA. In TSB, a conserved secondary structure acts as an enhancer. Deletions of TSA and TSB using the CRISPR/Cas9 system result in developmental defects in flies. Although it is not clear how the 5' intron finds the 3' introns, compensatory changes in U1 snRNA rescue trans-splicing of TSA mutants, demonstrating that U1 recruitment is critical to promote trans-splicing in vivo. Furthermore, TSA core-like motifs are found in many other trans-spliced Drosophila genes, including lola. These findings represent a novel mechanism of trans-splicing, in which RNA motifs in the 5' intron are sufficient to bring separate transcripts into close proximity to promote trans-splicing.

  1. Structural Conservation of Ligand Binding Reveals a Bile Acid-like Signaling Pathway in Nematodes*

    PubMed Central

    Zhi, Xiaoyong; Zhou, X. Edward; Melcher, Karsten; Motola, Daniel L.; Gelmedin, Verena; Hawdon, John; Kliewer, Steven A.; Mangelsdorf, David J.; Xu, H. Eric

    2012-01-01

    Bile acid-like molecules named dafachronic acids (DAs) control the dauer formation program in Caenorhabditis elegans through the nuclear receptor DAF-12. This mechanism is conserved in parasitic nematodes to regulate their dauer-like infective larval stage, and as such, the DAF-12 ligand binding domain has been identified as an important therapeutic target in human parasitic hookworm species that infect more than 600 million people worldwide. Here, we report two x-ray crystal structures of the hookworm Ancylostoma ceylanicum DAF-12 ligand binding domain in complex with DA and cholestenoic acid (a bile acid-like metabolite), respectively. Structure analysis and functional studies reveal key residues responsible for species-specific ligand responses of DAF-12. Furthermore, DA binds to DAF-12 mechanistically and is structurally similar to bile acids binding to the mammalian bile acid receptor farnesoid X receptor. Activation of DAF-12 by cholestenoic acid and the cholestenoic acid complex structure suggest that bile acid-like signaling pathways have been conserved in nematodes and mammals. Together, these results reveal the molecular mechanism for the interplay between parasite and host, provide a structural framework for DAF-12 as a promising target in treating nematode parasitism, and provide insight into the evolution of gut parasite hormone-signaling pathways. PMID:22170062

  2. Clostridium sticklandii, a specialist in amino acid degradation:revisiting its metabolism through its genome sequence

    PubMed Central

    2010-01-01

    Background Clostridium sticklandii belongs to a cluster of non-pathogenic proteolytic clostridia which utilize amino acids as carbon and energy sources. Isolated by T.C. Stadtman in 1954, it has been generally regarded as a "gold mine" for novel biochemical reactions and is used as a model organism for studying metabolic aspects such as the Stickland reaction, coenzyme-B12- and selenium-dependent reactions of amino acids. With the goal of revisiting its carbon, nitrogen, and energy metabolism, and comparing studies with other clostridia, its genome has been sequenced and analyzed. Results C. sticklandii is one of the best biochemically studied proteolytic clostridial species. Useful additional information has been obtained from the sequencing and annotation of its genome, which is presented in this paper. Besides, experimental procedures reveal that C. sticklandii degrades amino acids in a preferential and sequential way. The organism prefers threonine, arginine, serine, cysteine, proline, and glycine, whereas glutamate, aspartate and alanine are excreted. Energy conservation is primarily obtained by substrate-level phosphorylation in fermentative pathways. The reactions catalyzed by different ferredoxin oxidoreductases and the exergonic NADH-dependent reduction of crotonyl-CoA point to a possible chemiosmotic energy conservation via the Rnf complex. C. sticklandii possesses both the F-type and V-type ATPases. The discovery of an as yet unrecognized selenoprotein in the D-proline reductase operon suggests a more detailed mechanism for NADH-dependent D-proline reduction. A rather unusual metabolic feature is the presence of genes for all the enzymes involved in two different CO2-fixation pathways: C. sticklandii harbours both the glycine synthase/glycine reductase and the Wood-Ljungdahl pathways. This unusual pathway combination has retrospectively been observed in only four other sequenced microorganisms. Conclusions Analysis of the C. sticklandii genome and

  3. Morphological tranformation of calcite crystal growth by prismatic "acidic" polypeptide sequences.

    SciTech Connect

    Kim, I; Giocondi, J L; Orme, C A; Collino, J; Evans, J S

    2007-02-13

    Many of the interesting mechanical and materials properties of the mollusk shell are thought to stem from the prismatic calcite crystal assemblies within this composite structure. It is now evident that proteins play a major role in the formation of these assemblies. Recently, a superfamily of 7 conserved prismatic layer-specific mollusk shell proteins, Asprich, were sequenced, and the 42 AA C-terminal sequence region of this protein superfamily was found to introduce surface voids or porosities on calcite crystals in vitro. Using AFM imaging techniques, we further investigate the effect that this 42 AA domain (Fragment-2) and its constituent subdomains, DEAD-17 and Acidic-2, have on the morphology and growth kinetics of calcite dislocation hillocks. We find that Fragment-2 adsorbs on terrace surfaces and pins acute steps, accelerates then decelerates the growth of obtuse steps, forms clusters and voids on terrace surfaces, and transforms calcite hillock morphology from a rhombohedral form to a rounded one. These results mirror yet are distinct from some of the earlier findings obtained for nacreous polypeptides. The subdomains Acidic-2 and DEAD-17 were found to accelerate then decelerate obtuse steps and induce oval rather than rounded hillock morphologies. Unlike DEAD-17, Acidic-2 does form clusters on terrace surfaces and exhibits stronger obtuse velocity inhibition effects than either DEAD-17 or Fragment-2. Interestingly, a 1:1 mixture of both subdomains induces an irregular polygonal morphology to hillocks, and exhibits the highest degree of acute step pinning and obtuse step velocity inhibition. This suggests that there is some interplay between subdomains within an intra (Fragment-2) or intermolecular (1:1 mixture) context, and sequence interplay phenomena may be employed by biomineralization proteins to exert net effects on crystal growth and morphology.

  4. Bacterial periplasmic sialic acid-binding proteins exhibit a conserved binding site

    SciTech Connect

    Gangi Setty, Thanuja; Cho, Christine; Govindappa, Sowmya; Apicella, Michael A.; Ramaswamy, S.

    2014-07-01

    Structure–function studies of sialic acid-binding proteins from F. nucleatum, P. multocida, V. cholerae and H. influenzae reveal a conserved network of hydrogen bonds involved in conformational change on ligand binding. Sialic acids are a family of related nine-carbon sugar acids that play important roles in both eukaryotes and prokaryotes. These sialic acids are incorporated/decorated onto lipooligosaccharides as terminal sugars in multiple bacteria to evade the host immune system. Many pathogenic bacteria scavenge sialic acids from their host and use them for molecular mimicry. The first step of this process is the transport of sialic acid to the cytoplasm, which often takes place using a tripartite ATP-independent transport system consisting of a periplasmic binding protein and a membrane transporter. In this paper, the structural characterization of periplasmic binding proteins from the pathogenic bacteria Fusobacterium nucleatum, Pasteurella multocida and Vibrio cholerae and their thermodynamic characterization are reported. The binding affinities of several mutations in the Neu5Ac binding site of the Haemophilus influenzae protein are also reported. The structure and the thermodynamics of the binding of sugars suggest that all of these proteins have a very well conserved binding pocket and similar binding affinities. A significant conformational change occurs when these proteins bind the sugar. While the C1 carboxylate has been identified as the primary binding site, a second conserved hydrogen-bonding network is involved in the initiation and stabilization of the conformational states.

  5. Snake venom toxins. The amino acid sequence of toxin Vi2, a homologue of pancreatic trypsin inhibitor, from Dendroaspis polylepis polylepis (black mamba) venom.

    PubMed

    Strydom, D J

    1977-04-25

    The amino acid sequence of venom component Vi2, a protein of low toxicity from Dendroaspis polylepis polylepis venom was determined by automatic sequence analysis in combination with sequence studies on tryptic peptides. This protein, the most retarded fraction of this venom on a cation-exchange resin, is a homologue of bovine pancreatic trypsin inhibitor consisting of a single chain of 57 amino acid residues containing six half-cystine residues. The active site lysyl residue of bovine trypsin inhibitor is conserved in Vi2 although large differences are found in the rest of the molecule.

  6. Real-Time Nucleic Acid Sequence-Based Amplification Assay for Detection of Hepatitis A Virus

    PubMed Central

    Abd El Galil, Khaled H.; El Sokkary, M. A.; Kheira, S. M.; Salazar, Andre M.; Yates, Marylynn V.; Chen, Wilfred; Mulchandani, Ashok

    2005-01-01

    A nucleic acid sequence-based amplification (NASBA) assay in combination with a molecular beacon was developed for the real-time detection and quantification of hepatitis A virus (HAV). A 202-bp, highly conserved 5′ noncoding region of HAV was targeted. The sensitivity of the real-time NASBA assay was tested with 10-fold dilutions of viral RNA, and a detection limit of 1 PFU was obtained. The specificity of the assay was demonstrated by testing with other environmental pathogens and indicator microorganisms, with only HAV positively identified. When combined with immunomagnetic separation, the NASBA assay successfully detected as few as 10 PFU from seeded lake water samples. Due to its isothermal nature, its speed, and its similar sensitivity compared to the real-time RT-PCR assay, this newly reported real-time NASBA method will have broad applications for the rapid detection of HAV in contaminated food or water. PMID:16269748

  7. Conservation analysis predicts in vivo occupancy of glucocorticoid receptor-binding sequences at glucocorticoid-induced genes.

    PubMed

    So, Alex Yick-Lun; Cooper, Samantha B; Feldman, Brian J; Manuchehri, Mitra; Yamamoto, Keith R

    2008-04-15

    The glucocorticoid receptor (GR) interacts with specific GR-binding sequences (GBSs) at glucocorticoid response elements (GREs) to orchestrate transcriptional networks. Although the sequences of the GBSs are highly variable among different GREs, the precise sequence within an individual GRE is highly conserved. In this study, we examined whether sequence conservation of sites resembling GBSs is sufficient to predict GR occupancy of GREs at genes responsive to glucocorticoids. Indeed, we found that the level of conservation of these sites at genes up-regulated by glucocorticoids in mouse C3H10T1/2 mesenchymal stem-like cells correlated directly with the extent of occupancy by GR. In striking contrast, we failed to observe GR occupancy of GBSs at genes repressed by glucocorticoids, despite the occurrence of these sites at a frequency similar to that of the induced genes. Thus, GR occupancy of the GBS motif correlates with induction but not repression, and GBS conservation alone is sufficient to predict GR occupancy and GRE function at induced genes.

  8. THE GRK4 SUBFAMILY OF G PROTEIN-COUPLED RECEPTOR KINASES: ALTERNATIVE SPLICING, GENE ORGANIZATION, AND SEQUENCE CONSERVATION

    EPA Science Inventory

    The GRK4 subfamily of G protein-coupled receptor kinases. Alternative splicing, gene organization, and sequence conservation.

    Premont RT, Macrae AD, Aparicio SA, Kendall HE, Welch JE, Lefkowitz RJ.

    Department of Medicine, Howard Hughes Medical Institute, Duke Univer...

  9. Nucleic acid (cDNA) and amino acid sequences of the maize endosperm protein glutelin-2.

    PubMed Central

    Prat, S; Cortadas, J; Puigdomènech, P; Palau, J

    1985-01-01

    The cDNA coding for a glutelin-2 protein from maize endosperm has been cloned and the complete amino acid sequence of the protein derived for the first time. An immature maize endosperm cDNA bank was screened for the expression of a beta-lactamase:glutelin-2 (G2) fusion polypeptide by using antibodies against the purified 28 kd G2 protein. A clone corresponding to the 28 kd G2 protein was sequenced and the primary structure of this protein was derived. Five regions can be defined in the protein sequence: an 11 residue N-terminal part, a repeated region formed by eight units of the sequence Pro-Pro-Pro-Val-His-Leu, an alternating Pro-X stretch 21 residues long, a Cys rich domain and a C-terminal part rich in Gln. The protein sequence is preceded by 19 residues which have the characteristics of the signal peptide found in secreted proteins. Unlike zeins, the main maize storage proteins, 28 kd glutelin-2 has several homologous sequences in common with other cereal storage proteins. Images PMID:3839076

  10. Molecular cloning, encoding sequence, and expression of vaccinia virus nucleic acid-dependent nucleoside triphosphatase gene.

    PubMed Central

    Rodriguez, J F; Kahn, J S; Esteban, M

    1986-01-01

    A rabbit poxvirus genomic library contained within the expression vector lambda gt11 was screened with polyclonal antiserum prepared against vaccinia virus nucleic acid-dependent nucleoside triphosphatase (NTPase)-I enzyme. Five positive phage clones containing from 0.72- to 2.5-kilobase-pair (kbp) inserts expressed a beta-galactosidase fusion protein that was reactive by immunoblotting with the NTPase-I antibody. Hybridization analysis allowed the location of this gene within the vaccinia HindIIID restriction fragment. From the known nucleotide sequence of the 16-kbp vaccinia HindIIID fragment, we identified a region that contains a 1896-base open reading frame coding for a 631-amino acid protein. Analysis of the complete sequence revealed a highly basic protein, with hydrophilic COOH and NH2 termini, various hydrophobic domains, and no significant homology to other known proteins. Translational studies demonstrate that NTPase-I belongs to a late class of viral genes. This protein is highly conserved among Orthopoxviruses. Images PMID:3025846

  11. Genotyping by sequencing resolves shallow population structure to inform conservation of Chinook salmon (Oncorhynchus tshawytscha)

    PubMed Central

    Larson, Wesley A; Seeb, Lisa W; Everett, Meredith V; Waples, Ryan K; Templin, William D; Seeb, James E

    2014-01-01

    Recent advances in population genomics have made it possible to detect previously unidentified structure, obtain more accurate estimates of demographic parameters, and explore adaptive divergence, potentially revolutionizing the way genetic data are used to manage wild populations. Here, we identified 10 944 single-nucleotide polymorphisms using restriction-site-associated DNA (RAD) sequencing to explore population structure, demography, and adaptive divergence in five populations of Chinook salmon (Oncorhynchus tshawytscha) from western Alaska. Patterns of population structure were similar to those of past studies, but our ability to assign individuals back to their region of origin was greatly improved (>90% accuracy for all populations). We also calculated effective size with and without removing physically linked loci identified from a linkage map, a novel method for nonmodel organisms. Estimates of effective size were generally above 1000 and were biased downward when physically linked loci were not removed. Outlier tests based on genetic differentiation identified 733 loci and three genomic regions under putative selection. These markers and genomic regions are excellent candidates for future research and can be used to create high-resolution panels for genetic monitoring and population assignment. This work demonstrates the utility of genomic data to inform conservation in highly exploited species with shallow population structure. PMID:24665338

  12. Conserved noncoding genomic sequences associated with a flowering-time quantitative trait locus in maize

    PubMed Central

    Salvi, Silvio; Sponza, Giorgio; Morgante, Michele; Tomes, Dwight; Niu, Xiaomu; Fengler, Kevin A.; Meeley, Robert; Ananiev, Evgueni V.; Svitashev, Sergei; Bruggemann, Edward; Li, Bailin; Hainey, Christine F.; Radovic, Slobodanka; Zaina, Giusi; Rafalski, J.-Antoni; Tingey, Scott V.; Miao, Guo-Hua; Phillips, Ronald L.; Tuberosa, Roberto

    2007-01-01

    Flowering time is a fundamental trait of maize adaptation to different agricultural environments. Although a large body of information is available on the map position of quantitative trait loci for flowering time, little is known about the molecular basis of quantitative trait loci. Through positional cloning and association mapping, we resolved the major flowering-time quantitative trait locus, Vegetative to generative transition 1 (Vgt1), to an ≈2-kb noncoding region positioned 70 kb upstream of an Ap2-like transcription factor that we have shown to be involved in flowering-time control. Vgt1 functions as a cis-acting regulatory element as indicated by the correlation of the Vgt1 alleles with the transcript expression levels of the downstream gene. Additionally, within Vgt1, we identified evolutionarily conserved noncoding sequences across the maize–sorghum–rice lineages. Our results support the notion that changes in distant cis-acting regulatory regions are a key component of plant genetic adaptation throughout breeding and evolution. PMID:17595297

  13. Conserved regulators of Rag GTPases orchestrate amino acid-dependent TORC1 signaling

    PubMed Central

    Powis, Katie; De Virgilio, Claudio

    2016-01-01

    The highly conserved target of rapamycin complex 1 (TORC1) is the central component of a signaling network that couples a vast range of internal and external stimuli to cell growth, proliferation and metabolism. TORC1 deregulation is associated with a number of human pathologies, including many cancers and metabolic disorders, underscoring its importance in cellular and organismal growth control. The activity of TORC1 is modulated by multiple inputs; however, the presence of amino acids is a stimulus that is essential for its activation. Amino acid sufficiency is communicated to TORC1 via the highly conserved family of Rag GTPases, which assemble as heterodimeric complexes on lysosomal/vacuolar membranes and are regulated by their guanine nucleotide loading status. Studies in yeast, fly and mammalian model systems have revealed a multitude of conserved Rag GTPase modulators, which have greatly expanded our understanding of amino acid sensing by TORC1. Here we review the major known modulators of the Rag GTPases, focusing on recent mechanistic insights that highlight the evolutionary conservation and divergence of amino acid signaling to TORC1. PMID:27462445

  14. [Identification of new conserved and variable regions in the 16S rRNA gene of acetic acid bacteria and acetobacteraceae family].

    PubMed

    Chakravorty, S; Sarkar, S; Gachhui, R

    2015-01-01

    The Acetobacteraceae family of the class Alpha Proteobacteria is comprised of high sugar and acid tolerant bacteria. The Acetic Acid Bacteria are the economically most significant group of this family because of its association with food products like vinegar, wine etc. Acetobacteraceae are often hard to culture in laboratory conditions and they also maintain very low abundances in their natural habitats. Thus identification of the organisms in such environments is greatly dependent on modern tools of molecular biology which require a thorough knowledge of specific conserved gene sequences that may act as primers and or probes. Moreover unconserved domains in genes also become markers for differentiating closely related genera. In bacteria, the 16S rRNA gene is an ideal candidate for such conserved and variable domains. In order to study the conserved and variable domains of the 16S rRNA gene of Acetic Acid Bacteria and the Acetobacteraceae family, sequences from publicly available databases were aligned and compared. Near complete sequences of the gene were also obtained from Kombucha tea biofilm, a known Acetobacteraceae family habitat, in order to corroborate the domains obtained from the alignment studies. The study indicated that the degree of conservation in the gene is significantly higher among the Acetic Acid Bacteria than the whole Acetobacteraceae family. Moreover it was also observed that the previously described hypervariable regions V1, V3, V5, V6 and V7 were more or less conserved in the family and the spans of the variable regions are quite distinct as well.

  15. Nucleotide sequence of a cluster of early and late genes in a conserved segment of the vaccinia virus genome.

    PubMed Central

    Plucienniczak, A; Schroeder, E; Zettlmeissl, G; Streeck, R E

    1985-01-01

    The nucleotide sequence of a 7.6 kb vaccinia DNA segment from a genomic region conserved among different orthopox virus has been determined. This segment contains a tight cluster of 12 partly overlapping open reading frames most of which can be correlated with previously identified early and late proteins and mRNAs. Regulatory signals used by vaccinia virus have been studied. Presumptive promoter regions are rich in A, T and carry the consensus sequences TATA and AATAA spaced at 20-24 base pairs. Tandem repeats of a CTATTC consensus sequence are proposed to be involved in the termination of early transcription. PMID:2987815

  16. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2010-07-01 2010-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide...

  17. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2012-07-01 2012-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide...

  18. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2014-07-01 2014-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide...

  19. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2011-07-01 2011-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide...

  20. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2013-07-01 2013-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide...

  1. Evolutionary conservation analysis increases the colocalization of predicted exonic splicing enhancers in the BRCA1 gene with missense sequence changes and in-frame deletions, but not polymorphisms

    PubMed Central

    Pettigrew, Christopher; Wayte, Nicola; Lovelock, Paul K; Tavtigian, Sean V; Chenevix-Trench, Georgia; Spurdle, Amanda B; Brown, Melissa A

    2005-01-01

    Introduction Aberrant pre-mRNA splicing can be more detrimental to the function of a gene than changes in the length or nature of the encoded amino acid sequence. Although predicting the effects of changes in consensus 5' and 3' splice sites near intron:exon boundaries is relatively straightforward, predicting the possible effects of changes in exonic splicing enhancers (ESEs) remains a challenge. Methods As an initial step toward determining which ESEs predicted by the web-based tool ESEfinder in the breast cancer susceptibility gene BRCA1 are likely to be functional, we have determined their evolutionary conservation and compared their location with known BRCA1 sequence variants. Results Using the default settings of ESEfinder, we initially detected 669 potential ESEs in the coding region of the BRCA1 gene. Increasing the threshold score reduced the total number to 464, while taking into consideration the proximity to splice donor and acceptor sites reduced the number to 211. Approximately 11% of these ESEs (23/211) either are identical at the nucleotide level in human, primates, mouse, cow, dog and opossum Brca1 (conserved) or are detectable by ESEfinder in the same position in the Brca1 sequence (shared). The frequency of conserved and shared predicted ESEs between human and mouse is higher in BRCA1 exons (2.8 per 100 nucleotides) than in introns (0.6 per 100 nucleotides). Of conserved or shared putative ESEs, 61% (14/23) were predicted to be affected by sequence variants reported in the Breast Cancer Information Core database. Applying the filters described above increased the colocalization of predicted ESEs with missense changes, in-frame deletions and unclassified variants predicted to be deleterious to protein function, whereas they decreased the colocalization with known polymorphisms or unclassified variants predicted to be neutral. Conclusion In this report we show that evolutionary conservation analysis may be used to improve the specificity of an ESE

  2. Prediction of flexible/rigid regions from protein sequences using k-spaced amino acid pairs

    PubMed Central

    Chen, Ke; Kurgan, Lukasz A; Ruan, Jishou

    2007-01-01

    Background Traditionally, it is believed that the native structure of a protein corresponds to a global minimum of its free energy. However, with the growing number of known tertiary (3D) protein structures, researchers have discovered that some proteins can alter their structures in response to a change in their surroundings or with the help of other proteins or ligands. Such structural shifts play a crucial role with respect to the protein function. To this end, we propose a machine learning method for the prediction of the flexible/rigid regions of proteins (referred to as FlexRP); the method is based on a novel sequence representation and feature selection. Knowledge of the flexible/rigid regions may provide insights into the protein folding process and the 3D structure prediction. Results The flexible/rigid regions were defined based on a dataset, which includes protein sequences that have multiple experimental structures, and which was previously used to study the structural conservation of proteins. Sequences drawn from this dataset were represented based on feature sets that were proposed in prior research, such as PSI-BLAST profiles, composition vector and binary sequence encoding, and a newly proposed representation based on frequencies of k-spaced amino acid pairs. These representations were processed by feature selection to reduce the dimensionality. Several machine learning methods for the prediction of flexible/rigid regions and two recently proposed methods for the prediction of conformational changes and unstructured regions were compared with the proposed method. The FlexRP method, which applies Logistic Regression and collocation-based representation with 95 features, obtained 79.5% accuracy. The two runner-up methods, which apply the same sequence representation and Support Vector Machines (SVM) and Naïve Bayes classifiers, obtained 79.2% and 78.4% accuracy, respectively. The remaining considered methods are characterized by accuracies below 70

  3. QColors: an algorithm for conservative viral quasispecies reconstruction from short and non-contiguous next generation sequencing reads.

    PubMed

    Huang, Austin; Kantor, Rami; DeLong, Allison; Schreier, Leeann; Istrail, Sorin

    Next generation sequencing technologies have recently been applied to characterize mutational spectra of the heterogeneous population of viral genotypes (known as a quasispecies) within HIV-infected patients. Such information is clinically relevant because minority genetic subpopulations of HIV within patients enable viral escape from selection pressures such as the immune response and antiretroviral therapy. However, methods for quasispecies sequence reconstruction from next generation sequencing reads are not yet widely used and remains an emerging area of research. Furthermore, the majority of research methodology in HIV has focused on 454 sequencing, while many next-generation sequencing platforms used in practice are limited to shorter read lengths relative to 454 sequencing. Little work has been done in determining how best to address the read length limitations of other platforms. The approach described here incorporates graph representations of both read differences and read overlap to conservatively determine the regions of the sequence with sufficient variability to separate quasispecies sequences. Within these tractable regions of quasispecies inference, we use constraint programming to solve for an optimal quasispecies subsequence determination via vertex coloring of the conflict graph, a representation which also lends itself to data with non-contiguous reads such as paired-end sequencing. We demonstrate the utility of the method by applying it to simulations based on actual intra-patient clonal HIV-1 sequencing data.

  4. Deletion of conserved sequences in IG-DMR at Dlk1-Gtl2 locus suggests their involvement in expression of paternally expressed genes in mice

    PubMed Central

    SAITO, Takeshi; HARA, Satoshi; TAMANO, Moe; ASAHARA, Hiroshi; TAKADA, Shuji

    2016-01-01

    Expression regulation of the Dlk1-Dio3 imprinted domain by the intergenic differentially methylated region (IG-DMR) is essential for normal embryonic development in mammals. In this study, we investigated conserved IG-DMR genomic sequences in eutherians to elucidate their role in genomic imprinting of the Dlk1-Dio3 domain. Using a comparative genomics approach, we identified three highly conserved sequences in IG-DMR. To elucidate the functions of these sequences in vivo, we generated mutant mice lacking each of the identified highly conserved sequences using the CRISPR/Cas9 system. Although mutant mice did not exhibit the gross phenotype, deletions of the conserved sequences altered the expression levels of paternally expressed imprinted genes in the mutant embryos without skewing imprinting status. These results suggest that the conserved sequences in IG-DMR are involved in the expression regulation of some of the imprinted genes in the Dlk1-Dio3 domain. PMID:27904015

  5. Deletion of conserved sequences in IG-DMR at Dlk1-Gtl2 locus suggests their involvement in expression of paternally expressed genes in mice.

    PubMed

    Saito, Takeshi; Hara, Satoshi; Tamano, Moe; Asahara, Hiroshi; Takada, Shuji

    2017-02-16

    Expression regulation of the Dlk1-Dio3 imprinted domain by the intergenic differentially methylated region (IG-DMR) is essential for normal embryonic development in mammals. In this study, we investigated conserved IG-DMR genomic sequences in eutherians to elucidate their role in genomic imprinting of the Dlk1-Dio3 domain. Using a comparative genomics approach, we identified three highly conserved sequences in IG-DMR. To elucidate the functions of these sequences in vivo, we generated mutant mice lacking each of the identified highly conserved sequences using the CRISPR/Cas9 system. Although mutant mice did not exhibit the gross phenotype, deletions of the conserved sequences altered the expression levels of paternally expressed imprinted genes in the mutant embryos without skewing imprinting status. These results suggest that the conserved sequences in IG-DMR are involved in the expression regulation of some of the imprinted genes in the Dlk1-Dio3 domain.

  6. Human liver apolipoprotein B-100 cDNA: complete nucleic acid and derived amino acid sequence.

    PubMed Central

    Law, S W; Grant, S M; Higuchi, K; Hospattankar, A; Lackner, K; Lee, N; Brewer, H B

    1986-01-01

    Human apolipoprotein B-100 (apoB-100), the ligand on low density lipoproteins that interacts with the low density lipoprotein receptor and initiates receptor-mediated endocytosis and low density lipoprotein catabolism, has been cloned, and the complete nucleic acid and derived amino acid sequences have been determined. ApoB-100 cDNAs were isolated from normal human liver cDNA libraries utilizing immunoscreening as well as filter hybridization with radiolabeled apoB-100 oligodeoxynucleotides. The apoB-100 mRNA is 14.1 kilobases long encoding a mature apoB-100 protein of 4536 amino acids with a calculated amino acid molecular weight of 512,723. ApoB-100 contains 20 potential glycosylation sites, and 12 of a total of 25 cysteine residues are located in the amino-terminal region of the apolipoprotein providing a potential globular structure of the amino terminus of the protein. ApoB-100 contains relatively few regions of amphipathic helices, but compared to other human apolipoproteins it is enriched in beta-structure. The delineation of the entire human apoB-100 sequence will now permit a detailed analysis of the conformation of the protein, the low density lipoprotein receptor binding domain(s), and the structural relationship between apoB-100 and apoB-48 and will provide the basis for the study of genetic defects in apoB-100 in patients with dyslipoproteinemias. PMID:3464946

  7. Computer selection of oligonucleotide probes from amino acid sequences for use in gene library screening.

    PubMed

    Yang, J H; Ye, J H; Wallace, D C

    1984-01-11

    We present a computer program, FINPROBE, which utilizes known amino acid sequence data to deduce minimum redundancy oligonucleotide probes for use in screening cDNA or genomic libraries or in primer extension. The user enters the amino acid sequence of interest, the desired probe length, the number of probes sought, and the constraints on oligonucleotide synthesis. The computer generates a table of possible probes listed in increasing order of redundancy and provides the location of each probe in the protein and mRNA coding sequence. Activation of a next function provides the amino acid and mRNA sequences of each probe of interest as well as the complementary sequence and the minimum dissociation temperature of the probe. A final routine prints out the amino acid sequence of the protein in parallel with the mRNA sequence listing all possible codons for each amino acid.

  8. Two distinct nuclear factors bind the conserved regulatory sequences of a rabbit major histocompatibility complex class II gene.

    PubMed Central

    Sittisombut, N

    1988-01-01

    The constitutive coexpression of the major histocompatibility complex (MHC) class II genes in B lymphocytes requires positive, trans-acting transcriptional factors. The need for these trans-acting factors has been suggested by the reversion of the MHC class II-negative phenotype of rare B-lymphocyte mutants through somatic cell fusion with B cells or T-cell lines. The mechanism by which the trans-acting factors exert their effect on gene transcription is unknown. The possibility that two highly conserved DNA sequences, located 90 to 100 base pairs (bp) (the A sequence) and 60 to 70 bp (the B sequence) upstream of the transcription start site of the class II genes, are recognized by the trans-acting factors was investigated in this study. By using the gel electrophoresis retardation assay, a minimum of two proteins which specifically bound the conserved A or B sequence of a rabbit DP beta gene were identified in murine nuclear extracts of a B-lymphoma cell line, A20-2J. Fractionation of nuclear extract through a heparin-agarose column allowed the identification of one protein, designated NF-MHCIIB, which bound an oligonucleotide containing the B sequence and protected the entire B sequence in the DNase I protection analysis. Another protein, designated NF-MHCIIA, which bound an oligonucleotide containing the A sequence and partially protected the 3' half of this sequence, was also identified. NF-MHCIIB did not protect a CCAAT sequence located 17 bp downstream of the B sequence. The possible relationship between these DNA-binding factors and the trans-acting factors identified in the cell fusion experiments is discussed. Images PMID:3133552

  9. Human IgE-binding protein: A soluble lectin exhibiting a highly conserved interspecies sequence and differential recognition of IgE glycoforms

    SciTech Connect

    Robertson, M.W.; Albrandt, K.; Keller, D.; Liu, Fu-Tong )

    1990-09-04

    IgE-binding protein ({epsilon}BP) refers to a protein originally identified in rat basophilic leukemia cells by virtue of its affinity for IgE. It is now known to be a {beta}-galactoside-binding lectin equivalent to carbohydrate-binding protein 35 (CBP 35). More recently, its identity to Mac-2, a macrophage cell-surface protein, has been established. cDNA coding for human {epsilon}BP has been cloned from a human HeLa cell cDNA library and contains an open reading frame of 750 base pairs encoding a 250 amino acid protein. Like the rat and murine counterparts, the human {epsilon}BP amino acid sequence can be divided into two domains with the amino-terminal domain consisting of a highly conserved repetitive sequence (YPGXXXPGA) and the carboxyl-terminal domain containing sequences shared by other S-type lectins. The human {epsilon}BP sequence exhibits extensive homology to murine and rat {epsilon}BP with 84% and 82% identity, respectively. The homology is particularly striking in the carboxyl-terminal domain where 95% identity is found between human and murine sequences in a stretch of over 70 amino acids. A survey of {epsilon}BP mRNA expression from several lymphocyte cell lines revealed that the level of {epsilon}BP transcription may reflect a relationship between cell differentiation and {epsilon}BP expression. Finally, human {epsilon}BP was purified from several human cell lines and shown to possess lactose-binding characteristics and cross-species reactivity to murine IgE. Surprisingly, three different human myeloma IgE proteins did not show reactivity to human {epsilon}BP. However, after neuraminidase treatment of each human IgE, pronounced binding to {epsilon}BP was observed, thereby indicating that only specific IgE glycoforms can be recognized by {epsilon}BP.

  10. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data...

  11. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data...

  12. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data...

  13. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data...

  14. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data...

  15. Deep small RNA sequencing from the nematode Ascaris reveals conservation, functional diversification, and novel developmental profiles.

    PubMed

    Wang, Jianbin; Czech, Benjamin; Crunk, Amanda; Wallace, Adam; Mitreva, Makedonka; Hannon, Gregory J; Davis, Richard E

    2011-09-01

    Eukaryotic cells express several classes of small RNAs that regulate gene expression and ensure genome maintenance. Endogenous siRNAs (endo-siRNAs) and Piwi-interacting RNAs (piRNAs) mainly control gene and transposon expression in the germline, while microRNAs (miRNAs) generally function in post-transcriptional gene silencing in both somatic and germline cells. To provide an evolutionary and developmental perspective on small RNA pathways in nematodes, we identified and characterized known and novel small RNA classes through gametogenesis and embryo development in the parasitic nematode Ascaris suum and compared them with known small RNAs of Caenorhabditis elegans. piRNAs, Piwi-clade Argonautes, and other proteins associated with the piRNA pathway have been lost in Ascaris. miRNAs are synthesized immediately after fertilization in utero, before pronuclear fusion, and before the first cleavage of the zygote. This is the earliest expression of small RNAs ever described at a developmental stage long thought to be transcriptionally quiescent. A comparison of the two classes of Ascaris endo-siRNAs, 22G-RNAs and 26G-RNAs, to those in C. elegans, suggests great diversification and plasticity in the use of small RNA pathways during spermatogenesis in different nematodes. Our data reveal conserved characteristics of nematode small RNAs as well as features unique to Ascaris that illustrate significant flexibility in the use of small RNAs pathways, some of which are likely an adaptation to Ascaris' life cycle and parasitism. The transcriptome assembly has been submitted to NCBI Transcriptome Shotgun Assembly Sequence Database(http://www.ncbi.nlm.nih.gov/genbank/TSA.html) under accession numbers JI163767–JI182837 and JI210738–JI257410.

  16. Using Caenorhabditis elegans to Uncover Conserved Functions of Omega-3 and Omega-6 Fatty Acids

    PubMed Central

    Watts, Jennifer L.

    2016-01-01

    The nematode Caenorhabditis elegans is a powerful model organism to study functions of polyunsaturated fatty acids. The ability to alter fatty acid composition with genetic manipulation and dietary supplementation permits the dissection of the roles of omega-3 and omega-6 fatty acids in many biological process including reproduction, aging and neurobiology. Studies in C. elegans to date have mostly identified overlapping functions of 20-carbon omega-6 and omega-3 fatty acids in reproduction and in neurons, however, specific roles for either omega-3 or omega-6 fatty acids are beginning to emerge. Recent findings with importance to human health include the identification of a conserved Cox-independent prostaglandin synthesis pathway, critical functions for cytochrome P450 derivatives of polyunsaturated fatty acids, the requirements for omega-6 and omega-3 fatty acids in sensory neurons, and the importance of fatty acid desaturation for long lifespan. Furthermore, the ability of C. elegans to interconvert omega-6 to omega-3 fatty acids using the FAT-1 omega-3 desaturase has been exploited in mammalian studies and biotechnology approaches to generate mammals capable of exogenous generation of omega-3 fatty acids. PMID:26848697

  17. Using Caenorhabditis elegans to Uncover Conserved Functions of Omega-3 and Omega-6 Fatty Acids.

    PubMed

    Watts, Jennifer L

    2016-02-02

    The nematode Caenorhabditis elegans is a powerful model organism to study functions of polyunsaturated fatty acids. The ability to alter fatty acid composition with genetic manipulation and dietary supplementation permits the dissection of the roles of omega-3 and omega-6 fatty acids in many biological process including reproduction, aging and neurobiology. Studies in C. elegans to date have mostly identified overlapping functions of 20-carbon omega-6 and omega-3 fatty acids in reproduction and in neurons, however, specific roles for either omega-3 or omega-6 fatty acids are beginning to emerge. Recent findings with importance to human health include the identification of a conserved Cox-independent prostaglandin synthesis pathway, critical functions for cytochrome P450 derivatives of polyunsaturated fatty acids, the requirements for omega-6 and omega-3 fatty acids in sensory neurons, and the importance of fatty acid desaturation for long lifespan. Furthermore, the ability of C. elegans to interconvert omega-6 to omega-3 fatty acids using the FAT-1 omega-3 desaturase has been exploited in mammalian studies and biotechnology approaches to generate mammals capable of exogenous generation of omega-3 fatty acids.

  18. A conserved 11 nucleotide sequence contains an essential promoter element of the maize mitochondrial atp1 gene.

    PubMed Central

    Rapp, W D; Stern, D B

    1992-01-01

    To determine the structure of a functional plant mitochondrial promoter, we have partially purified an RNA polymerase activity that correctly initiates transcription at the maize mitochondrial atp1 promoter in vitro. Using a series of 5' deletion constructs, we found that essential sequences are located within--19 nucleotides (nt) of the transcription initiation site. The region surrounding the initiation site includes conserved sequence motifs previously proposed to be maize mitochondrial promoter elements. Deletion of a conserved 11 nt sequence showed that it is critical for promoter function, but deletion or alteration of conserved upstream G(A/T)3-4 repeats had no effect. When the atp1 11 nt sequence was inserted into different plasmids lacking mitochondrial promoter activity, transcription was only observed for one of these constructs. We infer from these data that the functional promoter extends beyond this motif, most likely in the 5' direction. The maize mitochondrial cox3 and atp6 promoters also direct transcription initiation in this in vitro system, suggesting that it may be widely applicable for studies of mitochondrial transcription in this species. Images PMID:1372246

  19. Use of a Drosophila Genome-Wide Conserved Sequence Database to Identify Functionally Related cis-Regulatory Enhancers

    PubMed Central

    Brody, Thomas; Yavatkar, Amarendra S; Kuzin, Alexander; Kundu, Mukta; Tyson, Leonard J; Ross, Jermaine; Lin, Tzu-Yang; Lee, Chi-Hon; Awasaki, Takeshi; Lee, Tzumin; Odenwald, Ward F

    2012-01-01

    Background: Phylogenetic footprinting has revealed that cis-regulatory enhancers consist of conserved DNA sequence clusters (CSCs). Currently, there is no systematic approach for enhancer discovery and analysis that takes full-advantage of the sequence information within enhancer CSCs. Results: We have generated a Drosophila genome-wide database of conserved DNA consisting of >100,000 CSCs derived from EvoPrints spanning over 90% of the genome. cis-Decoder database search and alignment algorithms enable the discovery of functionally related enhancers. The program first identifies conserved repeat elements within an input enhancer and then searches the database for CSCs that score highly against the input CSC. Scoring is based on shared repeats as well as uniquely shared matches, and includes measures of the balance of shared elements, a diagnostic that has proven to be useful in predicting cis-regulatory function. To demonstrate the utility of these tools, a temporally-restricted CNS neuroblast enhancer was used to identify other functionally related enhancers and analyze their structural organization. Conclusions: cis-Decoder reveals that co-regulating enhancers consist of combinations of overlapping shared sequence elements, providing insights into the mode of integration of multiple regulating transcription factors. The database and accompanying algorithms should prove useful in the discovery and analysis of enhancers involved in any developmental process. Developmental Dynamics 241:169–189, 2012. © 2011 Wiley Periodicals, Inc. Key findings A genome-wide catalog of Drosophila conserved DNA sequence clusters. cis-Decoder discovers functionally related enhancers. Functionally related enhancers share balanced sequence element copy numbers. Many enhancers function during multiple phases of development. PMID:22174086

  20. Human retroviruses and AIDS 1996. A compilation and analysis of nucleic acid and amino acid sequences

    SciTech Connect

    Myers, G.; Foley, B.; Korber, B.; Mellors, J.W.; Jeang, K.T.; Wain-Hobson, S.

    1997-04-01

    This compendium and the accompanying floppy diskettes are the result of an effort to compile and rapidly publish all relevant molecular data concerning the human immunodeficiency viruses (HIV) and related retroviruses. The scope of the compendium and database is best summarized by the five parts that it comprises: (1) Nuclear Acid Alignments and Sequences; (2) Amino Acid Alignments; (3) Analysis; (4) Related Sequences; and (5) Database Communications. Information within all the parts is updated throughout the year on the Web site, http://hiv-web.lanl.gov. While this publication could take the form of a review or sequence monograph, it is not so conceived. Instead, the literature from which the database is derived has simply been summarized and some elementary computational analyses have been performed upon the data. Interpretation and commentary have been avoided insofar as possible so that the reader can form his or her own judgments concerning the complex information. In addition to the general descriptions of the parts of the compendium, the user should read the individual introductions for each part.

  1. Transcriptome Sequencing in Response to Salicylic Acid in Salvia miltiorrhiza

    PubMed Central

    Zhang, Xiaoru; Dong, Juane; Liu, Hailong; Wang, Jiao; Qi, Yuexin; Liang, Zongsuo

    2016-01-01

    Salvia miltiorrhiza is a traditional Chinese herbal medicine, whose quality and yield are often affected by diseases and environmental stresses during its growing season. Salicylic acid (SA) plays a significant role in plants responding to biotic and abiotic stresses, but the involved regulatory factors and their signaling mechanisms are largely unknown. In order to identify the genes involved in SA signaling, the RNA sequencing (RNA-seq) strategy was employed to evaluate the transcriptional profiles in S. miltiorrhiza cell cultures. A total of 50,778 unigenes were assembled, in which 5,316 unigenes were differentially expressed among 0-, 2-, and 8-h SA induction. The up-regulated genes were mainly involved in stimulus response and multi-organism process. A core set of candidate novel genes coding SA signaling component proteins was identified. Many transcription factors (e.g., WRKY, bHLH and GRAS) and genes involved in hormone signal transduction were differentially expressed in response to SA induction. Detailed analysis revealed that genes associated with defense signaling, such as antioxidant system genes, cytochrome P450s and ATP-binding cassette transporters, were significantly overexpressed, which can be used as genetic tools to investigate disease resistance. Our transcriptome analysis will help understand SA signaling and its mechanism of defense systems in S. miltiorrhiza. PMID:26808150

  2. The Runt domain of AML1 (RUNX1) binds a sequence-conserved RNA motif that mimics a DNA element

    PubMed Central

    Fukunaga, Junichi; Nomura, Yusuke; Tanaka, Yoichiro; Amano, Ryo; Tanaka, Taku; Nakamura, Yoshikazu; Kawai, Gota; Sakamoto, Taiichi; Kozu, Tomoko

    2013-01-01

    AML1 (RUNX1) is a key transcription factor for hematopoiesis that binds to the Runt-binding double-stranded DNA element (RDE) of target genes through its N-terminal Runt domain. Aberrations in the AML1 gene are frequently found in human leukemia. To better understand AML1 and its potential utility for diagnosis and therapy, we obtained RNA aptamers that bind specifically to the AML1 Runt domain. Enzymatic probing and NMR analyses revealed that Apt1-S, which is a truncated variant of one of the aptamers, has a CACG tetraloop and two stem regions separated by an internal loop. All the isolated aptamers were found to contain the conserved sequence motif 5′-NNCCAC-3′ and 5′-GCGMGN′N′-3′ (M:A or C; N and N′ form Watson–Crick base pairs). The motif contains one AC mismatch and one base bulged out. Mutational analysis of Apt1-S showed that three guanines of the motif are important for Runt binding as are the three guanines of RDE, which are directly recognized by three arginine residues of the Runt domain. Mutational analyses of the Runt domain revealed that the amino acid residues used for Apt1-S binding were similar to those used for RDE binding. Furthermore, the aptamer competed with RDE for binding to the Runt domain in vitro. These results demonstrated that the Runt domain of the AML1 protein binds to the motif of the aptamer that mimics DNA. Our findings should provide new insights into RNA function and utility in both basic and applied sciences. PMID:23709277

  3. Human retroviruses and aids, 1992. A compilation and analysis of nucleic acid and amino acid sequences

    SciTech Connect

    Myers, G.; Korber, B.; Berzofsky, J.A.; Pavlakis, G.N.; Smith, R.F.

    1992-10-01

    This compendium and the accompanying floppy diskettes are the result of an effort to compile and rapidly publish all relevant molecular data concerning the human immunodeficiency viruses (HIV) and related retroviruses. The scope of the compendium and database is best summarized by the five parts that it comprises: (1) HIV and SIV Nucleotide Sequences; (H) Amino Acid Sequences; (III) Analyses; (IV) Related Sequences; and (V) Database Communications. information within all the parts is updated at least twice in each year, which accounts for the modes of binding and pagination in the compendium. While this publication could take the form of a review or sequence monograph, it is not so conceived. Instead, the literature from which the database is derived has simply been summarized and some elementary computational analyses have been performed upon the data. Interpretation and commentary have been avoided insofar as possible so that the reader can form his or her own judgments concerning the complex information. In addition to the general descriptions below of the parts of the compendium, the user should read the individual introductions for each part.

  4. Dominant sequences of human major histocompatibility complex conserved extended haplotypes from HLA-DQA2 to DAXX.

    PubMed

    Larsen, Charles E; Alford, Dennis R; Trautwein, Michael R; Jalloh, Yanoh K; Tarnacki, Jennifer L; Kunnenkeri, Sushruta K; Fici, Dolores A; Yunis, Edmond J; Awdeh, Zuheir L; Alper, Chester A

    2014-10-01

    We resequenced and phased 27 kb of DNA within 580 kb of the MHC class II region in 158 population chromosomes, most of which were conserved extended haplotypes (CEHs) of European descent or contained their centromeric fragments. We determined the single nucleotide polymorphism and deletion-insertion polymorphism alleles of the dominant sequences from HLA-DQA2 to DAXX for these CEHs. Nine of 13 CEHs remained sufficiently intact to possess a dominant sequence extending at least to DAXX, 230 kb centromeric to HLA-DPB1. We identified the regions centromeric to HLA-DQB1 within which single instances of eight "common" European MHC haplotypes previously sequenced by the MHC Haplotype Project (MHP) were representative of those dominant CEH sequences. Only two MHP haplotypes had a dominant CEH sequence throughout the centromeric and extended class II region and one MHP haplotype did not represent a known European CEH anywhere in the region. We identified the centromeric recombination transition points of other MHP sequences from CEH representation to non-representation. Several CEH pairs or groups shared sequence identity in small blocks but had significantly different (although still conserved for each separate CEH) sequences in surrounding regions. These patterns partly explain strong calculated linkage disequilibrium over only short (tens to hundreds of kilobases) distances in the context of a finite number of observed megabase-length CEHs comprising half a population's haplotypes. Our results provide a clearer picture of European CEH class II allelic structure and population haplotype architecture, improved regional CEH markers, and raise questions concerning regional recombination hotspots.

  5. High-throughput genomic sequencing of cassava bacterial blight strains identifies conserved effectors to target for durable resistance.

    PubMed

    Bart, Rebecca; Cohn, Megan; Kassen, Andrew; McCallum, Emily J; Shybut, Mikel; Petriello, Annalise; Krasileva, Ksenia; Dahlbeck, Douglas; Medina, Cesar; Alicai, Titus; Kumar, Lava; Moreira, Leandro M; Rodrigues Neto, Júlio; Verdier, Valerie; Santana, María Angélica; Kositcharoenkul, Nuttima; Vanderschuren, Hervé; Gruissem, Wilhelm; Bernal, Adriana; Staskawicz, Brian J

    2012-07-10

    Cassava bacterial blight (CBB), incited by Xanthomonas axonopodis pv. manihotis (Xam), is the most important bacterial disease of cassava, a staple food source for millions of people in developing countries. Here we present a widely applicable strategy for elucidating the virulence components of a pathogen population. We report Illumina-based draft genomes for 65 Xam strains and deduce the phylogenetic relatedness of Xam across the areas where cassava is grown. Using an extensive database of effector proteins from animal and plant pathogens, we identify the effector repertoire for each sequenced strain and use a comparative sequence analysis to deduce the least polymorphic of the conserved effectors. These highly conserved effectors have been maintained over 11 countries, three continents, and 70 y of evolution and as such represent ideal targets for developing resistance strategies.

  6. Identification of conserved genomic regions and variation therein amongst Cetartiodactyla species using next generation sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Background Next Generation Sequencing has created an opportunity to genetically characterize an individual both inexpensively and comprehensively. In earlier work produced in our collaboration [1], it was demonstrated that, for animals without a reference genome, their Next Generation Sequence data ...

  7. Genome-wide discovery and differential regulation of conserved and novel microRNAs in chickpea via deep sequencing.

    PubMed

    Jain, Mukesh; Chevala, V V S Narayana; Garg, Rohini

    2014-11-01

    MicroRNAs (miRNAs) are essential components of complex gene regulatory networks that orchestrate plant development. Although several genomic resources have been developed for the legume crop chickpea, miRNAs have not been discovered until now. For genome-wide discovery of miRNAs in chickpea (Cicer arietinum), we sequenced the small RNA content from seven major tissues/organs employing Illumina technology. About 154 million reads were generated, which represented more than 20 million distinct small RNA sequences. We identified a total of 440 conserved miRNAs in chickpea based on sequence similarity with known miRNAs in other plants. In addition, 178 novel miRNAs were identified using a miRDeep pipeline with plant-specific scoring. Some of the conserved and novel miRNAs with significant sequence similarity were grouped into families. The chickpea miRNAs targeted a wide range of mRNAs involved in diverse cellular processes, including transcriptional regulation (transcription factors), protein modification and turnover, signal transduction, and metabolism. Our analysis revealed several miRNAs with differential spatial expression. Many of the chickpea miRNAs were expressed in a tissue-specific manner. The conserved and differential expression of members of the same miRNA family in different tissues was also observed. Some of the same family members were predicted to target different chickpea mRNAs, which suggested the specificity and complexity of miRNA-mediated developmental regulation. This study, for the first time, reveals a comprehensive set of conserved and novel miRNAs along with their expression patterns and putative targets in chickpea, and provides a framework for understanding regulation of developmental processes in legumes.

  8. Purification of a marsupial insulin: amino-acid sequence of insulin from the eastern grey kangaroo Macropus giganteus.

    PubMed

    Treacy, G B; Shaw, D C; Griffiths, M E; Jeffrey, P D

    1989-03-24

    Insulin has been purified from kangaroo pancreas by acidic ethanol extraction, diethyl ether precipitation and gel filtration. The amino-acid sequence of this, the first marsupial insulin to be studied, is reported. It differs from human insulin by only four amino-acid substitutions, all in regions of the molecule previously known to be variable. However, it should be noted that one of these, asparagine for threonine at A8, has not been reported before. Computer comparisons of all 43 insulin sequences reported to date with kangaroo insulin show it to be most closely related to a group of mammalian insulins (dog, pig, cow, human) known to be of high biological potency. The measurement of blood glucose lowering in the rabbit by kangaroo insulin is consistent with this conclusion. Comparisons of amino-acid sequences of other proteins with their kangaroo counterparts show a greater difference, in line with the time of divergence of marsupials. The limited differences observed in insulin and cytochrome c suggest that their structures need to be closely conserved in order to maintain function.

  9. Completion of the amino acid sequence of the alpha 1 chain from type I calf skin collagen. Amino acid sequence of alpha 1(I)B8.

    PubMed Central

    Glanville, R W; Breitkreutz, D; Meitinger, M; Fietzek, P P

    1983-01-01

    The complete amino acid sequence of the 279-residue CNBr peptide CB8 from the alpha 1 chain of type I calf skin collagen is presented. It was determined by sequencing overlapping fragments of CB8 produced by Staphylococcus aureus V8 proteinase, trypsin, Endoproteinase Arg-C and hydroxylamine. Tryptic cleavages were also made specific for lysine by blocking arginine residues with cyclohexane-1,2-dione. This completes the amino acid sequence analysis of the 1054-residues-long alpha (I) chain of calf skin collagen. PMID:6354180

  10. Conserved sequence-specific lincRNA-steroid receptor interactions drive transcriptional repression and direct cell fate

    SciTech Connect

    Hudson, William H.; Pickard, Mark R.; de Vera, Ian Mitchelle S.; Kuiper, Emily G.; Mourtada-Maarabouni, Mirna; Conn, Graeme L.; Kojetin, Douglas J.; Williams, Gwyn T.; Ortlund, Eric A.

    2014-12-23

    The majority of the eukaryotic genome is transcribed, generating a significant number of long intergenic noncoding RNAs (lincRNAs). Although lincRNAs represent the most poorly understood product of transcription, recent work has shown lincRNAs fulfill important cellular functions. In addition to low sequence conservation, poor understanding of structural mechanisms driving lincRNA biology hinders systematic prediction of their function. Here we report the molecular requirements for the recognition of steroid receptors (SRs) by the lincRNA growth arrest-specific 5 (Gas5), which regulates steroid-mediated transcriptional regulation, growth arrest and apoptosis. We identify the functional Gas5-SR interface and generate point mutations that ablate the SR-Gas5 lincRNA interaction, altering Gas5-driven apoptosis in cancer cell lines. Further, we find that the Gas5 SR-recognition sequence is conserved among haplorhines, with its evolutionary origin as a splice acceptor site. This study demonstrates that lincRNAs can recognize protein targets in a conserved, sequence-specific manner in order to affect critical cell functions.

  11. Identification and characterization of novel and conserved microRNAs in radish (Raphanus sativus L.) using high-throughput sequencing.

    PubMed

    Xu, Liang; Wang, Yan; Xu, Yuanyuan; Wang, Liangju; Zhai, Lulu; Zhu, Xianwen; Gong, Yiqin; Ye, Shan; Liu, Liwang

    2013-03-01

    MicroRNAs (miRNAs) are endogenous, non-coding, small RNAs that play significant regulatory roles in plant growth, development, and biotic and abiotic stress responses. To date, a great number of conserved and species-specific miRNAs have been identified in many important plant species such as Arabidopsis, rice and poplar. However, little is known about identification of miRNAs and their target genes in radish (Raphanus sativus L.). In the present study, a small RNA library from radish root was constructed and sequenced using the high-throughput Solexa sequencing. Through sequence alignment and secondary structure prediction, a total of 545 conserved miRNA families as well as 15 novel (with their miRNA* strand) and 64 potentially novel miRNAs were identified. Quantitative real-time PCR (qRT-PCR) analysis confirmed that both conserved and novel miRNAs were expressed in radish, and some of them were preferentially expressed in certain tissues. A total of 196 potential target genes were predicted for 42 novel radish miRNAs. Gene ontology (GO) analysis showed that most of the targets were involved in plant growth, development, metabolism and stress responses. This study represents a first large-scale identification and characterization of radish miRNAs and their potential target genes. These results could lead to the further identification of radish miRNAs and enhance our understanding of radish miRNA regulatory mechanisms in diverse biological and metabolic processes.

  12. Regulation of ABCC6 trafficking and stability by a conserved C-terminal PDZ-like sequence.

    PubMed

    Xue, Peng; Crum, Chelsea M; Thibodeau, Patrick H

    2014-01-01

    Mutations in the ABCC6 ABC-transporter are causative of pseudoxanthoma elasticum (PXE). The loss of functional ABCC6 protein in the basolateral membrane of the kidney and liver is putatively associated with altered secretion of a circulatory factor. As a result, systemic changes in elastic tissues are caused by progressive mineralization and degradation of elastic fibers. Premature arteriosclerosis, loss of skin and vascular tone, and a progressive loss of vision result from this ectopic mineralization. However, the identity of the circulatory factor and the specific role of ABCC6 in disease pathophysiology are not known. Though recessive loss-of-function alleles are associated with alterations in ABCC6 expression and function, the molecular pathologies associated with the majority of PXE-causing mutations are also not known. Sequence analysis of orthologous ABCC6 proteins indicates the C-terminal sequences are highly conserved and share high similarity to the PDZ sequences found in other ABCC subfamily members. Genetic testing of PXE patients suggests that at least one disease-causing mutation is located in a PDZ-like sequence at the extreme C-terminus of the ABCC6 protein. To evaluate the role of this C-terminal sequence in the biosynthesis and trafficking of ABCC6, a series of mutations were utilized to probe changes in ABCC6 biosynthesis, membrane stability and turnover. Removal of this PDZ-like sequence resulted in decreased steady-state ABCC6 levels, decreased cell surface expression and stability, and mislocalization of the ABCC6 protein in polarized cells. These data suggest that the conserved, PDZ-like sequence promotes the proper biosynthesis and trafficking of the ABCC6 protein.

  13. An Integrated Sequence-Structure Database incorporating matching mRNA sequence, amino acid sequence and protein three-dimensional structure data.

    PubMed Central

    Adzhubei, I A; Adzhubei, A A; Neidle, S

    1998-01-01

    We have constructed a non-homologous database, termed the Integrated Sequence-Structure Database (ISSD) which comprises the coding sequences of genes, amino acid sequences of the corresponding proteins, their secondary structure and straight phi,psi angles assignments, and polypeptide backbone coordinates. Each protein entry in the database holds the alignment of nucleotide sequence, amino acid sequence and the PDB three-dimensional structure data. The nucleotide and amino acid sequences for each entry are selected on the basis of exact matches of the source organism and cell environment. The current version 1.0 of ISSD is available on the WWW at http://www.protein.bio.msu.su/issd/ and includes 107 non-homologous mammalian proteins, of which 80 are human proteins. The database has been used by us for the analysis of synonymous codon usage patterns in mRNA sequences showing their correlation with the three-dimensional structure features in the encoded proteins. Possible ISSD applications include optimisation of protein expression, improvement of the protein structure prediction accuracy, and analysis of evolutionary aspects of the nucleotide sequence-protein structure relationship. PMID:9399866

  14. Nucleotide sequence conservation of novel and established cis-regulatory sites within the tyrosine hydroxylase gene promoter

    PubMed Central

    Wang, Meng; Banerjee, Kasturi; Baker, Harriet; Cave, John W.

    2015-01-01

    Tyrosine hydroxylase (TH) is the rate-limiting enzyme in catecholamine biosynthesis and its gene proximal promoter ( < 1 kb upstream from the transcription start site) is essential for regulating transcription in both the developing and adult nervous systems. Several putative regulatory elements within the TH proximal promoter have been reported, but evolutionary conservation of these elements has not been thoroughly investigated. Since many vertebrate species are used to model development, function and disorders of human catecholaminergic neurons, identifying evolutionarily conserved transcription regulatory mechanisms is a high priority. In this study, we align TH proximal promoter nucleotide sequences from several vertebrate species to identify evolutionarily conserved motifs. This analysis identified three elements (a TATA box, cyclic AMP response element (CRE) and a 5′-GGTGG-3′ site) that constitute the core of an ancient vertebrate TH promoter. Focusing on only eutherian mammals, two regions of high conservation within the proximal promoter were identified: a ∼250 bp region adjacent to the transcription start site and a ∼85 bp region located approximately 350 bp further upstream. Within both regions, conservation of previously reported cis-regulatory motifs and human single nucleotide variants was evaluated. Transcription reporter assays in a TH -expressing cell line demonstrated the functionality of highly conserved motifs in the proximal promoter regions and electromobility shift assays showed that brain-region specific complexes assemble on these motifs. These studies also identified a non-canonical CRE binding (CREB) protein recognition element in the proximal promoter. Together, these studies provide a detailed analysis of evolutionary conservation within the TH promoter and identify potential cis-regulatory motifs that underlie a core set of regulatory mechanisms in mammals. PMID:25774193

  15. Complete amino acid sequence and structure characterization of the taste-modifying protein, miraculin.

    PubMed

    Theerasilp, S; Hitotsuya, H; Nakajo, S; Nakaya, K; Nakamura, Y; Kurihara, Y

    1989-04-25

    The taste-modifying protein, miraculin, has the unusual property of modifying sour taste into sweet taste. The complete amino acid sequence of miraculin purified from miracle fruits by a newly developed method (Theerasilp, S., and Kurihara, Y. (1988) J. Biol. Chem. 263, 11536-11539) was determined by an automatic Edman degradation method. Miraculin was a single polypeptide with 191 amino acid residues. The calculated molecular weight based on the amino acid sequence and the carbohydrate content (13.9%) was 24,600. Asn-42 and Asn-186 were linked N-glycosidically to carbohydrate chains. High homology was found between the amino acid sequences of miraculin and soybean trypsin inhibitor.

  16. Characterization of an Unusually Conserved Alui Highly Reiterated DNA Sequence Family from the Honeybee, Apis Mellifera

    PubMed Central

    Tares, S.; Cornuet, J. M.; Abad, P.

    1993-01-01

    An AluI family of highly reiterated nontranscribed sequences has been found in the genome of the honeybee Apis mellifera. This repeated sequence is shown to be present at approximately 23,000 copies per haploid genome constituting about 2% of the total genomic DNA. The nucleotide sequence of 10 monomers was determined. The consensus sequence is 176 nucleotides long and has an A + T content of 58%. There are clusters of both direct and inverted repeats. Internal subrepeating units ranging from 11 to 17 nucleotides are observed, suggesting that it could have evolved from a shorter sequence. DNA sequence data reveal that this repeat class is unusually homogeneous compared to the other class of invertebrate highly reiterated DNA sequences. The average pairwise sequence divergence between the repeats is 2.5%. In spite of this unusual homogeneity, divergence has been found in the repeated sequence hybridization ladder between four different honeybee subspecies. Therefore, the AluI highly reiterated sequences provide a new probe for fingerprinting in A. m. mellifera. PMID:8104160

  17. Detection and isolation of nucleic acid sequences using a bifunctional hybridization probe

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    2000-01-01

    A method for detecting and isolating a target sequence in a sample of nucleic acids is provided using a bifunctional hybridization probe capable of hybridizing to the target sequence that includes a detectable marker and a first complexing agent capable of forming a binding pair with a second complexing agent. A kit is also provided for detecting a target sequence in a sample of nucleic acids using a bifunctional hybridization probe according to this method.

  18. Biochemical Roles for Conserved Residues in the Bacterial Fatty Acid-binding Protein Family*

    PubMed Central

    Broussard, Tyler C.; Miller, Darcie J.; Jackson, Pamela; Nourse, Amanda; White, Stephen W.; Rock, Charles O.

    2016-01-01

    Fatty acid kinase (Fak) is a ubiquitous Gram-positive bacterial enzyme consisting of an ATP-binding protein (FakA) that phosphorylates the fatty acid bound to FakB. In Staphylococcus aureus, Fak is a global regulator of virulence factor transcription and is essential for the activation of exogenous fatty acids for incorporation into phospholipids. The 1.2-Å x-ray structure of S. aureus FakB2, activity assays, solution studies, site-directed mutagenesis, and in vivo complementation were used to define the functions of the five conserved residues that define the FakB protein family (Pfam02645). The fatty acid tail is buried within the protein, and the exposed carboxyl group is bound by a Ser-93-fatty acid carboxyl-Thr-61-His-266 hydrogen bond network. The guanidinium of the invariant Arg-170 is positioned to potentially interact with a bound acylphosphate. The reduced thermal denaturation temperatures of the T61A, S93A, and H266A FakB2 mutants illustrate the importance of the hydrogen bond network in protein stability. The FakB2 T61A, S93A, and H266A mutants are 1000-fold less active in the Fak assay, and the R170A mutant is completely inactive. All FakB2 mutants form FakA(FakB2)2 complexes except FakB2(R202A), which is deficient in FakA binding. Allelic replacement shows that strains expressing FakB2 mutants are defective in fatty acid incorporation into phospholipids and virulence gene transcription. These conserved residues are likely to perform the same critical functions in all bacterial fatty acid-binding proteins. PMID:26774272

  19. The complete mitochondrial genome sequence of the liverwort Pleurozia purpurea reveals extremely conservative mitochondrial genome evolution in liverworts.

    PubMed

    Wang, Bin; Xue, Jiayu; Li, Libo; Liu, Yang; Qiu, Yin-Long

    2009-12-01

    Plant mitochondrial genomes have been known to be highly unusual in their large sizes, frequent intra-genomic rearrangement, and generally conservative sequence evolution. Recent studies show that in early land plants the mitochondrial genomes exhibit a mixed mode of conservative yet dynamic evolution. Here, we report the completely sequenced mitochondrial genome from the liverwort Pleurozia purpurea. The circular genome has a size of 168,526 base pairs, containing 43 protein-coding genes, 3 rRNA genes, 25 tRNA genes, and 31 group I or II introns. It differs from the Marchantia polymorpha mitochondrial genome, the only other liverwort chondriome that has been sequenced, in lacking two genes (trnRucg and trnTggu) and one intron (rrn18i1065gII). The two genomes have identical gene orders and highly similar sequences in exons, introns, and intergenic spacers. Finally, a comparative analysis of duplicated trnRucu and other trnR genes from the two liverworts and several other organisms identified the recent lateral origin of trnRucg in Marchantia mtDNA through modification of a duplicated trnRucu. This study shows that the mitochondrial genomes evolve extremely slowly in liverworts, the earliest-diverging lineage of extant land plants, in stark contrast to what is known of highly dynamic evolution of mitochondrial genomes in seed plants.

  20. Structure and sequence conservation of hao cluster genes of autotrophic ammonia-oxidizing bacteria: evidence for their evolutionary history.

    PubMed

    Bergmann, David J; Hooper, Alan B; Klotz, Martin G

    2005-09-01

    Comparison of the organization and sequence of the hao (hydroxylamine oxidoreductase) gene clusters from the gammaproteobacterial autotrophic ammonia-oxidizing bacterium (aAOB) Nitrosococcus oceani and the betaproteobacterial aAOB Nitrosospira multiformis and Nitrosomonas europaea revealed a highly conserved gene cluster encoding the following proteins: hao, hydroxylamine oxidoreductase; orf2, a putative protein; cycA, cytochrome c(554); and cycB, cytochrome c(m)(552). The deduced protein sequences of HAO, c(554), and c(m)(552) were highly similar in all aAOB despite their differences in species evolution and codon usage. Phylogenetic inference revealed a broad family of multi-c-heme proteins, including HAO, the pentaheme nitrite reductase, and tetrathionate reductase. The c-hemes of this group also have a nearly identical geometry of heme orientation, which has remained conserved during divergent evolution of function. High sequence similarity is also seen within a protein family, including cytochromes c(m)(552), NrfH/B, and NapC/NirT. It is proposed that the hydroxylamine oxidation pathway evolved from a nitrite reduction pathway involved in anaerobic respiration (denitrification) during the radiation of the Proteobacteria. Conservation of the hydroxylamine oxidation module was maintained by functional pressure, and the module expanded into two separate narrow taxa after a lateral gene transfer event between gamma- and betaproteobacterial ancestors of extant aAOB. HAO-encoding genes were also found in six non-aAOB, either singly or tandemly arranged with an orf2 gene, whereas a c(554) gene was lacking. The conservation of the hao gene cluster in general and the uniqueness of the c(554) gene in particular make it a suitable target for the design of primers and probes useful for molecular ecology approaches to detect aAOB.

  1. Structure and Sequence Conservation of hao Cluster Genes of Autotrophic Ammonia-Oxidizing Bacteria: Evidence for Their Evolutionary History

    PubMed Central

    Bergmann, David J.; Hooper, Alan B.; Klotz, Martin G.

    2005-01-01

    Comparison of the organization and sequence of the hao (hydroxylamine oxidoreductase) gene clusters from the gammaproteobacterial autotrophic ammonia-oxidizing bacterium (aAOB) Nitrosococcus oceani and the betaproteobacterial aAOB Nitrosospira multiformis and Nitrosomonas europaea revealed a highly conserved gene cluster encoding the following proteins: hao, hydroxylamine oxidoreductase; orf2, a putative protein; cycA, cytochrome c554; and cycB, cytochrome cm552. The deduced protein sequences of HAO, c554, and cm552 were highly similar in all aAOB despite their differences in species evolution and codon usage. Phylogenetic inference revealed a broad family of multi-c-heme proteins, including HAO, the pentaheme nitrite reductase, and tetrathionate reductase. The c-hemes of this group also have a nearly identical geometry of heme orientation, which has remained conserved during divergent evolution of function. High sequence similarity is also seen within a protein family, including cytochromes cm552, NrfH/B, and NapC/NirT. It is proposed that the hydroxylamine oxidation pathway evolved from a nitrite reduction pathway involved in anaerobic respiration (denitrification) during the radiation of the Proteobacteria. Conservation of the hydroxylamine oxidation module was maintained by functional pressure, and the module expanded into two separate narrow taxa after a lateral gene transfer event between gamma- and betaproteobacterial ancestors of extant aAOB. HAO-encoding genes were also found in six non-aAOB, either singly or tandemly arranged with an orf2 gene, whereas a c554 gene was lacking. The conservation of the hao gene cluster in general and the uniqueness of the c554 gene in particular make it a suitable target for the design of primers and probes useful for molecular ecology approaches to detect aAOB. PMID:16151127

  2. High-throughput sequencing discovery of conserved and novel microRNAs in Chinese cabbage (Brassica rapa L. ssp. pekinensis).

    PubMed

    Wang, Fengde; Li, Libin; Liu, Lifeng; Li, Huayin; Zhang, Yihui; Yao, Yingyin; Ni, Zhongfu; Gao, Jianwei

    2012-07-01

    MicroRNAs (miRNAs) are a class of 21-24 nucleotide non-coding RNAs that down-regulate gene expression by cleaving or inhibiting the translation of target gene transcripts. miRNAs have been extensively analyzed in a few model plant species such as Arabidopsis, rice and Populus, and partially investigated in other non-model plant species. However, only a few conserved miRNAs have been identified in Chinese cabbage, a common and economically important crop in Asia. To identify novel and conserved miRNAs in Chinese cabbage (Brassica rapa L. ssp. pekinensis) we constructed a small RNA library. Using high-throughput Solexa sequencing to identify microRNAs we found 11,210 unique sequences belonging to 321 conserved miRNA families and 228 novel miRNAs. We ran a Blast search with these sequences against the Chinese cabbage mRNA database and found 2,308 and 736 potential target genes for 221 conserved and 125 novel miRNAs, respectively. The BlastX search against the Arabidopsis genome and GO analysis suggested most of the targets were involved in plant growth, metabolism, development and stress response. This study provides the first large scale-cloning and characterization of Chinese cabbage miRNAs and their potential targets. These miRNAs add to the growing database of new miRNAs, prompt further study on Chinese cabbage miRNA regulation mechanisms, and help toward a greater understanding of the important roles of miRNAs in Chinese cabbage.

  3. Cloning, sequence analysis, and expression in Escherichia coli of the gene encoding an alpha-amino acid ester hydrolase from Acetobacter turbidans.

    PubMed

    Polderman-Tijmes, Jolanda J; Jekel, Peter A; de Vries, Erik J; van Merode, Annet E J; Floris, René; van der Laan, Jan-Metske; Sonke, Theo; Janssen, Dick B

    2002-01-01

    The alpha-amino acid ester hydrolase from Acetobacter turbidans ATCC 9325 is capable of hydrolyzing and synthesizing beta-lactam antibiotics, such as cephalexin and ampicillin. N-terminal amino acid sequencing of the purified alpha-amino acid ester hydrolase allowed cloning and genetic characterization of the corresponding gene from an A. turbidans genomic library. The gene, designated aehA, encodes a polypeptide with a molecular weight of 72,000. Comparison of the determined N-terminal sequence and the deduced amino acid sequence indicated the presence of an N-terminal leader sequence of 40 amino acids. The aehA gene was subcloned in the pET9 expression plasmid and expressed in Escherichia coli. The recombinant protein was purified and found to be dimeric with subunits of 70 kDa. A sequence similarity search revealed 26% identity with a glutaryl 7-ACA acylase precursor from Bacillus laterosporus, but no homology was found with other known penicillin or cephalosporin acylases. There was some similarity to serine proteases, including the conservation of the active site motif, GXSYXG. Together with database searches, this suggested that the alpha-amino acid ester hydrolase is a beta-lactam antibiotic acylase that belongs to a class of hydrolases that is different from the Ntn hydrolase superfamily to which the well-characterized penicillin acylase from E. coli belongs. The alpha-amino acid ester hydrolase of A. turbidans represents a subclass of this new class of beta-lactam antibiotic acylases.

  4. Phospho-N-Acetyl-Muramyl-Pentapeptide Translocase from Escherichia coli: Catalytic Role of Conserved Aspartic Acid Residues

    PubMed Central

    Lloyd, Adrian J.; Brandish, Philip E.; Gilbey, Andrea M.; Bugg, Timothy D. H.

    2004-01-01

    Phospho-N-acetyl-muramyl-pentapeptide translocase (translocase 1) catalyzes the first of a sequence of lipid-linked steps that ultimately assemble the peptidoglycan layer of the bacterial cell wall. This essential enzyme is the target of several natural product antibiotics and has recently been the focus of antimicrobial drug discovery programs. The catalytic mechanism of translocase 1 is believed to proceed via a covalent intermediate formed between phospho-N-acetyl-muramyl-pentapeptide and a nucleophilic amino acid residue. Amino acid sequence alignments of the translocase 1 family and members of the related transmembrane phosphosugar transferase superfamily revealed only three conserved residues that possess nucleophilic side chains: the aspartic acid residues D115, D116, and D267. Here we report the expression and partial purification of Escherichia coli translocase 1 as a C-terminal hexahistidine (C-His6) fusion protein. Three enzymes with the site-directed mutations D115N, D116N, and D267N were constructed, expressed, and purified as C-His6 fusions. Enzymatic analysis established that all three mutations eliminated translocase 1 activity, and this finding verified the essential role of these residues. By analogy with the structural environment of the double aspartate motif found in prenyl transferases, we propose a model whereby D115 and D116 chelate a magnesium ion that coordinates with the pyrophosphate bridge of the UDP-N-acetyl-muramyl-pentapeptide substrate and in which D267 therefore fulfills the role of the translocase 1 active-site nucleophile. PMID:14996806

  5. The amino-acid sequence of the 2S sulphur-rich proteins from seeds of Brazil nut (Bertholletia excelsa H.B.K.).

    PubMed

    Ampe, C; Van Damme, J; de Castro, L A; Sampaio, M J; Van Montagu, M; Vandekerckhove, J

    1986-09-15

    Storage proteins of the albumin solubility fraction from seeds of Bertholletia excelsa H.B.K. were separated by reversed-phase high-performance liquid chromatography and their primary structures were determined by gas-phase sequencing on intact polypeptides and on the overlapping tryptic and thermolysin peptides. The 2S storage proteins consist of two subunits linked by disulphide bridges. The large subunit (8.5 kDa) is expressed in at least six different isoforms while the small subunit (3.6 kDa) consists of only one form. These proteins are extremely rich in glutamine, glutamic acid, arginine and the sulphur-containing amino acids cysteine and methionine. One of the variants even contains a sequence of six methionine residues in a row. Comparison with known sequences of 2S proteins of other dicotyledonous plants shows limited but distinct sequence homology. In particular, the positions of the cysteine residues relative to each other appear to be completely conserved, suggesting that tertiary structure constraints imposed by disulphide bridges dominate sequence conservation. It has been proposed that the two subunits of a related protein (the Brassica napus storage protein) is cleaved from a precursor polypeptide [Crouch, M. L., Tenbarge, K. M., Simon, A. E. & Ferl, R. (1983) J. Mol. Appl. Genet. 2,273-283]. The amino acid sequence homology of the Brazil nut protein with the former suggests that a similar protein processing event could occur.

  6. Identification and profiling of conserved and novel microRNAs involved in oil and oleic acid production during embryogenesis in Carya cathayensis Sarg.

    PubMed

    Wang, Zhengjia; Huang, Ruiming; Sun, Zhichao; Zhang, Tong; Huang, Jianqin

    2017-01-11

    MicroRNAs (miRNAs) are important regulators of plant development and fruit formation. Mature embryos of hickory (Carya cathayensis Sarg.) nuts contain more than 70% oil (comprising 90% unsaturated fatty acids), along with a substantial amount of oleic acid. To understand the roles of miRNAs involved in oil and oleic acid production during hickory embryogenesis, three small RNA libraries from different stages of embryogenesis were constructed. Deep sequencing of these three libraries identified 95 conserved miRNAs with 19 miRNA*s, 7 novel miRNAs (as well as their corresponding miRNA*s), and 26 potentially novel miRNAs. The analysis identified 15 miRNAs involved in oil and oleic acid production that are differentially expressed during embryogenesis in hickory. Among them, nine miRNA sequences, including eight conserved and one novel, were confirmed by qRT-PCR. In addition, 145 target genes of the novel miRNAs were predicted using a bioinformatic approach. Our results provide a framework for better understanding the roles of miRNAs during embryogenesis in hickory.

  7. Sequence-Based Screening for Rare Enzymes: New Insights into the World of AMDases Reveal a Conserved Motif and 58 Novel Enzymes Clustering in Eight Distinct Families

    PubMed Central

    Maimanakos, Janine; Chow, Jennifer; Gaßmeyer, Sarah K.; Güllert, Simon; Busch, Florian; Kourist, Robert; Streit, Wolfgang R.

    2016-01-01

    Arylmalonate Decarboxylases (AMDases, EC 4.1.1.76) are very rare and mostly underexplored enzymes. Currently only four known and biochemically characterized representatives exist. However, their ability to decarboxylate α-disubstituted malonic acid derivatives to optically pure products without cofactors makes them attractive and promising candidates for the use as biocatalysts in industrial processes. Until now, AMDases could not be separated from other members of the aspartate/glutamate racemase superfamily based on their gene sequences. Within this work, a search algorithm was developed that enables a reliable prediction of AMDase activity for potential candidates. Based on specific sequence patterns and screening methods 58 novel AMDase candidate genes could be identified in this work. Thereby, AMDases with the conserved sequence pattern of Bordetella bronchiseptica’s prototype appeared to be limited to the classes of Alpha-, Beta-, and Gamma-proteobacteria. Amino acid homologies and comparison of gene surrounding sequences enabled the classification of eight enzyme clusters. Particularly striking is the accumulation of genes coding for different transporters of the tripartite tricarboxylate transporters family, TRAP transporters and ABC transporters as well as genes coding for mandelate racemases/muconate lactonizing enzymes that might be involved in substrate uptake or degradation of AMDase products. Further, three novel AMDases were characterized which showed a high enantiomeric excess (>99%) of the (R)-enantiomer of flurbiprofen. These are the recombinant AmdA and AmdV from Variovorax sp. strains HH01 and HH02, originated from soil, and AmdP from Polymorphum gilvum found by a data base search. Altogether our findings give new insights into the class of AMDases and reveal many previously unknown enzyme candidates with high potential for bioindustrial processes. PMID:27610105

  8. Lariat sequencing in a unicellular yeast identifies regulated alternative splicing of exons that are evolutionarily conserved with humans.

    PubMed

    Awan, Ali R; Manfredo, Amanda; Pleiss, Jeffrey A

    2013-07-30

    Alternative splicing is a potent regulator of gene expression that vastly increases proteomic diversity in multicellular eukaryotes and is associated with organismal complexity. Although alternative splicing is widespread in vertebrates, little is known about the evolutionary origins of this process, in part because of the absence of phylogenetically conserved events that cross major eukaryotic clades. Here we describe a lariat-sequencing approach, which offers high sensitivity for detecting splicing events, and its application to the unicellular fungus, Schizosaccharomyces pombe, an organism that shares many of the hallmarks of alternative splicing in mammalian systems but for which no previous examples of exon-skipping had been demonstrated. Over 200 previously unannotated splicing events were identified, including examples of regulated alternative splicing. Remarkably, an evolutionary analysis of four of the exons identified here as subject to skipping in S. pombe reveals high sequence conservation and perfect length conservation with their homologs in scores of plants, animals, and fungi. Moreover, alternative splicing of two of these exons have been documented in multiple vertebrate organisms, making these the first demonstrations of identical alternative-splicing patterns in species that are separated by over 1 billion y of evolution.

  9. The sequence and antiapoptotic functional domains of the human cytomegalovirus UL37 exon 1 immediate early protein are conserved in multiple primary strains.

    PubMed

    Hayajneh, W A; Colberg-Poley, A M; Skaletskaya, A; Bartle, L M; Lesperance, M M; Contopoulos-Ioannidis, D G; Kedersha, N L; Goldmacher, V S

    2001-01-05

    The human cytomegalovirus UL37 exon 1 gene encodes the immediate early protein pUL37x1 that has antiapoptotic and regulatory activities. Deletion mutagenesis analysis of the open reading frame of UL37x1 identified two domains that are necessary and sufficient for its antiapoptotic activity. These domains are confined within the segments between amino acids 5 to 34, and 118 to 147, respectively. The first domain provides the targeting of the protein to mitochondria. Direct PCR sequencing of UL37 exon 1 amplified from 26 primary strains of human cytomegalovirus demonstrated that the promoter, polyadenylation signal, and the two segments of pUL37x1 required for its antiapoptotic function were invariant in all sequenced strains and identical to those in AD169 pUL37x1. In total, UL37 exon 1 varies between 0.0 and 1.6% at the nucleotide level from strain AD169. Only 11 amino acids were found to vary in one or more viral strains, and these variations occurred only in the domains of pUL37x1 dispensable for its antiapoptotic function. We infer from this remarkable conservation of pUL37x1 in primary strains that this protein and, probably, its antiapoptotic function are required for productive replication of human cytomegalovirus in humans.

  10. Structure-sequence based analysis for identification of conserved regions in proteins

    DOEpatents

    Zemla, Adam T; Zhou, Carol E; Lam, Marisa W; Smith, Jason R; Pardes, Elizabeth

    2013-05-28

    Disclosed are computational methods, and associated hardware and software products for scoring conservation in a protein structure based on a computationally identified family or cluster of protein structures. A method of computationally identifying a family or cluster of protein structures in also disclosed herein.

  11. Trichomonas vaginalis acidic phospholipase A2: isolation and partial amino acid sequence.

    PubMed

    Escobedo-Guajardo, Brenda L; González-Salazar, Francisco; Palacios-Corona, Rebeca; Torres de la Cruz, Víctor M; Morales-Vallarta, Mario; Mata-Cárdenas, Benito D; Garza-González, Jesús N; Rivera-Silva, Gerardo; Vargas-Villarreal, Javier

    2013-12-01

    Sexually transmitted diseases are a major cause of acute disease worldwide, and trichomoniasis is the most common and curable disease, generating more than 170 million cases annually worldwide. Trichomonas vaginalis is the causal agent of trichomoniasis and has the ability to destroy in vitro cell monolayers of the vaginal mucosa, where the phospholipases A2 (PLA2) have been reported as potential virulence factors. These enzymes have been partially characterized from the subcellular fraction S30 of pathogenic T. vaginalis strains. The main objective of this study was to purify a phospholipase A2 from T. vaginalis, make a partial characterization, obtain a partial amino acid sequence, and determine its enzymatic participation as hemolytic factor causing lysis of erythrocytes. Trichomonas S30, RF30 and UFF30 sub-fractions from GT-15 strain have the capacity to hydrolyze [2-(14)C-PA]-PC at pH 6.0. Proteins from the UFF30 sub-fraction were separated by affinity chromatography into two eluted fractions with detectable PLA A2 activity. The EDTA-eluted fraction was analyzed by HPLC using on-line HPLC-tandem mass spectrometry and two protein peaks were observed at 8.2 and 13 kDa. Peptide sequences were identified from the proteins present in the eluted EDTA UFF30 fraction; bioinformatic analysis using Protein Link Global Server charged with T. vaginalis protein database suggests that eluted peptides correspond a putative ubiquitin protein in the 8.2 kDa fraction and a phospholipase preserved in the 13 kDa fraction. The EDTA-eluted fraction hydrolyzed [2-(14)C-PA]-PC lyses erythrocytes from Sprague-Dawley in a time and dose-dependent manner. The acidic hemolytic activity decreased by 84% with the addition of 100 μM of Rosenthal's inhibitor.

  12. A conserved patch of hydrophobic amino acids modulates Myb activity by mediating protein-protein interactions.

    PubMed

    Dukare, Sandeep; Klempnauer, Karl-Heinz

    2016-07-01

    The transcription factor c-Myb plays a key role in the control of proliferation and differentiation in hematopoietic progenitor cells and has been implicated in the development of leukemia and certain non-hematopoietic tumors. c-Myb activity is highly dependent on the interaction with the coactivator p300 which is mediated by the transactivation domain of c-Myb and the KIX domain of p300. We have previously observed that conservative valine-to-isoleucine amino acid substitutions in a conserved stretch of hydrophobic amino acids have a profound effect on Myb activity. Here, we have explored the function of the hydrophobic region as a mediator of protein-protein interactions. We show that the hydrophobic region facilitates Myb self-interaction and binding of the histone acetyl transferase Tip60, a previously identified Myb interacting protein. We show that these interactions are affected by the valine-to-isoleucine amino acid substitutions and suppress Myb activity by interfering with the interaction of Myb and the KIX domain of p300. Taken together, our work identifies the hydrophobic region in the Myb transactivation domain as a binding site for homo- and heteromeric protein interactions and leads to a picture of the c-Myb transactivation domain as a composite protein binding region that facilitates interdependent protein-protein interactions of Myb with regulatory proteins.

  13. L-Rhamnose-binding lectin from eggs of the Echinometra lucunter: Amino acid sequence and molecular modeling.

    PubMed

    Carneiro, Rômulo Farias; Teixeira, Claudener Souza; de Melo, Arthur Alves; de Almeida, Alexandra Sampaio; Cavada, Benildo Sousa; de Sousa, Oscarina Viana; da Rocha, Bruno Anderson Matias; Nagano, Celso Shiniti; Sampaio, Alexandre Holanda

    2015-01-01

    An L-rhamnose-binding lectin named ELEL was isolated from eggs of the rock boring sea urchin Echinometra lucunter by affinity chromatography on lactosyl-agarose. ELEL is a homodimer linked by a disulfide bond with subunits of 11 kDa each. The new lectin was inhibited by saccharides possessing the same configuration of hydroxyl groups at C-2 and C-4, such as L-rhamnose, melibiose, galactose and lactose. The amino acid sequence of ELEL was determined by tandem mass spectrometry. The ELEL subunit has 103 amino acids, including nine cysteine residues involved in four conserved intrachain disulfide bonds and one interchain disulfide bond. The full sequence of ELEL presents conserved motifs commonly found in rhamnose-binding lectins, including YGR, DPC and KYL. A three-dimensional model of ELEL was created, and molecular docking revealed favorable binding energies for interactions between ELEL and rhamnose, melibiose and Gb3 (Galα1-4Galβ1-4Glcβ1-Cer). Furthermore, ELEL was able to agglutinate Gram-positive bacterial cells, suggesting its ability to recognize pathogens.

  14. Conserved hypothetical protein Rv1977 in Mycobacterium tuberculosis strains contains sequence polymorphisms and might be involved in ongoing immune evasion.

    PubMed

    Jiang, Yi; Liu, Haican; Wang, Xuezhi; Li, Guilian; Qiu, Yan; Dou, Xiangfeng; Wan, Kanglin

    2015-01-01

    Host immune pressure and associated parasite immune evasion are key features of host-pathogen co-evolution. A previous study showed that human T cell epitopes of Mycobacterium tuberculosis are evolutionarily hyperconserved and thus it was deduced that M. tuberculosis lacks antigenic variation and immune evasion. Here, we selected 151 clinical Mycobacterium tuberculosis isolates from China, amplified gene encoding Rv1977 and compared the sequences. The results showed that Rv1977, a conserved hypothetical protein, is not conserved in M. tuberculosis strains and there are polymorphisms existed in the protein. Some mutations, especially one frameshift mutation, occurred in the antigen Rv1977, which is uncommon in M.tb strains and may lead to the protein function altering. Mutations and deletion in the gene all affect one of three T cell epitopes and the changed T cell epitope contained more than one variable position, which may suggest ongoing immune evasion.

  15. Identification of random nucleic acid sequence aberrations using dual capture probes which hybridize to different chromosome regions

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1998-03-24

    A method is provided for detecting nucleic acid sequence aberrations using two immobilization steps. According to the method, a nucleic acid sequence aberration is detected by detecting nucleic acid sequences having both a first nucleic acid sequence type (e.g., from a first chromosome) and a second nucleic acid sequence type (e.g., from a second chromosome), the presence of the first and the second nucleic acid sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. In the method, immobilization of a first hybridization probe is used to isolate a first set of nucleic acids in the sample which contain the first nucleic acid sequence type. Immobilization of a second hybridization probe is then used to isolate a second set of nucleic acids from within the first set of nucleic acids which contain the second nucleic acid sequence type. The second set of nucleic acids are then detected, their presence indicating the presence of a nucleic acid sequence aberration. 14 figs.

  16. Identification of random nucleic acid sequence aberrations using dual capture probes which hybridize to different chromosome regions

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1998-01-01

    A method is provided for detecting nucleic acid sequence aberrations using two immobilization steps. According to the method, a nucleic acid sequence aberration is detected by detecting nucleic acid sequences having both a first nucleic acid sequence type (e.g., from a first chromosome) and a second nucleic acid sequence type (e.g., from a second chromosome), the presence of the first and the second nucleic acid sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. In the method, immobilization of a first hybridization probe is used to isolate a first set of nucleic acids in the sample which contain the first nucleic acid sequence type. Immobilization of a second hybridization probe is then used to isolate a second set of nucleic acids from within the first set of nucleic acids which contain the second nucleic acid sequence type. The second set of nucleic acids are then detected, their presence indicating the presence of a nucleic acid sequence aberration.

  17. Species identification using genetic tools: the value of nuclear and mitochondrial gene sequences in whale conservation.

    PubMed

    Palumbi, S R; Cipriano, F

    1998-01-01

    DNA sequence analysis is a powerful tool for identifying the source of samples thought to be derived from threatened or endangered species. Analysis of mitochondrial DNA (mtDNA) from retail whale meat markets has shown consistently that the expected baleen whale in these markets, the minke whale, makes up only about half the products analyzed. The other products are either unregulated small toothed whales like dolphins or are protected baleen whales such as humpback, Bryde's, fin, or blue whales. Independent verification of such mtDNA identifications requires analysis of nuclear genetic loci, but this is technically more difficult than standard mtDNA sequencing. In addition, evolution of species-specific sequences (i.e., fixation of sequence differences to produce reciprocally monophyletic gene trees) is slower in nuclear than in mitochondrial genes primarily because genetic drift is slower at nuclear loci. When will use of nuclear sequences allow forensic DNA identification? Comparison of neutral theories of coalescence of mitochondrial and nuclear loci suggests a simple rule of thumb. The "three-times rule" suggests that phylogenetic sorting at nuclear loci is likely to produce species-specific sequences when mitochondrial alleles are reciprocally monophyletic and the branches leading to the mtDNA sequences of a species are three times longer than the average difference observed within species. A preliminary test of the three-times rule, which depends on many assumptions about the species and genes involved, suggests that blue and fin whales should have species-specific sequences at most neutral nuclear loci, whereas humpback and fin whales should show species-specific sequences at fewer nuclear loci. Partial sequences of actin introns from these species confirm the predictions of the three-times rule and show that blue and fin whales are reciprocally monophyletic at this locus. These intron sequences are thus good tools for the identification of these species

  18. The amino acid sequence of protein CM-3 from Dendroaspis polylepis polylepis (black mamba) venom.

    PubMed

    Joubert, F J

    1985-01-01

    Protein CM-3 from Dendroaspis polylepis polylepis venom was purified by gel filtration and ion exchange chromatography. It comprises 65 amino acids including eight half-cystines. The complete amino acid sequence of protein CM-3 has been elucidated. The sequence (residues 1-50) resembles that of the N-terminal sequence of the subunits of a synergistic type protein and residues 51-65 that of the C-terminal sequence of an angusticeps type protein. Mixtures of protein CM-3 and angusticeps type proteins showed no apparent synergistic effect, in that their toxicity in combination was no greater than the sum of their individual toxicities.

  19. The amino acid sequences of the Fd fragments of two human γ heavy chains

    PubMed Central

    Press, E. M.; Hogg, N. M.

    1970-01-01

    The amino acid sequences of the Fd fragments of two human pathological immunoglobulins of the immunoglobulin G1 class are reported. Comparison of the two sequences shows that the heavy-chain variable regions are similar in length to those of the light chains. The existence of heavy chain variable region subgroups is also deduced, from a comparison of these two sequences with those of another γ 1 chain, Eu, a μ chain, Ou, and the partial sequence of a fourth γ 1 chain, Ste. Carbohydrate has been found to be linked to an aspartic acid residue in the variable region of one of the γ 1 chains, Cor. PMID:5449120

  20. Proteaselike sequence in hepatitis B virus core antigen is not required for e antigen generation and may not be part of an aspartic acid-type protease.

    PubMed Central

    Nassal, M; Galle, P R; Schaller, H

    1989-01-01

    The hepatitis B virus (HBV) C gene directs the synthesis of two major gene products: HBV core antigen (HBcAg[p21c]), which forms the nucleocapsid, and HBV e antigen (HBeAg [p17e]), a secreted antigen that is produced by several processing events during its maturation. These proteins contain an amino acid sequence similar to the active-site residues of aspartic acid and retroviral proteases. On the basis of this sequence similarity, which is highly conserved among mammalian hepadnaviruses, a model has been put forward according to which processing to HBeAg is due to self-cleavage of p21c involving the proteaselike sequence. Using site-directed mutagenesis in conjunction with transient expression of HBV proteins in the human hepatoma cell line HepG2, we tested this hypothesis. Our results with HBV mutants in which one or two of the conserved amino acids have been replaced by others suggest strongly that processing to HBeAg does not depend on the presence of an intact proteaselike sequence in the core protein. Attempts to detect an influence of this sequence on the processing of HBV P gene products into enzymatically active viral polymerase also gave no conclusive evidence for the existence of an HBV protease. Mutations replacing the putatively essential aspartic acid showed little effect on polymerase activity. Additional substitution of the likewise conserved threonine residue by alanine, in contrast, almost abolished the activity of the polymerase. We conclude that an HBV protease, if it exists, is functionally different from aspartic acid and retroviral proteases. Images PMID:2657101

  1. The amino acid sequence of goat beta-lactoglobulin.

    PubMed

    Préaux, G; Braunitzer, G; Schrank, B; Stangl, A

    1979-11-01

    The isolation of beta-lactoglobulin from milk of the goat is described. The purified protein was checked for purity and has been characterized by its gross composition and end groups. The native or the modified protein was then degraded by tryptic and cyanogen bromide cleavage. The cleavage products were isolated and sequenced in the sequenator using a Quadrol and propyne program. These data provide the complete sequence of beta-lactoglobulin of the goat. The results are discussed and compared particularly with bovine beta-lactoglobulin components AB. Some biological aspects are described.

  2. Conservation of nucleotide sequences for molecular diagnosis of Middle East respiratory syndrome coronavirus, 2015.

    PubMed

    Furuse, Yuki; Okamoto, Michiko; Oshitani, Hitoshi

    2015-11-01

    Infection due to the Middle East respiratory syndrome coronavirus (MERS-CoV) is widespread. The present study was performed to assess the protocols used for the molecular diagnosis of MERS-CoV by analyzing the nucleotide sequences of viruses detected between 2012 and 2015, including sequences from the large outbreak in eastern Asia in 2015. Although the diagnostic protocols were established only 2 years ago, mismatches between the sequences of primers/probes and viruses were found for several of the assays. Such mismatches could lead to a lower sensitivity of the assay, thereby leading to false-negative diagnosis. A slight modification in the primer design is suggested. Protocols for the molecular diagnosis of viral infections should be reviewed regularly after they are established, particularly for viruses that pose a great threat to public health such as MERS-CoV.

  3. Layered materials with coexisting acidic and basic sites for catalytic one-pot reaction sequences.

    PubMed

    Motokura, Ken; Tada, Mizuki; Iwasawa, Yasuhiro

    2009-06-17

    Acidic montmorillonite-immobilized primary amines (H-mont-NH(2)) were found to be excellent acid-base bifunctional catalysts for one-pot reaction sequences, which are the first materials with coexisting acid and base sites active for acid-base tamdem reactions. For example, tandem deacetalization-Knoevenagel condensation proceeded successfully with the H-mont-NH(2), affording the corresponding condensation product in a quantitative yield. The acidity of the H-mont-NH(2) was strongly influenced by the preparation solvent, and the base-catalyzed reactions were enhanced by interlayer acid sites.

  4. Histone-dependent IgG conservation in octanoic acid precipitation and its mechanism.

    PubMed

    Chen, Quan; Toh, Phyllicia; Sun, Yue; Latiff, Sarah Maria Abdul; Hoi, Aina; Xian, Mo; Zhang, Haibo; Nian, Rui; Zhang, Wei; Gagnon, Pete

    2016-12-01

    Octanoic acid (OA) precipitation has long been used in protein purification. Recently, we reported a new cell culture clarification method for immunoglobulin G (IgG) purification, employing an advance elimination of chromatin heteroaggregates with a hybrid OA-solid phase system. This treatment reduced DNA more than 3 logs, histone below the detection limit (LOD), and non-histone host cell proteins (nh-HCP) by 90 % while conserving more than 90 % of the IgG monomer. In this study, we further investigated the conservation of IgG monomer and antibody light chain (LC) to the addition of OA/OA-solid phase complex, with or without histone and DNA in different combinations. The results showed that highly basic histone protein was the prime target in OA/OA-solid phase precipitation system for IgG purification, and the selective conservation of IgG monomer in this system was histone dependent. Our findings partially support the idea that OA works by sticking to electropositive hydrophobic domains on proteins, reducing their solubility, and causing them to agglomerate into large particles that precipitate from solution. Our findings also provide a new perspective for IgG purification and emphasize the necessity to re-examine the roles of various host contaminants in IgG purification.

  5. Synthesis of gamma,delta-unsaturated glycolic acids via sequenced brook and Ireland--claisen rearrangements.

    PubMed

    Schmitt, Daniel C; Johnson, Jeffrey S

    2010-03-05

    Organozinc, -magnesium, and -lithium nucleophiles initiate a Brook/Ireland-Claisen rearrangement sequence of allylic silyl glyoxylates resulting in the formation of gamma,delta-unsaturated alpha-silyloxy acids.

  6. Computer Simulation of the Determination of Amino Acid Sequences in Polypeptides

    ERIC Educational Resources Information Center

    Daubert, Stephen D.; Sontum, Stephen F.

    1977-01-01

    Describes a computer program that generates a random string of amino acids and guides the student in determining the correct sequence of a given protein by using experimental analytic data for that protein. (MLH)

  7. Length heterogeneity at conserved sequence block 2 in human mitochondrial DNA acts as a rheostat for RNA polymerase POLRMT activity

    PubMed Central

    Tan, Benedict G.; Wellesley, Frederick C.; Savery, Nigel J.; Szczelkun, Mark D.

    2016-01-01

    The guanine (G)-tract of conserved sequence block 2 (CSB 2) in human mitochondrial DNA can result in transcription termination due to formation of a hybrid G-quadruplex between the nascent RNA and the nontemplate DNA strand. This structure can then influence genome replication, stability and localization. Here we surveyed the frequency of variation in sequence identity and length at CSB 2 amongst human mitochondrial genomes and used in vitro transcription to assess the effects of this length heterogeneity on the activity of the mitochondrial RNA polymerase, POLRMT. In general, increased G-tract length correlated with increased termination levels. However, variation in the population favoured CSB 2 sequences which produced efficient termination while particularly weak or strong signals were avoided. For all variants examined, the 3′ end of the transcripts mapped to the same downstream sequences and were prevented from terminating by addition of the transcription factor TEFM. We propose that CSB 2 length heterogeneity allows variation in the efficiency of transcription termination without affecting the position of the products or the capacity for regulation by TEFM. PMID:27436287

  8. Complete sequence of the mitochondrial DNA in the sea urchin Arbacia lixula: conserved features of the echinoid mitochondrial genome.

    PubMed

    De Giorgi, C; Martiradonna, A; Lanave, C; Saccone, C

    1996-04-01

    The complete nucleotide sequence (15,719 nucleotides) of the mitochondrial DNA (mtDNA) from the sea urchin Arbacia lixula is presented. The comparison of gene arrangement between different echinoderm orders of the same class provides evidence that the gene organization is conserved within the same echinoderm class. The peculiarities of sea urchin mtDNA features, already described, are confirmed by the A. lixula mtDNA sequence. The comparison of the entire sequences of mtDNA among A. lixula, Paracentrotus lividus, and Strongylocentrotus purpuratus allowed us to detect peculiar features, common to the three sea urchin species, that can represent the molecular signature of the mt genome in the sea urchin group. Analysis of the nucleotide composition indicates that A. lixula mtDNA, in contrast with the mtDNA of other sea urchins, shows a bias in the use of T and tends to avoid the use of C, most evident in the neutral part of the molecule, such as the third codon positions. This observation indicates that the three sea urchin mtDNAs evolve under different mutation pressure. Analysis of the sequence evolution allowed us to confirm the phylogenetic tree. However, the absolute divergence time, calculated on the basis of paleontological estimates, largely diverged from the expected one.

  9. Evolution of ITS1 rDNA in the Digenea (Platyhelminthes: trematoda): 3' end sequence conservation and its phylogenetic utility.

    PubMed

    vd Schulenburg, J H; Englisch, U; Wägele, J W

    1999-01-01

    A comparison of ribosomal internal transcribed spacer 1 (ITS1) elements of digenetic trematodes (Platyhelminthes) including unidentified digeneans isolated from Cyathura carinata (Crustacea: Isopoda) revealed DNA sequence similarities at more than half of the spacer at its 3' end. Primary sequence similarity was shown to be associated with secondary structure conservation, which suggested that similarity is due to identity by descent and not chance. Using an analysis of apomorphies, the sequence data were shown to produce a distinct phylogenetic signal. This was confirmed by the consistency of results of different tree reconstruction methods such as distance approaches, maximum parsimony, and maximum likelihood. Morphological evidence additionally supported the phylogenetic tree based on ITS1 data and the inferred phylogenetic position of the unidentified digeneans of C. carinata met the expectations from known trematode life-cycle patterns. Although ribosomal ITS1 elements are generally believed to be too variable for phylogenetic analysis above the species or genus level, the overall consistency of the results of this study strongly suggests that this is not the case in digenetic trematodes. Here, 3' end ITS1 sequence data seem to provide a valuable tool for elucidating phylogenetic relationships of a broad range of phylogenetically distinct taxa.

  10. Conservation of the C-type lectin fold for massive sequence variation in a Treponema diversity-generating retroelement

    SciTech Connect

    Le Coq, Johanne; Ghosh, Partho

    2012-06-19

    Anticipatory ligand binding through massive protein sequence variation is rare in biological systems, having been observed only in the vertebrate adaptive immune response and in a phage diversity-generating retroelement (DGR). Earlier work has demonstrated that the prototypical DGR variable protein, major tropism determinant (Mtd), meets the demands of anticipatory ligand binding by novel means through the C-type lectin (CLec) fold. However, because of the low sequence identity among DGR variable proteins, it has remained unclear whether the CLec fold is a general solution for DGRs. We have addressed this problem by determining the structure of a second DGR variable protein, TvpA, from the pathogenic oral spirochete Treponema denticola. Despite its weak sequence identity to Mtd ({approx}16%), TvpA was found to also have a CLec fold, with predicted variable residues exposed in a ligand-binding site. However, this site in TvpA was markedly more variable than the one in Mtd, reflecting the unprecedented approximate 10{sup 20} potential variability of TvpA. In addition, similarity between TvpA and Mtd with formylglycine-generating enzymes was detected. These results provide strong evidence for the conservation of the formylglycine-generating enzyme-type CLec fold among DGRs as a means of accommodating massive sequence variation.

  11. Genome sequence of the acid-tolerant strain Rhizobium sp. LPU83.

    PubMed

    Wibberg, Daniel; Tejerizo, Gonzalo Torres; Del Papa, María Florencia; Martini, Carla; Pühler, Alfred; Lagares, Antonio; Schlüter, Andreas; Pistorio, Mariano

    2014-04-20

    Rhizobia are important members of the soil microbiome since they enter into nitrogen-fixing symbiosis with different legume host plants. Rhizobium sp. LPU83 is an acid-tolerant Rhizobium strain featuring a broad-host-range. However, it is ineffective in nitrogen fixation. Here, the improved draft genome sequence of this strain is reported. Genome sequence information provides the basis for analysis of its acid tolerance, symbiotic properties and taxonomic classification.

  12. Coupling DNA-binding and ATP hydrolysis in Escherichia coli RecQ: role of a highly conserved aromatic-rich sequence.

    PubMed

    Zittel, Morgan C; Keck, James L

    2005-01-01

    RecQ enzymes are broadly conserved Superfamily-2 (SF-2) DNA helicases that play critical roles in DNA metabolism. RecQ proteins use the energy of ATP hydrolysis to drive DNA unwinding; however, the mechanisms by which RecQ links ATPase activity to DNA-binding/unwinding are unknown. In many Superfamily-1 (SF-1) DNA helicases, helicase sequence motif III links these activities by binding both single-stranded (ss) DNA and ATP. However, the ssDNA-binding aromatic-rich element in motif III present in these enzymes is missing from SF-2 helicases, raising the question of how these enzymes link ATP hydrolysis to DNA-binding/unwinding. We show that Escherichia coli RecQ contains a conserved aromatic-rich loop in its helicase domain between motifs II and III. Although placement of the RecQ aromatic-rich loop is topologically distinct relative to the SF-1 enzymes, both loops map to similar tertiary structural positions. We examined the functions of the E.coli RecQ aromatic-rich loop using RecQ variants with single amino acid substitutions within the segment. Our results indicate that the aromatic-rich loop in RecQ is critical for coupling ATPase and DNA-binding/unwinding activities. Our studies also suggest that RecQ's aromatic-rich loop might couple ATP hydrolysis to DNA-binding in a mechanistically distinct manner from SF-1 helicases.

  13. A conserved unusual posttranscriptional processing mediated by short, direct repeated (SDR) sequences in plants.

    PubMed

    Niu, Xiangli; Luo, Di; Gao, Shaopei; Ren, Guangjun; Chang, Lijuan; Zhou, Yuke; Luo, Xiaoli; Li, Yuxiang; Hou, Pei; Tang, Wei; Lu, Bao-Rong; Liu, Yongsheng

    2010-01-01

    In several stress responsive gene loci of monocot cereal crops, we have previously identified an unusual posttranscriptional processing mediated by paired presence of short direct repeated (SDR) sequences at 5' and 3' splicing junctions that are distinct from conventional (U2/U12-type) splicing boundaries. By using the known SDR-containing sequences as probes, 24 plant candidate genes involved in diverse functional pathways from both monocots and dicots that potentially possess SDR-mediated posttranscriptional processing were predicted in the GenBank database. The SDRs-mediated posttranscriptional processing events including cis- and trans-actions were experimentally detected in majority of the predicted candidates. Extensive sequence analysis demonstrates several types of SDR-associated splicing peculiarities including partial exon deletion, exon fragment repetition, exon fragment scrambling and trans-splicing that result in either loss of partial exon or unusual exonic sequence rearrangements within or between RNA molecules. In addition, we show that the paired presence of SDR is necessary but not sufficient in SDR-mediated splicing in transient expression and stable transformation systems. We also show prokaryote is incapable of SDR-mediated premRNA splicing.

  14. The amino acid sequence of monal pheasant lysozyme and its activity.

    PubMed

    Araki, T; Matsumoto, T; Torikata, T

    1998-10-01

    The amino acid sequence of monal pheasant lysozyme and its activity were analyzed. Carboxymethylated lysozyme was digested with trypsin and the resulting peptides were sequenced. The established amino acid sequence had one amino acid substitution at position 102 (Arg to Gly) comparing with Indian peafowl lysozyme and four amino acid substitutions at positions 3 (Phe to Tyr), 15 (His to Leu), 41 (Gln to His), and 121 (Gln to His) with chicken lysozyme. Analysis of the time-courses of reaction using N-acetylglucosamine pentamer as a substrate showed a difference of binding free energy change (-0.4 kcal/mol) at subsites A between monal pheasant and Indian peafowl lysozyme. This was assumed to be caused by the amino acid substitution at subsite A with loss of a positive charge at position 102 (Arg102 to Gly).

  15. Comprehensive Sequence Analysis of the Human IL23A Gene Defines New Variation Content and High Rate of Evolutionary Conservation

    PubMed Central

    Tindall, Elizabeth A.; Hayes, Vanessa M.

    2010-01-01

    A newly described heterodimeric cytokine, interleukin-23 (IL-23) is emerging as a key player in both the innate and the adaptive T helper (Th)17 driven immune response as well as an initiator of several autoimmune diseases. The rate-limiting element of IL-23 production is believed to be driven by expression of the unique p19 subunit encoded by IL23A. We set out to perform comprehensive DNA sequencing of this previously under-studied gene in 96 individuals from two evolutionary distinct human population groups, Southern African Bantu and European. We observed a total of 33 different DNA variants within these two groups, 22 (67%) of which are currently not reported in any available database. We further demonstrate both inter-population and intra-species sequence conservation within the coding and known regulatory regions of IL23A, supporting a critical physiological role for IL-23. We conclude that IL23A may have undergone positive selection pressure directed towards conservation, suggesting that functional genetic variants within IL23A will have a significant impact on the host immune response. PMID:20154336

  16. Touchdown digital polymerase chain reaction for quantification of highly conserved sequences in the HIV-1 genome.

    PubMed

    De Spiegelaere, Ward; Malatinkova, Eva; Kiselinova, Maja; Bonczkowski, Pawel; Verhofstede, Chris; Vogelaers, Dirk; Vandekerckhove, Linos

    2013-08-15

    Digital polymerase chain reaction (PCR) is an emerging absolute quantification method based on the limiting dilution principle and end-point PCR. This methodology provides high flexibility in assay design without influencing quantitative accuracy. This article describes an assay to quantify HIV DNA that targets a highly conserved region of the HIV-1 genome that hampers optimal probe design. To maintain high specificity and allow probe binding and hydrolysis of a probe with low melting temperature, a two-stage touchdown PCR was designed with a first round of amplification at high temperature and a subsequent round at low temperature to allow accumulation of fluorescence.

  17. Single-chain structure of human ceruloplasmin: the complete amino acid sequence of the whole molecule.

    PubMed Central

    Takahashi, N; Ortel, T L; Putnam, F W

    1984-01-01

    We have determined the amino acid sequence of the amino-terminal 67,000-dalton (67-kDa) fragment of human ceruloplasmin and have established overlapping sequences between the 67-kDa and 50-kDa fragments and between the 50-kDa and 19-kDa fragments. The 67-kDa fragment contains 480 amino acid residues and three glucosamine oligosaccharides. These results together with our previous sequence data for the 50-kDa and 19-kDa fragments complete the amino acid sequence of human ceruloplasmin. The polypeptide chain has a total of 1,046 amino acid residues (Mr 120,085) and has attachment sites for four glucosamine oligosaccharides; together these account for the total molecular mass of human ceruloplasmin (132 kDa). The sequence analysis of the peptides overlapping the fragments showed that one additional amino acid, arginine, is present between the 67-kDa and 50-kDa fragments, and another, lysine, is between the 50-kDa and 19-kDa fragments. Only two apparent sites of amino acid interchange have been identified in the polypeptide chain. Both involve a single-point interchange of glycine and lysine that would result in a difference in charge. The results of the complete sequence analysis verified that human ceruloplasmin is composed of a single polypeptide chain and that the subunit-like fragments are produced by proteolytic cleavage during purification (and possibly also in vivo). PMID:6582496

  18. Multiple Genome Sequences of Important Beer-Spoiling Lactic Acid Bacteria

    PubMed Central

    Geissler, Andreas J.; Vogel, Rudi F.

    2016-01-01

    Seven strains of important beer-spoiling lactic acid bacteria were sequenced using single-molecule real-time sequencing. Complete genomes were obtained for strains of Lactobacillus paracollinoides, Lactobacillus lindneri, and Pediococcus claussenii. The analysis of these genomes emphasizes the role of plasmids as the genomic foundation of beer-spoiling ability. PMID:27795248

  19. Cloning, sequence analysis and expression of the F1F0-ATPase beta-subunit from wine lactic acid bacteria.

    PubMed

    Sievers, Martin; Uermösi, Christina; Fehlmann, Marc; Krieger, Sibylle

    2003-09-01

    The nucleotide sequences of the genes encoding the F1F0-ATPase beta-subunit from Oenococcus oeni, Leuconostoc mesenteroides subsp. mesenteroides, Pediococcus damnosus, Pediococcus parvulus, Lactobacillus brevis and Lactobacillus hilgardii were determined. Their deduced amino acid sequences showed homology values of 79-98%. Data from the alignment and ATPase tree indicated that O. oeni and L. mesenteroides subsp. mesenteroides formed a group well-separated from P. damnosus and P. parvulus and from the group comprises L. brevis and L. hilgardii. The N-terminus of the F1F0-ATPase beta-subunit of O. oeni contains a stretch of additional 38 amino acid residues. The catalytic site of the ATPase beta-subunit of the investigated strains is characterized by the two conserved motifs GGAGVGKT and GERTRE. The amplified atpD coding sequences were inserted into the pCRT7/CT-TOPO vector using TA-cloning strategy and transformed in Escherichia coli. SDS-PAGE and Western blot analyses confirmed that O. oeni has an ATPase beta-subunit protein which is larger in size than the corresponding molecules from the investigated strains.

  20. PASTA: Ultra-Large Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences.

    PubMed

    Mirarab, Siavash; Nguyen, Nam; Guo, Sheng; Wang, Li-San; Kim, Junhyong; Warnow, Tandy

    2015-05-01

    We introduce PASTA, a new multiple sequence alignment algorithm. PASTA uses a new technique to produce an alignment given a guide tree that enables it to be both highly scalable and very accurate. We present a study on biological and simulated data with up to 200,000 sequences, showing that PASTA produces highly accurate alignments, improving on the accuracy and scalability of the leading alignment methods (including SATé). We also show that trees estimated on PASTA alignments are highly accurate--slightly better than SATé trees, but with substantial improvements relative to other methods. Finally, PASTA is faster than SATé, highly parallelizable, and requires relatively little memory.

  1. CDvist: A webserver for identification and visualization of conserved domains in protein sequences

    SciTech Connect

    Adebali, Ogun; Ortega, Davi R.; Zhulin, Igor B.

    2014-12-18

    Identification of domains in protein sequences allows their assigning to biological functions. Several webservers exist for identification of protein domains using similarity searches against various databases of protein domain models. However, none of them provides comprehensive domain coverage while allowing bulk querying and their visualization schemes can be improved. To address these issues, we developed CDvist (a comprehensive domain visualization tool), which combines the best available search algorithms and databases into a user-friendly framework. First, a given protein sequence is matched to domain models using high-specificity tools and only then unmatched segments are subjected to more sensitive algorithms resulting in a best possible comprehensive coverage. In conclusion, bulk querying and rich visualization and download options provide improved functionality to domain architecture analysis.

  2. CDvist: A webserver for identification and visualization of conserved domains in protein sequences

    DOE PAGES

    Adebali, Ogun; Ortega, Davi R.; Zhulin, Igor B.

    2014-12-18

    Identification of domains in protein sequences allows their assigning to biological functions. Several webservers exist for identification of protein domains using similarity searches against various databases of protein domain models. However, none of them provides comprehensive domain coverage while allowing bulk querying and their visualization schemes can be improved. To address these issues, we developed CDvist (a comprehensive domain visualization tool), which combines the best available search algorithms and databases into a user-friendly framework. First, a given protein sequence is matched to domain models using high-specificity tools and only then unmatched segments are subjected to more sensitive algorithms resulting inmore » a best possible comprehensive coverage. In conclusion, bulk querying and rich visualization and download options provide improved functionality to domain architecture analysis.« less

  3. The First Myriapod Genome Sequence Reveals Conservative Arthropod Gene Content and Genome Organisation in the Centipede Strigamia maritima

    PubMed Central

    Chipman, Ariel D.; Ferrier, David E. K.; Brena, Carlo; Qu, Jiaxin; Hughes, Daniel S. T.; Schröder, Reinhard; Torres-Oliva, Montserrat; Znassi, Nadia; Jiang, Huaiyang; Almeida, Francisca C.; Alonso, Claudio R.; Apostolou, Zivkos; Aqrawi, Peshtewani; Arthur, Wallace; Barna, Jennifer C. J.; Blankenburg, Kerstin P.; Brites, Daniela; Capella-Gutiérrez, Salvador; Coyle, Marcus; Dearden, Peter K.; Du Pasquier, Louis; Duncan, Elizabeth J.; Ebert, Dieter; Eibner, Cornelius; Erikson, Galina; Evans, Peter D.; Extavour, Cassandra G.; Francisco, Liezl; Gabaldón, Toni; Gillis, William J.; Goodwin-Horn, Elizabeth A.; Green, Jack E.; Griffiths-Jones, Sam; Grimmelikhuijzen, Cornelis J. P.; Gubbala, Sai; Guigó, Roderic; Han, Yi; Hauser, Frank; Havlak, Paul; Hayden, Luke; Helbing, Sophie; Holder, Michael; Hui, Jerome H. L.; Hunn, Julia P.; Hunnekuhl, Vera S.; Jackson, LaRonda; Javaid, Mehwish; Jhangiani, Shalini N.; Jiggins, Francis M.; Jones, Tamsin E.; Kaiser, Tobias S.; Kalra, Divya; Kenny, Nathan J.; Korchina, Viktoriya; Kovar, Christie L.; Kraus, F. Bernhard; Lapraz, François; Lee, Sandra L.; Lv, Jie; Mandapat, Christigale; Manning, Gerard; Mariotti, Marco; Mata, Robert; Mathew, Tittu; Neumann, Tobias; Newsham, Irene; Ngo, Dinh N.; Ninova, Maria; Okwuonu, Geoffrey; Ongeri, Fiona; Palmer, William J.; Patil, Shobha; Patraquim, Pedro; Pham, Christopher; Pu, Ling-Ling; Putman, Nicholas H.; Rabouille, Catherine; Ramos, Olivia Mendivil; Rhodes, Adelaide C.; Robertson, Helen E.; Robertson, Hugh M.; Ronshaugen, Matthew; Rozas, Julio; Saada, Nehad; Sánchez-Gracia, Alejandro; Scherer, Steven E.; Schurko, Andrew M.; Siggens, Kenneth W.; Simmons, DeNard; Stief, Anna; Stolle, Eckart; Telford, Maximilian J.; Tessmar-Raible, Kristin; Thornton, Rebecca; van der Zee, Maurijn; von Haeseler, Arndt; Williams, James M.; Willis, Judith H.; Wu, Yuanqing; Zou, Xiaoyan; Lawson, Daniel; Muzny, Donna M.; Worley, Kim C.; Gibbs, Richard A.; Akam, Michael; Richards, Stephen

    2014-01-01

    Myriapods (e.g., centipedes and millipedes) display a simple homonomous body plan relative to other arthropods. All members of the class are terrestrial, but they attained terrestriality independently of insects. Myriapoda is the only arthropod class not represented by a sequenced genome. We present an analysis of the genome of the centipede Strigamia maritima. It retains a compact genome that has undergone less gene loss and shuffling than previously sequenced arthropods, and many orthologues of genes conserved from the bilaterian ancestor that have been lost in insects. Our analysis locates many genes in conserved macro-synteny contexts, and many small-scale examples of gene clustering. We describe several examples where S. maritima shows different solutions from insects to similar problems. The insect olfactory receptor gene family is absent from S. maritima, and olfaction in air is likely effected by expansion of other receptor gene families. For some genes S. maritima has evolved paralogues to generate coding sequence diversity, where insects use alternate splicing. This is most striking for the Dscam gene, which in Drosophila generates more than 100,000 alternate splice forms, but in S. maritima is encoded by over 100 paralogues. We see an intriguing linkage between the absence of any known photosensory proteins in a blind organism and the additional absence of canonical circadian clock genes. The phylogenetic position of myriapods allows us to identify where in arthropod phylogeny several particular molecular mechanisms and traits emerged. For example, we conclude that juvenile hormone signalling evolved with the emergence of the exoskeleton in the arthropods and that RR-1 containing cuticle proteins evolved in the lineage leading to Mandibulata. We also identify when various gene expansions and losses occurred. The genome of S. maritima offers us a unique glimpse into the ancestral arthropod genome, while also displaying many adaptations to its specific

  4. Sorting out relationships among the grouse and ptarmigan using intron, mitochondrial, and ultra-conserved element sequences.

    PubMed

    Persons, Nicholas W; Hosner, Peter A; Meiklejohn, Kelly A; Braun, Edward L; Kimball, Rebecca T

    2016-05-01

    The Holarctic phasianid clade of the grouse and ptarmigan has received substantial attention in areas such as evolution of mating systems, display behavior, and population ecology related to their conservation and management as wild game species. There are multiple molecular phylogenetic studies that focus on grouse and ptarmigan. In spite of this, there is little consensus regarding historical relationships, particularly among genera, which has led to unstable and partial taxonomic revisions. We estimated the phylogeny of all currently recognized species using a combination of novel data from seven nuclear loci (largely intron sequences) and published data from one additional autosomal locus, two W-linked loci, and four mitochondrial regions. To explore relationships among genera and assess paraphyly of one genus more rigorously, we then added over 3000 ultra-conserved element (UCE) loci (over 1.7million bp) gathered using Illumina sequencing. The UCE topology agreed with that of the combined nuclear intron and previously published sequence data with 100% bootstrap support for all relationships. These data strongly support previous studies separating Bonasa from Tetrastes and Dendragapus from Falcipennis. However, the placement of Lagopus differed from previous studies, and we found no support for Falcipennis monophyly. Biogeographic analysis suggests that the ancestors of grouse and ptarmigan were distributed in the New World and subsequently underwent at least four dispersal events between the Old and New Worlds. Divergence time estimates from maternally-inherited and autosomal markers show stark differences across this clade, with divergence time estimates from maternally-inherited markers being nearly half that of the autosomal markers at some nodes, and nearly twice that at other nodes.

  5. The first myriapod genome sequence reveals conservative arthropod gene content and genome organisation in the centipede Strigamia maritima.

    PubMed

    Chipman, Ariel D; Ferrier, David E K; Brena, Carlo; Qu, Jiaxin; Hughes, Daniel S T; Schröder, Reinhard; Torres-Oliva, Montserrat; Znassi, Nadia; Jiang, Huaiyang; Almeida, Francisca C; Alonso, Claudio R; Apostolou, Zivkos; Aqrawi, Peshtewani; Arthur, Wallace; Barna, Jennifer C J; Blankenburg, Kerstin P; Brites, Daniela; Capella-Gutiérrez, Salvador; Coyle, Marcus; Dearden, Peter K; Du Pasquier, Louis; Duncan, Elizabeth J; Ebert, Dieter; Eibner, Cornelius; Erikson, Galina; Evans, Peter D; Extavour, Cassandra G; Francisco, Liezl; Gabaldón, Toni; Gillis, William J; Goodwin-Horn, Elizabeth A; Green, Jack E; Griffiths-Jones, Sam; Grimmelikhuijzen, Cornelis J P; Gubbala, Sai; Guigó, Roderic; Han, Yi; Hauser, Frank; Havlak, Paul; Hayden, Luke; Helbing, Sophie; Holder, Michael; Hui, Jerome H L; Hunn, Julia P; Hunnekuhl, Vera S; Jackson, LaRonda; Javaid, Mehwish; Jhangiani, Shalini N; Jiggins, Francis M; Jones, Tamsin E; Kaiser, Tobias S; Kalra, Divya; Kenny, Nathan J; Korchina, Viktoriya; Kovar, Christie L; Kraus, F Bernhard; Lapraz, François; Lee, Sandra L; Lv, Jie; Mandapat, Christigale; Manning, Gerard; Mariotti, Marco; Mata, Robert; Mathew, Tittu; Neumann, Tobias; Newsham, Irene; Ngo, Dinh N; Ninova, Maria; Okwuonu, Geoffrey; Ongeri, Fiona; Palmer, William J; Patil, Shobha; Patraquim, Pedro; Pham, Christopher; Pu, Ling-Ling; Putman, Nicholas H; Rabouille, Catherine; Ramos, Olivia Mendivil; Rhodes, Adelaide C; Robertson, Helen E; Robertson, Hugh M; Ronshaugen, Matthew; Rozas, Julio; Saada, Nehad; Sánchez-Gracia, Alejandro; Scherer, Steven E; Schurko, Andrew M; Siggens, Kenneth W; Simmons, DeNard; Stief, Anna; Stolle, Eckart; Telford, Maximilian J; Tessmar-Raible, Kristin; Thornton, Rebecca; van der Zee, Maurijn; von Haeseler, Arndt; Williams, James M; Willis, Judith H; Wu, Yuanqing; Zou, Xiaoyan; Lawson, Daniel; Muzny, Donna M; Worley, Kim C; Gibbs, Richard A; Akam, Michael; Richards, Stephen

    2014-11-01

    Myriapods (e.g., centipedes and millipedes) display a simple homonomous body plan relative to other arthropods. All members of the class are terrestrial, but they attained terrestriality independently of insects. Myriapoda is the only arthropod class not represented by a sequenced genome. We present an analysis of the genome of the centipede Strigamia maritima. It retains a compact genome that has undergone less gene loss and shuffling than previously sequenced arthropods, and many orthologues of genes conserved from the bilaterian ancestor that have been lost in insects. Our analysis locates many genes in conserved macro-synteny contexts, and many small-scale examples of gene clustering. We describe several examples where S. maritima shows different solutions from insects to similar problems. The insect olfactory receptor gene family is absent from S. maritima, and olfaction in air is likely effected by expansion of other receptor gene families. For some genes S. maritima has evolved paralogues to generate coding sequence diversity, where insects use alternate splicing. This is most striking for the Dscam gene, which in Drosophila generates more than 100,000 alternate splice forms, but in S. maritima is encoded by over 100 paralogues. We see an intriguing linkage between the absence of any known photosensory proteins in a blind organism and the additional absence of canonical circadian clock genes. The phylogenetic position of myriapods allows us to identify where in arthropod phylogeny several particular molecular mechanisms and traits emerged. For example, we conclude that juvenile hormone signalling evolved with the emergence of the exoskeleton in the arthropods and that RR-1 containing cuticle proteins evolved in the lineage leading to Mandibulata. We also identify when various gene expansions and losses occurred. The genome of S. maritima offers us a unique glimpse into the ancestral arthropod genome, while also displaying many adaptations to its specific

  6. Co-conservation of rRNA tetraloop sequences and helix length suggests involvement of the tetraloops in higher-order interactions

    NASA Technical Reports Server (NTRS)

    Hedenstierna, K. O.; Siefert, J. L.; Fox, G. E.; Murgola, E. J.

    2000-01-01

    Terminal loops containing four nucleotides (tetraloops) are common in structural RNAs, and they frequently conform to one of three sequence motifs, GNRA, UNCG, or CUUG. Here we compare available sequences and secondary structures for rRNAs from bacteria, and we show that helices capped by phylogenetically conserved GNRA loops display a strong tendency to be of conserved length. The simplest interpretation of this correlation is that the conserved GNRA loops are involved in higher-order interactions, intramolecular or intermolecular, resulting in a selective pressure for maintaining the lengths of these helices. A small number of conserved UNCG loops were also found to be associated with conserved length helices, consistent with the possibility that this type of tetraloop also takes part in higher-order interactions.

  7. SETG: Nucleic Acid Extraction and Sequencing for In Situ Life Detection on Mars

    NASA Astrophysics Data System (ADS)

    Mojarro, A.; Hachey, J.; Tani, J.; Smith, A.; Bhattaru, S. A.; Pontefract, A.; Doebler, R.; Brown, M.; Ruvkun, G.; Zuber, M. T.; Carr, C. E.

    2016-10-01

    We are developing an integrated nucleic acid extraction and sequencing instrument: the Search for Extra-Terrestrial Genomes (SETG) for in situ life detection on Mars. Our goals are to identify related or unrelated nucleic acid-based life on Mars.

  8. Draft Genome Sequence of Cyanobacterium sp. Strain IPPAS B-1200 with a Unique Fatty Acid Composition

    PubMed Central

    Starikov, Alexander Y.; Usserbaeva, Aizhan A.; Sinetova, Maria A.; Sarsekeyeva, Fariza K.; Zayadan, Bolatkhan K.; Ustinova, Vera V.; Kupriyanova, Elena V.; Los, Dmitry A.

    2016-01-01

    Here, we report the draft genome of Cyanobacterium sp. IPPAS strain B-1200, isolated from Lake Balkhash, Kazakhstan, and characterized by the unique fatty acid composition of its membrane lipids, which are enriched with myristic and myristoleic acids. The approximate genome size is 3.4 Mb, and the predicted number of coding sequences is 3,119. PMID:27856596

  9. Sequencing and computational analysis of complete genome sequences of Citrus yellow mosaic badna virus from acid lime and pummelo.

    PubMed

    Borah, Basanta K; Johnson, A M Anthony; Sai Gopal, D V R; Dasgupta, Indranil

    2009-08-01

    Citrus yellow mosaic badna virus (CMBV), a member of the Family Caulimoviridae, Genus Badnavirus, is the causative agent of Citrus mosaic disease in India. Although the virus has been detected in several citrus species, only two full-length genomes, one each from Sweet orange and Rangpur lime, are available in publicly accessible databases. In order to obtain a better understanding of the genetic variability of the virus in other citrus mosaic-affected citrus species, we performed the cloning and sequence analysis of complete genomes of CMBV from two additional citrus species, Acid lime and Pummelo. We show that CMBV genomes from the two hosts share high homology with previously reported CMBV sequences and hence conclude that the new isolates represent variants of the virus present in these species. Based on in silico sequence analysis, we predict the possible function of the protein encoded by one of the five ORFs.

  10. Parvalbumins from coelacanth muscle. III. Amino acid sequence of the major component.

    PubMed

    Jauregui-Adell, J; Pechere, J F

    1978-09-26

    The primary structure of the major parvalbumin (pI = 4.52) from coelacanth muscle (Latimeria chalumnae) has been determined. Sequence analysis of the tryptic peptides, in some cases obtained with beta-trypsin, accounts for the total amino acid content of the protein. Chymotryptic peptides provide appropriate sequence overlaps, to complete the localization of the tryptic peptides. Examination of the amino acid sequence of this protein shows the typical structure of a beta-parvalbumin. Its position in the dendrogram of related calcium-binding proteins corresponds to that usually accepted for crossopterygians.

  11. Amino acid sequence of anionic peroxidase from the windmill palm tree Trachycarpus fortunei.

    PubMed

    Baker, Margaret R; Zhao, Hongwei; Sakharov, Ivan Yu; Li, Qing X

    2014-12-10

    Palm peroxidases are extremely stable and have uncommon substrate specificity. This study was designed to fill in the knowledge gap about the structures of a peroxidase from the windmill palm tree Trachycarpus fortunei. The complete amino acid sequence and partial glycosylation were determined by MALDI-top-down sequencing of native windmill palm tree peroxidase (WPTP), MALDI-TOF/TOF MS/MS of WPTP tryptic peptides, and cDNA sequencing. The propeptide of WPTP contained N- and C-terminal signal sequences which contained 21 and 17 amino acid residues, respectively. Mature WPTP was 306 amino acids in length, and its carbohydrate content ranged from 21% to 29%. Comparison to closely related royal palm tree peroxidase revealed structural features that may explain differences in their substrate specificity. The results can be used to guide engineering of WPTP and its novel applications.

  12. Horse domestication and conservation genetics of Przewalski's horse inferred from sex chromosomal and autosomal sequences.

    PubMed

    Lau, Allison N; Peng, Lei; Goto, Hiroki; Chemnick, Leona; Ryder, Oliver A; Makova, Kateryna D

    2009-01-01

    Despite their ability to interbreed and produce fertile offspring, there is continued disagreement about the genetic relationship of the domestic horse (Equus caballus) to its endangered wild relative, Przewalski's horse (Equus przewalskii). Analyses have differed as to whether or not Przewalski's horse is placed phylogenetically as a separate sister group to domestic horses. Because Przewalski's horse and domestic horse are so closely related, genetic data can also be used to infer domestication-specific differences between the two. To investigate the genetic relationship of Przewalski's horse to the domestic horse and to address whether evolution of the domestic horse is driven by males or females, five homologous introns (a total of approximately 3 kb) were sequenced on the X and Y chromosomes in two Przewalski's horses and three breeds of domestic horses: Arabian horse, Mongolian domestic horse, and Dartmoor pony. Five autosomal introns (a total of approximately 6 kb) were sequenced for these horses as well. The sequences of sex chromosomal and autosomal introns were used to determine nucleotide diversity and the forces driving evolution in these species. As a result, X chromosomal and autosomal data do not place Przewalski's horses in a separate clade within phylogenetic trees for horses, suggesting a close relationship between domestic and Przewalski's horses. It was also found that there was a lack of nucleotide diversity on the Y chromosome and higher nucleotide diversity than expected on the X chromosome in domestic horses as compared with the Y chromosome and autosomes. This supports the hypothesis that very few male horses along with numerous female horses founded the various domestic horse breeds. Patterns of nucleotide diversity among different types of chromosomes were distinct for Przewalski's in contrast to domestic horses, supporting unique evolutionary histories of the two species.

  13. Comparative Mitogenomics of the Genus Odontobutis (Perciformes: Gobioidei: Odontobutidae) Revealed Conserved Gene Rearrangement and High Sequence Variations

    PubMed Central

    Ma, Zhihong; Yang, Xuefen; Bercsenyi, Miklos; Wu, Junjie; Yu, Yongyao; Wei, Kaijian; Fan, Qixue; Yang, Ruibin

    2015-01-01

    To understand the molecular evolution of mitochondrial genomes (mitogenomes) in the genus Odontobutis, the mitogenome of Odontobutis yaluensis was sequenced and compared with those of another four Odontobutis species. Our results displayed similar mitogenome features among species in genome organization, base composition, codon usage, and gene rearrangement. The identical gene rearrangement of trnS-trnL-trnH tRNA cluster observed in mitogenomes of these five closely related freshwater sleepers suggests that this unique gene order is conserved within Odontobutis. Additionally, the present gene order and the positions of associated intergenic spacers of these Odontobutis mitogenomes indicate that this unusual gene rearrangement results from tandem duplication and random loss of large-scale gene regions. Moreover, these mitogenomes exhibit a high level of sequence variation, mainly due to the differences of corresponding intergenic sequences in gene rearrangement regions and the heterogeneity of tandem repeats in the control regions. Phylogenetic analyses support Odontobutis species with shared gene rearrangement forming a monophyletic group, and the interspecific phylogenetic relationships are associated with structural differences among their mitogenomes. The present study contributes to understanding the evolutionary patterns of Odontobutidae species. PMID:26492246

  14. Genome-wide analyses of Epstein-Barr virus reveal conserved RNA structures and a novel stable intronic sequence RNA

    PubMed Central

    2013-01-01

    Background Epstein-Barr virus (EBV) is a human herpesvirus implicated in cancer and autoimmune disorders. Little is known concerning the roles of RNA structure in this important human pathogen. This study provides the first comprehensive genome-wide survey of RNA and RNA structure in EBV. Results Novel EBV RNAs and RNA structures were identified by computational modeling and RNA-Seq analyses of EBV. Scans of the genomic sequences of four EBV strains (EBV-1, EBV-2, GD1, and GD2) and of the closely related Macacine herpesvirus 4 using the RNAz program discovered 265 regions with high probability of forming conserved RNA structures. Secondary structure models are proposed for these regions based on a combination of free energy minimization and comparative sequence analysis. The analysis of RNA-Seq data uncovered the first observation of a stable intronic sequence RNA (sisRNA) in EBV. The abundance of this sisRNA rivals that of the well-known and highly expressed EBV-encoded non-coding RNAs (EBERs). Conclusion This work identifies regions of the EBV genome likely to generate functional RNAs and RNA structures, provides structural models for these regions, and discusses potential functions suggested by the modeled structures. Enhanced understanding of the EBV transcriptome will guide future experimental analyses of the discovered RNAs and RNA structures. PMID:23937650

  15. Comparative Mitogenomics of the Genus Odontobutis (Perciformes: Gobioidei: Odontobutidae) Revealed Conserved Gene Rearrangement and High Sequence Variations.

    PubMed

    Ma, Zhihong; Yang, Xuefen; Bercsenyi, Miklos; Wu, Junjie; Yu, Yongyao; Wei, Kaijian; Fan, Qixue; Yang, Ruibin

    2015-10-20

    To understand the molecular evolution of mitochondrial genomes (mitogenomes) in the genus Odontobutis, the mitogenome of Odontobutis yaluensis was sequenced and compared with those of another four Odontobutis species. Our results displayed similar mitogenome features among species in genome organization, base composition, codon usage, and gene rearrangement. The identical gene rearrangement of trnS-trnL-trnH tRNA cluster observed in mitogenomes of these five closely related freshwater sleepers suggests that this unique gene order is conserved within Odontobutis. Additionally, the present gene order and the positions of associated intergenic spacers of these Odontobutis mitogenomes indicate that this unusual gene rearrangement results from tandem duplication and random loss of large-scale gene regions. Moreover, these mitogenomes exhibit a high level of sequence variation, mainly due to the differences of corresponding intergenic sequences in gene rearrangement regions and the heterogeneity of tandem repeats in the control regions. Phylogenetic analyses support Odontobutis species with shared gene rearrangement forming a monophyletic group, and the interspecific phylogenetic relationships are associated with structural differences among their mitogenomes. The present study contributes to understanding the evolutionary patterns of Odontobutidae species.

  16. Peculiar patterns of amino acid substitution and conservation in the fast evolving tunicate Oikopleura dioica.

    PubMed

    Berná, Luisa; D'Onofrio, Giuseppe; Alvarez-Valin, Fernando

    2012-02-01

    We analyze the patterns and rates of amino acid evolution in tunicates with special interest on the extremely fast evolving Oikopleura dioica. We show that this species, on average, is twice as fast as the already fast evolving Ciona intestinalis. The acceleration in both species seems to be affected by similar evolutionary forces yet to different extent, since a substantial proportion of the most and less accelerated genes are orthologous between the two species. Among the possible causes that underlie the genome wide acceleration in Oikopleura, relaxation of functional constraints appears to be an important one, since all amino acids exhibit surprisingly homogenous levels of divergence. Such homogeneity, however, is not observed in Ciona. Apart from the genome wide acceleration, detailed analysis of functional groups of genes revealed that genes associated with regulatory functions (transcription regulators, chromatin remodeling proteins and metabolic regulators), have been subjected to an even more extreme process of acceleration, suggesting that adaptive evolution is the most probable cause of their unusual exacerbated rates. Another remarkable observation is that cysteine is among the less conserved amino acids, contrary to what is commonly observed in other species. The possible causes of this particular behavior are discussed.

  17. TranslatorX: multiple alignment of nucleotide sequences guided by amino acid translations.

    PubMed

    Abascal, Federico; Zardoya, Rafael; Telford, Maximilian J

    2010-07-01

    We present TranslatorX, a web server designed to align protein-coding nucleotide sequences based on their corresponding amino acid translations. Many comparisons between biological sequences (nucleic acids and proteins) involve the construction of multiple alignments. Alignments represent a statement regarding the homology between individual nucleotides or amino acids within homologous genes. As protein-coding DNA sequences evolve as triplets of nucleotides (codons) and it is known that sequence similarity degrades more rapidly at the DNA than at the amino acid level, alignments are generally more accurate when based on amino acids than on their corresponding nucleotides. TranslatorX novelties include: (i) use of all documented genetic codes and the possibility of assigning different genetic codes for each sequence; (ii) a battery of different multiple alignment programs; (iii) translation of ambiguous codons when possible; (iv) an innovative criterion to clean nucleotide alignments with GBlocks based on protein information; and (v) a rich output, including Jalview-powered graphical visualization of the alignments, codon-based alignments coloured according to the corresponding amino acids, measures of compositional bias and first, second and third codon position specific alignments. The TranslatorX server is freely available at http://translatorx.co.uk.

  18. Amino acid sequence of homologous rat atrial peptides: natriuretic activity of native and synthetic forms.

    PubMed Central

    Seidah, N G; Lazure, C; Chrétien, M; Thibault, G; Garcia, R; Cantin, M; Genest, J; Nutt, R F; Brady, S F; Lyle, T A

    1984-01-01

    A substance called atrial natriuretic factor (ANF), localized in secretory granules of atrial cardiocytes, was isolated as four homologous natriuretic peptides from homogenates of rat atria. The complete sequence of the longest form showed that it is composed of 33 amino acids. The three other shorter forms (2-33, 3-33, and 8-33) represent amino-terminally truncated versions of the 33 amino acid parent molecule as shown by analysis of sequence, amino acid composition, or both. The proposed primary structure agrees entirely with the amino acid composition and reveals no significant sequence homology with any known protein or segment of protein. The short form ANF-(8-33) was synthesized by a multi-fragment condensation approach and the synthetic product was shown to exhibit specific activity comparable to that of the natural ANF-(3-33). PMID:6232612

  19. Nucleotide and deduced amino acid sequences of a new subtilisin from an alkaliphilic Bacillus isolate.

    PubMed

    Saeki, Katsuhisa; Magallones, Marietta V; Takimura, Yasushi; Hatada, Yuji; Kobayashi, Tohru; Kawai, Shuji; Ito, Susumu

    2003-10-01

    The gene for a new subtilisin from the alkaliphilic Bacillus sp. KSM-LD1 was cloned and sequenced. The open reading frame of the gene encoded a 97 amino-acid prepro-peptide plus a 307 amino-acid mature enzyme that contained a possible catalytic triad of residues, Asp32, His66, and Ser224. The deduced amino acid sequence of the mature enzyme (LD1) showed approximately 65% identity to those of subtilisins SprC and SprD from alkaliphilic Bacillus sp. LG12. The amino acid sequence identities of LD1 to those of previously reported true subtilisins and high-alkaline proteases were below 60%. LD1 was characteristically stable during incubation with surfactants and chemical oxidants. Interestingly, an oxidizable Met residue is located next to the catalytic Ser224 of the enzyme as in the cases of the oxidation-susceptible subtilisins reported to date.

  20. Shark myelin basic protein: amino acid sequence, secondary structure, and self-association.

    PubMed

    Milne, T J; Atkins, A R; Warren, J A; Auton, W P; Smith, R

    1990-09-01

    Myelin basic protein (MBP) from the Whaler shark (Carcharhinus obscurus) has been purified from acid extracts of a chloroform/methanol pellet from whole brains. The amino acid sequence of the majority of the protein has been determined and compared with the sequences of other MBPs. The shark protein has only 44% homology with the bovine protein, but, in common with other MBPs, it has basic residues distributed throughout the sequence and no extensive segments that are predicted to have an ordered secondary structure in solution. Shark MBP lacks the triproline sequence previously postulated to form a hairpin bend in the molecule. The region containing the putative consensus sequence for encephalitogenicity in the guinea pig contains several substitutions, thus accounting for the lack of activity of the shark protein. Studies of the secondary structure and self-association have shown that shark MBP possesses solution properties similar to those of the bovine protein, despite the extensive differences in primary structure.

  1. Canine Polydactyl Mutations With Heterogeneous Origin in the Conserved Intronic Sequence of LMBR1

    PubMed Central

    Park, Kiyun; Kang, Joohyun; Subedi, Krishna Pd.; Ha, Ji-Hong; Park, Chankyu

    2008-01-01

    Canine preaxial polydactyly (PPD) in the hind limb is a developmental trait that restores the first digit lost during canine evolution. Using a linkage analysis, we previously demonstrated that the affected gene in a Korean breed is located on canine chromosome 16. The candidate locus was further limited to a linkage disequilibrium (LD) block of <213 kb composing the single gene, LMBR1, by LD mapping with single nucleotide polymorphisms (SNPs) for affected individuals from both Korean and Western breeds. The ZPA regulatory sequence (ZRS) in intron 5 of LMBR1 was implicated in mammalian polydactyly. An analysis of the LD haplotypes around the ZRS for various dog breeds revealed that only a subset is assigned to Western breeds. Furthermore, two distinct affected haplotypes for Asian and Western breeds were found, each containing different single-base changes in the upstream sequence (pZRS) of the ZRS. Unlike the previously characterized cases of PPD identified in the mouse and human ZRS regions, the canine mutations in pZRS lacked the ectopic expression of sonic hedgehog in the anterior limb bud, distinguishing its role in limb development from that of the ZRS. PMID:18689889

  2. The complete mitochondrial genome sequence of the tubeworm Lamellibrachia satsuma and structural conservation in the mitochondrial genome control regions of Order Sabellida.

    PubMed

    Patra, Ajit Kumar; Kwon, Yong Min; Kang, Sung Gyun; Fujiwara, Yoshihiro; Kim, Sang-Jin

    2016-04-01

    The control region of the mitochondrial genomes shows high variation in conserved sequence organizations, which follow distinct evolutionary patterns in different species or taxa. In this study, we sequenced the complete mitochondrial genome of Lamellibrachia satsuma from the cold-seep region of Kagoshima Bay, as a part of whole genome study and extensively studied the structural features and patterns of the control region sequences. We obtained 15,037 bp of mitochondrial genome using Illumina sequencing and identified the non-coding AT-rich region or control region (354 bp, AT=83.9%) located between trnH and trnR. We found 7 conserved sequence blocks (CSB), scattered throughout the control region of L. satsuma and other taxa of Annelida. The poly-TA stretches, which commonly form the stem of multiple stem-loop structures, are most conserved in the CSB-I and CSB-II regions. The mitochondrial genome of L. satsuma encodes a unique repetitive sequence in the control region, which forms a unique secondary structure in comparison to Lamellibrachia luymesi. Phylogenetic analyses of all protein-coding genes indicate that L. satsuma forms a monophyletic clade with L. luymesi along with other tubeworms found in cold-seep regions (genera: Lamellibrachia, Escarpia, and Seepiophila). In general, the control region sequences of Annelida could be aligned with certainty within each genus, and to some extent within the family, but with a higher rate of variation in conserved regions.

  3. On the conservation of the slow conformational dynamics within the amino acid kinase family: NAGK the paradigm.

    PubMed

    Marcos, Enrique; Crehuet, Ramon; Bahar, Ivet

    2010-04-08

    N-acetyl-L-glutamate kinase (NAGK) is the structural paradigm for examining the catalytic mechanisms and dynamics of amino acid kinase family members. Given that the slow conformational dynamics of the NAGK (at the microseconds time scale or slower) may be rate-limiting, it is of importance to assess the mechanisms of the most cooperative modes of motion intrinsically accessible to this enzyme. Here, we present the results from normal mode analysis using an elastic network model representation, which shows that the conformational mechanisms for substrate binding by NAGK strongly correlate with the intrinsic dynamics of the enzyme in the unbound form. We further analyzed the potential mechanisms of allosteric signalling within NAGK using a Markov model for network communication. Comparative analysis of the dynamics of family members strongly suggests that the low-frequency modes of motion and the associated intramolecular couplings that establish signal transduction are highly conserved among family members, in support of the paradigm sequence-->structure-->dynamics-->function.

  4. A highly Conserved Aspartic Acid Residue of the Chitosanase from Bacillus Sp. TS Is Involved in the Substrate Binding.

    PubMed

    Zhou, Zhanping; Zhao, Shuangzhi; Liu, Yang; Chang, Zhengying; Ma, Yanhe; Li, Jian; Song, Jiangning

    2016-11-01

    The chitosanase from Bacillus sp. TS (CsnTS) is an enzyme belonging to the glycoside hydrolase family 8. The sequence of CsnTS shares 98 % identity with the chitosanase from Bacillus sp. K17. Crystallography analysis and site-direct mutagenesis of the chitosanase from Bacillus sp. K17 identified the important residues involved in the catalytic interaction and substrate binding. However, despite progress in understanding the catalytic mechanism of the chitosanase from the family GH8, the functional roles of some residues that are highly conserved throughout this family have not been fully elucidated. This study focused on one of these residues, i.e., the aspartic acid residue at position 318. We found that apart from asparagine, mutation of Asp318 resulted in significant loss of enzyme activity. In-depth investigations showed that mutation of this residue not only impaired enzymatic activity but also affected substrate binding. Taken together, our results showed that Asp318 plays an important role in CsnTS activity.

  5. A conserved amino acid residue critical for product and substrate specificity in plant triterpene synthases

    PubMed Central

    Salmon, Melissa; Thimmappa, Ramesha B.; Minto, Robert E.; Melton, Rachel E.; O’Maille, Paul E.; Hemmings, Andrew M.; Osbourn, Anne

    2016-01-01

    Triterpenes are structurally complex plant natural products with numerous medicinal applications. They are synthesized through an origami-like process that involves cyclization of the linear 30 carbon precursor 2,3-oxidosqualene into different triterpene scaffolds. Here, through a forward genetic screen in planta, we identify a conserved amino acid residue that determines product specificity in triterpene synthases from diverse plant species. Mutation of this residue results in a major change in triterpene cyclization, with production of tetracyclic rather than pentacyclic products. The mutated enzymes also use the more highly oxygenated substrate dioxidosqualene in preference to 2,3-oxidosqualene when expressed in yeast. Our discoveries provide new insights into triterpene cyclization, revealing hidden functional diversity within triterpene synthases. They further open up opportunities to engineer novel oxygenated triterpene scaffolds by manipulating the precursor supply. PMID:27412861

  6. An analysis of amino acid sequences surrounding archaeal glycoprotein sequons.

    PubMed

    Abu-Qarn, Mehtap; Eichler, Jerry

    2007-05-01

    Despite having provided the first example of a prokaryal glycoprotein, little is known of the rules governing the N-glycosylation process in Archaea. As in Eukarya and Bacteria, archaeal N-glycosylation takes place at the Asn residues of Asn-X-Ser/Thr sequons. Since not all sequons are utilized, it is clear that other factors, including the context in which a sequon exists, affect glycosylation efficiency. As yet, the contribution to N-glycosylation made by sequon-bordering residues and other related factors in Archaea remains unaddressed. In the following, the surroundings of Asn residues confirmed by experiment as modified were analyzed in an attempt to define sequence rules and requirements for archaeal N-glycosylation.

  7. Structural sequences are conserved in the genes coding for the alpha, alpha' and beta-subunits of the soybean 7S seed storage protein.

    PubMed Central

    Schuler, M A; Ladin, B F; Pollaco, J C; Freyer, G; Beachy, R N

    1982-01-01

    Cloned DNAs encoding four different proteins have been isolated from recombinant cDNA libraries constructed with Glycine max seed mRNAs. Two cloned DNAs code for the alpha and alpha'-subunits of the 7S seed storage protein (conglycinin). The other cloned cDNAs code for proteins which are synthesized in vitro as 68,000 d., 60,000 d. or 53,000 d. polypeptides. Hybrid selection experiments indicate that, under low stringency hybridization conditions, all four cDNAs hybridize with mRNAs for the alpha and alpha'-subunits and the 68,000 d., 60,000 d. and 53,000 d. in vitro translation products. Within three of the mRNA, there is a conserved sequence of 155 nucleotides which is responsible for this hybridization. The conserved nucleotides in the alpha and alpha'-subunit cDNAs and the 68,000 d. polypeptide cDNAs span both coding and noncoding sequences. The differences in the coding nucleotides outside the conserved region are extensive. This suggests that selective pressure to maintain the 155 conserved nucleotides has been influenced by the structure of the seed mRNA. RNA blot hybridizations demonstrate that mRNA encoding the other major subunit (beta) of the 7S seed storage protein also shares sequence homology with the conserved 155 nucleotide sequence of the alpha and alpha'-subunit mRNAs, but not with other coding sequences. Images PMID:6897678

  8. The Number, Organization, and Size of Polymorphic Membrane Protein Coding Sequences as well as the Most Conserved Pmp Protein Differ within and across Chlamydia Species.

    PubMed

    Van Lent, Sarah; Creasy, Heather Huot; Myers, Garry S A; Vanrompay, Daisy

    2016-01-01

    Variation is a central trait of the polymorphic membrane protein (Pmp) family. The number of pmp coding sequences differs between Chlamydia species, but it is unknown whether the number of pmp coding sequences is constant within a Chlamydia species. The level of conservation of the Pmp proteins has previously only been determined for Chlamydia trachomatis. As different Pmp proteins might be indispensible for the pathogenesis of different Chlamydia species, this study investigated the conservation of Pmp proteins both within and across C. trachomatis,C. pneumoniae,C. abortus, and C. psittaci. The pmp coding sequences were annotated in 16 C. trachomatis, 6 C. pneumoniae, 2 C. abortus, and 16 C. psittaci genomes. The number and organization of polymorphic membrane coding sequences differed within and across the analyzed Chlamydia species. The length of coding sequences of pmpA,pmpB, and pmpH was conserved among all analyzed genomes, while the length of pmpE/F and pmpG, and remarkably also of the subtype pmpD, differed among the analyzed genomes. PmpD, PmpA, PmpH, and PmpA were the most conserved Pmp in C. trachomatis,C. pneumoniae,C. abortus, and C. psittaci, respectively. PmpB was the most conserved Pmp across the 4 analyzed Chlamydia species.

  9. High-Throughput Sequencing Reveals Diverse Sets of Conserved, Nonconserved, and Species-Specific miRNAs in Jute.

    PubMed

    Islam, Md Tariqul; Ferdous, Ahlan Sabah; Najnin, Rifat Ara; Sarker, Suprovath Kumar; Khan, Haseena

    2015-01-01

    MicroRNAs play a pivotal role in regulating a broad range of biological processes, acting by cleaving mRNAs or by translational repression. A group of plant microRNAs are evolutionarily conserved; however, others are expressed in a species-specific manner. Jute is an agroeconomically important fibre crop; nonetheless, no practical information is available for microRNAs in jute to date. In this study, Illumina sequencing revealed a total of 227 known microRNAs and 17 potential novel microRNA candidates in jute, of which 164 belong to 23 conserved families and the remaining 63 belong to 58 nonconserved families. Among a total of 81 identified microRNA families, 116 potential target genes were predicted for 39 families and 11 targets were predicted for 4 among the 17 identified novel microRNAs. For understanding better the functions of microRNAs, target genes were analyzed by Gene Ontology and their pathways illustrated by KEGG pathway analyses. The presence of microRNAs identified in jute was validated by stem-loop RT-PCR followed by end point PCR and qPCR for randomly selected 20 known and novel microRNAs. This study exhaustively identifies microRNAs and their target genes in jute which will ultimately pave the way for understanding their role in this crop and other crops.

  10. High-Throughput Sequencing Reveals Diverse Sets of Conserved, Nonconserved, and Species-Specific miRNAs in Jute

    PubMed Central

    Islam, Md. Tariqul; Ferdous, Ahlan Sabah; Najnin, Rifat Ara; Sarker, Suprovath Kumar; Khan, Haseena

    2015-01-01

    MicroRNAs play a pivotal role in regulating a broad range of biological processes, acting by cleaving mRNAs or by translational repression. A group of plant microRNAs are evolutionarily conserved; however, others are expressed in a species-specific manner. Jute is an agroeconomically important fibre crop; nonetheless, no practical information is available for microRNAs in jute to date. In this study, Illumina sequencing revealed a total of 227 known microRNAs and 17 potential novel microRNA candidates in jute, of which 164 belong to 23 conserved families and the remaining 63 belong to 58 nonconserved families. Among a total of 81 identified microRNA families, 116 potential target genes were predicted for 39 families and 11 targets were predicted for 4 among the 17 identified novel microRNAs. For understanding better the functions of microRNAs, target genes were analyzed by Gene Ontology and their pathways illustrated by KEGG pathway analyses. The presence of microRNAs identified in jute was validated by stem-loop RT-PCR followed by end point PCR and qPCR for randomly selected 20 known and novel microRNAs. This study exhaustively identifies microRNAs and their target genes in jute which will ultimately pave the way for understanding their role in this crop and other crops. PMID:25861616

  11. Complete mitochondrial DNA sequence of the endangered giant sable antelope (Hippotragus niger variani): insights into conservation and taxonomy.

    PubMed

    Espregueira Themudo, Gonçalo; Rufino, Ana C; Campos, Paula F

    2015-02-01

    The giant sable antelope is one of the most endangered African bovids. Populations of this iconic animal, the national symbol of Angola, were recently rediscovered, after many decades of presumed extinction. Even so, their numbers are scarce and hence conservation plans are essential. However, fundamental information such as its taxonomic position, time of divergence and degree of genetic variation are still lacking. Here, we used a museum preserved horn as a source of DNA to describe, for the first time, the complete mitochondrial genome of the giant sable antelope, and provide insights into its evolutionary history. Reads generated by shotgun sequencing were mapped against the mitochondrial genome of common sable antelope and the nuclear genomes of cow and sheep. Phylogenetic reconstruction and divergence time estimate give support to the monophyly of the giant sable and a maximum divergence time of 170 thousand years to the closest subspecies. About 7% of the nuclear genome was mapped against the reference. The genetic resources reported here are now available for future work in the field of conservation genetics and phylogeny, in this and related species.

  12. Classification of mouse VK groups based on the partial amino acid sequence to the first invariant tryptophan: impact of 14 new sequences from IgG myeloma proteins.

    PubMed

    Potter, M; Newell, J B; Rudikoff, S; Haber, E

    1982-12-01

    Fourteen new VK sequences derived from BALB/c IgG myeloma proteins were determined to the first invariant tryptophan (Trp 35). These partial sequences were compared with 65 other published VK sequences using a computer program. The 79 sequences were organized according to the length of the sequence from the amino terminus to the first invariant tryptophan (Trp 35), into seven groups (33, 34, 35, 36, 39, 40 and 41aa). A distance matrix of all 79 sequences was then computed, i.e. the number of amino acid substitutions necessary to convert one sequence to another was determined. From these data a dendrogram was constructed. Most of the VK sequences fell into clusters or closely related groups. The definition of a sequence group is arbitrary but facilitates the classification of VK proteins. We used 12 substitutions as the basis for defining a sequence group based on the known number of substitutions that are found in the VK21 proteins. By this criterion there were 18 groups in the Trp 35 dendrogram. Twelve of the 14 new sequences fell into one of these sequence groups; two formed new sequence groups. Collective amino acid sequencing is still encountering new VK structures indicating more sequences will be required to attain an accurate estimate of the total number of VK groups. Updated dendrograms can be quickly generated to include newly generated sequences.

  13. Oxygen affinity and amino acid sequence of myoglobins from endothermic and ectothermic fish.

    PubMed

    Marcinek, D J; Bonaventura, J; Wittenberg, J B; Block, B A

    2001-04-01

    Myoglobin (Mb) buffers intracellular O2 and facilitates diffusion of O2 through the cell. These functions of Mb will be most effective when intracellular PO2 is near the partial pressure of oxygen at which Mb is half saturated (P50) of the molecule. We test the hypothesis that Mb oxygen affinity has evolved such that it is conserved when adjusted for body temperature among closely related animals. We measure oxygen P50s tonometrically and oxygen dissociation rate constants with stopped flow and generate amino acid sequence from cDNA of Mbs from fish with different body temperatures. P50s for the endothermic bluefin tuna, skipjack tuna, and blue marlin at 20 degrees C were 0.62 +/- 0.02, 0.59 +/- 0.01, 0.58 +/- 0.04 mmHg, respectively, and were significantly lower than those for ectothermic bonito (1.03 +/- 0.07 mmHg) and mackerel (1.39 +/- 0.03 mmHg). Because the oxygen affinity of Mb decreases with increasing temperature, the above differences in oxygen affinity between endothermic and ectothermic fish are reduced when adjusted for the in vivo muscle temperature of the animal. Oxygen dissociation rate constants at 20 degrees C for the endothermic species ranged from 34.1 to 49.3 s(-1), whereas those for mackerel and bonito were 102 and 62 s(-1), respectively. Correlated with the low oxygen affinity and fast dissociation kinetics of mackerel Mb is a substitution of alanine for proline that would likely result in a more flexible mackerel protein.

  14. Molecular cloning, nucleotide sequence, and abscisic acid induction of a suberization-associated highly anionic peroxidase.

    PubMed

    Roberts, E; Kolattukudy, P E

    1989-06-01

    A highly anionic peroxidase induced in suberizing cells was suggested to be the key enzyme involved in polymerization of phenolic monomers to generate the aromatic matrix of suberin. The enzyme encoded by a potato cDNA was found to be highly homologous to the anionic peroxidase induced in suberizing tomato fruit. A tomato genomic library was screened using the potato anionic peroxidase cDNA and one genomic clone was isolated that contained two tandemly oriented anionic peroxidase genes. These genes were sequenced and were 96% and 87% identical to the mRNA for potato anionic peroxidase. Both genes consist of three exons with the relative positions of their two introns being conserved between the two genes. Primer extension analysis showed that only one of the genes is expressed in the periderm of 3 day wound-healed tomato fruits. Southern blot analyses suggested that there are two copies each of the two highly homologous genes per haploid genome in both potato and tomato. Abscisic acid (ABA) induced the accumulation of the anionic peroxidase transcripts in potato and tomato callus tissues. Northern blots showed that peroxidase mRNA was detectable at 2 days and was maximal at 8 days after transfer of potato callus to solid agar media containing 10(-4) M ABA. The transcripts induced by ABA in both potato and tomato callus were identical in size to those induced in wound-healing potato tuber and tomato fruit. The anionic peroxidase peptide was detected in extracts of potato callus grown on the ABA-containing media by western blot analysis. The results support the suggestion that stimulation of suberization by ABA involves the induction of the highly anionic peroxidase.

  15. Amino acid sequence of myoglobin from the chiton Liolophura japonica and a phylogenetic tree for molluscan globins.

    PubMed

    Suzuki, T; Furukohri, T; Okamoto, S

    1993-02-01

    Myoglobin was isolated from the radular muscle of the chiton Liolophura japonica, a primitive archigastropodic mollusc. Liolophura contains three monomeric myoglobins (I, II, and III), and the complete amino acid sequence of myoglobin I has been determined. It is composed of 145 amino acid residues, and the molecular mass was calculated to be 16,070 D. The E7 distal histidine, which is replaced by valine or glutamine in several molluscan globins, is conserved in Liolophura myoglobin. The autoxidation rate at physiological conditions indicated that Liolophura oxymyoglobin is fairly stable when compared with other molluscan myoglobins. The amino acid sequence of Liolophura myoglobin shows low homology (11-21%) with molluscan dimeric myoglobins and hemoglobins, but shows higher homology (26-29%) with monomeric myoglobins from the gastropodic molluscs Aplysia, Dolabella, and Bursatella. A phylogenetic tree was constructed from 19 molluscan globin sequences. The tree separated them into two distinct clusters, a cluster for muscle myoglobins and a cluster for erythrocyte or gill hemoglobins. The myoglobin cluster is divided further into two subclusters, corresponding to monomeric and dimeric myoglobins, respectively. Liolophura myoglobin was placed on the branch of monomeric myoglobin lineage, showing that it diverged earlier from other monomeric myoglobins. The hemoglobin cluster is also divided into two subclusters. One cluster contains homodimeric, heterodimeric, tetrameric, and didomain chains of erythrocyte hemoglobins of the blood clams Anadara, Scapharca, and Barbatia. Of special interest is the other subcluster. It consists of three hemoglobin chains derived from the bacterial symbiontharboring clams Calyptogena and Lucina, in which hemoglobins are supposed to play an important role in maintaining the symbiosis with sulfide bacteria.

  16. Detection and isolation of nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1997-04-01

    A method for detecting a target nucleic acid sequence in a sample is provided using hybridization probes which competitively hybridize to a target nucleic acid. According to the method, a target nucleic acid sequence is hybridized to first and second hybridization probes which are complementary to overlapping portions of the target nucleic acid sequence, the first hybridization probe including a first complexing agent capable of forming a binding pair with a second complexing agent and the second hybridization probe including a detectable marker. The first complexing agent attached to the first hybridization probe is contacted with a second complexing agent, the second complexing agent being attached to a solid support such that when the first and second complexing agents are attached, target nucleic acid sequences hybridized to the first hybridization probe become immobilized on to the solid support. The immobilized target nucleic acids are then separated and detected by detecting the detectable marker attached to the second hybridization probe. A kit for performing the method is also provided. 7 figs.

  17. Detection and isolation of nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1997-01-01

    A method for detecting a target nucleic acid sequence in a sample is provided using hybridization probes which competitively hybridize to a target nucleic acid. According to the method, a target nucleic acid sequence is hybridized to first and second hybridization probes which are complementary to overlapping portions of the target nucleic acid sequence, the first hybridization probe including a first complexing agent capable of forming a binding pair with a second complexing agent and the second hybridization probe including a detectable marker. The first complexing agent attached to the first hybridization probe is contacted with a second complexing agent, the second complexing agent being attached to a solid support such that when the first and second complexing agents are attached, target nucleic acid sequences hybridized to the first hybridization probe become immobilized on to the solid support. The immobilized target nucleic acids are then separated and detected by detecting the detectable marker attached to the second hybridization probe. A kit for performing the method is also provided.

  18. Identification and characterization of flowering genes in kiwifruit: sequence conservation and role in kiwifruit flower development

    PubMed Central

    2011-01-01

    Background Flower development in kiwifruit (Actinidia spp.) is initiated in the first growing season, when undifferentiated primordia are established in latent shoot buds. These primordia can differentiate into flowers in the second growing season, after the winter dormancy period and upon accumulation of adequate winter chilling. Kiwifruit is an important horticultural crop, yet little is known about the molecular regulation of flower development. Results To study kiwifruit flower development, nine MADS-box genes were identified and functionally characterized. Protein sequence alignment, phenotypes obtained upon overexpression in Arabidopsis and expression patterns suggest that the identified genes are required for floral meristem and floral organ specification. Their role during budbreak and flower development was studied. A spontaneous kiwifruit mutant was utilized to correlate the extended expression domains of these flowering genes with abnormal floral development. Conclusions This study provides a description of flower development in kiwifruit at the molecular level. It has identified markers for flower development, and candidates for manipulation of kiwifruit growth, phase change and time of flowering. The expression in normal and aberrant flowers provided a model for kiwifruit flower development. PMID:21521532

  19. Amino acid sequence around the active-site serine residue in the acyltransferase domain of goat mammary fatty acid synthetase.

    PubMed Central

    Mikkelsen, J; Højrup, P; Rasmussen, M M; Roepstorff, P; Knudsen, J

    1985-01-01

    Goat mammary fatty acid synthetase was labelled in the acyltransferase domain by formation of O-ester intermediates by incubation with [1-14C]acetyl-CoA and [2-14C]malonyl-CoA. Tryptic-digest and CNBr-cleavage peptides were isolated and purified by high-performance reverse-phase and ion-exchange liquid chromatography. The sequences of the malonyl- and acetyl-labelled peptides were shown to be identical. The results confirm the hypothesis that both acetyl and malonyl groups are transferred to the mammalian fatty acid synthetase complex by the same transferase. The sequence is compared with those of other fatty acid synthetase transferases. PMID:3922356

  20. Adiponectin receptor 1 conserves docosahexaenoic acid and promotes photoreceptor cell survival

    PubMed Central

    Rice, Dennis S.; Calandria, Jorgelina M.; Gordon, William C.; Jun, Bokkyoo; Zhou, Yongdong; Gelfman, Claire M.; Li, Songhua; Jin, Minghao; Knott, Eric J.; Chang, Bo; Abuin, Alex; Issa, Tawfik; Potter, David; Platt, Kenneth A.; Bazan, Nicolas G.

    2015-01-01

    The identification of pathways necessary for photoreceptor and retinal pigment epithelium (RPE) function is critical to uncover therapies for blindness. Here we report the discovery of adiponectin receptor 1 (AdipoR1) as a regulator of these cells’ functions. Docosahexaenoic acid (DHA) is avidly retained in photoreceptors, while mechanisms controlling DHA uptake and retention are unknown. Thus, we demonstrate that AdipoR1 ablation results in DHA reduction. In situ hybridization reveals photoreceptor and RPE cell AdipoR1 expression, blunted in AdipoR1−/− mice. We also find decreased photoreceptor-specific phosphatidylcholine containing very long-chain polyunsaturated fatty acids and severely attenuated electroretinograms. These changes precede progressive photoreceptor degeneration in AdipoR1−/− mice. RPE-rich eyecup cultures from AdipoR1−/− reveal impaired DHA uptake. AdipoR1 overexpression in RPE cells enhances DHA uptake, whereas AdipoR1 silencing has the opposite effect. These results establish AdipoR1 as a regulatory switch of DHA uptake, retention, conservation and elongation in photoreceptors and RPE, thus preserving photoreceptor cell integrity. PMID:25736573

  1. Ligation with nucleic acid sequence-based amplification.

    PubMed

    Ong, Carmichael; Tai, Warren; Sarma, Aartik; Opal, Steven M; Artenstein, Andrew W; Tripathi, Anubhav

    2012-01-01

    This work presents a novel method for detecting nucleic acid targets using a ligation step along with an isothermal, exponential amplification step. We use an engineered ssDNA with two variable regions on the ends, allowing us to design the probe for optimal reaction kinetics and primer binding. This two-part probe is ligated by T4 DNA Ligase only when both parts bind adjacently to the target. The assay demonstrates that the expected 72-nt RNA product appears only when the synthetic target, T4 ligase, and both probe fragments are present during the ligation step. An extraneous 38-nt RNA product also appears due to linear amplification of unligated probe (P3), but its presence does not cause a false-positive result. In addition, 40 mmol/L KCl in the final amplification mix was found to be optimal. It was also found that increasing P5 in excess of P3 helped with ligation and reduced the extraneous 38-nt RNA product. The assay was also tested with a single nucleotide polymorphism target, changing one base at the ligation site. The assay was able to yield a negative signal despite only a single-base change. Finally, using P3 and P5 with longer binding sites results in increased overall sensitivity of the reaction, showing that increasing ligation efficiency can improve the assay overall. We believe that this method can be used effectively for a number of diagnostic assays.

  2. The Putative Leishmania Telomerase RNA (LeishTER) Undergoes Trans-Splicing and Contains a Conserved Template Sequence

    PubMed Central

    da Silva, Marcelo S.; Segatto, Marcela; Myler, Peter J.; Cano, Maria Isabel N.

    2014-01-01

    Telomerase RNAs (TERs) are highly divergent between species, varying in size and sequence composition. Here, we identify a candidate for the telomerase RNA component of Leishmania genus, which includes species that cause leishmaniasis, a neglected tropical disease. Merging a thorough computational screening combined with RNA-seq evidence, we mapped a non-coding RNA gene localized in a syntenic locus on chromosome 25 of five Leishmania species that shares partial synteny with both Trypanosoma brucei TER locus and a putative TER candidate-containing locus of Crithidia fasciculata. Using target-driven molecular biology approaches, we detected a ∼2,100 nt transcript (LeishTER) that contains a 5′ spliced leader (SL) cap, a putative 3′ polyA tail and a predicted C/D box snoRNA domain. LeishTER is expressed at similar levels in the logarithmic and stationary growth phases of promastigote forms. A 5′SL capped LeishTER co-immunoprecipitated and co-localized with the telomerase protein component (TERT) in a cell cycle-dependent manner. Prediction of its secondary structure strongly suggests the existence of a bona fide single-stranded template sequence and a conserved C[U/C]GUCA motif-containing helix II, representing the template boundary element. This study paves the way for further investigations on the biogenesis of parasite TERT ribonucleoproteins (RNPs) and its role in parasite telomere biology. PMID:25391020

  3. The putative Leishmania telomerase RNA (LeishTER) undergoes trans-splicing and contains a conserved template sequence.

    PubMed

    Vasconcelos, Elton J R; Nunes, Vinícius S; da Silva, Marcelo S; Segatto, Marcela; Myler, Peter J; Cano, Maria Isabel N

    2014-01-01

    Telomerase RNAs (TERs) are highly divergent between species, varying in size and sequence composition. Here, we identify a candidate for the telomerase RNA component of Leishmania genus, which includes species that cause leishmaniasis, a neglected tropical disease. Merging a thorough computational screening combined with RNA-seq evidence, we mapped a non-coding RNA gene localized in a syntenic locus on chromosome 25 of five Leishmania species that shares partial synteny with both Trypanosoma brucei TER locus and a putative TER candidate-containing locus of Crithidia fasciculata. Using target-driven molecular biology approaches, we detected a ∼2,100 nt transcript (LeishTER) that contains a 5' spliced leader (SL) cap, a putative 3' polyA tail and a predicted C/D box snoRNA domain. LeishTER is expressed at similar levels in the logarithmic and stationary growth phases of promastigote forms. A 5'SL capped LeishTER co-immunoprecipitated and co-localized with the telomerase protein component (TERT) in a cell cycle-dependent manner. Prediction of its secondary structure strongly suggests the existence of a bona fide single-stranded template sequence and a conserved C[U/C]GUCA motif-containing helix II, representing the template boundary element. This study paves the way for further investigations on the biogenesis of parasite TERT ribonucleoproteins (RNPs) and its role in parasite telomere biology.

  4. Thin-film technology for direct visual detection of nucleic acid sequences: applications in clinical research.

    PubMed

    Jenison, Robert D; Bucala, Richard; Maul, Diana; Ward, David C

    2006-01-01

    Certain optical conditions permit the unaided eye to detect thickness changes on surfaces on the order of 20 A, which are of similar dimensions to monomolecular interactions between proteins or hybridization of complementary nucleic acid sequences. Such detection exploits specific interference of reflected white light, wherein thickness changes are perceived as surface color changes. This technology, termed thin-film detection, allows for the visualization of subattomole amounts of nucleic acid targets, even in complex clinical samples. Thin-film technology has been applied to a broad range of clinically relevant indications, including the detection of pathogenic bacterial and viral nucleic acid sequences and the discrimination of sequence variations in human genes causally related to susceptibility or severity of disease.

  5. RNA internal standard synthesis by nucleic acid sequence-based amplification for competitive quantitative amplification reactions.

    PubMed

    Lo, Wan-Yu; Baeumner, Antje J

    2007-02-15

    Nucleic acid sequence-based amplification (NASBA) reactions have been demonstrated to successfully synthesize new sequences based on deletion and insertion reactions. Two RNA internal standards were synthesized for use in competitive amplification reactions in which quantitative analysis can be achieved by coamplifying the internal standard with the wild type sample. The sequences were created in two consecutive NASBA reactions using the E. coli clpB mRNA sequence as model analyte. The primer sequences of the wild type sequence were maintained, and a 20-nt-long segment inside the amplicon region was exchanged for a new segment of similar GC content and melting temperature. The new RNA sequence was thus amplifiable using the wild type primers and detectable via a new inserted sequence. In the first reaction, the forwarding primer and an additional 20-nt-long sequence was deleted and replaced by a new 20-nt-long sequence. In the second reaction, a forwarding primer containing as 5' overhang sequence the wild type primer sequence was used. The presence of pure internal standard was verified using electrochemiluminescence and RNA lateral-flow biosensor analysis. Additional sequence deletion in order to shorten the internal standard amplicons and thus generate higher detection signals was found not to be required. Finally, a competitive NASBA reaction between one internal standard and the wild type sequence was carried out proving its functionality. This new rapid construction method via NASBA provides advantages over the traditional techniques since it requires no traditional cloning procedures, no thermocyclers, and can be completed in less than 4 h.

  6. Mutagenesis of conserved amino acids of Helicobacter pylori fur reveals residues important for function.

    PubMed

    Carpenter, Beth M; Gancz, Hanan; Benoit, Stéphane L; Evans, Sarah; Olsen, Cara H; Michel, Sarah L J; Maier, Robert J; Merrell, D Scott

    2010-10-01

    The ferric uptake regulator (Fur) of the medically important pathogen Helicobacter pylori is unique in that it has been shown to function as a repressor both in the presence of an Fe2+ cofactor and in its apo (non-Fe2+-bound) form. However, virtually nothing is known concerning the amino acid residues that are important for Fur functioning. Therefore, mutations in six conserved amino acid residues of H. pylori Fur were constructed and analyzed for their impact on both iron-bound and apo repression. In addition, accumulation of the mutant proteins, protein secondary structure, DNA binding ability, iron binding capacity, and the ability to form higher-order structures were also examined for each mutant protein. While none of the mutated residues completely abrogated the function of Fur, we were able to identify residues that were critical for both iron-bound and apo-Fur repression. One mutation, V64A, did not alter regulation of any target genes. However, each of the five remaining mutations showed an effect on either iron-bound or apo regulation. Of these, H96A, E110A, and E117A mutations altered iron-bound Fur regulation and were all shown to influence iron binding to different extents. Additionally, the H96A mutation was shown to alter Fur oligomerization, and the E110A mutation was shown to impact oligomerization and DNA binding. Conversely, the H134A mutant exhibited changes in apo-Fur regulation that were the result of alterations in DNA binding. Although the E90A mutant exhibited alterations in apo-Fur regulation, this mutation did not affect any of the assessed protein functions. This study is the first for H. pylori to analyze the roles of specific amino acid residues of Fur in function and continues to highlight the complexity of Fur regulation in this organism.

  7. Identification of conserved hepatic transcriptomic responses to 17β-estradiol using high-throughput sequencing in brown trout

    PubMed Central

    Uren Webster, Tamsyn M.; Shears, Janice A.; Moore, Karen

    2015-01-01

    Estrogenic chemicals are major contaminants of surface waters and can threaten the sustainability of natural fish populations. Characterization of the global molecular mechanisms of toxicity of environmental contaminants has been conducted primarily in model species rather than species with limited existing transcriptomic or genomic sequence information. We aimed to investigate the global mechanisms of toxicity of an endocrine disrupting chemical of environmental concern [17β-estradiol (E2)] using high-throughput RNA sequencing (RNA-Seq) in an environmentally relevant species, brown trout (Salmo trutta). We exposed mature males to measured concentrations of 1.94, 18.06, and 34.38 ng E2/l for 4 days and sequenced three individual liver samples per treatment using an Illumina HiSeq 2500 platform. Exposure to 34.4 ng E2/L resulted in 2,113 differentially regulated transcripts (FDR < 0.05). Functional analysis revealed upregulation of processes associated with vitellogenesis, including lipid metabolism, cellular proliferation, and ribosome biogenesis, together with a downregulation of carbohydrate metabolism. Using real-time quantitative PCR, we validated the expression of eight target genes and identified significant differences in the regulation of several known estrogen-responsive transcripts in fish exposed to the lower treatment concentrations (including esr1 and zp2.5). We successfully used RNA-Seq to identify highly conserved responses to estrogen and also identified some estrogen-responsive transcripts that have been less well characterized, including nots and tgm2l. These results demonstrate the potential application of RNA-Seq as a valuable tool for assessing mechanistic effects of pollutants in ecologically relevant species for which little genomic information is available. PMID:26082144

  8. Structural analysis of the regulatory elements of the type-II procollagen gene. Conservation of promoter and first intron sequences between human and mouse.

    PubMed Central

    Vikkula, M; Metsäranta, M; Syvänen, A C; Ala-Kokko, L; Vuorio, E; Peltonen, L

    1992-01-01

    Transcription of the type-II procollagen gene (COL2A1) is very specifically restricted to a limited number of tissues, particularly cartilages. In order to identify transcription-control motifs we have sequenced the promoter region and the first intron of the human and mouse COL2A1 genes. With the assumption that these motifs should be well conserved during evolution, we have searched for potential elements important for the tissue-specific transcription of the COL2A1 gene by aligning the two sequences with each other and with the available rat type-II procollagen sequence for the promoter. With this approach we could identify specific evolutionarily well-conserved motifs in the promoter area. On the other hand, several suggested regulatory elements in the promoter region did not show evolutionary conservation. In the middle of the first intron we found a cluster of well-conserved transcription-control elements and we conclude that these conserved motifs most probably possess a significant function in the control of the tissue-specific transcription of the COL2A1 gene. We also describe locations of additional, highly conserved nucleotide stretches, which are good candidate regions in the search for binding sites of yet-uncharacterized cartilage-specific transcription regulators of the COL2A1 gene. PMID:1637314

  9. Amino acid sequences of two nonspecific lipid-transfer proteins from germinated castor bean.

    PubMed

    Takishima, K; Watanabe, S; Yamada, M; Suga, T; Mamiya, G

    1988-11-01

    The amino acid sequence of two nonspecific lipid-transfer proteins (nsLTP) B and C from germinated castor bean seeds have been determined. Both the proteins consist of 92 residues, as for nsLTP previously reported, and their calculated Mr values are 9847 and 9593 for nsLTP-B and nsLTP-C, respectively. The sequences of nsLTP-B and nsLTP-C, compared to the known sequence of nsLTP-A from the same source, are 68% and 35% similar, respectively. No variation was found at the positions of the cysteine residues, indicating that they might be involved in disulfide bridges.

  10. A classification of glycosyl hydrolases based on amino acid sequence similarities.

    PubMed Central

    Henrissat, B

    1991-01-01

    The amino acid sequences of 301 glycosyl hydrolases and related enzymes have been compared. A total of 291 sequences corresponding to 39 EC entries could be classified into 35 families. Only ten sequences (less than 5% of the sample) could not be assigned to any family. With the sequences available for this analysis, 18 families were found to be monospecific (containing only one EC number) and 17 were found to be polyspecific (containing at least two EC numbers). Implications on the folding characteristics and mechanism of action of these enzymes and on the evolution of carbohydrate metabolism are discussed. With the steady increase in sequence and structural data, it is suggested that the enzyme classification system should perhaps be revised. PMID:1747104

  11. Complete amino acid sequence of the N-terminal extension of calf skin type III procollagen.

    PubMed Central

    Brandt, A; Glanville, R W; Hörlein, D; Bruckner, P; Timpl, R; Fietzek, P P; Kühn, K

    1984-01-01

    The N-terminal extension peptide of type III procollagen, isolated from foetal-calf skin, contains 130 amino acid residues. To determine its amino acid sequence, the peptide was reduced and carboxymethylated or aminoethylated and fragmented with trypsin, Staphylococcus aureus V8 proteinase and bacterial collagenase. Pyroglutamate aminopeptidase was used to deblock the N-terminal collagenase fragment to enable amino acid sequencing. The type III collagen extension peptide is homologous to that of the alpha 1 chain of type I procollagen with respect to a three-domain structure. The N-terminal 79 amino acids, which contain ten of the 12 cysteine residues, form a compact globular domain. The next 39 amino acids are in a collagenase triplet sequence (Gly- Xaa - Yaa )n with a high hydroxyproline content. Finally, another short non-collagenous domain of 12 amino acids ends at the cleavage site for procollagen aminopeptidase, which cleaves a proline-glutamine bond. In contrast with type I procollagen, the type III procollagen extension peptides contain interchain disulphide bridges located at the C-terminus of the triple-helical domain. PMID:6331392

  12. Identification of Structural and Catalytic Classes of Highly Conserved Amino Acid Residues in Lysine 2,3-Aminomutase †

    PubMed Central

    Chen, Dawei; Frey, Perry A.; Lepore, Bryan W.; Ringe, Dagmar; Ruzicka, Frank J.

    2008-01-01

    Lysine 2,3-aminomutase (LAM) from Clostridium subterminale SB4 catalyzes the interconversion of (S)-lysine and (S)-β-lysine by a radical mechanism involving coenzymatic actions of S-adenosylmethionine (SAM), a [4Fe-4S] cluster, and pyridoxal-5′-phosphate (PLP). The enzyme contains a number of conserved acidic residues and a cysteine and arginine-rich motif, that binds iron and sulfide in the [4Fe–4S] cluster. The results of activity and iron, sulfide, and PLP analysis of variants resulting from site-specific mutations of the conserved acidic residues and the arginine residues in the iron-sulfide binding motif indicate two classes of conserved residues of each type. Mutation of the conserved residues Arg134, Asp293, and Asp330 abolish all enzymatic activity. Based on the x-ray crystal structure, these residues bind the ε-aminium and α-carboxylate groups of (S)-lysine. However, among these residues only Asp293 appears to be important for stabilizing the [4Fe–4S] cluster. Members of a second group of conserved residues appear to stabilize the structure of LAM. Mutations of arginine residues 130, 135, and 136 and acidic residues Glu86, Asp165, Glu236, and Asp172 dramatically decrease iron and sulfide contents in the purified variants. Mutation of Asp96 significantly decreases iron and sulfide content. Variants in Arg130 or Asp172 display no detectable activity, whereas variants in the other positions display low to very low activities. Structural roles are assigned to this latter class of conserved amino acids. In particular, a network of hydrogen bonded interactions of Arg130, Glu86, Arg135 and the main chain carbonyl groups of Cys132 and Leu55 appears to stabilize the [4Fe–4S] cluster. PMID:17042481

  13. Conserved biosynthetic pathways for phosalacine, bialaphos and newly discovered phosphonic acid natural products

    PubMed Central

    Blodgett, Joshua A. V; Zhang, Jun Kai; Yu, Xiaomin; Metcalf, William W.

    2015-01-01

    Natural products containing phosphonic or phosphinic acid functionalities often display potent biological activities with applications in medicine and agriculture. The herbicide phosphinothricin-tripeptide (PTT) was the first phosphinate natural product discovered, yet despite numerous studies, questions remain surrounding key transformations required for its biosynthesis. In particular, the enzymology required to convert phosphonoformate to carboxyphosphonoenolpyruvate and the mechanisms underlying phosphorus-methylation remain poorly understood. In addition, the model for NRPS assembly of the intact tripeptide product has undergone numerous revisions that have yet to be experimentally tested. To further investigate the biosynthesis of this unusual natural product, we completely sequenced the PTT biosynthetic locus from Streptomyces hygroscopicus and compared it to the orthologous cluster from Streptomyces viridochromogenes. We also sequenced and analysed the closely related phosalacine (PAL) biosynthetic locus from Kitasatospora phosalacinea. Using data drawn from the comparative analysis of the PTT and PAL pathways, we also evaluate three related recently discovered phosphonate biosynthetic loci from Streptomyces sviceus, Streptomyces sp. WM6386 and Frankia alni. Our observations address long-standing biosynthetic questions related to PTT and PAL production and suggest that additional members of this pharmacologically important class await discovery. PMID:26328935

  14. Conserved biosynthetic pathways for phosalacine, bialaphos and newly discovered phosphonic acid natural products.

    PubMed

    Blodgett, Joshua A V; Zhang, Jun Kai; Yu, Xiaomin; Metcalf, William W

    2016-01-01

    Natural products containing phosphonic or phosphinic acid functionalities often display potent biological activities with applications in medicine and agriculture. The herbicide phosphinothricin-tripeptide (PTT) was the first phosphinate natural product discovered, yet despite numerous studies, questions remain surrounding key transformations required for its biosynthesis. In particular, the enzymology required to convert phosphonoformate to carboxyphosphonoenolpyruvate and the mechanisms underlying phosphorus methylation remain poorly understood. In addition, the model for non-ribosomal peptide synthetase assembly of the intact tripeptide product has undergone numerous revisions that have yet to be experimentally tested. To further investigate the biosynthesis of this unusual natural product, we completely sequenced the PTT biosynthetic locus from Streptomyces hygroscopicus and compared it with the orthologous cluster from Streptomyces viridochromogenes. We also sequenced and analyzed the closely related phosalacine (PAL) biosynthetic locus from Kitasatospora phosalacinea. Using data drawn from the comparative analysis of the PTT and PAL pathways, we also evaluate three related recently discovered phosphonate biosynthetic loci from Streptomyces sviceus, Streptomyces sp. WM6386 and Frankia alni. Our observations address long-standing biosynthetic questions related to PTT and PAL production and suggest that additional members of this pharmacologically important class await discovery.

  15. 37 CFR 1.824 - Form and format for nucleotide and/or amino acid sequence submissions in computer readable form.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... nucleotide and/or amino acid sequence submissions in computer readable form. 1.824 Section 1.824 Patents... And/or Amino Acid Sequences § 1.824 Form and format for nucleotide and/or amino acid sequence... readable form may be created by any means, such as word processors, nucleotide/amino acid sequence...

  16. 37 CFR 1.824 - Form and format for nucleotide and/or amino acid sequence submissions in computer readable form.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... nucleotide and/or amino acid sequence submissions in computer readable form. 1.824 Section 1.824 Patents... And/or Amino Acid Sequences § 1.824 Form and format for nucleotide and/or amino acid sequence... readable form may be created by any means, such as word processors, nucleotide/amino acid sequence...

  17. 37 CFR 1.824 - Form and format for nucleotide and/or amino acid sequence submissions in computer readable form.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... nucleotide and/or amino acid sequence submissions in computer readable form. 1.824 Section 1.824 Patents... And/or Amino Acid Sequences § 1.824 Form and format for nucleotide and/or amino acid sequence... readable form may be created by any means, such as word processors, nucleotide/amino acid sequence...

  18. Ultra-deep sequencing of ribosome-associated poly-adenylated RNA in early Drosophila embryos reveals hundreds of conserved translated sORFs.

    PubMed

    Li, Hongmei; Hu, Chuansheng; Bai, Ling; Li, Hua; Li, Mingfa; Zhao, Xiaodong; Czajkowsky, Daniel M; Shao, Zhifeng

    2016-12-01

    There is growing recognition that small open reading frames (sORFs) encoding peptides shorter than 100 amino acids are an important class of functional elements in the eukaryotic genome, with several already identified to play critical roles in growth, development, and disease. However, our understanding of their biological importance has been hindered owing to the significant technical challenges limiting their annotation. Here we combined ultra-deep sequencing of ribosome-associated poly-adenylated RNAs with rigorous conservation analysis to identify a comprehensive population of translated sORFs during early Drosophila embryogenesis. In total, we identify 399 sORFs, including those previously annotated but without evidence of translational capacity, those found within transcripts previously classified as non-coding, and those not previously known to be transcribed. Further, we find, for the first time, evidence for translation of many sORFs with different isoforms, suggesting their regulation is as complex as longer ORFs. Furthermore, many sORFs are found not associated with ribosomes in late-stage Drosophila S2 cells, suggesting that many of the translated sORFs may have stage-specific functions during embryogenesis. These results thus provide the first comprehensive annotation of the sORFs present during early Drosophila embryogenesis, a necessary basis for a detailed delineation of their function in embryogenesis and other biological processes.

  19. Purification to homogeneity and partial amino acid sequence of a fragment which includes the methyl acceptor site of the human DNA repair protein for O6-methylguanine.

    PubMed

    Major, G N; Gardner, E J; Carne, A F; Lawley, P D

    1990-03-25

    DNA repair by O6-methylguanine-DNA methyltransferase (O6-MT) is accomplished by removal by the enzyme of the methyl group from premutagenic O6-methylguanine-DNA, thereby restoring native guanine in DNA. The methyl group is transferred to an acceptor site cysteine thiol group in the enzyme, which causes the irreversible inactivation of O6-MT. We detected a variety of different forms of the methylated, inactivated enzyme in crude extracts of human spleen of molecular weights higher and lower than the usually observed 21-24kDa for the human O6-MT. Several apparent fragments of the methylated form of the protein were purified to homogeneity following reaction of partially-purified extract enzyme with O6-[3H-CH3]methylguanine-DNA substrate. One of these fragments yielded amino acid sequence information spanning fifteen residues, which was identified as probably belonging to human methyltransferase by virtue of both its significant sequence homology to three procaryote forms of O6-MT encoded by the ada, ogt (both from E. coli) and dat (B. subtilis) genes, and sequence position of the radiolabelled methyl group which matched the position of the conserved procaryote methyl acceptor site cysteine residue. Statistical prediction of secondary structure indicated good homologies between the human fragment and corresponding regions of the constitutive form of O6-MT in procaryotes (ogt and dat gene products), but not with the inducible ada protein, indicating the possibility that we had obtained partial amino acid sequence for a non-inducible form of the human enzyme. The identity of the fragment sequence as belonging to human methyltransferase was more recently confirmed by comparison with cDNA-derived amino acid sequence from the cloned human O6-MT gene from HeLa cells (1). The two sequences compared well, with only three out of fifteen amino acids being different (and two of them by only one nucleotide in each codon).

  20. Complete amino acid sequence of branched-chain amino acid aminotransferase (transaminase B) of Salmonella typhimurium, identification of the coenzyme-binding site and sequence comparison analysis

    SciTech Connect

    Feild, M.J.

    1988-01-01

    The complete amino acid sequence of the subunit of branched-chain amino acid aminotransferase of Salmonella typhimurium was determined by automated Edman degradation of peptide fragments generated by chemical and enzymatic digestion of S-carboxymethylated and S-pyridylethylated transaminase B. Peptide fragments of transaminase B were generated by treatment of the enzyme with trypsin, Staphylococcus aureus V8 protease, endoproteinase Lys-C, and cyanogen bromide. Protocols were developed for separation of the peptide fragments by reverse-phase high performance liquid chromatography (HPLC), ion-exchange HPLC, and SDS-urea gel electrophoresis. The enzyme subunit contains 308 amino acid residues and has a molecular weight of 33,920 daltons. The coenzyme-binding site was determined by treatment of the enzyme, containing bound pyridoxal 5-phosphate, with tritiated sodium borohydride prior to trypsin digestion. Monitoring radioactivity incorporation and peptide map comparisons with an apoenzyme tryptic digest, allowed identification of the pyridoxylated-peptide which was isolated by reverse-phase HPLC and sequenced. The coenzyme-binding site is a lysyl residue at position 159. Some peptides were further characterized by fast atom bombardment mass spectrometry.

  1. Marker genes that are less conserved in their sequences are useful for predicting genome-wide similarity levels between closely related prokaryotic strains

    SciTech Connect

    Lan, Yemin; Rosen, Gail; Hershberg, Ruth

    2016-05-03

    The 16s rRNA gene is so far the most widely used marker for taxonomical classification and separation of prokaryotes. Since it is universally conserved among prokaryotes, it is possible to use this gene to classify a broad range of prokaryotic organisms. At the same time, it has often been noted that the 16s rRNA gene is too conserved to separate between prokaryotes at finer taxonomic levels. In this paper, we examine how well levels of similarity of 16s rRNA and 73 additional universal or nearly universal marker genes correlate with genome-wide levels of gene sequence similarity. We demonstrate that the percent identity of 16s rRNA predicts genome-wide levels of similarity very well for distantly related prokaryotes, but not for closely related ones. In closely related prokaryotes, we find that there are many other marker genes for which levels of similarity are much more predictive of genome-wide levels of gene sequence similarity. Finally, we show that the identities of the markers that are most useful for predicting genome-wide levels of similarity within closely related prokaryotic lineages vary greatly between lineages. However, the most useful markers are always those that are least conserved in their sequences within each lineage. In conclusion, our results show that by choosing markers that are less conserved in their sequences within a lineage of interest, it is possible to better predict genome-wide gene sequence similarity between closely related prokaryotes than is possible using the 16s rRNA gene. We point readers towards a database we have created (POGO-DB) that can be used to easily establish which markers show lowest levels of sequence conservation within different prokaryotic lineages.

  2. Marker genes that are less conserved in their sequences are useful for predicting genome-wide similarity levels between closely related prokaryotic strains

    DOE PAGES

    Lan, Yemin; Rosen, Gail; Hershberg, Ruth

    2016-05-03

    The 16s rRNA gene is so far the most widely used marker for taxonomical classification and separation of prokaryotes. Since it is universally conserved among prokaryotes, it is possible to use this gene to classify a broad range of prokaryotic organisms. At the same time, it has often been noted that the 16s rRNA gene is too conserved to separate between prokaryotes at finer taxonomic levels. In this paper, we examine how well levels of similarity of 16s rRNA and 73 additional universal or nearly universal marker genes correlate with genome-wide levels of gene sequence similarity. We demonstrate that themore » percent identity of 16s rRNA predicts genome-wide levels of similarity very well for distantly related prokaryotes, but not for closely related ones. In closely related prokaryotes, we find that there are many other marker genes for which levels of similarity are much more predictive of genome-wide levels of gene sequence similarity. Finally, we show that the identities of the markers that are most useful for predicting genome-wide levels of similarity within closely related prokaryotic lineages vary greatly between lineages. However, the most useful markers are always those that are least conserved in their sequences within each lineage. In conclusion, our results show that by choosing markers that are less conserved in their sequences within a lineage of interest, it is possible to better predict genome-wide gene sequence similarity between closely related prokaryotes than is possible using the 16s rRNA gene. We point readers towards a database we have created (POGO-DB) that can be used to easily establish which markers show lowest levels of sequence conservation within different prokaryotic lineages.« less

  3. Sequence conservation in the C-terminal region of spider silk proteins (Spidroin) from Nephila clavipes (Tetragnathidae) and Araneus bicentenarius (Araneidae).

    PubMed

    Beckwitt, R; Arcidiacono, S

    1994-03-04

    The polymerase chain reaction (PCR) has been used to amplify the portion of the Spidroin 1 gene that codes for the C-terminal part of the silk protein of the spider Nephila clavipes. Along with some substitution mutations of minor consequence, the PCR-derived sequence reveals an additional base missing from the previously published Nephila Spidroin 1 sequence. Comparison of the PCR-derived sequence with the equivalent region of Spidroin 2 indicates that the insertion of this single base results in greatly increased similarity in the resulting amino acid sequences of Spidroin 1 and Spidroin 2 (75% over 97 amino acids). The same PCR primers also amplified a fragment of the same length from Araneus bicentenarius. This sequence is also very similar to Spidroin 1 of Nephila (71% over 238 bases excluding the PCR primers, which translates into 76% over 79 amino acids).

  4. The amino acid sequence of cytochromes c-551 from three species of Pseudomonas

    PubMed Central

    Ambler, R. P.; Wynn, Margaret

    1973-01-01

    The amino acid sequences of the cytochromes c-551 from three species of Pseudomonas have been determined. Each resembles the protein from Pseudomonas strain P6009 (now known to be Pseudomonas aeruginosa, not Pseudomonas fluorescens) in containing 82 amino acids in a single peptide chain, with a haem group covalently attached to cysteine residues 12 and 15. In all four sequences 43 residues are identical. Although by bacteriological criteria the organisms are closely related, the differences between pairs of sequences range from 22% to 39%. These values should be compared with the differences in the sequence of mitochondrial cytochrome c between mammals and amphibians (about 18%) or between mammals and insects (about 33%). Detailed evidence for the amino acid sequences of the proteins has been deposited as Supplementary Publication SUP 50015 at the National Lending Library for Science and Technology, Boston Spa, Yorks. LS23 7BQ, U.K., from whom copies can be obtained on the terms indicated in Biochem. J. (1973), 131, 5. PMID:4352718

  5. Draft Genome Sequence of Sorghum Grain Mold Fungus Epicoccum sorghinum, a Producer of Tenuazonic Acid

    PubMed Central

    Oliveira, Rodrigo C.; Davenport, Karen W.; Hovde, Blake; Silva, Danielle; Chain, Patrick S. G.; Correa, Benedito

    2017-01-01

    ABSTRACT The facultative plant pathogen Epicoccum sorghinum is associated with grain mold of sorghum and produces the mycotoxin tenuazonic acid. This fungus can have serious economic impact on sorghum production. Here, we report the draft genome sequence of E. sorghinum (USPMTOX48). PMID:28126937

  6. Snake venom. The amino acid sequence of protein A from Dendroaspis polylepis polylepis (black mamba) venom.

    PubMed

    Joubert, F J; Strydom, D J

    1980-12-01

    Protein A from Dendroaspis polylepis polylepis venom comprises 81 amino acids, including ten half-cystine residues. The complete primary structures of protein A and its variant A' were elucidated. The sequences of proteins A and A', which differ in a single position, show no homology with various neurotoxins and non-neurotoxic proteins and represent a new type of elapid venom protein.

  7. Draft Genome Sequence of Bacillus coagulans NL01, a Wonderful l-Lactic Acid Producer

    PubMed Central

    Zheng, Zhaojuan; Jiang, Ting; Lin, Xi; Zhou, Jie

    2015-01-01

    Here, we report the draft genome sequence of Bacillus coagulans NL01, which could produce high optically pure l-lactic acid using xylose as a sole carbon source. The draft genome is 3,505,081 bp, with 144 contigs. About 3,903 protein-coding genes and 92 rRNAs are predicted from this assembly. PMID:26089419

  8. Amino acid sequences of heterotrophic and photosynthetic ferredoxins from the tomato plant (Lycopersicon esculentum Mill.).

    PubMed

    Kamide, K; Sakai, H; Aoki, K; Sanada, Y; Wada, K; Green, L S; Yee, B C; Buchanan, B B

    1995-11-01

    Several forms (isoproteins) of ferredoxin in roots, leaves, and green and red pericarps in tomato plants (Lycopersicon esculentum Mill.) were earlier identified on the basis of N-terminal amino acid sequence and chromatographic behavior (Green et al. 1991). In the present study, a large scale preparation made possible determination of the full length amino acid sequence of the two ferredoxins from leaves. The ferredoxins characteristic of fruit and root were sequenced from the amino terminus to the 30th residue or beyond. The leaf ferredoxins were confirmed to be expressed in pericarp of both green and red fruit. The ferredoxins characteristic of fruit and root appeared to be restricted to those tissue. The results extend earlier findings in demonstrating that ferredoxin occurs in the major organs of the tomato plant where it appears to function irrespective of photosynthetic competence.

  9. ICAP-1, a Novel β1 Integrin Cytoplasmic Domain–associated Protein, Binds to a Conserved and Functionally Important NPXY Sequence Motif of β1 Integrin

    PubMed Central

    Chang, David D.; Wong, Carol; Smith, Healy; Liu, Jenny

    1997-01-01

    The cytoplasmic domains of integrins are essential for cell adhesion. We report identification of a novel protein, ICAP-1 (integrin cytoplasmic domain– associated protein-1), which binds to the β1 integrin cytoplasmic domain. The interaction between ICAP-1 and β1 integrins is highly specific, as demonstrated by the lack of interaction between ICAP-1 and the cytoplasmic domains of other β integrins, and requires a conserved and functionally important NPXY sequence motif found in the COOH-terminal region of the β1 integrin cytoplasmic domain. Mutational studies reveal that Asn and Tyr of the NPXY motif and a Val residue located NH2-terminal to this motif are critical for the ICAP-1 binding. Two isoforms of ICAP-1, a 200–amino acid protein (ICAP-1α) and a shorter 150–amino acid protein (ICAP-1β), derived from alternatively spliced mRNA, are expressed in most cells. ICAP-1α is a phosphoprotein and the extent of its phosphorylation is regulated by the cell–matrix interaction. First, an enhancement of ICAP-1α phosphorylation is observed when cells were plated on fibronectin-coated but not on nonspecific poly-l-lysine–coated surface. Second, the expression of a constitutively activated RhoA protein that disrupts the cell–matrix interaction results in dephosphorylation of ICAP-1α. The regulation of ICAP-1α phosphorylation by the cell–matrix interaction suggests an important role of ICAP-1 during integrin-dependent cell adhesion. PMID:9281591

  10. Nucleotide sequence and the encoded amino acids of human apolipoprotein A-I mRNA.

    PubMed Central

    Law, S W; Brewer, H B

    1984-01-01

    The cDNA clones encoding the precursor form of human liver apolipoprotein A-I (apoA-I), preproapoA-I, have been isolated from a cDNA library. A 17-base synthetic oligonucleotide based on residues 108-113 of apoA-I and a 26-base primer-extended, dideoxynucleotide-terminated cDNA were used as hybridization probes to select for recombinant plasmids bearing the apoA-I sequence. The complete nucleic acid sequence of human liver preproapoA-I has been determined by analysis of the cloned cDNA. The sequence is composed of 801 nucleotides encoding 267 amino acid residues. PreproapoA-I contains an 18-amino-acid prepeptide and a 6-amino-acid propeptide connected to the amino terminus of the 243-amino acid mature apoA-I. Southern blotting analysis of chromosomal DNA obtained from peripheral blood indicated the apoA-I gene is contained in a 2.1-kilobase-pair Pst I fragment and there is no gross difference in structural organization between the normal apoA-I gene and the Tangier disease apoA-I gene. Images PMID:6198645

  11. CodaChrome: a tool for the visualization of proteome conservation across all fully sequenced bacterial genomes

    PubMed Central

    2014-01-01

    Background The relationships between bacterial genomes are complicated by rampant horizontal gene transfer, varied selection pressures, acquisition of new genes, loss of genes, and divergence of genes, even in closely related lineages. As more and more bacterial genomes are sequenced, organizing and interpreting the incredible amount of relational information that connects them becomes increasingly difficult. Results We have developed CodaChrome (http://www.sourceforge.com/p/codachrome), a one-versus-all proteome comparison tool that allows the user to visually investigate the relationship between a bacterial proteome of interest and the proteomes encoded by every other bacterial genome recorded in GenBank in a massive interactive heat map. This tool has allowed us to rapidly identify the most highly conserved proteins encoded in the bacterial pan-genome, fast-clock genes useful for subtyping of bacterial species, the evolutionary history of an indel in the Sphingobium lineage, and an example of horizontal gene transfer from a member of the genus Enterococcus to a recent ancestor of Helicobacter pylori. Conclusion CodaChrome is a user-friendly and powerful tool for simultaneously visualizing relationships between thousands of proteomes. PMID:24460813

  12. A novel, evolutionarily conserved gene family with putative sequence-specific single-stranded DNA-binding activity.

    PubMed

    Castro, Patricia; Liang, Hong; Liang, Jan C; Nagarajan, Lalitha

    2002-07-01

    Complete and partial deletions of chromosome 5q are recurrent cytogenetic anomalies associated with aggressive myeloid malignancies. Earlier, we identified an approximately 1.5-Mb region of loss at 5q13.3 between the loci D5S672 and D5S620 in primary leukemic blasts. A leukemic cell line, ML3, is diploid for all of chromosome 5, except for an inversion-coupled translocation within the D5S672-D5S620 interval. Here, we report the development of a bacterial artificial chromosome (BAC) contig to define the breakpoint and the identification of a novel gene SSBP2, the target of disruption in ML3 cells. A preliminary evaluation of SSBP2 as a tumor suppressor gene in primary leukemic blasts and cell lines suggests that the remaining allele does not undergo intragenic mutations. SSBP2 is one of three members of a closely related, evolutionarily conserved, and ubiquitously expressed gene family. SSBP3 is the human ortholog of a chicken gene, CSDP, that encodes a sequence-specific single-stranded DNA-binding protein. SSBP3 localizes to chromosome 1p31.3, and the third member, SSBP4, maps to chromosome 19p13.1. Chromosomal localization and the putative single-stranded DNA-binding activity suggest that all three members of this family are capable of potential tumor suppressor activity by gene dosage or other epigenetic mechanisms.

  13. The sequence organization of Yp/proximal Xq homologous regions of the human sex chromosomes is highly conserved

    SciTech Connect

    Sargent, C.A.; Briggs, H.; Chalmers, I.J.

    1996-03-01

    Detailed deletion analysis of patients with breakpoints in Yp has allowed the definition of two distinct intervals on the Y chromosome short arm outside the pseudoautosomal region that are homologous to Xq21.3. Detailed YAC contigs have been developed over these regions on both the X and Y chromosomes, and the relative order of markers has been compared to assess whether rearrangements on either sex chromosome have occurred since the transposition events creating these patterns of homology. On the X chromosome, the region forms almost one contiguous block of homology, whereas on the Y chromosome, there has been one major rearrangement leading to the two separate Yp-Xq21 blocks of homology. The rearrangement breakpoint has been mapped. Within these separate X-Y homologous blocks on Yp, the order of loci homologous to X has been conserved to a high degree between the sex chromosomes. With the exception of the amelogenin gene (proximal Yp block), all the X-Y homologous sequences in the two Yp blocks have homologues in Xq21.3, with the former having its X counterpart in Xp22.2. This suggests an independent evolutionary event leading to the formation of the amelogenin X-Y homology. 45 refs., 4 figs., 1 tab.

  14. Comparative Genome Sequence Analysis Reveals the Extent of Diversity and Conservation for Glycan-Associated Proteins in Burkholderia spp.

    PubMed Central

    Ong, Hui San; Mohamed, Rahmah; Firdaus-Raih, Mohd

    2012-01-01

    Members of the Burkholderia family occupy diverse ecological niches. In pathogenic family members, glycan-associated proteins are often linked to functions that include virulence, protein conformation maintenance, surface recognition, cell adhesion, and immune system evasion. Comparative analysis of available Burkholderia genomes has revealed a core set of 178 glycan-associated proteins shared by all Burkholderia of which 68 are homologous to known essential genes. The genome sequence comparisons revealed insights into species-specific gene acquisitions through gene transfers, identified an S-layer protein, and proposed that significantly reactive surface proteins are associated to sugar moieties as a potential means to circumvent host defense mechanisms. The comparative analysis using a curated database of search queries enabled us to gain insights into the extent of conservation and diversity, as well as the possible virulence-associated roles of glycan-associated proteins in members of the Burkholderia spp. The curated list of glycan-associated proteins used can also be directed to screen other genomes for glycan-associated homologs. PMID:22991502

  15. Site-directed mutagenesis of conserved amino acids in the alpha subunit of toluene dioxygenase: potential mononuclear non-heme iron coordination sites.

    PubMed Central

    Jiang, H; Parales, R E; Lynch, N A; Gibson, D T

    1996-01-01

    The terminal oxygenase component of toluene dioxygenase from Pseudomonas putida F1 is an iron-sulfur protein (ISP(TOL)) that requires mononuclear iron for enzyme activity. Alignment of all available predicted amino acid sequences for the large (alpha) subunits of terminal oxygenases showed a conserved cluster of potential mononuclear iron-binding residues. These were between amino acids 210 and 230 in the alpha subunit (TodC1) of ISP(TOL). The conserved amino acids, Glu-214, Asp-219, Tyr-221, His-222, and His-228, were each independently replaced with an alanine residue by site-directed mutagenesis. Tyr-266 in TodC1, which has been suggested as an iron ligand, was treated in an identical manner. To assay toluene dioxygenase activity in the presence of TodC1 and its mutant forms, conditions for the reconstitution of wild-type ISP(TOL) activity from TodC1 and purified TodC2 (beta subunit) were developed and optimized. A mutation at Glu-214, Asp-219, His-222, or His-228 completely abolished toluene dioxygenase activity. TodC1 with an alanine substitution at either Tyr-221 or Tyr-266 retained partial enzyme activity (42 and 12%, respectively). In experiments with [14C]toluene, the two Tyr-->Ala mutations caused a reduction in the amount of Cis-[14C]-toluene dihydrodiol formed, whereas a mutation at Glu-214, Asp-219, His-222, or His-228 eliminated cis-toluene dihydrodiol formation. The expression level of all of the mutated TWO proteins was equivalent to that of wild-type TodC1 as judged by sodium dodecyl sulfate-polyacrylamide gel electrophoresis and Western blot (immunoblot) analyses. These results, in conjunction with the predicted amino acid sequences of 22 oxygenase components, suggest that the conserved motif Glu-X3-4,-Asp-X2-His-X4-5-His is critical for catalytic function and the glutamate, aspartate, and histidine residues may act as mononuclear iron ligands at the site of oxygen activation. PMID:8655491

  16. Software scripts for quality checking of high-throughput nucleic acid sequencers.

    PubMed

    Lazo, G R; Tong, J; Miller, R; Hsia, C; Rausch, C; Kang, Y; Anderson, O D

    2001-06-01

    We have developed a graphical interface to allow the researcher to view and assess the quality of sequencing results using a series of program scripts developed to process data generated by automated sequencers. The scripts are written in Perl programming language and are executable under the cgibin directory of a Web server environment. The scripts direct nucleic acid sequencing trace file data output from automated sequencers to be analyzed by the phred molecular biology program and are displayed as graphical hypertext mark-up language (HTML) pages. The scripts are mainly designed to handle 96-well microtiter dish samples, but the scripts are also able to read data from 384-well microtiter dishes 96 samples at a time. The scripts may be customized for different laboratory environments and computer configurations. Web links to the sources and discussion page are provided.

  17. NemaFootPrinter: a web based software for the identification of conserved non-coding genome sequence regions between C. elegans and C. briggsae

    PubMed Central

    Rambaldi, Davide; Guffanti, Alessandro; Morandi, Paolo; Cassata, Giuseppe

    2005-01-01

    Background NemaFootPrinter (Nematode Transcription Factor Scan Through Philogenetic Footprinting) is a web-based software for interactive identification of conserved, non-exonic DNA segments in the genomes of C. elegans and C. briggsae. It has been implemented according to the following project specifications: a) Automated identification of orthologous gene pairs. b) Interactive selection of the boundaries of the genes to be compared. c) Pairwise sequence comparison with a range of different methods. d) Identification of putative transcription factor binding sites on conserved, non-exonic DNA segments. Results Starting from a C. elegans or C. briggsae gene name or identifier, the software identifies the putative ortholog (if any), based on information derived from public nematode genome annotation databases. The investigator can then retrieve the genome DNA sequences of the two orthologous genes; visualize graphically the genes' intron/exon structure and the surrounding DNA regions; select, through an interactive graphical user interface, subsequences of the two gene regions. Using a bioinformatics toolbox (Blast2seq, Dotmatcher, Ssearch and connection to the rVista database) the investigator is able at the end of the procedure to identify and analyze significant sequences similarities, detecting the presence of transcription factor binding sites corresponding to the conserved segments. The software automatically masks exons. Discussion This software is intended as a practical and intuitive tool for the researchers interested in the identification of non-exonic conserved sequence segments between C. elegans and C. briggsae. These sequences may contain regulatory transcriptional elements since they are conserved between two related, but rapidly evolving genomes. This software also highlights the power of genome annotation databases when they are conceived as an open resource and the possibilities offered by seamless integration of different web services via the http

  18. Amino acid sequence of band-3 protein from rainbow trout erythrocytes derived from cDNA.

    PubMed Central

    Hübner, S; Michel, F; Rudloff, V; Appelhans, H

    1992-01-01

    In this report we present the first complete band-3 cDNA sequence of a poikilothermic lower vertebrate. The primary structure of the anion-exchange protein band 3 (AE1) from rainbow trout erythrocytes was determined by nucleotide sequencing of cDNA clones. The overlapping clones have a total length of 3827 bp with a 5'-terminal untranslated region of 150 bp, a 2754 bp open reading frame and a 3'-untranslated region of 924 bp. Band-3 protein from trout erythrocytes consists of 918 amino acid residues with a calculated molecular mass of 101 827 Da. Comparison of its amino acid sequence revealed a 60-65% identity within the transmembrane spanning sequence of band-3 proteins published so far. An additional insertion of 24 amino acid residues within the membrane-associated domain of trout band-3 protein was identified, which until now was thought to be a general feature only of mammalian band-3-related proteins. PMID:1637296

  19. Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology (Seventh Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting 2012)

    ScienceCinema

    Patel, Kamlesh D [Ken; SNL,

    2016-07-12

    Kamlesh (Ken) Patel from Sandia National Laboratories (Livermore, California) presents "Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology " at the 7th Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting held in June, 2012 in Santa Fe, NM.

  20. Cross-species conservation of complementary amino acid-ribonucleobase interactions and their potential for ribosome-free encoding

    PubMed Central

    Cannon, John G. D.; Sherman, Rachel M.; Wang, Victoria M. Y.; Newman, Grace A.

    2015-01-01

    The role of amino acid-RNA nucleobase interactions in the evolution of RNA translation and protein-mRNA autoregulation remains an open area of research. We describe the inference of pairwise amino acid-RNA nucleobase interaction preferences using structural data from known RNA-protein complexes. We observed significant matching between an amino acid’s nucleobase affinity and corresponding codon content in both the standard genetic code and mitochondrial variants. Furthermore, we showed that knowledge of nucleobase preferences allows statistically significant prediction of protein primary sequence from mRNA using purely physiochemical information. Interestingly, ribosomal primary sequences were more accurately predicted than non-ribosomal sequences, suggesting a potential role for direct amino acid-nucleobase interactions in the genesis of amino acid-based ribosomal components. Finally, we observed matching between amino acid-nucleobase affinities and corresponding mRNA sequences in 35 evolutionarily diverse proteomes. We believe these results have important implications for the study of the evolutionary origins of the genetic code and protein-mRNA cross-regulation. PMID:26656258

  1. Role of the two-component leader sequence and mature amino acid sequences in extracellular export of endoglucanase EGL from Pseudomonas solanacearum.

    PubMed Central

    Huang, J Z; Schell, M A

    1992-01-01

    The egl gene of Pseudomonas solanacearum encodes a 43-kDa extracellular endoglucanase (mEGL) involved in wilt disease caused by this phytopathogen. Egl is initially translated with a 45-residue, two-part leader sequence. The first 19 residues are apparently removed by signal peptidase II during export of Egl across the inner membrane (IM); the remaining residues of the leader sequence (modified with palmitate) are removed during export across the outer membrane (OM). Localization of Egl-PhoA fusion proteins showed that the first 26 residues of the Egl leader sequence are required and sufficient to direct lipid modification, processing, and export of Egl or PhoA across the IM but not the OM. Fusions of the complete 45-residue leader sequence or of the leader and increasing portions of mEgl sequences to PhoA did not cause its export across the OM. In-frame deletion of portions of mEGL-coding sequences blocked export of the truncated polypeptides across the OM without affecting export across the IM. These results indicate that the first part of the leader sequence functions independently to direct export of Egl across the IM while the second part and sequences and structures in mEGL are involved in export across the OM. Computer analysis of the mEgl amino acid sequence obtained from its nucleotide sequence identified a region of mEGL similar in amino acid sequence to regions in other prokaryotic endoglucanases. Images PMID:1735723

  2. Studies on adenosine triphosphate transphosphorylases. Amino acid sequence of rabbit muscle ATP-AMP transphosphorylase.

    PubMed

    Kuby, S A; Palmieri, R H; Frischat, A; Fischer, A H; Wu, L H; Maland, L; Manship, M

    1984-05-22

    The total amino acid sequence of rabbit muscle adenylate kinase has been determined, and the single polypeptide chain of 194 amino acid residues starts with N-acetylmethionine and ends with leucyllysine at its carboxyl terminus, in agreement with the earlier data on its amino acid composition [Mahowald, T. A., Noltmann, E. A., & Kuby, S. A. (1962) J. Biol. Chem. 237, 1138-1145] and its carboxyl-terminus sequence [Olson, O. E., & Kuby, S. A. (1964) J. Biol. Chem. 239, 460-467]. Elucidation of the primary structure was based on tryptic and chymotryptic cleavages of the performic acid oxidized protein, cyanogen bromide cleavages of the 14C-labeled S-carboxymethylated protein at its five methionine sites (followed by maleylation of peptide fragments), and tryptic cleavages at its 12 arginine sites of the maleylated 14C-labeled S-carboxymethylated protein. Calf muscle myokinase, whose sequence has also been established, differs primarily from the rabbit muscle myokinase's sequence in the following: His-30 is replaced by Gln-30; Lys-56 is replaced by Met-56; Ala-84 and Asp 85 are replaced by Val-84 and Asn-85. A comparison of the four muscle-type adenylate kinases, whose covalent structures have now been determined, viz., rabbit, calf, porcine, and human [for the latter two sequences see Heil, A., Müller, G., Noda, L., Pinder, T., Schirmer, H., Schirmer, I., & Von Zabern, I. (1974) Eur. J. Biochem. 43, 131-144, and Von Zabern, I., Wittmann-Liebold, B., Untucht-Grau, R., Schirmer, R. H., & Pai, E. F. (1976) Eur. J. Biochem. 68, 281-290], demonstrates an extraordinary degree of homology.(ABSTRACT TRUNCATED AT 250 WORDS)

  3. The complete amino acid sequence of a trypsin inhibitor from Bauhinia variegata var. candida seeds.

    PubMed

    Di Ciero, L; Oliva, M L; Torquato, R; Köhler, P; Weder, J K; Camillo Novello, J; Sampaio, C A; Oliveira, B; Marangoni, S

    1998-11-01

    Trypsin inhibitors of two varieties of Bauhinia variegata seeds have been isolated and characterized. Bauhinia variegata candida trypsin inhibitor (BvcTI) and B. variegata lilac trypsin inhibitor (BvlTI) are proteins with Mr of about 20,000 without free sulfhydryl groups. Amino acid analysis shows a high content of aspartic acid, glutamic acid, serine, and glycine, and a low content of histidine, tyrosine, methionine, and lysine in both inhibitors. Isoelectric focusing for both varieties detected three isoforms (pI 4.85, 5.00, and 5.15), which were resolved by HPLC procedure. The trypsin inhibitors show Ki values of 6.9 and 1.2 nM for BvcTI and BvlTI, respectively. The N-terminal sequences of the three trypsin inhibitor isoforms from both varieties of Bauhinia variegata and the complete amino acid sequence of B. variegata var. candida L. trypsin inhibitor isoform 3 (BvcTI-3) are presented. The sequences have been determined by automated Edman degradation of the reduced and carboxymethylated proteins of the peptides resulting from Staphylococcus aureus protease and trypsin digestion. BvcTI-3 is composed of 167 residues and has a calculated molecular mass of 18,529. Homology studies with other trypsin inhibitors show that BvcTI-3 belongs to the Kunitz family. The putative active site encompasses Arg (63)-Ile (64).

  4. Multiple site-selective insertions of non-canonical amino acids into sequence-repetitive polypeptides

    PubMed Central

    Wu, I-Lin; Patterson, Melissa A.; Carpenter Desai, Holly E.; Mehl, Ryan A.; Giorgi, Gianluca

    2013-01-01

    A simple and efficient method is described for introduction of non-canonical amino acids at multiple, structurally defined sites within recombinant polypeptide sequences. E. coli MRA30, a bacterial host strain with attenuated activity for release factor 1 (RF1), is assessed for its ability to support the incorporation of a diverse range of non-canonical amino acids in response to multiple encoded amber (TAG) codons within genetic templates derived from superfolder GFP and an elastin-mimetic protein polymer. Suppression efficiency and isolated protein yield were observed to depend on the identity of the orthogonal aminoacyl-tRNA synthetase/tRNACUA pair and the non-canonical amino acid substrate. This approach afforded elastin-mimetic protein polymers containing non-canonical amino acid derivatives at up to twenty-two positions within the repeat sequence with high levels of substitution. The identity and position of the variant residues was confirmed by mass spectrometric analysis of the full-length polypeptides and proteolytic cleavage fragments resulting from thermolysin digestion. The accumulated data suggest that this multi-site suppression approach permits the preparation of protein-based materials in which novel chemical functionality can be introduced at precisely defined positions within the polypeptide sequence. PMID:23625817

  5. Deduced amino acid sequence of human pulmonary surfactant proteolipid: SPL(pVal)

    SciTech Connect

    Whitsett, J.A.; Glasser, S.W.; Korfhagen, T.R.; Weaver, T.E.; Clark, J.; Pilot-Matias, T.; Meuth, J.; Fox, J.L.

    1987-05-01

    Hydrophobic, proteolipid-like protein of Mr 6500 was isolated from ether/ethanol extracts of human, canine and bovine pulmonary surfactant. Amino acid composition of the protein demonstrated a remarkable abundance of hydrophobic residues, particularly valine and leucine. The N-terminal amino acid sequence of the human protein was determined: N-Leu-Ile-Pro-Cys-Cys-Pro-Val-Asn-Leu-Lys-Arg-Leu-Leu-Ile-Val4... An oligonucleotide probe was used to screen an adult human lung cDNA library and resulted in detection of cDNA clones with predicted amino acid sequence with close identity to the N-terminal amino acid sequence of the human peptide. SPL(pVal) was found within the reading frame of a larger peptide. SPL(pVal) results from proteolytic processing of a larger preprotein. Northern blot analysis detected in a single 1.0 kilobase SPL(pVal) RNA which was less abundant in fetal than in adult lung. Mixtures of purified canine and bovine SPL(pVal) and synthetic phospholipids display properties of rapid adsorption and surface tension lowering activity characteristic of surfactant. Human SPL(pVal) is a pulmonary surfactant proteolipid which may therefore be useful in combination with phospholipids and/or other surfactant proteins for the treatment of surfactant deficiency such as hyaline membrane disease in newborn infants.

  6. SUBGROUPS OF AMINO ACID SEQUENCES IN THE VARIABLE REGIONS OF IMMUNOGLOBULIN HEAVY CHAINS*

    PubMed Central

    Cunningham, Bruce A.; Pflumm, Mollie N.; User, Urs Rutisha; Edelman, Gerald M.

    1969-01-01

    The amino acid sequence of the first 133 residues of the heavy (γ) chain from a human γG immunoglobulin (He) has been determined. This γ-chain is identical in Gm type to that of protein Eu, the complete sequence of which has been reported. Comparison of the two sequences substantiates the previous suggestion that there are subgroups of variable regions of heavy chains. The variable region of Eu has been assigned to subgroup I and that of He to subgroup II; on the other hand, the constant regions of the two proteins appear to be identical. Comparison of the sequence of the heavy chain of He with the heavy chain sequences determined in other laboratories suggests that the variable region of subgroup II is at least 118 residues long. The nature and distribution of amino acid variations in this heavy chain subgroup resemble those observed in light chain subgroups. These studies provide evidence that the translocation hypothesis applies to heavy as well as to light chains, viz., genes for variable regions (V) are somatically translocated to genes for constant regions (C) to form complete VC structural genes. Images PMID:5264153

  7. Complete nucleic acid sequence of Penaeus stylirostris densovirus (PstDNV) from India.

    PubMed

    Rai, Praveen; Safeena, Muhammed P; Karunasagar, Iddya; Karunasagar, Indrani

    2011-06-01

    Infectious hypodermal and hematopoietic necrosis virus (IHHNV) of shrimp, recently been classified as Penaeus stylirostris densovirus (PstDNV). The complete nucleic acid sequence of PstDNV from India was obtained by cloning and sequencing of different DNA fragment of the virus. The genome organisation of PstDNV revealed that there were three major coding domains: a left ORF (NS1) of 2001 bp, a mid ORF (NS2) of 1092 bp and a right ORF (VP) of 990 bp. The complete genome and amino acid sequences of three proteins viz., NS1, NS2 and VP were compared with the genomes of the virus reported from Hawaii, China and Mexico and with partial sequence available from isolates from different regions. The phylogenetic analysis of shrimp, insect and vertebrate parvovirus sequences showed that the Indian PstDNV isolate is phylogenetically more closely related to one of the three isolates from Taiwan (AY355307), and two isolates (AY362547 and AY102034) from Thailand.

  8. DNA Cloning of Plasmodium falciparum Circumsporozoite Gene: Amino Acid Sequence of Repetitive Epitope

    NASA Astrophysics Data System (ADS)

    Enea, Vincenzo; Ellis, Joan; Zavala, Fidel; Arnot, David E.; Asavanich, Achara; Masuda, Aoi; Quakyi, Isabella; Nussenzweig, Ruth S.

    1984-08-01

    A clone of complementary DNA encoding the circumsporozoite (CS) protein of the human malaria parasite Plasmodium falciparum has been isolated by screening an Escherichia coli complementary DNA library with a monoclonal antibody to the CS protein. The DNA sequence of the complementary DNA insert encodes a four-amino acid sequence: proline-asparagine-alanine-asparagine, tandemly repeated 23 times. The CS β -lactamase fusion protein specifically binds monoclonal antibodies to the CS protein and inhibits the binding of these antibodies to native Plasmodium falciparum CS protein. These findings provide a basis for the development of a vaccine against Plasmodium falciparum malaria.

  9. Amino-Acid Sequence of NADP-Specific Glutamate Dehydrogenase of Neurospora crassa

    PubMed Central

    Wootton, John C.; Chambers, Geoffrey K.; Holder, Anthony A.; Baron, Andrew J.; Taylor, John G.; Fincham, John R. S.; Blumenthal, Kenneth M.; Moon, Kenneth; Smith, Emil L.

    1974-01-01

    A tentative primary structure of the NADP-specific glutamate dehydrogenase [L-glutamate: NADP oxidoreductase (deaminating), EC 1.4.1.4] from Neurospora crassa has been determined. The proposed sequence contains 452 amino-acid residues in each of the identical subunits of the hexameric enzyme. Comparison of the sequence with that of the bovine liver enzyme reveals considerable homology in the amino-terminal portion of the chain, including the vicinity of the reactive lysine, with only shorter stretches of homology within the carboxyl-terminal regions. The significance of this distribution of homologous regions is discussed. PMID:4155068

  10. Comparative sequence and structure analysis reveals the conservation and diversity of nucleotide positions and their associated tertiary interactions in the riboswitches.

    PubMed

    Appasamy, Sri D; Ramlan, Effirul Ikhwan; Firdaus-Raih, Mohd

    2013-01-01

    The tertiary motifs in complex RNA molecules play vital roles to either stabilize the formation of RNA 3D structure or to provide important biological functionality to the molecule. In order to better understand the roles of these tertiary motifs in riboswitches, we examined 11 representative riboswitch PDB structures for potential agreement of both motif occurrences and conservations. A total of 61 unique tertiary interactions were found in the reference structures. In addition to the expected common A-minor motifs and base-triples mainly involved in linking distant regions the riboswitch structures three highly conserved variants of A-minor interactions called G-minors were found in the SAM-I and FMN riboswitches where they appear to be involved in the recognition of the respective ligand's functional groups. From our structural survey as well as corresponding structure and sequence alignments, the agreement between motif occurrences and conservations are very prominent across the representative riboswitches. Our analysis provide evidence that some of these tertiary interactions are essential components to form the structure where their sequence positions are conserved despite a high degree of diversity in other parts of the respective riboswitches sequences. This is indicative of a vital role for these tertiary interactions in determining the specific biological function of riboswitch.

  11. Method for high-volume sequencing of nucleic acids: random and directed priming with libraries of oligonucleotides

    DOEpatents

    Studier, F.W.

    1995-04-18

    Random and directed priming methods for determining nucleotide sequences by enzymatic sequencing techniques, using libraries of primers of lengths 8, 9 or 10 bases, are disclosed. These methods permit direct sequencing of nucleic acids as large as 45,000 base pairs or larger without the necessity for subcloning. Individual primers are used repeatedly to prime sequence reactions in many different nucleic acid molecules. Libraries containing as few as 10,000 octamers, 14,200 nonamers, or 44,000 decamers would have the capacity to determine the sequence of almost any cosmid DNA. Random priming with a fixed set of primers from a smaller library can also be used to initiate the sequencing of individual nucleic acid molecules, with the sequence being completed by directed priming with primers from the library. In contrast to random cloning techniques, a combined random and directed priming strategy is far more efficient. 2 figs.

  12. Method for high-volume sequencing of nucleic acids: random and directed priming with libraries of oligonucleotides

    DOEpatents

    Studier, F. William

    1995-04-18

    Random and directed priming methods for determining nucleotide sequences by enzymatic sequencing techniques, using libraries of primers of lengths 8, 9 or 10 bases, are disclosed. These methods permit direct sequencing of nucleic acids as large as 45,000 base pairs or larger without the necessity for subcloning. Individual primers are used repeatedly to prime sequence reactions in many different nucleic acid molecules. Libraries containing as few as 10,000 octamers, 14,200 nonamers, or 44,000 decamers would have the capacity to determine the sequence of almost any cosmid DNA. Random priming with a fixed set of primers from a smaller library can also be used to initiate the sequencing of individual nucleic acid molecules, with the sequence being completed by directed priming with primers from the library. In contrast to random cloning techniques, a combined random and directed priming strategy is far more efficient.

  13. Lactobacillus kefiri shows inter-strain variations in the amino acid sequence of the S-layer proteins.

    PubMed

    Malamud, Mariano; Carasi, Paula; Bronsoms, Sílvia; Trejo, Sebastián A; Serradell, María de Los Angeles

    2017-04-01

    The S-layer is a proteinaceous envelope constituted by subunits that self-assemble to form a two-dimensional lattice that covers the surface of different species of Bacteria and Archaea, and it could be involved in cell recognition of microbes among other several distinct functions. In this work, both proteomic and genomic approaches were used to gain knowledge about the sequences of the S-layer protein (SLPs) encoding genes expressed by six aggregative and sixteen non-aggregative strains of potentially probiotic Lactobacillus kefiri. Peptide mass fingerprint (PMF) analysis confirmed the identity of SLPs extracted from L. kefiri, and based on the homology with phylogenetically related species, primers located outside and inside the SLP-genes were employed to amplify genomic DNA. The O-glycosylation site SASSAS was found in all L. kefiri SLPs. Ten strains were selected for sequencing of the complete genes. The total length of the mature proteins varies from 492 to 576 amino acids, and all SLPs have a calculated pI between 9.37 and 9.60. The N-terminal region is relatively conserved and shows a high percentage of positively charged amino acids. Major differences among strains are found in the C-terminal region. Different groups could be distinguished regarding the mature SLPs and the similarities observed in the PMF spectra. Interestingly, SLPs of the aggregative strains are 100% homologous, although these strains were isolated from different kefir grains. This knowledge provides relevant data for better understanding of the mechanisms involved in SLPs functionality and could contribute to the development of products of biotechnological interest from potentially probiotic bacteria.

  14. Complete amino acid sequence and in vitro expression of rat NF-M, the middle molecular weight neurofilament protein.

    PubMed

    Napolitano, E W; Chin, S S; Colman, D R; Liem, R K

    1987-08-01

    A lambda gtII expression library was prepared from rat brain and screened with a polyclonal antiserum, which recognizes both NF-H and NF-M. An NF-M cDNA clone (pNF-M3C = 1.6 kb) was isolated and characterized. The fusion protein of NF-M3C, when used as an affinity matrix for the anti-neurofilament serum, isolated a subpopulation of antibodies specific for NF-M. Northern analysis demonstrates a single band of approximately 3000 nt and a constant message level for NF-M during postnatal development from postnatal day 0 (PO) to adulthood. Using pNF-M3C as a probe, a second cDNA clone was isolated from a lambda gtII rat brain expression library (pNF-M2D = 2.7 kb). The 2 clones were sequenced and pNF-M2D was found to encode the entire rat NF-M protein. The calculated molecular weight is 95,600, which is only 65% of the molecular weight determined by SDS-PAGE. The amino acid sequence of rat NF-M shows the conserved rod segment present in all intermediate filament proteins. The molecule also contains an unusual C-terminal extension with stretches of glutamic acid, which could contribute to the anomalous migration of this protein on SDS-PAGE and the fact that NF-M does not readily assemble into filaments. The pNF-M2D clone was transcribed and translated in vitro utilizing a rabbit reticulocyte lysate system. The resulting radiolabeled translation products were unexpectedly shown to comigrate with purified rat NF-M on 1- and 2-dimensional gels, even though the translated protein is not phosphorylated.

  15. Identification of conserved and novel microRNAs in the Pacific oyster Crassostrea gigas by deep sequencing.

    PubMed

    Xu, Fei; Wang, Xiaotong; Feng, Yue; Huang, Wen; Wang, Wei; Li, Li; Fang, Xiaodong; Que, Huayong; Zhang, Guofan

    2014-01-01

    MicroRNAs (miRNAs) play important roles in regulatory processes in various organisms. To date many studies have been performed in the investigation of miRNAs of numerous bilaterians, but limited numbers of miRNAs have been identified in the few species belonging to the clade Lophotrochozoa. In the current study, deep sequencing was conducted to identify the miRNAs of Crassostrea gigas (Lophotrochozoa) at a genomic scale, using 21 libraries that included different developmental stages and adult organs. A total of 100 hairpin precursor loci were predicted to encode miRNAs. Of these, 19 precursors (pre-miRNA) were novel in the oyster. As many as 53 (53%) miRNAs were distributed in clusters and 49 (49%) precursors were intragenic, which suggests two important biogenetic sources of miRNAs. Different developmental stages were characterized with specific miRNA expression patterns that highlighted regulatory variation along a temporal axis. Conserved miRNAs were expressed universally throughout different stages and organs, whereas novel miRNAs tended to be more specific and may be related to the determination of the novel body plan. Furthermore, we developed an index named the miRNA profile age index (miRPAI) to integrate the evolutionary age and expression levels of miRNAs during a particular developmental stage. We found that the swimming stages were characterized by the youngest miRPAIs. Indeed, the large-scale expression of novel miRNAs indicated the importance of these stages during development, particularly from organogenetic and evolutionary perspectives. Some potentially important miRNAs were identified for further study through significant changes between expression patterns in different developmental events, such as metamorphosis. This study broadened the knowledge of miRNAs in animals and indicated the presence of sophisticated miRNA regulatory networks related to the biological processes in lophotrochozoans.

  16. Identification of Conserved and Novel MicroRNAs in the Pacific Oyster Crassostrea gigas by Deep Sequencing

    PubMed Central

    Xu, Fei; Wang, Xiaotong; Feng, Yue; Huang, Wen; Wang, Wei; Li, Li; Fang, Xiaodong; Que, Huayong; Zhang, Guofan

    2014-01-01

    MicroRNAs (miRNAs) play important roles in regulatory processes in various organisms. To date many studies have been performed in the investigation of miRNAs of numerous bilaterians, but limited numbers of miRNAs have been identified in the few species belonging to the clade Lophotrochozoa. In the current study, deep sequencing was conducted to identify the miRNAs of Crassostrea gigas (Lophotrochozoa) at a genomic scale, using 21 libraries that included different developmental stages and adult organs. A total of 100 hairpin precursor loci were predicted to encode miRNAs. Of these, 19 precursors (pre-miRNA) were novel in the oyster. As many as 53 (53%) miRNAs were distributed in clusters and 49 (49%) precursors were intragenic, which suggests two important biogenetic sources of miRNAs. Different developmental stages were characterized with specific miRNA expression patterns that highlighted regulatory variation along a temporal axis. Conserved miRNAs were expressed universally throughout different stages and organs, whereas novel miRNAs tended to be more specific and may be related to the determination of the novel body plan. Furthermore, we developed an index named the miRNA profile age index (miRPAI) to integrate the evolutionary age and expression levels of miRNAs during a particular developmental stage. We found that the swimming stages were characterized by the youngest miRPAIs. Indeed, the large-scale expression of novel miRNAs indicated the importance of these stages during development, particularly from organogenetic and evolutionary perspectives. Some potentially important miRNAs were identified for further study through significant changes between expression patterns in different developmental events, such as metamorphosis. This study broadened the knowledge of miRNAs in animals and indicated the presence of sophisticated miRNA regulatory networks related to the biological processes in lophotrochozoans. PMID:25137038

  17. Sequence-specific thermodynamic properties of nucleic acids influence both transcriptional pausing and backtracking in yeast

    PubMed Central

    2017-01-01

    RNA Polymerase II pauses and backtracks during transcription, with many consequences for gene expression and cellular physiology. Here, we show that the energy required to melt double-stranded nucleic acids in the transcription bubble predicts pausing in Saccharomyces cerevisiae far more accurately than nucleosome roadblocks do. In addition, the same energy difference also determines when the RNA polymerase backtracks instead of continuing to move forward. This data-driven model corroborates—in a genome wide and quantitative manner—previous evidence that sequence-dependent thermodynamic features of nucleic acids influence both transcriptional pausing and backtracking. PMID:28301878

  18. Respiratory syncytial virus fusion glycoprotein: nucleotide sequence of mRNA, identification of cleavage activation site and amino acid sequence of N-terminus of F1 subunit.

    PubMed Central

    Elango, N; Satake, M; Coligan, J E; Norrby, E; Camargo, E; Venkatesan, S

    1985-01-01

    The amino acid sequence of respiratory syncytial virus fusion protein (Fo) was deduced from the sequence of a partial cDNA clone of mRNA and from the 5' mRNA sequence obtained by primer extension and dideoxysequencing. The encoded protein of 574 amino acids is extremely hydrophobic and has a molecular weight of 63371 daltons. The site of proteolytic cleavage within this protein was accurately mapped by determining a partial amino acid sequence of the N-terminus of the larger subunit (F1) purified by radioimmunoprecipitation using monoclonal antibodies. Alignment of the N-terminus of the F1 subunit within the deduced amino acid sequence of Fo permitted us to identify a sequence of lys-lys-arg-lys-arg-arg at the C-terminus of the smaller N-terminal F2 subunit that appears to represent the cleavage/activation domain. Five potential sites of glycosylation, four within the F2 subunit, were also identified. Three extremely hydrophobic domains are present in the protein; a) the N-terminal signal sequence, b) the N-terminus of the F1 subunit that is analogous to the N-terminus of the paramyxovirus F1 subunit and the HA2 subunit of influenza virus hemagglutinin, and c) the putative membrane anchorage domain near the C-terminus of F1. Images PMID:2987829

  19. Analysis of protein function and its prediction from amino acid sequence.

    PubMed

    Clark, Wyatt T; Radivojac, Predrag

    2011-07-01

    Understanding protein function is one of the keys to understanding life at the molecular level. It is also important in the context of human disease because many conditions arise as a consequence of alterations of protein function. The recent availability of relatively inexpensive sequencing technology has resulted in thousands of complete or partially sequenced genomes with millions of functionally uncharacterized proteins. Such a large volume of data, combined with the lack of high-throughput experimental assays to functionally annotate proteins, attributes to the growing importance of automated function prediction. Here, we study proteins annotated by Gene Ontology (GO) terms and estimate the accuracy of functional transfer from protein sequence only. We find that the transfer of GO terms by pairwise sequence alignments is only moderately accurate, showing a surprisingly small influence of sequence identity (SID) in a broad range (30-100%). We developed and evaluated a new predictor of protein function, functional annotator (FANN), from amino acid sequence. The predictor exploits a multioutput neural network framework which is well suited to simultaneously modeling dependencies between functional terms. Experiments provide evidence that FANN-GO (predictor of GO terms; available from http://www.informatics.indiana.edu/predrag) outperforms standard methods such as transfer by global or local SID as well as GOtcha, a method that incorporates the structure of GO.

  20. The Complete Genome Sequence of the Lactic Acid Bacterium Lactococcus lactis ssp. lactis IL1403

    PubMed Central

    Bolotin, Alexander; Wincker, Patrick; Mauger, Stéphane; Jaillon, Olivier; Malarme, Karine; Weissenbach, Jean; Ehrlich, S. Dusko; Sorokin, Alexei

    2001-01-01

    Lactococcus lactis is a nonpathogenic AT-rich gram-positive bacterium closely related to the genus Streptococcus and is the most commonly used cheese starter. It is also the best-characterized lactic acid bacterium. We sequenced the genome of the laboratory strain IL1403, using a novel two-step strategy that comprises diagnostic sequencing of the entire genome and a shotgun polishing step. The genome contains 2,365,589 base pairs and encodes 2310 proteins, including 293 protein-coding genes belonging to six prophages and 43 insertion sequence (IS) elements. Nonrandom distribution of IS elements indicates that the chromosome of the sequenced strain may be a product of recent recombination between two closely related genomes. A complete set of late competence genes is present, indicating the ability of L. lactis to undergo DNA transformation. Genomic sequence revealed new possibilities for fermentation pathways and for aerobic respiration. It also indicated a horizontal transfer of genetic information from Lactococcus to gram-negative enteric bacteria of Salmonella-Escherichia group. [The sequence data described in this paper has been submitted to the GenBank data library under accession no. AE005176.] PMID:11337471

  1. Alignment of U3 region sequences of mammalian type C viruses: identification of highly conserved motifs and implications for enhancer design.

    PubMed Central

    Golemis, E A; Speck, N A; Hopkins, N

    1990-01-01

    We aligned published sequences for the U3 region of 35 type C mammalian retroviruses. The alignment reveals that certain sequence motifs within the U3 region are strikingly conserved. A number of these motifs correspond to previously identified sites. In particular, we found that the enhancer region of most of the viruses examined contains a binding site for leukemia virus factor b, a viral corelike element, the consensus motif for nuclear factor 1, and the glucocorticoid response element. Most viruses containing more than one copy of enhancer sequences include these binding sites in both copies of the repeat. We consider this set of binding sites to constitute a framework for the enhancers of this set of viruses. Other highly conserved motifs in the U3 region include the retrovirus inverted repeat sequence, a negative regulatory element, and the CCAAT and TATA boxes. In addition, we identified two novel motifs in the promoter region that were exceptionally highly conserved but have not been previously described. PMID:2153223

  2. Stereochemical Sequence Ion Selectivity: Proline versus Pipecolic-acid-containing Protonated Peptides

    NASA Astrophysics Data System (ADS)

    Abutokaikah, Maha T.; Guan, Shanshan; Bythell, Benjamin J.

    2017-01-01

    Substitution of proline by pipecolic acid, the six-membered ring congener of proline, results in vastly different tandem mass spectra. The well-known proline effect is eliminated and amide bond cleavage C-terminal to pipecolic acid dominates instead. Why do these two ostensibly similar residues produce dramatically differing spectra? Recent evidence indicates that the proton affinities of these residues are similar, so are unlikely to explain the result [Raulfs et al., J. Am. Soc. Mass Spectrom. 25, 1705-1715 (2014)]. An additional hypothesis based on increased flexibility was also advocated. Here, we provide a computational investigation of the "pipecolic acid effect," to test this and other hypotheses to determine if theory can shed additional light on this fascinating result. Our calculations provide evidence for both the increased flexibility of pipecolic-acid-containing peptides, and structural changes in the transition structures necessary to produce the sequence ions. The most striking computational finding is inversion of the stereochemistry of the transition structures leading to "proline effect"-type amide bond fragmentation between the proline/pipecolic acid-congeners: R (proline) to S (pipecolic acid). Additionally, our calculations predict substantial stabilization of the amide bond cleavage barriers for the pipecolic acid congeners by reduction in deleterious steric interactions and provide evidence for the importance of experimental energy regime in rationalizing the spectra.

  3. Self-sequencing of amino acids and origins of polyfunctional protocells

    NASA Technical Reports Server (NTRS)

    Fox, S. W.

    1984-01-01

    The role of proteins in the origin of living things is discussed. It has been experimentally established that amino acids can sequence themselves under simulated geological conditions with highly nonrandom products which accordingly contain diverse information. Multiple copies of each type of macromolecule are formed, resulting in greater power for any protoenzymic molecule than would accrue from a single copy of each type. Thermal proteins are readily incorporated into laboratory protocells. The experimental evidence for original polyfunctional protocells is discussed.

  4. Amino acid sequence of atrial natriuretic peptides in human coronary sinus plasma.

    PubMed

    Yandle, T; Crozier, I; Nicholls, G; Espiner, E; Carne, A; Brennan, S

    1987-07-31

    Two atrial natriuretic peptides were purified from pooled human coronary sinus plasma by Sep-Pak extraction, immunoaffinity chromatography and reverse phase HPLC. The amino acid sequences of the two peptides were homologous with 99-126 human atrial natriuretic peptide (hANP) and 106-126 hANP, the latter being most probably linked to 99-105 ANP by the disulphide bond. The molar ratio of the peptides in plasma, as assessed by radioimmunoassay was 10:3.

  5. Amino Acid Sequences Mediating Vascular Cell Adhesion Molecule 1 Binding to Integrin Alpha 4: Homologous DSP Sequence Found for JC Polyoma VP1 Coat Protein

    PubMed Central

    Meyer, Michael Andrew

    2013-01-01

    The JC polyoma viral coat protein VP1 was analyzed for amino acid sequences homologies to the IDSP sequence which mediates binding of VLA-4 (integrin alpha 4) to vascular cell adhesion molecule 1. Although the full sequence was not found, a DSP sequence was located near the critical arginine residue linked to infectivity of the virus and binding to sialic acid containing molecules such as integrins (3). For the JC polyoma virus, a DSP sequence was found at residues 70, 71 and 72 with homology also noted for the mouse polyoma virus and SV40 virus. Three dimensional modeling of the VP1 molecule suggests that the DSP loop has an accessible site for interaction from the external side of the assembled viral capsid pentamer. PMID:24147211

  6. Amino Acid Sequences Mediating Vascular Cell Adhesion Molecule 1 Binding to Integrin Alpha 4: Homologous DSP Sequence Found for JC Polyoma VP1 Coat Protein.

    PubMed

    Meyer, Michael Andrew

    2013-01-01

    The JC polyoma viral coat protein VP1 was analyzed for amino acid sequences homologies to the IDSP sequence which mediates binding of VLA-4 (integrin alpha 4) to vascular cell adhesion molecule 1. Although the full sequence was not found, a DSP sequence was located near the critical arginine residue linked to infectivity of the virus and binding to sialic acid containing molecules such as integrins (3). For the JC polyoma virus, a DSP sequence was found at residues 70, 71 and 72 with homology also noted for the mouse polyoma virus and SV40 virus. Three dimensional modeling of the VP1 molecule suggests that the DSP loop has an accessible site for interaction from the external side of the assembled viral capsid pentamer.

  7. Amino acid sequence similarity between rabies virus glycoprotein and snake venom curaremimetic neurotoxins.

    PubMed

    Lentz, T L; Wilson, P T; Hawrot, E; Speicher, D W

    1984-11-16

    Evidence was presented earlier that a host-cell receptor for the highly neurotropic rabies virus might be the acetylcholine receptor. The amino acid sequence of the glycoprotein of rabies virus was compared by computer analysis with that of snake venom curaremimetic neurotoxins, potent ligands of the acetylcholine receptor. A statistically significant sequence relation was found between a segment of the rabies glycoprotein and the entire sequence of long neurotoxins. The greatest identity occurs with residues considered most important in neurotoxicity, including those interacting with the acetylcholine binding site of the acetylcholine receptor. Because of the similarity between the glycoprotein and the receptor-binding region of the neurotoxins, this region of the viral glycoprotein may function as a recognition site for the acetylcholine receptor. Direct binding of the rabies virus glycoprotein to the acetylcholine receptor could contribute to the neurotropism of this virus.

  8. Partial amino acid sequence of human pancreatic stone protein, a novel pancreatic secretory protein.

    PubMed Central

    Montalto, G; Bonicel, J; Multigner, L; Rovery, M; Sarles, H; De Caro, A

    1986-01-01

    Pancreatic stone protein (PSP) is the major organic component of human pancreatic stones. With the use of monoclonal antibody immunoadsorbents, five immunoreactive forms (PSP-S) with close Mr values (14,000-19,000) were isolated from normal pancreatic juice. By CM-Trisacryl M chromatography the lowest-Mr form (PSP-S1) was separated from the others and some of its molecular characteristics were investigated. The Mr of the PSP-S1 polypeptide chain calculated from the amino acid composition was about 16,100. The N-terminal sequences (40 residues) of PSP and PSP-S1 are identical, which suggests that the peptide backbone is the same for both of these polypeptides. The PSP-S1 sequence was determined up to residue 65 and was found to be different from all other known protein sequences. Images Fig. 1. PMID:3541906

  9. Characterization of the microbial acid mine drainage microbial community using culturing and direct sequencing techniques.

    PubMed

    Auld, Ryan R; Myre, Maxine; Mykytczuk, Nadia C S; Leduc, Leo G; Merritt, Thomas J S

    2013-05-01

    We characterized the bacterial community from an AMD tailings pond using both classical culturing and modern direct sequencing techniques and compared the two methods. Acid mine drainage (AMD) is produced by the environmental and microbial oxidation of minerals dissolved from mining waste. Surprisingly, we know little about the microbial communities associated with AMD, despite the fundamental ecological roles of these organisms and large-scale economic impact of these waste sites. AMD microbial communities have classically been characterized by laboratory culturing-based techniques and more recently by direct sequencing of marker gene sequences, primarily the 16S rRNA gene. In our comparison of the techniques, we find that their results are complementary, overall indicating very similar community structure with similar dominant species, but with each method identifying some species that were missed by the other. We were able to culture the majority of species that our direct sequencing results indicated were present, primarily species within the Acidithiobacillus and Acidiphilium genera, although estimates of relative species abundance were only obtained from direct sequencing. Interestingly, our culture-based methods recovered four species that had been overlooked from our sequencing results because of the rarity of the marker gene sequences, likely members of the rare biosphere. Further, direct sequencing indicated that a single genus, completely missed in our culture-based study, Legionella, was a dominant member of the microbial community. Our results suggest that while either method does a reasonable job of identifying the dominant members of the AMD microbial community, together the methods combine to give a more complete picture of the true diversity of this environment.

  10. Biosynthesis of D-alanyl-lipoteichoic acid: cloning, nucleotide sequence, and expression of the Lactobacillus casei gene for the D-alanine-activating enzyme.

    PubMed Central

    Heaton, M P; Neuhaus, F C

    1992-01-01

    The D-alanine-activating enzyme (Dae; EC 6.3.2.4) encoded by the dae gene from Lactobacillus casei ATCC 7469 is a cytosolic protein essential for the formation of the D-alanyl esters of membrane-bound lipoteichoic acid. The gene has been cloned, sequenced, and expressed in Escherichia coli, an organism which does not possess Dae activity. The open reading frame is 1,518 nucleotides and codes for a protein of 55.867 kDa, a value in agreement with the 56 kDa obtained by electrophoresis. A putative promoter and ribosome-binding site immediately precede the dae gene. A second open reading frame contiguous with the dae gene has also been partially sequenced. The organization of these genetic elements suggests that more than one enzyme necessary for the biosynthesis of D-alanyl-lipoteichoic acid may be present in this operon. Analysis of the amino acid sequence deduced from the dae gene identified three regions with significant homology to proteins in the following groups of ATP-utilizing enzymes: (i) the acid-thiol ligases, (ii) the activating enzymes for the biosynthesis of enterobactin, and (iii) the synthetases for tyrocidine, gramicidin S, and penicillin. From these comparisons, a common motif (GXXGXPK) has been identified that is conserved in the 19 protein domains analyzed. This motif may represent the phosphate-binding loop of an ATP-binding site for this class of enzymes. A DNA fragment (1,568 nucleotides) containing the dae gene and its putative ribosome-binding site has been subcloned and expressed in E. coli. Approximately 0.5% of the total cell protein is active Dae, whereas 21% is in the form of inclusion bodies. The isolation of this minimal fragment without a native promoter sequence provides the basis for designing a genetic system for modulating the D-alanine ester content of lipoteichoic acid. PMID:1385594

  11. [MOLECULAR EVOLUTION OF ION CHANNELS: AMINO ACID SEQUENCES AND 3D STRUCTURES].

    PubMed

    Korkosh, V S; Zhorov, B S; Tikhonov, D B

    2016-01-01

    An integral part of modern evolutionary biology is comparative analysis of structure and function of macromolecules such as proteins. The first and critical step to understand evolution of homologous proteins is their amino acid sequence alignment. However, standard algorithms fop not provide unambiguous sequence alignments for proteins of poor homology. More reliable results can be obtained by comparing experimental 3D structures obtained at atomic resolution, for instance, with the aid of X-ray structural analysis. If such structures are lacking, homology modeling is used, which may take into account indirect experimental data on functional roles of individual amino-acid residues. An important problem is that the sequence alignment, which reflects genetic modifications, does not necessarily correspond to the functional homology. The latter depends on three-dimensional structures which are critical for natural selection. Since alignment techniques relying only on the analysis of primary structures carry no information on the functional properties of proteins, including 3D structures into consideration is very important. Here we consider several examples involving ion channels and demonstrate that alignment of their three-dimensional structures can significantly improve sequence alignments obtained by traditional methods.

  12. Analysis of amino acid sequence variations and immunoglobulin E-binding epitopes of German cockroach tropomyosin.

    PubMed

    Jeong, Kyoung Yong; Lee, Jongweon; Lee, In-Yong; Ree, Han-Il; Hong, Chein-Soo; Yong, Tai-Soon

    2004-09-01

    The allergenicities of tropomyosins from different organisms have been reported to vary. The cDNA encoding German cockroach tropomyosin (Bla g 7) was isolated, expressed, and characterized previously. In the present study, the amino acid sequence variations in German cockroach tropomyosin were analyzed in order to investigate its influence on allergenicity. We also undertook the identification of immunodominant peptides containing immunoglobulin E (IgE) epitopes which may facilitate the development of diagnostic and immunotherapeutic strategies based on the recombinant proteins. Two-dimensional gel electrophoresis and immunoblot analysis with mouse anti-recombinant German cockroach tropomyosin serum was performed to investigate the isoforms at the protein level. Reverse transcriptase PCR (RT-PCR) was applied to examine the sequence diversity. Eleven different variants of the deduced amino acid sequences were identified by RT-PCR. German cockroach tropomyosin has only minor sequence variations that did not seem to affect its allergenicity significantly. These results support the molecular basis underlying the cross-reactivities of arthropod tropomyosins. Recombinant fragments were also generated by PCR, and IgE-binding epitopes were assessed by enzyme-linked immunosorbent assay. Sera from seven patients revealed heterogeneous IgE-binding responses. This study demonstrates multiple IgE-binding epitope regions in a single molecule, suggesting that full-length tropomyosin should be used for the development of diagnostic and therapeutic reagents.

  13. Complete amino acid sequence of a histidine-rich proteolytic fragment of human ceruloplasmin.

    PubMed

    Kingston, I B; Kingston, B L; Putnam, F W

    1979-04-01

    The complete amino acid sequence has been determined for a fragment of human ceruloplasmin [ferroxidase; iron(II):oxygen oxidoreductase, EC 1.16.3.1]. The fragment (designated Cp F5) contains 159 amino acid residues and has a molecular weight of 18,650; it lacks carbohydrate, is rich in histidine, and contains one free cysteine that may be part of a copper-binding site. This fragment is present in most commercial preparations of ceruloplasmin, probably owing to proteolytic degradation, but can also be obtained by limited cleavage of single-chain ceruloplasmin with plasmin. Cp F5 probably is an intact domain attached to the COOH-terminal end of single-chain ceruloplasmin via a labile interdomain peptide bond. A model of the secondary structure predicted by empirical methods suggests that almost one-third of the amino acid residues are distributed in alpha helices, about a third in beta-sheet structure, and the remainder in beta turns and unidentified structures. Computer analysis of the amino acid sequence has not demonstrated a statistically significant relationship between this ceruloplasmin fragment and any other protein, but there is some evidence for an internal duplication.

  14. Processing and amino acid sequence analysis of the mouse mammary tumor virus env gene product.

    PubMed Central

    Arthur, L O; Copeland, T D; Oroszlan, S; Schochetman, G

    1982-01-01

    The envelope proteins of mouse mammary tumor virus (MMTV) are synthesized from a subgenomic 24S mRNA as a 75,000-dalton glycosylated precursor polyprotein which is eventually processed to the mature glycoproteins gp52 and gp36. In vivo synthesis of this env precursor in the presence of the core glycosylation inhibitor tunicamycin yielded a precursor of approximately 61,000 daltons (P61env). However, a 67,000-dalton protein (P67env) was obtained from cell-free translation with the MMTV 24S mRNA as the template. To determine whether the portion of the protein cleaved from P67env to give P61env was removed from the NH2-terminal end of P67env and as such would represent a leader sequence, the NH2-terminal amino acid sequence of the terminal peptide gp52 was determined. Glutamic acid, and not methionine, was found to be the amino-terminal residue of gp52, indicating that the cleaved portion was derived from the NH2-terminal end of P67env. The NH2-terminal amino acid sequences of gp52's from endogenous and exogenous C3H MMTVs were determined though 46 residues and found to be identical. However, amino acid composition and type-specific gp52 radioimmunoassays from MMTVs grown in heterologous cells indicated primary structure differences between gp52's of the two viruses. The nucleic acid sequence of cloned MMTV DNA fragments (J. Majors and H. E. Varmus, personal communication) in conjunction with the NH2-terminal sequence of gp52 allowed localization of the env gene in the MMTV genome. Nucleotides coding for the NH2 terminus of gp52 begin approximately 0.8 kilobase to the 3' side of the single EcoRI cleavage site. Localization of the env gene at that point agrees with the proposed gene order -gag-pol-env- and also allows sufficient coding potential for the glycoprotein precursor without extending into the long terminal repeat. Images PMID:6281457

  15. Complete Genome Sequence of a thermotolerant sporogenic lactic acid bacterium, Bacillus coagulans strain 36D1

    PubMed Central

    Rhee, Mun Su; Moritz, Brélan E.; Xie, Gary; Glavina del Rio, T.; Dalin, E.; Tice, H.; Bruce, D.; Goodwin, L.; Chertkov, O.; Brettin, T.; Han, C.; Detter, C.; Pitluck, S.; Land, Miriam L.; Patel, Milind; Ou, Mark; Harbrucker, Roberta; Ingram, Lonnie O.; Shanmugam, K. T.

    2011-01-01

    Bacillus coagulans is a ubiquitous soil bacterium that grows at 50-55 °C and pH 5.0 and ferments various sugars that constitute plant biomass to L (+)-lactic acid. The ability of this sporogenic lactic acid bacterium to grow at 50-55 °C and pH 5.0 makes this organism an attractive microbial biocatalyst for production of optically pure lactic acid at industrial scale not only from glucose derived from cellulose but also from xylose, a major constituent of hemicellulose. This bacterium is also considered as a potential probiotic. Complete genome sequence of a representative strain, B. coagulans strain 36D1, is presented and discussed. PMID:22675583

  16. BeadCons: detection of nucleic acid sequences by flow cytometry.

    PubMed

    Horejsh, Douglas; Martini, Federico; Capobianchi, Maria Rosaria

    2005-11-01

    Molecular beacons are single-stranded nucleic acid structures with a terminal fluorophore and a distal, terminal quencher. These molecules are typically used in real-time PCR assays, but have also been conjugated with solid matrices. This unit describes protocols related to molecular beacon-conjugated beads (BeadCons), whose specific hybridization with complementary target sequences can be resolved by cytometry. Assay sensitivity is achieved through the concentration of fluorescence signal on discrete particles. By using molecular beacons with different fluorophores and microspheres of different sizes, it is possible to construct a fluid array system with each bead corresponding to a specific target nucleic acid. Methods are presented for the design, construction, and use of BeadCons for the specific, multiplexed detection of unlabeled nucleic acids in solution. The use of bead-based detection methods will likely lead to the design of new multiplex molecular diagnostic tools.

  17. Measuring nanometer distances in nucleic acids using a sequence-independent nitroxide probe

    PubMed Central

    Qin, Peter Z; Haworth, Ian S; Cai, Qi; Kusnetzow, Ana K; Grant, Gian Paola G; Price, Eric A; Sowa, Glenna Z; Popova, Anna; Herreros, Bruno; He, Honghang

    2008-01-01

    This protocol describes the procedures for measuring nanometer distances in nucleic acids using a nitroxide probe that can be attached to any nucleotide within a given sequence. Two nitroxides are attached to phosphorothioates that are chemically substituted at specific sites of DNA or RNA. Inter-nitroxide distances are measured using a four-pulse double electron–electron resonance technique, and the measured distances are correlated to the parent structures using a Web-accessible computer program. Four to five days are needed for sample labeling, purification and distance measurement. The procedures described herein provide a method for probing global structures and studying conformational changes of nucleic acids and protein/nucleic acid complexes. PMID:17947978

  18. Complete Genome Sequence of a thermotolerant sporogenic lactic acid bacterium, Bacillus coagulans strain 36D1.

    PubMed

    Rhee, Mun Su; Moritz, Brélan E; Xie, Gary; Glavina Del Rio, T; Dalin, E; Tice, H; Bruce, D; Goodwin, L; Chertkov, O; Brettin, T; Han, C; Detter, C; Pitluck, S; Land, Miriam L; Patel, Milind; Ou, Mark; Harbrucker, Roberta; Ingram, Lonnie O; Shanmugam, K T

    2011-12-31

    Bacillus coagulans is a ubiquitous soil bacterium that grows at 50-55 °C and pH 5.0 and ferments various sugars that constitute plant biomass to L (+)-lactic acid. The ability of this sporogenic lactic acid bacterium to grow at 50-55 °C and pH 5.0 makes this organism an attractive microbial biocatalyst for production of optically pure lactic acid at industrial scale not only from glucose derived from cellulose but also from xylose, a major constituent of hemicellulose. This bacterium is also considered as a potential probiotic. Complete genome sequence of a representative strain, B. coagulans strain 36D1, is presented and discussed.

  19. Human ERCC5 cDNA-cosmid complementation for excision repair and bipartite amino acid domains conserved with RAD proteins of Saccharomyces cerevisiae and Schizosaccharomyces pombe.

    PubMed Central

    MacInnes, M A; Dickson, J A; Hernandez, R R; Learmonth, D; Lin, G Y; Mudgett, J S; Park, M S; Schauer, S; Reynolds, R J; Strniste, G F

    1993-01-01

    Several human genes related to DNA excision repair (ER) have been isolated via ER cross-species complementation (ERCC) of UV-sensitive CHO cells. We have now isolated and characterized cDNAs for the human ERCC5 gene that complement CHO UV135 cells. The ERCC5 mRNA size is about 4.6 kb. Our available cDNA clones are partial length, and no single clone was active for UV135 complementation. When cDNAs were mixed pairwise with a cosmid clone containing an overlapping 5'-end segment of the ERCC5 gene, DNA transfer produced UV-resistant colonies with 60 to 95% correction of UV resistance relative to either a genomic ERCC5 DNA transformant or the CHO AA8 progenitor cells. cDNA-cosmid transformants regained intermediate levels (20 to 45%) of ER-dependent reactivation of a UV-damaged pSVCATgpt reporter plasmid. Our evidence strongly implicates an in situ recombination mechanism in cDNA-cosmid complementation for ER. The complete deduced amino acid sequence of ERCC5 was reconstructed from several cDNA clones encoding a predicted protein of 1,186 amino acids. The ERCC5 protein has extensive sequence similarities, in bipartite domains A and B, to products of RAD repair genes of two yeasts, Saccharomyces cerevisiae RAD2 and Schizosaccharomyces pombe rad13. Sequence, structural, and functional data taken together indicate that ERCC5 and its relatives are probable functional homologs. A second locus represented by S. cerevisiae YKL510 and S. pombe rad2 genes is structurally distinct from the ERCC5 locus but retains vestigial A and B domain similarities. Our analyses suggest that ERCC5 is a nuclear-localized protein with one or more highly conserved helix-loop-helix segments within domains A and B. Images PMID:8413238

  20. Human ERCC5 cDNA-cosmid complementation for excision repair and bipartite amino acid domains conserved with RAD proteins of saccharomyces cerevisiae and schizosaccharomyces pombe

    SciTech Connect

    MacInnes, M.A.; Dickson, J.A.; Hernandez, R.R.; Lin, G.Y.; Park, M.S.; Schauer, S.; Reynolds, R.J.; Strniste, G.F. ); Learmonth, D. ); Mudgett, J.S. ); Yu, J.Y. )

    1993-10-01

    Several human genes related to DNA excision repair (ER) have been isolated via ER cross-species complementation (ERCC) of UV-sensitive CHO cells. The authors have now isolated and characterized cDNAs for the human ERCC5 gene that complement CHO UV135 cells. The ERCC5 mRNA size is about 4.6 kb. Their available cDNA clones are partial length, and no single clone was active for UV135 complementation. When cDNAs were mixed pairwise with a cosmid clone containing an overlapping 5[prime]-end segment of the ERCC5 gene, DNA transfer produced UV-resistant colonies with 60 to 95% correction of UV resistance relative to either a genomic ERCC5 DNA transformant or the CHO AA8 progenitor cells. cDNA-cosmid transformants regained intermediate levels (20 to 45%) of ER-dependent reactivation of a UV-damaged pSVCATgpt reporter plasmid. Their evidence strongly implicates an in situ recombination mechanism in cDNA-cosmid complementation for ER. The complete deduced amino acid sequence of ERCC5 was reconstructed for several cDNA clones encoding a predicted protein of 1,186 amino acids. The ERCC5 protein has extensive sequence similarities, in bipartite domains A and B, to products of RAD repair genes of two yeast, Saccharomyces cerevisiae RAD2 and Schizosaccharomyces pombe rad13. Sequence, structural, and functional data taken together indicate that ERCC5 and its relatives are probable functional homologs. A second locus represented by S. cerevisiae YKL510 and S. pombe rad2 genes is structurally distinct from the ERCC5 locus but retains vestigial A and B domain similarities. Their analyses suggest that ERCC5 is a nuclear-localized protein with one or more highly conserved helix-loop-helix segments within domains A and B. 69 refs., 6 figs., 1 tab.

  1. Amino acid sequence and structural comparison of BACE1 and BACE2 using evolutionary trace method.

    PubMed

    Mirsafian, Hoda; Mat Ripen, Adiratna; Merican, Amir Feisal; Bin Mohamad, Saharuddin

    2014-01-01

    Beta-amyloid precursor protein cleavage enzyme 1 (BACE1) and beta-amyloid precursor protein cleavage enzyme 2 (BACE2), members of aspartyl protease family, are close homologues and have high similarity in their protein crystal structures. However, their enzymatic properties differ leading to disparate clinical consequences. In order to identify the residues that are responsible for such differences, we used evolutionary trace (ET) method to compare the amino acid conservation patterns of BACE1 and BACE2 in several mammalian species. We found that, in BACE1 and BACE2 structures, most of the ligand binding sites are conserved which indicate their enzymatic property of aspartyl protease family members. The other conserved residues are more or less randomly localized in other parts of the structures. Four group-specific residues were identified at the ligand binding site of BACE1 and BACE2. We postulated that these residues would be essential for selectivity of BACE1 and BACE2 biological functions and could be sites of interest for the design of selective inhibitors targeting either BACE1 or BACE2.

  2. The amino acid sequence of Lady Amherst's pheasant (Chrysolophus amherstiae) and golden pheasant (Chrysolophus pictus) egg-white lysozymes.

    PubMed

    Araki, T; Kuramoto, M; Torikata, T

    1990-09-01

    The amino acids of Lady Amherst's pheasant and golden pheasant egg-white lysozymes have been sequenced. The carboxymethylated lysozymes were digested with trypsin followed by sequencing of the tryptic peptides. Lady Amherst's pheasant lysozyme proved to consist of 129 amino acid residues, and a relative molecular mass of 14,423 Da was calculated. This lysozyme had 6 amino acids substitutions when compared with hen egg-white lysozyme: Phe3 to Tyr, His15 to Leu, Gln41 to His, Asn77 to His, Gln 121 to Asn, and a newly found substitution of Ile124 to Thr. The amino acid sequence of golden pheasant lysozyme was identical to that of Lady Amherst's phesant lysozyme. The phylogenetic tree constructured by the comparison of amino acid sequences of phasianoid birds lysozymes revealed a minimum genetic distance between these pheasants and the turkey-peafowl group.

  3. Conservation of the sizes of 53 introns and over 100 intronic sequences for the binding of common transcription factors in the human and mouse genes for type II procollagen (COL2A1).

    PubMed Central

    Ala-Kokko, L; Kvist, A P; Metsäranta, M; Kivirikko, K I; de Crombrugghe, B; Prockop, D J; Vuorio, E

    1995-01-01

    Over 11,000 bp of previously undefined sequences of the human COL2A1 gene were defined. The results made it possible to compare the intron structures of a highly complex gene from man and mouse. Surprisingly, the sizes of the 53 introns of the two genes were highly conserved with a mean difference of 13%. After alignment of the sequences, 69% of the intron sequences were identical. The introns contained consensus sequences for the binding of over 100 different transcription factors that were conserved in the introns of the two genes. The first intron of the gene contained 80 conserved consensus sequences and the remaining 52 introns of the gene contained 106 conserved sequences for the binding of transcription factors. The 5'-end of intron 2 in both genes had a potential for forming a stem loop in RNA transcripts. Images Figure 4 PMID:8948452

  4. Identification of critical amino acid residues and functional conservation of the Neurospora crassa and Rattus norvegicus orthologues of neuronal calcium sensor-1.

    PubMed

    Gohain, Dibakar; Deka, Rekha; Tamuli, Ranjan

    2016-12-01

    Neuronal calcium sensor-1 (NCS-1) is a member of neuronal calcium sensor family of proteins consisting of an amino terminal myristoylation domain and four conserved calcium (Ca(2+)) binding EF-hand domains. We performed site-directed mutational analysis of three key amino acid residues that are glycine in the conserved site for the N-terminal myristoylation, a conserved glutamic acid residue responsible for Ca(2+) binding in the third EF-hand (EF3), and an unusual non-conserved amino acid arginine at position 175 in the Neurospora crassa NCS-1. The N. crassa strains possessing the ncs-1 mutant allele of these three amino acid residues showed impairment in functions ranging from growth, Ca(2+) stress tolerance, and ultraviolet survival. In addition, heterologous expression of the NCS-1 from Rattus norvegicus in N. crassa confirmed its interspecies functional conservation. Moreover, functions of glutamic acid at position 120, the first Ca(2+) binding residue among all the EF-hands of the R. norvegicus NCS-1 was found conserved. Thus, we identified three critical amino acid residues of N. crassa NCS-1, and demonstrated its functional conservation across species using the orthologue from R. norvegicus.

  5. Sequencing of variable regions of the 16S rRNA gene for identification of lactic acid bacteria isolated from the intestinal microbiota of healthy salmonids.

    PubMed

    Balcázar, José Luis; de Blas, Ignacio; Ruiz-Zarzuela, Imanol; Vendrell, Daniel; Gironés, Olivia; Muzquiz, José Luis

    2007-03-01

    The aim of this study was to identify lactic acid bacteria (LAB) using polymerase chain reaction (PCR) amplification of variable regions of the 16S rRNA gene. Thirteen LAB strains were isolated from the intestinal microbiota of healthy salmonids. A approximately 500-bp region of the highly conserved 16S rRNA gene was PCR-amplified and following this, a portion of the amplicon (272-bp) including the V1 and V2 variable regions was sequenced. The sequence containing both the V1 and V2 region provided strong evidence for the identification of LAB. The LAB strains were identified as Carnobacterium maltaromaticum, Lactobacillus curvatus, Lactobacillus sakei, Lactobacillus plantarum, Lactococcus lactis subsp. cremoris, Lactococcus lactis subsp. lactis, and Leuconostoc mesenteroides. The method described was found to be a very simple, rapid, specific, and low-cost tool for the identification of unknown strains of LAB.

  6. A 25-Amino Acid Sequence of the Arabidopsis TGD2 Protein Is Sufficient for Specific Binding of Phosphatidic Acid*

    PubMed Central

    Lu, Binbin; Benning, Christoph

    2009-01-01

    Genetic analysis suggests that the TGD2 protein of Arabidopsis is required for the biosynthesis of endoplasmic reticulum derived thylakoid lipids. TGD2 is proposed to be the substrate-binding protein of a presumed lipid transporter consisting of the TGD1 (permease) and TGD3 (ATPase) proteins. The TGD1, -2, and -3 proteins are localized in the inner chloroplast envelope membrane. TGD2 appears to be anchored with an N-terminal membrane-spanning domain into the inner envelope membrane, whereas the C-terminal domain faces the intermembrane space. It was previously shown that the C-terminal domain of TGD2 binds phosphatidic acid (PtdOH). To investigate the PtdOH binding site of TGD2 in detail, the C-terminal domain of the TGD2 sequence lacking the transit peptide and transmembrane sequences was fused to the C terminus of the Discosoma sp. red fluorescent protein (DR). This greatly improved the solubility of the resulting DR-TGD2C fusion protein following production in Escherichia coli. The DR-TGD2C protein bound PtdOH with high specificity, as demonstrated by membrane lipid-protein overlay and liposome association assays. Internal deletion and truncation mutagenesis identified a previously undescribed minimal 25-amino acid fragment in the C-terminal domain of TGD2 that is sufficient for PtdOH binding. Binding characteristics of this 25-mer were distinctly different from those of TGD2C, suggesting that additional sequences of TGD2 providing the proper context for this 25-mer are needed for wild type-like PtdOH binding. PMID:19416982

  7. Nucleotide sequence of the luxC gene encoding fatty acid reductase of the lux operon from Photobacterium leiognathi.

    PubMed

    Lin, J W; Chao, Y F; Weng, S F

    1993-02-26

    The nucleotide sequence of the luxC gene (EMBL Accession No. 65156) encoding fatty acid reductase (FAR) of the lux operon from Photobacterium leiognathi PL741 was determined and the encoded amino acid sequence deduced. The fatty acid reductase is a component of the fatty acid reductase complex. The complex is responsible for converting fatty acid to aldehyde which serves as the substrate in the luciferase-catalyzed bioluminescent reaction. The protein comprises 478 amino acid residues and has a calculated M(r) of 53,858. Alignment and comparison of the fatty acid reductase of P. leiognathi with that of Vibrio harveyi B392 and Vibrio fischeri ATCC 7744 shows that there is 70% and 59% amino acid residues identity, respectively.

  8. "Opening" the ferritin pore for iron release by mutation of conserved amino acids at interhelix and loop sites.

    PubMed

    Jin, W; Takagi, H; Pancorbo, B; Theil, E C

    2001-06-26

    Ferritin concentrates, stores, and detoxifies iron in most organisms. The iron is a solid, ferric oxide mineral (< or =4500 Fe) inside the protein shell. Eight pores are formed by subunit trimers of the 24 subunit protein. A role for the protein in controlling reduction and dissolution of the iron mineral was suggested in preliminary experiments [Takagi et al. (1998) J. Biol. Chem. 273, 18685-18688] with a proline/leucine substitution near the pore. Localized pore disorder in frog L134P crystals coincided with enhanced iron exit, triggered by reduction. In this report, nine additional substitutions of conserved amino acids near L134 were studied for effects on iron release. Alterations of a conserved hydrophobic pair, a conserved ion pair, and a loop at the ferritin pores all increased iron exit (3-30-fold). Protein assembly was unchanged, except for a slight decrease in volume (measured by gel filtration); ferroxidase activity was still in the millisecond range, but a small decrease indicates slight alteration of the channel from the pore to the oxidation site. The sensitivity of reductive iron exit rates to changes in conserved residues near the ferritin pores, associated with localized unfolding, suggests that the structure around the ferritin pores is a target for regulated protein unfolding and iron release.

  9. Comparison of orthologous and paralogous DNA flanking the wheat high molecular weight glutenin genes: sequence conservation and divergence, transposon distribution, and matrix-attachment regions.

    PubMed

    Anderson, O D; Larka, L; Christoffers, M J; McCue, K F; Gustafson, J P

    2002-04-01

    Extended flanking DNA sequences were characterized for five members of the wheat high molecular weight (HMW) glutenin gene family to understand more of the structure, control, and evolution of these genes. Analysis revealed more sequence conservation among orthologous regions than between paralogous regions, with differences mainly owing to transposition events involving putative retrotransposons and several miniature inverted transposable elements (MITEs). Both gyspy-like long terminal repeat (LTR) and non-LTR retrotransposon sequences are represented in the flanking DNAs. One of the MITEs is a novel class, but another MITE is related to the maize Stowaway family and is widely represented in Triticeae express sequence tags (ESTs). Flanking DNA of the longest sequence, a 20 425-bp fragment including and surrounding the HMW-glutenin Bx7 gene, showed additional cereal gene-like sequences both immediately 5' and 3' to the HMW-glutenin coding region. The transcriptional activities of sequences related to these flanking putative genes and the retrotransposon-related regions were indicated by matches to wheat and other Triticeae ESTs. Predictive analysis of matrix-attachment regions (MARs) of the HMW glutenin and several alpha-, gamma-, and omega-gliadin flanking DNAs indicate potential MARs immediately flanking each of the genes. Matrix binding activity in the predicted regions was confirmed for two of the HMW-glutenin genes.

  10. Identification of individual barley chromosomes based on repetitive sequences: conservative distribution of Afa-family repetitive sequences on the chromosomes of barley and wheat.

    PubMed

    Tsujimoto, H; Mukai, Y; Akagawa, K; Nagaki, K; Fujigaki, J; Yamamoto, M; Sasakuma, T

    1997-10-01

    The Afa-family repetitive sequences were isolated from barley (Hordeum vulgare, 2n = 14) and cloned as pHvA14. This sequence distinguished each barely chromosome by in situ hybridization. Double color fluorescence in situ hybridization using pHvA14 and 5S rDNA or HvRT-family sequence (subtelomeric sequence of barley) allocated individual barley chromosomes showing a specific pattern of pHvA14 to chromosome 1H to 7H. As the case of the D genome chromosomes of Aegilops squarrosa and common wheat (Triticum aestivum) hybridized by its Afa-family sequences, the signals of pHvA14 in barley chromosomes tended to appear in the distal regions that do not carry many chromosome band markers. In the telomeric regions these signals always placed in more proximal portions than those of HvRT-family. Based on the distribution patterns of Afa-family sequences in the chromosomes of barley and D genome chromosomes of wheat, we discuss a possible mechanism of amplification of the repetitive sequences during the evolution of Triticeae. In addition, we show here that HvRT-family also could be used to distinguish individual barley chromosomes from the patterns of in situ hybridization.

  11. Genome wide identification of microRNAs involved in fatty acid and lipid metabolism of Brassica napus by small RNA and degradome sequencing.

    PubMed

    Wang, Zhiwei; Qiao, Yan; Zhang, Jingjing; Shi, Wenhui; Zhang, Jinwen

    2017-04-01

    Rapeseed (Brassica napus) is an important cash crop considered as the third largest oil crop worldwide. Rapeseed oil contains various saturation or unsaturation fatty acids, these fatty acids, whose could incorporation with TAG form into lipids stored in seeds play various roles in the metabolic activity. The different fatty acids in B. napus seeds determine oil quality, define if the oil is edible or must be used as industrial material. miRNAs are kind of non-coding sRNAs that could regulate gene expressions through post-transcriptional modification to their target transcripts playing important roles in plant metabolic activities. We employed high-throughput sequencing to identify the miRNAs and their target transcripts involved in fatty acids and lipids metabolism in different development of B. napus seeds. As a result, we identified 826 miRNA sequences, including 523 conserved and 303 newly miRNAs. From the degradome sequencing, we found 589 mRNA could be targeted by 236 miRNAs, it includes 49 novel miRNAs and 187 conserved miRNAs. The miRNA-target couple suggests that bna-5p-163957_18, bna-5p-396192_7, miR9563a-p3, miR9563b-p5, miR838-p3, miR156e-p3, miR159c and miR1134 could target PDP, LACS9, MFPA, ADSL1, ACO32, C0401, GDL73, PlCD6, OLEO3 and WSD1. These target transcripts are involving in acetyl-CoA generate and carbon chain desaturase, regulating the levels of very long chain fatty acids, β-oxidation and lipids transport and metabolism process. At the same, we employed the q-PCR to valid the expression of miRNAs and their target transcripts that involve in fatty acid and lipid metabolism, the result suggested that the miRNA and their transcript expression are negative correlation, which in accord with the expression of miRNA and its target transcript. The study findings suggest that the identified miRNA may play important role in the fatty acids and lipids metabolism in seeds of B. napus.

  12. Complete amino acid sequence of the A chain of human complement-classical-pathway enzyme C1r.

    PubMed Central

    Arlaud, G J; Willis, A C; Gagnon, J

    1987-01-01

    The amino acid sequence of human C1r A chain was determined, from sequence analysis performed on fragments obtained from C1r autolytic cleavage, cleavage of methionyl bonds, tryptic cleavages at arginine and lysine residues, and cleavages by staphylococcal proteinase. The polypeptide chain has an N-terminal serine residue and contains 446 amino acid residues (Mr 51,200). The sequence data allow chemical characterization of fragments alpha (positions 1-211), beta (positions 212-279) and gamma (positions 280-446) yielded from C1r autolytic cleavage, and identification of the two major cleavage sites generating these fragments. Position 150 of C1r A chain is occupied by a modified amino acid residue that, upon acid hydrolysis, yields erythro-beta-hydroxyaspartic acid, and that is located in a sequence homologous to the beta-hydroxyaspartic acid-containing regions of Factor IX, Factor X, protein C and protein Z. Sequence comparison reveals internal homology between two segments (positions 10-78 and 186-257). Two carbohydrate moieties are attached to the polypeptide chain, both via asparagine residues at positions 108 and 204. Combined with the previously determined sequence of C1r B chain [Arlaud & Gagnon (1983) Biochemistry 22, 1758-1764], these data give the complete sequence of human C1r. PMID:3036070

  13. Functional characterization of the conserved amino acids in Pop1p, the largest common protein subunit of yeast RNases P and MRP

    PubMed Central

    Xiao, Shaohua; Hsieh, John; Nugent, Rebecca L.; Coughlin, Daniel J.; Fierke, Carol A.; Engelke, David R.

    2006-01-01

    RNase P and RNase MRP are ribonucleoprotein enzymes required for 5′-end maturation of precursor tRNAs (pre-tRNAs) and processing of precursor ribosomal RNAs, respectively. In yeast, RNase P and MRP holoenzymes have eight protein subunits in common, with Pop1p being the largest at >100 kDa. Little is known about the functions of Pop1p, beyond the fact that it binds specifically to the RNase P RNA subunit, RPR1 RNA. In this study, we refined the previous Pop1 phylogenetic sequence alignment and found four conserved regions. Highly conserved amino acids in yeast Pop1p were mutagenized by randomization and conditionally defective mutations were obtained. Effects of the Pop1p mutations on pre-tRNA processing, pre-rRNA processing, and stability of the RNA subunits of RNase P and MRP were examined. In most cases, functional defects in RNase P and RNase MRP in vivo were consistent with assembly defects of the holoenzymes, although moderate kinetic defects in RNase P were also observed. Most mutations affected both pre-tRNA and pre-rRNA processing, but a few mutations preferentially interfered with only RNase P or only RNase MRP. In addition, one temperature-sensitive mutation had no effect on either tRNA or rRNA processing, consistent with an additional role for RNase P, RNase MRP, or Pop1p in some other form. This study shows that the Pop1p subunit plays multiple roles in the assembly and function of of RNases P and MRP, and that the functions can be differentiated through the mutations in conserved residues. PMID:16618965

  14. Nucleotide sequences of the Pseudomonas savastanoi indoleacetic acid genes show homology with Agrobacterium tumefaciens T-DNA

    PubMed Central

    Yamada, Tetsuji; Palm, Curtis J.; Brooks, Bob; Kosuge, Tsune

    1985-01-01

    We report the nucleotide sequences of iaaM and iaaH, the genetic determinants for, respectively, tryptophan 2-monooxygenase and indoleacetamide hydrolase, the enzymes that catalyze the conversion of L-tryptophan to indoleacetic acid in the tumor-forming bacterium Pseudomonas syringae pv. savastanoi. The sequence analysis indicates that the iaaM locus contains an open reading frame encoding 557 amino acids that would comprise a protein with a molecular weight of 61,783; the iaaH locus contains an open reading frame of 455 amino acids that would comprise a protein with a molecular weight of 48,515. Significant amino acid sequence homology was found between the predicted sequence of the tryptophan monooxygenase of P. savastanoi and the deduced product of the T-DNA tms-1 gene of the octopine-type plasmid pTiA6NC from Agrobacterium tumefaciens. Strong homology was found in the 25 amino acid sequence in the putative FAD-binding region of tryptophan monooxygenase. Homology was also found in the amino acid sequences representing the central regions of the putative products of iaaH and tms-2 T-DNA. The results suggest a strong similarity in the pathways for indoleacetic acid synthesis encoded by genes in P. savastanoi and in A. tumefaciens T-DNA. Images PMID:16593610

  15. Nucleic and amino acid sequences relating to a novel transketolase, and methods for the expression thereof

    DOEpatents

    Croteau, Rodney Bruce; Wildung, Mark Raymond; Lange, Bernd Markus; McCaskill, David G.

    2001-01-01

    cDNAs encoding 1-deoxyxylulose-5-phosphate synthase from peppermint (Mentha piperita) have been isolated and sequenced, and the corresponding amino acid sequences have been determined. Accordingly, isolated DNA sequences (SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7) are provided which code for the expression of 1-deoxyxylulose-5-phosphate synthase from plants. In another aspect the present invention provides for isolated, recombinant DXPS proteins, such as the proteins having the sequences set forth in SEQ ID NO:4, SEQ ID NO:6 and SEQ ID NO:8. In other aspects, replicable recombinant cloning vehicles are provided which code for plant 1-deoxyxylulose-5-phosphate synthases, or for a base sequence sufficiently complementary to at least a portion of 1-deoxyxylulose-5-phosphate synthase DNA or RNA to enable hybridization therewith. In yet other aspects, modified host cells are provided that have been transformed, transfected, infected and/or injected with a recombinant cloning vehicle and/or DNA sequence encoding a plant 1-deoxyxylulose-5-phosphate synthase. Thus, systems and methods are provided for the recombinant expression of the aforementioned recombinant 1-deoxyxylulose-5-phosphate synthase that may be used to facilitate its production, isolation and purification in significant amounts. Recombinant 1-deoxyxylulose-5-phosphate synthase may be used to obtain expression or enhanced expression of 1-deoxyxylulose-5-phosphate synthase in plants in order to enhance the production of 1-deoxyxylulose-5-phosphate, or its derivatives such as isopentenyl diphosphate (BP), or may be otherwise employed for the regulation or expression of 1-deoxyxylulose-5-phosphate synthase, or the production of its products.

  16. Gene sequence and predicted amino acid sequence of the motA protein, a membrane-associated protein required for flagellar rotation in Escherichia coli.

    PubMed Central

    Dean, G E; Macnab, R M; Stader, J; Matsumura, P; Burks, C

    1984-01-01

    The motA and motB gene products of Escherichia coli are integral membrane proteins necessary for flagellar rotation. We determined the DNA sequence of the region containing the motA gene and its promoter. Within this sequence, there is an open reading frame of 885 nucleotides, which with high probability (98% confidence level) meets criteria for a coding sequence. The 295-residue amino acid translation product had a molecular weight of 31,974, in good agreement with the value determined experimentally by gel electrophoresis. The amino acid sequence, which was quite hydrophobic, was subjected to a theoretical analysis designed to predict membrane-spanning alpha-helical segments of integral membrane proteins; four such hydrophobic helices were predicted by this treatment. Additional amphipathic helices may also be present. A remarkable feature of the sequence is the existence of two segments of high uncompensated charge density, one positive and the other negative. Possible organization of the protein in the membrane is discussed. Asymmetry in the amino acid composition of translated DNA sequences was used to distinguish between two possible initiation codons. The use of this method as a criterion for authentication of coding regions is described briefly in an Appendix. PMID:6090403

  17. Valproic acid increases conservative homologous recombination frequency and reactive oxygen species formation: a potential mechanism for valproic acid-induced neural tube defects.

    PubMed

    Defoort, Ericka N; Kim, Perry M; Winn, Louise M

    2006-04-01

    Valproic acid, a commonly used antiepileptic agent, is associated with a 1 to 2% incidence of neural tube defects when taken during pregnancy; however, the molecular mechanism by which this occurs has not been elucidated. Previous research suggests that valproic acid exposure leads to an increase in reactive oxygen species (ROS). DNA damage due to ROS can result in DNA double-strand breaks, which can be repaired through homologous recombination (HR), a process that is not error-free and can result in detrimental genetic changes. Because the developing embryo requires tight regulation of gene expression to develop properly, we propose that the loss or dysfunction of genes involved in embryonic development through aberrant HR may ultimately cause neural tube defects. To determine whether valproic acid induces HR, Chinese hamster ovary 3-6 cells, containing a neomycin direct repeat recombination substrate, were exposed to valproic acid for 4 or 24 h. A significant increase in HR after exposure to valproic acid (5 and 10 mM) for 24 h was observed, which seems to occur through a conservative HR mechanism. We also demonstrated that exposure to valproic acid (5 and 10 mM) significantly increased intracellular ROS levels, which were attenuated by preincubation with polyethylene glycol-conjugated (PEG)-catalase. A significant change in the ratio of 8-hydroxy-2'-deoxyguanosine/2'-de-oxyguanosine, a measure of DNA oxidation, was not observed after valproic acid exposure; however, preincubation with PEG-catalase significantly blocked the increase in HR. These data demonstrate that valproic acid increases HR frequency and provides a possible mechanism for valproic acid-induced neural tube defects.

  18. Genome Sequence Analysis of the Naphthenic Acid Degrading and Metal Resistant Bacterium Cupriavidus gilardii CR3

    PubMed Central

    Xiao, Jingfa; Hao, Lirui; Crowley, David E.; Zhang, Zhewen; Yu, Jun; Huang, Ning; Huo, Mingxin; Wu, Jiayan

    2015-01-01

    Cupriavidus sp. are generally heavy metal tolerant bacteria with the ability to degrade a variety of aromatic hydrocarbon compounds, although the degradation pathways and substrate versatilities remain largely unknown. Here we studied the bacterium Cupriavidus gilardii strain CR3, which was isolated from a natural asphalt deposit, and which was shown to utilize naphthenic acids as a sole carbon source. Genome sequencing of C. gilardii CR3 was carried out to elucidate possible mechanisms for the naphthenic acid biodegradation. The genome of C. gilardii CR3 was composed of two circular chromosomes chr1 and chr2 of respectively 3,539,530 bp and 2,039,213 bp in size. The genome for strain CR3 encoded 4,502 putative protein-coding genes, 59 tRNA genes, and many other non-coding genes. Many genes were associated with xenobiotic biodegradation and metal resistance functions. Pathway prediction for degradation of cyclohexanecarboxylic acid, a representative naphthenic acid, suggested that naphthenic acid undergoes initial ring-cleavage, after which the ring fission products can be degraded via several plausible degradation pathways including a mechanism similar to that used for fatty acid oxidation. The final metabolic products of these pathways are unstable or volatile compounds that were not toxic to CR3. Strain CR3 was also shown to have tolerance to at least 10 heavy metals, which was mainly achieved by self-detoxification through ion efflux, metal-complexation and metal-reduction, and a powerful DNA self-repair mechanism. Our genomic analysis suggests that CR3 is well adapted to survive the harsh environment in natural asphalts containing naphthenic acids and high concentrations of heavy metals. PMID:26301592

  19. Genome Sequence Analysis of the Naphthenic Acid Degrading and Metal Resistant Bacterium Cupriavidus gilardii CR3.

    PubMed

    Wang, Xiaoyu; Chen, Meili; Xiao, Jingfa; Hao, Lirui; Crowley, David E; Zhang, Zhewen; Yu, Jun; Huang, Ning; Huo, Mingxin; Wu, Jiayan

    2015-01-01

    Cupriavidus sp. are generally heavy metal tolerant bacteria with the ability to degrade a variety of aromatic hydrocarbon compounds, although the degradation pathways and substrate versatilities remain largely unknown. Here we studied the bacterium Cupriavidus gilardii strain CR3, which was isolated from a natural asphalt deposit, and which was shown to utilize naphthenic acids as a sole carbon source. Genome sequencing of C. gilardii CR3 was carried out to elucidate possible mechanisms for the naphthenic acid biodegradation. The genome of C. gilardii CR3 was composed of two circular chromosomes chr1 and chr2 of respectively 3,539,530 bp and 2,039,213 bp in size. The genome for strain CR3 encoded 4,502 putative protein-coding genes, 59 tRNA genes, and many other non-coding genes. Many genes were associated with xenobiotic biodegradation and metal resistance functions. Pathway prediction for degradation of cyclohexanecarboxylic acid, a representative naphthenic acid, suggested that naphthenic acid undergoes initial ring-cleavage, after which the ring fission products can be degraded via several plausible degradation pathways including a mechanism similar to that used for fatty acid oxidation. The final metabolic products of these pathways are unstable or volatile compounds that were not toxic to CR3. Strain CR3 was also shown to have tolerance to at least 10 heavy metals, which was mainly achieved by self-detoxification through ion efflux, metal-complexation and metal-reduction, and a powerful DNA self-repair mechanism. Our genomic analysis suggests that CR3 is well adapted to survive the harsh environment in natural asphalts containing naphthenic acids and high concentrations of heavy metals.

  20. Repeat sequence chromosome specific nucleic acid probes and methods of preparing and using

    DOEpatents

    Weier, Heinz-Ulrich G.; Gray, Joe W.

    1995-01-01

    A primer directed DNA amplification method to isolate efficiently chromosome-specific repeated DNA wherein degenerate oligonucleotide primers are used is disclosed. The probes produced are a heterogeneous mixture that can be used with blocking DNA as a chromosome-specific staining reagent, and/or the elements of the mixture can be screened for high specificity, size and/or high degree of repetition among other parameters. The degenerate primers are sets of primers that vary in sequence but are substantially complementary to highly repeated nucleic acid sequences, preferably clustered within the template DNA, for example, pericentromeric alpha satellite repeat sequences. The template DNA is preferably chromosome-specific. Exemplary primers ard probes are disclosed. The probes of this invention can be used to determine the number of chromosomes of a specific type in metaphase spreads, in germ line and/or somatic cell interphase nuclei, micronuclei and/or in tissue sections. Also provided is a method to select arbitrarily repeat sequence probes that can be screened for chromosome-specificity.

  1. Unconventional amino acid sequence of the sun anemone (Stoichactis helianthus) polypeptide neurotoxin

    SciTech Connect

    Kem, W.; Dunn, B.; Parten, B.; Pennington, M.; Price, D.

    1986-05-01

    A 5000 dalton polypeptide neurotoxin (Sh-NI) purified by G50 Sephadex, P-cellulose, and SP-Sephadex chromatography was homogeneous by isoelectric focusing. Sh-NI was highly toxic to crayfish (LD/sub 50/ 0.6 ..mu..g/kg) but without effect upon mice at 15,000 ..mu..g/kg (i.p. injection). The reduced, /sup 3/H-carboxymethylated toxin and its fragments were subjected to automatic Edman degradation and the resulting PTH-amino acids were identified by HPLC, back hydrolysis, and scintillation counting. Peptides resulting from proteolytic (clostripain, staphylococcal protease) and chemical (tryptophan) cleavage were sequenced. The sequence is: AACKCDDEGPDIRTAPLTGTVDLGSCNAGWEKCASYYTIIADCCRKKK. This sequence differs considerably from the homologous Anemonia and Anthopleura toxins; many of the identical residues (6 half-cystines, G9, P10, R13, G19, G29, W30) are probably critical for folding rather than receptor recognition. However, the Sh-NI sequence closely resembles Radioanthus macrodactylus neurotoxin III and r. paumotensis II. The authors propose that Sh-NI and related Radioanthus toxins act upon a different site on the sodium channel.

  2. Repeat sequence chromosome specific nucleic acid probes and methods of preparing and using

    DOEpatents

    Weier, H.U.G.; Gray, J.W.

    1995-06-27

    A primer directed DNA amplification method to isolate efficiently chromosome-specific repeated DNA wherein degenerate oligonucleotide primers are used is disclosed. The probes produced are a heterogeneous mixture that can be used with blocking DNA as a chromosome-specific staining reagent, and/or the elements of the mixture can be screened for high specificity, size and/or high degree of repetition among other parameters. The degenerate primers are sets of primers that vary in sequence but are substantially complementary to highly repeated nucleic acid sequences, preferably clustered within the template DNA, for example, pericentromeric alpha satellite repeat sequences. The template DNA is preferably chromosome-specific. Exemplary primers and probes are disclosed. The probes of this invention can be used to determine the number of chromosomes of a specific type in metaphase spreads, in germ line and/or somatic cell interphase nuclei, micronuclei and/or in tissue sections. Also provided is a method to select arbitrarily repeat sequence probes that can be screened for chromosome-specificity. 18 figs.

  3. Sequence-defined bioactive macrocycles via an acid-catalysed cascade reaction

    NASA Astrophysics Data System (ADS)

    Porel, Mintu; Thornlow, Dana N.; Phan, Ngoc N.; Alabi, Christopher A.

    2016-06-01

    Synthetic macrocycles derived from sequence-defined oligomers are a unique structural class whose ring size, sequence and structure can be tuned via precise organization of the primary sequence. Similar to peptides and other peptidomimetics, these well-defined synthetic macromolecules become pharmacologically relevant when bioactive side chains are incorporated into their primary sequence. In this article, we report the synthesis of oligothioetheramide (oligoTEA) macrocycles via a one-pot acid-catalysed cascade reaction. The versatility of the cyclization chemistry and modularity of the assembly process was demonstrated via the synthesis of >20 diverse oligoTEA macrocycles. Structural characterization via NMR spectroscopy revealed the presence of conformational isomers, which enabled the determination of local chain dynamics within the macromolecular structure. Finally, we demonstrate the biological activity of oligoTEA macrocycles designed to mimic facially amphiphilic antimicrobial peptides. The preliminary results indicate that macrocyclic oligoTEAs with just two-to-three cationic charge centres can elicit potent antibacterial activity against Gram-positive and Gram-negative bacteria.

  4. Complete amino acid sequence of ananain and a comparison with stem bromelain and other plant cysteine proteases.

    PubMed Central

    Lee, K L; Albee, K L; Bernasconi, R J; Edmunds, T

    1997-01-01

    The amino acid sequences of ananain (EC3.4.22.31) and stem bromelain (3.4.22.32), two cysteine proteases from pineapple stem, are similar yet ananain and stem bromelain possess distinct specificities towards synthetic peptide substrates and different reactivities towards the cysteine protease inhibitors E-64 and chicken egg white cystatin. We present here the complete amino acid sequence of ananain and compare it with the reported sequences of pineapple stem bromelain, papain and chymopapain from papaya and actinidin from kiwifruit. Ananain is comprised of 216 residues with a theoretical mass of 23464 Da. This primary structure includes a sequence insert between residues 170 and 174 not present in stem bromelain or papain and a hydrophobic series of amino acids adjacent to His-157. It is possible that these sequence differences contribute to the different substrate and inhibitor specificities exhibited by ananain and stem bromelain. PMID:9355753

  5. Microbial community dynamics in bioaugmented sequencing batch reactors for bromoamine acid removal.

    PubMed

    Qu, Yuanyuan; Zhou, Jiti; Wang, Jing; Fu, Xiang; Xing, Linlin

    2005-05-01

    Sphingomonas xenophaga QYY with the ability to degrade bromoamine acid (BAA) was previously isolated from sludge samples. The enhancement of BAA removal by strain QYY in sequencing batch reactors (SBRs) was investigated in this study. The results showed that augmented SBRs exhibited stronger abilities to degrade BAA than the non-augmented control one. In order to estimate the relationship between community dynamics and function of augmented SBRs, a combined method based on fingerprints (ribosomal intergenic spacer analysis, RISA) and 16S rRNA gene sequencing was used. The results indicated that the microbial community dynamics were substantially changed, and the introduced strain QYY was persistent in the augmented systems. This study suggests that it is feasible and potentially useful to enhance BAA removal using BAA-degrading bacteria, such as S. xenophaga QYY.

  6. The complete sequence of a Spanish isolate of Broad bean wilt virus 1 (BBWV-1) reveals a high variability and conserved motifs in the genus Fabavirus.

    PubMed

    Ferrer, R M; Guerri, J; Luis-Arteaga, M S; Moreno, P; Rubio, L

    2005-10-01

    The genome of a Spanish isolate of Broad bean wilt virus-1 (BBWV-1) was completely sequenced and compared with available sequences of other isolates of the genus Fabavirus (BBWV-1 and BBWV-2). This consisted of two RNAs of 5814 and 3431 nucleotides, respectively, and their organization was similar to that of other members of the family Comoviridae. Its mean nucleotide identity with a BBWV-1 American isolate was 81.5%, and between 59.8 and 63.5% with seven BBWV-2 isolates. Our analysis showed sequence stretches in the 5' non-coding regions which are conserved in both genomic RNAs and in BBWV-1 and BBWV-2 isolates.

  7. [Measurement of the amino acid sequence for the fusion protein FP3 with LC-MS/MS].

    PubMed

    Li, Xiang; Gao, Xiang-Dong; Tao, Lei; Pei, De-Ning; Guo, Ying; Rao, Chun-Ming; Wang, Jun-Zhi

    2012-02-01

    The amino acid sequence of the fusion protein FP3 was measured by two types of LC-MS/MS and its primary structure was confirmed. After reduction and alkylation, the protein was digested with trypsin and glycosyl groups in glycopeptide were removed by PNGase F. The mixed peptides were separated by LC, then Q-TOF and Ion trap tandem mass spectrometry were used to measure b, y fragment ions of each peptide to analyze the amino acid sequence of fusion protein FP3. Seventy-six percent of full amino acid sequence of the fusion protein FP3 was measured by LC-ESI-Q-TOF with the remaining 24% completed by LC-ESI-Trap. As LC-MS and tandem mass spectrometry are rapid, sensitive, accurate to measure the protein amino acid sequence, they are important approach to structure analysis and identification of recombinant protein.

  8. NullSeq: A Tool for Generating Random Coding Sequences with Desired Amino Acid and GC Contents

    PubMed Central

    Liu, Sophia S.; Hockenberry, Adam J.; Lancichinetti, Andrea; Jewett, Michael C.

    2016-01-01

    The existence of over- and under-represented sequence motifs in genomes provides evidence of selective evolutionary pressures on biological mechanisms such as transcription, translation, ligand-substrate binding, and host immunity. In order to accurately identify motifs and other genome-scale patterns of interest, it is essential to be able to generate accurate null models that are appropriate for the sequences under study. While many tools have been developed to create random nucleotide sequences, protein coding sequences are subject to a unique set of constraints that complicates the process of generating appropriate null models. There are currently no tools available that allow users to create random coding sequences with specified amino acid composition and GC content for the purpose of hypothesis testing. Using the principle of maximum entropy, we developed a method that generates unbiased random sequences with pre-specified amino acid and GC content, which we have developed into a python package. Our method is the simplest way to obtain maximally unbiased random sequences that are subject to GC usage and primary amino acid sequence constraints. Furthermore, this approach can easily be expanded to create unbiased random sequences that incorporate more complicated constraints such as individual nucleotide usage or even di-nucleotide frequencies. The ability to generate correctly specified null models will allow researchers to accurately identify sequence motifs which will lead to a better understanding of biological processes as well as more effective engineering of biological systems. PMID:27835644

  9. A common set of conserved motifs in a vast variety of putative nucleic acid-dependent ATPases including MCM proteins involved in the initiation of eukaryotic DNA replication.

    PubMed Central

    Koonin, E V

    1993-01-01

    A new superfamily of (putative) DNA-dependent ATPases is described that includes the ATPase domains of prokaryotic NtrC-related transcription regulators, MCM proteins involved in the initiation of eukaryotic DNA replication, and a group of uncharacterized bacterial and chloroplast proteins. MCM proteins are shown to contain a modified form of the ATP-binding motif and are predicted to mediate ATP-dependent opening of double-stranded DNA in the replication origins. In a second line of investigation, it is demonstrated that the products of unidentified open reading frames from Marchantia mitochondria and from yeast, and a domain of a baculovirus protein involved in viral DNA replication are related to the superfamily III of DNA and RNA helicases that previously has been known to include only proteins of small viruses. Comparison of the multiple alignments showed that the proteins of the NtrC superfamily and the helicases of superfamily III share three related sequence motifs tightly packed in the ATPase domain that consists of 100-150 amino acid residues. A similar array of conserved motifs is found in the family of DnaA-related ATPases. It is hypothesized that the three large groups of nucleic acid-dependent ATPases have similar structure of the core ATPase domain and have evolved from a common ancestor. PMID:8332451

  10. The Ras protein superfamily: Evolutionary tree and role of conserved amino acids

    PubMed Central

    Fuentes, Gloria; Rausell, Antonio

    2012-01-01

    The Ras superfamily is a fascinating example of functional diversification in the context of a preserved structural framework and a prototypic GTP binding site. Thanks to the availability of complete genome sequences of species representing important evolutionary branch points, we have analyzed the composition and organization of this superfamily at a greater level than was previously possible. Phylogenetic analysis of gene families at the organism and sequence level revealed complex relationships between the evolution of this protein superfamily sequence and the acquisition of distinct cellular functions. Together with advances in computational methods and structural studies, the sequence information has helped to identify features important for the recognition of molecular partners and the functional specialization of different members of the Ras superfamily. PMID:22270915

  11. Comparative genomic analysis of a neurotoxigenic Clostridium species using partial genome sequence: Phylogenetic analysis of a few conserved proteins involved in cellular processes and metabolism.

    PubMed

    Alam, Syed Imteyaz; Dixit, Aparna; Tomar, Arvind; Singh, Lokendra

    2010-04-01

    Clostridial organisms produce neurotoxins, which are generally regarded as the most potent toxic substances of biological origin and potential biological warfare agents. Clostridium tetani produces tetanus neurotoxin and is responsible for the fatal tetanus disease. In spite of the extensive immunization regimen, the disease is an important cause of death especially among neonates. Strains of C. tetani have not been genetically characterized except the complete genome sequencing of strain E88. The present study reports the genetic makeup and phylogenetic affiliations of an environmental strain of this bacterium with respect to C. tetani E88 and other clostridia. A shot gun library was constructed from the genomic DNA of C. tetani drde, isolated from decaying fish sample. Unique clones were sequenced and sequences compared with its closest relative C. tetani E88. A total of 275 clones were obtained and 32,457 bases of non-redundant sequence were generated. A total of 150 base changes were observed over the entire length of sequence obtained, including, additions, deletions and base substitutions. Of the total 120 ORFs detected, 48 exhibited closest similarity to E88 proteins of which three are hypothetical proteins. Eight of the ORFs exhibited similarity with hypothetical proteins from other organisms and 10 aligned with other proteins from unrelated organisms. There is an overall conservation of protein sequences among the two strains of C. tetani and. Selected ORFs involved in cellular processes and metabolism were subjected to phylogenetic analysis.

  12. Sequence selective recognition of double-stranded RNA using triple helix-forming peptide nucleic acids.

    PubMed

    Zengeya, Thomas; Gupta, Pankaj; Rozners, Eriks

    2014-01-01

    Noncoding RNAs are attractive targets for molecular recognition because of the central role they play in gene expression. Since most noncoding RNAs are in a double-helical conformation, recognition of such structures is a formidable problem. Herein, we describe a method for sequence-selective recognition of biologically relevant double-helical RNA (illustrated on ribosomal A-site RNA) using peptide nucleic acids (PNA) that form a triple helix in the major grove of RNA under physiologically relevant conditions. Protocols for PNA preparation and binding studies using isothermal titration calorimetry are described in detail.

  13. Fast computational methods for predicting protein structure from primary amino acid sequence

    DOEpatents

    Agarwal, Pratul Kumar

    2011-07-19

    The present invention provides a method utilizing primary amino acid sequence of a protein, energy minimization, molecular dynamics and protein vibrational modes to predict three-dimensional structure of a protein. The present invention also determines possible intermediates in the protein folding pathway. The present invention has important applications to the design of novel drugs as well as protein engineering. The present invention predicts the three-dimensional structure of a protein independent of size of the protein, overcoming a significant limitation in the prior art.

  14. Ophthalmic acid accumulation in an Escherichia coli mutant lacking the conserved pyridoxal 5'-phosphate-binding protein YggS.

    PubMed

    Ito, Tomokazu; Yamauchi, Ayako; Hemmi, Hisashi; Yoshimura, Tohru

    2016-12-01

    Escherichia coli YggS is a highly conserved pyridoxal 5'-phosphate (PLP)-binding protein whose biochemical function is currently unknown. A previous study with a yggS-deficient E. coli strain (ΔyggS) demonstrated that YggS controls l-Ile- and l-Val-metabolism by modulating 2-ketobutyrate (2-KB), l-2-aminobutyrate (l-2-AB), and/or coenzyme A (CoA) availability in a PLP-dependent fashion. In this study, we found that ΔyggS accumulates an unknown metabolite as judged by amino acid analyses. LC/MS and MS/MS analyses of the compound with propyl chloroformate derivatization, and co-chromatography analysis identified this compound as γ-l-glutamyl-l-2-aminobutyryl-glycine (ophthalmic acid), a glutathione (GSH) analogue in which the l-Cys moiety is replaced by l-2-AB. We also determine the metabolic consequence of the yggS mutation. Absence of YggS initially increases l-2-AB availability, and then causes ophthalmic acid accumulation and CoA limitation in the cell. The expression of a γ-glutamylcysteine synthetase and a glutathione synthetase in a ΔyggS background causes high-level accumulation of ophthalmic acid in the cells (∼1.2 nmol/mg cells) in a minimal synthetic medium. This opens the possibility of a first fermentative production of ophthalmic acid.

  15. Fluorescence energy transfer as a probe for nucleic acid structures and sequences.

    PubMed Central

    Mergny, J L; Boutorine, A S; Garestier, T; Belloc, F; Rougée, M; Bulychev, N V; Koshkin, A A; Bourson, J; Lebedev, A V; Valeur, B

    1994-01-01

    The primary or secondary structure of single-stranded nucleic acids has been investigated with fluorescent oligonucleotides, i.e., oligonucleotides covalently linked to a fluorescent dye. Five different chromophores were used: 2-methoxy-6-chloro-9-amino-acridine, coumarin 500, fluorescein, rhodamine and ethidium. The chemical synthesis of derivatized oligonucleotides is described. Hybridization of two fluorescent oligonucleotides to adjacent nucleic acid sequences led to fluorescence excitation energy transfer between the donor and the acceptor dyes. This phenomenon was used to probe primary and secondary structures of DNA fragments and the orientation of oligodeoxynucleotides synthesized with the alpha-anomers of nucleoside units. Fluorescence energy transfer can be used to reveal the formation of hairpin structures and the translocation of genes between two chromosomes. PMID:8152922

  16. Amino acid sequence of two neurotoxins from the venom of the Egyptian black snake (Walterinnesia aegyptia).

    PubMed

    Samejima, Y; Aoki-Tomomatsu, Y; Yanagisawa, M; Mebs, D

    1997-02-01

    The venom of the Egyptian black snake Walterinnesia aegyptia contains at least three toxins, which act postsynaptically to block the neuromuscular transmission of isolated rat phrenic nerve-diaphragm and chicken biventer cervicis muscle. The complete amino acid sequence of the two toxins, W-III and W-IV, consisting of 62 amino acid residues, was elucidated by Edman degradation of fragments obtained after Staphylococcus aureus protease and prolylpeptidase digestion. Although the toxins exhibit close structural homology to other short-chain postsynaptic neurotoxins from Elapidae venoms, toxin IV is unique by having a free SH-group (cysteine) at position 16. In position 35 of W-III, which is located at the tip of the central loop, threonine is replaced by lysine, which may alter the interaction of the toxin with the acetylcholine receptor, since the toxin is seven times less lethal than toxin W-IV.

  17. Complete genome sequence of Lactococcus lactis IO-1, a lactic acid bacterium that utilizes xylose and produces high levels of L-lactic acid.

    PubMed

    Kato, Hiroaki; Shiwa, Yuh; Oshima, Kenshiro; Machii, Miki; Araya-Kojima, Tomoko; Zendo, Takeshi; Shimizu-Kadota, Mariko; Hattori, Masahira; Sonomoto, Kenji; Yoshikawa, Hirofumi

    2012-04-01

    We report the complete genome sequence of Lactococcus lactis IO-1 (= JCM7638). It is a nondairy lactic acid bacterium, produces nisin Z, ferments xylose, and produces predominantly L-lactic acid at high xylose concentrations. From ortholog analysis with other five L. lactis strains, IO-1 was identified as L. lactis subsp. lactis.

  18. Complete genome sequence of Bacillus amyloliquefaciens LL3, which exhibits glutamic acid-independent production of poly-γ-glutamic acid.

    PubMed

    Geng, Weitao; Cao, Mingfeng; Song, Cunjiang; Xie, Hui; Liu, Li; Yang, Chao; Feng, Jun; Zhang, Wei; Jin, Yinghong; Du, Yang; Wang, Shufang

    2011-07-01

    Bacillus amyloliquefaciens is one of most prevalent Gram-positive aerobic spore-forming bacteria with the ability to synthesize polysaccharides and polypeptides. Here, we report the complete genome sequence of B. amyloliquefaciens LL3, which was isolated from fermented food and presents the glutamic acid-independent production of poly-γ-glutamic acid.

  19. Formation Sequences of Iron Minerals in the Acidic Alteration Products and Variation of Hydrothermal Fluid Conditions

    NASA Astrophysics Data System (ADS)

    Isobe, H.; Yoshizawa, M.

    2008-12-01

    Iron minerals have important role in environmental issues not only on the Earth but also other terrestrial planets. Iron mineral species related to alteration products of primary minerals with surface or subsurface fluids are characterized by temperature, acidity and redox conditions of the fluids. We can see various iron- bearing alteration products in alteration products around fumaroles in geothermal/volcanic areas. In this study, zonal structures of iron minerals in alteration products of the geothermal area are observed to elucidate temporal and spatial variation of hydrothermal fluids. Alteration of the pyroxene-amphibole andesite of Garan-dake volcano, Oita, Japan occurs by the acidic hydrothermal fluid to form cristobalite leaching out elements other than Si. Hand specimens with unaltered or weakly altered core and cristobalite crust show various sequences of layers. XRD analysis revealed that the alteration degree is represented by abundance of cristobalite. Intermediately altered layers are characterized by occurrence including alunite, pyrite, kaolinite, goethite and hematite. A specimen with reddish brown core surrounded by cristobalite-rich white crust has brown colored layers at the boundary of core and the crust. Reddish core is characterized by occurrence of crystalline hematite by XRD. Another hand specimen has light gray core, which represents reduced conditions, and white cristobalite crust with light brown and reddish brown layers of ferric iron minerals between the core and the crust. On the other hand, hornblende crystals, typical ferrous iron-bearing mineral of the host rock, are well preserved in some samples with strongly decolorized cristobalite-rich groundmass. Hydrothermal alteration experiments of iron-rich basaltic material shows iron mineral species depend on acidity and temperature of the fluid. Oxidation states of the iron-bearing mineral species are strongly influenced by the acidity and redox conditions. Variations of alteration

  20. Design, synthesis, and characterization of a protein sequencing reagent yielding amino acid derivatives with enhanced detectability by mass spectrometry.

    PubMed Central

    Aebersold, R.; Bures, E. J.; Namchuk, M.; Goghari, M. H.; Shushan, B.; Covey, T. C.

    1992-01-01

    We report the design, chemical synthesis, and structural and functional characterization of a novel reagent for protein sequence analysis by the Edman degradation, yielding amino acid derivatives rapidly detectable at high sensitivity by ion-evaporation mass spectrometry. We demonstrate that the reagent 3-[4'(ethylene-N,N,N-trimethylamino)phenyl]-2-isothiocyanate is chemically stable and shows coupling and cyclization/cleavage yields comparable to phenylisothiocyanate, the standard reagent in chemical sequence analysis, under conditions typically encountered in manual or automated sequence analysis. Amino acid derivatives generated with this reagent were detectable by ion-evaporation mass spectrometry at the subfemtomole sensitivity level at a pace of one sample per minute. Furthermore, derivatives were identified by their mass, thus permitting the rapid and highly sensitive determination of the molecular nature of modified amino acids. Derivatives of amino acids with acidic, basic, polar, or hydrophobic side chains were reproducibly detectable at comparable sensitivities. The polar nature of the reagent required covalent immobilization of polypeptides prior to automated sequence analysis. This reagent, used in automated sequence analysis, has the potential for overcoming the limitations in sensitivity, speed, and the ability to characterize modified amino acid residues inherent in the chemical sequencing methods that are currently used. PMID:1304351

  1. Complete Genome Sequence of Enterobacter cloacae UW5, a Rhizobacterium Capable of High Levels of Indole-3-Acetic Acid Production.

    PubMed

    Coulson, Thomas J D; Patten, Cheryl L

    2015-08-06

    We report the complete genome sequence of Enterobacter cloacae UW5, an indole-3-acetic acid-producing rhizobacterium originally isolated from the rhizosphere of grass. The 4.9-Mbp genome has a G+C content of 54% and contains 4,496 protein-coding sequences.

  2. Complete Genome Sequence of Enterobacter cloacae UW5, a Rhizobacterium Capable of High Levels of Indole-3-Acetic Acid Production

    PubMed Central

    Coulson, Thomas J. D.

    2015-01-01

    We report the complete genome sequence of Enterobacter cloacae UW5, an indole-3-acetic acid-producing rhizobacterium originally isolated from the rhizosphere of grass. The 4.9-Mbp genome has a G+C content of 54% and contains 4,496 protein-coding sequences. PMID:26251488

  3. Genome Sequence of the Lactic Acid Bacterium Lactococcus lactis subsp. lactis TOMSC161, Isolated from a Nonscalded Curd Pressed Cheese

    PubMed Central

    Velly, H.; Abraham, A.-L.; Loux, V.; Delacroix-Buchet, A.; Fonseca, F.; Bouix, M.

    2014-01-01

    Lactococcus lactis is a lactic acid bacterium used in the production of many fermented foods, such as dairy products. Here, we report the genome sequence of L. lactis subsp. lactis TOMSC161, isolated from nonscalded curd pressed cheese. This genome sequence provides information in relation to dairy environment adaptation. PMID:25377704

  4. Amino acid sequence and carbohydrate-binding analysis of the N-acetyl-D-galactosamine-specific C-type lectin, CEL-I, from the Holothuroidea, Cucumaria echinata.

    PubMed

    Hatakeyama, Tomomitsu; Matsuo, Noriaki; Shiba, Kouhei; Nishinohara, Shoichi; Yamasaki, Nobuyuki; Sugawara, Hajime; Aoyagi, Haruhiko

    2002-01-01

    CEL-I is one of the Ca2+-dependent lectins that has been isolated from the sea cucumber, Cucumaria echinata. This protein is composed of two identical subunits held by a single disulfide bond. The complete amino acid sequence of CEL-I was determined by sequencing the peptides produced by proteolytic fragmentation of S-pyridylethylated CEL-I. A subunit of CEL-I is composed of 140 amino acid residues. Two intrachain (Cys3-Cys14 and Cys31-Cys135) and one interchain (Cys36) disulfide bonds were also identified from an analysis of the cystine-containing peptides obtained from the intact protein. The similarity between the sequence of CEL-I and that of other C-type lectins was low, while the C-terminal region, including the putative Ca2+ and carbohydrate-binding sites, was relatively well conserved. When the carbohydrate-binding activity was examined by a solid-phase microplate assay, CEL-I showed much higher affinity for N-acetyl-D-galactosamine than for other galactose-related carbohydrates. The association constant of CEL-I for p-nitrophenyl N-acetyl-beta-D-galactosaminide (NP-GalNAc) was determined to be 2.3 x 10(4) M(-1), and the maximum number of bound NP-GalNAc was estimated to be 1.6 by an equilibrium dialysis experiment.

  5. Deoxyribonucleic acid sequence of araBAD promoter mutants of Escherichia coli.

    PubMed

    Horwitz, A H; Morandi, C; Wilcox, G

    1980-05-01

    The controlling site region for the araBAD operon is defined, in part, by two classes of cis-acting constitutive mutations. The aralc mutations allow low-level constitutive expression of ara-BAD in the absence of the positive regulatory protein coded for by the araC gene, whereas the araXc mutations allow expression of araBAD in the absence of the cyclic adenosine monophosphate receptor protein. Six independently isolated aralc mutations and three independently isolated araXc mutations were cloned onto the plasmid pBR322 using in vitro recombinant deoxyribonucleic acid techniques and in vivo recombination between plasmid and chromosomal deoxyribonucleic acid. The location of these mutations was determined by deoxyribonucleic acid sequence analysis. All of the aralc mutations occurred at position -35 within the araBAD promoter (+1 = messenger ribonucleic acid start for araBAD) and resulted from an AT leads to GC transition. All of the araXc mutations occurred at position -10 within the araBAD promoter and resulted from a GC leads to AT transition. Models are presented to explain the mode of action of the aralc and araXc mutations.

  6. In the TTF-1 homeodomain the contribution of several amino acids to DNA recognition depends on the bound sequence.

    PubMed Central

    Fabbro, D; Tell, G; Leonardi, A; Pellizzari, L; Pucillo, C; Lonigro, R; Formisano, S; Damante, G

    1996-01-01

    The thyroid transcription factor-1 homeodomain (TTF-1HD) shows a peculiar DNA binding specificity, preferentially recognizing sequences containing the 5'-CAAG-3' core motif. Most other homeodomains instead recognize sites containing the 5'-TAAT-3' core motif. Here, we show that TTF-1HD efficiently recognizes another sequence, called D1, devoid of the 5'-CAAG-3' core motif. Different experimental approaches indicate that TTF-1HD contacts the D1 sequence in a manner which is different to that used to interact with sequences containing the 5'-CAAG-3' core motif. The binding activities that mutants of TTF-1HD display with the D1 sequence or with the sequence containing the 5'-CAAG-3' core motif indicate that the role of several DNA-contacting amino acids is different. In particular, during recognition of the D1 sequence, backbone-interacting amino acids not relevant in binding to sequences containing the 5'-CAAG-3' core motif play an important role. In the TTF-1HD, therefore, the contribution of several amino acids to DNA recognition depends on the bound sequence. These data indicate that although a common bonding network exists in all of the HD/DNA complexes, peculiarities important for DNA recognition may occur in single cases. PMID:8811078

  7. Rdh10a Provides a Conserved Critical Step in the Synthesis of Retinoic Acid during Zebrafish Embryogenesis

    PubMed Central

    D’Aniello, Enrico; Ravisankar, Padmapriyadarshini; Waxman, Joshua S.

    2015-01-01

    The first step in the conversion of vitamin A into retinoic acid (RA) in embryos requires retinol dehydrogenases (RDHs). Recent studies have demonstrated that RDH10 is a critical core component of the machinery that produces RA in mouse and Xenopus embryos. If the conservation of Rdh10 function in the production of RA extends to teleost embryos has not been investigated. Here, we report that zebrafish Rdh10a deficient embryos have defects consistent with loss of RA signaling, including anteriorization of the nervous system and enlarged hearts with increased cardiomyocyte number. While knockdown of Rdh10a alone produces relatively mild RA deficient phenotypes, Rdh10a can sensitize embryos to RA deficiency and enhance phenotypes observed when Aldh1a2 function is perturbed. Moreover, excess Rdh10a enhances embryonic sensitivity to retinol, which has relatively mild teratogenic effects compared to retinal and RA treatment. Performing Rdh10a regulatory expression analysis, we also demonstrate that a conserved teleost rdh10a enhancer requires Pax2 sites to drive expression in the eyes of transgenic embryos. Altogether, our results demonstrate that Rdh10a has a conserved requirement in the first step of RA production within vertebrate embryos. PMID:26394147

  8. The amino acid sequences and activities of synergistic hemolysins from Staphylococcus cohnii.

    PubMed

    Mak, Pawel; Maszewska, Agnieszka; Rozalska, Malgorzata

    2008-10-01

    Staphylococcus cohnii ssp. cohnii and S. cohnii ssp. urealyticus are a coagulase-negative staphylococci considered for a long time as unable to cause infections. This situation changed recently and pathogenic strains of these bacteria were isolated from hospital environments, patients and medical staff. Most of the isolated strains were resistant to many antibiotics. The present work describes isolation and characterization of several synergistic peptide hemolysins produced by these bacteria and acting as virulence factors responsible for hemolytic and cytotoxic activities. Amino acid sequences of respective hemolysins from S. cohnii ssp. cohnii (named as H1C, H2C and H3C) and S. cohnii ssp. urealyticus (H1U, H2U and H3U) were identical. Peptides H1 and H3 possessed significant amino acid homology to three synergistic hemolysins secreted by Staphylococcus lugdunensis and to putative antibacterial peptide produced by Staphylococcus saprophyticus ssp. saprophyticus. On the other hand, hemolysin H2 had a unique sequence. All isolated peptides lysed red cells from different mammalian species and exerted a cytotoxic effect on human fibroblasts.

  9. Complete amino acid sequence of a Lolium perenne (perennial rye grass) pollen allergen, Lol p II.

    PubMed

    Ansari, A A; Shenbagamurthi, P; Marsh, D G

    1989-07-05

    The complete amino acid sequence of a Lolium perenne (rye grass) pollen allergen, Lol p II was determined by automated Edman degradation of the protein and selected fragments. Cleavage of the protein by enzymatic and chemical techniques established an unambiguous sequence for the protein. Lol p II contains 97 amino acid residues, with a calculated molecular weight of 10,882. The protein lacks cysteine and glutamine and shows no evidence of glycosylation. Theoretical predictions by Fraga's (Fraga, S. (1982) Can. J. Chem. 60, 2606-2610) and Hopp and Woods' (Hopp, T. P., and Woods, K. R. (1981) Proc. Natl. Acad. Sci. U.S.A. 78, 3824-3828) methods indicate the presence of four hydrophilic regions, which may contribute to sequential or parts of conformational B-cell epitopes. Analysis of amphipathic regions by Berzofsky's method indicates the presence of a highly amphipathic region, which may contain, or contribute to, an Ia/T-cell epitope. This latter segment of Lol p II was found to be highly homologous with an antibody-binding segment of the major rye allergen Lol p I and may explain why immune responsiveness to both the allergens is associated with HLA-DR3.

  10. The Sequence-Specific Cellular Uptake of Spherical Nucleic Acid Nanoparticle Conjugates

    PubMed Central

    Narayan, Suguna P.; Choi, Chung Hang J.; Hao, Liangliang; Calabrese, Colin M.; Auyeung, Evelyn; Zhang, Chuan; Goor, Olga J.G.M.

    2015-01-01

    We investigated the sequence-dependent cellular uptake of spherical nucleic acid nanoparticle conjugates (SNAs). This process occurs by interaction with class A scavenger receptors (SR-A) and caveolae-mediated endocytosis. It is known that linear poly(guanine) (poly G) is a natural ligand for SR-A, and it has been proposed that interaction of poly G with SR-A is dependent on the formation of G-quadruplexes. Since G-rich oligonucleotides are known to interact strongly with SR-A, we hypothesized that SNAs with higher G contents would be able to enter cells in larger amounts than SNAs composed of other nucleotides, and as such we measured cellular internalization of SNAs as a function of constituent oligonucleotide sequence. Indeed, SNAs with enriched G content show the highest cellular uptake. Using this hypothesis, we chemically conjugated a small molecule (camptothecin) with SNAs to create drug-SNA conjugates and observed that poly G SNAs deliver the most camptothecin to cells and have the highest cytotoxicity in cancer cells. Our data elucidate important design considerations for enhancing the intracellular delivery of spherical nucleic acids. PMID:26097111

  11. Partial amino acid sequences around sulfhydryl groups of soybean beta-amylase.

    PubMed

    Nomura, K; Mikami, B; Morita, Y

    1987-08-01

    Sulfhydryl (SH) groups of soybean beta-amylase were modified with 5-(iodoaceto-amidoethyl)aminonaphthalene-1-sulfonate (IAEDANS) and the SH-containing peptides exhibiting fluorescence were purified after chymotryptic digestion of the modified enzyme. The sequence analysis of the peptides derived from the modification of all SH groups in the denatured enzyme revealed the existence of six SH groups, in contrast to five reported previously. One of them was found to have extremely low reactivity toward SH-reagents without reduction. In the native state, IAEDANS reacted with 2 mol of SH groups per mol of the enzyme (SH1 and SH2) accompanied with inactivation of the enzyme owing to the modification of SH2 located near the active site of this enzyme. The selective modification of SH2 with IAEDANS was attained after the blocking of SH1 with 5,5'-dithiobis-(2-nitrobenzoic acid). The amino acid sequences of the peptides containing SH1 and SH2 were determined to be Cys-Ala-Asn-Pro-Gln and His-Gln-Cys-Gly-Gly-Asn-Val-Gly-Asp-Ile-Val-Asn-Ile-Pro-Ile-Pro-Gln-Trp, respectively.

  12. Genome Sequence of Lactobacillus rhamnosus Strain CASL, an Efficient l-Lactic Acid Producer from Cheap Substrate Cassava

    PubMed Central

    Yu, Bo; Su, Fei; Wang, Limin; Zhao, Bo; Qin, Jiayang; Ma, Cuiqing; Xu, Ping; Ma, Yanhe

    2011-01-01

    Lactobacillus rhamnosus is a type of probiotic bacteria with industrial potential for l-lactic acid production. We announce the draft genome sequence of L. rhamnosus CASL (2,855,156 bp with a G+C content of 46.6%), which is an efficient producer of l-lactic acid from cheap, nonfood substrate cassava with a high production titer. PMID:22123765

  13. Regulation of DNA replication at the end of the mitochondrial D-loop involves the helicase TWINKLE and a conserved sequence element

    PubMed Central

    Jemt, Elisabeth; Persson, Örjan; Shi, Yonghong; Mehmedovic, Majda; Uhler, Jay P.; Dávila López, Marcela; Freyer, Christoph; Gustafsson, Claes M.; Samuelsson, Tore; Falkenberg, Maria

    2015-01-01

    The majority of mitochondrial DNA replication events are terminated prematurely. The nascent DNA remains stably associated with the template, forming a triple-stranded displacement loop (D-loop) structure. However, the function of the D-loop region of the mitochondrial genome remains poorly understood. Using a comparative genomics approach we here identify two closely related 15 nt sequence motifs of the D-loop, strongly conserved among vertebrates. One motif is at the D-loop 5′-end and is part of the conserved sequence block 1 (CSB1). The other motif, here denoted coreTAS, is at the D-loop 3′-end. Both these sequences may prevent transcription across the D-loop region, since light and heavy strand transcription is terminated at CSB1 and coreTAS, respectively. Interestingly, the replication of the nascent D-loop strand, occurring in a direction opposite to that of heavy strand transcription, is also terminated at coreTAS, suggesting that coreTAS is involved in termination of both transcription and replication. Finally, we demonstrate that the loading of the helicase TWINKLE at coreTAS is reversible, implying that this site is a crucial component of a switch between D-loop formation and full-length mitochondrial DNA replication. PMID:26253742

  14. Amino acid sequence of versutoxin, a lethal neurotoxin from the venom of the funnel-web spider Atrax versutus.

    PubMed

    Brown, M R; Sheumack, D D; Tyler, M I; Howden, M E

    1988-03-01

    The complete amino acid sequence of versutoxin, a lethal neurotoxic polypeptide isolated from the venom of male and female funnel-web spiders of the species Atrax versutus, was determined. Sequencing was performed in a gas-phase protein sequencer by automated Edman degradation of the S-carboxymethylated toxin and fragments of it produced by reaction with CNBr. Versutoxin consisted of a single chain of 42 amino acid residues. It was found to have a high proportion of basic residues and of cystine. The primary structure showed marked homology with that of robustoxin, a novel neurotoxin recently isolated from the venom of another funnel-web-spider species, Atrax robustus.