Science.gov

Sample records for acid sequences homologous

  1. Partial amino acid sequence of human factor D:homology with serine proteases.

    PubMed Central

    Volanakis, J E; Bhown, A; Bennett, J C; Mole, J E

    1980-01-01

    Human factor D purified to homogeneity by a modified procedure was subjected to NH2-terminal amino acid sequence analysis by using a modified automated Beckman sequencer. We identified 48 of the first 57 NH2-terminal amino acids in a single sequencer run, using microgram quantities of factor D. The deduced amino acid sequence represents approximately 25% of the primary structure of factor D. This extended NH2-terminal amino acid sequence of factor D was compared to that of other trypsin-related serine proteases. By visual inspection, strong homologies (33--50% identity) were observed with all the serine proteases included in the comparison. Interestingly, factor D showed a higher degree of homology to serine proteases of pancreatic origin than to those of serum origin. Images PMID:6987665

  2. Partial amino acid sequence of apolipoprotein(a) shows that it is homologous to plasminogen

    SciTech Connect

    Eaton, D.L.; Fless, G.M.; Kohr, W.J.; McLean, J.W.; Xu, Q.T.; Miller, C.G.; Lawn, R.M.; Scanu, A.M.

    1987-05-01

    Apolipoprotein(a) (apo(a)) is a glycoprotein with M/sub r/ approx. 280,000 that is disulfide linked to apolipoprotein B in lipoprotein(a) particles. Elevated plasma levels of lipoprotein(a) are correlated with atherosclerosis. Partial amino acid sequence of apo(a) shows that it has striking homology to plasminogen. Plasminogen is a plasma serine protease zymogen that consists of five homologous and tandemly repeated domains called kringles and a trypsin-like protease domain. The amino-terminal sequence obtained for apo(a) is homologous to the beginning of kringle 4 but not the amino terminus of plasminogen. Apo(a) was subjected to limited proteolysis by trypsin or V8 protease, and fragments generated were isolated and sequenced. Sequences obtained from several of these fragments are highly (77-100%) homologous to plasminogen residues 391-421, which reside within kringle 4. Analysis of these internal apo(a) sequences revealed that apo(a) may contain at least two kringle 4-like domains. A sequence obtained from another tryptic fragment also shows homology to the end of kringle 4 and the beginning of kringle 5. Sequence data obtained from the two tryptic fragments shows homology with the protease domain of plasminogen. One of these sequences is homologous to the sequences surrounding the activation site of plasminogen. Plasminogen is activated by the cleavage of a specific arginine residue by urokinase and tissue plasminogen activator; however, the corresponding site in apo(a) is a serine that would not be cleaved by tissue plasminogen activator or urokinase. Using a plasmin-specific assay, no proteolytic activity could be demonstrated for lipoprotein(a) particles. These results suggest that apo(a) contains kringle-like domains and an inactive protease domain.

  3. Characterization of mouse cellular deoxyribonucleic acid homologous to Abelson murine leukemia virus-specific sequences.

    PubMed Central

    Dale, B; Ozanne, B

    1981-01-01

    The genome of Abelson murine leukemia virus (A-MuLV) consists of sequences derived from both BALB/c mouse deoxyribonucleic acid and the genome of Moloney murine leukemia virus. Using deoxyribonucleic acid linear intermediates as a source of retroviral deoxyribonucleic acid, we isolated a recombinant plasmid which contained 1.9 kilobases of the 3.5-kilobase mouse-derived sequences found in A-MuLV (A-MuLV-specific sequences). We used this clone, designated pSA-17, as a probe restriction enzyme and Southern blot analyses to examine the arrangement of homologous sequences in BALB/c deoxyribonucleic acid (endogenous Abelson sequences). The endogenous Abelson sequences within the mouse genome were interrupted by noncoding regions, suggesting that a rearrangement of the cell sequences was required to produce the sequence found in the virus. Endogenous Abelson sequences were arranged similarly in mice that were susceptible to A-MuLV tumors and in mice that were resistant to A-MuLV tumors. An examination of three BALB/c plasmacytomas and a BALB/c early B-cell tumor likewise revealed no alteration in the arrangement of the endogenous Abelson sequences. Homology to pSA-17 was also observed in deoxyribonucleic acids prepared from rat, hamster, chicken, and human cells. An isolate of A-MuLV which encoded a 160,000-dalton transforming protein (P160) contained 700 more base pairs of mouse sequences than the standard A-MuLV isolate, which encoded a 120,000-dalton transforming protein (P120). Images PMID:9279386

  4. ISHAN: sequence homology analysis package.

    PubMed

    Shil, Pratip; Dudani, Niraj; Vidyasagar, Pandit B

    2006-01-01

    Sequence based homology studies play an important role in evolutionary tracing and classification of proteins. Various methods are available to analyze biological sequence information. However, with the advent of proteomics era, there is a growing demand for analysis of huge amount of biological sequence information, and it has become necessary to have programs that would provide speedy analysis. ISHAN has been developed as a homology analysis package, built on various sequence analysis tools viz FASTA, ALIGN, CLUSTALW, PHYLIP and CODONW (for DNA sequences). This JAVA application offers the user choice of analysis tools. For testing, ISHAN was applied to perform phylogenetic analysis for sets of Caspase 3 DNA sequences and NF-kappaB p105 amino acid sequences. By integrating several tools it has made analysis much faster and reduced manual intervention. PMID:17274766

  5. Homology analyses of the protein sequences of fatty acid synthases from chicken liver, rat mammary gland, and yeast

    SciTech Connect

    Chang, Soo-Ik ); Hammes, G.G. )

    1989-11-01

    Homology analyses of the protein sequences of chicken liver and rat mammary gland fatty acid synthases were carried out. The amino acid sequences of the chicken and rat enzymes are 67% identical. If conservative substitutions are allowed, 78% of the amino acids are matched. A region of low homologies exists between the functional domains, in particular around amino acid residues 1059-1264 of the chicken enzyme. Homologies between the active sites of chicken and rat and of chicken and yeast enzymes have been analyzed by an alignment method. A high degree of homology exists between the active sites of the chicken and rat enzymes. However, the chicken and yeast enzymes show a lower degree of homology. The DADPH-binding dinucleotide folds of the {beta}-ketoacyl reductase and the enoyl reductase sites were identified by comparison with a known consensus sequence for the DADP- and FAD-binding dinucleotide folds. The active sites of all of the enzymes are primarily in hydrophobic regions of the protein. This study suggests that the genes for the functional domains of fatty acid synthase were originally separated, and these genes were connected to each other by using different connecting nucleotide sequences in different species. An alternative explanation for the differences in rat and chicken is a common ancestry and mutations in the joining regions during evolution.

  6. [Partial sequence homology of FtsZ in phylogenetics analysis of lactic acid bacteria].

    PubMed

    Zhang, Bin; Dong, Xiu-zhu

    2005-10-01

    FtsZ is a structurally conserved protein, which is universal among the prokaryotes. It plays a key role in prokaryote cell division. A partial fragment of the ftsZ gene about 800bp in length was amplified and sequenced and a partial FtsZ protein phylogenetic tree for the lactic acid bacteria was constructed. By comparing the FtsZ phylogenetic tree with the 16S rDNA tree, it was shown that the two trees were similar in topology. Both trees revealed that Pediococcus spp. were closely related with L. casei group of Lactobacillus spp. , but less related with other lactic acid cocci such as Enterococcus and Streptococcus. The results also showed that the discriminative power of FtsZ was higher than that of 16S rDNA for either inter-species or inter-genus and could be a very useful tool in species identification of lactic acid bacteria. PMID:16342751

  7. Amino-terminal amino acid sequence of the major structural polypeptides of avian retroviruses: sequence homology between reticuloendotheliosis virus p30 and p30s of mammalian retroviruses.

    PubMed Central

    Hunter, E; Bhown, A S; Bennett, J C

    1978-01-01

    The major structural polypeptides, p30 of reticuloendotheliosis virus (REV) (strain T) and p27 of avian sarcoma virus B77, have been compared with regard to amino acid composition. NH2-terminal amino acid sequence, and immunological crossreactions. The amino acid composition of the two polypeptides is distinct, and a comparison of the first 30 NH2-terminal amino acids of REV p30 with that for the first 25 of B77 p27 yields only three homologous residues. In competition radioimmunoassays the polypeptides show no crossreactivity. A comparison of the amino acid composition and NH2-terminal amino acid sequence of REV p30 with those reported for several mammalian retrovirus p30s shows remarkable similarities. Both REV and mammalian p30s contain a large number of polar residues in their amino acid composition and show approximately 40% homology in the first 30 NH2-terminal amino acids. No crossreactivity could be observed, however, in competition radioimmunoassays between Rauscher murine leukemia virus p30 and that of REV. The observations reported here suggest a close evolutionary relationship between REV and the mammalian retroviruses. Images PMID:208072

  8. High-affinity homologous peptide nucleic acid probes for targeting a quadruplex-forming sequence from a MYC promoter element.

    PubMed

    Roy, Subhadeep; Tanious, Farial A; Wilson, W David; Ly, Danith H; Armitage, Bruce A

    2007-09-18

    Guanine-rich DNA and RNA sequences are known to fold into secondary structures known as G-quadruplexes. Recent biochemical evidence along with the discovery of an increasing number of sequences in functionally important regions of the genome capable of forming G-quadruplexes strongly indicates important biological roles for these structures. Thus, molecular probes that can selectively target quadruplex-forming sequences (QFSs) are envisioned as tools to delineate biological functions of quadruplexes as well as potential therapeutic agents. Guanine-rich peptide nucleic acids have been previously shown to hybridize to homologous DNA or RNA sequences forming PNA-DNA (or RNA) quadruplexes. For this paper we studied the hybridization of an eight-mer G-rich PNA to a quadruplex-forming sequence derived from the promoter region of the MYC proto-oncogene. UV melting analysis, fluorescence assays, and surface plasmon resonance experiments reveal that this PNA binds to the MYC QFS in a 2:1 stoichiometry and with an average binding constant Ka = (2.0 +/- 0.2) x 10(8) M(-1) or Kd = 5.0 nM. In addition, experiments carried out with short DNA targets revealed a dependence of the affinity on the sequence of bases in the loop region of the DNA. A structural model for the hybrid quadruplex is proposed, and implications for gene targeting by G-rich PNAs are discussed. PMID:17718513

  9. Establishing homologies in protein sequences

    NASA Technical Reports Server (NTRS)

    Dayhoff, M. O.; Barker, W. C.; Hunt, L. T.

    1983-01-01

    Computer-based statistical techniques used to determine homologies between proteins occurring in different species are reviewed. The technique is based on comparison of two protein sequences, either by relating all segments of a given length in one sequence to all segments of the second or by finding the best alignment of the two sequences. Approaches discussed include selection using printed tabulations, identification of very similar sequences, and computer searches of a database. The use of the SEARCH, RELATE, and ALIGN programs (Dayhoff, 1979) is explained; sample data are presented in graphs, diagrams, and tables and the construction of scoring matrices is considered.

  10. Complete amino acid sequence of human plasma Zn-. cap alpha. /sub 2/-glycoprotein and its homology to histocompatibility antigens

    SciTech Connect

    Araki, T.; Gejyo, F.; Takagaki, K.; Haupt, H.; Schwick, H.G.; Buergi, W.; Marti, T.; Schaller, J.; Rickli, E.; Brossmer, R.

    1988-02-01

    In the present study the complete amino acid sequence of human plasma Zn-..cap alpha../sub 2/-glycoprotein was determined. This protein whose biological function is unknown consists of a single polypeptide chain of 276 amino acid residues including 8 tryptophan residues and has a pyroglutamyl residue at the amino terminus. The location of the two disulfide bonds in the polypeptide chain was also established. The three glycans, whose structure was elucidated with the aid of 500 MHz /sup 1/H NMR spectroscopy, were sialylated N-biantennas. The molecular weight calculated from the polypeptide and carbohydrate structure is 38,478, which is close to the reported value of approx. = 41,000 based on physicochemical measurements. The predicted secondary structure appeared to comprised of 23% ..cap alpha..-helix, 27% ..beta..-sheet, and 22% ..beta..-turns. The three N-glycans were found to be located in ..beta..-turn regions. An unexpected finding was made by computer analysis of the sequence data; this revealed that Zn-..cap alpha../sub 2/-glycoprotein is closely related to antigens of the major histocompatibility complex in amino acid sequence and in domain structure. There was an unusually high degree of sequence homology with the ..cap alpha.. chains of class I histocompatibility antigens. Moreover, this plasma protein was shown to be a member of the immunoglobulin gene superfamily. Zn-..cap alpha../sub 2/-glycoprotein appears to be truncated secretory major histocompatibility complex-related molecule, and it may have a role in the expression of the immune response.

  11. Amino acid sequence homology between Piv, an essential protein in site-specific DNA inversion in Moraxella lacunata, and transposases of an unusual family of insertion elements.

    PubMed Central

    Lenich, A G; Glasgow, A C

    1994-01-01

    Deletion analysis of the subcloned DNA inversion region of Moraxella lacunata indicates that Piv is the only M. lacunata-encoded factor required for site-specific inversion of the tfpQ/tfpI pilin segment. The predicted amino acid sequence of Piv shows significant homology solely with the transposases/integrases of a family of insertion sequence elements, suggesting that Piv is a novel site-specific recombinase. Images PMID:8021196

  12. Fold homology detection using sequence fragment composition profiles of proteins.

    PubMed

    Solis, Armando D; Rackovsky, Shalom R

    2010-10-01

    The effectiveness of sequence alignment in detecting structural homology among protein sequences decreases markedly when pairwise sequence identity is low (the so-called "twilight zone" problem of sequence alignment). Alternative sequence comparison strategies able to detect structural kinship among highly divergent sequences are necessary to address this need. Among them are alignment-free methods, which use global sequence properties (such as amino acid composition) to identify structural homology in a rapid and straightforward way. We explore the viability of using tetramer sequence fragment composition profiles in finding structural relationships that lie undetected by traditional alignment. We establish a strategy to recast any given protein sequence into a tetramer sequence fragment composition profile, using a series of amino acid clustering steps that have been optimized for mutual information. Our method has the effect of compressing the set of 160,000 unique tetramers (if using the 20-letter amino acid alphabet) into a more tractable number of reduced tetramers (approximately 15-30), so that a meaningful tetramer composition profile can be constructed. We test remote homology detection at the topology and fold superfamily levels using a comprehensive set of fold homologs, culled from the CATH database that share low pairwise sequence similarity. Using the receiver-operating characteristic measure, we demonstrate potentially significant improvement in using information-optimized reduced tetramer composition, over methods relying only on the raw amino acid composition or on traditional sequence alignment, in homology detection at or below the "twilight zone". PMID:20635424

  13. Gene Sequence Homology of Chemokines Across Species

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The abundance of expressed gene and protein sequences available in the biological information databases facilitates comparison of protein homologies. A high degree of sequence similarity typically implies homology regarding structure and function and may provide clues to antibody cross-reactivities...

  14. GENE SEQUENCE HOMOLOGY OF CHEMOKINES ACROSS SPECIES

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The abundance of expressed gene and protein sequences available in the biological information databases facilitates comparison of protein homologies. A high degree of sequence similarity typically implies homology regarding structure and function and may provide clues to antibody cross-react...

  15. Domain structures and molecular evolution of class I and class II major histocompatibility gene complex (MHC) products deduced from amino acid and nucleotide sequence homologies

    NASA Astrophysics Data System (ADS)

    Ohnishi, Koji

    1984-12-01

    Domain structures of class I and class II MHC products were analyzed from a viewpoint of amino acid and nucleotide sequence homologies. Alignment statistics revealed that class I (transplantation) antigen H chains consist of four mutually homologous domains, and that class II (HLA-DR) antigen β and α chains are both composed of three mutually homologous ones. The N-terminal three and two domains of class I and class II (both β and α) gene products, respectively, all of which being ˜90 residues long, were concluded to be homologous to β2-microglobulin (β2M). The membraneembedded C-terminal shorter domains of these MHC products were also found to be homologous to one another and to the third domain of class I H chains. Class I H chains were found to be more closely related to class II α chains than to class II β chains. Based on these findings, an exon duplication history from a common ancestral gene encoding a β2M-like primodial protein of one-domain-length up to the contemporary MHC products was proposed.

  16. DNA Sequence Alignment during Homologous Recombination.

    PubMed

    Greene, Eric C

    2016-05-27

    Homologous recombination allows for the regulated exchange of genetic information between two different DNA molecules of identical or nearly identical sequence composition, and is a major pathway for the repair of double-stranded DNA breaks. A key facet of homologous recombination is the ability of recombination proteins to perfectly align the damaged DNA with homologous sequence located elsewhere in the genome. This reaction is referred to as the homology search and is akin to the target searches conducted by many different DNA-binding proteins. Here I briefly highlight early investigations into the homology search mechanism, and then describe more recent research. Based on these studies, I summarize a model that includes a combination of intersegmental transfer, short-distance one-dimensional sliding, and length-specific microhomology recognition to efficiently align DNA sequences during the homology search. I also suggest some future directions to help further our understanding of the homology search. Where appropriate, I direct the reader to other recent reviews describing various issues related to homologous recombination. PMID:27129270

  17. Simplified computer programs for search of homology within nucleotide sequences.

    PubMed Central

    Kröger, M; Kröger-Block, A

    1984-01-01

    Four new computer programs for search of homology within nucleotide sequences are presented. The main scope of the program design is flexibility, independence of sequence length and the capability to be used by any molecular biologist without any prior computer experience. The programs offer a linear search, a search for maximal identity, an alignment along a given sequence and a search based on homology within the amino acid coding capacity of nucleotide sequences. The language is Fortran V. Copies are available on request. PMID:6546417

  18. Towards Scalable Optimal Sequence Homology Detection

    SciTech Connect

    Daily, Jeffrey A.; Krishnamoorthy, Sriram; Kalyanaraman, Anantharaman

    2012-12-26

    Abstract—The field of bioinformatics and computational biol- ogy is experiencing a data revolution — experimental techniques to procure data have increased in throughput, improved in accuracy and reduced in costs. This has spurred an array of high profile sequencing and data generation projects. While the data repositories represent untapped reservoirs of rich information critical for scientific breakthroughs, the analytical software tools that are needed to analyze large volumes of such sequence data have significantly lagged behind in their capacity to scale. In this paper, we address homology detection, which is a funda- mental problem in large-scale sequence analysis with numerous applications. We present a scalable framework to conduct large- scale optimal homology detection on massively parallel super- computing platforms. Our approach employs distributed memory work stealing to effectively parallelize optimal pairwise alignment computation tasks. Results on 120,000 cores of the Hopper Cray XE6 supercomputer demonstrate strong scaling and up to 2.42 × 107 optimal pairwise sequence alignments computed per second (PSAPS), the highest reported in the literature.

  19. Bipartite mRNA for chicken alpha-fibrinogen potentially encodes an amino acid sequence homologous to beta- and gamma-fibrinogens.

    PubMed Central

    Weissbach, L; Grieninger, G

    1990-01-01

    Overlapping cDNAs derived from the chicken alpha-fibrinogen mRNA have been sequenced, beginning from within the coding region for the signal peptide of this subunit and terminating within the poly(A) extension. The predicted size of chicken alpha-fibrinogen is 54,187 daltons, which is the smallest of any alpha chain reported; the oligopeptide repeats that characterize the central regions of the other alpha subunits were conspicuously absent. A further unexpected finding was the presence on the mRNA of a separate, long open reading frame (752 nucleotides), beginning 312 nucleotides downstream from the alpha-fibrinogen coding sequence and containing intron-like features near its 5' end. The protein sequence predicted from this second open reading frame lacks an initiating methionine but is homologous to the C-terminal regions of all known beta- and gamma-fibrinogens as well as the C termini of two nonfibrinogen proteins: cytotactin (tenascin), an extracellular matrix protein, and pT49, a putative protein specific to cytotoxic T cells. The intron-like features of the second open reading frame immediately precede the region of common homology, and the beginnings of the corresponding homologous segments in the beta- and gamma-fibrinogen sequences are marked by aligned intron positions. Based on these findings, it is proposed that fibrinogen gene evolution included a fusion of two distinct ancestral genes. PMID:2367530

  20. Assessment of sequence homology and cross-reactivity

    SciTech Connect

    Aalberse, Rob C. . E-mail: r.aalberse@sanquin.nl

    2005-09-01

    Three aspects of allergenicity assessment and are discussed: IgE immunogenicity, IgE cross-reactivity and T cell cross-reactivity, all with emphasis on in-silico predictability: from amino acid sequence via 3D structure to allergenicity.(1)IgE immunogenicity depends to an overwhelming degree on factors other than the protein itself: the context and history of the protein by the time it reaches the immune system. Without specification of these two factors very few foreign proteins can be claimed to be absolutely non-allergenic. Any antigen may be allergenic, particularly if it avoids activation of TH2-suppressive mechanisms (CD8 cells, TH1 cells, other regulatory T cells and regulatory cytokines). (2)IgE cross-reactivity can be much more reliably assessed by a combination of in-silico homology searches and in vitro IgE antibody assays. The in-silico homology search is unlikely to miss potential cross-reactivity with sequenced allergens. So far, no biologically relevant cross-reactivity at the antibody level has been demonstrated between proteins without easily-demonstrable homology. (3)T cell cross-reactivity is much more difficult to predict compared to B cell cross-reactivity, and its effects are more diverse. Yet, pre-existing cross-reactive T cell activity is likely to influence the outcome not only of the immune response, but also of the effector phase of the allergic reaction.

  1. Sequence homology between RNAs encoding rat alpha-fetoprotein and rat serum albumin.

    PubMed Central

    Jagodzinski, L L; Sargent, T D; Yang, M; Glackin, C; Bonner, J

    1981-01-01

    We have determined the sequences of the recombinant DNA inserts of three bacterial plasmid cDNA clones containing most of the rat alpha a-fetoprotein mRNA. The resultant nucleotide sequence of alpha-fetoprotein was exhaustively compared to the nucleotide sequence of the mRNA encoding rat serum albumin. These two mRNAs have extensive homology (50%) throughout and the same intron locations. The amino acid sequence of rat alpha-fetoprotein has been deduced from the nucleotide sequence, and its comparison to rat serum albumin's amino acid sequence reveals a 34% homology. The regularly spaced positions of the cysteines found in serum albumin are conserved in rat alpha-fetoprotein, indicating that these two proteins may have a similar secondary folding structure. These homologies indicate that alpha-fetoprotein and serum albumin were derived by duplication of a common ancestral gene and constitute a gene family. PMID:6167988

  2. Aspartyl-tRNA synthetase from Escherichia coli: cloning and characterisation of the gene, homologies of its translated amino acid sequence with asparaginyl- and lysyl-tRNA synthetases.

    PubMed Central

    Eriani, G; Dirheimer, G; Gangloff, J

    1990-01-01

    By screening of an Escherichia coli plasmidic library using antibodies against aspartyl-tRNA synthetase (AspRS) several clones were obtained containing aspS, the gene coding for AspRS. We report here the nucleotide sequence of aspS and the corresponding primary structure of the aspartyl-tRNA synthetase, a protein of 590 amino acid residues with a Mr 65,913, a value in close agreement with that observed for the purified protein. Primer extension analysis of the aspS mRNA using reverse transcriptase located its 5'-end at 94 nucleotides upstream of the translation initiation AUG; nuclease S1 analysis located the 3'-end at 126 nucleotides downstream of the stop codon UGA. Comparison of the DNA-derived protein sequence with known aminoacyl-tRNA sequences revealed important homologies with asparaginyl- and lysyl-tRNA synthetases from E.coli; more than 25% of their amino acid residues are identical, the homologies being distributed preferencially in the first part and the carboxy-terminal end of the molecule. Mutagenesis directed towards a consensus tetrapeptide (Gly-Leu-Asp-Arg) and the carboxy-terminal end showed that both domains could be implicated in catalysis as well as in ATP binding. Images PMID:2129559

  3. Why do Sequence Signatures Predict Enzyme Mechanism? Homology versus Chemistry

    PubMed Central

    Beattie, Kirsten E.; De Ferrari, Luna; Mitchell, John B. O.

    2015-01-01

    First, we identify InterPro sequence signatures representing evolutionary relatedness and, second, signatures identifying specific chemical machinery. Thus, we predict the chemical mechanisms of enzyme-catalyzed reactions from catalytic and non-catalytic subsets of InterPro signatures. We first scanned our 249 sequences using InterProScan and then used the MACiE database to identify those amino acid residues that are important for catalysis. The sequences were mutated in silico to replace these catalytic residues with glycine and then again scanned using InterProScan. Those signature matches from the original scan that disappeared on mutation were called catalytic. Mechanism was predicted using all signatures, only the 78 “catalytic” signatures, or only the 519 “non-catalytic” signatures. The non-catalytic signatures gave indistinguishable results from those for the whole feature set, with precision of 0.991 and sensitivity of 0.970. The catalytic signatures alone gave less impressive predictivity, with precision and sensitivity of 0.791 and 0.735, respectively. These results show that our successful prediction of enzyme mechanism is mostly by homology rather than by identifying catalytic machinery. PMID:26740739

  4. FAB overlapping: a strategy for sequencing homologous proteins

    NASA Astrophysics Data System (ADS)

    Ferranti, P.; Malorni, A.; Marino, G.; Pucci, P.; di Luccia, A.; Ferrara, L.

    1991-12-01

    Extensive similarity has been shown to exist between the primary structures of closely related proteins from different species, the only differences being restricted to a few amino acid variations. A new mass spectrometric procedure, which has been called FAB-overlapping, has been developed for sequencing highly homologous proteins based on the detection of these small differences as compared with a known protein used as a reference. Several complementary peptide maps are constructed using fast atom bombardment mass spectrometry (FAB-MS) analysis of different proteolytic digests of the unknown protein and the mass values are related to those expected on the basis of the sequence of the reference protein. The mass signals exhibiting unusual mass values identify those regions where variations have taken place; fine location of the mutations can be obtained by coupling simple protein chemistry methodologies with FAB-MS. Using the FAB-overlapping procedure, it was possible to determine the sequence of [alpha]1, [alpha]3 and [beta] globins from water buffalo (Bubalus bubalis hemoglobins (phenotype AA). Two amino acid substitutions were detected in the buffalo [beta] chain (Lys16 --> His and Asn118 --> His) whereas the [alpha]1 chains were found the [alpha]1 and [alpha]3 chains were found to contain four amino acid replacements, three of which were identical (Glu23 --> Asp, Glu71 --> Gly, Phe117 --> Cys), and the insertion of an alanine residue in position 124. The only differences between [alpha]1 and [alpha]3 globins were identified in the C -terminal region; [alpha]1 contains a Phe residue at position 130 whereas [alpha]3 shows serine at position 132.

  5. Biochemical characterization of NfsA, the Escherichia coli major nitroreductase exhibiting a high amino acid sequence homology to Frp, a Vibrio harveyi flavin oxidoreductase.

    PubMed Central

    Zenno, S; Koike, H; Kumar, A N; Jayaraman, R; Tanokura, M; Saigo, K

    1996-01-01

    We identified the nfsA gene, encoding the major oxygen-insensitive nitroreductase in Escherichia coli, and determined its position on the E. coli map to be 19 min. We also purified its gene product, NfsA, to homogeneity. It was suggested that NfsA is a nonglobular protein with a molecular weight of 26,799 and is associated tightly with a flavin mononucleotide. Its amino acid sequence is highly similar to that of Frp, a flavin oxidoreductase from Vibrio harveyi (B. Lei, M. Liu, S. Huang, and S.-C. Tu, J. Bacteriol. 176:3552-3558, 1994), an observation supporting the notion that E. coli nitroreductase and luminescent-bacterium flavin reductase families are intimately related in evolution. Although no appreciable sequence similarity was detected between two E. coli nitroreductases, NfsA and NfsB, NfsA exhibited a low level of the flavin reductase activity and a broad electron acceptor specificity similar to those of NfsB. NfsA reduced nitrofurazone by a ping-pong Bi-Bi mechanism possibly to generate a two-electron transfer product. PMID:8755878

  6. Homology and the optimization of DNA sequence data

    NASA Technical Reports Server (NTRS)

    Wheeler, W.

    2001-01-01

    Three methods of nucleotide character analysis are discussed. Their implications for molecular sequence homology and phylogenetic analysis are compared. The criterion of inter-data set congruence, both character based and topological, are applied to two data sets to elucidate and potentially discriminate among these parsimony-based ideas. c2001 The Willi Hennig Society.

  7. DNA sequence alignment by microhomology sampling during homologous recombination

    PubMed Central

    Qi, Zhi; Redding, Sy; Lee, Ja Yil; Gibb, Bryan; Kwon, YoungHo; Niu, Hengyao; Gaines, William A.; Sung, Patrick

    2015-01-01

    Summary Homologous recombination (HR) mediates the exchange of genetic information between sister or homologous chromatids. During HR, members of the RecA/Rad51 family of recombinases must somehow search through vast quantities of DNA sequence to align and pair ssDNA with a homologous dsDNA template. Here we use single-molecule imaging to visualize Rad51 as it aligns and pairs homologous DNA sequences in real-time. We show that Rad51 uses a length-based recognition mechanism while interrogating dsDNA, enabling robust kinetic selection of 8-nucleotide (nt) tracts of microhomology, which kinetically confines the search to sites with a high probability of being a homologous target. Successful pairing with a 9th nucleotide coincides with an additional reduction in binding free energy and subsequent strand exchange occurs in precise 3-nt steps, reflecting the base triplet organization of the presynaptic complex. These findings provide crucial new insights into the physical and evolutionary underpinnings of DNA recombination. PMID:25684365

  8. Human DNA sequence homologous to the transforming gene (mos) of Moloney murine sarcoma virus.

    PubMed Central

    Watson, R; Oskarsson, M; Vande Woude, G F

    1982-01-01

    We describe the molecular cloning of a 9-kilo-base-pair BamHI fragment from human placental DNA containing a sequence homologous to the transforming gene (v-mos) of Moloney murine sarcoma virus. The DNA sequence of the homologous region of human DNA (termed humos) was resolved and compared to that of the mouse cellular homolog of v-mos (termed mumos) [Van Beveren, C., van Straaten, F., Galleshaw, J.A. & Verma, I.M. (1981) Cell 27, 97-108]. The humos gene contained an open reading frame of 346 codons that was aligned with the equivalent mumos DNA sequence by the introduction of two gaps of 15 and 3 bases into the mumos DNA and a single gap of 9 bases into the humos DNA. The aligned coding sequences were 77% homologous and terminated at equivalent opal codons. The humos open reading frame initiated at an ATG found internally in the mumos coding sequence. The polypeptides predicted from the DNA sequence to be encoded by humos and mumos also were found to be extensively homologous, and 253 of 337 amino acids were shared between the two polypeptides. The first five NH2-terminal and last two COOH-terminal amino acids of the humos gene product were in common with those of mumos. In addition, near the middle of the polypeptide chains, four regions ranging from 19 to 26 consecutive amino acids were conserved. However, we have not been able to transform mouse cells with transfected humos DNA fragments or with hybrid DNA recombinants containing humos and retroviral long terminal repeat (LTR) sequences. Images PMID:6287464

  9. Sequence homologies in the protamine gene family of rainbow trout.

    PubMed Central

    Aiken, J M; McKenzie, D; Zhao, H Z; States, J C; Dixon, G H

    1983-01-01

    We have sequenced five different rainbow trout protamine genes plus their flanking regions. The genes are not clustered and do not contain intervening sequences. There is an extremely high degree of sequence conservation in the coding and 3' untranslated regions of the gene. Downstream sequences exhibit little homology though conserved regions are found 250 base pairs 3' to the gene. There are four regions upstream of the gene that are highly conserved in the six clones, including the canonical Goldberg - Hogness box which is 45 base pairs 5' to the coding region. A second homologous region is found 90 bases upstream. Although in the same approximate location as the CAAT box found upstream of other genes, it does not contain the canonical CAAT sequence. Further upstream of the protamine genes at -115 there is an A-T rich sequence while a 25 base pair conserved sequence is located 150 bases upstream. In addition we report the presence of a potential Z-DNA region of predominantly A-C repeats approximately one kilobase downstream of one of the genes. Images PMID:6308564

  10. Optimised fine and coarse parallelism for sequence homology search.

    PubMed

    Meng, Xiandong; Chaudhary, Vipin

    2006-01-01

    New biological experimental techniques are continuing to generate large amounts of data using DNA, RNA, human genome and protein sequences. The quantity and quality of data from these experiments makes analyses of their results very time-consuming, expensive and impractical. Searching on DNA and protein databases using sequence comparison algorithms has become one of the most powerful techniques to better understand the functionality of particular DNA, RNA, genome, or protein sequence. This paper presents a technique to effectively combine fine and coarse grain parallelism using general-purpose processors for sequence homology database searches. The results show that the classic Smith-Waterman sequence alignment algorithm achieves super linear performance with proper scheduling and multi-level parallel computing at no additional cost. PMID:18048183

  11. Nucleotide sequence analysis of a cloned DNA fragment from human cells reveals homology to retrotransposons.

    PubMed Central

    Flügel, R M; Maurer, B; Bannert, H; Rethwilm, A; Schnitzler, P; Darai, G

    1987-01-01

    During molecular cloning of proviral DNA of human spumaretrovirus, various recombinant clones were established and analyzed. Blot hybridization revealed that one of the recombinant plasmids had the characteristic features of a member of the long interspersed repetitive sequences family. The DNA element was analyzed by restriction mapping and nucleotide sequencing. It showed a high degree of amino acid sequence homology of 54.3% when compared with the 5'-terminal part of the pol gene product of the murine retrotransposon LIMd. The 3' region of the cloned DNA element encodes proteins with an even higher degree of homology of 67.4% in comparison to the corresponding parts of a member of the primate KpnI sequence family. Images PMID:3031462

  12. An expert system for processing sequence homology data.

    PubMed

    Sonnhammer, E L; Durbin, R

    1994-01-01

    When confronted with the task of finding homology to large numbers of sequences, database searching tools such as Blast and Fasta generate prohibitively large amounts of information. An automatic way of making most of the decisions a trained sequence analyst would make was developed by means of a rule-based expert system combined with an algorithm to avoid non-informative biased residue composition matches. The results found relevant by the system are presented in a very concise and clear way, so that the homology can be assessed with minimum effort. The expert system, HSPcrunch, was implemented to process the output to the programs in the BLAST suite. HSPcrunch embodies rules on detecting distant similarities when pairs of weak matches are consistent with a larger gapped alignment, i.e. when Blast has broken a longer gapped alignment up into smaller ungapped ones. This way, more distant similarities can be detected with no or little side-effects of more spurious matches. The rules for how small the gaps must be to be considered significant have been derived empirically. Currently a set of rules are used that operate on two different scoring levels, one for very weak matches that have very small gaps and one for medium weak matches that have slightly larger gaps. This set of rules proved to be robust for most cases and gives high fidelity separation between real homologies and spurious matches. One of the most important rules for reducing the amount of output is to limit the number of overlapping matches to the same region of the query sequence.(ABSTRACT TRUNCATED AT 250 WORDS) PMID:7584413

  13. An expert system for processing sequence homology data

    SciTech Connect

    Sonnhammer, E.L.L.; Durbin, R.

    1994-12-31

    When confronted with the task of finding homology to large numbers of sequences, database searching tools such as Blast and Fasta generate prohibitively large amounts of information. An automatic way of making most of the decisions a trained sequence analyst would make was developed by means of a rule-based expert system combined with an algorithm to avoid non-informative biased residue composition matches. The results found relevant by the system are presented in a very concise and clear way, so that the homology can be assessed with minimum effort. The expert system, HSPcrunch, was implemented to process the output of the programs in the BLAST suite. HSPcrunch embodies rules on detecting distant similarities when pairs of weak matches are consistent with a larger gaped alignment, i.e. when Blast has broken a longer gaped alignment up into smaller ungaped ones. This way, more distant similarities can be detected with no or little side-effects of more spurious matches. The rules for how small the gaps must be to be considered significant have been derived empirically. Currently a set of rules are used that operate on two different scoring levels, one for very weak matches that have very small gaps and one for medium weak matches that have slightly larger gaps. This set of rules proved to be robust for most cases and gives high fidelity separation between real homologies and spurious matches, One of the most important rules for reducing the amount of output is to limit the number of overlapping matches to the same region of the query sequence. This way, a region with many high-scoring matches will not dominate the output and hide weaker but relevant matches to other regions. This is particularly valuable for multi-domain queries.

  14. Homolog detection using global sequence properties suggests an alternate view of structural encoding in protein sequences

    PubMed Central

    Scheraga, Harold A.; Rackovsky, S.

    2014-01-01

    We show that a Fourier-based sequence distance function is able to identify structural homologs of target sequences with high accuracy. It is shown that Fourier distances correlate very strongly with independently determined structural distances between molecules, a property of the method that is not attainable using conventional representations. It is further shown that the ability of the Fourier approach to identify protein folds is statistically far in excess of random expectation. It is then shown that, in actual searches for structural homologs of selected target sequences, the Fourier approach gives excellent results. On the basis of these results, we suggest that the global information detected by the Fourier representation is an essential feature of structure encoding in protein sequences and a key to structural homology detection. PMID:24706836

  15. Identification, localization, and sequencing of fetal bovine VASA homolog.

    PubMed

    Bartholomew, Rachel A; Parks, John E

    2007-10-01

    The vasa gene, first described in Drosophila, is purported to be important in germ cell development. Vasa is present across several invertebrate and vertebrate taxa, including frogs, fish, chickens, and humans. Vasa, a DEAD (asparagine-glutamine-alanine-asparagine) box protein shown to function as an RNA helicase in vitro, has not been investigated previously in fetal stage cattle. Total RNA was extracted from bovine fetal gonads obtained at 35-55 days, 55-80 days, and 80-120 days of gestation to amplify a 296 bp reverse transcription polymerase chain reaction (RT-PCR) product using primers for human vasa. The complete coding sequence of bovine vasa was cloned with 5' and 3' random amplification of cDNA ends polymerase chain reaction (RACE-PCR) and subsequently identified as bovine vasa homolog (BVH). Northern blot analysis revealed that among the tissues examined (gonad, liver, heart, brain, and femur), the vasa gene was expressed in the gonad. This localization, the conserved pattern of gene expression, and the gene sequence suggests that BVH plays a role in bovine germ cell development as proposed for other mammalian species. PMID:17150314

  16. On the use of sequence homologies to predict protein structure: identical pentapeptides can have completely different conformations.

    PubMed Central

    Kabsch, W; Sander, C

    1984-01-01

    The search for amino acid sequence homologies can be a powerful tool for predicting protein structure. Discovered sequence homologies are currently used in predicting the function of oncogene proteins. To sharpen this tool, we investigated the structural significance of short sequence homologies by searching proteins of known three-dimensional structure for subsequence identities. In 62 proteins with 10,000 residues, we found that the longest isolated homologies between unrelated proteins are five residues long. In 6 (out of 25) cases we saw surprising structural adaptability: the same five residues are part of an alpha-helix in one protein and part of a beta-strand in another protein. These examples show quantitatively that pentapeptide structure within a protein is strongly dependent on sequence context, a fact essentially ignored in most protein structure prediction methods: just considering the local sequence of five residues is not sufficient to predict correctly the local conformation (secondary structure). Cooperativity of length six or longer must be taken into account. Also, we are warned that in the growing practice of comparing a new protein sequence with a data base of known sequences, finding an identical pentapeptide sequence between two proteins is not a significant indication of structural similarity or of evolutionary kinship. PMID:6422466

  17. Sequence analysis and homology modeling of peroxidase from Medicago sativa

    PubMed Central

    Hooda, Vinita; Gundala, Prasada babu; Chinthala, Paramageetham

    2012-01-01

    Plant peroxidases are one of the most extensively studied group of enzymes which find applications in the environment, health, pharmaceutical, chemical and biotechnological processes. Class III secretary peroxidase from alfalfa (Medicago sativa) has been characterized using bioinformatics approach Physiochemical properties and topology of alfalfa peroxidase were compared with that of soybean and horseradish peroxidase, two most popular commercially available peroxidase preparations. Lower value of instability index as predicted by ProtParam and presence of extra disulphide linkages as predicted by Cys_REC suggested alfalfa peroxidase to be more stable than either of the commercial preparations. Multiple Sequence Alignment (MSA) with other functionally similar proteins revealed the presence of highly conserved catalytic residues. Three dimensional model of alfalfa peroxidase was constructed based on the crystal structure of soybean peroxidase (PDB Id: 1FHF A) by homology modelling approach. The model was checked for stereo chemical quality by PROCHECH, VERIFY 3D, WHAT IF, ERRAT, 3D MATCH AND ProSA servers. The best model was selected, energy minimized and used to analyze structure function relationship with substrate hydrogen peroxide by Autodock 4.0. The enzyme substrate complex was viewed with Swiss PDB viewer and one residue ASP43 was found to stabilize the interaction by hydrogen bonds. The results of the study may be a guiding point for further investigations on alfalfa peroxidase. PMID:23275690

  18. Nucleotide sequence of the L1 ribosomal protein gene of Xenopus laevis: remarkable sequence homology among introns.

    PubMed Central

    Loreni, F; Ruberti, I; Bozzoni, I; Pierandrei-Amaldi, P; Amaldi, F

    1985-01-01

    Ribosomal protein L1 is encoded by two genes in Xenopus laevis. The comparison of two cDNA sequences shows that the two L1 gene copies (L1a and L1b) have diverged in many silent sites and very few substitution sites; moreover a small duplication occurred at the very end of the coding region of the L1b gene which thus codes for a product five amino acids longer than that coded by L1a. Quantitatively the divergence between the two L1 genes confirms that a whole genome duplication took place in Xenopus laevis approximately 30 million years ago. A genomic fragment containing one of the two L1 gene copies (L1a), with its nine introns and flanking regions, has been completely sequenced. The 5' end of this gene has been mapped within a 20-pyridimine stretch as already found for other vertebrate ribosomal protein genes. Four of the nine introns have a 60-nucleotide sequence with 80% homology; within this region some boxes, one of which is 16 nucleotides long, are 100% homologous among the four introns. This feature of L1a gene introns is interesting since we have previously shown that the activity of this gene is regulated at a post-transcriptional level and it involves the block of the normal splicing of some intron sequences. Images Fig. 3. Fig. 5. PMID:3841512

  19. Sequence analysis and characterization of a 40-kilodalton Borrelia hermsii glycerophosphodiester phosphodiesterase homolog.

    PubMed Central

    Shang, E S; Skare, J T; Erdjument-Bromage, H; Blanco, D R; Tempst, P; Miller, J N; Lovett, M A

    1997-01-01

    We report the purification, molecular cloning, and characterization of a 40-kDa glycerophosphodiester phosphodiesterase homolog from Borrelia hermsii. The 40-kDa protein was solubilized from whole organisms with 0.1% Triton X-100, phase partitioned into the Triton X-114 detergent phase, and purified by fast-performance liquid chromatography (FPLC). The gene encoding the 40-kDa protein was cloned from a B. hermsii chromosomal DNA lambda EXlox expression library and identified by using affinity antibodies generated against the purified native protein. The deduced amino acid sequence included a 20-amino-acid signal peptide encoding a putative leader peptidase II cleavage site, indicating that the 40-kDa protein was a lipoprotein. Based on significant homology (31 to 52% identity) of the 40-kDa protein to glycerophosphodiester phosphodiesterases of Escherichia coli (GlpQ), Bacillus subtilis (GlpQ), and Haemophilus influenzae (Hpd; protein D), we have designated this B. hermsii 40-kDa lipoprotein a glycerophosphodiester phosphodiesterase (Gpd) homolog, the first B. hermsii lipoprotein to have a putative functional assignment. A nonlipidated form of the Gpd homolog was overproduced as a fusion protein in E. coli BL21(DE3)(pLysE) and was used to immunize rabbits to generate specific antiserum. Immunoblot analysis with anti-Gpd serum recognized recombinant H. influenzae protein D, and conversely, antiserum to H. influenzae protein D recognized recombinant B. hermsii Gpd (rGpd), indicating antigenic conservation between these proteins. Antiserum to rGpd also identified native Gpd as a constituent of purified outer membrane vesicles prepared from B. hermsii. Screening of other pathogenic spirochetes with anti-rGpd serum revealed the presence of antigenically related proteins in Borrelia burgdorferi, Treponema pallidum, and Leptospira kirschneri. Further sequence analysis both upstream and downstream of the Gpd homolog showed additional homologs of glycerol metabolism

  20. Phylogenetic analysis of sequences from diverse bacteria with homology to the Escherichia coli rho gene.

    PubMed Central

    Opperman, T; Richardson, J P

    1994-01-01

    Genes from Pseudomonas fluorescens, Chromatium vinosum, Micrococcus luteus, Deinococcus radiodurans, and Thermotoga maritima with homology to the Escherichia coli rho gene were cloned and sequenced, and their sequences were compared with other available sequences. The species for all of the compared sequences are members of five bacterial phyla, including Thermotogales, the most deeply diverged phylum. This suggests that a rho-like gene is ubiquitous in the Bacteria and was present in their common ancestor. The comparative analysis revealed that the Rho homologs are highly conserved, exhibiting a minimum identity of 50% of their amino acid residues in pairwise comparisons. The ATP-binding domain had a particularly high degree of conservation, consisting of some blocks with sequences of residues that are very similar to segments of the alpha and beta subunits of F1-ATPase and of other blocks with sequences that are unique to Rho. The RNA-binding domain is more diverged than the ATP-binding domain. However, one of its most highly conserved segments includes a RNP1-like sequence, which is known to be involved in RNA binding. Overall, the degree of similarity is lowest in the first 50 residues (the first half of the RNA-binding domain), in the putative connector region between the RNA-binding and the ATP-binding domains, and in the last 50 residues of the polypeptide. Since functionally defective mutants for E. coli Rho exist in all three of these segments, they represent important parts of Rho that have undergone adaptive evolution. PMID:8051015

  1. Adhesive Proteins of Stalked and Acorn Barnacles Display Homology with Low Sequence Similarities

    PubMed Central

    Jonker, Jaimie-Leigh; Abram, Florence; Pires, Elisabete; Varela Coelho, Ana; Grunwald, Ingo; Power, Anne Marie

    2014-01-01

    Barnacle adhesion underwater is an important phenomenon to understand for the prevention of biofouling and potential biotechnological innovations, yet so far, identifying what makes barnacle glue proteins ‘sticky’ has proved elusive. Examination of a broad range of species within the barnacles may be instructive to identify conserved adhesive domains. We add to extensive information from the acorn barnacles (order Sessilia) by providing the first protein analysis of a stalked barnacle adhesive, Lepas anatifera (order Lepadiformes). It was possible to separate the L. anatifera adhesive into at least 10 protein bands using SDS-PAGE. Intense bands were present at approximately 30, 70, 90 and 110 kilodaltons (kDa). Mass spectrometry for protein identification was followed by de novo sequencing which detected 52 peptides of 7–16 amino acids in length. None of the peptides matched published or unpublished transcriptome sequences, but some amino acid sequence similarity was apparent between L. anatifera and closely-related Dosima fascicularis. Antibodies against two acorn barnacle proteins (ab-cp-52k and ab-cp-68k) showed cross-reactivity in the adhesive glands of L. anatifera. We also analysed the similarity of adhesive proteins across several barnacle taxa, including Pollicipes pollicipes (a stalked barnacle in the order Scalpelliformes). Sequence alignment of published expressed sequence tags clearly indicated that P. pollicipes possesses homologues for the 19 kDa and 100 kDa proteins in acorn barnacles. Homology aside, sequence similarity in amino acid and gene sequences tended to decline as taxonomic distance increased, with minimum similarities of 18–26%, depending on the gene. The results indicate that some adhesive proteins (e.g. 100 kDa) are more conserved within barnacles than others (20 kDa). PMID:25295513

  2. Purification and characterization of elastase-specific inhibitor. Sequence homology with mucus proteinase inhibitor.

    PubMed

    Sallenave, J M; Ryle, A P

    1991-01-01

    Elastase-specific inhibitor (ESI) was purified from sputum of patients with chronic bronchitis and compared with mucus proteinase inhibitor (MPI, BrI) isolated, without the use of affinity chromatography on an enzyme, from non-purulent sputum of a patient with bronchial carcinoma. The N-terminal sequence of 27 residues of the latter was determined and showed serine as the only N-terminus. The partial N-terminal amino-acid sequence of ESI shows some homology with MPI, especially around the reactive site of MPI for human neutrophil elastase. This region could therefore be the reactive site of ESI. The thermodynamic and kinetic constants of the reactions of ESI with human neutrophil elastase and with porcine pancreatic elastase show that ESI is a fast-acting inhibitor. PMID:2039600

  3. DNA sequence, structure, and tyrosine kinase activity of the Drosophila melanogaster abelson proto-oncogene homolog

    SciTech Connect

    Henkemeyer, M.J.; Bennett, R.L.; Gertler, F.B.; Hoffmann, F.M.

    1988-02-01

    The authors report their molecular characterization of the Drosophila melanogaster Abelson gene (abl), a gene in which recessive loss-of-function mutations result in lethality at the pupal stage of development. This essential gene consists of 10 exons extending over 26 kilobase pairs of genomic DNA. The DNA sequence encodes a protein of 1,520 amino acids with strong sequence similarity to the human c-abl proto-oncogene beginning in the type 1b 5' exon and extending through the region essential for tyrosine kinase activity. When the tyrosine kinase homologous region was expressed in Escherichia coli, phosphorylation of proteins on tyrosine residues was observed with an antiphosphotyrosine antibody. These results show that the abl gene is highly conserved through evolution and encodes a functional tyrosine protein kinase required for Drosophila development.

  4. Reconstruction of cyclooxygenase evolution in animals suggests variable, lineage-specific duplications, and homologs with low sequence identity.

    PubMed

    Havird, Justin C; Kocot, Kevin M; Brannock, Pamela M; Cannon, Johanna T; Waits, Damien S; Weese, David A; Santos, Scott R; Halanych, Kenneth M

    2015-04-01

    Cyclooxygenase (COX) enzymatically converts arachidonic acid into prostaglandin G/H in animals and has importance during pregnancy, digestion, and other physiological functions in mammals. COX genes have mainly been described from vertebrates, where gene duplications are common, but few studies have examined COX in invertebrates. Given the increasing ease in generating genomic data, as well as recent, although incomplete descriptions of potential COX sequences in Mollusca, Crustacea, and Insecta, assessing COX evolution across Metazoa is now possible. Here, we recover 40 putative COX orthologs by searching publicly available genomic resources as well as ~250 novel invertebrate transcriptomic datasets. Results suggest the common ancestor of Cnidaria and Bilateria possessed a COX homolog similar to those of vertebrates, although such homologs were not found in poriferan and ctenophore genomes. COX was found in most crustaceans and the majority of molluscs examined, but only specific taxa/lineages within Cnidaria and Annelida. For example, all octocorallians appear to have COX, while no COX homologs were found in hexacorallian datasets. Most species examined had a single homolog, although species-specific COX duplications were found in members of Annelida, Mollusca, and Cnidaria. Additionally, COX genes were not found in Hemichordata, Echinodermata, or Platyhelminthes, and the few previously described COX genes in Insecta lacked appreciable sequence homology (although structural analyses suggest these may still be functional COX enzymes). This analysis provides a benchmark for identifying COX homologs in future genomic and transcriptomic datasets, and identifies lineages for future studies of COX. PMID:25758350

  5. Multilocus Sequence Typing Reveals Evidence of Homologous Recombination Linked to Antibiotic Resistance in the Genus Salinispora

    PubMed Central

    Freel, Kelle C.; Millán-Aguiñaga, Natalie

    2013-01-01

    The three closely related species that currently comprise the genus Salinispora were analyzed using a multilocus sequence typing approach targeting 48 strains derived from four geographic locations. Phylogenetic congruence and a well-supported concatenated tree provide strong support for the delineation of the three species as currently described and the basal relationship of Salinispora arenicola to the more recently diverged sister taxa S. tropica and S. pacifica. The phylogeny of the initial region of the rpoB gene sequenced was atypical, placing the related genera Micromonospora and Verrucosispora within the Salinispora clade. This phylogenetic incongruence was subsequently ascribed to a homologous-recombination event in a portion of the gene associated with resistance to compounds in the rifamycin class, which target RpoB. All S. arenicola strains produced compounds in this class and possessed resistance-conferring amino acid changes in RpoB. The phylogeny of a region of the rpoB gene that is not associated with rifamycin resistance was congruent with the other housekeeping genes. The link between antibiotic resistance and homologous recombination suggests that incongruent phylogenies provide opportunities to identify the molecular targets of secondary metabolites, an observation with potential relevance for drug discovery efforts. Low ratios of interspecies recombination to mutation, even among cooccurring strains, coupled with high levels of within-species recombination suggest that the three species have been described in accordance with natural barriers to recombination. PMID:23892741

  6. Composition for nucleic acid sequencing

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2008-08-26

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  7. Sequence comparisons in the aminoacyl-tRNA synthetases with emphasis on regions of likely homology with sequences in the Rossmann fold in the methionyl and tyrosyl enzymes.

    PubMed

    Walker, E J; Jeffrey, P D

    1988-02-01

    Amino acid sequences of aminoacyl-tRNA synthetases specific for 12 different amino acids have now been published. Differences in origin at the species and organelle level result in 20 distinct sequences being available for comparison. Some of these were compared in small groups as they were determined and, although some homologies were detected, it was generally concluded that there was surprisingly little sequence homology in this functionally related group of enzymes. We have made comparisons of all of the available sequences by using a combination of computer and manual alignment methods and knowledge of the sequences in the Rossmann fold region of methionyl-tRNA synthetase from E. coli and tyrosyl-tRNA synthetase from B. stearothermophilus, enzymes whose three-dimensional structures have been described. It emerges that all of the aminoacyl-tRNA synthetase sequences thus examined show considerable homology with each other over at least parts of this region, some over virtually all of it. We conclude that a great deal more similarity than had previously been suspected exists in these proteins. In particular, the alignments we have made strongly imply the existence of a mononucleotide binding site of the Rossmann fold configuration in all of the synthetases compared. PMID:3283733

  8. Callatostatins: neuropeptides from the blowfly Calliphora vomitoria with sequence homology to cockroach allatostatins.

    PubMed Central

    Duve, H; Johnsen, A H; Scott, A G; Yu, C G; Yagi, K J; Tobe, S S; Thorpe, A

    1993-01-01

    Five neuropeptides with C-terminal amino acid sequence homology to cockroach allatostatins have been identified in the blowfly Calliphora vomitoria. Three have the same pentapeptide C-terminal amino acid sequence as allatostatin 1 of the cockroach Diploptera punctata. A hexadecapeptide designated callatostatin 1, isolated from thoracic ganglia, brains, and heads, has the sequence Asp-Pro-Leu-Asn-Glu-Glu-Arg-Arg-Ala-Asn-Arg-Tyr-Gly-Phe-Gly-Leu-NH2. Callatostatins 2 and 3 have been isolated from heads and thoracic ganglia, respectively; they comprise the last 14 and 8 residues of callatostatin 1. Callatostatin 4, isolated from thoracic ganglia, has the sequence Xaa-Arg-Pro-Tyr-Ser-Phe-Gly-Leu-NH2, where Xaa is either Asp or Asn. This peptide, with a serine substitution for glycine at position 5, has a C-terminal pentapeptide sequence identical to that of allatostatins 3 and 4 of D. punctata. Callatostatin 5, with the sequence Gly-Pro-Pro-Tyr-Asp-Phe-Gly-Met-NH2, was identified from whole flies. All five peptides inhibit juvenile hormone production by the corpora allata of D. punctata in vitro. Callatostatin 5 was the most potent allatostatin so far tested in this species, with maximum inhibition occurring at 1 nM. In contrast, none of the callatostatins or the allatostatins showed allatostatic activity in mature female C. vomitoria when tested at concentrations of 100 to 0.1 microM. In accordance with these results, immunoreactivity to an antiserum directed against the common C terminus of callatostatin 1 and allatostatin 1 was observed in the corpora allata of D. punctata but not in the corpus allatum of C. vomitoria, despite its presence in neurons of the brain. Neurons in the thoracic ganglion of C. vomitoria that are immunoreactive against this antiserum project to the hindgut, rectum, rectal papillae, and oviduct, suggestive of a function different from that of a true allatostatin. Images Fig. 5 PMID:8460157

  9. Studying RNA Homology and Conservation with Infernal: From Single Sequences to RNA Families.

    PubMed

    Barquist, Lars; Burge, Sarah W; Gardner, Paul P

    2016-01-01

    Emerging high-throughput technologies have led to a deluge of putative non-coding RNA (ncRNA) sequences identified in a wide variety of organisms. Systematic characterization of these transcripts will be a tremendous challenge. Homology detection is critical to making maximal use of functional information gathered about ncRNAs: identifying homologous sequence allows us to transfer information gathered in one organism to another quickly and with a high degree of confidence. ncRNA presents a challenge for homology detection, as the primary sequence is often poorly conserved and de novo secondary structure prediction and search remain difficult. This unit introduces methods developed by the Rfam database for identifying "families" of homologous ncRNAs starting from single "seed" sequences, using manually curated sequence alignments to build powerful statistical models of sequence and structure conservation known as covariance models (CMs), implemented in the Infernal software package. We provide a step-by-step iterative protocol for identifying ncRNA homologs and then constructing an alignment and corresponding CM. We also work through an example for the bacterial small RNA MicA, discovering a previously unreported family of divergent MicA homologs in genus Xenorhabdus in the process. © 2016 by John Wiley & Sons, Inc. PMID:27322404

  10. Homology between the deoxyribonucleic acid of fertility factor P and Vibrio cholerae chromosomal deoxyribonucleic acid.

    PubMed Central

    Wohhieter, J A; Datta, A; Brenner, D J; Baron, L S

    1975-01-01

    The deoxyribonucleic acid (DNA) of the Vibrio cholerae fertility factor P was isolated by the dye-buoyant density method and hybridized to V. cholerae chromosomal DNA. The DNA of this fertility plasmid had between 35 to 40% homology with the V. cholerae chromosomal DNA. Little or no homology was detected between the P factor DNA and DNA of the Escherichia coli sex factor F. PMID:1092651

  11. Identification of viruses and viroids by next-generation sequencing and homology-dependent and homology-independent algorithms.

    PubMed

    Wu, Qingfa; Ding, Shou-Wei; Zhang, Yongjiang; Zhu, Shuifang

    2015-01-01

    A fast, accurate, and full indexing of viruses and viroids in a sample for the inspection and quarantine services and disease management is desirable but was unrealistic until recently. This article reviews the rapid and exciting recent progress in the use of next-generation sequencing (NGS) technologies for the identification of viruses and viroids in plants. A total of four viroids/viroid-like RNAs and 49 new plant RNA and DNA viruses from 18 known or unassigned virus families have been identified from plants since 2009. A comparison of enrichment strategies reveals that full indexing of RNA and DNA viruses as well as viroids in a plant sample at single-nucleotide resolution is made possible by one NGS run of total small RNAs, followed by data mining with homology-dependent and homology-independent computational algorithms. Major challenges in the application of NGS technologies to pathogen discovery are discussed. PMID:26047558

  12. HorA web server to infer homology between proteins using sequence and structural similarity.

    PubMed

    Kim, Bong-Hyun; Cheng, Hua; Grishin, Nick V

    2009-07-01

    The biological properties of proteins are often gleaned through comparative analysis of evolutionary relatives. Although protein structure similarity search methods detect more distant homologs than purely sequence-based methods, structural resemblance can result from either homology (common ancestry) or analogy (similarity without common ancestry). While many existing web servers detect structural neighbors, they do not explicitly address the question of homology versus analogy. Here, we present a web server named HorA (Homology or Analogy) that identifies likely homologs for a query protein structure. Unlike other servers, HorA combines sequence information from state-of-the-art profile methods with structure information from spatial similarity measures using an advanced computational technique. HorA aims to identify biologically meaningful connections rather than purely 3D-geometric similarities. The HorA method finds approximately 90% of remote homologs defined in the manually curated database SCOP. HorA will be especially useful for finding remote homologs that might be overlooked by other sequence or structural similarity search servers. The HorA server is available at http://prodata.swmed.edu/horaserver. PMID:19417074

  13. HorA web server to infer homology between proteins using sequence and structural similarity

    PubMed Central

    Kim, Bong-Hyun; Cheng, Hua; Grishin, Nick V.

    2009-01-01

    The biological properties of proteins are often gleaned through comparative analysis of evolutionary relatives. Although protein structure similarity search methods detect more distant homologs than purely sequence-based methods, structural resemblance can result from either homology (common ancestry) or analogy (similarity without common ancestry). While many existing web servers detect structural neighbors, they do not explicitly address the question of homology versus analogy. Here, we present a web server named HorA (Homology or Analogy) that identifies likely homologs for a query protein structure. Unlike other servers, HorA combines sequence information from state-of-the-art profile methods with structure information from spatial similarity measures using an advanced computational technique. HorA aims to identify biologically meaningful connections rather than purely 3D-geometric similarities. The HorA method finds ∼90% of remote homologs defined in the manually curated database SCOP. HorA will be especially useful for finding remote homologs that might be overlooked by other sequence or structural similarity search servers. The HorA server is available at http://prodata.swmed.edu/horaserver. PMID:19417074

  14. Nucleotide sequence analysis of the L gene of Newcastle disease virus: homologies with Sendai and vesicular stomatitis viruses.

    PubMed Central

    Yusoff, K; Millar, N S; Chambers, P; Emmerson, P T

    1987-01-01

    The nucleotide sequence of the L gene of the Beaudette C strain of Newcastle disease virus (NDV) has been determined. The L gene is 6704 nucleotides long and encodes a protein of 2204 amino acids with a calculated molecular weight of 248822. Mung bean nuclease mapping of the 5' terminus of the L gene mRNA indicates that the transcription of the L gene is initiated 11 nucleotides upstream of the translational start site. Comparison with the amino acid sequences of the L genes of Sendai virus and vesicular stomatitis virus (VSV) suggests that there are several regions of homology between the sequences. These data provide further evidence for an evolutionary relationship between the Paramyxoviridae and the Rhabdoviridae. A non-coding sequence of 46 nucleotides downstream of the presumed polyadenylation site of the L gene may be part of a negative strand leader RNA. Images PMID:3035486

  15. Sequence analysis and homology modeling of laccase from Pycnoporus cinnabarinus.

    PubMed

    Meshram, Rohan J; Gavhane, Aj; Gaikar, Rb; Bansode, Ts; Maskar, Au; Gupta, Ak; Sohni, Sk; Patidar, Ma; Pandey, Tr; Jangle, Sn

    2010-01-01

    Industrial effluents of textile, paper, and leather industries contain various toxic dyes as one of the waste material. It imparts major impact on human health as well as environment. The white rot fungus Pycnoporus cinnabarinus Laccase is generally used to degrade these toxic dyes. In order to decipher the mechanism of process by which Laccase degrade dyes, it is essential to know its 3D structure. Homology modeling was performed in presented work, by satisfying Spatial restrains using Modeller Program, which is considered as standard in this field, to generate 3D structure of Laccase in unison, SWISSMODEL web server was also utilized to generate and verify the alternative models. We observed that models created using Modeller stands better on structure evaluation tests. This study can further be used in molecular docking techniques, to understand the interaction of enzyme with its mediators like 2, 2-azinobis (3-ethylbenzthiazoline-6-sulfonate) (ABTS) and Vanillin that are known to enhance the Laccase activity. PMID:21364777

  16. Proline-rich sequences that bind to Src homology 3 domains with individual specificities.

    PubMed Central

    Alexandropoulos, K; Cheng, G; Baltimore, D

    1995-01-01

    To study the binding specificity of Src homology 3 (SH3) domains, we have screened a mouse embryonic expression library for peptide fragments that interact with them. Several clones were identified that express fragments of proteins which, through proline-rich binding sites, exhibit differential binding specificity to various SH3 domains. Src-SH3-specific binding uses a sequence of 7 aa of the consensus RPLPXXP, in which the N-terminal arginine is very important. The SH3 domains of the Src-related kinases Fyn, Lyn, and Hck bind to this sequence with the same affinity as that of the Src SH3. In contrast, a quite different proline-rich sequence from the Btk protein kinase binds to the Fyn, Lyn, and Hck SH3 domains, but not to the Src SH3. Specific binding of the Abl SH3 requires a longer, more proline-rich sequence but no arginine. One clone that binds to both Src and Abl SH3 domains through a common site exhibits reversed binding orientation, in that an arginine indispensable for binding to all tested SH3 domains occurs at the C terminus. Another clone contains overlapping yet distinct Src and Abl SH3 binding sites. Binding to the SH3 domains is mediated by a common PXXP amino acid sequence motif present on all ligands, and specificity comes about from other interactions, often ones involving arginine. The rules governing in vivo usage of particular sites by particular SH3 domains are not clear, but one binding orientation may be more specific than another. Images Fig. 1 Fig. 2 Fig. 3 PMID:7536925

  17. Using homology relations within a database markedly boosts protein sequence similarity search.

    PubMed

    Tong, Jing; Sadreyev, Ruslan I; Pei, Jimin; Kinch, Lisa N; Grishin, Nick V

    2015-06-01

    Inference of homology from protein sequences provides an essential tool for analyzing protein structure, function, and evolution. Current sequence-based homology search methods are still unable to detect many similarities evident from protein spatial structures. In computer science a search engine can be improved by considering networks of known relationships within the search database. Here, we apply this idea to protein-sequence-based homology search and show that it dramatically enhances the search accuracy. Our new method, COMPADRE (COmparison of Multiple Protein sequence Alignments using Database RElationships) assesses the relationship between the query sequence and a hit in the database by considering the similarity between the query and hit's known homologs. This approach increases detection quality, boosting the precision rate from 18% to 83% at half-coverage of all database homologs. The increased precision rate allows detection of a large fraction of protein structural relationships, thus providing structure and function predictions for previously uncharacterized proteins. Our results suggest that this general approach is applicable to a wide variety of methods for detection of biological similarities. The web server is available at prodata.swmed.edu/compadre. PMID:26038555

  18. Sequence homology and structural analysis of the clostridial neurotoxins.

    PubMed

    Lacy, D B; Stevens, R C

    1999-09-01

    The clostridial neurotoxins (CNTs), comprised of tetanus neurotoxin (TeNT) and the seven serotypes of botulinum neurotoxin (BoNT A-G), specifically bind to neuronal cells and disrupt neurotransmitter release by cleaving proteins involved in synaptic vesicle membrane fusion. In this study, multiple CNT sequences were analyzed within the context of the 1277 residue BoNT/A crystal structure to gain insight into the events of binding, pore formation, translocation, and catalysis that are required for toxicity. A comparison of the TeNT-binding domain structure to that of BoNT/A reveals striking differences in their surface properties. Further, the solvent accessibility of a key tryptophan in the C terminus of the BoNT/A-binding domain refines the location of the ganglioside-binding site. Data collected from a single frozen crystal of BoNT/A are included in this study, revealing slight differences in the binding domain orientation as well as density for a previously unobserved translocation domain loop. This loop and the conservation of charged residues with structural proximity to putative pore-forming sequences lend insight into the CNT mechanism of pore formation and translocation. The sequence analysis of the catalytic domain revealed an area near the active-site likely to account for specificity differences between the CNTs. It revealed also a tertiary structure, highly conserved in primary sequence, which seems critical to catalysis but is 30 A from the active-site zinc ion. This observation, along with an analysis of the 54 residue "belt" from the translocation domain are discussed with respect to the mechanism of catalysis. PMID:10518945

  19. Using homology relations within a database markedly boosts protein sequence similarity search

    PubMed Central

    Tong, Jing; Sadreyev, Ruslan I.; Pei, Jimin; Kinch, Lisa N.; Grishin, Nick V.

    2015-01-01

    Inference of homology from protein sequences provides an essential tool for analyzing protein structure, function, and evolution. Current sequence-based homology search methods are still unable to detect many similarities evident from protein spatial structures. In computer science a search engine can be improved by considering networks of known relationships within the search database. Here, we apply this idea to protein-sequence–based homology search and show that it dramatically enhances the search accuracy. Our new method, COMPADRE (COmparison of Multiple Protein sequence Alignments using Database RElationships) assesses the relationship between the query sequence and a hit in the database by considering the similarity between the query and hit’s known homologs. This approach increases detection quality, boosting the precision rate from 18% to 83% at half-coverage of all database homologs. The increased precision rate allows detection of a large fraction of protein structural relationships, thus providing structure and function predictions for previously uncharacterized proteins. Our results suggest that this general approach is applicable to a wide variety of methods for detection of biological similarities. The web server is available at prodata.swmed.edu/compadre. PMID:26038555

  20. pGraph: Efficient Parallel Construction of Large-Scale Protein Sequence Homology Graphs

    SciTech Connect

    Wu, Changjun; Kalyanaraman, Anantharaman; Cannon, William R.

    2012-09-15

    Detecting sequence homology between protein sequences is a fundamental problem in computational molecular biology, with a pervasive application in nearly all analyses that aim to structurally and functionally characterize protein molecules. While detecting the homology between two protein sequences is relatively inexpensive, detecting pairwise homology for a large number of protein sequences can become computationally prohibitive for modern inputs, often requiring millions of CPU hours. Yet, there is currently no robust support to parallelize this kernel. In this paper, we identify the key characteristics that make this problemparticularly hard to parallelize, and then propose a new parallel algorithm that is suited for detecting homology on large data sets using distributed memory parallel computers. Our method, called pGraph, is a novel hybrid between the hierarchical multiple-master/worker model and producer-consumer model, and is designed to break the irregularities imposed by alignment computation and work generation. Experimental results show that pGraph achieves linear scaling on a 2,048 processor distributed memory cluster for a wide range of inputs ranging from as small as 20,000 sequences to 2,560,000 sequences. In addition to demonstrating strong scaling, we present an extensive report on the performance of the various system components and related parametric studies.

  1. High speed nucleic acid sequencing

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2011-05-17

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid. Each type of labeled nucleotide comprises an acceptor fluorophore attached to a phosphate portion of the nucleotide such that the fluorophore is removed upon incorporation into a growing strand. Fluorescent signal is emitted via fluorescent resonance energy transfer between the donor fluorophore and the acceptor fluorophore as each nucleotide is incorporated into the growing strand. The sequence is deduced by identifying which base is being incorporated into the growing strand.

  2. Transitive Homology-Guided Structural Studies Lead to Discovery of Cro Proteins With 40% Sequence Identify But Different Folds

    SciTech Connect

    Roessler, C.G.; Hall, B.M.; Anderson, W.J.; Ingram, W.M.; Roberts, S.A.; Montfort, W.R.; Cordes, M.H.J.

    2009-05-27

    Proteins that share common ancestry may differ in structure and function because of divergent evolution of their amino acid sequences. For a typical diverse protein superfamily, the properties of a few scattered members are known from experiment. A satisfying picture of functional and structural evolution in relation to sequence changes, however, may require characterization of a larger, well chosen subset. Here, we employ a 'stepping-stone' method, based on transitive homology, to target sequences intermediate between two related proteins with known divergent properties. We apply the approach to the question of how new protein folds can evolve from preexisting folds and, in particular, to an evolutionary change in secondary structure and oligomeric state in the Cro family of bacteriophage transcription factors, initially identified by sequence-structure comparison of distant homologs from phages P22 and {lambda}. We report crystal structures of two Cro proteins, Xfaso 1 and Pfl 6, with sequences intermediate between those of P22 and {lambda}. The domains show 40% sequence identity but differ by switching of {alpha}-helix to {beta}-sheet in a C-terminal region spanning {approx}25 residues. Sedimentation analysis also suggests a correlation between helix-to-sheet conversion and strengthened dimerization.

  3. A work stealing based approach for enabling scalable optimal sequence homology detection

    SciTech Connect

    Daily, Jeffrey A.; Kalyanaraman, Anantharaman; Krishnamoorthy, Sriram; Vishnu, Abhinav

    2015-05-01

    Sequence homology detection is central to a number of bioinformatics applications including genome sequencing and protein family characterization. Given millions of sequences, the goal is to identify all pairs of sequences that are highly similar (or “homologous”) on the basis of alignment criteria. While there are optimal alignment algorithms to compute pairwise homology, their deployment for large-scale is currently not feasible; instead, heuristic methods are used at the expense of quality. Here, we present the design and evaluation of a parallel implementation for conducting optimal homology detection on distributed memory supercomputers. Our approach uses a combination of techniques from asynchronous load balancing (viz. work stealing, dynamic task counters), data replication, and exact-matching filters to achieve homology detection at scale. Results for 2.56M sequences on up to 8K cores show parallel efficiencies of ~ 75-100%, a time-to-solution of 33s, and a rate of ~ 2.0M alignments per second.

  4. LocARNAscan: Incorporating thermodynamic stability in sequence and structure-based RNA homology search

    PubMed Central

    2013-01-01

    Background The search for distant homologs has become an import issue in genome annotation. A particular difficulty is posed by divergent homologs that have lost recognizable sequence similarity. This same problem also arises in the recognition of novel members of large classes of RNAs such as snoRNAs or microRNAs that consist of families unrelated by common descent. Current homology search tools for structured RNAs are either based entirely on sequence similarity (such as blast or hmmer) or combine sequence and secondary structure. The most prominent example of the latter class of tools is Infernal. Alternatives are descriptor-based methods. In most practical applications published to-date, however, the information contained in covariance models or manually prescribed search patterns is dominated by sequence information. Here we ask two related questions: (1) Is secondary structure alone informative for homology search and the detection of novel members of RNA classes? (2) To what extent is the thermodynamic propensity of the target sequence to fold into the correct secondary structure helpful for this task? Results Sequence-structure alignment can be used as an alternative search strategy. In this scenario, the query consists of a base pairing probability matrix, which can be derived either from a single sequence or from a multiple alignment representing a set of known representatives. Sequence information can be optionally added to the query. The target sequence is pre-processed to obtain local base pairing probabilities. As a search engine we devised a semi-global scanning variant of LocARNA’s algorithm for sequence-structure alignment. The LocARNAscan tool is optimized for speed and low memory consumption. In benchmarking experiments on artificial data we observe that the inclusion of thermodynamic stability is helpful, albeit only in a regime of extremely low sequence information in the query. We observe, furthermore, that the sensitivity is bounded in

  5. Sequence divergence and chromosomal rearrangements during the evolution of human pseudoautosomal genes and their mouse homologs

    SciTech Connect

    Ellison, J.; Li, X.; Francke, U.

    1994-09-01

    The pseudoautosomal region (PAR) is an area of sequence identity between the X and Y chromosomes and is important for mediating X-Y pairing during male meiosis. Of the seven genes assigned to the human PAR, none of the mouse homologs have been isolated by a cross-hybridization strategy. Two of these homologs, Csfgmra and II3ra, have been isolated using a functional assay for the gene products. These genes are quite different in sequence from their human homologs, showing only 60-70% sequence similarity. The Csfgmra gene has been found to further differ from its human homolog in being isolated not on the sex chromosomes, but on a mouse autosome (chromosome 19). Using a mouse-hamster somatic cell hybrid mapping panel, we have mapped the II3ra gene to yet another mouse autosome, chromosome 14. Attempts to clone the mouse homolog of the ANT3 locus resulted in the isolation of two related genes, Ant1 and Ant2, but failed to yield the Ant3 gene. Southern blot analysis of the ANT/Ant genes showed the Ant1 and Ant2 sequences to be well-conserved among all of a dozen mammals tested. In contrast, the ANT3 gene only showed hybridization to non-rodent mammals, suggesting it is either greatly divergent or has been deleted in the rodent lineage. Similar experiments with other human pseudoautosomal probes likewise showed a lack of hybridization to rodent sequences. The results show a definite trend of extensive divergence of pseudoautosomal sequences in addition to chromosomal rearrangements involving X;autosome translocations and perhaps gene deletions. Such observations have interesting implications regarding the evolution of this important region of the sex chromosomes.

  6. Solid phase sequencing of double-stranded nucleic acids

    DOEpatents

    Fu, Dong-Jing; Cantor, Charles R.; Koster, Hubert; Smith, Cassandra L.

    2002-01-01

    This invention relates to methods for detecting and sequencing of target double-stranded nucleic acid sequences, to nucleic acid probes and arrays of probes useful in these methods, and to kits and systems which contain these probes. Useful methods involve hybridizing the nucleic acids or nucleic acids which represent complementary or homologous sequences of the target to an array of nucleic acid probes. These probe comprise a single-stranded portion, an optional double-stranded portion and a variable sequence within the single-stranded portion. The molecular weights of the hybridized nucleic acids of the set can be determined by mass spectroscopy, and the sequence of the target determined from the molecular weights of the fragments. Nucleic acids whose sequences can be determined include nucleic acids in biological samples such as patient biopsies and environmental samples. Probes may be fixed to a solid support such as a hybridization chip to facilitate automated determination of molecular weights and identification of the target sequence.

  7. A potent antimicrobial protein from onion seeds showing sequence homology to plant lipid transfer proteins.

    PubMed Central

    Cammue, B P; Thevissen, K; Hendriks, M; Eggermont, K; Goderis, I J; Proost, P; Van Damme, J; Osborn, R W; Guerbette, F; Kader, J C

    1995-01-01

    An antimicrobial protein of about 10 kD, called Ace-AMP1, was isolated from onion (Allium cepa L.) seeds. Based on the near-complete amino acid sequence of this protein, oligonucleotides were designed for polymerase chain reaction-based cloning of the corresponding cDNA. The mature protein is homologous to plant nonspecific lipid transfer proteins (nsLTPs), but it shares only 76% of the residues that are conserved among all known plant nsLTPs and is unusually rich in arginine. Ace-AMP1 inhibits all 12 tested plant pathogenic fungi at concentrations below 10 micrograms mL-1. Its antifungal activity is either not at all or is weakly affected by the presence of different cations at concentrations approximating physiological ionic strength conditions. Ace-AMP1 is also active on two Gram-positive bacteria but is apparently not toxic for Gram-negative bacteria and cultured human cells. In contrast to nsLTPs such as those isolated from radish or maize seeds, Ace-AMP1 was unable to transfer phospholipids from liposomes to mitochondria. On the other hand, lipid transfer proteins from wheat and maize seeds showed little or no antimicrobial activity, whereas the radish lipid transfer protein displayed antifungal activity only in media with low cation concentrations. The relevance of these findings with regard to the function of nsLTPs is discussed. PMID:7480341

  8. PyMod: sequence similarity searches, multiple sequence-structure alignments, and homology modeling within PyMOL

    PubMed Central

    2012-01-01

    Background In recent years, an exponential growing number of tools for protein sequence analysis, editing and modeling tasks have been put at the disposal of the scientific community. Despite the vast majority of these tools have been released as open source software, their deep learning curves often discourages even the most experienced users. Results A simple and intuitive interface, PyMod, between the popular molecular graphics system PyMOL and several other tools (i.e., [PSI-]BLAST, ClustalW, MUSCLE, CEalign and MODELLER) has been developed, to show how the integration of the individual steps required for homology modeling and sequence/structure analysis within the PyMOL framework can hugely simplify these tasks. Sequence similarity searches, multiple sequence and structural alignments generation and editing, and even the possibility to merge sequence and structure alignments have been implemented in PyMod, with the aim of creating a simple, yet powerful tool for sequence and structure analysis and building of homology models. Conclusions PyMod represents a new tool for the analysis and the manipulation of protein sequences and structures. The ease of use, integration with many sequence retrieving and alignment tools and PyMOL, one of the most used molecular visualization system, are the key features of this tool. Source code, installation instructions, video tutorials and a user's guide are freely available at the URL http://schubert.bio.uniroma1.it/pymod/index.html PMID:22536966

  9. T-Coffee: a web server for the multiple sequence alignment of protein and RNA sequences using structural information and homology extension

    PubMed Central

    Di Tommaso, Paolo; Moretti, Sebastien; Xenarios, Ioannis; Orobitg, Miquel; Montanyola, Alberto; Chang, Jia-Ming; Taly, Jean-François; Notredame, Cedric

    2011-01-01

    This article introduces a new interface for T-Coffee, a consistency-based multiple sequence alignment program. This interface provides an easy and intuitive access to the most popular functionality of the package. These include the default T-Coffee mode for protein and nucleic acid sequences, the M-Coffee mode that allows combining the output of any other aligners, and template-based modes of T-Coffee that deliver high accuracy alignments while using structural or homology derived templates. These three available template modes are Expresso for the alignment of protein with a known 3D-Structure, R-Coffee to align RNA sequences with conserved secondary structures and PSI-Coffee to accurately align distantly related sequences using homology extension. The new server benefits from recent improvements of the T-Coffee algorithm and can align up to 150 sequences as long as 10 000 residues and is available from both http://www.tcoffee.org and its main mirror http://tcoffee.crg.cat. PMID:21558174

  10. EUGENE'HOM: A generic similarity-based gene finder using multiple homologous sequences.

    PubMed

    Foissac, Sylvain; Bardou, Philippe; Moisan, Annick; Cros, Marie-Josée; Schiex, Thomas

    2003-07-01

    EUGENE'HOM is a gene prediction software for eukaryotic organisms based on comparative analysis. EUGENE'HOM is able to take into account multiple homologous sequences from more or less closely related organisms. It integrates the results of TBLASTX analysis, splice site and start codon prediction and a robust coding/non-coding probabilistic model which allows EUGENE'HOM to handle sequences from a variety of organisms. The current target of EUGENE'HOM is plant sequences. The EUGENE'HOM web site is available at http://genopole.toulouse.inra.fr/bioinfo/eugene/EuGeneHom/cgi-bin/EuGeneHom.pl. PMID:12824408

  11. EUGÈNE'HOM: a generic similarity-based gene finder using multiple homologous sequences

    PubMed Central

    Foissac, Sylvain; Bardou, Philippe; Moisan, Annick; Cros, Marie-Josée; Schiex, Thomas

    2003-01-01

    EUGÈNE'HOM is a gene prediction software for eukaryotic organisms based on comparative analysis. EUGÈNE'HOM is able to take into account multiple homologous sequences from more or less closely related organisms. It integrates the results of TBLASTX analysis, splice site and start codon prediction and a robust coding/non-coding probabilistic model which allows EUGÈNE'HOM to handle sequences from a variety of organisms. The current target of EUGÈNE'HOM is plant sequences. The EUGÈNE'HOM web site is available at http://genopole.toulouse.inra.fr/bioinfo/eugene/EuGeneHom/cgi-bin/EuGeneHom.pl. PMID:12824408

  12. Characterization of RAD51-Independent Break-Induced Replication That Acts Preferentially with Short Homologous Sequences

    PubMed Central

    Ira, Grzegorz; Haber, James E.

    2002-01-01

    Repair of double-strand breaks by gene conversions between homologous sequences located on different Saccharomyces cerevisiae chromosomes or plasmids requires RAD51. When repair occurs between inverted repeats of the same plasmid, both RAD51-dependent and RAD51-independent repairs are found. Completion of RAD51-independent plasmid repair events requires RAD52, RAD50, RAD59, TID1 (RDH54), and SRS2 and appears to involve break-induced replication coupled to single-strand annealing. Surprisingly, RAD51-independent recombination requires much less homology (30 bp) for strand invasion than does RAD51-dependent repair (approximately 100 bp); in fact, the presence of Rad51p impairs recombination with short homology. The differences between the RAD51- and RAD50/RAD59-dependent pathways account for the distinct ways that two different recombination processes maintain yeast telomeres in the absence of telomerase. PMID:12192038

  13. Sequence analysis of the NgoPII methyltransferase gene from Neisseria gonorrhoeae P9: homologies with other enzymes recognizing the sequence 5'-GGCC-3'.

    PubMed Central

    Sullivan, K M; Saunders, J R

    1988-01-01

    Recombinant plasmids harbouring the functional M.NgoPII methyltransferase (specificity 5'-GGCC-3') were isolated from amplified gene libraries of gonococcal chromosomal DNA cloned in pBR322 and in Escherichia coli RR1. The M.NgoPII gene was localized by sub-cloning and the nucleotide sequence of a cloned 1.6 kb segment of Neisseria gonorrhoeae DNA harbouring the methylase gene was determined. This data, coupled with sub-cloning experiments and in vitro transcription-translation studies, indicates a theoretical size of 38.5 kd for the methylase protein. The predicted amino acid sequence of the methylase contains significant regions of homology with the projected sequences of other cytosine-modifying methylases, upon which the activity of these enzymes is likely to depend. Images PMID:2837733

  14. Cloning of human papilloma virus genomic DNAs and analysis of homologous polynucleotide sequences.

    PubMed

    Heilman, C A; Law, M F; Israel, M A; Howley, P M

    1980-11-01

    The complete DNA genomes of four distinct human papilloma viruses (human papilloma virus subtype 1a [HPV-1a], HPV-1b, HPV-2a, and HPV-4) were molecularly cloned in Escherichia coli, using the certified plasmid vector pBR322. The restriction endonuclease patterns of the cloned HPV-1a and HPV-1b DNAs were similar to those already published for uncloned DNAs. Physical maps were constructed for HPV-2a DNA and HPV-4 DNA, since these viral DNAs had not been previously mapped. By using the cloned DNAs, the genomes of HPV-1a, HPV-2a, and HPV-4 were analyzed for nucleotide sequence homology. Under standard hybridization conditions (Tm = --28 degrees C), no homology was detectable among the genomes of these papilloma viruses, in agreement with previous reports. However, under less stringent conditions (i.e., Tm = --50 degrees C), stable DNA hybrids could be detected between these viral DNAs, indicating homologous segments in the genomes with approximately 30% base mismatch. By using specific DNA fragments immobilized on nitrocellulose filters, these regions of homology were mapped. Hybridization experiments between radiolabeled bovine papilloma virus type 1 (BPV-1) DNA and the unlabeled HPV-1a, HPV-2a, or HPV-4 DNA restriction fragments under low-stringency conditions indicated that the regions of homology among the HPV DNAs are also conserved in the BPV-1 genome with approximately the same degree of base mismatch. PMID:6253665

  15. Detection of sequences homologous to human retroviral DNA in multiple sclerosis by gene amplification

    SciTech Connect

    Greenberg, S.J.; Ehrlich, G.D.; Abbott, M.A.; Hurwitz, B.J.; Waldmann, T.A.; Poiesz, B.J. )

    1989-04-01

    Twenty-one patients with multiple sclerosis, chronic progressive type, were examined for DNA sequences homologous to a human retrovirus. Genomic DNA from peripheral blood mononuclear cells was analyzed for the presence of homologous sequences to the human T-cell leukemia/lymphoma virus type I (HTLV-I) long terminal repeat, 3{prime} gag, pol, and env domains by the enzymatic in vitro gene amplification technique, polymerase chain reaction. Positive identification of homologous pol sequences was made in the amplified DNA from six of these patients (29%). Three of these six patients (14%) also tested positive for the env region, but not for the other regions tested. In contrast, none of the samples from 35 normal individuals studied was positive when amplified and tested with the same primers and probes. Comparison of patterns obtained from controls and from patients with adult T-cell leukemia or tropical spastic paraparesis suggests that the DNA sequences identified are exogenous to the human genome and may correspond to a human retroviral species. The data support the detection of a human retroviral agent in some patients with multiple sclerosis.

  16. A Scalable Parallel Algorithm for Large-Scale Protein Sequence Homology Detection

    SciTech Connect

    Wu, Changjun; Kalyanaraman, Anantharaman; Cannon, William R.

    2010-09-13

    Protein sequence homology detection is a fundamental problem in computational molecular biology, with a pervasive application in nearly all analyses that aim to structurally and functionally characterize protein molecules. While detecting homology between two protein sequences is computationally inexpensive, detecting pairwise homology at a large-scale becomes prohibitive, requiring millions of CPU hours. Yet, there is currently no efficient method available to parallelize this kernel. In this paper, we present the key characteristics that make this problem particularly hard to parallelize, and then propose a new parallel algorithm that is suited for large-scale protein sequence data. Our method, called pGraph, is designed using a hierarchical multiple-master multiple-worker model, where the processor space is partitioned into subgroups and the hierarchy helps in ensuring the workload is load balanced fashion despite the inherent irregularity that may originate in the input. Experimental evaluation demonstrates that our method scales linearly on all input sizes tested (up to 640K sequences) on a 1,024 node supercomputer. In addition to demonstrating strong scaling, we present an extensive study of the various components of the system and related parametric studies.

  17. Sequence Conversion by Single Strand Oligonucleotide Donors via Non-homologous End Joining in Mammalian Cells*

    PubMed Central

    Liu, Jia; Majumdar, Alokes; Liu, Jilan; Thompson, Lawrence H.; Seidman, Michael M.

    2010-01-01

    Double strand breaks (DSBs) can be repaired by homology independent nonhomologous end joining (NHEJ) pathways involving proteins such as Ku70/80, DNAPKcs, Xrcc4/Ligase 4, and the Mre11/Rad50/Nbs1 (MRN) complex. DSBs can also be repaired by homology-dependent pathways (HDR), in which the MRN and CtIP nucleases produce single strand ends that engage homologous sequences either by strand invasion or strand annealing. The entry of ends into HDR pathways underlies protocols for genomic manipulation that combine site-specific DSBs with appropriate informational donors. Most strategies utilize long duplex donors that participate by strand invasion. Work in yeast indicates that single strand oligonucleotide (SSO) donors are also active, over considerable distance, via a single strand annealing pathway. We examined the activity of SSO donors in mammalian cells at DSBs induced either by a restriction nuclease or by a targeted interstrand cross-link. SSO donors were effective immediately adjacent to the break, but activity declined sharply beyond ∼100 nucleotides. Overexpression of the resection nuclease CtIP increased the frequency of SSO-mediated sequence modulation distal to the break site, but had no effect on the activity of an SSO donor adjacent to the break. Genetic and in vivo competition experiments showed that sequence conversion by SSOs in the immediate vicinity of the break was not by strand invasion or strand annealing pathways. Instead these donors competed for ends that would have otherwise entered NHEJ pathways. PMID:20489199

  18. Sequence, expression divergence, and complementation of homologous ALCATRAZ loci in Brassica napus.

    PubMed

    Hua, Shuijin; Shamsi, Imran Haider; Guo, Yuan; Pak, Haksong; Chen, Mingxun; Shi, Congguang; Meng, Huabing; Jiang, Lixi

    2009-08-01

    The genomic era provides new perspectives in understanding polyploidy evolution, mostly on the genome-wide scale. In this paper, we show the sequence and expression divergence between the homologous ALCATRAZ (ALC) loci in Brassica napus, responsible for silique dehiscence. We cloned two homologous ALC loci, namely BnaC.ALC.a and BnaA.ALC.a in B. napus. Driven by the 35S promoter, both the loci complemented to the alc mutation of Arabidopsis thaliana, yet only the expression of BnaC.ALC.a was detectable in the siliques of B. napus. Sequence alignment indicated that BnaC.ALC.a and BolC.ALC.a, or BnaA.ALC.a and BraA.ALC.a, possess a high level of similarity. The understanding of the sequence and expression divergence among homologous loci of a gene is of due importance for an effective gene manipulation and TILLING (or ECOTILLING) analysis for the allelic DNA variation at a given locus. PMID:19504267

  19. Expanding the nitrogen regulatory protein superfamily: Homology detection at below random sequence identity.

    PubMed

    Kinch, Lisa N; Grishin, Nick V

    2002-07-01

    Nitrogen regulatory (PII) proteins are signal transduction molecules involved in controlling nitrogen metabolism in prokaryots. PII proteins integrate the signals of intracellular nitrogen and carbon status into the control of enzymes involved in nitrogen assimilation. Using elaborate sequence similarity detection schemes, we show that five clusters of orthologs (COGs) and several small divergent protein groups belong to the PII superfamily and predict their structure to be a (betaalphabeta)(2) ferredoxin-like fold. Proteins from the newly emerged PII superfamily are present in all major phylogenetic lineages. The PII homologs are quite diverse, with below random (as low as 1%) pairwise sequence identities between some members of distant groups. Despite this sequence diversity, evidence suggests that the different subfamilies retain the PII trimeric structure important for ligand-binding site formation and maintain a conservation of conservations at residue positions important for PII function. Because most of the orthologous groups within the PII superfamily are composed entirely of hypothetical proteins, our remote homology-based structure prediction provides the only information about them. Analogous to structural genomics efforts, such prediction gives clues to the biological roles of these proteins and allows us to hypothesize about locations of functional sites on model structures or rationalize about available experimental information. For instance, conserved residues in one of the families map in close proximity to each other on PII structure, allowing for a possible metal-binding site in the proteins coded by the locus known to affect sensitivity to divalent metal ions. Presented analysis pushes the limits of sequence similarity searches and exemplifies one of the extreme cases of reliable sequence-based structure prediction. In conjunction with structural genomics efforts to shed light on protein function, our strategies make it possible to detect

  20. Online homology modelling as a means of bridging the sequence-structure gap.

    PubMed

    Sheehan, David; O'Sullivan, Siobhán

    2011-01-01

    For even the best-studied species, there is a large gap in their representation in the protein databank (PDB) compared to within sequence databases. Typically, less than 2% of sequences are represented in the PDB. This is partly due to the considerable experimental challenge and manual inputs required to solve three dimensional structures by methods such as X-ray diffraction and multi-dimensional nuclear magnetic resonance (NMR) spectroscopy in comparison to high-throughput sequencing. This gap is made even wider by the high level of redundancy within the PDB and under-representation of some protein categories such as membrane-associated proteins which comprise approximately 25% of proteins encoded in genomes. A traditional route to closing the sequence-structure gap is offered by homology modelling whereby the sequence of a target protein is modelled on a template represented in the PDB using in silico energy minimisation approaches. More recently, online homology servers have become available which automatically generate models from proffered sequences. However, many online servers give little indication of the structural plausibility of the generated model. In this paper, the online homology server Geno3D will be described. This server uses similar software to that used in modelling structures during structure determination and thus generates data allowing determination of the structural plausibility of models. For illustration, modelling of a chemotaxis protein (CheY) from Pseudomononas entomophila L48 (accession YP_609298) on a template (PDB id. 1mvo), the phosphorylation domain of an outer membrane protein PhoP from Bacillus subtilis, will be described. PMID:22064508

  1. Rapid and accurate identification of microorganisms contaminating cosmetic products based on DNA sequence homology.

    PubMed

    Fujita, Y; Shibayama, H; Suzuki, Y; Karita, S; Takamatsu, S

    2005-12-01

    The aim of this study was to develop rapid and accurate procedures to identify microorganisms contaminating cosmetic products, based on the identity of the nucleotide sequences of the internal transcribed spacer (ITS) region of the ribosomal RNA coding DNA (rDNA). Five types of microorganisms were isolated from the inner portion of lotion bottle caps, skin care lotions, and cleansing gels. The rDNA ITS region of microorganisms was amplified through the use of colony-direct PCR or ordinal PCR using DNA extracts as templates. The nucleotide sequences of the amplified DNA were determined and subjected to homology search of a publicly available DNA database. Thereby, we obtained DNA sequences possessing high similarity with the query sequences from the databases of all the five organisms analyzed. The traditional identification procedure requires expert skills, and a time period of approximately 1 month to identify the microorganisms. On the contrary, 3-7 days were sufficient to complete all the procedures employed in the current method, including isolation and cultivation of organisms, DNA sequencing, and the database homology search. Moreover, it was possible to develop the skills necessary to perform the molecular techniques required for the identification procedures within 1 week. Consequently, the current method is useful for rapid and accurate identification of microorganisms, contaminating cosmetics. PMID:18492168

  2. Incorporation of partial polyhedrin homology sequences (PPHS) enhances the production of cloned foreign genes in a baculovirus expression system.

    PubMed

    Gong, Zhaohui; Jin, Yongfeng; Zhang, Yaozhou

    2006-03-01

    Baculovirus expression vector systems (BEVSs) have been used extensively for high-level expression of cloned foreign genes. In many instances, the levels of recombinant protein(s) produced in insect cells and larvae are insufficient for experimental purposes. Thus new techniques and methods are needed to increase significantly the protein expression levels in BEVS. In the present paper, we describe the incorporation of a 15 bp element derived from the 5'-end partial sequence of the polyhedrin gene, which contains the non-coding sequence ATAAAT and the coding sequence ATGCCGAAT, into the 5'-end of the CTB (cholera toxin B subunit)-INS (insulin) fusion gene. With the addition of the PPHS (partial polyhedrin homology sequences), two extra amino acids (Pro-Asn) were added to the N-terminus of the mCTB-INS (modified CTB-INS) fusion protein. This new fusion protein was expressed in both insect cells and larvae using BEVSs. We found that the addition of PPHS enhanced 4-fold the expression of CTB-INS in both insect cells and larvae. Further analysis revealed that the additional two amino acids in mCTB-INS did not significantly affect binding affinity for G(M1) ganglioside. Therefore the PPHS can be used as a constitutive element immediately downstream of the polyhedrin promoter to induce significant increases in the expression levels of cloned foreign genes. PMID:16313236

  3. Matrix genes of measles virus and canine distemper virus: cloning, nucleotide sequences, and deduced amino acid sequences.

    PubMed Central

    Bellini, W J; Englund, G; Richardson, C D; Rozenblatt, S; Lazzarini, R A

    1986-01-01

    The nucleotide sequences encoding the matrix (M) proteins of measles virus (MV) and canine distemper virus (CDV) were determined from cDNA clones containing these genes in their entirety. In both cases, single open reading frames specifying basic proteins of 335 amino acid residues were predicted from the nucleotide sequences. Both viral messages were composed of approximately 1,450 nucleotides and contained 400 nucleotides of presumptive noncoding sequences at their respective 3' ends. MV and CDV M-protein-coding regions were 67% homologous at the nucleotide level and 76% homologous at the amino acid level. Only chance homology was observed in the 400-nucleotide trailer sequences. Comparisons of the M protein sequences of MV and CDV with the sequence reported for Sendai virus (B. M. Blumberg, K. Rose, M. G. Simona, L. Roux, C. Giorgi, and D. Kolakofsky, J. Virol. 52:656-663; Y. Hidaka, T. Kanda, K. Iwasaki, A. Nomoto, T. Shioda, and H. Shibuta, Nucleic Acids Res. 12:7965-7973) indicated the greatest homology among these M proteins in the carboxyterminal third of the molecule. Secondary-structure analyses of this shared region indicated a structurally conserved, hydrophobic sequence which possibly interacted with the lipid bilayer. Images PMID:3754588

  4. Facile Formation of β-Hydroxyboronate Esters by a Cu-Catalyzed Diboration/Matteson Homologation Sequence

    PubMed Central

    2015-01-01

    The copper-catalyzed diboration of aldehydes was used in conjunction with the Matteson homologation, providing the efficient synthesis of β-hydroxyboronate esters. The oxygen-bound boronate ester was found to play a key role in mediating the homologation reaction, which was compared to the α-hydroxyboronate ester (isolated hydrolysis product). The synthetic utility of the diboration/homologation sequence was demonstrated through the oxidation of one product to provide a 1,2-diol. PMID:25412356

  5. Nucleotide Sequence of the Envelope Gene of Gardner-Arnstein Feline Leukemia Virus B Reveals Unique Sequence Homologies with a Murine Mink Cell Focus-Forming Virus †

    PubMed Central

    Elder, John H.; Mullins, James I.

    1983-01-01

    The nucleotide sequence of the envelope gene and the adjacent 3′ long terminal repeat (LTR) of Gardner-Arnstein feline leukemia virus of subgroup B (GA-FeLV-B) has been determined. Comparison of the derived amino acid sequence of the gp70-p15E polyprotein to those of several previously reported murine retroviruses revealed striking homologies between GA-FeLV-B gp70 and the gp70 of a Moloney virus-derived mink cell focus-forming virus. These homologies were located within the substituted (presumably xenotropic) portion of the mink cell focus-forming virus envelope gene and comprised amino acid sequences not present in three ecotropic virus gp70s. In addition, areas of insertions and deletions, in general, were the same between GA-FeLV-B and Moloney mink cell focus-forming virus, although the sizes of the insertions and deletions differed. Homologies between GA-FeLV-B and mink cell focus-forming virus gp70s is functionally significant in that they both possess expanded host ranges, a property dictated by gp70. The amino acid sequence of FeLV-B contains 12 Asn-X-Ser/Thr sequences, indicating 12 possible sites of N-linked glycosylation as compared with 7 or 8 for its murine counterparts. Comparison of the 3′ LTR of GA-FeLV-B to AKR and Moloney virus LTRs revealed extensive conservation in several regions including the “CCAAT” and Goldberg-Hogness (TATA) boxes thought to be involved in promotion of transcription and in the repeat region of the LTR. The inverted repeats that flanked the LTR of GA-FeLV-B were identical to the murine inverted repeats, but were one base longer than the latter. The region of U3 corresponding to the approximately 75-nucleotide “enhancer sequence” is present in GA-FeLV-B, but contains deletions relative to AKR and Moloney virus and is not repeated. An interesting pallindrome in the repeat region immediately 3′ to the U3 region was noted in all the LTRs, but was particularly pronounced in GA-FeLV-B. Possible roles for this

  6. Meiotic recombination at the Lmp2 hotspot tolerates minor sequence divergence between homologous chromosomes

    SciTech Connect

    Yoshino, Masayasu; Sagai, Tomoko; Shiroishi, Toshihiko

    1996-06-01

    Recombination is widely considered to linearly depend on the length of the homologous sequences. An 11% mismatch decreases the rate of phage-plasmid recombination 240-fold. Two single nucleotide mismatches, which reduce the longest uninterrupted stretch of similarity from 232 base pairs (bp) to 134 bp, reduce gene conversion in mouse L cells 20-fold. The efficiency of gene targeting through homologous recombination in mouse embryonic stem cells can be increased by using an isogenic, rather than a non-isogenic, DNA construct. In this study we asked whether a high degree of sequence identity between homologous mouse chromosomes enhances meiotic recombination at a hotspot. Sites of meiotic recombination in the mouse major histocompatibility complex (MHC) class II region are not randomly distributed but are almost all clustered within short segments known as recombinational hotspots. The wm7 MHC haplotype, derived from Japanese wild mice Mus musculus molossinus, enhances meiotic recombination at a hotspot near the Lmp2 gene. Heterozygotes between the wm7 haplotype and the b or k haplotypes have yielded a high frequency of recombination (2.1%) in 1.3 kilobase kb segment of this hotspot. 20 refs., 2 figs.

  7. Nucleotide sequence of the Salmonella typhimurium mutS gene required for mismatch repair: homology of MutS and HexA of Streptococcus pneumoniae.

    PubMed Central

    Haber, L T; Pang, P P; Sobell, D I; Mankovich, J A; Walker, G C

    1988-01-01

    The mutS gene product of Escherichia coli and Salmonella typhimurium is one of at least four proteins required for methyl-directed mismatch repair in these organisms. A functionally similar repair system in Streptococcus pneumoniae requires the hex genes. We have sequenced the S. typhimurium mutS gene, showing that it encodes a 96-kilodalton protein. Amino-terminal amino acid sequencing of purified S. typhimurium MutS protein confirmed the initial portion of the deduced amino acid sequence. The S. typhimurium MutS protein is homologous to the S. pneumoniae HexA protein, suggesting that they arose from a common ancestor before the gram-negative and gram-positive bacteria diverged. Overall, approximately 36% of the amino acids of the two proteins are identical when the sequences are optimally aligned, including regions of stronger homology which are of particular interest. One such region is close to the amino terminus. Another, located closer to the carboxy terminus, includes homology to a consensus sequence thought to be diagnostic of nucleotide-binding sites. A third one, adjacent to the second, is homologous to the consensus sequence for the helix-turn-helix motif found in many DNA-binding proteins. We found that the S. typhimurium MutS protein can substitute for the E. coli MutS protein in vitro as it can in vivo, but we have not yet been able to demonstrate a similar in vitro complementation by the S. pneumoniae HexA protein. PMID:3275609

  8. Sequence homology and immunologic cross-reactivity of human cytomegalovirus with HLA-DR beta chain: a means for graft rejection and immunosuppression.

    PubMed Central

    Fujinami, R S; Nelson, J A; Walker, L; Oldstone, M B

    1988-01-01

    A peptide (Leu-Gly-Arg-Pro-Asp-Glu-Asp-Ser-Ser-Ser-Ser-Ser-Ser-Ser-Cys) that was identical to residues 82 through 96 of a predicted protein of 208 amino acids from the immediate-early region (IE-2) nucleic acid sequence of human cytomegalovirus was chemically synthesized. By computer analysis, the first five amino acids of this peptide showed sequence homology to the beta chain of the human histocompatibility complex HLA-DR. The homologous amino acids, 53 through 57, were located in a region that is conserved between the human DR beta chain and the beta chain of the H-2 class II histocompatibility antigen for mice. The shared region between the IE-2 protein and DR beta chain were similar in both hydrophilicity and predicted beta-turn potential. The IE-2 viral peptide induced antibodies that specifically recognized the human DR beta chain. These observations describe a protein encoded by the IE-2 region of human cytomegalovirus that contains sequence homology and shows immunologic cross-reactivity with a conserved domain of HLA-DR and suggest a mechanism to explain how human cytomegalovirus infection contributes to graft rejection after transplantation. Images PMID:2446012

  9. Mining Novel Allergens from Coconut Pollen Employing Manual De Novo Sequencing and Homology-Driven Proteomics.

    PubMed

    Saha, Bodhisattwa; Sircar, Gaurab; Pandey, Naren; Gupta Bhattacharya, Swati

    2015-11-01

    Coconut pollen, one of the major palm pollen grains is an important constituent among vectors of inhalant allergens in India and a major sensitizer for respiratory allergy in susceptible patients. To gain insight into its allergenic components, pollen proteins were analyzed by two-dimensional electrophoresis, immunoblotted with coconut pollen sensitive patient sera, followed by mass spectrometry of IgE reactive proteins. Coconut being largely unsequenced, a proteomic workflow has been devised that combines the conventional database-dependent analysis of tandem mass spectral data and manual de novo sequencing followed by a homology-based search for identifying the allergenic proteins. N-terminal acetylation helped to distinguish "b" ions from others, facilitating reliable sequencing. This led to the identification of 12 allergenic proteins. Cluster analysis with individual patient sera recognized vicilin-like protein as a major allergen, which was purified to assess its in vitro allergenicity and then partially sequenced. Other IgE-sensitive spots showed significant homology with well-known allergenic proteins such as 11S globulin, enolase, and isoflavone reductase along with a few which are reported as novel allergens. The allergens identified can be used as potential candidates to develop hypoallergenic vaccines, to design specific immunotherapy trials, and to enrich the repertoire of existing IgE reactive proteins. PMID:26426307

  10. SVM-BALSA: Remote Homology Detection based on Bayesian Sequence Alignment

    SciTech Connect

    Webb-Robertson, Bobbie-Jo M.; Oehmen, Chris S.; Matzke, Melissa M.

    2005-11-10

    Using biopolymer sequence comparison methods to identify evolutionarily related proteins is one of the most common tasks in bioinformatics. Recently, support vector machines (SVMs) utilizing statistical learning theory have been employed in the problem of remote homology detection and shown to outperform iterative profile methods such as PSI-BLAST. In this study we demonstrate the utilization of a Bayesian alignment score, which accounts for the uncertainty of all possible alignments, in the SVM construction improves sensitivity compared to the traditional dynamic programming implementation.

  11. Chip-based sequencing nucleic acids

    DOEpatents

    Beer, Neil Reginald

    2014-08-26

    A system for fast DNA sequencing by amplification of genetic material within microreactors, denaturing, demulsifying, and then sequencing the material, while retaining it in a PCR/sequencing zone by a magnetic field. One embodiment includes sequencing nucleic acids on a microchip that includes a microchannel flow channel in the microchip. The nucleic acids are isolated and hybridized to magnetic nanoparticles or to magnetic polystyrene-coated beads. Microreactor droplets are formed in the microchannel flow channel. The microreactor droplets containing the nucleic acids and the magnetic nanoparticles are retained in a magnetic trap in the microchannel flow channel and sequenced.

  12. G-quadruplex formation between G-rich PNA and homologous sequences in oligonucleotides and supercoiled plasmid DNA.

    PubMed

    Gaynutdinov, Timur I; Englund, Ethan A; Appella, Daniel H; Onyshchenko, Mykola I; Neumann, Ronald D; Panyutin, Igor G

    2015-04-01

    Guanine (G)-rich DNA sequences can adopt four-stranded quadruplex conformations that may play a role in the regulation of genetic processes. To explore the possibility of targeted molecular recognition of DNA sequences with short G-rich peptide nucleic acids (PNA) and to assess the strand arrangement in such complexes, we used PNA and DNA with the Oxytricha nova telomeric sequence d(G4T4G4) as a model. PNA probes were complexed with DNA targets in the following forms: single-stranded oligonucleotides, a loop of DNA in a hairpin conformation, and as supercoiled plasmid with the (G4T4G4)/(C4A4C4) insert. Gel-shift mobility assays demonstrated formation of stable hybrid complexes between the homologous G4T4G4 PNA and DNA with multiple modes of binding. Chemical and enzymatic probing revealed sequence-specific and G-quadruplex dependent binding of G4T4G4 PNA to dsDNA. Spectroscopic and electrophoretic analysis of the complex formed between PNA and the synthetic DNA hairpin containing the G4T4G4 loop showed that the stoichiometry of a prevailing complex is three PNA strands per one DNA strand. We speculate how this new PNA-DNA complex architecture can help to design more selective, quadruplex-specific PNA probes. PMID:25650982

  13. Intrachromosomal recombination between well-separated, homologous sequences in mammalian cells.

    PubMed Central

    Baker, M D; Read, L R; Ng, P; Beatty, B G

    1999-01-01

    In the present study, we investigated intrachromosomal homologous recombination in a murine hybridoma in which the recipient for recombination, the haploid, endogenous chromosomal immunoglobulin mu-gene bearing a mutation in the constant (Cmu) region, was separated from the integrated single copy wild-type donor Cmu region by approximately 1 Mb along the hybridoma chromosome. Homologous recombination between the donor and recipient Cmu region occurred with high frequency, correcting the mutant chromosomal mu-gene in the hybridoma. This enabled recombinant hybridomas to synthesize normal IgM and to be detected as plaque-forming cells (PFC). Characterization of the recombinants revealed that they could be placed into three distinct classes. The generation of the class I recombinants was consistent with a simple unequal sister chromatid exchange (USCE) between the donor and recipient Cmu region, as they contained the three Cmu-bearing fragments expected from this recombination, the original donor Cmu region along with both products of the single reciprocal crossover. However, a simple mechanism of homologous recombination was not sufficient in explaining the more complex Cmu region structures characterizing the class II and class III recombinants. To explain these recombinants, a model is proposed in which unequal pairing between the donor and recipient Cmu regions located on sister chromatids resulted in two crossover events. One crossover resulted in the deletion of sequences from one chromatid forming a DNA circle, which then integrated into the sister chromatid by a second reciprocal crossover. PMID:10353910

  14. GPU-Acceleration of Sequence Homology Searches with Database Subsequence Clustering.

    PubMed

    Suzuki, Shuji; Kakuta, Masanori; Ishida, Takashi; Akiyama, Yutaka

    2016-01-01

    Sequence homology searches are used in various fields and require large amounts of computation time, especially for metagenomic analysis, owing to the large number of queries and the database size. To accelerate computing analyses, graphics processing units (GPUs) are widely used as a low-cost, high-performance computing platform. Therefore, we mapped the time-consuming steps involved in GHOSTZ, which is a state-of-the-art homology search algorithm for protein sequences, onto a GPU and implemented it as GHOSTZ-GPU. In addition, we optimized memory access for GPU calculations and for communication between the CPU and GPU. As per results of the evaluation test involving metagenomic data, GHOSTZ-GPU with 12 CPU threads and 1 GPU was approximately 3.0- to 4.1-fold faster than GHOSTZ with 12 CPU threads. Moreover, GHOSTZ-GPU with 12 CPU threads and 3 GPUs was approximately 5.8- to 7.7-fold faster than GHOSTZ with 12 CPU threads. PMID:27482905

  15. GPU-Acceleration of Sequence Homology Searches with Database Subsequence Clustering

    PubMed Central

    Suzuki, Shuji; Kakuta, Masanori; Ishida, Takashi; Akiyama, Yutaka

    2016-01-01

    Sequence homology searches are used in various fields and require large amounts of computation time, especially for metagenomic analysis, owing to the large number of queries and the database size. To accelerate computing analyses, graphics processing units (GPUs) are widely used as a low-cost, high-performance computing platform. Therefore, we mapped the time-consuming steps involved in GHOSTZ, which is a state-of-the-art homology search algorithm for protein sequences, onto a GPU and implemented it as GHOSTZ-GPU. In addition, we optimized memory access for GPU calculations and for communication between the CPU and GPU. As per results of the evaluation test involving metagenomic data, GHOSTZ-GPU with 12 CPU threads and 1 GPU was approximately 3.0- to 4.1-fold faster than GHOSTZ with 12 CPU threads. Moreover, GHOSTZ-GPU with 12 CPU threads and 3 GPUs was approximately 5.8- to 7.7-fold faster than GHOSTZ with 12 CPU threads. PMID:27482905

  16. Distinguishing Proteins From Arbitrary Amino Acid Sequences

    PubMed Central

    Yau, Stephen S.-T.; Mao, Wei-Guang; Benson, Max; He, Rong Lucy

    2015-01-01

    What kinds of amino acid sequences could possibly be protein sequences? From all existing databases that we can find, known proteins are only a small fraction of all possible combinations of amino acids. Beginning with Sanger's first detailed determination of a protein sequence in 1952, previous studies have focused on describing the structure of existing protein sequences in order to construct the protein universe. No one, however, has developed a criteria for determining whether an arbitrary amino acid sequence can be a protein. Here we show that when the collection of arbitrary amino acid sequences is viewed in an appropriate geometric context, the protein sequences cluster together. This leads to a new computational test, described here, that has proved to be remarkably accurate at determining whether an arbitrary amino acid sequence can be a protein. Even more, if the results of this test indicate that the sequence can be a protein, and it is indeed a protein sequence, then its identity as a protein sequence is uniquely defined. We anticipate our computational test will be useful for those who are attempting to complete the job of discovering all proteins, or constructing the protein universe. PMID:25609314

  17. Critical assessment of sequence-based protein-protein interaction prediction methods that do not require homologous protein sequences

    PubMed Central

    2009-01-01

    Background Protein-protein interactions underlie many important biological processes. Computational prediction methods can nicely complement experimental approaches for identifying protein-protein interactions. Recently, a unique category of sequence-based prediction methods has been put forward - unique in the sense that it does not require homologous protein sequences. This enables it to be universally applicable to all protein sequences unlike many of previous sequence-based prediction methods. If effective as claimed, these new sequence-based, universally applicable prediction methods would have far-reaching utilities in many areas of biology research. Results Upon close survey, I realized that many of these new methods were ill-tested. In addition, newer methods were often published without performance comparison with previous ones. Thus, it is not clear how good they are and whether there are significant performance differences among them. In this study, I have implemented and thoroughly tested 4 different methods on large-scale, non-redundant data sets. It reveals several important points. First, significant performance differences are noted among different methods. Second, data sets typically used for training prediction methods appear significantly biased, limiting the general applicability of prediction methods trained with them. Third, there is still ample room for further developments. In addition, my analysis illustrates the importance of complementary performance measures coupled with right-sized data sets for meaningful benchmark tests. Conclusions The current study reveals the potentials and limits of the new category of sequence-based protein-protein interaction prediction methods, which in turn provides a firm ground for future endeavours in this important area of contemporary bioinformatics. PMID:20003442

  18. Method for sequencing nucleic acid molecules

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2006-05-30

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  19. Method for sequencing nucleic acid molecules

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2006-06-06

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  20. Detecting sequence homology at the gene cluster level with MultiGeneBlast.

    PubMed

    Medema, Marnix H; Takano, Eriko; Breitling, Rainer

    2013-05-01

    The genes encoding many biomolecular systems and pathways are genomically organized in operons or gene clusters. With MultiGeneBlast, we provide a user-friendly and effective tool to perform homology searches with operons or gene clusters as basic units, instead of single genes. The contextualization offered by MultiGeneBlast allows users to get a better understanding of the function, evolutionary history, and practical applications of such genomic regions. The tool is fully equipped with applications to generate search databases from GenBank or from the user's own sequence data. Finally, an architecture search mode allows searching for gene clusters with novel configurations, by detecting genomic regions with any user-specified combination of genes. Sources, precompiled binaries, and a graphical tutorial of MultiGeneBlast are freely available from http://multigeneblast.sourceforge.net/. PMID:23412913

  1. Characterization of a small family (CAIII) of microsatellite-containing sequences with X-Y homology.

    PubMed

    Malaspina, P; Ciminelli, B M; Viggiano, L; Jodice, C; Cruciani, F; Santolamazza, P; Sellitto, D; Scozzari, R; Terrenato, L; Rocchi, M; Novelletto, A

    1997-06-01

    Four X-linked loci showing homology with a previously described Y-linked polymorphic locus (DYS413) were identified and characterized. By fluorescent in situ hybridization (FISH), somatic cell hybrids, and YAC screening, the X-linked members of this small family of sequences (CAIII) all map in Xp22, while the Y members map in Yq11. These loci contribute to the overall similarity of the two genomic regions. All of the CAIII loci contain an internal microsatellite of the (CA)n type. The microsatellites display extensive length polymorphism in two of the X-linked members as well as in the Y members. In addition, common sequence variants are found in the portions flanking the microsatellites in two of the X-linked members. Our results indicate that, during the evolution of this family, length variation on the Y chromosome was accumulated at a rate not slower than that on the X chromosome. Finally, these sequences represent a model system with which to analyze human populations for similar X- and Y-linked polymorphisms. PMID:9169558

  2. Formation of large palindromic DNA by homologous recombination of short inverted repeat sequences in Saccharomyces cerevisiae.

    PubMed Central

    Butler, David K; Gillespie, David; Steele, Brandi

    2002-01-01

    Large DNA palindromes form sporadically in many eukaryotic and prokaryotic genomes and are often associated with amplified genes. The presence of a short inverted repeat sequence near a DNA double-strand break has been implicated in the formation of large palindromes in a variety of organisms. Previously we have established that in Saccharomyces cerevisiae a linear DNA palindrome is efficiently formed from a single-copy circular plasmid when a DNA double-strand break is introduced next to a short inverted repeat sequence. In this study we address whether the linear palindromes form by an intermolecular reaction (that is, a reaction between two identical fragments in a head-to-head arrangement) or by an unusual intramolecular reaction, as it apparently does in other examples of palindrome formation. Our evidence supports a model in which palindromes are primarily formed by an intermolecular reaction involving homologous recombination of short inverted repeat sequences. We have also extended our investigation into the requirement for DNA double-strand break repair genes in palindrome formation. We have found that a deletion of the RAD52 gene significantly reduces palindrome formation by intermolecular recombination and that deletions of two other genes in the RAD52-epistasis group (RAD51 and MRE11) have little or no effect on palindrome formation. In addition, palindrome formation is dramatically reduced by a deletion of the nucleotide excision repair gene RAD1. PMID:12136011

  3. Mutant gene phenotypes mediated by a Drosophila melanogaster retrotransposon require sequences homologous to mammalian enhancers.

    PubMed Central

    Geyer, P K; Green, M M; Corces, V G

    1988-01-01

    We have analyzed the molecular structure of phenotypic revertants of gypsy-induced mutations to understand the molecular mechanisms by which this retrotransposon causes mutant phenotypes in Drosophila melanogaster. The independent partial revertants analyzed are caused by the insertion of different transposons into the same region of gypsy. One partial revertant of the yellow allele y2 arose as a consequence of the insertion of the jockey mobile element into gypsy sequences, whereas a second incomplete revertant is due to the insertion of the hobo transposon. In addition, a previously isolated partial revertant of the Hairy-wing allele Hw1 resulted from the integration of the BS transposable element into the same gypsy sequences. The region affected by the insertion of the three transposons contains 12 copies of a repeated motif that shows striking homology to mammalian transcriptional enhancers. Our results suggest that these sequences, which might be involved in the transcriptional control of the gypsy element, are also responsible for the induction of mutant phenotypes by this retrotransposon. PMID:2847167

  4. Nucleotide sequence of the Porphyromonas gingivalis W83 recA homolog and construction of a recA-deficient mutant.

    PubMed Central

    Fletcher, H M; Morgan, R M; Macrina, F L

    1997-01-01

    Degenerate oligonucleotide primers were used in PCR to amplify a region of the recA homolog from Porphyromonas gingivalis W83. The resulting PCR fragment was used as a probe to identify a recombinant lambda DASH phage (L10) carrying the P. gingivalis recA homolog. The recA homolog was localized to a 2.1-kb BamHI fragment. The nucleotide sequence of this 2.1-kb fragment was determined, and a 1.02-kb open reading frame (341 amino acids) was detected. The predicted amino acid sequence was strikingly similar (90% identical residues) to the RecA protein from Bacteroides fragilis. No SOS box, characteristic of LexA-regulated promoters, was found in the 5' upstream region of the P. gingivalis recA homolog. In both methyl methanesulfonate and UV survival experiments the recA homolog from P. gingivalis complemented the recA mutation of Escherichia coli HB101. The cloned P. gingivalis recA gene was insertionally inactivated with the ermF-ermAM antibiotic resistance cassette to create a recA-deficient mutant (FLL33) by allelic exchange. The recA-deficient mutant was significantly more sensitive to UV irradiation than the wild-type strain, W83. W83 and FLL33 showed the same level of virulence in in vivo experiments using a mouse model. These results suggest that the recA gene in P. gingivalis W83 plays the expected role of repairing DNA damage caused by UV irradiation. However, inactivation of this gene did not alter the virulence of P. gingivalis in the mouse model. PMID:9353038

  5. The Chinese hamster Alu-equivalent sequence: a conserved highly repetitious, interspersed deoxyribonucleic acid sequence in mammals has a structure suggestive of a transposable element.

    PubMed Central

    Haynes, S R; Toomey, T P; Leinwand, L; Jelinek, W R

    1981-01-01

    A consensus sequence has been determined for a major interspersed deoxyribonucleic acid repeat in the genome of Chinese hamster ovary cells (CHO cells). This sequence is extensively homologous to (i) the human Alu sequence (P. L. Deininger et al., J. Mol. Biol., in press), (ii) the mouse B1 interspersed repetitious sequence (Krayev et al., Nucleic Acids Res. 8:1201-1215, 1980) (iii) an interspersed repetitious sequence from African green monkey deoxyribonucleic acid (Dhruva et al., Proc. Natl. Acad. Sci. U.S.A. 77:4514-4518, 1980) and (iv) the CHO and mouse 4.5S ribonucleic acid (this report; F. Harada and N. Kato, Nucleic Acids Res. 8:1273-1285, 1980). Because the CHO consensus sequence shows significant homology to the human Alu sequence it is termed the CHO Alu-equivalent sequence. A conserved structure surrounding CHO Alu-equivalent family members can be recognized. It is similar to that surrounding the human Alu and the mouse B1 sequences, and is represented as follows: direct repeat-CHO-Alu-A-rich sequence-direct repeat. A composite interspersed repetitious sequence has been identified. Its structure is represented as follows: direct repeat-residue 47 to 107 of CHO-Alu-non-Alu repetitious sequence-A-rich sequence-direct repeat. Because the Alu flanking sequences resemble those that flank known transposable elements, we think it likely that the Alu sequence dispersed throughout the mammalian genome by transposition. Images PMID:9279371

  6. Amino acid sequence of bovine heart coupling factor 6.

    PubMed Central

    Fang, J K; Jacobs, J W; Kanner, B I; Racker, E; Bradshaw, R A

    1984-01-01

    The amino acid sequence of bovine heart mitochondrial coupling factor 6 (F6) has been determined by automated Edman degradation of the whole protein and derived peptides. Preparations based on heat precipitation and ethanol extraction showed allotypic variation at three positions while material further purified by HPLC yielded only one sequence that also differed by a Phe-Thr replacement at residue 62. The mature protein contains 76 amino acids with a calculated molecular weight of 9006 and a pI of approximately equal to 5, in good agreement with experimentally measured values. The charged amino acids are mainly clustered at the termini and in one section in the middle; these three polar segments are separated by two segments relatively rich in nonpolar residues. Chou-Fasman analysis suggests three stretches of alpha-helix coinciding (or within) the high-charge-density sequences with a single beta-turn at the first polar-nonpolar junction. Comparison of the F6 sequence with those of other proteins did not reveal any homologous structures. PMID:6149548

  7. VITAL NMR: Using Chemical Shift Derived Secondary Structure Information for a Limited Set of Amino Acids to Assess Homology Model Accuracy

    SciTech Connect

    Brothers, Michael C; Nesbitt, Anna E; Hallock, Michael J; Rupasinghe, Sanjeewa; Tang, Ming; Harris, Jason B; Baudry, Jerome Y; Schuler, Mary A; Rienstra, Chad M

    2011-01-01

    Homology modeling is a powerful tool for predicting protein structures, whose success depends on obtaining a reasonable alignment between a given structural template and the protein sequence being analyzed. In order to leverage greater predictive power for proteins with few structural templates, we have developed a method to rank homology models based upon their compliance to secondary structure derived from experimental solid-state NMR (SSNMR) data. Such data is obtainable in a rapid manner by simple SSNMR experiments (e.g., (13)C-(13)C 2D correlation spectra). To test our homology model scoring procedure for various amino acid labeling schemes, we generated a library of 7,474 homology models for 22 protein targets culled from the TALOS+/SPARTA+ training set of protein structures. Using subsets of amino acids that are plausibly assigned by SSNMR, we discovered that pairs of the residues Val, Ile, Thr, Ala and Leu (VITAL) emulate an ideal dataset where all residues are site specifically assigned. Scoring the models with a predicted VITAL site-specific dataset and calculating secondary structure with the Chemical Shift Index resulted in a Pearson correlation coefficient (-0.75) commensurate to the control (-0.77), where secondary structure was scored site specifically for all amino acids (ALL 20) using STRIDE. This method promises to accelerate structure procurement by SSNMR for proteins with unknown folds through guiding the selection of remotely homologous protein templates and assessing model quality.

  8. Complete amino acid sequence of the Mu heavy chain of a human IgM immunoglobulin.

    PubMed

    Putnam, F W; Florent, G; Paul, C; Shinoda, T; Shimizu, A

    1973-10-19

    The amino acid sequence of the micro, chain of a human IgM immunoglobulin, including the location of all disulfide bridges and oligosaccharides, has been determined. The homology of the constant regions of immunoglobulin micro, gamma, alpha, and epsilon heavy chains reveals evolutionary relationships and suggests that two genes code for each heavy chain. PMID:4742735

  9. Alpha 1(XVIII), a collagen chain with frequent interruptions in the collagenous sequence, a distinct tissue distribution, and homology with type XV collagen.

    PubMed Central

    Rehn, M; Pihlajaniemi, T

    1994-01-01

    We report on the isolation of mouse cDNA clones which encode a collagenous sequence designated here as the alpha 1 chain of type XVIII collagen. The overlapping clones cover 2.8 kilobases and encode an open reading frame of 928 amino acid residues comprising a putative signal peptide of 25 residues, an amino-terminal noncollagenous domain of 301 residues, and a primarily collagenous stretch of 602 residues. The clones do not cover the carboxyl-terminal end of the polypeptide, since the translation stop codon is absent. Characteristic of the deduced polypeptide is the possession of eight noncollagenous interruptions varying in length from 10 to 24 residues in the collagenous amino acid sequence. Other features include the presence of several putative sites for both N-linked glycosylation and O-linked glycosaminoglycan attachment and homology of the amino-terminal noncollagenous domain with thrombospondin. It is of particular interest that five of the eight collagenous sequences of type XVIII show homology to the previously reported type XV collagen, suggesting that the two form a distinct subgroup among the diverse family of collagens. Northern blot hybridization analysis revealed a striking tissue distribution for type XVIII collagen mRNAs, as the clones hybridized strongly with mRNAs of 4.3 and 5.3 kilobases that were present only in lung and liver of the eight mouse tissues studied. Images PMID:8183894

  10. Inference of Homologous Recombination in Bacteria Using Whole-Genome Sequences

    PubMed Central

    Didelot, Xavier; Lawson, Daniel; Darling, Aaron; Falush, Daniel

    2010-01-01

    Bacteria and archaea reproduce clonally, but sporadically import DNA into their chromosomes from other organisms. In many of these events, the imported DNA replaces an homologous segment in the recipient genome. Here we present a new method to reconstruct the history of recombination events that affected a given sample of bacterial genomes. We introduce a mathematical model that represents both the donor and the recipient of each DNA import as an ancestor of the genomes in the sample. The model represents a simplification of the previously described coalescent with gene conversion. We implement a Monte Carlo Markov chain algorithm to perform inference under this model from sequence data alignments and show that inference is feasible for whole-genome alignments through parallelization. Using simulated data, we demonstrate accurate and reliable identification of individual recombination events and global recombination rate parameters. We applied our approach to an alignment of 13 whole genomes from the Bacillus cereus group. We find, as expected from laboratory experiments, that the recombination rate is higher between closely related organisms and also that the genome contains several broad regions of elevated levels of recombination. Application of the method to the genomic data sets that are becoming available should reveal the evolutionary history and private lives of populations of bacteria and archaea. The methods described in this article have been implemented in a computer software package, ClonalOrigin, which is freely available from http://code.google.com/p/clonalorigin/. PMID:20923983

  11. Not all transmembrane helices are born equal: Towards the extension of the sequence homology concept to membrane proteins

    PubMed Central

    2011-01-01

    Background Sequence homology considerations widely used to transfer functional annotation to uncharacterized protein sequences require special precautions in the case of non-globular sequence segments including membrane-spanning stretches composed of non-polar residues. Simple, quantitative criteria are desirable for identifying transmembrane helices (TMs) that must be included into or should be excluded from start sequence segments in similarity searches aimed at finding distant homologues. Results We found that there are two types of TMs in membrane-associated proteins. On the one hand, there are so-called simple TMs with elevated hydrophobicity, low sequence complexity and extraordinary enrichment in long aliphatic residues. They merely serve as membrane-anchoring device. In contrast, so-called complex TMs have lower hydrophobicity, higher sequence complexity and some functional residues. These TMs have additional roles besides membrane anchoring such as intra-membrane complex formation, ligand binding or a catalytic role. Simple and complex TMs can occur both in single- and multi-membrane-spanning proteins essentially in any type of topology. Whereas simple TMs have the potential to confuse searches for sequence homologues and to generate unrelated hits with seemingly convincing statistical significance, complex TMs contain essential evolutionary information. Conclusion For extending the homology concept onto membrane proteins, we provide a necessary quantitative criterion to distinguish simple TMs (and a sufficient criterion for complex TMs) in query sequences prior to their usage in homology searches based on assessment of hydrophobicity and sequence complexity of the TM sequence segments. Reviewers This article was reviewed by Shamil Sunyaev, L. Aravind and Arcady Mushegian. PMID:22024092

  12. Nucleotide sequences of genes encoding penicillin-binding proteins from Streptococcus pneumoniae and Streptococcus oralis with high homology to Escherichia coli penicillin-binding proteins 1a and 1b.

    PubMed Central

    Martin, C; Briese, T; Hakenbeck, R

    1992-01-01

    The nucleotide sequence of a 3,378-bp DNA fragment of Streptococcus pneumoniae that included the structural gene for penicillin-binding protein (PBP) 1a (ponA), which encodes 719 amino acids, was determined. Homologous DNA fragments from an S. oralis strain were amplified with ponA-specific oligonucleotides. The 2,524-bp S. oralis sequence contained the coding region for the first 636 amino acids of a PBP. The coding sequence differed by 437 nucleotides (27%) and one additional triplet, resulting in 87 amino acid substitutions (14%), from S. pneumoniae PBP 1a. Both PBPs are highly homologous to bifunctional high-M(r) Escherichia coli PBPs 1a and 1b. Images PMID:1624444

  13. A sequence homologous to kappa-deleting element is located 5 prime to the human J sub K locus

    SciTech Connect

    Kennedy, M.A.; Morris, C.M.; Fitzgerald, P.H. )

    1989-01-25

    The human kappa deleting element (Kde) mediates loss of CK and JK genes in B cells. A probe for Kde detects two genomic sequences on Southern blots. The Kde is located 24kb 3{prime} to CK, but the position of the homologous sequence is unknown. The authors in situ hybridized m141-2 to metaphase cells of JC11, a B-cell line bearing a t(2;14)(p11;q32) in which the chromosome 2 breakpoint is within JK or the VK-JK intron. Three peaks of labelled sites were obtained. Southern analysis of BamH1 digested DNA showed that Kde (14kb) and the homologous sequence (3kb) were both intact. Kde accounts for hybridization to 14q+ and the 2p- signal presumably derives from the related sequence. This locates the sequence homologous to Kde upstream from JK, possibly within the VK cluster, and may reflect transposition or some other duplicative event as proposed for the evolution of other regions of the kappa locus.

  14. Recombination-Independent Recognition of DNA Homology for Repeat-Induced Point Mutation (RIP) Is Modulated by the Underlying Nucleotide Sequence.

    PubMed

    Gladyshev, Eugene; Kleckner, Nancy

    2016-05-01

    Haploid germline nuclei of many filamentous fungi have the capacity to detect homologous nucleotide sequences present on the same or different chromosomes. Once recognized, such sequences can undergo cytosine methylation or cytosine-to-thymine mutation specifically over the extent of shared homology. In Neurospora crassa this process is known as Repeat-Induced Point mutation (RIP). Previously, we showed that RIP did not require MEI-3, the only RecA homolog in Neurospora, and that it could detect homologous trinucleotides interspersed with a matching periodicity of 11 or 12 base-pairs along participating chromosomal segments. This pattern was consistent with a mechanism of homology recognition that involved direct interactions between co-aligned double-stranded (ds) DNA molecules, where sequence-specific dsDNA/dsDNA contacts could be established using no more than one triplet per turn. In the present study we have further explored the DNA sequence requirements for RIP. In our previous work, interspersed homologies were always examined in the context of a relatively long adjoining region of perfect homology. Using a new repeat system lacking this strong interaction, we now show that interspersed homologies with overall sequence identity of only 36% can be efficiently detected by RIP in the absence of any perfect homology. Furthermore, in this new system, where the total amount of homology is near the critical threshold required for RIP, the nucleotide composition of participating DNA molecules is identified as an important factor. Our results specifically pinpoint the triplet 5'-GAC-3' as a particularly efficient unit of homology recognition. Finally, we present experimental evidence that the process of homology sensing can be uncoupled from the downstream mutation. Taken together, our results advance the notion that sequence information can be compared directly between double-stranded DNA molecules during RIP and, potentially, in other processes where homologous

  15. Recombination-Independent Recognition of DNA Homology for Repeat-Induced Point Mutation (RIP) Is Modulated by the Underlying Nucleotide Sequence

    PubMed Central

    Kleckner, Nancy

    2016-01-01

    Haploid germline nuclei of many filamentous fungi have the capacity to detect homologous nucleotide sequences present on the same or different chromosomes. Once recognized, such sequences can undergo cytosine methylation or cytosine-to-thymine mutation specifically over the extent of shared homology. In Neurospora crassa this process is known as Repeat-Induced Point mutation (RIP). Previously, we showed that RIP did not require MEI-3, the only RecA homolog in Neurospora, and that it could detect homologous trinucleotides interspersed with a matching periodicity of 11 or 12 base-pairs along participating chromosomal segments. This pattern was consistent with a mechanism of homology recognition that involved direct interactions between co-aligned double-stranded (ds) DNA molecules, where sequence-specific dsDNA/dsDNA contacts could be established using no more than one triplet per turn. In the present study we have further explored the DNA sequence requirements for RIP. In our previous work, interspersed homologies were always examined in the context of a relatively long adjoining region of perfect homology. Using a new repeat system lacking this strong interaction, we now show that interspersed homologies with overall sequence identity of only 36% can be efficiently detected by RIP in the absence of any perfect homology. Furthermore, in this new system, where the total amount of homology is near the critical threshold required for RIP, the nucleotide composition of participating DNA molecules is identified as an important factor. Our results specifically pinpoint the triplet 5'-GAC-3' as a particularly efficient unit of homology recognition. Finally, we present experimental evidence that the process of homology sensing can be uncoupled from the downstream mutation. Taken together, our results advance the notion that sequence information can be compared directly between double-stranded DNA molecules during RIP and, potentially, in other processes where homologous

  16. Molecular cloning and amino acid sequence of human 5-lipoxygenase

    SciTech Connect

    Matsumoto, T.; Funk, C.D.; Radmark, O.; Hoeoeg, J.O.; Joernvall, H.; Samuelsson, B.

    1988-01-01

    5-Lipoxygenase (EC 1.13.11.34), a Ca/sup 2 +/- and ATP-requiring enzyme, catalyzes the first two steps in the biosynthesis of the peptidoleukotrienes and the chemotactic factor leukotriene B/sub 4/. A cDNA clone corresponding to 5-lipoxygenase was isolated from a human lung lambda gt11 expression library by immunoscreening with a polyclonal antibody. Additional clones from a human placenta lambda gt11 cDNA library were obtained by plaque hybridization with the /sup 32/P-labeled lung cDNA clone. Sequence data obtained from several overlapping clones indicate that the composite DNAs contain the complete coding region for the enzyme. From the deduced primary structure, 5-lipoxygenase encodes a 673 amino acid protein with a calculated molecular weight of 77,839. Direct analysis of the native protein and its proteolytic fragments confirmed the deduced composition, the amino-terminal amino acid sequence, and the structure of many internal segments. 5-Lipoxygenase has no apparent sequence homology with leukotriene A/sub 4/ hydrolase or Ca/sup 2 +/-binding proteins. RNA blot analysis indicated substantial amounts of an mRNA species of approx. = 2700 nucleotides in leukocytes, lung, and placenta.

  17. Molecular cloning and sequence analysis of the Sta58 major antigen gene of Rickettsia tsutsugamushi: sequence homology and antigenic comparison of Sta58 to the 60-kilodalton family of stress proteins.

    PubMed Central

    Stover, C K; Marana, D P; Dasch, G A; Oaks, E V

    1990-01-01

    The scrub typhus 58-kilodalton (kDa) antigen (Sta58) of Rickettsia tsutsugamushi is a major protein antigen often recognized by humans infected with scrub typhus rickettsiae. A 2.9-kilobase HindIII fragment containing a complete sta58 gene was cloned in Escherichia coli and found to express the entire Sta58 antigen and a smaller protein with an apparent molecular mass of 11 kDa (Stp11). DNA sequence analysis of the 2.9-kilobase HindIII fragment revealed two adjacent open reading frames encoding proteins of 11 (Stp11) and 60 (Sta58) kDa. Comparisons of deduced amino acid sequences disclosed a high degree of homology between the R. tsutsugamushi proteins Stp11 and Sta58 and the E. coli proteins GroES and GroEL, respectively, and the family of primordial heat shock proteins designated Hsp10 Hsp60. Although the sequence homology between the Sta58 antigen and the Hsp60 protein family is striking, the Sta58 protein appeared to be antigenically distinct among a sample of other bacterial Hsp60 homologs, including the typhus group of rickettsiae. The antigenic uniqueness of the Sta58 antigen indicates that this protein may be a potentially protective antigen and a useful diagnostic reagent for scrub typhus fever. Images PMID:2108930

  18. External and semi-internal controls for PCR amplification of homologous sequences in mixed templates.

    PubMed

    Kalle, Elena; Gulevich, Alexander; Rensing, Christopher

    2013-11-01

    In a mixed template, the presence of homologous target DNA sequences creates environments that almost inevitably give rise to artifacts and biases during PCR. Heteroduplexes, chimeras, and skewed template-to-product ratios are the exclusive attributes of mixed template PCR and never occur in a single template assay. Yet, multi-template PCR has been used without appropriate attention to quality control and assay validation, in spite of the fact that such practice diminishes the reliability of results. External and internal amplification controls became obligatory elements of good laboratory practice in different PCR assays. We propose the inclusion of an analogous approach as a quality control system for multi-template PCR applications. The amplification controls must take into account the characteristics of multi-template PCR and be able to effectively monitor particular assay performance. This study demonstrated the efficiency of a model mixed template as an adequate external amplification control for a particular PCR application. The conditions of multi-template PCR do not allow implementation of a classic internal control; therefore we developed a convenient semi-internal control as an acceptable alternative. In order to evaluate the effects of inhibitors, a model multi-template mix was amplified in a mixture with DNAse-treated sample. Semi-internal control allowed establishment of intervals for robust PCR performance for different samples, thus enabling correct comparison of the samples. The complexity of the external and semi-internal amplification controls must be comparable with the assumed complexity of the samples. We also emphasize that amplification controls should be applied in multi-template PCR regardless of the post-assay method used to analyze products. PMID:24076226

  19. Phenolic acid esterases, coding sequences and methods

    DOEpatents

    Blum, David L.; Kataeva, Irina; Li, Xin-Liang; Ljungdahl, Lars G.

    2002-01-01

    Described herein are four phenolic acid esterases, three of which correspond to domains of previously unknown function within bacterial xylanases, from XynY and XynZ of Clostridium thermocellum and from a xylanase of Ruminococcus. The fourth specifically exemplified xylanase is a protein encoded within the genome of Orpinomyces PC-2. The amino acids of these polypeptides and nucleotide sequences encoding them are provided. Recombinant host cells, expression vectors and methods for the recombinant production of phenolic acid esterases are also provided.

  20. Amino-Acid Sequence of Porcine Pepsin

    PubMed Central

    Tang, J.; Sepulveda, P.; Marciniszyn, J.; Chen, K. C. S.; Huang, W-Y.; Tao, N.; Liu, D.; Lanier, J. P.

    1973-01-01

    As the culmination of several years of experiments, we propose a complete amino-acid sequence for porcine pepsin, an enzyme containing 327 amino-acid residues in a single polypeptide chain. In the sequence determination, the enzyme was treated with cyanogen bromide. Five resulting fragments were purified. The amino-acid sequence of four of the fragments accounted for 290 residues. Because the structure of a 37-residue carboxyl-terminal fragment was already known, it was not studied. The alignment of these fragments was determined from the sequence of methionyl-peptides we had previously reported. We also discovered the locations of activesite aspartyl residues, as well as the pairing of the three disulfide bridges. A minor component of commercial crystalline pepsin was found to contain two extra amino-acid residues, Ala-Leu-, at the amino-terminus of the molecule. This minor component was apparently derived from a different site of cleavage during the activation of porcine pepsinogen. PMID:4587252

  1. Method for identifying and quantifying nucleic acid sequence aberrations

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1998-01-01

    A method for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe.

  2. Method for identifying and quantifying nucleic acid sequence aberrations

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1998-07-21

    A method is disclosed for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe. 11 figs.

  3. CBH1 homologs and varian CBH1 cellulase

    SciTech Connect

    Goedegebuur, Frits; Gualfetti, Peter; Mitchinson, Colin; Neefe, Paulien

    2014-07-01

    Disclosed are a number of homologs and variants of Hypocrea jecorina Cel7A (formerly Trichoderma reesei cellobiohydrolase I or CBH1), nucleic acids encoding the same and methods for producing the same. The homologs and variant cellulases have the amino acid sequence of a glycosyl hydrolase of family 7A wherein one or more amino acid residues are substituted and/or deleted.

  4. CBH1 homologs and variant CBH1 cellulases

    DOEpatents

    Goedegebuur, Frits; Gualfetti, Peter; Mitchinson, Colin; Neefe, Paulien

    2011-05-31

    Disclosed are a number of homologs and variants of Hypocrea jecorina Cel7A (formerly Trichoderma reesei cellobiohydrolase I or CBH1), nucleic acids encoding the same and methods for producing the same. The homologs and variant cellulases have the amino acid sequence of a glycosyl hydrolase of family 7A wherein one or more amino acid residues are substituted and/or deleted.

  5. CBH1 homologs and variant CBH1 cellulases

    DOEpatents

    Goedegebuur, Frits; Gualfetti, Peter; Mitchinson, Colin; Neefe, Paulien

    2008-11-18

    Disclosed are a number of homologs and variants of Hypocrea jecorina Cel7A (formerly Trichoderma reesei cellobiohydrolase I or CBH1), nucleic acids encoding the same and methods for producing the same. The homologs and variant cellulases have the amino acid sequence of a glycosyl hydrolase of family 7A wherein one or more amino acid residues are substituted and/or deleted.

  6. Complete cDNA and derived amino acid sequence of human factor V

    SciTech Connect

    Jenny, R.J.; Pittman, D.D.; Toole, J.J.; Kriz, R.W.; Aldape, R.A.; Hewick, R.M.; Kaufman, R.J.; Mann, K.G.

    1987-07-01

    cDNA clones encoding human factor V have been isolated from an oligo(dT)-primed human fetal liver cDNA library prepared with vector Charon 21A. The cDNA sequence of factor V from three overlapping clones includes a 6672-base-pair (bp) coding region, a 90-bp 5' untranslated region, and a 163-bp 3' untranslated region within which is a poly(A)tail. The deduced amino acid sequence consists of 2224 amino acids inclusive of a 28-amino acid leader peptide. Direct comparison with human factor VIII reveals considerable homology between proteins in amino acid sequence and domain structure: a triplicated A domain and duplicated C domain show approx. 40% identity with the corresponding domains in factor VIII. As in factor VIII, the A domains of factor V share approx. 40% amino acid-sequence homology with the three highly conserved domains in ceruloplasmin. The B domain of factor V contains 35 tandem and approx. 9 additional semiconserved repeats of nine amino acids of the form Asp-Leu-Ser-Gln-Thr-Thr/Asn-Leu-Ser-Pro and 2 additional semiconserved repeats of 17 amino acids. Factor V contains 37 potential N-linked glycosylation sites, 25 of which are in the B domain, and a total of 19 cysteine residues.

  7. Nucleotide sequence of the 3'-noncoding region of alfalfa mosaic virus RNA 4 and its homology with the genomic RNAs.

    PubMed Central

    Koper-Zwarthoff, E C; Brederode, F T; Walstra, P; Bol, J F

    1979-01-01

    A 226-nucleotide fragment was derived from alfalfa mosaic virus RNA 4 (ALMV RNA 4), the subgenomic messenger for viral coat protein, and its sequence was deduced by in vitro labeling with polynucleotide kinase and application of RNA sequencing techniques. The fragment contains the 3'-terminal 45 nucleotides of the coat protein cistron and the complete 3'-noncoding region of 182 nucleotides. The total length of RNA 4 was calculated to be 881 nucleotides. AlMV RNAs 1, 2 and 3 were elongated with a 3'-terminal poly(A) stretch and subjected to sequence analysis by using a specific primer, reverse transcriptase and chain terminators. This revealed and extensive homology between the 3'-terminal 140 to 150 nucleotides of all four ALMV RNAs. Despite a number of base substitutions, the secondary structure of the homologous region is highly conserved. The observed homology indicates that, as with RNA 4, the sites with a high affinity for the viral coat protein are located at the 3'-termini of the genomic RNAs. Images PMID:537914

  8. Methods for analyzing nucleic acid sequences

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2011-05-17

    The present invention is directed to a method of sequencing a target nucleic acid. The method provides a complex comprising a polymerase enzyme, a target nucleic acid molecule, and a primer, wherein the complex is immobilized on a support Fluorescent label is attached to a terminal phosphate group of the nucleotide or nucleotide analog. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The time duration of the signal from labeled nucleotides or nucleotide analogs that become incorporated is distinguished from freely diffusing labels by a longer retention in the observation volume for the nucleotides or nucleotide analogs that become incorporated than for the freely diffusing labels.

  9. Fast method of homology and purine-pyrimidine mutual relations between DNA sequences search.

    PubMed

    Korotkov, E V

    1994-01-01

    A new algorithm for scanning sequences is described. This algorithm uses the boolean operators AND and OR. The mutual information between the sequences is used as a measure of sequence interrelation. It allows evaluation of the probability of accidental sequence interrelation in a quantitative manner. The proposed algorithm was used for searching for MB1 repeats in human and other mammalian sequences. PMID:7841466

  10. Desulfovibrio desulfuricans PglB homolog possesses oligosaccharyltransferase activity with relaxed glycan specificity and distinct protein acceptor sequence requirements.

    PubMed

    Ielmini, Maria V; Feldman, Mario F

    2011-06-01

    Oligosaccharyltransferases (OTases) are responsible for the transfer of carbohydrates from lipid carriers to acceptor proteins and are present in all domains of life. In bacteria, the most studied member of this family is PglB from Campylobacter jejuni (PglB(Cj)). This enzyme is functional in Escherichia coli and, contrary to its eukaryotic counterparts, has the ability to transfer a variety of oligo- and polysaccharides to protein carriers in vivo. Phylogenetic analysis revealed that in the delta proteobacteria Desulfovibrio sp., the PglB homolog is more closely related to eukaryotic and archaeal OTases than to its Campylobacter counterparts. Genetic analysis revealed the presence of a putative operon that might encode all enzymes required for N-glycosylation in Desulfovibrio desulfuricans. D. desulfuricans PglB (PglB(Dd)) was cloned and successfully expressed in E. coli, and its activity was confirmed by transferring the C. jejuni heptasaccharide onto the model protein acceptor AcrA. In contrast to PglB(Cj), which adds two glycan chains to AcrA, a single oligosaccharide was attached to the protein by PglB(Dd). Site-directed mutagenesis of the five putative N-X-S/T glycosylation sites in AcrA and mass spectrometry analysis showed that PglB(Dd) does not recognize the "conventional bacterial glycosylation sequon" consisting of the sequence D/E-X(1)-N-X(2)-S/T (where X(1) and X(2) are any amino acid except proline), and instead used a different site for the attachment of the oligosaccharide than PglB(Cj.). Furthermore, PglB(Dd) exhibited relaxed glycan specificity, being able to transfer mono- and polysaccharides to AcrA. Our analysis constitutes the first characterization of an OTase from delta-proteobacteria involved in N-linked protein glycosylation. PMID:21098514

  11. Snake venoms. The amino-acid sequence of trypsin inhibitor E of Dendroaspis polylepis polylepis (Black Mamba) venom.

    PubMed

    Joubert, F J; Strydom, D J

    1978-06-01

    Trypsin inhibitor E from black mamba venom comprises 59 amino acid residues in a single polypeptide chain, cross-linked by three intrachain disulphide bridges. The complete primary structure of inhibitor E was elucidated. The sequence is homologous with trypsin inhibitors from different sources. Unique among this homologous series of proteinase inhibitors, inhibitor E has an affinity for transition metal ions, exemplified here by Cu2 and Co2+. PMID:668688

  12. (+/-)-3-Oxocyclohexanecarboxylic and -acetic acids: contrasting hydrogen-bonding patterns in two homologous keto acids.

    PubMed

    Barcon, Alan; Brunskill, Andrew P J; Lalancette, Roger A; Thompson, Hugh W

    2002-03-01

    The crystal structures for the title compounds reveal fundamentally different hydrogen-bonding patterns. (+/-)-3-Oxocyclohexanecarboxylic acid, C(7)H(10)O(3), displays acid-to-ketone catemers having a glide relationship for successive components of the hydrogen-bonding chains which advance simultaneously by two cells in a and one in c [O...O = 2.683 (3) A and O-H...O = 166]. A pair of intermolecular close contacts exists involving the acid carbonyl group. The asymmetric unit in (+/-)-3-oxocyclohexaneacetic acid, C(8)H(12)O(3), utilizes only one of two available isoenthalpic conformers and its aggregation involves mutual hydrogen bonding by centrosymmetric carboxyl dimerization [O.O = 2.648 (3) A and O-H...O = 171]. Intermolecular close contacts exist for both the ketone and the acid carbonyl group. PMID:11870311

  13. Evolution and homologous recombination of the hemagglutinin-esterase gene sequences from porcine torovirus

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The objective of the present study was to gain new insights into the evolution, homologous recombination and selection pressures imposed on the porcine torovirus (PToV), by examining changes in the hemagglutinin-esterase (HE) gene. The most recent common ancestor of PToV was estimated to have emerge...

  14. Selective anticancer activity of a hexapeptide with sequence homology to a non-kinase domain of Cyclin Dependent Kinase 4

    PubMed Central

    2011-01-01

    Background Cyclin-dependent kinases 2, 4 and 6 (Cdk2, Cdk4, Cdk6) are closely structurally homologous proteins which are classically understood to control the transition from the G1 to the S-phases of the cell cycle by combining with their appropriate cyclin D or cyclin E partners to form kinase-active holoenzymes. Deregulation of Cdk4 is widespread in human cancer, CDK4 gene knockout is highly protective against chemical and oncogene-mediated epithelial carcinogenesis, despite the continued presence of CDK2 and CDK6; and overexpresssion of Cdk4 promotes skin carcinogenesis. Surprisingly, however, Cdk4 kinase inhibitors have not yet fulfilled their expectation as 'blockbuster' anticancer agents. Resistance to inhibition of Cdk4 kinase in some cases could potentially be due to a non-kinase activity, as recently reported with epidermal growth factor receptor. Results A search for a potential functional site of non-kinase activity present in Cdk4 but not Cdk2 or Cdk6 revealed a previously-unidentified loop on the outside of the C'-terminal non-kinase domain of Cdk4, containing a central amino-acid sequence, Pro-Arg-Gly-Pro-Arg-Pro (PRGPRP). An isolated hexapeptide with this sequence and its cyclic amphiphilic congeners are selectively lethal at high doses to a wide range of human cancer cell lines whilst sparing normal diploid keratinocytes and fibroblasts. Treated cancer cells do not exhibit the wide variability of dose response typically seen with other anticancer agents. Cancer cell killing by PRGPRP, in a cyclic amphiphilic cassette, requires cells to be in cycle but does not perturb cell cycle distribution and is accompanied by altered relative Cdk4/Cdk1 expression and selective decrease in ATP levels. Morphological features of apoptosis are absent and cancer cell death does not appear to involve autophagy. Conclusion These findings suggest a potential new paradigm for the development of broad-spectrum cancer specific therapeutics with a companion diagnostic

  15. Sequence conservation, phylogenetic relationships, and expression profiles of nondigestive serine proteases and serine protease homologs in Manduca sexta.

    PubMed

    Cao, Xiaolong; He, Yan; Hu, Yingxia; Zhang, Xiufeng; Wang, Yang; Zou, Zhen; Chen, Yunru; Blissard, Gary W; Kanost, Michael R; Jiang, Haobo

    2015-07-01

    Serine protease (SP) and serine protease homolog (SPH) genes in insects encode a large family of proteins involved in digestion, development, immunity, and other processes. While 68 digestive SPs and their close homologs are reported in a companion paper (Kuwar et al., in preparation), we have identified 125 other SPs/SPHs in Manduca sexta and studied their structure, evolution, and expression. Fifty-two of them contain cystine-stabilized structures for molecular recognition, including clip, LDLa, Sushi, Wonton, TSP, CUB, Frizzle, and SR domains. There are nineteen groups of genes evolved from relatively recent gene duplication and sequence divergence. Thirty-five SPs and seven SPHs contain 1, 2 or 5 clip domains. Multiple sequence alignment and molecular modeling of the 54 clip domains have revealed structural diversity of these regulatory modules. Sequence comparison with their homologs in Drosophila melanogaster, Anopheles gambiae and Tribolium castaneum allows us to classify them into five subfamilies: A are SPHs with 1 or 5 group-3 clip domains, B are SPs with 1 or 2 group-2 clip domains, C, D1 and D2 are SPs with a single clip domain in group-1a, 1b and 1c, respectively. We have classified into six categories the 125 expression profiles of SP-related proteins in fat body, brain, midgut, Malpighian tubule, testis, and ovary at different stages, suggesting that they participate in various physiological processes. Through RNA-Seq-based gene annotation and expression profiling, as well as intragenomic sequence comparisons, we have established a framework of information for future biochemical research of nondigestive SPs and SPHs in this model species. PMID:25530503

  16. Complete amino acid sequence and structure characterization of the taste-modifying protein, miraculin.

    PubMed

    Theerasilp, S; Hitotsuya, H; Nakajo, S; Nakaya, K; Nakamura, Y; Kurihara, Y

    1989-04-25

    The taste-modifying protein, miraculin, has the unusual property of modifying sour taste into sweet taste. The complete amino acid sequence of miraculin purified from miracle fruits by a newly developed method (Theerasilp, S., and Kurihara, Y. (1988) J. Biol. Chem. 263, 11536-11539) was determined by an automatic Edman degradation method. Miraculin was a single polypeptide with 191 amino acid residues. The calculated molecular weight based on the amino acid sequence and the carbohydrate content (13.9%) was 24,600. Asn-42 and Asn-186 were linked N-glycosidically to carbohydrate chains. High homology was found between the amino acid sequences of miraculin and soybean trypsin inhibitor. PMID:2708331

  17. Detection of a neurofibromatosis type I (NF1) homologous sequence by PCR: implications for the diagnosis and screening of genetic diseases.

    PubMed

    Gasparini, P; Grifa, A; Origone, P; Coviello, D; Antonacci, R; Rocchi, M

    1993-10-01

    The neurofibromatosis type I (NF1) gene was extensively screened for mutations using single strand conformation polymorphism (SSCP) technology. During the analysis of the NF1 GAP-related domain, electrophoretically abnormal fragments were detected. Direct sequencing of these fragments allowed us to identify the presence of a NF1 highly homologous sequence (NF1HHS). A detailed analysis of a hybrid panel located this sequence on chromosome 15q24-->qter. An accurate search through several data banks demonstrated that this sequence is a new NF1 homologue. This report shows how it is possible to find homologous sequences at random, and subsequently to make wrong interpretations. PMID:8264676

  18. Isolation of Insertion Sequence ISRLdTAL1145-1 from a Rhizobium sp. (Leucaena diversifolia) and Distribution of Homologous Sequences Identifying Cross-Inoculation Group Relationships †

    PubMed Central

    Rice, Douglas J.; Somasegaran, Padma; MacGlashan, Kathryn; Bohlool, B. Ben

    1994-01-01

    Insertion sequence (IS) element ISRLdTAL1145-1 from Rhizobium sp. (Leucaena diversifolia) strain TAL 1145 was entrapped in the sacB gene of the positive selection vector pUCD800 by insertional inactivation. A hybridization probe prepared from the whole 2.5-kb element was used to determine the distribution of homologous sequences in a diverse collection of 135 Rhizobium and Bradyrhizobium strains. The IS probe hybridized strongly to Southern blots of genomic DNAs from 10 rhizobial strains that nodulate both Phaseolus vulgaris (beans) and Leucaena leucocephala (leguminous trees), 1 Rhizobium sp. that nodulates Leucaena spp., 9 R. meliloti (alfalfa) strains, 4 Rhizobium spp. that nodulate Sophora chrysophylla (leguminous trees), and 1 nonnodulating bacterium associated with the nodules of Pithecellobium dulce from the Leucaena cross-inoculation group, producing distinguishing IS patterns for each strain. Hybridization analysis revealed that ISRLdTAL1145-1 was strongly homologous with and closely related to a previously isolated element, ISRm USDA1024-1 from R. meliloti, while restriction enzyme analysis found structural similarities and differences between the two IS homologs. Two internal segments of these IS elements were used to construct hybridization probes of 1.2 kb and 380 bp that delineate a structural similarity and a difference, respectively, of the two IS homologs. The internal segment probes were used to analyze the structures of homologous IS elements in other strains. Five types of structural variation in homolog IS elements were found. The predominate IS structural type naturally occurring in a strain can reasonably identify the strain's cross-inoculation group relationships. Three IS structural types were found in Rhizobium species that nodulate beans and Leucaena species, one of which included the designated type IIB strain of R. tropici (CIAT 899). Weak homology to the whole IS probe, but not with the internal segments, was found with two

  19. Nucleotide sequence of a gene from Phanerochaete chrysosporium that shows homology to the facA gene of Aspergillus nidulans.

    PubMed

    Birch, P R; Sims, P F; Broda, P

    1992-01-01

    Heterologous hybridisation was used to isolate a genomic DNA sequence from Phanerochaete chrysosporium using the facA (acetyl CoA synthetase) gene from Aspergillus nidulans as a probe. The cloned sequence hybridises to a 2.2 kb transcript in poly(A)+ RNA prepared from mycelium grown on acetate as the sole carbon source. Comparison of the DNA sequence obtained with those of the A. nidulans facA and N. crassa acu5 genes reveals an ORF that appears to be interrupted by five typical fungal introns. Two possible candidates for the translation initiation codon were observed. Homology with the facA and acu5 genes is revealed after the second ATG codon. PMID:1352996

  20. The nucleotide sequences of several tRNA genes from rat mitochondria: common features and relatedness to homologous species.

    PubMed Central

    Cantatore, P; De Benedetto, C; Gadaleta, G; Gallerani, R; Kroon, A M; Holtrop, M; Lanave, C; Pepe, G; Quagliariello, C; Saccone, C; Sbisa, E

    1982-01-01

    We have determined the nucleotide sequences of thirteen rat mt tRNA genes. The features of the primary and secondary structures of these tRNAs show that those for Gln, Ser, and f-Met resemble, while those for Lys, Cys, and Trp depart strikingly from the universal type. The remainder are slightly abnormal. Among many mammalian mt DNA sequences, those of mt tRNA genes are highly conserved, thus suggesting for those genes an additional, perhaps regulatory, function. A simple evolutionary relationship between the tRNAs of animal mitochondria and those of eukaryotic cytoplasm, of lower eukaryotic mitochondria or of prokaryotes, is not evident owing to the extreme divergence of the tRNA sequences in the two groups. However, a slightly higher homology does exist between a few animal mt tRNAs and those from prokaryotes or from lower eukaryotic mitochondria. PMID:7099963

  1. Snake venom. The amino acid sequence of protein A from Dendroaspis polylepis polylepis (black mamba) venom.

    PubMed

    Joubert, F J; Strydom, D J

    1980-12-01

    Protein A from Dendroaspis polylepis polylepis venom comprises 81 amino acids, including ten half-cystine residues. The complete primary structures of protein A and its variant A' were elucidated. The sequences of proteins A and A', which differ in a single position, show no homology with various neurotoxins and non-neurotoxic proteins and represent a new type of elapid venom protein. PMID:7461607

  2. Homology of the 3' terminal sequences of the 18S rRNA of Bombyx mori and the 16S rRNA of Escherchia coli.

    PubMed Central

    Samols, D R; Hagenbuchle, O; Gage, L P

    1979-01-01

    The terminal 220 base pairs (bp) of the gene for 18S rRNA and 18 bp of the adjoining spacer rDNA of the silkworm Bombyx mori have been sequenced. Comparison with the sequence of the 16S rRNA gene of Escherichia coli has shown that a region including 45 bp of the B. mori sequence at the 3' end is remarkably homologous with the 3' terminal E. coli sequence. Other homologies occur in the terminal regions of the 18S and 16S rRNAs, including a perfectly conserved stretch of 13 bp within a longer homology located 150--200 bp from the 3' termini. These homologies are the most extensive so far reported between prokaryotic and eukaryotic genomic DNA. Images PMID:390496

  3. Mass spectrometrical analysis of the processed metastasis-inducing anterior gradient protein 2 homolog reveals 100% sequence coverage.

    PubMed

    Myung, J-K; Frischer, T; Afjehi-Sadat, L; Pollak, A; Lubec, G

    2008-08-01

    Anterior gradient protein 2 homolog is a metastasis-inducing protein in a rat model of rat breast cancer and prognostic for outcome in hormonally treated breast cancer patients. Carrying out protein profiling in several mammalian cells and tissues, we detected this protein (synonym: secreted cement gland protein XAG-2 homolog) that was originally described in toad skin, in human bronchial epithelia. Tissues obtained from biopsies were homogenised and extracted proteins were run on two-dimensional gel electrophoresis. Following in-gel digestion with proteases trypsin, AspN, LysC and chymotrypsin, mass spectrometrical analysis was carried out by MALDI-TOF/TOF. The use of MS following multi-enzyme digestion of the protein resulted into 100% sequence coverage. MS/MS analysis enabled sequencing of 87% of the protein structure. This percentage does not include the signal peptide that was not observed in our protein due to processing. No posttranslational modifications were detectable and no sequence conflicts were observed. Complete analysis, unambiguous identification and characterisation of this biologically important protein could be shown, which is relevant for the definition of a marker protein that has been described so far by immunochemical methods only. Complete analysis is of importance as it forms the basis for all future work on this protein and, moreover, may serve as an analytical tool for further studies. PMID:17497304

  4. Sequences homologous to yeast mitochondrial and bacteriophage T3 and T7 RNA polymerases are widespread throughout the eukaryotic lineage.

    PubMed Central

    Cermakian, N; Ikeda, T M; Cedergren, R; Gray, M W

    1996-01-01

    Although mitochondria and chloroplasts are considered to be descendants of eubacteria-like endo- symbionts, the mitochondrial RNA polymerase of yeast is a nucleus-encoded, single-subunit enzyme homologous to bacteriophage T3 and T7 RNA polymerases, rather than a multi-component, eubacterial-type alpha 2 beta beta' enzyme, as encoded in chloroplast DNA. To broaden our knowledge of the mitochondrial transcriptional apparatus, we have used a polymerase chain reaction (PCR) approach designed to amplify an internal portion of phage T3/T7-like RNA polymerase genes. Using this strategy, we have recovered sequences homologous to yeast mitochondrial and phage T3/T7 RNA polymerases from a phylogenetically broad range of multicellular and unicellular eukaryotes. These organisms display diverse patterns of mitochondrial genome organization and expression, and include species that separated from the main eukaryotic line early in the evolution of this lineage. In certain cases, we can deduce that PCR-amplified sequences, some of which contain small introns, are localized in nuclear DNA. We infer that the T3/T7-like RNA polymerase sequences reported here are likely derived from genes encoding the mitochondrial RNA polymerase in the organisms in which they occur, suggesting a phage T3/T7-like RNA polymerase was recruited to act in transcription in the mitochondrion at an early stage in the evolution of this organelle. PMID:8604305

  5. Neisseria meningitidis C114 contains silent, truncated pilin genes that are homologous to Neisseria gonorrhoeae pil sequences.

    PubMed Central

    Perry, A C; Nicolson, I J; Saunders, J R

    1988-01-01

    Neisseria meningitidis pili can be classified into two groups: those (referred to here as class I pili) which are similar to gonococcal pili in that they react with monoclonal antibody SM1 and those that are dissimilar to gonococcal pili in that they lack the SM1-reactive epitope (class II pili). Pilus expression in N. meningitidis C114, a class II pilus-producing isolate, was investigated. The sole genomic segment of this strain that bore extensive homology with the pilE locus of Neisseria gonorrhoeae P9 was cloned in Escherichia coli. The production of the pilus structural subunit (pilin) from this meningococcal segment could not be detected by immunological and coupled in vitro transcription-translation analyses. Nucleotide sequence analysis revealed the presence in the C114 genome of two variant, tandemly arranged pilin genes (copies 1 and 2). Copies 1 and 2 are partial pilin genes that constitute part of a silent meningococcal pilin gene (pil gene) region, designated pilS. Both copies are truncated, corresponding to variable domains of the gonococcal pilE gene but lacking homologous N-terminal coding sequences. Located within sequences surrounding copies 1 and 2 were several classes of repeated elements that are associated with pil loci in N. gonorrhoeae. Images PMID:2895102

  6. Partial amino acid sequence of fructose-1,6-bisphosphatase from the blue-green algae Synechococcus leopoliensis.

    PubMed

    Marcus, F; Latshaw, S P; Steup, M; Gerbling, K P

    1989-08-01

    Purified fructose-1,6-bisphosphatase from the cyanobacterium Synechococcus leopoliensis was S-carboxymethylated and cleaved with trypsin. The resulting peptides were purified by reversed-phase high performance liquid chromatography and the amino acid sequence of six of the purified peptides was determined by gas-phase microsequencing. The results revealed sequence homology with other fructose-1,6-bisphosphatases. The obtained sequence data provides information required for the design of oligonucleotide hybridization probes to screen existing libraries of cyanobacterial DNA. The determination of the amino acid sequence of cyanobacterial proteins may yield important information with respect to the endosymbiotic theory of evolution. PMID:2550924

  7. dRHP-PseRA: detecting remote homology proteins using profile-based pseudo protein sequence and rank aggregation.

    PubMed

    Chen, Junjie; Long, Ren; Wang, Xiao-Long; Liu, Bin; Chou, Kuo-Chen

    2016-01-01

    Protein remote homology detection is an important task in computational proteomics. Some computational methods have been proposed, which detect remote homology proteins based on different features and algorithms. As noted in previous studies, their predictive results are complementary to each other. Therefore, it is intriguing to explore whether these methods can be combined into one package so as to further enhance the performance power and application convenience. In view of this, we introduced a protein representation called profile-based pseudo protein sequence to extract the evolutionary information from the relevant profiles. Based on the concept of pseudo proteins, a new predictor, called "dRHP-PseRA", was developed by combining four state-of-the-art predictors (PSI-BLAST, HHblits, Hmmer, and Coma) via the rank aggregation approach. Cross-validation tests on a SCOP benchmark dataset have demonstrated that the new predictor has remarkably outperformed any of the existing methods for the same purpose on ROC50 scores. Accordingly, it is anticipated that dRHP-PseRA holds very high potential to become a useful high throughput tool for detecting remote homology proteins. For the convenience of most experimental scientists, a web-server for dRHP-PseRA has been established at http://bioinformatics.hitsz.edu.cn/dRHP-PseRA/. PMID:27581095

  8. Structural and sequence similarities of hydra xeroderma pigmentosum A protein to human homolog suggest early evolution and conservation.

    PubMed

    Barve, Apurva; Ghaskadbi, Saroj; Ghaskadbi, Surendra

    2013-01-01

    Xeroderma pigmentosum group A (XPA) is a protein that binds to damaged DNA, verifies presence of a lesion, and recruits other proteins of the nucleotide excision repair (NER) pathway to the site. Though its homologs from yeast, Drosophila, humans, and so forth are well studied, XPA has not so far been reported from protozoa and lower animal phyla. Hydra is a fresh-water cnidarian with a remarkable capacity for regeneration and apparent lack of organismal ageing. Cnidarians are among the first metazoa with a defined body axis, tissue grade organisation, and nervous system. We report here for the first time presence of XPA gene in hydra. Putative protein sequence of hydra XPA contains nuclear localization signal and bears the zinc-finger motif. It contains two conserved Pfam domains and various characterized features of XPA proteins like regions for binding to excision repair cross-complementing protein-1 (ERCC1) and replication protein A 70 kDa subunit (RPA70) proteins. Hydra XPA shows a high degree of similarity with vertebrate homologs and clusters with deuterostomes in phylogenetic analysis. Homology modelling corroborates the very close similarity between hydra and human XPA. The protein thus most likely functions in hydra in the same manner as in other animals, indicating that it arose early in evolution and has been conserved across animal phyla. PMID:24083246

  9. Structural and Sequence Similarities of Hydra Xeroderma Pigmentosum A Protein to Human Homolog Suggest Early Evolution and Conservation

    PubMed Central

    Ghaskadbi, Saroj

    2013-01-01

    Xeroderma pigmentosum group A (XPA) is a protein that binds to damaged DNA, verifies presence of a lesion, and recruits other proteins of the nucleotide excision repair (NER) pathway to the site. Though its homologs from yeast, Drosophila, humans, and so forth are well studied, XPA has not so far been reported from protozoa and lower animal phyla. Hydra is a fresh-water cnidarian with a remarkable capacity for regeneration and apparent lack of organismal ageing. Cnidarians are among the first metazoa with a defined body axis, tissue grade organisation, and nervous system. We report here for the first time presence of XPA gene in hydra. Putative protein sequence of hydra XPA contains nuclear localization signal and bears the zinc-finger motif. It contains two conserved Pfam domains and various characterized features of XPA proteins like regions for binding to excision repair cross-complementing protein-1 (ERCC1) and replication protein A 70 kDa subunit (RPA70) proteins. Hydra XPA shows a high degree of similarity with vertebrate homologs and clusters with deuterostomes in phylogenetic analysis. Homology modelling corroborates the very close similarity between hydra and human XPA. The protein thus most likely functions in hydra in the same manner as in other animals, indicating that it arose early in evolution and has been conserved across animal phyla. PMID:24083246

  10. dRHP-PseRA: detecting remote homology proteins using profile-based pseudo protein sequence and rank aggregation

    PubMed Central

    Chen, Junjie; Long, Ren; Wang, Xiao-long; Liu, Bin; Chou, Kuo-Chen

    2016-01-01

    Protein remote homology detection is an important task in computational proteomics. Some computational methods have been proposed, which detect remote homology proteins based on different features and algorithms. As noted in previous studies, their predictive results are complementary to each other. Therefore, it is intriguing to explore whether these methods can be combined into one package so as to further enhance the performance power and application convenience. In view of this, we introduced a protein representation called profile-based pseudo protein sequence to extract the evolutionary information from the relevant profiles. Based on the concept of pseudo proteins, a new predictor, called “dRHP-PseRA”, was developed by combining four state-of-the-art predictors (PSI-BLAST, HHblits, Hmmer, and Coma) via the rank aggregation approach. Cross-validation tests on a SCOP benchmark dataset have demonstrated that the new predictor has remarkably outperformed any of the existing methods for the same purpose on ROC50 scores. Accordingly, it is anticipated that dRHP-PseRA holds very high potential to become a useful high throughput tool for detecting remote homology proteins. For the convenience of most experimental scientists, a web-server for dRHP-PseRA has been established at http://bioinformatics.hitsz.edu.cn/dRHP-PseRA/. PMID:27581095

  11. Top-Down-Assisted Bottom-Up Method for Homologous Protein Sequencing: Hemoglobin from 33 Bird Species

    NASA Astrophysics Data System (ADS)

    Song, Yang; Laskay, Ünige A.; Vilcins, Inger-Marie E.; Barbour, Alan G.; Wysocki, Vicki H.

    2015-11-01

    Ticks are vectors for disease transmission because they are indiscriminant in their feeding on multiple vertebrate hosts, transmitting pathogens between their hosts. Identifying the hosts on which ticks have fed is important for disease prevention and intervention. We have previously shown that hemoglobin (Hb) remnants from a host on which a tick fed can be used to reveal the host's identity. For the present research, blood was collected from 33 bird species that are common in the U.S. as hosts for ticks but that have unknown Hb sequences. A top-down-assisted bottom-up mass spectrometry approach with a customized searching database, based on variability in known bird hemoglobin sequences, has been devised to facilitate fast and complete sequencing of hemoglobin from birds with unknown sequences. These hemoglobin sequences will be added to a hemoglobin database and used for tick host identification. The general approach has the potential to sequence any set of homologous proteins completely in a rapid manner.

  12. Accuracy of sequence alignment and fold assessment using reduced amino acid alphabets.

    PubMed

    Melo, Francisco; Marti-Renom, Marc A

    2006-06-01

    Reduced or simplified amino acid alphabets group the 20 naturally occurring amino acids into a smaller number of representative protein residues. To date, several reduced amino acid alphabets have been proposed, which have been derived and optimized by a variety of methods. The resulting reduced amino acid alphabets have been applied to pattern recognition, generation of consensus sequences from multiple alignments, protein folding, and protein structure prediction. In this work, amino acid substitution matrices and statistical potentials were derived based on several reduced amino acid alphabets and their performance assessed in a large benchmark for the tasks of sequence alignment and fold assessment of protein structure models, using as a reference frame the standard alphabet of 20 amino acids. The results showed that a large reduction in the total number of residue types does not necessarily translate into a significant loss of discriminative power for sequence alignment and fold assessment. Therefore, some definitions of a few residue types are able to encode most of the relevant sequence/structure information that is present in the 20 standard amino acids. Based on these results, we suggest that the use of reduced amino acid alphabets may allow to increasing the accuracy of current substitution matrices and statistical potentials for the prediction of protein structure of remote homologs. PMID:16506243

  13. Detection of nucleic acid sequences by invader-directed cleavage

    DOEpatents

    Brow, Mary Ann D.; Hall, Jeff Steven Grotelueschen; Lyamichev, Victor; Olive, David Michael; Prudent, James Robert

    1999-01-01

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The 5' nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof. The present invention further relates to methods and devices for the separation of nucleic acid molecules based by charge.

  14. Amino acid sequences of lysozymes newly purified from invertebrates imply wide distribution of a novel class in the lysozyme family.

    PubMed

    Ito, Y; Yoshikawa, A; Hotani, T; Fukuda, S; Sugimura, K; Imoto, T

    1999-01-01

    Lysozymes were purified from three invertebrates: a marine bivalve, a marine conch, and an earthworm. The purified lysozymes all showed a similar molecular weight of 13 kDa on SDS/PAGE. Their N-terminal sequences up to the 33rd residue determined here were apparently homologous among them; in addition, they had a homology with a partial sequence of a starfish lysozyme which had been reported before. The complete sequence of the bivalve lysozyme was determined by peptide mapping and subsequent sequence analysis. This was composed of 123 amino acids including as many as 14 cysteine residues and did not show a clear homology with the known types of lysozymes. However, the homology search of this protein on the protein or nucleic acid database revealed two homologous proteins. One of them was a gene product, CELF22 A3.6 of C. elegans, which was a functionally unknown protein. The other was an isopeptidase of a medicinal leech, named destabilase. Thus, a new type of lysozyme found in at least four species across the three classes of the invertebrates demonstrates a novel class of protein/lysozyme family in invertebrates. The bivalve lysozyme, first characterized here, showed extremely high protein stability and hen lysozyme-like enzymatic features. PMID:9914527

  15. The amino acid sequence of protein SCMK-B2A from the high-sulphur fraction of wool keratin

    PubMed Central

    Elleman, T. C.

    1972-01-01

    1. The amino acid sequence of protein SCMK-B2A, a reduced and S-carboxymethylated protein from the high-sulphur fraction of wool, has been determined. 2. This protein of 171 amino acid residues displays both a high degree of internal homology and extensive external homology with other members of the SCMK-B2 group of proteins. 3. Evidence is presented which suggests that the SCMK-B2 group of proteins are produced by separate non-allelic genes. ImagesPLATE 1 PMID:4679226

  16. DNA sequence and genetic analysis of the Rhodobacter capsulatus nifENX gene region: homology between NifX and NifB suggests involvement of NifX in processing of the iron-molybdenum cofactor.

    PubMed

    Moreno-Vivian, C; Schmehl, M; Masepohl, B; Arnold, W; Klipp, W

    1989-04-01

    Rhodobacter capsulatus genes homologous to Klebsiella pneumoniae nifE, nifN and nifX were identified by DNA sequence analysis of a 4282 bp fragment of nif region A. Four open reading frames coding for a 51,188 (NifE), a 49,459 (NifN), a 17,459 (NifX) and a 17,472 (ORF4) dalton protein were detected. A typical NifA activated consensus promoter and two imperfect putative NifA binding sites were located in the 377 bp sequence in front of the nifE coding region. Comparison of the deduced amino acid sequences of R. capsulatus NifE and NifN revealed homologies not only to analogous gene products of other organisms but also to the alpha and beta subunits of the nitrogenase iron-molybdenum protein. In addition, the R. capsulatus nifE and nifN proteins shared considerable homology with each other. The map position of nifX downstream of nifEN corresponded in R. capsulatus and K. pneumoniae and the deduced molecular weights of both proteins were nearly identical. Nevertheless, R. capsulatus NifX was more related to the C-terminal end of NifY from K. pneumoniae than to NifX. A small domain of approximately 33 amino acid residues showing the highest degree of homology between NifY and NifX was also present in all nifB proteins analyzed so far. This homology indicated an evolutionary relationship of nifX, nifY and nifB and also suggested that NifX and NifY might play a role in maturation and/or stability of the iron-molybdenum cofactor. The open reading frame (ORF4) downstream of nifX in R. capsulatus is also present in Azotobacter vinelandii but not in K. pneumoniae.(ABSTRACT TRUNCATED AT 250 WORDS) PMID:2747620

  17. Complete cDNA sequence of human complement C1s and close physical linkage of the homologous genes C1s and C1r

    SciTech Connect

    Tosi, M.; Duponchel, C.; Meo, T.; Julier, C.

    1987-12-29

    Overlapping molecular clones encoding the complement subcomponent C1s were isolated from a human liver cDNA library. The nucleotide sequence reconstructed from these clones spans about 85% of the length of the liver C1s messenger RNAs, which occur in three distinct size classes around 3 kilobases in length. Comparisons with the sequence of C1r, the other enzymatic subcomponent of C1, reveal 40% amino acid identity and conservation of all the cysteine residues. Beside the serine protease domain, the following sequence motifs, previously described in C1r, were also found in C1s: (a) two repeats of the type found in the Ba fragment of complement factor B and in several other complement but also noncomplement proteins, (b) a cysteine-rich segment homologous to the repeats of epidermal growth factor precursor, and (c) a duplicated segment found only in C1r and C1s. Differences in each of these structural motifs provide significant clues for the interpretation of the functional divergence of these interacting serine protease zymogens. Hybridizations of C1r and C1s probes to restriction endonuclease fragments of genomic DNA demonstrate close physical linkage of the corresponding genes. The implications of this finding are discussed with respect to the evolution of C1r and C1s after their origin by tandem gene duplication and to the previously observed combined hereditary deficiencies of Clr and Cls.

  18. beta-Keratins in crocodiles reveal amino acid homology with avian keratins.

    PubMed

    Ye, Changjiang; Wu, Xiaobing; Yan, Peng; Amato, George

    2010-03-01

    The DNA sequences encoding beta-keratin have been obtained from Marsh Mugger (Crocodylus palustris) and Orinoco Crocodiles (Crocodylus intermedius). Through the deduced amino acid sequence, these proteins are rich in glycine, proline and serine. The central region of the proteins are composed of two beta-folded regions and show a high degree of identity with beta-keratins of aves and squamates. This central part is thought to be the site of polymerization to build the framework of beta-keratin filaments. It is believed that the beta-keratins in reptiles and birds share a common ancestry. Near the C-terminal, these beta-keratins contain a peptide rich in glycine-X and glycine-X-X, and the distinctive feature of the region is some 12-amino acid repeats, which are similar to the 13-amino acid repeats in chick scale keratin but absent from avian feather keratin. From our phylogenetic analysis, the beta-keratins in crocodile have a closer relationship with avian keratins than the other keratins in reptiles. PMID:19266314

  19. Hybridization and sequencing of nucleic acids using base pair mismatches

    DOEpatents

    Fodor, Stephen P. A.; Lipshutz, Robert J.; Huang, Xiaohua

    2001-01-01

    Devices and techniques for hybridization of nucleic acids and for determining the sequence of nucleic acids. Arrays of nucleic acids are formed by techniques, preferably high resolution, light-directed techniques. Positions of hybridization of a target nucleic acid are determined by, e.g., epifluorescence microscopy. Devices and techniques are proposed to determine the sequence of a target nucleic acid more efficiently and more quickly through such synthesis and detection techniques.

  20. Comparative genomic survey, exon-intron annotation and phylogenetic analysis of NAT-homologous sequences in archaea, protists, fungi, viruses, and invertebrates

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We have previously published extensive genomic surveys [1-3], reporting NAT-homologous sequences in hundreds of sequenced bacterial, fungal and vertebrate genomes. We present here the results of our latest search of 2445 genomes, representing 1532 (70 archaeal, 1210 bacterial, 43 protist, 97 fungal,...

  1. Powerful Sequence Similarity Search Methods and In-Depth Manual Analyses Can Identify Remote Homologs in Many Apparently “Orphan” Viral Proteins

    PubMed Central

    Kuchibhatla, Durga B.; Chung, Betty Y. W.; Cook, Shelley; Schneider, Georg; Eisenhaber, Birgit

    2014-01-01

    The genome sequences of new viruses often contain many “orphan” or “taxon-specific” proteins apparently lacking homologs. However, because viral proteins evolve very fast, commonly used sequence similarity detection methods such as BLAST may overlook homologs. We analyzed a data set of proteins from RNA viruses characterized as “genus specific” by BLAST. More powerful methods developed recently, such as HHblits or HHpred (available through web-based, user-friendly interfaces), could detect distant homologs of a quarter of these proteins, suggesting that these methods should be used to annotate viral genomes. In-depth manual analyses of a subset of the remaining sequences, guided by contextual information such as taxonomy, gene order, or domain cooccurrence, identified distant homologs of another third. Thus, a combination of powerful automated methods and manual analyses can uncover distant homologs of many proteins thought to be orphans. We expect these methodological results to be also applicable to cellular organisms, since they generally evolve much more slowly than RNA viruses. As an application, we reanalyzed the genome of a bee pathogen, Chronic bee paralysis virus (CBPV). We could identify homologs of most of its proteins thought to be orphans; in each case, identifying homologs provided functional clues. We discovered that CBPV encodes a domain homologous to the Alphavirus methyltransferase-guanylyltransferase; a putative membrane protein, SP24, with homologs in unrelated insect viruses and insect-transmitted plant viruses having different morphologies (cileviruses, higreviruses, blunerviruses, negeviruses); and a putative virion glycoprotein, ORF2, also found in negeviruses. SP24 and ORF2 are probably major structural components of the virions. PMID:24155369

  2. Multilocus sequence typing of Lactobacillus casei reveals a clonal population structure with low levels of homologous recombination.

    PubMed

    Diancourt, Laure; Passet, Virginie; Chervaux, Christian; Garault, Peggy; Smokvina, Tamara; Brisse, Sylvain

    2007-10-01

    Robust genotyping methods for Lactobacillus casei are needed for strain tracking and collection management, as well as for population biology research. A collection of 52 strains initially labeled L. casei or Lactobacillus paracasei was first subjected to rplB gene sequencing together with reference strains of Lactobacillus zeae, Lactobacillus rhamnosus, and other species. Phylogenetic analysis showed that all 52 strains belonged to a single compact L. casei-L. paracasei sequence cluster, together with strain CIP107868 (= ATCC 334) but clearly distinct from L. rhamnosus and from a cluster with L. zeae and CIP103137(T) (= ATCC 393(T)). The strains were genotyped using amplified fragment length polymorphism, multilocus sequence typing based on internal portions of the seven housekeeping genes fusA, ileS, lepA, leuS, pyrG, recA, and recG, and tandem repeat variation (multilocus variable-number tandem repeats analysis [MLVA] using nine loci). Very high concordance was found between the three methods. Although amounts of nucleotide variation were low for the seven genes (pi ranging from 0.0038 to 0.0109), 3 to 12 alleles were distinguished, resulting in 31 sequence types. One sequence type (ST1) was frequent (17 strains), but most others were represented by a single strain. Attempts to subtype ST1 strains by MLVA, ribotyping, clustered regularly interspaced short palindromic repeat characterization, and single nucleotide repeat variation were unsuccessful. We found clear evidence for homologous recombination during the diversification of L. casei clones, including a putative intragenic import of DNA into one strain. Nucleotides were estimated to change four times more frequently by recombination than by mutation. However, statistical congruence between individual gene trees was retained, indicating that recombination is not frequent enough to disrupt the phylogenetic signal. The developed multilocus sequence typing scheme should be useful for future studies of L. casei

  3. Multilocus Sequence Typing of Lactobacillus casei Reveals a Clonal Population Structure with Low Levels of Homologous Recombination▿ †

    PubMed Central

    Diancourt, Laure; Passet, Virginie; Chervaux, Christian; Garault, Peggy; Smokvina, Tamara; Brisse, Sylvain

    2007-01-01

    Robust genotyping methods for Lactobacillus casei are needed for strain tracking and collection management, as well as for population biology research. A collection of 52 strains initially labeled L. casei or Lactobacillus paracasei was first subjected to rplB gene sequencing together with reference strains of Lactobacillus zeae, Lactobacillus rhamnosus, and other species. Phylogenetic analysis showed that all 52 strains belonged to a single compact L. casei-L. paracasei sequence cluster, together with strain CIP107868 (= ATCC 334) but clearly distinct from L. rhamnosus and from a cluster with L. zeae and CIP103137T (= ATCC 393T). The strains were genotyped using amplified fragment length polymorphism, multilocus sequence typing based on internal portions of the seven housekeeping genes fusA, ileS, lepA, leuS, pyrG, recA, and recG, and tandem repeat variation (multilocus variable-number tandem repeats analysis [MLVA] using nine loci). Very high concordance was found between the three methods. Although amounts of nucleotide variation were low for the seven genes (π ranging from 0.0038 to 0.0109), 3 to 12 alleles were distinguished, resulting in 31 sequence types. One sequence type (ST1) was frequent (17 strains), but most others were represented by a single strain. Attempts to subtype ST1 strains by MLVA, ribotyping, clustered regularly interspaced short palindromic repeat characterization, and single nucleotide repeat variation were unsuccessful. We found clear evidence for homologous recombination during the diversification of L. casei clones, including a putative intragenic import of DNA into one strain. Nucleotides were estimated to change four times more frequently by recombination than by mutation. However, statistical congruence between individual gene trees was retained, indicating that recombination is not frequent enough to disrupt the phylogenetic signal. The developed multilocus sequence typing scheme should be useful for future studies of L. casei

  4. 77 FR 65537 - Requirements for Patent Applications Containing Nucleotide Sequence and/or Amino Acid Sequence...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-10-29

    ... Amino Acid Sequence Disclosures ACTION: Proposed collection; comment request. SUMMARY: The United States....'' SUPPLEMENTARY INFORMATION: I. Abstract Patent applications that contain nucleotide and/or amino acid sequence disclosures must include a copy of the sequence listing in accordance with the requirements in 37 CFR...

  5. Nucleotide and derived amino acid sequences of the major porin of Comamonas acidovorans and comparison of porin primary structures.

    PubMed Central

    Gerbl-Rieger, S; Peters, J; Kellermann, J; Lottspeich, F; Baumeister, W

    1991-01-01

    The DNA sequence of the gene which codes for the major outer membrane porin (Omp32) of Comamonas acidovorans has been determined. The structural gene encodes a precursor consisting of 351 amino acid residues with a signal peptide of 19 amino acid residues. Comparisons with amino acid sequences of outer membrane proteins and porins from several other members of the class Proteobacteria and of the Chlamydia trachomatis porin and the Neurospora crassa mitochondrial porin revealed a motif of eight regions of local homology. The results of this analysis are discussed with regard to common structural features of porins. PMID:1848840

  6. A Neurospora crassa ribosomal protein gene, homologous to yeast CRY1, contains sequences potentially coordinating its transcription with rRNA genes.

    PubMed Central

    Tyler, B M; Harrison, K

    1990-01-01

    We have isolated and sequenced a Neurospora crassa ribosomal protein gene (designated crp-2) strongly homologous to the rp59 gene (CRY1) of yeast and the S14 ribosomal protein gene of mammals. The inferred sequence of the crp-2 protein is more homologous (83%) to the mammalian S14 sequence than to the yeast rp59 sequence (69%). The gene has three intervening sequences (IVSs) two of which are offset 7 bp from the position of IVSs in the mammalian genes. None correspond to the position of the IVS in the yeast gene. Crp-2 was mapped by RFLP analysis to the right arm of linkage group III. The 5' region of the gene contains three copies of a sequence, the Ribo box, previously shown to be required for transcription of both 5S and 40S rRNA genes. We speculate that the Ribo box may coordinate ribosomal protein and rRNA gene transcription. Images PMID:1977135

  7. On the necessity of dissecting sequence similarity scores into segment-specific contributions for inferring protein homology, function prediction and annotation

    PubMed Central

    2014-01-01

    Background Protein sequence similarities to any types of non-globular segments (coiled coils, low complexity regions, transmembrane regions, long loops, etc. where either positional sequence conservation is the result of a very simple, physically induced pattern or rather integral sequence properties are critical) are pertinent sources for mistaken homologies. Regretfully, these considerations regularly escape attention in large-scale annotation studies since, often, there is no substitute to manual handling of these cases. Quantitative criteria are required to suppress events of function annotation transfer as a result of false homology assignments. Results The sequence homology concept is based on the similarity comparison between the structural elements, the basic building blocks for conferring the overall fold of a protein. We propose to dissect the total similarity score into fold-critical and other, remaining contributions and suggest that, for a valid homology statement, the fold-relevant score contribution should at least be significant on its own. As part of the article, we provide the DissectHMMER software program for dissecting HMMER2/3 scores into segment-specific contributions. We show that DissectHMMER reproduces HMMER2/3 scores with sufficient accuracy and that it is useful in automated decisions about homology for instructive sequence examples. To generalize the dissection concept for cases without 3D structural information, we find that a dissection based on alignment quality is an appropriate surrogate. The approach was applied to a large-scale study of SMART and PFAM domains in the space of seed sequences and in the space of UniProt/SwissProt. Conclusions Sequence similarity core dissection with regard to fold-critical and other contributions systematically suppresses false hits and, additionally, recovers previously obscured homology relationships such as the one between aquaporins and formate/nitrite transporters that, so far, was only

  8. Progressive structure-based alignment of homologous proteins: Adopting sequence comparison strategies.

    PubMed

    Joseph, Agnel Praveen; Srinivasan, Narayanaswamy; de Brevern, Alexandre G

    2012-09-01

    Comparison of multiple protein structures has a broad range of applications in the analysis of protein structure, function and evolution. Multiple structure alignment tools (MSTAs) are necessary to obtain a simultaneous comparison of a family of related folds. In this study, we have developed a method for multiple structure comparison largely based on sequence alignment techniques. A widely used Structural Alphabet named Protein Blocks (PBs) was used to transform the information on 3D protein backbone conformation as a 1D sequence string. A progressive alignment strategy similar to CLUSTALW was adopted for multiple PB sequence alignment (mulPBA). Highly similar stretches identified by the pairwise alignments are given higher weights during the alignment. The residue equivalences from PB based alignments are used to obtain a three dimensional fit of the structures followed by an iterative refinement of the structural superposition. Systematic comparisons using benchmark datasets of MSTAs underlines that the alignment quality is better than MULTIPROT, MUSTANG and the alignments in HOMSTRAD, in more than 85% of the cases. Comparison with other rigid-body and flexible MSTAs also indicate that mulPBA alignments are superior to most of the rigid-body MSTAs and highly comparable to the flexible alignment methods. PMID:22676903

  9. A novel antimicrobial protein isolated from potato (Solanum tuberosum) shares homology with an acid phosphatase.

    PubMed

    Feng, Jie; Yuan, Fenghua; Gao, Yin; Liang, Chenggang; Xu, Jin; Zhang, Changling; He, Liyuan

    2003-12-01

    The nucleotide and amino acids sequences for AP(1) will appear in the GenBank(R) and NCBI databases under accession number AY297449. A novel antimicrobial protein (AP(1)) was purified from leaves of the potato ( Solanum tuberosum, variety MS-42.3) with a procedure involving ammonium sulphate fractionation, molecular sieve chromatography with Sephacryl S-200 and hydrophobic chromatography with Butyl-Sepharose using a FPLC system. The inhibition spectrum investigation showed that AP(1) had good inhibition activity against five different strains of Ralstonia solanacearum from potato or other crops, and two fungal pathogens, Rhizoctonia solani and Alternaria solani from potato. The full-length cDNA encoding AP(1) has been successfully cloned by screening a cDNA expression library of potato with an anti-AP(1) antibody and RACE (rapid amplification of cDNA ends) PCR. Determination of the nucleotide sequences revealed the presence of an open reading frame encoding 343 amino acids. At the C-terminus of AP(1) there is an ATP-binding domain, and the N-terminus exhibits 58% identity with an/the acid phosphatase from Mesorhizobium loti. SDS/PAGE and Western blotting analysis suggested that the AP(1) gene can be successfully expressed in Escherichia coli and recognized by an antibody against AP(1). Also the expressed protein showed an inhibition activity the same as original AP(1) protein isolated from potato. We suggest that AP(1) most likely belongs to a new group of proteins with antimicrobial characteristics in vitro and functions in relation to phosphorylation and energy metabolism of plants. PMID:12927022

  10. Use of a structural alphabet to find compatible folds for amino acid sequences

    PubMed Central

    Mahajan, Swapnil; de Brevern, Alexandre G; Sanejouand, Yves-Henri; Srinivasan, Narayanaswamy; Offmann, Bernard

    2015-01-01

    The structural annotation of proteins with no detectable homologs of known 3D structure identified using sequence-search methods is a major challenge today. We propose an original method that computes the conditional probabilities for the amino-acid sequence of a protein to fit to known protein 3D structures using a structural alphabet, known as “Protein Blocks” (PBs). PBs constitute a library of 16 local structural prototypes that approximate every part of protein backbone structures. It is used to encode 3D protein structures into 1D PB sequences and to capture sequence to structure relationships. Our method relies on amino acid occurrence matrices, one for each PB, to score global and local threading of query amino acid sequences to protein folds encoded into PB sequences. It does not use any information from residue contacts or sequence-search methods or explicit incorporation of hydrophobic effect. The performance of the method was assessed with independent test datasets derived from SCOP 1.75A. With a Z-score cutoff that achieved 95% specificity (i.e., less than 5% false positives), global and local threading showed sensitivity of 64.1% and 34.2%, respectively. We further tested its performance on 57 difficult CASP10 targets that had no known homologs in PDB: 38 compatible templates were identified by our approach and 66% of these hits yielded correctly predicted structures. This method scales-up well and offers promising perspectives for structural annotations at genomic level. It has been implemented in the form of a web-server that is freely available at http://www.bo-protscience.fr/forsa. PMID:25297700

  11. Use of a structural alphabet to find compatible folds for amino acid sequences.

    PubMed

    Mahajan, Swapnil; de Brevern, Alexandre G; Sanejouand, Yves-Henri; Srinivasan, Narayanaswamy; Offmann, Bernard

    2015-01-01

    The structural annotation of proteins with no detectable homologs of known 3D structure identified using sequence-search methods is a major challenge today. We propose an original method that computes the conditional probabilities for the amino-acid sequence of a protein to fit to known protein 3D structures using a structural alphabet, known as "Protein Blocks" (PBs). PBs constitute a library of 16 local structural prototypes that approximate every part of protein backbone structures. It is used to encode 3D protein structures into 1D PB sequences and to capture sequence to structure relationships. Our method relies on amino acid occurrence matrices, one for each PB, to score global and local threading of query amino acid sequences to protein folds encoded into PB sequences. It does not use any information from residue contacts or sequence-search methods or explicit incorporation of hydrophobic effect. The performance of the method was assessed with independent test datasets derived from SCOP 1.75A. With a Z-score cutoff that achieved 95% specificity (i.e., less than 5% false positives), global and local threading showed sensitivity of 64.1% and 34.2%, respectively. We further tested its performance on 57 difficult CASP10 targets that had no known homologs in PDB: 38 compatible templates were identified by our approach and 66% of these hits yielded correctly predicted structures. This method scales-up well and offers promising perspectives for structural annotations at genomic level. It has been implemented in the form of a web-server that is freely available at http://www.bo-protscience.fr/forsa. PMID:25297700

  12. Suberoylanilide Hydroxamic Acid (SAHA) enhances olaparib activity by targeting homologous recombination DNA repair in ovarian cancer

    PubMed Central

    Konstantinopoulos, Panagiotis A.; Wilson, Andrew J.; Saskowski, Jeanette; Wass, Erica; Khabele, Dineo

    2015-01-01

    Objectives Approximately 50% of serous epithelial ovarian cancers (EOC) contain molecular defects in homologous recombination (HR) DNA repair pathways. Poly(ADP-ribose) polymerase inhibitors (PARPi) have efficacy in HR-deficient, but not HR-proficient, EOC tumors as a single agent. Our goal was to determine whether the histone deacetylase inhibitor, suberoylanilide hydroxamic acid (SAHA), can sensitize HR-proficient ovarian cancer cells to the PARPi AZD-2281 (olaparib). Methods Ovarian cancer cell lines (SKOV-3, OVCAR-8, NCI/ADR-Res, UWB1.289 BRCA1null and UWB1.289 + BRCA1 wild-type) were treated with saline vehicle, olaparib, SAHA or olaparib/SAHA. Sulforhodamine B (SRB) assessed cytotoxicity and immunofluorescence and Western blot assays assessed markers of apoptosis (cleaved PARP) and DNA damage (pH2AX and RAD51). Drug effects were also tested in SKOV-3 xenografts in Nude mice. Affymetrix microarray experiments were performed in vehicle and SAHA-treated SKOV-3 cells. Results In a microarray analysis, SAHA induced coordinated down-regulation of HR pathway genes, including RAD51 and BRCA1. Nuclear co-expression of RAD51 and pH2AX, a marker of efficient HR repair, was reduced approximately 40% by SAHA treatment alone and combined with olaparib. SAHA combined with olaparib induced apoptosis and pH2AX expression to a greater extent than either drug alone. Olaparib reduced cell viability at increasing concentrations and SAHA enhanced these effects in 4 of 5 cell lines, including BRCA1 null and wild-type cells, in vitro and in SKOV-3 xenografts in vivo. Conclusions These results provide preclinical rationale for targeting DNA damage response pathways by combining small molecule PARPi with HDACi as a mechanism for reducing HR efficiency in ovarian cancer. PMID:24631446

  13. High-efficiency transformation of Pichia stipitis based on its URA3 gene and a homologous autonomous replication sequence, ARS2.

    PubMed Central

    Yang, V W; Marks, J A; Davis, B P; Jeffries, T W

    1994-01-01

    This paper describes the first high-efficiency transformation system for the xylose-fermenting yeast Pichia stipitis. The system includes integrating and autonomously replicating plasmids based on the gene for orotidine-5'-phosphate decarboxylase (URA3) and an autonomous replicating sequence (ARS) element (ARS2) isolated from P. stipitis CBS 6054. Ura- auxotrophs were obtained by selecting for resistance to 5-fluoroorotic acid and were identified as ura3 mutants by transformation with P. stipitis URA3. P. stipitis URA3 was cloned by its homology to Saccharomyces cerevisiae URA3, with which it is 69% identical in the coding region. P. stipitis ARS elements were cloned functionally through plasmid rescue. These sequences confer autonomous replication when cloned into vectors bearing the P. stipitis URA3 gene. P. stipitis ARS2 has features similar to those of the consensus ARS of S. cerevisiae and other ARS elements. Circular plasmids bearing the P. stipitis URA3 gene with various amounts of flanking sequences produced 600 to 8,600 Ura+ transformants per micrograms of DNA by electroporation. Most transformants obtained with circular vectors arose without integration of vector sequences. One vector yielded 5,200 to 12,500 Ura+ transformants per micrograms of DNA after it was linearized at various restriction enzyme sites within the P. stipitis URA3 insert. Transformants arising from linearized vectors produced stable integrants, and integration events were site specific for the genomic ura3 in 20% of the transformants examined. Plasmids bearing the P. stipitis URA3 gene and ARS2 element produced more than 30,000 transformants per micrograms of plasmid DNA. Autonomously replicating plasmids were stable for at least 50 generations in selection medium and were present at an average of 10 copies per nucleus. Images PMID:7811063

  14. Predicting intrinsic disorder from amino acid sequence.

    PubMed

    Obradovic, Zoran; Peng, Kang; Vucetic, Slobodan; Radivojac, Predrag; Brown, Celeste J; Dunker, A Keith

    2003-01-01

    Blind predictions of intrinsic order and disorder were made on 42 proteins subsequently revealed to contain 9,044 ordered residues, 284 disordered residues in 26 segments of length 30 residues or less, and 281 disordered residues in 2 disordered segments of length greater than 30 residues. The accuracies of the six predictors used in this experiment ranged from 77% to 91% for the ordered regions and from 56% to 78% for the disordered segments. The average of the order and disorder predictions ranged from 73% to 77%. The prediction of disorder in the shorter segments was poor, from 25% to 66% correct, while the prediction of disorder in the longer segments was better, from 75% to 95% correct. Four of the predictors were composed of ensembles of neural networks. This enabled them to deal more efficiently with the large asymmetry in the training data through diversified sampling from the significantly larger ordered set and achieve better accuracy on ordered and long disordered regions. The exclusive use of long disordered regions for predictor training likely contributed to the disparity of the predictions on long versus short disordered regions, while averaging the output values over 61-residue windows to eliminate short predictions of order or disorder probably contributed to the even greater disparity for three of the predictors. This experiment supports the predictability of intrinsic disorder from amino acid sequence. PMID:14579347

  15. CMsearch: simultaneous exploration of protein sequence space and structure space improves not only protein homology detection but also protein structure prediction

    PubMed Central

    Cui, Xuefeng; Lu, Zhiwu; Wang, Sheng; Jing-Yan Wang, Jim; Gao, Xin

    2016-01-01

    Motivation: Protein homology detection, a fundamental problem in computational biology, is an indispensable step toward predicting protein structures and understanding protein functions. Despite the advances in recent decades on sequence alignment, threading and alignment-free methods, protein homology detection remains a challenging open problem. Recently, network methods that try to find transitive paths in the protein structure space demonstrate the importance of incorporating network information of the structure space. Yet, current methods merge the sequence space and the structure space into a single space, and thus introduce inconsistency in combining different sources of information. Method: We present a novel network-based protein homology detection method, CMsearch, based on cross-modal learning. Instead of exploring a single network built from the mixture of sequence and structure space information, CMsearch builds two separate networks to represent the sequence space and the structure space. It then learns sequence–structure correlation by simultaneously taking sequence information, structure information, sequence space information and structure space information into consideration. Results: We tested CMsearch on two challenging tasks, protein homology detection and protein structure prediction, by querying all 8332 PDB40 proteins. Our results demonstrate that CMsearch is insensitive to the similarity metrics used to define the sequence and the structure spaces. By using HMM–HMM alignment as the sequence similarity metric, CMsearch clearly outperforms state-of-the-art homology detection methods and the CASP-winning template-based protein structure prediction methods. Availability and implementation: Our program is freely available for download from http://sfb.kaust.edu.sa/Pages/Software.aspx. Contact: xin.gao@kaust.edu.sa Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27307635

  16. Analysis of the functional domains of biosynthetic threonine deaminase by comparison of the amino acid sequences of three wild-type alleles to the amino acid sequence of biodegradative threonine deaminase.

    PubMed

    Taillon, B E; Little, R; Lawther, R P

    1988-03-31

    The nucleotide sequence of the gene, ilvA, for biosynthetic threonine deaminase (Tda) from Salmonella typhimurium was determined. The deduced amino acid sequence was compared with the deduced amino acid sequences of the biosynthetic Tda from Escherichia coli K-12 (ilvA) and Saccharomyces cerevisiae (ILV1) and the biodegradative Tda from E. coli K-12 (tdc). The comparison indicated the presence of two types of blocks of homologous amino acids. The first type of homology is in the N-terminal portion of all four isozymes of Tda and probably indicates amino acids involved in catalysis. The second type of homology is found in the C-terminal portion of the three biosynthetic isozymes and presumably is involved in either (i) the binding or interaction of the allosteric effector isoleucine with the enzyme, or (ii) subunit interactions. The sites of amino acid changes of two E. coli K-12 ilvA alleles with altered response to isoleucine are consistent with the conclusion that the C-terminal portion of biosynthetic Tda is involved in allosteric regulation. PMID:3290055

  17. Slr2019, lipid A transporter homolog, is essential for acidic tolerance in Synechocystis sp. PCC6803.

    PubMed

    Matsuhashi, Ayumi; Tahara, Hiroko; Ito, Yutaro; Uchiyama, Junji; Ogawa, Satoru; Ohta, Hisataka

    2015-08-01

    Living organisms must defend themselves against various environmental stresses. Extracellular polysaccharide-producing cells exhibit enhanced tolerance toward adverse environmental stress. In Synechocystis sp. PCC6803 (Synechocystis), lipopolysaccharide (LPS) may play a role in this protection. To examine the relationship between stress tolerance of Synechocystis and LPS, we focused on Slr2019 because Slr2019 is homologous to MsbA in Escherichia coli, which is related to LPS synthesis. First, to obtain a defective mutant of LPS, we constructed the slr2019 insertion mutant (slr2019) strain. Sodium deoxycholate-polyacrylamide gel electrophoresis indicated that slr2019 strain did not synthesize normal LPS. Second, to clarify the participation of LPS in acid tolerance, wild type (WT) and slr2019 strain were grown under acid stress; slr2019 strain growth was significantly weaker than WT growth. Third, to examine influences on stress tolerance, slr2019 strain was grown under various stresses. Under salinity and temperature stress, slr2019 strain grew significantly slower than WT. To confirm cell morphology, cell shape and envelope of slr2019 strain were observed by transmission electron microscopy; slr2019 cells contained more electron-transparent bodies than WT cells. Finally, to confirm whether electron-transparent bodies are poly-3-hydroxybutyrate (PHB), slr2019 strain was stained with Nile Blue A, a PHB detector, and observed by fluorescence microscopy. The PHB granule content ratio of WT and slr2019 strain grown at BG-11 pH 8.0 was each 7.18 and 8.41 %. At pH 6.0, the PHB granule content ratio of WT and slr2019 strain was 2.99 and 2.60 %. However, the PHB granule content ratio of WT and slr2019 strain grown at BG-11N-reduced was 10.82 and 0.56 %. Because slr2019 strain significantly decreased PHB under BG-11N-reduced compared with WT, LPS synthesis may be related to PHB under particular conditions. These results indicated that Slr2019 is necessary for

  18. Ingi, a 5.2-kb dispersed sequence element from Trypanosoma brucei that carries half of a smaller mobile element at either end and has homology with mammalian LINEs.

    PubMed Central

    Kimmel, B E; ole-MoiYoi, O K; Young, J R

    1987-01-01

    A dispersed repetitive element named ingi, which is present in the genome of the protozoan parasite Trypanosoma brucei, is described. One complete 5.2-kilobase element and the ends of two others were sequenced. There were no direct or inverted terminal repeats. Rather, the ends consisted of two halves of a previously described 512-base-pair transposable element (G. Hasan, M.J. Turner, and J.S. Cordingley, Cell 37:333-341, 1984). Oligo(dA) tails and possible insertion site duplications suggested that ingi is a retroposon. The sequenced element appears to be a pseudogene copy of an original retroposon with one or more open reading frames occupying most of its length. Significant homologies of the encoded amino acid sequences with reverse transcriptases and mammalian long interpersed nuclear element sequences suggest a remote evolutionary origin for this kind of retroposon. Images PMID:3037321

  19. Conservation of the function counts: homologous neurons express sequence-related neuropeptides that originate from different genes.

    PubMed

    Neupert, Susanne; Huetteroth, Wolf; Schachtner, Joachim; Predel, Reinhard

    2009-11-01

    By means of single-cell matrix assisted laser desorption/ionization time-of-flight mass spectrometry, we analysed neuropeptide expression in all FXPRLamide/pheromone biosynthesis activating neuropeptide synthesizing neurons of the adult tobacco hawk moth, Manduca sexta. Mass spectra clearly suggest a completely identical processing of the pheromone biosynthesis activating neuropeptide-precursor in the mandibular, maxillary and labial neuromeres of the subesophageal ganglion. Only in the pban-neurons of the labial neuromere, products of two neuropeptide genes, namely the pban-gene and the capa-gene, were detected. Both of these genes expressed, amongst others, sequence-related neuropeptides (extended WFGPRLamides). We speculate that the expression of the two neuropeptide genes is a plesiomorph character typical of moths. A detailed examination of the neuroanatomy and the peptidome of the (two) pban-neurons in the labial neuromere of moths with homologous neurons of different insects indicates a strong conservation of the function of this neuroendocrine system. In other insects, however, the labial neurons either express products of the fxprl-gene or products of the capa-gene. The processing of the respective genes is reduced to extended WFGPRLamides in each case and yields a unique peptidome in the labial cells. Thus, sequence-related messenger molecules are always produced in these cells and it seems that the respective neurons recruited different neuropeptide genes for this motif. PMID:19712058

  20. Logistic regression models to predict solvent accessible residues using sequence- and homology-based qualitative and quantitative descriptors applied to a domain-complete X-ray structure learning set

    PubMed Central

    Nepal, Reecha; Spencer, Joanna; Bhogal, Guneet; Nedunuri, Amulya; Poelman, Thomas; Kamath, Thejas; Chung, Edwin; Kantardjieff, Katherine; Gottlieb, Andrea; Lustig, Brooke

    2015-01-01

    A working example of relative solvent accessibility (RSA) prediction for proteins is presented. Novel logistic regression models with various qualitative descriptors that include amino acid type and quantitative descriptors that include 20- and six-term sequence entropy have been built and validated. A domain-complete learning set of over 1300 proteins is used to fit initial models with various sequence homology descriptors as well as query residue qualitative descriptors. Homology descriptors are derived from BLASTp sequence alignments, whereas the RSA values are determined directly from the crystal structure. The logistic regression models are fitted using dichotomous responses indicating buried or accessible solvent, with binary classifications obtained from the RSA values. The fitted models determine binary predictions of residue solvent accessibility with accuracies comparable to other less computationally intensive methods using the standard RSA threshold criteria 20 and 25% as solvent accessible. When an additional non-homology descriptor describing Lobanov–Galzitskaya residue disorder propensity is included, incremental improvements in accuracy are achieved with 25% threshold accuracies of 76.12 and 74.79% for the Manesh-215 and CASP(8+9) test sets, respectively. Moreover, the described software and the accompanying learning and validation sets allow students and researchers to explore the utility of RSA prediction with simple, physically intuitive models in any number of related applications. PMID:26664348

  1. PSI/TM-Coffee: a web server for fast and accurate multiple sequence alignments of regular and transmembrane proteins using homology extension on reduced databases.

    PubMed

    Floden, Evan W; Tommaso, Paolo D; Chatzou, Maria; Magis, Cedrik; Notredame, Cedric; Chang, Jia-Ming

    2016-07-01

    The PSI/TM-Coffee web server performs multiple sequence alignment (MSA) of proteins by combining homology extension with a consistency based alignment approach. Homology extension is performed with Position Specific Iterative (PSI) BLAST searches against a choice of redundant and non-redundant databases. The main novelty of this server is to allow databases of reduced complexity to rapidly perform homology extension. This server also gives the possibility to use transmembrane proteins (TMPs) reference databases to allow even faster homology extension on this important category of proteins. Aside from an MSA, the server also outputs topological prediction of TMPs using the HMMTOP algorithm. Previous benchmarking of the method has shown this approach outperforms the most accurate alignment methods such as MSAProbs, Kalign, PROMALS, MAFFT, ProbCons and PRALINE™. The web server is available at http://tcoffee.crg.cat/tmcoffee. PMID:27106060

  2. Methods and compositions for efficient nucleic acid sequencing

    DOEpatents

    Drmanac, Radoje

    2002-01-01

    Disclosed are novel methods and compositions for rapid and highly efficient nucleic acid sequencing based upon hybridization with two sets of small oligonucleotide probes of known sequences. Extremely large nucleic acid molecules, including chromosomes and non-amplified RNA, may be sequenced without prior cloning or subcloning steps. The methods of the invention also solve various current problems associated with sequencing technology such as, for example, high noise to signal ratios and difficult discrimination, attaching many nucleic acid fragments to a surface, preparing many, longer or more complex probes and labelling more species.

  3. Methods and compositions for efficient nucleic acid sequencing

    DOEpatents

    Drmanac, Radoje

    2006-07-04

    Disclosed are novel methods and compositions for rapid and highly efficient nucleic acid sequencing based upon hybridization with two sets of small oligonucleotide probes of known sequences. Extremely large nucleic acid molecules, including chromosomes and non-amplified RNA, may be sequenced without prior cloning or subcloning steps. The methods of the invention also solve various current problems associated with sequencing technology such as, for example, high noise to signal ratios and difficult discrimination, attaching many nucleic acid fragments to a surface, preparing many, longer or more complex probes and labelling more species.

  4. Kit for detecting nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    2001-01-01

    A kit is provided for detecting a target nucleic acid sequence in a sample, the kit comprising: a first hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the first hybridization probe including a first complexing agent for forming a binding pair with a second complexing agent; and a second hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the first hybridization probe does not selectively hybridize, the second hybridization probe including a detectable marker; a third hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the third hybridization probe including the same detectable marker as the second hybridization probe; and a fourth hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the third hybridization probe does not selectively hybridize, the fourth hybridization probe including the first complexing agent for forming a binding pair with the second complexing agent; wherein the first and second hybridization probes are capable of simultaneously hybridizing to the target sequence and the third and fourth hybridization probes are capable of simultaneously hybridizing to the target sequence, the detectable marker is not present on the first or fourth hybridization probes and the first, second, third, and fourth hybridization probes each include a competitive nucleic acid sequence which is sufficiently complementary to a third portion of the target sequence that the competitive sequences of the first, second, third, and fourth hybridization probes compete with each other to hybridize to the third portion of the

  5. The biosynthetic gene cluster for coronamic acid, an ethylcyclopropyl amino acid, contains genes homologous to amino acid-activating enzymes and thioesterases.

    PubMed Central

    Ullrich, M; Bender, C L

    1994-01-01

    Coronamic acid (CMA), an ethylcyclopropyl amino acid derived from isoleucine, functions as an intermediate in the biosynthesis of coronatine, a chlorosis-inducing phytotoxin produced by Pseudomonas syringae pv. glycinea PG4180. The DNA required for CMA biosynthesis (6.9 kb) was sequenced, revealing three distinct open reading frames (ORFs) which share a common orientation for transcription. The deduced amino acid sequence of a 2.7-kb ORF designated cmaA contained six core sequences and two conserved motifs which are present in a variety of amino acid-activating enzymes, including nonribosomal peptide synthetases. Furthermore, CmaA contained a spatial arrangement of histidine, aspartate, and arginine residues which are conserved in the ferrous active site of some nonheme iron(II) enzymes which catalyze oxidative cyclizations. The deduced amino acid sequence of a 1.2-kb ORF designated cmaT was related to thioesterases of both procaryotic and eucaryotic origins. These data suggest that CMA assembly is similar to the thiotemplate mechanism of nonribosomal peptide synthesis. No significant similarities between a 0.9-kb ORF designated cmaU and other database entries were found. The start sites of two transcripts required for CMA biosynthesis were identified in the present study. pRG960sd, a vector containing a promoterless glucuronidase gene, was used to localize and study the promoter regions upstream of the two transcripts. Data obtained in the present study indicate that CMA biosynthesis is regulated at the transcriptional level by temperature. Images PMID:8002582

  6. Relationships in the Caryophyllales as suggested by phylogenetic analyses of partial chloroplast DNA ORF2280 homolog sequences.

    PubMed

    Downie, S; Katz-Downie, D; Cho, K

    1997-02-01

    Phylogenetic relationships within the angiosperm order Caryophyllales were investigated by comparative sequencing of two portions of the highly conserved inverted repeat (totaling some 1100 base pairs) coinciding with the region occupied by ORF2280 in Nicotiana, the largest gene in the plastid genomes of most land plants. Data were obtained for 33 species in 11 families within the order and for one species each of Plumbaginaceae, Polygonaceae, and Nepenthaceae. These data, when analyzed along with previously published ORF (open reading frame) sequences from Nicotiana. Spinacia. Epifagus, and Pelargonium using parsimony, neighbor-joining, and maximum likelihood methods, reveal that: (1) Amaranthus, Celosia, and Froelichia (all Amaranthaceae) do not comprise a monophyletic group; (2) Amaranthus may be nested within a paraphyletic Chenopodiaceae; (3) Sarcobatus (Chenopodiaceae) is allied with Nyctaginaceae + Phytolaccaceae (the latter family excluding Stegnosperma but including Petiveria); and (4) Caryophyllaceae (with Corrigiola basal within the clade) are sister group to Chenopodiaceae + Amaranthaceae. Basal relations within the order remain obscure. Sequence divergence values in pairwise comparisons across all Caryophyllales taxa ranged from 0.1 to 5% of nucleotides. However, despite these low values, 23 insertion and deletion events were apparent, of which five were informative phylogenetically and bolstered several of the relationships listed above. A polymerase chain reaction (PCR) survey for ORF homolog length variants in representatives from 70 additional angiosperm families revealed major deletions, of 100 to 1400 base pairs, in 19 of these families. Although the ORF is located within the mutationally retarded inverted repeat region of most angiosperm chloroplast DNAs, this gene appears particularly prone to length mutation. PMID:21712205

  7. Complete Unique Genome Sequence, Expression Profile, and Salivary Gland Tissue Tropism of the Herpesvirus 7 Homolog in Pigtailed Macaques

    PubMed Central

    Staheli, Jeannette P.; Dyen, Michael R.; Deutsch, Gail H.; Basom, Ryan S.; Fitzgibbon, Matthew P.; Lewis, Patrick

    2016-01-01

    ABSTRACT Human herpesvirus 6A (HHV-6A), HHV-6B, and HHV-7 are classified as roseoloviruses and are highly prevalent in the human population. Roseolovirus reactivation in an immunocompromised host can cause severe pathologies. While the pathogenic potential of HHV-7 is unclear, it can reactivate HHV-6 from latency and thus contributes to severe pathological conditions associated with HHV-6. Because of the ubiquitous nature of roseoloviruses, their roles in such interactions and the resulting pathological consequences have been difficult to study. Furthermore, the lack of a relevant animal model for HHV-7 infection has hindered a better understanding of its contribution to roseolovirus-associated diseases. Using next-generation sequencing analysis, we characterized the unique genome of an uncultured novel pigtailed macaque roseolovirus. Detailed genomic analysis revealed the presence of gene homologs to all 84 known HHV-7 open reading frames. Phylogenetic analysis confirmed that the virus is a macaque homolog of HHV-7, which we have provisionally named Macaca nemestrina herpesvirus 7 (MneHV7). Using high-throughput RNA sequencing, we observed that the salivary gland tissue samples from nine different macaques had distinct MneHV7 gene expression patterns and that the overall number of viral transcripts correlated with viral loads in parotid gland tissue and saliva. Immunohistochemistry staining confirmed that, like HHV-7, MneHV7 exhibits a natural tropism for salivary gland ductal cells. We also observed staining for MneHV7 in peripheral nerve ganglia present in salivary gland tissues, suggesting that HHV-7 may also have a tropism for the peripheral nervous system. Our data demonstrate that MneHV7-infected macaques represent a relevant animal model that may help clarify the causality between roseolovirus reactivation and diseases. IMPORTANCE Human herpesvirus 6A (HHV-6A), HHV-6B, and HHV-7 are classified as roseoloviruses. We have recently discovered that pigtailed

  8. Peptide mapping and amino acid sequencing of two catechol 1,2-dioxygenases (CD I1 and CD I2) from Acinetobacter lwoffii K24.

    PubMed

    Kim, S I; Ha, K S

    1997-10-31

    The partial amino acid sequences of two catechol 1,2-dioxygenases (CD I1 and CD I2) from Acinetobacter lwoffii K24 have been determined by analysis of peptides after cleavages with endopeptidase Lys-C, endopeptidase Glu-C, trypsin, and chemicals (cyanogen bromide and BNPS-skatole). They include 248 amino acid sequences (4 fragments) of CD I1 and 211 amino acid sequences (5 fragments) of CD I2. Two enzymes have more than 50% sequence homology with type I catechol 1,2-dioxygenases and less than 30% sequence homology with type II catechol 1,2-dioxygenases. Two enzymes have similar hydropathy profiles in the N-terminal region, suggesting that they have similar secondary structures. PMID:9387151

  9. Analysis and Annotation of Nucleic Acid Sequence

    SciTech Connect

    States, David J.

    2004-07-28

    The aims of this project were to develop improved methods for computational genome annotation and to apply these methods to improve the annotation of genomic sequence data with a specific focus on human genome sequencing. The project resulted in a substantial body of published work. Notable contributions of this project were the identification of basecalling and lane tracking as error processes in genome sequencing and contributions to improved methods for these steps in genome sequencing. This technology improved the accuracy and throughput of genome sequence analysis. Probabilistic methods for physical map construction were developed. Improved methods for sequence alignment, alternative splicing analysis, promoter identification and NF kappa B response gene prediction were also developed.

  10. Analysis and Annotation of Nucleic Acid Sequence

    SciTech Connect

    David J. States

    1998-08-01

    The aims of this project were to develop improved methods for computational genome annotation and to apply these methods to improve the annotation of genomic sequence data with a specific focus on human genome sequencing. The project resulted in a substantial body of published work. Notable contributions of this project were the identification of basecalling and lane tracking as error processes in genome sequencing and contributions to improved methods for these steps in genome sequencing. This technology improved the accuracy and throughput of genome sequence analysis. Probabilistic methods for physical map construction were developed. Improved methods for sequence alignment, alternative splicing analysis, promoter identification and NF kappa B response gene prediction were also developed.

  11. The world of beta- and gamma-peptides comprised of homologated proteinogenic amino acids and other components.

    PubMed

    Seebach, Dieter; Beck, Albert K; Bierbaum, Daniel J

    2004-08-01

    The origins of our nearly ten-year research program of chemical and biological investigations into peptides based on homologated proteinogenic amino acids are described. The road from the biopolymer poly[ethyl (R)-3-hydroxybutanoate] to the beta-peptides was primarily a step from organic synthesis methodology (the preparation of enantiomerically pure compounds (EPCs)) to supramolecular chemistry (higher-order structures maintained through non-covalent interactions). The performing of biochemical and biological tests on the beta- and gamma-peptides, which differ from natural peptides/proteins by a single or two additional CH(2) groups per amino acid, then led into bioorganic chemistry and medicinal chemistry. The individual chapters of this review article begin with descriptions of work on beta-amino acids, beta-peptides, and polymers (Nylon-3) that dates back to the 1960s, even to the times of Emil Fischer, but did not yield insights into structures or biological properties. The numerous, often highly physiologically active, or even toxic, natural products containing beta- and gamma-amino acid moieties are then presented. Chapters on the preparation of homologated amino acids with proteinogenic side chains, their coupling to provide the corresponding peptides, both in solution (including thioligation) and on the solid phase, their isolation by preparative HPLC, and their characterization by mass spectrometry (HR-MS and MS sequencing) follow. After that, their structures, predominantly determined by NMR spectroscopy in methanolic solution, are described: helices, pleated sheets, and turns, together with stack-, crankshaft-, paddlewheel-, and staircase-like patterns. The presence of the additional C--C bonds in the backbones of the new peptides did not give rise to a chaotic increase in their secondary structures as many protein specialists might have expected: while there are indeed more structure types than are observed in the alpha-peptide realm - three different

  12. Increased fatty acid unsaturation and production of arachidonic acid by homologous over-expression of the mitochondrial malic enzyme in Mortierella alpina

    PubMed Central

    Hao, Guangfei; Du, Kai; Huang, Xiaoyun; Song, Yuanda; Gu, Zhennan; Wang, Lei; Zhang, Hao; Chen, Wei; Chen, Yong Q.

    2015-01-01

    Malic enzyme (ME) catalyses the oxidative decarboxylation of L-malate to pyruvate and provides NADPH for intracellular metabolism, such as fatty acid synthesis. Here, the mitochondrial ME (mME) gene from Mortierella alpina was homologously over-expressed. Compared with controls, fungal arachidonic acid (ARA; 20:4 n-6) content increased by 60 % without affecting the total fatty acid content. Our results suggest that enhancing mME activity may be an effective mean to increase industrial production of ARA in M. alpina. PMID:24863290

  13. The human and mouse homologs of the yeat RAD52 gene: cDNA cloning, sequence analysis, assignment to human chromosome 12p12.2-p13, and mRNA expression in mouse tissues

    SciTech Connect

    Shen, Z.; Chen, D.J.; Denison, K.

    1995-01-01

    The yeast Saccharomyces cerevisiae RAD52 gene is involved in DNA double-strand break repair and mitotic/meiotic recombination. The N-terminal amino acid sequence of yeast S. cerevisiae, Schizosaccharomyces pombe, and Kluyveromyces lactis and chicken is highly conserved. Using the technology of mixed oligonucleotide primed amplification of cDNA (MOPAC), two mouse RAD52 homologous cDNA fragments were amplified and sequenced. Subsequently, we have cloned the cDNA of the human and mouse homologs of yeast RAD52 gene by screening cDNA libraries using the identified mouse cDNA fragments. Sequence analysis of cDNA derived amino acid revealed a highly conserved N-terminus among human, mouse, chicken, and yeast RAD52 genes. The human RAD52 gene was assigned to chromosome 12p12.2-p13 by fluorescence in situ hybridization, R-banding, and DNA analysis of somatic cell hybrids. Unlike chicken RAD52 and mouse RAD51, no significant difference in mouse RAD52 mRNA level was found among mouse heart, brain, spleen, lung, liver, skeletal muscle, kidney, and testis. In addition to an {approximately}1.9-kb RAD52 mRNA band that is present in all of the tested tissues, an extra mRNA species of {approximately}0.85 kb was detectable in mouse testis. 40 refs., 7 figs., 1 tab.

  14. A novel nucleoid-associated protein of Mycobacterium tuberculosis is a sequence homolog of GroEL

    PubMed Central

    Basu, Debashree; Khare, Garima; Singh, Shashi; Tyagi, Anil; Khosla, Sanjeev; Mande, Shekhar C.

    2009-01-01

    The Mycobacterium tuberculosis genome sequence reveals remarkable absence of many nucleoid-associated proteins (NAPs), such as HNS, Hfq or DPS. In order to characterize the nucleoids of M. tuberculosis, we have attempted to identify NAPs, and report an interesting finding that a chaperonin-homolog, GroEL1, is nucleoid associated. We report that M. tuberculosis GroEL1 binds DNA with low specificity but high affinity, suggesting that it might have naturally evolved to bind DNA. We are able to demonstrate that GroEL1 can effectively function as a DNA-protecting agent against DNase I or hydroxyl-radicals. Moreover, Atomic Force Microscopic studies reveal that GroEL1 can condense a large DNA into a compact structure. We also provide in vivo evidences that include presence of GroEL1 in purified nucleoids, in vivo crosslinking followed by Southern hybridizations and immunofluorescence imaging in M. tuberculosis confirming that GroEL1: DNA interactions occur in natural biological settings. These findings therefore reveal that M. tuberculosis GroEL1 has evolved to be associated with nucleoids. PMID:19528065

  15. Next generation sequencing identifies mutations in Atonal homolog 7 (ATOH7) in families with global eye developmental defects

    PubMed Central

    Khan, Kamron; Logan, Clare V.; McKibbin, Martin; Sheridan, Eamonn; Elçioglu, Nursel H.; Yenice, Ozlem; Parry, David A.; Fernandez-Fuentes, Narcis; Abdelhamed, Zakia I.A.; Al-Maskari, Ahmed; Poulter, James A.; Mohamed, Moin D.; Carr, Ian M.; Morgan, Joanne E.; Jafri, Hussain; Raashid, Yasmin; Taylor, Graham R.; Johnson, Colin A.; Inglehearn, Chris F.; Toomes, Carmel; Ali, Manir

    2012-01-01

    The atonal homolog 7 (ATOH7) gene encodes a transcription factor involved in determining the fate of retinal progenitor cells and is particularly required for optic nerve and ganglion cell development. Using a combination of autozygosity mapping and next generation sequencing, we have identified homozygous mutations in this gene, p.E49V and p.P18RfsX69, in two consanguineous families diagnosed with multiple ocular developmental defects, including severe vitreoretinal dysplasia, optic nerve hypoplasia, persistent fetal vasculature, microphthalmia, congenital cataracts, microcornea, corneal opacity and nystagmus. Most of these clinical features overlap with defects in the Norrin/β-catenin signalling pathway that is characterized by dysgenesis of the retinal and hyaloid vasculature. Our findings document Mendelian mutations within ATOH7 and imply a role for this molecule in the development of structures at the front as well as the back of the eye. This work also provides further insights into the function of ATOH7, especially its importance in retinal vascular development and hyaloid regression. PMID:22068589

  16. From Artificial Amino Acids to Sequence-Defined Targeted Oligoaminoamides.

    PubMed

    Morys, Stephan; Wagner, Ernst; Lächelt, Ulrich

    2016-01-01

    Artificial oligoamino acids with appropriate protecting groups can be used for the sequential assembly of oligoaminoamides on solid-phase. With the help of these oligoamino acids multifunctional nucleic acid (NA) carriers can be designed and produced in highly defined topologies. Here we describe the synthesis of the artificial oligoamino acid Fmoc-Stp(Boc3)-OH, the subsequent assembly into sequence-defined oligomers and the formulation of tumor-targeted plasmid DNA (pDNA) polyplexes. PMID:27436323

  17. Nucleotide and predicted amino acid sequences of cloned human and mouse preprocathepsin B cDNAs.

    PubMed Central

    Chan, S J; San Segundo, B; McCormick, M B; Steiner, D F

    1986-01-01

    Cathepsin B is a lysosomal thiol proteinase that may have additional extralysosomal functions. To further our investigations on the structure, mode of biosynthesis, and intracellular sorting of this enzyme, we have determined the complete coding sequences for human and mouse preprocathepsin B by using cDNA clones isolated from human hepatoma and kidney phage libraries. The nucleotide sequences predict that the primary structure of preprocathepsin B contains 339 amino acids organized as follows: a 17-residue NH2-terminal prepeptide sequence followed by a 62-residue propeptide region, 254 residues in mature (single chain) cathepsin B, and a 6-residue extension at the COOH terminus. A comparison of procathepsin B sequences from three species (human, mouse, and rat) reveals that the homology between the propeptides is relatively conserved with a minimum of 68% sequence identity. In particular, two conserved sequences in the propeptide that may be functionally significant include a potential glycosylation site and the presence of a single cysteine at position 59. Comparative analysis of the three sequences also suggests that processing of procathepsin B is a multistep process, during which enzymatically active intermediate forms may be generated. The availability of the cDNA clones will facilitate the identification of possible active or inactive intermediate processive forms as well as studies on the transcriptional regulation of the cathepsin B gene. PMID:3463996

  18. Detecting frame shifts by amino acid sequence comparison.

    PubMed

    Claverie, J M

    1993-12-20

    Various amino acid substitution scoring matrices are used in conjunction with local alignments programs to detect regions of similarity and infer potential common ancestry between proteins. The usual scoring schemes derive from the implicit hypothesis that related proteins evolve from a common ancestor by the accumulation of point mutations and that amino acids tend to be progressively substituted by others with similar properties. However, other frequent single mutation events, like nucleotide insertion or deletion and gene inversion, change the translation reading frame and cause previously encoded amino acid sequences to become unrecognizable at once. Here, I derive five new types of scoring matrix, each capable of detecting a specific frame shift (deletion, insertion and inversion in 3 frames) and use them with a regular local alignments program to detect amino acid sequences that may have derived from alternative reading frames of the same nucleotide sequence. Frame shifts are inferred from the sole comparison of the protein sequences. The five scoring matrices were used with the BLASTP program to compare all the protein sequences in the Swissprot database. Surprisingly, the searches revealed hundreds of highly significant frame shift matches, of which many are likely to represent sequencing errors. Others provide some evidence that frame shift mutations might be used in protein evolution as a way to create new amino acid sequences from pre-existing coding regions. PMID:7903399

  19. Segments of amino acid sequence similarity in beta-amylases.

    PubMed

    Friedberg, F; Rhodes, C

    1988-01-01

    In alpha-amylases from animals, plants and bacteria and in beta-amylases from plants and bacteria a number of segments exhibit amino acid sequence similarity specific to the alpha or to the beta type, respectively. In the case of the beta-amylases the similar sequence regions are extensive and they are disrupted only by short interspersed dissimilar regions. Close to the C terminus, however, no such sequence similarity exist. PMID:2464171

  20. Complete amino acid sequence of the medium-chain S-acyl fatty acid synthetase thio ester hydrolase from rat mammary gland

    SciTech Connect

    Randhawa, Z.I.; Smith, S.

    1987-03-10

    The complete amino acid sequence of the medium-chain S-acyl fatty acid synthetase thio ester hydrolase (thioesterase II) from rat mammary gland is presented. Most of the sequence was derived by analysis of (/sup 14/C)-labelled peptide fragments produced by cleavage at methionyl, glutamyl, lysyl, arginyl, and tryptophanyl residues. A small section of the sequence was deduced from a previously analyzed cDNA clone. The protein consists of 260 residues and has a blocked amino-terminal methionine and calculated M/sub r/ of 29,212. The carboxy-terminal sequence, verified by Edman degradation of the carboxy-terminal cyanogen bromide fragment and carboxypeptidase Y digestion of the intact thioesterase II, terminates with a serine residue and lacks three additional residues predicted by the cDNA sequence. The native enzyme contains three cysteine residues but no disulfide bridges. The active site serine residue is located at position 101. The rat mammary gland thioesterase II exhibits approximately 40% homology with a thioesterase from mallard uropygial gland, the sequence of which was recently determined by cDNA analysis. Thus the two enzymes may share similar structural features and a common evolutionary origin. The location of the active site in these thioesterases differs from that of other serine active site esterases; indeed, the enzymes do not exhibit any significant homology with other serine esterases, suggesting that they may constitute a separate new family of serine active site enzymes.

  1. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... acids are not intended to be embraced by this definition. Any amino acid sequence that contains post-translationally modified amino acids may be described as the amino acid sequence that is initially translated... sequence of four or more amino acids or an unbranched sequence of ten or more nucleotides....

  2. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... acids are not intended to be embraced by this definition. Any amino acid sequence that contains post-translationally modified amino acids may be described as the amino acid sequence that is initially translated... sequence of four or more amino acids or an unbranched sequence of ten or more nucleotides....

  3. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... acids are not intended to be embraced by this definition. Any amino acid sequence that contains post-translationally modified amino acids may be described as the amino acid sequence that is initially translated... sequence of four or more amino acids or an unbranched sequence of ten or more nucleotides....

  4. Computer analysis between nucleotide and amino acid sequences of bean golden mosaic virus and those of maize streak, wheat dwarf, chloris striate mosaic, and beet curly top viruses.

    PubMed

    Ikegami, M

    1989-01-01

    Bean golden mosaic virus (BGMV) DNA 1 and 2 have little sequence homology with maize streak virus (MSV), wheat dwarf virus (WDV), and chloris striate mosaic virus (CSMV) DNAs. BGMV DNA 1 and beet curly top virus (BCTV) DNA are closely related, whereas BGMV DNA 2 and BCTV DNA are not related. Direct amino acid homologies of predicted proteins between BGMV ORFs and MSV ORFs, WDV ORFs or CSMV ORFs were 40-50%. BGMV 1L1 and BCTV L1, and BGMV IL3 and BCTV L4 were highly conserved. The sequence TAATATTAC was detected in the loops of hairpin structures of 5 gemini-viruses. PMID:2615677

  5. A method to find palindromes in nucleic acid sequences.

    PubMed

    Anjana, Ramnath; Shankar, Mani; Vaishnavi, Marthandan Kirti; Sekar, Kanagaraj

    2013-01-01

    Various types of sequences in the human genome are known to play important roles in different aspects of genomic functioning. Among these sequences, palindromic nucleic acid sequences are one such type that have been studied in detail and found to influence a wide variety of genomic characteristics. For a nucleotide sequence to be considered as a palindrome, its complementary strand must read the same in the opposite direction. For example, both the strands i.e the strand going from 5' to 3' and its complementary strand from 3' to 5' must be complementary. A typical nucleotide palindromic sequence would be TATA (5' to 3') and its complimentary sequence from 3' to 5' would be ATAT. Thus, a new method has been developed using dynamic programming to fetch the palindromic nucleic acid sequences. The new method uses less memory and thereby it increases the overall speed and efficiency. The proposed method has been tested using the bacterial (3891 KB bases) and human chromosomal sequences (Chr-18: 74366 kb and Chr-Y: 25554 kb) and the computation time for finding the palindromic sequences is in milli seconds. PMID:23515654

  6. Gene-related strain variation of Staphylococcus aureus for homologous resistance response to acid stress.

    PubMed

    Lee, Soomin; Ahn, Sooyeon; Lee, Heeyoung; Kim, Won-Il; Kim, Hwang-Yong; Ryu, Jae-Gee; Kim, Se-Ri; Choi, Kyoung-Hee; Yoon, Yohan

    2014-10-01

    This study investigated the effect of adaptation of Staphylococcus aureus strains to the acidic condition of tomato in response to environmental stresses, such as heat and acid. S. aureus ATCC 13565, ATCC 14458, ATCC 23235, ATCC 27664, and NCCP10826 habituated in tomato extract at 35°C for 24 h were inoculated in tryptic soy broth. The culture suspensions were then subjected to heat challenge or acid challenge at 60°C and pH 3.0, respectively, for 60 min. In addition, transcriptional analysis using quantitative real-time PCR was performed to evaluate the expression level of acid-shock genes, such as clpB, zwf, nuoF, and gnd, from five S. aureus strains after the acid habituation of strains in tomato at 35°C for 15 min and 60 min in comparison with that of the nonhabituated strains. In comparison with the nonhabituated strains, the five tomato-habituated S. aureus strains did not show cross protection to heat, but tomato-habituated S. aureus ATCC 23235 showed acid resistance. In quantitative real-time-PCR analysis, the relative expression levels of acid-shock genes (clpB, zwf, nuoF, and gnd) were increased the most in S. aureus ATCC 23235 after 60 min of tomato habituation, but there was little difference in the expression levels among the five S. aureus strains after 15 min of tomato habituation. These results indicate that the variation of acid resistance of S. aureus is related to the expression of acid-shock genes during acid habituation. PMID:25285500

  7. Amino acid sequence repertoire of the bacterial proteome and the occurrence of untranslatable sequences.

    PubMed

    Navon, Sharon Penias; Kornberg, Guy; Chen, Jin; Schwartzman, Tali; Tsai, Albert; Puglisi, Elisabetta Viani; Puglisi, Joseph D; Adir, Noam

    2016-06-28

    Bioinformatic analysis of Escherichia coli proteomes revealed that all possible amino acid triplet sequences occur at their expected frequencies, with four exceptions. Two of the four underrepresented sequences (URSs) were shown to interfere with translation in vivo and in vitro. Enlarging the URS by a single amino acid resulted in increased translational inhibition. Single-molecule methods revealed stalling of translation at the entrance of the peptide exit tunnel of the ribosome, adjacent to ribosomal nucleotides A2062 and U2585. Interaction with these same ribosomal residues is involved in regulation of translation by longer, naturally occurring protein sequences. The E. coli exit tunnel has evidently evolved to minimize interaction with the exit tunnel and maximize the sequence diversity of the proteome, although allowing some interactions for regulatory purposes. Bioinformatic analysis of the human proteome revealed no underrepresented triplet sequences, possibly reflecting an absence of regulation by interaction with the exit tunnel. PMID:27307442

  8. Molecular cloning, sequencing and expression in Escherichia coli of the 25-kDa growth-related protein of Ehrlich ascites tumor and its homology to mammalian stress proteins.

    PubMed

    Gaestel, M; Gross, B; Benndorf, R; Strauss, M; Schunk, W H; Kraft, R; Otto, A; Böhm, H; Stahl, J; Drabsch, H

    1989-01-15

    The growth-related 25-kDa protein (p25) of Ehrlich ascites tumor (EAT) has been characterized by molecular cloning and sequencing of cDNA clones detected by hybridization with oligonucleotide probes synthesized according to the amino acid sequence of a tryptic peptide of p25. Detection of p25 mRNA in EAT of the exponential growth phase and of the stationary phase using cDNA-derived RNA probes demonstrated that the abundance of p25 mRNA is also growth-related. High-level expression of p25 in Escherichia coli has been established by oligonucleotide-directed mutagenesis of cDNA and insertion of the mutated cDNA into a T7-promoter expression vector. Recombinant p25 from the expressed cDNA sequence has been shown to comigrate with EAT p25 in electrophoresis and to react with antibodies against the EAT p25. On the amino acid level, p25 shows about 80% sequence homology to the human stress protein hsp27. Furthermore, p25 has similar isoforms of phosphorylation as demonstrated for small mammalian stress proteins from rat and human. From the results obtained, it is concluded that p25 is a mammalian stress protein, the abundance of which is related to growth characteristics of the Ehrlich ascites tumor. PMID:2645135

  9. DNA sequence analysis of a 5.27-kb direct repeat occurring adjacent to the regions of S-episome homology in maize mitochondria.

    PubMed Central

    Houchins, J P; Ginsburg, H; Rohrbaugh, M; Dale, R M; Schardl, C L; Hodge, T P; Lonsdale, D M

    1986-01-01

    The DNA sequence of the 5270-bp repeated DNA element from the mitochondrial genome of the fertile cytoplasm of maize has been determined. The repeat is a major site of recombination within the mitochondrial genome and sequences related to the R1(S1) and R2(S2) linear episomes reside immediately adjacent to the repeat. The terminal inverted repeats of the R1 and R2 homologous sequences form one of the two boundaries of the repeat. Frame-shift mutations have introduced 11 translation termination codons into the transcribed S2/R2 URFI gene. The repeated sequence, though recombinantly active, appears to serve no biological function. Images Fig. 7. PMID:3792299

  10. Homologous electron transport components fail to increase fatty acid hydroxylation in transgenic Arabidopsis thaliana

    PubMed Central

    Wayne, Laura L.; Browse, John

    2013-01-01

    Ricinoleic acid, a hydroxylated fatty acid (HFA) present in castor ( Ricinus communis) seeds, is an important industrial commodity used in products ranging from inks and paints to polymers and fuels. However, due to the deadly toxin ricin and allergens also present in castor, it would be advantageous to produce ricinoleic acid in a different agricultural crop. Unfortunately, repeated efforts at heterologous expression of the castor fatty acid hydroxylase (RcFAH12) in the model plant Arabidopsis thaliana have produced only 17-19% HFA in the seed triacylglycerols (TAG), whereas castor seeds accumulate up to 90% ricinoleic acid in the endosperm TAG. RcFAH12 requires an electron supply from NADH:cytochrome b5 reductase (CBR1) and cytochrome b5 (Cb5) to synthesize ricinoleic acid. Previously, our laboratory found a mutation in the Arabidopsis CBR1 gene, cbr1-1, that caused an 85% decrease in HFA levels in the RcFAH12 Arabidopsis line. These results raise the possibility that electron supply to the heterologous RcFAH12 may limit the production of HFA. Therefore, we hypothesized that by heterologously expressing RcCb5, the reductant supply to RcFAH12 would be improved and lead to increased HFA accumulation in Arabidopsis seeds. Contrary to this proposal, heterologous expression of the top three RcCb5 candidates did not increase HFA accumulation. Furthermore, coexpression of RcCBR1 and RcCb5 in RcFAH12 Arabidopsis also did not increase in HFA levels compared to the parental lines. These results demonstrate that the Arabidopsis electron transfer system is supplying sufficient reductant to RcFAH12 and that there must be other bottlenecks limiting the accumulation of HFA. PMID:24555099

  11. On Quantum Algorithm for Multiple Alignment of Amino Acid Sequences

    NASA Astrophysics Data System (ADS)

    Iriyama, Satoshi; Ohya, Masanori

    2009-02-01

    The alignment of genome sequences or amino acid sequences is one of fundamental operations for the study of life. Usual computational complexity for the multiple alignment of N sequences with common length L by dynamic programming is O(LN). This alignment is considered as one of the NP problems, so that it is desirable to find a nice algorithm of the multiple alignment. Thus in this paper we propose the quantum algorithm for the multiple alignment based on the works12,1,2 in which the NP complete problem was shown to be the P problem by means of quantum algorithm and chaos information dynamics.

  12. Prebiotically plausible mechanisms increase compositional diversity of nucleic acid sequences

    PubMed Central

    Derr, Julien; Manapat, Michael L.; Rajamani, Sudha; Leu, Kevin; Xulvi-Brunet, Ramon; Joseph, Isaac; Nowak, Martin A.; Chen, Irene A.

    2012-01-01

    During the origin of life, the biological information of nucleic acid polymers must have increased to encode functional molecules (the RNA world). Ribozymes tend to be compositionally unbiased, as is the vast majority of possible sequence space. However, ribonucleotides vary greatly in synthetic yield, reactivity and degradation rate, and their non-enzymatic polymerization results in compositionally biased sequences. While natural selection could lead to complex sequences, molecules with some activity are required to begin this process. Was the emergence of compositionally diverse sequences a matter of chance, or could prebiotically plausible reactions counter chemical biases to increase the probability of finding a ribozyme? Our in silico simulations using a two-letter alphabet show that template-directed ligation and high concatenation rates counter compositional bias and shift the pool toward longer sequences, permitting greater exploration of sequence space and stable folding. We verified experimentally that unbiased DNA sequences are more efficient templates for ligation, thus increasing the compositional diversity of the pool. Our work suggests that prebiotically plausible chemical mechanisms of nucleic acid polymerization and ligation could predispose toward a diverse pool of longer, potentially structured molecules. Such mechanisms could have set the stage for the appearance of functional activity very early in the emergence of life. PMID:22319215

  13. The amino-acid sequence of kangaroo pancreatic ribonuclease.

    PubMed

    Gaastra, W; Welling, G W; Beintema, J J

    1978-05-01

    Red kangaroo (Macropus rufus) ribonuclease was isolated from pancreatic tissue by affinity chromatography. The amino acid sequence was determined by automatic sequencing of overlapping large fragments and by analysis of shorter peptides obtained by digestion with a number of proteolytic enzymes. The polypeptide chain consists of 122 amino acid residues. Compared to other ribonucleases, the N-terminal residue and residue 114 are deleted. In other pancreatic ribonucleases position 114 is occupied by a cis proline residue in an external loop at the surface of the molecule. Other remarkable substitutions are the presence of a tyrosine residue at position 123 instead of a serine which forms a hydrogen bond with the pyrimidine ring of a nucleotide substrate, and a number of hydrophobichydrophilic interchanges in the sequence 51-55, which forms part of an alpha-helix in bovine ribonuclease and exhibits few substitutions in the placental mammals. Kangaroo ribonuclease contains no carbohydrate, although the enzyme possesses a recognition site for carbohydrate attachment in the sequence Asn-Val-Thr (62-64). The enzyme differs at about 35-40% of the positions from all other mammalian pancreatic ribonucleases sequenced to date, which is in agreement with the early divergence between the marsupials and the placental mammals. From fragmentary data a tentative sequence of red-necked wallaby (Macropus rufogriseus) pancreatic ribonuclease has been derived. Eight differences with the kangaroo sequence were found. PMID:658039

  14. Assay of deoxyribonucleic acid homology using a single-strand-specific nuclease at 75 C.

    PubMed Central

    Barth, P T; Grinter, N J

    1975-01-01

    We investigated the conditions under which a crude preparation of endonuclease S1 gives maximal hydrolysis of denatured deoxyribonucleic acid (DNA) while giving minimal hydrolysis of native DNA. The hydrolysis was measured by filtering and determining the acid-insoluble reaction product using 3H-labeled substrates. We also investigated various parameters in making this measurement. Under appropriate conditions (in 1 mM ZnSO-4, 0.168 M NaCl at pH 4.8) denatured DNA is hydrolyzed within 3% of completion whereas native DNA is essentially unaffected. The reaction was applied to assay plasmid DNA homoand heteroduplexes for which the method proves to be simple, fast, and reproducible. PMID:234416

  15. Nucleotide and predicted amino acid sequence of a cDNA clone encoding part of human transketolase.

    PubMed

    Abedinia, M; Layfield, R; Jones, S M; Nixon, P F; Mattick, J S

    1992-03-31

    Transketolase is a key enzyme in the pentose-phosphate pathway which has been implicated in the latent human genetic disease, Wernicke-Korsakoff syndrome. Here we report the cloning and partial characterisation of the coding sequences encoding human transketolase from a human brain cDNA library. The library was screened with oligonucleotide probes based on the amino acid sequence of proteolytic fragments of the purified protein. Northern blots showed that the transketolase mRNA is approximately 2.2 kb, close to the minimum expected, of which approximately 60% was represented in the largest cDNA clone. Sequence analysis of the transketolase coding sequences reveals a number of homologies with related enzymes from other species. PMID:1567394

  16. Unconventional amino acid sequence of the sun anemone (Stoichactis helianthus) polypeptide neurotoxin

    SciTech Connect

    Kem, W.; Dunn, B.; Parten, B.; Pennington, M.; Price, D.

    1986-05-01

    A 5000 dalton polypeptide neurotoxin (Sh-NI) purified by G50 Sephadex, P-cellulose, and SP-Sephadex chromatography was homogeneous by isoelectric focusing. Sh-NI was highly toxic to crayfish (LD/sub 50/ 0.6 ..mu..g/kg) but without effect upon mice at 15,000 ..mu..g/kg (i.p. injection). The reduced, /sup 3/H-carboxymethylated toxin and its fragments were subjected to automatic Edman degradation and the resulting PTH-amino acids were identified by HPLC, back hydrolysis, and scintillation counting. Peptides resulting from proteolytic (clostripain, staphylococcal protease) and chemical (tryptophan) cleavage were sequenced. The sequence is: AACKCDDEGPDIRTAPLTGTVDLGSCNAGWEKCASYYTIIADCCRKKK. This sequence differs considerably from the homologous Anemonia and Anthopleura toxins; many of the identical residues (6 half-cystines, G9, P10, R13, G19, G29, W30) are probably critical for folding rather than receptor recognition. However, the Sh-NI sequence closely resembles Radioanthus macrodactylus neurotoxin III and r. paumotensis II. The authors propose that Sh-NI and related Radioanthus toxins act upon a different site on the sodium channel.

  17. [Analysis of DNA homology and 16S rDNA sequence of rhizobia, a new phenotypic subgroup, isolated from Xizang Autonomous Region of China].

    PubMed

    Wang, Su-ying; Yang, Xiao-li; Li, Hai-feng; Liu, Jie

    2006-02-01

    Based on the studies of numerical taxonomy, the seven rhizobial strains isolated from the root nodules of leguminous plants Trigonella spp. and Astragalus spp. growing in the Xizang Autonomous Region of China constituted a new phenotypic subgroup, where wide phenotypic and genotypic diversity among legume crops had been reported due to complex terrain and various climate. The new phenotypic subgroup were further identified to clarify its taxonomic position by DNA homology analysis and 16S rDNA gene sequencing. The mol% G + C ratio of the DNA among members of the new subgroup ranged from 59.5 to 63.3 mol% as determined by T (m) assay. The levels of DNA relatedness, determined by using the DNA liquid hybridization method, among the members of the new subgroup were between 74.3% and 92.3%, while level of DNA relatedness between the central strains XZ2-3 of the new subgroup and the type strains of known species of Rhizobium was less than 47.4%. These results indicated that the new phenotypic subgroup is a DNA homological group different from described species of Rhizobium. Therefore, this new phenotypic subgroup was supposed to be a new species in the genus of Rhizobium since the strains in the same species generally exhibit levels of DNA homology ranging from 70 to 100%. A systematic identification method-16S rDNA gene sequence comparison was carried out to determine the phylogenetic relationships of the new subgroup with the described species of Rhizobium. The GenBank accession number for the 16S rDNA sequence of the central strain XZ2-3 of the new subgroup is DQ099745. The full-length 16S rDNA gene sequence were sequenced by chain terminator techniques and analyzed with PHYLIP. The phylogenetic trees were constructed by using the programs DRAWTREE. The phylogenetic analysis indicated that new subgroup occupy a independent sub-branch in phylogenetic tree. The sequence similarities between the center strain XZ2-3 and the closest relatives, strain R. leguminosarum USDA

  18. Cloning, DNA sequencing, and characterization of a nifD-homologous gene from the archaeon Methanosarcina barkeri 227 which resembles nifD1 from the eubacterium Clostridium pasteurianum.

    PubMed Central

    Chien, Y T; Zinder, S H

    1994-01-01

    L. Sibold, M. Henriquet, O. Possot, and J.-P. Aubert (Res. Microbiol. 142:5-12, 1991) cloned and sequenced two nifH-homologous open reading frames (ORFs) from Methanosarcina barkeri 227. Phylogenetic analysis of the deduced amino acid sequences of the nifH ORFs from M. barkeri showed that nifH1 clusters with nifH genes from alternative nitrogenases, while nifH2 clusters with nifH1 from the gram-positive eubacterium Clostridium pasteurianum. The N-terminal sequence of the purified nitrogenase component 2 (the nifH gene product) from M. barkeri was identical with that predicted for nifH2, and dot blot analysis of RNA transcripts indicated that nifH2 (and nifDK2) was expressed in M. barkeri when grown diazotrophically in Mo-containing medium. To obtain nifD2 from M. barkeri, a 4.7-kbp BamHI fragment of M. barkeri DNA was cloned which contained at least five ORFs, including nifH2, ORF105, and ORF125 (previously described by Sibold et al.), as well as nifD2 and part of nifK2. ORFnifD2 is 1,596 bp long and encodes 532 amino acid residues, while the nifK2 fragment is 135 bp long. The deduced amino acid sequences for nifD2 and the nifK2 fragment from M. barkeri cluster most closely with the corresponding nifDK1 gene products from C. pasteurianum. The predicted M. barkeri nifD2 product contains a 50-amino acid insert near the C terminus which has previously been found only in the clostridial nifD1 product. Previous biochemical and sequencing evidence indicates that the C. pasteurianum nitrogenase is the most divergent of known eubacterial Mo-nitrogenases, most likely representing a distinct nif gene family, which now also contains M. barkeri as a member. The similarity between the methanogen and clostridial nif sequences is especially intriguing in light of the recent findings of sequence similarities between gene products from archaea and from low-G+C gram-positive eubacteria for glutamate dehydrogenase, glutamine synthetase I, and heat shock protein 70. It is not clear

  19. Characterization of Group V Dubnium Homologs on DGA Extraction Chromatography Resin from Nitric and Hydrofluoric Acid Matrices

    SciTech Connect

    Despotopulos, J D; Sudowe, R

    2012-02-21

    somewhere between Nb and Pa. Much more recent studies have examined the properties of Db from HNO{sub 3}/HF matrices, and suggest Db forms complexes similar to those of Pa. Very little experimental work into the behavior of element 114 has been performed. Thermochromatography experiments of three atoms of element 114 indicate that the element 114 is at least as volatile as Hg, At, and element 112. Lead was shown to deposit on gold at temperatures about 1000 C higher than the atoms of element 114. Results indicate a substantially increased stability of element 114. No liquid phase studies of element 114 or its homologs (Pb, Sn, Ge) or pseudo-homologs (Hg, Cd) have been performed. Theoretical predictions indicate that element 114 is should have a much more stable +2 oxidation state and neutral state than Pb, which would result in element 114 being less reactive and less metallic than Pb. The relativistic effects on the 7p{sub 1/2} electrons are predicted to cause a diagonal relationship to be introduced into the periodic table. Therefore, 114{sup 2+} is expected to behave as if it were somewhere between Hg{sup 2+}, Cd{sup 2+}, and Pb{sup 2+}. In this work two commercially available extraction chromatography resins are evaluated, one for the separation of Db homologs and pseudo?homologs from each other as well as from potential interfering elements such as Group IV Rf homologs and actinides, and the other for separation of element 114 homologs. One resin, Eichrom's DGA resin, contains a N,N,N',N'-tetra-n-octyldiglycolamide extractant, which separates analytes based on both size and charge characteristics of the solvated metal species, coated on an inert support. The DGA resin was examined for Db chemical systems, and shows a high degree of selectivity for tri-, tetra-, and hexavalent metal ions in multiple acid matrices with fast kinetics. The other resin, Eichrom's Pb resin, contains a di-t-butylcyclohexano 18-crown-6 extractant with isodecanol solvent, which separates

  20. Amino acid sequence of Salmonella typhimurium branched-chain amino acid aminotransferase.

    PubMed

    Feild, M J; Nguyen, D C; Armstrong, F B

    1989-06-13

    The complete amino acid sequence of the subunit of branched-chain amino acid aminotransferase (transaminase B, EC 2.6.1.42) of Salmonella typhimurium was determined. An Escherichia coli recombinant containing the ilvGEDAY gene cluster of Salmonella was used as the source of the hexameric enzyme. The peptide fragments used for sequencing were generated by treatment with trypsin, Staphylococcus aureus V8 protease, endoproteinase Lys-C, and cyanogen bromide. The enzyme subunit contains 308 residues and has a molecular weight of 33,920. To determine the coenzyme-binding site, the pyridoxal 5-phosphate containing enzyme was treated with tritiated sodium borohydride prior to trypsin digestion. Peptide map comparisons with an apoenzyme tryptic digest and monitoring radioactivity incorporation allowed identification of the pyridoxylated peptide, which was then isolated and sequenced. The coenzyme-binding site is the lysyl residue at position 159. The amino acid sequence of Salmonella transaminase B is 97.4% identical with that of Escherichia coli, differing in only eight amino acid positions. Sequence comparisons of transaminase B to other known aminotransferase sequences revealed limited sequence similarity (24-33%) when conserved amino acid substitutions are allowed and alignments were forced to occur on the coenzyme-binding site. PMID:2669973

  1. Tousled kinase activator, gallic acid, promotes homologous recombinational repair and suppresses radiation cytotoxicity in salivary gland cells.

    PubMed

    Timiri Shanmugam, Prakash Srinivasan; Nair, Renjith Parameshwaran; De Benedetti, Arrigo; Caldito, Gloria; Abreo, Fleurette; Sunavala-Dossabhoy, Gulshan

    2016-04-01

    Accidental or medical radiation exposure of the salivary glands can gravely impact oral health. Previous studies have shown the importance of Tousled-like kinase 1 (TLK1) and its alternate start variant TLK1B in cell survival against genotoxic stresses. Through a high-throughput library screening of natural compounds, the phenolic phytochemical, gallic acid (GA), was identified as a modulator of TLK1/1B. This small molecule possesses anti-oxidant and free radical scavenging properties, but in this study, we report that in vitro it promotes survival of human salivary acinar cells, NS-SV-AC, through repair of ionizing radiation damage. Irradiated cells treated with GA show improved clonogenic survival compared to untreated controls. And, analyses of DNA repair kinetics by alkaline single-cell gel electrophoresis and γ-H2AX foci immunofluorescence indicate rapid resolution of DNA breaks in drug-treated cells. Study of DR-GFP transgene repair indicates GA facilitates homologous recombinational repair to establish a functional GFP gene. In contrast, inactivation of TLK1 or its shRNA knockdown suppressed resolution of radiation-induced DNA tails in NS-SV-AC, and homology directed repair in DR-GFP cells. Consistent with our results in culture, animals treated with GA after exposure to fractionated radiation showed better preservation of salivary function compared to saline-treated animals. Our results suggest that GA-mediated transient modulation of TLK1 activity promotes DNA repair and suppresses radiation cytoxicity in salivary gland cells. PMID:26855419

  2. Full Genome Virus Detection in Fecal Samples Using Sensitive Nucleic Acid Preparation, Deep Sequencing, and a Novel Iterative Sequence Classification Algorithm

    PubMed Central

    Cotten, Matthew; Oude Munnink, Bas; Canuti, Marta; Deijs, Martin; Watson, Simon J.; Kellam, Paul; van der Hoek, Lia

    2014-01-01

    We have developed a full genome virus detection process that combines sensitive nucleic acid preparation optimised for virus identification in fecal material with Illumina MiSeq sequencing and a novel post-sequencing virus identification algorithm. Enriched viral nucleic acid was converted to double-stranded DNA and subjected to Illumina MiSeq sequencing. The resulting short reads were processed with a novel iterative Python algorithm SLIM for the identification of sequences with homology to known viruses. De novo assembly was then used to generate full viral genomes. The sensitivity of this process was demonstrated with a set of fecal samples from HIV-1 infected patients. A quantitative assessment of the mammalian, plant, and bacterial virus content of this compartment was generated and the deep sequencing data were sufficient to assembly 12 complete viral genomes from 6 virus families. The method detected high levels of enteropathic viruses that are normally controlled in healthy adults, but may be involved in the pathogenesis of HIV-1 infection and will provide a powerful tool for virus detection and for analyzing changes in the fecal virome associated with HIV-1 progression and pathogenesis. PMID:24695106

  3. Gastropod arginine kinases from Cellana grata and Aplysia kurodai. Isolation and cDNA-derived amino acid sequences.

    PubMed

    Suzuki, T; Inoue, N; Higashi, T; Mizobuchi, R; Sugimura, N; Yokouchi, K; Furukohri, T

    2000-12-01

    Arginine kinase (AK) was isolated from the radular muscle of the gastropod molluscs Cellana grata (subclass Prosobranchia) and Aplysia kurodai (subclass Opisthobranchia), respectively, by ammonium sulfate fractionation, Sephadex G-75 gel filtration and DEAE-ion exchange chromatography. The denatured relative molecular mass values were estimated to be 40 kDa by sodium dodecyl sulfate-polyacrylamide gel electrophoresis. The isolated enzyme from Aplysia gave a Km value of 0.6 mM for arginine and a Vmax value of 13 micromole Pi min(-1) mg protein(-1) for the forward reaction. These values are comparable to other molluscan AKs. The cDNAs encoding Cellana and Aplysia AKs were amplified by polymerase chain reaction, and the nucleotide sequences of 1,608 and 1,239 bp, respectively, were determined. The open reading frame for Cellana AK is 1044 nucleotides in length and encodes a protein with 347 amino acid residues, and that for A. kurodai is 1077 nucleotides and 354 residues. The cDNA-derived amino acid sequences were validated by chemical sequencing of internal lysyl endopeptidase peptides. The amino acid sequences of Cellana and Aplysia AKs showed the highest percent identity (66-73%) with those of the abalone Nordotis and turbanshell Battilus belonging to the same class Gastropoda. These AK sequences still have a strong homology (63-71%) with that of the chiton Liolophura (class Polyplacophora), which is believed to be one of the most primitive molluscs. On the other hand, these AK sequences are less homologous (55-57%) with that of the clam Pseudocardium (class Bivalvia), suggesting that the biological position of the class Polyplacophora should be reconsidered. PMID:11281267

  4. Sequences Of Amino Acids For Human Serum Albumin

    NASA Technical Reports Server (NTRS)

    Carter, Daniel C.

    1992-01-01

    Sequences of amino acids defined for use in making polypeptides one-third to one-sixth as large as parent human serum albumin molecule. Smaller, chemically stable peptides have diverse applications including service as artificial human serum and as active components of biosensors and chromatographic matrices. In applications involving production of artificial sera from new sequences, little or no concern about viral contaminants. Smaller genetically engineered polypeptides more easily expressed and produced in large quantities, making commercial isolation and production more feasible and profitable.

  5. Novel method for PIK3CA mutation analysis: locked nucleic acid--PCR sequencing.

    PubMed

    Ang, Daphne; O'Gara, Rebecca; Schilling, Amy; Beadling, Carol; Warrick, Andrea; Troxell, Megan L; Corless, Christopher L

    2013-05-01

    Somatic mutations in PIK3CA are commonly seen in invasive breast cancer and several other carcinomas, occurring in three hotspots: codons 542 and 545 of exon 9 and in codon 1047 of exon 20. We designed a locked nucleic acid (LNA)-PCR sequencing assay to detect low levels of mutant PIK3CA DNA with attention to avoiding amplification of a pseudogene on chromosome 22 that has >95% homology to exon 9 of PIK3CA. We tested 60 FFPE breast DNA samples with known PIK3CA mutation status (48 cases had one or more PIK3CA mutations, and 12 were wild type) as identified by PCR-mass spectrometry. PIK3CA exons 9 and 20 were amplified in the presence or absence of LNA-oligonucleotides designed to bind to the wild-type sequences for codons 542, 545, and 1047, and partially suppress their amplification. LNA-PCR sequencing confirmed all 51 PIK3CA mutations; however, the mutation detection rate by standard Sanger sequencing was only 69% (35 of 51). Of the 12 PIK3CA wild-type cases, LNA-PCR sequencing detected three additional H1047R mutations in "normal" breast tissue and one E545K in usual ductal hyperplasia. Histopathological review of these three normal breast specimens showed columnar cell change in two (both with known H1047R mutations) and apocrine metaplasia in one. The novel LNA-PCR shows higher sensitivity than standard Sanger sequencing and did not amplify the known pseudogene. PMID:23541593

  6. The amino acid sequences and activities of synergistic hemolysins from Staphylococcus cohnii.

    PubMed

    Mak, Pawel; Maszewska, Agnieszka; Rozalska, Malgorzata

    2008-10-01

    Staphylococcus cohnii ssp. cohnii and S. cohnii ssp. urealyticus are a coagulase-negative staphylococci considered for a long time as unable to cause infections. This situation changed recently and pathogenic strains of these bacteria were isolated from hospital environments, patients and medical staff. Most of the isolated strains were resistant to many antibiotics. The present work describes isolation and characterization of several synergistic peptide hemolysins produced by these bacteria and acting as virulence factors responsible for hemolytic and cytotoxic activities. Amino acid sequences of respective hemolysins from S. cohnii ssp. cohnii (named as H1C, H2C and H3C) and S. cohnii ssp. urealyticus (H1U, H2U and H3U) were identical. Peptides H1 and H3 possessed significant amino acid homology to three synergistic hemolysins secreted by Staphylococcus lugdunensis and to putative antibacterial peptide produced by Staphylococcus saprophyticus ssp. saprophyticus. On the other hand, hemolysin H2 had a unique sequence. All isolated peptides lysed red cells from different mammalian species and exerted a cytotoxic effect on human fibroblasts. PMID:18752624

  7. Method for the detection of specific nucleic acid sequences by polymerase nucleotide incorporation

    DOEpatents

    Castro, Alonso

    2004-06-01

    A method for rapid and efficient detection of a target DNA or RNA sequence is provided. A primer having a 3'-hydroxyl group at one end and having a sequence of nucleotides sufficiently homologous with an identifying sequence of nucleotides in the target DNA is selected. The primer is hybridized to the identifying sequence of nucleotides on the DNA or RNA sequence and a reporter molecule is synthesized on the target sequence by progressively binding complementary nucleotides to the primer, where the complementary nucleotides include nucleotides labeled with a fluorophore. Fluorescence emitted by fluorophores on single reporter molecules is detected to identify the target DNA or RNA sequence.

  8. Nanopores and nucleic acids: prospects for ultrarapid sequencing

    NASA Technical Reports Server (NTRS)

    Deamer, D. W.; Akeson, M.

    2000-01-01

    DNA and RNA molecules can be detected as they are driven through a nanopore by an applied electric field at rates ranging from several hundred microseconds to a few milliseconds per molecule. The nanopore can rapidly discriminate between pyrimidine and purine segments along a single-stranded nucleic acid molecule. Nanopore detection and characterization of single molecules represents a new method for directly reading information encoded in linear polymers. If single-nucleotide resolution can be achieved, it is possible that nucleic acid sequences can be determined at rates exceeding a thousand bases per second.

  9. RNABindRPlus: a predictor that combines machine learning and sequence homology-based methods to improve the reliability of predicted RNA-binding residues in proteins.

    PubMed

    Walia, Rasna R; Xue, Li C; Wilkins, Katherine; El-Manzalawy, Yasser; Dobbs, Drena; Honavar, Vasant

    2014-01-01

    Protein-RNA interactions are central to essential cellular processes such as protein synthesis and regulation of gene expression and play roles in human infectious and genetic diseases. Reliable identification of protein-RNA interfaces is critical for understanding the structural bases and functional implications of such interactions and for developing effective approaches to rational drug design. Sequence-based computational methods offer a viable, cost-effective way to identify putative RNA-binding residues in RNA-binding proteins. Here we report two novel approaches: (i) HomPRIP, a sequence homology-based method for predicting RNA-binding sites in proteins; (ii) RNABindRPlus, a new method that combines predictions from HomPRIP with those from an optimized Support Vector Machine (SVM) classifier trained on a benchmark dataset of 198 RNA-binding proteins. Although highly reliable, HomPRIP cannot make predictions for the unaligned parts of query proteins and its coverage is limited by the availability of close sequence homologs of the query protein with experimentally determined RNA-binding sites. RNABindRPlus overcomes these limitations. We compared the performance of HomPRIP and RNABindRPlus with that of several state-of-the-art predictors on two test sets, RB44 and RB111. On a subset of proteins for which homologs with experimentally determined interfaces could be reliably identified, HomPRIP outperformed all other methods achieving an MCC of 0.63 on RB44 and 0.83 on RB111. RNABindRPlus was able to predict RNA-binding residues of all proteins in both test sets, achieving an MCC of 0.55 and 0.37, respectively, and outperforming all other methods, including those that make use of structure-derived features of proteins. More importantly, RNABindRPlus outperforms all other methods for any choice of tradeoff between precision and recall. An important advantage of both HomPRIP and RNABindRPlus is that they rely on readily available sequence and sequence

  10. Strategies for Development of Functionally Equivalent Promoters with Minimum Sequence Homology for Transgene Expression in Plants: cis-Elements in a Novel DNA Context versus Domain Swapping1

    PubMed Central

    Bhullar, Simran; Chakravarthy, Suma; Advani, Sonia; Datta, Sudipta; Pental, Deepak; Burma, Pradeep Kumar

    2003-01-01

    The cauliflower mosaic virus 35S (35S) promoter has been extensively used for the constitutive expression of transgenes in dicotyledonous plants. The repetitive use of the same promoter is known to induce transgene inactivation due to promoter homology. As a way to circumvent this problem, we tested two different strategies for the development of synthetic promoters that are functionally equivalent but have a minimum sequence homology. Such promoters can be generated by (a) introducing known cis-elements in a novel or synthetic stretch of DNA or (b) “domain swapping,” wherein domains of one promoter can be replaced with functionally equivalent domains from other heterologous promoters. We evaluated the two strategies for promoter modifications using domain A (consisting of minimal promoter and subdomain A1) of the 35S promoter as a model. A set of modified 35S promoters were developed whose strength was compared with the 35S promoter per se using β-glucuronidase as the reporter gene. Analysis of the expression of the reporter gene in transient assay system showed that domain swapping led to a significant fall in promoter activity. In contrast, promoters developed by placing cis-elements in a novel DNA context showed levels of expression comparable with that of the 35S. Two promoter constructs Mod2A1T and Mod3A1T were then designed by placing the core sequences of minimal promoter and subdomain A1 in divergent DNA sequences. Transgenics developed in tobacco (Nicotiana tabacum) with the two constructs and with 35S as control were used to assess the promoter activity in different tissues of primary transformants. Mod2A1T and Mod3A1T were found to be active in all of the tissues tested, at levels comparable with that of 35S. Further, the expression of the Mod2A1T promoter in the seedlings of the T1 generation was also similar to that of the 35S promoter. The present strategy opens up the possibility of creating a set of synthetic promoters with minimum sequence

  11. Role of arabidopsis MYC and MYB homologs in drought- and abscisic acid-regulated gene expression.

    PubMed Central

    Abe, H; Yamaguchi-Shinozaki, K; Urao, T; Iwasaki, T; Hosokawa, D; Shinozaki, K

    1997-01-01

    In Arabidopsis, the induction of a dehydration-responsive gene, rd22, is mediated by abscisic acid (ABA) and requires protein biosynthesis for ABA-dependent gene expression. Previous experiments established that a 67-bp DNA fragment of the rd22 promoter is sufficient for dehydration- and ABA-induced gene expression and that this DNA fragment contains two closely located putative recognition sites for the basic helix-loop-helix protein MYC and one putative recognition site for MYB. We have carefully analyzed the 67-bp region of the rd22 promoter in transgenic tobacco plants and found that both the first MYC site and the MYB recognition site function as cis-acting elements in the dehydration-induced expression of the rd22 gene. A cDNA encoding a MYC-related DNA binding protein was isolated by DNA-ligand binding screening, using the 67-bp region as a probe, and designated rd22BP1. The rd22BP1 cDNA encodes a 68-kD protein that has a typical DNA binding domain of a basic region helix-loop-helix leucine zipper motif in MYC-related transcription factors. The rd22BP1 protein binds specifically to the first MYC recognition site in the 67-bp fragment. RNA gel blot analysis revealed that transcription of the rd22BP1 gene is induced by dehydration stress and ABA treatment, and its induction precedes that of rd22. We have reported a drought- and ABA-inducible gene that encodes the MYB-related protein ATMYB2. In a transient transactivation experiment using Arabidopsis leaf protoplasts, we demonstrated that both the rd22BP1 and ATMYB2 proteins activate transcription of the rd22 promoter fused to the beta-glucuronidase reporter gene. These results indicate that both the rd22BP1 (MYC) and ATMYB2 (MYB) proteins function as transcriptional activators in the dehydration- and ABA-inducible expression of the rd22 gene. PMID:9368419

  12. Amino acid sequence of the Amur tiger prion protein.

    PubMed

    Wu, Changde; Pang, Wanyong; Zhao, Deming

    2006-10-01

    Prion diseases are fatal neurodegenerative disorders in human and animal associated with conformational conversion of a cellular prion protein (PrP(C)) into the pathologic isoform (PrP(Sc)). Various data indicate that the polymorphisms within the open reading frame (ORF) of PrP are associated with the susceptibility and control the species barrier in prion diseases. In the present study, partial Prnp from 25 Amur tigers (tPrnp) were cloned and screened for polymorphisms. Four single nucleotide polymorphisms (T423C, A501G, C511A, A610G) were found; the C511A and A610G nucleotide substitutions resulted in the amino acid changes Lysine171Glutamine and Alanine204Threoine, respectively. The tPrnp amino acid sequence is similar to house cat (Felis catus ) and sheep, but differs significantly from other two cat Prnp sequences that were previously deposited in GenBank. PMID:16780982

  13. DNA sequence of the control region of phage D108: the N-terminal amino acid sequences of repressor and transposase are similar both in phage D108 and in its relative, phage Mu.

    PubMed Central

    Mizuuchi, M; Weisberg, R A; Mizuuchi, K

    1986-01-01

    We have determined the DNA sequence of the control region of phage D108 up to position 1419 at the left end of the phage genome. Open reading frames for the repressor gene, ner gene, and the 5' part of the A gene (which codes for transposase) are found in the sequence. The genetic organization of this region of phage D108 is quite similar to that of phage Mu in spite of considerable divergence, both in the nucleotide sequence and in the amino acid sequences of the regulatory proteins of the two phages. The N-terminal amino acid sequences of the transposases of the two phages also share only limited homology. On the other hand, a significant amino acid sequence homology was found within each phage between the N-terminal parts of the repressor and transposase. We propose that the N-terminal domains of the repressor and transposase of each phage interact functionally in the process of making the decision between the lytic and the lysogenic mode of growth. PMID:3012481

  14. Evolution of vertebrate IgM: complete amino acid sequence of the constant region of Ambystoma mexicanum mu chain deduced from cDNA sequence.

    PubMed

    Fellah, J S; Wiles, M V; Charlemagne, J; Schwager, J

    1992-10-01

    cDNA clones coding for the constant region of the Mexican axolotl (Ambystoma mexicanum) mu heavy immunoglobulin chain were selected from total spleen RNA, using a cDNA polymerase chain reaction technique. The specific 5'-end primer was an oligonucleotide homologous to the JH segment of Xenopus laevis mu chain. One of the clones, JHA/3, corresponded to the complete constant region of the axolotl mu chain, consisting of a 1362-nucleotide sequence coding for a polypeptide of 454 amino acids followed in 3' direction by a 179-nucleotide untranslated region and a polyA+ tail. The axolotl C mu is divided into four typical domains (C mu 1-C mu 4) and can be aligned with the Xenopus C mu with an overall identity of 56% at the nucleotide level. Percent identities were particularly high between C mu 1 (59%) and C mu 4 (71%). The C-terminal 20-amino acid segment which constitutes the secretory part of the mu chain is strongly homologous to the equivalent sequences of chondrichthyans and of other tetrapods, including a conserved N-linked oligosaccharide, the penultimate cysteine and the C-terminal lysine. The four C mu domains of 13 vertebrate species ranging from chondrichthyans to mammals were aligned and compared at the amino acid level. The significant number of mu-specific residues which are conserved into each of the four C mu domains argues for a continuous line of evolution of the vertebrate mu chain. This notion was confirmed by the ability to reconstitute a consistent vertebrate evolution tree based on the phylogenic parsimony analysis of the C mu 4 sequences. PMID:1382992

  15. Heterotrophic bacteria from cultures of autotrophic Thiobacillus ferrooxidans: relationships as studied by means of deoxyribonucleic acid homology.

    PubMed

    Harrison, A P; Jarvis, B W; Johnson, J L

    1980-07-01

    From several presumably pure cultures of Thiobacillus ferrooxidans, we isolated a pair of stable phenotypes. One was a strict autotroph utilizing sulfur or ferrous iron as the energy source and unable to utilize glucose; the other phenotype was an acidophilic obligate heterotroph capable of utilizing glucose but not sulfur or ferrous iron. The acidophilic obligate heterotroph not only was encountered in cultures of T. ferrooxidans, but also was isolated with glucose-mineral salts medium, pH 2.0, directly from coal refuse. By means of deoxyribonucleic acid homology, we have demonstrated that the acidophilic heterotrophs are of a different genotype from T. ferrooxidans, not closely related to this species; we have shown also that the acidophilic obligate heterotrophs, regardless of their source of isolation, are related to each other. Therefore, cultures of T. ferrooxidans reported capable of utilizing organic compounds should be carefully examined for contamination. The acidophilic heterotrophs isolated by us are different from T. acidophilis, which is also associated with T. ferrooxidans but is facultative, utilizing both glucose and elemental sulfur as energy sources. Since they are so common and tenacious in T. ferrooxidans cultures, the heterotrophs must be associated with T. ferrooxidans in the natural habitat. PMID:7400100

  16. Salicylic Acid Based Small Molecule Inhibitor for the Oncogenic Src Homology-2 Domain Containing Protein Tyrosine Phosphatase-2 (SHP2)

    SciTech Connect

    Zhang, Xian; He, Yantao; Liu, Sijiu; Yu, Zhihong; Jiang, Zhong-Xing; Yang, Zhenyun; Dong, Yuanshu; Nabinger, Sarah C.; Wu, Li; Gunawan, Andrea M.; Wang, Lina; Chan, Rebecca J.; Zhang, Zhong-Yin

    2010-08-13

    The Src homology-2 domain containing protein tyrosine phosphatase-2 (SHP2) plays a pivotal role in growth factor and cytokine signaling. Gain-of-function SHP2 mutations are associated with Noonan syndrome, various kinds of leukemias, and solid tumors. Thus, there is considerable interest in SHP2 as a potential target for anticancer and antileukemia therapy. We report a salicylic acid based combinatorial library approach aimed at binding both active site and unique nearby subpockets for enhanced affinity and selectivity. Screening of the library led to the identification of a SHP2 inhibitor II-B08 (compound 9) with highly efficacious cellular activity. Compound 9 blocks growth factor stimulated ERK1/2 activation and hematopoietic progenitor proliferation, providing supporting evidence that chemical inhibition of SHP2 may be therapeutically useful for anticancer and antileukemia treatment. X-ray crystallographic analysis of the structure of SHP2 in complex with 9 reveals molecular determinants that can be exploited for the acquisition of more potent and selective SHP2 inhibitors.

  17. Bone morphogenetic protein 4 and retinoic acid trigger bovine VASA homolog expression in differentiating bovine induced pluripotent stem cells.

    PubMed

    Malaver-Ortega, Luis F; Sumer, Huseyin; Jain, Kanika; Verma, Paul J

    2016-02-01

    Primordial germ cells (PGCs) are the earliest identifiable and completely committed progenitors of female and male gametes. They are obvious targets for genome editing because they assure the transmission of desirable or introduced traits to future generations. PGCs are established at the earliest stages of embryo development and are difficult to propagate in vitro--two characteristics that pose a problem for their practical application. One alternative method to enrich for PGCs in vitro is to differentiate them from pluripotent stem cells derived from adult tissues. Here, we establish a reporter system for germ cell identification in bovine pluripotent stem cells based on green fluorescent protein expression driven by the minimal essential promoter of the bovine Vasa homolog (BVH) gene, whose regulatory elements were identified by orthologous modelling of regulatory units. We then evaluated the potential of bovine induced pluripotent stem cell (biPSC) lines carrying the reporter construct to differentiate toward the germ cell lineage. Our results showed that biPSCs undergo differentiation as embryoid bodies, and a fraction of the differentiating cells expressed BVH. The rate of differentiation towards BVH-positive cells increased up to tenfold in the presence of bone morphogenetic protein 4 or retinoic acid. Finally, we determined that the expression of key PGC genes, such as BVH or SOX2, can be modified by pre-differentiation cell culture conditions, although this increase is not necessarily mirrored by an increase in the rate of differentiation. PMID:26660942

  18. Homologies between T cell receptor junctional sequences unique to multiple sclerosis and T cells mediating experimental allergic encephalomyelitis.

    PubMed Central

    Allegretta, M; Albertini, R J; Howell, M D; Smith, L R; Martin, R; McFarland, H F; Sriram, S; Brostoff, S; Steinman, L

    1994-01-01

    The selection of T cell clones with mutations in the hypoxanthine guanine phosphoribosyltransferase (hprt) gene has been used to isolate T cells reactive to myelin basic protein (MBP) in patients with multiple sclerosis (MS). These T cell clones are activated in vivo, and are not found in healthy individuals. The third complementarity determining regions (CDR3) of the T cell receptor (TCR) alpha and beta chains are the putative contact sites for peptide fragments of MBP bound in the groove of the HLA molecule. The TCR V gene usage and CDR3s of these MBP-reactive hprt-T cell clones are homologous to TCRs from other T cells relevant to MS, including T cells causing experimental allergic encephalomyelitis (EAE) and T cells found in brain lesions and in the cerebrospinal fluid (CSF) of MS patients. In vivo activated MBP-reactive T cells in MS patients may be critical in the pathogenesis of MS. PMID:8040252

  19. Quantum-Sequencing: Biophysics of quantum tunneling through nucleic acids

    NASA Astrophysics Data System (ADS)

    Casamada Ribot, Josep; Chatterjee, Anushree; Nagpal, Prashant

    2014-03-01

    Tunneling microscopy and spectroscopy has extensively been used in physical surface sciences to study quantum tunneling to measure electronic local density of states of nanomaterials and to characterize adsorbed species. Quantum-Sequencing (Q-Seq) is a new method based on tunneling microscopy for electronic sequencing of single molecule of nucleic acids. A major goal of third-generation sequencing technologies is to develop a fast, reliable, enzyme-free single-molecule sequencing method. Here, we present the unique ``electronic fingerprints'' for all nucleotides on DNA and RNA using Q-Seq along their intrinsic biophysical parameters. We have analyzed tunneling spectra for the nucleotides at different pH conditions and analyzed the HOMO, LUMO and energy gap for all of them. In addition we show a number of biophysical parameters to further characterize all nucleobases (electron and hole transition voltage and energy barriers). These results highlight the robustness of Q-Seq as a technique for next-generation sequencing.

  20. Multiple Amino Acid Sequence Alignment Nitrogenase Component 1: Insights into Phylogenetics and Structure-Function Relationships

    PubMed Central

    Howard, James B.; Kechris, Katerina J.; Rees, Douglas C.; Glazer, Alexander N.

    2013-01-01

    Amino acid residues critical for a protein's structure-function are retained by natural selection and these residues are identified by the level of variance in co-aligned homologous protein sequences. The relevant residues in the nitrogen fixation Component 1 α- and β-subunits were identified by the alignment of 95 protein sequences. Proteins were included from species encompassing multiple microbial phyla and diverse ecological niches as well as the nitrogen fixation genotypes, anf, nif, and vnf, which encode proteins associated with cofactors differing at one metal site. After adjusting for differences in sequence length, insertions, and deletions, the remaining >85% of the sequence co-aligned the subunits from the three genotypes. Six Groups, designated Anf, Vnf , and Nif I-IV, were assigned based upon genetic origin, sequence adjustments, and conserved residues. Both subunits subdivided into the same groups. Invariant and single variant residues were identified and were defined as “core” for nitrogenase function. Three species in Group Nif-III, Candidatus Desulforudis audaxviator, Desulfotomaculum kuznetsovii, and Thermodesulfatator indicus, were found to have a seleno-cysteine that replaces one cysteinyl ligand of the 8Fe:7S, P-cluster. Subsets of invariant residues, limited to individual groups, were identified; these unique residues help identify the gene of origin (anf, nif, or vnf) yet should not be considered diagnostic of the metal content of associated cofactors. Fourteen of the 19 residues that compose the cofactor pocket are invariant or single variant; the other five residues are highly variable but do not correlate with the putative metal content of the cofactor. The variable residues are clustered on one side of the cofactor, away from other functional centers in the three dimensional structure. Many of the invariant and single variant residues were not previously recognized as potentially critical and their identification provides the bases

  1. Correlation between fibroin amino acid sequence and physical silk properties.

    PubMed

    Fedic, Robert; Zurovec, Michal; Sehnal, Frantisek

    2003-09-12

    The fiber properties of lepidopteran silk depend on the amino acid repeats that interact during H-fibroin polymerization. The aim of our research was to relate repeat composition to insect biology and fiber strength. Representative regions of the H-fibroin genes were sequenced and analyzed in three pyralid species: wax moth (Galleria mellonella), European flour moth (Ephestia kuehniella), and Indian meal moth (Plodia interpunctella). The amino acid repeats are species-specific, evidently a diversification of an ancestral region of 43 residues, and include three types of regularly dispersed motifs: modifications of GSSAASAA sequence, stretches of tripeptides GXZ where X and Z represent bulky residues, and sequences similar to PVIVIEE. No concatenations of GX dipeptide or alanine, which are typical for Bombyx silkworms and Antheraea silk moths, respectively, were found. Despite different repeat structure, the silks of G. mellonella and E. kuehniella exhibit similar tensile strength as the Bombyx and Antheraea silks. We suggest that in these latter two species, variations in the repeat length obstruct repeat alignment, but sufficiently long stretches of iterated residues get superposed to interact. In the pyralid H-fibroins, interactions of the widely separated and diverse motifs depend on the precision of repeat matching; silk is strong in G. mellonella and E. kuehniella, with 2-3 types of long homogeneous repeats, and nearly 10 times weaker in P. interpunctella, with seven types of shorter erratic repeats. The high proportion of large amino acids in the H-fibroin of pyralids has probably evolved in connection with the spinning habit of caterpillars that live in protective silk tubes and spin continuously, enlarging the tubes on one end and partly devouring the other one. The silk serves as a depot of energetically rich and essential amino acids that may be scarce in the diet. PMID:12816957

  2. Amino acid sequence of the nonsecretory ribonuclease of human urine.

    PubMed

    Beintema, J J; Hofsteenge, J; Iwama, M; Morita, T; Ohgi, K; Irie, M; Sugiyama, R H; Schieven, G L; Dekker, C A; Glitz, D G

    1988-06-14

    The amino acid sequence of a nonsecretory ribonuclease isolated from human urine was determined except for the identity of the residue at position 7. Sequence information indicates that the ribonucleases of human liver and spleen and an eosinophil-derived neurotoxin are identical or very closely related gene products. The sequence is identical at about 30% of the amino acid positions with those of all of the secreted mammalian ribonucleases for which information is available. Identical residues include active-site residues histidine-12, histidine-119, and lysine-41, other residues known to be important for substrate binding and catalytic activity, and all eight half-cystine residues common to these enzymes. Major differences include a deletion of six residues in the (so-called) S-peptide loop, insertions of two, and nine residues, respectively, in three other external loops of the molecule, and an addition of three residues at the amino terminus. The sequence shows the human nonsecretory ribonuclease to belong to the same ribonuclease superfamily as the mammalian secretory ribonucleases, turtle pancreatic ribonuclease, and human angiogenin. Sequence data suggest that a gene duplication occurred in an ancient vertebrate ancestor; one branch led to the nonsecretory ribonuclease, while the other branch led to a second duplication, with one line leading to the secretory ribonucleases (in mammals) and the second line leading to pancreatic ribonuclease in turtle and an angiogenic factor in mammals (human angiogenin). The nonsecretory ribonuclease has five short carbohydrate chains attached via asparagine residues at the surface of the molecule; these chains may have been shortened by exoglycosidase action.(ABSTRACT TRUNCATED AT 250 WORDS) PMID:3166997

  3. Characterization and amino acid sequence of a fatty acid-binding protein from human heart.

    PubMed

    Offner, G D; Brecher, P; Sawlivich, W B; Costello, C E; Troxler, R F

    1988-05-15

    The complete amino acid sequence of a fatty acid-binding protein from human heart was determined by automated Edman degradation of CNBr, BNPS-skatole [3'-bromo-3-methyl-2-(2-nitrobenzenesulphenyl)indolenine], hydroxylamine, Staphylococcus aureus V8 proteinase, tryptic and chymotryptic peptides, and by digestion of the protein with carboxypeptidase A. The sequence of the blocked N-terminal tryptic peptide from citraconylated protein was determined by collisionally induced decomposition mass spectrometry. The protein contains 132 amino acid residues, is enriched with respect to threonine and lysine, lacks cysteine, has an acetylated valine residue at the N-terminus, and has an Mr of 14768 and an isoelectric point of 5.25. This protein contains two short internal repeated sequences from residues 48-54 and from residues 114-119 located within regions of predicted beta-structure and decreasing hydrophobicity. These short repeats are contained within two longer repeated regions from residues 48-60 and residues 114-125, which display 62% sequence similarity. These regions could accommodate the charged and uncharged moieties of long-chain fatty acids and may represent fatty acid-binding domains consistent with the finding that human heart fatty acid-binding protein binds 2 mol of oleate or palmitate/mol of protein. Detailed evidence for the amino acid sequences of the peptides has been deposited as Supplementary Publication SUP 50143 (23 pages) at the British Library Lending Division, Boston Spa, Yorkshire LS23 7BQ, U.K., from whom copies may be obtained as indicated in Biochem. J. (1988) 249, 5. PMID:3421901

  4. Drosophila topoisomerase II double-strand DNA cleavage: analysis of DNA sequence homology at the cleavage site.

    PubMed Central

    Sander, M; Hsieh, T S

    1985-01-01

    In order to study the sequence specificity of double-strand DNA cleavage by Drosophila topoisomerase II, we have mapped and sequenced 16 strong and 47 weak cleavage sites in the recombinant plasmid p pi 25.1. Analysis of the nucleotide and dinucleotide frequencies in the region near the site of phosphodiester bond breakage revealed a nonrandom distribution. The nucleotide frequencies observed would occur by chance with a probability less than 0.05. The consensus sequence we derived is 5'GT.A/TAY decrease ATT.AT..G 3', where a dot means no preferred nucleotide, Y is for pyrimidine, and the arrow shows the point of bond cleavage. On average, strong sites match the consensus better than weak sites. Images PMID:2987816

  5. Nucleic acid sequence detection using multiplexed oligonucleotide PCR

    DOEpatents

    Nolan, John P.; White, P. Scott

    2006-12-26

    Methods for rapidly detecting single or multiple sequence alleles in a sample nucleic acid are described. Provided are all of the oligonucleotide pairs capable of annealing specifically to a target allele and discriminating among possible sequences thereof, and ligating to each other to form an oligonucleotide complex when a particular sequence feature is present (or, alternatively, absent) in the sample nucleic acid. The design of each oligonucleotide pair permits the subsequent high-level PCR amplification of a specific amplicon when the oligonucleotide complex is formed, but not when the oligonucleotide complex is not formed. The presence or absence of the specific amplicon is used to detect the allele. Detection of the specific amplicon may be achieved using a variety of methods well known in the art, including without limitation, oligonucleotide capture onto DNA chips or microarrays, oligonucleotide capture onto beads or microspheres, electrophoresis, and mass spectrometry. Various labels and address-capture tags may be employed in the amplicon detection step of multiplexed assays, as further described herein.

  6. The amino acid sequence of rabbit muscle triose phosphate isomerase.

    PubMed Central

    Corran, P H; Waley, S G

    1975-01-01

    The amino acid sequence of rabbit muscle triose phosphate isomerase was deduced by characterizing peptides that overlap the tryptic peptides. Thiol groups were modified by oxidation, carboxymethylation or aminoen. About 50 peptides that provided information about overlaps were isolated; the peptides were mostly characterized by their compositions and N-terminal residues. The peptide chains contain 248 amino acid residues, and no evidence for dissimilarity of the two subunits that comprise the native enzyme was found. The sequence of the rabbit muscle enzyme may be compared with that of the coelacanth enzyme (Kolb et al., 1974): 84% of the residues are in identical positions. Similarly, comparison of the sequence with that inferred for the chicken enzyme (Furth et al., 1974) shows that 87% of the residues are in identical positions. Limited though these comparisons are, they suggest that triose phosphate isomerase has one of the lowest rates of evolutionary change. An extended version of the present paper has been deposited as Supplementary Publication SUP 50040 (42 pages) at the British Library (Lending Division) (formerly the National Lending Library for Science and Technology), Boston Spa, Yorks. LS23 7BQ, U.K., from whom copies can be obtained on the terms given in Biochem. J. (1975) 145, 5. PMID:1171682

  7. The amino acid sequence of chymopapain from Carica papaya.

    PubMed Central

    Watson, D C; Yaguchi, M; Lynn, K R

    1990-01-01

    Chymopapain is a polypeptide of 218 amino acid residues. It has considerable structural similarity with papain and papaya proteinase omega, including conservation of the catalytic site and of the disulphide bonding. Chymopapain is like papaya proteinase omega in carrying four extra residues between papain positions 168 and 169, but differs from both papaya proteinases in the composition of its S2 subsite, as well as in having a second thiol group, Cys-117. Some evidence for the amino acid sequence of chymopapain has been deposited as Supplementary Publication SUP 50153 (12 pages) at the British Library Document Supply Centre, Boston Spa., Wetherby, West Yorkshire LS23 7BQ, U.K., from whom copies may be obtained on the terms indicated in Biochem. J. (1990) 265, 5. The information comprises Supplement Tables 1-4, which contain, in order, amino acid compositions of peptides from tryptic, peptic, CNBr and mild acid cleavages, Supplement Fig. 1, showing re-fractionation of selected peaks from Fig. 2 of the main paper. Supplement Fig. 2, showing cation-exchange chromatography of the earliest-eluted peak of Fig. 3 of the main paper, Supplement Fig. 3, showing reverse-phase h.p.l.c. of the later-eluted peak from Fig. 3 of the main paper, and Supplement Fig. 4, showing the separation of peptides after mild acid hydrolysis of CNBr-cleavage fragment CB3. PMID:2106878

  8. The amino acid sequence of chymopapain from Carica papaya.

    PubMed

    Watson, D C; Yaguchi, M; Lynn, K R

    1990-02-15

    Chymopapain is a polypeptide of 218 amino acid residues. It has considerable structural similarity with papain and papaya proteinase omega, including conservation of the catalytic site and of the disulphide bonding. Chymopapain is like papaya proteinase omega in carrying four extra residues between papain positions 168 and 169, but differs from both papaya proteinases in the composition of its S2 subsite, as well as in having a second thiol group, Cys-117. Some evidence for the amino acid sequence of chymopapain has been deposited as Supplementary Publication SUP 50153 (12 pages) at the British Library Document Supply Centre, Boston Spa., Wetherby, West Yorkshire LS23 7BQ, U.K., from whom copies may be obtained on the terms indicated in Biochem. J. (1990) 265, 5. The information comprises Supplement Tables 1-4, which contain, in order, amino acid compositions of peptides from tryptic, peptic, CNBr and mild acid cleavages, Supplement Fig. 1, showing re-fractionation of selected peaks from Fig. 2 of the main paper. Supplement Fig. 2, showing cation-exchange chromatography of the earliest-eluted peak of Fig. 3 of the main paper, Supplement Fig. 3, showing reverse-phase h.p.l.c. of the later-eluted peak from Fig. 3 of the main paper, and Supplement Fig. 4, showing the separation of peptides after mild acid hydrolysis of CNBr-cleavage fragment CB3. PMID:2106878

  9. Two novel PRPF31 premessenger ribonucleic acid processing factor 31 homolog mutations including a complex insertion-deletion identified in Chinese families with retinitis pigmentosa

    PubMed Central

    Dong, Bing; Chen, Jieqiong; Zhang, Xiaohui; Pan, Zhe; Bai, Fengge

    2013-01-01

    Objective To identify the causative mutations in two Chinese families with retinitis pigmentosa (RP), and to describe the associated phenotype. Methods Individuals from two unrelated families underwent full ophthalmic examinations. After informed consent was obtained, genomic DNA was extracted from the venous blood of all participants. Linkage analysis was performed on the known genetic loci for autosomal dominant retinitis pigmentosa with a panel of polymorphic markers in the two families, and then all coding exons of the PRP31 premessenger ribonucleic acid processing factor 31 homolog (PRPF31) gene were screened for mutations with direct sequencing of PCR-amplified DNA fragments. Allele-specific PCR was used to validate a substitution in all available family members and 100 normal controls. A large deletion was detected with real-time quantitative PCR (RQ-PCR) using a panel of primers from regions around the PRPF31 gene. Long-range PCR, followed by DNA sequencing, was used to define the breakpoints. Results Clinical examination and pedigree analysis revealed two four-generation families (RP24 and RP106) with autosomal dominant retinitis pigmentosa. A significant two-point linkage odd disequilibrium score was generated at marker D19S926 (Zmax=3.55, θ=0) for family RP24 and D19S571 (Zmax=3.21, θ=0) for family RP106, and further linkage and haplotype studies confined the disease locus to chromosome 19q13.42 where the PRPF31 gene is located. Mutation screening of the PRPF31 gene revealed a novel deletion c.1215delG (p.G405fs+7X) in family RP106. The deletion cosegregated with the family’s disease phenotype, but was not found in 100 normal controls. No disease-causing mutation was detected in family RP24 with PCR-based sequencing analysis. RQ-PCR and long-range PCR analysis revealed a complex insertion-deletion (indel) in the patients of family RP24. The deletion is more than 19 kb and encompasses part of the PRPF31 gene (exons 1–3), together with three adjacent

  10. Highly recurring sequence elements identified in eukaryotic DNAs by computer analysis are often homologous to regulatory sequences or protein binding sites.

    PubMed Central

    Bodnar, J W; Ward, D C

    1987-01-01

    We have used computer assisted dot matrix and oligonucleotide frequency analyses to identify highly recurring sequence elements of 7-11 base pairs in eukaryotic genes and viral DNAs. Such elements are found much more frequently than expected, often with an average spacing of a few hundred base pairs. Furthermore, the most abundant repetitive elements observed in the ovalbumin locus, the beta-globin gene cluster, the metallothionein gene and the viral genomes of SV40, polyoma, Herpes simplex-1 and Mouse Mammary Tumor Virus were sequences shown previously to be protein binding sites or sequences important for regulating gene expression. These sequences were present in both exons and introns as well as promoter regions. These observations suggest that such sequences are often highly overrepresented within the specific gene segments with which they are associated. Computer analysis of other genetic units, including viral genomes and oncogenes, has identified a number of highly recurring sequence elements that could serve similar regulatory or protein-binding functions. A model for the role of such reiterated sequence elements in DNA organization and function is presented. PMID:3822840

  11. Multiple overlapping homologies between two rheumatoid antigens and immunosuppressive viruses.

    PubMed Central

    Douvas, A; Sobelman, S

    1991-01-01

    Amino acid (aa) sequence homologies between viruses and autoimmune nuclear antigens are suggestive of viral involvement in disorders such as systemic lupus erythematosus (SLE) and scleroderma. We analyzed the frequency of exact homologies of greater than or equal to 5 aa between 61 viral proteins (19,827 aa), 8 nuclear antigens (3813 aa), and 41 control proteins (11,743 aa). Both pentamer and hexamer homologies between control proteins and viruses are unexpectedly abundant, with hexamer matches occurring in 1 of 3 control proteins (or once every 769 aa). However, 2 nuclear antigens, the SLE-associated 70-kDa antigen and the scleroderma-associated CENP-B protein, are highly unusual in containing multiple homologies to a group of synergizing immunosuppressive viruses. Two viruses, herpes simplex virus 1 (HSV-1) and human immunodeficiency virus 1 (HIV-1), contain sequences exactly duplicated at 15 sites in the 70-kDa antigen and at 10 sites in CENP-B protein. The immediate-early (IE) protein of HSV-1, which activates HIV-1 regulatory functions, contains three homologies to the 70-kDa antigen (two hexamers and a pentamer) and two to CENP-B (a hexamer and pentamer). There are four homologies (including a hexamer) common to the 70-kDa antigen and Epstein-Barr virus, and three homologies (including two hexamers) common to CENP-B and cytomegalovirus. The majority of homologies in both nuclear antigens are clustered in highly charged C-terminal domains containing epitopes for human autoantibodies. Furthermore, most homologies have a contiguous or overlapping distribution, thereby creating a high density of potential epitopes. In addition to the exact homologies tabulated, motifs of matching sequences are repeated frequently in these domains. Our analysis suggests that coexpression of heterologous viruses having common immunosuppressive functions may generate autoantibodies cross-reacting with certain nuclear proteins. PMID:1712488

  12. Alcohol homologation

    DOEpatents

    Wegman, Richard W.; Moloy, Kenneth G.

    1988-01-01

    A process for the homologation of an alkanol by reaction with synthesis gas in contact with a system containing rhodium atom, ruthenium atom, iodine atom and a bis(diorganophosphino) alkane to selectivity produce the next higher homologue.

  13. Alcohol homologation

    DOEpatents

    Wegman, R.W.; Moloy, K.G.

    1988-02-23

    A process is described for the homologation of an alkanol by reaction with synthesis gas in contact with a system containing rhodium atom, ruthenium atom, iodine atom and a bis(diorganophosphino) alkane to selectivity produce the next higher homologue.

  14. Amino acid sequence of mouse nidogen, a multidomain basement membrane protein with binding activity for laminin, collagen IV and cells.

    PubMed Central

    Mann, K; Deutzmann, R; Aumailley, M; Timpl, R; Raimondi, L; Yamada, Y; Pan, T C; Conway, D; Chu, M L

    1989-01-01

    The whole amino acid sequence of nidogen was deduced from cDNA clones isolated from expression libraries and confirmed to approximately 50% by Edman degradation of peptides. The protein consists of some 1217 amino acid residues and a 28-residue signal peptide. The data support a previously proposed dumb-bell model of nidogen by demonstrating a large N-terminal globular domain (641 residues), five EGF-like repeats constituting the rod-like domain (248 residues) and a smaller C-terminal globule (328 residues). Two more EGF-like repeats interrupt the N-terminal and terminate the C-terminal sequences. Weak sequence homologies (25%) were detected between some regions of nidogen, the LDL receptor, thyroglobulin and the EGF precursor. Nidogen contains two consensus sequences for tyrosine sulfation and for asparagine beta-hydroxylation, two N-linked carbohydrate acceptor sites and, within one of the EGF-like repeats an Arg-Gly-Asp sequence. The latter was shown to be functional in cell attachment to nidogen. Binding sites for laminin and collagen IV are present on the C-terminal globule but not yet precisely localized. Images PMID:2496973

  15. Amino acid sequence prerequisites for the formation of cn ions.

    PubMed

    Downard, K M; Biemann, K

    1993-11-01

    Ammo acid sequence prerequisites are described for the formation of c, ions observed in high-energy collision-induced decomposition spectra of peptides. It is shown that the formation of cn ions is promoted by the nature of the amino acid C-terminal to the cleavage site. A propensity for cn cleavage preceding threonine, and to a lesser extent tryptophan, lysine, and serine, is demonstrated where fragmentation is directed N-terminally at these residues. In addition, the nature of the residue N-terminal to the cleavage site is shown to have little effect on cn ion formation. A mechanism for cn ion formation is proposed and its applicability to the results observed is discussed. PMID:24227531

  16. Ultrasensitive nucleic acid sequence detection by single-molecule electrophoresis

    SciTech Connect

    Castro, A; Shera, E.B.

    1996-09-01

    This is the final report of a one-year laboratory-directed research and development project at Los Alamos National Laboratory. There has been considerable interest in the development of very sensitive clinical diagnostic techniques over the last few years. Many pathogenic agents are often present in extremely small concentrations in clinical samples, especially at the initial stages of infection, making their detection very difficult. This project sought to develop a new technique for the detection and accurate quantification of specific bacterial and viral nucleic acid sequences in clinical samples. The scheme involved the use of novel hybridization probes for the detection of nucleic acids combined with our recently developed technique of single-molecule electrophoresis. This project is directly relevant to the DOE`s Defense Programs strategic directions in the area of biological warfare counter-proliferation.

  17. Homologous recombination among three intragene Alu sequences causes an inversion-deletion resulting in the hereditary bleeding disorder glanzmann thrombasthenia

    SciTech Connect

    Li, L.; Bray, P.F. )

    1993-07-01

    The crucial role of the human platelet fibrinogen receptor in maintaining normal hemostasis is best exemplified by the autosomal recessive bleeding disorder Glanzmann thrombasthenia (GT). The platelet fibrinogen receptor is a heterodimer composed of glycoproteins IIb (GPIIb) and IIIa (GPIIIa). Platelets from patients with GT have a quantitative or qualitative abnormality in GPIIb and GPIIIa and can neither bind fibrinogen nor aggregate. Very few genetic defects have been identified that cause this disorder. The authors describe a kindred with GT in which the affected individuals have a unique inversion-deletion mutation in the gene for GPIIIa. Patient platelets lacked both GPIIIa protein and mRNA. Southern blots of patient genomic DNA probed with an internal 1.0-kb GPIIIa cDNA suggested a large rearrangement of this gene but were normal when probed with small GPIIIa cDNA fragments that were outside the mutation. Cytogenetics and pulsed-field gel analysis of the GPIIIa gene were normal, making a translocation or a very large rearrangement unlikely. Additional Southern analyses suggested that the abnormality was not a small insertion. The authors constructed a patient genomic DNA library and isolated fragments containing the 5' and 3' breakpoints of the mutation. The nucleotide sequence from these genomic clones was determined and revealed that, relative to the normal gene, the mutant allele contained a 1-kb deletion immediately preceding a 15-kb inversion. The DNA breaks occurred in two inverted and one forward Alu sequence within the gene for GPIIIa and in the left, right, and left arms, respectively, of these sequences. There was a 5-bp repeat at the 3 terminus of the inversion. One copy of the repeat remained in the mutant allele breakpoint junction. The alignment and orientation of the different Alu sequences, as well as the position of the breakpoints, suggest that the inversion preceded the deletion in this complex rearrangement. 41 refs., 5 figs.

  18. Determining Structure and Function of Steroid Dehydrogenase Enzymes by Sequence Analysis, Homology Modeling, and Rational Mutational Analysis

    PubMed Central

    DUAX, WILLIAM L.; THOMAS, JAMES; PLETNEV, VLADIMIR; ADDLAGATTA, ANTHONY; HUETHER, ROBERT; HABEGGER, LUKAS; WEEKS, CHARLES M.

    2006-01-01

    The short-chain oxidoreductase (SCOR) family of enzymes includes over 6,000 members identified in sequenced genomes. Of these enzymes, ~300 have been characterized functionally, and the three-dimensional crystal structures of ~40 have been reported. Since some SCOR enzymes are steroid dehydrogenases involved in hypertension, diabetes, breast cancer, and polycystic kidney disease, it is important to characterize the other members of the family for which the biological functions are currently unknown and to determine their three-dimensional structure and mechanism of action. Although the SCOR family appears to have only a single fully conserved residue, it was possible, using bioinformatics methods, to determine characteristic fingerprints composed of 30–40 residues that are conserved at the 70% or greater level in SCOR subgroups. These fingerprints permit reliable prediction of several important structure-function features including cofactor preference, catalytic residues, and substrate specificity. Human type 1 3β-hydroxysteroid dehydrogenase isomerase (3β-HSDI) has 30% sequence identity with a human UDP galactose 4-epimerase (UDPGE), a SCOR family enzyme for which an X-ray structure has been reported. Both UDPGE and 3-HSDI appear to trace their origins back to bacterial 3α,20β-HSD. Combining three-dimensional structural information and sequence data on the 3α,20β-HSD, UDPGE, and 3β-HSDI subfamilies with mutational analysis, we were able to identify the residues critical to the dehydrogenase function of 3-HSDI. We also identified the residues most probably responsible for the isomerase activity of 3β-HSDI. We test our predictions by specific mutations based on sequence analysis and our structure-based model. PMID:16467263

  19. Homology-Driven Proteomics of Dinoflagellates with Unsequenced Genomes Using MALDI-TOF/TOF and Automated De Novo Sequencing

    PubMed Central

    Wang, Da-Zhi; Li, Cheng; Xie, Zhang-Xian; Dong, Hong-Po; Lin, Lin; Hong, Hua-Sheng

    2011-01-01

    This study developed a multilayered, gel-based, and underivatized strategy for de novo protein sequence analysis of unsequenced dinoflagellates using a MALDI-TOF/TOF mass spectrometer with the assistance of DeNovo Explorer software. MASCOT was applied as the first layer screen to identify either known or unknown proteins sharing identical peptides presented in a database. Once the confident identifications were removed after searching against the NCBInr database, the remainder was searched against the dinoflagellate expressed sequence tag database. In the last layer, those borderline and nonconfident hits were further subjected to de novo interpretation using DeNovo Explorer software. The de novo sequences passing a reliability filter were subsequently submitted to nonredundant MS-BLAST search. Using this layer identification method, 216 protein spots representing 158 unique proteins out of 220 selected protein spots from Alexandrium tamarense, a dinoflagellate with unsequenced genome, were confidently or tentatively identified by database searching. These proteins were involved in various intracellular physiological activities. This study is the first effort to develop a completely automated approach to identify proteins from unsequenced dinoflagellate databases and establishes a preliminary protein database for various physiological studies of dinoflagellates in the future. PMID:21977052

  20. Homology of pCS1 Plasmid Sequences with Chromosomal DNA in Clavibacter michiganense subsp. sepedonicum: Evidence for the Presence of a Repeated Sequence and Plasmid Integration †

    PubMed Central

    Mogen, Bradley D.; Oleson, Arland E.

    1987-01-01

    Restriction fragments of pCS1, a 50.6-kilobase (kb) plasmid present in many strains of Clavibacter michiganense subsp. sepedonicum (“Corynebacterium sepedonicum”), have been cloned in an M13mp11 phage vector. Radiolabeled forms of these cloned fragments have been used as Southern hybridization probes for the presence of plasmid sequences in chromosomal DNA of this organism. These studies have shown that all tested strains lacking the covalently closed circular form of pCS1 contain the plasmid in integrated form. In each case the site of integration exists on a single plasmid restriction fragment with a size of 5.1 kb. Southern hybridizations with these probes have also revealed the existence of a major repeated sequence in C. michiganense subsp. sepedonicum. Hybridizations of chromosomal DNA with deletion subclones of a 2.9-kb plasmid fragment containing the repeated sequence indicate that the size of the repeated sequence is approximately 1.3 kb. One of the copies of the repeated sequence is on the plasmid fragment containing the site of integration. Images PMID:16347464

  1. Homology Analysis of Pathogenic Yersinia Species Yersinia enterocolitica, Yersinia pseudotuberculosis, and Yersinia pestis Based on Multilocus Sequence Typing

    PubMed Central

    Duan, Ran; Liang, Junrong; Shi, Guoxiang; Cui, Zhigang; Hai, Rong; Wang, Peng; Xiao, Yuchun; Li, Kewei; Qiu, Haiyan; Gu, Wenpeng; Du, Xiaoli

    2014-01-01

    We developed a multilocus sequence typing (MLST) scheme and used it to study the population structure and evolutionary relationships of three pathogenic Yersinia species. MLST of these three Yersinia species showed a complex of two clusters, one composed of Yersinia pseudotuberculosis and Yersinia pestis and the other composed of Yersinia enterocolitica. Within the first cluster, the predominant Y. pestis sequence type 90 (ST90) was linked to Y. pseudotuberculosis ST43 by one locus difference, and 81.25% of the ST43 strains were from serotype O:1b, supporting the hypothesis that Y. pestis descended from the O:1b serotype of Y. pseudotuberculosis. We also found that the worldwide-prevalent serotypes O:1a, O:1b, and O:3 were predominated by specific STs. The second cluster consisted of pathogenic and nonpathogenic Y. enterocolitica strains, two of which may not have identical STs. The pathogenic Y. enterocolitica strains formed a relatively conserved group; most strains clustered within ST186 and ST187. Serotypes O:3, O:8, and O:9 were separated into three distinct blocks. Nonpathogenic Y. enterocolitica STs were more heterogeneous, reflecting genetic diversity through evolution. By providing a better and effective MLST procedure for use with the Yersinia community, valuable information and insights into the genetic evolutionary differences of these pathogens were obtained. PMID:24131695

  2. FFPred 2.0: improved homology-independent prediction of gene ontology terms for eukaryotic protein sequences.

    PubMed

    Minneci, Federico; Piovesan, Damiano; Cozzetto, Domenico; Jones, David T

    2013-01-01

    To understand fully cell behaviour, biologists are making progress towards cataloguing the functional elements in the human genome and characterising their roles across a variety of tissues and conditions. Yet, functional information - either experimentally validated or computationally inferred by similarity - remains completely missing for approximately 30% of human proteins. FFPred was initially developed to bridge this gap by targeting sequences with distant or no homologues of known function and by exploiting clear patterns of intrinsic disorder associated with particular molecular activities and biological processes. Here, we present an updated and improved version, which builds on larger datasets of protein sequences and annotations, and uses updated component feature predictors as well as revised training procedures. FFPred 2.0 includes support vector regression models for the prediction of 442 Gene Ontology (GO) terms, which largely expand the coverage of the ontology and of the biological process category in particular. The GO term list mainly revolves around macromolecular interactions and their role in regulatory, signalling, developmental and metabolic processes. Benchmarking experiments on newly annotated proteins show that FFPred 2.0 provides more accurate functional assignments than its predecessor and the ProtFun server do; also, its assignments can complement information obtained using BLAST-based transfer of annotations, improving especially prediction in the biological process category. Furthermore, FFPred 2.0 can be used to annotate proteins belonging to several eukaryotic organisms with a limited decrease in prediction quality. We illustrate all these points through the use of both precision-recall plots and of the COGIC scores, which we recently proposed as an alternative numerical evaluation measure of function prediction accuracy. PMID:23717476

  3. Mycobacterium tuberculosis expresses two chaperonin-60 homologs.

    PubMed Central

    Kong, T H; Coates, A R; Butcher, P D; Hickman, C J; Shinnick, T M

    1993-01-01

    A 65-kDa protein and a 10-kDa protein are two of the more strongly immunoreactive components of Mycobacterium tuberculosis, the causative agent of tuberculosis. The 65-kDa antigen has homology with members of the GroEL or chaperonin-60 (Cpn60) family of heat shock proteins. The 10-kDa antigen has homology with the GroES or chaperonin-10 family of heat shock proteins. These two proteins are encoded by separate genes in M. tuberculosis. The studies reported here reveal that M. tuberculosis contains a second Cpn60 homolog located 98 bp downstream of the 10-kDa antigen gene. The second Cpn60 homolog (Cpn60-1) displays 61% amino acid sequence identity with the 65-kDa antigen (Cpn60-2) and 53% and 41% identity with the Escherichia coli GroEL protein and the human P60 protein, respectively. Primer-extension analysis revealed that transcription starts 29 bp upstream of the translation start of the Cpn60-1 homolog and protein purification studies indicate that the cpn60-1 gene is expressed as an approximately 60-kDa polypeptide. Images Fig. 3 Fig. 5 PMID:7681982

  4. Trypsin inhibitors from ridged gourd (Luffa acutangula Linn.) seeds: purification, properties, and amino acid sequences.

    PubMed

    Haldar, U C; Saha, S K; Beavis, R C; Sinha, N K

    1996-02-01

    Two trypsin inhibitors, LA-1 and LA-2, have been isolated from ridged gourd (Luffa acutangula Linn.) seeds and purified to homogeneity by gel filtration followed by ion-exchange chromatography. The isoelectric point is at pH 4.55 for LA-1 and at pH 5.85 for LA-2. The Stokes radius of each inhibitor is 11.4 A. The fluorescence emission spectrum of each inhibitor is similar to that of the free tyrosine. The biomolecular rate constant of acrylamide quenching is 1.0 x 10(9) M-1 sec-1 for LA-1 and 0.8 x 10(9) M-1 sec-1 for LA-2 and that of K2HPO4 quenching is 1.6 x 10(11) M-1 sec-1 for LA-1 and 1.2 x 10(11) M-1 sec-1 for LA-2. Analysis of the circular dichroic spectra yields 40% alpha-helix and 60% beta-turn for La-1 and 45% alpha-helix and 55% beta-turn for LA-2. Inhibitors LA-1 and LA-2 consist of 28 and 29 amino acid residues, respectively. They lack threonine, alanine, valine, and tryptophan. Both inhibitors strongly inhibit trypsin by forming enzyme-inhibitor complexes at a molar ratio of unity. A chemical modification study suggests the involvement of arginine of LA-1 and lysine of LA-2 in their reactive sites. The inhibitors are very similar in their amino acid sequences, and show sequence homology with other squash family inhibitors. PMID:8924202

  5. A protective surface protein from type V group B streptococci shares N-terminal sequence homology with the alpha C protein.

    PubMed

    Lachenauer, C S; Madoff, L C

    1996-10-01

    Infection by group B streptococci (GBS) is an important cause of bacterial disease in neonates, pregnant women, and nonpregnant adults. Historically, serotypes Ia, Ib, II, and III have been most prevalent among disease cases; recently, type V strains have emerged as important strains in the United States and elsewhere. In addition to type-specific capsular polysaccharides, many GBS strains possess surface proteins which demonstrate a laddering pattern on sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) and resistance to trypsin digestion. These include the alpha C protein, the R proteins, and protein Rib. Some of these proteins elicit protective antibodies in animals. We demonstrate a trypsin-resistant laddering protein purified from a type V GBS strain by mutanolysin extraction and column chromatography. This protein contains a major 90-kDa band and a series of smaller bands spaced approximately 10 kDa apart on SDS-PAGE. Cross-reactivity of the type V protein with the alpha C protein and with R1 was demonstrated on Western blot (immunoblot). N-terminal sequence analysis of the protein revealed residue identity with 17 of 18 residues at corresponding positions on the alpha protein. Western blot of SDS extracts of 41 clinical type V isolates with rabbit antiserum to the protein demonstrated a homologous protein in 25 isolates (61%); two additional strains exhibited a heterologous pattern which was also demonstrated with 4G8, a monoclonal antibody directed to the alpha C protein repeat region. Rabbit antiserum raised to the type V protein conferred protection in neonatal mice against a type V strain bearing a homologous protein. These data support the hypothesis that there exists a family of trypsin-resistant, laddering GBS surface proteins which may play a role in immunity to GBS infection. PMID:8926097

  6. A protective surface protein from type V group B streptococci shares N-terminal sequence homology with the alpha C protein.

    PubMed Central

    Lachenauer, C S; Madoff, L C

    1996-01-01

    Infection by group B streptococci (GBS) is an important cause of bacterial disease in neonates, pregnant women, and nonpregnant adults. Historically, serotypes Ia, Ib, II, and III have been most prevalent among disease cases; recently, type V strains have emerged as important strains in the United States and elsewhere. In addition to type-specific capsular polysaccharides, many GBS strains possess surface proteins which demonstrate a laddering pattern on sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) and resistance to trypsin digestion. These include the alpha C protein, the R proteins, and protein Rib. Some of these proteins elicit protective antibodies in animals. We demonstrate a trypsin-resistant laddering protein purified from a type V GBS strain by mutanolysin extraction and column chromatography. This protein contains a major 90-kDa band and a series of smaller bands spaced approximately 10 kDa apart on SDS-PAGE. Cross-reactivity of the type V protein with the alpha C protein and with R1 was demonstrated on Western blot (immunoblot). N-terminal sequence analysis of the protein revealed residue identity with 17 of 18 residues at corresponding positions on the alpha protein. Western blot of SDS extracts of 41 clinical type V isolates with rabbit antiserum to the protein demonstrated a homologous protein in 25 isolates (61%); two additional strains exhibited a heterologous pattern which was also demonstrated with 4G8, a monoclonal antibody directed to the alpha C protein repeat region. Rabbit antiserum raised to the type V protein conferred protection in neonatal mice against a type V strain bearing a homologous protein. These data support the hypothesis that there exists a family of trypsin-resistant, laddering GBS surface proteins which may play a role in immunity to GBS infection. PMID:8926097

  7. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... in the sequence. (4) The enumeration of amino acids may start at the first amino acid of the first..., counting backwards starting with the amino acid next to number 1. Otherwise, the enumeration of amino acids... sequence every 5 amino acids. The enumeration method for amino acid sequences that is set forth......

  8. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... in the sequence. (4) The enumeration of amino acids may start at the first amino acid of the first..., counting backwards starting with the amino acid next to number 1. Otherwise, the enumeration of amino acids... sequence every 5 amino acids. The enumeration method for amino acid sequences that is set forth......

  9. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... in the sequence. (4) The enumeration of amino acids may start at the first amino acid of the first..., counting backwards starting with the amino acid next to number 1. Otherwise, the enumeration of amino acids... sequence every 5 amino acids. The enumeration method for amino acid sequences that is set forth......

  10. Predicting protein disorder by analyzing amino acid sequence

    PubMed Central

    Yang, Jack Y; Yang, Mary Qu

    2008-01-01

    Background Many protein regions and some entire proteins have no definite tertiary structure, presenting instead as dynamic, disorder ensembles under different physiochemical circumstances. These proteins and regions are known as Intrinsically Unstructured Proteins (IUP). IUP have been associated with a wide range of protein functions, along with roles in diseases characterized by protein misfolding and aggregation. Results Identifying IUP is important task in structural and functional genomics. We exact useful features from sequences and develop machine learning algorithms for the above task. We compare our IUP predictor with PONDRs (mainly neural-network-based predictors), disEMBL (also based on neural networks) and Globplot (based on disorder propensity). Conclusion We find that augmenting features derived from physiochemical properties of amino acids (such as hydrophobicity, complexity etc.) and using ensemble method proved beneficial. The IUP predictor is a viable alternative software tool for identifying IUP protein regions and proteins. PMID:18831799

  11. Canine preprorelaxin: nucleic acid sequence and localization within the canine placenta.

    PubMed

    Klonisch, T; Hombach-Klonisch, S; Froehlich, C; Kauffold, J; Steger, K; Steinetz, B G; Fischer, B

    1999-03-01

    Employing uteroplacental tissue at Day 35 of gestation, we determined the nucleic acid sequence of canine preprorelaxin using reverse transcription- and rapid amplification of cDNA ends-polymerase chain reaction. Canine preprorelaxin cDNA consisted of 534 base pairs encoding a protein of 177 amino acids with a signal peptide of 25 amino acids (aa), a B domain of 35 aa, a C domain of 93 aa, and an A domain of 24 aa. The putative receptor binding region in the N'-terminal part of the canine relaxin B domain GRDYVR contained two substitutions from the classical motif (E-->D and L-->Y). Canine preprorelaxin shared highest homology with porcine and equine preprorelaxin. Northern analysis revealed a 1-kilobase transcript present in total RNA of canine uteroplacental tissue but not of kidney tissue. Uteroplacental tissue from two bitches each at Days 30 and 35 of gestation were studied by in situ hybridization to localize relaxin mRNA. Immunohistochemistry for relaxin, cytokeratin, vimentin, and von Willebrand factor was performed on uteroplacental tissue at Day 30 of gestation. The basal cell layer at the core of the chorionic villi was devoid of relaxin mRNA and immunoreactive relaxin or vimentin but was immunopositive for cytokeratin and identified as cytotrophoblast cells. The cell layer surrounding the chorionic villi displayed specific hybridization signals for relaxin mRNA and immunoreactivity for relaxin and cytokeratin but not for vimentin, and was identified as syncytiotrophoblast. Those areas of the chorioallantoic tissue with most intense relaxin immunoreactivity were highly vascularized as demonstrated by immunoreactive von Willebrand factor expressed on vascular endothelium. The uterine glands and nonplacental uterine areas of the canine zonary girdle placenta were devoid of relaxin mRNA and relaxin. We conclude that the syncytiotrophoblast is the source of relaxin in the canine placenta. PMID:10026098

  12. Snake venoms. The amino acid sequences of two proteinase inhibitor homologues from Dendroaspis angusticeps venom.

    PubMed

    Joubert, F J; Taljaard, N

    1980-05-01

    Toxins C13S1C3 and C13S2C3 from D. angusticeps venom were purified by gel filtration and ion exchange chromatography. Whereas C13S1C3 contains 57 amino acids, C13S2C3 contains 59 but each include six half-cystine residues. The complete primary structure of the low toxicity proteins have been elucidated. The sequences and the invariant residues of toxins C13S1C3 and C13S2C3 from D. angusticeps venom resemble, respectively, those of the proteinase inhibitor homologues K and I from D. polylepis polylepis venom and they are also homologous to the active proteinase inhibitors from various sources. In C13S1C3 and K the active site lysyl residue of active bovine pancreatic proteinase inhibitor is conserved but the site residue alanine, is replaced by lysine. In C13S2C3 and I the active site residue is replaced by tyrosine. PMID:7429422

  13. Solid phase sequencing of biopolymers

    DOEpatents

    Cantor, Charles; Koster, Hubert

    2010-09-28

    This invention relates to methods for detecting and sequencing target nucleic acid sequences, to mass modified nucleic acid probes and arrays of probes useful in these methods, and to kits and systems which contain these probes. Useful methods involve hybridizing the nucleic acids or nucleic acids which represent complementary or homologous sequences of the target to an array of nucleic acid probes. These probes comprise a single-stranded portion, an optional double-stranded portion and a variable sequence within the single-stranded portion. The molecular weights of the hybridized nucleic acids of the set can be determined by mass spectroscopy, and the sequence of the target determined from the molecular weights of the fragments. Nucleic acids whose sequences can be determined include DNA or RNA in biological samples such as patient biopsies and environmental samples. Probes may be fixed to a solid support such as a hybridization chip to facilitate automated molecular weight analysis and identification of the target sequence.

  14. Genomic evolution in Barrett’s adenocarcinoma cells: critical roles of elevated hsRAD51, homologous recombination and Alu sequences in the genome

    PubMed Central

    Pal, J; Bertheau, R; Buon, L; Qazi, A; Batchu, RB; Bandyopadhyay, S; Ali-Fehmi, R; Beer, DG; Weaver, DW; Reis, RJ Shmookler; Goyal, RK; Huang, Q; Munshi, NC; Shammas, MA

    2012-01-01

    A prominent feature of most cancers including Barrett’s adenocarcinoma (BAC) is genetic instability, which is associated with development and progression of disease. In this study, we investigated the role of recombinase (hsRAD51), a key component of homologous recombination (HR)/repair, in evolving genomic changes and growth of BAC cells. We show that the expression of RAD51 is elevated in BAC cell lines and tissue specimens, relative to normal cells. HR activity is also elevated and significantly correlates with RAD51 expression in BAC cells. The suppression of RAD51 expression, by short hairpin RNA (shRNA) specifically targeting this gene, significantly prevented BAC cells from acquiring genomic changes to either copy number or heterozygosity (P<0.02) in several independent experiments employing single-nucleotide polymorphism arrays. The reduction in copy-number changes, following shRNA treatment, was confirmed by Comparative Genome Hybridization analyses of the same DNA samples. Moreover, the chromosomal distributions of mutations correlated strongly with frequencies and locations of Alu interspersed repetitive elements on individual chromosomes. We conclude that the hsRAD51 protein level is systematically elevated in BAC, contributes significantly to genomic evolution during serial propagation of these cells and correlates with disease progression. Alu sequences may serve as substrates for elevated HR during cell proliferation in vitro, as they have been reported to do during the evolution of species, and thus may provide additional targets for prevention or treatment of this disease. PMID:21423218

  15. On–off system for PI3-kinase–Akt signaling through S-nitrosylation of phosphatase with sequence homology to tensin (PTEN)

    PubMed Central

    Numajiri, Naoki; Takasawa, Kumi; Nishiya, Tadashi; Tanaka, Hirotaka; Ohno, Kazuki; Hayakawa, Wataru; Asada, Mariko; Matsuda, Hiromi; Azumi, Kaoru; Kamata, Hideaki; Nakamura, Tomohiro; Hara, Hideaki; Minami, Masabumi; Lipton, Stuart A.; Uehara, Takashi

    2011-01-01

    Nitric oxide (NO) physiologically regulates numerous cellular responses through S-nitrosylation of protein cysteine residues. We performed antibody-array screening in conjunction with biotin-switch assays to look for S-nitrosylated proteins. Using this combination of techniques, we found that phosphatase with sequence homology to tensin (PTEN) is selectively S-nitrosylated by low concentrations of NO at a specific cysteine residue (Cys-83). S-nitrosylation of PTEN (forming SNO-PTEN) inhibits enzymatic activity and consequently stimulates the downstream Akt cascade, indicating that Cys-83 is a critical site for redox regulation of PTEN function. In ischemic mouse brain, we observed SNO-PTEN in the core and penumbra regions but found SNO-Akt, which is known to inhibit Akt activity, only in the ischemic core. These findings suggest that low concentrations of NO, as found in the penumbra, preferentially S-nitrosylate PTEN, whereas higher concentrations of NO, known to exist in the ischemic core, also S-nitrosylate Akt. In the penumbra, inhibition of PTEN (but not Akt) activity by S-nitrosylation would be expected to contribute to cell survival by means of enhanced Akt signaling. In contrast, in the ischemic core, SNO-Akt formation would inhibit this neuroprotective pathway. In vitro model systems support this notion. Thus, we identify unique sites of PTEN and Akt regulation by means of S-nitrosylation, resulting in an “on–off” pattern of control of Akt signaling. PMID:21646525

  16. Cloning and nucleotide sequencing of genes for three small, acid-soluble proteins from Bacillus subtilis spores.

    PubMed Central

    Connors, M J; Mason, J M; Setlow, P

    1986-01-01

    Three Bacillus subtilis genes (termed sspA, sspB, and sspD) which code for small, acid-soluble spore proteins (SASPs) have been cloned, and their complete nucleotide sequence has been determined. The amino acid sequences of the SASPs coded for by these genes are similar to each other and to those of the SASP-1 of B. subtilis (coded for by the sspC gene) and the SASP-A/C family of B. megaterium. The sspA and sspB genes are expressed only in sporulation, in parallel with each other and with the sspC gene. Two regions upstream of the postulated transcription start sites for the sspA and B genes have significant homology with the analogous regions of the sspC gene and the SASP-A/C gene family. Purification of two of the three major B, subtilis SASPs (alpha and beta) and determination of their amino-terminal sequences indicated that the sspA gene codes for SASP-alpha and that the sspB gene codes for SASP-beta. This was confirmed by the introduction of deletion mutations into the cloned sspA and sspB genes and transfer of these deletions into the B. subtilis chromosome with concomitant loss of the wild-type gene. Images PMID:3009398

  17. The amino acid sequence of protein SCMK-B2C from the high-sulphur fraction of wool keratin

    PubMed Central

    Elleman, T. C.

    1972-01-01

    1. The amino acid sequence of a protein from the reduced and carboxymethylated high-sulphur fraction of wool has been determined. 2. The sequence of this S-carboxymethylkerateine (SCMK-B2C) of 151 amino acid residues displays much internal homology and an unusual residue distribution. Thus a ten-residue sequence occurs four times near the N-terminus and five times near the C-terminus with few changes. These regions contain much of the molecule's half-cystine, whereas between them there is a region of 19 residues that are mainly small and devoid of cystine and proline. 3. Certain models of the wool fibre based on its mechanical and physical properties propose a matrix of small compact globular units linked together to form beaded chains. The unusual distribution of the component residues of protein SCMK-B2C suggests structures in the wool-fibre matrix compatible with certain features of the proposed models. PMID:4678578

  18. Structural gene and complete amino acid sequence of Pseudomonas aeruginosa IFO 3455 elastase.

    PubMed Central

    Fukushima, J; Yamamoto, S; Morihara, K; Atsumi, Y; Takeuchi, H; Kawamoto, S; Okuda, K

    1989-01-01

    The DNA encoding the elastase of Pseudomonas aeruginosa IFO 3455 was cloned, and its complete nucleotide sequence was determined. When the cloned gene was ligated to pUC18, the Escherichia coli expression vector, bacteria carrying the gene exhibited high levels of both elastase activity and elastase antigens. The amino acid sequence, deduced from the nucleotide sequence, revealed that the mature elastase consisted of 301 amino acids with a relative molecular mass of 32,926 daltons. The amino acid composition predicted from the DNA sequence was quite similar to the chemically determined composition of purified elastase reported previously. We also observed nucleotide sequence encoding a signal peptide and "pro" sequence consisting of 197 amino acids upstream from the mature elastase protein gene. The amino acid sequence analysis revealed that both the N-terminal sequence of the purified elastase and the N-terminal side sequences of the C-terminal tryptic peptide as well as the internal lysyl peptide fragment were completely identical to the deduced amino acid sequences. The pattern of identity of amino acid sequences was quite evident in the regions that include structurally and functionally important residues of Bacillus subtilis thermolysin. PMID:2493453

  19. Human retroviruses and AIDS 1996. A compilation and analysis of nucleic acid and amino acid sequences

    SciTech Connect

    Myers, G.; Foley, B.; Korber, B.; Mellors, J.W.; Jeang, K.T.; Wain-Hobson, S.

    1997-04-01

    This compendium and the accompanying floppy diskettes are the result of an effort to compile and rapidly publish all relevant molecular data concerning the human immunodeficiency viruses (HIV) and related retroviruses. The scope of the compendium and database is best summarized by the five parts that it comprises: (1) Nuclear Acid Alignments and Sequences; (2) Amino Acid Alignments; (3) Analysis; (4) Related Sequences; and (5) Database Communications. Information within all the parts is updated throughout the year on the Web site, http://hiv-web.lanl.gov. While this publication could take the form of a review or sequence monograph, it is not so conceived. Instead, the literature from which the database is derived has simply been summarized and some elementary computational analyses have been performed upon the data. Interpretation and commentary have been avoided insofar as possible so that the reader can form his or her own judgments concerning the complex information. In addition to the general descriptions of the parts of the compendium, the user should read the individual introductions for each part.

  20. Natural vs. random protein sequences: Discovering combinatorics properties on amino acid words.

    PubMed

    Santoni, Daniele; Felici, Giovanni; Vergni, Davide

    2016-02-21

    Casual mutations and natural selection have driven the evolution of protein amino acid sequences that we observe at present in nature. The question about which is the dominant force of proteins evolution is still lacking of an unambiguous answer. Casual mutations tend to randomize protein sequences while, in order to have the correct functionality, one expects that selection mechanisms impose rigid constraints on amino acid sequences. Moreover, one also has to consider that the space of all possible amino acid sequences is so astonishingly large that it could be reasonable to have a well tuned amino acid sequence indistinguishable from a random one. In order to study the possibility to discriminate between random and natural amino acid sequences, we introduce different measures of association between pairs of amino acids in a sequence, and apply them to a dataset of 1047 natural protein sequences and 10,470 random sequences, carefully generated in order to preserve the relative length and amino acid distribution of the natural proteins. We analyze the multidimensional measures with machine learning techniques and show that, to a reasonable extent, natural protein sequences can be differentiated from random ones. PMID:26656109

  1. Transcriptome Sequencing in Response to Salicylic Acid in Salvia miltiorrhiza

    PubMed Central

    Zhang, Xiaoru; Dong, Juane; Liu, Hailong; Wang, Jiao; Qi, Yuexin; Liang, Zongsuo

    2016-01-01

    Salvia miltiorrhiza is a traditional Chinese herbal medicine, whose quality and yield are often affected by diseases and environmental stresses during its growing season. Salicylic acid (SA) plays a significant role in plants responding to biotic and abiotic stresses, but the involved regulatory factors and their signaling mechanisms are largely unknown. In order to identify the genes involved in SA signaling, the RNA sequencing (RNA-seq) strategy was employed to evaluate the transcriptional profiles in S. miltiorrhiza cell cultures. A total of 50,778 unigenes were assembled, in which 5,316 unigenes were differentially expressed among 0-, 2-, and 8-h SA induction. The up-regulated genes were mainly involved in stimulus response and multi-organism process. A core set of candidate novel genes coding SA signaling component proteins was identified. Many transcription factors (e.g., WRKY, bHLH and GRAS) and genes involved in hormone signal transduction were differentially expressed in response to SA induction. Detailed analysis revealed that genes associated with defense signaling, such as antioxidant system genes, cytochrome P450s and ATP-binding cassette transporters, were significantly overexpressed, which can be used as genetic tools to investigate disease resistance. Our transcriptome analysis will help understand SA signaling and its mechanism of defense systems in S. miltiorrhiza. PMID:26808150

  2. Transcriptome Sequencing in Response to Salicylic Acid in Salvia miltiorrhiza.

    PubMed

    Zhang, Xiaoru; Dong, Juane; Liu, Hailong; Wang, Jiao; Qi, Yuexin; Liang, Zongsuo

    2016-01-01

    Salvia miltiorrhiza is a traditional Chinese herbal medicine, whose quality and yield are often affected by diseases and environmental stresses during its growing season. Salicylic acid (SA) plays a significant role in plants responding to biotic and abiotic stresses, but the involved regulatory factors and their signaling mechanisms are largely unknown. In order to identify the genes involved in SA signaling, the RNA sequencing (RNA-seq) strategy was employed to evaluate the transcriptional profiles in S. miltiorrhiza cell cultures. A total of 50,778 unigenes were assembled, in which 5,316 unigenes were differentially expressed among 0-, 2-, and 8-h SA induction. The up-regulated genes were mainly involved in stimulus response and multi-organism process. A core set of candidate novel genes coding SA signaling component proteins was identified. Many transcription factors (e.g., WRKY, bHLH and GRAS) and genes involved in hormone signal transduction were differentially expressed in response to SA induction. Detailed analysis revealed that genes associated with defense signaling, such as antioxidant system genes, cytochrome P450s and ATP-binding cassette transporters, were significantly overexpressed, which can be used as genetic tools to investigate disease resistance. Our transcriptome analysis will help understand SA signaling and its mechanism of defense systems in S. miltiorrhiza. PMID:26808150

  3. Redesigning Aldolase Stereoselectivity by Homologous Grafting

    PubMed Central

    Henßen, Birgit; Metz, Alexander; Gohlke, Holger; Pietruszka, Jörg

    2016-01-01

    The 2-deoxy-d-ribose-5-phosphate aldolase (DERA) offers access to highly desirable building blocks for organic synthesis by catalyzing a stereoselective C-C bond formation between acetaldehyde and certain electrophilic aldehydes. DERA´s potential is particularly highlighted by the ability to catalyze sequential, highly enantioselective aldol reactions. However, its synthetic use is limited by the absence of an enantiocomplementary enzyme. Here, we introduce the concept of homologous grafting to identify stereoselectivity-determining amino acid positions in DERA. We identified such positions by structural analysis of the homologous aldolases 2-keto-3-deoxy-6-phosphogluconate aldolase (KDPG) and the enantiocomplementary enzyme 2-keto-3-deoxy-6-phosphogalactonate aldolase (KDPGal). Mutation of these positions led to a slightly inversed enantiopreference of both aldolases to the same extent. By transferring these sequence motifs onto DERA we achieved the intended change in enantioselectivity. PMID:27327271

  4. Redesigning Aldolase Stereoselectivity by Homologous Grafting.

    PubMed

    Bisterfeld, Carolin; Classen, Thomas; Küberl, Irene; Henßen, Birgit; Metz, Alexander; Gohlke, Holger; Pietruszka, Jörg

    2016-01-01

    The 2-deoxy-d-ribose-5-phosphate aldolase (DERA) offers access to highly desirable building blocks for organic synthesis by catalyzing a stereoselective C-C bond formation between acetaldehyde and certain electrophilic aldehydes. DERA´s potential is particularly highlighted by the ability to catalyze sequential, highly enantioselective aldol reactions. However, its synthetic use is limited by the absence of an enantiocomplementary enzyme. Here, we introduce the concept of homologous grafting to identify stereoselectivity-determining amino acid positions in DERA. We identified such positions by structural analysis of the homologous aldolases 2-keto-3-deoxy-6-phosphogluconate aldolase (KDPG) and the enantiocomplementary enzyme 2-keto-3-deoxy-6-phosphogalactonate aldolase (KDPGal). Mutation of these positions led to a slightly inversed enantiopreference of both aldolases to the same extent. By transferring these sequence motifs onto DERA we achieved the intended change in enantioselectivity. PMID:27327271

  5. Arabidopsis Glutamate Receptor Homolog3.5 Modulates Cytosolic Ca2+ Level to Counteract Effect of Abscisic Acid in Seed Germination1[OPEN

    PubMed Central

    Kong, Dongdong; Ju, Chuanli; Parihar, Aisha; Kim, So; Cho, Daeshik; Kwak, June M.

    2015-01-01

    Seed germination is a critical step in a plant’s life cycle that allows successful propagation and is therefore strictly controlled by endogenous and environmental signals. However, the molecular mechanisms underlying germination control remain elusive. Here, we report that the Arabidopsis (Arabidopsis thaliana) glutamate receptor homolog3.5 (AtGLR3.5) is predominantly expressed in germinating seeds and increases cytosolic Ca2+ concentration that counteracts the effect of abscisic acid (ABA) to promote germination. Repression of AtGLR3.5 impairs cytosolic Ca2+ concentration elevation, significantly delays germination, and enhances ABA sensitivity in seeds, whereas overexpression of AtGLR3.5 results in earlier germination and reduced seed sensitivity to ABA. Furthermore, we show that Ca2+ suppresses the expression of ABSCISIC ACID INSENSITIVE4 (ABI4), a key transcription factor involved in ABA response in seeds, and that ABI4 plays a fundamental role in modulation of Ca2+-dependent germination. Taken together, our results provide molecular genetic evidence that AtGLR3.5-mediated Ca2+ influx stimulates seed germination by antagonizing the inhibitory effects of ABA through suppression of ABI4. These findings establish, to our knowledge, a new and pivotal role of the plant glutamate receptor homolog and Ca2+ signaling in germination control and uncover the orchestrated modulation of the AtGLR3.5-mediated Ca2+ signal and ABA signaling via ABI4 to fine-tune the crucial developmental process, germination, in Arabidopsis. PMID:25681329

  6. Adsorption of the Lighter Homologs of Element 104 and Element 105 on DGA Resin from Various Mineral Acids

    SciTech Connect

    Bennett, M E; Sudowe, R

    2008-11-17

    The goal of studying transactinide elements is to further understand the fundamental principles that govern the periodic table. The current periodic table arrangement allows for the prediction of the chemical behavior of elements. The correct position of a transactinide element can be assessed by investigating its chemical behavior and comparing it to that of the homologs and pseudo-homologs of a transactinide element. Homologs of a transactinide element are the elements in the same group of the periodic table as the transactinide. A pseudo-homolog of a transactinide element is an element with a similar main oxidation state and similar ionic radius to the transactinide element. For example, the homologs of rutherfordium, Rf, are titanium, zirconium and hafnium (Ti, Zr and Hf); the pseudo homologs of Rf are thorium, Th, and plutonium, Pu. Understanding the chemical behavior of a transactinide element compared to its homologs and pseudo-homologs also allows for the assessment of the role of relativistic effects. Relativistic effects occur when the velocity of the s orbital electrons closest to the nucleus approaches the speed of light. These electrons approach the speed of light because they have no orbital momentum. This causes two effects, first there is in a decrease in Bohr radius of the inner electronic orbitals because of this there is an increase in particle mass. A contraction of outer s and p orbitals is also seen. The contraction of these orbitals results in an energy destabilization of the outer most shell, in the case of transactinides this would be the 5f and 6d orbitals. The outer most d shell and all f shells can also experience a radial expansion due to these orbitals being screened from the effective nuclear charge. Another relativistic effect is the 'spin-orbit splitting' for p, d and f orbitals into j = 1 {+-} 1/2 states. Where j is the total angular momentum vector and 1 is angular quantum number. All of these effects have the same order of

  7. Rhizobial homologs of the fatty acid transporter FadL facilitate perception of long-chain acyl-homoserine lactone signals

    PubMed Central

    Krol, Elizaveta; Becker, Anke

    2014-01-01

    Quorum sensing (QS) using N-acyl homoserine lactones (AHLs) as signal molecules is a common strategy used by diverse Gram-negative bacteria. A widespread mechanism of AHL sensing involves binding of these molecules by cytosolic LuxR-type transcriptional regulators, which requires uptake of external AHLs. The outer membrane is supposed to be an efficient barrier for diffusion of long-chain AHLs. Here we report evidence that in Sinorhizobium meliloti, sensing of AHLs with acyl chains composed of 14 or more carbons is facilitated by the outer membrane protein FadLSm, a homolog of the Escherichia coli FadLEc long-chain fatty acid transporter. The effect of fadLSm on AHL sensing was more prominent for longer and more hydrophobic signal molecules. Using reporter gene fusions to QS target genes, we found that fadLSm increased AHL sensitivity and accelerated the course of QS. In contrast to FadLEc, FadLSm did not support uptake of oleic acid, but did contribute to growth on palmitoleic acid. FadLSm homologs from related symbiotic α-rhizobia and the plant pathogen Agrobacterium tumefaciens differed in their ability to facilitate long-chain AHL sensing or to support growth on oleic acid. FadLAt was found to be ineffective toward long-chain AHLs. We obtained evidence that the predicted extracellular loop 5 of FadLSm and further α-rhizobial FadL proteins contains determinants of specificity to long-chain AHLs. Replacement of a part of loop 5 by the corresponding region from α-rhizobial FadL proteins transferred sensitivity for long-chain AHLs to FadLAt. PMID:25002473

  8. Identification of the amino acid sequence that targets peroxiredoxin 6 to lysosome-like structures of lung epithelial cells.

    PubMed

    Sorokina, Elena M; Feinstein, Sheldon I; Milovanova, Tatyana N; Fisher, Aron B

    2009-11-01

    Peroxiredoxin 6 (Prdx6), an enzyme with glutathione peroxidase and PLA2 (aiPLA2) activities, is highly expressed in respiratory epithelium, where it participates in phospholipid turnover and antioxidant defense. Prdx6 has been localized by immunocytochemistry and subcellular fractionation to acidic organelles (lung lamellar bodies and lysosomes) and cytosol. On the basis of their pH optima, we have postulated that protein subcellular localization determines the balance between the two activities of Prdx6. Using green fluorescent protein-labeled protein expression in alveolar epithelial cell lines, we showed Prdx6 localization to organellar structures resembling lamellar bodies in mouse lung epithelial (MLE-12) cells and lysosomes in A549 cells. Localization within lamellar bodies/lysosomes was in the luminal compartment. Targeting to lysosome-like organelles was abolished by the deletion of amino acids 31-40 from the Prdx6 NH2-terminal region; deletion of the COOH-terminal region had no effect. A green fluorescent protein-labeled peptide containing only amino acids 31-40 showed lysosomal targeting that was abolished by mutation of S32 or G34 within the peptide. Studies with mutated protein indicated that lipid binding was not necessary for Prdx6 targeting. This peptide sequence has no homology to known organellar targeting motifs. These studies indicate that the localization of Prdx6 in acidic organelles and consequent PLA2 activity depend on a novel 10-aa peptide located at positions 31-40 of the protein. PMID:19700648

  9. A Possible Mechanism of Zika Virus Associated Microcephaly: Imperative Role of Retinoic Acid Response Element (RARE) Consensus Sequence Repeats in the Viral Genome.

    PubMed

    Kumar, Ashutosh; Singh, Himanshu N; Pareek, Vikas; Raza, Khursheed; Dantham, Subrahamanyam; Kumar, Pavan; Mochan, Sankat; Faiq, Muneeb A

    2016-01-01

    Owing to the reports of microcephaly as a consistent outcome in the fetuses of pregnant women infected with ZIKV in Brazil, Zika virus (ZIKV)-microcephaly etiomechanistic relationship has recently been implicated. Researchers, however, are still struggling to establish an embryological basis for this interesting causal handcuff. The present study reveals robust evidence in favor of a plausible ZIKV-microcephaly cause-effect liaison. The rationale is based on: (1) sequence homology between ZIKV genome and the response element of an early neural tube developmental marker "retinoic acid" in human DNA and (2) comprehensive similarities between the details of brain defects in ZIKV-microcephaly and retinoic acid embryopathy. Retinoic acid is considered as the earliest factor for regulating anteroposterior axis of neural tube and positioning of structures in developing brain through retinoic acid response elements (RARE) consensus sequence (5'-AGGTCA-3') in promoter regions of retinoic acid-dependent genes. We screened genomic sequences of already reported virulent ZIKV strains (including those linked to microcephaly) and other viruses available in National Institute of Health genetic sequence database (GenBank) for the RARE consensus repeats and obtained results strongly bolstering our hypothesis that ZIKV strains associated with microcephaly may act through precipitation of dysregulation in retinoic acid-dependent genes by introducing extra stretches of RARE consensus sequence repeats in the genome of developing brain cells. Additional support to our hypothesis comes from our findings that screening of other viruses for RARE consensus sequence repeats is positive only for those known to display neurotropism and cause fetal brain defects (for which maternal-fetal transmission during developing stage may be required). The numbers of RARE sequence repeats appeared to match with the virulence of screened positive viruses. Although, bioinformatic evidence and embryological

  10. Detection and isolation of nucleic acid sequences using a bifunctional hybridization probe

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    2000-01-01

    A method for detecting and isolating a target sequence in a sample of nucleic acids is provided using a bifunctional hybridization probe capable of hybridizing to the target sequence that includes a detectable marker and a first complexing agent capable of forming a binding pair with a second complexing agent. A kit is also provided for detecting a target sequence in a sample of nucleic acids using a bifunctional hybridization probe according to this method.

  11. Application of combined mass spectrometry and partial amino acid sequence to the identification of gel-separated proteins.

    PubMed

    Patterson, S D; Thomas, D; Bradshaw, R A

    1996-05-01

    The combined use of peptide mass information with amino acid sequence information derived by chemical sequencing or mass spectrometry (MS)-based approaches provides a powerful means of protein identification. We have used a two-part strategy to identify proteins from nerve growth factor (NGF)-stimulated rat adrenal pheochromocytoma cell line PC-12 cell lysates that associate with the adaptor protein Shc (Shc homologous and collagen protein). Initial experiments with metabolically radiolabeled cell extracts separated by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) revealed a number of proteins that coimmunoprecipitated with anti-Shc antibody compared with control (unstimulated) cell extracts. The experiment was scaled up and cell lysate from NGF-stimulated PC-12 cells was applied to a glutathione-S-transferase (GST)-Shc affinity column, eluted, separated by SDS-PAGE and blotted to Immobilon-CD. The blotted proteins were proteolytically digested in situ, and the masses obtained from the extracted peptides were used in a peptide-mass search program in an attempt to identify the protein. Even if a strong candidate was found using this search, an additional step was performed to confirm the identification. The mixtures were fractionated by reversed-phase high-performance liquid chromatography (RP-HPLC) and subjected to chemical sequencing to obtain (partial) sequence information, or post-source decay (PSD-) matrix-assisted laser-desorption ionization (MALDI)-MS to obtain sequence-specific fragment ions. This data was used in a peptide-sequence tag search to confirm the identity of the proteins. This combined approach allowed identification of four proteins of M(r) 43,000 to 200,000. In one case the identified protein clearly did not correspond to the radiolabeled band, but to a protein contaminant from the column. The advantages and pitfalls of the approach are discussed. PMID:8783013

  12. The complete amino acid sequence of the major Kunitz trypsin inhibitor from the seeds of Prosopsis juliflora.

    PubMed

    Negreiros, A N; Carvalho, M M; Xavier Filho, J; Blanco-Labra, A; Shewry, P R; Richardson, M

    1991-01-01

    The major inhibitor of trypsin in seeds of Prosopsis juliflora was purified by precipitation with ammonium sulphate, ion-exchange column chromatography on DEAE- and CM-Sepharose and preparative reverse phase HPLC on a Vydac C-18 column. The protein inhibited trypsin in the stoichiometric ratio of 1:1, but had only weak activity against chymotrypsin and did not inhibit human salivary or porcine pancreatic alpha-amylases. SDS-PAGE indicated that the inhibitor has a Mr of ca 20,000, and IEF-PAGE showed that the pI is 8.8. The complete amino acid sequence was determined by automatic degradation, and by DABITC/PITC microsequence analysis of peptides obtained from enzyme digestions of the reduced and S-carboxymethylated protein with trypsin, chymotrypsin, elastase, the Glu-specific protease from S. aureus and the Lys-specific protease from Lysobacter enzymogenes. The inhibitor consisted of two polypeptide chains, of 137 residues (alpha chain) and 38 residues (beta chain) linked together by a single disulphide bond. The amino acid sequence of the protein exhibited homology with a number of Kunitz proteinase inhibitors from other legume seeds, the bifunctional subtilisin/alpha-amylase inhibitors from cereals and the taste-modifying protein miraculin. PMID:1367792

  13. The amino acid sequence of a cereal Bowman-Birk type trypsin inhibitor from seeds of Jobs' tears (Coix lachryma-jobi L.).

    PubMed

    Ary, M B; Shewry, P R; Richardson, M

    1988-02-29

    The major trypsin inhibitor from seeds of Jobs' tears (Coix lachryma-jobi) was purified by heat treatment, fractional precipitation with (NH4)2SO4, ion-exchange chromatography on DEAE-Sepharose, gel-filtration on Sephadex G-75 and preparative reverse-phase HPLC. The complete amino acid sequence was determined by analysis of peptides derived from the reduced and S-carboxymethylated protein by digestion with trypsin, chymotrypsin and the S. aureus V8 protease. The polypeptide contained 64 amino acids with a high content of cysteine. The sequence exhibited strong homology with a number of Bowman-Birk inhibitors from legume seeds and similar proteins recently isolated from wheat and rice. PMID:3162215

  14. Insights into the bile acid transportation system: the human ileal lipid-binding protein-cholyltaurine complex and its comparison with homologous structures.

    PubMed

    Kurz, Michael; Brachvogel, Volker; Matter, Hans; Stengelin, Siegfried; Thüring, Harald; Kramer, Werner

    2003-02-01

    Bile acids are generated in vivo from cholesterol in the liver, and they undergo an enterohepatic circulation involving the small intestine, liver, and kidney. To understand the molecular mechanism of this transportation, it is essential to gain insight into the three-dimensional (3D) structures of proteins involved in the bile acid recycling in free and complexed form and to compare them with homologous members of this protein family. Here we report the solution structure of the human ileal lipid-binding protein (ILBP) in free form and in complex with cholyltaurine. Both structures are compared with a previously published structure of the porcine ILBP-cholylglycine complex and with related lipid-binding proteins. Protein structures were determined in solution by using two-dimensional (2D)- and 3D-homo and heteronuclear NMR techniques, leading to an almost complete resonance assignment and a significant number of distance constraints for distance geometry and restrained molecular dynamics simulations. The identification of several intermolecular distance constraints unambiguously determines the cholyltaurine-binding site. The bile acid is deeply buried within ILBP with its flexible side-chain situated close to the fatty acid portal as entry region into the inner ILBP core. This binding mode differs significantly from the orientation of cholylglycine in porcine ILBP. A detailed analysis using the GRID/CPCA strategy reveals differences in favorable interactions between protein-binding sites and potential ligands. This characterization will allow for the rational design of potential inhibitors for this relevant system. PMID:12486725

  15. Amino acid sequences of two novel long-chain neurotoxins from the venom of the sea snake Laticauda colubrina.

    PubMed

    Kim, H S; Tamiya, N

    1982-11-01

    From the venom of a population of the sea snake Laticauda colubrina from the Solomon Islands, a neurotoxic component, Laticauda colubrina a (toxin Lc a), was isolated in 16.6% (A280) yield. Similarly, from the venom of a population of L. colubrina from the Philippines, a neurotoxic component, Laticauda colubrina b (toxin Lc b), was obtained in 10.0% (A280) yield. The LD50 values of these toxins were 0.12 microgram/g body wt. on intramuscular injection in mice. Toxins Lc a and Lc b were each composed of molecules containing 69 amino acid residues with eight half-cystine residues. The complete amino acid sequences of these two toxins were elucidated. Toxins Lc a and Lc b are different from each other at five positions of their sequences, namely at positions 31 (Phe/Ser), 32 (Leu/Ile), 33 (Lys/Arg), 50 (Pro/Arg) and 53 (Asp/His) (residues in parentheses give the residues in toxins Lc a and Lc b respectively). Toxins Lc a and Lc b have a novel structure in that they have only four disulphide bridges, although the whole amino acid sequences are homologous to those of other known long-chain neurotoxins. It is remarkable that toxins Lc a and Lc b are not coexistent at the detection error of 6% of the other toxin. Populations of Laticauda colubrina from the Solomon Islands and from the Philippines have either toxin Lc a or toxin Lc b and not both of them. PMID:7159381

  16. Solid phase sequencing of biopolymers

    DOEpatents

    Cantor, Charles R.; Hubert, Koster

    2014-06-24

    This invention relates to methods for detecting and sequencing target nucleic acid sequences, to mass modified nucleic acid probes and arrays of probes useful in these methods, and to kits and systems which contain these probes. Useful methods involve hybridizing the nucleic acids or nucleic acids which represent complementary or homologous sequences of the target to an array of nucleic acid probes. These probes comprise a single-stranded portion, an optional double-stranded portion and a variable sequence within the single-stranded portion. The molecular weights of the hybridized nucleic acids of the set can be determined by mass spectroscopy, and the sequence of the target determined from the molecular weights of the fragments. Probes may be affixed to a solid support such as a hybridization chip to facilitate automated molecular weight analysis and identification of the target sequence.

  17. Amino acid sequence of Japanese quail (Coturnix japonica) and northern bobwhite (Colinus virginianus) myoglobin.

    PubMed

    Goodson, John; Beckstead, Robert B; Payne, Jason; Singh, Rakesh K; Mohan, Anand

    2015-08-15

    Myoglobin has an important physiological role in vertebrates, and as the primary sarcoplasmic pigment in meat, influences quality perception and consumer acceptability. In this study, the amino acid sequences of Japanese quail and northern bobwhite myoglobin were deduced by cDNA cloning of the coding sequence from mRNA. Japanese quail myoglobin was isolated from quail cardiac muscles, purified using ammonium sulphate precipitation and gel-filtration, and subjected to multiple enzymatic digestions. Mass spectrometry corroborated the deduced protein amino acid sequence at the protein level. Sequence analysis revealed both species' myoglobin structures consist of 153 amino acids, differing at only three positions. When compared with chicken myoglobin, Japanese quail showed 98% sequence identity, and northern bobwhite 97% sequence identity. The myoglobin in both quail species contained eight histidine residues instead of the nine present in chicken and turkey. PMID:25794748

  18. The mouse and human excitatory amino acid transporter gene (EAAT1) maps to mouse chromosome 15 and a region of syntenic homology on human chromosome 5

    SciTech Connect

    Kirschner, M.A.; Arriza, J.L.; Amara, S.G.

    1994-08-01

    The gene for human excitatory amino acid transporter (EAAT1) was localized to the distal region of human chromosome 5p13 by in situ hybridization of metaphase chromosome spreads. Interspecific backcross analysis identified the mouse Eaat1 locus in a region of 5p13 homology on mouse chromosome 15. Markers that are linked with EAAT1 on both human and mouse chromosomes include the receptors for leukemia inhibitory factor, interleukin-7, and prolactin. The Eaat1 locus appears not be linked to the epilepsy mutant stg locus, which is also on chromosome 15. The EAAT1 locus is located in a region of 5p deletions that have been associated with mental retardation and microcephaly. 22 refs., 2 figs.

  19. Genomic homologous recombination in planta.

    PubMed Central

    Gal, S; Pisan, B; Hohn, T; Grimsley, N; Hohn, B

    1991-01-01

    A system for monitoring intrachromosomal homologous recombination in whole plants is described. A multimer of cauliflower mosaic virus (CaMV) sequences, arranged such that CaMV could only be produced by recombination, was integrated into Brassica napus nuclear DNA. This set-up allowed scoring of recombination events by the appearance of viral symptoms. The repeated homologous regions were derived from two different strains of CaMV so that different recombinant viruses (i.e. different recombination events) could be distinguished. In most of the transgenic plants, a single major virus species was detected. About half of the transgenic plants contained viruses of the same type, suggesting a hotspot for recombination. The remainder of the plants contained viruses with cross-over sites distributed throughout the rest of the homologous sequence. Sequence analysis of two recombinant molecules suggest that mismatch repair is linked to the recombination process. Images PMID:2026150

  20. Identification of random nucleic acid sequence aberrations using dual capture probes which hybridize to different chromosome regions

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1998-01-01

    A method is provided for detecting nucleic acid sequence aberrations using two immobilization steps. According to the method, a nucleic acid sequence aberration is detected by detecting nucleic acid sequences having both a first nucleic acid sequence type (e.g., from a first chromosome) and a second nucleic acid sequence type (e.g., from a second chromosome), the presence of the first and the second nucleic acid sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. In the method, immobilization of a first hybridization probe is used to isolate a first set of nucleic acids in the sample which contain the first nucleic acid sequence type. Immobilization of a second hybridization probe is then used to isolate a second set of nucleic acids from within the first set of nucleic acids which contain the second nucleic acid sequence type. The second set of nucleic acids are then detected, their presence indicating the presence of a nucleic acid sequence aberration.

  1. Identification of random nucleic acid sequence aberrations using dual capture probes which hybridize to different chromosome regions

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1998-03-24

    A method is provided for detecting nucleic acid sequence aberrations using two immobilization steps. According to the method, a nucleic acid sequence aberration is detected by detecting nucleic acid sequences having both a first nucleic acid sequence type (e.g., from a first chromosome) and a second nucleic acid sequence type (e.g., from a second chromosome), the presence of the first and the second nucleic acid sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. In the method, immobilization of a first hybridization probe is used to isolate a first set of nucleic acids in the sample which contain the first nucleic acid sequence type. Immobilization of a second hybridization probe is then used to isolate a second set of nucleic acids from within the first set of nucleic acids which contain the second nucleic acid sequence type. The second set of nucleic acids are then detected, their presence indicating the presence of a nucleic acid sequence aberration. 14 figs.

  2. tax and rex Sequences of bovine leukaemia virus from globally diverse isolates: rex amino acid sequence more variable than tax.

    PubMed

    McGirr, K M; Buehring, G C

    2005-02-01

    Bovine leukaemia virus (BLV) is an important agricultural problem with high costs to the dairy industry. Here, we examine the variation of the tax and rex genes of BLV. The tax and rex genes share 420 bases and have overlapping reading frames. The tax gene encodes a protein that functions as a transactivator of the BLV promoter, is required for viral replication, acts on cellular promoters, and is responsible for oncogenesis. The rex facilitates the export of viral mRNAs from the nucleus and regulates transcription. We have sequenced five new isolates of the tax/rex gene. We examined the five new and three previously published tax/rex DNA and predicted amino acid sequences of BLV isolates from cattle in representative regions worldwide. The highest variation among nucleic acid sequences for tax and rex was 7% and 5%, respectively; among predicted amino acid sequences for Tax and Rex, 9% and 11%, respectively. Significantly more nucleotide changes resulted in predicted amino acid changes in the rex gene than in the tax gene (P < or = 0.0006). This variability is higher than previously reported for any region of the viral genome. This research may also have implications for the development of Tax-based vaccines. PMID:15702995

  3. The amino acid sequence of protein CM-3 from Dendroaspis polylepis polylepis (black mamba) venom.

    PubMed

    Joubert, F J

    1985-01-01

    Protein CM-3 from Dendroaspis polylepis polylepis venom was purified by gel filtration and ion exchange chromatography. It comprises 65 amino acids including eight half-cystines. The complete amino acid sequence of protein CM-3 has been elucidated. The sequence (residues 1-50) resembles that of the N-terminal sequence of the subunits of a synergistic type protein and residues 51-65 that of the C-terminal sequence of an angusticeps type protein. Mixtures of protein CM-3 and angusticeps type proteins showed no apparent synergistic effect, in that their toxicity in combination was no greater than the sum of their individual toxicities. PMID:4029488

  4. Homology-independent metrics for comparative genomics.

    PubMed

    Coutinho, Tarcisio José Domingos; Franco, Glória Regina; Lobo, Francisco Pereira

    2015-01-01

    A mainstream procedure to analyze the wealth of genomic data available nowadays is the detection of homologous regions shared across genomes, followed by the extraction of biological information from the patterns of conservation and variation observed in such regions. Although of pivotal importance, comparative genomic procedures that rely on homology inference are obviously not applicable if no homologous regions are detectable. This fact excludes a considerable portion of "genomic dark matter" with no significant similarity - and, consequently, no inferred homology to any other known sequence - from several downstream comparative genomic methods. In this review we compile several sequence metrics that do not rely on homology inference and can be used to compare nucleotide sequences and extract biologically meaningful information from them. These metrics comprise several compositional parameters calculated from sequence data alone, such as GC content, dinucleotide odds ratio, and several codon bias metrics. They also share other interesting properties, such as pervasiveness (patterns persist on smaller scales) and phylogenetic signal. We also cite examples where these homology-independent metrics have been successfully applied to support several bioinformatics challenges, such as taxonomic classification of biological sequences without homology inference. They where also used to detect higher-order patterns of interactions in biological systems, ranging from detecting coevolutionary trends between the genomes of viruses and their hosts to characterization of gene pools of entire microbial communities. We argue that, if correctly understood and applied, homology-independent metrics can add important layers of biological information in comparative genomic studies without prior homology inference. PMID:26029354

  5. Antigenic and protein sequence homology between VP13/14, a herpes simplex virus type 1 tegument protein, and gp10, a glycoprotein of equine herpesvirus 1 and 4.

    PubMed Central

    Whittaker, G R; Riggio, M P; Halliburton, I W; Killington, R A; Allen, G P; Meredith, D M

    1991-01-01

    Monospecific polyclonal antisera raised against VP13/14, a major tegument protein of herpes simplex virus type 1 cross-reacted with structural equine herpesvirus 1 and 4 proteins of Mr 120,000 and 123,000, respectively; these proteins are identical in molecular weight to the corresponding glycoprotein 10 (gp10) of each virus. Using a combination of immune precipitation and Western immunoblotting techniques, we confirmed that anti-VP13/14 and a monoclonal antibody to gp10 reacted with the same protein. Sequence analysis of a lambda gt11 insert of equine herpesvirus 1 gp10 identified an open reading frame in equine herpesvirus 4 with which it showed strong homology; this open reading frame also shared homology with gene UL47 of herpes simplex virus type 1 and gene 11 of varicella-zoster virus. This showed that, in addition to immunological cross-reactivity, VP13/14 and gp10 have protein sequence homology; it also allowed identification of VP13/14 as the gene product of UL47. Images PMID:1850013

  6. Computer Simulation of the Determination of Amino Acid Sequences in Polypeptides

    ERIC Educational Resources Information Center

    Daubert, Stephen D.; Sontum, Stephen F.

    1977-01-01

    Describes a computer program that generates a random string of amino acids and guides the student in determining the correct sequence of a given protein by using experimental analytic data for that protein. (MLH)

  7. The amino acid sequence of monal pheasant lysozyme and its activity.

    PubMed

    Araki, T; Matsumoto, T; Torikata, T

    1998-10-01

    The amino acid sequence of monal pheasant lysozyme and its activity were analyzed. Carboxymethylated lysozyme was digested with trypsin and the resulting peptides were sequenced. The established amino acid sequence had one amino acid substitution at position 102 (Arg to Gly) comparing with Indian peafowl lysozyme and four amino acid substitutions at positions 3 (Phe to Tyr), 15 (His to Leu), 41 (Gln to His), and 121 (Gln to His) with chicken lysozyme. Analysis of the time-courses of reaction using N-acetylglucosamine pentamer as a substrate showed a difference of binding free energy change (-0.4 kcal/mol) at subsites A between monal pheasant and Indian peafowl lysozyme. This was assumed to be caused by the amino acid substitution at subsite A with loss of a positive charge at position 102 (Arg102 to Gly). PMID:9836434

  8. Studies on monotreme proteins. VII. Amino acid sequence of myoglobin from the platypus, Ornithoryhynchus anatinus.

    PubMed

    Fisher, W K; Thompson, E O

    1976-03-01

    Myoglobin isolated from skeletal muscle of the platypus contains 153 amino acid residues. The complete amino acid sequence has been determined following cleavage with cyanogen bromide and further digestion of the four fragments with trypsin, chymotrypsin, pepsin and thermolysin. Sequences of the purified peptides were determined by the dansyl-Edman procedure. The amino acid sequence showed 25 differences from human myoglobin and 24 from kangaroo myoglobin. Amino acid sequences in myoglobins are more conserved than sequences in the alpha- and beta-globin chains, and platypus myoglobin shows a similar number of variations in sequence to kangaroo myoglobin when compared with myoglobin of other species. The date of divergence of the platypus from other mammals was estimated at 102 +/- 31 million years, based on the number of amino acid differences between species and allowing for mutations during the evolutionary period. This estimate differs widely from the estimate given by similar treatment of the alpha- and beta-chain sequences and a constant rate of mutation of globin chains is not supported. PMID:962722

  9. cDNA-derived amino acid sequences of myoglobins from nine species of whales and dolphins.

    PubMed

    Iwanami, Kentaro; Mita, Hajime; Yamamoto, Yasuhiko; Fujise, Yoshihiro; Yamada, Tadasu; Suzuki, Tomohiko

    2006-10-01

    We determined the myoglobin (Mb) cDNA sequences of nine cetaceans, of which six are the first reports of Mb sequences: sei whale (Balaenoptera borealis), Bryde's whale (Balaenoptera edeni), pygmy sperm whale (Kogia breviceps), Stejneger's beaked whale (Mesoplodon stejnegeri), Longman's beaked whale (Indopacetus pacificus), and melon-headed whale (Peponocephala electra), and three confirm the previously determined chemical amino acid sequences: sperm whale (Physeter macrocephalus), common minke whale (Balaenoptera acutorostrata) and pantropical spotted dolphin (Stenella attenuata). We found two types of Mb in the skeletal muscle of pantropical spotted dolphin: Mb I with the same amino acid sequence as that deposited in the protein database, and Mb II, which differs at two amino acid residues compared with Mb I. Using an alignment of the amino acid or cDNA sequences of cetacean Mb, we constructed a phylogenetic tree by the NJ method. Clustering of cetacean Mb amino acid and cDNA sequences essentially follows the classical taxonomy of cetaceans, suggesting that Mb sequence data is valid for classification of cetaceans at least to the family level. PMID:16962803

  10. Evolution of alpha-lactalbumins. The complete amino acid sequence of the alpha-lactalbumin from a marsupial (Macropus rufogriseus) and corrections to regions of sequence in bovine and goat alpha-lactalbumins.

    PubMed

    Shewale, J G; Sinha, S K; Brew, K

    1984-04-25

    alpha-Lactalbumin was purified from a whey protein fraction of the milk of the red-necked wallaby (Macropus rufogriseus). The complete amino acid sequence was determined from the results of automatic sequenator analyses of the intact protein, the three cyanogen bromide fragments, and of peptides generated from the larger, COOH-terminal CNBr fragment by digestion with trypsin or staphylococcal protease. This is the first sequence to be determined of an alpha-lactalbumin from a marsupial and differs from known eutherian alpha-lactalbumins in size and locations of deletions in alignments with the homologous type c lysozymes, as well as in having amino acid substitutions at 8 sites that are invariant in known eutherian proteins. Some corrections are also reported for two regions of sequence in both bovine and goat alpha-lactalbumins. The new and previously published information on alpha-lactalbumin sequences is analyzed in relation to the evolutionary history of the alpha-lactalbumin line as well as the relationship of structure to function in these proteins. PMID:6715332

  11. The amino-acid sequence of the glucose/mannose-specific lectin isolated from Parkia platycephala seeds reveals three tandemly arranged jacalin-related domains.

    PubMed

    Mann, K; Farias, C M; Del Sol, F G; Santos, C F; Grangeiro, T B; Nagano, C S; Cavada, B S; Calvete, J J

    2001-08-01

    A mannose/glucose-specific lectin was isolated from seeds of Parkia platycephala, the most primitive subfamily of Leguminosae plants. The molecular mass of the purified lectin determined by mass spectrometry was 47 946 +/- 6 Da (by electrospray ionization) and 47 951 +/- 9 Da (by matrix-assisted laser-desoption ionization). The apparent molecular mass of the lectin in solutions of pH in the range 4.5-8.5 determined by analytical ultracentrifugation equilibrium sedimentation was 94 +/- 3 kDa, showing that the protein behaved as a non-pH-dependent dimer. The amino-acid sequence of the Parkia lectin was determined by Edman degradation of overlapping peptides. This is the first report of the primary structure of a Mimosoideae lectin. The protein contained a blocked N-terminus and a single, nonglycosylated polypeptide chain composed of three tandemly arranged homologous domains. Each of these domains shares sequence similarity with jacalin-related lectin monomers from Asteraceae, Convolvulaceae, Moraceae, Musaceae, Gramineae, and Fagaceae plant families. Based on this homology, we predict that each Parkia lectin repeat may display a beta prism fold similar to that observed in the crystal structure of the lectin from Helianthus tuberosus. The P. platycephala lectin also shows sequence similarity with stress- and pathogen-upregulated defence genes of a number of different plants, suggesting a common ancestry for jacalin-related lectins and inducible defence proteins. PMID:11502201

  12. Draft Genome Sequences of Two Novel Acidimicrobiaceae Members from an Acid Mine Drainage Biofilm Metagenome.

    PubMed

    Pinto, Ameet J; Sharp, Jonathan O; Yoder, Michael J; Almstrand, Robert

    2016-01-01

    Bacteria belonging to the family Acidimicrobiaceae are frequently encountered in heavy metal-contaminated acidic environments. However, their phylogenetic and metabolic diversity is poorly resolved. We present draft genome sequences of two novel and phylogenetically distinct Acidimicrobiaceae members assembled from an acid mine drainage biofilm metagenome. PMID:26769942

  13. Draft Genome Sequences of Two Novel Acidimicrobiaceae Members from an Acid Mine Drainage Biofilm Metagenome

    PubMed Central

    Pinto, Ameet J.; Sharp, Jonathan O.; Yoder, Michael J.

    2016-01-01

    Bacteria belonging to the family Acidimicrobiaceae are frequently encountered in heavy metal-contaminated acidic environments. However, their phylogenetic and metabolic diversity is poorly resolved. We present draft genome sequences of two novel and phylogenetically distinct Acidimicrobiaceae members assembled from an acid mine drainage biofilm metagenome. PMID:26769942

  14. Complete Genome Sequence Analysis of Acute and Mild Strains of Classical Swine Fever Virus Subgenotype 3.2.

    PubMed

    Lim, Seong-In; Han, Song-Hee; Hyun, HyeSook; Lim, Ji-Ae; Song, Jae-Young; Cho, In-Soo; An, Dong-Jun

    2016-01-01

    We report the complete genome sequences of two classical swine fever virus strains (JJ9811 and YI9908). Both belong to subgenotype 3.2. Strain JJ9811 causes mild symptoms and strain YI9908 causes acute symptoms. The sequences were 95.7% homologous at the nucleotide level and 95.6% homologous at the amino acid level. PMID:26823570

  15. Complete Genome Sequence Analysis of Acute and Mild Strains of Classical Swine Fever Virus Subgenotype 3.2

    PubMed Central

    Lim, Seong-In; Han, Song-Hee; Hyun, HyeSook; Lim, Ji-Ae; Song, Jae-Young; Cho, In-Soo

    2016-01-01

    We report the complete genome sequences of two classical swine fever virus strains (JJ9811 and YI9908). Both belong to subgenotype 3.2. Strain JJ9811 causes mild symptoms and strain YI9908 causes acute symptoms. The sequences were 95.7% homologous at the nucleotide level and 95.6% homologous at the amino acid level. PMID:26823570

  16. Nucleotide sequences and characterization of liv genes encoding components of the high-affinity branched-chain amino acid transport system in Salmonella typhimurium.

    PubMed

    Matsubara, K; Ohnishi, K; Kiritani, K

    1992-07-01

    A 7.6-kb fragment of Salmonella typhimurium LT2 containing the liv gene cluster, which specifies the high-affinity branched-chain amino acid transport system (LIV-I), has been isolated. The upstream region contains the livB and livC genes encoding the leucine-isoleucine-valine-threonine and leucine-specific binding proteins, respectively. In this study, the nucleotide sequence of the 4-kb downstream segment was determined and found to contain four reading frames, designated as livA, livE, livF, and livG, that encode putative membrane-associated proteins. The livA and livE genes encode hydrophobic proteins composed of 308 and 425 amino acid residues, respectively. The livF and livG genes encode hydrophilic proteins of 255 and 237 amino acids, respectively; both the proteins contain consensus amino acid sequences found in proteins with ATP-binding sites. These four genes linked together have a potential rho-independent transcriptional terminator adjacent to the 3'-end of livG. No promoter sequence was found in the immediate upstream region of the livAEFG cluster. The livA, livE, livF, and livG gene products were identified as proteins with apparent M(r)s of 25,500, 34,500, 28,000, and 26,000, respectively, by SDS-polyacryl-amide gel electrophoresis. The deduced amino acid sequences of these four proteins showed strong homology to those of the corresponding membrane-associated proteins required for the high-affinity branched-chain amino acid transport systems from both Escherichia coli and Pseudomonas aeruginosa. PMID:1429514

  17. Complete amino acid sequence of the lentil trypsin-chymotrypsin inhibitor LCI-1.7 and a discussion of atypical binding sites of Bowman-Birk inhibitors.

    PubMed

    Weder, Jürgen K P; Hinkers, Sabine C

    2004-06-30

    The complete primary structure of the lentil (Lens culinaris) trypsin-chymotrypsin inhibitor LCI-1.7 was determined by conventional methods in order to find relationships between partial sequences and the difference in action against human and bovine chymotrypsin. As other Bowman-Birk type inhibitors, LCI-1.7 contained 68 amino acid residues, seven disulfide bridges, and two reactive sites, Arg16-Ser17 for trypsin and Tyr42-Ser43 for chymotrypsin. Evaluation of sequence homologies showed that it belonged to the group III Bowman-Birk inhibitors. The atypical additional binding site of LCI-1.7 for human chymotrypsin was discussed and compared with such binding sites of two other Bowman-Birk inhibitors, the Bowman-Birk soybean proteinase inhibitor BBI, and the lima bean proteinase inhibitor LBI I, for human and bovine trypsin and chymotrypsin. A concept to reduce the action of these inhibitors against human enzymes by genetic engineering was proposed. PMID:15212472

  18. Integration ofhup cosmid pHU52 into the chromosomal DNA ofCicer-Rhizobium using Tn5 as an homologous sequence.

    PubMed

    Kunnimalaiyaan, M; Lodha, M L; Sreekumar, K R

    1992-11-01

    Cosmid pHU52, which carrieshup genes ofBradyrhizobium japonicum, has been integrated into theCicer-Rhizobium G36-84 genome via Tn5-mediated homologous recombination. Tn5 was inserted into both the cosmid pHU52 and the chromosome ofCicer-Rhizobium to provide a region of DNA homology, without affecting the expression of necessary genes. An incompatible plasmid, pPH1JI, was used to select those few cells that had undergone recombination. The integration of the cosmid was demonstrated by Southern blot analysis. Chromosomal integration of thehup genes maximized stability and minimized the potential for their horizontal transfer to other bacterial species. The integratedhup genes were found to expressex planta as well in nodules. The method described illustrates how a given gene can be stably integrated into the chromosome. PMID:24425601

  19. Two distinct ferredoxins from Rhodobacter capsulatus: complete amino acid sequences and molecular evolution.

    PubMed

    Saeki, K; Suetsugu, Y; Yao, Y; Horio, T; Marrs, B L; Matsubara, H

    1990-09-01

    Two distinct ferredoxins were purified from Rhodobacter capsulatus SB1003. Their complete amino acid sequences were determined by a combination of protease digestion, BrCN cleavage and Edman degradation. Ferredoxins I and II were composed of 64 and 111 amino acids, respectively, with molecular weights of 6,728 and 12,549 excluding iron and sulfur atoms. Both contained two Cys clusters in their amino acid sequences. The first cluster of ferredoxin I and the second cluster of ferredoxin II had a sequence, CxxCxxCxxxCP, in common with the ferredoxins found in Clostridia. The second cluster of ferredoxin I had a sequence, CxxCxxxxxxxxCxxxCM, with extra amino acids between the second and third Cys, which has been reported for other photosynthetic bacterial ferredoxins and putative ferredoxins (nif-gene products) from nitrogen-fixing bacteria, and with a unique occurrence of Met. The first cluster of ferredoxin II had a CxxCxxxxCxxxCP sequence, with two additional amino acids between the second and third Cys, a characteristics feature of Azotobacter-[3Fe-4S] [4Fe-4S]-ferredoxin. Ferredoxin II was also similar to Azotobacter-type ferredoxins with an extended carboxyl (C-) terminal sequence compared to the common Clostridium-type. The evolutionary relationship of the two together with a putative one recently found to be encoded in nifENXQ region in this bacterium [Moreno-Vivian et al. (1989) J. Bacteriol. 171, 2591-2598] is discussed. PMID:2277040

  20. Amino Acid Sequence of Anionic Peroxidase from the Windmill Palm Tree Trachycarpus fortunei

    PubMed Central

    2015-01-01

    Palm peroxidases are extremely stable and have uncommon substrate specificity. This study was designed to fill in the knowledge gap about the structures of a peroxidase from the windmill palm tree Trachycarpus fortunei. The complete amino acid sequence and partial glycosylation were determined by MALDI-top-down sequencing of native windmill palm tree peroxidase (WPTP), MALDI-TOF/TOF MS/MS of WPTP tryptic peptides, and cDNA sequencing. The propeptide of WPTP contained N- and C-terminal signal sequences which contained 21 and 17 amino acid residues, respectively. Mature WPTP was 306 amino acids in length, and its carbohydrate content ranged from 21% to 29%. Comparison to closely related royal palm tree peroxidase revealed structural features that may explain differences in their substrate specificity. The results can be used to guide engineering of WPTP and its novel applications. PMID:25383699

  1. Homology study of two polyhydroxyalkanoate (PHA) synthases from Pseudomonas aureofaciens.

    PubMed

    Umeda, F; Nishikawa, T; Miyasaka, H; Maeda, I; Kawase, M; Yagi, K

    2001-11-01

    Recently, we have cloned and analyzed two polyhydroxyalkanoate (PHA) synthase genes (phaC1 and phaC2 in the pha cluster) from Pseudomonas aureofaciens. In this report, the deduced amino acid (AA) sequences of PHA synthase 1 and PHA synthase 2 from P. aureofaciens are compared with those from three other bacterial strains (Pseudomonas sp. 61-3, P. oleovorans and P. aeruginosa) containing the homologous pha cluster. The level of homology of either PHA synthase 1 or PHA synthase 2 was high with each enzyme from these three bacterial strains. Furthermore, multialignment of PHA synthase AA sequences implied that both enzymes of PHA synthase 1 and PHA synthase 2 were highly conserved in the four strains including P. aureofaciens. PMID:11916262

  2. Protein chemotaxonomy. XIII. Amino acid sequence of ferredoxin from Panax ginseng.

    PubMed

    Mino, Yoshiki

    2006-08-01

    The complete amino acid sequence of [2Fe-2S] ferredoxin from Panax ginseng (Araliaceae) has been determined by automated Edman degradation of the entire S-carboxymethylcysteinyl protein and of the peptides obtained by enzymatic digestion. This ferredoxin has a unique amino acid sequence, which includes an insertion of Tyr at the 3rd position from the amino-terminus and a deletion of two amino acid residues at the carboxyl terminus. This ferredoxin had 18 differences in its amino acid sequence compared to that of Petroselinum sativum (Umbelliferae). In contrast, 23-33 differences were observed compared to other dicotyledonous plants. This suggests that Panax ginseng is related taxonomically to umbelliferous plants. PMID:16880642

  3. Despite sequence homologies to gluten, salivary proline-rich proteins do not elicit immune responses central to the pathogenesis of celiac disease.

    PubMed

    Tian, Na; Leffler, Daniel A; Kelly, Ciaran P; Hansen, Joshua; Marietta, Eric V; Murray, Joseph A; Schuppan, Detlef; Helmerhorst, Eva J

    2015-12-01

    Celiac disease (CD) is an inflammatory disorder triggered by ingested gluten, causing immune-mediated damage to the small-intestinal mucosa. Gluten proteins are strikingly similar in amino acid composition and sequence to proline-rich proteins (PRPs) in human saliva. On the basis of this feature and their shared destination in the gastrointestinal tract, we hypothesized that salivary PRPs may modulate gluten-mediated immune responses in CD. Parotid salivary secretions were collected from CD patients, refractory CD patients, non-CD patients with functional gastrointestinal complaints, and healthy controls. Structural similarities of PRPs with gluten were probed with anti-gliadin antibodies. Immune responses to PRPs were investigated toward CD patient-derived peripheral blood mononuclear cells and in a humanized transgenic HLA-DQ2/DQ8 mouse model for CD. Anti-gliadin antibodies weakly cross-reacted with the abundant salivary amylase but not with PRPs. Likewise, the R5 antibody, recognizing potential antigenic gluten epitopes, showed negligible reactivity to salivary proteins from all groups. Inflammatory responses in peripheral blood mononuclear cells were provoked by gliadins whereas responses to PRPs were similar to control levels, and PRPs did not compete with gliadins in immune stimulation. In vivo, PRP peptides were well tolerated and nonimmunogenic in the transgenic HLA-DQ2/DQ8 mouse model. Collectively, although structurally similar to dietary gluten, salivary PRPs were nonimmunogenic in CD patients and in a transgenic HLA-DQ2/DQ8 mouse model for CD. It is possible that salivary PRPs play a role in tolerance induction to gluten early in life. Deciphering the structural basis for the lack of immunogenicity of salivary PRPs may further our understanding of the toxicity of gluten. PMID:26505973

  4. N-terminal sequence of amino acids and some properties of an acid-stable alpha-amylase from citric acid-koji (Aspergillus usamii var.).

    PubMed

    Suganuma, T; Tahara, N; Kitahara, K; Nagahama, T; Inuzuka, K

    1996-01-01

    An acid-stable alpha-amylase (AA) was purified from an acidic extract of citric acid-koji (A. usamii var.). The N-terminal sequence of the first 20 amino acids of the enzyme was identical with that of AA from A. niger, but the two enzymes differed in molecular weight. HPLC analysis for identifying the anomers of products indicated that the AA hydrolyzed maltopentaose (G5) at the third glycoside bond predominantly, which differed from Taka-amylase A and the neutral alpha-amylase (NA) from the citric acid-koji. PMID:8824843

  5. Homology, Analogy, and Ethology.

    ERIC Educational Resources Information Center

    Beer, Colin G.

    1984-01-01

    Because the main criterion of structural homology (the principle of connections) does not exist for behavioral homology, the utility of the ethological concept of homology has been questioned. The confidence with which behavioral homologies can be claimed varies inversely with taxonomic distance. Thus, conjectures about long-range phylogenetic…

  6. A salicylic acid-based small molecule inhibitor for the oncogenic Src homology-2 domain containing protein tyrosine phosphatase-2 (SHP2)

    PubMed Central

    Zhang, Xian; He, Yantao; Liu, Sijiu; Yu, Zhihong; Jiang, Zhong-Xing; Yang, Zhenyun; Dong, Yuanshu; Nabinger, Sarah C.; Wu, Li; Gunawan, Andrea M.; Wang, Lina; Chan, Rebecca J.; Zhang, Zhong-Yin

    2010-01-01

    The Src homology-2 domain containing protein tyrosine phosphatase-2 (SHP2) plays a pivotal role in growth factor and cytokine signaling. Gain-of-function SHP2 mutations are associated with Noonan syndrome, various kinds of leukemias and solid tumors. Thus there is considerable interest in SHP2 as a potential target for anti-cancer and anti-leukemia therapy. We report a salicylic acid-based combinatorial library approach aimed to bind both active site and unique nearby sub-pockets for enhanced affinity and selectivity. Screening of the library led to the identification of a SHP2 inhibitor II-B08 (compound 9) with highly efficacious cellular activity. Compound 9 blocks growth factor stimulated ERK1/2 activation and hematopoietic progenitor proliferation, providing supporting evidence that chemical inhibition of SHP2 may be therapeutically useful for anti-cancer and anti-leukemia treatment. X-ray crystallographic analysis of the structure of SHP2 in complex with 9 reveals molecular determinants that can be exploited for the acquisition of more potent and selective SHP2 inhibitors. PMID:20170098

  7. Small activating ribonucleic acid reverses tyrosine kinase inhibitor resistance in epidermal growth factor receptor‐mutant lung cancer by increasing the expression of phosphatase and tensin homolog

    PubMed Central

    Li, Meng; Peng, Zhongmin; Ren, Wangang

    2016-01-01

    Background Epidermal growth factor receptor‐tyrosine kinase inhibitors (TKI‐EGFRs) present a new prospect for the treatment of lung cancer. However, in clinical application, the majority of patients become TKI resistant within a year. More and more studies have shown that a loss of phosphatase and tensin homolog (PTEN) expression is associated with TKI resistance. An alternative method of upregulating PTEN expression may reverse TKI resistance. Methods We designed five candidate small activating ribonucleic acids (saRNAs) to target PTEN, and transfected them into H‐157 cells to screen out functional saRNA. We used reverse transcriptase‐polymerase chain reaction and Western blot to evaluate the effect of saRNA to PTEN expression. We then analyzed the growth and apoptosis of cells transfected with saRNA under the treatment of TKI to investigate whether saRNAs can reverse TKI resistance by upregulating PTEN expression. Results The functional saRNA we designed could upregulate PTEN expression. The H‐157 cells transfected with saRNA grew slower in the presence of TKI drugs than the cells that were not transfected with saRNA. The apoptosis rate was also obviously higher. Conclusions Our study proves that loss of PTEN expression is an important mechanism of TKI resistance. It is possible to control TKI resistance by upregulating PTEN expression using RNA activation technology. PMID:27385992

  8. LISTA, LISTA-HOP and LISTA-HON: a comprehensive compilation of protein encoding sequences and its associated homology databases from the yeast Saccharomyces.

    PubMed Central

    Dölz, R; Mossé, M O; Slonimski, P P; Bairoch, A; Linder, P

    1994-01-01

    We continued our effort to make a comprehensive database (LISTA) for the yeast Saccharomyces cerevisiae. In this database each sequence has been attributed a single genetic name. In the case of duplicated sequences a simple method has been applied to distinguish between sequences of one and the same gene from non-allelic sequences of duplicated genes. If necessary, synonyms are given in the case of allelic duplicated sequences. Thus sequences can be found either by the name or by synonyms given in LISTA. Each entry contains the genetic name, the mnemonic from the EMBL data bank, the codon bias, reference of the publication of the sequence, Chromosomal location as far as known, Swissprot and EMBL accession numbers. To obtain more information on the included sequences, each entry has been screened against non-redundant nucleotide and protein data bank collections resulting in LISTA-HON and LISTA-HOP. The LISTA data base can be linked to the associated data sets or to nucleotide and protein banks by the Sequence Retrieval System (SRS). PMID:7937046

  9. RecA-mediated sequence homology recognition as an example of how searching speed in self-assembly systems can be optimized by balancing entropic and enthalpic barriers

    PubMed Central

    Jiang, Lili; Prentiss, Mara

    2016-01-01

    Ideally, self-assembly should rapidly and efficiently produce stable correctly assembled structures. We study the tradeoff between enthalpic and entropic cost in self-assembling systems using RecA-mediated homology search as an example. Earlier work suggested that RecA searches could produce stable final structures with high stringency using a slow testing process that follows an initial rapid search of ~9–15 bases. In this work, we will show that as a result of entropic and enthalpic barriers, simultaneously testing all ~9–15 bases as separate individual units results in a longer overall searching time than testing them in groups and stages. PMID:25215755

  10. Isolation and Characterization of Two Saccharomyces Cerevisiae Genes Encoding Homologs of the Bacterial Hexa and Muts Mismatch Repair Proteins

    PubMed Central

    Reenan, R. A.; Kolodner, R. D.

    1992-01-01

    Homologs of the Escherichia coli (mutL, S and uvrD) and Streptococcus pneumoniae (hexA, B) genes involved in mismatch repair are known in several distantly related organisms. Degenerate oligonucleotide primers based on conserved regions of E. coli MutS protein and its homologs from Salmonella typhimurium, S. pneumoniae and human were used in the polymerase chain reaction (PCR) to amplify and clone mutS/hexA homologs from Saccharomyces cerevisiae. Two DNA sequences were amplified whose deduced amino acid sequences both shared a high degree of homology with MutS. These sequences were then used to clone the full-length genes from a yeast genomic library. Sequence analysis of the two MSH genes (MSH = mutS homolog), MSH1 and MSH2, revealed open reading frames of 2877 bp and 2898 bp. The deduced amino acid sequences predict polypeptides of 109.3 kD and 109.1 kD, respectively. The overall amino acid sequence identity with the E. coli MutS protein is 28.6% for MSH1 and 25.2% for MSH2. Features previously found to be shared by MutS homologs, such as the nucleotide binding site and the helix-turn-helix DNA binding motif as well as other highly conserved regions whose function remain unknown, were also found in the two yeast homologs. Evidence presented in this and a companion study suggest that MSH1 is involved in repair of mitochondrial DNA and that MSH2 is involved in nuclear DNA repair. PMID:1459447

  11. Detection and isolation of nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1997-01-01

    A method for detecting a target nucleic acid sequence in a sample is provided using hybridization probes which competitively hybridize to a target nucleic acid. According to the method, a target nucleic acid sequence is hybridized to first and second hybridization probes which are complementary to overlapping portions of the target nucleic acid sequence, the first hybridization probe including a first complexing agent capable of forming a binding pair with a second complexing agent and the second hybridization probe including a detectable marker. The first complexing agent attached to the first hybridization probe is contacted with a second complexing agent, the second complexing agent being attached to a solid support such that when the first and second complexing agents are attached, target nucleic acid sequences hybridized to the first hybridization probe become immobilized on to the solid support. The immobilized target nucleic acids are then separated and detected by detecting the detectable marker attached to the second hybridization probe. A kit for performing the method is also provided.

  12. Detection and isolation of nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1997-04-01

    A method for detecting a target nucleic acid sequence in a sample is provided using hybridization probes which competitively hybridize to a target nucleic acid. According to the method, a target nucleic acid sequence is hybridized to first and second hybridization probes which are complementary to overlapping portions of the target nucleic acid sequence, the first hybridization probe including a first complexing agent capable of forming a binding pair with a second complexing agent and the second hybridization probe including a detectable marker. The first complexing agent attached to the first hybridization probe is contacted with a second complexing agent, the second complexing agent being attached to a solid support such that when the first and second complexing agents are attached, target nucleic acid sequences hybridized to the first hybridization probe become immobilized on to the solid support. The immobilized target nucleic acids are then separated and detected by detecting the detectable marker attached to the second hybridization probe. A kit for performing the method is also provided. 7 figs.

  13. Uses of Phage Display in Agriculture: Sequence Analysis and Comparative Modeling of Late Embryogenesis Abundant Client Proteins Suggest Protein-Nucleic Acid Binding Functionality

    PubMed Central

    Kushwaha, Rekha; Downie, A. Bruce; Payne, Christina M.

    2013-01-01

    A group of intrinsically disordered, hydrophilic proteins—Late Embryogenesis Abundant (LEA) proteins—has been linked to survival in plants and animals in periods of stress, putatively through safeguarding enzymatic function and prevention of aggregation in times of dehydration/heat. Yet despite decades of effort, the molecular-level mechanisms defining this protective function remain unknown. A recent effort to understand LEA functionality began with the unique application of phage display, wherein phage display and biopanning over recombinant Seed Maturation Protein homologs from Arabidopsis thaliana and Glycine max were used to retrieve client proteins at two different temperatures, with one intended to represent heat stress. From this previous study, we identified 21 client proteins for which clones were recovered, sometimes repeatedly. Here, we use sequence analysis and homology modeling of the client proteins to ascertain common sequence and structural properties that may contribute to binding affinity with the protective LEA protein. Our methods uncover what appears to be a predilection for protein-nucleic acid interactions among LEA client proteins, which is suggestive of subcellular residence. The results from this initial computational study will guide future efforts to uncover the protein protective mechanisms during heat stress, potentially leading to phage-display-directed evolution of synthetic LEA molecules. PMID:23956788

  14. Conservation of Shannon's redundancy for proteins. [information theory applied to amino acid sequences

    NASA Technical Reports Server (NTRS)

    Gatlin, L. L.

    1974-01-01

    Concepts of information theory are applied to examine various proteins in terms of their redundancy in natural originators such as animals and plants. The Monte Carlo method is used to derive information parameters for random protein sequences. Real protein sequence parameters are compared with the standard parameters of protein sequences having a specific length. The tendency of a chain to contain some amino acids more frequently than others and the tendency of a chain to contain certain amino acid pairs more frequently than other pairs are used as randomness measures of individual protein sequences. Non-periodic proteins are generally found to have random Shannon redundancies except in cases of constraints due to short chain length and genetic codes. Redundant characteristics of highly periodic proteins are discussed. A degree of periodicity parameter is derived.

  15. Conversion of amino-acid sequence in proteins to classical music: search for auditory patterns

    PubMed Central

    2007-01-01

    We have converted genome-encoded protein sequences into musical notes to reveal auditory patterns without compromising musicality. We derived a reduced range of 13 base notes by pairing similar amino acids and distinguishing them using variations of three-note chords and codon distribution to dictate rhythm. The conversion will help make genomic coding sequences more approachable for the general public, young children, and vision-impaired scientists. PMID:17477882

  16. Analysis on the sequence of the whole genome of an isolated enterovirus 71 strain

    PubMed Central

    Gou, Enjin; Li, Qing; Li, Xiangxue; Gu, Shengli; Han, Yun; Tang, Zhengzhen; Li, Ying; Huang, Bo

    2015-01-01

    An enterovirus 71 (EV71) strain Query was isolated from a patient specimen in 2015. In order to known about its genetic evolution, this study amplified gene fragment of the isolated stain by RT-PCT and carried out sequencing of the total genome. The homology and genetic evolution of the gene sequence of the virus strain in the study were analyzed. The results showed that the isolated EV71 strain in this study had higher homology of nucleotide sequence and amino acid sequence with other virus strains, which was 80%-97% and 88% to 92%, respectively, but it had lower homology with Cox.A16 (homology of nucleotide sequence and amino acid sequence of Cox.A16 was 81% and 79%, respectively). Compare of homologous sequence at the encoding region VP1 demonstrated that the experimental isolated strain EV71 had higher homology of amino acid sequence at VP1 region with other virus strains. Genetic evolution of nucleotide sequence at VP1 region of the identified strain and other EV71 strains was analyzed, and the results demonstrated gene sequence at VP1 region and 5’UTR region of the isolated strain and SDLY017 strain was at the same branch, both of which belonged to C4a, a subtype of type C4. PMID:26884986

  17. Protein location prediction using atomic composition and global features of the amino acid sequence

    SciTech Connect

    Cherian, Betsy Sheena; Nair, Achuthsankar S.

    2010-01-22

    Subcellular location of protein is constructive information in determining its function, screening for drug candidates, vaccine design, annotation of gene products and in selecting relevant proteins for further studies. Computational prediction of subcellular localization deals with predicting the location of a protein from its amino acid sequence. For a computational localization prediction method to be more accurate, it should exploit all possible relevant biological features that contribute to the subcellular localization. In this work, we extracted the biological features from the full length protein sequence to incorporate more biological information. A new biological feature, distribution of atomic composition is effectively used with, multiple physiochemical properties, amino acid composition, three part amino acid composition, and sequence similarity for predicting the subcellular location of the protein. Support Vector Machines are designed for four modules and prediction is made by a weighted voting system. Our system makes prediction with an accuracy of 100, 82.47, 88.81 for self-consistency test, jackknife test and independent data test respectively. Our results provide evidence that the prediction based on the biological features derived from the full length amino acid sequence gives better accuracy than those derived from N-terminal alone. Considering the features as a distribution within the entire sequence will bring out underlying property distribution to a greater detail to enhance the prediction accuracy.

  18. LISTA, LISTA-HOP and LISTA-HON: a comprehensive compilation of protein encoding sequences and its associated homology databases from the yeast Saccharomyces.

    PubMed Central

    Dölz, R; Mossé, M O; Slonimski, P P; Bairoch, A; Linder, P

    1996-01-01

    We continued our effort to make a comprehensive database (LISTA) for the yeast Saccharomyces cerevisiae. As in previous editions the genetic names are consistently associated to each sequence with a known and confirmed ORF. If necessary, synonyms are given in the case of allelic duplicated sequences. Although the first publication of a sequence gives-according to our rules-the genetic name of a gene, in some instances more commonly used names are given to avoid nomenclature problems and the use of ancient designations which are no longer used. In these cases the old designation is given as synonym. Thus sequences can be found either by the name or by synonyms given in LISTA. Each entry contains the genetic name, the mnemonic from the EMBL data bank, the codon bias, reference of the publication of the sequence, Chromosomal location as far as known, SWISSPROT and EMBL accession numbers. New entries will also contain the name from the systematic sequencing efforts. Since the release of LISTA4.1 we update the database continuously. To obtain more information on the included sequences, each entry has been screened against non-redundant nucleotide and protein data bank collections resulting in LISTA-HON and LISTA-HOP. This release includes reports from full Smith and Watermann peptide-level searches against a non-redundant protein sequence database. The LISTA data base can be linked to the associated data sets or to nucleotide and protein banks by the Sequence Retrieval System (SRS). The database is available by FTP and on World Wide Web. PMID:8594599

  19. Ab initio detection of fuzzy amino acid tandem repeats in protein sequences

    PubMed Central

    2012-01-01

    Background Tandem repetitions within protein amino acid sequences often correspond to regular secondary structures and form multi-repeat 3D assemblies of varied size and function. Developing internal repetitions is one of the evolutionary mechanisms that proteins employ to adapt their structure and function under evolutionary pressure. While there is keen interest in understanding such phenomena, detection of repeating structures based only on sequence analysis is considered an arduous task, since structure and function is often preserved even under considerable sequence divergence (fuzzy tandem repeats). Results In this paper we present PTRStalker, a new algorithm for ab-initio detection of fuzzy tandem repeats in protein amino acid sequences. In the reported results we show that by feeding PTRStalker with amino acid sequences from the UniProtKB/Swiss-Prot database we detect novel tandemly repeated structures not captured by other state-of-the-art tools. Experiments with membrane proteins indicate that PTRStalker can detect global symmetries in the primary structure which are then reflected in the tertiary structure. Conclusions PTRStalker is able to detect fuzzy tandem repeating structures in protein sequences, with performance beyond the current state-of-the art. Such a tool may be a valuable support to investigating protein structural properties when tertiary X-ray data is not available. PMID:22536906

  20. Multimodal phylogeny for taxonomy: integrating information from nucleotide and amino acid sequences.

    PubMed

    Bicego, Manuele; Dellaglio, Franco; Felis, Giovanna E

    2007-10-01

    The crucial role played by the analysis of microbial diversity in biotechnology-based innovations has increased the interest in the microbial taxonomy research area. Phylogenetic sequence analyses have contributed significantly to the advances in this field, also in the view of the large amount of sequence data collected in recent years. Phylogenetic analyses could be realized on the basis of protein-encoding nucleotide sequences or encoded amino acid molecules: these two mechanisms present different peculiarities, still starting from two alternative representations of the same information. This complementarity could be exploited to achieve a multimodal phylogenetic scheme that is able to integrate gene and protein information in order to realize a single final tree. This aspect has been poorly addressed in the literature. In this paper, we propose to integrate the two phylogenetic analyses using basic schemes derived from the multimodality fusion theory (or multiclassifier systems theory), a well-founded and rigorous branch for which its powerfulness has already been demonstrated in other pattern recognition contexts. The proposed approach could be applied to distance matrix-based phylogenetic techniques (like neighbor joining), resulting in a smart and fast method. The proposed methodology has been tested in a real case involving sequences of some species of lactic acid bacteria. With this dataset, both nucleotide sequence- and amino acid sequence-based phylogenetic analyses present some drawbacks, which are overcome with the multimodal analysis. PMID:17933011

  1. Chromosomally-retained RNA mediates homologous pairing.

    PubMed

    Ding, Da-Qiao; Haraguchi, Tokuko; Hiraoka, Yasushi

    2012-01-01

    Pairing and recombination of homologous chromosomes are essential for ensuring correct segregation of chromosomes in meiosis. In S. pombe, chromosomes are first bundled at the telomeres (forming a telomere bouquet) and then aligned by oscillatory movement of the elongated "horsetail" nucleus. Telomere clustering and subsequent chromosome alignment promote pairing of homologous chromosomes. However, this telomere-bundled alignment of chromosomes cannot be responsible for the specificity of chromosome pairing. Thus, there must be some mechanism to facilitate recognition of homologous partners after telomere clustering. Recent studies in S. pombe have shown that RNA transcripts retained on the chromosome, or RNA bodies, may play a role in recognition of homologous chromosomes for pairing. Acting as fiducial markers of homologous loci they would abrogate the need for direct DNA sequence homology searching. PMID:23117617

  2. The amino-acid sequence of leghemoglobin component a from Phaseolus vulgaris (kidney bean).

    PubMed

    Lehtovaara, P; Ellfolk, N

    1975-06-01

    1. Leghemoglobin component a from Phaseolus vulgaris (kidney bean) was digested with trypsin; 15 tryptic peptides and free lysine were purified and the amino acid sequences of the peptides determined. 2. The internal order of the tryptic peptides was determined by the bridge peptides obtained from the thermolytic digest and the dilute acid hydrolyzate of kidney bean leghemoglobin a; 12 thermolytic peptides and two acid hydrolysis peptides were purified and the sequences were partially or completely determined. 3. The complete amino acid sequence of kidney bean leghemoglobin a is compared to that of leghemoglobin a from soybean (Glycine max) and to some animal globins. As regards sequence, the kidney bean globin has 79% identity with the soybean globin and 21% identity with human hemoglobin gamma-chain. Seven of the 14 amino acid residues common to most globins are found in the kidney bean globin. Trp-15 and Tyr-145 are evolutionarily conserved in this globin, which confirms the concept of a common origin of animal and plant globins. PMID:809270

  3. Amino acid sequence of the ligand-binding domain of the aryl hydrocarbon receptor 1 predicts sensitivity of wild birds to effects of dioxin-like compounds.

    PubMed

    Farmahin, Reza; Manning, Gillian E; Crump, Doug; Wu, Dongmei; Mundy, Lukas J; Jones, Stephanie P; Hahn, Mark E; Karchner, Sibel I; Giesy, John P; Bursian, Steven J; Zwiernik, Matthew J; Fredricks, Timothy B; Kennedy, Sean W

    2013-01-01

    The sensitivity of avian species to the toxic effects of dioxin-like compounds (DLCs) varies up to 1000-fold among species, and this variability has been associated with interspecies differences in aryl hydrocarbon receptor 1 ligand-binding domain (AHR1 LBD) sequence. We previously showed that LD(50) values, based on in ovo exposures to DLCs, were significantly correlated with in vitro EC(50) values obtained with a luciferase reporter gene (LRG) assay that measures AHR1-mediated induction of cytochrome P4501A in COS-7 cells transfected with avian AHR1 constructs. Those findings suggest that the AHR1 LBD sequence and the LRG assay can be used to predict avian species sensitivity to DLCs. In the present study, the AHR1 LBD sequences of 86 avian species were studied, and differences at amino acid sites 256, 257, 297, 324, 337, and 380 were identified. Site-directed mutagenesis, the LRG assay, and homology modeling highlighted the importance of each amino acid site in AHR1 sensitivity to 2,3,7,8-tetrachlorodibenzo-p-dioxin and other DLCs. The results of the study revealed that (1) only amino acids at sites 324 and 380 affect the sensitivity of AHR1 expression constructs of the 86 avian species to DLCs and (2) in vitro luciferase activity of AHR1 constructs containing only the LBD of the species of interest is significantly correlated (r (2) = 0.93, p < 0.0001) with in ovo toxicity data for those species. These results indicate promise for the use of AHR1 LBD amino acid sequences independently, or combined with the LRG assay, to predict avian species sensitivity to DLCs. PMID:22923492

  4. Draft genome sequence of the docosahexaenoic acid producing thraustochytrid Aurantiochytrium sp. T66.

    PubMed

    Liu, Bin; Ertesvåg, Helga; Aasen, Inga Marie; Vadstein, Olav; Brautaset, Trygve; Heggeset, Tonje Marita Bjerkan

    2016-06-01

    Thraustochytrids are unicellular, marine protists, and there is a growing industrial interest in these organisms, particularly because some species, including strains belonging to the genus Aurantiochytrium, accumulate high levels of docosahexaenoic acid (DHA). Here, we report the draft genome sequence of Aurantiochytrium sp. T66 (ATCC PRA-276), with a size of 43 Mbp, and 11,683 predicted protein-coding sequences. The data has been deposited at DDBJ/EMBL/Genbank under the accession LNGJ00000000. The genome sequence will contribute new insight into DHA biosynthesis and regulation, providing a basis for metabolic engineering of thraustochytrids. PMID:27222814

  5. A classification of glycosyl hydrolases based on amino acid sequence similarities.

    PubMed Central

    Henrissat, B

    1991-01-01

    The amino acid sequences of 301 glycosyl hydrolases and related enzymes have been compared. A total of 291 sequences corresponding to 39 EC entries could be classified into 35 families. Only ten sequences (less than 5% of the sample) could not be assigned to any family. With the sequences available for this analysis, 18 families were found to be monospecific (containing only one EC number) and 17 were found to be polyspecific (containing at least two EC numbers). Implications on the folding characteristics and mechanism of action of these enzymes and on the evolution of carbohydrate metabolism are discussed. With the steady increase in sequence and structural data, it is suggested that the enzyme classification system should perhaps be revised. PMID:1747104

  6. New families in the classification of glycosyl hydrolases based on amino acid sequence similarities.

    PubMed Central

    Henrissat, B; Bairoch, A

    1993-01-01

    301 glycosyl hydrolases and related enzymes corresponding to 39 EC entries of the I.U.B. classification system have been classified into 35 families on the basis of amino-acid-sequence similarities [Henrissat (1991) Biochem. J. 280, 309-316]. Approximately half of the families were found to be monospecific (containing only one EC number), whereas the other half were found to be polyspecific (containing at least two EC numbers). A > 60% increase in sequence data for glycosyl hydrolases (181 additional enzymes or enzyme domains sequences have since become available) allowed us to update the classification not only by the addition of more members to already identified families, but also by the finding of ten new families. On the basis of a comparison of 482 sequences corresponding to 52 EC entries, 45 families, out of which 22 are polyspecific, can now be defined. This classification has been implemented in the SWISS-PROT protein sequence data bank. PMID:8352747

  7. Sequence-specific purification of nucleic acids by PNA-controlled hybrid selection.

    PubMed

    Orum, H; Nielsen, P E; Jørgensen, M; Larsson, C; Stanley, C; Koch, T

    1995-09-01

    Using an oligohistidine peptide nucleic acids (oligohistidine-PNA) chimera, we have developed a rapid hybrid selection method that allows efficient, sequence-specific purification of a target nucleic acid. The method exploits two fundamental features of PNA. First, that PNA binds with high affinity and specificity to its complementary nucleic acid. Second, that amino acids are easily attached to the PNA oligomer during synthesis. We show that a (His)6-PNA chimera exhibits strong binding to chelated Ni2+ ions without compromising its native PNA hybridization properties. We further show that these characteristics allow the (His)6-PNA/DNA complex to be purified by the well-established method of metal ion affinity chromatography using a Ni(2+)-NTA (nitrilotriactic acid) resin. Specificity and efficiency are the touchstones of any nucleic acid purification scheme. We show that the specificity of the (His)6-PNA selection approach is such that oligonucleotides differing by only a single nucleotide can be selectively purified. We also show that large RNAs (2224 nucleotides) can be captured with high efficiency by using multiple (His)6-PNA probes. PNA can hybridize to nucleic acids in low-salt concentrations that destabilize native nucleic acid structures. We demonstrate that this property of PNA can be utilized to purify an oligonucleotide in which the target sequence forms part of an intramolecular stem/loop structure. PMID:7495562

  8. In silico comparative analysis of DNA and amino acid sequences for prion protein gene.

    PubMed

    Kim, Y; Lee, J; Lee, C

    2008-01-01

    Genetic variability might contribute to species specificity of prion diseases in various organisms. In this study, structures of the prion protein gene (PRNP) and its amino acids were compared among species of which sequence data were available. Comparisons of PRNP DNA sequences among 12 species including human, chimpanzee, monkey, bovine, ovine, dog, mouse, rat, wallaby, opossum, chicken and zebrafish allowed us to identify candidate regulatory regions in intron 1 and 3'-untranslated region (UTR) in addition to the coding region. Highly conserved putative binding sites for transcription factors, such as heat shock factor 2 (HSF2) and myocite enhancer factor 2 (MEF2), were discovered in the intron 1. In 3'-UTR, the functional sequence (ATTAAA) for nucleus-specific polyadenylation was found in all the analysed species. The functional sequence (TTTTTAT) for maturation-specific polyadenylation was identically observed only in ovine, and one or two nucleotide mismatches in the other species. A comparison of the amino acid sequences in 53 species revealed a large sequence identity. Especially the octapeptide repeat region was observed in all the species but frog and zebrafish. Functional changes and susceptibility to prion diseases with various isoforms of prion protein could be caused by numeric variability and conformational changes discovered in the repeat sequences. PMID:18397498

  9. Antibody-specific model of amino acid substitution for immunological inferences from alignments of antibody sequences.

    PubMed

    Mirsky, Alexander; Kazandjian, Linda; Anisimova, Maria

    2015-03-01

    Antibodies are glycoproteins produced by the immune system as a dynamically adaptive line of defense against invading pathogens. Very elegant and specific mutational mechanisms allow B lymphocytes to produce a large and diversified repertoire of antibodies, which is modified and enhanced throughout all adulthood. One of these mechanisms is somatic hypermutation, which stochastically mutates nucleotides in the antibody genes, forming new sequences with different properties and, eventually, higher affinity and selectivity to the pathogenic target. As somatic hypermutation involves fast mutation of antibody sequences, this process can be described using a Markov substitution model of molecular evolution. Here, using large sets of antibody sequences from mice and humans, we infer an empirical amino acid substitution model AB, which is specific to antibody sequences. Compared with existing general amino acid models, we show that the AB model provides significantly better description for the somatic evolution of mice and human antibody sequences, as demonstrated on large next generation sequencing (NGS) antibody data. General amino acid models are reflective of conservation at the protein level due to functional constraints, with most frequent amino acids exchanges taking place between residues with the same or similar physicochemical properties. In contrast, within the variable part of antibody sequences we observed an elevated frequency of exchanges between amino acids with distinct physicochemical properties. This is indicative of a sui generis mutational mechanism, specific to antibody somatic hypermutation. We illustrate this property of antibody sequences by a comparative analysis of the network modularity implied by the AB model and general amino acid substitution models. We recommend using the new model for computational studies of antibody sequence maturation, including inference of alignments and phylogenetic trees describing antibody somatic hypermutation in

  10. cDNA-derived amino acid sequence of rat mitochondrial 3-oxoacyl-CoA thiolase with no transient presequence: structural relationship with peroxisomal isozyme.

    PubMed Central

    Arakawa, H; Takiguchi, M; Amaya, Y; Nagata, S; Hayashi, H; Mori, M

    1987-01-01

    The sorting of homologous proteins between two separate intracellular organelles is a major unsolved problem. 3-Oxoacyl-CoA thiolase is localized in mitochondria and peroxisomes, and provides a good system for the study on the problem. Unlike most mitochondrial matrix proteins, mitochondrial 3-oxoacyl-CoA thiolase in rats is synthesized with no transient presequence and possess information for mitochondrial targeting and import in the mature protein. Two overlapping cDNA clones contained an open reading frame encoding a polypeptide of 397 amino acid residues (predicted Mr = 41,868), a 5' untranslated sequence of 164 bp, a 3' untranslated sequence of 264 bp and a poly(A) tract. The amino acid sequence of the mitochondrial thiolase is 37% identical with that of the mature portion of rat peroxisomal 3-oxoacyl-CoA thiolase precursor. These results suggest that the two thiolases have a common origin and obtained information for targeting to respective organelles during evolution. Two portions in the mitochondrial thiolase that may serve as a mitochondrial targeting signal are presented. PMID:3038520

  11. A Possible Mechanism of Zika Virus Associated Microcephaly: Imperative Role of Retinoic Acid Response Element (RARE) Consensus Sequence Repeats in the Viral Genome

    PubMed Central

    Kumar, Ashutosh; Singh, Himanshu N.; Pareek, Vikas; Raza, Khursheed; Dantham, Subrahamanyam; Kumar, Pavan; Mochan, Sankat; Faiq, Muneeb A.

    2016-01-01

    Owing to the reports of microcephaly as a consistent outcome in the fetuses of pregnant women infected with ZIKV in Brazil, Zika virus (ZIKV)—microcephaly etiomechanistic relationship has recently been implicated. Researchers, however, are still struggling to establish an embryological basis for this interesting causal handcuff. The present study reveals robust evidence in favor of a plausible ZIKV-microcephaly cause-effect liaison. The rationale is based on: (1) sequence homology between ZIKV genome and the response element of an early neural tube developmental marker “retinoic acid” in human DNA and (2) comprehensive similarities between the details of brain defects in ZIKV-microcephaly and retinoic acid embryopathy. Retinoic acid is considered as the earliest factor for regulating anteroposterior axis of neural tube and positioning of structures in developing brain through retinoic acid response elements (RARE) consensus sequence (5′–AGGTCA–3′) in promoter regions of retinoic acid-dependent genes. We screened genomic sequences of already reported virulent ZIKV strains (including those linked to microcephaly) and other viruses available in National Institute of Health genetic sequence database (GenBank) for the RARE consensus repeats and obtained results strongly bolstering our hypothesis that ZIKV strains associated with microcephaly may act through precipitation of dysregulation in retinoic acid-dependent genes by introducing extra stretches of RARE consensus sequence repeats in the genome of developing brain cells. Additional support to our hypothesis comes from our findings that screening of other viruses for RARE consensus sequence repeats is positive only for those known to display neurotropism and cause fetal brain defects (for which maternal-fetal transmission during developing stage may be required). The numbers of RARE sequence repeats appeared to match with the virulence of screened positive viruses. Although, bioinformatic evidence and

  12. Localization in the Tomato Genome of DNA Restriction Fragments Containing Sequences Homologous to the rRNA (45s), the Major Chlorophyll a/b Binding Polypeptide and the Ribulose Bisphosphate Carboxylase Genes

    PubMed Central

    Vallejos, C. E.; Tanksley, S. D.; Bernatzky, R.

    1986-01-01

    DNA restriction fragments containing sequences homologous to the ribosomal RNA (45s), the major chlorophyll a/b binding polypeptide (CAB) and the small subunit of ribulose bisphosphate carboxylase (RBCS) genes have been localized and mapped in the tomato nuclear genome by linkage analysis. Ribosomal RNA genes map to a single locus, R45s, which resides in a terminal position on the short arm of chromosome 2 and corresponds to the Nucleolar Organizer Region. The size of the 45s repeating unit is estimated to be approximately 9 kb in Lycopersicon esculentum and 11 kb in Lycopersicon pennellii. Five loci were found to contain CAB sequences. Two of the loci, Cab-1 (chromosome 2) and Cab-3 (chromosome 8), together accounted for more than 80% of the hybridization signal. These loci contain more than one CAB structural gene. The other three loci, Cab-2 (chromosome 8), Cab-4 (chromosome 7) and Cab-5 (chromosome 12), each account for <10% of the total signal and may contain only a single copy of the CAB structural sequence. Three loci were found to contain RBCS sequences. Rbcs-2 (chromosome 3) and Rbcs-3 (chromosome 2) were responsible for >80% of the signal, with the remainder being associated with Rbcs-1 (chromosome 2). Rbcs-2 and Rbcs-3 may contain more than one copy of the gene. PMID:17246311

  13. Homology-Independent Metrics for Comparative Genomics

    PubMed Central

    Coutinho, Tarcisio José Domingos; Franco, Glória Regina; Lobo, Francisco Pereira

    2015-01-01

    A mainstream procedure to analyze the wealth of genomic data available nowadays is the detection of homologous regions shared across genomes, followed by the extraction of biological information from the patterns of conservation and variation observed in such regions. Although of pivotal importance, comparative genomic procedures that rely on homology inference are obviously not applicable if no homologous regions are detectable. This fact excludes a considerable portion of “genomic dark matter” with no significant similarity — and, consequently, no inferred homology to any other known sequence — from several downstream comparative genomic methods. In this review we compile several sequence metrics that do not rely on homology inference and can be used to compare nucleotide sequences and extract biologically meaningful information from them. These metrics comprise several compositional parameters calculated from sequence data alone, such as GC content, dinucleotide odds ratio, and several codon bias metrics. They also share other interesting properties, such as pervasiveness (patterns persist on smaller scales) and phylogenetic signal. We also cite examples where these homology-independent metrics have been successfully applied to support several bioinformatics challenges, such as taxonomic classification of biological sequences without homology inference. They where also used to detect higher-order patterns of interactions in biological systems, ranging from detecting coevolutionary trends between the genomes of viruses and their hosts to characterization of gene pools of entire microbial communities. We argue that, if correctly understood and applied, homology-independent metrics can add important layers of biological information in comparative genomic studies without prior homology inference. PMID:26029354

  14. Definition of Mycobacterium tuberculosis culture filtrate proteins by two-dimensional polyacrylamide gel electrophoresis, N-terminal amino acid sequencing, and electrospray mass spectrometry.

    PubMed Central

    Sonnenberg, M G; Belisle, J T

    1997-01-01

    A number of the culture filtrate proteins secreted by Mycobacterium tuberculosis are known to contribute to the immunology of tuberculosis and to possess enzymatic activities associated with pathogenicity. However, a complete analysis of the protein composition of this fraction has been lacking. By using two-dimensional polyacrylamide gel electrophoresis, detailed maps of the culture filtrate proteins of M. tuberculosis H37Rv were generated. In total, 205 protein spots were observed. The coupling of this electrophoretic technique with Western blot analysis allowed the identification and mapping of 32 proteins. Further molecular characterization of abundant proteins within this fraction was achieved by N-terminal amino acid sequencing and liquid chromatography-mass spectrometry. Eighteen proteins were subjected to N-group analysis; of these, only 10 could be sequenced by Edman degradation. Among the most interesting were a novel 52-kDa protein demonstrating significant homology to an alpha-hydroxysteroid dehydrogenase of Eubacterium sp. strain VPI 12708, a 25-kDa protein corresponding to open reading frame 28 of the M. tuberculosis cosmid MTCY1A11, and a 31-kDa protein exhibiting an amino acid sequence identical to that of antigen 85A and 85B. This latter product migrated with an isoelectric point between those of antigen 85A and 85C but did not react with the antibody specific for this complex, suggesting that there is a fourth member of the antigen 85 complex. Novel N-terminal amino acid sequences were obtained for three additional culture filtrate proteins; however, these did not yield significant homology to known protein sequences. A protein cluster of 85 to 88 kDa, recognized by the monoclonal antibodies IT-57 and IT-42 and known to react with sera from a large proportion of tuberculosis patients, was refractory to N-group analysis. Nevertheless, mass spectrometry of peptides obtained from one member of this complex identified it as the M. tuberculosis Kat

  15. Amino acid sequence of a vitamin K-dependent Ca2+-binding peptide from bovine prothrombin.

    PubMed

    Howard, J B; Fausch, M D

    1975-08-10

    The amino acid sequence of a 31-residue peptide from bovine prothrombin has been determined. This peptide has been shown to contain the vitamin K-dependent modification required for Ca2+ binding (Nelsestuen, G. L., and Suttie, J. W. (1973) Proc. Natl. Acad. Sci. U. S. A. 70, 3366-3370) and the modified amino acid, gamma-carboxyglutamic acid (Nelsestuen, G. L., Zytkovicz, T., and Howard, J. B. (1974) J. Biol. Chem. 249, 6347-6350). The peptide was shown to correspond to residues 12 to 42 of prothrombin. PMID:807581

  16. Amino acid sequences around the cysteine residues of rabbit muscle triose phosphate isomerase

    PubMed Central

    Miller, Janet C.; Waley, S. G.

    1971-01-01

    1. The nature of the subunits in rabbit muscle triose phosphate isomerase has been investigated. 2. Amino acid analyses show that there are five cysteine residues and two methionine residues/subunit. 3. The amino acid sequences around the cysteine residues have been determined; these account for about 75 residues. 4. Cleavage at the methionine residues with cyanogen bromide gave three fragments. 5. These results show that the subunits correspond to polypeptide chains, containing about 230 amino acid residues. The chains in triose phosphate isomerase seem to be shorter than those of other glycolytic enzymes. PMID:5165707

  17. Draft Genome Sequence of the Butyric Acid Producer Clostridium tyrobutyricum Strain CIP I-776 (IFP923)

    PubMed Central

    Clément, Benjamin; Lopes Ferreira, Nicolas

    2016-01-01

    Here, we report the draft genome sequence of Clostridium tyrobutyricum CIP I-776 (IFP923), an efficient producer of butyric acid. The genome consists of a single chromosome of 3.19 Mb and provides useful data concerning the metabolic capacities of the strain. PMID:26941139

  18. Draft Genome Sequence of Perfluorooctane Acid-Degrading Bacterium Pseudomonas parafulva YAB-1

    PubMed Central

    Tang, Chongjian; Peng, Qingjing; Peng, Qingzhong

    2015-01-01

    Pseudomonas parafulva YAB-1, isolated from perfluorinated compound-contaminated soil, has the ability to degrade perfluorooctane acid (PFOA) compound. Here, we report the draft genome sequence and annotation of the PFOA-degrading bacterium P. parafulva YAB-1. The data provide the basis to investigate the molecular mechanism of PFOA metabolism. PMID:26337877

  19. The amino acid sequence of cytochrome c-555 from the methane-oxidizing bacterium Methylococcus capsulatus.

    PubMed Central

    Ambler, R P; Dalton, H; Meyer, T E; Bartsch, R G; Kamen, M D

    1986-01-01

    The amino acid sequence of the cytochrome c-555 from the obligate methanotroph Methylococcus capsulatus strain Bath (N.C.I.B. 11132) was determined. It is a single polypeptide chain of 96 residues, binding a haem group through the cysteine residues at positions 19 and 22, and the only methionine residue is a position 59. The sequence does not closely resemble that of any other cytochrome c that has yet been characterized. Detailed evidence for the amino acid sequence of the protein has been deposited as Supplementary Publication SUP 50131 (12 pages) at the British Library Lending Division, Boston Spa, West Yorkshire LS23 7BQ, U.K., from whom copies are available on prepayment. PMID:3006666

  20. Cloning, nucleotide sequences, and identification of products of the Pseudomonas aeruginosa PAO bra genes, which encode the high-affinity branched-chain amino acid transport system.

    PubMed Central

    Hoshino, T; Kose, K

    1990-01-01

    A DNA fragment of Pseudomonas aeruginosa PAO containing genes specifying the high-affinity branched-chain amino acid transport system (LIV-I) was isolated. The fragment contained the braC gene, encoding the binding protein for branched-chain amino acids, and the 4-kilobase DNA segment adjacent to 3' of braC. The nucleotide sequence of the 4-kilobase DNA fragment was determined and found to contain four open reading frames, designated braD, braE, braF, and braG. The braD and braE genes specify very hydrophobic proteins of 307 and 417 amino acid residues, respectively. The braD gene product showed extensive homology (67% identical) to the livH gene product, a component required for the Escherichia coli high-affinity branched-chain amino acid transport systems. The braF and braG genes encode proteins of 255 and 233 amino acids, respectively, both containing amino acid sequences typical of proteins with ATP-binding sites. By using a T7 RNA polymerase/promoter system together with plasmids having various deletions in the braDEFG region, the braD, braE, braF, and braG gene products were identified as proteins with apparent Mrs of 25,500, 34,000, 30,000, and 27,000, respectively. These proteins were found among cell membrane proteins on a sodium dodecyl sulfate-polyacrylamide gel stained with Coomassie blue. Images PMID:2120183

  1. Allelic polymorphism in arabian camel ribonuclease and the amino acid sequence of bactrian camel ribonuclease.

    PubMed

    Welling, G W; Mulder, H; Beintema, J J

    1976-04-01

    Pancreatic ribonucleases from several species (whitetail deer, roe deer, guinea pig, and arabian camel) exhibit more than one amino acid at particular positions in their amino acid sequences. Since these enzymes were isolated from pooled pancreas, the origin of this heterogeneity is not clear. The pancreatic ribonucleases from 11 individual arabian camels (Camelus dromedarius) have been investigated with respect to the lysine-glutamine heterogeneity at position 103 (Welling et al., 1975). Six ribonucleases showed only one basic band and five showed two bands after polyacrylamide gel electrophoresis, suggesting a gene frequency of about 0.75 for the Lys gene and about 0.25 for the Gln gene. The amino acid sequence of bactrian camel (Camelus bactrianus) ribonuclease isolated from individual pancreatic tissue was determined and compared with that of arabian camel ribonuclease. The only difference was observed at position 103. In the ribonucleases from two unrelated bactrian camels, only glutamine was observed at that position. PMID:962846

  2. Sequences homologous to the human x- and y-borne zinc finger protein genes (ZFX/Y) are autosomal in monotreme mannals

    SciTech Connect

    Watson, J.M.; Frost, C.; Graves, M.J.A. ); Spencer, J.A. )

    1993-02-01

    The human zinc finger protein genes (ZFX/Y) were identified as a result of a systematic search for the testis-determining factor gene on the human Y chromosome. Although they play no direct role in sex determination, they are of particular interest because they are highly conserved among mammals, birds, and amphibians and because, in eutherian mammals at least, they have active alleles on both the X and the Y chromosomes outside the pseudoautosomal region. We used in situ hybridization to localize the homologues of the zinc finger protein gene to chromosome 1 of the Australian echidna and to an equivalent position on chromosomes 1 and 2 of the playtpus. The localization to platypus chromosome 1 was confirmed by Southern analysis of a Chinese hamster [times] platypus cell hybrid retaining most of platypus chromosome 1. This localization is consistent with the cytological homology of chromosome 1 between the two species. The zinc finger protein gene homologues were localized to regions of platypus chromosomes 1 and 2 that included a number of other genes situated near ZFX on the short arm of the human X chromosome. These results support the hypothesis that many of the genes located on the short arm of the human X were originally autosomal and have been translocated to the X chromosome since the eutherian-metatherian divergence. 34 refs., 3 figs., 2 tabs.

  3. Sequence homology requirements for transcriptional silencing of 35S transgenes and post-transcriptional silencing of nitrite reductase (trans)genes by the tobacco 271 locus.

    PubMed

    Thierry, D; Vaucheret, H

    1996-12-01

    The transgene locus of the tobacco plant 271 (271 locus) is located on a telomere and consists of multiple copies of a plasmid carrying an NptII marker gene driven by the cauliflower mosaic virus (CaMV) 19S promoter and the leaf-specific nitrite reductase Nii1 cDNA cloned in the antisense orientation under the control of the CaMV 35S promoter. Previous analysis of gene expression in leaves has shown that this locus triggers both post-transcriptional silencing of the host leaf-specific Nii genes and transcriptional silencing of transgenes driven by the 19S or 35S promoter irrespective of their coding sequence and of their location in the genome. In this paper we show that silencing of transgenes carrying Nii1 sequences occurs irrespective of the promoter driving their expression and of their location within the genome. This phenomenon occurs in roots as well as in leaves although root Nii genes share only 84% identity with leaf-specific Nii1 sequences carried by the 271 locus. Conversely, transgenes carrying the bean Nii gene (which shares 76% identity with the tobacco Nii1 gene) escape silencing by the 271 locus. We also show that transgenes driven by the figwort mosaic virus 34S promoter (which shares 63% identity with the 35S promoter) also escape silencing by the 271 locus. Taken together, these results indicate that a high degree of sequence similarity is required between the sequences of the silencing locus and of the target (trans)genes for both transcriptional and post-transcriptional silencing. PMID:9002606

  4. Complementary DNA and derived amino acid sequence of the. beta. subunit of human complement protein C8: identification of a close structural and ancestral relationship to the. cap alpha. subunit and C9

    SciTech Connect

    Howard, O.M.Z.; Rao, A.G.; Sodetz, J.M.

    1987-06-16

    A cDNA clone encoding the ..beta.. subunit (M/sub r/ 64,000) of the eighth component of complement (C8) has been isolated from a human liver cDNA library. This clone has a cDNA insert of 1.95 kilobases (kb) and contains the entire ..beta.. sequence (1608 base pairs (bp)). Analysis of total cellular RNA isolated from the hepatoma cell line HepG2 revealed the mRNA for ..beta.. to be approx. 2.5 kb. This is similar to the message size for the ..cap alpha.. subunit of C8 and confirms the existence of different mRNAs for ..cap alpha.. and ..beta... This finding supports genetic evidence that ..cap alpha.. and ..beta.. are encoded at different loci. Analysis of the derived amino acid sequence revealed several membrane surface seeking segments that may facilitate ..beta.. interaction with target membranes during complement-mediated cytolysis. Determined of the carbohydrate composition indicated 1 or 2 asparagine-linked but no O-linked oligosaccharide chains. Comparison of the ..beta.. sequence to that reported earlier and to that of human C9 revealed a striking homology between all three proteins. For ..beta.. and ..cap alpha.., the overall homology is 33% on the basis of identity and 53% when conserved substitutions are allowed. For ..beta.. and C9, the values are 26% and 47/sup 5/, respectively. All three have a large internal domain that is nearly cysteine free and N- and C-termini that are cysteine-rich and homologous to the low-density lipoprotein receptor repeat and epidermal growth factor type sequences, respectively. The overall homology and similarities in size and structural organization are indicative of a close ancestral relationship. It is concluded that ..cap alpha.., ..beta.. and C9 are members of a family of structurally related proteins that are capable of interacting to produce a hydrophilic to amphiphilic transition and membrane association.

  5. Software scripts for quality checking of high-throughput nucleic acid sequencers.

    PubMed

    Lazo, G R; Tong, J; Miller, R; Hsia, C; Rausch, C; Kang, Y; Anderson, O D

    2001-06-01

    We have developed a graphical interface to allow the researcher to view and assess the quality of sequencing results using a series of program scripts developed to process data generated by automated sequencers. The scripts are written in Perl programming language and are executable under the cgibin directory of a Web server environment. The scripts direct nucleic acid sequencing trace file data output from automated sequencers to be analyzed by the phred molecular biology program and are displayed as graphical hypertext mark-up language (HTML) pages. The scripts are mainly designed to handle 96-well microtiter dish samples, but the scripts are also able to read data from 384-well microtiter dishes 96 samples at a time. The scripts may be customized for different laboratory environments and computer configurations. Web links to the sources and discussion page are provided. PMID:11414222

  6. Elucidation of the sequence of canine (pro)-calcitonin. A molecular biological and protein chemical approach.

    PubMed

    Mol, J A; Kwant, M M; Arnold, I C; Hazewinkel, H A

    1991-09-01

    From the canine thyroid gland a calcitonin (CT) immunoreactive peptide was purified by successive aqueous acid acetone extraction, gel filtration and HPLC. Gas-phase sequencing of the purified peptide showed that the first 25 amino acids had 65% sequence homology with the amino-terminus of the human CT prohormone. A canine cDNA library was then made from the thyroid gland. A plasmid was isolated containing a sequence that is homologous to part of exon 3, and the complete sequence of exon 4 of the human mRNA encoding preproCT. From this cDNA the amino acid sequence of canine CT is predicted. In comparison with well-known CT sequences of other species, the strongest homology exists with bovine, porcine and ovine CT. PMID:1758974

  7. Efficient Nucleic Acid Extraction and 16S rRNA Gene Sequencing for Bacterial Community Characterization.

    PubMed

    Anahtar, Melis N; Bowman, Brittany A; Kwon, Douglas S

    2016-01-01

    There is a growing appreciation for the role of microbial communities as critical modulators of human health and disease. High throughput sequencing technologies have allowed for the rapid and efficient characterization of bacterial communities using 16S rRNA gene sequencing from a variety of sources. Although readily available tools for 16S rRNA sequence analysis have standardized computational workflows, sample processing for DNA extraction remains a continued source of variability across studies. Here we describe an efficient, robust, and cost effective method for extracting nucleic acid from swabs. We also delineate downstream methods for 16S rRNA gene sequencing, including generation of sequencing libraries, data quality control, and sequence analysis. The workflow can accommodate multiple samples types, including stool and swabs collected from a variety of anatomical locations and host species. Additionally, recovered DNA and RNA can be separated and used for other applications, including whole genome sequencing or RNA-seq. The method described allows for a common processing approach for multiple sample types and accommodates downstream analysis of genomic, metagenomic and transcriptional information. PMID:27168460

  8. Efficient Nucleic Acid Extraction and 16S rRNA Gene Sequencing for Bacterial Community Characterization

    PubMed Central

    Anahtar, Melis N.; Bowman, Brittany A.; Kwon, Douglas S.

    2016-01-01

    There is a growing appreciation for the role of microbial communities as critical modulators of human health and disease. High throughput sequencing technologies have allowed for the rapid and efficient characterization of bacterial communities using 16S rRNA gene sequencing from a variety of sources. Although readily available tools for 16S rRNA sequence analysis have standardized computational workflows, sample processing for DNA extraction remains a continued source of variability across studies. Here we describe an efficient, robust, and cost effective method for extracting nucleic acid from swabs. We also delineate downstream methods for 16S rRNA gene sequencing, including generation of sequencing libraries, data quality control, and sequence analysis. The workflow can accommodate multiple samples types, including stool and swabs collected from a variety of anatomical locations and host species. Additionally, recovered DNA and RNA can be separated and used for other applications, including whole genome sequencing or RNA-seq. The method described allows for a common processing approach for multiple sample types and accommodates downstream analysis of genomic, metagenomic and transcriptional information. PMID:27168460

  9. Purification, characterization, and complete amino acid sequence of a trypsin inhibitor from amaranth (Amaranthus hypochondriacus) seeds.

    PubMed Central

    Valdes-Rodriguez, S; Segura-Nieto, M; Chagolla-Lopez, A; Verver y Vargas-Cortina, A; Martinez-Gallardo, N; Blanco-Labra, A

    1993-01-01

    A protein proteinase inhibitor was purified from a seed extract of amaranth (Amaranthus hypochondriacus) by precipitation with (NH4)2SO4, gel-filtration chromatography, ion-exchange chromatography, and reverse-phase high-performance liquid chromatography. It is a 69-amino acid protein with a high content of valine, arginine, and glutamic acid, but lacking in methionine. The inhibitor has a relative molecular weight of 7400 and an isoelectric point of 7.5. It is a serine proteinase inhibitor that recognizes chymotrypsin, trypsin, and trypsin-like proteinase activities extracted from larvae of the insect Prostephanus truncatus. This inhibitor belongs to the potato-I inhibitor family, showing the closest homology (59.5%) with the Lycopersicum peruvianum trypsin inhibitor, and (51%) with the proteinase inhibitor 5 extracted from the seeds of Cucurbita maxima. The position of the lysine-aspartic acid residues present in the active site of the amaranth inhibitor are found in almost the same relative position as in the inhibitor from C. maxima. PMID:8290633

  10. Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology (Seventh Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting 2012)

    ScienceCinema

    Patel, Kamlesh D [Ken]; SNL,

    2013-01-25

    Kamlesh (Ken) Patel from Sandia National Laboratories (Livermore, California) presents "Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology " at the 7th Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting held in June, 2012 in Santa Fe, NM.

  11. The amino acid sequence of ribonuclease U2 from Ustilago sphaerogena.

    PubMed Central

    Sato, S; Uchida, T

    1975-01-01

    1. RNAase (ribonuclease) U2, a purine-specific RNAase, was reduced, aminoethylated and hydrolysed with trypsin, chymotrypsin and thermolysin. On the basis of the analyses of the resulting peptides, the complete amino acid sequence of RNAase U2 was determined, 2. When the sequence was compared with the amino acid sequence of RNAase T1 (EC 3.1.4.8), the following regions were found to be similar in the two enzymes; Tyr-Pro-His-Gln-Tyr (38-42) in RNAase U2 and Tyr-Pro-His-Lys-Tyr (38-42) in RNAase T1, Glu-Phe-Pro-Leu-Val (61-65) in RNAase U2 and Glu-Trp-Pro-Ile-Leu (58-62) in RNAase T1, Asp-Arg-Val-Ile-Tyr-Gln (83-88) in RNAase U2 and Asp-Arg-Val-Phe-Asn (76-81) in RNAase T1 and Val-Thr-His-Thr-Gly-Ala (98-103) in RNAase U2 and Ile-Thr-His-Thr-Gly-Ala (90-95) in RNAase T1. All of the amino acid residues, histidine-40, glutamate-58, arginine-77 and histidine-92, which were found to play a crucial role in the biological activity of RNAase T1, were included in the regions cited here. 3. Detailed evidence for the amino acid sequence of the sequence of the proteins has been deposited as Supplementary Publication SUP 50041 (33 PAGES) AT THE British Library (Lending Division)(formerly the National Lending Library for Science and Technology), Boston Spa, Yorks. LS23 7BQ, U.K., from whom copies can be obtained on the terms indicated in Biochem. J. (1975), 145, 5. PMID:1156364

  12. Deduced amino acid sequence of human pulmonary surfactant proteolipid: SPL(pVal)

    SciTech Connect

    Whitsett, J.A.; Glasser, S.W.; Korfhagen, T.R.; Weaver, T.E.; Clark, J.; Pilot-Matias, T.; Meuth, J.; Fox, J.L.

    1987-05-01

    Hydrophobic, proteolipid-like protein of Mr 6500 was isolated from ether/ethanol extracts of human, canine and bovine pulmonary surfactant. Amino acid composition of the protein demonstrated a remarkable abundance of hydrophobic residues, particularly valine and leucine. The N-terminal amino acid sequence of the human protein was determined: N-Leu-Ile-Pro-Cys-Cys-Pro-Val-Asn-Leu-Lys-Arg-Leu-Leu-Ile-Val4... An oligonucleotide probe was used to screen an adult human lung cDNA library and resulted in detection of cDNA clones with predicted amino acid sequence with close identity to the N-terminal amino acid sequence of the human peptide. SPL(pVal) was found within the reading frame of a larger peptide. SPL(pVal) results from proteolytic processing of a larger preprotein. Northern blot analysis detected in a single 1.0 kilobase SPL(pVal) RNA which was less abundant in fetal than in adult lung. Mixtures of purified canine and bovine SPL(pVal) and synthetic phospholipids display properties of rapid adsorption and surface tension lowering activity characteristic of surfactant. Human SPL(pVal) is a pulmonary surfactant proteolipid which may therefore be useful in combination with phospholipids and/or other surfactant proteins for the treatment of surfactant deficiency such as hyaline membrane disease in newborn infants.

  13. Complete nucleic acid sequence of Penaeus stylirostris densovirus (PstDNV) from India.

    PubMed

    Rai, Praveen; Safeena, Muhammed P; Karunasagar, Iddya; Karunasagar, Indrani

    2011-06-01

    Infectious hypodermal and hematopoietic necrosis virus (IHHNV) of shrimp, recently been classified as Penaeus stylirostris densovirus (PstDNV). The complete nucleic acid sequence of PstDNV from India was obtained by cloning and sequencing of different DNA fragment of the virus. The genome organisation of PstDNV revealed that there were three major coding domains: a left ORF (NS1) of 2001 bp, a mid ORF (NS2) of 1092 bp and a right ORF (VP) of 990 bp. The complete genome and amino acid sequences of three proteins viz., NS1, NS2 and VP were compared with the genomes of the virus reported from Hawaii, China and Mexico and with partial sequence available from isolates from different regions. The phylogenetic analysis of shrimp, insect and vertebrate parvovirus sequences showed that the Indian PstDNV isolate is phylogenetically more closely related to one of the three isolates from Taiwan (AY355307), and two isolates (AY362547 and AY102034) from Thailand. PMID:21402111

  14. Human liver type pyruvate kinase: complete amino acid sequence and the expression in mammalian cells.

    PubMed Central

    Tani, K; Fujii, H; Nagata, S; Miwa, S

    1988-01-01

    Pyruvate kinase (PK) has four isozymes (L, R, M1, M2) that are encoded by two different genes. Among these isozymes, abnormalities of liver (L)-type PK is considered to be associated with hereditary nonspherocytic hemolytic anemia in humans. We isolated and determined the full-length sequence of human L-type PK cDNA. The cDNA contains 1629 base pairs encoding 543 amino acids, 68 base pairs of 5'-noncoding sequence, and 734 base pairs of 3'-noncoding sequence. The similarity between human and rat L-type PK was 86.9% at the nucleotide sequence level and 92.4% at the amino acid sequence level. The full-length L-type PK cDNA was placed under the promoter of simian virus 40 and introduced into monkey COS cells. Human L-type PK activity was detected in the extract of COS cells by the classical PK electrophoresis method. Images PMID:3126495

  15. Human liver type pyruvate kinase: Complete amino acid sequence and the expression in mammalian cells

    SciTech Connect

    Tani, Kenzaburo; Nagata, Shigekazu ); Fujii, Hisaichi ); Miwa, Shiro )

    1988-03-01

    Pyruvate kinase (PK) has four isozymes (L, R, M{sub 1}, M{sub 2}) that are encoded by two different genes. Among these isozymes, abnormalities of liver (L)-type PK is considered to be associated with hereditary nonspherocytic hemolytic anemia in humans. The authors isolated and determined the full-length sequence of human L-type PK cDNA. The cDNA contains 1,629 base pairs encoding 543 amino acids, 68 base pairs of 5{prime}-noncoding sequence, and 734 base pairs of 3{prime}-noncoding sequence. The similarity between human and rat L-type PK was 86.9% at the nucleotide sequence level and 92.4% at the amino acid sequence level. The full-length L-type PK cDNA was placed under the promoter of simian virus 40 and introduced into monkey COS cells. Human L-type PK activity was detected in the extract of COS cells by the classical PK electrophoresis method.

  16. Potential basis for regulation of the coordinately expressed fibrinogen genes: homology in the 5' flanking regions.

    PubMed Central

    Fowlkes, D M; Mullis, N T; Comeau, C M; Crabtree, G R

    1984-01-01

    The three chains of fibrinogen are encoded by three separate genes whose transcription is coordinately regulated. The breakdown of fibrinogen during the acute-phase reaction leads to a simultaneous increase in alpha-, beta-, and gamma-fibrinogen mRNA in the liver. In a search for the basis of this coordinate increase in transcription, we have determined the sequences of the regions surrounding the points of transcriptional initiation of the three rat fibrinogen genes, 1490 nucleotides upstream and 730 nucleotides downstream. Two unique regions of homology have been found. One region consists of 15 nucleotides that have a common 6-nucleotide core lying between -116 and -160; the other is approximately equal to 100 nucleotides long and is in the -165 to -472 region. In this region, the beta- and gamma-fibrinogen genes are approximately equal to 65% homologous. alpha-Fibrinogen has somewhat less homology with both beta- and gamma-fibrinogen. In addition, the beta-fibrinogen gene has 22 nucleotides at position -480 that are homologous to sequences that have been noted to occur in glucocorticosteroid-regulated genes in a similar position. We feel that these areas of conserved sequences play a role in the regulation of the transcription of fibrinogen. The fibrinogen chains are synthesized as precursor peptides, and the amino-terminal portion, the so-called signal peptide, is removed during the translocation of the peptide chain across the endoplasmic reticulum. We have determined those sequences that encode the signal peptides. Homology in the amino acid sequence between the rat and human signal peptides varies between 52% for alpha-fibrinogen and 66% for beta-fibrinogen. This homology implies that there has been strong selective pressure on this portion of these genes. PMID:6232608

  17. Molecular cytogenetics by polymerase catalyzed amplification or in situ labelling of specific nucleic acid sequences

    SciTech Connect

    Bolund, L.; Brandt, C.; Hindkjaer, J.; Koch, J.; Koelvraa, S.; Pedersen, S. )

    1993-01-01

    The Polymerase Chain Reaction (PCR) can be performed on isolated cells or chromosomes and the product can be analyzed by DNA technology or by FISH to test metaphases. The authors have good experiences analyzing aberrant chromosomes by FACS sorting, PCR with degenerated primers and painting of test metaphases with the PCR product. They also utilize polymerases for PRimed IN Situ labelling (PRINS) of specific nucleic acid sequences. In PRINS oligonucleotides are hybridized to their target sequences and labeled nucleotides are incorporated at the site of hybridization with the oligonucleotide as primer. PRINS may eventually allow the study of individual genes, gene expression and even somatic mutations (in mRNA) in single cells.

  18. DNA Cloning of Plasmodium falciparum Circumsporozoite Gene: Amino Acid Sequence of Repetitive Epitope

    NASA Astrophysics Data System (ADS)

    Enea, Vincenzo; Ellis, Joan; Zavala, Fidel; Arnot, David E.; Asavanich, Achara; Masuda, Aoi; Quakyi, Isabella; Nussenzweig, Ruth S.

    1984-08-01

    A clone of complementary DNA encoding the circumsporozoite (CS) protein of the human malaria parasite Plasmodium falciparum has been isolated by screening an Escherichia coli complementary DNA library with a monoclonal antibody to the CS protein. The DNA sequence of the complementary DNA insert encodes a four-amino acid sequence: proline-asparagine-alanine-asparagine, tandemly repeated 23 times. The CS β -lactamase fusion protein specifically binds monoclonal antibodies to the CS protein and inhibits the binding of these antibodies to native Plasmodium falciparum CS protein. These findings provide a basis for the development of a vaccine against Plasmodium falciparum malaria.

  19. Method for high-volume sequencing of nucleic acids: random and directed priming with libraries of oligonucleotides

    DOEpatents

    Studier, F.W.

    1995-04-18

    Random and directed priming methods for determining nucleotide sequences by enzymatic sequencing techniques, using libraries of primers of lengths 8, 9 or 10 bases, are disclosed. These methods permit direct sequencing of nucleic acids as large as 45,000 base pairs or larger without the necessity for subcloning. Individual primers are used repeatedly to prime sequence reactions in many different nucleic acid molecules. Libraries containing as few as 10,000 octamers, 14,200 nonamers, or 44,000 decamers would have the capacity to determine the sequence of almost any cosmid DNA. Random priming with a fixed set of primers from a smaller library can also be used to initiate the sequencing of individual nucleic acid molecules, with the sequence being completed by directed priming with primers from the library. In contrast to random cloning techniques, a combined random and directed priming strategy is far more efficient. 2 figs.

  20. Method for high-volume sequencing of nucleic acids: random and directed priming with libraries of oligonucleotides

    DOEpatents

    Studier, F. William

    1995-04-18

    Random and directed priming methods for determining nucleotide sequences by enzymatic sequencing techniques, using libraries of primers of lengths 8, 9 or 10 bases, are disclosed. These methods permit direct sequencing of nucleic acids as large as 45,000 base pairs or larger without the necessity for subcloning. Individual primers are used repeatedly to prime sequence reactions in many different nucleic acid molecules. Libraries containing as few as 10,000 octamers, 14,200 nonamers, or 44,000 decamers would have the capacity to determine the sequence of almost any cosmid DNA. Random priming with a fixed set of primers from a smaller library can also be used to initiate the sequencing of individual nucleic acid molecules, with the sequence being completed by directed priming with primers from the library. In contrast to random cloning techniques, a combined random and directed priming strategy is far more efficient.

  1. Amino acid sequence alignment of bacterial and mammalian pancreatic serine proteases based on topological equivalences.

    PubMed

    James, M N; Delbaere, L T; Brayer, G D

    1978-06-01

    The three-dimensional structures of the bacterial serine proteases SGPA, SGPB, and alpha-lytic protease have been compared with those of the pancreatic enzymes alpha-chymotrypsin and elastase. This comparison shows that approximately 60% (55-64%) of the alpha-carbon atom positions of the bacterial serine proteases are topologically equivalent to the alpha-carbon atom positions of the pancreatic enzymes. The corresponding value for a comparison of the bacterial enzymes among themselves is approximately 84%. The results of these topological comparisons have been used to deduce an experimentally sound sequence alignment for these several enzymes. This alignment shows that there is extensive tertiary structural homology among the bacteria and pancreatic enzymes without significant primary sequence identity (less than 21%). The acquisition of a zymogen function by the pancreatic enzymes is accompanied by two major changes to the bacterial enzymes' architecture: an insertion of 9 residues to increase the length of the N-terminal loop, and one of 12 residues to a loop near the activation salt bridge. In addition, in these two enzyme families, the methionine loop (residues 164-182) adopts very different comformations which are associated with their altered substrate specificities. PMID:96920

  2. Autoreactive T-cell receptor (Vbeta/D/Jbeta) sequences in diabetes are homologous to insulin, glucagon, the insulin receptor, and the glucagon receptor.

    PubMed

    Root-Bernstein, Robert

    2009-01-01

    The hypervariable (Vbeta/D/Jbeta) regions of T-cell receptors (TCR) have been sequenced in a variety of autoimmune diseases by various investigators. An analysis of some of these sequences shows that TCR from both human diabetics and NOD mice mimic insulin, glucagon, the insulin receptor, and the glucagon receptor. Such similarities are not found in the TCR produced in other human autoimmune diseases. These data may explain how insulin, glucagon, and their receptors are targets of autoimmunity in diabetes and also suggest that TCR mimicking insulin and its receptor may be targets of anti-insulin autoantibodies. Such intra-systemic mimicry of self-proteins also raises complex questions about how "self" and "nonself" are regulated during TCR production, especially in light of the complementarity of insulin for its receptor and glucagon for its receptor. The data presented here suggest that some TCR may be complementary to other TCR in autoimmune diseases, a possibility that is experimentally testable. Such complementarity, if it exists, could either serve to down-regulate the clones bearing such TCR or, alternatively, trigger an intra-immune system civil war between them. PMID:19051206

  3. Biological sequence classification with multivariate string kernels.

    PubMed

    Kuksa, Pavel P

    2013-01-01

    String kernel-based machine learning methods have yielded great success in practical tasks of structured/sequential data analysis. They often exhibit state-of-the-art performance on many practical tasks of sequence analysis such as biological sequence classification, remote homology detection, or protein superfamily and fold prediction. However, typical string kernel methods rely on the analysis of discrete 1D string data (e.g., DNA or amino acid sequences). In this paper, we address the multiclass biological sequence classification problems using multivariate representations in the form of sequences of features vectors (as in biological sequence profiles, or sequences of individual amino acid physicochemical descriptors) and a class of multivariate string kernels that exploit these representations. On three protein sequence classification tasks, the proposed multivariate representations and kernels show significant 15-20 percent improvements compared to existing state-of-the-art sequence classification methods. PMID:24384708

  4. Biological Sequence Analysis with Multivariate String Kernels.

    PubMed

    Kuksa, Pavel P

    2013-03-01

    String kernel-based machine learning methods have yielded great success in practical tasks of structured/sequential data analysis. They often exhibit state-of-the-art performance on many practical tasks of sequence analysis such as biological sequence classification, remote homology detection, or protein superfamily and fold prediction. However, typical string kernel methods rely on analysis of discrete one-dimensional (1D) string data (e.g., DNA or amino acid sequences). In this work we address the multi-class biological sequence classification problems using multivariate representations in the form of sequences of features vectors (as in biological sequence profiles, or sequences of individual amino acid physico-chemical descriptors) and a class of multivariate string kernels that exploit these representations. On a number of protein sequence classification tasks proposed multivariate representations and kernels show significant 15-20\\% improvements compared to existing state-of-the-art sequence classification methods. PMID:23509193

  5. The Complete Genome Sequence of the Lactic Acid Bacterium Lactococcus lactis ssp. lactis IL1403

    PubMed Central

    Bolotin, Alexander; Wincker, Patrick; Mauger, Stéphane; Jaillon, Olivier; Malarme, Karine; Weissenbach, Jean; Ehrlich, S. Dusko; Sorokin, Alexei

    2001-01-01

    Lactococcus lactis is a nonpathogenic AT-rich gram-positive bacterium closely related to the genus Streptococcus and is the most commonly used cheese starter. It is also the best-characterized lactic acid bacterium. We sequenced the genome of the laboratory strain IL1403, using a novel two-step strategy that comprises diagnostic sequencing of the entire genome and a shotgun polishing step. The genome contains 2,365,589 base pairs and encodes 2310 proteins, including 293 protein-coding genes belonging to six prophages and 43 insertion sequence (IS) elements. Nonrandom distribution of IS elements indicates that the chromosome of the sequenced strain may be a product of recent recombination between two closely related genomes. A complete set of late competence genes is present, indicating the ability of L. lactis to undergo DNA transformation. Genomic sequence revealed new possibilities for fermentation pathways and for aerobic respiration. It also indicated a horizontal transfer of genetic information from Lactococcus to gram-negative enteric bacteria of Salmonella-Escherichia group. [The sequence data described in this paper has been submitted to the GenBank data library under accession no. AE005176.] PMID:11337471

  6. Low molecular weight (C1-C10) monocarboxylic acids, dissolved organic carbon and major inorganic ions in alpine snow pit sequence from a high mountain site, central Japan

    NASA Astrophysics Data System (ADS)

    Kawamura, Kimitaka; Matsumoto, Kohei; Tachibana, Eri; Aoki, Kazuma

    2012-12-01

    Snowpack samples were collected from a snow pit sequence (6 m in depth) at the Murodo-Daira site near the summit of Mt. Tateyama, central Japan, an outflow region of Asian dusts. The snow samples were analyzed for a homologous series of low molecular weight normal (C1-C10) and branched (iC4-iC6) monocarboxylic acids as well as aromatic (benzoic) and hydroxy (glycolic and lactic) acids, together with major inorganic ions and dissolved organic carbon (DOC). The molecular distributions of organic acids were characterized by a predominance of acetic (range 7.8-76.4 ng g-1-snow, av. 34.8 ng g-1) or formic acid (2.6-48.1 ng g-1, 27.7 ng g-1), followed by propionic acid (0.6-5.2 ng g-1, 2.8 ng g-1). Concentrations of normal organic acids generally decreased with an increase in carbon chain length, although nonanoic acid (C9) showed a maximum in the range of C5-C10. Higher concentrations were found in the snowpack samples containing dust layer. Benzoic acid (0.18-4.1 ng g-1, 1.4 ng g-1) showed positive correlation with nitrate (r = 0.70), sulfate (0.67), Na+ (0.78), Ca2+ (0.86) and Mg+ (0.75), suggesting that this aromatic acid is involved with anthropogenic sources and Asian dusts. Higher concentrations of Ca2+ and SO42- were found in the dusty snow samples. We found a weak positive correlation (r = 0.43) between formic acid and Ca2+, suggesting that gaseous formic acid may react with Asian dusts in the atmosphere during long-range transport. However, acetic acid did not show any positive correlations with major inorganic ions. Hydroxyacids (0.03-5.7 ng g-1, 1.5 ng g-1) were more abundant in the granular and dusty snow. Total monocarboxylic acids (16-130 ng g-1, 74 ng g-1) were found to account for 1-6% of DOC (270-1500 ng g-1, 630 ng g-1) in the snow samples.

  7. On human disease-causing amino acid variants: statistical study of sequence and structural patterns

    PubMed Central

    Alexov, Emil

    2015-01-01

    Statistical analysis was carried out on large set of naturally occurring human amino acid variations and it was demonstrated that there is a preference for some amino acid substitutions to be associated with diseases. At an amino acid sequence level, it was shown that the disease-causing variants frequently involve drastic changes of amino acid physico-chemical properties of proteins such as charge, hydrophobicity and geometry. Structural analysis of variants involved in diseases and being frequently observed in human population showed similar trends: disease-causing variants tend to cause more changes of hydrogen bond network and salt bridges as compared with harmless amino acid mutations. Analysis of thermodynamics data reported in literature, both experimental and computational, indicated that disease-causing variants tend to destabilize proteins and their interactions, which prompted us to investigate the effects of amino acid mutations on large databases of experimentally measured energy changes in unrelated proteins. Although the experimental datasets were linked neither to diseases nor exclusory to human proteins, the observed trends were the same: amino acid mutations tend to destabilize proteins and their interactions. Having in mind that structural and thermodynamics properties are interrelated, it is pointed out that any large change of any of them is anticipated to cause a disease. PMID:25689729

  8. Self-sequencing of amino acids and origins of polyfunctional protocells.

    PubMed

    Fox, S W

    1984-01-01

    The primal role of the origins of proteins in molecular evolution is discussed. On the basis of this premise, the significance of the experimentally established self-sequencing of amino acids under simulated geological conditions is explained as due to the fact that the products are highly nonrandom and accordingly contain many kinds of information. When such thermal proteins are aggregated into laboratory protocells, an action that occurs readily, the resultant protocells also contain many kinds of information. Residue-by-residue order, enzymic activities, and lipid quality accordingly occur within each preparation of proteinoid (thermal protein). In this paper are reviewed briefly the phenomenon of self-sequencing of amino acids, its relationship to evolutionary processes, other significance of such self-ordering, and the experimental evidence for original polyfunctional protocells. PMID:6462684

  9. Self-Sequencing of Amino Acids and Origins of Polyfunctional Protocells

    NASA Astrophysics Data System (ADS)

    Fox, Sidney W.

    1984-12-01

    The primal role of the origins of proteins in molecular evolution is discussed. On the basis of this premise, the significance of the experimentally established self-sequencing of amino acids under simulated geological conditions is explained as due to the fact that the products are highly nonrandom and accordingly contain many kinds of information. When such thermal proteins are aggregated into laboratory protocells, an action that occurs readily, the resultant protocells also contain many kinds of information. Residue-by-residue order, enzymic activities, and lipid quality accordingly occur within each preparation of proteinoid (thermal protein). In this paper are reviewed briefly the phenomenon of self-sequencing of amino acids, its relationship to evolutionary processes, other significance of such self-ordering, and the experimental evidence for original polyfunctional protocells.

  10. Two RNAs or DNAs May Artificially Fuse Together at a Short Homologous Sequence (SHS) during Reverse Transcription or Polymerase Chain Reactions, and Thus Reporting an SHS-Containing Chimeric RNA Requires Extra Caution

    PubMed Central

    Xie, Bingkun; Yang, Wei; Ouyang, Yongchang; Chen, Lichan; Jiang, Hesheng; Liao, Yuying; Liao, D. Joshua

    2016-01-01

    Tens of thousands of chimeric RNAs have been reported. Most of them contain a short homologous sequence (SHS) at the joining site of the two partner genes but are not associated with a fusion gene. We hypothesize that many of these chimeras may be technical artifacts derived from SHS-caused mis-priming in reverse transcription (RT) or polymerase chain reactions (PCR). We cloned six chimeric complementary DNAs (cDNAs) formed by human mitochondrial (mt) 16S rRNA sequences at an SHS, which were similar to several expression sequence tags (ESTs).These chimeras, which could not be detected with cDNA protection assay, were likely formed because some regions of the 16S rRNA are reversely complementary to another region to form an SHS, which allows the downstream sequence to loop back and anneal at the SHS to prime the synthesis of its complementary strand, yielding a palindromic sequence that can form a hairpin-like structure.We identified a 16S rRNA that ended at the 4th nucleotide(nt) of the mt-tRNA-leu was dominant and thus should be the wild type. We also cloned a mouse Bcl2-Nek9 chimeric cDNA that contained a 5-nt unmatchable sequence between the two partners, contained two copies of the reverse primer in the same direction but did not contain the forward primer, making it unclear how this Bcl2-Nek9 was formed and amplified. Moreover, a cDNA was amplified because one primer has 4 nts matched to the template, suggesting that there may be many more artificial cDNAs than we have realized, because the nuclear and mt genomes have many more 4-nt than 5-nt or longer homologues. Altogether, the chimeric cDNAs we cloned are good examples suggesting that many cDNAs may be artifacts due to SHS-caused mis-priming and thus greater caution should be taken when new sequence is obtained from a technique involving DNA polymerization. PMID:27148738

  11. Two RNAs or DNAs May Artificially Fuse Together at a Short Homologous Sequence (SHS) during Reverse Transcription or Polymerase Chain Reactions, and Thus Reporting an SHS-Containing Chimeric RNA Requires Extra Caution.

    PubMed

    Xie, Bingkun; Yang, Wei; Ouyang, Yongchang; Chen, Lichan; Jiang, Hesheng; Liao, Yuying; Liao, D Joshua

    2016-01-01

    Tens of thousands of chimeric RNAs have been reported. Most of them contain a short homologous sequence (SHS) at the joining site of the two partner genes but are not associated with a fusion gene. We hypothesize that many of these chimeras may be technical artifacts derived from SHS-caused mis-priming in reverse transcription (RT) or polymerase chain reactions (PCR). We cloned six chimeric complementary DNAs (cDNAs) formed by human mitochondrial (mt) 16S rRNA sequences at an SHS, which were similar to several expression sequence tags (ESTs).These chimeras, which could not be detected with cDNA protection assay, were likely formed because some regions of the 16S rRNA are reversely complementary to another region to form an SHS, which allows the downstream sequence to loop back and anneal at the SHS to prime the synthesis of its complementary strand, yielding a palindromic sequence that can form a hairpin-like structure.We identified a 16S rRNA that ended at the 4th nucleotide(nt) of the mt-tRNA-leu was dominant and thus should be the wild type. We also cloned a mouse Bcl2-Nek9 chimeric cDNA that contained a 5-nt unmatchable sequence between the two partners, contained two copies of the reverse primer in the same direction but did not contain the forward primer, making it unclear how this Bcl2-Nek9 was formed and amplified. Moreover, a cDNA was amplified because one primer has 4 nts matched to the template, suggesting that there may be many more artificial cDNAs than we have realized, because the nuclear and mt genomes have many more 4-nt than 5-nt or longer homologues. Altogether, the chimeric cDNAs we cloned are good examples suggesting that many cDNAs may be artifacts due to SHS-caused mis-priming and thus greater caution should be taken when new sequence is obtained from a technique involving DNA polymerization. PMID:27148738

  12. Nitrogenase and Homologs

    PubMed Central

    2014-01-01

    Nitrogenase catalyzes biological nitrogen fixation, a key step in the global nitrogen cycle. Three homologous nitrogenases have been identified to date, along with several structural and/or functional homologs of this enzyme that are involved in nitrogenase assembly, bacteriochlorophyll biosynthesis and methanogenic process, respectively. In this article, we provide an overview of the structures and functions of nitrogenase and its homologs, which highlights the similarity and disparity of this uniquely versatile group of enzymes. PMID:25491285

  13. Dualities in Persistent (Co)Homology

    SciTech Connect

    de Silva, Vin; Morozov, Dmitriy; Vejdemo-Johansson, Mikael

    2011-09-16

    We consider sequences of absolute and relative homology and cohomology groups that arise naturally for a filtered cell complex. We establishalgebraic relationships between their persistence modules, and show that they contain equivalent information. We explain how one can use the existingalgorithm for persistent homology to process any of the four modules, and relate it to a recently introduced persistent cohomology algorithm. Wepresent experimental evidence for the practical efficiency of the latter algorithm.

  14. Single Point Mutation in Bin/Amphiphysin/Rvs (BAR) Sequence of Endophilin Impairs Dimerization, Membrane Shaping, and Src Homology 3 Domain-mediated Partnership*

    PubMed Central

    Gortat, Anna; San-Roman, Mabel Jouve; Vannier, Christian; Schmidt, Anne A.

    2012-01-01

    Bin/Amphiphysin/Rvs (BAR) domain-containing proteins are essential players in the dynamics of intracellular compartments. The BAR domain is an evolutionarily conserved dimeric module characterized by a crescent-shaped structure whose intrinsic curvature, flexibility, and ability to assemble into highly ordered oligomers contribute to inducing the curvature of target membranes. Endophilins, diverging into A and B subgroups, are BAR and SH3 domain-containing proteins. They exert activities in membrane dynamic processes such as endocytosis, autophagy, mitochondrial dynamics, and permeabilization during apoptosis. Here, we report on the involvement of the third α-helix of the endophilin A BAR sequence in dimerization and identify leucine 215 as a key residue within a network of hydrophobic interactions stabilizing the entire BAR dimer interface. With the combination of N-terminal truncation retaining the high dimerization capacity of the third α-helices of endophilin A and leucine 215 substitution by aspartate (L215D), we demonstrate the essential role of BAR sequence-mediated dimerization on SH3 domain partnership. In comparison with wild type, full-length endophilin A2 heterodimers with one protomer bearing the L215D substitution exhibit very significant changes in membrane binding and shaping activities as well as a dramatic decrease of SH3 domain partnership. This suggests that subtle changes in the conformation and/or rigidity of the BAR domain impact both the control of membrane curvature and downstream binding to effectors. Finally, we show that expression, in mammalian cells, of endophilin A2 bearing the L215D substitution impairs the endocytic recycling of transferrin receptors. PMID:22167186

  15. Comparative studies on tree pollen allergens. X. Further purification and N-terminal amino acid sequence analyses of the major allergen of birch pollen (Betula verrucosa).

    PubMed

    Vik, H; Elsayed, S

    1986-01-01

    The previously isolated major allergen of birch pollen (fraction BV45), Int. Archs Allergy appl. Immun. 68: 70-78 (1982), was further purified by recycling chromatography. The purified preparation was run on a high-performance liquid chromatography (HPLC) TSK-G-2000 gel filtration chromatography column and, finally, on paper high-volt electrophoresis. The protein recovered met the homogeneity criteria required for performing the N-terminal sequence analysis. The allergenic and antigenic reactivities of the HPLC-purified protein, designated BV45B, was examined. A single homogeneous precipitation line in crossed immunoelectrophoresis (CIE) was shown. Specific IgE-inhibition tests and immuno-autoradiographic prints indicated that this allergen could bind reaginic IgE specificially and with good affinity. The homogeneity of BV45B was examined by isoelectric focusing (IEF). Several minor bands of pI differences of less than 0.1 units were visible, demonstrating the existence of some molecular variants of this protein. The N-terminal sequence analysis of the molecule was performed, and the following four amino acids were tentatively shown by sequential cleavage: NH2-Ala-Gly-Ile-Val-. The demonstration of one dominant N-terminal 1-dimethyl-amino-5-naphthalene sulphonyl (DNS)-amino acid by polyamide thin-layer chromatography at each sequence step confirmed that the N-terminal residue of the protein was not blocked; the heterogeneity shown by the IEF system was merely due to the presence of several homologous polymorphic proteins with identical N-terminal amino acid, the adequacy of the purification repertoire used. PMID:3957444

  16. Sequence of morphological transitions in two-dimensional pattern growth from aqueous ascorbic Acid solutions.

    PubMed

    Paranjpe, A S

    2002-08-12

    A sequence of morphological transitions in two-dimensional dehydration patterns of aqueous solutions of ascorbic acid is observed with humidity as a control parameter. Change in morphology occurs due to humidity induced variation in the concentration of the metastable supersaturated solution phase formed after initial solvent evaporation. As percent humidity is varied from 40 to 80, patterns change from compact circular --> radial --> density modulated radial (a new morphology) --> density modulated circular --> density modulated dendritic (a new morphology) --> dense branching. PMID:12190528

  17. Self-sequencing of amino acids and origins of polyfunctional protocells

    NASA Technical Reports Server (NTRS)

    Fox, S. W.

    1984-01-01

    The role of proteins in the origin of living things is discussed. It has been experimentally established that amino acids can sequence themselves under simulated geological conditions with highly nonrandom products which accordingly contain diverse information. Multiple copies of each type of macromolecule are formed, resulting in greater power for any protoenzymic molecule than would accrue from a single copy of each type. Thermal proteins are readily incorporated into laboratory protocells. The experimental evidence for original polyfunctional protocells is discussed.

  18. Characterization of the microbial acid mine drainage microbial community using culturing and direct sequencing techniques.

    PubMed

    Auld, Ryan R; Myre, Maxine; Mykytczuk, Nadia C S; Leduc, Leo G; Merritt, Thomas J S

    2013-05-01

    We characterized the bacterial community from an AMD tailings pond using both classical culturing and modern direct sequencing techniques and compared the two methods. Acid mine drainage (AMD) is produced by the environmental and microbial oxidation of minerals dissolved from mining waste. Surprisingly, we know little about the microbial communities associated with AMD, despite the fundamental ecological roles of these organisms and large-scale economic impact of these waste sites. AMD microbial communities have classically been characterized by laboratory culturing-based techniques and more recently by direct sequencing of marker gene sequences, primarily the 16S rRNA gene. In our comparison of the techniques, we find that their results are complementary, overall indicating very similar community structure with similar dominant species, but with each method identifying some species that were missed by the other. We were able to culture the majority of species that our direct sequencing results indicated were present, primarily species within the Acidithiobacillus and Acidiphilium genera, although estimates of relative species abundance were only obtained from direct sequencing. Interestingly, our culture-based methods recovered four species that had been overlooked from our sequencing results because of the rarity of the marker gene sequences, likely members of the rare biosphere. Further, direct sequencing indicated that a single genus, completely missed in our culture-based study, Legionella, was a dominant member of the microbial community. Our results suggest that while either method does a reasonable job of identifying the dominant members of the AMD microbial community, together the methods combine to give a more complete picture of the true diversity of this environment. PMID:23485423

  19. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... approved by the Director of the Federal Register in accordance with 5 U.S.C. 552(a) and 1 CFR part 51... base or modified or unusual amino acid may be presented in a given sequence as the corresponding unmodified base or amino acid if the modified base or modified or unusual amino acid is one of those...

  20. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... approved by the Director of the Federal Register in accordance with 5 U.S.C. 552(a) and 1 CFR part 51... base or modified or unusual amino acid may be presented in a given sequence as the corresponding unmodified base or amino acid if the modified base or modified or unusual amino acid is one of those...

  1. Nanopore Analysis of Nucleic Acids: Single-Molecule Studies of Molecular Dynamics, Structure, and Base Sequence

    NASA Astrophysics Data System (ADS)

    Olasagasti, Felix; Deamer, David W.

    Nucleic acids are linear polynucleotides in which each base is covalently linked to a pentose sugar and a phosphate group carrying a negative charge. If a pore having roughly the crosssectional diameter of a single-stranded nucleic acid is embedded in a thin membrane and a voltage of 100 mV or more is applied, individual nucleic acids in solution can be captured by the electrical field in the pore and translocated through by single-molecule electrophoresis. The dimensions of the pore cannot accommodate anything larger than a single strand, so each base in the molecule passes through the pore in strict linear sequence. The nucleic acid strand occupies a large fraction of the pore's volume during translocation and therefore produces a transient blockade of the ionic current created by the applied voltage. If it could be demonstrated that each nucleotide in the polymer produced a characteristic modulation of the ionic current during its passage through the nanopore, the sequence of current modulations would reflect the sequence of bases in the polymer. According to this basic concept, nanopores are analogous to a Coulter counter that detects nanoscopic molecules rather than microscopic [1,2]. However, the advantage of nanopores is that individual macromolecules can be characterized because different chemical and physical properties affect their passage through the pore. Because macromolecules can be captured in the pore as well as translocated, the nanopore can be used to detect individual functional complexes that form between a nucleic acid and an enzyme. No other technique has this capability.

  2. Structural investigations of the p53/p73 homologs from the tunicate species Ciona intestinalis reveal the sequence requirements for the formation of a tetramerization domain.

    PubMed

    Heering, Jan; Jonker, Hendrik R A; Löhr, Frank; Schwalbe, Harald; Dötsch, Volker

    2016-02-01

    Most members of the p53 family of transcription factors form tetramers. Responsible for determining the oligomeric state is a short oligomerization domain consisting of one β-strand and one α-helix. With the exception of human p53 all other family members investigated so far contain a second α-helix as part of their tetramerization domain. Here we have used nuclear magnetic resonance spectroscopy to characterize the oligomerization domains of the two p53-like proteins from the tunicate Ciona intestinalis, representing the closest living relative of vertebrates. Structure determination reveals for one of the two proteins a new type of packing of this second α-helix on the core domain that was not predicted based on the sequence, while the other protein does not form a second helix despite the presence of crucial residues that are conserved in all other family members that form a second helix. By mutational analysis, we identify a proline as well as large hydrophobic residues in the hinge region between both helices as the crucial determinant for the formation of a second helix. PMID:26473758

  3. The ATP binding site of the chromatin remodeling homolog Lsh is required for nucleosome density and de novo DNA methylation at repeat sequences

    PubMed Central

    Ren, Jianke; Briones, Victorino; Barbour, Samantha; Yu, Weishi; Han, Yixing; Terashima, Minoru; Muegge, Kathrin

    2015-01-01

    Lsh, a chromatin remodeling protein of the SNF2 family, is critical for normal heterochromatin structure. In particular, DNA methylation at repeat elements, a hallmark of heterochromatin, is greatly reduced in Lsh−/− (KO) cells. Here, we examined the presumed nucleosome remodeling activity of Lsh on chromatin in the context of DNA methylation. We found that dynamic CG methylation was dependent on Lsh in embryonic stem cells. Moreover, we demonstrate that ATP function is critical for de novo methylation at repeat sequences. The ATP binding site of Lsh is in part required to promote stable association of the DNA methyltransferase 3b with the repeat locus. By performing nucleosome occupancy assays, we found distinct nucleosome occupancy in KO ES cells compared to WT ES cells after differentiation. Nucleosome density was restored to wild-type level by re-expressing wild-type Lsh but not the ATP mutant in KO ES cells. Our results suggest that ATP-dependent nucleosome remodeling is the primary molecular function of Lsh, which may promote de novo methylation in differentiating ES cells. PMID:25578963

  4. Complete amino acid sequence of a histidine-rich proteolytic fragment of human ceruloplasmin.

    PubMed

    Kingston, I B; Kingston, B L; Putnam, F W

    1979-04-01

    The complete amino acid sequence has been determined for a fragment of human ceruloplasmin [ferroxidase; iron(II):oxygen oxidoreductase, EC 1.16.3.1]. The fragment (designated Cp F5) contains 159 amino acid residues and has a molecular weight of 18,650; it lacks carbohydrate, is rich in histidine, and contains one free cysteine that may be part of a copper-binding site. This fragment is present in most commercial preparations of ceruloplasmin, probably owing to proteolytic degradation, but can also be obtained by limited cleavage of single-chain ceruloplasmin with plasmin. Cp F5 probably is an intact domain attached to the COOH-terminal end of single-chain ceruloplasmin via a labile interdomain peptide bond. A model of the secondary structure predicted by empirical methods suggests that almost one-third of the amino acid residues are distributed in alpha helices, about a third in beta-sheet structure, and the remainder in beta turns and unidentified structures. Computer analysis of the amino acid sequence has not demonstrated a statistically significant relationship between this ceruloplasmin fragment and any other protein, but there is some evidence for an internal duplication. PMID:287005

  5. The amino acid sequence of Lady Amherst's pheasant (Chrysolophus amherstiae) and golden pheasant (Chrysolophus pictus) egg-white lysozymes.

    PubMed

    Araki, T; Kuramoto, M; Torikata, T

    1990-09-01

    The amino acids of Lady Amherst's pheasant and golden pheasant egg-white lysozymes have been sequenced. The carboxymethylated lysozymes were digested with trypsin followed by sequencing of the tryptic peptides. Lady Amherst's pheasant lysozyme proved to consist of 129 amino acid residues, and a relative molecular mass of 14,423 Da was calculated. This lysozyme had 6 amino acids substitutions when compared with hen egg-white lysozyme: Phe3 to Tyr, His15 to Leu, Gln41 to His, Asn77 to His, Gln 121 to Asn, and a newly found substitution of Ile124 to Thr. The amino acid sequence of golden pheasant lysozyme was identical to that of Lady Amherst's phesant lysozyme. The phylogenetic tree constructured by the comparison of amino acid sequences of phasianoid birds lysozymes revealed a minimum genetic distance between these pheasants and the turkey-peafowl group. PMID:1368578

  6. Complete Genome Sequence of a thermotolerant sporogenic lactic acid bacterium, Bacillus coagulans strain 36D1

    PubMed Central

    Rhee, Mun Su; Moritz, Brélan E.; Xie, Gary; Glavina del Rio, T.; Dalin, E.; Tice, H.; Bruce, D.; Goodwin, L.; Chertkov, O.; Brettin, T.; Han, C.; Detter, C.; Pitluck, S.; Land, Miriam L.; Patel, Milind; Ou, Mark; Harbrucker, Roberta; Ingram, Lonnie O.; Shanmugam, K. T.

    2011-01-01

    Bacillus coagulans is a ubiquitous soil bacterium that grows at 50-55 °C and pH 5.0 and ferments various sugars that constitute plant biomass to L (+)-lactic acid. The ability of this sporogenic lactic acid bacterium to grow at 50-55 °C and pH 5.0 makes this organism an attractive microbial biocatalyst for production of optically pure lactic acid at industrial scale not only from glucose derived from cellulose but also from xylose, a major constituent of hemicellulose. This bacterium is also considered as a potential probiotic. Complete genome sequence of a representative strain, B. coagulans strain 36D1, is presented and discussed. PMID:22675583

  7. Complete amino acid sequence of globin chains and biological activity of fragmented crocodile hemoglobin (Crocodylus siamensis).

    PubMed

    Srihongthong, Saowaluck; Pakdeesuwan, Anawat; Daduang, Sakda; Araki, Tomohiro; Dhiravisit, Apisak; Thammasirirak, Sompong

    2012-08-01

    Hemoglobin, α-chain, β-chain and fragmented hemoglobin of Crocodylus siamensis demonstrated both antibacterial and antioxidant activities. Antibacterial and antioxidant properties of the hemoglobin did not depend on the heme structure but could result from the compositions of amino acid residues and structures present in their primary structure. Furthermore, thirteen purified active peptides were obtained by RP-HPLC analyses, corresponding to fragments in the α-globin chain and the β-globin chain which are mostly located at the N-terminal and C-terminal parts. These active peptides operate on the bacterial cell membrane. The globin chains of Crocodylus siamensis showed similar amino acids to the sequences of Crocodylus niloticus. The novel amino acid substitutions of α-chain and β-chain are not associated with the heme binding site or the bicarbonate ion binding site, but could be important through their interactions with membranes of bacteria. PMID:22648692

  8. Towards Alignment Independent Quantitative Assessment of Homology Detection

    PubMed Central

    Kliger, Yossef

    2006-01-01

    Identification of homologous proteins provides a basis for protein annotation. Sequence alignment tools reliably identify homologs sharing high sequence similarity. However, identification of homologs that share low sequence similarity remains a challenge. Lowering the cutoff value could enable the identification of diverged homologs, but also introduces numerous false hits. Methods are being continuously developed to minimize this problem. Estimation of the fraction of homologs in a set of protein alignments can help in the assessment and development of such methods, and provides the users with intuitive quantitative assessment of protein alignment results. Herein, we present a computational approach that estimates the amount of homologs in a set of protein pairs. The method requires a prevalent and detectable protein feature that is conserved between homologs. By analyzing the feature prevalence in a set of pairwise protein alignments, the method can estimate the number of homolog pairs in the set independently of the alignments' quality. Using the HomoloGene database as a standard of truth, we implemented this approach in a proteome-wide analysis. The results revealed that this approach, which is independent of the alignments themselves, works well for estimating the number of homologous proteins in a wide range of homology values. In summary, the presented method can accompany homology searches and method development, provides validation to search results, and allows tuning of tools and methods. PMID:17205117

  9. Comparative characterization of random-sequence proteins consisting of 5, 12, and 20 kinds of amino acids.

    PubMed

    Tanaka, Junko; Doi, Nobuhide; Takashima, Hideaki; Yanagawa, Hiroshi

    2010-04-01

    Screening of functional proteins from a random-sequence library has been used to evolve novel proteins in the field of evolutionary protein engineering. However, random-sequence proteins consisting of the 20 natural amino acids tend to aggregate, and the occurrence rate of functional proteins in a random-sequence library is low. From the viewpoint of the origin of life, it has been proposed that primordial proteins consisted of a limited set of amino acids that could have been abundantly formed early during chemical evolution. We have previously found that members of a random-sequence protein library constructed with five primitive amino acids show high solubility (Doi et al., Protein Eng Des Sel 2005;18:279-284). Although such a library is expected to be appropriate for finding functional proteins, the functionality may be limited, because they have no positively charged amino acid. Here, we constructed three libraries of 120-amino acid, random-sequence proteins using alphabets of 5, 12, and 20 amino acids by preselection using mRNA display (to eliminate sequences containing stop codons and frameshifts) and characterized and compared the structural properties of random-sequence proteins arbitrarily chosen from these libraries. We found that random-sequence proteins constructed with the 12-member alphabet (including five primitive amino acids and positively charged amino acids) have higher solubility than those constructed with the 20-member alphabet, though other biophysical properties are very similar in the two libraries. Thus, a library of moderate complexity constructed from 12 amino acids may be a more appropriate resource for functional screening than one constructed from 20 amino acids. PMID:20162614

  10. Homological stabilizer codes

    SciTech Connect

    Anderson, Jonas T.

    2013-03-15

    In this paper we define homological stabilizer codes on qubits which encompass codes such as Kitaev's toric code and the topological color codes. These codes are defined solely by the graphs they reside on. This feature allows us to use properties of topological graph theory to determine the graphs which are suitable as homological stabilizer codes. We then show that all toric codes are equivalent to homological stabilizer codes on 4-valent graphs. We show that the topological color codes and toric codes correspond to two distinct classes of graphs. We define the notion of label set equivalencies and show that under a small set of constraints the only homological stabilizer codes without local logical operators are equivalent to Kitaev's toric code or to the topological color codes. - Highlights: Black-Right-Pointing-Pointer We show that Kitaev's toric codes are equivalent to homological stabilizer codes on 4-valent graphs. Black-Right-Pointing-Pointer We show that toric codes and color codes correspond to homological stabilizer codes on distinct graphs. Black-Right-Pointing-Pointer We find and classify all 2D homological stabilizer codes. Black-Right-Pointing-Pointer We find optimal codes among the homological stabilizer codes.

  11. N-Terminal Amino Acid Sequence Determination of Proteins by N-Terminal Dimethyl Labeling: Pitfalls and Advantages When Compared with Edman Degradation Sequence Analysis.

    PubMed

    Chang, Elizabeth; Pourmal, Sergei; Zhou, Chun; Kumar, Rupesh; Teplova, Marianna; Pavletich, Nikola P; Marians, Kenneth J; Erdjument-Bromage, Hediye

    2016-07-01

    In recent history, alternative approaches to Edman sequencing have been investigated, and to this end, the Association of Biomolecular Resource Facilities (ABRF) Protein Sequencing Research Group (PSRG) initiated studies in 2014 and 2015, looking into bottom-up and top-down N-terminal (Nt) dimethyl derivatization of standard quantities of intact proteins with the aim to determine Nt sequence information. We have expanded this initiative and used low picomole amounts of myoglobin to determine the efficiency of Nt-dimethylation. Application of this approach on protein domains, generated by limited proteolysis of overexpressed proteins, confirms that it is a universal labeling technique and is very sensitive when compared with Edman sequencing. Finally, we compared Edman sequencing and Nt-dimethylation of the same polypeptide fragments; results confirm that there is agreement in the identity of the Nt amino acid sequence between these 2 methods. PMID:27006647

  12. N-Terminal Amino Acid Sequence Determination of Proteins by N-Terminal Dimethyl Labeling: Pitfalls and Advantages When Compared with Edman Degradation Sequence Analysis

    PubMed Central

    Chang, Elizabeth; Pourmal, Sergei; Zhou, Chun; Kumar, Rupesh; Teplova, Marianna; Pavletich, Nikola P.; Marians, Kenneth J.

    2016-01-01

    In recent history, alternative approaches to Edman sequencing have been investigated, and to this end, the Association of Biomolecular Resource Facilities (ABRF) Protein Sequencing Research Group (PSRG) initiated studies in 2014 and 2015, looking into bottom-up and top-down N-terminal (Nt) dimethyl derivatization of standard quantities of intact proteins with the aim to determine Nt sequence information. We have expanded this initiative and used low picomole amounts of myoglobin to determine the efficiency of Nt-dimethylation. Application of this approach on protein domains, generated by limited proteolysis of overexpressed proteins, confirms that it is a universal labeling technique and is very sensitive when compared with Edman sequencing. Finally, we compared Edman sequencing and Nt-dimethylation of the same polypeptide fragments; results confirm that there is agreement in the identity of the Nt amino acid sequence between these 2 methods. PMID:27006647

  13. Protein sequence analysis by incorporating modified chaos game and physicochemical properties into Chou's general pseudo amino acid composition.

    PubMed

    Xu, Chunrui; Sun, Dandan; Liu, Shenghui; Zhang, Yusen

    2016-10-01

    In this contribution we introduced a novel graphical method to compare protein sequences. By mapping a protein sequence into 3D space based on codons and physicochemical properties of 20 amino acids, we are able to get a unique P-vector from the 3D curve. This approach is consistent with wobble theory of amino acids. We compute the distance between sequences by their P-vectors to measure similarities/dissimilarities among protein sequences. Finally, we use our method to analyze four datasets and get better results compared with previous approaches. PMID:27375218

  14. Covalent structure of human haptoglobin: a serine protease homolog.

    PubMed Central

    Kurosky, A; Barnett, D R; Lee, T H; Touchstone, B; Hay, R E; Arnott, M S; Bowman, B H; Fitch, W M

    1980-01-01

    The complete amino acid sequences and the disulfide arrangements of the two chains of human haptoglobin 1-1 were established. The alpha 1 and beta chains of haptoglobin contain 83 and 245 residues, respectively. Comparison of the primary structure of haptoglobin with that of the chymotrypsinogen family of serine proteases revealed a significant degree of chemical similarity. The probability was less than 10(-5) that the chemical similarity of the beta chain of haptoglobin to the proteases was due to chance. The amino acid sequence of the beta chain of haptoglobin is 29--33% identical to bovine trypsin, bovine chymotrypsin, porcine elastase, human thrombin, or human plasmin. Comparison of haptoglobin alpha 1 chain to activation peptide regions of the zymogens revealed an identity of 25% to the fifth "kringle" region of the activation peptide of plasminogen. The probability was less than 0.014 that this similarity was due to chance. These results strongly indicate haptoglobin to be a homolog of the chymotrypsinogen family of serine proteases. Alignment of the beta-chain sequence of haptoglobin to the serine proteases is remarkably consistent except for an insertion of 16 residues in the region corresponding to the methionyl loop of the serine proteases. The active-site residues typical of the serine proteases, histidine-57 and serine-195, are replaced in haptoglobin by lysine and alanine, respectively; however, aspartic acid-102 and the trypsin specificity, residue, aspartic acid-189, do occur in haptoglobin. Haptoglobin and the serine proteases represent a striking example of homologous proteins with different biological functions. PMID:6997877

  15. beta-Hydroxyaspartic acid or beta-hydroxyasparagine in bovine low density lipoprotein receptor and in bovine thrombomodulin.

    PubMed

    Stenflo, J; Ohlin, A K; Owen, W G; Schneider, W J

    1988-01-01

    All of the vitamin K-dependent plasma proteins with domains that are homologous to the epidermal growth factor (EGF) precursor have 1 hydroxylated aspartic acid residue in the NH2-terminal EGF-homology region. In addition, protein S has 1 hydroxylated asparagine residue in each of the three COOH-terminal EGF-homology regions. All of these proteins have been found to have the amino acid sequence, CX(D or N)XXXX(F or Y)XCXC (corresponding to residues 20 to 33 in EGF), where the Asp or Asn residue is hydroxylated. This sequence also appears in two of the three EGF-homology regions of the human low density lipoprotein receptor and in two of the six EGF-homology regions of bovine thrombomodulin so far identified, suggesting that they may have the modified amino acid. We have now identified beta-hydroxyaspartic acid in acid hydrolysates of both these proteins. PMID:2826439

  16. Flexible mapping of homology onto structure with Homolmapper

    PubMed Central

    Rockwell, Nathan C; Lagarias, J Clark

    2007-01-01

    Background Over the past decade, a number of tools have emerged for the examination of homology relationships among protein sequences in a structural context. Most recent software implementations for such analysis are tied to specific molecular viewing programs, which can be problematic for collaborations involving multiple viewing environments. Incorporation into larger packages also adds complications for users interested in adding their own scoring schemes or in analyzing proteins incorporating unusual amino acid residues such as selenocysteine. Results We describe homolmapper, a command-line application for mapping information from a multiple protein sequence alignment onto a protein structure for analysis in the viewing software of the user's choice. Homolmapper is small (under 250 K for the application itself) and is written in Python to ensure portability. It is released for non-commercial use under a modified University of California BSD license. Homolmapper permits facile import of additional scoring schemes and can incorporate arbitrary additional amino acids to allow handling of residues such as selenocysteine or pyrrolysine. Homolmapper also provides tools for defining and analyzing subfamilies relative to a larger alignment, for mutual information analysis, and for rapidly visualizing the locations of mutations and multi-residue motifs. Conclusion Homolmapper is a useful tool for analysis of homology relationships among proteins in a structural context. There is also extensive, example-driven documentation available. More information about homolmapper is available at . PMID:17428344

  17. Nucleotide sequence of the phosphoglycerate kinase gene from the extreme thermophile Thermus thermophilus. Comparison of the deduced amino acid sequence with that of the mesophilic yeast phosphoglycerate kinase.

    PubMed Central

    Bowen, D; Littlechild, J A; Fothergill, J E; Watson, H C; Hall, L

    1988-01-01

    Using oligonucleotide probes derived from amino acid sequencing information, the structural gene for phosphoglycerate kinase from the extreme thermophile, Thermus thermophilus, was cloned in Escherichia coli and its complete nucleotide sequence determined. The gene consists of an open reading frame corresponding to a protein of 390 amino acid residues (calculated Mr 41,791) with an extreme bias for G or C (93.1%) in the codon third base position. Comparison of the deduced amino acid sequence with that of the corresponding mesophilic yeast enzyme indicated a number of significant differences. These are discussed in terms of the unusual codon bias and their possible role in enhanced protein thermal stability. Images Fig. 1. PMID:3052437

  18. Bacteria obtained from a sequencing batch reactor that are capable of growth on dehydroabietic acid.

    PubMed Central

    Mohn, W W

    1995-01-01

    Eleven isolates capable of growth on the resin acid dehydroabietic acid (DhA) were obtained from a sequencing batch reactor designed to treat a high-strength process stream from a paper mill. The isolates belonged to two groups, represented by strains DhA-33 and DhA-35, which were characterized. In the bioreactor, bacteria like DhA-35 were more abundant than those like DhA-33. The population in the bioreactor of organisms capable of growth on DhA was estimated to be 1.1 x 10(6) propagules per ml, based on a most-probable-number determination. Analysis of small-subunit rRNA partial sequences indicated that DhA-33 was most closely related to Sphingomonas yanoikuyae (Sab = 0.875) and that DhA-35 was most closely related to Zoogloea ramigera (Sab = 0.849). Both isolates additionally grew on other abietanes, i.e., abietic and palustric acids, but not on the pimaranes, pimaric and isopimaric acids. For DhA-33 and DhA-35 with DhA as the sole organic substrate, doubling times were 2.7 and 2.2 h, respectively, and growth yields were 0.30 and 0.25 g of protein per g of DhA, respectively. Glucose as a cosubstrate stimulated growth of DhA-33 on DhA and stimulated DhA degradation by the culture. Pyruvate as a cosubstrate did not stimulate growth of DhA-35 on DhA and reduced the specific rate of DhA degradation of the culture. DhA induced DhA and abietic acid degradation activities in both strains, and these activities were heat labile. Cell suspensions of both strains consumed DhA at a rate of 6 mumol mg of protein-1 h-1.(ABSTRACT TRUNCATED AT 250 WORDS) PMID:7793937

  19. Nucleic and amino acid sequences relating to a novel transketolase, and methods for the expression thereof

    DOEpatents

    Croteau, Rodney Bruce; Wildung, Mark Raymond; Lange, Bernd Markus; McCaskill, David G.

    2001-01-01

    cDNAs encoding 1-deoxyxylulose-5-phosphate synthase from peppermint (Mentha piperita) have been isolated and sequenced, and the corresponding amino acid sequences have been determined. Accordingly, isolated DNA sequences (SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7) are provided which code for the expression of 1-deoxyxylulose-5-phosphate synthase from plants. In another aspect the present invention provides for isolated, recombinant DXPS proteins, such as the proteins having the sequences set forth in SEQ ID NO:4, SEQ ID NO:6 and SEQ ID NO:8. In other aspects, replicable recombinant cloning vehicles are provided which code for plant 1-deoxyxylulose-5-phosphate synthases, or for a base sequence sufficiently complementary to at least a portion of 1-deoxyxylulose-5-phosphate synthase DNA or RNA to enable hybridization therewith. In yet other aspects, modified host cells are provided that have been transformed, transfected, infected and/or injected with a recombinant cloning vehicle and/or DNA sequence encoding a plant 1-deoxyxylulose-5-phosphate synthase. Thus, systems and methods are provided for the recombinant expression of the aforementioned recombinant 1-deoxyxylulose-5-phosphate synthase that may be used to facilitate its production, isolation and purification in significant amounts. Recombinant 1-deoxyxylulose-5-phosphate synthase may be used to obtain expression or enhanced expression of 1-deoxyxylulose-5-phosphate synthase in plants in order to enhance the production of 1-deoxyxylulose-5-phosphate, or its derivatives such as isopentenyl diphosphate (BP), or may be otherwise employed for the regulation or expression of 1-deoxyxylulose-5-phosphate synthase, or the production of its products.

  20. Genome Sequence Analysis of the Naphthenic Acid Degrading and Metal Resistant Bacterium Cupriavidus gilardii CR3.

    PubMed

    Wang, Xiaoyu; Chen, Meili; Xiao, Jingfa; Hao, Lirui; Crowley, David E; Zhang, Zhewen; Yu, Jun; Huang, Ning; Huo, Mingxin; Wu, Jiayan

    2015-01-01

    Cupriavidus sp. are generally heavy metal tolerant bacteria with the ability to degrade a variety of aromatic hydrocarbon compounds, although the degradation pathways and substrate versatilities remain largely unknown. Here we studied the bacterium Cupriavidus gilardii strain CR3, which was isolated from a natural asphalt deposit, and which was shown to utilize naphthenic acids as a sole carbon source. Genome sequencing of C. gilardii CR3 was carried out to elucidate possible mechanisms for the naphthenic acid biodegradation. The genome of C. gilardii CR3 was composed of two circular chromosomes chr1 and chr2 of respectively 3,539,530 bp and 2,039,213 bp in size. The genome for strain CR3 encoded 4,502 putative protein-coding genes, 59 tRNA genes, and many other non-coding genes. Many genes were associated with xenobiotic biodegradation and metal resistance functions. Pathway prediction for degradation of cyclohexanecarboxylic acid, a representative naphthenic acid, suggested that naphthenic acid undergoes initial ring-cleavage, after which the ring fission products can be degraded via several plausible degradation pathways including a mechanism similar to that used for fatty acid oxidation. The final metabolic products of these pathways are unstable or volatile compounds that were not toxic to CR3. Strain CR3 was also shown to have tolerance to at least 10 heavy metals, which was mainly achieved by self-detoxification through ion efflux, metal-complexation and metal-reduction, and a powerful DNA self-repair mechanism. Our genomic analysis suggests that CR3 is well adapted to survive the harsh environment in natural asphalts containing naphthenic acids and high concentrations of heavy metals. PMID:26301592

  1. Genome Sequence Analysis of the Naphthenic Acid Degrading and Metal Resistant Bacterium Cupriavidus gilardii CR3

    PubMed Central

    Xiao, Jingfa; Hao, Lirui; Crowley, David E.; Zhang, Zhewen; Yu, Jun; Huang, Ning; Huo, Mingxin; Wu, Jiayan

    2015-01-01

    Cupriavidus sp. are generally heavy metal tolerant bacteria with the ability to degrade a variety of aromatic hydrocarbon compounds, although the degradation pathways and substrate versatilities remain largely unknown. Here we studied the bacterium Cupriavidus gilardii strain CR3, which was isolated from a natural asphalt deposit, and which was shown to utilize naphthenic acids as a sole carbon source. Genome sequencing of C. gilardii CR3 was carried out to elucidate possible mechanisms for the naphthenic acid biodegradation. The genome of C. gilardii CR3 was composed of two circular chromosomes chr1 and chr2 of respectively 3,539,530 bp and 2,039,213 bp in size. The genome for strain CR3 encoded 4,502 putative protein-coding genes, 59 tRNA genes, and many other non-coding genes. Many genes were associated with xenobiotic biodegradation and metal resistance functions. Pathway prediction for degradation of cyclohexanecarboxylic acid, a representative naphthenic acid, suggested that naphthenic acid undergoes initial ring-cleavage, after which the ring fission products can be degraded via several plausible degradation pathways including a mechanism similar to that used for fatty acid oxidation. The final metabolic products of these pathways are unstable or volatile compounds that were not toxic to CR3. Strain CR3 was also shown to have tolerance to at least 10 heavy metals, which was mainly achieved by self-detoxification through ion efflux, metal-complexation and metal-reduction, and a powerful DNA self-repair mechanism. Our genomic analysis suggests that CR3 is well adapted to survive the harsh environment in natural asphalts containing naphthenic acids and high concentrations of heavy metals. PMID:26301592

  2. Bile acid sulfotransferase I from rat liver sulfates bile acids and 3-hydroxy steroids: purification, N-terminal amino acid sequence, and kinetic properties.

    PubMed

    Barnes, S; Buchina, E S; King, R J; McBurnett, T; Taylor, K B

    1989-04-01

    A bile acid:3'phosphoadenosine-5'phosphosulfate:sulfotransferase (BAST I) from adult female rat liver cytosol has been purified 157-fold by a two-step isolation procedure. The N-terminal amino acid sequence of the 30,000 subunit has been determined for the first 35 residues. The Vmax of purified BAST I is 18.7 nmol/min per mg protein with N-(3-hydroxy-5 beta-cholanoyl)glycine (glycolithocholic acid) as substrate, comparable to that of the corresponding purified human BAST (Chen, L-J., and I. H. Segel, 1985. Arch. Biochem. Biophys. 241: 371-379). BAST I activity has a broad pH optimum from 5.5-7.5. Although maximum activity occurs with 5 mM MgCl2, Mg2+ is not essential for BAST I activity. The greatest sulfotransferase activity and the highest substrate affinity is observed with bile acids or steroids that have a steroid nucleus containing a 3 beta-hydroxy group and a 5-6 double bond or a trans A-B ring junction. These substrates have normal hyperbolic initial velocity curves with substrate inhibition occurring above 5 microM. Of the saturated 5 beta-bile acids, those with a single 3-hydroxy group are the most active. The addition of a second hydroxy group at the 6- or 7-position eliminates more than 99% of the activity. In contrast, 3 alpha,12 alpha-dihydroxy-5 beta-cholan-24-oic acid (deoxycholic acid) is an excellent substrate. The initial velocity curves for glycolithocholic and deoxycholic acid conjugates are sigmoidal rather than hyperbolic, suggestive of an allosteric effect. Maximum activity is observed at 80 microM for glycolithocholic acid. All substrates, bile acids and steroids, are inhibited by the 5 beta-bile acid, 3-keto-5 beta-cholanoic acid. The data suggest that BAST I is the same protein as hydrosteroid sulfotransferase 2 (Marcus, C. J., et al. 1980. Anal. Biochem. 107: 296-304). PMID:2754334

  3. Three Surface Layer Homology Domains at the N Terminus of the Clostridium cellulovorans Major Cellulosomal Subunit EngE

    PubMed Central

    Tamaru, Yutaka; Doi, Roy H.

    1999-01-01

    The gene engE, coding for endoglucanase E, one of the three major subunits of the Clostridium cellulovorans cellulosome, has been isolated and sequenced. engE is comprised of an open reading frame (ORF) of 3,090 bp and encodes a protein of 1,030 amino acids with a molecular weight of 111,796. The amino acid sequence derived from engE revealed a structure consisting of catalytic and noncatalytic domains. The N-terminal-half region of EngE consisted of a signal peptide of 31 amino acid residues and three repeated surface layer homology (SLH) domains, which were highly conserved and homologous to an S-layer protein from the gram-negative bacterium Caulobacter crescentus. The C-terminal-half region, which is necessary for the enzymatic function of EngE and for binding of EngE to the scaffolding protein CbpA, consisted of a catalytic domain homologous to that of family 5 of the glycosyl hydrolases, a domain of unknown function, and a duplicated sequence (DS or dockerin) at its C terminus. engE is located downstream of an ORF, ORF1, that is homologous to the Bacillus subtilis phosphomethylpyrimidine kinase (pmk) gene. The unique presence of three SLH domains and a DS suggests that EngE is capable of binding both to CbpA to form a CbpA-EngE cellulosome complex and to the surface layer of C. cellulovorans. PMID:10322032

  4. Sequence-defined bioactive macrocycles via an acid-catalysed cascade reaction

    NASA Astrophysics Data System (ADS)

    Porel, Mintu; Thornlow, Dana N.; Phan, Ngoc N.; Alabi, Christopher A.

    2016-06-01

    Synthetic macrocycles derived from sequence-defined oligomers are a unique structural class whose ring size, sequence and structure can be tuned via precise organization of the primary sequence. Similar to peptides and other peptidomimetics, these well-defined synthetic macromolecules become pharmacologically relevant when bioactive side chains are incorporated into their primary sequence. In this article, we report the synthesis of oligothioetheramide (oligoTEA) macrocycles via a one-pot acid-catalysed cascade reaction. The versatility of the cyclization chemistry and modularity of the assembly process was demonstrated via the synthesis of >20 diverse oligoTEA macrocycles. Structural characterization via NMR spectroscopy revealed the presence of conformational isomers, which enabled the determination of local chain dynamics within the macromolecular structure. Finally, we demonstrate the biological activity of oligoTEA macrocycles designed to mimic facially amphiphilic antimicrobial peptides. The preliminary results indicate that macrocyclic oligoTEAs with just two-to-three cationic charge centres can elicit potent antibacterial activity against Gram-positive and Gram-negative bacteria.

  5. Repeat sequence chromosome specific nucleic acid probes and methods of preparing and using

    DOEpatents

    Weier, H.U.G.; Gray, J.W.

    1995-06-27

    A primer directed DNA amplification method to isolate efficiently chromosome-specific repeated DNA wherein degenerate oligonucleotide primers are used is disclosed. The probes produced are a heterogeneous mixture that can be used with blocking DNA as a chromosome-specific staining reagent, and/or the elements of the mixture can be screened for high specificity, size and/or high degree of repetition among other parameters. The degenerate primers are sets of primers that vary in sequence but are substantially complementary to highly repeated nucleic acid sequences, preferably clustered within the template DNA, for example, pericentromeric alpha satellite repeat sequences. The template DNA is preferably chromosome-specific. Exemplary primers and probes are disclosed. The probes of this invention can be used to determine the number of chromosomes of a specific type in metaphase spreads, in germ line and/or somatic cell interphase nuclei, micronuclei and/or in tissue sections. Also provided is a method to select arbitrarily repeat sequence probes that can be screened for chromosome-specificity. 18 figs.

  6. Repeat sequence chromosome specific nucleic acid probes and methods of preparing and using

    DOEpatents

    Weier, Heinz-Ulrich G.; Gray, Joe W.

    1995-01-01

    A primer directed DNA amplification method to isolate efficiently chromosome-specific repeated DNA wherein degenerate oligonucleotide primers are used is disclosed. The probes produced are a heterogeneous mixture that can be used with blocking DNA as a chromosome-specific staining reagent, and/or the elements of the mixture can be screened for high specificity, size and/or high degree of repetition among other parameters. The degenerate primers are sets of primers that vary in sequence but are substantially complementary to highly repeated nucleic acid sequences, preferably clustered within the template DNA, for example, pericentromeric alpha satellite repeat sequences. The template DNA is preferably chromosome-specific. Exemplary primers ard probes are disclosed. The probes of this invention can be used to determine the number of chromosomes of a specific type in metaphase spreads, in germ line and/or somatic cell interphase nuclei, micronuclei and/or in tissue sections. Also provided is a method to select arbitrarily repeat sequence probes that can be screened for chromosome-specificity.

  7. Detection of Nucleic Acids with Graphene Nanopores: Ab Initio Characterization of a Novel Sequencing Device

    NASA Astrophysics Data System (ADS)

    Nelson, Tammie; Zhang, Bo; Prezhdo, Oleg

    2010-03-01

    We report an ab initio study of the interaction of two nucleobases, cytosine and adenine, with a novel graphene nanopore device for detecting the base sequence of a single-stranded nucleic acid (ssDNA or RNA). The nucleobases were inserted into a pore in a graphene nanoribbon, and the electrical current and conductance spectra were calculated as functions of voltage applied across the nanoribbon. The conductance spectra and charge densities were analyzed in the presence of each nucleobase in the graphene nanopore. The results indicate that, due to significant differences in the conductance spectra, the proposed device has adequate sensitivity to discriminate between different nucleotides. Moreover, we show that the nucleotide conductance spectra is not affected by its orientation inside the graphene nanopore. The proposed technique may be extremely useful for real applications in developing ultrafast, low cost DNA sequencing methods.

  8. Morphological tranformation of calcite crystal growth by prismatic "acidic" polypeptide sequences.

    SciTech Connect

    Kim, I; Giocondi, J L; Orme, C A; Collino, J; Evans, J S

    2007-02-13

    Many of the interesting mechanical and materials properties of the mollusk shell are thought to stem from the prismatic calcite crystal assemblies within this composite structure. It is now evident that proteins play a major role in the formation of these assemblies. Recently, a superfamily of 7 conserved prismatic layer-specific mollusk shell proteins, Asprich, were sequenced, and the 42 AA C-terminal sequence region of this protein superfamily was found to introduce surface voids or porosities on calcite crystals in vitro. Using AFM imaging techniques, we further investigate the effect that this 42 AA domain (Fragment-2) and its constituent subdomains, DEAD-17 and Acidic-2, have on the morphology and growth kinetics of calcite dislocation hillocks. We find that Fragment-2 adsorbs on terrace surfaces and pins acute steps, accelerates then decelerates the growth of obtuse steps, forms clusters and voids on terrace surfaces, and transforms calcite hillock morphology from a rhombohedral form to a rounded one. These results mirror yet are distinct from some of the earlier findings obtained for nacreous polypeptides. The subdomains Acidic-2 and DEAD-17 were found to accelerate then decelerate obtuse steps and induce oval rather than rounded hillock morphologies. Unlike DEAD-17, Acidic-2 does form clusters on terrace surfaces and exhibits stronger obtuse velocity inhibition effects than either DEAD-17 or Fragment-2. Interestingly, a 1:1 mixture of both subdomains induces an irregular polygonal morphology to hillocks, and exhibits the highest degree of acute step pinning and obtuse step velocity inhibition. This suggests that there is some interplay between subdomains within an intra (Fragment-2) or intermolecular (1:1 mixture) context, and sequence interplay phenomena may be employed by biomineralization proteins to exert net effects on crystal growth and morphology.

  9. Structures of Arg- and Gln-type bacterial cysteine dioxygenase homologs: Arg- and Gln-type Bacterial CDO Homologs

    DOE PAGESBeta

    Driggers, Camden M.; Hartman, Steven J.; Karplus, P. Andrew

    2015-01-01

    In some bacteria, cysteine is converted to cysteine sulfinic acid by cysteine dioxygenases (CDO) that are only ~15–30% identical in sequence to mammalian CDOs. Among bacterial proteins having this range of sequence similarity to mammalian CDO are some that conserve an active site Arg residue (“Arg-type” enzymes) and some having a Gln substituted for this Arg (“Gln-type” enzymes). Here, we describe a structure from each of these enzyme types by analyzing structures originally solved by structural genomics groups but not published: a Bacillus subtilis “Arg-type” enzyme that has cysteine dioxygenase activity (BsCDO), and a Ralstonia eutropha “Gln-type” CDO homolog ofmore » uncharacterized activity (ReCDOhom). The BsCDO active site is well conserved with mammalian CDO, and a cysteine complex captured in the active site confirms that the cysteine binding mode is also similar. The ReCDOhom structure reveals a new active site Arg residue that is hydrogen bonding to an iron-bound diatomic molecule we have interpreted as dioxygen. Notably, the Arg position is not compatible with the mode of Cys binding seen in both rat CDO and BsCDO. As sequence alignments show that this newly discovered active site Arg is well conserved among “Gln-type” CDO enzymes, we conclude that the “Gln-type” CDO homologs are not authentic CDOs but will have substrate specificity more similar to 3-mercaptopropionate dioxygenases.« less

  10. Structures of Arg- and Gln-type bacterial cysteine dioxygenase homologs: Arg- and Gln-type Bacterial CDO Homologs

    SciTech Connect

    Driggers, Camden M.; Hartman, Steven J.; Karplus, P. Andrew

    2015-01-01

    In some bacteria, cysteine is converted to cysteine sulfinic acid by cysteine dioxygenases (CDO) that are only ~15–30% identical in sequence to mammalian CDOs. Among bacterial proteins having this range of sequence similarity to mammalian CDO are some that conserve an active site Arg residue (“Arg-type” enzymes) and some having a Gln substituted for this Arg (“Gln-type” enzymes). Here, we describe a structure from each of these enzyme types by analyzing structures originally solved by structural genomics groups but not published: a Bacillus subtilis “Arg-type” enzyme that has cysteine dioxygenase activity (BsCDO), and a Ralstonia eutropha “Gln-type” CDO homolog of uncharacterized activity (ReCDOhom). The BsCDO active site is well conserved with mammalian CDO, and a cysteine complex captured in the active site confirms that the cysteine binding mode is also similar. The ReCDOhom structure reveals a new active site Arg residue that is hydrogen bonding to an iron-bound diatomic molecule we have interpreted as dioxygen. Notably, the Arg position is not compatible with the mode of Cys binding seen in both rat CDO and BsCDO. As sequence alignments show that this newly discovered active site Arg is well conserved among “Gln-type” CDO enzymes, we conclude that the “Gln-type” CDO homologs are not authentic CDOs but will have substrate specificity more similar to 3-mercaptopropionate dioxygenases.

  11. Correlation between the presence of sequences homologous to the vir region of Salmonella dublin plasmid pSDL2 and the virulence of twenty-two Salmonella serotypes in mice.

    PubMed Central

    Roudier, C; Krause, M; Fierer, J; Guiney, D G

    1990-01-01

    Large plasmids encoding important virulence properties have been found in several Salmonella serotypes. We have studied the relationship between the presence of a highly conserved 4-kilobase (kb) EcoRI fragment from the plasmid virulence region and pathogenicity for mice of 53 isolates representing 22 serotypes of Salmonella. Only strains possessing the homologous 4-kb region were virulent for mice. In addition, we transferred the virulence plasmid from S. dublin into nine different serotypes, including S. typhi and S. paratyphi A, that lack a native virulence plasmid. Only S. heidelberg and S. newport were rendered mouse virulent by the introduction of the S. dublin plasmid. This study demonstrates that plasmid-mediated virulence sequences are required for Salmonella virulence in mice, but many strains, including the agents of human typhoid fever, also lack chromosomal genes necessary to produce lethal systemic disease in mice. Since all the major Salmonella strains that are host-adapted to animals carry virulence plasmids, it appears that these plasmids are important in mediating systemic infection in animals and may contribute to septicemic, nontyphoid salmonellosis in humans. Images PMID:2323813

  12. Large and small subunits of the Aujeszky's disease virus ribonucleotide reductase: nucleotide sequence and putative structure.

    PubMed

    Kaliman, A V; Boldogköi, Z; Fodor, I

    1994-09-13

    We determined the entire DNA sequence of two adjacent open reading frames of Aujeszky's disease virus encoding ribonucleotide reductase genes with the intergenic sequence of 9 bp. From the sequence analysis we deduce that ORFs encode large and small subunits, with sizes of 835 and 303 amino acids, respectively. Amino acid sequence comparison of ADV RR2 with that of equine herpesvirus type 1, bovine herpesvirus type 1, HSV-1 and varicella zoster virus revealed that 48% of amino acids represent clusters of residues conserved in all compared sequences. In the N-terminal part ADV RR1 shows low homology to the RR1 of other herpesviruses. Rest of the RR1 protein contains highly conserved amino acid sequences divided by blocks of low homology. PMID:8086454

  13. Fast computational methods for predicting protein structure from primary amino acid sequence

    DOEpatents

    Agarwal, Pratul Kumar

    2011-07-19

    The present invention provides a method utilizing primary amino acid sequence of a protein, energy minimization, molecular dynamics and protein vibrational modes to predict three-dimensional structure of a protein. The present invention also determines possible intermediates in the protein folding pathway. The present invention has important applications to the design of novel drugs as well as protein engineering. The present invention predicts the three-dimensional structure of a protein independent of size of the protein, overcoming a significant limitation in the prior art.

  14. Purification and amino acid sequence of aminopeptidase P from pig kidney.

    PubMed

    Vergas Romero, C; Neudorfer, I; Mann, K; Schäfer, W

    1995-04-01

    Aminopeptidase P from kidney cortex was purified in high yield (recovery greater than or equal to 20%) by a series of column chromatographic steps after solubilization of the membrane-bound glycoprotein with n-butanol. A coupled enzymic assay, using Gly-Pro-Pro-NH-Nap as substrate and dipeptidyl-peptidase IV as auxilliary enzyme, was used to monitor the purification. The purification procedure yielded two forms of aminopeptidase P differing in their carbohydrate composition (glycoforms). Both enzyme preparations were homogeneous as assessed by SDS/PAGE silver staining, and isoelectric focusing. Both forms possessed the same substrate specificity, catalysed the same reaction, and consisted of identical protein chains. The amino acid sequence determined by Edman degradation and mass spectrometry consisted of 623 amino acids. Six N-glycosylation sites, all contained in the N-terminal half of the protein, were characterized. PMID:7744038

  15. Draft Genome Sequence of Cupriavidus sp. Strain SK-3, a 4-Chlorobiphenyl- and 4-Clorobenzoic Acid-Degrading Bacterium

    PubMed Central

    Vilo, Claudia; Benedik, Michael J.; Ilori, Matthew

    2014-01-01

    We report the draft genome sequence of Cupriavidus sp. strain SK-3, which can use 4-chlorobiphenyl and 4-clorobenzoic acid as the sole carbon source for growth. The draft genome sequence allowed the study of the polychlorinated biphenyl degradation mechanism and the recharacterization of the strain SK-3 as a Cupriavidus species. PMID:24994805

  16. Draft Genome Sequence of Bacillus subtilis subsp. natto Strain CGMCC 2108, a High Producer of Poly-γ-Glutamic Acid

    PubMed Central

    Tan, Siyuan; Su, Anping; Zhang, Chen; Ren, Yuanyuan

    2016-01-01

    Here, we report the 4.1-Mb draft genome sequence of Bacillus subtilis subsp. natto strain CGMCC 2108, a high producer of poly-γ-glutamic acid (γ-PGA). This sequence will provide further help for the biosynthesis of γ-PGA and will greatly facilitate research efforts in metabolic engineering of B. subtilis subsp. natto strain CGMCC 2108. PMID:27231363

  17. New monoclonal antibodies to the Ebola virus glycoprotein: Identification and analysis of the amino acid sequence of the variable domains.

    PubMed

    Panina, A A; Aliev, T K; Shemchukova, O B; Dement'yeva, I G; Varlamov, N E; Pozdnyakova, L P; Bokov, M N; Dolgikh, D A; Sveshnikov, P G; Kirpichnikov, M P

    2016-03-01

    We determined the nucleotide and amino acid sequences of variable domains of three new monoclonal antibodies to the glycoprotein of Ebola virus capsid. The framework and hypervariable regions of immunoglobulin heavy and light chains were identified. The primary structures were confirmed using massspectrometry analysis. Immunoglobulin database search showed the uniqueness of the sequences obtained. PMID:27193713

  18. Genome Sequence of the Lactic Acid Bacterium Lactococcus lactis subsp. lactis TOMSC161, Isolated from a Nonscalded Curd Pressed Cheese

    PubMed Central

    Velly, H.; Abraham, A.-L.; Loux, V.; Delacroix-Buchet, A.; Fonseca, F.; Bouix, M.

    2014-01-01

    Lactococcus lactis is a lactic acid bacterium used in the production of many fermented foods, such as dairy products. Here, we report the genome sequence of L. lactis subsp. lactis TOMSC161, isolated from nonscalded curd pressed cheese. This genome sequence provides information in relation to dairy environment adaptation. PMID:25377704

  19. Draft Genome Sequence of Bacillus subtilis subsp. natto Strain CGMCC 2108, a High Producer of Poly-γ-Glutamic Acid.

    PubMed

    Tan, Siyuan; Meng, Yonghong; Su, Anping; Zhang, Chen; Ren, Yuanyuan

    2016-01-01

    Here, we report the 4.1-Mb draft genome sequence of Bacillus subtilis subsp. natto strain CGMCC 2108, a high producer of poly-γ-glutamic acid (γ-PGA). This sequence will provide further help for the biosynthesis of γ-PGA and will greatly facilitate research efforts in metabolic engineering of B. subtilis subsp. natto strain CGMCC 2108. PMID:27231363

  20. Amino Acid Substitutions in Homologs of the STAY-GREEN Protein Are Responsible for the green-flesh and chlorophyll retainer Mutations of Tomato and Pepper1[W][OA

    PubMed Central

    Barry, Cornelius S.; McQuinn, Ryan P.; Chung, Mi-Young; Besuden, Anna; Giovannoni, James J.

    2008-01-01

    Color changes often accompany the onset of ripening, leading to brightly colored fruits that serve as attractants to seed-dispersing organisms. In many fruits, including tomato (Solanum lycopersicum) and pepper (Capsicum annuum), there is a sharp decrease in chlorophyll content and a concomitant increase in the synthesis of carotenoids as a result of the conversion of chloroplasts into chromoplasts. The green-flesh (gf) and chlorophyll retainer (cl) mutations of tomato and pepper, respectively, are inhibited in their ability to degrade chlorophyll during ripening, leading to the production of ripe fruits characterized by both chlorophyll and carotenoid accumulation and are thus brown in color. Using a positional cloning approach, we have identified a point mutation at the gf locus that causes an amino acid substitution in an invariant residue of a tomato homolog of the STAY-GREEN (SGR) protein of rice (Oryza sativa). Similarly, the cl mutation also carries an amino acid substitution at an invariant residue in a pepper homolog of SGR. Both GF and CL expression are highly induced at the onset of fruit ripening, coincident with the ripening-associated decline in chlorophyll. Phylogenetic analysis indicates that there are two distinct groups of SGR proteins in plants. The SGR subfamily is required for chlorophyll degradation and operates through an unknown mechanism. A second subfamily, which we have termed SGR-like, has an as-yet undefined function. PMID:18359841

  1. Homology, convergence and parallelism.

    PubMed

    Ghiselin, Michael T

    2016-01-01

    Homology is a relation of correspondence between parts of parts of larger wholes. It is used when tracking objects of interest through space and time and in the context of explanatory historical narratives. Homologues can be traced through a genealogical nexus back to a common ancestral precursor. Homology being a transitive relation, homologues remain homologous however much they may come to differ. Analogy is a relationship of correspondence between parts of members of classes having no relationship of common ancestry. Although homology is often treated as an alternative to convergence, the latter is not a kind of correspondence: rather, it is one of a class of processes that also includes divergence and parallelism. These often give rise to misleading appearances (homoplasies). Parallelism can be particularly hard to detect, especially when not accompanied by divergences in some parts of the body. PMID:26598721

  2. ANTICALIgN: visualizing, editing and analyzing combined nucleotide and amino acid sequence alignments for combinatorial protein engineering.

    PubMed

    Jarasch, Alexander; Kopp, Melanie; Eggenstein, Evelyn; Richter, Antonia; Gebauer, Michaela; Skerra, Arne

    2016-07-01

    ANTIC ALIGN: is an interactive software developed to simultaneously visualize, analyze and modify alignments of DNA and/or protein sequences that arise during combinatorial protein engineering, design and selection. ANTIC ALIGN: combines powerful functions known from currently available sequence analysis tools with unique features for protein engineering, in particular the possibility to display and manipulate nucleotide sequences and their translated amino acid sequences at the same time. ANTIC ALIGN: offers both template-based multiple sequence alignment (MSA), using the unmutated protein as reference, and conventional global alignment, to compare sequences that share an evolutionary relationship. The application of similarity-based clustering algorithms facilitates the identification of duplicates or of conserved sequence features among a set of selected clones. Imported nucleotide sequences from DNA sequence analysis are automatically translated into the corresponding amino acid sequences and displayed, offering numerous options for selecting reading frames, highlighting of sequence features and graphical layout of the MSA. The MSA complexity can be reduced by hiding the conserved nucleotide and/or amino acid residues, thus putting emphasis on the relevant mutated positions. ANTIC ALIGN: is also able to handle suppressed stop codons or even to incorporate non-natural amino acids into a coding sequence. We demonstrate crucial functions of ANTIC ALIGN: in an example of Anticalins selected from a lipocalin random library against the fibronectin extradomain B (ED-B), an established marker of tumor vasculature. Apart from engineered protein scaffolds, ANTIC ALIGN: provides a powerful tool in the area of antibody engineering and for directed enzyme evolution. PMID:27261456

  3. Formation Sequences of Iron Minerals in the Acidic Alteration Products and Variation of Hydrothermal Fluid Conditions

    NASA Astrophysics Data System (ADS)

    Isobe, H.; Yoshizawa, M.

    2008-12-01

    Iron minerals have important role in environmental issues not only on the Earth but also other terrestrial planets. Iron mineral species related to alteration products of primary minerals with surface or subsurface fluids are characterized by temperature, acidity and redox conditions of the fluids. We can see various iron- bearing alteration products in alteration products around fumaroles in geothermal/volcanic areas. In this study, zonal structures of iron minerals in alteration products of the geothermal area are observed to elucidate temporal and spatial variation of hydrothermal fluids. Alteration of the pyroxene-amphibole andesite of Garan-dake volcano, Oita, Japan occurs by the acidic hydrothermal fluid to form cristobalite leaching out elements other than Si. Hand specimens with unaltered or weakly altered core and cristobalite crust show various sequences of layers. XRD analysis revealed that the alteration degree is represented by abundance of cristobalite. Intermediately altered layers are characterized by occurrence including alunite, pyrite, kaolinite, goethite and hematite. A specimen with reddish brown core surrounded by cristobalite-rich white crust has brown colored layers at the boundary of core and the crust. Reddish core is characterized by occurrence of crystalline hematite by XRD. Another hand specimen has light gray core, which represents reduced conditions, and white cristobalite crust with light brown and reddish brown layers of ferric iron minerals between the core and the crust. On the other hand, hornblende crystals, typical ferrous iron-bearing mineral of the host rock, are well preserved in some samples with strongly decolorized cristobalite-rich groundmass. Hydrothermal alteration experiments of iron-rich basaltic material shows iron mineral species depend on acidity and temperature of the fluid. Oxidation states of the iron-bearing mineral species are strongly influenced by the acidity and redox conditions. Variations of alteration

  4. Identification of Tuber borchii Vittad. mycelium proteins separated by two-dimensional polyacrylamide gel electrophoresis using amino acid analysis and sequence tagging.

    PubMed

    Vallorani, L; Bernardini, F; Sacconi, C; Pierleoni, R; Pieretti, B; Piccoli, G; Buffalini, M; Stocchi, V

    2000-11-01

    This paper reports the first results in the proteome analysis of Tuber borchii Vittad. mycelium, an ectomycorrhizal fungus poorly defined genetically, but known for its generation of edible fruit bodies known as white truffles. Employing isoelectric focusing on immobilized pH gradients, followed by sodium dodecyl sulfate-polyacrylamide gel electrophoresis, we obtained an electropherogram presenting over 800 spots within the window of isoelectric points (pI) 3.5-9 and a molecular mass of 10-200 kDa. Different reducing agents were tested in the sample preparation buffers, and the standard lysis buffer plus 2% w/v polyvinylpolypyrrolidone allowed the best solubilization and resolution of the proteins. The T. borchii proteins separated in micropreparative gels were electroblotted onto polyvinylidene difluoride membranes and visualized by Coomassie staining. Twenty-three proteins were excised and analyzed by the combination of amino acid and N-terminal analysis. One protein was identified by matching its amino acid composition, estimated isoelectric point and molecular mass against the SWISS-PROT and EMBL databases. Four spots were successfully tagged by Edman microsequencing but no homologous sequences were found in databases. PMID:11271490

  5. Draft Genome Sequences of Gluconobacter cerinus CECT 9110 and Gluconobacter japonicus CECT 8443, Acetic Acid Bacteria Isolated from Grape Must

    PubMed Central

    Sainz, Florencia

    2016-01-01

    We report here the draft genome sequences of Gluconobacter cerinus strain CECT9110 and Gluconobacter japonicus CECT8443, acetic acid bacteria isolated from grape must. Gluconobacter species are well known for their ability to oxidize sugar alcohols into the corresponding acids. Our objective was to select strains to oxidize effectively d-glucose. PMID:27365351

  6. Identification of SHIP-1 and SHIP-2 homologs in channel catfish, Ictalurus punctatus

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Src homology domain 2 (SH2) domain-containing inositol 5’-phosphatases (SHIP) proteins have diverse roles in signal transduction. SHIP-1 and SHIP-2 homologs were identified in channel catfish, Ictalurus punctatus, based on sequence homology to murine and human SHIP sequences. Full-length cDNAs for ...

  7. Swfoldrate: predicting protein folding rates from amino acid sequence with sliding window method.

    PubMed

    Cheng, Xiang; Xiao, Xuan; Wu, Zhi-cheng; Wang, Pu; Lin, Wei-zhong

    2013-01-01

    Protein folding is the process by which a protein processes from its denatured state to its specific biologically active conformation. Understanding the relationship between sequences and the folding rates of proteins remains an important challenge. Most previous methods of predicting protein folding rate require the tertiary structure of a protein as an input. In this study, the long-range and short-range contact in protein were used to derive extended version of the pseudo amino acid composition based on sliding window method. This method is capable of predicting the protein folding rates just from the amino acid sequence without the aid of any structural class information. We systematically studied the contributions of individual features to folding rate prediction. The optimal feature selection procedures are adopted by means of combining the forward feature selection and sequential backward selection method. Using the jackknife cross validation test, the method was demonstrated on the large dataset. The predictor was achieved on the basis of multitudinous physicochemical features and statistical features from protein using nonlinear support vector machine (SVM) regression model, the method obtained an excellent agreement between predicted and experimentally observed folding rates of proteins. The correlation coefficient is 0.9313 and the standard error is 2.2692. The prediction server is freely available at http://www.jci-bioinfo.cn/swfrate/input.jsp. PMID:22933332

  8. From amino acid sequence to bioactivity: The biomedical potential of antitumor peptides.

    PubMed

    Blanco-Míguez, Aitor; Gutiérrez-Jácome, Alberto; Pérez-Pérez, Martín; Pérez-Rodríguez, Gael; Catalán-García, Sandra; Fdez-Riverola, Florentino; Lourenço, Anália; Sánchez, Borja

    2016-06-01

    Chemoprevention is the use of natural and/or synthetic substances to block, reverse, or retard the process of carcinogenesis. In this field, the use of antitumor peptides is of interest as, (i) these molecules are small in size, (ii) they show good cell diffusion and permeability, (iii) they affect one or more specific molecular pathways involved in carcinogenesis, and (iv) they are not usually genotoxic. We have checked the Web of Science Database (23/11/2015) in order to collect papers reporting on bioactive peptide (1691 registers), which was further filtered searching terms such as "antiproliferative," "antitumoral," or "apoptosis" among others. Works reporting the amino acid sequence of an antiproliferative peptide were kept (60 registers), and this was complemented with the peptides included in CancerPPD, an extensive resource for antiproliferative peptides and proteins. Peptides were grouped according to one of the following mechanism of action: inhibition of cell migration, inhibition of tumor angiogenesis, antioxidative mechanisms, inhibition of gene transcription/cell proliferation, induction of apoptosis, disorganization of tubulin structure, cytotoxicity, or unknown mechanisms. The main mechanisms of action of those antiproliferative peptides with known amino acid sequences are presented and finally, their potential clinical usefulness and future challenges on their application is discussed. PMID:27010507

  9. The sequence, and its evolutionary implications, of a Thermococcus celer protein associated with transcription

    NASA Technical Reports Server (NTRS)

    Kaine, B. P.; Mehr, I. J.; Woese, C. R.

    1994-01-01

    Through random search, a gene from Thermococcus celer has been identified and sequenced that appears to encode a transcription-associated protein (110 amino acid residues). The sequence has clear homology to approximately the last half of an open reading frame reported previously for Sulfolobus acidocaldarius [Langer, D. & Zillig, W. (1993) Nucleic Acids Res. 21, 2251]. The protein translations of these two archaeal genes in turn are homologs of a small subunit found in eukaryotic RNA polymerase I (A12.2) and the counterpart of this from RNA polymerase II (B12.6). Homology is also seen with the eukaryotic transcription factor TFIIS, but it involves only the terminal 45 amino acids of the archaeal proteins. Evolutionary implications of these homologies are discussed.

  10. Clostridium sticklandii, a specialist in amino acid degradation:revisiting its metabolism through its genome sequence

    PubMed Central

    2010-01-01

    Background Clostridium sticklandii belongs to a cluster of non-pathogenic proteolytic clostridia which utilize amino acids as carbon and energy sources. Isolated by T.C. Stadtman in 1954, it has been generally regarded as a "gold mine" for novel biochemical reactions and is used as a model organism for studying metabolic aspects such as the Stickland reaction, coenzyme-B12- and selenium-dependent reactions of amino acids. With the goal of revisiting its carbon, nitrogen, and energy metabolism, and comparing studies with other clostridia, its genome has been sequenced and analyzed. Results C. sticklandii is one of the best biochemically studied proteolytic clostridial species. Useful additional information has been obtained from the sequencing and annotation of its genome, which is presented in this paper. Besides, experimental procedures reveal that C. sticklandii degrades amino acids in a preferential and sequential way. The organism prefers threonine, arginine, serine, cysteine, proline, and glycine, whereas glutamate, aspartate and alanine are excreted. Energy conservation is primarily obtained by substrate-level phosphorylation in fermentative pathways. The reactions catalyzed by different ferredoxin oxidoreductases and the exergonic NADH-dependent reduction of crotonyl-CoA point to a possible chemiosmotic energy conservation via the Rnf complex. C. sticklandii possesses both the F-type and V-type ATPases. The discovery of an as yet unrecognized selenoprotein in the D-proline reductase operon suggests a more detailed mechanism for NADH-dependent D-proline reduction. A rather unusual metabolic feature is the presence of genes for all the enzymes involved in two different CO2-fixation pathways: C. sticklandii harbours both the glycine synthase/glycine reductase and the Wood-Ljungdahl pathways. This unusual pathway combination has retrospectively been observed in only four other sequenced microorganisms. Conclusions Analysis of the C. sticklandii genome and

  11. Complete amino acid sequence of the myoglobin from the Pacific spotted dolphin, Stenella attenuata graffmani.

    PubMed

    Jones, B N; Wang, C C; Dwulet, F E; Lehman, L D; Meuth, J L; Bogardt, R A; Gurd, F R

    1979-04-25

    The complete amino acid sequence of the major component myoglobin from the Pacific spotted dolphin, Stenella attenuata graffmani, was determined by the automated Edman degradation of several large peptides obtained by specific cleavage of the protein. The acetimidated apomyoglobin was selectively cleaved at its two methionyl residues with cyanogen bromide and at its three arginyl residues by trypsin. By subjecting four of these peptides and the apomyoglobin to automated Edman degradation, over 80% of the primary structure of the protein was obtained. The remainder of the covalent structure was determined by the sequence analysis of peptides that resulted from further digestion of the central cyanogen bromide fragment. This fragment was cleaved at its glutamyl residues with staphylococcal protease and its lysyl residues with trypsin. The action of trypsin was restricted to the lysyl residues by chemical modification of the single arginyl residue of the fragment with 1,2-cyclohexanedione. The primary structure of this myoglobin proved to be identical with that from the Atlantic bottlenosed dolphin and Pacific common dolphin but differs from the myoglobins of the killer whale and pilot whale at two positions. The above sequence identities and differences reflect the close taxonomic relationship of these five species of Cetacea. PMID:454657

  12. Isolation and amino acid sequences of squirrel monkey (Saimiri sciurea) insulin and glucagon.

    PubMed Central

    Yu, J H; Eng, J; Yalow, R S

    1990-01-01

    It was reported two decades ago that insulin was not detectable in the glucose-stimulated state in Saimiri sciurea, the New World squirrel monkey, by a radioimmunoassay system developed with guinea pig anti-pork insulin antibody and labeled pork insulin. With the same system, reasonable levels were observed in rhesus monkeys and chimpanzees. This suggested that New World monkeys, like the New World hystricomorph rodents such as the guinea pig and the coypu, might have insulins whose sequences differ markedly from those of Old World mammals. In this report we describe the purification and amino acid sequences of squirrel monkey insulin and glucagon. We demonstrate that the substitutions at B29, B27, A2, A4, and A17 of squirrel monkey insulin are identical with those previously found in another New World primate, the owl monkey (Aotus trivirgatus). The immunologic cross-reactivity of this insulin in our immunoassay system is only a few percent of that of human insulin. Squirrel monkey glucagon is identical with the usual glucagon found in Old World mammals, which predicts that the glucagons of other New World monkeys would not differ from the usual Old World mammalian glucagon. It appears that the peptides of the New World monkeys have diverged less from those of the Old World mammals than have those of the New World hystricomorph rodents. The striking improvements in peptide purification and sequencing have the potential for adding new information concerning the evolutionary divergence of species. PMID:2263627

  13. Isolation and amino acid sequences of squirrel monkey (Saimiri sciurea) insulin and glucagon

    SciTech Connect

    Yu, Jinghua ); Eng, J.; Yalow, R.S. City Univ. of New York, NY )

    1990-12-01

    It was reported two decades ago that insulin was not detectable in the glucose-stimulated state in Saimiri sciurea, the New World squirrel monkey, by a radioimmunoassay system developed with guinea pig anti-pork insulin antibody and labeled park insulin. With the same system, reasonable levels were observed in rhesus monkeys and chimpanzees. This suggested that New World monkeys, like the New World hystricomorph rodents such as the guinea pig and the coypu, might have insulins whose sequences differ markedly from those of Old World mammals. In this report the authors describe the purification and amino acid sequences of squirrel monkey insulin and glucagon. They demonstrate that the substitutions at B29, B27, A2, A4, and A17 of squirrel monkey insulin are identical with those previously found in another New World primate, the owl monkey (Aotus trivirgatus). The immunologic cross-reactivity of this insulin in their immunoassay system is only a few percent of that of human insulin. It appears that the peptides of the New World monkeys have diverged less from those of the Old World mammals than have those of the New World hystricomorph rodents. The striking improvements in peptide purification and sequencing have the potential for adding new information concerning the evolutionary divergence of species.

  14. Binding site discovery from nucleic acid sequences by discriminative learning of hidden Markov models

    PubMed Central

    Maaskola, Jonas; Rajewsky, Nikolaus

    2014-01-01

    We present a discriminative learning method for pattern discovery of binding sites in nucleic acid sequences based on hidden Markov models. Sets of positive and negative example sequences are mined for sequence motifs whose occurrence frequency varies between the sets. The method offers several objective functions, but we concentrate on mutual information of condition and motif occurrence. We perform a systematic comparison of our method and numerous published motif-finding tools. Our method achieves the highest motif discovery performance, while being faster than most published methods. We present case studies of data from various technologies, including ChIP-Seq, RIP-Chip and PAR-CLIP, of embryonic stem cell transcription factors and of RNA-binding proteins, demonstrating practicality and utility of the method. For the alternative splicing factor RBM10, our analysis finds motifs known to be splicing-relevant. The motif discovery method is implemented in the free software package Discrover. It is applicable to genome- and transcriptome-scale data, makes use of available repeat experiments and aside from binary contrasts also more complex data configurations can be utilized. PMID:25389269

  15. Sequence and transcription analysis of the human cytomegalovirus DNA polymerase gene

    SciTech Connect

    Kouzarides, T.; Bankier, A.T.; Satchwell, S.C.; Weston, K.; Tomlinson, P.; Barrell, B.G.

    1987-01-01

    DNA sequence analysis has revealed that the gene coding for the human cytomegalovirus (HCMV) DNA polymerase is present within the long unique region of the virus genome. Identification is based on extensive amino acid homology between the predicted HCMV open reading frame HFLF2 and the DNA polymerase of herpes simplex virus type 1. The authors present here a 5280 base-pair DNA sequence containing the HCMV pol gene, along with the analysis of transcripts encoded within this region. Since HCMV pol also shows homology to the predicted Epstein-Barr virus pol, they were able to analyze the extent of homology between the DNA polymerases of three distantly related herpes viruses, HCMV, Epstein-Barr virus, and herpes simplex virus. The comparison shows that these DNA polymerases exhibit considerable amino acid homology and highlights a number of highly conserved regions; two such regions show homology to sequences within the adenovirus type 2 DNA polymerase. The HCMV pol gene is flanked by open reading frames with homology to those of other herpes viruses; upstream, there is a reading frame homologous to the glycoprotein B gene of herpes simplex virus type I and Epstein-Barr virus, and downstream there is a reading frame homologous to BFLF2 of Epstein-Barr virus.

  16. Molecular characterization of the body site-specific human epidermal cytokeratin 9: cDNA cloning, amino acid sequence, and tissue specificity of gene expression.

    PubMed

    Langbein, L; Heid, H W; Moll, I; Franke, W W

    1993-12-01

    Differentiation of human plantar and palmar epidermis is characterized by the suprabasal synthesis of a major special intermediate-sized filament (IF) protein, the type I (acidic) cytokeratin 9 (CK 9). Using partial amino acid (aa) sequence information obtained by direct Edman sequencing of peptides resulting from proteolytic digestion of purified CK 9, we synthesized several redundant primers by 'back-translation'. Amplification by polymerase chain reaction (PCR) of cDNAs obtained by reverse transcription of mRNAs from human foot sole epidermis, including 5'-primer extension, resulted in multiple overlapping cDNA clones, from which the complete cDNA (2353 bp) could be constructed. This cDNA encoded the CK 9 polypeptide with a calculated molecular weight of 61,987 and an isoelectric point at about pH 5.0. The aa sequence deduced from cDNA was verified in several parts by comparison with the peptide sequences and showed the typical structure of type I CKs, with a head (153 aa), and alpha-helical coiled-coil-forming rod (306 aa), and a tail (163 aa) domain. The protein displayed the highest homology to human CK 10, not only in the highly conserved rod domain but also in large parts of the head and the tail domains. On the other hand, the aa sequence revealed some remarkable differences from CK 10 and other CKs, even in the most conserved segments of the rod domain. The nuclease digestion pattern seen on Southern blot analysis of human genomic DNA indicated the existence of a unique CK 9 gene. Using CK 9-specific riboprobes for hybridization on Northern blots of RNAs from various epithelia, a mRNA of about 2.4 kb in length could be identified only in foot sole epidermis, and a weaker cross-hybridization signal was seen in RNA from bovine heel pad epidermis at about 2.0 kb. A large number of tissues and cell cultures were examined by PCR of mRNA-derived cDNAs, using CK 9-specific primers. But even with this very sensitive signal amplification, only palmar

  17. The σ enigma: bacterial σ factors, archaeal TFB and eukaryotic TFIIB are homologs.

    PubMed

    Burton, Samuel P; Burton, Zachary F

    2014-01-01

    Structural comparisons of initiating RNA polymerase complexes and structure-based amino acid sequence alignments of general transcription initiation factors (eukaryotic TFIIB, archaeal TFB and bacterial σ factors) show that these proteins are homologs. TFIIB and TFB each have two-five-helix cyclin-like repeats (CLRs) that include a C-terminal helix-turn-helix (HTH) motif (CLR/HTH domains). Four homologous HTH motifs are present in bacterial σ factors that are relics of CLR/HTH domains. Sequence similarities clarify models for σ factor and TFB/TFIIB evolution and function and suggest models for promoter evolution. Commitment to alternate modes for transcription initiation appears to be a major driver of the divergence of bacteria and archaea. PMID:25483602

  18. The σ enigma: Bacterial σ factors, archaeal TFB and eukaryotic TFIIB are homologs

    PubMed Central

    Burton, Samuel P; Burton, Zachary F

    2014-01-01

    Structural comparisons of initiating RNA polymerase complexes and structure-based amino acid sequence alignments of general transcription initiation factors (eukaryotic TFIIB, archaeal TFB and bacterial σ factors) show that these proteins are homologs. TFIIB and TFB each have two-five-helix cyclin-like repeats (CLRs) that include a C-terminal helix-turn-helix (HTH) motif (CLR/HTH domains). Four homologous HTH motifs are present in bacterial σ factors that are relics of CLR/HTH domains. Sequence similarities clarify models for σ factor and TFB/TFIIB evolution and function and suggest models for promoter evolution. Commitment to alternate modes for transcription initiation appears to be a major driver of the divergence of bacteria and archaea. PMID:25483602

  19. Amino acid sequence analysis and characterization of a ribonuclease from starfish Asterias amurensis.

    PubMed

    Motoyoshi, Naomi; Kobayashi, Hiroko; Itagaki, Tadashi; Inokuchi, Norio

    2016-09-01

    The aim of this study was to phylogenetically characterize the location of the RNase T2 enzyme in the starfish (Asterias amurensis). We isolated an RNase T2 ribonuclease (RNase Aa) from the ovaries of starfish and determined its amino acid sequence by protein chemistry and cloning cDNA encoding RNase Aa. The isolated protein had 231 amino acid residues, a predicted molecular mass of 25,906 Da, and an optimal pH of 5.0. RNase Aa preferentially released guanylic acid from the RNA. The catalytic sites of the RNase T2 family are conserved in RNase Aa; furthermore, the distribution of the cysteine residues in RNase Aa is similar to that in other animal and plant T2 RNases. RNase Aa is cleaved at two points: 21 residues from the N-terminus and 29 residues from the C-terminus; however, both fragments may remain attached to the protein via disulfide bridges, leading to the maintenance of its conformation, as suggested by circular dichroism spectrum analysis. The phylogenetic analysis revealed that starfish RNase Aa is evolutionarily an intermediate between protozoan and oyster RNases. PMID:26920046

  20. Braid Floer homology

    NASA Astrophysics Data System (ADS)

    van den Berg, J. B.; Ghrist, R.; Vandervorst, R. C.; Wójcik, W.

    2015-09-01

    Area-preserving diffeomorphisms of a 2-disc can be regarded as time-1 maps of (non-autonomous) Hamiltonian flows on R / Z ×D2. The periodic flow-lines define braid (conjugacy) classes, up to full twists. We examine the dynamics relative to such braid classes and define a new invariant for such classes, the BRAID FLOER HOMOLOGY. This refinement of Floer homology, originally used for the Arnol'd Conjecture, yields a Morse-type forcing theory for periodic points of area-preserving diffeomorphisms of the 2-disc based on braiding. Contributions of this paper include (1) a monotonicity lemma for the behavior of the nonlinear Cauchy-Riemann equations with respect to algebraic lengths of braids; (2) establishment of the topological invariance of the resulting braid Floer homology; (3) a shift theorem describing the effect of twisting braids in terms of shifting the braid Floer homology; (4) computation of examples; and (5) a forcing theorem for the dynamics of Hamiltonian disc maps based on braid Floer homology.

  1. Complete amino acid sequence and characterization of the reaction mechanism of a glucosamine-induced novel alcohol dehydrogenase from Agrobacterium radiobacter (tumefaciens).

    PubMed

    Iwamoto, Ryoko; Kubota, Humie; Hosoki, Tomoko; Ikehara, Kenji; Tanaka, Mieko

    2002-02-15

    A glucosamine-induced novel alcohol dehydrogenase has been isolated from Agrobacterium radiobacter (tumefaciens) and its fundamental properties have been characterized. The enzyme catalyzes NAD-dependent dehydrogenation of aliphatic alcohols and amino alcohols. In this work, the complete amino acid sequence of the alcohol dehydrogenase was determined by PCR method using genomic DNA of A. radiobacter as template. The enzyme comprises 336 amino acids and has a molecular mass of 36 kDa. The primary structure of the enzyme demonstrates a high homology to structures of alcohol dehydrogenases from Shinorhizobium meliloti (83% identity, 90% positive) and Pseudomonas aeruginosa (65% identity, 76% positive). The two Zn(2+) ion binding sites, both the active site and another site that contributed to stabilization of the enzyme, are conserved in those enzymes. Sequences analysis of the NAD-dependent dehydrogenase family using a hypothetical phylogenetic tree indicates that these three enzymes form a new group distinct from other members of the Zn-containing long-chain alcohol dehydrogenase family. The physicochemical properties of alcohol dehydrogenase from A. radiobacter were characterized as follows. (1) Stereospecificity of the hydride transfer from ethanol to NADH was categorized as pro-R type by NMR spectra of NADH formed in the enzymatic reaction using ethanol-D(6) was used as substrate. (2) Optimal pH for all alcohols with no amino group examined was pH 8.5 (of the C(2)-C(6) alcohols, n-amyl alcohol demonstrated the highest activity). Conversely, glucosaminitol was optimally dehydrogenated at pH 10.0. (3) The rate-determining step of the dehydrogenase for ethanol is deprotonation of the enzyme-NAD-Zn-OHCH(2)CH(3) complex to enzyme-NAD-Zn-O(-)CH(2)CH(3) complex and that for glucosaminitol is H(2)O addition to enzyme-Zn-NADH complex. PMID:11831851

  2. The highly conserved amino acid sequence motif Tyr-Gly-Asp-Thr-Asp-Ser in alpha-like DNA polymerases is required by phage phi 29 DNA polymerase for protein-primed initiation and polymerization.

    PubMed Central

    Bernad, A; Lázaro, J M; Salas, M; Blanco, L

    1990-01-01

    The alpha-like DNA polymerases from bacteriophage phi 29 and other viruses, prokaryotes and eukaryotes contain an amino acid consensus sequence that has been proposed to form part of the dNTP binding site. We have used site-directed mutants to study five of the six highly conserved consecutive amino acids corresponding to the most conserved C-terminal segment (Tyr-Gly-Asp-Thr-Asp-Ser). Our results indicate that in phi 29 DNA polymerase this consensus sequence, although irrelevant for the 3'----5' exonuclease activity, is essential for initiation and elongation. Based on these results and on its homology with known or putative metal-binding amino acid sequences, we propose that in phi 29 DNA polymerase the Tyr-Gly-Asp-Thr-Asp-Ser consensus motif is part of the dNTP binding site, involved in the synthetic activities of the polymerase (i.e., initiation and polymerization), and that it is involved particularly in the metal binding associated with the dNTP site. Images PMID:2191296

  3. Siro(haem)amide in Allochromatium vinosum and relevance of DsrL and DsrN, a homolog of cobyrinic acid a,c-diamide synthase, for sulphur oxidation.

    PubMed

    Lübbe, Yvonne J; Youn, Hyung-Sun; Timkovich, Russell; Dahl, Christiane

    2006-08-01

    In the purple sulphur bacterium Allochromatium vinosum, the prosthetic group of dissimilatory sulphite reductase (DsrAB) was identified as siroamide, an amidated form of the classical sirohaem. The genes dsrAB are the first two of a large cluster of genes necessary for the oxidation of sulphur globules stored intracellularly during growth on sulphide and thiosulphate. DsrN is homologous to cobyrinic acid a,c diamide synthase and may therefore catalyze glutamine-dependent amidation of sirohaem. Indeed, an A. vinosumDeltadsrN in frame deletion mutant showed a significantly reduced sulphur oxidation rate that was fully restored upon complementation with dsrN in trans. Sulphite reductase was still present in the DeltadsrN mutant. DsrL is a homolog of the small subunits of bacterial glutamate synthases and was proposed to deliver glutamine for sirohaem amidation. However, recombinant DsrL does not exhibit glutamate synthase activity nor does the gene complement a glutamate synthase-deficient Escherichia coli strain. Deletion of dsrL showed that the encoded protein is absolutely essential for sulphur oxidation in A. vinosum. PMID:16907720

  4. Evolutionary connections of biological kingdoms based on protein and nucleic acid sequence evidence

    NASA Technical Reports Server (NTRS)

    Dayhoff, M. O.

    1983-01-01

    Prokaryotic and eukaryotic evolutionary trees are developed from protein and nucleic-acid sequences by the methods of numerical taxonomy. Trees are presented for bacterial ferredoxins, 5S ribosomal RNA, c-type cytochromes , cytochromes c2 and c', and 5.8S ribosomal RNA; the implications for early evolution are discussed; and a composite tree showing the branching of the anaerobes, aerobes, archaebacteria, and eukaryotes is shown. Single lines are found for all oxygen-evolving photosynthetic forms and for the salt-loving and high-temperature forms of archaebacteria. It is argued that the eukaryote mitochondria, chloroplasts, and cytoplasmic host material are descended from free-living prokaryotes that formed symbiotic associations, with more than one symbiotic event involved in the evolution of each organelle.

  5. The amino acid alphabet and the architecture of the protein sequence-structure map. I. Binary alphabets.

    PubMed

    Ferrada, Evandro

    2014-12-01

    The correspondence between protein sequences and structures, or sequence-structure map, relates to fundamental aspects of structural, evolutionary and synthetic biology. The specifics of the mapping, such as the fraction of accessible sequences and structures, or the sequences' ability to fold fast, are dictated by the type of interactions between the monomers that compose the sequences. The set of possible interactions between monomers is encapsulated by the potential energy function. In this study, I explore the impact of the relative forces of the potential on the architecture of the sequence-structure map. My observations rely on simple exact models of proteins and random samples of the space of potential energy functions of binary alphabets. I adopt a graph perspective and study the distribution of viable sequences and the structures they produce, as networks of sequences connected by point mutations. I observe that the relative proportion of attractive, neutral and repulsive forces defines types of potentials, that induce sequence-structure maps of vastly different architectures. I characterize the properties underlying these differences and relate them to the structure of the potential. Among these properties are the expected number and relative distribution of sequences associated to specific structures and the diversity of structures as a function of sequence divergence. I study the types of binary potentials observed in natural amino acids and show that there is a strong bias towards only some types of potentials, a bias that seems to characterize the folding code of natural proteins. I discuss implications of these observations for the architecture of the sequence-structure map of natural proteins, the construction of random libraries of peptides, and the early evolution of the natural amino acid alphabet. PMID:25473967

  6. The Amino Acid Alphabet and the Architecture of the Protein Sequence-Structure Map. I. Binary Alphabets

    PubMed Central

    Ferrada, Evandro

    2014-01-01

    The correspondence between protein sequences and structures, or sequence-structure map, relates to fundamental aspects of structural, evolutionary and synthetic biology. The specifics of the mapping, such as the fraction of accessible sequences and structures, or the sequences' ability to fold fast, are dictated by the type of interactions between the monomers that compose the sequences. The set of possible interactions between monomers is encapsulated by the potential energy function. In this study, I explore the impact of the relative forces of the potential on the architecture of the sequence-structure map. My observations rely on simple exact models of proteins and random samples of the space of potential energy functions of binary alphabets. I adopt a graph perspective and study the distribution of viable sequences and the structures they produce, as networks of sequences connected by point mutations. I observe that the relative proportion of attractive, neutral and repulsive forces defines types of potentials, that induce sequence-structure maps of vastly different architectures. I characterize the properties underlying these differences and relate them to the structure of the potential. Among these properties are the expected number and relative distribution of sequences associated to specific structures and the diversity of structures as a function of sequence divergence. I study the types of binary potentials observed in natural amino acids and show that there is a strong bias towards only some types of potentials, a bias that seems to characterize the folding code of natural proteins. I discuss implications of these observations for the architecture of the sequence-structure map of natural proteins, the construction of random libraries of peptides, and the early evolution of the natural amino acid alphabet. PMID:25473967

  7. Microfluidic platform for isolating nucleic acid targets using sequence specific hybridization

    PubMed Central

    Wang, Jingjing; Morabito, Kenneth; Tang, Jay X.; Tripathi, Anubhav

    2013-01-01

    The separation of target nucleic acid sequences from biological samples has emerged as a significant process in today's diagnostics and detection strategies. In addition to the possible clinical applications, the fundamental understanding of target and sequence specific hybridization on surface modified magnetic beads is of high value. In this paper, we describe a novel microfluidic platform that utilizes a mobile magnetic field in static microfluidic channels, where single stranded DNA (ssDNA) molecules are isolated via nucleic acid hybridization. We first established efficient isolation of biotinylated capture probe (BP) using streptavidin-coated magnetic beads. Subsequently, we investigated the hybridization of target ssDNA with BP bound to beads and explained these hybridization kinetics using a dual-species kinetic model. The number of hybridized target ssDNA molecules was determined to be about 6.5 times less than that of BP on the bead surface, due to steric hindrance effects. The hybridization of target ssDNA with non-complementary BP bound to bead was also examined, and non-specific hybridization was found to be insignificant. Finally, we demonstrated highly efficient capture and isolation of target ssDNA in the presence of non-target ssDNA, where as low as 1% target ssDNA can be detected from mixture. The microfluidic method described in this paper is significantly relevant and is broadly applicable, especially towards point-of-care biological diagnostic platforms that require binding and separation of known target biomolecules, such as RNA, ssDNA, or protein. PMID:24404041

  8. Homologous Pairing between Long DNA Double Helices

    NASA Astrophysics Data System (ADS)

    Mazur, Alexey K.

    2016-04-01

    Molecular recognition between two double stranded (ds) DNA with homologous sequences may not seem compatible with the B-DNA structure because the sequence information is hidden when it is used for joining the two strands. Nevertheless, it has to be invoked to account for various biological data. Using quantum chemistry, molecular mechanics, and hints from recent genetics experiments, I show here that direct recognition between homologous dsDNA is possible through the formation of short quadruplexes due to direct complementary hydrogen bonding of major-groove surfaces in parallel alignment. The constraints imposed by the predicted structures of the recognition units determine the mechanism of complexation between long dsDNA. This mechanism and concomitant predictions agree with the available experimental data and shed light upon the sequence effects and the possible involvement of topoisomerase II in the recognition.

  9. Homologous Pairing between Long DNA Double Helices.

    PubMed

    Mazur, Alexey K

    2016-04-15

    Molecular recognition between two double stranded (ds) DNA with homologous sequences may not seem compatible with the B-DNA structure because the sequence information is hidden when it is used for joining the two strands. Nevertheless, it has to be invoked to account for various biological data. Using quantum chemistry, molecular mechanics, and hints from recent genetics experiments, I show here that direct recognition between homologous dsDNA is possible through the formation of short quadruplexes due to direct complementary hydrogen bonding of major-groove surfaces in parallel alignment. The constraints imposed by the predicted structures of the recognition units determine the mechanism of complexation between long dsDNA. This mechanism and concomitant predictions agree with the available experimental data and shed light upon the sequence effects and the possible involvement of topoisomerase II in the recognition. PMID:27127987

  10. Nucleotide sequence of a glucosyltransferase gene from Streptococcus sobrinus MFe28.

    PubMed Central

    Ferretti, J J; Gilpin, M L; Russell, R R

    1987-01-01

    The complete nucleotide sequence was determined for the Streptococcus sobrinus MFe28 gtfI gene, which encodes a glucosyltransferase that produces an insoluble glucan product. A single open reading frame encodes a mature glucosyltransferase protein of 1,559 amino acids (Mr, 172,983) and a signal peptide of 38 amino acids. In the C-terminal one-third of the protein there are six repeating units containing 35 amino acids of partial homology and two repeating units containing 48 amino acids of complete homology. The functional role of these repeating units remains to be determined, although truncated forms of glucosyltransferase containing only the first two repeating units of partial homology maintained glucosyltransferase activity and the ability to bind glucan. Regions of homology with alpha-amylase and glycogen phosphorylase were identified in the glucosyltransferase protein and may represent regions involved in functionally similar domains. Images PMID:3040686

  11. Characterization of N-glycosylation and amino acid sequence features of immunoglobulins from swine.

    PubMed

    Lopez, Paul G; Girard, Lauren; Buist, Marjorie; de Oliveira, Andrey Giovanni Gomes; Bodnar, Edward; Salama, Apolline; Soulillou, Jean-Paul; Perreault, Hélène

    2016-02-01

    The primary goal of this study was to develop a method to study the N-glycosylation of IgG from swine in order to detect epitopes containing N-glycolylneuraminic acid (Neu5Gc) and/or terminal galactose residues linked in α1-3 susceptible to cause xenograft-related problems. Samples of immunoglobulin were isolated from porcine serum using protein-A affinity chromatography. The eluate was then separated on electrophoretic gel, and bands corresponding to the N-glycosylated heavy chains were cut off the gel and subjected to tryptic digestion. Peptides and glycopeptides were separated by reversed phase liquid chromatography and fractions were collected for matrix-assisted laser desorption/ionization time-of-flight mass spectrometric (MALDI-TOF-MS) analysis. Overall no α1-3 galactose was detected, as demonstrated by complete susceptibility of terminal galactose residues to β-galactosidase digestion. Neu5Gc was detected on singly sialylated structures. Two major N-glycopeptides were found, EEQFNSTYR and EAQFNSTYR as determined by tandem MS (MS/MS), as previously reported by Butler et al. (Immunogenetics, 61, 2009, 209-230), who found 11 subclasses for porcine IgG. Out of the 11, ten include the sequence corresponding to EEQFNSTYR, and only one codes for EAQFNSTYR. In this study, glycosylation patterns associated with both chains were slightly different, in that EEQFNSTYR had a higher content of galactose. The last step of this study consisted of peptide-mapping the 11 reported porcine IgG sequences. Although there was considerable overlap, at least one unique tryptic peptide was found per IgG sequence. The workflow presented in this manuscript constitutes the first study to use MALDI-TOF-MS in the investigation of porcine IgG structural features. PMID:26586247

  12. Human Retroviruses and AIDS. A compilation and analysis of nucleic acid and amino acid sequences: I--II; III--V

    SciTech Connect

    Myers, G.; Korber, B.; Wain-Hobson, S.; Smith, R.F.; Pavlakis, G.N.

    1993-12-31

    This compendium and the accompanying floppy diskettes are the result of an effort to compile and rapidly publish all relevant molecular data concerning the human immunodeficiency viruses (HIV) and related retroviruses. The scope of the compendium and database is best summarized by the five parts that it comprises: (I) HIV and SIV Nucleotide Sequences; (II) Amino Acid Sequences; (III) Analyses; (IV) Related Sequences; and (V) Database Communications. Information within all the parts is updated at least twice in each year, which accounts for the modes of binding and pagination in the compendium.

  13. Isolation and sequence of complementary DNA encoding human extracellular superoxide dismutase

    SciTech Connect

    Hjalmarsson, K.; Marklund, S.L.; Engstroem, A.; Edlund, T.

    1987-09-01

    A complementary DNA (cDNA) clone from a human placenta cDNA library encoding extracellular superoxide dismutase has been isolated and the nucleotide sequence determined. The cDNA has a very high G + C content. EC-SOD is synthesized with a putative 18-amino acid signal peptide, preceding the 222 amino acids in the mature enzyme, indicating that the enzyme is a secretory protein. The first 95 amino acids of the mature enzyme show no sequence homology with other sequenced proteins and there is one possible N-glycosylation site (Asn-89). The amino acid sequence from residues 96-193 shows strong homology (approx. 50%) with the final two-thirds of the sequences of all know eukaryotic CuZn SODs, whereas the homology with the P. leiognathi CuZn SOD is clearly lower. The ligands to Cu and Zn, the cysteines forming the intrasubunit disulfide bridge in the CuZn SODs, and the arginine found in all CuZn SODs in the entrance to the active site can all be identified in EC-SOD. A comparison with bovine CuZn SOD, the three-dimensional structure of which is known, reveals that the homologies occur in the active site and the divergencies are in the part constituting the subunit contact area in CuZn SOD. Amino acid sequence 194-222 in the carboxyl-terminal end of EC-SOD is strongly hydrophilic and contains nine amino acids with a positive charge. This sequence probably confers the affinity of EC-SOD for heparin and heparan sulfate. An analysis of the amino acid sequence homologies with CuZn SODs from various species indicates that the EC-SODs may have evolved form the CuZn SODs before the evolution of fungi and plants.

  14. Homology and phylogeny and their automated inference

    NASA Astrophysics Data System (ADS)

    Fuellen, Georg

    2008-06-01

    The analysis of the ever-increasing amount of biological and biomedical data can be pushed forward by comparing the data within and among species. For example, an integrative analysis of data from the genome sequencing projects for various species traces the evolution of the genomes and identifies conserved and innovative parts. Here, I review the foundations and advantages of this “historical” approach and evaluate recent attempts at automating such analyses. Biological data is comparable if a common origin exists (homology), as is the case for members of a gene family originating via duplication of an ancestral gene. If the family has relatives in other species, we can assume that the ancestral gene was present in the ancestral species from which all the other species evolved. In particular, describing the relationships among the duplicated biological sequences found in the various species is often possible by a phylogeny, which is more informative than homology statements. Detecting and elaborating on common origins may answer how certain biological sequences developed, and predict what sequences are in a particular species and what their function is. Such knowledge transfer from sequences in one species to the homologous sequences of the other is based on the principle of ‘my closest relative looks and behaves like I do’, often referred to as ‘guilt by association’. To enable knowledge transfer on a large scale, several automated ‘phylogenomics pipelines’ have been developed in recent years, and seven of these will be described and compared. Overall, the examples in this review demonstrate that homology and phylogeny analyses, done on a large (and automated) scale, can give insights into function in biology and biomedicine.

  15. Lactic acid production from potato peel waste by anaerobic sequencing batch fermentation using undefined mixed culture.

    PubMed

    Liang, Shaobo; McDonald, Armando G; Coats, Erik R

    2015-11-01

    Lactic acid (LA) is a necessary industrial feedstock for producing the bioplastic, polylactic acid (PLA), which is currently produced by pure culture fermentation of food carbohydrates. This work presents an alternative to produce LA from potato peel waste (PPW) by anaerobic fermentation in a sequencing batch reactor (SBR) inoculated with undefined mixed culture from a municipal wastewater treatment plant. A statistical design of experiments approach was employed using set of 0.8L SBRs using gelatinized PPW at a solids content range from 30 to 50 g L(-1), solids retention time of 2-4 days for yield and productivity optimization. The maximum LA production yield of 0.25 g g(-1) PPW and highest productivity of 125 mg g(-1) d(-1) were achieved. A scale-up SBR trial using neat gelatinized PPW (at 80 g L(-1) solids content) at the 3 L scale was employed and the highest LA yield of 0.14 g g(-1) PPW and a productivity of 138 mg g(-1) d(-1) were achieved with a 1 d SRT. PMID:25708409

  16. Bacterial community compositions in sediment polluted by perfluoroalkyl acids (PFAAs) using Illumina high-throughput sequencing.

    PubMed

    Sun, Yajun; Wang, Tieyu; Peng, Xiawei; Wang, Pei; Lu, Yonglong

    2016-06-01

    The characterization of bacterial community compositions and the change in perfluoroalkyl acids (PFAAs) along a natural river distribution system were explored in the present study. Illumina high-throughput sequencing was used to explore bacterial community diversity and structure in sediment polluted by PFAAs from the Xiaoqing River, the area with concentrated fluorochemical facilities in China. The concentration of PFAAs was in the range of 8.44-465.60 ng/g dry weight (dw) in sediment. Perfluorooctanoic acid (PFOA) was the dominant PFAA in all samples, which accounted for 94.2 % of total PFAAs. High-level PFOA could lead to an obvious increase in relative abundance of Proteobacteria, ε-Proteobacteria, Thiobacillus, and Sulfurimonas and the decrease in relative abundance of other bacteria. Redundancy analysis revealed that PFOA played an important role in the formation of bacterial community, and PFOA at higher concentration could reduce the diversity of bacterial community. When the concentration of PFOA was below 100 ng/g dw in sediment, no significant effect on microbial community structure was observed. Thiobacillus and Sulfurimonas were positively correlated with the concentration of PFOA, suggesting that both genera were resistant to PFOA contamination. PMID:26780047

  17. Mass spectrometric detection of the amino acid sequence polymorphism of the hepatitis C virus antigen.

    PubMed

    Kaysheva, A L; Ivanov, Yu D; Frantsuzov, P A; Krohin, N V; Pavlova, T I; Uchaikin, V F; Konev, V А; Kovalev, O B; Ziborov, V S; Archakov, A I

    2016-03-01

    A method for detection and identification of the hepatitis C virus antigen (HCVcoreAg) in human serum with consideration for possible amino acid substitutions is proposed. The method is based on a combination of biospecific capturing and concentrating of the target protein on the surface of the chip for atomic force microscope (AFM chip) with subsequent protein identification by tandem mass spectrometric (MS/MS) analysis. Biospecific AFM-capturing of viral particles containing HCVcoreAg from serum samples was performed by use of AFM chips with monoclonal antibodies (anti-HCVcore) covalently immobilized on the surface. Biospecific complexes were registered and counted by AFM. Further MS/MS analysis allowed to reliably identify the HCVcoreAg in the complexes formed on the AFM chip surface. Analysis of MS/MS spectra, with the account taken of the possible polymorphisms in the amino acid sequence of the HCVcoreAg, enabled us to increase the number of identified peptides. PMID:26773170

  18. Acidosis Blocks CCAAT/Enhancer-Binding Protein Homologous Protein (CHOP)- and c-Jun-Mediated Induction of p53-Upregulated Mediator of Apoptosis (PUMA) during Amino Acid Starvation

    PubMed Central

    Ryder, Christopher B.; McColl, Karen; Distelhorst, Clark W.

    2012-01-01

    Cancer cells must avoid succumbing to a variety of noxious conditions within their surroundings. Acidosis is one such prominent feature of the tumor microenvironment that surprisingly promotes tumor survival and progression. We recently reported that acidosis prevents apoptosis of starved or stressed lymphoma cells through regulation of several Bcl-2 family members (Ryder et al., JBC, 2012). Mechanistic studies in that work focused on the acid-mediated upregulation of anti-apoptotic Bcl-2 and Bcl-xL, while additionally showing inhibition of glutamine starvation-induced expression of pro-apoptotic PUMA by acidosis. Herein we report that amino acid (AA) starvation elevates PUMA, an effect that is blocked by extracellular acidity. Knockdown studies confirm that PUMA induction during AA starvation requires expression of both CHOP and c-Jun. Interestingly, acidosis strongly attenuates AA starvation-mediated c-Jun expression, which correlates with PUMA repression. As c-Jun exerts a tumor suppressive function in this and other contexts, its inhibition by acidosis has broader implications for survival of cancer cells in the acidic tumor milieu. PMID:23261451

  19. Peptide sequencing by using a combination of partial acid hydrolysis and fast-atom-bombardment mass spectrometry.

    PubMed Central

    De Angelis, F; Botta, M; Ceccarelli, S; Nicoletti, R

    1986-01-01

    To overcome the limit of the intensity of ions carrying sequence information in structural determinations of peptides by fast-atom-bombardment m.s., we have developed a method that consists in taking spectra of the peptide acid hydrolysates at different hydrolysis times. Peaks correspond to the oligomers arising from the peptide partial hydrolysis. The sequence can then be identified from the structurally overlapping fragments. PMID:2428356

  20. Purification and partial amino acid sequence of the chloroplast cytochrome b-559.

    PubMed

    Widger, W R; Cramer, W A; Hermodson, M; Meyer, D; Gullifor, M

    1984-03-25

    The hydrophobic cytochrome b-559, purified from unstacked, ethanol-washed spinach thylakoid membranes, using extraction with 2% Triton X-100 in 4 M urea and three chromatographic steps in the presence of protease inhibitors, has a dominant band on sodium dodecyl sulfate-urea gels corresponding to Mr = 10,000. The yield of this preparation is 30-50% (5-10 mg) starting with 600 mg of chlorophyll. The heme content yields a calculated molecular weight of no more than 17,500/heme, and perhaps somewhat smaller after correction for impurities. The Mr = 10,000 band is stained by the tetramethylbenzidine-H2O2 heme reagent on lithium dodecyl sulfate gels run at 0 degrees C. The Mr = 10,000 protein, further separated by high performance liquid chromatography, contains a unique NH2 terminus that is not blocked, and the amino acid sequence for the first 27 residues is NH2-Ser-Gly-Ser-Thr-Gly-Glu-Arg-Ser-Phe-Ala-Asp-Ile-Ile-Thr-Ser-Ile-Arg-Tyr-Trp -Val-Ile-X-Ser-Ile-Thr-Ile-Pro. . . COOH. Approximately 55% of the amino acids are hydrophobic, based on amino acid analysis of the Mr = 10,000 peptide, which also indicated the presence of at least one histidine. Only one cytochrome b-559 component could be identified, whose yield indicated that it arises from a single b-559 protein in chloroplasts corresponding to the in situ high potential cytochrome of the chloroplast photosystem II. PMID:6706983

  1. Sequence-Specific Electrical Purification of Nucleic Acids with Nanoporous Gold Electrodes.

    PubMed

    Daggumati, Pallavi; Appelt, Sandra; Matharu, Zimple; Marco, Maria L; Seker, Erkin

    2016-06-22

    Nucleic-acid-based biosensors have enabled rapid and sensitive detection of pathogenic targets; however, these devices often require purified nucleic acids for analysis since the constituents of complex biological fluids adversely affect sensor performance. This purification step is typically performed outside the device, thereby increasing sample-to-answer time and introducing contaminants. We report a novel approach using a multifunctional matrix, nanoporous gold (np-Au), which enables both detection of specific target sequences in a complex biological sample and their subsequent purification. The np-Au electrodes modified with 26-mer DNA probes (via thiol-gold chemistry) enabled sensitive detection and capture of complementary DNA targets in the presence of complex media (fetal bovine serum) and other interfering DNA fragments in the range of 50-1500 base pairs. Upon capture, the noncomplementary DNA fragments and serum constituents of varying sizes were washed away. Finally, the surface-bound DNA-DNA hybrids were released by electrochemically cleaving the thiol-gold linkage, and the hybrids were iontophoretically eluted from the nanoporous matrix. The optical and electrophoretic characterization of the analytes before and after the detection-purification process revealed that low target DNA concentrations (80 pg/μL) can be successfully detected in complex biological fluids and subsequently released to yield pure hybrids free of polydisperse digested DNA fragments and serum biomolecules. Taken together, this multifunctional platform is expected to enable seamless integration of detection and purification of nucleic acid biomarkers of pathogens and diseases in miniaturized diagnostic devices. PMID:27244455

  2. Cloning and sequencing of a cDNA for Akazara scallop troponin T.

    PubMed

    Inoue, A; Ojima, T; Nishita, K

    1996-10-01

    A cDNA clone encoding troponin T of Akazara scallop (Chlamys nipponensis akazara) striated adductor muscle has been isolated and sequenced. The complete sequence deduced consists of 314 amino acid residues with a molecular weight of 37,206. Akazara scallop troponin T contains 55 amino acid residues more and 82 residues fewer than rabbit skeletal muscle troponin T and Drosophila melanogaster troponin T, respectively, showing almost the lowest sequence homology with rabbit troponin T (26%) but the highest homology with Drosophila troponin T (33%). Further, high sequence homology was seen in the functional regions: residues 33-120 and 174-227, corresponding respectively to residues 71-158 and 197-250 of rabbit troponin T (tropomyosin-binding regions); and residues 200-204, corresponding to 223 227 of rabbit troponin T (troponin I-binding region). In residues 1-70 (tropomyosin-binding region), however, only six residues are identical with rabbit troponin T. PMID:8947849

  3. Negative Ion In-Source Decay Matrix-Assisted Laser Desorption/Ionization Mass Spectrometry for Sequencing Acidic Peptides

    NASA Astrophysics Data System (ADS)

    McMillen, Chelsea L.; Wright, Patience M.; Cassady, Carolyn J.

    2016-05-01

    Matrix-assisted laser desorption/ionization (MALDI) in-source decay was studied in the negative ion mode on deprotonated peptides to determine its usefulness for obtaining extensive sequence information for acidic peptides. Eight biological acidic peptides, ranging in size from 11 to 33 residues, were studied by negative ion mode ISD (nISD). The matrices 2,5-dihydroxybenzoic acid, 2-aminobenzoic acid, 2-aminobenzamide, 1,5-diaminonaphthalene, 5-amino-1-naphthol, 3-aminoquinoline, and 9-aminoacridine were used with each peptide. Optimal fragmentation was produced with 1,5-diaminonphthalene (DAN), and extensive sequence informative fragmentation was observed for every peptide except hirudin(54-65). Cleavage at the N-Cα bond of the peptide backbone, producing c' and z' ions, was dominant for all peptides. Cleavage of the N-Cα bond N-terminal to proline residues was not observed. The formation of c and z ions is also found in electron transfer dissociation (ETD), electron capture dissociation (ECD), and positive ion mode ISD, which are considered to be radical-driven techniques. Oxidized insulin chain A, which has four highly acidic oxidized cysteine residues, had less extensive fragmentation. This peptide also exhibited the only charged localized fragmentation, with more pronounced product ion formation adjacent to the highly acidic residues. In addition, spectra were obtained by positive ion mode ISD for each protonated peptide; more sequence informative fragmentation was observed via nISD for all peptides. Three of the peptides studied had no product ion formation in ISD, but extensive sequence informative fragmentation was found in their nISD spectra. The results of this study indicate that nISD can be used to readily obtain sequence information for acidic peptides.

  4. Investigating homology between proteins using energetic profiles.

    PubMed

    Wrabl, James O; Hilser, Vincent J

    2010-03-01

    Accumulated experimental observations demonstrate that protein stability is often preserved upon conservative point mutation. In contrast, less is known about the effects of large sequence or structure changes on the stability of a particular fold. Almost completely unknown is the degree to which stability of different regions of a protein is generally preserved throughout evolution. In this work, these questions are addressed through thermodynamic analysis of a large representative sample of protein fold space based on remote, yet accepted, homology. More than 3,000 proteins were computationally analyzed using the structural-thermodynamic algorithm COREX/BEST. Estimated position-specific stability (i.e., local Gibbs free energy of folding) and its component enthalpy and entropy were quantitatively compared between all proteins in the sample according to all-vs.-all pairwise structural alignment. It was discovered that the local stabilities of homologous pairs were significantly more correlated than those of non-homologous pairs, indicating that local stability was indeed generally conserved throughout evolution. However, the position-specific enthalpy and entropy underlying stability were less correlated, suggesting that the overall regional stability of a protein was more important than the thermodynamic mechanism utilized to achieve that stability. Finally, two different types of statistically exceptional evolutionary structure-thermodynamic relationships were noted. First, many homologous proteins contained regions of similar thermodynamics despite localized structure change, suggesting a thermodynamic mechanism enabling evolutionary fold change. Second, some homologous proteins with extremely similar structures nonetheless exhibited different local stabilities, a phenomenon previously observed experimentally in this laboratory. These two observations, in conjunction with the principal conclusion that homologous proteins generally conserved local stability, may

  5. The F-actin capping proteins of Physarum polycephalum: cap42(a) is very similar, if not identical, to fragmin and is structurally and functionally very homologous to gelsolin; cap42(b) is Physarum actin.

    PubMed Central

    Ampe, C; Vandekerckhove, J

    1987-01-01

    We have carried out a primary structure analysis of the F-actin capping proteins of Physarum polycephalum. Cap42(b) was completely sequenced and was found to be identical with Physarum actin. Approximately 88% of the sequence of cap42(a) was determined. Cap42(a) and fragmin were found to be identical by amino acid composition, isoelectric point, mol. wt, elution time on reversed-phase chromatography and amino acid sequence of their tryptic peptides. The available sequence of cap42(a) is greater than 36% homologous with the NH2-terminal 42-kd domain of human gelsolin. A highly homologous region of 16 amino acids is also shared between cap42(a), gelsolin and the Acanthamoeba profilins. Cap42(a) binds two actin molecules in a similar way to gelsolin suggesting a mechanism of F-actin modulation that has been conserved during evolution. Images Fig. 1. Fig. 3. Fig. 4. PMID:2832154

  6. The complete amino acid sequence of the A-chain of human plasma alpha 2HS-glycoprotein.

    PubMed

    Yoshioka, Y; Gejyo, F; Marti, T; Rickli, E E; Bürgi, W; Offner, G D; Troxler, R F; Schmid, K

    1986-02-01

    Normal human plasma alpha 2HS-glycoprotein has earlier been shown to be comprised of two polypeptide chains. Recently, the amino acid and carbohydrate sequences of the short chain were elucidated (Gejyo, F., Chang, J.-L., Bürgi, W., Schmid, K., Offner, G. D., Troxler, R.F., van Halbeck, H., Dorland, L., Gerwig, G. J., and Vliegenthart, J.F.G. (1983) J. Biol. Chem. 258, 4966-4971). In the present study, the amino acid sequence of the long chain of this protein, designated A-chain, was determined and found to consist of 282 amino acid residues. Twenty-four amino acid doublets were found; the most abundant of these are Pro-Pro and Ala-Ala which each occur five times. Of particular interest is the presence of three Gly-X-Pro and one Gly-Pro-X sequences that are characteristic of the repeating sequences of collagens. Chou-Fasman evaluation of the secondary structure suggested that the A-chain contains 29% alpha-helix, 24% beta-pleated sheet, and 26% reverse turns and, thus, approximately 80% of the polypeptide chain may display ordered structure. Four glycosylation sites were identified. The two N-glycosidic oligosaccharides were found in the center region (residues 138 and 158), whereas the two O-glycosidic heterosaccharides, both linked to threonine (residues 238 and 252), occur within the carboxyl-terminal region. The N-glycans are linked to Asn residues in beta-turns, while the O-glycans are located in short random segments. Comparison of the sequence of the amino- and carboxyl-terminal 30 residues with protein sequences in a data bank demonstrated that the A-chain is not significantly related to any known proteins. However, the proline-rich carboxyl-terminal region of the A-chain displays some sequence similarity to collagens and the collagen-like domains of complement subcomponent C1q. PMID:3944104

  7. Cloning and characterization of two vertebrate homologs of the Drosophila eyes absent gene.

    PubMed

    Zimmerman, J E; Bui, Q T; Steingrímsson, E; Nagle, D L; Fu, W; Genin, A; Spinner, N B; Copeland, N G; Jenkins, N A; Bucan, M; Bonini, N M

    1997-02-01

    The Drosophila eyes absent (eya) gene plays an essential role in the events that lead to proper development of the fly eye and embryo. Here we report the analysis of two human and two mouse homologs of the fly eya gene. Sequence comparison reveals a large domain of approximately 270 amino acids in the carboxyl terminus of the predicted mammalian proteins that shows 53% identity between the fly sequence and all of the vertebrate homologs. This Eya-homology domain is of novel sequence, with no previously identified motifs. RNA hybridization studies indicate that the mouse genes are expressed during embryogenesis and in select tissues of the adult. Both mouse Eya genes are expressed in the eye, suggesting that these genes may function in eye development in vertebrates as eya does in the fly. The mouse Eya2 gene maps to chromosome 2 in the region syntenic with human chromosome 20q13, and the mouse Eya2 gene maps to chromosome 4 in the region syntenic with human chromosome 1p36. Our findings support the notion that several families of genes (Pax-6/eyeless, Six-3/sine oculis, and Eya) play related and critical roles in the eye for both files and vertebrates. PMID:9049631

  8. A novel mutation in TFL1 homolog affecting determinacy in cowpea (Vigna unguiculata).

    PubMed

    Dhanasekar, P; Reddy, K S

    2015-02-01

    Mutations in the widely conserved Arabidopsis Terminal Flower 1 (TFL1) gene and its homologs have been demonstrated to result in determinacy across genera, the knowledge of which is lacking in cowpea. Understanding the molecular events leading to determinacy of apical meristems could hasten development of cowpea varieties with suitable ideotypes. Isolation and characterization of a novel mutation in cowpea TFL1 homolog (VuTFL1) affecting determinacy is reported here for the first time. Cowpea TFL1 homolog was amplified using primers designed based on conserved sequences in related genera and sequence variation was analysed in three gamma ray-induced determinate mutants, their indeterminate parent "EC394763" and two indeterminate varieties. The analyses of sequence variation exposed a novel SNP distinguishing the determinate mutants from the indeterminate types. The non-synonymous point mutation in exon 4 at position 1,176 resulted from transversion of cytosine (C) to adenine (A) leading to an amino acid change (Pro-136 to His) in determinate mutants. The effect of the mutation on protein function and stability was predicted to be detrimental using different bioinformatics/computational tools. The functionally significant novel substitution mutation is hypothesized to affect determinacy in the cowpea mutants. Development of suitable regeneration protocols in this hitherto recalcitrant crop and subsequent complementation assay in mutants or over-expressing assay in parents could decisively conclude the role of the SNP in regulating determinacy in these cowpea mutants. PMID:25146839

  9. The developmental transcriptome landscape of bovine skeletal muscle defined by Ribo-Zero ribonucleic acid sequencing.

    PubMed

    Sun, X; Li, M; Sun, Y; Cai, H; Li, R; Wei, X; Lan, X; Huang, Y; Lei, C; Chen, H

    2015-12-01

    Ribonucleic acid sequencing (RNA-Seq) libraries are normally prepared with oligo(dT) selection of poly(A)+ mRNA, but it depends on intact total RNA samples. Recent studies have described Ribo-Zero technology, a novel method that can capture both poly(A)+ and poly(A)- transcripts from intact or fragmented RNA samples. We report here the first application of Ribo-Zero RNA-Seq for the analysis of the bovine embryonic, neonatal, and adult skeletal muscle whole transcriptome at an unprecedented depth. Overall, 19,893 genes were found to be expressed, with a high correlation of expression levels between the calf and the adult. Hundreds of genes were found to be highly expressed in the embryo and decreased at least 10-fold after birth, indicating their potential roles in embryonic muscle development. In addition, we present for the first time the analysis of global transcript isoform discovery in bovine skeletal muscle and identified 36,694 transcript isoforms. Transcriptomic data were also analyzed to unravel sequence variations; 185,036 putative SNP and 12,428 putative short insertions-deletions (InDel) were detected. Specifically, many stop-gain, stop-loss, and frameshift mutations were identified that probably change the relative protein production and sequentially affect the gene function. Notably, the numbers of stage-specific transcripts, alternative splicing events, SNP, and InDel were greater in the embryo than in the calf and the adult, suggesting that gene expression is most active in the embryo. The resulting view of the transcriptome at a single-base resolution greatly enhances the comprehensive transcript catalog and uncovers the global trends in gene expression during bovine skeletal muscle development. PMID:26641174

  10. Cloning and nucleotide sequences of livB and livC, the structural genes encoding binding proteins of the high-affinity branched-chain amino acid transport in Salmonella typhimurium.

    PubMed

    Ohnishi, K; Nakazima, A; Matsubara, K; Kiritani, K

    1990-02-01

    The liv gene cluster responsible for encoding the high-affinity branched-chain amino acid transport proteins in Salmonella typhimurium was mapped in the 7.6-kilobase HindIII-SacI segment of plasmid pMN12 by utilizing the gene dosage effect. By subcloning and biochemical analysis, the livB and livC structural genes encoding the leucine-, isoleucine-, valine-, threonine-binding protein (LIVT-BP) and the leucine-specific binding protein (L-BP), respectively, were localized within the 3,617-base HindIII-BstEII segment. Upon determining the nucleotide sequence of the 3,617 bases, we found that the coding sequence of the livB gene (1,095 base pairs) starts at the position 355 and specifies the precursor LIVT-BP of 365 amino acid residues, and the livC gene (1,107 base pairs) starts at the position 2,452 and encodes the precursor L-BP of 369 amino acid residues. The two genes, separated by a 1-kilobase intergenic region, each possess potential promoters and rho-independent transcriptional terminators. The mature LIVT-BP and L-BP are produced by removing the putative 21 and 23 signal peptides from the respective precursors. In comparison with the analogous two binding proteins from Escherichia coli K-12, strong homologies are observed. PMID:2193932

  11. Characterization and cDNA sequence of Bothriechis schlegeliil-amino acid oxidase with antibacterial activity.

    PubMed

    Vargas Muñoz, Leidy Johana; Estrada-Gomez, Sebastian; Núñez, Vitelbina; Sanz, Libia; Calvete, Juan J

    2014-08-01

    Snake venoms are complex mixtures of proteins including l-amino acid oxidase (lAAO). A lAAO (named BslAAO) with a mass of 56kDa and a theoretical Ip of 5.79, was purified from Bothriechis schlegelii venom through size-exclusion, ion exchange and affinity chromatography. The entire protein sequence of 498 amino acids, was determined from cDNA using reverse-transcribed mRNA isolated from venom gland. The enzyme showed dose-dependent inhibition of bacterial growth. BslAAO showed inhibitory effect against S. aureus with a MIC of 4μg/mL and a MBC of 8μg/mL. Against Acinetobacter baumannii, showed a MIC of 2μg/mL and MBC of 4μg/mL, No effect was observed in Escherichia coli. This antibacterial activity was inhibited by catalase, indicating that antimicrobial activity was due to H2O2 production. BslAAO did not show any cytotoxic activity toward mouse myoblast cell line C2C12 or peripheral blood mononuclear cells. The enzyme oxidated l-Leu, with a Km of 16.37μM and a Vmax of 0.39μM/min. Snake venoms lAAOs, are potential frames of different therapeutics molecules since these enzymes exhibit low MICs and MBCs and show to be harmless to human cells due to microorganisms being generally several fold more sensitive to reactive oxygen species than human tissues. PMID:24875315

  12. Genome Sequence of a Candidate World Health Organization Reference Strain of Zika Virus for Nucleic Acid Testing

    PubMed Central

    Trösemeier, Jan-Hendrik; Musso, Didier; Blümel, Johannes; Thézé, Julien; Pybus, Oliver G.

    2016-01-01

    We report here the sequence of a candidate reference strain of Zika virus (ZIKV) developed on behalf of the World Health Organization (WHO). The ZIKV reference strain is intended for use in nucleic acid amplification (NAT)-based assays for the detection and quantification of ZIKV RNA. PMID:27587826

  13. Genome Sequence of Schizochytrium sp. CCTCC M209059, an Effective Producer of Docosahexaenoic Acid-Rich Lipids

    PubMed Central

    Ji, Xiao-Jun; Mo, Kai-Qiang; Ren, Lu-Jing; Li, Gan-Lu; Huang, Jian-Zhong

    2015-01-01

    Schizochytrium is an effective species for producing omega-3 docosahexaenoic acid (DHA). Here, we report a genome sequence of Schizochytrium sp. CCTCC M209059, which has a genome size of 39.09 Mb. It will provide the genomic basis for further insights into the metabolic and regulatory mechanisms underlying the DHA formation. PMID:26251485

  14. Evolutionary Distance of Amino Acid Sequence Orthologs across Macaque Subspecies: Identifying Candidate Genes for SIV Resistance in Chinese Rhesus Macaques

    PubMed Central

    Ross, Cody T.; Roodgar, Morteza; Smith, David Glenn

    2015-01-01

    We use the Reciprocal Smallest Distance (RSD) algorithm to identify amino acid sequence orthologs in the Chinese and Indian rhesus macaque draft sequences and estimate the evolutionary distance between such orthologs. We then use GOanna to map gene function annotations and human gene identifiers to the rhesus macaque amino acid sequences. We conclude methodologically by cross-tabulating a list of amino acid orthologs with large divergence scores with a list of genes known to be involved in SIV or HIV pathogenesis. We find that many of the amino acid sequences with large evolutionary divergence scores, as calculated by the RSD algorithm, have been shown to be related to HIV pathogenesis in previous laboratory studies. Four of the strongest candidate genes for SIVmac resistance in Chinese rhesus macaques identified in this study are CDK9, CXCL12, TRIM21, and TRIM32. Additionally, ANKRD30A, CTSZ, GORASP2, GTF2H1, IL13RA1, MUC16, NMDAR1, Notch1, NT5M, PDCD5, RAD50, and TM9SF2 were identified as possible candidates, among others. We failed to find many laboratory experiments contrasting the effects of Indian and Chinese orthologs at these sites on SIVmac pathogenesis, but future comparative studies might hold fertile ground for research into the biological mechanisms underlying innate resistance to SIVmac in Chinese rhesus macaques. PMID:25884674

  15. Evolutionary distance of amino acid sequence orthologs across macaque subspecies: identifying candidate genes for SIV resistance in Chinese rhesus macaques.

    PubMed

    Ross, Cody T; Roodgar, Morteza; Smith, David Glenn

    2015-01-01

    We use the Reciprocal Smallest Distance (RSD) algorithm to identify amino acid sequence orthologs in the Chinese and Indian rhesus macaque draft sequences and estimate the evolutionary distance between such orthologs. We then use GOanna to map gene function annotations and human gene identifiers to the rhesus macaque amino acid sequences. We conclude methodologically by cross-tabulating a list of amino acid orthologs with large divergence scores with a list of genes known to be involved in SIV or HIV pathogenesis. We find that many of the amino acid sequences with large evolutionary divergence scores, as calculated by the RSD algorithm, have been shown to be related to HIV pathogenesis in previous laboratory studies. Four of the strongest candidate genes for SIVmac resistance in Chinese rhesus macaques identified in this study are CDK9, CXCL12, TRIM21, and TRIM32. Additionally, ANKRD30A, CTSZ, GORASP2, GTF2H1, IL13RA1, MUC16, NMDAR1, Notch1, NT5M, PDCD5, RAD50, and TM9SF2 were identified as possible candidates, among others. We failed to find many laboratory experiments contrasting the effects of Indian and Chinese orthologs at these sites on SIVmac