Science.gov

Sample records for embl nucleotide sequence

  1. The EMBL nucleotide sequence database.

    PubMed Central

    Stoesser, G; Moseley, M A; Sleep, J; McGowran, M; Garcia-Pastor, M; Sterk, P

    1998-01-01

    The EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl. html ) constitutes Europe's primary nucleotide sequence resource. DNA and RNA sequences are directly submitted from researchers and genome sequencing groups and collected from the scientific literature and patent applications (Fig. 1). In collaboration with DDBJ and GenBank the database is produced, maintained and distributed at the European Bioinformatics Institute. Database releases are produced quarterly and are distributed on CD-ROM. EBI's network services allow access to the most up-to-date data collection via Internet and World Wide Web interface, providing database searching and sequence similarity facilities plus access to a large number of additional databases. PMID:9399791

  2. Automated Identification of Nucleotide Sequences

    NASA Technical Reports Server (NTRS)

    Osman, Shariff; Venkateswaran, Kasthuri; Fox, George; Zhu, Dian-Hui

    2007-01-01

    STITCH is a computer program that processes raw nucleotide-sequence data to automatically remove unwanted vector information, perform reverse-complement comparison, stitch shorter sequences together to make longer ones to which the shorter ones presumably belong, and search against the user s choice of private and Internet-accessible public 16S rRNA databases. ["16S rRNA" denotes a ribosomal ribonucleic acid (rRNA) sequence that is common to all organisms.] In STITCH, a template 16S rRNA sequence is used to position forward and reverse reads. STITCH then automatically searches known 16S rRNA sequences in the user s chosen database(s) to find the sequence most similar to (the sequence that lies at the smallest edit distance from) each spliced sequence. The result of processing by STITCH is the identification of the most similar well-described bacterium. Whereas previously commercially available software for analyzing genetic sequences operates on one sequence at a time, STITCH can manipulate multiple sequences simultaneously to perform the aforementioned operations. A typical analysis of several dozen sequences (length of the order of 103 base pairs) by use of STITCH is completed in a few minutes, whereas such an analysis performed by use of prior software takes hours or days.

  3. Nucleotide sequences encoding a thermostable alkaline protease

    DOEpatents

    Wilson, David B.; Lao, Guifang

    1998-01-01

    Nucleotide sequences, derived from a thermophilic actinomycete microorganism, which encode a thermostable alkaline protease are disclosed. Also disclosed are variants of the nucleotide sequences which encode a polypeptide having thermostable alkaline proteolytic activity. Recombinant thermostable alkaline protease or recombinant polypeptide may be obtained by culturing in a medium a host cell genetically engineered to contain and express a nucleotide sequence according to the present invention, and recovering the recombinant thermostable alkaline protease or recombinant polypeptide from the culture medium.

  4. Nucleotide sequences encoding a thermostable alkaline protease

    DOEpatents

    Wilson, D.B.; Lao, G.

    1998-01-06

    Nucleotide sequences, derived from a thermophilic actinomycete microorganism, which encode a thermostable alkaline protease are disclosed. Also disclosed are variants of the nucleotide sequences which encode a polypeptide having thermostable alkaline proteolytic activity. Recombinant thermostable alkaline protease or recombinant polypeptide may be obtained by culturing in a medium a host cell genetically engineered to contain and express a nucleotide sequence according to the present invention, and recovering the recombinant thermostable alkaline protease or recombinant polypeptide from the culture medium. 3 figs.

  5. Submitting MIGS, MIMS, MIENS Information to EMBL and Standards and the Sequencing Pipelines of the Gordon and Betty Moore Foundation (GSC8 Meeting)

    SciTech Connect

    Vaughan, Bob; Kaye, Jon

    2009-09-09

    The Genomic Standards Consortium was formed in September 2005. It is an international, open-membership working body which promotes standardization in the description of genomes and the exchange and integration of genomic data. The 2009 meeting was an activity of a five-year funding "Research Coordination Network" from the National Science Foundation and was organized held at the DOE Joint Genome Institute with organizational support provided by the JGI and by the University of California - San Diego. Bob Vaughan of EMBL on submitting MIGS/MIMS/MIENS information to EMBL-EBI's system, followed by a brief talk from Jon Kaye of the Gordon and Betty Moore Foundation on standards and the foundation's sequencing pipelines at the Genomic Standards Consortium's 8th meeting at the DOE JGI in Walnut Creek, Calif. on Sept. 9, 2009

  6. Submitting MIGS, MIMS, MIENS Information to EMBL and Standards and the Sequencing Pipelines of the Gordon and Betty Moore Foundation (GSC8 Meeting)

    ScienceCinema

    Vaughan, Bob [EMBL; Kaye, Jon [Gordon and Betty Moore Foundation

    2016-07-12

    The Genomic Standards Consortium was formed in September 2005. It is an international, open-membership working body which promotes standardization in the description of genomes and the exchange and integration of genomic data. The 2009 meeting was an activity of a five-year funding "Research Coordination Network" from the National Science Foundation and was organized held at the DOE Joint Genome Institute with organizational support provided by the JGI and by the University of California - San Diego. Bob Vaughan of EMBL on submitting MIGS/MIMS/MIENS information to EMBL-EBI's system, followed by a brief talk from Jon Kaye of the Gordon and Betty Moore Foundation on standards and the foundation's sequencing pipelines at the Genomic Standards Consortium's 8th meeting at the DOE JGI in Walnut Creek, Calif. on Sept. 9, 2009

  7. Long-range correlations in nucleotide sequences

    NASA Astrophysics Data System (ADS)

    Peng, C.-K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Sciortino, F.; Simons, M.; Stanley, H. E.

    1992-03-01

    DNA SEQUENCES have been analysed using models, such as an it-step Markov chain, that incorporate the possibility of short-range nucleotide correlations1. We propose here a method for studying the stochastic properties of nucleotide sequences by constructing a 1:1 map of the nucleotide sequence onto a walk, which we term a 'DNA walk'. We then use the mapping to provide a quantitative measure of the correlation between nucleotides over long distances along the DNA chain. Thus we uncover in the nucleotide sequence a remarkably long-range power law correlation that implies a new scale-invariant property of DNA. We find such long-range correlations in intron-containing genes and in nontranscribed regulatory DNA sequences, but not in complementary DNA sequences or intron-less genes.

  8. Long-range correlations in nucleotide sequences

    NASA Technical Reports Server (NTRS)

    Peng, C. K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Sciortino, F.; Simons, M.; Stanley, H. E.

    1992-01-01

    DNA sequences have been analysed using models, such as an n-step Markov chain, that incorporate the possibility of short-range nucleotide correlations. We propose here a method for studying the stochastic properties of nucleotide sequences by constructing a 1:1 map of the nucleotide sequence onto a walk, which we term a 'DNA walk'. We then use the mapping to provide a quantitative measure of the correlation between nucleotides over long distances along the DNA chain. Thus we uncover in the nucleotide sequence a remarkably long-range power law correlation that implies a new scale-invariant property of DNA. We find such long-range correlations in intron-containing genes and in nontranscribed regulatory DNA sequences, but not in complementary DNA sequences or intron-less genes.

  9. Nucleotide Sequence-Based Multitarget Identification

    PubMed Central

    Vinayagamoorthy, T.; Mulatz, Kirk; Hodkinson, Roger

    2003-01-01

    MULTIGEN technology (T. Vinayagamoorthy, U.S. patent 6,197,510, March 2001) is a modification of conventional sequencing technology that generates a single electropherogram consisting of short nucleotide sequences from a mixture of known DNA targets. The target sequences may be present on the same or different nucleic acid molecules. For example, when two DNA targets are sequenced, the first and second sequencing primers are annealed to their respective target sequences, and then a polymerase causes chain extension by the addition of new deoxyribose nucleotides. Since the electrophoretic separation depends on the relative molecular weights of the truncated molecules, the molecular weight of the second sequencing primer was specifically designed to be higher than the combined molecular weight of the first sequencing primer plus the molecular weight of the largest truncated molecule generated from the first target sequence. Thus, the series of truncated molecules produced by the second sequencing primer will have higher molecular weights than those produced by the first sequencing primer. Hence, the truncated molecules produced by these two sequencing primers can be effectively separated in a single lane by standard gel electrophoresis in a single electropherogram without any overlapping of the nucleotide sequences. By using sequencing primers with progressively higher molecular weights, multiple short DNA sequences from a variety of targets can be determined simultaneously. We describe here the basic concept of MULTIGEN technology and three applications: detection of sexually transmitted pathogens (Neisseria gonorrhoeae, Chlamydia trachomatis, and Ureaplasma urealyticum), detection of contaminants in meat samples (coliforms, fecal coliforms, and Escherichia coli O157:H7), and detection of single-nucleotide polymorphisms in the human N-acetyltransferase (NAT1) gene (S. Fronhoffs et al., Carcinogenesis 22:1405-1412, 2001). PMID:12843076

  10. The EMBL-EBI channel.

    PubMed

    McEntyre, Jo; Birney, Ewan

    2016-01-01

    This editorial introduces the EMBL-EBI channel in F1000Research. The aims of the channel are to present EMBL-EBI outputs and collate research published on F1000Research contributed, in whole or in part, EMBL-EBI researchers. PMID:26913196

  11. Complete Nucleotide Sequence of Tn10

    PubMed Central

    Chalmers, Ronald; Sewitz, Sven; Lipkow, Karen; Crellin, Paul

    2000-01-01

    The complete nucleotide sequence of Tn10 has been determined. The dinucleotide signature and percent G+C of the sequence had no discontinuities, indicating that Tn10 constitutes a homogeneous unit. The new sequence contained three new open reading frames corresponding to a glutamate permease, repressors of heavy metal resistance operons, and a hypothetical protein in Bacillus subtilis. The glutamate permease was fully functional when expressed, but Tn10 did not protect Escherichia coli from the toxic effects of various metals. PMID:10781570

  12. Remote access to ACNUC nucleotide and protein sequence databases at PBIL.

    PubMed

    Gouy, Manolo; Delmotte, Stéphane

    2008-04-01

    The ACNUC biological sequence database system provides powerful and fast query and extraction capabilities to a variety of nucleotide and protein sequence databases. The collection of ACNUC databases served by the Pôle Bio-Informatique Lyonnais includes the EMBL, GenBank, RefSeq and UniProt nucleotide and protein sequence databases and a series of other sequence databases that support comparative genomics analyses: HOVERGEN and HOGENOM containing families of homologous protein-coding genes from vertebrate and prokaryotic genomes, respectively; Ensembl and Genome Reviews for analyses of prokaryotic and of selected eukaryotic genomes. This report describes the main features of the ACNUC system and the access to ACNUC databases from any internet-connected computer. Such access was made possible by the definition of a remote ACNUC access protocol and the implementation of Application Programming Interfaces between the C, Python and R languages and this communication protocol. Two retrieval programs for ACNUC databases, Query_win, with a graphical user interface and raa_query, with a command line interface, are also described. Altogether, these bioinformatics tools provide users with either ready-to-use means of querying remote sequence databases through a variety of selection criteria, or a simple way to endow application programs with an extensive access to these databases. Remote access to ACNUC databases is open to all and fully documented (http://pbil.univ-lyon1.fr/databases/acnuc/acnuc.html).

  13. Reading biological processes from nucleotide sequences

    NASA Astrophysics Data System (ADS)

    Murugan, Anand

    Cellular processes have traditionally been investigated by techniques of imaging and biochemical analysis of the molecules involved. The recent rapid progress in our ability to manipulate and read nucleic acid sequences gives us direct access to the genetic information that directs and constrains biological processes. While sequence data is being used widely to investigate genotype-phenotype relationships and population structure, here we use sequencing to understand biophysical mechanisms. We present work on two different systems. First, in chapter 2, we characterize the stochastic genetic editing mechanism that produces diverse T-cell receptors in the human immune system. We do this by inferring statistical distributions of the underlying biochemical events that generate T-cell receptor coding sequences from the statistics of the observed sequences. This inferred model quantitatively describes the potential repertoire of T-cell receptors that can be produced by an individual, providing insight into its potential diversity and the probability of generation of any specific T-cell receptor. Then in chapter 3, we present work on understanding the functioning of regulatory DNA sequences in both prokaryotes and eukaryotes. Here we use experiments that measure the transcriptional activity of large libraries of mutagenized promoters and enhancers and infer models of the sequence-function relationship from this data. For the bacterial promoter, we infer a physically motivated 'thermodynamic' model of the interaction of DNA-binding proteins and RNA polymerase determining the transcription rate of the downstream gene. For the eukaryotic enhancers, we infer heuristic models of the sequence-function relationship and use these models to find synthetic enhancer sequences that optimize inducibility of expression. Both projects demonstrate the utility of sequence information in conjunction with sophisticated statistical inference techniques for dissecting underlying biophysical

  14. Nucleotide sequence of SHV-2 beta-lactamase gene

    SciTech Connect

    Garbarg-Chenon, A.; Godard, V.; Labia, R.; Nicolas, J.C. )

    1990-07-01

    The nucleotide sequence of plasmid-mediated beta-lactamase SHV-2 from Salmonella typhimurium (SHV-2pHT1) was determined. The gene was very similar to chromosomally encoded beta-lactamase LEN-1 of Klebsiella pneumoniae. Compared with the sequence of the Escherichia coli SHV-2 enzyme (SHV-2E.coli) obtained by protein sequencing, the deduced amino acid sequence of SHV-2pHT1 differed by three amino acid substitutions.

  15. [Tabular excel editor for analysis of aligned nucleotide sequences].

    PubMed

    Demkin, V V

    2010-01-01

    Excel platform was used for transition of results of multiple aligned nucleotide sequences obtained using the BLAST network service to the form appropriate for visual analysis and editing. Two macros operators for MS Excel 2007 were constructed. The array of aligned sequences transformed into Excel table and processed using macros operators is more appropriate for analysis than initial html data.

  16. 77 FR 65537 - Requirements for Patent Applications Containing Nucleotide Sequence and/or Amino Acid Sequence...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-10-29

    ... Amino Acid Sequence Disclosures ACTION: Proposed collection; comment request. SUMMARY: The United States....'' SUPPLEMENTARY INFORMATION: I. Abstract Patent applications that contain nucleotide and/or amino acid...

  17. Nucleotide sequence stability of the genome of hepatitis delta virus.

    PubMed Central

    Netter, H J; Wu, T T; Bockol, M; Cywinski, A; Ryu, W S; Tennant, B C; Taylor, J M

    1995-01-01

    Cultured cells were cotransfected with a fully sequenced 1,679-base cDNA clone of human hepatitis delta virus (HDV) RNA genome and a cDNA for the genome of woodchuck hepatitis virus (WHV). The HDV particles released were able to infect a woodchuck that was chronically infected with WHV. The HDV so produced was passaged a total of six times in woodchucks in order to determine the stability of the HDV nucleotide sequence. During a final chronic infection with such virus, liver RNA was extracted, and the HDV nucleotide sequence for the 352-base region, positions 905 to 1256, was obtained. By means of PCR, we obtained double-stranded cDNA both for direct sequencing and also for molecular cloning followed by sequencing. By direct sequencing, we found that a consensus sequence existed and was identical to the original sequence. From the sequences of 31 clones, we found 32% (10 of 31) to be identical to the original single nucleotide sequence. For the remainder, there were neither insertions nor deletions but there was a small number of single-nucleotide changes. These changes were predominantly transitions rather than transversions. Furthermore, the transitions were largely of just two types, uridine to cytidine and adenosine to guanosine. Of the 40 changes detected on HDV, 35% (14 of 40) occurred within an eight-nucleotide region that included position 1012, previously shown to be a site of RNA editing. These findings may have significant implications regarding both the stability of the HDV RNA genome and the mechanism of RNA editing. PMID:7853505

  18. The nucleotide sequence of cowpea mosaic virus B RNA

    PubMed Central

    Lomonossoff, G.P.; Shanks, M.

    1983-01-01

    The complete sequence of the bottom component RNA (B RNA) of cowpea mosaic virus (CPMV) has been determined. Restriction enzyme fragments of double-stranded cDNA were cloned in M13 and the sequence of the inserts was determined by a combination of enzymatic and chemical sequencing techniques. Additional sequence information was obtained by primed synthesis on first strand cDNA. The complete sequence deduced is 5889 nucleotides long excluding the 3' poly(A), and contains an open reading frame sufficient to code for a polypeptide of mol. wt. 207 760. The coding region is flanked by a 5' leader sequence of 206 nucleotides and a 3' non-coding region of 82 residues which does not contain a polyadenylation signal. PMID:16453487

  19. Nucleotide sequence composition and method for detection of neisseria gonorrhoeae

    SciTech Connect

    Lo, A.; Yang, H.L.

    1990-02-13

    This patent describes a composition of matter that is specific for {ital Neisseria gonorrhoeae}. It comprises: at least one nucleotide sequence for which the ratio of the amount of the sequence which hybridizes to chromosomal DNA of {ital Neisseria gonorrhoeae} to the amount of the sequence which hybridizes to chromosomal DNA of {ital Neisseria meningitidis} is greater than about five. The ratio being obtained by a method described.

  20. Nucleotide sequence of the tobacco (Nicotiana tabacum) anionic peroxidase gene

    SciTech Connect

    Diaz-De-Leon, F.; Klotz, K.L.; Lagrimini, L.M. )

    1993-03-01

    Peroxidases have been implicated in numerous physiological processes including lignification (Grisebach, 1981), wound-healing (Espelie et al., 1986), phenol oxidation (Lagrimini, 1991), pathogen defense (Ye et al., 1990), and the regulation of cell elongation through the formation of interchain covalent bonds between various cell wall polymers (Fry, 1986; Goldberg et al., 1986; Bradley et al., 1992). However, a complete description of peroxidase action in vivo is not available because of the vast number of potential substrates and the existence of multiple isoenzymes. The tobacco anionic peroxidase is one of the better-characterized isoenzymes. This enzyme has been shown to oxidize a number of significant plant secondary compounds in vitro including cinnamyl alcohols, phenolic acids, and indole-3-acetic acid (Maeder, 1980; Lagrimini, 1991). A cDNA encoding the enzyme has been obtained, and this enzyme was shown to be expressed at the highest levels in lignifying tissues (xylem and tracheary elements) and also in epidermal tissue (Lagrimini et al., 1987). It was shown at this time that there were four distinct copies of the anionic peroxidase gene in tobacco (Nicotiana tabacum). A tobacco genomic DNA library was constructed in the [lambda]-phase EMBL3, from which two unique peroxidase genes were sequenced. One of these clones, [lambda]POD1, was designated as a pseudogene when the exonic sequences were found to differ from the cDNA sequences by 1%, and several frame shifts in the coding sequences indicated a dysfunctional gene (the authors' unpublished results). The other clone, [lambda]POD3, described in this manuscript, was designated as the functional tobacco anionic peroxidase gene because of 100% homology with the cDNA. Significant structural elements include an AS-2 box indicated in shoot-specific expression (Lam and Chua, 1989), a TATA box, and two intervening sequences. 10 refs., 1 tab.

  1. Moss Phylogeny Reconstruction Using Nucleotide Pangenome of Complete Mitogenome Sequences.

    PubMed

    Goryunov, D V; Nagaev, B E; Nikolaev, M Yu; Alexeevski, A V; Troitsky, A V

    2015-11-01

    Stability of composition and sequence of genes was shown earlier in 13 mitochondrial genomes of mosses (Rensing, S. A., et al. (2008) Science, 319, 64-69). It is of interest to study the evolution of mitochondrial genomes not only at the gene level, but also on the level of nucleotide sequences. To do this, we have constructed a "nucleotide pangenome" for mitochondrial genomes of 24 moss species. The nucleotide pangenome is a set of aligned nucleotide sequences of orthologous genome fragments covering the totality of all genomes. The nucleotide pangenome was constructed using specially developed new software, NPG-explorer (NPGe). The stable part of the mitochondrial genome (232 stable blocks) is shown to be, on average, 45% of its length. In the joint alignment of stable blocks, 82% of positions are conserved. The phylogenetic tree constructed with the NPGe program is in good correlation with other phylogenetic reconstructions. With the NPGe program, 30 blocks have been identified with repeats no shorter than 50 bp. The maximal length of a block with repeats is 140 bp. Duplications in the mitochondrial genomes of mosses are rare. On average, the genome contains about 500 bp in large duplications. The total length of insertions and deletions was determined in each genome. The losses and gains of DNA regions are rather active in mitochondrial genomes of mosses, and such rearrangements presumably can be used as additional markers in the reconstruction of phylogeny. PMID:26615445

  2. Information capacity of nucleotide sequences and its applications.

    PubMed

    Sadovsky, M G

    2006-05-01

    The information capacity of nucleotide sequences is defined through the specific entropy of frequency dictionary of a sequence determined with respect to another one containing the most probable continuations of shorter strings. This measure distinguishes a sequence both from a random one, and from ordered entity. A comparison of sequences based on their information capacity is studied. An order within the genetic entities is found at the length scale ranged from 3 to 8. Some other applications of the developed methodology to genetics, bioinformatics, and molecular biology are discussed.

  3. Method for the detection of specific nucleic acid sequences by polymerase nucleotide incorporation

    DOEpatents

    Castro, Alonso

    2004-06-01

    A method for rapid and efficient detection of a target DNA or RNA sequence is provided. A primer having a 3'-hydroxyl group at one end and having a sequence of nucleotides sufficiently homologous with an identifying sequence of nucleotides in the target DNA is selected. The primer is hybridized to the identifying sequence of nucleotides on the DNA or RNA sequence and a reporter molecule is synthesized on the target sequence by progressively binding complementary nucleotides to the primer, where the complementary nucleotides include nucleotides labeled with a fluorophore. Fluorescence emitted by fluorophores on single reporter molecules is detected to identify the target DNA or RNA sequence.

  4. The nucleotide sequence of cloned wheat dwarf virus DNA

    PubMed Central

    MacDowell, S. W.; Macdonald, H.; Hamilton, W. D. O.; Coutts, R. H. A.; Buck, K. W.

    1985-01-01

    Restriction analysis and cloning of virus-specific double-stranded DNA isolated from plants infected with wheat dwarf virus (WDV) indicated that the virus genome, like that of maize streak virus (MSV), consists of a single DNA circle. The complete nucleotide sequence of cloned WDV DNA (2749 nucleotides) has been determined. Comparison of the potential coding regions in WDV DNA with those in the DNA of two strains of MSV suggests that these viruses encode at least two functional proteins, the coat protein read in the virion (+) DNA sense and a composite protein, formed from two open reading regions, in the complementary (−) DNA sense. Although WDV and MSV are serologically unrelated their coat proteins showed 35% direct amino acid sequence and their DNAs showed 46% nucleotide sequence homology. There was too little homology between the DNAs of WDV and those of two geminiviruses with bipartite genomes, cassava latent virus (CLV) and tomato golden mosaic virus (TGMV), to align the sequences. However comparison of the amino acid sequences of predicted proteins of WDV, MSV, TGMV and CLV revealed clear relationships between these viruses and suggested that the monopartite and the bipartite geminiviruses have a common ancestral origin. Four inverted repeat sequences which have the potential to form hairpin structures of △G≥-14 kcal/mol were detected in WDV DNA. The sequence TAATATTAC present in the loop of one of these hairpins is conserved in similar putative structures in MSV DNA and in both DNA components of CLV and TGMV and may function as a recognition sequence for a protein involved in virus DNA replication. PMID:15938050

  5. The primary nucleotide sequence of U4 RNA.

    PubMed

    Reddy, R; Henning, D; Busch, H

    1981-04-10

    U4 RNA is one of the "capped" nuclear snRNAs recently found to be precipitable by anti-Sm antibodies as ribonucleoprotein particles. U4 RNA, along with other snRNAs, has been implicated in hnRNA processing, mRNA transport, or both (Lerner, M. R., Boyle, J., Mount, S., Wolin, S., and Steitz, J. A. (1980) Nature 283, 220-224). Since the proteins bound to different snRNAs appear to be the same, the functions of different snRNPs might be dependent on the RNA components. To help understand the function of U4 RNP, the nucleotide sequence of U4 RNA was determined. The sequence is (formula see text) In addition to the modified nucleotides in the "cap," U4 RNA contains Am at position 63 and m6A at position 98. It also exhibited A-C microheterogeneity at position 97. PMID:6162848

  6. Nucleotide-Specific Contrast for DNA Sequencing by Electron Spectroscopy.

    PubMed

    Mankos, Marian; Persson, Henrik H J; N'Diaye, Alpha T; Shadman, Khashayar; Schmid, Andreas K; Davis, Ronald W

    2016-01-01

    DNA sequencing by imaging in an electron microscope is an approach that holds promise to deliver long reads with low error rates and without the need for amplification. Earlier work using transmission electron microscopes, which use high electron energies on the order of 100 keV, has shown that low contrast and radiation damage necessitates the use of heavy atom labeling of individual nucleotides, which increases the read error rates. Other prior work using scattering electrons with much lower energy has shown to suppress beam damage on DNA. Here we explore possibilities to increase contrast by employing two methods, X-ray photoelectron and Auger electron spectroscopy. Using bulk DNA samples with monomers of each base, both methods are shown to provide contrast mechanisms that can distinguish individual nucleotides without labels. Both spectroscopic techniques can be readily implemented in a low energy electron microscope, which may enable label-free DNA sequencing by direct imaging. PMID:27149617

  7. Nucleotide-Specific Contrast for DNA Sequencing by Electron Spectroscopy

    PubMed Central

    Schmid, Andreas K.; Davis, Ronald W.

    2016-01-01

    DNA sequencing by imaging in an electron microscope is an approach that holds promise to deliver long reads with low error rates and without the need for amplification. Earlier work using transmission electron microscopes, which use high electron energies on the order of 100 keV, has shown that low contrast and radiation damage necessitates the use of heavy atom labeling of individual nucleotides, which increases the read error rates. Other prior work using scattering electrons with much lower energy has shown to suppress beam damage on DNA. Here we explore possibilities to increase contrast by employing two methods, X-ray photoelectron and Auger electron spectroscopy. Using bulk DNA samples with monomers of each base, both methods are shown to provide contrast mechanisms that can distinguish individual nucleotides without labels. Both spectroscopic techniques can be readily implemented in a low energy electron microscope, which may enable label-free DNA sequencing by direct imaging. PMID:27149617

  8. The complete nucleotide sequence of pelargonium leaf curl virus.

    PubMed

    McGavin, Wendy J; MacFarlane, Stuart A

    2016-05-01

    Investigation of a tombusvirus isolated from tulip plants in Scotland revealed that it was pelargonium leaf curl virus (PLCV) rather than the originally suggested tomato bushy stunt virus. The complete sequence of the PLCV genome was determined for the first time, revealing it to be 4789 nucleotides in size and to have an organization similar to that of the other, previously described tombusviruses. Primers derived from the sequence were used to construct a full-length infectious clone of PLCV that recapitulates the disease symptoms of leaf curling in systemically infected pelargonium plants.

  9. The complete nucleotide sequence of pelargonium leaf curl virus.

    PubMed

    McGavin, Wendy J; MacFarlane, Stuart A

    2016-05-01

    Investigation of a tombusvirus isolated from tulip plants in Scotland revealed that it was pelargonium leaf curl virus (PLCV) rather than the originally suggested tomato bushy stunt virus. The complete sequence of the PLCV genome was determined for the first time, revealing it to be 4789 nucleotides in size and to have an organization similar to that of the other, previously described tombusviruses. Primers derived from the sequence were used to construct a full-length infectious clone of PLCV that recapitulates the disease symptoms of leaf curling in systemically infected pelargonium plants. PMID:26906694

  10. Fluorogenic sequencing using halogen-fluorescein-labeled nucleotides.

    PubMed

    Chen, Zitian; Duan, Haifeng; Qiao, Shuo; Zhou, Wenxiong; Qiu, Haiwei; Kang, Li; Xie, X Sunney; Huang, Yanyi

    2015-05-26

    Fluorogenic sequencing is a sequencing-by-synthesis technology that combines the advantages of pyrosequencing and fluorescence detection. With native duplex DNA as the major product, we employ polymerase to incorporate the complement- arily matched terminal phosphate-labeled fluorogenic nucleotides into the DNA template and release halogen-fluorescein as the reporter. This red-emitting fluorophore successfully avoids spectral overlap with the autofluorescence background of the flow chip. We fully characterized the enzymatic reaction kinetics of the new substrates, and performed a 35-base sequencing experiment with 60 reaction cycles. Our achievement expands the substrate repertoire for fluorogenic sequencing, and extends the spectral range to obtain better signal-to-background performance.

  11. Comparing compressed sequences for faster nucleotide BLAST searches.

    PubMed

    Cameron, Michael; Williams, Hugh E

    2007-01-01

    Molecular biologists, geneticists, and other life scientists use the BLAST homology search package as their first step for discovery of information about unknown or poorly annotated genomic sequences. There are two main variants of BLAST: BLASTP for searching protein collections and BLASTN for nucleotide collections. Surprisingly, BLASTN has had very little attention; for example, the algorithms it uses do not follow those described in the 1997 BLAST paper and no exact description has been published. It is important that BLASTN is state-of-the-art: Nucleotide collections such as GenBank dwarf the protein collections in size, they double in size almost yearly, and they take many minutes to search on modern general purpose workstations. This paper proposes significant improvements to the BLASTN algorithms. Each of our schemes is based on compressed bytepacked formats that allow queries and collection sequences to be compared four bases at a time, permitting very fast query evaluation using lookup tables and numeric comparisons. Our most significant innovations are two new, fast gapped alignment schemes that allow accurate sequence alignment without decompression of the collection sequences. Overall, our innovations more than double the speed of BLASTN with no effect on accuracy and have been integrated into our new version of BLAST that is freely available for download from http://www.fsa-blast.org/. PMID:17666756

  12. Comparing compressed sequences for faster nucleotide BLAST searches.

    PubMed

    Cameron, Michael; Williams, Hugh E

    2007-01-01

    Molecular biologists, geneticists, and other life scientists use the BLAST homology search package as their first step for discovery of information about unknown or poorly annotated genomic sequences. There are two main variants of BLAST: BLASTP for searching protein collections and BLASTN for nucleotide collections. Surprisingly, BLASTN has had very little attention; for example, the algorithms it uses do not follow those described in the 1997 BLAST paper and no exact description has been published. It is important that BLASTN is state-of-the-art: Nucleotide collections such as GenBank dwarf the protein collections in size, they double in size almost yearly, and they take many minutes to search on modern general purpose workstations. This paper proposes significant improvements to the BLASTN algorithms. Each of our schemes is based on compressed bytepacked formats that allow queries and collection sequences to be compared four bases at a time, permitting very fast query evaluation using lookup tables and numeric comparisons. Our most significant innovations are two new, fast gapped alignment schemes that allow accurate sequence alignment without decompression of the collection sequences. Overall, our innovations more than double the speed of BLASTN with no effect on accuracy and have been integrated into our new version of BLAST that is freely available for download from http://www.fsa-blast.org/.

  13. Bioinformatics comparison of sulfate-reducing metabolism nucleotide sequences

    NASA Astrophysics Data System (ADS)

    Tremberger, G.; Dehipawala, Sunil; Nguyen, A.; Cheung, E.; Sullivan, R.; Holden, T.; Lieberman, D.; Cheung, T.

    2015-09-01

    The sulfate-reducing bacteria can be traced back to 3.5 billion years ago. The thermodynamics details of the sulfur cycle have been well documented. A recent sulfate-reducing bacteria report (Robator, Jungbluth, et al , 2015 Jan, Front. Microbiol) with Genbank nucleotide data has been analyzed in terms of the sulfite reductase (dsrAB) via fractal dimension and entropy values. Comparison to oil field sulfate-reducing sequences was included. The AUCG translational mass fractal dimension versus ATCG transcriptional mass fractal dimension for the low temperature dsrB and dsrA sequences reported in Reference Thirteen shows correlation R-sq ~ 0.79 , with a probably of about 3% in simulation. A recent report of using Cystathionine gamma-lyase sequence to produce CdS quantum dot in a biological method, where the sulfur is reduced just like in the H2S production process, was included for comparison. The AUCG mass fractal dimension versus ATCG mass fractal dimension for the Cystathionine gamma-lyase sequences was found to have R-sq of 0.72, similar to the low temperature dissimilatory sulfite reductase dsr group with 3% probability, in contrary to the oil field group having R-sq ~ 0.94, a high probable outcome in the simulation. The other two simulation histograms, namely, fractal dimension versus entropy R-sq outcome values, and di-nucleotide entropy versus mono-nucleotide entropy R-sq outcome values are also discussed in the data analysis focusing on low probability outcomes.

  14. Cytochrome b nucleotide sequence variation among the Atlantic Alcidae.

    PubMed

    Friesen, V L; Montevecchi, W A; Davidson, W S

    1993-01-01

    Analysis of cytochrome b nucleotide sequences of the six extant species of Atlantic alcids and a gull revealed an excess of adenines and cytosines and a deficit of guanines at silent sites on the coding strand. Phylogenetic analyses grouped the sequences of the common (Uria aalge) and Brünnich's (U. lomvia) guillemots, followed by the razorbill (Alca torda) and little auk (Alle alle). The black guillemot (Cepphus grylle) sequence formed a sister taxon, and the puffin (Fratercula arctica) fell outside the other alcids. Phylogenetic comparisons of substitutions indicated that mutabilities of bases did not differ, but that C was much more likely to be incorporated than was G. Imbalances in base composition appear to result from a strand bias in replication errors, which may result from selection on secondary RNA structure and/or the energetics of codon-anticodon interactions. PMID:7916741

  15. Detection of protein similarities using nucleotide sequence databases.

    PubMed

    Henikoff, S; Wallace, J C

    1988-07-11

    A simple procedure is described for finding similarities between proteins using nucleotide sequence databases. The approach is illustrated by several examples of previously unknown correspondences with important biological implications: Drosophila elongation factor Tu is shown to be encoded by two genes that are differently expressed during development; a cluster of three Drosophila genes likely encode maltases; a flesh-fly fat body protein resembles the hypothesized Drosophila alcohol dehydrogenase ancestral protein; an unknown protein encoded at the multifunctional E. coli hisT locus resembles aspartate beta-semialdehyde dehydrogenase; and the E. coli tyrR protein is related to nitrogen regulatory proteins. These and other matches were discovered using a personal computer of the type available in most laboratories collecting DNA sequence data. As relatively few sequences were sampled to find these matches, it is likely that much of the existing data has not been adequately examined.

  16. Petabyte-scale innovations at the European Nucleotide Archive.

    PubMed

    Cochrane, Guy; Akhtar, Ruth; Bonfield, James; Bower, Lawrence; Demiralp, Fehmi; Faruque, Nadeem; Gibson, Richard; Hoad, Gemma; Hubbard, Tim; Hunter, Christopher; Jang, Mikyung; Juhos, Szilveszter; Leinonen, Rasko; Leonard, Steven; Lin, Quan; Lopez, Rodrigo; Lorenc, Dariusz; McWilliam, Hamish; Mukherjee, Gaurab; Plaister, Sheila; Radhakrishnan, Rajesh; Robinson, Stephen; Sobhany, Siamak; Hoopen, Petra Ten; Vaughan, Robert; Zalunin, Vadim; Birney, Ewan

    2009-01-01

    Dramatic increases in the throughput of nucleotide sequencing machines, and the promise of ever greater performance, have thrust bioinformatics into the era of petabyte-scale data sets. Sequence repositories, which provide the feed for these data sets into the worldwide computational infrastructure, are challenged by the impact of these data volumes. The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/embl), comprising the EMBL Nucleotide Sequence Database and the Ensembl Trace Archive, has identified challenges in the storage, movement, analysis, interpretation and visualization of petabyte-scale data sets. We present here our new repository for next generation sequence data, a brief summary of contents of the ENA and provide details of major developments to submission pipelines, high-throughput rule-based validation infrastructure and data integration approaches.

  17. Nucleotide sequence and expression of a Drosophila metallothionein.

    PubMed

    Lastowski-Perry, D; Otto, E; Maroni, G

    1985-02-10

    A Drosophila melanogaster cDNA clone was isolated based on its more intense hybridization to RNA sequences from copper-fed larvae than from control larval RNA. This clone showed strong hybridization to mouse metallothionein I cDNA at reduced stringency. Its nucleotide sequence includes an open reading segment which codes for a 40-amino acid protein; this protein is identified as metallothionein based on its similarity to the amino-terminal portion of mammalian and crab metalloproteins. The 10 cysteine residues present occur in five pairs of near vicinal cysteines (Cys-X-Cys). This cDNA sequence hybridized to a 400-nucleotide polyadenylated RNA whose presence in the cells of the alimentary canal of larvae was stimulated by ingestion of cadmium or copper; in other tissues this RNA was present at much lower levels. Mercury, silver, and zinc induced metallothionein to a lesser extent. The level of metallothionein RNA increased very soon after the initiation of metal treatment and reached a maximum after approximately 36 h. PMID:2578462

  18. Nucleotide sequence of the vaccinia virus hemagglutinin gene.

    PubMed

    Shida, H

    1986-04-30

    Vaccinia virus hemagglutinin (HA) is expressed at late time of infection cycle, and it is nonessential for virus growth. Location of the HA structural gene was determined by hybrid-arrested and hybrid-selected translation methods at the right terminus of the HindIII A fragment. The position of the HA gene was confirmed by the production of the complete HA protein in the cells transfected with the plasmid containing that region. Examination of this nucleotide sequence revealed the positions of cleavage sites for a number of restriction endonucleases. The deduced amino acid sequence revealed that the HA protein is a member of typical surface membrane glycoproteins. Comparison of the nucleotide sequence upstream of the HA coding region with corresponding region of other late genes suggested the existence of the consensus decanucleotides TTCATTTa/tGT between 34 to 18 bp upstream to the initiation codon followed by a cluster of A or T, a unique feature of the late genes of vaccinia virus. These results in conjunction with the ease of isolating HA- mutants provide a basis for a new site suitable for inserting foreign genes.

  19. Nucleotide sequence of Bacillus phage Nf terminal protein gene.

    PubMed Central

    Leavitt, M C; Ito, J

    1987-01-01

    The nucleotide sequence of Bacillus phage Nf gene E has been determined. Gene E codes for phage terminal protein which is the primer necessary for the initiation of DNA replication. The deduced amino acid sequence of Nf terminal protein is approximately 66% homologous with the terminal proteins of Bacillus phages PZA and luminal diameter 29, and shows similar hydropathy and secondary structure predictions. A serine which has been identified as the residue which covalently links the protein to the 5' end of the genome in luminal diameter 29, is conserved in all three phages. The hydropathic and secondary structural environment of this serine is similar in these phage terminal proteins and also similar to the linking serine of adenovirus terminal protein. PMID:3601672

  20. Genome nucleotide composition shapes variation in simple sequence repeats.

    PubMed

    Tian, Xiangjun; Strassmann, Joan E; Queller, David C

    2011-02-01

    Simple sequence repeats (SSRs) or microsatellites are a common component of genomes but vary greatly across species in their abundance. We tested the hypothesis that this variation is due in part to AT/GC content of genomes, with genomes biased toward either high AT or high CG generating more short random repeats that are long enough to enhance expansion through slippage during replication. To test this hypothesis, we identified repeats with perfect tandem iterations of 1-6 bp from 25 protists with complete or near-complete genome sequences. As expected, the density and the frequency are highly related to genome AT content, with excellent fits to quadratic regressions with minima near a 50% AT content and rising toward both extremes. Within species, the same trends hold, except the limited variation in AT content within each species places each mainly on the descending (GC rich), middle, or ascending (AT rich) part of the curve. The base usages of repeat motifs are also significantly correlated with genome nucleotide compositions: Percentages of AT-rich motifs rise with the increase of genome AT content but vice versa for GC-rich subgroups. Amino acid homopolymer repeats also show the expected quadratic relationship, with higher abundance in species with AT content biased in either direction. Our results show that genome nucleotide composition explains up to half of the variance in the abundance and motif constitution of SSRs.

  1. Nucleotide sequences specific to Yersinia pestis and methods for the detection of Yersinia pestis

    DOEpatents

    McCready, Paula M.; Radnedge, Lyndsay; Andersen, Gary L.; Ott, Linda L.; Slezak, Thomas R.; Kuczmarski, Thomas A.; Motin, Vladinir L.

    2009-02-24

    Nucleotide sequences specific to Yersinia pestis that serve as markers or signatures for identification of this bacterium were identified. In addition, forward and reverse primers and hybridization probes derived from these nucleotide sequences that are used in nucleotide detection methods to detect the presence of the bacterium are disclosed.

  2. Nucleotide sequences specific to Brucella and methods for the detection of Brucella

    SciTech Connect

    McCready, Paula M.; Radnedge, Lyndsay; Andersen, Gary L.; Ott, Linda L.; Slezak, Thomas R.; Kuczmarski, Thomas A.

    2009-02-24

    Nucleotide sequences specific to Brucella that serves as a marker or signature for identification of this bacterium were identified. In addition, forward and reverse primers and hybridization probes derived from these nucleotide sequences that are used in nucleotide detection methods to detect the presence of the bacterium are disclosed.

  3. Nucleotide sequences specific to Francisella tularensis and methods for the detection of Francisella tularensis

    DOEpatents

    McCready, Paula M.; Radnedge, Lyndsay; Andersen, Gary L.; Ott, Linda L.; Slezak, Thomas R.; Kuczmarski, Thomas A.; Vitalis, Elizabeth A

    2007-02-06

    Described herein is the identification of nucleotide sequences specific to Francisella tularensis that serves as a marker or signature for identification of this bacterium. In addition, forward and reverse primers and hybridization probes derived from these nucleotide sequences that are used in nucleotide detection methods to detect the presence of the bacterium are disclosed.

  4. Nucleotide sequences specific to Francisella tularensis and methods for the detection of Francisella tularensis

    DOEpatents

    McCready, Paula M.; Radnedge, Lyndsay; Andersen, Gary L.; Ott, Linda L.; Slezak, Thomas R.; Kuczmarski, Thomas A.; Vitalis, Elizabeth A

    2009-02-24

    Described herein is the identification of nucleotide sequences specific to Francisella tularensis that serves as a marker or signature for identification of this bacterium. In addition, forward and reverse primers and hybridization probes derived from these nucleotide sequences that are used in nucleotide detection methods to detect the presence of the bacterium are disclosed.

  5. Generalized Levy-walk model for DNA nucleotide sequences

    NASA Technical Reports Server (NTRS)

    Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Simons, M.; Stanley, H. E.

    1993-01-01

    We propose a generalized Levy walk to model fractal landscapes observed in noncoding DNA sequences. We find that this model provides a very close approximation to the empirical data and explains a number of statistical properties of genomic DNA sequences such as the distribution of strand-biased regions (those with an excess of one type of nucleotide) as well as local changes in the slope of the correlation exponent alpha. The generalized Levy-walk model simultaneously accounts for the long-range correlations in noncoding DNA sequences and for the apparently paradoxical finding of long subregions of biased random walks (length lj) within these correlated sequences. In the generalized Levy-walk model, the lj are chosen from a power-law distribution P(lj) varies as lj(-mu). The correlation exponent alpha is related to mu through alpha = 2-mu/2 if 2 < mu < 3. The model is consistent with the finding of "repetitive elements" of variable length interspersed within noncoding DNA.

  6. Empirical Bayes Estimation of Coalescence Times from Nucleotide Sequence Data.

    PubMed

    King, Leandra; Wakeley, John

    2016-09-01

    We demonstrate the advantages of using information at many unlinked loci to better calibrate estimates of the time to the most recent common ancestor (TMRCA) at a given locus. To this end, we apply a simple empirical Bayes method to estimate the TMRCA. This method is both asymptotically optimal, in the sense that the estimator converges to the true value when the number of unlinked loci for which we have information is large, and has the advantage of not making any assumptions about demographic history. The algorithm works as follows: we first split the sample at each locus into inferred left and right clades to obtain many estimates of the TMRCA, which we can average to obtain an initial estimate of the TMRCA. We then use nucleotide sequence data from other unlinked loci to form an empirical distribution that we can use to improve this initial estimate. PMID:27440864

  7. Cloning, nucleotide sequence, and expression of Achromobacter protease I gene.

    PubMed

    Ohara, T; Makino, K; Shinagawa, H; Nakata, A; Norioka, S; Sakiyama, F

    1989-12-01

    Achromobacter protease I (API) is a lysine-specific serine protease which hydrolyzes specifically the lysyl peptide bond. A gene coding for API was cloned from Achromobacter lyticus M497-1. Nucleotide sequence of the cloned DNA fragment revealed that the gene coded for a single polypeptide chain of 653 amino acids. The N-terminal 205 amino acids, including signal peptide and the threonine/serine-rich C-terminal 180 amino acids are flanking the 268 amino acid-mature protein which was identified by protein sequencing. Escherichia coli carrying a plasmid containing the cloned API gene overproduced and secreted a protein of Mr 50,000 (API') into the periplasm. This protein exhibited a distinct endopeptidase activity specific for lysyl bonds as well. The N-terminal amino acid sequence of API' was the same as mature API, suggesting that the enzyme retained the C-terminal extended peptide chain. The present experiments indicate that API, an extracellular protease produced by gram-negative bacteria, is synthesized in vivo as a precursor protein bearing long extended peptide chains at both N and C termini. PMID:2684982

  8. Complete nucleotide sequence of Nootka lupine vein-clearing virus.

    PubMed

    Robertson, Nancy L; Côté, Fabien; Paré, Christine; Leblanc, Eric; Bergeron, Michel G; Leclerc, Denis

    2007-12-01

    The complete genome sequence of Nootka lupine vein-clearing virus (NLVCV) was determined to be 4,172 nucleotides in length containing four open reading frames (ORFs) with a similar genetic organization of virus species in the genus Carmovirus, family Tombusviridae. The order and gene product size, starting from the 5'-proximal ORF consisted of: (1) polymerase/replicase gene, ORF1 (p27) and ORF1RT (readthrough) (p87), (2) movement proteins ORF2 (p7) and ORF3 (p9), and, (3) the 3'-proximal coat protein ORF4, (p37). The genomic 5'- and 3'-proximal termini contained a short (59 nt) and a relatively longer 405 nt untranslated region, respectively. The longer replicase gene product contained the GDD motif common to RNA-dependent RNA polymerases. Phylogenetically, NLVCV formed a subgroup with the following four carmoviruses when separately comparing the amino acids of the coat protein or replicase protein: Angelonia flower break virus (AnFBV), Carnation mottle virus (CarMV), Pelargonium flower break virus (PFBV), and Saguaro cactus virus (SgCV). Whole genome nucleotide analysis (percent identities) among the carmoviruses with NLVCV suggested a similar pattern. The species demarcation criteria in the genus Carmovirus for the amino acid sequence identity of the polymerase (<52%) and coat (<41%) protein genes restricted NLVCV as a distinct species, and instead, placed it as a tentative strain of CarMV, PFBV, or SgCV when both the polymerase and CP were used as the determining factors. In contrast, the species criteria that included different host ranges with no overlap and lack of serology relatedness between NLVCV and the carmoviruses, suggested that NLVCV was a distinct species. The relatively low cutoff percentages allowed for the polymerase and CP genes to dictate the inclusion/exclusion of a distinct carmovirus species should be reevaluated. Therefore, at this time we have concluded that NLVCV should be classified as a tentative new species in the genus Carmovirus

  9. DNA sequencing using differential extension with nucleotide subsets (DENS).

    PubMed Central

    Raja, M C; Zevin-Sonkin, D; Shwartzburd, J; Rozovskaya, T A; Sobolev, I A; Chertkov, O; Ramanathan, V; Lvovsky, L; Ulanovsky, L E

    1997-01-01

    Here we describe template directed enzymatic synthesis of unique primers, avoiding the chemical synthesis step in primer walking. We have termed this conceptually new technique DENS (differential extension with nucleotide subsets). DENS works by selectively extending a short primer, making it a long one at the intended site only. The procedure starts with a limited initial extension of the primer (at 20-30 degrees C) in the presence of only two out of the four possible dNTPs. The primer is extended by 6-9 bases or longer at the intended priming site, which is deliberately selected, (as is the two-dNTP set), to maximize the extension length. The subsequent termination reaction at 60-65 degrees C then accepts the extended primer at the intended site, but not at alternative sites, where the initial extension (if any) is generally much shorter. DENS allows the use of primers as long as 8mers (degenerate in two positions) which prime much more strongly than modular primers involving 5-7mers and which (unlike the latter) can be used with thermostable polymerases, thus allowing cycle-sequencing with dye-terminators compatible with Taq DNA polymerase, as well as making double-stranded DNA sequencing more robust. PMID:9016632

  10. Nucleotide sequence determines the accelerated rate of point mutations.

    PubMed

    Kini, R Manjunatha; Chinnasamy, Arunkumar

    2010-09-01

    Although the theory of evolution was put forth about 150 years ago our understanding of how molecules drive evolution remains poor. It is well-established that proteins evolve at different rates, essentially based on their functional role and three-dimensional structure. However, the highly variable rates of evolution of different proteins - especially the rapidly evolving ones - within a single organism are poorly understood. Using examples of genes for fast-evolving toxins and human hereditary diseases, we show for the first time that specific nucleotide sequences appear to determine point mutation rates. Based on mutation rates, we have classified triplets (not just codons) into stable, unstable and intermediate groups. Toxin genes contain a relatively higher percentage of unstable triplets in their exons compared to introns, whereas non-toxin genes contain a higher percentage of unstable triplets in their introns. Thus the distribution of stable and unstable triplets is correlated with and may explain the accelerated evolution of point mutations in toxins. Similarly, at the genomic level, lower organisms with genes that evolve faster contain a higher percentage of unstable triplets compared to higher organisms. These findings show that mutation rates of proteins, and hence of the organisms, are DNA sequence-dependent and thus provide a proximate mechanism of evolution at the molecular level. PMID:20362603

  11. Single nucleotide polymorphisms associated with rat expressed sequences.

    PubMed

    Guryev, Victor; Berezikov, Eugene; Malik, Rainer; Plasterk, Ronald H A; Cuppen, Edwin

    2004-07-01

    Single nucleotide polymorphisms (SNPs) are the most common source of genetic variation in populations and are thus most likely to account for the majority of phenotypic and behavioral differences between individuals or strains. Although the rat is extensively studied for the latter, data on naturally occurring polymorphisms are mostly lacking. We have used publicly available sequences consisting of whole-genome shotgun (WGS), expressed sequence tag (EST), and mRNA data as a source for the in silico identification of SNPs in gene-coding regions and have identified a large collection of 33,305 high-quality candidate SNPs. Experimental verification of 471 candidate SNPs using a limited set of rat isolates revealed a confirmation rate of approximately 50%. Although the majority of SNPs were identified between Sprague-Dawley (EST data) and Brown Norway (WGS data) strains, we found that 66% of the verified variations are common among different rat strains. All SNPs were extensively annotated, including chromosomal and genetic map information, and nonsynonymous SNPs were analyzed by SIFT and PolyPhen prediction programs for their potential deleterious effect on protein function. Interestingly, we retrieved three SNPs from the database that result in the introduction of a premature stop codon and that could be confirmed experimentally. Two of these "in silico-identified knockouts" reside in interesting QTL regions. Data are publicly available via a Web interface (http://cascad.niob.knaw.nl), allowing simple and advanced search queries.

  12. The nucleotide sequence of the uvrD gene of E. coli.

    PubMed Central

    Finch, P W; Emmerson, P T

    1984-01-01

    The nucleotide sequence of a cloned section of the E. coli chromosome containing the uvrD gene has been determined. The coding region for the UvrD protein consists of 2,160 nucleotides which would direct the synthesis of a polypeptide 720 amino acids long with a calculated molecular weight of 82 kd. The predicted amino acid sequence of the UvrD protein has been compared with the amino acid sequences of other known adenine nucleotide binding proteins and a common sequence has been identified, thought to contribute towards adenine nucleotide binding. PMID:6379604

  13. Spatially localized generation of nucleotide sequence-specific DNA damage

    PubMed Central

    Oh, Dennis H.; King, Brett A.; Boxer, Steven G.; Hanawalt, Philip C.

    2001-01-01

    Psoralens linked to triplex-forming oligonucleotides (psoTFOs) have been used in conjunction with laser-induced two-photon excitation (TPE) to damage a specific DNA target sequence. To demonstrate that TPE can initiate photochemistry resulting in psoralen–DNA photoadducts, target DNA sequences were incubated with psoTFOs to form triple-helical complexes and then irradiated in liquid solution with pulsed 765-nm laser light, which is half the quantum energy required for conventional one-photon excitation, as used in psoralen + UV A radiation (320–400 nm) therapy. Target DNA acquired strand-specific psoralen monoadducts in a light dose-dependent fashion. To localize DNA damage in a model tissue-like medium, a DNA–psoTFO mixture was prepared in a polyacrylamide gel and then irradiated with a converging laser beam targeting the rear of the gel. The highest number of photoadducts formed at the rear while relatively sparing DNA at the front of the gel, demonstrating spatial localization of sequence-specific DNA damage by TPE. To assess whether TPE treatment could be extended to cells without significant toxicity, cultured monolayers of normal human dermal fibroblasts were incubated with tritium-labeled psoralen without TFO to maximize detectable damage and irradiated by TPE. DNA from irradiated cells treated with psoralen exhibited a 4- to 7-fold increase in tritium activity relative to untreated controls. Functional survival assays indicated that the psoralen–TPE treatment was not toxic to cells. These results demonstrate that DNA damage can be simultaneously manipulated at the nucleotide level and in three dimensions. This approach for targeting photochemical DNA damage may have photochemotherapeutic applications in skin and other optically accessible tissues. PMID:11572980

  14. Spatially localized generation of nucleotide sequence-specific DNA damage.

    PubMed

    Oh, D H; King, B A; Boxer, S G; Hanawalt, P C

    2001-09-25

    Psoralens linked to triplex-forming oligonucleotides (psoTFOs) have been used in conjunction with laser-induced two-photon excitation (TPE) to damage a specific DNA target sequence. To demonstrate that TPE can initiate photochemistry resulting in psoralen-DNA photoadducts, target DNA sequences were incubated with psoTFOs to form triple-helical complexes and then irradiated in liquid solution with pulsed 765-nm laser light, which is half the quantum energy required for conventional one-photon excitation, as used in psoralen + UV A radiation (320-400 nm) therapy. Target DNA acquired strand-specific psoralen monoadducts in a light dose-dependent fashion. To localize DNA damage in a model tissue-like medium, a DNA-psoTFO mixture was prepared in a polyacrylamide gel and then irradiated with a converging laser beam targeting the rear of the gel. The highest number of photoadducts formed at the rear while relatively sparing DNA at the front of the gel, demonstrating spatial localization of sequence-specific DNA damage by TPE. To assess whether TPE treatment could be extended to cells without significant toxicity, cultured monolayers of normal human dermal fibroblasts were incubated with tritium-labeled psoralen without TFO to maximize detectable damage and irradiated by TPE. DNA from irradiated cells treated with psoralen exhibited a 4- to 7-fold increase in tritium activity relative to untreated controls. Functional survival assays indicated that the psoralen-TPE treatment was not toxic to cells. These results demonstrate that DNA damage can be simultaneously manipulated at the nucleotide level and in three dimensions. This approach for targeting photochemical DNA damage may have photochemotherapeutic applications in skin and other optically accessible tissues. PMID:11572980

  15. Nucleotide sequence and temporal expression of a baculovirus regulatory gene.

    PubMed

    Guarino, L A; Summers, M D

    1987-07-01

    The nucleotide sequence of a trans-activating regulatory gene (IE-1) of the baculovirus Autographa californica nuclear polyhedrosis virus has been determined. This gene encodes a protein of 581 amino acids with a predicted molecular weight of 66,856. A DNA fragment containing the entire coding sequence of IE-1 was inserted downstream of an RNA promoter. Subsequent cell-free transcription and translation directed the synthesis of a single peptide with an apparent molecular weight of 70,000. Quantitative S1 nuclease analysis indicated that IE-1 was maximally synthesized during a 1-h virus adsorption period and that steady-state levels of IE-1 message were maintained during the first 24 h of infection. Northern blot hybridization indicated that several late transcripts which overlap the IE-1 gene were transcribed from both strands. The precise locations of the 5' and 3' ends of these overlapping transcripts were mapped using S1 nuclease. The overlapping transcripts were grouped in two transcriptional units. One unit was composed of IE-1 and overlapping gamma transcripts which initiated upstream of IE-1 and terminated downstream of IE-1. The other unit, transcribed from the opposite strand, consisted of gamma transcripts with coterminal 5' ends and extended 3' ends. The shorter, more abundant transcripts in this unit overlapped 30 to 40 bases of IE-1 at the 3' end, while the longer transcripts overlapped the entire IE-1 gene. Transcription of several early A. californica nuclear polyhedrosis virus genes, in addition to 39K, was shown to be trans-activated by IE-1, indicating that IE-1 may have a central role in the regulation of beta-gene expression. PMID:16789264

  16. Complete nucleotide sequence of a monopartite Begomovirus and associated satellites infecting Carica papaya in Nepal.

    PubMed

    Shahid, M S; Yoshida, S; Khatri-Chhetri, G B; Briddon, R W; Natsuaki, K T

    2013-06-01

    Carica papaya (papaya) is a fruit crop that is cultivated mostly in kitchen gardens throughout Nepal. Leaf samples of C. papaya plants with leaf curling, vein darkening, vein thickening, and a reduction in leaf size were collected from a garden in Darai village, Rampur, Nepal in 2010. Full-length clones of a monopartite Begomovirus, a betasatellite and an alphasatellite were isolated. The complete nucleotide sequence of the Begomovirus showed the arrangement of genes typical of Old World begomoviruses with the highest nucleotide sequence identity (>99 %) to an isolate of Ageratum yellow vein virus (AYVV), confirming it as an isolate of AYVV. The complete nucleotide sequence of betasatellite showed greater than 89 % nucleotide sequence identity to an isolate of Tomato leaf curl Java betasatellite originating from Indonesian. The sequence of the alphasatellite displayed 92 % nucleotide sequence identity to Sida yellow vein China alphasatellite. This is the first identification of these components in Nepal and the first time they have been identified in papaya.

  17. The nucleotide sequence of the amiE gene of Pseudomonas aeruginosa.

    PubMed

    Brammar, W J; Charles, I G; Matfield, M; Liu, C P; Drew, R E; Clarke, P H

    1987-05-11

    The nucleotide sequence of the amiE gene, encoding the aliphatic amidase of Pseudomonas aeruginosa, has been determined. The sequence of 1038 nucleotides shows a strong bias in favour of codons with G or C in the third position, and only 44 different codons are utilised.

  18. Analysis Tool Web Services from the EMBL-EBI.

    PubMed

    McWilliam, Hamish; Li, Weizhong; Uludag, Mahmut; Squizzato, Silvano; Park, Young Mi; Buso, Nicola; Cowley, Andrew Peter; Lopez, Rodrigo

    2013-07-01

    Since 2004 the European Bioinformatics Institute (EMBL-EBI) has provided access to a wide range of databases and analysis tools via Web Services interfaces. This comprises services to search across the databases available from the EMBL-EBI and to explore the network of cross-references present in the data (e.g. EB-eye), services to retrieve entry data in various data formats and to access the data in specific fields (e.g. dbfetch), and analysis tool services, for example, sequence similarity search (e.g. FASTA and NCBI BLAST), multiple sequence alignment (e.g. Clustal Omega and MUSCLE), pairwise sequence alignment and protein functional analysis (e.g. InterProScan and Phobius). The REST/SOAP Web Services (http://www.ebi.ac.uk/Tools/webservices/) interfaces to these databases and tools allow their integration into other tools, applications, web sites, pipeline processes and analytical workflows. To get users started using the Web Services, sample clients are provided covering a range of programming languages and popular Web Service tool kits, and a brief guide to Web Services technologies, including a set of tutorials, is available for those wishing to learn more and develop their own clients. Users of the Web Services are informed of improvements and updates via a range of methods.

  19. Complete nucleotide sequence of the temperate bacteriophage LBR48, a new member of the family Myoviridae.

    PubMed

    Jang, Se Hwan; Yoon, Bo Hyun; Chang, Hyo Ihl

    2011-02-01

    The complete genomic sequence of LBR48, a temperate bacteriophage induced from a lysogenic strain of Lactobacillus brevis, was found to be 48,211 nucleotides long and to contain 90 putative open reading frames. Based on structural characteristics obtained from microscopic analysis and nucleic acid sequence determination, phage LBR48 can be classified as a member of the family Myoviridae. Analysis of the genome showed the conserved gene order of previously reported phages of the family Siphoviridae from lactic acid bacteria, despite low nucleotide sequence similarity. Analysis of the attachment sites revealed 15-nucleotide-long core sequences. PMID:20976608

  20. Nucleotide sequence of HS-beta satellite DNA from kangaroo rat Dipodomys ordii.

    PubMed

    Fry, K; Poon, R; Whitcome, P; Idriss, J; Salser, W; Mazrimas, J; Hatch, F

    1973-09-01

    The sequence of the highly repetitive satellite HS-beta DNA fraction from kangaroo rat Dipodomys ordii was determined independently by RNA and DNA sequencing techniques. A basic iterated sequence of 10 nucleotides with several mutational variations was found. Base-composition data are consistent with the proposed sequence and revealed a high content of 5-methylcytosine. DNA and RNA sequencing techniques used gave identical results, showing that the fidelity of synthesis of riboguanidine-substituted DNA under our conditions is adequate for nucleotide sequence studies.

  1. Complete nucleotide sequence of a 16S ribosomal RNA gene from Escherichia coli.

    PubMed Central

    Brosius, J; Palmer, M L; Kennedy, P J; Noller, H F

    1978-01-01

    The complete nucleotide sequence of the 16S RNA gene from the rrnB cistron of Escherichia coli has been determined by using three rapid DNA sequencing methods. Nearly all of the structure has been confirmed by two to six independent sequence determinations on both DNA strands. The length of the 16S rRNA chain inferred from the DNA sequence is 1541 nucleotides, in close agreement with previous estimates. We note discrepancies between this sequence and the most recent version of it reported from direct RNA sequencing [Ehresmann, C., Stiegler, P., Carbon, P. & Ebel, J.P. (1977) FEBS Lett. 84, 337-341]. A few of these may be explained by heterogeneity among 16S rRNA sequences from different cistrons. No nucleotide sequences were found in the 16S rRNA gene that cannot be reconciled with RNase digestion products of mature 16S rRNA. Images PMID:368799

  2. [Evolution of non-coding nucleotide sequences in Newcastle disease virus genomes ].

    PubMed

    Xu, Huaiying; Qin, Zhuoming; Qi, Lihong; Zhang, Wei; Wang, Youling; Liu, Jinhua

    2014-09-01

    [OBJECTIVE] Although much is done in the coding genes of Newcastle disease virus (NDV) , limited papers can be found with non-coding sequences. In this paper, the evolution tendency of non-coding sequences was studied. [METHODS] NDV strain LC12 isolated from duck with egg drop syndrome in 2012, and others 35 strains genome cDNA of different NDV genotype were sought and obtained from GenBank. Analytical approaches including nucleotide homology, nucleotide alignment and phylogenetic tree were associated with the leading sequences, trailer sequences, intergenic sequences (IGS), and coding gene between 5 'and 3' UTR nucleotide, respectively. [RESULTS] The location and the length of the non-coding sequences highly conserve, and the variation trend of non-coding sequences is synchronous with the entire genomes and coding genes. [ CONCLUSION] The molecular variation of the coding gene was indistinguishable with the non-coding gene in view of the NDV genome. PMID:25522596

  3. Diversity of preferred nucleotide sequences around the translation initiation codon in eukaryote genomes.

    PubMed

    Nakagawa, So; Niimura, Yoshihito; Gojobori, Takashi; Tanaka, Hiroshi; Miura, Kin-ichiro

    2008-02-01

    Understanding regulatory mechanisms of protein synthesis in eukaryotes is essential for the accurate annotation of genome sequences. Kozak reported that the nucleotide sequence GCCGCC(A/G)CCAUGG (AUG is the initiation codon) was frequently observed in vertebrate genes and that this 'consensus' sequence enhanced translation initiation. However, later studies using invertebrate, fungal and plant genes reported different 'consensus' sequences. In this study, we conducted extensive comparative analyses of nucleotide sequences around the initiation codon by using genomic data from 47 eukaryote species including animals, fungi, plants and protists. The analyses revealed that preferred nucleotide sequences are quite diverse among different species, but differences between patterns of nucleotide bias roughly reflect the evolutionary relationships of the species. We also found strong biases of A/G at position -3, A/C at position -2 and C at position +5 that were commonly observed in all species examined. Genes with higher expression levels showed stronger signals, suggesting that these nucleotides are responsible for the regulation of translation initiation. The diversity of preferred nucleotide sequences around the initiation codon might be explained by differences in relative contributions from two distinct patterns, GCCGCCAUG and AAAAAAAUG, which implies the presence of multiple molecular mechanisms for controlling translation initiation.

  4. Cloning and nucleotide sequence of the aroA gene of Bordetella pertussis.

    PubMed Central

    Maskell, D J; Morrissey, P; Dougan, G

    1988-01-01

    The aroA locus of Bordetella pertussis, encoding 5-enolpyruvylshikimate 3-phosphate synthase, has been cloned into Escherichia coli by using a cosmid vector. The gene is expressed in E. coli and complemented an E. coli aroA mutant. The nucleotide sequence of the B. pertussis aroA gene was determined and contains an open reading frame encoding 442 amino acids, with a calculated molecular weight for 5-enolpyruvylshikimate 3-phosphate synthase of 46,688. The amino acid sequence derived from the nucleotide sequence shows homology with the published amino acid sequences of aroA gene products of other microorganisms. PMID:2897356

  5. Evaluation of intra- and interspecific divergence of satellite DNA sequences by nucleotide frequency calculation and pairwise sequence comparison

    PubMed Central

    2003-01-01

    Satellite DNA sequences are known to be highly variable and to have been subjected to concerted evolution that homogenizes member sequences within species. We have analyzed the mode of evolution of satellite DNA sequences in four fishes from the genus Diplodus by calculating the nucleotide frequency of the sequence array and the phylogenetic distances between member sequences. Calculation of nucleotide frequency and pairwise sequence comparison enabled us to characterize the divergence among member sequences in this satellite DNA family. The results suggest that the evolutionary rate of satellite DNA in D. bellottii is about two-fold greater than the average of the other three fishes, and that the sequence homogenization event occurred in D. puntazzo more recently than in the others. The procedures described here are effective to characterize mode of evolution of satellite DNA. PMID:12734555

  6. Evaluation of intra- and interspecific divergence of satellite DNA sequences by nucleotide frequency calculation and pairwise sequence comparison.

    PubMed

    Kato, Mikio

    2003-01-01

    Satellite DNA sequences are known to be highly variable and to have been subjected to concerted evolution that homogenizes member sequences within species. We have analyzed the mode of evolution of satellite DNA sequences in four fishes from the genus Diplodus by calculating the nucleotide frequency of the sequence array and the phylogenetic distances between member sequences. Calculation of nucleotide frequency and pairwise sequence comparison enabled us to characterize the divergence among member sequences in this satellite DNA family. The results suggest that the evolutionary rate of satellite DNA in D. bellottii is about two-fold greater than the average of the other three fishes, and that the sequence homogenization event occurred in D. puntazzo more recently than in the others. The procedures described here are effective to characterize mode of evolution of satellite DNA. PMID:12734555

  7. Complete nucleotide sequences of two adjacent early vaccinia virus genes located within the inverted terminal repetition.

    PubMed

    Venkatesan, S; Gershowitz, A; Moss, B

    1982-11-01

    The proximal part of the 10,000-base pair (bp) inverted terminal repetition of vaccinia virus DNA encodes at least three early mRNAs. A 2,236-bp segment of the repetition was sequenced to characterize two of the genes. This task was facilitated by constructing a series of recombinants containing overlapping deletions; oligonucleotide linkers with synthetic restriction sites provided points for radioactive labeling before sequencing by the chemical degradation method of Maxam and Gilbert (Methods Enzymol. 65:499-560, 1980). The ends of the transcripts were mapped by hybridizing labeled DNA fragments to early viral RNA and resolving nuclease S1-protected fragments in sequencing gels, by sequencing cDNA clones, and from the lengths of the RNAs. The nucleotide sequences for at least 60 bp upstream of both transcriptional initiation sites are more than 80% adenine . thymine rich and contain long runs of adenines and thymines with some homology to procaryotic and eucaryotic consensus sequences. The gene transcribed in the rightward direction encodes an RNA of approximately 530 nucleotides with a single open reading frame of 420 nucleotides. Preceding the first AUG, there is a heptanucleotide that can hybridize to the 3' end of 18S rRNA with only one mismatch. The derived amino acid sequence of the protein indicated a molecular weight of 15,500. The gene transcribed in the leftward direction encodes an RNA 1,000 to 1,100 nucleotides long with an open reading frame of 996 nucleotides and a leader sequence of only 5 to 6 nucleotides. The derived amino acid sequence of this protein indicated a molecular weight of 38,500. The 3' ends of the two transcripts were located within 100 bp of each other. Although there are adenine . thymine-rich clusters near the putative transcriptional termination sites, specific AATAAA polyadenylic acid signal sequences are absent.

  8. Nucleotide sequence of 3' untranslated portion of human alpha globin mRNA.

    PubMed Central

    Wilson, J T; deRiel, J K; Forget, B G; Marotta, C A; Weissman, S M

    1977-01-01

    We have determined the nucleotide sequence of 75 nucleotides of the 3'-untranslated portion of normal human alpha globin mRNA which corresponds to the elongated amino acid sequence of the chain termination mutant Hb Constant Spring. This was accomplished by sequence analysis of cDNA fragments obtained by restriction endonuclease or T4 endonuclease IV cleavage of human globin cDNA synthesized from globin mRNA by use of viral reverse transcriptase. Analysis of cRNA synthesized from cDNA by use of RNA polymerase provided additional confirmatory sequence information. Possible polymorphism has been identified at one site of the sequence. Our sequence overlaps with, and extends the sequence of 43 nucleotides determined by Proudfood and coworkers for the very 3'-terminal portion of human alpha globin mRNA. The complete 3'-untranslated sequence of human alpha globin mRNA (112 nucleotides including termination codon) shows little homology to that of the human or rabbit beta globin mRNAs except for the presence of the hexanucleotide sequence AAUAAA which is found in most eukaryotic mRNAs near the 3'-terminal poly (A). Images PMID:909779

  9. The nucleotide sequences of 5S rRNAs from three ciliated protozoa.

    PubMed Central

    Kumazaki, T; Hori, H; Osawa, S; Mita, T; Higashinakagawa, T

    1982-01-01

    The nucleotide sequences of 5S rRNAs from three ciliated protozoa, Paramecium tetraurelia, Tetrahymena thermophila and Blepharisma japonicum have been determined. All of them are 120 nucleotides long and the sequence of probable tRNA binding site of position 41-44 is GAAC which is characteristic of the plant 5S rRNAs. The sequence similarity percents are 87% (Paramecium/Tetrahymena), 86% (Paramecium/Blepharisma) and 79% (Tetrahymena/Blepharisma), suggesting a close relationship of these three ciliates. PMID:7122243

  10. The nucleotide sequence of 5S rRNA from a cellular slime mold Dictyostelium discoideum.

    PubMed

    Hori, H; Osawa, S; Iwabuchi, M

    1980-12-11

    The nucleotide sequence of ribosomal 5S rRNA from a cellular slime mold Dictyostelium discoideum is GUAUACGGCCAUACUAGGUUGGAAACACAUCAUCCCGUUCGAUCUGAUA AGUAAAUCGACCUCAGGCCUUCCAAGUACUCUGGUUGGAGACAACAGGGGAACAUAGGGUGCUGUAUACU. A model for the secondary structure of this 5S rRNA is proposed. The sequence is more similar to those of animals (62% similarity on the average) rather than those of yeasts (56%).

  11. Nucleotide sequences of 5S rRNAs from four jellyfishes.

    PubMed

    Hori, H; Ohama, T; Kumazaki, T; Osawa, S

    1982-11-25

    The nucleotide sequences of 5S rRNAs from four jellyfishes, Spirocodon saltatrix, Nemopsis dofleini, Aurelia aurita and Chrysaora quinquecirrha have been determined. The sequences are highly similar to each other. A fairly high similarity was also found between these jellyfishes and a sea anemone, Anthopleura japonica.

  12. Nucleotide sequences of 5S rRNAs from four jellyfishes.

    PubMed

    Hori, H; Ohama, T; Kumazaki, T; Osawa, S

    1982-11-25

    The nucleotide sequences of 5S rRNAs from four jellyfishes, Spirocodon saltatrix, Nemopsis dofleini, Aurelia aurita and Chrysaora quinquecirrha have been determined. The sequences are highly similar to each other. A fairly high similarity was also found between these jellyfishes and a sea anemone, Anthopleura japonica. PMID:6130512

  13. Diverse nucleotide compositions and sequence fluctuation in Rubisco protein genes

    NASA Astrophysics Data System (ADS)

    Holden, Todd; Dehipawala, S.; Cheung, E.; Bienaime, R.; Ye, J.; Tremberger, G., Jr.; Schneider, P.; Lieberman, D.; Cheung, T.

    2011-10-01

    The Rubisco protein-enzyme is arguably the most abundance protein on Earth. The biology dogma of transcription and translation necessitates the study of the Rubisco genes and Rubisco-like genes in various species. Stronger correlation of fractal dimension of the atomic number fluctuation along a DNA sequence with Shannon entropy has been observed in the studied Rubisco-like gene sequences, suggesting a more diverse evolutionary pressure and constraints in the Rubisco sequences. The strategy of using metal for structural stabilization appears to be an ancient mechanism, with data from the porphobilinogen deaminase gene in Capsaspora owczarzaki and Monosiga brevicollis. Using the chi-square distance probability, our analysis supports the conjecture that the more ancient Rubisco-like sequence in Microcystis aeruginosa would have experienced very different evolutionary pressure and bio-chemical constraint as compared to Bordetella bronchiseptica, the two microbes occupying either end of the correlation graph. Our exploratory study would indicate that high fractal dimension Rubisco sequence would support high carbon dioxide rate via the Michaelis- Menten coefficient; with implication for the control of the whooping cough pathogen Bordetella bronchiseptica, a microbe containing a high fractal dimension Rubisco-like sequence (2.07). Using the internal comparison of chi-square distance probability for 16S rRNA (~ E-22) versus radiation repair Rec-A gene (~ E-05) in high GC content Deinococcus radiodurans, our analysis supports the conjecture that high GC content microbes containing Rubisco-like sequence are likely to include an extra-terrestrial origin, relative to Deinococcus radiodurans. Similar photosynthesis process that could utilize host star radiation would not compete with radiation resistant process from the biology dogma perspective in environments such as Mars and exoplanets.

  14. Whole-genome sequence of Clostridium lituseburense L74, isolated from the larval gut of the rhinoceros beetle, Trypoxylus dichotomus.

    PubMed

    Lee, Yookyung; Lim, Sooyeon; Rhee, Moon-Soo; Chang, Dong-Ho; Kim, Byoung-Chan

    2016-03-01

    Clostridium lituseburense L74 was isolated from the larval gut of the rhinoceros beetle, Trypoxylus dichotomus collected in Yeong-dong, Chuncheongbuk-do, South Korea and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession NZ_LITJ00000000.

  15. TranslatorX: multiple alignment of nucleotide sequences guided by amino acid translations.

    PubMed

    Abascal, Federico; Zardoya, Rafael; Telford, Maximilian J

    2010-07-01

    We present TranslatorX, a web server designed to align protein-coding nucleotide sequences based on their corresponding amino acid translations. Many comparisons between biological sequences (nucleic acids and proteins) involve the construction of multiple alignments. Alignments represent a statement regarding the homology between individual nucleotides or amino acids within homologous genes. As protein-coding DNA sequences evolve as triplets of nucleotides (codons) and it is known that sequence similarity degrades more rapidly at the DNA than at the amino acid level, alignments are generally more accurate when based on amino acids than on their corresponding nucleotides. TranslatorX novelties include: (i) use of all documented genetic codes and the possibility of assigning different genetic codes for each sequence; (ii) a battery of different multiple alignment programs; (iii) translation of ambiguous codons when possible; (iv) an innovative criterion to clean nucleotide alignments with GBlocks based on protein information; and (v) a rich output, including Jalview-powered graphical visualization of the alignments, codon-based alignments coloured according to the corresponding amino acids, measures of compositional bias and first, second and third codon position specific alignments. The TranslatorX server is freely available at http://translatorx.co.uk.

  16. The EMBL-EBI bioinformatics web and programmatic tools framework.

    PubMed

    Li, Weizhong; Cowley, Andrew; Uludag, Mahmut; Gur, Tamer; McWilliam, Hamish; Squizzato, Silvano; Park, Young Mi; Buso, Nicola; Lopez, Rodrigo

    2015-07-01

    Since 2009 the EMBL-EBI Job Dispatcher framework has provided free access to a range of mainstream sequence analysis applications. These include sequence similarity search services (https://www.ebi.ac.uk/Tools/sss/) such as BLAST, FASTA and PSI-Search, multiple sequence alignment tools (https://www.ebi.ac.uk/Tools/msa/) such as Clustal Omega, MAFFT and T-Coffee, and other sequence analysis tools (https://www.ebi.ac.uk/Tools/pfa/) such as InterProScan. Through these services users can search mainstream sequence databases such as ENA, UniProt and Ensembl Genomes, utilising a uniform web interface or systematically through Web Services interfaces (https://www.ebi.ac.uk/Tools/webservices/) using common programming languages, and obtain enriched results with novel visualisations. Integration with EBI Search (https://www.ebi.ac.uk/ebisearch/) and the dbfetch retrieval service (https://www.ebi.ac.uk/Tools/dbfetch/) further expands the usefulness of the framework. New tools and updates such as NCBI BLAST+, InterProScan 5 and PfamScan, new categories such as RNA analysis tools (https://www.ebi.ac.uk/Tools/rna/), new databases such as ENA non-coding, WormBase ParaSite, Pfam and Rfam, and new workflow methods, together with the retirement of depreciated services, ensure that the framework remains relevant to today's biological community. PMID:25845596

  17. The EMBL-EBI bioinformatics web and programmatic tools framework.

    PubMed

    Li, Weizhong; Cowley, Andrew; Uludag, Mahmut; Gur, Tamer; McWilliam, Hamish; Squizzato, Silvano; Park, Young Mi; Buso, Nicola; Lopez, Rodrigo

    2015-07-01

    Since 2009 the EMBL-EBI Job Dispatcher framework has provided free access to a range of mainstream sequence analysis applications. These include sequence similarity search services (https://www.ebi.ac.uk/Tools/sss/) such as BLAST, FASTA and PSI-Search, multiple sequence alignment tools (https://www.ebi.ac.uk/Tools/msa/) such as Clustal Omega, MAFFT and T-Coffee, and other sequence analysis tools (https://www.ebi.ac.uk/Tools/pfa/) such as InterProScan. Through these services users can search mainstream sequence databases such as ENA, UniProt and Ensembl Genomes, utilising a uniform web interface or systematically through Web Services interfaces (https://www.ebi.ac.uk/Tools/webservices/) using common programming languages, and obtain enriched results with novel visualisations. Integration with EBI Search (https://www.ebi.ac.uk/ebisearch/) and the dbfetch retrieval service (https://www.ebi.ac.uk/Tools/dbfetch/) further expands the usefulness of the framework. New tools and updates such as NCBI BLAST+, InterProScan 5 and PfamScan, new categories such as RNA analysis tools (https://www.ebi.ac.uk/Tools/rna/), new databases such as ENA non-coding, WormBase ParaSite, Pfam and Rfam, and new workflow methods, together with the retirement of depreciated services, ensure that the framework remains relevant to today's biological community.

  18. The EMBL-EBI bioinformatics web and programmatic tools framework

    PubMed Central

    Li, Weizhong; Cowley, Andrew; Uludag, Mahmut; Gur, Tamer; McWilliam, Hamish; Squizzato, Silvano; Park, Young Mi; Buso, Nicola; Lopez, Rodrigo

    2015-01-01

    Since 2009 the EMBL-EBI Job Dispatcher framework has provided free access to a range of mainstream sequence analysis applications. These include sequence similarity search services (https://www.ebi.ac.uk/Tools/sss/) such as BLAST, FASTA and PSI-Search, multiple sequence alignment tools (https://www.ebi.ac.uk/Tools/msa/) such as Clustal Omega, MAFFT and T-Coffee, and other sequence analysis tools (https://www.ebi.ac.uk/Tools/pfa/) such as InterProScan. Through these services users can search mainstream sequence databases such as ENA, UniProt and Ensembl Genomes, utilising a uniform web interface or systematically through Web Services interfaces (https://www.ebi.ac.uk/Tools/webservices/) using common programming languages, and obtain enriched results with novel visualisations. Integration with EBI Search (https://www.ebi.ac.uk/ebisearch/) and the dbfetch retrieval service (https://www.ebi.ac.uk/Tools/dbfetch/) further expands the usefulness of the framework. New tools and updates such as NCBI BLAST+, InterProScan 5 and PfamScan, new categories such as RNA analysis tools (https://www.ebi.ac.uk/Tools/rna/), new databases such as ENA non-coding, WormBase ParaSite, Pfam and Rfam, and new workflow methods, together with the retirement of depreciated services, ensure that the framework remains relevant to today's biological community. PMID:25845596

  19. Methods for making nucleotide probes for sequencing and synthesis

    SciTech Connect

    Church, George M; Zhang, Kun; Chou, Joseph

    2014-07-08

    Compositions and methods for making a plurality of probes for analyzing a plurality of nucleic acid samples are provided. Compositions and methods for analyzing a plurality of nucleic acid samples to obtain sequence information in each nucleic acid sample are also provided.

  20. The nucleotide sequence of Saccharomyces cerevisiae chromosome XII.

    PubMed

    Johnston, M; Hillier, L; Riles, L; Albermann, K; André, B; Ansorge, W; Benes, V; Brückner, M; Delius, H; Dubois, E; Düsterhöft, A; Entian, K D; Floeth, M; Goffeau, A; Hebling, U; Heumann, K; Heuss-Neitzel, D; Hilbert, H; Hilger, F; Kleine, K; Kötter, P; Louis, E J; Messenguy, F; Mewes, H W; Hoheisel, J D

    1997-05-29

    The yeast Saccharomyces cerevisiae is the pre-eminent organism for the study of basic functions of eukaryotic cells. All of the genes of this simple eukaryotic cell have recently been revealed by an international collaborative effort to determine the complete DNA sequence of its nuclear genome. Here we describe some of the features of chromosome XII.

  1. Finding similar nucleotide sequences using network BLAST searches.

    PubMed

    Ladunga, Istvan

    2009-06-01

    The Basic Local Alignment Search Tool (BLAST) is a keystone of bioinformatics due to its performance and user-friendliness. Beginner and intermediate users will learn how to design and submit blastn and Megablast searches on the Web pages at the National Center for Biotechnology Information. We map nucleic acid sequences to genomes, find identical or similar mRNA, expressed sequence tag, and noncoding RNA sequences, and run Megablast searches, which are much faster than blastn. Understanding results is assisted by taxonomy reports, genomic views, and multiple alignments. We interpret expected frequency thresholds, biological significance, and statistical significance. Weak hits provide no evidence, but hints for further analyses. We find genes that may code for homologous proteins by translated BLAST. We reduce false positives by filtering out low-complexity regions. Parsed BLAST results can be integrated into analysis pipelines. Links in the output connect to Entrez, PUBMED, structural, sequence, interaction, and expression databases. This facilitates integration with a wide spectrum of biological knowledge. PMID:19496060

  2. Nucleotide sequence of a human tRNA gene heterocluster

    SciTech Connect

    Chang, Y.N.; Pirtle, I.L.; Pirtle, R.M.

    1986-05-01

    Leucine tRNA from bovine liver was used as a hybridization probe to screen a human gene library harbored in Charon-4A of bacteriophage lambda. The human DNA inserts from plaque-pure clones were characterized by restriction endonuclease mapping and Southern hybridization techniques, using both (3'-/sup 32/P)-labeled bovine liver leucine tRNA and total tRNA as hybridization probes. An 8-kb Hind III fragment of one of these ..gamma..-clones was subcloned into the Hind III site of pBR322. Subsequent fine restriction mapping and DNA sequence analysis of this plasmid DNA indicated the presence of four tRNA genes within the 8-kb DNA fragment. A leucine tRNA gene with an anticodon of AAG and a proline tRNA gene with an anticodon of AGG are in a 1.6-kb subfragment. A threonine tRNA gene with an anticodon of UGU and an as yet unidentified tRNA gene are located in a 1.1-kb subfragment. These two different subfragments are separated by 2.8 kb. The coding regions of the three sequenced genes contain characteristic internal split promoter sequences and do not have intervening sequences. The 3'-flanking region of these three genes have typical RNA polymerase III termination sites of at least four consecutive T residues.

  3. Nucleotide sequence conservation in paramyxoviruses; the concept of codon constellation.

    PubMed

    Rima, Bert K

    2015-05-01

    The stability and conservation of the sequences of RNA viruses in the field and the high error rates measured in vitro are paradoxical. The field stability indicates that there are very strong selective constraints on sequence diversity. The nature of these constraints is discussed. Apart from constraints on variation in cis-acting RNA and the amino acid sequences of viral proteins, there are other ones relating to the presence of specific dinucleotides such CpG and UpA as well as the importance of RNA secondary structures and RNA degradation rates. Recent other constraints identified in other RNA viruses, such as effects of secondary RNA structure on protein folding or modification of cellular tRNA complements, are also discussed. Using the family Paramyxoviridae, I show that the codon usage pattern (CUP) is (i) specific for each virus species and (ii) that it is markedly different from the host - it does not vary even in vaccine viruses that have been derived by passage in a number of inappropriate host cells. The CUP might thus be an additional constraint on variation, and I propose the concept of codon constellation to indicate the informational content of the sequences of RNA molecules relating not only to stability and structure but also to the efficiency of translation of a viral mRNA resulting from the CUP and the numbers and position of rare codons.

  4. Water buffalo (Bubalus bubalis): complete nucleotide mitochondrial genome sequence.

    PubMed

    Parma, Pietro; Erra-Pujada, Marta; Feligini, Maria; Greppi, Gianfranco; Enne, Giuseppe

    2004-01-01

    In this work, we report the whole sequence of the water buffalo (Bubalus bubalis) mitochondrial genome. The water buffalo mt molecule is 16.355 base pair length and shows a genome organization similar to those reported for other mitochondrial genome. These new data provide an useful tool for many research area, i.e. evolutionary study and identification of food origin.

  5. Statistical analysis of nucleotide runs in coding and noncoding DNA sequences.

    PubMed

    Sprizhitsky YuA; Nechipurenko YuD; Alexandrov, A A; Volkenstein, M V

    1988-10-01

    A statistical analysis of the occurrence of particular nucleotide runs in DNA sequences of different species has been carried out. There are considerable differences of run distributions in DNA sequences of procaryotes, invertebrates and vertebrates. There is an abundance of short runs (1-2 nucleotides long) in the coding sequences and there is a deficiency of such runs in the noncoding regions. However, some interesting exceptions from this rule exist for the run distribution of adenine in procaryotes and for the arrangement of purine-pyrimidine runs in eucaryotes. The similarity in the distributions of such runs in the coding and noncoding regions may be due to some structural features of the DNA molecule as a whole. Runs of guanine (or cytosine) of three to six nucleotides occur predominantly in noncoding DNA regions in eucaryotes, especially in vertebrates.

  6. Nucleotide sequence of equine caspase-1 cDNA.

    PubMed

    Wardlow, S; Penha-Goncalves, M N; Argyle, D J; Onions, D E; Nicolson, L

    1999-01-01

    Caspases are a family of cysteine proteases which have important roles in activation of cytokines and in apoptosis. Caspase-1, or interleukin-1 beta converting enzyme (ICE), promotes maturation of interleukin-1 beta (IL-1 beta) and interleukin-18 (IL-18) by proteolytic cleavage of precursor forms to generate biologically active peptides. We report the cloning and sequencing of equine caspase-1 cDNA. Equine caspase-1 is 405 amino acids in length and has 72% and 63% identity to human and mouse caspase-1, respectively, at the amino acid level. Sites of proteolytic cleavage and catalytic activity as identified in human caspase-1, are conserved. PMID:10376217

  7. The nucleotide sequence of spinach chloroplast tryptophan transfer RNA.

    PubMed Central

    Canaday, J; Guillemaut, P; Gloeckler, R; Weil, J H

    1981-01-01

    Spinach chloroplast tRNATrp, purified by column chromatography and two-dimensional gel electrophoresis, has been sequenced using in vitro labeling techniques. The sequence is : pG-C-G-C-U-C-U-U-A-G-U-U-C-A-G-U-U-C-Gm-G-D-A-G-A-A-C-m2G-psi-G-G-G-psi-C-U-C-A-A*-A-A-C-C-C-G-A-U-G-N-C-G-U-A-G-G-T-psi-C-A-A-G-U-C-C-U-A-C-A-G-A-G-C-G-U-G -C-C-AOH. Like the E. coli suppressor tRNA psu+UGA which translates both the opal terminator codon U-G-A and the tryptophan codon U-G-G, spinach chloroplast tRNATrp has C-C-A as an anticodon and contains an A-U pair in the D-stem. Images PMID:6907845

  8. Complete nucleotide sequence and transcriptional analysis of snakehead fish retrovirus.

    PubMed Central

    Hart, D; Frerichs, G N; Rambaut, A; Onions, D E

    1996-01-01

    The complete genome of the snakehead fish retrovirus has been cloned and sequenced, and its transcriptional profile in cell culture has been determined. The 11.2-kb provirus displays a complex expression pattern capable of encoding accessory proteins and is unique in the predicted location of the env initiation codon and signal peptide upstream of gag and the common splice donor site. The virus is distinguishable from all known retrovirus groups by the presence of an arginine tRNA primer binding site. The coding regions are highly divergent and show a number of unusual characteristics, including a large Gag coiled-coil region, a Pol domain of unknown function, and a long, lentiviral-like, Env cytoplasmic domain. Phylogenetic analysis of the Pol sequence emphasizes the divergent nature of the virus from the avian and mammalian retroviruses. The snakehead virus is also distinct from a previously characterized complex fish retrovirus, suggesting that discrete groups of these viruses have yet to be identified in the lower vertebrates. PMID:8648695

  9. Complete nucleotide sequence and transcriptional analysis of snakehead fish retrovirus.

    PubMed

    Hart, D; Frerichs, G N; Rambaut, A; Onions, D E

    1996-06-01

    The complete genome of the snakehead fish retrovirus has been cloned and sequenced, and its transcriptional profile in cell culture has been determined. The 11.2-kb provirus displays a complex expression pattern capable of encoding accessory proteins and is unique in the predicted location of the env initiation codon and signal peptide upstream of gag and the common splice donor site. The virus is distinguishable from all known retrovirus groups by the presence of an arginine tRNA primer binding site. The coding regions are highly divergent and show a number of unusual characteristics, including a large Gag coiled-coil region, a Pol domain of unknown function, and a long, lentiviral-like, Env cytoplasmic domain. Phylogenetic analysis of the Pol sequence emphasizes the divergent nature of the virus from the avian and mammalian retroviruses. The snakehead virus is also distinct from a previously characterized complex fish retrovirus, suggesting that discrete groups of these viruses have yet to be identified in the lower vertebrates.

  10. Nucleotide sequences of five IncF plasmid finP alleles.

    PubMed Central

    Finlay, B B; Frost, L S; Paranchych, W; Willetts, N S

    1986-01-01

    The nucleotide sequences of five finP alleles from various IncF plasmids (finP types I to V) as well as of three finP mutations were determined and compared. The finP gene specificity could be attributed to a variable, six-to-seven-nucleotide loop located between inverted repeats, and the sequence data were consistent with the product of finP being an RNA molecule rather than a protein. The finP mutations interrupted a proposed finP promoter or destabilized a predicted stem-and-loop structure in the finP RNA molecule. PMID:2426248

  11. Nucleotide sequence of a small cryptic plasmid from Acidithiobacillus ferrooxidans strain A-6

    SciTech Connect

    F. Roberto

    2003-10-01

    A 2.1 kb cryptic plasmid from Acidithiobacillus ferrooxidans strain A-6 was isolated and cloned into the E. coli vector plasmid, pUC128. The cloned plasmid was mapped by restriction enzyme fragment analysis and subsequently sequenced. At this time over half the plasmid sequence has been determined and compared to sequences in the GenBank nucleotide and protein sequence databases. Much of the plasmid remains cryptic, but substantial nucleotide and protein sequence similarities have been observed to the putative replication protein, RepA, of the small cryptic plasmids pAYS and pAYL found in the ammonia-oxidizing Nitrosomonas sp. Strain ENI-11. These results suggest an entirely new class of plasmid is maintained in at least one strain of Acidithiobacillus ferrooxidans and other acidophilic bacteria, and raises interesting questions about the origin of this plasmid in acidic environments.

  12. The complete nucleotide sequence and genomic characterization of tropical soda apple mosaic virus.

    PubMed

    Fillmer, Kornelia; Adkins, Scott; Pongam, Patchara; D'Elia, Tom

    2016-08-01

    We report the first complete genome sequence of tropical soda apple mosaic virus (TSAMV), a tobamovirus originally isolated from tropical soda apple (Solanum viarum) collected in Okeechobee, Florida. The complete genome of TSAMV is 6,350 nucleotides long and contains four open reading frames encoding the following proteins: i) 126-kDa methyltransferase/helicase (3354 nt), ii) 183-kDa polymerase (4839 nt), iii) movement protein (771 nt) and iv) coat protein (483 nt). The complete genome sequence of TSAMV shares 80.4 % nucleotide sequence identity with pepper mild mottle virus (PMMoV) and 71.2-74.2 % identity with other tobamoviruses naturally infecting members of the Solanaceae plant family. Phylogenetic analysis of the deduced amino acid sequences of the 126-kDa and 183-kDa proteins and the complete genome sequence place TSAMV in a subcluster with PMMoV within the Solanaceae-infecting subgroup of tobamoviruses.

  13. Relationships amongst bluetongue viruses revealed by comparisons of capsid and outer coat protein nucleotide sequences.

    PubMed

    Gould, A R; Pritchard, L I

    1990-08-01

    Sequence data from the gene segments coding for the capsid protein. VP3, of all eight Australian bluetongue virus serotypes were compared. The high degree of nucleotide sequence homology for VP3 genes amongst BTV isolates from the same geographic region supported previous studies (Gould, 1987; 1988b, c; Gould et al., 1988b) and was proposed as a basis for "topotyping" a bluetongue virus isolate (Gould et al., 1989). The complete nucleotide sequences which coded for the VP2 outer coat proteins of South African BTV serotypes 1 and 3 (vaccine strains) were determined and compared to cognate gene sequences from North American and Australian BTVs. These VP2 comparisons demonstrated that BTVs of the same serotype, but from different geographical regions, were closely related at the nucleotide and amino acid levels. However, close inter-relationships were also demonstrated amongst other BTVs irrespective of serotype or geographic origin. These data enabled phylogenic relationships of the BTV serotypes to be analysed using VP2 nucleotide sequences as a determinant.

  14. Nucleotide composition of CO1 sequences in Chelicerata (Arthropoda): detecting new mitogenomic rearrangements.

    PubMed

    Arabi, Juliette; Judson, Mark L I; Deharveng, Louis; Lourenço, Wilson R; Cruaud, Corinne; Hassanin, Alexandre

    2012-02-01

    Here we study the evolution of nucleotide composition in third codon-positions of CO1 sequences of Chelicerata, using a phylogenetic framework, based on 180 taxa and three markers (CO1, 18S, and 28S rRNA; 5,218 nt). The analyses of nucleotide composition were also extended to all CO1 sequences of Chelicerata found in GenBank (1,701 taxa). The results show that most species of Chelicerata have a positive strand bias in CO1, i.e., in favor of C nucleotides, including all Amblypygi, Palpigradi, Ricinulei, Solifugae, Uropygi, and Xiphosura. However, several taxa show a negative strand bias, i.e., in favor of G nucleotides: all Scorpiones, Opisthothelae spiders and several taxa within Acari, Opiliones, Pseudoscorpiones, and Pycnogonida. Several reversals of strand-specific bias can be attributed to either a rearrangement of the control region or an inversion of a fragment containing the CO1 gene. Key taxa for which sequencing of complete mitochondrial genomes will be necessary to determine the origin and nature of mtDNA rearrangements involved in the reversals are identified. Acari, Opiliones, Pseudoscorpiones, and Pycnogonida were found to show a strong variability in nucleotide composition. In addition, both mitochondrial and nuclear genomes have been affected by higher substitution rates in Acari and Pseudoscorpiones. The results therefore indicate that these two orders are more liable to fix mutations of all types, including base substitutions, indels, and genomic rearrangements.

  15. Single nucleotide polymorphism mining and nucleotide sequence analysis of Mx1 gene in exonic regions of Japanese quail

    PubMed Central

    Niraj, Diwesh Kumar; Kumar, Pushpendra; Mishra, Chinmoy; Narayan, Raj; Bhattacharya, Tarun Kumar; Shrivastava, Kush; Bhushan, Bharat; Tiwari, Ashok Kumar; Saxena, Vishesh; Sahoo, Nihar Ranjan; Sharma, Deepak

    2015-01-01

    Aim: An attempt has been made to study the Myxovirus resistant (Mx1) gene polymorphism in Japanese quail. Materials and Methods: In the present, investigation four fragments viz. Fragment I of 185 bp (Exon 3 region), Fragment II of 148 bp (Exon 5 region), Fragment III of 161 bp (Exon 7 region), and Fragment IV of 176 bp (Exon 13 region) of Mx1 gene were amplified and screened for polymorphism by polymerase chain reaction-single-strand conformation polymorphism technique in 170 Japanese quail birds. Results: Out of the four fragments, one fragment (Fragment II) was found to be polymorphic. Remaining three fragments (Fragment I, III, and IV) were found to be monomorphic which was confirmed by custom sequencing. Overall nucleotide sequence analysis of Mx1 gene of Japanese quail showed 100% homology with common quail and more than 80% homology with reported sequence of chicken breeds. Conclusion: The Mx1 gene is mostly conserved in Japanese quail. There is an urgent need of comprehensive analysis of other regions of Mx1 gene along with its possible association with the traits of economic importance in Japanese quail. PMID:27047057

  16. Complete nucleotide sequence of the chlorarachniophyte nucleomorph: nature's smallest nucleus.

    PubMed

    Gilson, Paul R; Su, Vanessa; Slamovits, Claudio H; Reith, Michael E; Keeling, Patrick J; McFadden, Geoffrey I

    2006-06-20

    The introduction of plastids into different heterotrophic protists created lineages of algae that diversified explosively, proliferated in marine and freshwater environments, and radically altered the biosphere. The origins of these secondary plastids are usually inferred from the presence of additional plastid membranes. However, two examples provide unique snapshots of secondary-endosymbiosis-in-action, because they retain a vestige of the endosymbiont nucleus known as the nucleomorph. These are chlorarachniophytes and cryptomonads, which acquired their plastids from a green and red alga respectively. To allow comparisons between them, we have sequenced the nucleomorph genome from the chlorarachniophyte Bigelowiella natans: at a mere 373,000 bp and with only 331 genes, the smallest nuclear genome known and a model for extreme reduction. The genome is eukaryotic in nature, with three linear chromosomes containing densely packed genes with numerous overlaps. The genome is replete with 852 introns, but these are the smallest introns known, being only 18, 19, 20, or 21 nt in length. These pygmy introns are shown to be miniaturized versions of normal-sized introns present in the endosymbiont at the time of capture. Seventeen nucleomorph genes encode proteins that function in the plastid. The other nucleomorph genes are housekeeping entities, presumably underpinning maintenance and expression of these plastid proteins. Chlorarachniophyte plastids are thus serviced by three different genomes (plastid, nucleomorph, and host nucleus) requiring remarkable coordination and targeting. Although originating by two independent endosymbioses, chlorarachniophyte and cryptomonad nucleomorph genomes have converged upon remarkably similar architectures but differ in many molecular details that reflect two distinct trajectories to hypercompaction and reduction.

  17. On the feasibility of using the intrinsic fluorescence of nucleotides for DNA sequencing.

    SciTech Connect

    Chowdhury, M. H.; Ray, K.; Johnson, R. L.; Gray, S. K.; Pond, J.; Lakowicz, J. R.; Univ. of Maryland; Univ. of Virginia; Lumerical Solutions, Inc.

    2010-04-29

    There is presently a worldwide effort to increase the speed and decrease the cost of DNA sequencing as exemplified by the goal of the National Human Genome Research Institute (NHGRI) to sequence a human genome for under $1000. Several high throughput technologies are under development. Among these, single strand sequencing using exonuclease appear very promising. However, this approach requires complete labeling of at least two bases at a time, with extrinsic high quantum yield probes. This is necessary because nucleotides absorb in the deep ultraviolet (UV) and emit with extremely low quantum yields. Hence intrinsic emission from DNA and nucleotides is not being exploited for DNA sequencing. In the present paper we consider the possibility of identifying single nucleotides using their intrinsic emission. We used the finite-difference time-domain (FDTD) method to calculate the effects of aluminum nanoparticles on nearby fluorophores that emit in the UV. We find that the radiated power of UV fluorophores is significantly increased when they are in close proximity to aluminum nanostructures. We show that there will be increased localized excitation near aluminum particles at wavelengths used to excite intrinsic nucleotide emission. Using FDTD simulation we show that a typical DNA base when coupled to appropriate aluminum nanostructures leads to highly directional emission. Additionally we present experimental results showing that a thin film of nucleotides show enhanced emission when in close proximity to aluminum nanostructures. Finally we provide Monte Carlo simulations that predict high levels of base calling accuracy for an assumed number of photons that is derived from the emission spectra of the intrinsic fluorescence of the bases. Our results suggest that single nucleotides can be detected and identified using aluminum nanostructures that enhance their intrinsic emission. This capability would be valuable for the ongoing efforts toward the $1000 genome.

  18. Secretory pancreatic stone protein messenger RNA. Nucleotide sequence and expression in chronic calcifying pancreatitis.

    PubMed Central

    Giorgi, D; Bernard, J P; Rouquier, S; Iovanna, J; Sarles, H; Dagorn, J C

    1989-01-01

    The pancreatic stone protein and its secretory form (PSP-S) are inhibitors of CaCO3 crystal growth, possibly involved in the stabilization of pancreatic juice. We have established the structure of PSP-S mRNA and monitored its expression in chronic calcifying pancreatitis (CCP). A cDNA encoding pre-PSP-S has been cloned from a human pancreatic cDNA library. Its nucleotide sequence revealed that it comprised all but the 5' end of PSP-S mRNA, which was obtained by sequencing the first exon of the PSP-S gene. The complete mRNA sequence is 775 nucleotides long, including 5'- and 3'- noncoding regions of 80 and 197 nucleotides, respectively, attached to a poly(A) tail of approximately 125 nucleotides. It encodes a preprotein of 166 amino acids, including a prepeptide of 22 amino acids. No overall sequence homology was found between PSP-S and other pancreatic proteins. Some homology with several serine proteases was observed in the COOH-terminal region, however. The mRNA levels of PSP-S, trypsinogen, chymotrypsinogen, and colipase in CCP and control pancreas were compared. PSP-S mRNA was three times lower in CCP than in control, whereas the others were not altered. It was concluded that PSP-S gene expression is specifically reduced in CCP patients. Images PMID:2525567

  19. Nucleotide sequence of the alpha-amylase-pullulanase gene from Clostridium thermohydrosulfuricum.

    PubMed

    Melasniemi, H; Paloheimo, M; Hemiö, L

    1990-03-01

    The nucleotide sequence of the gene (apu) encoding the thermostable alpha-amylase-pullulanase of Clostridium thermohydrosulfuricum was determined. An open reading frame of 4425 bp was present. The deduced polypeptide (Mr 165,600), including a 31 amino acid putative signal sequence, comprised 1475 amino acids, with no cysteine residues. The structural gene was preceded by the consensus promoter sequence TTGACA TATAAT, a putative regulatory sequence and a putative ribosome-binding sequence AAAGGGGG. The codon usage resembled that of Bacillus genes. The deduced sequence of the mature apu product showed similarities to various amylolytic enzymes, especially the neopullulanase of Bacillus stearothermophilus, whereas the signal sequence showed similarity to those of the alpha-amylases of B. stearothermophilus and B. subtilis. Three regions thought to be highly conserved in the primary structure of alpha-amylases could also be distinguished in the apu product, two being partly 'duplicated' in this alpha-1,4/alpha-1,6-active enzyme.

  20. Nucleotide sequence of a cloned woodchuck hepatitis virus genome: evolutional relationship between hepadnaviruses.

    PubMed Central

    Kodama, K; Ogasawara, N; Yoshikawa, H; Murakami, S

    1985-01-01

    We have determined the complete nucleotide sequence of a cloned DNA of woodchuck hepatitis virus (WHV), the most oncogenic virus among hepadnaviruses. The genome, designated WHV2, is 3,320 base pairs long and contains four major open reading frames (ORFs) coded on the same strand of nucleotide sequence as in the human hepatitis B virus (HBV) genome. Comparison of the nucleotide sequence and amino acid sequences deduced from it among the genomes of various hepadnaviruses demonstrates that each protein shows an intrinsic property in conserving its amino acid sequence. A parameter, the ratio of the number of triplets with one-letter change but no amino acid substitution to the total number of triplets in which one-letter change occurred, was introduced to measure the intrinsic properties quantitatively. For each ORF, the parameter gave characteristic values in all combinations. Therefore, the relative evolutional distance between these hepadnaviruses can be measured by the amino acid substitution rate of any ORF. These comparisons suggest that (i) the difference between two WHV clones, WHV1 and WHV2, corresponds to that among clones of a HBV subtype, HBVadr, and (ii) WHV and ground squirrel hepatitis virus can be categorized in a way similar to the subgroups of HBV. PMID:3855246

  1. Drosophila melanogaster mitochondrial DNA: completion of the nucleotide sequence and evolutionary comparisons.

    PubMed

    Lewis, D L; Farr, C L; Kaguni, L S

    1995-11-01

    The nucleotide sequence of the regions flanking the A+T region of Drosophila melanogaster mitochondrial DNA (mtDNA) has been determined. Included are the genes encoding the transfer RNAs for valine, isoleucine, glutamine and methionine, the small ribosomal RNA and the 5'-coding sequences of the large ribosomal RNA and NADH dehydrogenase subunit II. This completes the nucleotide sequence of the D. melanogaster mitochondrial genome. The circular mtDNA of D. melanogaster varies in size among different populations largely due to length differences in the control region (Fauron & Wolstenholme, 1976; Fauron & Wolstenholme, 1980a, b); the mtDNA region we have sequenced, combined with those sequenced by others, yields a composite genome that is 19,517 bp in length as compared to 16,019 bp for the mtDNA of D. yakuba. D. melanogaster mtDNA exhibits an extreme bias in base composition; it comprises 82.2% deoxyadenylate and thymidylate residues as compared to 78.6% in D. yakuba mtDNA. All genes encoded in the mtDNA of both species are in identical locations and orientations. Nucleotide substitution analysis reveals that tRNA and rRNA genes evolve at less than half the rate of protein coding genes.

  2. The human myelin oligodendrocyte glycoprotein (MOG) gene: Complete nucleotide sequence and structural characterization

    SciTech Connect

    Paule Roth, M.; Malfroy, L.; Offer, C.; Sevin, J.; Enault, G.; Borot, N.; Pontarotti, P.; Coppin, H.

    1995-07-20

    Human myelin oligodendrocyte glycoprotein (MOG), a myelin component of the central nervous system, is a candidate target antigen for autoimmune-mediated demyelination. We have isolated and sequenced part of a cosmid clone that contains the entire human MOG gene. The primary nuclear transcript, extending from the putative start of transcription to the site of poly(A) addition, is 15,561 nucleotides in length. The human MOG gene contains 8 exons, separated by 7 introns; canonical intron/exon boundary sites are observed at each junction. The introns vary in size from 242 to 6484 bp and contain numerous repetitive DNA elements, including 14 Alu sequences within 3 introns. Another Alu element is located in the 3{prime}-untranslated region of the gene. Alu sequences were classified with respect to subfamily assignment. Seven hundred sixty-three nucleotides 5{prime} of the transcription start and 1214 nucleotides 3{prime} of the poly(A) addition sites were also sequenced. The 5{prime}-flanking region revealed the presence of several consensus sequences that could be relevant in the transcription of the MOG gene, in particular binding sites in common with other myelin gene promoters. Two polymorphic intragenic dinucleotide (CA){sub n} and tetranucleotide (TAAA){sub n} repeats were identified and may provide genetic marker tools for association and linkage studies. 50 refs., 3 figs., 3 tabs.

  3. Nucleotide sequence and genome organization of atractylodes mottle virus, a new member of the genus Carlavirus.

    PubMed

    Zhao, Fumei; Igori, Davaajargal; Lim, Seungmo; Yoo, Ran Hee; Lee, Su-Heon; Moon, Jae Sun

    2015-11-01

    The complete genome sequence of a member of a distinct species of the genus Carlavirus in the family Betaflexiviridae, tentatively named atractylodes mottle virus (AtrMoV), has been determined. Analysis of its genomic organization indicates that it has a single-stranded, positive-sense genomic RNA of 8866 nucleotides, excluding the poly(A) tail, and consists of six open reading frames typical of members of the genus Carlavirus. The individual open reading frames of AtrMoV show moderately low sequence similarity to those of other carlaviruses at the nucleotide and amino acid sequence levels. Pairwise comparison and phylogenetic analysis suggest that AtrMoV is most closely related to chrysanthemum virus B. PMID:26264403

  4. Analysis of the complete nucleotide sequence of the Agrobacterium tumefaciens virB operon.

    PubMed

    Thompson, D V; Melchers, L S; Idler, K B; Schilperoort, R A; Hooykaas, P J

    1988-05-25

    The complete nucleotide sequence of the virB locus, from the octopine Ti plasmid of Agrobacterium tumefaciens strain 15955, has been determined. In the large virB-operon (9600 nucleotides) we have identified eleven open reading frames, designated virB1 to virB11. From DNA sequence analysis it is proposed that nearly all VirB products, i.e. VirB1 to VirB9, are secreted or membrane associated proteins. Interestingly, both a membrane protein (VirB4) and a potential cytoplasmic protein (VirB11) contain the consensus amino acid sequence of ATP-binding proteins. In view of the conjugative T-DNA transfer model, the VirB proteins are suggested to act at the bacterial surface and there play an important role in directing T-DNA transfer to plant cells. PMID:2837739

  5. A sequence of seventy-three nucleotides from the coliphage R17 genome

    PubMed Central

    Rensing, Ulrich F. E.

    1973-01-01

    1. A sequence of 73 nucleotides of the RNA genome from coliphage R17 was determined. It can be read through in only one translational frame. The fragment is not part of the coatprotein cistron (Min Jou et al., 1972), nor does it come from the untranslated sequences described previously (Steitz, 1969; Nichols, 1970; Cory et al., 1970; de Wachter et al., 1971; Contreras et al., 1971; Cory et al., 1972). It contains two sequences of 23 and 24 nucleotides, 22 of which are identical. This kind of reiteration is the first one found in bacteriophage nucleic acid. 2. Improved conditions were found and tested for blocking oligonucleotides with carbodi-imide and cleaving by ribonuclease A at cytidylate residues. 3. A synthetic medium is described which allows labelling in vivo with 32P to give specific radioactivities higher than those obtained in the procedures used previously. ImagesPLATE 1PLATE 2PLATE 3 PMID:4352721

  6. Complete nucleotide sequence of a subviral DNA molecule of porcine circovirus type 2.

    PubMed

    Wen, Han

    2016-07-01

    Porcine circovirus type 2 (PCV2) is a member of the genus Circovirus in the family Circoviridae. Most subgenomic molecules of PCV2 have been mapped. Here, the first full-length sequence of a subviral molecule of PCV2 (CH-IVT12) containing a reverse complement sequence of the PCV2 genome was determined by sequencing DNA extracted from PK15 cells infected with PCV2. The circular CH-IVT12 DNA consists of 1136 nucleotides and contains one major open reading frame. PMID:27084550

  7. Sequence selective naked-eye detection of DNA harnessing extension of oligonucleotide-modified nucleotides.

    PubMed

    Verga, Daniela; Welter, Moritz; Marx, Andreas

    2016-02-01

    DNA polymerases can efficiently and sequence selectively incorporate oligonucleotide (ODN)-modified nucleotides and the incorporated oligonucleotide strand can be employed as primer in rolling circle amplification (RCA). The effective amplification of the DNA primer by Φ29 DNA polymerase allows the sequence-selective hybridisation of the amplified strand with a G-quadruplex DNA sequence that has horse radish peroxidase-like activity. Based on these findings we develop a system that allows DNA detection with single-base resolution by naked eye.

  8. Sequence selective naked-eye detection of DNA harnessing extension of oligonucleotide-modified nucleotides.

    PubMed

    Verga, Daniela; Welter, Moritz; Marx, Andreas

    2016-02-01

    DNA polymerases can efficiently and sequence selectively incorporate oligonucleotide (ODN)-modified nucleotides and the incorporated oligonucleotide strand can be employed as primer in rolling circle amplification (RCA). The effective amplification of the DNA primer by Φ29 DNA polymerase allows the sequence-selective hybridisation of the amplified strand with a G-quadruplex DNA sequence that has horse radish peroxidase-like activity. Based on these findings we develop a system that allows DNA detection with single-base resolution by naked eye. PMID:26774580

  9. The nucleotide sequence at the termini of adenovirus type 5 DNA.

    PubMed Central

    Steenbergh, P H; Maat, J; van Ormondt, H; Sussenbach, J S

    1977-01-01

    The sequences of the first 194 base pairs at both termini of adenovirus type 5 (Ad5) DNA have been determined, using the chemical degradation technique developed by Maxam and Gilbert (Proc. Nat. Acad. Sci. USA 74 (1977), pp. 560-564). The nucleotide sequences 1-75 were confirmed by analysis of labeled RNA transcribed from the terminal HhaI fragments in vitro. The sequence data show that Ad5 DNA has a perfect inverted terminal repetition of 103 base pairs long. Images PMID:600799

  10. Complete nucleotide sequence analysis of a Dengue-1 virus isolated on Easter Island, Chile.

    PubMed

    Cáceres, C; Yung, V; Araya, P; Tognarelli, J; Villagra, E; Vera, L; Fernández, J

    2008-01-01

    Dengue-1 viruses responsible for the dengue fever outbreak in Easter Island in 2002 were isolated from acute-phase sera of dengue fever patients. In order to analyze the complete genome sequence, we designed primers to amplify contiguous segments across the entire sequence of the viral genome. RT-PCR products obtained were cloned, and complete nucleotide and deduced amino acid sequences were determined. This report constitutes the first complete genetic characterization of a DENV-1 isolate from Chile. Phylogenetic analysis shows that an Easter Island isolate is most closely related to Pacific DENV-1 genotype IV viruses.

  11. Cloning and nucleotide sequence of the alpha-galactosidase cDNA from Cyamopsis tetragonoloba (guar).

    PubMed

    Overbeeke, N; Fellinger, A J; Toonen, M Y; van Wassenaar, D; Verrips, C T

    1989-11-01

    Polyadenylated mRNA was purified from the aleurone cells of Cyamopsis tetragonoloba (guar) seeds germinated for 18 h and used for the construction of a cDNA library. Clones with the alpha-galactosidase encoding gene were identified using oligo-nucleotide mixed probes based on the NH2 terminal amino acid sequence and on the sequence of an internal peptide. The nucleotide sequence of the cDNA clone showed that the enzyme is synthesized as a precursor with a 47 amino acid NH2 terminal extension. This pre-sequence most likely functions to target the protein outside the aleurone cells into the endosperm. Based upon structural features, it is proposed to divide the precursor into a pre-(signal sequence) part and a glycosylated pro-part comparable with those of the yeast mat A/alpha factor and killer factor. A comparison of the derived amino acid sequence of this alpha-galactosidase from plant origin revealed significant stretches of homology with respect to the amino acid sequences of the enzymes from Saccharomyces cerevisiae and from human origin but only to a minor extent compared with the alpha-galactosidase from Escherichia coli.

  12. Complete nucleotide sequence of wound tumor virus genomic segments encoding nonstructural polypeptides.

    PubMed

    Anzola, J V; Dall, D J; Xu, Z K; Nuss, D L

    1989-07-01

    Sequence analysis of the genomic segments which encode the five wound tumor virus nonstructural polypeptides has been completed. The complete nucleotide sequence of segments S4 (2565 bp), S6 (1700 bp), S9 (1182 bp), and S10 (1172 bp) are presented in this report while the sequence of segment S12 (851 bp) has been described previously (T. Asamizu, D. Summers, M. B. Motika, J. V. Anzola, and D. L. Nuss, 1985, Virology 144, 398-409). Comparison of the only published sequence for another member of the genus Phytoreovirus, that of rice dwarf virus segment S10, with the combined available wound tumor virus sequence data revealed similarity with WTV segment S10: 54.9 and 30.6% at the nucleotide and amino acid level, respectively. Although wound tumor virus and rice dwarf virus differ in plant host range, tissue specificity, vector range, and disease symptom expression, the level of sequence similarity shared by the two segments suggests a common origin for these viruses. The potential use of a phytoreovirus sequence database for predicting functions of viral encoded gene products is considered.

  13. Nucleotide binding database NBDB – a collection of sequence motifs with specific protein-ligand interactions

    PubMed Central

    Zheng, Zejun; Goncearenco, Alexander; Berezovsky, Igor N.

    2016-01-01

    NBDB database describes protein motifs, elementary functional loops (EFLs) that are involved in binding of nucleotide-containing ligands and other biologically relevant cofactors/coenzymes, including ATP, AMP, ATP, GMP, GDP, GTP, CTP, PAP, PPS, FMN, FAD(H), NAD(H), NADP, cAMP, cGMP, c-di-AMP and c-di-GMP, ThPP, THD, F-420, ACO, CoA, PLP and SAM. The database is freely available online at http://nbdb.bii.a-star.edu.sg. In total, NBDB contains data on 249 motifs that work in interactions with 24 ligands. Sequence profiles of EFL motifs were derived de novo from nonredundant Uniprot proteome sequences. Conserved amino acid residues in the profiles interact specifically with distinct chemical parts of nucleotide-containing ligands, such as nitrogenous bases, phosphate groups, ribose, nicotinamide, and flavin moieties. Each EFL profile in the database is characterized by a pattern of corresponding ligand–protein interactions found in crystallized ligand–protein complexes. NBDB database helps to explore the determinants of nucleotide and cofactor binding in different protein folds and families. NBDB can also detect fragments that match to profiles of particular EFLs in the protein sequence provided by user. Comprehensive information on sequence, structures, and interactions of EFLs with ligands provides a foundation for experimental and computational efforts on design of required protein functions. PMID:26507856

  14. Nucleotide binding database NBDB--a collection of sequence motifs with specific protein-ligand interactions.

    PubMed

    Zheng, Zejun; Goncearenco, Alexander; Berezovsky, Igor N

    2016-01-01

    NBDB database describes protein motifs, elementary functional loops (EFLs) that are involved in binding of nucleotide-containing ligands and other biologically relevant cofactors/coenzymes, including ATP, AMP, ATP, GMP, GDP, GTP, CTP, PAP, PPS, FMN, FAD(H), NAD(H), NADP, cAMP, cGMP, c-di-AMP and c-di-GMP, ThPP, THD, F-420, ACO, CoA, PLP and SAM. The database is freely available online at http://nbdb.bii.a-star.edu.sg. In total, NBDB contains data on 249 motifs that work in interactions with 24 ligands. Sequence profiles of EFL motifs were derived de novo from nonredundant Uniprot proteome sequences. Conserved amino acid residues in the profiles interact specifically with distinct chemical parts of nucleotide-containing ligands, such as nitrogenous bases, phosphate groups, ribose, nicotinamide, and flavin moieties. Each EFL profile in the database is characterized by a pattern of corresponding ligand-protein interactions found in crystallized ligand-protein complexes. NBDB database helps to explore the determinants of nucleotide and cofactor binding in different protein folds and families. NBDB can also detect fragments that match to profiles of particular EFLs in the protein sequence provided by user. Comprehensive information on sequence, structures, and interactions of EFLs with ligands provides a foundation for experimental and computational efforts on design of required protein functions.

  15. Nucleotide binding database NBDB--a collection of sequence motifs with specific protein-ligand interactions.

    PubMed

    Zheng, Zejun; Goncearenco, Alexander; Berezovsky, Igor N

    2016-01-01

    NBDB database describes protein motifs, elementary functional loops (EFLs) that are involved in binding of nucleotide-containing ligands and other biologically relevant cofactors/coenzymes, including ATP, AMP, ATP, GMP, GDP, GTP, CTP, PAP, PPS, FMN, FAD(H), NAD(H), NADP, cAMP, cGMP, c-di-AMP and c-di-GMP, ThPP, THD, F-420, ACO, CoA, PLP and SAM. The database is freely available online at http://nbdb.bii.a-star.edu.sg. In total, NBDB contains data on 249 motifs that work in interactions with 24 ligands. Sequence profiles of EFL motifs were derived de novo from nonredundant Uniprot proteome sequences. Conserved amino acid residues in the profiles interact specifically with distinct chemical parts of nucleotide-containing ligands, such as nitrogenous bases, phosphate groups, ribose, nicotinamide, and flavin moieties. Each EFL profile in the database is characterized by a pattern of corresponding ligand-protein interactions found in crystallized ligand-protein complexes. NBDB database helps to explore the determinants of nucleotide and cofactor binding in different protein folds and families. NBDB can also detect fragments that match to profiles of particular EFLs in the protein sequence provided by user. Comprehensive information on sequence, structures, and interactions of EFLs with ligands provides a foundation for experimental and computational efforts on design of required protein functions. PMID:26507856

  16. Linking the human cytogenetic map with nucleotide sequence: the CCAP clone set.

    PubMed

    Jang, Wonhee; Yonescu, Raluca; Knutsen, Turid; Brown, Theresa; Reppert, Tricia; Sirotkin, Karl; Schuler, Gregory D; Ried, Thomas; Kirsch, Ilan R

    2006-07-15

    We present the completed dataset and clone repository of the Cancer Chromosome Aberration Project (CCAP), an initiative developed and funded through the intramural program of the U.S. National Cancer Institute, to provide seamless linkage of human cytogenetic markers with the primary nucleotide sequence of the human genome. Spaced at 1-2 Mb intervals across the human genome, 1,339 bacterial artificial chromosome (BAC) clones have been localized to chromosomal bands through high-resolution fluorescence in situ hybridization (FISH) mapping. Of these clones, 99.8% can be positioned on the primary human genome sequence and 95% are placed at or close to their precise nucleotide starts and stops. This dataset can be studied and manipulated within generally available public Web sites. The clones are available from a commercial repository. The CCAP BAC clone set provides anchors for the interrogation of gene and sequence involvement in oncogenic and developmental disorders when the starting point is the recognition of a structural, numerical, or interstitial chromosomal aberration. This dataset also provides a current view of the quality and coherence of the available genome sequence and insight into the nucleotide and three-dimensional structures that manifest as Giemsa light and dark chromosomal banding patterns.

  17. Linking the human cytogenetic map with nucleotide sequence: the CCAP clone set.

    PubMed

    Jang, Wonhee; Yonescu, Raluca; Knutsen, Turid; Brown, Theresa; Reppert, Tricia; Sirotkin, Karl; Schuler, Gregory D; Ried, Thomas; Kirsch, Ilan R

    2006-07-15

    We present the completed dataset and clone repository of the Cancer Chromosome Aberration Project (CCAP), an initiative developed and funded through the intramural program of the U.S. National Cancer Institute, to provide seamless linkage of human cytogenetic markers with the primary nucleotide sequence of the human genome. Spaced at 1-2 Mb intervals across the human genome, 1,339 bacterial artificial chromosome (BAC) clones have been localized to chromosomal bands through high-resolution fluorescence in situ hybridization (FISH) mapping. Of these clones, 99.8% can be positioned on the primary human genome sequence and 95% are placed at or close to their precise nucleotide starts and stops. This dataset can be studied and manipulated within generally available public Web sites. The clones are available from a commercial repository. The CCAP BAC clone set provides anchors for the interrogation of gene and sequence involvement in oncogenic and developmental disorders when the starting point is the recognition of a structural, numerical, or interstitial chromosomal aberration. This dataset also provides a current view of the quality and coherence of the available genome sequence and insight into the nucleotide and three-dimensional structures that manifest as Giemsa light and dark chromosomal banding patterns. PMID:16843097

  18. The EBI Search engine: providing search and retrieval functionality for biological data from EMBL-EBI.

    PubMed

    Squizzato, Silvano; Park, Young Mi; Buso, Nicola; Gur, Tamer; Cowley, Andrew; Li, Weizhong; Uludag, Mahmut; Pundir, Sangya; Cham, Jennifer A; McWilliam, Hamish; Lopez, Rodrigo

    2015-07-01

    The European Bioinformatics Institute (EMBL-EBI-https://www.ebi.ac.uk) provides free and unrestricted access to data across all major areas of biology and biomedicine. Searching and extracting knowledge across these domains requires a fast and scalable solution that addresses the requirements of domain experts as well as casual users. We present the EBI Search engine, referred to here as 'EBI Search', an easy-to-use fast text search and indexing system with powerful data navigation and retrieval capabilities. API integration provides access to analytical tools, allowing users to further investigate the results of their search. The interconnectivity that exists between data resources at EMBL-EBI provides easy, quick and precise navigation and a better understanding of the relationship between different data types including sequences, genes, gene products, proteins, protein domains, protein families, enzymes and macromolecular structures, together with relevant life science literature.

  19. Cloning and nucleotide sequence of the anaerobically regulated pepT gene of Salmonella typhimurium.

    PubMed Central

    Miller, C G; Miller, J L; Bagga, D A

    1991-01-01

    The anaerobically regulated pepT gene of Salmonella typhimurium has been cloned in pBR328. Strains carrying the pepT plasmid, pJG17, overproduce peptidase T by approximately 70-fold. The nucleotide sequence of a 2.5-kb region including pepT has been determined. The sequence codes for a protein of 44,855 Da, consistent with a molecular weight of approximately 46,000 for peptidase T (as determined by sodium dodecyl sulfate-polyacrylamide gel electrophoresis and gel filtration). The N-terminal amino acid sequence of peptidase T purified from a pJG17-containing strain matches that predicted by the nucleotide sequence. A plasmid carrying an anaerobically regulated pepT::lacZ transcriptional fusion contains only 165 bp 5' to the start of translation. This region contains a sequence highly homologous to that identified in Escherichia coli as the site of action of the FNR protein, a positive regulator of anaerobic gene expression. A region of the deduced amino acid sequence of peptidase T is similar to segments of Pseudomonas carboxypeptidase G2, the E. coli peptidase encoded by the iap gene, and E. coli peptidase D. PMID:1904438

  20. Nucleotide sequence of the hypervariable region of the human C2 gene

    SciTech Connect

    Zhu, Z.B.; Volanakis, J.V. )

    1991-03-15

    It has been previously suggested that the multiallelic Bam H1/Sst I RFLPs of the human C2 gene arose through deletion/insertion of a tandemly-repeated minisatellite region. In this study the authors subcloned and sequenced the Sst I polymorphic fragment of the b haplotype of the C2 gene. This restriction fragment is 2,450 bp long and maps 1,550 bp 3{prime} of exon 3. Its nucleotide sequence is characterized by the presence of at least 4 different repeated regions varying in size from 18 to 58 bp. One of these regions starting at position 1,413 is 48 bp long and is repeated five times. The first 3 repeats are in tandem and are separated by 72 bp from two additional tandem repeats. Sequence homology among the 5 repeats ranges between 93 and 98%. Eighty three percent of the nucleotides of the repeated-region are G or C. It seems likely that this nucleotide repeat resulted in the multiallelic RFLPs through a mechanism of unequal recombination or replication slippage.

  1. Complete nucleotide sequence and coding strategy of rice hoja blanca virus RNA4.

    PubMed

    Ramirez, B C; Lozano, I; Constantino, L M; Haenni, A L; Calvert, L A

    1993-11-01

    The complete sequence of rice hoja blanca virus (RHBV) RNA4 has been determined, based on the sequence of the corresponding cDNA clones. RNA4 consists of 1991 nucleotides with two open reading frames (ORFs). One putative ORF is located in the 5'-proximal region of the viral RNA4; it encodes a protein of predicted M(r) 20076 which corresponds to the major non-structural protein that accumulates in RHBV-infected rice plants, and which bears limited sequence identity with the helper component of tobacco vein mottling potyvirus. The other ORF is located in the 5'-proximal region of the viral complementary RNA4 and encodes a protein of predicted M(r) 32,469. Between the two ORFs is an intergenic region of 524 nucleotides, part of which can theoretically adopt a stable stem-loop structure; the 5' and 3' ends can potentially base-pair over 16 nucleotides, producing a pan-handle configuration. These characteristics are in favour of an ambisense coding strategy for RHBV RNA4. PMID:8245863

  2. Complete Nucleotide Sequence of a French Isolate of Maize rough dwarf virus, a Fijivirus Member in the Family Reoviridae

    PubMed Central

    Svanella-Dumas, L.; Marais, A.; Faure, C.; Theil, S.; Thibord, J. B.

    2016-01-01

    The complete nucleotide sequence of a French isolate of Maize rough dwarf virus (MRDV) was determined by next-generation sequencing and compared with the single available complete sequence and with the partial sequences of two additional isolates available in online databases. PMID:27445367

  3. Nucleotide sequence of a hop stunt viroid variant isolated from citrus growing in Taiwan.

    PubMed

    Hsu, Y H; Chen, W; Owens, R A

    1995-01-01

    The 303 nucleotide sequence of HSVd-citrus(T), a hop stunt viroid (HSVd) variant present in Etrog citron growing in Taiwan, was determined from cDNAs amplified by the polymerase chain reaction. HSVd-citrus(T) is very similar to several HSVd isolates previously recovered from citrus or cucumber, and exhibits microsequence heterogeneity at positions 154 and 181. Phylogenetic analysis using maximum parsimony grouped HSVd-citrus(T) with seven other isolates from citrus and cucumber in a large cluster of "citrus-type" isolates. A similar analysis revealed marked differences in both the extent and distribution of sequence variation among naturally occurring isolates of potato spindle tuber viroid.

  4. Nucleotide sequencing and characterization of the genes encoding benzene oxidation enzymes of Pseudomonas putida

    SciTech Connect

    Irie, S.; Doi, S.; Yorifuji, T.; Takagi, M.; Yano, K.

    1987-11-01

    The nucleotide sequence of the genes from Pseudomonas putida encoding oxidation of benzene to catechol was determined. Five open reading frames were found in the sequence. Four corresponding protein molecules were detected by a DNA-directed in vitro translation system. Escherichia coli cells containing the fragment with the four open reading frames transformed benzene to cis-benzene glycol, which is an intermediate of the oxidation of benzene to catechol. The relation between the product of each cistron and the components of the benzene oxidation enzyme system is discussed.

  5. Nucleotide sequences of immunoglobulin eta genes of chimpanzee and orangutan: DNA molecular clock and hominoid evolution

    SciTech Connect

    Sakoyama, Y.; Hong, K.J.; Byun, S.M.; Hisajima, H.; Ueda, S.; Yaoita, Y.; Hayashida, H.; Miyata, T.; Honjo, T.

    1987-02-01

    To determine the phylogenetic relationships among hominoids and the dates of their divergence, the complete nucleotide sequences of the constant region of the immunoglobulin eta-chain (C/sub eta1/) genes from chimpanzee and orangutan have been determined. These sequences were compared with the human eta-chain constant-region sequence. A molecular clock (silent molecular clock), measured by the degree of sequence divergence at the synonymous (silent) positions of protein-encoding regions, was introduced for the present study. From the comparison of nucleotide sequences of ..cap alpha../sub 1/-antitrypsin and ..beta..- and delta-globulin genes between humans and Old World monkeys, the silent molecular clock was calibrated: the mean evolutionary rate of silent substitution was determined to be 1.56 x 10/sup -9/ substitutions per site per year. Using the silent molecular clock, the mean divergence dates of chimpanzee and orangutan from the human lineage were estimated as 6.4 +/- 2.6 million years and 17.3 +/- 4.5 million years, respectively. It was also shown that the evolutionary rate of primate genes is considerably slower than those of other mammalian genes.

  6. Nucleotide sequence of the p53 cDNA of beluga whale (Delphinapterus leucas).

    PubMed

    Xu, Ning; Shiraki, Takashi; Yamada, Tadasu; Nakajima, Masayuki; Gauthier, Julie M; Pfeiffer, Carl J; Sato, Shigeaki

    2002-04-17

    The cDNA (DNA complementary to RNA) of the p53 gene of the beluga whale (Delphinapterus leucas) was sequenced by the method of 5'- and 3'-rapid amplification of cDNA ends (RACE) with the cDNA made for the RNA obtained from fresh peripheral blood leukocytes isolated from two animals. Primers for the RACE method were synthesized based on the sequence of the DNA of beluga whale corresponding to exon 5 of the human p53 gene, which was determined after amplification of the DNA isolated from the liver from a beluga whale by using a pair of primers for the human sequence. The sequenced cDNA had a 2150-nucleotide length and contained the whole region corresponding to human exons 1 through 11. The reading frame was 1164 bp (base pair) long and began in exon 2 and ended in exon 11, coding for a 387-amino acid protein. The nucleotide sequence of the reading frame showed high similarity over 85% with pig, sheep, cow, and human genes. The similarities with the former two animals at the amino acid level were also more than 85%. Lower similarity of the beluga whale p53 gene was also found with those of lower tetrapods, fish and invertebrates.

  7. Comparison of Sequencing Platforms for Single Nucleotide Variant Calls in a Human Sample

    PubMed Central

    Miller, Webb; Guillory, Joseph; Stinson, Jeremy; Seshagiri, Somasekar

    2013-01-01

    Next-generation sequencings platforms coupled with advanced bioinformatic tools enable re-sequencing of the human genome at high-speed and large cost savings. We compare sequencing platforms from Roche/454(GS FLX), Illumina/HiSeq (HiSeq 2000), and Life Technologies/SOLiD (SOLiD 3 ECC) for their ability to identify single nucleotide substitutions in whole genome sequences from the same human sample. We report on significant GC-related bias observed in the data sequenced on Illumina and SOLiD platforms. The differences in the variant calls were investigated with regards to coverage, and sequencing error. Some of the variants called by only one or two of the platforms were experimentally tested using mass spectrometry; a method that is independent of DNA sequencing. We establish several causes why variants remained unreported, specific to each platform. We report the indel called using the three sequencing technologies and from the obtained results we conclude that sequencing human genomes with more than a single platform and multiple libraries is beneficial when high level of accuracy is required. PMID:23405114

  8. Comparison of sequencing platforms for single nucleotide variant calls in a human sample.

    PubMed

    Ratan, Aakrosh; Miller, Webb; Guillory, Joseph; Stinson, Jeremy; Seshagiri, Somasekar; Schuster, Stephan C

    2013-01-01

    Next-generation sequencings platforms coupled with advanced bioinformatic tools enable re-sequencing of the human genome at high-speed and large cost savings. We compare sequencing platforms from Roche/454(GS FLX), Illumina/HiSeq (HiSeq 2000), and Life Technologies/SOLiD (SOLiD 3 ECC) for their ability to identify single nucleotide substitutions in whole genome sequences from the same human sample. We report on significant GC-related bias observed in the data sequenced on Illumina and SOLiD platforms. The differences in the variant calls were investigated with regards to coverage, and sequencing error. Some of the variants called by only one or two of the platforms were experimentally tested using mass spectrometry; a method that is independent of DNA sequencing. We establish several causes why variants remained unreported, specific to each platform. We report the indel called using the three sequencing technologies and from the obtained results we conclude that sequencing human genomes with more than a single platform and multiple libraries is beneficial when high level of accuracy is required.

  9. Large-scale detection and application of expressed sequence tag single nucleotide polymorphisms in Nicotiana.

    PubMed

    Wang, Y; Zhou, D; Wang, S; Yang, L

    2015-01-01

    Single nucleotide polymorphisms (SNPs) are widespread in the Nicotiana genome. Using an alignment and variation detection method, we developed 20,607,973 SNPs, based on the expressed sequence tag sequences of 10 Nicotiana species. The replacement rate was much higher than the transversion rate in the SNPs, and SNPs widely exist in the Nicotiana. In vitro verification indicated that all of the SNPs were high quality and accurate. Evolutionary relationships between 15 varieties were investigated by polymerase chain reaction with a special primer; the specific 302 locus of these sequence results clearly indicated the origin of Zhongyan 100. A database of Nicotiana SNPs (NSNP) was developed to store and search for SNPs in Nicotiana. NSNP is a tool for researchers to develop SNP markers of sequence data. PMID:26214460

  10. Large-scale detection and application of expressed sequence tag single nucleotide polymorphisms in Nicotiana.

    PubMed

    Wang, Y; Zhou, D; Wang, S; Yang, L

    2015-07-14

    Single nucleotide polymorphisms (SNPs) are widespread in the Nicotiana genome. Using an alignment and variation detection method, we developed 20,607,973 SNPs, based on the expressed sequence tag sequences of 10 Nicotiana species. The replacement rate was much higher than the transversion rate in the SNPs, and SNPs widely exist in the Nicotiana. In vitro verification indicated that all of the SNPs were high quality and accurate. Evolutionary relationships between 15 varieties were investigated by polymerase chain reaction with a special primer; the specific 302 locus of these sequence results clearly indicated the origin of Zhongyan 100. A database of Nicotiana SNPs (NSNP) was developed to store and search for SNPs in Nicotiana. NSNP is a tool for researchers to develop SNP markers of sequence data.

  11. Nucleotide sequence of the SrRNA gene and phylogenetic analysis of Trichomonas tenax.

    PubMed

    Fukura, K; Yamamoto, A; Hashimoto, T; Goto, N

    1996-01-01

    The small subunit ribosomal RNA (SrRNA) gene of Trichomonas tenax ATCC30207 was amplified by PCR and the 1.55-kb product was cloned into plasmid vector pUC18. Four clones were isolated and sequenced. The insert DNAs were 1,552 bp long and their G+C contents were 48.1%; three of them had exactly the same DNA sequences and one had only one nucleotide change. A representative SrRNA sequence was analyzed and a phylogenetic tree was estimated by the neighbor-joining (NJ) method. Among the protists examined, T. tenax was placed as the closest relative of Tritrichomonas foetus, as expected from the traditional taxonomy. The total homology between the two SrRNA sequences was 89.2%.

  12. Nucleotide sequence of Crithidia fasciculata cytosol 5S ribosomal ribonucleic acid.

    PubMed

    MacKay, R M; Gray, M W; Doolittle, W F

    1980-11-11

    The complete nucleotide sequence of the cytosol 5S ribosomal ribonucleic acid of the trypanosomatid protozoan Crithidia fasciculata has been determined by a combination of T1-oligonucleotide catalog and gel sequencing techniques. The sequence is: GAGUACGACCAUACUUGAGUGAAAACACCAUAUCCCGUCCGAUUUGUGAAGUUAAGCACC CACAGGCUUAGUUAGUACUGAGGUCAGUGAUGACUCGGGAACCCUGAGUGCCGUACUCCCOH. This 5S ribosomal RNA is unique in having GAUU in place of the GAAC or GAUC found in all other prokaryotic and eukaryotic 5S RNAs, and thought to be involved in interactions with tRNAs. Comparisons to other eukaryotic cytosol 5S ribosomal RNA sequences indicate that the four major eukaryotic kingdoms (animals, plants, fungi, and protists) are about equally remote from each other, and that the latter kingdom may be the most internally diverse.

  13. Nucleotide sequence variation of chitin synthase genes among ectomycorrhizal fungi and its potential use in taxonomy.

    PubMed Central

    Mehmann, B; Brunner, I; Braus, G H

    1994-01-01

    DNA sequences of single-copy genes coding for chitin synthases (UDP-N-acetyl-D-glucosamine:chitin 4-beta-N-acetylglucosaminyltransferase; EC 2.4.1.16) were used to characterize ectomycorrhizal fungi. Degenerate primers deduced from short, completely conserved amino acid stretches flanking a region of about 200 amino acids of zymogenic chitin synthases allowed the amplification of DNA fragments of several members of this gene family. Different DNA band patterns were obtained from basidiomycetes because of variation in the number and length of amplified fragments. Cloning and sequencing of the most prominent DNA fragments revealed that these differences were due to various introns at conserved positions. The presence of introns in basidiomycetous fungi therefore has a potential use in identification of genera by analyzing PCR-generated DNA fragment patterns. Analyses of the nucleotide sequences of cloned fragments revealed variations in nucleotide sequences from 4 to 45%. By comparison of the deduced amino acid sequences, the majority of the DNA fragments were identified as members of genes for chitin synthase class II. The deduced amino acid sequences from species of the same genus differed only in one amino acid residue, whereas identity between the amino acid sequences of ascomycetous and basidiomycetous fungi within the same taxonomic class was found to be approximately 43 to 66%. Phylogenetic analysis of the amino acid sequence of class II chitin synthase-encoding gene fragments by using parsimony confirmed the current taxonomic groupings. In addition, our data revealed a fourth class of putative zymogenic chitin synthesis. Images PMID:7944356

  14. The mouse collagen X gene: complete nucleotide sequence, exon structure and expression pattern.

    PubMed Central

    Elima, K; Eerola, I; Rosati, R; Metsäranta, M; Garofalo, S; Perälä, M; De Crombrugghe, B; Vuorio, E

    1993-01-01

    Overlapping genomic clones covering the 7.2 kb mouse alpha 1(X) collagen gene, 0.86 kb of promoter and 1.25 kb of 3'-flanking sequences were isolated from two genomic libraries and characterized by nucleotide sequencing. Typical features of the gene include a unique three-exon structure, similar to that in the chick gene, with the entire triple-helical domain of 463 amino acids coded by a single large exon. The highest degree of amino acid and nucleotide sequence conservation was seen in the coding region for the collagenous and C-terminal non-collagenous domains between the mouse and known chick, bovine and human collagen type X sequences. More divergence between the sequences occurred in the N-terminal non-collagenous domain. Similarity between the mammalian collagen X sequences extended into the 3'-untranslated sequence, particularly near the polyadenylation site. The promoter of the mouse collagen X gene was found to contain two TATAA boxes 159 bp apart; primer extension analyses of the transcription start site revealed that both were functional. The promoter has an unusual structure with a very low G + C content of 28% between positions -220 and -1 of the upstream transcription start site. Northern and in situ hybridization analyses confirmed that the expression of the alpha 1(X) collagen gene is restricted to hypertrophic chondrocytes in tissues undergoing endochondral calcification. The detailed sequence information of the gene is useful for studies on the promoter activity of the gene and for generation of transgenic mice. Images Figure 3 Figure 5 Figure 6 PMID:8424763

  15. PEG-Labeled Nucleotides and Nanopore Detection for Single Molecule DNA Sequencing by Synthesis

    PubMed Central

    Kumar, Shiv; Tao, Chuanjuan; Chien, Minchen; Hellner, Brittney; Balijepalli, Arvind; Robertson, Joseph W. F.; Li, Zengmin; Russo, James J.; Reiner, Joseph E.; Kasianowicz, John J.; Ju, Jingyue

    2012-01-01

    We describe a novel single molecule nanopore-based sequencing by synthesis (Nano-SBS) strategy that can accurately distinguish four bases by detecting 4 different sized tags released from 5′-phosphate-modified nucleotides. The basic principle is as follows. As each nucleotide is incorporated into the growing DNA strand during the polymerase reaction, its tag is released and enters a nanopore in release order. This produces a unique ionic current blockade signature due to the tag's distinct chemical structure, thereby determining DNA sequence electronically at single molecule level with single base resolution. As proof of principle, we attached four different length PEG-coumarin tags to the terminal phosphate of 2′-deoxyguanosine-5′-tetraphosphate. We demonstrate efficient, accurate incorporation of the nucleotide analogs during the polymerase reaction, and excellent discrimination among the four tags based on nanopore ionic currents. This approach coupled with polymerase attached to the nanopores in an array format should yield a single-molecule electronic Nano-SBS platform. PMID:23002425

  16. Mouse Mammary Tumor Virus-Like Nucleotide Sequences in Canine and Feline Mammary Tumors▿

    PubMed Central

    Hsu, Wei-Li; Lin, Hsing-Yi; Chiou, Shyan-Song; Chang, Chao-Chin; Wang, Szu-Pong; Lin, Kuan-Hsun; Chulakasian, Songkhla; Wong, Min-Liang; Chang, Shih-Chieh

    2010-01-01

    Mouse mammary tumor virus (MMTV) has been speculated to be involved in human breast cancer. Companion animals, dogs, and cats with intimate human contacts may contribute to the transmission of MMTV between mouse and human. The aim of this study was to detect MMTV-like nucleotide sequences in canine and feline mammary tumors by nested PCR. Results showed that the presence of MMTV-like env and LTR sequences in canine malignant mammary tumors was 3.49% (3/86) and 18.60% (16/86), respectively. For feline malignant mammary tumors, the presence of both env and LTR sequences was found to be 22.22% (2/9). Nevertheless, the MMTV-like LTR and env sequences also were detected in normal mammary glands of dogs and cats. In comparisons of the MMTV-like DNA sequences of our findings to those of NIH 3T3 (MMTV-positive murine cell line) and human breast cancer cells, the sequence similarities ranged from 94 to 98%. Phylogenetic analysis revealed that intermixing among sequences identified from tissues of different hosts, i.e., mouse, dog, cat, and human, indicated the MMTV-like DNA existing in these hosts. Moreover, the env transcript was detected in 1 of the 19 MMTV-positive samples by reverse transcription-PCR. Taken together, our study provides evidence for the existence and expression of MMTV-like sequences in neoplastic and normal mammary glands of dogs and cats. PMID:20881168

  17. Cloning, nucleotide sequence, and expression of the Pasteurella haemolytica A1 glycoprotease gene.

    PubMed Central

    Abdullah, K M; Lo, R Y; Mellors, A

    1991-01-01

    Pasteurella haemolytica serotype A1 secretes a glycoprotease which is specific for O-sialoglycoproteins such as glycophorin A. The gene encoding the glycoprotease enzyme has been cloned in the recombinant plasmid pH1, and its nucleotide sequence has been determined. The gene (designated gcp) codes for a protein of 35.2 kDa, and an active enzyme protein of this molecular mass can be observed in Escherichia coli clones carrying pPH1. In vivo labeling of plasmid-encoded proteins in E. coli maxicells demonstrated the expression of a 35-kDa protein from pPH1. The amino-terminal sequence of the heterologously expressed protein corresponds to that predicted from the nucleotide sequence. The glycoprotease is a neutral metalloprotease, and the predicted amino acid sequence of the glycoprotease contains a putative zinc-binding site. The gene shows no significant homology with the genes for other proteases of procaryotic or eucaryotic origin. However, there is substantial homology between gcp and an E. coli gene, orfX, whose product is believed to function in the regulation of macromolecule biosynthesis. Images PMID:1885539

  18. Cloning and nucleotide sequence of the Salmonella typhimurium LT2 metF gene and its homology with the corresponding sequence of Escherichia coli.

    PubMed

    Stauffer, G V; Stauffer, L T

    1988-05-01

    The Salmonella typhimurium LT2 metF gene, encoding 5,10-methylenetetrahydrofolate reductase, has been cloned. Strains with multicopy plasmids carrying the metF gene overproduce the enzyme 44-fold. The nucleotide sequence of the metF gene was determined, and an open reading frame of 888 nucleotides was identified. The polypeptide deduced from the DNA sequence contains 296 amino acids and has a molecular weight of 33,135 daltons. Mung bean nuclease mapping experiments located the transcription start point and possible transcription termination region for the gene. There is a 25 bp nucleotide sequence between the translation termination site and the possible transcription termination region. This region possesses a GC-rich sequence that could form a stable stem and loop structure once transcribed (delta G = -9 kcal/mol), followed by an AT-rich sequence, both of which are characteristic of rho-independent transcription terminators. The nucleotide and deduced amino acid sequences of the S. typhimurium metF gene are compared with the corresponding sequences of the Escherichia coli metF gene. The nucleotide sequences show 85% homology. Most of the nucleotide differences found do not alter the amino acid sequences, which show 95% homology. The results also show that a change has occurred in the metF region of the S. typhimurium chromosome as compared to the E. coli chromosome.

  19. Nucleotide sequence and revised map location of the arn gene from bacteriophage T4.

    PubMed

    Kim, B C; Kim, K; Park, E H; Lim, C J

    1997-10-31

    Non-glucosylated (Glu-) T-even phage DNAs are restricted by Escherichia coli RgIA and RgIB endonucleases with different specificities. RgIB endonuclease activity is strongly inhibited by anti-restriction endonuclease (Arn) encoded by the bacteriophage T4 genome. The nucleotide sequence of the arn gene encoding Arn was determined. The product of the cloned arn gene was overexpressed by the T7 RNA polymerase/promoter system, and its molecular size is consistent with that predicted from the open reading frame of the arn gene. The arn gene is located between the asiA gene and motA gene in the region of 161,300-161,578 nucleotides.

  20. Nucleotide sequence of a cloned duck hepatitis B virus genome: comparison with woodchuck and human hepatitis B virus sequences.

    PubMed Central

    Mandart, E; Kay, A; Galibert, F

    1984-01-01

    The nucleotide sequence of an EcoRI duck hepatitis B virus (DHBV) clone was elucidated by using the Maxam and Gilbert method. This sequence, which is 3,021 nucleotides long, was compared with the two previously analyzed hepatitis B-like viruses (human and woodchuck). From this comparison, it was shown that DHBV is derived from an ancestor common to the two others but has a slightly different genomic organization. There was no intergenic region between genes 5 and 8, which were fused into a single open reading frame in DHBV. Genes for the surface and core proteins were assigned to open reading frames 7 and 5/8. Amino acid comparisons showed some structural relationship between gene 6 product and avian reverse transcriptase, suggesting either evolution from a common ancestor or convergence to some particular structure to fulfill a specific function. This should be correlated with the synthesis of an RNA intermediate during DNA replication. This is also taken as an argument in favor of the hypothesis that gene 6 codes for the DNA polymerase that is found within the virion. DNA sequence comparison also showed that the two mammalian hepatitis B viruses are more homologous to each other than they are to DHBV, indicating that DHBV starts to evolve on its own earlier than the two other viruses, as do birds compared with mammals. From this it is proposed that the viruses evolved in a fashion parallel to the species they infect. PMID:6699938

  1. Single nucleotide polymorphisms from Theobroma cacao expressed sequence tags associated with witches' broom disease in cacao.

    PubMed

    Lima, L S; Gramacho, K P; Carels, N; Novais, R; Gaiotto, F A; Lopes, U V; Gesteira, A S; Zaidan, H A; Cascardo, J C M; Pires, J L; Micheli, F

    2009-07-14

    In order to increase the efficiency of cacao tree resistance to witches' broom disease, which is caused by Moniliophthora perniciosa (Tricholomataceae), we looked for molecular markers that could help in the selection of resistant cacao genotypes. Among the different markers useful for developing marker-assisted selection, single nucleotide polymorphisms (SNPs) constitute the most common type of sequence difference between alleles and can be easily detected by in silico analysis from expressed sequence tag libraries. We report the first detection and analysis of SNPs from cacao-M. perniciosa interaction expressed sequence tags, using bioinformatics. Selection based on analysis of these SNPs should be useful for developing cacao varieties resistant to this devastating disease.

  2. Conservation of nucleotide sequences for molecular diagnosis of Middle East respiratory syndrome coronavirus, 2015.

    PubMed

    Furuse, Yuki; Okamoto, Michiko; Oshitani, Hitoshi

    2015-11-01

    Infection due to the Middle East respiratory syndrome coronavirus (MERS-CoV) is widespread. The present study was performed to assess the protocols used for the molecular diagnosis of MERS-CoV by analyzing the nucleotide sequences of viruses detected between 2012 and 2015, including sequences from the large outbreak in eastern Asia in 2015. Although the diagnostic protocols were established only 2 years ago, mismatches between the sequences of primers/probes and viruses were found for several of the assays. Such mismatches could lead to a lower sensitivity of the assay, thereby leading to false-negative diagnosis. A slight modification in the primer design is suggested. Protocols for the molecular diagnosis of viral infections should be reviewed regularly after they are established, particularly for viruses that pose a great threat to public health such as MERS-CoV.

  3. Nucleotide-sequence of a canine oral papillomavirus containing a long noncoding region.

    PubMed

    Isegawa, N; Ohta, M; Shirasawa, H; Tokita, H; Yamaura, A; Simizu, B

    1995-07-01

    The DNA genome of a canine oral papillomavirus (COPV) was completely sequenced and found to consist of 8607 base pairs, which were the longest of all known papillomaviruses (PVs). Its organization was similar to that of other PVs except that it lacked early gene 5 (E5) and possessed a unique long noncoding region (L-NCR) between the end of the early genes and the beginning of the late genes. COPV also possessed a short noncoding region (S-NCR) which contained a putative upper regulatory region (URR), which is commonly found in PVs. The L-NCR did not show any similarity to known PV DNAs nor other DNA sequences in the GenBank database. Nucleotide sequence analysis of COPV showed that it was closely related to human papillomavirus type 1 (HPV 1) and animal PVs associated with cutaneous lesions in rabbit, European elk, deer and cow as we reported previously. PMID:21552821

  4. Comparisons of the Distribution of Nucleotides and Common Sequences in Deoxyribonucleic Acid from Selected Bacteriophages

    PubMed Central

    Skalka, A.; Hanson, P.

    1972-01-01

    Results from comparisons of deoxyribonucleic acid (DNA) from several classes of bacteriophages suggest that most phage chromosomes contain either a homogeneous distribution of nucleotides or are made up of a few, rather large segments of different quanine plus cytosine (G + C) contents which are internally homogeneous. Among those temperate phages tested, most contained segmented DNA. Comparisons of sequence similarities among segments from lambdoid phage DNA species revealed the following order in relatedness to λ: 82 (and 434) > 21 > 424 > φ80. Most common sequences are found in the highest G + C segments, which in λ contain head and tail genes. Hybridization tests with λ and 186 or P2 DNA species verified that the lambdoids and 186 and P2 belong to two distinct groups. There are fewer homologous sequences between the DNA species of coliphages λ and P2 or 186 than there are between the DNA species of coliphage λ and salmonella phage P22. PMID:4553679

  5. Nucleotide sequence of a satellite RNA associated with carrot motley dwarf in parsley and carrot.

    PubMed

    Menzel, Wulf; Maiss, Edgar; Vetten, H Josef

    2009-02-01

    Carrot motley dwarf (CMD) is known to result from a mixed infection by two viruses, the polerovirus Carrot red leaf virus and one of the umbraviruses Carrot mottle mimic virus or Carrot mottle virus. Some umbraviruses have been shown to be associated with small satellite (sat) RNAs, but none have been reported for the latter two. A CMD-affected parsley plant was used for sap transmission to test plants, that were used for dsRNA isolation. The presence of a 0.8-kbp dsRNA indicated the occurrence of a hitherto unrecognized satRNA associated with CMD. The satRNAs of the CMD isolate from parsley and an isolate from carrot have been sequenced and showed 94% sequence identity. Nucleotide sequences and putative translation products had no significant similarities to GenBank entries. To our knowledge, this is the first report of satRNAs associated with CMD.

  6. Characterization and partial nucleotide sequence of endogenous type C retrovirus segments in human chromosomal DNA.

    PubMed Central

    Repaske, R; O'Neill, R R; Steele, P E; Martin, M A

    1983-01-01

    Twenty-six different murine leukemia virus (MuLV)-related clones have been isolated from a human DNA library and characterized by restriction enzyme mapping and reciprocal nucleic acid hybridization reactions. The sequence of approximately 2,600 nucleotides, spanning more than 4.0 kilobases, of one of the MuLV-related cloned human DNAs was also determined. The deduced amino acid sequence permitted the alignment of this prototype cloned human DNA segment with the p12 gag, p30 gag, p10 gag, and pol regions of Moloney MuLV. A majority of the endogenous type C retrovirus-related segments present in human DNA are approximately 6.0 kilobases in size and appear to contain a deletion of env sequences. Images PMID:6298769

  7. Nucleotide sequence analysis of the L gene of Newcastle disease virus: homologies with Sendai and vesicular stomatitis viruses.

    PubMed Central

    Yusoff, K; Millar, N S; Chambers, P; Emmerson, P T

    1987-01-01

    The nucleotide sequence of the L gene of the Beaudette C strain of Newcastle disease virus (NDV) has been determined. The L gene is 6704 nucleotides long and encodes a protein of 2204 amino acids with a calculated molecular weight of 248822. Mung bean nuclease mapping of the 5' terminus of the L gene mRNA indicates that the transcription of the L gene is initiated 11 nucleotides upstream of the translational start site. Comparison with the amino acid sequences of the L genes of Sendai virus and vesicular stomatitis virus (VSV) suggests that there are several regions of homology between the sequences. These data provide further evidence for an evolutionary relationship between the Paramyxoviridae and the Rhabdoviridae. A non-coding sequence of 46 nucleotides downstream of the presumed polyadenylation site of the L gene may be part of a negative strand leader RNA. Images PMID:3035486

  8. Developing Single Nucleotide Polymorphism (SNP) markers from transcriptome sequences for the identification of longan (Dimocarpus longan) germplasm

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Longan (Dimocarpus longan Lour.) is an important tropical fruit tree crop. Accurate varietal identification is essential for germplasm management and breeding. Using longan transcriptome sequences from public databases, we developed single nucleotide polymorphism (SNP) markers; validated 60 SNPs in...

  9. Nucleotide sequences of the 3' terminal region of onion yellow dwarf virus isolates from Allium plants in Japan.

    PubMed

    Tsuneyoshi, T; Ikeda, Y; Sumi, S

    1997-01-01

    The 2032 nucleotide sequence of the 3' terminal region of onion yellow dwarf virus (OYDV) isolated from Allium wakegi, bearing the genes for viral coat protein (CP) and a truncated RNA-dependent RNA polymerase, has been determined. Respective homologies of the nucleotide sequence in the corresponding region and the deduced amino acid sequence of CP with the equivalents of leek yellow stripe virus (LYSV) from garlic were 68.0 and 59.3%. Variation in the nucleotide sequence is concentrated in the boundary region between the putative RNA-dependent RNA polymerase gene and the CP gene as well as in the 3' noncoding region. These sequence divergencies, including the deletion of 79 nucleotides, resulted both in alterations to the amino acid sequence and the absence of 28 amino acid residues in the amino terminal region of OYDV CP in comparison with LYSV CP. In addition, the length of the 3' noncoding sequence of OYDV was one-third that of LYSV. Comparison of the 3' terminal 1197 nucleotides sequence of OYDV with sequences of the respective cDNAs cloned by RT-PCR directly from the total RNA of infected Allium plants that included two varieties of A. fistulosum, "Wakenegi" and "Shimonita-negi", and A. chinense, showed 90.7% overall identities, even though they have long been cultivated in locally restricted area in Japan. These findings appear to suggest that a single strain of OYDV invaded Japanese Allium plants long ago and spread throughout them. PMID:9354273

  10. Nucleotide sequences derived from pheasant DNA in the genome of recombinant avian leukosis viruses with subgroup F specificity.

    PubMed

    Keshet, E; Temin, H M

    1977-11-01

    Recombination between viral and cellular genes can give rise to new strains of retroviruses. For example, Rous-associated virus 61 (RAV-61) is a recombinant between the Bryan high-titer strain of Rous sarcoma virus (RSV) and normal pheasant DNA. Nucleic acid hybridization techniques were used to study the genome of RAV-61 and another RAV with subgroup F specificity (RAV-F) obtained by passage of RSV-RAV-0 in cells from a ring-necked pheasant embryo. The nucleotide sequences acquired by these two independent isolates of RAV-F that were not shared with the parental virus comprised 20 to 25% of the RAV-F genomes and were indistinguishable by nucleic acid hybridization. (In addition, RAV-F genomes had another set of nucleotide sequences that were homologous to some pheasant nucleotide sequences and also were present in the parental viruses.) A specific complementary DNA, containing only nucleotide sequences complementary to those acquired by RAV-61 through recombination, was prepared. These nucleotide sequences were pheasant derived and were not present in the genomes of reticuloendotheliosis viruses, pheasant viruses, and avian leukosis-sarcoma viruses of subgroups A, B, C, D, and E. They were partially endogenous, however, to avian DNA other than pheasant. The fraction of these nucleotide sequences present in other avian DNAs generally paralleled the genetic relatedness of these avian species to pheasants. However, there was a high degree of homology between these pheasant nucleotide sequences and related nucleotide sequences in the DNA of normal chickens as indicated by the identical melting profiles of the respective hybrids.

  11. Nucleotide sequence of the Shiga-like toxin genes of Escherichia coli.

    PubMed Central

    Calderwood, S B; Auclair, F; Donohue-Rolfe, A; Keusch, G T; Mekalanos, J J

    1987-01-01

    We have determined the nucleotide sequence of the sltA and sltB genes that encode the Shiga-like toxin (SLT) produced by Escherichia coli phage H19B. The amino acid composition of the A and B subunits of SLT is very similar to that previously established for Shiga toxin from Shigella dysenteriae 1, and the deduced amino acid sequence of the B subunit of SLT is identical with that reported for the B subunit of Shiga toxin. The genes for the A and B subunits of SLT apparently constitute an operon, with only 12 nucleotides separating the coding regions. There is a 21-base-pair region of dyad symmetry overlapping the proposed promoter of the slt operon that may be involved in regulation of SLT production by iron. The peptide sequence of the A subunit of SLT is homologous to the A subunit of the plant toxin ricin, providing evidence for the hypothesis that certain prokaryotic toxins may be evolutionarily related to eukaryotic enzymes. Images PMID:3299365

  12. Genomic DNA enrichment using sequence capture microarrays: a novel approach to discover sequence nucleotide polymorphisms (SNP) in Brassica napus L.

    PubMed

    Clarke, Wayne E; Parkin, Isobel A; Gajardo, Humberto A; Gerhardt, Daniel J; Higgins, Erin; Sidebottom, Christine; Sharpe, Andrew G; Snowdon, Rod J; Federico, Maria L; Iniguez-Luy, Federico L

    2013-01-01

    Targeted genomic selection methodologies, or sequence capture, allow for DNA enrichment and large-scale resequencing and characterization of natural genetic variation in species with complex genomes, such as rapeseed canola (Brassica napus L., AACC, 2n=38). The main goal of this project was to combine sequence capture with next generation sequencing (NGS) to discover single nucleotide polymorphisms (SNPs) in specific areas of the B. napus genome historically associated (via quantitative trait loci -QTL- analysis) to traits of agronomical and nutritional importance. A 2.1 million feature sequence capture platform was designed to interrogate DNA sequence variation across 47 specific genomic regions, representing 51.2 Mb of the Brassica A and C genomes, in ten diverse rapeseed genotypes. All ten genotypes were sequenced using the 454 Life Sciences chemistry and to assess the effect of increased sequence depth, two genotypes were also sequenced using Illumina HiSeq chemistry. As a result, 589,367 potentially useful SNPs were identified. Analysis of sequence coverage indicated a four-fold increased representation of target regions, with 57% of the filtered SNPs falling within these regions. Sixty percent of discovered SNPs corresponded to transitions while 40% were transversions. Interestingly, fifty eight percent of the SNPs were found in genic regions while 42% were found in intergenic regions. Further, a high percentage of genic SNPs was found in exons (65% and 64% for the A and C genomes, respectively). Two different genotyping assays were used to validate the discovered SNPs. Validation rates ranged from 61.5% to 84% of tested SNPs, underpinning the effectiveness of this SNP discovery approach. Most importantly, the discovered SNPs were associated with agronomically important regions of the B. napus genome generating a novel data resource for research and breeding this crop species.

  13. Nucleotide Sequences and Modifications That Determine RIG-I/RNA Binding and Signaling Activities ▿

    PubMed Central

    Uzri, Dina; Gehrke, Lee

    2009-01-01

    Cytoplasmic viral RNAs with 5′ triphosphates (5′ppp) are detected by the RNA helicase RIG-I, initiating downstream signaling and alpha/beta interferon (IFN-α/β) expression that establish an antiviral state. We demonstrate here that the hepatitis C virus (HCV) 3′ untranslated region (UTR) RNA has greater activity as an immune stimulator than several flavivirus UTR RNAs. We confirmed that the HCV 3′-UTR poly(U/UC) region is the determinant for robust activation of RIG-I-mediated innate immune signaling and that its antisense sequence, poly(AG/A), is an equivalent RIG-I activator. The poly(U/UC) region of the fulminant HCV JFH-1 strain was a relatively weak activator, while the antisense JFH-1 strain poly(AG/A) RNA was very potent. Poly(U/UC) activity does not require primary nucleotide sequence adjacency to the 5′ppp, suggesting that RIG-I recognizes two independent RNA domains. Whereas poly(U) 50-nt or poly(A) 50-nt sequences were minimally active, inserting a single C or G nucleotide, respectively, into these RNAs increased IFN-β expression. Poly(U/UC) RNAs transcribed in vitro using modified uridine 2′ fluoro or pseudouridine ribonucleotides lacked signaling activity while functioning as competitive inhibitors of RIG-I binding and IFN-β expression. Nucleotide base and ribose modifications that convert activator RNAs into competitive inhibitors of RIG-I signaling may be useful as modulators of RIG-I-mediated innate immune responses and as tools to dissect the RNA binding and conformational events associated with signaling. PMID:19224987

  14. The nucleotide sequences of some large ribonuclease T1 products from bacteriophage R17 ribonucleic acid

    PubMed Central

    Jeppesen, Peter G. N.

    1971-01-01

    A method of `fingerprinting' high-molecular-weight 32P-labelled RNA species, using a two-dimensional thin-layer-chromatographic separation of ribonuclease T1 digestion products, has been applied to RNA from the Escherichia coli bacteriophage R17. The `fingerprinting' technique, besides giving a unique pattern that can be used as a characterization of the RNA, has made it possible to isolate a number of the larger oligonucleotides and to determine their nucleotide sequences. ImagesPLATE 1 PMID:5158505

  15. The Complete Nucleotide Sequence of the Mitochondrial Genome of Bactrocera minax (Diptera: Tephritidae)

    PubMed Central

    Zhang, Bin; Nardi, Francesco; Hull-Sanders, Helen; Wan, Xuanwu; Liu, Yinghong

    2014-01-01

    The complete 16,043 bp mitochondrial genome (mitogenome) of Bactrocera minax (Diptera: Tephritidae) has been sequenced. The genome encodes 37 genes usually found in insect mitogenomes. The mitogenome information for B. minax was compared to the homologous sequences of Bactrocera oleae, Bactrocera tryoni, Bactrocera philippinensis, Bactrocera carambolae, Bactrocera papayae, Bactrocera dorsalis, Bactrocera correcta, Bactrocera cucurbitae and Ceratitis capitata. The analysis indicated the structure and organization are typical of, and similar to, the nine closely related species mentioned above, although it contains the lowest genome-wide A+T content (67.3%). Four short intergenic spacers with a high degree of conservation among the nine tephritid species mentioned above and B. minax were observed, which also have clear counterparts in the control regions (CRs). Correlation analysis among these ten tephritid species revealed close positive correlation between the A+T content of zero-fold degenerate sites (P0FD), the ratio of nucleotide substitution frequency at P0FD sites to all degenerate sites (zero-fold degenerate sites, two-fold degenerate sites and four-fold degenerate sites) and amino acid sequence distance (ASD) were found. Further, significant positive correlation was observed between the A+T content of four-fold degenerate sites (P4FD) and the ratio of nucleotide substitution frequency at P4FD sites to all degenerate sites; however, we found significant negative correlation between ASD and the A+T content of P4FD, and the ratio of nucleotide substitution frequency at P4FD sites to all degenerate sites. A higher nucleotide substitution frequency at non-synonymous sites compared to synonymous sites was observed in nad4, the first time that has been observed in an insect mitogenome. A poly(T) stretch at the 5′ end of the CR followed by a [TA(A)]n-like stretch was also found. In addition, a highly conserved G+A-rich sequence block was observed in front of the

  16. The nucleotide sequence of glutamate tRNA4 of Drosophila melanogaster.

    PubMed Central

    Altwegg, M; Kubli, E

    1980-01-01

    The nucleotide sequence of Drosophila melanogaster glutamate tRNA4 was determined to be: pU-C-C-C-A-U-A-U-G-G-U-C-psi-A-G-D-G-G-C-D-A-G-G-A-U-A-U-C-U-G-G-C (m) -U-U-U-C-A-C-C-A-G-A-A-G-G-C-C-C-G-G-G-T-psi-U-C-G-A-U-U-C-C-C-G-G-U-A-U-G-G-G-A-A-C-C-AOH. A partial modified C is found at position 32 in the anticodon loop. Images PMID:6775307

  17. Nucleotide sequence and organization of copper resistance genes from Pseudomonas syringae pv. tomato

    SciTech Connect

    Mellano, M.A.; Cooksey, D.A.

    1988-06-01

    The nucleotide sequence of a 4.5-kilobase copper resistance determinant from Pseudomonas syringae pv. tomato revealed four open reading frames (ORFs) in the same orientation. Deletion and site-specific mutational analyses indicated that the first two ORFs were essential for copper resistance; the last two ORFs were required for full resistance, but low-level resistance could be conferred in their absence. Five highly conserved, direct 24-base repeats were found near the beginning of the second ORF, and a similar, but less conserved, repeated region was found in the middle of the first ORF.

  18. Nucleotide deletion and P addition in V(D)J recombination: a determinant role of the coding-end sequence.

    PubMed Central

    Nadel, B; Feeney, A J

    1997-01-01

    During V(D)J recombination, the coding ends to be joined are extensively modified. Those modifications, termed coding-end processing, consist of removal and addition of various numbers of nucleotides. We previously showed in vivo that coding-end processing is specific for each coding end, suggesting that specific motifs in a coding-end sequence influence nucleotide deletion and P-region formation. In this study, we created a panel of recombination substrates containing actual immunoglobulin and T-cell receptor coding-end sequences and dissected the role of each motif by comparing its processing pattern with those of variants containing minimal nucleotide changes from the original sequence. Our results demonstrate the determinant role of specific sequence motifs on coding-end processing and also the importance of the context in which they are found. We show that minimal nucleotide changes in key positions of a coding-end sequence can result in dramatic changes in the processing pattern. We propose that each coding-end sequence dictates a unique hairpin structure, the result of a particular energy conformation between nucleotides organizing the loop and the stem, and that the interplay between this structure and specific sequence motifs influences the frequency and location of nicks which open the coding-end hairpin. These findings indicate that the sequences of the coding ends determine their own processing and have a profound impact on the development of the primary B- and T-cell repertoires. PMID:9199310

  19. Nucleotide sequence of the gene for the b subunit of human factor XIII

    SciTech Connect

    Bottenus, R.E.; Ichinose, A.; Davie, E.W. )

    1990-12-01

    Factor XIII (M{sub r} 320 000) is a blood coagulation factor that stabilizes and strengthens the fibrin clot. It circulates in blood as a tetramer composed of two a subunits (M{sub r} 75 000 each) and two b subunits (M{sub r} 80 000 each). The b subunit consists of 641 amino acids and includes 10 tandem repeats of 60 amino acids known as GP-I structures, short consensus repeats (SCR), or sushi domains. In the present study, the human gene for the b subunit has been isolated from three different genomic libraries prepared in {lambda} phage. Fifteen independent phage with inserts coding for the entire gene were isolated and characterized by restriction mapping, Southern blotting, and DNA sequencing. The gene was found to be 28 kilobases in length and consisted of 12 exons (I-XII) separated by 11 intervening sequences. The leader sequence was encoded by exon I, while the carbonyl-terminal region of the protein was encoded by exon XII. Exons II-XI each coded for a single sushi domain, suggesting that the gene evolved through exon shuffling and duplication. The 12 exons in the gene ranged in size from 64 to 222 base pairs, while the introns ranged in size from 87 to 9970 nucleotides and made up 92{percent} of the gene. One nucleotide change was found in the coding region of the gene when its sequence was compared to that of the cDNA. This difference, however, did not result in a change in the amino acid sequence of the protein.

  20. The ChEMBL database as linked open data

    PubMed Central

    2013-01-01

    Background Making data available as Linked Data using Resource Description Framework (RDF) promotes integration with other web resources. RDF documents can natively link to related data, and others can link back using Uniform Resource Identifiers (URIs). RDF makes the data machine-readable and uses extensible vocabularies for additional information, making it easier to scale up inference and data analysis. Results This paper describes recent developments in an ongoing project converting data from the ChEMBL database into RDF triples. Relative to earlier versions, this updated version of ChEMBL-RDF uses recently introduced ontologies, including CHEMINF and CiTO; exposes more information from the database; and is now available as dereferencable, linked data. To demonstrate these new features, we present novel use cases showing further integration with other web resources, including Bio2RDF, Chem2Bio2RDF, and ChemSpider, and showing the use of standard ontologies for querying. Conclusions We have illustrated the advantages of using open standards and ontologies to link the ChEMBL database to other databases. Using those links and the knowledge encoded in standards and ontologies, the ChEMBL-RDF resource creates a foundation for integrated semantic web cheminformatics applications, such as the presented decision support. PMID:23657106

  1. Mining of haplotype-based expressed sequence tag single nucleotide polymorphisms in citrus

    PubMed Central

    2013-01-01

    Background Single nucleotide polymorphisms (SNPs), the most abundant variations in a genome, have been widely used in various studies. Detection and characterization of citrus haplotype-based expressed sequence tag (EST) SNPs will greatly facilitate further utilization of these gene-based resources. Results In this paper, haplotype-based SNPs were mined out of publicly available citrus expressed sequence tags (ESTs) from different citrus cultivars (genotypes) individually and collectively for comparison. There were a total of 567,297 ESTs belonging to 27 cultivars in varying numbers and consequentially yielding different numbers of haplotype-based quality SNPs. Sweet orange (SO) had the most (213,830) ESTs, generating 11,182 quality SNPs in 3,327 out of 4,228 usable contigs. Summed from all the individually mining results, a total of 25,417 quality SNPs were discovered – 15,010 (59.1%) were transitions (AG and CT), 9,114 (35.9%) were transversions (AC, GT, CG, and AT), and 1,293 (5.0%) were insertion/deletions (indels). A vast majority of SNP-containing contigs consisted of only 2 haplotypes, as expected, but the percentages of 2 haplotype contigs varied widely in these citrus cultivars. BLAST of the 25,417 25-mer SNP oligos to the Clementine reference genome scaffolds revealed 2,947 SNPs had “no hits found”, 19,943 had 1 unique hit / alignment, 1,571 had one hit and 2+ alignments per hit, and 956 had 2+ hits and 1+ alignment per hit. Of the total 24,293 scaffold hits, 23,955 (98.6%) were on the main scaffolds 1 to 9, and only 338 were on 87 minor scaffolds. Most alignments had 100% (25/25) or 96% (24/25) nucleotide identities, accounting for 93% of all the alignments. Considering almost all the nucleotide discrepancies in the 24/25 alignments were at the SNP sites, it served well as in silico validation of these SNPs, in addition to and consistent with the rate (81%) validated by sequencing and SNaPshot assay. Conclusions High-quality EST-SNPs from different

  2. Infectious hepatitis B virus from cloned DNA of known nucleotide sequence.

    PubMed Central

    Will, H; Cattaneo, R; Darai, G; Deinhardt, F; Schellekens, H; Schaller, H

    1985-01-01

    The infectivity of cloned hepatitis B viral DNA (HBV) has been tested in chimpanzees to identify a fully functional HBV genome and to assess the risk associated with its handling. Only one of two HBV DNA sequence variants tested was shown to be infectious. "Clone purified" virus of predicted nucleotide sequence was produced from the infectious HBV DNA, and the cloned viral genome was identical in structure with naturally occurring HBV. Infection could be initiated independent of whether circular monomeric or plasmid integrated dimeric forms of the viral genome were inoculated, but the infectivity of the DNA depended on liver cell transfection or intrahepatic injection. Intravenous injection of high doses of infectious HBV DNA did not induce hepatitis, suggesting that there is virtually no risk associated with routine laboratory handling of cloned HBV DNA. Images PMID:2983320

  3. The uteroglobin gene region: hormonal regulation, repetitive elements and complete nucleotide sequence of the gene.

    PubMed Central

    Suske, G; Wenz, M; Cato, A C; Beato, M

    1983-01-01

    Differential uteroglobin induction represents an appropriate model for the molecular analysis of the mechanism by which steroid hormones control gene expression in mammals. We have analyzed the structure and hormonal regulation of a 35 Kb region of genomic DNA in which the uteroglobin gene is located. The complete sequence of 3,700 nucleotides including the uteroglobin gene and its flanking regions has been determined, and the limits of the gene established by S1 nuclease mapping. Several regions containing repeated sequences were mapped by blot hybridization, one of which is located within the large intron in the uteroglobin gene. Analysis of the RNAs extracted from endometrium, lung and liver, after treatment with estrogen and/or progesterone shows that within the 35 Kb region, the uteroglobin gene is the only DNA segment whose transcription into stable RNA is induced by progesterone. Images PMID:6304644

  4. Using mitochondrial nucleotide sequences to investigate diversity and genealogical relationships within common carp (Cyprinus carpio L.).

    PubMed

    Thai, B T; Burridge, C P; Pham, T A; Austin, C M

    2005-02-01

    Direct sequencing of mitochondrial DNA (mtDNA) D-loop (745 bp) and MTATPase6/MTATPase8 (857 bp) regions was used to investigate genetic variation within common carp and develop a global genealogy of common carp strains. The D-loop region was more variable than the MTATPase6/MTATPase8 region, but given the wide distribution of carp the overall levels of sequence divergence were low. Levels of haplotype diversity varied widely among countries with Chinese, Indonesian and Vietnamese carp showing the greatest diversity whereas Japanese Koi and European carp had undetectable nucleotide variation. A genealogical analysis supports a close relationship between Vietnamese, Koi and Chinese Color carp strains and to a lesser extent, European carp. Chinese and Indonesian carp strains were the most divergent, and their relationships do not support the evolution of independent Asian and European lineages and current taxonomic treatments.

  5. Nucleotide sequence of nifD from Frankia alni strain ArI3: phylogenetic inferences.

    PubMed

    Normand, P; Gouy, M; Cournoyer, B; Simonet, P

    1992-05-01

    The complete nucleotide sequence of the nifD gene encoding the alpha subunit of component I of nitrogenase from Frankia alni strain ArI3 was determined. The coding region is 1,458 bp in length and encodes a polypeptide of 486 residues with a predicted molecular weight of 53,500. Phylogenetic inferences with 12 complete published nifD sequences were drawn using a variety of approaches. Frankia nifD clusters with proteobacteria rather than with Clostridium pasteurianum, the other Gram-positive bacterium studied. Extant eubacterial nif genes seem to have at least three distinct evolutionary origins as a result of ancient gene duplications. Within the Gram-positive bacterial phylum, functional nif genes descend from different duplicates. PMID:1584016

  6. Nucleotide sequence alignment of hdcA from Gram-positive bacteria.

    PubMed

    Diaz, Maria; Ladero, Victor; Redruello, Begoña; Sanchez-Llana, Esther; Del Rio, Beatriz; Fernandez, Maria; Martin, Maria Cruz; Alvarez, Miguel A

    2016-03-01

    The decarboxylation of histidine -carried out mainly by some gram-positive bacteria- yields the toxic dietary biogenic amine histamine (Ladero et al. 2010 〈10.2174/157340110791233256〉 [1], Linares et al. 2016 〈http://dx.doi.org/10.1016/j.foodchem.2015.11.013〉〉 [2]). The reaction is catalyzed by a pyruvoyl-dependent histidine decarboxylase (Linares et al. 2011 〈10.1080/10408398.2011.582813〉 [3]), which is encoded by the gene hdcA. In order to locate conserved regions in the hdcA gene of Gram-positive bacteria, this article provides a nucleotide sequence alignment of all the hdcA sequences from Gram-positive bacteria present in databases. For further utility and discussion, see 〈http://dx.doi.org/ 10.1016/j.foodcont.2015.11.035〉〉 [4].

  7. Nucleotide sequence alignment of hdcA from Gram-positive bacteria

    PubMed Central

    Diaz, Maria; Ladero, Victor; Redruello, Begoña; Sanchez-Llana, Esther; del Rio, Beatriz; Fernandez, Maria; Martin, Maria Cruz; Alvarez, Miguel A.

    2016-01-01

    The decarboxylation of histidine -carried out mainly by some gram-positive bacteria- yields the toxic dietary biogenic amine histamine (Ladero et al. 2010 〈10.2174/157340110791233256〉 [1], Linares et al. 2016 〈http://dx.doi.org/10.1016/j.foodchem.2015.11.013〉〉 [2]). The reaction is catalyzed by a pyruvoyl-dependent histidine decarboxylase (Linares et al. 2011 〈10.1080/10408398.2011.582813〉 [3]), which is encoded by the gene hdcA. In order to locate conserved regions in the hdcA gene of Gram-positive bacteria, this article provides a nucleotide sequence alignment of all the hdcA sequences from Gram-positive bacteria present in databases. For further utility and discussion, see 〈http://dx.doi.org/ 10.1016/j.foodcont.2015.11.035〉〉 [4]. PMID:26958625

  8. The complete nucleotide sequence and genome organization of pea streak virus (genus Carlavirus).

    PubMed

    Su, Li; Li, Zhengnan; Bernardy, Mike; Wiersma, Paul A; Cheng, Zhihui; Xiang, Yu

    2015-10-01

    Pea streak virus (PeSV) is a member of the genus Carlavirus in the family Betaflexiviridae. Here, the first complete genome sequence of PeSV was determined by deep sequencing of a cDNA library constructed from dsRNA extracted from a PeSV-infected sample and Rapid Amplification of cDNA Ends (RACE) PCR. The PeSV genome consists of 8041 nucleotides excluding the poly(A) tail and contains six open reading frames (ORFs). The putative peptide encoded by the PeSV ORF6 has an estimated molecular mass of 6.6 kDa and shows no similarity to any known proteins. This differs from typical carlaviruses, whose ORF6 encodes a 12- to 18-kDa cysteine-rich nucleic-acid-binding protein.

  9. Unique nucleotide sequence-guided assembly of repetitive DNA parts for synthetic biology applications

    SciTech Connect

    Torella, JP; Lienert, F; Boehm, CR; Chen, JH; Way, JC; Silver, PA

    2014-08-07

    Recombination-based DNA construction methods, such as Gibson assembly, have made it possible to easily and simultaneously assemble multiple DNA parts, and they hold promise for the development and optimization of metabolic pathways and functional genetic circuits. Over time, however, these pathways and circuits have become more complex, and the increasing need for standardization and insulation of genetic parts has resulted in sequence redundancies-for example, repeated terminator and insulator sequences-that complicate recombination-based assembly. We and others have recently developed DNA assembly methods, which we refer to collectively as unique nucleotide sequence (UNS)-guided assembly, in which individual DNA parts are flanked with UNSs to facilitate the ordered, recombination-based assembly of repetitive sequences. Here we present a detailed protocol for UNS-guided assembly that enables researchers to convert multiple DNA parts into sequenced, correctly assembled constructs, or into high-quality combinatorial libraries in only 2-3 d. If the DNA parts must be generated from scratch, an additional 2-5 d are necessary. This protocol requires no specialized equipment and can easily be implemented by a student with experience in basic cloning techniques.

  10. Unique nucleotide sequence (UNS)-guided assembly of repetitive DNA parts for synthetic biology applications

    PubMed Central

    Torella, Joseph P.; Lienert, Florian; Boehm, Christian R.; Chen, Jan-Hung; Way, Jeffrey C.; Silver, Pamela A.

    2016-01-01

    Recombination-based DNA construction methods, such as Gibson assembly, have made it possible to easily and simultaneously assemble multiple DNA parts and hold promise for the development and optimization of metabolic pathways and functional genetic circuits. Over time, however, these pathways and circuits have become more complex, and the increasing need for standardization and insulation of genetic parts has resulted in sequence redundancies — for example repeated terminator and insulator sequences — that complicate recombination-based assembly. We and others have recently developed DNA assembly methods that we refer to collectively as unique nucleotide sequence (UNS)-guided assembly, in which individual DNA parts are flanked with UNSs to facilitate the ordered, recombination-based assembly of repetitive sequences. Here we present a detailed protocol for UNS-guided assembly that enables researchers to convert multiple DNA parts into sequenced, correctly-assembled constructs, or into high-quality combinatorial libraries in only 2–3 days. If the DNA parts must be generated from scratch, an additional 2–5 days are necessary. This protocol requires no specialized equipment and can easily be implemented by a student with experience in basic cloning techniques. PMID:25101822

  11. Nucleotide sequence of the DNA packaging and capsid synthesis genes of bacteriophage P2.

    PubMed Central

    Linderoth, N A; Ziermann, R; Haggård-Ljungquist, E; Christie, G E; Calendar, R

    1991-01-01

    Overlapping DNA fragments containing the DNA packaging and capsid synthesis gene region of bacteriophage P2 were cloned and sequenced. In this report we present the complete nucleotide sequence of this 6550 bp region. Each of six open reading frames found in the interval was assigned to one of the essential genes (Q, P, O, N, M and L) by correlating genetic, physical and mutational data with DNA and protein sequence information. Polypeptides predicted were: a capsid completion protein, gpL; the major capsid precursor, gpN; the presumed capsid scaffolding protein; gpO; the ATPase and proposed endonuclease subunits of terminase, gpP and gpM, respectively; and a candidate for the portal protein, gpQ. These gene and protein sequences exhibited no homology to analogous genes or proteins of other bacteriophages. Expression of gene Q in E. coli from a plasmid caused production of a Mr 39,000 Da protein that restored Qam34 growth. This sequence analysis found only genes previously known from analysis of conditional-lethal mutations. No new capsid genes were found. Images PMID:1837355

  12. Complete nucleotide sequence and genome organization of Pelargonium flower break virus.

    PubMed

    Rico, P; Hernández, C

    2004-03-01

    The complete nucleotide sequence of Pelargonium flower break virus (PFBV) has been determined. The genomic RNA is 3923 nucleotides (nt) long and contains five open reading frames (ORFs). The 5'-proximal ORF encodes a 27 kDa protein (p27) and terminates with an amber codon which may be read-through into an in-frame p56 ORF to generate a 86 kDa protein (p86) containing the viral RNA dependent-RNA polymerase motifs. Two small ORFs, located in the central part of the viral genome, encode polypeptides of 7 (p7) and 12 kDa (p12), respectively, which are very likely involved in virus movement. Interestingly, p12 presents a leucine zipper motif that has not been previously reported in related proteins. The 3'-proximal ORF encodes a 37 kDa capsid protein (CP). The p12 ORF is in-frame with the p86 ORF and a double read-through protein of 99 kDa (p99) may be produced. Amino acid sequence comparisons revealed that the proteins encoded by ORFs 2, 3 and 4 are more similar to the corresponding gene products of Carnation mottle virus than to those of other carmoviruses, whereas the p27 and the CP show higher identity with the equivalent proteins of Saguaro cactus virus. Phylogenetic analysis conducted with the different viral products confirmed the assignment of PFBV to the genus Carmovirus. PMID:14991450

  13. Genome-wide association study reveals five nucleotide sequence variants for carcass traits in beef cattle.

    PubMed

    Kim, Y; Ryu, J; Woo, J; Kim, J B; Kim, C Y; Lee, C

    2011-08-01

    Genetic associations of nucleotide sequence variants with carcass traits in beef cattle were investigated using a genome-wide single nucleotide polymorphism (SNP) assay. Three hundred and thirteen Korean cattle were genotyped with the Illumina BovineSNP50 BeadChip, and 39,129 SNPs from 311 animals were analysed for each carcass phenotype after filtering by quality assurance. Five sequence markers were associated with one of the meat quantity or quality traits; rs109593638 on chromosome 3 with marbling score, rs109821175 on chromosome 11 and rs110862496 on chromosome 13 with backfat thickness (BFT), and rs110228023 on chromosome 6 and rs110201414 on chromosome 16 with eye muscle area (EMA) (P < 1.27 × 10(-6) , Bonferonni P < 0.05). The ss96319521 SNP, located within a gene with functions of muscle development, dishevelled homolog 1 (DVL1), would be a desirable candidate marker. Individuals with genotype CC at this gene appeared to have increased both EMA and carcass weight. Fine-mapping would be required to refine each of the five association signals shown in the current study for future application in marker-assisted selection for genetic improvement of beef quality and quantity.

  14. Essential nucleotide sequences and secondary structure elements of the hairpin ribozyme.

    PubMed Central

    Berzal-Herranz, A; Joseph, S; Chowrira, B M; Butcher, S E; Burke, J M

    1993-01-01

    In vitro selection experiments have been used to isolate active variants of the 50 nt hairpin catalytic RNA motif following randomization of individual ribozyme domains and intensive mutagenesis of the ribozyme-substrate complex. Active and inactive variants were characterized by sequencing, analysis of RNA cleavage activity in cis and in trans, and by substrate binding studies. Results precisely define base-pairing requirements for ribozyme helices 3 and 4, and identify eight essential nucleotides (G8, A9, A10, G21, A22, A23, A24 and C25) within the catalytic core of the ribozyme. Activity and substrate binding assays show that point mutations at these eight sites eliminate cleavage activity but do not significantly decrease substrate binding, demonstrating that these bases contribute to catalytic function. The mutation U39C has been isolated from different selection experiments as a second-site suppressor of the down mutants G21U and A43G. Assays of the U39C mutation in the wild-type ribozyme and in a variety of mutant backgrounds show that this variant is a general up mutation. Results from selection experiments involving populations totaling more than 10(10) variants are summarized, and consensus sequences including 16 essential nucleotides and a secondary structure model of four short helices, encompassing 18 bp for the ribozyme-substrate complex are derived. Images PMID:8508779

  15. Mapping DNA methylation by transverse current sequencing: Reduction of noise from neighboring nucleotides

    NASA Astrophysics Data System (ADS)

    Alvarez, Jose; Massey, Steven; Kalitsov, Alan; Velev, Julian

    Nanopore sequencing via transverse current has emerged as a competitive candidate for mapping DNA methylation without needed bisulfite-treatment, fluorescent tag, or PCR amplification. By eliminating the error producing amplification step, long read lengths become feasible, which greatly simplifies the assembly process and reduces the time and the cost inherent in current technologies. However, due to the large error rates of nanopore sequencing, single base resolution has not been reached. A very important source of noise is the intrinsic structural noise in the electric signature of the nucleotide arising from the influence of neighboring nucleotides. In this work we perform calculations of the tunneling current through DNA molecules in nanopores using the non-equilibrium electron transport method within an effective multi-orbital tight-binding model derived from first-principles calculations. We develop a base-calling algorithm accounting for the correlations of the current through neighboring bases, which in principle can reduce the error rate below any desired precision. Using this method we show that we can clearly distinguish DNA methylation and other base modifications based on the reading of the tunneling current.

  16. Evidence for Balancing Selection from Nucleotide Sequence Analyses of Human G6PD

    PubMed Central

    Verrelli, Brian C.; McDonald, John H.; Argyropoulos, George; Destro-Bisol, Giovanni; Froment, Alain; Drousiotou, Anthi; Lefranc, Gerard; Helal, Ahmed N.; Loiselet, Jacques; Tishkoff, Sarah A.

    2002-01-01

    Glucose-6-phosphate dehydrogenase (G6PD) mutations that result in reduced enzyme activity have been implicated in malarial resistance and constitute one of the best examples of selection in the human genome. In the present study, we characterize the nucleotide diversity across a 5.2-kb region of G6PD in a sample of 160 Africans and 56 non-Africans, to determine how selection has shaped patterns of DNA variation at this gene. Our global sample of enzymatically normal B alleles and A, A−, and Med alleles with reduced enzyme activities reveals many previously uncharacterized silent-site polymorphisms. In comparison with the absence of amino acid divergence between human and chimpanzee G6PD sequences, we find that the number of G6PD amino acid polymorphisms in human populations is significantly high. Unlike many other G6PD-activity alleles with reduced activity, we find that the age of the A variant, which is common in Africa, may not be consistent with the recent emergence of severe malaria and therefore may have originally had a historically different adaptive function. Overall, our observations strongly support previous genotype-phenotype association studies that proposed that balancing selection maintains G6PD deficiencies within human populations. The present study demonstrates that nucleotide sequence analyses can reveal signatures of both historical and recent selection in the genome and may elucidate the impact that infectious disease has had during human evolution. PMID:12378426

  17. Haplotype structure and population genetic inferences from nucleotide-sequence variation in human lipoprotein lipase.

    PubMed Central

    Clark, A G; Weiss, K M; Nickerson, D A; Taylor, S L; Buchanan, A; Stengård, J; Salomaa, V; Vartiainen, E; Perola, M; Boerwinkle, E; Sing, C F

    1998-01-01

    Allelic variation in 9.7 kb of genomic DNA sequence from the human lipoprotein lipase gene (LPL) was scored in 71 healthy individuals (142 chromosomes) from three populations: African Americans (24) from Jackson, MS; Finns (24) from North Karelia, Finland; and non-Hispanic Whites (23) from Rochester, MN. The sequences had a total of 88 variable sites, with a nucleotide diversity (site-specific heterozygosity) of .002+/-.001 across this 9.7-kb region. The frequency spectrum of nucleotide variation exhibited a slight excess of heterozygosity, but, in general, the data fit expectations of the infinite-sites model of mutation and genetic drift. Allele-specific PCR helped resolve linkage phases, and a total of 88 distinct haplotypes were identified. For 1,410 (64%) of the 2,211 site pairs, all four possible gametes were present in these haplotypes, reflecting a rich history of past recombination. Despite the strong evidence for recombination, extensive linkage disequilibrium was observed. The number of haplotypes generally is much greater than the number expected under the infinite-sites model, but there was sufficient multisite linkage disequilibrium to reveal two major clades, which appear to be very old. Variation in this region of LPL may depart from the variation expected under a simple, neutral model, owing to complex historical patterns of population founding, drift, selection, and recombination. These data suggest that the design and interpretation of disease-association studies may not be as straightforward as often is assumed. PMID:9683608

  18. Nucleotide sequence and newly formed phosphodiester bond of spontaneously ligated satellite tobacco ringspot virus RNA.

    PubMed Central

    Buzayan, J M; Hampel, A; Bruening, G

    1986-01-01

    The satellite RNA of tobacco ringspot virus (STobRV RNA) replicates and becomes encapsidated in association with tobacco ringspot virus. Previous results show that the infected tissue produces multimeric STobRV RNAs of both polarities. RNA that is complementary to encapsidated STobRV RNA, designated as having the (-) polarity, cleaves autolytically at a specific ApG bond. Purified autolysis products spontaneously join in a non-enzymic reaction. We report characteristics of this RNA ligation reaction: the terminal groups that react, the type of bond in the newly formed junction and the nucleotide sequence of the joined RNA. The nucleotide sequence of the ligated RNA shows that joining of the reacting RNAs restored an ApG bond. The junction ApG has a 3'-to-5' phosphodiester bond. Thus the net ligation reaction of STobRV (-)RNA is the precise reversal of autolysis. We discuss this new type of RNA ligation reaction and its implications for the formation of multimeric STobRV RNAs during replication. Images PMID:2433680

  19. Complete nucleotide sequence of the Nilaparvata lugens reovirus: a putative member of the genus Fijivirus.

    PubMed

    Nakashima, N; Koizumi, M; Watanabe, H; Noda, H

    1996-01-01

    The nucleotide sequences of all genome segments of the Nilaparvata lugens reovirus (NLRV), which is found in the brown planthopper Nilaparvata lugens, have been determined and some genes have been assigned to structural and functional proteins. The genome of NLRV consists of 28 699 nucleotides and contains at least 11 large open reading frames (ORFs). The genome of NLRV is the largest among viruses of the family Reoviridae reported to date. The deduced amino acid sequence of genome segment S1 contained the major motifs of RNA polymerase and that of S7 had the purine NTP-binding motif. Based on the molecular masses of the deduced proteins and the particle structure of NLRV, segments S1, S3 and S7 were assigned to the 160, 140 and 75 kDa proteins, respectively, that are located in the inner core. It was deduced that S2 codes for the 135 kDa protein (B spike), which is located on the surface of the inner core. Most reported ORFs of rice black streaked dwarf virus (RBSDV), which shares many properties with NLRV, had similarities with the corresponding ORFs of NLRV. An exception was S7 ORF2, which is found in RBSDV but not NLRV and may therefore be involved in multiplication of RBSDV in rice plants. These results and our previous observations indicate that NLRV should be classified in the genus Fijivirus.

  20. Characterization, nucleotide sequence, and conserved genomic locations of insertion sequence ISRm5 in Rhizobium meliloti.

    PubMed Central

    Laberge, S; Middleton, A T; Wheatcroft, R

    1995-01-01

    A target for ISRm3 transposition in Rhizobium meliloti IZ450 is another insertion sequence element, named ISRm5. ISRm5 is 1,340 bp in length and possesses terminal inverted repeats of unequal lengths (27 and 28 bp) and contain five mismatches. An open reading frame that spans 89% of the length of one DNA strand encodes a putative transposase with significant similarity to the putative transposases of 11 insertion sequence elements from diverse bacterial species, including ISRm3 from R. meliloti. Multiple copies and variants of ISRm5 occur in the R. meliloti genome, often in close association with ISRm3. Five ISRm5 copies in two strains were studied, and each was found to be located between 8-bp direct repeats. At two of these loci, which were shown to be highly conserved in R. meliloti, the copies of ISRm5 were found to be associated with pairs of short inverted repeats resembling transcription terminators. This structural arrangement not only may provide a conserved niche for ISRm5 but also may be a preferred target for transposition. PMID:7768811

  1. Identification and nucleotide sequence of the glycoprotein gB gene of equine herpesvirus 4.

    PubMed

    Riggio, M P; Cullinane, A A; Onions, D E

    1989-03-01

    The nucleotide sequence of the glycoprotein gB gene of equine herpesvirus 4 (EHV-4) was determined. The gene was located within a BamHI genomic library by a combination of Southern and dot-blot hybridization with probes derived from the herpes simplex virus type 1 (HSV-1) gB DNA sequence. The predominant portion of the coding sequences was mapped to a 2.95-kilobase BamHI-EcoRI subfragment at the left-hand end of BamHI-C. Potential TATA box, CAT box, and mRNA start site sequences and the translational initiation codon were located in the BamHI M fragment of the virus, which is located immediately to the left of BamHI-C. A polyadenylation signal, AATAAA, occurs nine nucleotides past the chain termination codon. Translation of these sequences would give a 110-kilodalton protein possessing a 5' hydrophobic signal sequence, a hydrophilic surface domain containing 11 potential N-linked glycosylation sites, a hydrophobic transmembrane domain, and a 3' highly charged cytoplasmic domain. A potential internal proteolytic cleavage site, Arg-Arg/Ser, was identified at residues 459 to 461. Analysis of this protein revealed amino acid sequence homologies of 47% with HSV-1 gB, 54% with pseudorabies virus gpII, 51% with varicella-zoster virus gpII, 29% with human cytomegalovirus gB, and 30% with Epstein-Barr virus gB. Alignment of EHV-4 gB with HSV-1 (KOS) gB further revealed that four potential N-linked glycosylation sites and all 10 cysteine residues on the external surface of the molecules are perfectly conserved, suggesting that the proteins possess similar secondary and tertiary structures. Thus, we showed that EHV-4 gB is highly conserved with the gB and gpII glycoproteins of other herpesviruses, suggesting that this glycoprotein has a similar overall function in each virus. PMID:2915378

  2. Remarkable similarity in genome nucleotide sequences between the Schwarz FF-8 and AIK-C measles virus vaccine strains and apparent nucleotide differences in the phosphoprotein gene.

    PubMed

    Ito, Chie; Ohgimoto, Shinji; Kato, Seiichi; Sharma, Luna Bhatta; Ayata, Minoru; Komase, Katsuhiro; Takeuchi, Kaoru; Ihara, Toshiaki; Ogura, Hisashi

    2011-07-01

    The Schwarz FF-8 (FF-8) and AIK-C measles virus vaccine strains are currently used for vaccination in Japan. Here, the complete genome nucleotide sequence of the FF-8 strain has been determined and its genome sequence found to be remarkably similar to that of the AIK-C strain. These two strains are differentiated only by two nucleotide differences in the phosphoprotein gene. Since the FF-8 strain does not possess the amino acid substitutions in the phospho- and fusion proteins which are responsible for the temperature-sensitivity and small syncytium formation phenotypes of the AIK-C strain, respectively, other unidentified common mechanisms likely attenuate both the FF-8 and AIK-C strains.

  3. The bioinformatics of nucleotide sequence coding for proteins requiring metal coenzymes and proteins embedded with metals

    NASA Astrophysics Data System (ADS)

    Tremberger, G.; Dehipawala, Sunil; Cheung, E.; Holden, T.; Sullivan, R.; Nguyen, A.; Lieberman, D.; Cheung, T.

    2015-09-01

    All metallo-proteins need post-translation metal incorporation. In fact, the isotope ratio of Fe, Cu, and Zn in physiology and oncology have emerged as an important tool. The nickel containing F430 is the prosthetic group of the enzyme methyl coenzyme M reductase which catalyzes the release of methane in the final step of methano-genesis, a prime energy metabolism candidate for life exploration space mission in the solar system. The 3.5 Gyr early life sulfite reductase as a life switch energy metabolism had Fe-Mo clusters. The nitrogenase for nitrogen fixation 3 billion years ago had Mo. The early life arsenite oxidase needed for anoxygenic photosynthesis energy metabolism 2.8 billion years ago had Mo and Fe. The selection pressure in metal incorporation inside a protein would be quantifiable in terms of the related nucleotide sequence complexity with fractal dimension and entropy values. Simulation model showed that the studied metal-required energy metabolism sequences had at least ten times more selection pressure relatively in comparison to the horizontal transferred sequences in Mealybug, guided by the outcome histogram of the correlation R-sq values. The metal energy metabolism sequence group was compared to the circadian clock KaiC sequence group using magnesium atomic level bond shifting mechanism in the protein, and the simulation model would suggest a much higher selection pressure for the energy life switch sequence group. The possibility of using Kepler 444 as an example of ancient life in Galaxy with the associated exoplanets has been proposed and is further discussed in this report. Examples of arsenic metal bonding shift probed by Synchrotron-based X-ray spectroscopy data and Zn controlled FOXP2 regulated pathways in human and chimp brain studied tissue samples are studied in relationship to the sequence bioinformatics. The analysis results suggest that relatively large metal bonding shift amount is associated with low probability correlation R

  4. araB Gene and nucleotide sequence of the araC gene of Erwinia carotovora.

    PubMed Central

    Lei, S P; Lin, H C; Heffernan, L; Wilcox, G

    1985-01-01

    The araB and araC genes of Erwinia carotovora were expressed in Escherichia coli and Salmonella typhimurium. The araB and araC genes in E. coli, E. carotovora, and S. typhimurium were transcribed in divergent directions. In E. carotovora, the araB and araC genes were separated by 3.5 kilobase pairs, whereas in E. coli and S. typhimurium they were separated by 147 base pairs. The nucleotide sequence of the E. carotovora araC gene was determined. The predicted sequence of AraC protein of E. carotovora was 18 and 29 amino acids longer than that of AraC protein of E. coli and S. typhimurium, respectively. The DNA sequence of the araC gene of E. carotovora was 58% homologous to that of E. coli and 59% homologous to that of S. typhimurium, with respect to the common region they share. The predicted amino acid sequence of AraC protein was 57% homologous to that of E. coli and 58% homologous to that of S. typhimurium. The 5' noncoding regions of the araB and araC genes of E. carotovora had little homology to either of the other two species. Images PMID:3902795

  5. Genome-wide analysis of single-nucleotide polymorphisms in human expressed sequences.

    PubMed

    Irizarry, K; Kustanovich, V; Li, C; Brown, N; Nelson, S; Wong, W; Lee, C J

    2000-10-01

    Single-nucleotide polymorphisms (SNPs) have been explored as a high-resolution marker set for accelerating the mapping of disease genes. Here we report 48,196 candidate SNPs detected by statistical analysis of human expressed sequence tags (ESTs), associated primarily with coding regions of genes. We used Bayesian inference to weigh evidence for true polymorphism versus sequencing error, misalignment or ambiguity, misclustering or chimaeric EST sequences, assessing data such as raw chromatogram height, sharpness, overlap and spacing, sequencing error rates, context-sensitivity and cDNA library origin. Three separate validations-comparison with 54 genes screened for SNPs independently, verification of HLA-A polymorphisms and restriction fragment length polymorphism (RFLP) testing-verified 70%, 89% and 71% of our predicted SNPs, respectively. Our method detects tenfold more true HLA-A SNPs than previous analyses of the EST data. We found SNPs in a large fraction of known disease genes, including some disease-causing mutations (for example, the HbS sickle-cell mutation). Our comprehensive analysis of human coding region polymorphism provides a public resource for mapping of disease genes (available at http://www.bioinformatics.ucla.edu/snp).

  6. Cloning and nucleotide sequence of the gene coding for citrate synthase from a thermotolerant Bacillus sp.

    PubMed Central

    Schendel, F J; August, P R; Anderson, C R; Hanson, R S; Flickinger, M C

    1992-01-01

    The structural gene coding for citrate synthase from the gram-positive soil isolate Bacillus sp. strain C4 (ATCC 55182) capable of secreting acetic acid at pH 5.0 to 7.0 in the presence of dolime has been cloned from a genomic library by complementation of an Escherichia coli auxotrophic mutant lacking citrate synthase. The nucleotide sequence of the entire 3.1-kb HindIII fragment has been determined, and one major open reading frame was found coding for citrate synthase (ctsA). Citrate synthase from Bacillus sp. strain C4 was found to be a dimer (Mr, 84,500) with a subunit with an Mr of 42,000. The N-terminal sequence was found to be identical with that predicted from the gene sequence. The kinetics were best fit to a bisubstrate enzyme with an ordered mechanism. Bacillus sp. strain C4 citrate synthase was not activated by potassium chloride and was not inhibited by NADH, ATP, ADP, or AMP at levels up to 1 mM. The predicted amino acid sequence was compared with that of the E. coli, Acinetobacter anitratum, Pseudomonas aeruginosa, Rickettsia prowazekii, porcine heart, and Saccharomyces cerevisiae cytoplasmic and mitochondrial enzymes. PMID:1311544

  7. WAViS server for handling, visualization and presentation of multiple alignments of nucleotide or amino acids sequences.

    PubMed

    Zika, Radek; Paces, Jan; Pavlícek, Adam; Paces, Václav

    2004-07-01

    Web Alignment Visualization Server contains a set of web-tools designed for quick generation of publication-quality color figures of multiple alignments of nucleotide or amino acids sequences. It can be used for identification of conserved regions and gaps within many sequences using only common web browsers. The server is accessible at http://wavis.img.cas.cz.

  8. Cloning and genomic nucleotide sequence of the matrix attachment region binding protein from the halotolerant alga Dunaliella salina.

    PubMed

    Wang, Peng-Ju; Wang, Tian-Yun; Wang, Ya-Feng; Yang, Rui; Li, Zhao-Xi

    2013-07-01

    In our previous study, the sequence of a matrix attachment region binding protein (MBP) cDNA was cloned from the unicellular green alga Dunaliella salina. However, the nucleotide sequence of this gene has not been reported so far. In this paper, the nucleotide sequence of MBP was cloned and characterized, and its gene copy number was determined. The MBP nucleotide sequence is 5641 bp long, and interrupted by 12 introns ranging from 132 to 562 bp. All the introns in the D. salina MBP gene have orthodox splice sites, exhibiting GT at the 5' end and AG at the 3' end. Southern blot analysis showed that MBP only has one copy in the D. salina genome. PMID:22961592

  9. Cloning and genomic nucleotide sequence of the matrix attachment region binding protein from the halotolerant alga Dunaliella salina.

    PubMed

    Wang, Peng-Ju; Wang, Tian-Yun; Wang, Ya-Feng; Yang, Rui; Li, Zhao-Xi

    2013-07-01

    In our previous study, the sequence of a matrix attachment region binding protein (MBP) cDNA was cloned from the unicellular green alga Dunaliella salina. However, the nucleotide sequence of this gene has not been reported so far. In this paper, the nucleotide sequence of MBP was cloned and characterized, and its gene copy number was determined. The MBP nucleotide sequence is 5641 bp long, and interrupted by 12 introns ranging from 132 to 562 bp. All the introns in the D. salina MBP gene have orthodox splice sites, exhibiting GT at the 5' end and AG at the 3' end. Southern blot analysis showed that MBP only has one copy in the D. salina genome.

  10. Complete nucleotide sequences of two isolates of cherry green ring mottle virus from peach (Prunus persica) in China.

    PubMed

    Wang, Lihui; Jiang, Dongmei; Niu, Feiqing; Lu, Meiguang; Wang, Hongqing; Li, Shifang

    2013-03-01

    Two complete nucleotide sequences of cherry green ring mottle virus (CGRMV) isolated from peach in Hebei (Hs10) and Fujian (F9) Provinces, China, were determined. Five open reading frames (ORFs) were found in the genomes of both isolates. The F9 and Hs10 isolates shared 82.2 % and 83.4-94.4 % nucleotide sequence identity, respectively, with two CGRMV isolates from cherry. Analysis of the nucleotide and amino acid sequences from the five ORFs of both isolates showed that Hs10 shares the greatest sequence identity with P1A (GenBank AJ291761) from cherry. Phylogenetic analysis indicated that CGRMV isolates from peach and cherry are closely related to members of the genus Foveavirus.

  11. Single nucleotide polymorphism analysis of Korean native chickens using next generation sequencing data.

    PubMed

    Seo, Dong-Won; Oh, Jae-Don; Jin, Shil; Song, Ki-Duk; Park, Hee-Bok; Heo, Kang-Nyeong; Shin, Younhee; Jung, Myunghee; Park, Junhyung; Jo, Cheorun; Lee, Hak-Kyo; Lee, Jun-Heon

    2015-02-01

    There are five native chicken lines in Korea, which are mainly classified by plumage colors (black, white, red, yellow, gray). These five lines are very important genetic resources in the Korean poultry industry. Based on a next generation sequencing technology, whole genome sequence and reference assemblies were performed using Gallus_gallus_4.0 (NCBI) with whole genome sequences from these lines to identify common and novel single nucleotide polymorphisms (SNPs). We obtained 36,660,731,136 ± 1,257,159,120 bp of raw sequence and average 26.6-fold of 25-29 billion reference assembly sequences representing 97.288 % coverage. Also, 4,006,068 ± 97,534 SNPs were observed from 29 autosomes and the Z chromosome and, of these, 752,309 SNPs are the common SNPs across lines. Among the identified SNPs, the number of novel- and known-location assigned SNPs was 1,047,951 ± 14,956 and 2,948,648 ± 81,414, respectively. The number of unassigned known SNPs was 1,181 ± 150 and unassigned novel SNPs was 8,238 ± 1,019. Synonymous SNPs, non-synonymous SNPs, and SNPs having character changes were 26,266 ± 1,456, 11,467 ± 604, 8,180 ± 458, respectively. Overall, 443,048 ± 26,389 SNPs in each bird were identified by comparing with dbSNP in NCBI. The presently obtained genome sequence and SNP information in Korean native chickens have wide applications for further genome studies such as genetic diversity studies to detect causative mutations for economic and disease related traits.

  12. The ChEMBL bioactivity database: an update.

    PubMed

    Bento, A Patrícia; Gaulton, Anna; Hersey, Anne; Bellis, Louisa J; Chambers, Jon; Davies, Mark; Krüger, Felix A; Light, Yvonne; Mak, Lora; McGlinchey, Shaun; Nowotka, Michal; Papadatos, George; Santos, Rita; Overington, John P

    2014-01-01

    ChEMBL is an open large-scale bioactivity database (https://www.ebi.ac.uk/chembl), previously described in the 2012 Nucleic Acids Research Database Issue. Since then, a variety of new data sources and improvements in functionality have contributed to the growth and utility of the resource. In particular, more comprehensive tracking of compounds from research stages through clinical development to market is provided through the inclusion of data from United States Adopted Name applications; a new richer data model for representing drug targets has been developed; and a number of methods have been put in place to allow users to more easily identify reliable data. Finally, access to ChEMBL is now available via a new Resource Description Framework format, in addition to the web-based interface, data downloads and web services. PMID:24214965

  13. Plastid sequence evolution: a new pattern of nucleotide substitutions in the Cucurbitaceae.

    PubMed

    Decker-Walters, Deena S; Chung, Sang-Min; Staub, Jack E

    2004-05-01

    Nucleotide substitutions (i.e., point mutations) are the primary driving force in generating DNA variation upon which selection can act. Substitutions called transitions, which entail exchanges between purines (A = adenine, G = guanine) or pyrimidines (C = cytosine, T = thymine), typically outnumber transversions (e.g., exchanges between a purine and a pyrimidine) in a DNA strand. With an increasing number of plant studies revealing a transversion rather than transition bias, we chose to perform a detailed substitution analysis for the plant family Cucurbitaceae using data from several short plastid DNA sequences. We generated a phylogenetic tree for 19 taxa of the tribe Benincaseae and related genera and then scored conservative substitution changes (e.g., those not exhibiting homoplasy or reversals) from the unambiguous branches of the tree. Neither the transition nor (A+T)/(G+C) biases found in previous studies were supported by our overall data. More importantly, we found a novel and symmetrical substitution bias in which Gs had been preferentially replaced by A, As by C, Cs by T, and Ts by G, resulting in the G-->A-->C-->T-->G substitution series. Understanding this pattern will lead to new hypotheses concerning plastid evolution, which in turn will affect the choices of substitution models and other tree-building algorithms for phylogenetic analyses based on nucleotide data.

  14. Nucleotide Sequence Analyses and Predicted Coding of Bunyavirus Genome RNA Species

    PubMed Central

    Clerx-van Haaster, Corrie M.; Akashi, Hiroomi; Auperin, David D.; Bishop, David H. L.

    1982-01-01

    We performed 3′ RNA sequence analyses of [32P]pCp-end-labeled La Crosse (LAC) virus, alternate LAC virus isolate L74, and snowshoe hare bunyavirus large (L), medium (M), and small (S) negative-stranded viral RNA species to determine the coding capabilities of these species. These analyses were confirmed by dideoxy primer extension studies in which we used a synthetic oligodeoxynucleotide primer complementary to the conserved 3′-terminal decanucleotide of the three viral RNA species (Clerx-van Haaster and Bishop, Virology 105:564-574, 1980). The deduced sequences predicted translation of two S-RNA gene products that were read in overlapping reading frames. So far, only single contiguous open reading frames have been identified for the viral M- and L-RNA species. For the negative-stranded M-RNA species of all three viruses, the single reading frame developed from the first 3′-proximal UAC triplet. Likewise, for the L-RNA of the alternate LAC isolate, a single open reading frame developed from the first 3′-proximal UAC triplet. The corresponding L-RNA sequences of prototype LAC and snowshoe hare viruses initiated open reading frames; however, for both viral L-RNA species there was a preceding 3′-proximal UAC triplet in another reading frame that was followed shortly afterward by a termination codon. A comparison of the sequence data obtained for snowshoe hare virus, LAC virus, and the alternate LAC virus isolate showed that the identified nucleotide substitutions were sufficient to account for some of the fingerprint differences in the L-, M-, and S-RNA species of the three viruses. Unlike the distribution of the L- and M-RNA substitutions, significantly fewer nucleotide substitutions occurred after the initial UAC triplet of the S-RNA species than before this triplet, implying that the overlapping genes of the S RNA provided a constraint against evolution by point mutation. The comparative sequence analyses predicted amino acid differences among the

  15. Guanine nucleotide-binding proteins that enhance choleragen ADP-ribosyltransferase activity: nucleotide and deduced amino acid sequence of an ADP-ribosylation factor cDNA.

    PubMed Central

    Price, S R; Nightingale, M; Tsai, S C; Williamson, K C; Adamik, R; Chen, H C; Moss, J; Vaughan, M

    1988-01-01

    Three (two soluble and one membrane) guanine nucleotide-binding proteins (G proteins) that enhance ADP-ribosylation of the Gs alpha stimulatory subunit of the adenylyl cyclase (EC 4.6.1.1) complex by choleragen have recently been purified from bovine brain. To further define the structure and function of these ADP-ribosylation factors (ARFs), we isolated a cDNA clone (lambda ARF2B) from a bovine retinal library by screening with a mixed heptadecanucleotide probe whose sequence was based on the partial amino acid sequence of one of the soluble ARFs from bovine brain. Comparison of the deduced amino acid sequence of lambda ARF2B with sequences of peptides from the ARF protein (total of 60 amino acids) revealed only two differences. Whether these are cloning artifacts or reflect the existence of more than one ARF protein remains to be determined. Deduced amino acid sequences of ARF, Go alpha (the alpha subunit of a G protein that may be involved in regulation of ion fluxes), and c-Ha-ras gene product p21 show similarities in regions believed to be involved in guanine nucleotide binding and GTP hydrolysis. ARF apparently lacks a site analogous to that ADP-ribosylated by choleragen in G-protein alpha subunits. Although both the ARF proteins and the alpha subunits bind guanine nucleotides and serve as choleragen substrates, they must interact with the toxin A1 peptide in different ways. In addition to serving as an ADP-ribose acceptor, ARF interacts with the toxin in a manner that modifies its catalytic properties. PMID:3135549

  16. The complete nucleotide sequence of a new bipartite begomovirus from Brazil infecting Abutilon.

    PubMed

    Paprotka, T; Metzler, V; Jeske, H

    2010-05-01

    The complete nucleotide sequence of Abutilon mosaic Brazil virus (AbMBV), a new bipartite begomovirus from Bahia, Brazil, is described and analyzed phylogenetically. Its DNA A is most closely related to those of Sida-infecting begomoviruses from Brazil and forms a phylogenetic cluster with pepper- and Euphorbia-infecting begomoviruses from Central America. The DNA B component forms a cluster with different Sida- and okra-infecting begomoviruses from Brazil. Both components are distinct from those of the classical Abutilon mosaic virus originating from the West Indies. AbMBV is transmissible to Nicotiana benthamiana and Malva parviflora by biolistics of rolling-circle amplification products and induces characteristic mosaic and vein-clearing symptoms in M. parviflora.

  17. High-Throughput Sequencing Reveals Single Nucleotide Variants in Longer-Kernel Bread Wheat

    PubMed Central

    Chen, Feng; Zhu, Zibo; Zhou, Xiaobian; Yan, Yan; Dong, Zhongdong; Cui, Dangqun

    2016-01-01

    The transcriptomes of bread wheat Yunong 201 and its ethyl methanesulfonate derivative Yunong 3114 were obtained by next-sequencing technology. Single nucleotide variants (SNVs) in the wheat strains were explored and compared. A total of 5907 and 6287 non-synonymous SNVs were acquired for Yunong 201 and 3114, respectively. A total of 4021 genes with SNVs were obtained. The genes that underwent non-synonymous SNVs were significantly involved in ATP binding, protein phosphorylation, and cellular protein metabolic process. The heat map analysis also indicated that most of these mutant genes were significantly differentially expressed at different developmental stages. The SNVs in these genes possibly contribute to the longer kernel length of Yunong 3114. Our data provide useful information on wheat transcriptome for future studies on wheat functional genomics. This study could also help in illustrating the gene functions of the non-synonymous SNVs of Yunong 201 and 3114. PMID:27551288

  18. Developing single nucleotide polymorphism (SNP) markers from transcriptome sequences for identification of longan (Dimocarpus longan) germplasm

    PubMed Central

    Wang, Boyi; Tan, Hua-Wei; Fang, Wanping; Meinhardt, Lyndel W; Mischke, Sue; Matsumoto, Tracie; Zhang, Dapeng

    2015-01-01

    Longan (Dimocarpus longan Lour.) is an important tropical fruit tree crop. Accurate varietal identification is essential for germplasm management and breeding. Using longan transcriptome sequences from public databases, we developed single nucleotide polymorphism (SNP) markers; validated 60 SNPs in 50 longan germplasm accessions, including cultivated varieties and wild germplasm; and designated 25 SNP markers that unambiguously identified all tested longan varieties with high statistical rigor (P<0.0001). Multiple trees from the same clone were verified and off-type trees were identified. Diversity analysis revealed genetic relationships among analyzed accessions. Cultivated varieties differed significantly from wild populations (Fst=0.300; P<0.001), demonstrating untapped genetic diversity for germplasm conservation and utilization. Within cultivated varieties, apparent differences between varieties from China and those from Thailand and Hawaii indicated geographic patterns of genetic differentiation. These SNP markers provide a powerful tool to manage longan genetic resources and breeding, with accurate and efficient genotype identification. PMID:26504559

  19. High-Throughput Sequencing Reveals Single Nucleotide Variants in Longer-Kernel Bread Wheat.

    PubMed

    Chen, Feng; Zhu, Zibo; Zhou, Xiaobian; Yan, Yan; Dong, Zhongdong; Cui, Dangqun

    2016-01-01

    The transcriptomes of bread wheat Yunong 201 and its ethyl methanesulfonate derivative Yunong 3114 were obtained by next-sequencing technology. Single nucleotide variants (SNVs) in the wheat strains were explored and compared. A total of 5907 and 6287 non-synonymous SNVs were acquired for Yunong 201 and 3114, respectively. A total of 4021 genes with SNVs were obtained. The genes that underwent non-synonymous SNVs were significantly involved in ATP binding, protein phosphorylation, and cellular protein metabolic process. The heat map analysis also indicated that most of these mutant genes were significantly differentially expressed at different developmental stages. The SNVs in these genes possibly contribute to the longer kernel length of Yunong 3114. Our data provide useful information on wheat transcriptome for future studies on wheat functional genomics. This study could also help in illustrating the gene functions of the non-synonymous SNVs of Yunong 201 and 3114. PMID:27551288

  20. Complete nucleotide sequence of the mitochondrial genome of a salamander, Mertensiella luschani.

    PubMed

    Zardoya, Rafael; Malaga-Trillo, Edward; Veith, Michael; Meyer, Axel

    2003-10-23

    The complete nucleotide sequence (16,650 bp) of the mitochondrial genome of the salamander Mertensiella luschani (Caudata, Amphibia) was determined. This molecule conforms to the consensus vertebrate mitochondrial gene order. However, it is characterized by a long non-coding intervening sequence with two 124-bp repeats between the tRNA(Thr) and tRNA(Pro) genes. The new sequence data were used to reconstruct a phylogeny of jawed vertebrates. Phylogenetic analyses of all mitochondrial protein-coding genes at the amino acid level recovered a robust vertebrate tree in which lungfishes are the closest living relatives of tetrapods, salamanders and frogs are grouped together to the exclusion of caecilians (the Batrachia hypothesis) in a monophyletic amphibian clade, turtles show diapsid affinities and are placed as sister group of crocodiles+birds, and the marsupials are grouped together with monotremes and basal to placental mammals. The deduced phylogeny was used to characterize the molecular evolution of vertebrate mitochondrial proteins. Amino acid frequencies were analyzed across the main lineages of jawed vertebrates, and leucine and cysteine were found to be the most and least abundant amino acids in mitochondrial proteins, respectively. Patterns of amino acid replacements were conserved among vertebrates. Overall, cartilaginous fishes showed the least variation in amino acid frequencies and replacements. Constancy of rates of evolution among the main lineages of jawed vertebrates was rejected.

  1. Regulatory regions of two transport operons under nitrogen control: nucleotide sequences.

    PubMed Central

    Higgins, C F; Ames, G F

    1982-01-01

    We have determined the nucleotide sequences of the regulatory regions from two amino acid transport operons from Salmonella typhimurium: dhuA, which regulates the histidine transport operon, and argTr, which regulates argT, the gene encoding the lysine-arginine-ornithine-binding protein, LAO. The promoter for the histidine transport operon has been identified from the sequence change in the promoter-up mutation dhuA1. Neither regulatory region has any of the features typical of the regulatory regions of the amino acid biosynthetic operons, indicating that regulation of at least these transport genes does not involve a transcription attenuation mechanism. We have identified three interesting features, present in both of these sequences, which may be of importance in the regulation of these and other operons: a "stem-loop-foot" structure, a region of specific homology, and a mirror symmetry. The region of mirror symmetry may be a protein recognition site important is regulating expression of these and other operons in response to nitrogen availability. Mirror symmetry as a structure for DNA-protein interaction sites has not been proposed previously. PMID:7041112

  2. High-throughput nucleotide sequence analysis of diverse bacterial communities in leachates of decomposing pig carcasses

    PubMed Central

    Yang, Seung Hak; Lim, Joung Soo; Khan, Modabber Ahmed; Kim, Bong Soo; Choi, Dong Yoon; Lee, Eun Young; Ahn, Hee Kwon

    2015-01-01

    The leachate generated by the decomposition of animal carcass has been implicated as an environmental contaminant surrounding the burial site. High-throughput nucleotide sequencing was conducted to investigate the bacterial communities in leachates from the decomposition of pig carcasses. We acquired 51,230 reads from six different samples (1, 2, 3, 4, 6 and 14 week-old carcasses) and found that sequences representing the phylum Firmicutes predominated. The diversity of bacterial 16S rRNA gene sequences in the leachate was the highest at 6 weeks, in contrast to those at 2 and 14 weeks. The relative abundance of Firmicutes was reduced, while the proportion of Bacteroidetes and Proteobacteria increased from 3–6 weeks. The representation of phyla was restored after 14 weeks. However, the community structures between the samples taken at 1–2 and 14 weeks differed at the bacterial classification level. The trend in pH was similar to the changes seen in bacterial communities, indicating that the pH of the leachate could be related to the shift in the microbial community. The results indicate that the composition of bacterial communities in leachates of decomposing pig carcasses shifted continuously during the study period and might be influenced by the burial site. PMID:26500442

  3. Complete nucleotide sequence of watermelon chlorotic stunt virus originating from Oman.

    PubMed

    Khan, Akhtar J; Akhtar, Sohail; Briddon, Rob W; Ammara, Um; Al-Matrooshi, Abdulrahman M; Mansoor, Shahid

    2012-07-01

    Watermelon chlorotic stunt virus (WmCSV) is a bipartite begomovirus (genus Begomovirus, family Geminiviridae) that causes economic losses to cucurbits, particularly watermelon, across the Middle East and North Africa. Recently squash (Cucurbita moschata) grown in an experimental field in Oman was found to display symptoms such as leaf curling, yellowing and stunting, typical of a begomovirus infection. Sequence analysis of the virus isolated from squash showed 97.6-99.9% nucleotide sequence identity to previously described WmCSV isolates for the DNA A component and 93-98% identity for the DNA B component. Agrobacterium-mediated inoculation to Nicotiana benthamiana resulted in the development of symptoms fifteen days post inoculation. This is the first bipartite begomovirus identified in Oman. Overall the Oman isolate showed the highest levels of sequence identity to a WmCSV isolate originating from Iran, which was confirmed by phylogenetic analysis. This suggests that WmCSV present in Oman has been introduced from Iran. The significance of this finding is discussed.

  4. Complete nucleotide sequence of the mitochondrial genome of a salamander, Mertensiella luschani.

    PubMed

    Zardoya, Rafael; Malaga-Trillo, Edward; Veith, Michael; Meyer, Axel

    2003-10-23

    The complete nucleotide sequence (16,650 bp) of the mitochondrial genome of the salamander Mertensiella luschani (Caudata, Amphibia) was determined. This molecule conforms to the consensus vertebrate mitochondrial gene order. However, it is characterized by a long non-coding intervening sequence with two 124-bp repeats between the tRNA(Thr) and tRNA(Pro) genes. The new sequence data were used to reconstruct a phylogeny of jawed vertebrates. Phylogenetic analyses of all mitochondrial protein-coding genes at the amino acid level recovered a robust vertebrate tree in which lungfishes are the closest living relatives of tetrapods, salamanders and frogs are grouped together to the exclusion of caecilians (the Batrachia hypothesis) in a monophyletic amphibian clade, turtles show diapsid affinities and are placed as sister group of crocodiles+birds, and the marsupials are grouped together with monotremes and basal to placental mammals. The deduced phylogeny was used to characterize the molecular evolution of vertebrate mitochondrial proteins. Amino acid frequencies were analyzed across the main lineages of jawed vertebrates, and leucine and cysteine were found to be the most and least abundant amino acids in mitochondrial proteins, respectively. Patterns of amino acid replacements were conserved among vertebrates. Overall, cartilaginous fishes showed the least variation in amino acid frequencies and replacements. Constancy of rates of evolution among the main lineages of jawed vertebrates was rejected. PMID:14604788

  5. Whole genome sequencing of a single Bos taurus animal for single nucleotide polymorphism discovery

    PubMed Central

    Eck, Sebastian H; Benet-Pagès, Anna; Flisikowski, Krzysztof; Meitinger, Thomas; Fries, Ruedi; Strom, Tim M

    2009-01-01

    Background The majority of the 2 million bovine single nucleotide polymorphisms (SNPs) currently available in dbSNP have been identified in a single breed, Hereford cattle, during the bovine genome project. In an attempt to evaluate the variance of a second breed, we have produced a whole genome sequence at low coverage of a single Fleckvieh bull. Results We generated 24 gigabases of sequence, mainly using 36-bp paired-end reads, resulting in an average 7.4-fold sequence depth. This coverage was sufficient to identify 2.44 million SNPs, 82% of which were previously unknown, and 115,000 small indels. A comparison with the genotypes of the same animal, generated on a 50 k oligonucleotide chip, revealed a detection rate of 74% and 30% for homozygous and heterozygous SNPs, respectively. The false positive rate, as determined by comparison with genotypes determined for 196 randomly selected SNPs, was approximately 1.1%. We further determined the allele frequencies of the 196 SNPs in 48 Fleckvieh and 48 Braunvieh bulls. 95% of the SNPs were polymorphic with an average minor allele frequency of 24.5% and with 83% of the SNPs having a minor allele frequency larger than 5%. Conclusions This work provides the first single cattle genome by next-generation sequencing. The chosen approach - low to medium coverage re-sequencing - added more than 2 million novel SNPs to the currently publicly available SNP resource, providing a valuable resource for the construction of high density oligonucleotide arrays in the context of genome-wide association studies. PMID:19660108

  6. Detection and quantitation of single nucleotide polymorphisms, DNA sequence variations, DNA mutations, DNA damage and DNA mismatches

    DOEpatents

    McCutchen-Maloney, Sandra L.

    2002-01-01

    DNA mutation binding proteins alone and as chimeric proteins with nucleases are used with solid supports to detect DNA sequence variations, DNA mutations and single nucleotide polymorphisms. The solid supports may be flow cytometry beads, DNA chips, glass slides or DNA dips sticks. DNA molecules are coupled to solid supports to form DNA-support complexes. Labeled DNA is used with unlabeled DNA mutation binding proteins such at TthMutS to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by binding which gives an increase in signal. Unlabeled DNA is utilized with labeled chimeras to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by nuclease activity of the chimera which gives a decrease in signal.

  7. Real-time single-molecule electronic DNA sequencing by synthesis using polymer-tagged nucleotides on a nanopore array.

    PubMed

    Fuller, Carl W; Kumar, Shiv; Porel, Mintu; Chien, Minchen; Bibillo, Arek; Stranges, P Benjamin; Dorwart, Michael; Tao, Chuanjuan; Li, Zengmin; Guo, Wenjing; Shi, Shundi; Korenblum, Daniel; Trans, Andrew; Aguirre, Anne; Liu, Edward; Harada, Eric T; Pollard, James; Bhat, Ashwini; Cech, Cynthia; Yang, Alexander; Arnold, Cleoma; Palla, Mirkó; Hovis, Jennifer; Chen, Roger; Morozova, Irina; Kalachikov, Sergey; Russo, James J; Kasianowicz, John J; Davis, Randy; Roever, Stefan; Church, George M; Ju, Jingyue

    2016-05-10

    DNA sequencing by synthesis (SBS) offers a robust platform to decipher nucleic acid sequences. Recently, we reported a single-molecule nanopore-based SBS strategy that accurately distinguishes four bases by electronically detecting and differentiating four different polymer tags attached to the 5'-phosphate of the nucleotides during their incorporation into a growing DNA strand catalyzed by DNA polymerase. Further developing this approach, we report here the use of nucleotides tagged at the terminal phosphate with oligonucleotide-based polymers to perform nanopore SBS on an α-hemolysin nanopore array platform. We designed and synthesized several polymer-tagged nucleotides using tags that produce different electrical current blockade levels and verified they are active substrates for DNA polymerase. A highly processive DNA polymerase was conjugated to the nanopore, and the conjugates were complexed with primer/template DNA and inserted into lipid bilayers over individually addressable electrodes of the nanopore chip. When an incoming complementary-tagged nucleotide forms a tight ternary complex with the primer/template and polymerase, the tag enters the pore, and the current blockade level is measured. The levels displayed by the four nucleotides tagged with four different polymers captured in the nanopore in such ternary complexes were clearly distinguishable and sequence-specific, enabling continuous sequence determination during the polymerase reaction. Thus, real-time single-molecule electronic DNA sequencing data with single-base resolution were obtained. The use of these polymer-tagged nucleotides, combined with polymerase tethering to nanopores and multiplexed nanopore sensors, should lead to new high-throughput sequencing methods. PMID:27091962

  8. Real-time single-molecule electronic DNA sequencing by synthesis using polymer-tagged nucleotides on a nanopore array

    PubMed Central

    Fuller, Carl W.; Kumar, Shiv; Porel, Mintu; Chien, Minchen; Bibillo, Arek; Stranges, P. Benjamin; Dorwart, Michael; Tao, Chuanjuan; Li, Zengmin; Guo, Wenjing; Shi, Shundi; Korenblum, Daniel; Trans, Andrew; Aguirre, Anne; Liu, Edward; Harada, Eric T.; Pollard, James; Bhat, Ashwini; Cech, Cynthia; Yang, Alexander; Arnold, Cleoma; Palla, Mirkó; Hovis, Jennifer; Chen, Roger; Morozova, Irina; Kalachikov, Sergey; Russo, James J.; Kasianowicz, John J.; Davis, Randy; Roever, Stefan; Church, George M.; Ju, Jingyue

    2016-01-01

    DNA sequencing by synthesis (SBS) offers a robust platform to decipher nucleic acid sequences. Recently, we reported a single-molecule nanopore-based SBS strategy that accurately distinguishes four bases by electronically detecting and differentiating four different polymer tags attached to the 5′-phosphate of the nucleotides during their incorporation into a growing DNA strand catalyzed by DNA polymerase. Further developing this approach, we report here the use of nucleotides tagged at the terminal phosphate with oligonucleotide-based polymers to perform nanopore SBS on an α-hemolysin nanopore array platform. We designed and synthesized several polymer-tagged nucleotides using tags that produce different electrical current blockade levels and verified they are active substrates for DNA polymerase. A highly processive DNA polymerase was conjugated to the nanopore, and the conjugates were complexed with primer/template DNA and inserted into lipid bilayers over individually addressable electrodes of the nanopore chip. When an incoming complementary-tagged nucleotide forms a tight ternary complex with the primer/template and polymerase, the tag enters the pore, and the current blockade level is measured. The levels displayed by the four nucleotides tagged with four different polymers captured in the nanopore in such ternary complexes were clearly distinguishable and sequence-specific, enabling continuous sequence determination during the polymerase reaction. Thus, real-time single-molecule electronic DNA sequencing data with single-base resolution were obtained. The use of these polymer-tagged nucleotides, combined with polymerase tethering to nanopores and multiplexed nanopore sensors, should lead to new high-throughput sequencing methods. PMID:27091962

  9. Assessment of the nucleotide sequence variability in the bovine T-cell receptor alpha delta joining gene region.

    PubMed

    Fries, R; Ewald, D; Thaller, G; Buitkamp, J

    2001-05-01

    The sequence of 2,193 nucleotides from the bovine T-cell receptor alpha/delta joining gene region (TCRADJ) was determined and compared with the corresponding human and murine sequences. The identity was 75.3% for the comparison of the Bos taurus vs. the Homo sapiens sequence and 63.8% for the Bos taurus vs. the Mus musculus sequence. This comparison permitted the identification of the putatively functional elements within the bovine sequence. Direct sequencing of 2,110 nucleotides in nine animals revealed 12 variable sites. Estimates, based on direct sequencing in three Holstein Friesian animals, for the two measures of sequence variability, nucleotide polymorphism (u) and nucleotide diversity (p), were 0.00050 (60.00036) and 0.00077 (60.00056), respectively. The test statistic, Tajima's D, for the comparison of the two measures indicates that the difference between u and p is close to significance (P < 0.05), suggesting the possibility of selective forces acting on the studied genomic region. Allelic variation at 5 of the 12 variable sites was analysed in 359 animals (48 Anatolian Black, 56 Braunvieh, 115 Fleckvieh, 47 Holstein Friesian, 50 Simmental and 43 Pinzgauer) using the oligonucleotide ligation assay (OLA) in combination with the enzyme linked immunoabsorbant assay (ELISA). Nine unambiguous haplotypes could be derived based on animals with a maximum of one heterozygous site. Four to seven haplotypes were present in the different breeds. When taking into account the frequencies of the haplotypes in the different breeds, especially in Anatolian Black, an ancestral cattle population, we could establish the likely phylogenetic relationships of the haplotypes. Such haplotype trees are the basis for cladistic candidate gene analysis. Our study demonstrates that the systematic search of single nucleotide polymorphisms (SNPs) is useful for analysing all aspects of variability of a given genomic region.

  10. Whole-genome sequence of Sunxiuqinia dokdonensis DH1(T), isolated from deep sub-seafloor sediment in Dokdo Island.

    PubMed

    Lim, Sooyeon; Chang, Dong-Ho; Kim, Byoung-Chan

    2016-09-01

    Sunxiuqinia dokdonensis DH1(T) was isolated from deep sub-seafloor sediment at a depth of 900 m below the seafloor off Seo-do (the west part of Dokdo Island) in the East Sea of the Republic of Korea and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession LGIA00000000.

  11. Whole-genome sequence of Sunxiuqinia dokdonensis DH1(T), isolated from deep sub-seafloor sediment in Dokdo Island.

    PubMed

    Lim, Sooyeon; Chang, Dong-Ho; Kim, Byoung-Chan

    2016-09-01

    Sunxiuqinia dokdonensis DH1(T) was isolated from deep sub-seafloor sediment at a depth of 900 m below the seafloor off Seo-do (the west part of Dokdo Island) in the East Sea of the Republic of Korea and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession LGIA00000000. PMID:27437183

  12. Cloning and nucleotide sequence of the gene coding for citrate synthase from a thermotolerant Bacillus sp

    SciTech Connect

    Schendel, F.J.; August, P.R.; Anderson, C.R.; Flickinger, M.C. ); Hanson, R.S. )

    1992-01-01

    Acetate salts are emerging as potentially attractive bulk chemicals for a variety of environmental applications, for example, as catalysts to facilitate combustion of high-sulfur coal by electrical utilities and as the biodegradable noncorrosive highway deicing salt calcium magnesium acetate. The structural gene coding for citrate synthase from the gram-positive soil isolate Bacillus sp. strain C4 (ATCC 55182) capable of secreting acetic acid at pH 5.0 to 7.0 in the presence of dolime has been cloned from a genomic library by complementation of an Escherichia coli auxotrophic mutant lacking citrate synthase. The nucleotide sequence of the entire 3.1-kb HindIII fragment has been determined, and one major open reading frame was found coding for citrate synthase (ctsA). Citrate synthase from Bacillus sp. strain C4 was found to be a dimer (M{sub r}, 84,500) with a sub unit with an M{sub r} of 42,000. The N-terminal sequence was found to be identical with that predicted from the gene sequence. The kinetics were best fit to a bisubstrate enzyme with an ordered mechanism. Bacillus sp. strain C4 citrate synthase was not activated by potassium chloride and was not inhibited by NADH, ATP, ADP, or AMP at levels up to 1 mM. The predicted amino acid sequence was compared with that of the E. coli, Acinetobacter anitratum, Pseudomonas aeruginosa, Rickettsia prowazekii, porcine heart, and Saccharomyces cerevisiae cytoplasmic and mitochondrial enzymes.

  13. The qa repressor gene of Neurospora crassa: wild-type and mutant nucleotide sequences.

    PubMed Central

    Huiet, L; Giles, N H

    1986-01-01

    The qa-1S gene, one of two regulatory genes in the qa gene cluster of Neurospora crassa, encodes the qa repressor. The qa-1S gene together with the qa-1F gene, which encodes the qa activator protein, control the expression of all seven qa genes, including those encoding the inducible enzymes responsible for the utilization of quinic acid as a carbon source. The nucleotide sequence of the qa-1S gene and its flanking regions has been determined. The deduced coding sequence for the qa-1S protein encodes 918 amino acids with a calculated molecular weight of 100,650 and is interrupted by a single 66-base-pair intervening sequence. Both constitutive and noninducible mutants occur in the qa-1S gene and two different mutations of each type have been cloned and sequenced. All four mutations occur within the predicted coding region of the qa-1S gene. This result strongly supports the hypothesis that the qa-1S gene encodes a repressor. All four mutations are located within codons for the last 300 amino acids of the qa-1S protein. The mutations in three of the mutants involve amino acid substitutions, while the fourth mutant, which has a constitutive phenotype, contains a frameshift mutation. The two constitutive mutations occur in the most distal region of the gene, possibly implicating the COOH-terminal region of the qa repressor in binding to its target. The two noninducible mutations occur in a region proximal to the constitutive mutations, possibly implicating this region of the qa repressor in binding the inducer. Images PMID:3010294

  14. Human secreted carbonic anhydrase: cDNA cloning, nucleotide sequence, and hybridization histochemistry

    SciTech Connect

    Aldred, P.; Fu, Ping; Barrett, G.; Penschow, J.D.; Wright, R.D.; Coghlan, J.P.; Fernley, R.T. )

    1991-01-01

    Complementary DNA clones coding for the human secreted carbonic anhydrase isozyme (CAVI) have been isolated and their nucleotide sequences determined. These clones identify a 1.45-kb mRNA that is present in high levels in parotid submandibular salivary glands but absent in other tissues such as the sublingual gland, kidney, liver, and prostate gland. Hybridization histochemistry of human salivary glands shows mRNA for CA VI located in the acinar cells of these glands. The cDNA clones encode a protein of 308 amino acids that includes a 17 amino acid leader sequence typical of secreted proteins. The mature protein has 291 amino acids compared to 259 or 260 for the cytoplasmic isozymes, with most of the extra amino acids present as a carboxyl terminal extension. In comparison, sheep CA VI has a 45 amino acid extension. Overall the human CA VI protein has a sequence identity of 35 {percent} with human CA II, while residues involved in the active site of the enzymes have been conserved. The human and sheep secreted carbonic anhydrases have a sequence identity of 72 {percent}. This includes the two cysteine residues that are known to be involved in an intramolecular disulfide bond in the sheep CA VI. The enzyme is known to be glycosylated and three potential N-glycosylation sites (Asn-X-Thr/Ser) have been identified. Two of these are known to be glycosylated in sheep CA VI. Southern analysis of human DNA indicates that there is only one gene coding for CA VI.

  15. Nucleotide sequence of an immediate-early frog virus 3 gene.

    PubMed

    Willis, D; Foglesong, D; Granoff, A

    1984-12-01

    We have used "gene walking" with synthetic oligonucleotides and M13 dideoxynucleotide sequencing techniques to obtain the complete coding and flanking sequences of the gene encoding a major immediate-early RNA (molecular weight, 169,000) of frog virus 3. R-loop mapping of the cloned XbaI K fragment of frog virus 3 DNA with immediate-early RNA from infected cells showed that an RNA of approximately 500 to 600 nucleotides (the right size to code for the immediate-early viral 18-kilodalton protein of unknown function) hybridized to a region within 100 base pairs of one end of the XbaI K fragment; no evidence for splicing was observed in the electron microscope or by single-strand nuclease analysis. Further restriction mapping narrowed the location of the gene to the XbaI end of a 2-kilobase-pair XbaI-Bg/II fragment, which was bidirectionally subcloned into the bacteriophage pair mp10 and mp11 for sequencing. Mung bean nuclease mapping was used to identify both the 5' and the 3' ends of the mRNA. The 5' end mapped within an AT-rich region 19 base pairs upstream from two in-phase AUG start codons that were immediately followed by an open reading frame of 157 amino acids. Another AT-rich sequence was found at -29 base pairs from the 5' end of the mRNA start site; this sequence may function as a TATA box. The 3' end of the message displayed considerable microheterogeneity, but clearly terminated within a third AT-rich region 50 to 60 base pairs from the translation stop codon. The eucaryotic polyadenylic acid addition signal (AATAAA) was not present, a finding to be expected since frog virus 3 mRNA is not polyadenylated. Both the single-stranded mp10 clone of the XbaI-Bg/II fragment and a 15-base oligonucleotide complementary to the region flanking the two AUG translation start codons inhibited translation of the immediate-early 18-kilodalton protein in vitro, confirming the identity of the sequenced gene. As the regulatory sequences of this gene did not resemble those of

  16. Complete nucleotide sequence of a plant tumor-inducing Ti plasmid.

    PubMed

    Suzuki, K; Hattori, Y; Uraji, M; Ohta, N; Iwata, K; Murata, K; Kato, A; Yoshida, K

    2000-01-25

    Crown gall tumor disease in dicot plants is caused by Agrobacterium tumefaciens harboring a giant tumor-inducing (Ti) plasmid. Here, for the first time among agrobacterial plasmids, the nucleotide sequence of a typical nopaline-type Ti plasmid (pTi-SAKURA) was determined completely. In total, 195 open reading frames (ORFs) were estimated in the 206479 bp long sequence. 20 genes for conjugation, three for replication, 22 for pathogenesis and 37 for genetic colonization of host plants were found within two-thirds of the plasmid. These genes formed seven functional gene clusters with narrow inter-cluster spaces. In the remaining one-third of the plasmid, novel genes including homologs of mutT, Rhizobium nodQ and Sphingomonas ligE genes were found, which are likely to be responsible for the broad host range. Restriction fragment length variation indicates extreme plasticity of the part required for conjugational gene transfer and the above-mentioned one-third of the plasmid, even among closely related Ti plasmids. PMID:10721727

  17. Complete Nucleotide Sequence Analysis of the Norovirus GII.4 Sydney Variant in South Korea

    PubMed Central

    Park, Ji-Sun; Lee, Sung-Geun; Cho, Han-Gil; Jheong, Weon-Hwa; Paik, Soon-Young

    2015-01-01

    Norovirus is the primary cause of acute gastroenteritis in individuals of all ages. In Australia, a new strain of norovirus (GII.4) was identified in March 2012, and this strain has spread rapidly around the world. In August 2012, this new GII.4 strain was identified in patients in South Korea. Therefore, to examine the characteristics of the epidemic norovirus GII.4 2012 variant in South Korea, we conducted KM272334 full-length genomic analysis. The genome of the gg-12-08-04 strain consisted of 7,558 bp and contained three open reading frame (ORF) composites throughout the whole genome: ORF1 (5,100 bp), ORF2 (1,623 bp), and ORF3 (807 bp). Phylogenetic analyses showed that gg-12-08-04 belonged to the GII.4 Sydney 2012 variant, sharing 98.92% nucleotide similarity with this variant strain. According to SimPlot analysis, the gg-12-08-04 strain was a recombinant strain with breakpoint at the ORF1/2 junction between Osaka 2007 and Apeldoorn 2008 strains. This study is the first report of the complete sequence of the GII.4 Sydney 2012 strain in South Korea. Therefore, this may represent the standard sequence of the norovirus GII.4 2012 variant in South Korea and could therefore be useful for the development of norovirus vaccines. PMID:25688356

  18. Nucleotide sequence and structural organization of the human vasopressin pituitary receptor (V3) gene.

    PubMed

    René, P; Lenne, F; Ventura, M A; Bertagna, X; de Keyzer, Y

    2000-01-01

    In the pituitary, vasopressin triggers ACTH release through a specific receptor subtype, termed V3 or V1b. We cloned the V3 cDNA and showed that its expression was almost exclusive to pituitary corticotrophs and some corticotroph tumors. To study the determinants of this tissue specificity, we have now cloned the gene for the human (h) V3 receptor and characterized its structure. It is composed of two exons, spanning 10kb, with the coding region interrupted between transmembrane domains 6 and 7. We established that the transcription initiation site is located 498 nucleotides upstream of the initiator codon and showed that two polyadenylation sites may be used, while the most frequent is the most downstream. Sequence analysis of the promoter region showed no TATA box but identified consensus binding motifs for Sp1, CREB, and half sites of the estrogen receptor binding site. However comparison with another corticotroph-specific gene, proopiomelanocortin, did not identify common regulatory elements in the two promoters except for a short GC-rich region. Unexpectedly, hV3 gene analysis revealed that a formerly cloned 'artifactual' hV3 cDNA indeed corresponded to a spliced antisense transcript, overlapping the 5' part of the coding sequence in exon 1 and the promoter region. This transcript, hV3rev, was detected in normal pituitary and in many corticotroph tumors expressing hV3 sense mRNA and may therefore play a role in hV3 gene expression.

  19. Predicting Mendelian Disease-Causing Non-Synonymous Single Nucleotide Variants in Exome Sequencing Studies

    PubMed Central

    Bao, Su-Ying; Yang, Wanling; Ho, Shu-Leong; Song, Yong-Qiang; Sham, Pak C.

    2013-01-01

    Exome sequencing is becoming a standard tool for mapping Mendelian disease-causing (or pathogenic) non-synonymous single nucleotide variants (nsSNVs). Minor allele frequency (MAF) filtering approach and functional prediction methods are commonly used to identify candidate pathogenic mutations in these studies. Combining multiple functional prediction methods may increase accuracy in prediction. Here, we propose to use a logit model to combine multiple prediction methods and compute an unbiased probability of a rare variant being pathogenic. Also, for the first time we assess the predictive power of seven prediction methods (including SIFT, PolyPhen2, CONDEL, and logit) in predicting pathogenic nsSNVs from other rare variants, which reflects the situation after MAF filtering is done in exome-sequencing studies. We found that a logit model combining all or some original prediction methods outperforms other methods examined, but is unable to discriminate between autosomal dominant and autosomal recessive disease mutations. Finally, based on the predictions of the logit model, we estimate that an individual has around 5% of rare nsSNVs that are pathogenic and carries ∼22 pathogenic derived alleles at least, which if made homozygous by consanguineous marriages may lead to recessive diseases. PMID:23341771

  20. A single-nucleotide substitution mutator phenotype revealed by exome sequencing of human colon adenomas.

    PubMed

    Nikolaev, Sergey I; Sotiriou, Sotirios K; Pateras, Ioannis S; Santoni, Federico; Sougioultzis, Stavros; Edgren, Henrik; Almusa, Henrikki; Robyr, Daniel; Guipponi, Michel; Saarela, Janna; Gorgoulis, Vassilis G; Antonarakis, Stylianos E; Halazonetis, Thanos D

    2012-12-01

    Oncogene-induced DNA replication stress is thought to drive genomic instability in cancer. In particular, replication stress can explain the high prevalence of focal genomic deletions mapping within very large genes in human tumors. However, the origin of single-nucleotide substitutions (SNS) in nonfamilial cancers is strongly debated. Some argue that cancers have a mutator phenotype, whereas others argue that the normal DNA replication error rates are sufficient to explain the number of observed SNSs. Here, we sequenced the exomes of 24, mostly precancerous, colon polyps. Analysis of the sequences revealed mutations in the APC, CTNNB1, and BRAF genes as the presumptive cancer-initiating events and many passenger SNSs. We used the number of SNSs in the various lesions to calculate mutation rates for normal colon and adenomas and found that colon adenomas exhibit a mutator phenotype. Interestingly, the SNSs in the adenomas mapped more often than expected within very large genes, where focal deletions in response to DNA replication stress also map. We propose that single-stranded DNA generated in response to oncogene-induced replication stress compromises the repair of deaminated cytosines and other damaged bases, leading to the observed SNS mutator phenotype.

  1. Nucleotide sequences and mutational analysis of the structural genes for nitrogenase 2 of Azotobacter vinelandii.

    PubMed Central

    Joerger, R D; Loveless, T M; Pau, R N; Mitchenall, L A; Simon, B H; Bishop, P E

    1990-01-01

    The nucleotide sequence (6,559 base pairs) of the genomic region containing the structural genes for nitrogenase 2 (V nitrogenase) from Azotobacter vinelandii was determined. The open reading frames present in this region are organized into two transcriptional units. One contains vnfH (encoding dinitrogenase reductase 2) and a ferredoxinlike open reading frame (Fd). The second one includes vnfD (encoding the alpha subunit of dinitrogenase 2), vnfG (encoding a product similar to the delta subunit of dinitrogenase 2 from A. chroococcum), and vnfK (encoding the beta subunit of dinitrogenase 2). The 5'-flanking regions of vnfH and vnfD contain sequences similar to ntrA-dependent promoters. This gene arrangement allows independent expression of vnfH-Fd and vnfDGK. Mutant strains (CA80 and CA11.80) carrying an insertion in vnfH are still able to synthesize the alpha and beta subunits of dinitrogenase 2 when grown in N-free, Mo-deficient, V-containing medium. A strain (RP1.11) carrying a deletion-plus-insertion mutation in the vnfDGK region produced only dinitrogenase reductase 2. Images PMID:2345152

  2. Nucleotide sequences and mutational analysis of the structural genes for nitrogenase 2 of Azotobacter vinelandii.

    PubMed

    Joerger, R D; Loveless, T M; Pau, R N; Mitchenall, L A; Simon, B H; Bishop, P E

    1990-06-01

    The nucleotide sequence (6,559 base pairs) of the genomic region containing the structural genes for nitrogenase 2 (V nitrogenase) from Azotobacter vinelandii was determined. The open reading frames present in this region are organized into two transcriptional units. One contains vnfH (encoding dinitrogenase reductase 2) and a ferredoxinlike open reading frame (Fd). The second one includes vnfD (encoding the alpha subunit of dinitrogenase 2), vnfG (encoding a product similar to the delta subunit of dinitrogenase 2 from A. chroococcum), and vnfK (encoding the beta subunit of dinitrogenase 2). The 5'-flanking regions of vnfH and vnfD contain sequences similar to ntrA-dependent promoters. This gene arrangement allows independent expression of vnfH-Fd and vnfDGK. Mutant strains (CA80 and CA11.80) carrying an insertion in vnfH are still able to synthesize the alpha and beta subunits of dinitrogenase 2 when grown in N-free, Mo-deficient, V-containing medium. A strain (RP1.11) carrying a deletion-plus-insertion mutation in the vnfDGK region produced only dinitrogenase reductase 2.

  3. Nucleotide sequence and phylogenetic analysis of a new potexvirus: Malva mosaic virus.

    PubMed

    Côté, Fabien; Paré, Christine; Majeau, Nathalie; Bolduc, Marilène; Leblanc, Eric; Bergeron, Michel G; Bernardy, Michael G; Leclerc, Denis

    2008-01-01

    A filamentous virus isolated from Malva neglecta Wallr. (common mallow) and propagated in Chenopodium quinoa was grown, cloned and the complete nucleotide sequence was determined (GenBank accession # DQ660333). The genomic RNA is 6858 nt in length and contains five major open reading frames (ORFs). The genomic organization is similar to members and the viral encoded proteins shared homology with the group of the Potexvirus genus in the Flexiviridae family. Phylogenetic analysis revealed a close relationship with narcissus mosaic virus (NMV), scallion virus X (ScaVX) and, to a lesser extent, to Alstroemeria virus X (AlsVX) and pepino mosaic virus (PepMV). A novel putative pseudoknot structure is predicted in the 3'-UTR of a subgroup of potexviruses, including this newly described virus. The consensus GAAAA sequence is detected at the 5'-end of the genomic RNA and experimental data strongly suggest that this motif could be a distinctive hallmark of this genus. The name Malva mosaic virus is proposed. PMID:18054524

  4. Complete nucleotide sequence of rose yellow leaf virus, a new member of the family Tombusviridae.

    PubMed

    Mollov, Dimitre; Lockhart, Ben; Zlesak, David C

    2014-10-01

    The genome of the rose yellow leaf virus (RYLV) has been determined to be 3918 nucleotides long and to contain seven open reading frames (ORFs). ORF1 encodes a 27-kDa peptide (p27). ORF2 shares a common start codon with ORF1 and continues through the amber stop codon of p27 to encode an 87-kDa (p87) protein that has amino acid similarity to the RNA-dependent RNA polymerase (RdRp) of members of the family Tombusviridae. ORFs 3 and 4 have no significant amino acid similarity to known functional viral ORFs. ORF5 encodes a 6-kDa (p6) protein that has similarity to movement proteins of members of the Tombusviridae. ORF5A has no conventional start codon and overlaps with p6. A putative +1 frameshift mechanism allows p6 translation to continue through the stop codon and results in a 12-kDa protein that has high homology to the carmovirus p13 movement protein. The 37-kDa protein encoded by ORF6 has amino acid sequence similarity to coat proteins (CP) of members of the Tombusviridae. ORF7 has no significant amino acid similarity to known viral ORFs. Phylogenetic analysis of the RdRp amino acid sequences grouped RYLV together with the unclassified Rosa rugosa leaf distortion virus (RrLDV), pelargonium line pattern virus (PLPV), and pelargonium chlorotic ring pattern virus (PCRPV) in a distinct subgroup of the family Tombusviridae. PMID:24838852

  5. Predicting mendelian disease-causing non-synonymous single nucleotide variants in exome sequencing studies.

    PubMed

    Li, Miao-Xin; Kwan, Johnny S H; Bao, Su-Ying; Yang, Wanling; Ho, Shu-Leong; Song, Yong-Qiang; Sham, Pak C

    2013-01-01

    Exome sequencing is becoming a standard tool for mapping Mendelian disease-causing (or pathogenic) non-synonymous single nucleotide variants (nsSNVs). Minor allele frequency (MAF) filtering approach and functional prediction methods are commonly used to identify candidate pathogenic mutations in these studies. Combining multiple functional prediction methods may increase accuracy in prediction. Here, we propose to use a logit model to combine multiple prediction methods and compute an unbiased probability of a rare variant being pathogenic. Also, for the first time we assess the predictive power of seven prediction methods (including SIFT, PolyPhen2, CONDEL, and logit) in predicting pathogenic nsSNVs from other rare variants, which reflects the situation after MAF filtering is done in exome-sequencing studies. We found that a logit model combining all or some original prediction methods outperforms other methods examined, but is unable to discriminate between autosomal dominant and autosomal recessive disease mutations. Finally, based on the predictions of the logit model, we estimate that an individual has around 5% of rare nsSNVs that are pathogenic and carries ~22 pathogenic derived alleles at least, which if made homozygous by consanguineous marriages may lead to recessive diseases. PMID:23341771

  6. Mutations in core nucleotide sequence of hepatitis B virus correlate with fulminant and severe hepatitis.

    PubMed Central

    Ehata, T; Omata, M; Chuang, W L; Yokosuka, O; Ito, Y; Hosoda, K; Ohto, M

    1993-01-01

    Infection with hepatitis B virus leads to a wide spectrum of liver injury, including self-limited acute hepatitis, fulminant hepatitis, and chronic hepatitis with progression to cirrhosis or acute exacerbation to liver failure, as well as an asymptomatic chronic carrier state. Several studies have suggested that the hepatitis B core antigen could be an immunological target of cytotoxic T lymphocytes. To investigate the reason why the extreme immunological attack occurred in fulminant hepatitis and severe exacerbation patients, the entire precore and core region of hepatitis B virus DNA was sequenced in 24 subjects (5 fulminant, 10 severe fatal exacerbation, and 9 self-limited acute hepatitis patients). No significant change in the nucleotide sequence and deduced amino acid residue was noted in the nine self-limited acute hepatitis patients. In contrast, clustering changes in a small segment of 16 amino acids (codon 84-99 from the start of the core gene) in all seven adr subtype infected fulminant and severe exacerbation patients was found. A different segment with clustering substitutions (codon 48-60) was also found in seven of eight adw subtype infected fulminant and severe exacerbation patients. Of the 15 patients, 2 lacked precore stop mutation which was previously reported to be associated with fulminant hepatitis. These data suggest that these core regions with mutations may play an important role in the pathogenesis of hepatitis B viral disease, and such mutations are related to severe liver damage. Images PMID:8450049

  7. Complete nucleotide sequence analysis of the norovirus GII.4 Sydney variant in South Korea.

    PubMed

    Park, Ji-Sun; Lee, Sung-Geun; Jin, Ji-Young; Cho, Han-Gil; Jheong, Weon-Hwa; Paik, Soon-Young

    2015-01-01

    Norovirus is the primary cause of acute gastroenteritis in individuals of all ages. In Australia, a new strain of norovirus (GII.4) was identified in March 2012, and this strain has spread rapidly around the world. In August 2012, this new GII.4 strain was identified in patients in South Korea. Therefore, to examine the characteristics of the epidemic norovirus GII.4 2012 variant in South Korea, we conducted KM272334 full-length genomic analysis. The genome of the gg-12-08-04 strain consisted of 7,558 bp and contained three open reading frame (ORF) composites throughout the whole genome: ORF1 (5,100 bp), ORF2 (1,623 bp), and ORF3 (807 bp). Phylogenetic analyses showed that gg-12-08-04 belonged to the GII.4 Sydney 2012 variant, sharing 98.92% nucleotide similarity with this variant strain. According to SimPlot analysis, the gg-12-08-04 strain was a recombinant strain with breakpoint at the ORF1/2 junction between Osaka 2007 and Apeldoorn 2008 strains. This study is the first report of the complete sequence of the GII.4 Sydney 2012 strain in South Korea. Therefore, this may represent the standard sequence of the norovirus GII.4 2012 variant in South Korea and could therefore be useful for the development of norovirus vaccines.

  8. Cloning and nucleotide sequence of the hemA gene of Agrobacterium radiobacter.

    PubMed

    Drolet, M; Sasarman, A

    1991-04-01

    The hemA gene of Agrobacterium radiobacter ATCC4718 was identified by hybridization with a hemA probe from Rhizobium meliloti and cloned by complementation of a hemA mutant of Escherichia coli K12. E. coli hemA transformants carrying the hemA gene of Agrobacterium showed delta-aminolevulinic acid synthetase (delta-ALAS) activity in vitro. The hemA gene was carried on a 4.4 kb EcoRI fragment which could be reduced to a 2.6 kb EcoRI-SstI fragment without affecting its complementing or delta-ALAS activity. The sequence of the hemA gene showed an open reading frame of 1215 nucleotides, which could code for a protein of 44,361 Da. This is very close to the molecular weight of the HemA protein obtained using an in vitro coupled transcription-translation system (45,000 Da). Comparison of amino acid sequences of the delta-ALAS of A. radiobacter and Bradyrhizobium japonicum showed strong homology between the two enzymes; less, but still significant, homology was observed when A. radiobacter and human delta-ALAS were compared. Primer extension experiments enabled us to identify two promoters for the hemA gene of A. radiobacter. One of these promoters shows some similarity to the first promoter of the hemA gene of R. meliloti.

  9. Characterization of Sri Lanka rabies virus isolates using nucleotide sequence analysis of nucleoprotein gene.

    PubMed

    Arai, Y T; Takahashi, H; Kameoka, Y; Shiino, T; Wimalaratne, O; Lodmell, D L

    2001-01-01

    Thirty-four suspected rabid brain samples from 2 humans, 24 dogs, 4 cats, 2 mongooses, I jackal and I water buffalo were collected in 1995-1996 in Sri Lanka. Total RNA was extracted directly from brain suspensions and examined using a one-step reverse transcription-polymerase chain reaction (RT-PCR) for the rabies virus nucleoprotein (N) gene. Twenty-eight samples were found positive for the virus N gene by RT-PCR and also for the virus antigens by fluorescent antibody (FA) test. Rabies virus isolates obtained from different animal species in different regions of Sri Lanka were genetically homogenous. Sequences of 203 nucleotides (nt)-long RT-PCR products obtained from 16 of 27 samples were found identical. Sequences of 1350 nt of N genes of 14 RT-PCR products were determined. The Sri Lanka isolates under study formed a specific cluster that included also an earlier isolate from India but did not include the known isolates from China, Thailand, Malaysia, Israel, Iran, Oman, Saudi Arabia, Russia, Nepal, Philippines, Japan and from several other countries. These results suggest that one type of rabies virus is circulating among human, dog, cat, mongoose, jackal and water buffalo living near Colombo City and in other five remote regions in Sri Lanka.

  10. Escherichia coli gene purR encoding a repressor protein for purine nucleotide synthesis. Cloning, nucleotide sequence, and interaction with the purF operator.

    PubMed

    Rolfes, R J; Zalkin, H

    1988-12-25

    The Escherichia coli gene purR, encoding a repressor protein, was cloned by complementation of a purR mutation. Gene purR on a multicopy plasmid repressed expression of purF and purF-lacZ and reduced the growth rate of host cells by limiting the rate of de novo purine nucleotide synthesis. The level of a 1.3-kilobase purR mRNA was higher in cells grown with excess adenine, suggesting that synthesis of the repressor may be regulated. The chromosomal locus of purR was mapped to coordinate 1755-kb on the E. coli restriction map (Kohara, Y., Akiyama, K., and Isono, K. (1987) Cell 50, 495-508). Pur repressor bound specifically to purF operator DNA as determined by gel retardation and DNase I footprinting assays. The amino acid sequence of Pur repressor was derived from the nucleotide sequence. Pur repressor subunit contains 341 amino acids and has a calculated Mr of 38,179. Pur repressor is 31-35% identical with the galR and cytR repressors and 26% identical with the lacI repressor. These four repressors are likely homologous. Amino acid sequence similarity is greatest in an amino-terminal region presumed to contain a DNA-binding domain. A similarity is also noted in the operator sites for these repressors.

  11. Nucleotide sequences of genome segments S6, S7 and S10 of Dendrolimus punctatus cypovirus 1.

    PubMed

    Hong, J J; Duan, J L; Zhao, S L; Xu, H G; Peng, H Y

    2004-01-01

    The nucleotide sequences of genome segments S6, S7 and S10 of Dendrolimus punctatus cypovirus 1 Hunan I (DpCPV-HN(I)) and DpCPV-HN(I)-Se(3) (DpCPV-HN(I) passed three times in Spodoptera exigua) were determined. Segment S10 was 944 nucleotides in length and encoded a polyhedrin of 248 amino acids (28,439 Da). Only two nucleotide mutations were found between DpCPV-HN(I) S10 and DpCPV-HN(I)-Se3 S10, and the deduced amino acid sequences of the polyhedrin proteins were identical. Segment S7, 1 501 nucleotides, encoded a protein of 448 amino acids ( approximately 50 kDa; p50). Thirty-one nucleotide mutations were found between DpCPV-HN(I) S7 and DpCPV-HN(I)-Se3 S7, but these resulted in only four amino acid changes. DpCPV-HN(I) S6 encoded a protein of 561 amino acids (63,688 Da; p64). The amino acid sequence of p64, had a high leucine content (10%), and contained a leucine zipper motif and one ATP/GTP-binding site motif.

  12. Cloning, mutagenesis, and nucleotide sequence of a siderophore biosynthetic gene (amoA) from Aeromonas hydrophila.

    PubMed Central

    Barghouthi, S; Payne, S M; Arceneaux, J E; Byers, B R

    1991-01-01

    Many isolates of the Aeromonas species produce amonabactin, a phenolate siderophore containing 2,3-dihydroxybenzoic acid (2,3-DHB). An amonabactin biosynthetic gene (amoA) was identified (in a Sau3A1 gene library of Aeromonas hydrophila 495A2 chromosomal DNA) by its complementation of the requirement of Escherichia coli SAB11 for exogenous 2,3-DHB to support siderophore (enterobactin) synthesis. The gene amoA was subcloned as a SalI-HindIII 3.4-kb DNA fragment into pSUP202, and the complete nucleotide sequence of amoA was determined. A putative iron-regulatory sequence resembling the Fur repressor protein-binding site overlapped a possible promoter region. A translational reading frame, beginning with valine and encoding 396 amino acids, was open for 1,188 bp. The C-terminal portion of the deduced amino acid sequence showed 58% identity and 79% similarity with the E. coli EntC protein (isochorismate synthetase), the first enzyme in the E. coli 2,3-DHB biosynthetic pathway, suggesting that amoA probably encodes a step in 2,3-DHB biosynthesis and is the A. hydrophila equivalent of the E. coli entC gene. An isogenic amonabactin-negative mutant, A. hydrophila SB22, was isolated after marker exchange mutagenesis with Tn5-inactivated amoA (amoA::Tn5). The mutant excreted neither 2,3-DHB nor amonabactin, was more sensitive than the wild-type to growth inhibition by iron restriction, and used amonabactin to overcome iron starvation. Images PMID:1830579

  13. ANTICALIgN: visualizing, editing and analyzing combined nucleotide and amino acid sequence alignments for combinatorial protein engineering.

    PubMed

    Jarasch, Alexander; Kopp, Melanie; Eggenstein, Evelyn; Richter, Antonia; Gebauer, Michaela; Skerra, Arne

    2016-07-01

    ANTIC ALIGN: is an interactive software developed to simultaneously visualize, analyze and modify alignments of DNA and/or protein sequences that arise during combinatorial protein engineering, design and selection. ANTIC ALIGN: combines powerful functions known from currently available sequence analysis tools with unique features for protein engineering, in particular the possibility to display and manipulate nucleotide sequences and their translated amino acid sequences at the same time. ANTIC ALIGN: offers both template-based multiple sequence alignment (MSA), using the unmutated protein as reference, and conventional global alignment, to compare sequences that share an evolutionary relationship. The application of similarity-based clustering algorithms facilitates the identification of duplicates or of conserved sequence features among a set of selected clones. Imported nucleotide sequences from DNA sequence analysis are automatically translated into the corresponding amino acid sequences and displayed, offering numerous options for selecting reading frames, highlighting of sequence features and graphical layout of the MSA. The MSA complexity can be reduced by hiding the conserved nucleotide and/or amino acid residues, thus putting emphasis on the relevant mutated positions. ANTIC ALIGN: is also able to handle suppressed stop codons or even to incorporate non-natural amino acids into a coding sequence. We demonstrate crucial functions of ANTIC ALIGN: in an example of Anticalins selected from a lipocalin random library against the fibronectin extradomain B (ED-B), an established marker of tumor vasculature. Apart from engineered protein scaffolds, ANTIC ALIGN: provides a powerful tool in the area of antibody engineering and for directed enzyme evolution.

  14. Complete Nucleotide Sequences and Genome Organization of Two Pepper Mild Mottle Virus Isolates from Capsicum annuum in South Korea.

    PubMed

    Choi, Seung-Kook; Choi, Gug-Seoun; Kwon, Sun-Jung; Yoon, Ju-Yeon

    2016-05-19

    The complete genome sequences of pepper mild mottle virus (PMMoV)-P2 and -P3 were determined by the Sanger sequencing method. Although PMMoV-P2 and PMMoV-P3 have different pathogenicity in some pepper cultivars, the complete genome sequences of PMMoV-P2 and -P3 are composed of 6,356 nucleotides (nt). In this study, we report the complete genome sequences and genome organization of PMMoV-P2 and -P3 isolates from pepper species in South Korea.

  15. Complete Nucleotide Sequences and Genome Organization of Two Pepper Mild Mottle Virus Isolates from Capsicum annuum in South Korea

    PubMed Central

    Choi, Seung-Kook; Choi, Gug-Seoun; Kwon, Sun-Jung

    2016-01-01

    The complete genome sequences of pepper mild mottle virus (PMMoV)-P2 and -P3 were determined by the Sanger sequencing method. Although PMMoV-P2 and PMMoV-P3 have different pathogenicity in some pepper cultivars, the complete genome sequences of PMMoV-P2 and -P3 are composed of 6,356 nucleotides (nt). In this study, we report the complete genome sequences and genome organization of PMMoV-P2 and -P3 isolates from pepper species in South Korea. PMID:27198033

  16. [Polymorphism of DNA nucleotide sequence as a source of enhancement of the discrimination potential of the STR-markers].

    PubMed

    Zemskova, E Yu; Timoshenko, T V; Leonov, S N; Ivanov, P L

    2016-01-01

    The objective of the present pilot investigation was to reveal and to study polymorphism of nucleotide sequence in the alleles of STR loci of human autosomal DNA with special reference to the role of this phenomenon as a source of the differences between homonymous allelic variants. The secondary objection was to evaluate the possibility of using the data thus obtained for the enhancement of the informative value of the forensic medical genotyping of STR loci by means of identification of single nucleotide polymorphisms (SNP) for the purpose of extending their allelic spectrum. The methodological basis of the study was constituted by the comprehensive amplified fragment length polymorphism (AFLP) analysis and amplified fragment sequence polymorphisms (AFSP) analysis of DNA with the use of the PLEX-ID^TM analytical mass-spectrometry platform (Abbot Molecular, USA). The study has demonstrated that polymorphism of DNA nucleotide sequence can be regarded as the possible source of enhancement of the discriminating potential of STR markers. It means that the analysis of polymorphism of DNA nucleotide sequence for genotyping AFLP-type markers of chromosomal DNA can considerably increase the effectiveness of their application as individualizing markers for the purpose of molecular genetic expertises.

  17. A high-density simple sequence repeat and single nucleotide polymorphism genetic map of the tetraploid cotton genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Cotton genome complexity was investigated with a saturated molecular genetic map that combined several sets of microsatellites or simple sequence repeats (SSR) and the first major public set of single nucleotide polymorphism (SNP) markers in cotton genomes (Gossypium spp.), and that was constructed ...

  18. 37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... is DNA, RNA, or PRT (protein). If a nucleotide sequence contains both DNA and RNA fragments, the type shall be “DNA.” In addition, the combined DNA/RNA molecule shall be further described in the to feature... combined DNA/RNA” Name/Key Provide appropriate identifier for feature, preferably from WIPO Standard...

  19. 37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... is DNA, RNA, or PRT (protein). If a nucleotide sequence contains both DNA and RNA fragments, the type shall be “DNA.” In addition, the combined DNA/RNA molecule shall be further described in the to feature... combined DNA/RNA” Name/Key Provide appropriate identifier for feature, preferably from WIPO Standard...

  20. 37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... is DNA, RNA, or PRT (protein). If a nucleotide sequence contains both DNA and RNA fragments, the type shall be “DNA.” In addition, the combined DNA/RNA molecule shall be further described in the to feature... combined DNA/RNA” Name/Key Provide appropriate identifier for feature, preferably from WIPO Standard...

  1. 37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... is DNA, RNA, or PRT (protein). If a nucleotide sequence contains both DNA and RNA fragments, the type shall be “DNA.” In addition, the combined DNA/RNA molecule shall be further described in the to feature... combined DNA/RNA” Name/Key Provide appropriate identifier for feature, preferably from WIPO Standard...

  2. 37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... is DNA, RNA, or PRT (protein). If a nucleotide sequence contains both DNA and RNA fragments, the type shall be “DNA.” In addition, the combined DNA/RNA molecule shall be further described in the to feature... combined DNA/RNA” Name/Key Provide appropriate identifier for feature, preferably from WIPO Standard...

  3. [Polymorphism of DNA nucleotide sequence as a source of enhancement of the discrimination potential of the STR-markers].

    PubMed

    Zemskova, E Yu; Timoshenko, T V; Leonov, S N; Ivanov, P L

    2016-01-01

    The objective of the present pilot investigation was to reveal and to study polymorphism of nucleotide sequence in the alleles of STR loci of human autosomal DNA with special reference to the role of this phenomenon as a source of the differences between homonymous allelic variants. The secondary objection was to evaluate the possibility of using the data thus obtained for the enhancement of the informative value of the forensic medical genotyping of STR loci by means of identification of single nucleotide polymorphisms (SNP) for the purpose of extending their allelic spectrum. The methodological basis of the study was constituted by the comprehensive amplified fragment length polymorphism (AFLP) analysis and amplified fragment sequence polymorphisms (AFSP) analysis of DNA with the use of the PLEX-ID^TM analytical mass-spectrometry platform (Abbot Molecular, USA). The study has demonstrated that polymorphism of DNA nucleotide sequence can be regarded as the possible source of enhancement of the discriminating potential of STR markers. It means that the analysis of polymorphism of DNA nucleotide sequence for genotyping AFLP-type markers of chromosomal DNA can considerably increase the effectiveness of their application as individualizing markers for the purpose of molecular genetic expertises. PMID:27500481

  4. Compilation of 5S rRNA and 5S rRNA gene sequences

    PubMed Central

    Specht, Thomas; Wolters, Jörn; Erdmann, Volker A.

    1990-01-01

    The BERLIN RNA DATABANK as of Dezember 31, 1989, contains a total of 667 sequences of 5S rRNAs or their genes, which is an increase of 114 new sequence entries over the last compilation (1). It covers sequences from 44 archaebacteria, 267 eubacteria, 20 plastids, 6 mitochondria, 319 eukaryotes and 11 eukaryotic pseudogenes. The hardcopy shows only the list (Table 1) of those organisms whose sequences have been determined. The BERLIN RNA DATABANK uses the format of the EMBL Nucleotide Sequence Data Library complemented by a Sequence Alignment (SA) field including secondary structure information. PMID:1692116

  5. Molecular cloning and nucleotide sequence of a transforming gene detected by transfection of chicken B-cell lymphoma DNA

    NASA Astrophysics Data System (ADS)

    Goubin, Gerard; Goldman, Debra S.; Luce, Judith; Neiman, Paul E.; Cooper, Geoffrey M.

    1983-03-01

    A transforming gene detected by transfection of chicken B-cell lymphoma DNA has been isolated by molecular cloning. It is homologous to a conserved family of sequences present in normal chicken and human DNAs but is not related to transforming genes of acutely transforming retroviruses. The nucleotide sequence of the cloned transforming gene suggests that it encodes a protein that is partially homologous to the amino terminus of transferrin and related proteins although only about one tenth the size of transferrin.

  6. Single nucleotide polymorphisms in the IS900 sequence of Mycobacterium avium subsp. paratuberculosis are strain type specific.

    PubMed

    Castellanos, Elena; Aranaz, Alicia; de Juan, Lucia; Alvarez, Julio; Rodríguez, Sabrina; Romero, Beatriz; Bezos, Javier; Stevenson, Karen; Mateos, Ana; Domínguez, Lucas

    2009-07-01

    Insertion sequence IS900 is used as a target for the identification of Mycobacterium avium subsp. paratuberculosis. Previous reports have revealed single nucleotide polymorphisms within IS900. This study, which analyzed the IS900 sequences of a panel of isolates representing M. avium subsp. paratuberculosis strain types I, II, and III, revealed conserved type-specific polymorphisms that could be utilized as a tool for diagnostic and epidemiological purposes.

  7. A single nucleotide polymorphism and sequence analysis of CSN1S1 gene promoter region in Chinese Bos grunniens (yak).

    PubMed

    Bai, W L; Yin, R H; Dou, Q L; Yang, J C; Zhao, S J; Ma, Z J; Yin, R L; Luo, G B; Zhao, Z H

    2010-01-01

    The aim of this study was to investigate the polymorphism of the CSN1S1 gene promoter region in 4 Chinese yak breeds, and compare the yak CSN1S1 gene promoter region sequences with other ruminants. A Polymerase Chain Reaction-Single Strand Conformation Polymorphism protocol was developed for rapid genotyping of the yak CSN1S1 gene. One hundred fifty-eight animals from 4 Chinese yak breeds were genotyped at the CSN1S1 locus using the protocol developed. A single nucleotide polymorphism of the CSN1S1 gene promoter region has been identified in all yak breeds investigated. The polymorphism consists of a single nucleotide substitution G-->A at position 386 of the CSN1S1 gene promoter region, resulting in two alleles named, respectively, G(386) and A(386), based on the nucleotide at position 386. The allele G(386) was found to be more common in the animals investigated. The corresponding nucleotide sequences in GenBank of yak (having the same nucleotides as allele G(386) in this study), bovine, water buffalo, sheep, and goat had similarity of 99.68%, 99.35%, 97.42%, 95.14%, and 94.19%, respectively, with the yak allele A(386.).

  8. The Coding of Biological Information: From Nucleotide Sequence to Protein Recognition

    NASA Astrophysics Data System (ADS)

    Štambuk, Nikola

    The paper reviews the classic results of Swanson, Dayhoff, Grantham, Blalock and Root-Bernstein, which link genetic code nucleotide patterns to the protein structure, evolution and molecular recognition. Symbolic representation of the binary addresses defining particular nucleotide and amino acid properties is discussed, with consideration of: structure and metric of the code, direct correspondence between amino acid and nucleotide information, and molecular recognition of the interacting protein motifs coded by the complementary DNA and RNA strands.

  9. Isolation of a family of resistance gene analogue sequences of the nucleotide binding site (NBS) type from Lens species.

    PubMed

    Yaish, M W F; Sáenz de Miera, L E; Pérez de la Vega, M

    2004-08-01

    Most known plant disease-resistance genes (R genes) include in their encoded products domains such as a nucleotide-binding site (NBS) or leucine-rich repeats (LRRs). Sequences with unknown function, but encoding these conserved domains, have been defined as resistance gene analogues (RGAs). The conserved motifs within plant NBS domains make it possible to use degenerate primers and PCR to isolate RGAs. We used degenerate primers deduced from conserved motifs in the NBS domain of NBS-LRR resistance proteins to amplify genomic sequences from Lens species. Fragments from approximately 500-850 bp were obtained. The nucleotide sequence analysis of these fragments revealed 32 different RGA sequences in Lens species with a high similarity (up to 91%) to RGAs from other plants. The predicted amino acid sequences showed that lentil sequences contain all the conserved motifs (P-loop, kinase-2, kinase-3a, GLPL, and MHD) present in the majority of other known plant NBS-LRR resistance genes. Phylogenetic analyses grouped the Lens NBS sequences with the Toll and interleukin-1 receptor (TIR) subclass of NBS-LRR genes, as well as with RGA sequences isolated from other legume species. Using inverse PCR on one putative RGA of lentil, we were able to amplify the flanking regions of this sequence, which contained features found in R proteins.

  10. Structural organization, nucleotide sequence, and regulation of the Haemophilus influenzae rec-1+ gene.

    PubMed Central

    Zulty, J J; Barcak, G J

    1993-01-01

    The Haemophilus influenzae rec-1+ protein plays a central role in DNA metabolism, participating in general homologous recombination, recombinational (postreplication) DNA repair, and prophage induction. Although many H. influenzae rec-1 mutants have been phenotypically characterized, little is known about the rec-1+ gene at the molecular level. In this study, we present the genetic organization of the rec-1+ locus, the DNA sequence of rec-1+, and studies of the transcriptional regulation of rec-1+ during cellular assault by DNA-damaging agents and during the induction of competence for genetic transformation. Although little is known about promoter structure in H. influenzae, we identified a potential rec-1+ promoter that is identical in 11 of 12 positions to the bacterial sigma 70-dependent promoter consensus sequence. Results from a primer extension analysis revealed that the start site of rec-1+ transcription is centered 6 nucleotides downstream of this promoter. We identified potential DNA binding sites in the rec-1+ gene for LexA, integration host factor, and cyclic AMP receptor protein. We obtained evidence that at least one of the proposed cyclic AMP receptor protein binding sites is active in modulating rec-1+ transcription. This finding makes rec-1+ control circuitry novel among recA+ homologs. Two H. influenzae DNA uptake sequences that may function as a transcription termination signal were identified in inverted orientations at the end of the rec-1+ coding sequence. In addition, we report the first use of the Escherichia coli lacZ operon fusion technique in H. influenzae to study the transcriptional control of rec-1+. Our results indicate that rec-1+ is transcriptionally induced about threefold during DNA-damaging events. Furthermore, we show that rec-1+ can substitute for recA+ in E. coli to modulate SOS induction of dinB1 expression. Surprisingly, although 5% of the H. influenzae genome is in the form of single-stranded DNA during competence for

  11. Proteus mirabilis MR/P fimbrial operon: genetic organization, nucleotide sequence, and conditions for expression.

    PubMed Central

    Bahrani, F K; Mobley, H L

    1994-01-01

    Proteus mirabilis, an agent of urinary tract infection, expresses at least four fimbrial types. Among these are the MR/P (mannose-resistant/Proteus-like) fimbriae. MrpA, the structural subunit, is optimally expressed at 37 degrees C in Luria broth cultured statically for 48 h by each of seven strains examined. Genes encoding this fimbria were isolated, and the complete nucleotide sequence was determined. The mrp gene cluster encoded by 7,293 bp predicts eight polypeptides: MrpI (22,133 Da), MrpA (17,909 Da), MrpB (19,632 Da), MrpC (96,823 Da), MrpD (27,886 Da), MrpE (19,470 Da), MrpF (17,363 Da), and MrpG (13,169 Da). mrpI is upstream of the gene encoding the major structural subunit gene mrpA and is transcribed in the direction opposite to that of the rest of the operon. All predicted polypeptides share > or = 25% amino acid identity with at least one other enteric fimbrial gene product encoded by the pap, fim, smf, fan, or mrk gene clusters. Images PMID:7910820

  12. Nucleotide sequence and mutational analysis of the vnfENX region of Azotobacter vinelandii.

    PubMed

    Wolfinger, E D; Bishop, P E

    1991-12-01

    The nucleotide sequence (3,600 bp) of a second copy of nifENX-like genes in Azotobacter vinelandii has been determined. These genes are located immediately downstream from vnfA and have been designated vnfENX. The vnfENX genes appear to be organized as a single transcriptional unit that is preceded by a potential RpoN-dependent promoter. While the nifEN genes are thought to be evolutionarily related to nifDK, the vnfEN genes appear to be more closely related to nifEN than to either nifDK, vnfDK, or anfDK. Mutant strains (CA47 and CA48) carrying insertions in vnfE and vnfN, respectively, are able to grow diazotrophically in molybdenum (Mo)-deficient medium containing vanadium (V) (Vnf+) and in medium lacking both Mo and V (Anf+). However, a double mutant (strain DJ42.48) which contains a nifEN deletion and an insertion in vnfE is unable to grow diazotrophically in Mo-sufficient medium or in Mo-deficient medium with or without V. This suggests that NifE and NifN substitute for VnfE and VnfN when the vnfEN genes are mutationally inactivated. AnfA is not required for the expression of a vnfN-lacZ transcriptional fusion, even though this fusion is expressed under Mo- and V-deficient diazotrophic growth conditions.

  13. Whole-genome sequencing identifies genomic heterogeneity at a nucleotide and chromosomal level in bladder cancer

    PubMed Central

    Morrison, Carl D.; Liu, Pengyuan; Woloszynska-Read, Anna; Zhang, Jianmin; Luo, Wei; Qin, Maochun; Bshara, Wiam; Conroy, Jeffrey M.; Sabatini, Linda; Vedell, Peter; Xiong, Donghai; Liu, Song; Wang, Jianmin; Shen, He; Li, Yinwei; Omilian, Angela R.; Hill, Annette; Head, Karen; Guru, Khurshid; Kunnev, Dimiter; Leach, Robert; Eng, Kevin H.; Darlak, Christopher; Hoeflich, Christopher; Veeranki, Srividya; Glenn, Sean; You, Ming; Pruitt, Steven C.; Johnson, Candace S.; Trump, Donald L.

    2014-01-01

    Using complete genome analysis, we sequenced five bladder tumors accrued from patients with muscle-invasive transitional cell carcinoma of the urinary bladder (TCC-UB) and identified a spectrum of genomic aberrations. In three tumors, complex genotype changes were noted. All three had tumor protein p53 mutations and a relatively large number of single-nucleotide variants (SNVs; average of 11.2 per megabase), structural variants (SVs; average of 46), or both. This group was best characterized by chromothripsis and the presence of subclonal populations of neoplastic cells or intratumoral mutational heterogeneity. Here, we provide evidence that the process of chromothripsis in TCC-UB is mediated by nonhomologous end-joining using kilobase, rather than megabase, fragments of DNA, which we refer to as “stitchers,” to repair this process. We postulate that a potential unifying theme among tumors with the more complex genotype group is a defective replication–licensing complex. A second group (two bladder tumors) had no chromothripsis, and a simpler genotype, WT tumor protein p53, had relatively few SNVs (average of 5.9 per megabase) and only a single SV. There was no evidence of a subclonal population of neoplastic cells. In this group, we used a preclinical model of bladder carcinoma cell lines to study a unique SV (translocation and amplification) of the gene glutamate receptor ionotropic N-methyl D-aspertate as a potential new therapeutic target in bladder cancer. PMID:24469795

  14. Isolation and nucleotide sequencing of lactose carrier mutants that transport maltose.

    PubMed Central

    Brooker, R J; Wilson, T H

    1985-01-01

    The wild-type lactose carrier of Escherichia coli has a poor ability to transport the disaccharide maltose. However, it is possible to select lactose carrier mutants that have an enhanced ability to transport maltose by growing E. coli cells on maltose minimal plates in the presence of isopropyl thiogalactoside (an inducer of the lac operon). We have utilized this approach to isolate 18 independent lactose permease mutants that transport maltose. The relevant DNA sequences have been determined, and all of the mutations were found to be single base pair changes either at triplet 177 or at triplet 236. The nucleotide changes replace alanine-177 with valine or threonine, or tyrosine-236 with phenylalanine, asparagine, serine, or histidine. Transport experiments indicate that all of the mutants have faster maltose transport compared with the wild-type strain. Position 177 mutants retain the ability to transport galactosides, such as lactose and melibiose, at rates similar to the rate of the wild-type strain. In contrast, the position 236 mutants are markedly defective in the ability to transport galactosides. With regard to secondary structure, alanine-177 and tyrosine-236 are located on adjacent hydrophobic segments of the lactose carrier that are predicted to span the membrane. Thus, the results of this study indicate that the substrate recognition site of the lactose carrier is located within the plane of the lipid bilayer. In addition, a tertiary structure model is proposed that suggests how certain transmembrane segments might be localized relative to one another. Images PMID:3889919

  15. Whole-genome sequencing identifies genomic heterogeneity at a nucleotide and chromosomal level in bladder cancer.

    PubMed

    Morrison, Carl D; Liu, Pengyuan; Woloszynska-Read, Anna; Zhang, Jianmin; Luo, Wei; Qin, Maochun; Bshara, Wiam; Conroy, Jeffrey M; Sabatini, Linda; Vedell, Peter; Xiong, Donghai; Liu, Song; Wang, Jianmin; Shen, He; Li, Yinwei; Omilian, Angela R; Hill, Annette; Head, Karen; Guru, Khurshid; Kunnev, Dimiter; Leach, Robert; Eng, Kevin H; Darlak, Christopher; Hoeflich, Christopher; Veeranki, Srividya; Glenn, Sean; You, Ming; Pruitt, Steven C; Johnson, Candace S; Trump, Donald L

    2014-02-11

    Using complete genome analysis, we sequenced five bladder tumors accrued from patients with muscle-invasive transitional cell carcinoma of the urinary bladder (TCC-UB) and identified a spectrum of genomic aberrations. In three tumors, complex genotype changes were noted. All three had tumor protein p53 mutations and a relatively large number of single-nucleotide variants (SNVs; average of 11.2 per megabase), structural variants (SVs; average of 46), or both. This group was best characterized by chromothripsis and the presence of subclonal populations of neoplastic cells or intratumoral mutational heterogeneity. Here, we provide evidence that the process of chromothripsis in TCC-UB is mediated by nonhomologous end-joining using kilobase, rather than megabase, fragments of DNA, which we refer to as "stitchers," to repair this process. We postulate that a potential unifying theme among tumors with the more complex genotype group is a defective replication-licensing complex. A second group (two bladder tumors) had no chromothripsis, and a simpler genotype, WT tumor protein p53, had relatively few SNVs (average of 5.9 per megabase) and only a single SV. There was no evidence of a subclonal population of neoplastic cells. In this group, we used a preclinical model of bladder carcinoma cell lines to study a unique SV (translocation and amplification) of the gene glutamate receptor ionotropic N-methyl D-aspertate as a potential new therapeutic target in bladder cancer.

  16. Associations of single nucleotide polymorphisms in the Pygo2 coding sequence with idiopathic oligospermia and azoospermia.

    PubMed

    Ge, S-Q; Grifin, J; Liu, L-H; Aston, K I; Simon, L; Jenkins, T G; Emery, B R; Carrell, D T

    2015-08-07

    Male infertility is often associated with a decreased sperm count. The Pygo2 gene is expressed in the elongating spermatid during chromatin remodeling; thus impairment in PYGO2 function might lead to spermatogenic arrest, sperm count reduction, and subsequent infertility. The aim of this study was to identify mutations in Pygo2 that might lead to idiopathic oligospermia and azoospermia. DNA was isolated from venous blood from 77 men with normal fertility and 195 men with idiopathic oligospermia or azoospermia. Polymerase chain reaction-sequencing analysis was performed for the three Pygo2 coding regions. Non-synonymous single nucleotide polymorphisms (SNPs) were detected and analyzed using SIFT, Polyphen-2, and Mutation Taster softwares to identify possible changes in protein structure that could affect phenotype. Pygo2 sequencing was successful for 178 patients (30 with mild or moderate oligospermia, 57 with severe oligospermia, and 91 with azoospermia). Three previously reported non-synonymous SNPs were identified in patients with azoospermia or severe oligospermic but not in those with mild or moderate oligozoopermia or normozoospermia. SNPs rs61758740 (M141I) and rs141722381 (N240I) cause the replacement of one hydrophobic or hydrophilic amino acid, respectively, with another, and SNP rs61758741 (K261E) causes the replacement of a basic amino acid with an acidic one. The software predictions demonstrated that SNP rsl41722381 would likely result in disrupted tertiary protein structure and thus could be involved in disease pathogenesis. Overall, this study demonstrated that SNPs in the coding region of Pygo2 might be one of the causative factors in idiopathic oligospermia and azoospermia, resulting in male infertility.

  17. Postzygotic single-nucleotide mosaicisms in whole-genome sequences of clinically unremarkable individuals

    PubMed Central

    Huang, August Y; Xu, Xiaojing; Ye, Adam Y; Wu, Qixi; Yan, Linlin; Zhao, Boxun; Yang, Xiaoxu; He, Yao; Wang, Sheng; Zhang, Zheng; Gu, Bowen; Zhao, Han-Qing; Wang, Meng; Gao, Hua; Gao, Ge; Zhang, Zhichao; Yang, Xiaoling; Wu, Xiru; Zhang, Yuehua; Wei, Liping

    2014-01-01

    Postzygotic single-nucleotide mutations (pSNMs) have been studied in cancer and a few other overgrowth human disorders at whole-genome scale and found to play critical roles. However, in clinically unremarkable individuals, pSNMs have never been identified at whole-genome scale largely due to technical difficulties and lack of matched control tissue samples, and thus the genome-wide characteristics of pSNMs remain unknown. We developed a new Bayesian-based mosaic genotyper and a series of effective error filters, using which we were able to identify 17 SNM sites from ∼80× whole-genome sequencing of peripheral blood DNAs from three clinically unremarkable adults. The pSNMs were thoroughly validated using pyrosequencing, Sanger sequencing of individual cloned fragments, and multiplex ligation-dependent probe amplification. The mutant allele fraction ranged from 5%-31%. We found that C→T and C→A were the predominant types of postzygotic mutations, similar to the somatic mutation profile in tumor tissues. Simulation data showed that the overall mutation rate was an order of magnitude lower than that in cancer. We detected varied allele fractions of the pSNMs among multiple samples obtained from the same individuals, including blood, saliva, hair follicle, buccal mucosa, urine, and semen samples, indicating that pSNMs could affect multiple sources of somatic cells as well as germ cells. Two of the adults have children who were diagnosed with Dravet syndrome. We identified two non-synonymous pSNMs in SCN1A, a causal gene for Dravet syndrome, from these two unrelated adults and found that the mutant alleles were transmitted to their children, highlighting the clinical importance of detecting pSNMs in genetic counseling. PMID:25312340

  18. Quantitative theory of entropic forces acting on constrained nucleotide sequences applied to viruses.

    PubMed

    Greenbaum, Benjamin D; Cocco, Simona; Levine, Arnold J; Monasson, Rémi

    2014-04-01

    We outline a theory to quantify the interplay of entropic and selective forces on nucleotide organization and apply it to the genomes of single-stranded RNA viruses. We quantify these forces as intensive variables that can easily be compared between sequences, outline a computationally efficient transfer-matrix method for their calculation, and apply this method to influenza and HIV viruses. We find viruses altering their dinucleotide motif use under selective forces, with these forces on CpG dinucleotides growing stronger in influenza the longer it replicates in humans. For a subset of genes in the human genome, many involved in antiviral innate immunity, the forces acting on CpG dinucleotides are even greater than the forces observed in viruses, suggesting that both effects are in response to similar selective forces involving the innate immune system. We further find that the dynamics of entropic forces balancing selective forces can be used to predict how long it will take a virus to adapt to a new host, and that it would take H1N1 several centuries to adapt to humans from birds, typically contributing many of its synonymous substitutions to the forcible removal of CpG dinucleotides. By examining the probability landscape of dinucleotide motifs, we predict where motifs are likely to appear using only a single-force parameter and uncover the localization of UpU motifs in HIV. Essentially, we extend the natural language and concepts of statistical physics, such as entropy and conjugated forces, to understanding viral sequences and, more generally, constrained genome evolution.

  19. Phylogenetic analysis of beta-papillomaviruses as inferred from nucleotide and amino acid sequence data.

    PubMed

    Gottschling, Marc; Köhler, Anja; Stockfleth, Eggert; Nindl, Ingo

    2007-01-01

    Human papillomaviruses (HPV) of the beta-group seem to be involved in the pathogenesis of non-melanoma skin cancer. Papillomaviruses are host specific and are considered closely co-evolving with their hosts. Evolutionary incongruence between early genes and late genes has been reported among oncogenic genital alpha-papillomaviruses and considerably challenge phylogenetic reconstructions. We investigated the relationships of 29 beta-HPV (25 types plus four putative new types, subtypes, or variants) as inferred from codon aligned and amino acid sequence data of the genes E1, E2, E6, E7, L1, and L2 using likelihood, distance, and parsimony approaches. An analysis of a L1 fragment included additional nucleotide and amino acid sequences from seven non-human beta-papillomaviruses. Early genes and late genes evolution did not conflict significantly in beta-papillomaviruses based on partition homogeneity tests (p > or = 0.001). As inferred from the complete genome analyses, beta-papillomaviruses were monophyletic and segregated into four highly supported monophyletic assemblages corresponding to the species 1, 2, 3, and fused 4/5. They basically split into the species 1 and the remainder of beta-papillomaviruses, whose species 3, 4, and 5 constituted the sistergroup of species 2. beta-Papillomaviruses have been isolated from humans, apes, and monkeys, and phylogenetic analyses of the L1 fragment showed non-human papillomaviruses highly polyphyletic nesting within the HPV species. Thus, host and virus phylogenies were not congruent in beta-papillomaviruses, and multiple invasions across species borders may contribute (additionally to host-linked evolution) to their diversification.

  20. Nucleotide sequences and genetic analysis of hydrogen oxidation (hox) genes in Azotobacter vinelandii.

    PubMed Central

    Menon, A L; Mortenson, L E; Robson, R L

    1992-01-01

    Azotobacter vinelandii contains a heterodimeric, membrane-bound [NiFe]hydrogenase capable of catalyzing the reversible oxidation of H2. The beta and alpha subunits of the enzyme are encoded by the structural genes hoxK and hoxG, respectively, which appear to form part of an operon that contains at least one further potential gene (open reading frame 3 [ORF3]). In this study, determination of the nucleotide sequence of a region of 2,344 bp downstream of ORF3 revealed four additional closely spaced or overlapping ORFs. These ORFs, ORF4 through ORF7, potentially encode polypeptides with predicted masses of 22.8, 11.4, 16.3, and 31 kDa, respectively. Mutagenesis of the chromosome of A. vinelandii in the area sequenced was carried out by introduction of antibiotic resistance gene cassettes. Disruption of hoxK and hoxG by a kanamycin resistance gene abolished whole-cell hydrogenase activity coupled to O2 and led to loss of the hydrogenase alpha subunit. Insertional mutagenesis of ORF3 through ORF7 with a promoterless lacZ-Kmr cassette established that the region is transcriptionally active and involved in H2 oxidation. We propose to call ORF3 through ORF7 hoxZ, hoxM, hoxL, hoxO, and hoxQ, respectively. The predicted hox gene products resemble those encoded by genes from hydrogenase-related operons in other bacteria, including Escherichia coli and Alcaligenes eutrophus. Images PMID:1624446

  1. T box transcription antitermination riboswitch: Influence of nucleotide sequence and orientation on tRNA binding by the antiterminator element

    PubMed Central

    Fauzi, Hamid; Agyeman, Akwasi; Hines, Jennifer V.

    2008-01-01

    Many bacteria utilize riboswitch transcription regulation to monitor and appropriately respond to cellular levels of important metabolites or effector molecules. The T box transcription antitermination riboswitch responds to cognate uncharged tRNA by specifically stabilizing an antiterminator element in the 5′-untranslated mRNA leader region and precluding formation of a thermodynamically more stable terminator element. Stabilization occurs when the tRNA acceptor end base pairs with the first four nucleotides in the seven nucleotide bulge of the highly conserved antiterminator element. The significance of the conservation of the antiterminator bulge nucleotides that do not base pair with the tRNA is unknown, but they are required for optimal function. In vitro selection was used to determine if the isolated antiterminator bulge context alone dictates the mode in which the tRNA acceptor end binds the bulge nucleotides. No sequence conservation beyond complementarity was observed and the location was not constrained to the first four bases of the bulge. The results indicate that formation of a structure that recognizes the tRNA acceptor end in isolation is not the determinant driving force for the high phylogenetic sequence conservation observed within the antiterminator bulge. Additional factors or T box leader features more likely influenced the phylogenetic sequence conservation. PMID:19152843

  2. Structural dynamics of cereal mitochondrial genomes as revealed by complete nucleotide sequencing of the wheat mitochondrial genome.

    PubMed

    Ogihara, Yasunari; Yamazaki, Yukiko; Murai, Koji; Kanno, Akira; Terachi, Toru; Shiina, Takashi; Miyashita, Naohiko; Nasuda, Shuhei; Nakamura, Chiharu; Mori, Naoki; Takumi, Shigeo; Murata, Minoru; Futo, Satoshi; Tsunewaki, Koichiro

    2005-01-01

    The application of a new gene-based strategy for sequencing the wheat mitochondrial genome shows its structure to be a 452 528 bp circular molecule, and provides nucleotide-level evidence of intra-molecular recombination. Single, reciprocal and double recombinant products, and the nucleotide sequences of the repeats that mediate their formation have been identified. The genome has 55 genes with exons, including 35 protein-coding, 3 rRNA and 17 tRNA genes. Nucleotide sequences of seven wheat genes have been determined here for the first time. Nine genes have an exon-intron structure. Gene amplification responsible for the production of multicopy mitochondrial genes, in general, is species-specific, suggesting the recent origin of these genes. About 16, 17, 15, 3.0 and 0.2% of wheat mitochondrial DNA (mtDNA) may be of genic (including introns), open reading frame, repetitive sequence, chloroplast and retro-element origin, respectively. The gene order of the wheat mitochondrial gene map shows little synteny to the rice and maize maps, indicative that thorough gene shuffling occurred during speciation. Almost all unique mtDNA sequences of wheat, as compared with rice and maize mtDNAs, are redundant DNA. Features of the gene-based strategy are discussed, and a mechanistic model of mitochondrial gene amplification is proposed. PMID:16260473

  3. Structural dynamics of cereal mitochondrial genomes as revealed by complete nucleotide sequencing of the wheat mitochondrial genome

    PubMed Central

    Ogihara, Yasunari; Yamazaki, Yukiko; Murai, Koji; Kanno, Akira; Terachi, Toru; Shiina, Takashi; Miyashita, Naohiko; Nasuda, Shuhei; Nakamura, Chiharu; Mori, Naoki; Takumi, Shigeo; Murata, Minoru; Futo, Satoshi; Tsunewaki, Koichiro

    2005-01-01

    The application of a new gene-based strategy for sequencing the wheat mitochondrial genome shows its structure to be a 452 528 bp circular molecule, and provides nucleotide-level evidence of intra-molecular recombination. Single, reciprocal and double recombinant products, and the nucleotide sequences of the repeats that mediate their formation have been identified. The genome has 55 genes with exons, including 35 protein-coding, 3 rRNA and 17 tRNA genes. Nucleotide sequences of seven wheat genes have been determined here for the first time. Nine genes have an exon–intron structure. Gene amplification responsible for the production of multicopy mitochondrial genes, in general, is species-specific, suggesting the recent origin of these genes. About 16, 17, 15, 3.0 and 0.2% of wheat mitochondrial DNA (mtDNA) may be of genic (including introns), open reading frame, repetitive sequence, chloroplast and retro-element origin, respectively. The gene order of the wheat mitochondrial gene map shows little synteny to the rice and maize maps, indicative that thorough gene shuffling occurred during speciation. Almost all unique mtDNA sequences of wheat, as compared with rice and maize mtDNAs, are redundant DNA. Features of the gene-based strategy are discussed, and a mechanistic model of mitochondrial gene amplification is proposed. PMID:16260473

  4. Sequence-Specific Incorporation of Enzyme-Nucleotide Chimera by DNA Polymerases.

    PubMed

    Welter, Moritz; Verga, Daniela; Marx, Andreas

    2016-08-16

    DNA polymerases select the right nucleotide for the growing polynucleotide chain based on the shape and geometry of the nascent nucleotide pairs and thereby ensure high DNA replication selectivity. High-fidelity DNA polymerases are believed to possess tight active sites that allow little deviation from the canonical structures. However, DNA polymerases are known to use nucleotides with small modifications as substrates, which is key for numerous core biotechnology applications. We show that even high-fidelity DNA polymerases are capable of efficiently using nucleotide chimera modified with a large protein like horseradish peroxidase as substrates for template-dependent DNA synthesis, despite this "cargo" being more than 100-fold larger than the natural substrates. We exploited this capability for the development of systems that enable naked-eye detection of DNA and RNA at single nucleotide resolution. PMID:27392211

  5. Nucleotide sequence of a complementary DNA encoding pea cytosolic copper/zinc superoxide dismutase. [Pisum sativum L

    SciTech Connect

    White, D.A.; Zilinskas, B.A. )

    1991-08-01

    The authors now report the nucleotide sequence of the cytosolic Cu/Zn SOD cloned from a {lambda}gt11 cDNA library constructed from mRNA extracted from leaves of 7- to 10-d pea seedlings (Pisum sativum L.). The clone was isolated using a 22-base synthetic oligonucleotide complementary to the amino acid sequence CGIIGLQG. This sequence, found at the protein's carboxy terminus, is highly conserved among plant cytosolic Cu/Zn SODs but not chloroplastic Cu/Zn SODs. The 738-base pair sequence contains an open reading frame specifying 152 codons and a predicted M{sub r} of 18,024 D. The deduced amino acid sequence is highly homologous (79-82% identity) with the sequences of other known plant cytosolic Cu/Zn SODs but less highly conserved (63-65%) when compared with several chloroplastic Cu/Zn SODs including pea (10).

  6. Nucleotide Sequence and Genetic Structure of a Novel Carbaryl Hydrolase Gene (cehA) from Rhizobium sp. Strain AC100

    PubMed Central

    Hashimoto, Masayuki; Fukui, Mitsuru; Hayano, Kouichi; Hayatsu, Masahito

    2002-01-01

    Rhizobium sp. strain AC100, which is capable of degrading carbaryl (1-naphthyl-N-methylcarbamate), was isolated from soil treated with carbaryl. This bacterium hydrolyzed carbaryl to 1-naphthol and methylamine. Carbaryl hydrolase from the strain was purified to homogeneity, and its N-terminal sequence, molecular mass (82 kDa), and enzymatic properties were determined. The purified enzyme hydrolyzed 1-naphthyl acetate and 4-nitrophenyl acetate indicating that the enzyme is an esterase. We then cloned the carbaryl hydrolase gene (cehA) from the plasmid DNA of the strain and determined the nucleotide sequence of the 10-kb region containing cehA. No homologous sequences were found by a database homology search using the nucleotide and deduced amino acid sequences of the cehA gene. Six open reading frames including the cehA gene were found in the 10-kb region, and sequencing analysis shows that the cehA gene is flanked by two copies of insertion sequence-like sequence, suggesting that it makes part of a composite transposon. PMID:11872471

  7. Nucleotide sequence and mutational analysis of the vnfENX region of Azotobacter vinelandii.

    PubMed Central

    Wolfinger, E D; Bishop, P E

    1991-01-01

    The nucleotide sequence (3,600 bp) of a second copy of nifENX-like genes in Azotobacter vinelandii has been determined. These genes are located immediately downstream from vnfA and have been designated vnfENX. The vnfENX genes appear to be organized as a single transcriptional unit that is preceded by a potential RpoN-dependent promoter. While the nifEN genes are thought to be evolutionarily related to nifDK, the vnfEN genes appear to be more closely related to nifEN than to either nifDK, vnfDK, or anfDK. Mutant strains (CA47 and CA48) carrying insertions in vnfE and vnfN, respectively, are able to grow diazotrophically in molybdenum (Mo)-deficient medium containing vanadium (V) (Vnf+) and in medium lacking both Mo and V (Anf+). However, a double mutant (strain DJ42.48) which contains a nifEN deletion and an insertion in vnfE is unable to grow diazotrophically in Mo-sufficient medium or in Mo-deficient medium with or without V. This suggests that NifE and NifN substitute for VnfE and VnfN when the vnfEN genes are mutationally inactivated. AnfA is not required for the expression of a vnfN-lacZ transcriptional fusion, even though this fusion is expressed under Mo- and V-deficient diazotrophic growth conditions. PMID:1938952

  8. Organization and nucleotide sequence of a gene cluster comprising the translation elongation factor 1 alpha, ribosomal protein S10 and tRNA(Ala) from Halobacterium halobium.

    PubMed

    Fujita, T; Itoh, T

    1995-09-01

    Lambda EMBL clone containing a gene cluster coding for the translation elongation factor 1alpha, ribosomal protein S10 and tRNA(ala) was identified in a genomic library for the halophilic archaebacterium Halobacterium halobium using a PCR probe amplified by two oligonucleotide primers for conserved amino acid sequences of the elongation factor 1 alpha family. The gene coding for elongation factor EF-2 was also found 4.3kb upstream from the 5'end of the elongation factor 1 alpha by hybridization analysis using a DNA fragment specific for EF-2 from Halobacterium halobium [1]. Halobacterial and eukaryotic elongation factor 1 alpha homologues are very similar in sequence and in length and appear to be more closely related to each other than to the eubacterial protein. PMID:8653072

  9. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... Director of the Federal Register in accordance with 5 U.S.C. 552(a) and 1 CFR part 51. Copies of WIPO... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid...

  10. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... Director of the Federal Register in accordance with 5 U.S.C. 552(a) and 1 CFR part 51. Copies of WIPO... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid...

  11. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... Director of the Federal Register in accordance with 5 U.S.C. 552(a) and 1 CFR part 51. Copies of WIPO... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid...

  12. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... Director of the Federal Register in accordance with 5 U.S.C. 552(a) and 1 CFR part 51. Copies of WIPO... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid...

  13. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... Director of the Federal Register in accordance with 5 U.S.C. 552(a) and 1 CFR part 51. Copies of WIPO... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid...

  14. A comparison of nucleotide sequences of measles virus L genes derived from wild-type viruses and SSPE brain tissues.

    PubMed

    Komase, K; Rima, B K; Pardowitz, I; Kunz, C; Billeter, M A; ter Meulen, V; Baczko, K

    1995-04-20

    The nucleotide sequences of the large protein (L) gene derived from two wild-type measles viruses (MV) and two SSPE brain-derived viruses have been determined. All sequences have single large open reading frames encoding 2183 amino acid residues. The deduced L proteins are well conserved and the proposed functional domains which have been identified for rhabdo- and paramyxoviruses are completely conserved in all strains. The degree of variability of L proteins is the lowest of all structural proteins of MV, reflecting its role in virus reproduction and persistence. Biased hypermutation was not observed in the L genes derived from SSPE brain tissue. None of the nucleotide changes can be associated with the attenuated phenotype of the Edmonston vaccine viruses. PMID:7747453

  15. Genetic divergence between subpopulations of the eastern Pacific goose barnacle Pollicipes elegans: mitochondrial cytochrome c subunit 1 nucleotide sequences.

    PubMed

    Van Syoc, R J

    1994-12-01

    Nucleotide sequence data derived from polymerase chain reaction products from the cytochrome oxidase subunit 1 gene of mitochondrial DNA provide evidence for interrupted gene flow and subsequent genetic divergence between geographically separate subpopulations of the edible goose barnacle, Pollicipes elegans, with a 4400-km latitudinal distribution in the eastern Pacific Ocean. The amphitropical subpopulations of Pollicipes elegans have a net nucleotide sequence divergence of about 1.2%. A range of mutation rates are applied to calculate estimates for the timing of this divergence. The earliest estimated time of divergence agrees with a Pliocene time of general warming in the eastern Pacific. The latest estimated times coincide with the Pleistocene epoch and periods of cooling and warming that could have allowed for a series of expansions and contractions of P. elegans populations in the eastern tropical Pacific. These expansions and contractions may, therefore, represent alternating periods of genetic exchange and isolation of the two populations.

  16. Molecular Identification of Necrophagous Muscidae and Sarcophagidae Fly Species Collected in Korea by Mitochondrial Cytochrome c Oxidase Subunit I Nucleotide Sequences

    PubMed Central

    Ham, Chan Seon; Kim, Seong Yoon; Ko, Kwang Soo; Jo, Tae-Ho; Son, Gi Hoon

    2014-01-01

    Identification of insect species is an important task in forensic entomology. For more convenient species identification, the nucleotide sequences of cytochrome c oxidase subunit I (COI) gene have been widely utilized. We analyzed full-length COI nucleotide sequences of 10 Muscidae and 6 Sarcophagidae fly species collected in Korea. After DNA extraction from collected flies, PCR amplification and automatic sequencing of the whole COI sequence were performed. Obtained sequences were analyzed for a phylogenetic tree and a distance matrix. Our data showed very low intraspecific sequence distances and species-level monophylies. However, sequence comparison with previously reported sequences revealed a few inconsistencies or paraphylies requiring further investigation. To the best of our knowledge, this study is the first report of COI nucleotide sequences from Hydrotaea occulta, Muscina angustifrons, Muscina pascuorum, Ophyra leucostoma, Sarcophaga haemorrhoidalis, Sarcophaga harpax, and Phaonia aureola. PMID:24982938

  17. Characterization of Newcastle disease virus isolates by reverse transcription PCR coupled to direct nucleotide sequencing and development of sequence database for pathotype prediction and molecular epidemiological analysis.

    PubMed Central

    Seal, B S; King, D J; Bennett, J D

    1995-01-01

    Degenerate oligonucleotide primers were synthesized to amplify nucleotide sequences from portions of the fusion protein and matrix protein genes of Newcastle disease virus (NDV) genomic RNA that could be used diagnostically. These primers were used in a single-tube reverse transcription PCR of NDV genomic RNA coupled to direct nucleotide sequencing of the amplified product to characterize more than 30 NDV isolates. In agreement with previous reports, differences in the fusion protein cleavage sequence that correlated genotypically with virulence among various NDV pathotypes were detected. By using sequences generated from the matrix protein gene coding for the nuclear localization signal, lentogenic viruses were again grouped phylogenetically separate from other pathotypes. These techniques were applied to compare neurotropic velogenic viruses isolated from an outbreak of Newcastle disease in cormorants and turkeys. Cormorant NDV isolates and an NDV isolate from an infected turkey flock in North Dakota had the fusion protein cleavage sequence 109SRGRRQKRFVG119. The R-for-G substitution at position 110 may be unique for the cormorant-type isolates. Although the amino acid sequences from the fusion protein cleavage site were identical, nucleotide sequence data correlate the outbreak in turkeys to a cormorant virus isolate from Minnesota and not to a cormorant virus isolate from Michigan. On the basis of sequence information, the cormorant isolates are virulent viruses related to isolates of psittacine origin, possibly genotypically distinct from other velogenic NDV isolates. These techniques can be used reliably for Newcastle disease epidemiology and for prediction of pathotypes of NDV isolates without traditional live-bird inoculations. PMID:8567895

  18. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus) Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms.

    PubMed

    Bertolini, Francesca; Scimone, Concetta; Geraci, Claudia; Schiavo, Giuseppina; Utzeri, Valerio Joe; Chiofalo, Vincenzo; Fontanesi, Luca

    2015-01-01

    Few studies investigated the donkey (Equus asinus) at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer) and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated) and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL) obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca). The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs) in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing) and Ion Torrent (RRL) runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources.

  19. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus) Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms.

    PubMed

    Bertolini, Francesca; Scimone, Concetta; Geraci, Claudia; Schiavo, Giuseppina; Utzeri, Valerio Joe; Chiofalo, Vincenzo; Fontanesi, Luca

    2015-01-01

    Few studies investigated the donkey (Equus asinus) at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer) and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated) and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL) obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca). The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs) in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing) and Ion Torrent (RRL) runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources. PMID:26151450

  20. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus) Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms

    PubMed Central

    Bertolini, Francesca; Scimone, Concetta; Geraci, Claudia; Schiavo, Giuseppina; Utzeri, Valerio Joe; Chiofalo, Vincenzo; Fontanesi, Luca

    2015-01-01

    Few studies investigated the donkey (Equus asinus) at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer) and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated) and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL) obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca). The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs) in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing) and Ion Torrent (RRL) runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources. PMID:26151450

  1. Nucleotide sequencing and serological evidence that the recently recognized deer tick virus is a genotype of Powassan virus.

    PubMed

    Beasley, D W; Suderman, M T; Holbrook, M R; Barrett, A D

    2001-11-01

    Deer tick virus (DTV) is a recently recognized North American virus isolated from Ixodes dammini ticks. Nucleotide sequencing of fragments of structural and non-structural protein genes suggested that this virus was most closely related to the tick-borne flavivirus Powassan (POW), which causes potentially fatal encephalitis in humans. To determine whether DTV represents a new and distinct member of the Flavivirus genus of the family Flaviviridae, we sequenced the structural protein genes and 5' and 3' non-coding regions of this virus. In addition, we compared the reactivity of DTV and POW in hemagglutination inhibition tests with a panel of polyclonal and monoclonal antisera, and performed cross-neutralization experiments using anti-DTV antisera. Nucleotide sequencing revealed a high degree of homology between DTV and POW at both nucleotide (>80% homology) and amino acid (>90% homology) levels, and the two viruses were indistinguishable in serological assays and mouse neuroinvasiveness. On the basis of these results, we suggest that DTV should be classified as a genotype of POW virus. PMID:11551648

  2. Complete nucleotide sequence and gene rearrangement of the mitochondrial genome of the bell-ring frog, Buergeria buergeri (family Rhacophoridae).

    PubMed

    Sano, Naomi; Kurabayashi, Atsushi; Fujii, Tamotsu; Yonekawa, Hiromichi; Sumida, Masayuki

    2004-06-01

    In this study we determined the complete nucleotide sequence (19,959 bp) of the mitochondrial DNA of the rhacophorid frog Buergeria buergeri. The gene content, nucleotide composition, and codon usage of B. buergeri conformed to those of typical vertebrate patterns. However, due to an accumulation of lengthy repetitive sequences in the D-loop region, this species possesses the largest mitochondrial genome among all the vertebrates examined so far. Comparison of the gene organizations among amphibian species (Rana, Xenopus, salamanders and caecilians) revealed that the positioning of four tRNA genes and the ND5 gene in the mtDNA of B. buergeri diverged from the common vertebrate gene arrangement shared by Xenopus, salamanders and caecilians. The unique positions of the tRNA genes in B. buergeri are shared by ranid frogs, indicating that the rearrangements of the tRNA genes occurred in a common ancestral lineage of ranids and rhacophorids. On the other hand, the novel position of the ND5 gene seems to have arisen in a lineage leading to rhacophorids (and other closely related taxa) after ranid divergence. Phylogenetic analysis based on nucleotide sequence data of all mitochondrial genes also supported the gene rearrangement pathway.

  3. Complete nucleotide sequence and gene rearrangement of the mitochondrial genome of the bell-ring frog, Buergeria buergeri (family Rhacophoridae).

    PubMed

    Sano, Naomi; Kurabayashi, Atsushi; Fujii, Tamotsu; Yonekawa, Hiromichi; Sumida, Masayuki

    2004-06-01

    In this study we determined the complete nucleotide sequence (19,959 bp) of the mitochondrial DNA of the rhacophorid frog Buergeria buergeri. The gene content, nucleotide composition, and codon usage of B. buergeri conformed to those of typical vertebrate patterns. However, due to an accumulation of lengthy repetitive sequences in the D-loop region, this species possesses the largest mitochondrial genome among all the vertebrates examined so far. Comparison of the gene organizations among amphibian species (Rana, Xenopus, salamanders and caecilians) revealed that the positioning of four tRNA genes and the ND5 gene in the mtDNA of B. buergeri diverged from the common vertebrate gene arrangement shared by Xenopus, salamanders and caecilians. The unique positions of the tRNA genes in B. buergeri are shared by ranid frogs, indicating that the rearrangements of the tRNA genes occurred in a common ancestral lineage of ranids and rhacophorids. On the other hand, the novel position of the ND5 gene seems to have arisen in a lineage leading to rhacophorids (and other closely related taxa) after ranid divergence. Phylogenetic analysis based on nucleotide sequence data of all mitochondrial genes also supported the gene rearrangement pathway. PMID:15329496

  4. The nucleotide sequences of 5S rRNAs from a sea-cucumber, a starfish and a sea-urchin.

    PubMed Central

    Ohama, T; Hori, H; Osawa, S

    1983-01-01

    The nucleotide sequences of 5S rRNA from three echinoderms, a sea-cucumber Stichopus oshimae, a starfish Asterina pectinifera and a sea-urchin Hemicentrotus pulcherrimus have been determined. These 5S rRNAs are all 120 nucleotides long. The echinoderm sequences are more related to the sequences of proterostomes animals such as mollusc, annelids and some others (87% identity on average) than to those of vertebrates (82% identity on average). PMID:6878041

  5. Palindrome analyser - A new web-based server for predicting and evaluating inverted repeats in nucleotide sequences.

    PubMed

    Brázda, Václav; Kolomazník, Jan; Lýsek, Jiří; Hároníková, Lucia; Coufal, Jan; Št'astný, Jiří

    2016-09-30

    DNA cruciform structures play an important role in the regulation of natural processes including gene replication and expression, as well as nucleosome structure and recombination. They have also been implicated in the evolution and development of diseases such as cancer and neurodegenerative disorders. Cruciform structures are formed by inverted repeats, and their stability is enhanced by DNA supercoiling and protein binding. They have received broad attention because of their important roles in biology. Computational approaches to study inverted repeats have allowed detailed analysis of genomes. However, currently there are no easily accessible and user-friendly tools that can analyse inverted repeats, especially among long nucleotide sequences. We have developed a web-based server, Palindrome analyser, which is a user-friendly application for analysing inverted repeats in various DNA (or RNA) sequences including genome sequences and oligonucleotides. It allows users to search and retrieve desired gene/nucleotide sequence entries from the NCBI databases, and provides data on length, sequence, locations and energy required for cruciform formation. Palindrome analyser also features an interactive graphical data representation of the distribution of the inverted repeats, with options for sorting according to the length of inverted repeat, length of loop, and number of mismatches. Palindrome analyser can be accessed at http://bioinformatics.ibp.cz.

  6. Testing evolutionary models to explain the process of nucleotide substitution in gut bacterial 16S rRNA gene sequences.

    PubMed

    Garcia-Mazcorro, Jose F

    2013-09-01

    The 16S rRNA gene has been widely used as a marker of gut bacterial diversity and phylogeny, yet we do not know the model of evolution that best explains the differences in its nucleotide composition within and among taxa. Over 46 000 good-quality near-full-length 16S rRNA gene sequences from five bacterial phyla were obtained from the ribosomal database project (RDP) by study and, when possible, by within-study characteristics (e.g. anatomical region). Using alignments (RDPX and MUSCLE) of unique sequences, the FINDMODEL tool available at http://www.hiv.lanl.gov/ was utilized to find the model of character evolution (28 models were available) that best describes the input sequence data, based on the Akaike information criterion. The results showed variable levels of agreement (from 33% to 100%) in the chosen models between the RDP-based and the MUSCLE-based alignments among the taxa. Moreover, subgroups of sequences (using either alignment method) from the same study were often explained by different models. Nonetheless, the different representatives of the gut microbiota were explained by different proportions of the available models. This is the first report using evolutionary models to explain the process of nucleotide substitution in gut bacterial 16S rRNA gene sequences. PMID:23808388

  7. MoD Tools: regulatory motif discovery in nucleotide sequences from co-regulated or homologous genes.

    PubMed

    Pavesi, Giulio; Mereghetti, Paolo; Zambelli, Federico; Stefani, Marco; Mauri, Giancarlo; Pesole, Graziano

    2006-07-01

    Understanding the complex mechanisms regulating gene expression at the transcriptional and post-transcriptional levels is one of the greatest challenges of the post-genomic era. The MoD (MOtif Discovery) Tools web server comprises a set of tools for the discovery of novel conserved sequence and structure motifs in nucleotide sequences, motifs that in turn are good candidates for regulatory activity. The server includes the following programs: Weeder, for the discovery of conserved transcription factor binding sites (TFBSs) in nucleotide sequences from co-regulated genes; WeederH, for the discovery of conserved TFBSs and distal regulatory modules in sequences from homologous genes; RNAProfile, for the discovery of conserved secondary structure motifs in unaligned RNA sequences whose secondary structure is not known. In this way, a given gene can be compared with other co-regulated genes or with its homologs, or its mRNA can be analyzed for conserved motifs regulating its post-transcriptional fate. The web server thus provides researchers with different strategies and methods to investigate the regulation of gene expression, at both the transcriptional and post-transcriptional levels. Available at http://www.pesolelab.it/modtools/ and http://www.beacon.unimi.it/modtools/.

  8. Palindrome analyser - A new web-based server for predicting and evaluating inverted repeats in nucleotide sequences.

    PubMed

    Brázda, Václav; Kolomazník, Jan; Lýsek, Jiří; Hároníková, Lucia; Coufal, Jan; Št'astný, Jiří

    2016-09-30

    DNA cruciform structures play an important role in the regulation of natural processes including gene replication and expression, as well as nucleosome structure and recombination. They have also been implicated in the evolution and development of diseases such as cancer and neurodegenerative disorders. Cruciform structures are formed by inverted repeats, and their stability is enhanced by DNA supercoiling and protein binding. They have received broad attention because of their important roles in biology. Computational approaches to study inverted repeats have allowed detailed analysis of genomes. However, currently there are no easily accessible and user-friendly tools that can analyse inverted repeats, especially among long nucleotide sequences. We have developed a web-based server, Palindrome analyser, which is a user-friendly application for analysing inverted repeats in various DNA (or RNA) sequences including genome sequences and oligonucleotides. It allows users to search and retrieve desired gene/nucleotide sequence entries from the NCBI databases, and provides data on length, sequence, locations and energy required for cruciform formation. Palindrome analyser also features an interactive graphical data representation of the distribution of the inverted repeats, with options for sorting according to the length of inverted repeat, length of loop, and number of mismatches. Palindrome analyser can be accessed at http://bioinformatics.ibp.cz. PMID:27603574

  9. SMRT Sequencing of Long Tandem Nucleotide Repeats in SCA10 Reveals Unique Insight of Repeat Expansion Structure

    PubMed Central

    Landrian, Ivette; Godiska, Ronald; Shanker, Savita; Yu, Fahong; Farmerie, William G.; Ashizawa, Tetsuo

    2015-01-01

    A large, non-coding ATTCT repeat expansion causes the neurodegenerative disorder, spinocerebellar ataxia type 10 (SCA10). In a subset of SCA10 patients, interruption motifs are present at the 5’ end of the expansion and strongly correlate with epileptic seizures. Thus, interruption motifs are a predictor of the epileptic phenotype and are hypothesized to act as a phenotypic modifier in SCA10. Yet, the exact internal sequence structure of SCA10 expansions remains unknown due to limitations in current technologies for sequencing across long extended tracts of tandem nucleotide repeats. We used the third generation sequencing technology, Single Molecule Real Time (SMRT) sequencing, to obtain full-length contiguous expansion sequences, ranging from 2.5 to 4.4 kb in length, from three SCA10 patients with different clinical presentations. We obtained sequence spanning the entire length of the expansion and identified the structure of known and novel interruption motifs within the SCA10 expansion. The exact interruption patterns in expanded SCA10 alleles will allow us to further investigate the potential contributions of these interrupting sequences to the pathogenic modification leading to the epilepsy phenotype in SCA10. Our results also demonstrate that SMRT sequencing is useful for deciphering long tandem repeats that pose as “gaps” in the human genome sequence. PMID:26295943

  10. ChEMBL web services: streamlining access to drug discovery data and utilities.

    PubMed

    Davies, Mark; Nowotka, Michał; Papadatos, George; Dedman, Nathan; Gaulton, Anna; Atkinson, Francis; Bellis, Louisa; Overington, John P

    2015-07-01

    ChEMBL is now a well-established resource in the fields of drug discovery and medicinal chemistry research. The ChEMBL database curates and stores standardized bioactivity, molecule, target and drug data extracted from multiple sources, including the primary medicinal chemistry literature. Programmatic access to ChEMBL data has been improved by a recent update to the ChEMBL web services (version 2.0.x, https://www.ebi.ac.uk/chembl/api/data/docs), which exposes significantly more data from the underlying database and introduces new functionality. To complement the data-focused services, a utility service (version 1.0.x, https://www.ebi.ac.uk/chembl/api/utils/docs), which provides RESTful access to commonly used cheminformatics methods, has also been concurrently developed. The ChEMBL web services can be used together or independently to build applications and data processing workflows relevant to drug discovery and chemical biology.

  11. Nucleotide sequence variation of the VP7 gene of two G3-type rotaviruses isolated from dogs.

    PubMed

    Martella, V; Pratelli, A; Greco, G; Gentile, M; Fiorente, P; Tempesta, M; Buonavoglia, C

    2001-04-01

    The sequence of the VP7 gene of two rotaviruses isolated from dogs in southern Italy was determined and the inferred amino acid sequence was compared with that of other rotavirus strains. There was very high nucleotide and amino acid identity between canine strain RV198/95 and other canine strains, and to the human strain HCR3A. Strain RV52/96, however, was found to have about 95% identity to the G3 serotype canine strains K9, A79-10 and CU-1 and 96% identity to strain RV198/95 and to the simian strain RRV. Therefore both of the canine strains belong to the G3 serotype. Nevertheless, detailed analysis of the VP7 variable regions revealed that RV52/96 possesses amino acid substitutions uncommon to the other canine isolates. In addition, strain RV52/96 exhibited a nucleotide divergence greater than 16% from all the other canine strains studied; however, it revealed the closest identity (90.4%) to the simian strain RRV. With only a few exceptions, phylogenetic analysis allowed clear differentiation of the G3 rotaviruses on the basis of the species of origin. The nucleotide and amino acid variations observed in strain RV52/96 could account for the existence of a canine rotavirus G3 sub-type. PMID:11226570

  12. Nucleotide sequences and mutations of the 5'-nontranslated region (5'NTR) of natural isolates of an epidemic echovirus 11' (prime).

    PubMed

    Szendrõi, A; El-Sageyer, M; Takács, M; Mezey, I; Berencsi, G

    2000-01-01

    An echovirus 11' (prime) virus caused an epidemic in Hungary in 1989. The leading clinical form of the diseases was myocarditis. Hemorrhagic hepatitis syndroms were also caused, however, with lethal outcome in 13 newborn babies. Altogether 386 children suffered from registered clinical disease. No accumulation of serous meningitis cases and intrauterine death were observed during the epidemic, and the monovalent oral poliovirus vaccination campaign has prevented the further circulation of the virus. The 5'-nontranslated region (5'-NTR) of 12 natural isolates were sequenced (nucleotides: 260-577). The 5'-NTR was found to be different from that of the prototype Gregory strain (X80059) of EV11 (less than 90% identity), but related to the swine vesicular disease virus (D16364) SVDV and EV9 (X92886) as indicated by the best fitting dendogram. The examination of the variable nucleotides in the internal ribosomal entry site (IRES) revealed, that the nucleotide sequence of a region of the epidemic 5'-NTR was identical to that of coxsackievirus B2. Five of the epidemic isolates were found to carry mutations. Seven EV11' IRES elements possessed identical sequences indicating, that the virus has evolved before its arrival to Hungary. The comparative examination of the suboptimal secondary structures revealed, that no one of the mutations affected the secondary structure of stem-loop structures IV and V in the IRES elements. Although it has been shown previously, that the echovirus group is genetically coherent and related to coxsackie B viruses the sequence differences in the epidemic isolates resulted in profound modification of the central stem (residues 477-529) of stem-loop structure No.V known to be affecting neurovirulence of polioviruses. Two alternate cloverleaf (stem-loop) structures were also recognised (nucleotides 376 to 460 and 540 to 565) which seem to mask both regions of the IRES element complementary to the 3'-end of the 18 S rRNA (460 to 466 and 561 to 570

  13. An Interpretation of the Ancestral Codon from Miller’s Amino Acids and Nucleotide Correlations in Modern Coding Sequences

    PubMed Central

    Carels, Nicolas; de Leon, Miguel Ponce

    2015-01-01

    Purine bias, which is usually referred to as an “ancestral codon”, is known to result in short-range correlations between nucleotides in coding sequences, and it is common in all species. We demonstrate that RWY is a more appropriate pattern than the classical RNY, and purine bias (Rrr) is the product of a network of nucleotide compensations induced by functional constraints on the physicochemical properties of proteins. Through deductions from universal correlation properties, we also demonstrate that amino acids from Miller’s spark discharge experiment are compatible with functional primeval proteins at the dawn of living cell radiation on earth. These amino acids match the hydropathy and secondary structures of modern proteins. PMID:25922573

  14. Characterization and nucleotide sequence of a chicken gene encoding an opal suppressor tRNA and its flanking DNA segments.

    PubMed Central

    Hatfield, D L; Dudock, B S; Eden, F C

    1983-01-01

    A naturally occurring opal suppressor serine tRNA has been purified from chicken liver and used as a probe to isolate the corresponding gene from a library of chicken DNA in bacteriophage lambda. This minor tRNA is encoded by a single-copy gene that is not part of a tRNA gene cluster. DNA sequence analysis of the gene and its flanking DNA segments shows that the gene is encoded in an 87-base-pair segment without intervening sequences and specifies a tRNA that reads the termination codon UGA. This gene has additional nucleotides in the 5' internal promoter region but has a normal 3' internal promoter sequence and the usual termination signal. Images PMID:6308662

  15. The nucleotide sequences of several tRNA genes from rat mitochondria: common features and relatedness to homologous species.

    PubMed Central

    Cantatore, P; De Benedetto, C; Gadaleta, G; Gallerani, R; Kroon, A M; Holtrop, M; Lanave, C; Pepe, G; Quagliariello, C; Saccone, C; Sbisa, E

    1982-01-01

    We have determined the nucleotide sequences of thirteen rat mt tRNA genes. The features of the primary and secondary structures of these tRNAs show that those for Gln, Ser, and f-Met resemble, while those for Lys, Cys, and Trp depart strikingly from the universal type. The remainder are slightly abnormal. Among many mammalian mt DNA sequences, those of mt tRNA genes are highly conserved, thus suggesting for those genes an additional, perhaps regulatory, function. A simple evolutionary relationship between the tRNAs of animal mitochondria and those of eukaryotic cytoplasm, of lower eukaryotic mitochondria or of prokaryotes, is not evident owing to the extreme divergence of the tRNA sequences in the two groups. However, a slightly higher homology does exist between a few animal mt tRNAs and those from prokaryotes or from lower eukaryotic mitochondria. PMID:7099963

  16. Nucleotide sequences of fic and fic-1 genes involved in cell filamentation induced by cyclic AMP in Escherichia coli.

    PubMed Central

    Kawamukai, M; Matsuda, H; Fujii, W; Utsumi, R; Komano, T

    1989-01-01

    The nucleotide sequences of fic-1 involved in the cell filamentation induced by cyclic AMP in Escherichia coli and its normal counterpart fic were analyzed. The open reading frame of both fic-1 and fic coded for 200 amino acids. The Gly at position 55 in the Fic protein was changed to Arg in the Fic-1 protein. The promoter activity of fic was confirmed by fusing fic and lacZ. The gene downstream from fic was found to be pabA (p-aminobenzoate). There is an open reading frame (ORF190) coding for 190 amino acids upstream from the fic gene. Computer-assisted analysis showed that Fic has sequence similarity with part of CDC28 of Saccharomyces cerevisiae, CDC2 of Schizosaccharomyces pombe, and FtsA of E. coli. In addition, ORF190 has sequence similarity with the cyclosporin A-binding protein cyclophilin. PMID:2546924

  17. Are sites with multiple single nucleotide variants in cancer genomes a consequence of drivers, hypermutable sites or sequencing errors?

    PubMed Central

    Carr, Antony M.

    2016-01-01

    Across independent cancer genomes it has been observed that some sites have been recurrently hit by single nucleotide variants (SNVs). Such recurrently hit sites might be either (i) drivers of cancer that are postively selected during oncogenesis, (ii) due to mutation rate variation, or (iii) due to sequencing and assembly errors. We have investigated the cause of recurrently hit sites in a dataset of >3 million SNVs from 507 complete cancer genome sequences. We find evidence that many sites have been hit significantly more often than one would expect by chance, even taking into account the effect of the adjacent nucleotides on the rate of mutation. We find that the density of these recurrently hit sites is higher in non-coding than coding DNA and hence conclude that most of them are unlikely to be drivers. We also find that most of them are found in parts of the genome that are not uniquely mappable and hence are likely to be due to mapping errors. In support of the error hypothesis, we find that recurently hit sites are not randomly distributed across sequences from different laboratories. We fit a model to the data in which the rate of mutation is constant across sites but the rate of error varies. This model suggests that ∼4% of all SNVs are errors in this dataset, but that the rate of error varies by thousands-of-fold between sites. PMID:27688957

  18. Are sites with multiple single nucleotide variants in cancer genomes a consequence of drivers, hypermutable sites or sequencing errors?

    PubMed Central

    Carr, Antony M.

    2016-01-01

    Across independent cancer genomes it has been observed that some sites have been recurrently hit by single nucleotide variants (SNVs). Such recurrently hit sites might be either (i) drivers of cancer that are postively selected during oncogenesis, (ii) due to mutation rate variation, or (iii) due to sequencing and assembly errors. We have investigated the cause of recurrently hit sites in a dataset of >3 million SNVs from 507 complete cancer genome sequences. We find evidence that many sites have been hit significantly more often than one would expect by chance, even taking into account the effect of the adjacent nucleotides on the rate of mutation. We find that the density of these recurrently hit sites is higher in non-coding than coding DNA and hence conclude that most of them are unlikely to be drivers. We also find that most of them are found in parts of the genome that are not uniquely mappable and hence are likely to be due to mapping errors. In support of the error hypothesis, we find that recurently hit sites are not randomly distributed across sequences from different laboratories. We fit a model to the data in which the rate of mutation is constant across sites but the rate of error varies. This model suggests that ∼4% of all SNVs are errors in this dataset, but that the rate of error varies by thousands-of-fold between sites.

  19. Nucleotide sequence of dengue 2 RNA and comparison of the encoded proteins with those of other flaviviruses.

    PubMed

    Hahn, Y S; Galler, R; Hunkapiller, T; Dalrymple, J M; Strauss, J H; Strauss, E G

    1988-01-01

    We have determined the complete sequence of the RNA of dengue 2 virus (S1 candidate vaccine strain derived from the PR-159 isolate) with the exception of about 15 nucleotides at the 5' end. The genome organization is the same as that deduced earlier for other flaviviruses and the amino acid sequences of the encoded dengue 2 proteins show striking homology to those of other flaviviruses. The overall amino acid sequence similarity between dengue 2 and yellow fever virus is 44.7%, whereas that between dengue 2 and West Nile virus is 50.7%. These viruses represent three different serological subgroups of mosquito-borne flaviviruses. Comparison of the amino acid sequences shows that amino acid sequence homology is not uniformly distributed among the proteins; highest homology is found in some domains of nonstructural protein NS5 and lowest homology in the hydrophobic polypeptides ns2a and 2b. In general the structural proteins are less well conserved than the nonstructural proteins. Hydrophobicity profiles, however, are remarkably similar throughout the translated region. Comparison of the dengue 2 PR-159 sequence to partial sequence data from dengue 4 and another strain of dengue 2 virus reveals amino acid sequence homologies of about 64 and 96%, respectively, in the structural protein region. Thus as a general rule for flaviviruses examined to date, members of different serological subgroups demonstrate 50% or less amino acid sequence homology, members of the same subgroup average 65-75% homology, and strains of the same virus demonstrate greater than 95% amino acid sequence similarity.

  20. DNA sequencing by a single molecule detection of labeled nucleotides sequentially cleaved from a single strand of DNA

    SciTech Connect

    Goodwin, P.M.; Schecker, J.A.; Wilkerson, C.W.; Hammond, M.L.; Ambrose, W.P.; Jett, J.H.; Martin, J.C.; Marrone, B.L.; Keller, R.A. ); Haces, A.; Shih, P.J.; Harding, J.D. )

    1993-01-01

    We are developing a laser-based technique for the rapid sequencing of large DNA fragments (several kb in size) at a rate of 100 to 1000 bases per second. Our approach relies on fluorescent labeling of the bases in a single fragment of DNA, attachment of this labeled DNA fragment to a support, movement of the supported DNA into a flowing sample stream, sequential cleavage of the end nucleotide from the DNA fragment with an exonuclease, and detection of the individual fluorescently labeled bases by laser-induced fluorescence.

  1. DNA sequencing by a single molecule detection of labeled nucleotides sequentially cleaved from a single strand of DNA

    SciTech Connect

    Goodwin, P.M.; Schecker, J.A.; Wilkerson, C.W.; Hammond, M.L.; Ambrose, W.P.; Jett, J.H.; Martin, J.C.; Marrone, B.L.; Keller, R.A.; Haces, A.; Shih, P.J.; Harding, J.D.

    1993-02-01

    We are developing a laser-based technique for the rapid sequencing of large DNA fragments (several kb in size) at a rate of 100 to 1000 bases per second. Our approach relies on fluorescent labeling of the bases in a single fragment of DNA, attachment of this labeled DNA fragment to a support, movement of the supported DNA into a flowing sample stream, sequential cleavage of the end nucleotide from the DNA fragment with an exonuclease, and detection of the individual fluorescently labeled bases by laser-induced fluorescence.

  2. Analysis of a nucleotide-binding site of 5-lipoxygenase by affinity labelling: binding characteristics and amino acid sequences.

    PubMed Central

    Zhang, Y Y; Hammarberg, T; Radmark, O; Samuelsson, B; Ng, C F; Funk, C D; Loscalzo, J

    2000-01-01

    5-Lipoxygenase (5LO) catalyses the first two steps in the biosynthesis of leukotrienes, which are inflammatory mediators derived from arachidonic acid. 5LO activity is stimulated by ATP; however, a consensus ATP-binding site or nucleotide-binding site has not been found in its protein sequence. In the present study, affinity and photoaffinity labelling of 5LO with 5'-p-fluorosulphonylbenzoyladenosine (FSBA) and 2-azido-ATP showed that 5LO bound to the ATP analogues quantitatively and specifically and that the incorporation of either analogue inhibited ATP stimulation of 5LO activity. The stoichiometry of the labelling was 1.4 mol of FSBA/mol of 5LO (of which ATP competed with 1 mol/mol) or 0.94 mol of 2-azido-ATP/mol of 5LO (of which ATP competed with 0.77 mol/mol). Labelling with FSBA prevented further labelling with 2-azido-ATP, indicating that the same binding site was occupied by both analogues. Other nucleotides (ADP, AMP, GTP, CTP and UTP) also competed with 2-azido-ATP labelling, suggesting that the site was a general nucleotide-binding site rather than a strict ATP-binding site. Ca(2+), which also stimulates 5LO activity, had no effect on the labelling of the nucleotide-binding site. Digestion with trypsin and peptide sequencing showed that two fragments of 5LO were labelled by 2-azido-ATP. These fragments correspond to residues 73-83 (KYWLNDDWYLK, in single-letter amino acid code) and 193-209 (FMHMFQSSWNDFADFEK) in the 5LO sequence. Trp-75 and Trp-201 in these peptides were modified by the labelling, suggesting that they were immediately adjacent to the C-2 position of the adenine ring of ATP. Given the stoichiometry of the labelling, the two peptide sequences of 5LO were probably near each other in the enzyme's tertiary structure, composing or surrounding the ATP-binding site of 5LO. PMID:11042125

  3. Phylogenetic analysis of Brassiceae based on the nucleotide sequences of the S-locus related gene, SLR1.

    PubMed

    Inaba, Ryuichi; Nishio, Takeshi

    2002-12-01

    Nucleotide sequences of orthologs of the S-locus related gene, SLR1, in 20 species of Brassicaceae were determined and compared with the previously reported SLR1 sequences of six species. Identities of deduced amino-acid sequences with Brassica oleracea SLR1 ranged from 66.0% to 97.6%, and those with B. oleracea SRK and SLR2 were less than 62% and 55%, respectively. In multiple alignment of deduced amino-acid sequences, the 180-190th amino-acid residues from the initial methionine were highly variable, this variable region corresponding to hypervariable region I of SLG and SRK. A phylogenetic tree based on the deduced amino-acid sequences showed a close relationship of SLR1 orthologs of species in the Brassicinae and Raphaninae. Brassica nigra SLR1 was found to belong to the same clade as Sinapis arvensis and Diplotaxis siifolia, while the sequences of the other Brassica species belonged to another clade together with B. oleracea and Brassica rapa. The phylogenetic tree was similar to previously reported trees constructed using the data of electrophoretic band patterns of chloroplast DNA, though minor differences were found. Based on synonymous substitution rates in SLR1, the diversification time of SLR1 orthologs between species in the Brassicinae was estimated. The evolution and function of SLR1 and the phylogenetic relationship of Brassiceae plants are discussed.

  4. PerPlot & PerScan: tools for analysis of DNA curvature-related periodicity in genomic nucleotide sequences

    PubMed Central

    2011-01-01

    Background Periodic spacing of short adenine or thymine runs phased with DNA helical period of ~10.5 bp is associated with intrinsic DNA curvature and deformability, which play important roles in DNA-protein interactions and in the organization of chromosomes in both eukaryotes and prokaryotes. Local differences in DNA sequence periodicity have been linked to differences in gene expression in some organisms. Despite the significance of these periodic patterns, there are virtually no publicly accessible tools for their analysis. Results We present novel tools suitable for assessments of DNA curvature-related sequence periodicity in nucleotide sequences at the genome scale. Utility of the present software is demonstrated on a comparison of sequence periodicities in the genomes of Haemophilus influenzae, Methanocaldococcus jannaschii, Saccharomyces cerevisiae, and Arabidopsis thaliana. The software can be accessed through a web interface and the programs are also available for download. Conclusions The present software is suitable for comparing DNA curvature-related sequence periodicity among different genomes as well as for analysis of intrachromosomal heterogeneity of the sequence periodicity. It provides a quick and convenient way to detect anomalous regions of chromosomes that could have unusual structural and functional properties and/or distinct evolutionary history. PMID:22587738

  5. Phylogeny of Populus (Salicaceae) based on nucleotide sequences of chloroplast TRNT-TRNF region and nuclear rDNA.

    PubMed

    Hamzeh, Mona; Dayanandan, Selvadurai

    2004-09-01

    The species of the genus Populus, collectively known as poplars, are widely distributed over the northern hemisphere and well known for their ecological, economical, and evolutionary importance. The extensive interspecific hybridization and high morphological diversity in this group pose difficulties in identifying taxonomic units for comparative evolutionary studies and systematics. To understand the evolutionary relationships among poplars and to provide a framework for biosystematic classification, we reconstructed a phylogeny of the genus Populus based on nucleotide sequences of three noncoding regions of the chloroplast DNA (intron of trnL and intergenic regions of trnT-trnL and trnL-trnF) and ITS1 and ITS2 of the nuclear rDNA. The resulting phylogenetic trees showed polyphyletic relationships among species in the sections Tacamahaca and Aigeiros. Based on chloroplast DNA sequence data, P. nigra had a close affinity to species of section Populus, whereas nuclear DNA sequence data suggested a close relationship between P. nigra and species of the section Aigeiros, suggesting a possible hybrid origin for P. nigra. Similarly, the chloroplast DNA sequences of P. tristis and P. szechuanica were similar to that of the species of section Aigeiros, while the nuclear sequences revealed a close affinity to species of the section Tacamahaca, suggesting a hybrid origin for these two Asiatic balsam poplars. The incongruence between phylogenetic trees based on nuclear- and chloroplast-DNA sequence data suggests a reticulate evolution in the genus Populus.

  6. Infectivity and complete nucleotide sequence of the genome of a genetically distinct strain of maize streak virus from Reunion Island.

    PubMed

    Peterschmitt, M; Granier, M; Frutos, R; Reynaud, B

    1996-01-01

    A complete infectious genome of an isolate of maize streak subgroup 1 geminivirus from Reunion Island (MSV-R) was cloned and sequenced. Using an Agrobacterium tumefaciens Ti plasmid delivery system, the cloned 2.7 kb circular DNA was shown to be infectious in maize. The agroinfected virus could be transmitted by Cicadulina mbila, the most common vector species of MSV in Reunion. Analysis of open reading frames (ORFs) revealed seven potential coding regions including the 4 ORFs conserved in all geminiviruses infecting monocotyledonous plants, the 2 on the viral "+" strand (MP, CP), and the 2 on the complementary "-" strand (RepA, RepB). The nucleotide sequence of MSV-R was compared to previously determined sequence of three African clones from Nigeria (MSV-N), Kenya (MSV-K), and South Africa (MSV-S). More similarity was found between the African clones (97.0-97.3%) than between these and MSV-R (94.4-95.3%). Nucleotide substitutions were frequent in the large intergenic region, particularly in and around the most likely TATA box for the complementary sense genes, and in the 5' end of ORF V1. The comparison of the predicted peptide sequences of the proteins encoded by ORFs MP, RepA and RepB confirmed the higher similarity between the African clones (97.8-99.3%) than between these and MSV-R (95.1-97.1%). However the amino acid sequences of the protein encoded by ORF CP (capsid protein) were very conserved among all the 4 clones, suggesting a high selection pressure on this ORF. PMID:8893787

  7. Molecular cloning, nucleotide sequence, and expression in Escherichia coli of a hemolytic toxin (aerolysin) gene from Aeromonas trota

    SciTech Connect

    Khan, A.A.; Kim, E.; Cerniglia, C.E.

    1998-07-01

    Aeromonas trota AK2, which was derived from ATCC 49659 and produces the extracellular pore-forming hemolytic toxin aerolysin, was mutagenized with the transposon mini-Tn5Km1 to generate a hemolysin-deficient mutant, designated strain AK253. Southern blotting data indicated that an 8.7-kb NotI fragment of the genomic DNA of strain AK253 contained the kanamycin resistance gene of mini-Tn5Km1. The 8.7-kb NotI DNA fragment was cloned into the vector pGEM5Zf({minus}) by selecting for kanamycin resistance, and the resultant clone, pAK71, showed aerolysin activity in Escherichia coli JM109. The nucleotide sequence of the aerA gene, located on the 1.8-kb ApaI-EcoRI fragment, was determined to consist of 1,479 bp and to have an ATG initiation codon and a TAA termination codon. An in vitro coupled transcription-translation analysis of the 1.8-kb region suggested that the aerA gene codes for a 54-kDa protein, in agreement with nucleotide sequence data. The deduced amino acid sequence of the aerA gene product of A. trota exhibited 99% homology with the amino acid sequence of the aerA product of Aeromonas sobria AB3 and 57% homology with the amino acid sequences of the products of the aerA genes of Aeromonas salmonicida 17-2 and A. sobria 33.

  8. Rapid DNA Sequencing by Direct Nanoscale Reading of Nucleotide Bases on Individual DNA Chains

    SciTech Connect

    Lee, James Weifu; Meller, Amit

    2007-01-01

    Since the independent invention of DNA sequencing by Sanger and by Gilbert 30 years ago, it has grown from a small scale technique capable of reading several kilobase-pair of sequence per day into today's multibillion dollar industry. This growth has spurred the development of new sequencing technologies that do not involve either electrophoresis or Sanger sequencing chemistries. Sequencing by Synthesis (SBS) involves multiple parallel micro-sequencing addition events occurring on a surface, where data from each round is detected by imaging. New High Throughput Technologies for DNA Sequencing and Genomics is the second volume in the Perspectives in Bioanalysis series, which looks at the electroanalytical chemistry of nucleic acids and proteins, development of electrochemical sensors and their application in biomedicine and in the new fields of genomics and proteomics. The authors have expertly formatted the information for a wide variety of readers, including new developments that will inspire students and young scientists to create new tools for science and medicine in the 21st century. Reviews of complementary developments in Sanger and SBS sequencing chemistries, capillary electrophoresis and microdevice integration, MS sequencing and applications set the framework for the book.

  9. [Classification of nucleotide sequences over their frequency dictionaries reveals a relation between the structure of sequences and taxonomy of their bearers].

    PubMed

    Gorban', A N; Popova, T G; Sadovskiĭ, M G

    2003-01-01

    Classification of 16S RNA sequences over their frequency dictionaries, both real ones, and transformed ones was studied. Two entities were considered to be close each other from the point of view of their structure, if their frequency dictionaries were close, in Eucledian metric. A transformation procedure of a frequency dictionary has been implemented that reveals the peculiarities of information structure of a nucleotide sequence. A comparative study of two classification developed over the real frequency dictionary vs. that one developed over the transformed frequency dictionary was carried out. The strong correlation is revealed between the classification and the taxonomy of 16S RNA bearer. For the classes isolated, the information valuable words were identified. These words are the main factors of a difference between the classes. The frequency dictionaries containing the words of the length 3 exhibit the best correlation between a class and a genus. A genus, as a rule, is included into the same class, and the exclusion are sporadic. A development of hierarchy classification over the transformed frequency dictionaries separated one or two taxonomy groups, as each stage of classification. The unexpectedly frequent, or contrary, unexpectedly rare occurred of words (of the length 3) in entities under consideration make the structure difference between the classes of the nucleotide sequences.

  10. Nucleotide sequence of the pnd gene in plasmid R483 and role of the pnd gene product in plasmolysis.

    PubMed

    Ono, K; Akimoto, S; Ohnishi, Y

    1987-01-01

    The pnd gene of R plasmid R483, like the srnB gene of the F plasmid, increases the degradation of stable RNA in Escherichia coli. The nucleotide sequence of the pnd locus was determined and compared with that of the srnB locus. The genes have open reading frames that are 54% homologous, and both have an upstream inverted repeat sequence. The pnd gene expression seems to decrease the osmotic barrier of the cytoplasmic membrane, since no plasmolytic vacuoles were formed in the cells carrying the gene when the cells were exposed to hypertonic sucrose solution. This result suggests that RNase I in the periplasm passes through the altered membrane to degrade stable RNA in the cytoplasm.

  11. Purification of the gam gene-product of bacteriophage Mu and determination of the nucleotide sequence of the gam gene.

    PubMed Central

    Akroyd, J E; Clayson, E; Higgins, N P

    1986-01-01

    The gam gene of bacteriophage Mu encodes a protein which protects linear double stranded DNA from exonuclease degradation in vitro and in vivo. We purified the Mu gam gene product to apparent homogeneity from cells in which it is over-produced from a plasmid clone. The purified protein is a dimer of identical subunits of 18.9 kd. It can aggregate DNA into large, rapidly sedimenting complexes and is a potent exonuclease inhibitor when bound to DNA. The N-terminal amino acid sequence of the purified protein was determined by automated degradation and the nucleotide sequence of the Mu gam gene is presented to accurately map its position in the Mu genome. Images PMID:2945162

  12. A Simple Sequence Repeat- and Single-Nucleotide Polymorphism-Based Genetic Linkage Map of the Brown Planthopper, Nilaparvata lugens

    PubMed Central

    Jairin, Jirapong; Kobayashi, Tetsuya; Yamagata, Yoshiyuki; Sanada-Morimura, Sachiyo; Mori, Kazuki; Tashiro, Kosuke; Kuhara, Satoru; Kuwazaki, Seigo; Urio, Masahiro; Suetsugu, Yoshitaka; Yamamoto, Kimiko; Matsumura, Masaya; Yasui, Hideshi

    2013-01-01

    In this study, we developed the first genetic linkage map for the major rice insect pest, the brown planthopper (BPH, Nilaparvata lugens). The linkage map was constructed by integrating linkage data from two backcross populations derived from three inbred BPH strains. The consensus map consists of 474 simple sequence repeats, 43 single-nucleotide polymorphisms, and 1 sequence-tagged site, for a total of 518 markers at 472 unique positions in 17 linkage groups. The linkage groups cover 1093.9 cM, with an average distance of 2.3 cM between loci. The average number of marker loci per linkage group was 27.8. The sex-linkage group was identified by exploiting X-linked and Y-specific markers. Our linkage map and the newly developed markers used to create it constitute an essential resource and a useful framework for future genetic analyses in BPH. PMID:23204257

  13. Fusion protein of the paramyxovirus simian virus 5: nucleotide sequence of mRNA predicts a highly hydrophobic glycoprotein.

    PubMed Central

    Paterson, R G; Harris, T J; Lamb, R A

    1984-01-01

    The nucleotide sequence of the mRNA coding for the fusion glycoprotein (F) of the paramyxovirus, simian virus 5, has been obtained. There is a single large open reading frame on the mRNA that encodes a protein of 529 amino acids with a molecular weight of 56,531. The proteolytic cleavage/activation site of F, to yield F2 and F1, contains five arginine residues. Six potential glycosylation sites were identified in the protein, two on F2 and four on F1. The deduced amino acid sequence indicates that F is extensively hydrophobic over the length of the polypeptide chain. Three regions are very hydrophobic and could interact directly with membranes: these are the NH2-terminal putative signal peptide, the COOH-terminal putative membrane anchorage domain, and the NH2-terminal region of F1. Images PMID:6093114

  14. The complete nucleotide sequence and genomic characterization of tropical soda apple mosaic virus

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Tropical soda apple mosaic virus (TSAMV) was first identified in tropical soda apple (Solanum viarum), a noxious weed, in Florida in 2002. This report provides the first full genome sequence of TSAMV. The full genome sequence of this virus will enable research scientists to develop additional spec...

  15. The nucleotide sequence of the mouse embryonic beta-like y-globin messenger RNA as determined from cloned cDNA.

    PubMed

    Vanin, E F; Farace, M G; Gambari, R; Fantoni, A

    1981-12-01

    We have determined the nucleotide sequence of two cloned cDNAs corresponding to the mRNA of mouse embryonic y2 globin. The combined overlapping sequences span a total of 480 bp, beginning at the codon corresponding to amino acido residue 21 and extending to the AATAAA sequence in the 3' untranslated region. Therefore, when the amino acid sequence encoded by the cDNA is combined with the available amino acid sequence, a complete y2 protein sequence can be obtained. Comparisons, at the nucleotide level, between the known beta- and beta-like globin sequences and the y2 sequence show that the embryonic, fetal-adult duplication occurred approx. 160 million years (MY) ago and that the embryonic-fetal duplication occurred approx. 100 MY ago.

  16. Prioritization Of Nonsynonymous Single Nucleotide Variants For Exome Sequencing Studies Via Integrative Learning On Multiple Genomic Data

    PubMed Central

    Wu, Mengmeng; Wu, Jiaxin; Chen, Ting; Jiang, Rui

    2015-01-01

    The rapid advancement of next generation sequencing technology has greatly accelerated the progress for understanding human inherited diseases via such innovations as exome sequencing. Nevertheless, the identification of causative variants from sequencing data remains a great challenge. Traditional statistical genetics approaches such as linkage analysis and association studies have limited power in analyzing exome sequencing data, while relying on simply filtration strategies and predicted functional implications of mutations to pinpoint pathogenic variants are prone to produce false positives. To overcome these limitations, we herein propose a supervised learning approach, termed snvForest, to prioritize candidate nonsynonymous single nucleotide variants for a specific type of disease by integrating 11 functional scores at the variant level and 8 association scores at the gene level. We conduct a series of large-scale in silico validation experiments, demonstrating the effectiveness of snvForest across 2,511 diseases of different inheritance styles and the superiority of our approach over two state-of-the-art methods. We further apply snvForest to three real exome sequencing data sets of epileptic encephalophathies and intellectual disability to show the ability of our approach to identify causative de novo mutations for these complex diseases. The online service and standalone software of snvForest are found at http://bioinfo.au.tsinghua.edu.cn/jianglab/snvforest. PMID:26459872

  17. Isolation and nucleotide sequence of a sesquiterpene cyclase gene from the trichothecene-producing fungus Fusarium sporotrichioides.

    PubMed

    Hohn, T M; Beremand, P D

    1989-06-30

    The trichodiene synthase gene (Tox5) has been isolated from the fungus Fusarium sporotrichioides, and its nucleotide (nt) sequence determined. A lambda gt11 library of F. sporotrichioides DNA was screened with antiserum against trichodiene synthase (TS). DNA fragments were isolated which encode a portion of the Tox5 gene. In subsequent screening of the library we employed one of these DNAs as a probe and identified several recombinant phage containing the entire Tox5 gene. The gene consists of a 1182-nt open reading frame (ORF) which contains a 60-nt intron and specifies a Mr 43,999 protein. The deduced amino acid sequence of the ORF was identical to sequences determined for several CNBr peptides from purified TS. Southern and Northern analyses indicated that the Tox5 gene is present in a single copy and is transcribed into an mRNA of about 1450 nt. Upstream from the start codon, 'TATA'-like sequences and a short repeated sequence resembling the 'CCAAT' box were observed. The primary structure described for TS is the first such report for a member of the terpene cyclase group of enzymes. PMID:2777086

  18. Prioritization Of Nonsynonymous Single Nucleotide Variants For Exome Sequencing Studies Via Integrative Learning On Multiple Genomic Data.

    PubMed

    Wu, Mengmeng; Wu, Jiaxin; Chen, Ting; Jiang, Rui

    2015-01-01

    The rapid advancement of next generation sequencing technology has greatly accelerated the progress for understanding human inherited diseases via such innovations as exome sequencing. Nevertheless, the identification of causative variants from sequencing data remains a great challenge. Traditional statistical genetics approaches such as linkage analysis and association studies have limited power in analyzing exome sequencing data, while relying on simply filtration strategies and predicted functional implications of mutations to pinpoint pathogenic variants are prone to produce false positives. To overcome these limitations, we herein propose a supervised learning approach, termed snvForest, to prioritize candidate nonsynonymous single nucleotide variants for a specific type of disease by integrating 11 functional scores at the variant level and 8 association scores at the gene level. We conduct a series of large-scale in silico validation experiments, demonstrating the effectiveness of snvForest across 2,511 diseases of different inheritance styles and the superiority of our approach over two state-of-the-art methods. We further apply snvForest to three real exome sequencing data sets of epileptic encephalophathies and intellectual disability to show the ability of our approach to identify causative de novo mutations for these complex diseases. The online service and standalone software of snvForest are found at http://bioinfo.au.tsinghua.edu.cn/jianglab/snvforest. PMID:26459872

  19. Characterization, nucleotide sequence and genome organization of leek white stripe virus, a putative new species of the genus Necrovirus.

    PubMed

    Lot, H; Rubino, L; Delecolle, B; Jacquemond, M; Turturo, C; Russo, M

    1996-01-01

    White stripe is a disease affecting leek in France with which an isometric virus c. 30 nm in diameter is associated. The most evident symptom is the presence of white stripes on the leaves extending to the stem. Attempts to demonstrate transmission through the soil by sowing or transplanting leek in contaminated soil were unsuccessful. The virus was transmitted by sap inoculation to a narrow range of herbaceous hosts, all of which were infected only locally. Virus purification was from infected leek tissues, where it accumulated in large amounts, as demonstrated by ultrastructural observations. RNA was extracted from purified virus preparations and cDNA clones were prepared. The complete nucleotide sequence of the viral RNA was determined: The genome is 3,662 nucleotides long and contains five open reading frames (ORFs). The first (ORF 1) encodes a putative translation product of M(r) 23,803 (p24) and read through of its amber stop codon results in a protein of M(r) 82,625 (p83) (ORF 2). ORF 3 and ORF 4 encode two small polypeptides of M(r) 11,280 (p11) and M(r) 6,261 (p6), respectively. ORF 5 encodes the capsid protein of M(r) 27,460 (p27). The genome organization and sequence alignments with the corresponding products of necroviruses suggest that the virus isolated from leek is a new species in the genus Necrovirus, for which the name of leek white stripe virus (LWSV) is proposed.

  20. Nucleotide sequence analysis of genes encoding a toluene/benzene-2-monooxygenase from pseudomonas sp. strain JS150

    SciTech Connect

    Johnson, G.R.; Olsen, R.H.

    1995-09-01

    Pseudomonas sp. strain JS150 metabolizes benzene and alkyl- and chloro-substituted benzenes by using dioxygenase-initiated pathways coupled with multiple downstream metabolic pathways to accommodate catechol metabolism. By cloning genes encoding benzene-degradative enzymes, strain JS150 was also found to carry genes for a toluene/benzene-2-monooxygenase. The gene cluster encoding a 2-monooxygenase and its cognate regulator was cloned from a plasmid carried by strain JS150. Oxygen ({sup 18}O{sub 2}) incorporation experiments using Pseudomonas aeruginosa strains carrying the cloned genes confirmed toluene hydroxylation was catalyzed through an authentic monooxygenase reaction to yield ortho-cresol. Encoding the toluene-2-monooxygenase and regulatory gene product was localized in two regions of the cloned fragment. The nucleotide sequence of the toluene/benzene-2-monooxygenase locus was determined, revealing six open reading frames that were then designated tbmA, tbmB, tbmC, tbmD, tbmE, and tbmF. The deduced amino acid sequences for these genes showed the presence of motifs similar to well-conserved functional domains of multicomponent oxygenases. This analysis allowed the tentative identification of two terminal oxygenase subunits (TbmB and TbmD) and an electron transport protein (TbmF) for the monooxygenase enzyme. All the tbm polypeptides shared significant homology with protein components from other bacterial multicomponent monooxygenases. Overall, the tbm gene products shared greater similarity with polypeptides from the phenol hydroxylases of Pseudomo-KR1 and Burkholderia (Pseudomonas) picketti PKO1. The relationship found between the phenol hydroxlases and a toluene-2-monooxygenase, characterized in this study for the first time at the nucleotide sequence level, suggested DNA probes used for surveys of environmental populations should be carefully selected to reflect DNA sequences corresponding to the metabolic pathway of interest. 58 refs., 8 figs., 1 tab.

  1. Nucleotide sequence of the gene encoding the nitrogenase iron protein of Thiobacillus ferrooxidans

    SciTech Connect

    Pretorius, I.M.; Rawlings, D.E.; O'Neill, E.G.; Jones, W.A.; Kirby, R.; Woods, D.R.

    1987-01-01

    The DNA sequence was determined for the cloned Thiobacillus ferrooxidans nifH and part of the nifD genes. The DNA chains were radiolabeled with (..cap alpha..-/sup 32/P)dCTP (3000 Ci/mmol) or (..cap alpha..-/sup 35/S)dCTP (400 Ci/mmol). A putative T. ferrooxidans nifH promoter was identified whose sequences showed perfect consensus with those of the Klebsiella pneumoniae nif promoter. Two putative consensus upstream activator sequences were also identified. The amino acid sequence was deduced from the DNA sequence. In a comparison of nifH DNA sequences from T. ferrooxidans and eight other nitrogen-fixing microbes, a Rhizobium sp. isolated from Parasponia andersonii showed the greatest homology (74%) and Clostridium pasteurianum (nifH1) showed the least homology (54%). In the comparison of the amino acid sequences of the Fe proteins, the Rhizobium sp. and Rhizobium japonicum showed the greatest homology (both 86%) and C. pasteurianum (nifH1 gene product) demonstrated the least homology (56%) to the T. ferrooxidans Fe protein.

  2. The method to compare nucleotide sequences based on the minimum entropy principle.

    PubMed

    Sadovsky, Michael G

    2003-03-01

    A new method to compare two (or several) symbol sequences is developed. The method is based on the comparison of the frequencies of the small fragments of the compared sequences; it requires neither string editing, nor other transformations of the compared objects. The comparison is executed through a calculation of the specific entropy of a frequency dictionary against the special dictionary called the hybrid one; this latter is the statistical ancestor of the group of sequences under comparison. Some applications of the developed method in the fields of genetics and bioinformatics are discussed.

  3. The nucleotide sequence of Beneckea harveyi 5S rRNA. [bioluminescent marine bacterium

    NASA Technical Reports Server (NTRS)

    Luehrsen, K. R.; Fox, G. E.

    1981-01-01

    The primary sequence of the 5S ribosomal RNA isolated from the free-living bioluminescent marine bacterium Beneckea harveyi is reported and discussed in regard to indications of phylogenetic relationships with the bacteria Escherichia coli and Photobacterium phosphoreum. Sequences were determined for oligonucleotide products generated by digestion with ribonuclease T1, pancreatic ribonuclease and ribonuclease T2. The presence of heterogeneity is indicated for two sites. The B. harveyi sequence can be arranged into the same four helix secondary structures as E. coli and other prokaryotic 5S rRNAs. Examination of the 5S-RNS sequences of the three bacteria indicates that B. harveyi and P. phosphoreum are specifically related and share a common ancestor which diverged from an ancestor of E. coli at a somewhat earlier time, consistent with previous studies.

  4. Targeted capture enrichment and sequencing identifies extensive nucleotide variation in the turkey MHC-B.

    PubMed

    Reed, Kent M; Mendoza, Kristelle M; Settlage, Robert E

    2016-03-01

    Variation in the major histocompatibility complex (MHC) is increasingly associated with disease susceptibility and resistance in avian species of agricultural importance. This variation includes sequence polymorphisms but also structural differences (gene rearrangement) and copy number variation (CNV). The MHC has now been described for multiple galliform species including the best defined assemblies of the chicken (Gallus gallus) and domestic turkey (Meleagris gallopavo). Using this sequence resource, this study applied high-throughput sequencing to investigate MHC variation in turkeys of North America (NA turkeys). An MHC-specific SureSelect (Agilent) capture array was developed, and libraries were created for 14 turkeys representing domestic (commercial bred), heritage breed, and wild turkeys. In addition, a representative of the Ocellated turkey (M. ocellata) and chicken (G. gallus) was included to test cross-species applicability of the capture array allowing for identification of new species-specific polymorphisms. Libraries were hybridized to ∼12 K cRNA baits and the resulting pools were sequenced. On average, 98% of processed reads mapped to the turkey whole genome sequence and 53% to the MHC target. In addition to the MHC, capture hybridization recovered sequences corresponding to other MHC regions. Sequence alignment and de novo assembly indicated the presence of several additional BG genes in the turkey with evidence for CNV. Variant detection identified an average of 2245 polymorphisms per individual for the NA turkeys, 3012 for the Ocellated turkey, and 462 variants in the chicken (RJF-256). This study provides an extensive sequence resource for examining MHC variation and its relation to health of this agriculturally important group of birds.

  5. Four novel cystic fibrosis mutations in splice junction sequences affecting the CFTR nucleotide binding folds

    SciTech Connect

    Doerk, T.; Wulbrand, U.; Tuemmler, B. )

    1993-03-01

    Single cases of the four novel splice site mutations 1525[minus]1 G [r arrow] A (intron 9), 3601[minus]2 A [r arrow] G (intron 18), 3850[minus]3 T [r arrow] G (intron 19), and 4374+1 G [r arrow] T (intron 23) were detected in the CFTR gene of cystic fibrosis patients of Indo-Iranian, Turkish, Polish, and Germany descent. The nucleotide substitutions at the +1, [minus]1, and [minus]2 positions all destroy splice sites and lead to severe disease alleles associated with features typical of gastrointestinal and pulmonary cystic fibrosis disease. The 3850[minus]3 T-to-G change was discovered in a very mildly affected 33-year-old [Delta]F508 compound heterozygote, suggesting that the T-to-G transversion at the less conserved [minus]3 position of the acceptor splice site may retain some wildtype function. 13 refs., 1 fig., 2 tabs.

  6. Pyruvate decarboxylase from Pisum sativum. Properties, nucleotide and amino acid sequences.

    PubMed

    Mücke, U; Wohlfarth, T; Fiedler, U; Bäumlein, H; Rücknagel, K P; König, S

    1996-04-15

    To study the molecular structure and function of pyruvate decarboxylase (PDC) from plants the protein was isolated from pea seeds and partially characterised. The active enzyme which occurs in the form of higher oligomers consists of two different subunits appearing in SDS/PAGE and mass spectroscopy experiments. For further experiments, like X-ray crystallography, it was necessary to elucidate the protein sequence. Partial cDNA clones encoding pyruvate decarboxylase from seeds of Pisum sativum cv. Miko have been obtained by means of polymerase chain reaction techniques. The first sequences were found using degenerate oligonucleotide primers designated according to conserved amino acid sequences of known pyruvate decarboxylases. The missing parts of one cDNA were amplified applying the 3'- and 5'-rapid amplification of cDNA ends systems. The amino acid sequence deduced from the entire cDNA sequence displays strong similarity to pyruvate decarboxylases from other organisms, especially from plants. A molecular mass of 64 kDa was calculated for this protein correlating with estimations for the smaller subunit of the oligomeric enzyme. The PCR experiments led to at least three different clones representing the middle part of the PDC cDNA indicating the existence of three isozymes. Two of these isoforms could be confirmed on the protein level by sequencing tryptic peptides. Only anaerobically treated roots showed a positive signal for PDC mRNA in Northern analysis although the cDNA from imbibed seeds was successfully used for PCR.

  7. Nucleotide sequence variation of GLABRA1 contributing to phenotypic variation of leaf hairiness in Brassicaceae vegetables.

    PubMed

    Li, Feng; Zou, Zhongwei; Yong, Hui-Yee; Kitashiba, Hiroyasu; Nishio, Takeshi

    2013-05-01

    GLABRA1 (GL1) belongs to the group of R2R3-MYB transcription factors and is known to be essential for trichome initiation in Arabidopsis. In our previous study, we identified a GL1 ortholog in Brassica rapa as a candidate for the gene controlling leaf hairiness by QTL analysis and suggested that a 5-bp deletion (B-allele) and a 2-bp deletion (D-allele) in the exon 3 of BrGL1 and a non-synonymous SNP (C-allele) in the second nucleotide of exon 3 possibly cause leaf hairlessness. In this study, we transformed a B. rapa line having the B-allele with the A-allele (wild type) or the C-allele of BrGL1 under the control of the CaMV 35S promoter. The transgenic plants with the A-allele showed dense coverage of seedling tissues including stems, young leaves and hypocotyls with trichomes, whereas the phenotypes of those with the C-allele were unchanged. In order to obtain more information about allelic variation of GL1 in different plant lineages and its correlation with leaf hairiness, two GL1 homologs, i.e., RsGL1a and RsGL1b, in Raphanus sativus were analyzed. Allelic variation of RsGL1a between a hairless line and a hairy line was completely associated with hairiness in their BC1F1 population. Comparison of the full-length of RsGL1a in the hairless and hairy lines showed great variation of nucleotides in the 3' end, which might be essential for its function and expression.

  8. Isolation, expression, and nucleotide sequencing of the pilin structural gene of the Brazilian purpuric fever clone of Haemophilus influenzae biogroup aegyptius.

    PubMed Central

    St Geme, J W; Falkow, S

    1993-01-01

    In this study we isolated the pilin gene from the Brazilian purpuric fever (BPF) clone of Haemophilus influenzae biogroup aegyptius, expressed the gene in Escherichia coli, and determined its nucleotide sequence. Comparison of the nucleotide sequence of the BPF pilin gene with the sequences of pilin genes from strains of H. influenzae sensu stricto demonstrated a high degree of identity. Consistent with this observation, hemagglutination inhibition studies performed with a series of glycoconjugates indicated that BPF pili and H. influenzae type b pili possess the same erythrocyte receptor specificity. Images PMID:8478116

  9. Nucleotide sequence and genome organization of human parvovirus B19 isolated from the serum of a child during aplastic crisis.

    PubMed

    Shade, R O; Blundell, M C; Cotmore, S F; Tattersall, P; Astell, C R

    1986-06-01

    The nucleotide sequence of an almost-full-length clone of human parvovirus B19 was determined. Whereas the extreme left and right ends of this genomic clone are incomplete, the sequence clearly indicates that the two ends of viral DNA are related by inverted terminal repeats similar to those of the Dependovirus genus. The coding regions are complete in the cloned DNA, and the two large open reading frames which span almost the entire genome are restricted to one strand, as has been found for all other parvoviruses characterized to date. From the DNA sequence we conclude that the organization of the B19 transcription units is similar although not identical to those of other parvoviruses. In particular, we predict that the B19 genome may utilize a fourth promoter to transcribe mRNA encoding the major structural polypeptide, VP2. Analysis of the putative polypeptides confirms that B19 is only distantly related to the other parvoviruses but reveals that there is a small region in the gene probably encoding the major nonstructural protein of B19, which is closely conserved between all of the parvovirus genomes for which sequence information is currently available.

  10. Novel technologies applied to the nucleotide sequencing and comparative sequence analysis of the genomes of infectious agents in veterinary medicine.

    PubMed

    Granberg, F; Bálint, Á; Belák, S

    2016-04-01

    Next-generation sequencing (NGS), also referred to as deep, high-throughput or massively parallel sequencing, is a powerful new tool that can be used for the complex diagnosis and intensive monitoring of infectious disease in veterinary medicine. NGS technologies are also being increasingly used to study the aetiology, genomics, evolution and epidemiology of infectious disease, as well as host-pathogen interactions and other aspects of infection biology. This review briefly summarises recent progress and achievements in this field by first introducing a range of novel techniques and then presenting examples of NGS applications in veterinary infection biology. Various work steps and processes for sampling and sample preparation, sequence analysis and comparative genomics, and improving the accuracy of genomic prediction are discussed, as are bioinformatics requirements. Examples of sequencing-based applications and comparative genomics in veterinary medicine are then provided. This review is based on novel references selected from the literature and on experiences of the World Organisation for Animal Health (OIE) Collaborating Centre for the Biotechnology-based Diagnosis of Infectious Diseases in Veterinary Medicine, Uppsala, Sweden.

  11. Novel technologies applied to the nucleotide sequencing and comparative sequence analysis of the genomes of infectious agents in veterinary medicine.

    PubMed

    Granberg, F; Bálint, Á; Belák, S

    2016-04-01

    Next-generation sequencing (NGS), also referred to as deep, high-throughput or massively parallel sequencing, is a powerful new tool that can be used for the complex diagnosis and intensive monitoring of infectious disease in veterinary medicine. NGS technologies are also being increasingly used to study the aetiology, genomics, evolution and epidemiology of infectious disease, as well as host-pathogen interactions and other aspects of infection biology. This review briefly summarises recent progress and achievements in this field by first introducing a range of novel techniques and then presenting examples of NGS applications in veterinary infection biology. Various work steps and processes for sampling and sample preparation, sequence analysis and comparative genomics, and improving the accuracy of genomic prediction are discussed, as are bioinformatics requirements. Examples of sequencing-based applications and comparative genomics in veterinary medicine are then provided. This review is based on novel references selected from the literature and on experiences of the World Organisation for Animal Health (OIE) Collaborating Centre for the Biotechnology-based Diagnosis of Infectious Diseases in Veterinary Medicine, Uppsala, Sweden. PMID:27217166

  12. 37 CFR 1.824 - Form and format for nucleotide and/or amino acid sequence submissions in computer readable form.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... nucleotide and/or amino acid sequence submissions in computer readable form. 1.824 Section 1.824 Patents... submissions in computer readable form. (a) The computer readable form required by § 1.821(e) shall meet the following requirements: (1) The computer readable form shall contain a single “Sequence Listing” as either...

  13. 37 CFR 1.824 - Form and format for nucleotide and/or amino acid sequence submissions in computer readable form.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... nucleotide and/or amino acid sequence submissions in computer readable form. 1.824 Section 1.824 Patents... submissions in computer readable form. (a) The computer readable form required by § 1.821(e) shall meet the following requirements: (1) The computer readable form shall contain a single “Sequence Listing” as either...

  14. 37 CFR 1.824 - Form and format for nucleotide and/or amino acid sequence submissions in computer readable form.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... nucleotide and/or amino acid sequence submissions in computer readable form. 1.824 Section 1.824 Patents... submissions in computer readable form. (a) The computer readable form required by § 1.821(e) shall meet the following requirements: (1) The computer readable form shall contain a single “Sequence Listing” as either...

  15. 37 CFR 1.824 - Form and format for nucleotide and/or amino acid sequence submissions in computer readable form.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... nucleotide and/or amino acid sequence submissions in computer readable form. 1.824 Section 1.824 Patents... submissions in computer readable form. (a) The computer readable form required by § 1.821(e) shall meet the following requirements: (1) The computer readable form shall contain a single “Sequence Listing” as either...

  16. 37 CFR 1.824 - Form and format for nucleotide and/or amino acid sequence submissions in computer readable form.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... nucleotide and/or amino acid sequence submissions in computer readable form. 1.824 Section 1.824 Patents... submissions in computer readable form. (a) The computer readable form required by § 1.821(e) shall meet the following requirements: (1) The computer readable form shall contain a single “Sequence Listing” as either...

  17. Nucleotide sequence of the structural gene for diphtheria toxin carried by corynebacteriophage beta.

    PubMed Central

    Greenfield, L; Bjorn, M J; Horn, G; Fong, D; Buck, G A; Collier, R J; Kaplan, D A

    1983-01-01

    A 1,942-base-pair DNA segment encoding the structural gene for diphtheria toxin was sequenced, and the primary structure of the toxin was deduced. Restriction enzyme fragments corresponding to nontoxic or hypotoxic peptides of the toxin were isolated from corynebacteriophage beta and cloned into Escherichia coli on plasmid pBR322, and the sequence was determined. The mature toxin molecule deduced from the sequence has 535 amino acid residues and a molecular weight of 58,342. The deduced sequence for the fragment A moiety was the same as that determined at the protein level, except for a single serine residue, which had been mispositioned in the earlier study. Several differences were noted with respect to the partial sequence data available on the fragment B moiety, some or all of which may reflect genetic variations among populations of corynephages carrying the toxin gene. The DNA sequence predicts a 25-residue leader peptide preceding the mature protein, which is presumably involved in secretion of the toxin from lysogenized Corynebacterium diphtheriae. We infer that initiation of translation probably occurs at a GTG codon (codon -25). Cloned restriction fragments containing sequences for the amino-terminal region of toxin, together with 5' flanking regions, were expressed in E. coli. Toxin-related peptides were synthesized and secreted into the periplasmic space. These results provide a basis for applying recombinant DNA methods to the study of diphtheria toxin and for producing novel, genetically altered forms of the toxin suited to the construction of new classes of immunotoxins. PMID:6316330

  18. Functional analysis and nucleotide sequence of the promoter region of the murine hck gene.

    PubMed Central

    Lock, P; Stanley, E; Holtzman, D A; Dunn, A R

    1990-01-01

    The structure and function of the promoter region and exon 1 of the murine hck gene have been characterized in detail. RNase protection analysis has established that hck transcripts initiate from heterogeneous start sites located within the hck gene. Fusion gene constructs containing hck 5'-flanking sequences and the bacterial Neor gene have been introduced into the hematopoietic cell lines FDC-P1 and WEHI-265 by using a self-inactivating retroviral vector. The transcriptional start sites of the fusion gene are essentially identical to those of the endogenous hck gene. Analysis of infected WEHI-265 cell lines treated with bacterial lipopolysaccharide (LPS) reveals a 3- to 5-fold elevation in the levels of endogenous hck mRNA and a 1.4- to 2.6-fold increase in the level of Neor fusion gene transcripts, indicating that hck 5'-flanking sequences are capable of conferring LPS responsiveness on the Neor gene. The 5'-flanking region of the hck gene contains sequences similar to an element which is thought to be involved in the LPS responsiveness of the class II major histocompatibility gene A alpha k. A subset of these sequences are also found in the 5'-flanking regions of other LPS-responsive genes. Moreover, this motif is related to the consensus binding sequence of NF-kappa B, a transcription factor which is known to be regulated by LPS. Images PMID:2388619

  19. The phylogenetic utility of nucleotide sequences of sorbitol 6-phosphate dehydrogenase in Prunus (Rosaceae).

    PubMed

    Bortiri, Esteban; Oh, Sang-Hun; Gao, Fang-You; Potter, Dan

    2002-10-01

    Sequences from s6pdh, a gene that encodes sorbitol-6-phosphate dehydrogenase in the Rosaceae, are used to reconstruct the phylogeny of 22 species of Prunus. The s6pdh sequences alone and in combination with previously published sequences of the internal transcribed spacer (ITS) and the cpDNA trnL-trnF spacer are analyzed using parsimony and maximum likelihood methods. Both methods reconstructed the same phylogeny when s6pdh sequences are used alone and in combination with ITS and trnL-trnF, and the topology is in agreement with previous studies that used a larger sample size. The s6pdh sequences have about twice as many informative sites as ITS. A molecular clock is rejected for s6pdh, most likely due to greater rates of evolution in subgenera Padus and Laurocerasus than in the rest of the genus. Phylogenetic reconstruction of Prunus as determined by analysis of the combined data set suggests an early split into two clades. One is composed of subgenera Cerasus, Laurocerasus, and Padus. The second includes subgenera Amygdalus, Emplectocladus, and Prunus. Species of section Microcerasus (formerly in subgenus Cerasus) are nested within subgenus Prunus. The order of branching and relationships among early diverging lineages is weakly supported, as a result of very short branches that may indicate rapid radiation. PMID:21665596

  20. Nucleotide sequence of the transcriptional initiation region of the yeast GAL7 gene.

    PubMed Central

    Nogi, Y; Fukasawa, T

    1983-01-01

    The GAL7 gene of Saccharomyces cerevisiae encodes Gal-1-P uridylyl transferase, the second enzyme of Leloir pathway for the galactose catabolism. We have determined the sequence of 1003 base pairs surrounding and upstream of the transcriptional initiation site of the GAL7 gene. The region sequenced also encompasses the 3' end of GAL10 gene. The 5' end of GAL7 mRNA was determined on the DNA sequence by the S1 nuclease- and exonuclease VII mapping, which is located 21 to 22 base pairs upstream from the translation initiating ATG codon. The primary structure of the GAL7 5' flanking region has many features common to those of multicellular eukaryotic genes. The 3' end of GAL10 mRNA was also determined by the mapping technique with the single-strand specific nucleases to be about 600 base pairs upstream from the 5' end of GAL7 mRNA. Images PMID:6324089

  1. Comparison and analysis of the nucleotide sequences of pilin genes from Haemophilus influenzae type b strains Eagan and M43.

    PubMed Central

    Forney, L J; Marrs, C F; Bektesh, S L; Gilsdorf, J R

    1991-01-01

    Previous studies have demonstrated antigenic differences among the pili expressed by various strains of Haemophilus influenzae type b (Hib). In order to understand the molecular basis for these differences, the structural gene for pilin was cloned from Hib strain Eagan (p+) and the nucleotide sequence was compared to those of strains M43 (p+) and 770235 b0f+, which had been previously determined. The pilin gene of Hib strain Eagan (p+) had a 648-bp open reading frame that encoded a 20-amino-acid leader sequence followed by the 196 amino acids found in mature pilin. The translated sequence was three amino acids larger than pilins of strains M43 (p+) and 770235 b0f+ and was 78% identical and 95% homologous when conservative amino acid substitutions were considered. Differences between the amino acid sequences were not localized to any one region but rather were distributed throughout the proteins. Comparison of protein hydrophilicity profiles showed several hydrophilic regions with sequences that were conserved between strain Eagan (p+) and pilins of other Hib strains, and these regions represent potentially conserved antigenic domains. Southern blot analyses using an intragenic probe from the pilin gene of strain Eagan (p+) showed that the pilin gene was conserved among all type b and nontypeable strains of H. influenzae examined, and only a single copy was present in these strains. Homologous genes were not present in the phylogenetically related species Pasteurella multocida, Pasteurella haemolytica, and Actinobacillus pleuropneumoniae. These data indicate that the pilin gene was highly conserved among different strains of H. influenzae and that small differences in the pilin amino acid sequences account for the observed antigenic differences of assembled pili from these strains. Images PMID:2037360

  2. Nucleotide sequence discrepancies within the GA strain of Marek's disease virus

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Comparative genomics between 9 gallid herpesvirus type 2 strains have singled out the virulent (v) prototype strain GA as phylogenetically distant from other v pathotypes. Multiple amino acid alignments of otherwise highly conserved unique long (UL) genes have indicated sequence discrepancies within...

  3. Symbolic complexity for nucleotide sequences: a sign of the genome structure

    NASA Astrophysics Data System (ADS)

    Salgado-García, R.; Ugalde, E.

    2016-11-01

    We introduce a method for estimating the complexity function (which counts the number of observable words of a given length) of a finite symbolic sequence, which we use to estimate the complexity function of coding DNA sequences for several species of the Hominidae family. In all cases, the obtained symbolic complexities show the same characteristic behavior: exponential growth for small word lengths, followed by linear growth for larger word lengths. The symbolic complexities of the species we consider exhibit a systematic trend in correspondence with the phylogenetic tree. Using our method, we estimate the complexity function of sequences obtained by some known evolution models, and in some cases we observe the characteristic exponential-linear growth of the Hominidae coding DNA complexity. Analysis of the symbolic complexity of sequences obtained from a specific evolution model points to the following conclusion: linear growth arises from the random duplication of large segments during the evolution of the genome, while the decrease in the overall complexity from one species to another is due to a difference in the speed of accumulation of point mutations.

  4. The nucleotide sequence of the equine herpesvirus 4 gC gene homologue.

    PubMed

    Nicolson, L; Onions, D E

    1990-11-01

    The genomic position of an equine herpesvirus 4 (EHV-4) gene homologue of the herpes simplex virus 1 (HSV-1) gC gene was determined by Southern analysis and DNA sequencing. The gene lies within a 2-kbp Bg/II-EcoRI fragment mapping between 0.15 and 0.17 within the long unique component of the EHV-4 genome and is transcribed from right to left. Putative promoter elements were identified upstream of the 1455-bp open reading frame which encodes a 485-amino-acid protein of unglycosylated molecular weight 52,513. Computer-assisted analysis of the primary sequence predicts the protein possesses a domain structure characteristic of a type 1 integral membrane glycoprotein. Four domains were distinguished--(i) an N-terminal signal sequence, (ii) a large extracellular domain containing 11 putative N-linked glycosylation sites, (iii) a hydrophobic transmembrane domain, and (iv) a C-terminal charged domain. Comparison of the predicted amino acid sequence to that of other herpesvirus glycoproteins indicated identities of between 22 and 29% with HSV-1 gC, HSV-2 gC, VZV gpV, PRV gIII, BHV-1 gIII, and MDV A antigen and of 79% with EHV-1 gp13. A gene with no apparent homologue in HSV-1 or VZV maps immediately downstream of the EHV-4 gC gene homologue. PMID:2171212

  5. Phylogenetic discovery bias in Bacillus anthracis using single-nucleotide polymorphisms from whole-genome sequencing

    PubMed Central

    Pearson, Talima; Busch, Joseph D.; Ravel, Jacques; Read, Timothy D.; Rhoton, Shane D.; U'Ren, Jana M.; Simonson, Tatum S.; Kachur, Sergey M.; Leadem, Rebecca R.; Cardon, Michelle L.; Van Ert, Matthew N.; Huynh, Lynn Y.; Fraser, Claire M.; Keim, Paul

    2004-01-01

    Phylogenetic reconstruction using molecular data is often subject to homoplasy, leading to inaccurate conclusions about phylogenetic relationships among operational taxonomic units. Compared with other molecular markers, single-nucleotide polymorphisms (SNPs) exhibit extremely low mutation rates, making them rare in recently emerged pathogens, but they are less prone to homoplasy and thus extremely valuable for phylogenetic analyses. Despite their phylogenetic potential, ascertainment bias occurs when SNP characters are discovered through biased taxonomic sampling; by using whole-genome comparisons of five diverse strains of Bacillus anthracis to facilitate SNP discovery, we show that only polymorphisms lying along the evolutionary pathway between reference strains will be observed. We illustrate this in theoretical and simulated data sets in which complex phylogenetic topologies are reduced to linear evolutionary models. Using a set of 990 SNP markers, we also show how divergent branches in our topologies collapse to single points but provide accurate information on internodal distances and points of origin for ancestral clades. These data allowed us to determine the ancestral root of B. anthracis, showing that it lies closer to a newly described “C” branch than to either of two previously described “A” or “B” branches. In addition, subclade rooting of the C branch revealed unequal evolutionary rates that seem to be correlated with ecological parameters and strain attributes. Our use of nonhomoplastic whole-genome SNP characters allows branch points and clade membership to be estimated with great precision, providing greater insight into epidemiological, ecological, and forensic questions. PMID:15347815

  6. Isolation and nucleotide sequence of an autonomously replicating sequence (ARS) element functional in Candida albicans and Saccharomyces cerevisiae.

    PubMed

    Cannon, R D; Jenkinson, H F; Shepherd, M G

    1990-04-01

    An 8.6-kb fragment was isolated from an EcoRI digest of Candida albicans ATCC 10261 genomic DNA which conferred the property of autonomous replication in Saccharomyces cervisiae on the otherwise non-replicative plasmid pMK155 (5.6 kb). The DNA responsible for the replicative function was subcloned as a 1.2-kb fragment onto a non-replicative plasmid (pRC3915) containing the C. albicans URA3 and LEU2 genes to form plasmid pRC3920. This plasmid was capable of autonomous replication in both S. cerevisiae and C. albicans and transformed S. cerevisiae AH22 (leu2-) to Leu+ at a frequency of 2.15 x 10(3) transformants per microgram DNA, and transformed C. albicans SGY-243 (delta ura3) to Ura+ at a frequency of 1.91 x 10(3) transformants per microgram DNA. Sequence analysis of the cloned DNA revealed the presence of two identical regions of eleven base pairs (5'TTTTATGTTTT3') which agreed with the consensus of autonomously replicating sequence (ARS) cores functional in S. cerevisiae. In addition there were two 10/11 and numerous 9/11 matches to the core consensus. The two 11/11 matches to the consensus, CaARS1 and CaARS2, were located on opposite strands in a non-coding AT-rich region and were separated by 107 bp. Also present on the C. albicans DNA, 538 bp from the ARS cores, was a gene for 5S rRNA which showed sequence homology with several other yeast 5S rRNA genes.(ABSTRACT TRUNCATED AT 250 WORDS) PMID:2196431

  7. Complete nucleotide sequences of okra isolates of Cotton leaf curl Gezira virus and their associated DNA-beta from Niger.

    PubMed

    Shih, S L; Kumar, S; Tsai, W S; Lee, L M; Green, S K

    2009-01-01

    Okra (Abelmoschus esculentus) is a major crop in Niger. In the fall of 2007, okra leaf curl disease was observed in Niger and the begomovirus and DNA-beta satellite were found associated with the disease. The complete nucleotide sequences of DNA-A (FJ469626 and FJ469627) and associated DNA-beta satellites (FJ469628 and FJ469629) were determined from two samples. This is the first report of molecular characterization of okra-infecting begomovirus and their associated DNA-beta from Niger. The begomovirus and DNA-beta have been identified as Cotton leaf curl Gezira virus and Cotton leaf curl Gezira betasatellite, respectively, which are reported to also infect okra in Egypt, Mali and Sudan.

  8. Complete nucleotide sequences of a new bipartite begomovirus from Malvastrum sp. plants with bright yellow mosaic symptoms in South Texas.

    PubMed

    Alabi, Olufemi J; Villegas, Cecilia; Gregg, Lori; Murray, K Daniel

    2016-06-01

    Two isolates of a novel bipartite begomovirus, tentatively named malvastrum bright yellow mosaic virus (MaBYMV), were molecularly characterized from naturally infected plants of the genus Malvastrum showing bright yellow mosaic disease symptoms in South Texas. Six complete DNA-A and five DNA-B genome sequences of MaBYMV obtained from the isolates ranged in length from 2,608 to 2,609 nucleotides (nt) and 2,578 to 2,605 nt, respectively. Both genome segments shared a 178- to 180-nt common region. In pairwise comparisons, the complete DNA-A and DNA-B sequences of MaBYMV were most similar (87-88 % and 79-81 % identity, respectively) and phylogenetically related to the corresponding sequences of sida mosaic Sinaloa virus-[MX-Gua-06]. Further analysis revealed that MaBYMV is a putative recombinant virus, thus supporting the notion that malvaceous hosts may be influencing the evolution of several begomoviruses. The design of new diagnostic primers enabled the detection of MaBYMV in cohorts of Bemisia tabaci collected from symptomatic Malvastrum sp. plants, thus implicating whiteflies as potential vectors of the virus. PMID:27016928

  9. Conserved nucleotide sequences in the open reading frame and 3' untranslated region of selenoprotein P mRNA.

    PubMed Central

    Hill, K E; Lloyd, R S; Burk, R F

    1993-01-01

    Rat liver selenoprotein P contains 10 selenocysteine residues in its primary structure (deduced). It is the only selenoprotein characterized to date that has more than one selenocysteine residue. Selenoprotein P cDNA has been cloned from human liver and heart cDNA libraries and sequenced. The open reading frames are identical and contain a signal peptide, indicating that the protein is secreted by both organs and is therefore not exclusively produced in the liver. Ten selenocysteine residues (deduced) are present. Comparison of the open reading frame of the human cDNA with the rat cDNA reveals a 69% identity of the nucleotide sequence and 72% identity of the deduced amino acid sequence. Two regions in the 3' untranslated portion have high conservation between human and rat. Each of these regions contains a predicted stable stem-loop structure similar to the single stem-loop structures reported in 3' untranslated regions of type I iodothyronine 5'-deiodinase and glutathione peroxidase. The stem-loop structure of type I iodothyronine 5'-deiodinase has been shown to be necessary for incorporation of the selenocysteine residue at the UGA codon. Because only two stem-loop structures are present in the 3' untranslated region of selenoprotein P mRNA, it can be concluded that a separate stem-loop structure is not required for each selenocysteine residue. Images PMID:8421687

  10. Cloning and nucleotide sequence of the glpD gene encoding sn-glycerol-3-phosphate dehydrogenase of Pseudomonas aeruginosa.

    PubMed Central

    Schweizer, H P; Po, C

    1994-01-01

    Nitrosoguanidine-induced Pseudomonas aeruginosa mutants which were unable to utilize glycerol as a carbon source were isolated. By utilizing PAO104, a mutant defective in glycerol transport and sn-glycerol-3-phosphate dehydrogenase (glpD), the glpD gene was cloned by a phage mini-D3112-based in vivo cloning method. The cloned gene was able to complement an Escherichia coli glpD mutant. Restriction analysis and recloning of DNA fragments located the glpD gene to a 1.6-kb EcoRI-SphI DNA fragment. In E. coli, a single 56,000-Da protein was expressed from the cloned DNA fragments. An in-frame glpD'-'lacZ translational fusion was isolated and used to determine the reading frame of glpD by sequencing across the fusion junction. The nucleotide sequence of a 1,792-bp fragment containing the glpD region was determined. The glpD gene encodes a protein containing 510 amino acids and with a predicted molecular weight of 56,150. Compared with the aerobic sn-glycerol-3-phosphate dehydrogenase from E. coli, P. aeruginosa GlpD is 56% identical and 69% similar. A similar comparison with GlpD from Bacillus subtilis reveals 21% identity and 40% similarity. A flavin-binding domain near the amino terminus which shared the consensus sequence reported for other bacterial flavoproteins was identified. Images PMID:8157588

  11. Nucleotide-Resolution Profiling of RNA Recombination in the Encapsidated Genome of a Eukaryotic RNAVirus by Next-Generation Sequencing

    PubMed Central

    Routh, Andrew; Ordoukhanian, Phillip; Johnson, John E.

    2012-01-01

    Next-Generation Sequencing has been used in numerous investigations to characterize andquantifythe genetic diversity of a virus samplethrough the mapping of polymorphisms and measurement of mutation frequencies.Next-Generation Sequencing has also been employed to identifyrecombinationevents occurring within the genomes of higher organisms, for example, detecting alternative RNA splicing events and oncogenic chromosomal rearrangements. Here, we combine these two approaches toprofile RNA recombination within the encapsidated genome of a eukaryotic RNA virus, Flock House Virus. We detect hundreds of thousands of recombination events, with single-nucleotide resolution, which result indiversity in the encapsidated genome rivaling that due to mismatch mutation. We detect previously identified Defective-RNAs as well as many other abundant and novel Defective-RNAs. Our approach is exceptionally sensitive, unbiased, and requires no prior knowledge beyond the virus genome sequence. RNA recombination is a powerful driving force behind the evolution and adaptation of RNA viruses. The strategy implemented here is widely applicable and provides a highly detailed description of the complex mutational landscape of the transmissible viral genome. PMID:23069247

  12. Gene Cloning and Nucleotide Sequencing and Properties of a Cocaine Esterase from Rhodococcus sp. Strain MB1

    PubMed Central

    Bresler, Matthew M.; Rosser, Susan J.; Basran, Amrik; Bruce, Neil C.

    2000-01-01

    A strain of Rhodococcus designated MB1, which was capable of utilizing cocaine as a sole source of carbon and nitrogen for growth, was isolated from rhizosphere soil of the tropane alkaloid-producing plant Erythroxylum coca. A cocaine esterase was found to initiate degradation of cocaine, which was hydrolyzed to ecgonine methyl ester and benzoate; both of these esterolytic products were further metabolized by Rhodococcus sp. strain MB1. The structural gene encoding a cocaine esterase, designated cocE, was cloned from Rhodococcus sp. strain MB1 genomic libraries by screening recombinant strains of Rhodococcus erythropolis CW25 for growth on cocaine. The nucleotide sequence of cocE corresponded to an open reading frame of 1,724 bp that codes for a protein of 574 amino acids. The amino acid sequence of cocaine esterase has a region of similarity with the active serine consensus of X-prolyl dipeptidyl aminopeptidases, suggesting that the cocaine esterase is a serine esterase. The cocE coding sequence was subcloned into the pCFX1 expression plasmid and expressed in Escherichia coli. The recombinant cocaine esterase was purified to apparent homogeneity and was found to be monomeric, with an Mr of approximately 65,000. The apparent Km of the enzyme (mean ± standard deviation) for cocaine was measured as 1.33 ± 0.085 mM. These findings are of potential use in the development of a linked assay for the detection of illicit cocaine. PMID:10698749

  13. The complete nucleotide sequence and genomic organization of Citrus Leprosis associated Virus, Cytoplasmatic type (CiLV-C).

    PubMed

    Pascon, Renata C; Kitajima, João Paulo; Breton, Michèle C; Assumpção, Laura; Greggio, Christian; Zanca, Almir S; Okura, Vagner Katsumi; Alegria, Marcos C; Camargo, Maria E; Silva, Giovana G C; Cardozo, Jussara C; Vallim, Marcelo A; Franco, Sulamita F; Silva, Vitor H; Jordão, Hamilton; Oliveira, Fernanda; Giachetto, Poliana F; Ferrari, Fernanda; Aguilar-Vildoso, Carlos I; Franchiscini, Fabrício J B; Silva, José M F; Arruda, Paulo; Ferro, Jesus A; Reinach, Fernando; da Silva, Ana Cláudia Rasera

    2006-06-01

    The Citrus leprosis disease (CiL) is associated to a virus (CiLV) transmitted by Brevipalpus spp. mites (Acari: Tenuipalpidae). CiL is endemic in Brazil and its recently spreading to Central America represents a threat to citrus industry in the USA. Electron microscopy images show two forms of CiLV: a rare nuclear form, characterized by rod-shaped naked particle (CiLV-N) and a common cytoplasmic form (CiLV-C) associated with bacilliform-enveloped particle and cytoplasmic viroplasm. Due to this morphological feature, CiLV-C has been treated as Rhabdovirus-like. In this paper we present the complete nucleotide sequence and genomic organization of CiLV-C. It is a bipartite virus with sequence similarity to ssRNA positive plant virus. RNA1 encodes a putative replicase polyprotein and an ORF with no known function. RNA2 encodes 4 ORFs. pl5, p24 and p61 have no significant similarity to any known proteins and p32 encodes a protein with similarity to a viral movement protein. The CiLV-C sequences are associated with typical symptoms of CiL by RT-PCR. Phylogenetic analysis suggests that CiLV-C is probably a member of a new family of plant virus evolutionarily related to Tobamovirus. PMID:16732481

  14. InPhaDel: integrative shotgun and proximity-ligation sequencing to phase deletions with single nucleotide polymorphisms

    PubMed Central

    Patel, Anand; Edge, Peter; Selvaraj, Siddarth; Bansal, Vikas; Bafna, Vineet

    2016-01-01

    Phasing of single nucleotide (SNV), and structural variations into chromosome-wide haplotypes in humans has been challenging, and required either trio sequencing or restricting phasing to population-based haplotypes. Selvaraj et al. demonstrated single individual SNV phasing is possible with proximity ligated (HiC) sequencing. Here, we demonstrate HiC can phase structural variants into phased scaffolds of SNVs. Since HiC data is noisy, and SV calling is challenging, we applied a range of supervised classification techniques, including Support Vector Machines and Random Forest, to phase deletions. Our approach was demonstrated on deletion calls and phasings on the NA12878 human genome. We used three NA12878 chromosomes and simulated chromosomes to train model parameters. The remaining NA12878 chromosomes withheld from training were used to evaluate phasing accuracy. Random Forest had the highest accuracy and correctly phased 86% of the deletions with allele-specific read evidence. Allele-specific read evidence was found for 76% of the deletions. HiC provides significant read evidence for accurately phasing 33% of the deletions. Also, eight of eight top ranked deletions phased by only HiC were validated using long range polymerase chain reaction and Sanger. Thus, deletions from a single individual can be accurately phased using a combination of shotgun and proximity ligation sequencing. InPhaDel software is available at: http://l337x911.github.io/inphadel/. PMID:27105843

  15. Confirming single nucleotide polymorphisms from expressed sequence tag datasets derived from three cattle cDNA libraries.

    PubMed

    Lee, Seung-Hwan; Park, Eung-Woo; Cho, Yong-Min; Lee, Ji-Woong; Kim, Hyoung-Yong; Lee, Jun-Heon; Oh, Sung-Jong; Cheong, Il-Cheong; Yoon, Du-Hak

    2006-03-31

    Using the Phred/Phrap/Polyphred/Consed pipeline established in the National Livestock Research Institute of Korea, we predicted candidate coding single nucleotide polymorphisms (cSNPs) from 7,600 expressed sequence tags (ESTs) derived from three cDNA libraries (liver, M. longissimus dorsi, and intermuscular fat) of Hanwoo (Korean native cattle) steers. From the 7,600 ESTs, 829 contigs comprising more than two EST reads were assembled using the Phrap assembler. Based on the contig analysis, 201 candidate cSNPs were identified in 129 contigs, in which transitions (69%) outnumbered transversions (31%). To verify whether the predicted cSNPs are real, 17 SNPs involved in lipid and energy metabolism were selected from the ESTs. Twelve of these were confirmed to be real while five were identified as artifacts, possibly due to expressed sequence tag sequence error. Further analysis of the 12 verified cSNPs was performed using the program BLASTX. Five were identified as nonsynonymous cSNPs, five were synonymous cSNPs, and two SNPs were located in 3'-UTRs. Our data indicated that a relatively high SNP prediction rate (71%) from a large EST database could produce abundant cSNPs rapidly, which can be used as valuable genetic markers in cattle.

  16. Unusual sequence effects on nucleotide excision repair of arylamine lesions: DNA bending/distortion as a primary recognition factor

    PubMed Central

    Jain, Vipin; Hilton, Benjamin; Lin, Bin; Patnaik, Satyakam; Liang, Fengting; Darian, Eva; Zou, Yue; MacKerell, Alexander D.; Cho, Bongsup P.

    2013-01-01

    The environmental arylamine mutagens are implicated in the etiology of various sporadic human cancers. Arylamine-modified dG lesions were studied in two fully paired 11-mer duplexes with a -G*CN- sequence context, in which G* is a C8-substituted dG adduct derived from fluorinated analogs of 4-aminobiphenyl (FABP), 2-aminofluorene (FAF) or 2-acetylaminofluorene (FAAF), and N is either dA or dT. The FABP and FAF lesions exist in a simple mixture of ‘stacked’ (S) and ‘B-type’ (B) conformers, whereas the N-acetylated FAAF also samples a ‘wedge’ (W) conformer. FAAF is repaired three to four times more efficiently than FABP and FAF. A simple A- to -T polarity swap in the G*CA/G*CT transition produced a dramatic increase in syn-conformation and resulted in 2- to 3-fold lower nucleotide excision repair (NER) efficiencies in Escherichia coli. These results indicate that lesion-induced DNA bending/thermodynamic destabilization is an important DNA damage recognition factor, more so than the local S/B-conformational heterogeneity that was observed previously for FAF and FAAF in certain sequence contexts. This work represents a novel 3′-next flanking sequence effect as a unique NER factor for bulky arylamine lesions in E. coli. PMID:23180767

  17. Nucleotide sequence of the DNA polymerase gene of herpes simplex virus type 2 and comparison with the type 1 counterpart.

    PubMed

    Tsurumi, T; Maeno, K; Nishiyama, Y

    1987-01-01

    The complete nucleotide sequence of the DNA polymerase gene of herpes simplex virus (HSV) type 2 strain 186 has been determined. The gene included a 3720-bp major open reading frame capable of encoding 1240 amino acids. The predicted primary translation product had an Mr of 137,354, which was slightly larger than its HSV-1 counterpart. A comparison of the predicted functional amino acid sequences of the HSV-1 and HSV-2 DNA polymerases revealed 95.5% overall amino acid homology, the value of which was the highest among those of the other known polypeptides encoded by HSV-1 and HSV-2. The functional amino acid changes were spread in the N-terminal one-third of the protein, whereas the C-terminal two-third was almost identical between the two types except a particular hydrophilic region. A highly conserved sequence of 6 aa, YGDTDS, which has been observed in DNA polymerases of HSV-1, Epstein-Barr virus, adenovirus, and vaccinia virus, was also present at positions 889 to 894 in the C-terminal region of HSV-2 DNA polymerase.

  18. Escherichia coli purB gene: cloning, nucleotide sequence, and regulation by purR.

    PubMed

    He, B; Smith, J M; Zalkin, H

    1992-01-01

    Escherichia coli purB encodes adenylosuccinate lyase (ASL), the enzyme that catalyzes step 8 in the pathway for de novo synthesis of IMP and also the final reaction in the two-step sequence from IMP to AMP. Gene purB was cloned and found to encode an ASL protein of 435 amino acids having a calculated molecular weight of 49,225. E. coli ASL is homologous to the corresponding enzymes from Bacillus subtilis and chickens and also to fumarase from B. subtilis. Gene phoP is 232 bp downstream of purB. Gene purB is regulated threefold by the purine pool and purR. Transcriptional regulation of purB involves binding of the purine repressor to the 16-bp conserved pur regulon operator. The purB operator is 224 bp downstream of the transcription start site and overlaps codons 62 to 67 in the protein-coding sequence.

  19. Nucleotide sequence of the regulatory locus controlling expression of bacterial genes for bioluminescence.

    PubMed Central

    Engebrecht, J; Silverman, M

    1987-01-01

    Production of light by the marine bacterium Vibrio fischeri and by recombinant hosts containing cloned lux genes is controlled by the density of the culture. Density-dependent regulation of lux gene expression has been shown to require a locus consisting of the luxR and luxI genes and two closely linked divergent promoters. As part of a genetic analysis to understand the regulation of bioluminescence, we have sequenced the region of DNA containing this control circuit. Open reading frames corresponding to luxR and luxI were identified; transcription start sites were defined by S1 nuclease mapping and sequences resembling promoter elements were located. Images PMID:3697093

  20. Design and synthesis of fluorescence-labeled nucleotide with a cleavable azo linker for DNA sequencing.

    PubMed

    Tan, Lianjiang; Liu, Yazhi; Yang, Qinglai; Li, Xiaowei; Wu, Xin-Yan; Gong, Bing; Shen, Yu-Mei; Shao, Zhifeng

    2016-01-18

    A cleavable azo linker was synthesized and reacted with 5-(6)-carboxytetramethyl rhodamine succinimidyl ester, followed by further reactions with di(N-succinimidyl) carbonate and 5-(3-amino-1-propynyl)-2'-deoxyuridine 5'-triphosphate [dUTP(AP3)] to obtain the terminal product dUTP-azo linker-TAMRA as a potential reversible terminator for DNA sequencing by synthesis with no need for 3'-OH blocking. PMID:26587573

  1. Nucleotide sequence analysis of a candidate gene for ataxia-telangiectasia group D (ATDC)

    SciTech Connect

    Leonhardt, E.A.; Kapp, L.N.; Young, B.R.; Murnane, J.P. )

    1994-01-01

    A radioresistant cell clone (1B3) was previously isolated after transfection of an ataxia-telangiectasia (AT) group D cell line with a human cosmid library. A cosmid rescued from the integration site in 1B3 contained human DNA from chromosome position 11q23, the same region shown by both genetic linkage and chromosome transfer to contain the genes for AT complementation groups A/B, C, and D. A gene within the cosmid (ATDC) was found to produce mRNAs of different sizes. A cDNA for one of the most abundant mRNAs (3.0 kb) was isolated from a HeLa cell library. In the present study, the authors sequenced the 3.0-kb cDNA and the surrounding intron DNA in the cosmids. They used polymerase chain reaction, with primers in the introns, to confirm the number of exons and to analyze DNA from AT group D cells for mutations within this gene. Although no mutations were found, they do not rule out the possibility that mutations may be present within the regulatory sequences or coding sequences found in other mRNAs specific for this gene. From the sequence analysis, they found that the ATDC gene product is one of a group of proteins that share multiple zinc finger motifs and an adjacent leucine zipper motif. These proteins have been proposed to form homo- or hetero-dimers involved in nucleic acid binding, consistent with the fact that many of these proteins appear to be transcriptional regulatory factors involved in carcinogenesis and/or differentiation. The likelihood that the ATDC gene product is involved in transcriptional regulation could explain the pleiomorphic characteristics of AT, including abnormal cell cycle regulation. 36 refs., 5 figs., 2 tabs.

  2. Analysis of SCAR marker nucleotide sequences in maize (Zea mays L.) somaclones.

    PubMed

    Osipova, E S; Lysenko, E A; Troitsky, A V; Dolgikh, Yu I; Shamina, Z B; Gostimskii, S A

    2011-02-01

    SCAR (sequence characterized amplified region) markers allow the reliable identification of unique somaclonal variations. Six SCAR markers were developed previously and were thought to be exclusively characteristic of eight maize somaclones. However, we detected two of these markers in maize lines and a cultivar unrelated to the progenitor line of the somaclones. Therefore, we sequenced these markers and performed bioinformatic searches to understand the molecular events that may underlie the variability observed in the somaclones. All changes were found in noncoding sequences and were induced by different molecular events, such as the insertion of long terminal repeat (LTR) transposon(s), precise miniature inverted repeat transposable element (MITE) excision, microdeletion, recombination, and a change in the pool of mitochondrial DNA. For example, the SCAR marker QR is represented by the two variants QR-A and QR-2. The sequences of the two variants were similar, except for a 457-bp fragment found only in QR-A; this region was denoted as Q. Region Q was flanked by the direct 3-bp repeat 5'-TAA-3' (target site duplication; TSD) and the inverted 14-bp repeat 5'-GGGCCTGTTTGGAA-3' (terminal inverted repeats; TIRs). These features confer the Q region with similarity to the nonautonomic Tourist-like MITE. In two groups of independently produced somaclones, the same features (morphological, molecular) were variable, which confirms the theory of 'hot spots' occurring in the genome. The distribution of one of the SCAR markers was confirmed using Southern blot hybridization. The presence of the same molecular markers in the somaclones and in different non-somaclonal maize variants suggests that in some cases, the same mechanisms determine both in vitro and in vivo variability and that cell culture enhances the rate of heritable genomic changes that naturally occur in living organisms. PMID:21421376

  3. Finding the right coverage: the impact of coverage and sequence quality on single nucleotide polymorphism genotyping error rates.

    PubMed

    Fountain, Emily D; Pauli, Jonathan N; Reid, Brendan N; Palsbøll, Per J; Peery, M Zachariah

    2016-07-01

    Restriction-enzyme-based sequencing methods enable the genotyping of thousands of single nucleotide polymorphism (SNP) loci in nonmodel organisms. However, in contrast to traditional genetic markers, genotyping error rates in SNPs derived from restriction-enzyme-based methods remain largely unknown. Here, we estimated genotyping error rates in SNPs genotyped with double digest RAD sequencing from Mendelian incompatibilities in known mother-offspring dyads of Hoffman's two-toed sloth (Choloepus hoffmanni) across a range of coverage and sequence quality criteria, for both reference-aligned and de novo-assembled data sets. Genotyping error rates were more sensitive to coverage than sequence quality and low coverage yielded high error rates, particularly in de novo-assembled data sets. For example, coverage ≥5 yielded median genotyping error rates of ≥0.03 and ≥0.11 in reference-aligned and de novo-assembled data sets, respectively. Genotyping error rates declined to ≤0.01 in reference-aligned data sets with a coverage ≥30, but remained ≥0.04 in the de novo-assembled data sets. We observed approximately 10- and 13-fold declines in the number of loci sampled in the reference-aligned and de novo-assembled data sets when coverage was increased from ≥5 to ≥30 at quality score ≥30, respectively. Finally, we assessed the effects of genotyping coverage on a common population genetic application, parentage assignments, and showed that the proportion of incorrectly assigned maternities was relatively high at low coverage. Overall, our results suggest that the trade-off between sample size and genotyping error rates be considered prior to building sequencing libraries, reporting genotyping error rates become standard practice, and that effects of genotyping errors on inference be evaluated in restriction-enzyme-based SNP studies.

  4. A single origin and moderate bottleneck during domestication of soybean (Glycine max): implications from microsatellites and nucleotide sequences.

    PubMed

    Guo, Juan; Wang, Yunsheng; Song, Chi; Zhou, Jianfeng; Qiu, Lijuan; Huang, Hongwen; Wang, Ying

    2010-09-01

    Background and Aims It is essential to illuminate the evolutionary history of crop domestication in order to understand further the origin and development of modern cultivation and agronomy; however, despite being one of the most important crops, the domestication origin and bottleneck of soybean (Glycine max) are poorly understood. In the present study, microsatellites and nucleotide sequences were employed to elucidate the domestication genetics of soybean. Methods The genomes of 79 landrace soybeans (endemic cultivated soybeans) and 231 wild soybeans (G. soja) that represented the species-wide distribution of wild soybean in East Asia were scanned with 56 microsatellites to identify the genetic structure and domestication origin of soybean. To understand better the domestication bottleneck, four nucleotide sequences were selected to simulate the domestication bottleneck. Key Results Model-based analysis revealed that most of the landrace genotypes were assigned to the inferred wild soybean cluster of south China, South Korea and Japan. Phylogeny for wild and landrace soybeans showed that all landrace soybeans formed a single cluster supporting a monophyletic origin of all the cultivars. The populations of the nearest branches which were basal to the cultivar lineage were wild soybeans from south China. The coalescent simulation detected a bottleneck severity of K' = 2 during soybean domestication, which could be explained by a foundation population of 6000 individuals if domestication duration lasted 3000 years. Conclusions As a result of integrating geographic distribution with microsatellite genotype assignment and phylogeny between landrace and wild soybeans, a single origin of soybean in south China is proposed. The coalescent simulation revealed a moderate genetic bottleneck with an effective wild soybean population used for domestication estimated to be approximately 2 % of the total number of ancestral wild soybeans. Wild soybeans in Asia, especially in

  5. Complete nucleotide sequence and organization of the mitochondrial genome of Sirtheneaflavipes (Hemiptera: Reduviidae: Peiratinae) and comparison with other assassin bugs.

    PubMed

    Gao, Jianyu; Li, Hu; Truong, Xuan Lam; Dai, Xun; Chang, Jian; Cai, Wanzhi

    2013-01-01

    The complete sequence of the mitochondrial (mt) genome of the assassin bug, Sirtheneaflavipes (Stål), was determined. The circular genome is 15, 961 bp long and contains a standard gene complement, i.e., the large and small ribosomal RNA (rRNA) subunits, 22 transfer RNA (tRNA) genes, 13 protein-coding genes (PCGs), and the 1, 295 bp control region. The nucleotide composition of S. flavipes mt genome is 71.8% AT-rich, reflected in the predominance of AT-rich codons in PCGs. Compared with the other three reduviid species available in complete mt genomes, the genome architecture as well as the nucleotide composition, codon usage, and amino acid composition reflected high similarity. All PCGs use standard initiation codons (ATN); however, ND4L and ND1 started with GTG. Canonical TAA and TAG termination codons are found in nine PCGs, the remaining four (COIII, ND3, ND5, and ND]) have incomplete termination codons. All tRNAs have the typical clover-leaf structure, except the dihydrouridine (DHU) arm of tRNASer(AGN) forms a simple loop as seen in many other metazoans. Secondary structure models of the ribosomal RNA genes of S. flavipes are presented and are similar to those proposed for other insects. The structure of rrnL is more conservative than that of rrnS among sequenced assassin bugs. The monophyly of Reduviidae is highly supported by Bayesian inferences, and the Peiratinae presents a sister position to the Triatominae+ (Salyavatinae + Harpactorinae). PMID:26312315

  6. Inferring Multiple Refugia and Phylogeographical Patterns in Pinus massoniana Based on Nucleotide Sequence Variation and DNA Fingerprinting

    PubMed Central

    Lin, Chung-Jian; Huang, Chi-Chung; Huang, Chao-Ching; Chiang, Yu-Chung; Chiang, Tzen-Yuh

    2012-01-01

    Background Pinus massoniana, an ecologically and economically important conifer, is widespread across central and southern mainland China and Taiwan. In this study, we tested the central–marginal paradigm that predicts that the marginal populations tend to be less polymorphic than the central ones in their genetic composition, and examined a founders' effect in the island population. Methodology/Principal Findings We examined the phylogeography and population structuring of the P. massoniana based on nucleotide sequences of cpDNA atpB-rbcL intergenic spacer, intron regions of the AdhC2 locus, and microsatellite fingerprints. SAMOVA analysis of nucleotide sequences indicated that most genetic variants resided among geographical regions. High levels of genetic diversity in the marginal populations in the south region, a pattern seemingly contradicting the central–marginal paradigm, and the fixation of private haplotypes in most populations indicate that multiple refugia may have existed over the glacial maxima. STRUCTURE analyses on microsatellites revealed that genetic structure of mainland populations was mediated with recent genetic exchanges mostly via pollen flow, and that the genetic composition in east region was intermixed between south and west regions, a pattern likely shaped by gene introgression and maintenance of ancestral polymorphisms. As expected, the small island population in Taiwan was genetically differentiated from mainland populations. Conclusions/Significance The marginal populations in south region possessed divergent gene pools, suggesting that the past glaciations might have low impacts on these populations at low latitudes. Estimates of ancestral population sizes interestingly reflect a recent expansion in mainland from a rather smaller population, a pattern that seemingly agrees with the pollen record. PMID:22952747

  7. The influence of nucleotide sequence and temperature on the activity of thermostable DNA polymerases.

    PubMed

    Montgomery, Jesse L; Rejali, Nick; Wittwer, Carl T

    2014-05-01

    Extension rates of a thermostable, deletion-mutant polymerase were measured from 50°C to 90°C using a fluorescence activity assay adapted for real-time PCR instruments. Substrates with a common hairpin (6-base loop and a 14-bp stem) were synthesized with different 10-base homopolymer tails. Rates for A, C, G, T, and 7-deaza-G incorporation at 75°C were 81, 150, 214, 46, and 120 seconds(-1). Rates for U were half as fast as T and did not increase with increasing concentration. Hairpin substrates with 25-base tails from 0% to 100% GC content had maximal extension rates near 60% GC and were predicted from the template sequence and mononucleotide incorporation rates to within 30% for most sequences. Addition of dimethyl sulfoxide at 7.5% increased rates to within 1% to 17% of prediction for templates with 40% to 90% GC. When secondary structure was designed into the template region, extension rates decreased. Oligonucleotide probes reduced extension rates by 65% (5'-3' exo-) and 70% (5'-3' exo+). When using a separate primer and a linear template to form a polymerase substrate, rates were dependent on both the primer melting temperature (Tm) and the annealing/extension temperature. Maximum rates were observed from Tm to Tm - 5°C with little extension by Tm + 5°C. Defining the influence of sequence and temperature on polymerase extension will enable more rapid and efficient PCR. PMID:24607271

  8. Statistical tests to identify appropriate types of nucleotide sequence recoding in molecular phylogenetics

    PubMed Central

    2014-01-01

    Background Under a Markov model of evolution, recoding, or lumping, of the four nucleotides into fewer groups may permit analysis under simpler conditions but may unfortunately yield misleading results unless the evolutionary process of the recoded groups remains Markovian. If a Markov process is lumpable, then the evolutionary process of the recoded groups is Markovian. Results We consider stationary, reversible, and homogeneous Markov processes on two taxa and compare three tests for lumpability: one using an ad hoc test statistic, which is based on an index that is evaluated using a bootstrap approximation of its distribution; one that is based on a test proposed specifically for Markov chains; and one using a likelihood-ratio test. We show that the likelihood-ratio test is more powerful than the index test, which is more powerful than that based on the Markov chain test statistic. We also show that for stationary processes on binary trees with more than two taxa, the tests can be applied to all pairs. Finally, we show that if the process is lumpable, then estimates obtained under the recoded model agree with estimates obtained under the original model, whereas, if the process is not lumpable, then these estimates can differ substantially. We apply the new likelihood-ratio test for lumpability to two primate data sets, one with a mitochondrial origin and one with a nuclear origin. Conclusions Recoding may result in biased phylogenetic estimates because the original evolutionary process is not lumpable. Accordingly, testing for lumpability should be done prior to phylogenetic analysis of recoded data. PMID:24564837

  9. High nucleotide and amino acid sequence similarities in tumour necrosis factor-alpha amongst Indian buffalo (Bubalus bubalis), Indian cattle (Bos indicus) and other ruminants.

    PubMed

    Gupta, P K; Bind, R B; Walunj, S S; Saini, M

    2004-08-01

    Tumour necrosis factor-alpha (TNF-alpha) mRNA from Indian water buffalo (Bubalus bubalis) and Indian cattle (Bos indicus) was reverse transcribed and amplified using reverse transcriptase-polymerase chain reaction (RT-PCR). The nucleotide sequences of cDNAs were determined after cloning into pGEM-T-Easy vector (Promega, Madison, WI) and compared with reported nucleotide sequences of TNF-alpha cDNA from other species. The nucleotide sequences of TNF-alpha from Indian cattle revealed significantly high similarities at nucleotide (99.2%) and amino acid (100%) levels with those of cattle (Bos taurus; Zebu). The sequences from buffalo had 98.4% nucleotide and 99.1% amino acid similarities with Indian cattle, indicating functional cross-reactivity. One amino acid deletion at position 63 and one substitution (A-->P) at position 64 were observed in buffalo compared with Indian cattle. The amino acid deletion at position 63 was predicted due to differences in pre-mRNA splicing.

  10. Cloning and nucleotide sequence of luxR, a regulatory gene controlling bioluminescence in Vibrio harveyi.

    PubMed Central

    Showalter, R E; Martin, M O; Silverman, M R

    1990-01-01

    Mutagenesis with transposon mini-Mulac was used previously to identify a regulatory locus necessary for expression of bioluminescence genes, lux, in Vibrio harveyi (M. Martin, R. Showalter, and M. Silverman, J. Bacteriol. 171:2406-2414, 1989). Mutants with transposon insertions in this regulatory locus were used to construct a hybridization probe which was used in this study to detect recombinants in a cosmid library containing the homologous DNA. Recombinant cosmids with this DNA stimulated expression of the genes encoding enzymes for luminescence, i.e., the luxCDABE operon, which were positioned in trans on a compatible replicon in Escherichia coli. Transposon mutagenesis and analysis of the DNA sequence of the cloned DNA indicated that regulatory function resided in a single gene of about 0.6-kilobases named luxR. Expression of bioluminescence in V. harveyi and in the fish light-organ symbiont Vibrio fischeri is controlled by density-sensing mechanisms involving the accumulation of small signal molecules called autoinducers, but similarity of the two luminescence systems at the molecular level was not apparent in this study. The amino acid sequence of the LuxR product of V. harveyi, which indicates a structural relationship to some DNA-binding proteins, is not similar to the sequence of the protein that regulates expression of luminescence in V. fischeri. In addition, reconstitution of autoinducer-controlled luminescence in recombinant E. coli, already achieved with lux genes cloned from V. fischeri, was not accomplished with the isolation of luxR from V. harveyi, suggesting a requirement for an additional regulatory component. PMID:2160932

  11. Nucleotide sequences of three distinct clones coding for rat heavy chain class 1 major hitocompatibility antigens

    SciTech Connect

    Wang, M.; Stepkowski, S.M.; Tain, L.

    1996-09-01

    Poly(A){sup +} RNAs were isolated from ConconavalinA stimulated splenocytes of BUF (RT1.A{sup b}), PVG (RT1.A{sup c}), or PVG.1U (RT1.A{sup u}) rats, respectively, using a Micro-Fast Track kit. After reverse transcription with a synthetic oligo-d(T) primer (5{sup {prime}}-CAT GAT CGA ATT CAC GCG TCT AGA TTT TTT TTT TTT TTT TTT TTT TTT TVN-3{sup {prime}}, V = A+G+C, N = A+T+G+C; Genosys, Woodland, TX), 1.6 kilobase products, which encode the entire MHC class I protein and the 3{sup {prime}} non-translated region including the poly-A tail, were amplified by polymerase chain reaction (PCR) using two synthetic oligonucleotide primers (Genosys). The upstream primer (5{sup {prime}}-GTC CGG GWT CTC AGA TGG GG C-3{sup {prime}}, W = A+T) was designed based upon the published rat class I sequences of eight genes: RT1.1{sup a} M31018; rat LW2 gene X70066; RT1.1{sup 1}, L26224 X79719; RT1.A{sup u} X82669, and RT1.Aw3 L40363, RT1.E{sup u} L40365, RT1.C{sup 1} L40362. The downstream primer (5{sup {prime}}) ATG ATC GAA TTC ACG CGT CTA GA-3{sup {prime}} was the portion of the oligo-d(T) primer used for reverse transcription. The purified PCR products were inserted into pCR II cloning vectors (Invitrogen). Automated sequencing of plasmid cDNAs from the positive clones obtained from three repeated PCR amplifications identified by restriction enzyme mapping were reproducible. Comparison between new sequences of the heavy chain class I genes and those available in GenBank. 7 refs., 1 fig.

  12. Short nucleotide sequences in herpesviral genomes identical to the human DNA.

    PubMed

    Filatov, Felix; Shargunov, Alexander

    2015-05-01

    In 2010, we described many similar DNA sequences in human and viral genomes, including herpesviral ones. The data obtained allowed us to suggest that these motifs may provide the antiviral protection by mating with a complementary potential target and destroying it by the catalytic way like small interfering RNA, siRNA. Since we have analyzed these viruses as a group, two major issues seemed to us curious: (1) the number of such motifs in genomes of various herpesvirus types, and (2) distribution of these motifs in an individual viral genome. Here we searched only the herpesviral genomes for short (>20nt) continuous sequences (hits) that are totally identical to the sequences of human DNA. We found that different viral genes and genomes of different herpesviruses contain different amount of such hits. Assuming like in previous paper that the density of these hits in viral genes is associated with the probability to be targets for cellular siRNA, we consider the genomic allocation of this density as a hypothetical targetome map of the human herpesviruses. We combined all nine types of herpesviruses in the three groups according the hit concentration in their genomes and found that the resulting sequence corresponds to the type of cellular pathology caused by a virus. We do not assert now that this trend also relates to other human viruses or other viruses in general. As the GenBank continues to fill, it would be highly advisable to conduct further relevant research. We also suggested that a high hits concentration we found in the gene RL1 (ICP34.5) of the herpes simplex virus type 1 (HSV1) can make this gene a likely target for putative cellular endogenous siRNA. Artificial blockade of the gene RL1 attaches oncolytic properties to HSV1, and we do not exclude the possibility that part of the HSV1 population in humans with blocked RL1 in vivo, may participate in early anti-cancer protection during the reactivation of the virus from the latent state. PMID:25728788

  13. Molecular cloning, nucleotide sequence and expression of a Sulfolobus solfataricus gene encoding a class II fumarase.

    PubMed

    Colombo, S; Grisa, M; Tortora, P; Vanoni, M

    1994-01-01

    Fumarase catalyzes the interconversion of L-malate and fumarate. A Sulfolobus solfataricus fumarase gene (fumC) was cloned and sequenced. Typical archaebacterial regulatory sites were identified in the region flanking the fumC open reading frame. The fumC gene encodes a protein of 438 amino acids (47,899 Da) which shows several significant similarities with class II fumarases from both eubacterial and eukariotic sources as well as with aspartases. S. solfataricus fumarase expressed in Escherichia coli retains enzymatic activity and its thermostability is comparable to that of S. solfataricus purified enzyme despite a 11 amino acid C-terminal deletion.

  14. Complete nucleotide sequences of two new begomoviruses infecting the wild malvaceous plant Melochia sp. in Brazil.

    PubMed

    Fiallo-Olivé, Elvira; Zerbini, F Murilo; Navas-Castillo, Jesús

    2015-12-01

    Wild malvaceous plants are hosts for a large number of begomoviruses (genus Begomovirus, family Geminiviridae) in both the Old World and the New World. Here, we report the complete genome sequences of two new begomoviruses from Melochia sp. plants from Brazil. The cloned bipartite genomes, composed of DNA-A and DNA-B, showed the typical organization of the New World begomoviruses but they were distantly related to the genomes of other begomoviruses. We propose the names Melochia mosaic virus and Melochia yellow mosaic virus for these begomoviruses.

  15. A genome walking strategy for the identification of nucleotide sequences adjacent to known regions.

    PubMed

    Wang, Hailong; Yao, Ting; Cai, Mei; Xiao, Xiuqing; Ding, Xuezhi; Xia, Liqiu

    2013-02-01

    To identify the transposon insertion sites in a soil actinomycete, Saccharopolyspora spinosa, a genome walking approach, termed SPTA-PCR, was developed. In SPTA-PCR, a simple procedure consisting of TA cloning and a high stringency PCR, following the single primer-mediated, randomly-primed PCR, can eliminate non-target DNA fragments and obtain target fragments specifically. Using SPTA-PCR, the DNA sequence adjacent to the highly conserved region of lectin coding gene in onion plant, Allium chinense, was also cloned. PMID:23108875

  16. Species-wide genome sequence and nucleotide polymorphisms from the model allopolyploid plant Brassica napus

    PubMed Central

    Schmutzer, Thomas; Samans, Birgit; Dyrszka, Emmanuelle; Ulpinnis, Chris; Weise, Stephan; Stengel, Doreen; Colmsee, Christian; Lespinasse, Denis; Micic, Zeljko; Abel, Stefan; Duchscherer, Peter; Breuer, Frank; Abbadi, Amine; Leckband, Gunhild; Snowdon, Rod; Scholz, Uwe

    2015-01-01

    Brassica napus (oilseed rape, canola) is one of the world’s most important sources of vegetable oil for human nutrition and biofuel, and also a model species for studies investigating the evolutionary consequences of polyploidisation. Strong bottlenecks during its recent origin from interspecific hybridisation, and subsequently through intensive artificial selection, have severely depleted the genetic diversity available for breeding. On the other hand, high-throughput genome profiling technologies today provide unprecedented scope to identify, characterise and utilise genetic diversity in primary and secondary crop gene pools. Such methods also enable implementation of genomic selection strategies to accelerate breeding progress. The key prerequisite is availability of high-quality sequence data and identification of high-quality, genome-wide sequence polymorphisms representing relevant gene pools. We present comprehensive genome resequencing data from a panel of 52 highly diverse natural and synthetic B. napus accessions, along with a stringently selected panel of 4.3 million high-confidence, genome-wide SNPs. The data is of great interest for genomics-assisted breeding and for evolutionary studies on the origins and consequences in allopolyploidisation in plants. PMID:26647166

  17. The structure and complete nucleotide sequence of the human cyclophilin 40 (PPID) gene

    SciTech Connect

    Yokoi, Haruhiko; Shimizu, Yukiko; Anazawa, Hideharu

    1996-08-01

    Cyclophilin 40 is a recently identified member of the cyclophilin family that is found in an unactivated steroid hormone receptor complex. Cyclophilin 40 possesses a region homologous to FKBP59, a member of the FK506-binding protein family that is also a component of the receptor complex. We report the isolation and sequencing of the entire human cyclophilin 40 (hCyP40) gene (human gene symbol PPID). The gene contains 10 exons (43 to 698 bp) and 9 introns encompassing 14.2 kb. The exon organization of the cyclophilin-like region is not similar to that of the human cyclophilin A gene (PPIA), suggesting their early divergence in evolution. Determination of the sequence of the 5{prime} end of the hCyP40 mRNA by an {open_quotes}anchor-ligation PCR{close_quotes} procedure showed that transcription is initiated from a cluster of sites about 80 bp upstream from the first in-frame ATG. The immediate 5{prime}-flanking region of the gene lacks typical TATA and CAAT boxes, but is GC-rich and contains Sp1 sites, features characteristic of promoters associated with housekeeping genes. The hCyP40 gene was mapped to chromosome 4 by PCR with genomic DNA from somatic cell hybrids. As shown by {open_quotes}Zoo blot{close_quotes} analysis, the cylophilin 40 gene appears to be highly conserved throughout evolution. 47 refs., 4 figs., 1 tab.

  18. Species-wide genome sequence and nucleotide polymorphisms from the model allopolyploid plant Brassica napus.

    PubMed

    Schmutzer, Thomas; Samans, Birgit; Dyrszka, Emmanuelle; Ulpinnis, Chris; Weise, Stephan; Stengel, Doreen; Colmsee, Christian; Lespinasse, Denis; Micic, Zeljko; Abel, Stefan; Duchscherer, Peter; Breuer, Frank; Abbadi, Amine; Leckband, Gunhild; Snowdon, Rod; Scholz, Uwe

    2015-01-01

    Brassica napus (oilseed rape, canola) is one of the world's most important sources of vegetable oil for human nutrition and biofuel, and also a model species for studies investigating the evolutionary consequences of polyploidisation. Strong bottlenecks during its recent origin from interspecific hybridisation, and subsequently through intensive artificial selection, have severely depleted the genetic diversity available for breeding. On the other hand, high-throughput genome profiling technologies today provide unprecedented scope to identify, characterise and utilise genetic diversity in primary and secondary crop gene pools. Such methods also enable implementation of genomic selection strategies to accelerate breeding progress. The key prerequisite is availability of high-quality sequence data and identification of high-quality, genome-wide sequence polymorphisms representing relevant gene pools. We present comprehensive genome resequencing data from a panel of 52 highly diverse natural and synthetic B. napus accessions, along with a stringently selected panel of 4.3 million high-confidence, genome-wide SNPs. The data is of great interest for genomics-assisted breeding and for evolutionary studies on the origins and consequences in allopolyploidisation in plants. PMID:26647166

  19. Characterization of Mapuera virus: structure, proteins and nucleotide sequence of the gene encoding the nucleocapsid protein.

    PubMed

    Henderson, G W; Laird, C; Dermott, E; Rima, B K

    1995-10-01

    The molecular biology of Mapuera virus was studied at both the protein and nucleic acid levels. Seven virus-encoded proteins were detected in infected Vero cells. The sizes and characteristics of each of the proteins determined from various radiolabelling experiments allowed preliminary identification of the proteins as the large (L; 190 kDa), haemagglutinin neuraminidase (HN; 74 kDa), nucleocapsid (N; 66 kDa), fusion (F0; 63 kDa), phosphoprotein (P; 49 kDa), matrix (M; 43 kDa) and non-structural (V; 35 kDa) proteins. Western blot analysis showed that the HN, N and P proteins were major antigens recognized in the mouse. A cDNA library of total virus-infected cellular mRNA was created and screening of the library resulted in the detection of cDNA sequences representing the N mRNA transcript of Mapuera virus. The N mRNA sequence determined from the clones was 1731 nt in length and contained an ORF that encoded 537 amino acids, the complete 3' untranslated region and part of the 5' non-coding region. The calculated M(r) of the N protein was 59 kDa, which is close to the 66 kDa protein observed by SDS-PAGE. PMID:7595354

  20. Nucleotide sequence and expression in Escherichia coli of the Pseudomonas aeruginosa lasA gene.

    PubMed Central

    Schad, P A; Iglewski, B H

    1988-01-01

    Pseudomonas aeruginosa PAO-E64 is a mutant which produces parental levels of elastase antigen but has no elastolytic activity at 37 degrees C. The lesion (lasA1) in PAO-E64 is not a mutation in the structural gene for P. aeruginosa elastase (P.A. Schad, R.A. Bever, T.I. Nicas, F. Leduce, L.F. Hanne, and B.H. Iglewski, J. Bacteriol. 169: 2691-2696, 1987). A 1.7-kilobase segment of DNA that complements the lasA1 lesion was sequenced. Computer analysis of the DNA sequence showed that it contained an open reading frame which encoded a 41,111-dalton protein. The lasA gene was expressed under an inducible PT-7 promoter, and a 40,000-dalton protein was detected in Escherichia coli lysates. The lasA protein was localized in the outer membrane fraction of E. coli. This lasA protein produced in E. coli activated the extracellular elastase produced by the P. aeruginosa mutant, PAO-E64. Images PMID:2836371

  1. Identification and nucleotide sequence of the thymidine kinase gene of Shope fibroma virus

    SciTech Connect

    Upton, C.; McFadden, G.

    1986-12-01

    The thymidine kinase (TK) gene of Shope fibroma virus (SFV), a tumorigenic leporipoxvirus, was localized within the viral genome with degenerate oligonucleotide probes. These probes were constructed to two regions of high sequence conservation between the vaccinia virus TK gene and those of several known eucaryotic cellular TK genes, including human, mouse, hamster, and chicken TK genes. The oligonucleotide probes initially localized the SFV TK gene 50 kilobases (kb) from the right terminus of the 160-kb SFV genome within the 9.5-kb BamHI-HindIII fragment E. Fine-mapping analysis indicated that the TK Gene was within a 1.2-kb AvaI-HaeIII fragment, and DNA sequencing of this region revealed an open reading frame capable of encoding a polypeptide of 187 amino acids possessing considerable homology to the TK genes of the vaccinia, variola, and monkeypox orthopoxviruses and also to a variety of cellular TK genes. Homology matrix analysis and homology scores suggest that the SFV TK gene has diverged significantly from its counterpart members in the orthopoxvirus genus. Nevertheless, the presence of conserved upstream open reading frames on the 5' side of all of the poxvirus TK genes indicates a similarity of functional organization between the orthopoxviruses and leporipoxviruses. These data suggest a common ancestral origin for at least some of the unique internal regions of the leporipoxviruses and orthopoxviruses as exemplified by SFV and vaccinia virus, respectively.

  2. The structure of the yeast ribosomal RNA genes. I. The complete nucleotide sequence of the 18S ribosomal RNA gene from Saccharomyces cerevisiae.

    PubMed

    Rubtsov, P M; Musakhanov, M M; Zakharyev, V M; Krayev, A S; Skryabin, K G; Bayev, A A

    1980-12-11

    The cloned 18 S ribosomal RNA gene from Saccharomyces cerevisiae have been sequenced, using the Maxam-Gilbert procedure. From this data the complete sequence of 1789 nucleotides of the 18 S RNA was deduced. Extensive homology with many eucaryotic as well as E. coli ribosomal small subunit rRNA (S-rRNA) has been observed in the 3'-end region of the rRNA molecule. Comparison of the yeast 18 S rRNA sequences with partial sequence data, available for rRNAs of the other eucaryotes provides strong evidence that a substantial portion of the 18 S RNA sequence has been conserved in evolution.

  3. Nucleotide sequence analysis establishes the role of endogenous murine leukemia virus DNA segments in formation of recombinant mink cell focus-forming murine leukemia viruses.

    PubMed Central

    Khan, A S

    1984-01-01

    The sequence of 363 nucleotides near the 3' end of the pol gene and 564 nucleotides from the 5' terminus of the env gene in an endogenous murine leukemia viral (MuLV) DNA segment, cloned from AKR/J mouse DNA and designated as A-12, was obtained. For comparison, the nucleotide sequence in an analogous portion of AKR mink cell focus-forming (MCF) 247 MuLV provirus was also determined. Sequence features unique to MCF247 MuLV DNA in the 3' pol and 5' env regions were identified by comparison with nucleotide sequences in analogous regions of NFS -Th-1 xenotropic and AKR ecotropic MuLV proviruses. These included (i) an insertion of 12 base pairs encoding four amino acids located 60 base pairs from the 3' terminus of the pol gene and immediately preceding the env gene, (ii) the deletion of 12 base pairs (encoding four amino acids) and the insertion of 3 base pairs (encoding one amino acid) in the 5' portion of the env gene, and (iii) single base substitutions resulting in 2 MCF247 -specific amino acids in the 3' pol and 23 in the 5' env regions. Nucleotide sequence comparison involving the 3' pol and 5' env regions of AKR MCF247 , NFS xenotropic, and AKR ecotropic MuLV proviruses with the cloned endogenous MuLV DNA indicated that MCF247 proviral DNA sequences were conserved in the cloned endogenous MuLV proviral segment. In fact, total nucleotide sequence identity existed between the endogenous MuLV DNA and the MCF247 MuLV provirus in the 3' portion of the pol gene. In the 5' env region, only 4 of 564 nucleotides were different, resulting in three amino acid changes between AKR MCF247 MuLV DNA and the endogenous MuLV DNA present in clone A-12. In addition, nucleotide sequence comparison indicated that Moloney-and Friend-MCF MuLVs were also highly related in the 3' pol and 5' env regions to the cloned endogenous MuLV DNA. These results establish the role of endogenous MuLV DNA segments in generation of recombinant MCF viruses. PMID:6328017

  4. The nucleotide sequence of the chicken thymidine kinase gene and the relationship of its predicted polypeptide to that of the vaccinia virus thymidine kinase.

    PubMed

    Kwoh, T J; Engler, J A

    1984-05-11

    The entire DNA nucleotide sequence of a 3.0 kilobase pair Hind III fragment containing the chicken cytoplasmic thymidine kinase gene was determined. Oligonucleotide linker insertion mutations distributed throughout this gene and having known effects upon gene activity ( Kwoh , T.J., Zipser , D., and Wigler , M. 1983. J. Mol. Appl. Genet. 2, 191-200), were used to access regions of the Hind III fragment for sequencing reactions. The complete nucleotide sequence, together with the positions of the linker insertion mutations within the sequence, allows us to propose a structure for the chicken thymidine kinase gene. The protein coding sequence of the gene is divided into seven small segments (each less than 160 base pairs) by six small introns (each less than 230 base pairs). The proposed 244 amino acid polypeptide encoded by this gene bears strong homology to the vaccinia virus thymidine kinase. No homology with the thymidine kinases of the herpes simplex viruses was found.

  5. Nucleotide sequence of the BamHI repetitive sequence, including the HindIII fundamental unit, as a possible mobile element from the Japanese monkey Macaca fuscata.

    PubMed

    Prassolov, V S; Kuchino, Y; Nemoto, K; Nishimura, S

    1986-01-01

    Clustered repeat units produced by BamHI digestion of genomic DNA from the Japanese monkey Macaca fuscata [JMr(BamHI)] were sequenced by dideoxy DNA sequencing. The nucleotide sequences of several individual repeats showed that the BamHI repeat contains the 170-bp HindIII element as an integral part, and that it has more than 90% homology with the HindIII repeat element [AGMr(HindIII)] found in the genomic DNA of the African green monkey. In the JMr(BamHI) repeat unit, the 170-bp HindIII element is flanked by a 6-bp inverted repeat, which is part of a 22-bp direct repeat. This latter repeat of 22-bp asymmetrically overlaps the border between the internal AGMr(HindIII)-like region and adjacent regions of the JMr(BamHI) repeat. A similar structural feature of the BamHI repeat unit has been found in the genomic DNA of the baboon, but not in that of the African green monkey. These results show clearly that the BamHI repeat of the modern Japanese monkey originated as a result of insertion of an AGMr(HindIII) element into a certain site(s) of the genomic DNA of an ancestor of the modern Japanese monkey before Macaca-Cercocebus divergence.

  6. Molecular cloning and sequence determination of cDNAs for alpha subunits of the guanine nucleotide-binding proteins Gs, Gi, and Go from rat brain.

    PubMed Central

    Itoh, H; Kozasa, T; Nagata, S; Nakamura, S; Katada, T; Ui, M; Iwai, S; Ohtsuka, E; Kawasaki, H; Suzuki, K

    1986-01-01

    We have cloned cDNAs encoding alpha subunits of the guanine nucleotide-binding proteins Gs, Gi, and Go and determined their nucleotide sequences. Purified preparations of Gi and Go alpha subunits (Gi alpha and Go alpha) from rat brain were completely digested with trypsin, and peptides were subjected to amino acid sequence analysis. By screening of a cDNA library from rat C6 glioma cells with a synthetic probe corresponding to a 17 amino acid sequence, a clone encoding the sequence of Go alpha was obtained. Then, the library was rescreened with a Go alpha cDNA probe to isolate several strongly or weakly hybridizing clones. cDNAs encoding the complete sequences of Gi alpha and Gs alpha were thus obtained. From nucleotide sequence analysis, the amino acid sequences of Gs alpha and Gi alpha were deduced; they contain 394 and 355 amino acid residues (including the initiator methionine), respectively. The calculated molecular weights for Gs alpha and Gi alpha were 45,663 and 40,499, respectively. The Go alpha clone encoded a sequence of 310 amino acid residues that lacked the NH2 terminus. The homology of the alpha subunits of Gs, Gi, Go, transducin, and ras-encoded protein is discussed. PMID:3086867

  7. Nucleotide and amino acid sequences of human intestinal alkaline phosphatase: close homology to placental alkaline phosphatase

    SciTech Connect

    Henthorn, P.S.; Raducha, M.; Edwards, Y.H.; Weiss, M.J.; Slaughter, C.; Lafferty, M.A.; Harris, H.

    1987-03-01

    A cDNA clone for human adult intestinal alkaline phosphatase (ALP) (orthophosphoric-monoester phosphohydrolase (alkaline optimum); EC 3.1.3.1) was isolated from a lambdagt11 expression library. The cDNA insert of this clone is 2513 base pairs in length and contains an open reading frame that encodes a 528-amino acid polypeptide. This deduced polypeptide contains the first 40 amino acids of human intestinal ALP, as determined by direct protein sequencing. Intestinal ALP shows 86.5% amino acid identity to placental (type 1) ALP and 56.6% amino acid identity to liver/bone/kidney ALP. In the 3'-untranslated regions, intestinal and placental ALP cDNAs are 73.5% identical (excluding gaps). The evolution of this multigene enzyme family is discussed.

  8. Assessment of the labelling accuracy of spanish semipreserved anchovies products by FINS (forensically informative nucleotide sequencing).

    PubMed

    Velasco, Amaya; Aldrey, Anxela; Pérez-Martín, Ricardo I; Sotelo, Carmen G

    2016-06-01

    Anchovies have been traditionally captured and processed for human consumption for millennia. In the case of Spain, ripened and salted anchovies are a delicacy, which, in some cases, can reach high commercial values. Although there have been a number of studies presenting DNA methodologies for the identification of anchovies, this is one of the first studies investigating the level of mislabelling in this kind of products in Europe. Sixty-three commercial semipreserved anchovy products were collected in different types of food markets in four Spanish cities to check labelling accuracy. Species determination in these commercial products was performed by sequencing two different cyt-b mitochondrial DNA fragments. Results revealed mislabelling levels higher than 15%, what authors consider relatively high considering the importance of the product. The most frequent substitute species was the Argentine anchovy, Engraulis anchoita, which can be interpreted as an economic fraud.

  9. Complete nucleotide sequence of plasmid plca36 isolated from Lactobacillus casei Zhang.

    PubMed

    Zhang, Wenyi; Yu, Dongliang; Sun, Zhihong; Chen, Xia; Bao, Qiuhua; Meng, He; Hu, Songnian; Zhang, Heping

    2008-09-01

    The complete 36,487 bp sequence of plasmid plca36 from Lactobacillus casei Zhang was determined. Plca36 contains 44 predicted coding regions, and to 23 of them functions could be assigned. For the first time, we identified a relBE toxin-antitoxin (TA) locus in a Lactobacillus genus, perhaps indicating a potential role for plca36 in host survival under extreme nutritional stress. A region encoding a cluster of conjugation genes (tra) was also identified. The cluster showed high similarity and co-linearity with tra regions of pWCFS103 and pMRC01 from Lactobacillus plantarum and Lactococcus lactis, respectively. Comparative gene analysis revealed that plasmids from the genus Lactobacillus may have contributed to the environmental adaptation mainly by providing carbohydrate and amino acid transporters. In addition, two chromosome-encoded relBE systems in Lactobacillus johnsonii and Lactobacillus gasseri were identified. PMID:18634821

  10. The complete nucleotide sequence of Malabar grouper (Epinephelus malabaricus) mitochondrial genome.

    PubMed

    Zhu, Kecheng; Huang, Guiju; Zhang, Dongling; Guo, Yihui; Yu, Dahui

    2016-05-01

    In this study, we reported the complete mitochondrial DNA sequence of the Epinephelus malabaricus. The full-length of the mitochondrial genome consisted of a 16,423 bp fragment, with the base composition of A (28.70%), T (26.55%), G (15.92%) and C (28.83%). It contained 2 rRNA genes, 13 protein-coding genes, 22 tRNA genes and a major non-coding control region (D-loop region). The composition and order of these genes were identical to most of other vertebrates. All the protein initiation codons were ATG, except that COX1 began with GTG, ATP-6 and ND6 was not determined, respectively. The complete mitogenome of the Epinephelus malabaricus provided an important data set for the study in genetic mechanism of the hybridization.

  11. Cloning, nucleotide sequence, and transcriptional analyses of the gene encoding a ferredoxin from Methanosarcina thermophila.

    PubMed Central

    Clements, A P; Ferry, J G

    1992-01-01

    A mixed 17-mer oligonucleotide deduced from the N terminus of a ferredoxin isolated from Methanosarcina thermophila was used to probe a lambda gt11 library prepared from M. thermophila genomic DNA; positive clones contained either a 5.7- or 2.1-kbp EcoRI insert. An open reading frame (fdxA) located within the 5.7-kbp insert had a deduced amino acid sequence that was identical to the first 26 N-terminal residues reported for the ferredoxin isolated from M. thermophila, with the exception of the initiator methionine. fdxA had the coding capacity for a 6,230-Da protein which contained eight cysteines with spacings typical of 2[4Fe-4S] ferredoxins. An open reading frame (ORF1) located within the 2.1-kbp EcoRI fragment also had the potential to encode a 2[4Fe-4S] bacterial-type ferredoxin (5,850 Da). fdxA and ORF1 were present as single copies in the genome, and each was transcribed on a monocistronic mRNA. While the fdxA- and ORF1-specific mRNAs were detected in cells grown on methanol and trimethylamine, only the fdxA-specific transcript was present in acetate-grown cells. The apparent transcriptional start sites of fdxA and ORF1, as determined by primer extension analyses, lay 21 to 28 bases downstream of sequences with high identity to the consensus methanogen promoter. Images PMID:1379583

  12. The Bryopsis hypnoides Plastid Genome: Multimeric Forms and Complete Nucleotide Sequence

    PubMed Central

    Tian, Chao; Wang, Guangce; Niu, Jiangfeng; Pan, Guanghua; Hu, Songnian

    2011-01-01

    Background Bryopsis hypnoides Lamouroux is a siphonous green alga, and its extruded protoplasm can aggregate spontaneously in seawater and develop into mature individuals. The chloroplast of B. hypnoides is the biggest organelle in the cell and shows strong autonomy. To better understand this organelle, we sequenced and analyzed the chloroplast genome of this green alga. Principal Findings A total of 111 functional genes, including 69 potential protein-coding genes, 5 ribosomal RNA genes, and 37 tRNA genes were identified. The genome size (153,429 bp), arrangement, and inverted-repeat (IR)-lacking structure of the B. hypnoides chloroplast DNA (cpDNA) closely resembles that of Chlorella vulgaris. Furthermore, our cytogenomic investigations using pulsed-field gel electrophoresis (PFGE) and southern blotting methods showed that the B. hypnoides cpDNA had multimeric forms, including monomer, dimer, trimer, tetramer, and even higher multimers, which is similar to the higher order organization observed previously for higher plant cpDNA. The relative amounts of the four multimeric cpDNA forms were estimated to be about 1, 1/2, 1/4, and 1/8 based on molecular hybridization analysis. Phylogenetic analyses based on a concatenated alignment of chloroplast protein sequences suggested that B. hypnoides is sister to all Chlorophyceae and this placement received moderate support. Conclusion All of the results suggest that the autonomy of the chloroplasts of B. hypnoides has little to do with the size and gene content of the cpDNA, and the IR-lacking structure of the chloroplasts indirectly demonstrated that the multimeric molecules might result from the random cleavage and fusion of replication intermediates instead of recombinational events. PMID:21339817

  13. Molecular cloning of the Clostridium botulinum structural gene encoding the type B neurotoxin and determination of its entire nucleotide sequence.

    PubMed Central

    Whelan, S M; Elmore, M J; Bodsworth, N J; Brehm, J K; Atkinson, T; Minton, N P

    1992-01-01

    DNA fragments derived from the Clostridium botulinum type A neurotoxin (BoNT/A) gene (botA) were used in DNA-DNA hybridization reactions to derive a restriction map of the region of the C. botulinum type B strain Danish chromosome encoding botB. As the one probe encoded part of the BoNT/A heavy (H) chain and the other encoded part of the light (L) chain, the position and orientation of botB relative to this map were established. The temperature at which hybridization occurred indicated that a higher degree of DNA homology occurred between the two genes in the H-chain-encoding region. By using the derived restriction map data, a 2.1-kb BglII-XbaI fragment encoding the entire BoNT/B L chain and 108 amino acids of the H chain was cloned and characterized by nucleotide sequencing. A contiguous 1.8-kb XbaI fragment encoding a further 623 amino acids of the H chain was also cloned. The 3' end of the gene was obtained by cloning a 1.6-kb fragment amplified from genomic DNA by inverse polymerase chain reaction. Translation of the nucleotide sequence derived from all three clones demonstrated that BoNT/B was composed of 1,291 amino acids. Comparative alignment of its sequence with all currently characterized BoNTs (A, C, D, and E) and tetanus toxin (TeTx) showed that a wide variation in percent homology occurred dependent on which component of the dichain was compared. Thus, the L chain of BoNT/B exhibits the greatest degree of homology (50% identity) with the TeTx L chain, whereas its H chain is most homologous (48% identity) with the BoNT/A H chain. Overall, the six neurotoxins were shown to be composed of highly conserved amino acid domains interceded with amino acid tracts exhibiting little overall similarity. In total, 68 amino acids of an average of 442 are absolutely conserved between L chains and 110 of 845 amino acids are conserved between H chains. Conservation of Trp residues (one in the L chain and nine in the H chain) was particularly striking. The most

  14. Complete nucleotide sequence and host range of South African cassava mosaic virus: further evidence for recombination amongst begomoviruses.

    PubMed

    Berrie, L C; Rybicki, E P; Rey, M E

    2001-01-01

    Complete nucleotide sequences of the DNA-A (2800 nt) and DNA-B (2760 nt) components of a novel cassava-infecting begomovirus, South African cassava mosaic virus (SACMV), were determined and compared with various New World and Old World begomoviruses. SACMV is most closely related to East African cassava mosaic virus (EACMV) in both its DNA-A (85% with EACMV-MH and -MK) and -B (90% with EACMV-UG2-Mld and EACMV-UG3-Svr) components; however, percentage sequence similarities of less than 90% in the DNA-A component allowed SACMV to be considered a distinct virus. One significant recombination event spanning the entire AC4 open reading frame was identified; however, there was no evidence of recombination in the DNA-B component. Infectivity of the cloned SACMV genome was demonstrated by successful agroinoculation of cassava and three other plant species (Phaseolus vulgaris, Malva parviflora and Nicotiana benthamiana). This is the first description of successful infection of cassava with a geminivirus using Agrobacterium tumefaciens.

  15. Complete nucleotide sequences of mitochondrial genomes of two solitary entoprocts, Loxocorone allax and Loxosomella aloxiata: implications for lophotrochozoan phylogeny.

    PubMed

    Yokobori, Shin-ichi; Iseto, Tohru; Asakawa, Shuichi; Sasaki, Takashi; Shimizu, Nobuyoshi; Yamagishi, Akihiko; Oshima, Tairo; Hirose, Euichi

    2008-05-01

    The complete nucleotide sequences of the mitochondrial (mt) genomes of the entoprocts Loxocorone allax and Loxosomella aloxiata were determined. Both species carry the typical gene set of metazoan mt genomes and have similar organizations of their mt genes. However, they show differences in the positions of two tRNA(Leu) genes. Additionally, the tRNA(Val) gene, and half of the long non-coding region, is duplicated and inverted in the Loxos. aloxiata mt genome. The initiation codon of the Loxos. aloxiata cytochrome oxidase subunit I gene is expected to be ACG rather than AUG. The mt gene organizations in these two entoproct species most closely resemble those of mollusks such as Katharina tunicata and Octopus vulgaris, which have the most evolutionarily conserved mt gene organization reported to date in mollusks. Analyses of the mt gene organization in the lophotrochozoan phyla (Annelida, Brachiopoda, Echiura, Entoprocta, Mollusca, Nemertea, and Phoronida) suggested a close phylogenetic relationship between Brachiopoda, Annelida, and Echiura. However, Phoronida was excluded from this grouping. Molecular phylogenetic analyses based on the sequences of mt protein-coding genes suggested a possible close relationship between Entoprocta and Phoronida, and a close relationship among Brachiopoda, Annelida, and Echiura.

  16. Nucleotide and protein sequences for dog masticatory tropomyosin identify a novel Tpm4 gene product.

    PubMed

    Brundage, Elizabeth A; Biesiadecki, Brandon J; Reiser, Peter J

    2015-10-01

    Jaw-closing muscles of several vertebrate species, including members of Carnivora, express a unique, "masticatory", isoform of myosin heavy chain, along with isoforms of other myofibrillar proteins that are not expressed in most other muscles. It is generally believed that the complement of myofibrillar isoforms in these muscles serves high force generation for capturing live prey, breaking down tough plant material and defensive biting. A unique isoform of tropomyosin (Tpm) was reported to be expressed in cat jaw-closing muscle, based upon two-dimensional gel mobility, peptide mapping, and immunohistochemistry. The objective of this study was to obtain protein and gene sequence information for this unique Tpm isoform. Samples of masseter (a jaw-closing muscle), tibialis (predominantly fast-twitch fibers), and the deep lateral gastrocnemius (predominantly slow-twitch fibers) were obtained from adult dogs. Expressed Tpm isoforms were cloned and sequencing yielded cDNAs that were identical to genomic predicted striated muscle Tpm1.1St(a,b,b,a) (historically referred to as αTpm), Tpm2.2St(a,b,b,a) (βTpm) and Tpm3.12St(a,b,b,a) (γTpm) isoforms (nomenclature reflects predominant tissue expression ("St"-striated muscle) and exon splicing pattern), as well as a novel 284 amino acid isoform observed in jaw-closing muscle that is identical to a genomic predicted product of the Tpm4 gene (δTpm) family. The novel isoform is designated as Tpm4.3St(a,b,b,a). The myofibrillar Tpm isoform expressed in dog masseter exhibits a unique electrophoretic mobility on gels containing 6 M urea, compared to other skeletal Tpm isoforms. To validate that the cloned Tpm4.3 isoform is the Tpm expressed in dog masseter, E. coli-expressed Tpm4.3 was electrophoresed in the presence of urea. Results demonstrate that Tpm4.3 has identical electrophoretic mobility to the unique dog masseter Tpm isoform and is of different mobility from that of muscle Tpm1.1, Tpm2.2 and Tpm3.12 isoforms. We

  17. Nucleotide and protein sequences for dog masticatory tropomyosin identify a novel Tpm4 gene product

    PubMed Central

    Reiser, Peter J.

    2016-01-01

    Jaw-closing muscles of several vertebrate species, including members of Carnivora, express a unique, “masticatory”, isoform of myosin heavy chain, along with isoforms of other myofibrillar proteins that are not expressed in most other muscles. It is generally believed that the complement of myofibrillar isoforms in these muscles serves high force generation for capturing live prey, breaking down tough plant material and defensive biting. A unique isoform of tropomyosin (Tpm) was reported to be expressed in cat jaw-closing muscle, based upon two-dimensional gel mobility, peptide mapping, and immunohistochemistry. The objective of this study was to obtain protein and gene sequence information for this unique Tpm isoform. Samples of masseter (also a jaw-closing muscle), tibialis (with predominantly fast-twitch fibers), and the deep lateral gastrocnemius (predominantly slow-twitch fibers) were obtained from adult dogs. Expressed Tpm isoforms were cloned and sequencing yielded cDNAs that were identical to genomic predicted striated muscle Tpm1.1St(a,b,b,a) (historically referred to as αTpm), Tpm2.2St(a,b,b,a) (βTpm) and Tpm3.12St(a,b,b,a) (cTpm) isoforms (nomenclature reflects predominant tissue expression (“St”—striated muscle) and exon splicing pattern), as well as a novel 284 amino acid isoform observed in jaw-closing muscle that is identical to a genomic predicted product of the Tpm4 gene (δTpm) family. The novel isoform is designated as Tpm4.3St(a,b,b,a). The myofibrillar Tpm isoform expressed in dog masseter exhibits a unique electrophoretic mobility on gels containing 6 M urea, compared to other skeletal Tpm isoforms. To validate that the cloned Tpm4.3 isoform is the Tpm expressed in dog masseter, E. coli-expressed Tpm4.3 was electrophoresed in the presence of urea. Results demonstrate that Tpm4.3 has identical electrophoretic mobility to the unique dog masseter Tpm isoform and is of different mobility from that of muscle Tpm1.1, Tpm2.2 and Tpm3

  18. Longitudinal study of a heteroplasmic 3460 Leber hereditary optic neuropathy family by multiplexed primer-extension analysis and nucleotide sequencing

    SciTech Connect

    Ghosh, S.S.; Fahy, E.; Bodis-Wollner, I.

    1996-02-01

    Nucleotide-sequencing and multiplexed primer-extension assays have been used to quantitate the mutant-allele frequency in 14 maternal relatives, spanning three generations, from a family that is heteroplasmic for the primary Leber hereditary optic neuropathy (LHON) mutation at nucleotide 3460 of the mitochondrial genome. There was excellent agreement between the values that were obtained with the two different methods. The longitudinal study shows that the mutant-allele frequency was constant within individual family members over a sampling period of 3.5 years. Second, although there was an overall increase in the mutant-allele frequency in successive generations, segregation in the direction of the mutant allele was not invariant, and there was one instance in which there was a significant decrease in the frequency from parent to offspring. From these two sets of results, and from previous studies of heteroplasmic LHON families, we conclude that there is no evidence for a marked selective pressure that determines the replication, segregation, or transmission of primary LHON mutations to white blood cells and platelets. Instead, the mtDNA molecules are most likely to replicate and segregate under conditions of random drift at the cellular level. Finally, the pattern of transmission in this maternal lineage is compatible with a developmental bottleneck model in which the number of mitochondrial units of segregation in the female germ line is relatively small in relation to the number of mtDNA molecules within a cell. However, this is not an invariant pattern for humans, and simple models of mitochondrial gene transmission are inappropriate at the present time. 37 refs., 4 figs., 1 tab.

  19. Emergence of gynodioecy in wild beet (Beta vulgaris ssp. maritima L.): a genealogical approach using chloroplastic nucleotide sequences.

    PubMed

    Fénart, Stéphane; Touzet, Pascal; Arnaud, Jean-François; Cuguen, Joël

    2006-06-01

    Gynodioecy is a breeding system where both hermaphroditic and female individuals coexist within plant populations. This dimorphism is the result of a genomic interaction between maternally inherited cytoplasmic male sterility (CMS) genes and bi-parentally inherited nuclear male fertility restorers. As opposed to other gynodioecious species, where every cytoplasm seems to be associated with male sterility, wild beet Beta vulgaris ssp. maritima exhibits a minority of sterilizing cytoplasms among numerous non-sterilizing ones. Many studies on population genetics have explored the molecular diversity of different CMS cytoplasms, but questions remain concerning their evolutionary dynamics. In this paper we report one of the first investigations on phylogenetic relationships between CMS and non-CMS lineages. We investigated the phylogenetic relationships between 35 individuals exhibiting different mitochondrial haplotypes. Relying on the high linkage disequilibrium between chloroplastic and mitochondrial genomes, we chose to analyse the nucleotide sequence diversity of three chloroplastic fragments (trnK intron, trnD-trnT and trnL-trnF intergenic spacers). Nucleotide diversity appeared to be low, suggesting a recent bottleneck during the evolutionary history of B. vulgaris ssp. maritima. Statistical parsimony analyses revealed a star-like genealogy and showed that sterilizing haplotypes all belong to different lineages derived from an ancestral non-sterilizing cytoplasm. These results suggest a rapid evolution of male sterility in this taxon. The emergence of gynodioecy in wild beet is confronted with theoretical expectations, describing either gynodioecy dynamics as the maintenance of CMS factors through balancing selection or as a constant turnover of new CMSs.

  20. Identification, cloning, and nucleotide sequencing of the ornithine decarboxylase antizyme gene of Escherichia coli.

    PubMed Central

    Canellakis, E S; Paterakis, A A; Huang, S C; Panagiotidis, C A; Kyriakidis, D A

    1993-01-01

    The ornithine decarboxylase antizyme gene of Escherichia coli was identified by immunological screening of an E. coli genomic library. A 6.4-kilobase fragment containing the antizyme gene was subcloned and sequenced. The open reading frame encoding the antizyme was identified on the basis of its ability to direct the synthesis of immunoreactive antizyme. Antizyme shares significant homology with bacterial transcriptional activators of the two-component regulatory system family; these systems consist of a "sensor" kinase and a transcriptional regulator. The open reading frame next to antizyme is homologous to sensor kinases. Antizyme overproduction inhibits the activities of both ornithine and arginine decarboxylases without affecting their protein levels. Extracts from E. coli bearing an antizyme gene-containing plasmid exhibit increased antizyme activity. These data strongly suggest that (i) the cloned gene encodes the ornithine decarboxylase antizyme and (ii) antizyme is a bifunctional protein serving as both an inhibitor of polyamine biosynthesis as well as a transcriptional regulator of an as yet unknown set of genes. Images Fig. 2 Fig. 4 Fig. 6 PMID:8346225

  1. Nodavirus Coat Protein Imposes Dodecahedral RNA Structure Independent of Nucleotide Sequence and Length†

    PubMed Central

    Tihova, Mariana; Dryden, Kelly A.; Le, Thuc-vy L.; Harvey, Stephen C.; Johnson, John E.; Yeager, Mark; Schneemann, Anette

    2004-01-01

    The nodavirus Flock house virus (FHV) has a bipartite, positive-sense RNA genome that is packaged into an icosahedral particle displaying T=3 symmetry. The high-resolution X-ray structure of FHV has shown that 10 bp of well-ordered, double-stranded RNA are located at each of the 30 twofold axes of the virion, but it is not known which portions of the genome form these duplex regions. The regular distribution of double-stranded RNA in the interior of the virus particle indicates that large regions of the encapsidated genome are engaged in secondary structure interactions. Moreover, the RNA is restricted to a topology that is unlikely to exist during translation or replication. We used electron cryomicroscopy and image reconstruction to determine the structure of four types of FHV particles that differed in RNA and protein content. RNA-capsid interactions were primarily mediated via the N and C termini, which are essential for RNA recognition and particle assembly. A substantial fraction of the packaged nucleic acid, either viral or heterologous, was organized as a dodecahedral cage of duplex RNA. The similarity in tertiary structure suggests that RNA folding is independent of sequence and length. Computational modeling indicated that RNA duplex formation involves both short-range and long-range interactions. We propose that the capsid protein is able to exploit the plasticity of the RNA secondary structures, capturing those that are compatible with the geometry of the dodecahedral cage. PMID:14990708

  2. SureChEMBL: a large-scale, chemically annotated patent document database

    PubMed Central

    Papadatos, George; Davies, Mark; Dedman, Nathan; Chambers, Jon; Gaulton, Anna; Siddle, James; Koks, Richard; Irvine, Sean A.; Pettersson, Joe; Goncharoff, Nicko; Hersey, Anne; Overington, John P.

    2016-01-01

    SureChEMBL is a publicly available large-scale resource containing compounds extracted from the full text, images and attachments of patent documents. The data are extracted from the patent literature according to an automated text and image-mining pipeline on a daily basis. SureChEMBL provides access to a previously unavailable, open and timely set of annotated compound-patent associations, complemented with sophisticated combined structure and keyword-based search capabilities against the compound repository and patent document corpus; given the wealth of knowledge hidden in patent documents, analysis of SureChEMBL data has immediate applications in drug discovery, medicinal chemistry and other commercial areas of chemical science. Currently, the database contains 17 million compounds extracted from 14 million patent documents. Access is available through a dedicated web-based interface and data downloads at: https://www.surechembl.org/. PMID:26582922

  3. SureChEMBL: a large-scale, chemically annotated patent document database.

    PubMed

    Papadatos, George; Davies, Mark; Dedman, Nathan; Chambers, Jon; Gaulton, Anna; Siddle, James; Koks, Richard; Irvine, Sean A; Pettersson, Joe; Goncharoff, Nicko; Hersey, Anne; Overington, John P

    2016-01-01

    SureChEMBL is a publicly available large-scale resource containing compounds extracted from the full text, images and attachments of patent documents. The data are extracted from the patent literature according to an automated text and image-mining pipeline on a daily basis. SureChEMBL provides access to a previously unavailable, open and timely set of annotated compound-patent associations, complemented with sophisticated combined structure and keyword-based search capabilities against the compound repository and patent document corpus; given the wealth of knowledge hidden in patent documents, analysis of SureChEMBL data has immediate applications in drug discovery, medicinal chemistry and other commercial areas of chemical science. Currently, the database contains 17 million compounds extracted from 14 million patent documents. Access is available through a dedicated web-based interface and data downloads at: https://www.surechembl.org/.

  4. SureChEMBL: a large-scale, chemically annotated patent document database.

    PubMed

    Papadatos, George; Davies, Mark; Dedman, Nathan; Chambers, Jon; Gaulton, Anna; Siddle, James; Koks, Richard; Irvine, Sean A; Pettersson, Joe; Goncharoff, Nicko; Hersey, Anne; Overington, John P

    2016-01-01

    SureChEMBL is a publicly available large-scale resource containing compounds extracted from the full text, images and attachments of patent documents. The data are extracted from the patent literature according to an automated text and image-mining pipeline on a daily basis. SureChEMBL provides access to a previously unavailable, open and timely set of annotated compound-patent associations, complemented with sophisticated combined structure and keyword-based search capabilities against the compound repository and patent document corpus; given the wealth of knowledge hidden in patent documents, analysis of SureChEMBL data has immediate applications in drug discovery, medicinal chemistry and other commercial areas of chemical science. Currently, the database contains 17 million compounds extracted from 14 million patent documents. Access is available through a dedicated web-based interface and data downloads at: https://www.surechembl.org/. PMID:26582922

  5. Neuropeptidergic Signaling in the American Lobster Homarus americanus: New Insights from High-Throughput Nucleotide Sequencing.

    PubMed

    Christie, Andrew E; Chi, Megan; Lameyer, Tess J; Pascual, Micah G; Shea, Devlin N; Stanhope, Meredith E; Schulz, David J; Dickinson, Patsy S

    2015-01-01

    Peptides are the largest and most diverse class of molecules used for neurochemical communication, playing key roles in the control of essentially all aspects of physiology and behavior. The American lobster, Homarus americanus, is a crustacean of commercial and biomedical importance; lobster growth and reproduction are under neuropeptidergic control, and portions of the lobster nervous system serve as models for understanding the general principles underlying rhythmic motor behavior (including peptidergic neuromodulation). While a number of neuropeptides have been identified from H. americanus, and the effects of some have been investigated at the cellular/systems levels, little is currently known about the molecular components of neuropeptidergic signaling in the lobster. Here, a H. americanus neural transcriptome was generated and mined for sequences encoding putative peptide precursors and receptors; 35 precursor- and 41 receptor-encoding transcripts were identified. We predicted 194 distinct neuropeptides from the deduced precursor proteins, including members of the adipokinetic hormone-corazonin-like peptide, allatostatin A, allatostatin C, bursicon, CCHamide, corazonin, crustacean cardioactive peptide, crustacean hyperglycemic hormone (CHH), CHH precursor-related peptide, diuretic hormone 31, diuretic hormone 44, eclosion hormone, FLRFamide, GSEFLamide, insulin-like peptide, intocin, leucokinin, myosuppressin, neuroparsin, neuropeptide F, orcokinin, pigment dispersing hormone, proctolin, pyrokinin, SIFamide, sulfakinin and tachykinin-related peptide families. While some of the predicted peptides are known H. americanus isoforms, most are novel identifications, more than doubling the extant lobster neuropeptidome. The deduced receptor proteins are the first descriptions of H. americanus neuropeptide receptors, and include ones for most of the peptide groups mentioned earlier, as well as those for ecdysis-triggering hormone, red pigment concentrating hormone

  6. Neuropeptidergic Signaling in the American Lobster Homarus americanus: New Insights from High-Throughput Nucleotide Sequencing.

    PubMed

    Christie, Andrew E; Chi, Megan; Lameyer, Tess J; Pascual, Micah G; Shea, Devlin N; Stanhope, Meredith E; Schulz, David J; Dickinson, Patsy S

    2015-01-01

    Peptides are the largest and most diverse class of molecules used for neurochemical communication, playing key roles in the control of essentially all aspects of physiology and behavior. The American lobster, Homarus americanus, is a crustacean of commercial and biomedical importance; lobster growth and reproduction are under neuropeptidergic control, and portions of the lobster nervous system serve as models for understanding the general principles underlying rhythmic motor behavior (including peptidergic neuromodulation). While a number of neuropeptides have been identified from H. americanus, and the effects of some have been investigated at the cellular/systems levels, little is currently known about the molecular components of neuropeptidergic signaling in the lobster. Here, a H. americanus neural transcriptome was generated and mined for sequences encoding putative peptide precursors and receptors; 35 precursor- and 41 receptor-encoding transcripts were identified. We predicted 194 distinct neuropeptides from the deduced precursor proteins, including members of the adipokinetic hormone-corazonin-like peptide, allatostatin A, allatostatin C, bursicon, CCHamide, corazonin, crustacean cardioactive peptide, crustacean hyperglycemic hormone (CHH), CHH precursor-related peptide, diuretic hormone 31, diuretic hormone 44, eclosion hormone, FLRFamide, GSEFLamide, insulin-like peptide, intocin, leucokinin, myosuppressin, neuroparsin, neuropeptide F, orcokinin, pigment dispersing hormone, proctolin, pyrokinin, SIFamide, sulfakinin and tachykinin-related peptide families. While some of the predicted peptides are known H. americanus isoforms, most are novel identifications, more than doubling the extant lobster neuropeptidome. The deduced receptor proteins are the first descriptions of H. americanus neuropeptide receptors, and include ones for most of the peptide groups mentioned earlier, as well as those for ecdysis-triggering hormone, red pigment concentrating hormone

  7. Neuropeptidergic Signaling in the American Lobster Homarus americanus: New Insights from High-Throughput Nucleotide Sequencing

    PubMed Central

    Christie, Andrew E.; Chi, Megan; Lameyer, Tess J.; Pascual, Micah G.; Shea, Devlin N.; Stanhope, Meredith E.; Schulz, David J.; Dickinson, Patsy S.

    2015-01-01

    Peptides are the largest and most diverse class of molecules used for neurochemical communication, playing key roles in the control of essentially all aspects of physiology and behavior. The American lobster, Homarus americanus, is a crustacean of commercial and biomedical importance; lobster growth and reproduction are under neuropeptidergic control, and portions of the lobster nervous system serve as models for understanding the general principles underlying rhythmic motor behavior (including peptidergic neuromodulation). While a number of neuropeptides have been identified from H. americanus, and the effects of some have been investigated at the cellular/systems levels, little is currently known about the molecular components of neuropeptidergic signaling in the lobster. Here, a H. americanus neural transcriptome was generated and mined for sequences encoding putative peptide precursors and receptors; 35 precursor- and 41 receptor-encoding transcripts were identified. We predicted 194 distinct neuropeptides from the deduced precursor proteins, including members of the adipokinetic hormone-corazonin-like peptide, allatostatin A, allatostatin C, bursicon, CCHamide, corazonin, crustacean cardioactive peptide, crustacean hyperglycemic hormone (CHH), CHH precursor-related peptide, diuretic hormone 31, diuretic hormone 44, eclosion hormone, FLRFamide, GSEFLamide, insulin-like peptide, intocin, leucokinin, myosuppressin, neuroparsin, neuropeptide F, orcokinin, pigment dispersing hormone, proctolin, pyrokinin, SIFamide, sulfakinin and tachykinin-related peptide families. While some of the predicted peptides are known H. americanus isoforms, most are novel identifications, more than doubling the extant lobster neuropeptidome. The deduced receptor proteins are the first descriptions of H. americanus neuropeptide receptors, and include ones for most of the peptide groups mentioned earlier, as well as those for ecdysis-triggering hormone, red pigment concentrating hormone

  8. Complete Nucleotide Sequence of a South African Isolate of Grapevine Fanleaf Virus and Its Associated Satellite RNA

    PubMed Central

    Lamprecht, Renate L.; Spaltman, Monique; Stephan, Dirk; Wetzel, Thierry; Burger, Johan T.

    2013-01-01

    The complete sequences of RNA1, RNA2 and satellite RNA have been determined for a South African isolate of Grapevine fanleaf virus (GFLV-SACH44). The two RNAs of GFLV-SACH44 are 7,341 nucleotides (nt) and 3,816 nt in length, respectively, and its satellite RNA (satRNA) is 1,104 nt in length, all excluding the poly(A) tail. Multiple sequence alignment of these sequences showed that GFLV-SACH44 RNA1 and RNA2 were the closest to the South African isolate, GFLV-SAPCS3 (98.2% and 98.6% nt identity, respectively), followed by the French isolate, GFLV-F13 (87.3% and 90.1% nt identity, respectively). Interestingly, the GFLV-SACH44 satRNA is more similar to three Arabis mosaic virus satRNAs (85%–87.4% nt identity) than to the satRNA of GFLV-F13 (81.8% nt identity) and was most distantly related to the satRNA of GFLV-R2 (71.0% nt identity). Full-length infectious clones of GFLV-SACH44 satRNA were constructed. The infectivity of the clones was tested with three nepovirus isolates, GFLV-NW, Arabis mosaic virus (ArMV)-NW and GFLV-SAPCS3. The clones were mechanically inoculated in Chenopodium quinoa and were infectious when co-inoculated with the two GFLV helper viruses, but not when co-inoculated with ArMV-NW. PMID:23867805

  9. Nucleotide sequence and cloning in Bacillus subtilis of the Bacillus stearothermophilus pleiotropic regulatory gene degT.

    PubMed Central

    Takagi, M; Takada, H; Imanaka, T

    1990-01-01

    The regulatory gene (degT) from Bacillus stearothermophilus NCA1503 which enhanced production of extracellular alkaline protease (Apr) was cloned in Bacillus subtilis with pTB53 as a vector. When B. subtilis MT-2 (Npr- [deficiency of neutral protease] Apr+) was transformed with the recombinant plasmid, pDT145, the plasmid carrier produced about three times more alkaline protease than did the wild-type strain. In contrast, when B. subtilis DB104 (Npr- Apr-) was used as a host, the transformant with pDT145 could not exhibit any protease activity. After construction of the deletion plasmids, DNA sequencing was done. A large open reading frame was found, and nucleotide sequence analysis showed that the degT gene was composed of 1,116 bases (372 amino acid residues, molecular weight of 41,244). A Shine-Dalgarno sequence was found nine bases upstream from the open reading frame. A B. subtilis strain carrying degT showed the following pleiotropic phenomena: (i) enhancement of production of extracellular enzymes such as alkaline protease and levansucrase, (ii) repression of autolysin activity, (iii) decrease of transformation efficiency for B. subtilis (competent cell procedure), (iv) altered control of sporulation, (v) loss of flagella, and (vi) abnormal cell division. When B. stearothermophilus SIC1 was transformed with the recombinant plasmid carrying degT, the transformants exhibited abnormal cell division. These phenomena are similar to those of the phenotypes of degSU(Hy) (hyperproduction), degQ(Hy), and degR mutants of B. subtilis. However, the amino acid sequence of the degT product (DegT) is different from those of the reported gene products. Furthermore, DegT includes a hydrophobic core region in the N-terminal portion (amino acid numbers 50 to 160), a consensus sequence for a DNA binding region (amino acid numbers 160 to 179), and a region homologous to transcription activator proteins (amino acid numbers 351 to 366). We discuss the possibility that the membrane

  10. Nucleotide sequence of the 16S - 23S spacer region in an rRNA gene cluster from tobacco chloroplast DNA.

    PubMed Central

    Takaiwa, F; Sugiura, M

    1982-01-01

    The nucleotide sequence of a spacer region between 16S and 23S rRNA genes from tobacco chloroplasts has been determined. The spacer region is 2080 bp long and encodes tRNAIle and tRNAAla genes which contain intervening sequences of 707 bp and 710 bp, respectively. Strong homology between the two intervening sequences is observed. These spacer tRNAs are synthesized as part of an 8.2 kb precursor molecule containing 16S and 23S rRNA sequences. Images PMID:6281739

  11. Development of Single Nucleotide Polymorphism markers in Theobroma cacao and comparison to Simple Sequence Repeat markers for genotyping of Cameroon clones.

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Single Nucleotide Polymorphism (SNP) markers are increasingly being used in crop breeding programs, slowly replacing Simple Sequence Repeats (SSR) and other markers. SNPs provide many benefits over SSRs, including ease of analysis and unambiguous results across various platforms. We have identifie...

  12. Complete nucleotide sequence and genome organization of an endornavirus from bottle gourd (Lagenaria siceraria) in California, U.S.A.

    PubMed

    Kwon, Sun-Jung; Tan, Shih-Hua; Vidalakis, Georgios

    2014-08-01

    The full-length nucleotide sequence and genome organization of an Endornavirus isolated from ornamental hard shell bottle gourd plants (Lagenaria siceraria (Molina) Standl.) in California (CA), USA tentatively named L. siceraria endornavirus-California (LsEV-CA) was determined. The LsEV-CA genome was 15088 bp in length, with a G + C content of 36.55 %. The lengths of the 5' and 3' untranslated regions were 111 and 52 bp, respectively. The genome of LsEV-CA contained one large ORF encoding a 576 kDa polyprotein. The predicted protein contains two glycosyltransferase motifs, as well as RNA-dependent RNA polymerase and helicase domains. LsEV-CA was detected in healthy-looking field-grown gourd plants, as well as plants expressing yellows symptoms. It was also detected in non-symptomatic greenhouse-grown gourd seedlings grown from seed obtained from the same field sites. These preliminary data indicate that LsEV-CA is likely not associated with the gourd-yellows syndrome observed in the field. PMID:24818693

  13. Simultaneous Detection of Both Single Nucleotide Variations and Copy Number Alterations by Next-Generation Sequencing in Gorlin Syndrome.

    PubMed

    Morita, Kei-ichi; Naruto, Takuya; Tanimoto, Kousuke; Yasukawa, Chisato; Oikawa, Yu; Masuda, Kiyoshi; Imoto, Issei; Inazawa, Johji; Omura, Ken; Harada, Hiroyuki

    2015-01-01

    Gorlin syndrome (GS) is an autosomal dominant disorder that predisposes affected individuals to developmental defects and tumorigenesis, and caused mainly by heterozygous germline PTCH1 mutations. Despite exhaustive analysis, PTCH1 mutations are often unidentifiable in some patients; the failure to detect mutations is presumably because of mutations occurred in other causative genes or outside of analyzed regions of PTCH1, or copy number alterations (CNAs). In this study, we subjected a cohort of GS-affected individuals from six unrelated families to next-generation sequencing (NGS) analysis for the combined screening of causative alterations in Hedgehog signaling pathway-related genes. Specific single nucleotide variations (SNVs) of PTCH1 causing inferred amino acid changes were identified in four families (seven affected individuals), whereas CNAs within or around PTCH1 were found in two families in whom possible causative SNVs were not detected. Through a targeted resequencing of all coding exons, as well as simultaneous evaluation of copy number status using the alignment map files obtained via NGS, we found that GS phenotypes could be explained by PTCH1 mutations or deletions in all affected patients. Because it is advisable to evaluate CNAs of candidate causative genes in point mutation-negative cases, NGS methodology appears to be useful for improving molecular diagnosis through the simultaneous detection of both SNVs and CNAs in the targeted genes/regions. PMID:26544948

  14. The nucleotide sequence and a first generation gene transfer vector of species B human adenovirus serotype 3.

    PubMed

    Sirena, Dominique; Ruzsics, Zsolt; Schaffner, Walter; Greber, Urs F; Hemmi, Silvio

    2005-12-20

    Human adenovirus (Ad) serotype 3 causes respiratory infections. It is considered highly virulent, accounting for about 13% of all Ad isolates. We report here the complete Ad3 DNA sequence of 35,343 base pairs (GenBank accession DQ086466). Ad3 shares 96.43% nucleotide identity with Ad7, another virulent subspecies B1 serotype, and 82.56 and 62.75% identity with the less virulent species B2 Ad11 and species C Ad5, respectively. The genomic organization of Ad3 is similar to the other human Ads comprising five early transcription units, E1A, E1B, E2, E3, and E4, two delayed early units IX and IVa2, and the major late unit, in total 39 putative and 7 hypothetical open reading frames. A recombinant E1-deleted Ad3 was generated on a bacterial artificial chromosome. This prototypic virus efficiently transduced CD46-positive rodent and human cells. Our results will help in clarifying the biology and pathology of adenoviruses and enhance therapeutic applications of viral vectors in clinical settings.

  15. Prediction of rare single-nucleotide causative mutations for muscular diseases in pooled next-generation sequencing experiments.

    PubMed

    Ferraro, Maria Brigida; Savarese, Marco; Di Fruscio, Giuseppina; Nigro, Vincenzo; Guarracino, Mario Rosario

    2014-09-01

    Next-generation sequencing (NGS) is a new approach for biomedical research, useful for the diagnosis of genetic diseases in extremely heterogeneous conditions. In this work, we describe how data generated by high-throughput NGS experiments can be analyzed to find single nucleotide polymorphisms (SNPs) in DNA samples of patients affected by neuromuscular disorders. In particular, we consider untagged pooled NGS data, where DNA samples of different individuals are combined in a single experiment, still providing information with an uncertainty limited to only two patients. At the moment, only few publications address the problem of SNPs detection in pooled experiments, and existing tools are often inaccurate. We propose a computational procedure consisting of two parts. In the first, data are filtered by means of decision rules. The second phase is based on a supervised classification technique. In the present work, we compare different de facto standard supervised and unsupervised procedures to identify and classify variants potentially related to muscular diseases, and we discuss results in terms of statistical and biological validation.

  16. Genome sequence of Perigonia lusca single nucleopolyhedrovirus: insights into the evolution of a nucleotide metabolism enzyme in the family Baculoviridae

    PubMed Central

    Ardisson-Araújo, Daniel M. P.; Lima, Rayane Nunes; Melo, Fernando L.; Clem, Rollie J.; Huang, Ning; Báo, Sônia Nair; Sosa-Gómez, Daniel R.; Ribeiro, Bergmann M.

    2016-01-01

    The genome of a novel group II alphabaculovirus, Perigonia lusca single nucleopolyhedrovirus (PeluSNPV), was sequenced and shown to contain 132,831 bp with 145 putative ORFs (open reading frames) of at least 50 amino acids. An interesting feature of this novel genome was the presence of a putative nucleotide metabolism enzyme-encoding gene (pelu112). The pelu112 gene was predicted to encode a fusion of thymidylate kinase (tmk) and dUTP diphosphatase (dut). Phylogenetic analysis indicated that baculoviruses have independently acquired tmk and dut several times during their evolution. Two homologs of the tmk-dut fusion gene were separately introduced into the Autographa californica multiple nucleopolyhedrovirus (AcMNPV) genome, which lacks tmk and dut. The recombinant baculoviruses produced viral DNA, virus progeny, and some viral proteins earlier during in vitro infection and the yields of viral occlusion bodies were increased 2.5-fold when compared to the parental virus. Interestingly, both enzymes appear to retain their active sites, based on separate modeling using previously solved crystal structures. We suggest that the retention of these tmk-dut fusion genes by certain baculoviruses could be related to accelerating virus replication and to protecting the virus genome from deleterious mutation. PMID:27273152

  17. The nucleotide sequence and a first generation gene transfer vector of species B human adenovirus serotype 3.

    PubMed

    Sirena, Dominique; Ruzsics, Zsolt; Schaffner, Walter; Greber, Urs F; Hemmi, Silvio

    2005-12-20

    Human adenovirus (Ad) serotype 3 causes respiratory infections. It is considered highly virulent, accounting for about 13% of all Ad isolates. We report here the complete Ad3 DNA sequence of 35,343 base pairs (GenBank accession DQ086466). Ad3 shares 96.43% nucleotide identity with Ad7, another virulent subspecies B1 serotype, and 82.56 and 62.75% identity with the less virulent species B2 Ad11 and species C Ad5, respectively. The genomic organization of Ad3 is similar to the other human Ads comprising five early transcription units, E1A, E1B, E2, E3, and E4, two delayed early units IX and IVa2, and the major late unit, in total 39 putative and 7 hypothetical open reading frames. A recombinant E1-deleted Ad3 was generated on a bacterial artificial chromosome. This prototypic virus efficiently transduced CD46-positive rodent and human cells. Our results will help in clarifying the biology and pathology of adenoviruses and enhance therapeutic applications of viral vectors in clinical settings. PMID:16169033

  18. Development of Prevotella intermedia-specific PCR primers based on the nucleotide sequences of a DNA probe Pig27.

    PubMed

    Kim, Min Jung; Hwang, Kyung Hwan; Lee, Young-Seok; Park, Jae-Yoon; Kook, Joong-Ki

    2011-03-01

    The aim of this study was to develop Prevotella intermedia-specific PCR primers based on the P. intermedia-specific DNA probe. The P. intermedia-specific DNA probe was screened by inverted dot blot hybridization and confirmed by Southern blot hybridization. The nucleotide sequences of the species-specific DNA probes were determined using a chain termination method. Southern blot analysis showed that the DNA probe, Pig27, detected only the genomic DNA of P. intermedia strains. PCR showed that the PCR primers, Pin-F1/Pin-R1, had species-specificity for P. intermedia. The detection limits of the PCR primer sets were 0.4pg of the purified genomic DNA of P. intermedia ATCC 49046. These results suggest that the PCR primers, Pin-F1/Pin-R1, could be useful in the detection of P. intermedia as well as in the development of a PCR kit in epidemiological studies related to periodontal diseases. PMID:21192988

  19. [Variability of nucleotide sequences of the mitochondrial DNA cytochrome c gene in dolly varden and taranetz char].

    PubMed

    Radchenko, O A; Derenko, M V; Maliarchuk, B A

    2000-07-01

    Nucleotide sequence of the 307-bp fragment of the mitochondrial DNA cytochrome b gene was determined in representatives of the three species of the Salvelinus genus, specifically, dolly varden char (S. malma), taranetz char (S. taranetzi), and white-spotted char (S. leucomaenis). These results pointed to a high level of mitochondrial DNA (mtDNA) divergence between white-spotted char and dolly varden char, on the one hand, and taranetz char, on the other (the mean d value was 5.45%). However, the divergence between the dolly varden char and taranetz char was only 0.81%, which is comparable with the level of intraspecific divergence in the dolly varden char (d = 0.87%). It was shown that the dolly varden char mitochondrial gene pool contained DNA lineages differing from the main mtDNA pool at least in the taranetz char-specific mitochondrial lineages. One of these dolly varden char mtDNA lineages was characterized by the presence of the restriction endonuclease MspI-D variant of the cytochrome b gene. This lineage was widely distributed in the Chukotka populations but it was not detected in the Yana River (Okhotsk sea) populations. These findings suggest that dolly varden char has a more ancient evolutionary lineage, diverging from the common ancestor earlier than did taranetz char.

  20. Nucleotide sequence and mutational analysis of the structural genes (anfHDGK) for the second alternative nitrogenase from Azotobacter vinelandii.

    PubMed Central

    Joerger, R D; Jacobson, M R; Premakumar, R; Wolfinger, E D; Bishop, P E

    1989-01-01

    The nucleotide sequence of a region of the Azotobacter vinelandii genome exhibiting sequence similarity to nifH has been determined. The order of open reading frames within this 6.1-kilobase-pair region was found to be anfH (alternative nitrogen fixation, nifH-like gene), anfD (nifD-like gene), anfG (potentially encoding a protein similar to the product of vnfG from Azotobacter chroococcum), anfK (nifK-like gene), followed by two additional open reading frames. The 5'-flanking region of anfH contains a nif promoter similar to that found in the A. vinelandii nifHDK gene cluster. The presumed products of anfH, anfD, and anfK are similar in predicted Mr and pI to the previously described subunits of nitrogenase 3. Deletion plus insertion mutations introduced into the anfHDGK region of wild-type strain A. vinelandii CA resulted in mutant strains that were unable to grow in Mo-deficient, N-free medium but grew in the presence of 1 microM Na2MoO4 or V2O5. Introduction of the same mutations into the nifHDK deletion strain CA11 resulted in strains that grew under diazotrophic conditions only in the presence of vanadium. The lack of nitrogenase 3 subunits in these mutant strains was demonstrated through two-dimensional gel analysis of protein extracts from cells derepressed for nitrogenase under Mo and V deficiency. These results indicate that anfH, anfD, and anfK encode structural proteins for nitrogenase 3. Images PMID:2644222

  1. Nucleotide sequence and mutational analysis of the structural genes (anfHDGK) for the second alternative nitrogenase from Azotobacter vinelandii.

    PubMed

    Joerger, R D; Jacobson, M R; Premakumar, R; Wolfinger, E D; Bishop, P E

    1989-02-01

    The nucleotide sequence of a region of the Azotobacter vinelandii genome exhibiting sequence similarity to nifH has been determined. The order of open reading frames within this 6.1-kilobase-pair region was found to be anfH (alternative nitrogen fixation, nifH-like gene), anfD (nifD-like gene), anfG (potentially encoding a protein similar to the product of vnfG from Azotobacter chroococcum), anfK (nifK-like gene), followed by two additional open reading frames. The 5'-flanking region of anfH contains a nif promoter similar to that found in the A. vinelandii nifHDK gene cluster. The presumed products of anfH, anfD, and anfK are similar in predicted Mr and pI to the previously described subunits of nitrogenase 3. Deletion plus insertion mutations introduced into the anfHDGK region of wild-type strain A. vinelandii CA resulted in mutant strains that were unable to grow in Mo-deficient, N-free medium but grew in the presence of 1 microM Na2MoO4 or V2O5. Introduction of the same mutations into the nifHDK deletion strain CA11 resulted in strains that grew under diazotrophic conditions only in the presence of vanadium. The lack of nitrogenase 3 subunits in these mutant strains was demonstrated through two-dimensional gel analysis of protein extracts from cells derepressed for nitrogenase under Mo and V deficiency. These results indicate that anfH, anfD, and anfK encode structural proteins for nitrogenase 3. PMID:2644222

  2. Nucleotide sequence and mutational analysis of the structural genes (anfHDGK) for the second alternative nitrogenase from Azotobacter vinelandii.

    PubMed

    Joerger, R D; Jacobson, M R; Premakumar, R; Wolfinger, E D; Bishop, P E

    1989-02-01

    The nucleotide sequence of a region of the Azotobacter vinelandii genome exhibiting sequence similarity to nifH has been determined. The order of open reading frames within this 6.1-kilobase-pair region was found to be anfH (alternative nitrogen fixation, nifH-like gene), anfD (nifD-like gene), anfG (potentially encoding a protein similar to the product of vnfG from Azotobacter chroococcum), anfK (nifK-like gene), followed by two additional open reading frames. The 5'-flanking region of anfH contains a nif promoter similar to that found in the A. vinelandii nifHDK gene cluster. The presumed products of anfH, anfD, and anfK are similar in predicted Mr and pI to the previously described subunits of nitrogenase 3. Deletion plus insertion mutations introduced into the anfHDGK region of wild-type strain A. vinelandii CA resulted in mutant strains that were unable to grow in Mo-deficient, N-free medium but grew in the presence of 1 microM Na2MoO4 or V2O5. Introduction of the same mutations into the nifHDK deletion strain CA11 resulted in strains that grew under diazotrophic conditions only in the presence of vanadium. The lack of nitrogenase 3 subunits in these mutant strains was demonstrated through two-dimensional gel analysis of protein extracts from cells derepressed for nitrogenase under Mo and V deficiency. These results indicate that anfH, anfD, and anfK encode structural proteins for nitrogenase 3.

  3. A high-density simple sequence repeat and single nucleotide polymorphism genetic map of the tetraploid cotton genome.

    PubMed

    Yu, John Z; Kohel, Russell J; Fang, David D; Cho, Jaemin; Van Deynze, Allen; Ulloa, Mauricio; Hoffman, Steven M; Pepper, Alan E; Stelly, David M; Jenkins, Johnie N; Saha, Sukumar; Kumpatla, Siva P; Shah, Manali R; Hugie, William V; Percy, Richard G

    2012-01-01

    Genetic linkage maps play fundamental roles in understanding genome structure, explaining genome formation events during evolution, and discovering the genetic bases of important traits. A high-density cotton (Gossypium spp.) genetic map was developed using representative sets of simple sequence repeat (SSR) and the first public set of single nucleotide polymorphism (SNP) markers to genotype 186 recombinant inbred lines (RILs) derived from an interspecific cross between Gossypium hirsutum L. (TM-1) and G. barbadense L. (3-79). The genetic map comprised 2072 loci (1825 SSRs and 247 SNPs) and covered 3380 centiMorgan (cM) of the cotton genome (AD) with an average marker interval of 1.63 cM. The allotetraploid cotton genome produced equivalent recombination frequencies in its two subgenomes (At and Dt). Of the 2072 loci, 1138 (54.9%) were mapped to 13 At-subgenome chromosomes, covering 1726.8 cM (51.1%), and 934 (45.1%) mapped to 13 Dt-subgenome chromosomes, covering 1653.1 cM (48.9%). The genetically smallest homeologous chromosome pair was Chr. 04 (A04) and 22 (D04), and the largest was Chr. 05 (A05) and 19 (D05). Duplicate loci between and within homeologous chromosomes were identified that facilitate investigations of chromosome translocations. The map augments evidence of reciprocal rearrangement between ancestral forms of Chr. 02 and 03 versus segmental homeologs 14 and 17 as centromeric regions show homeologous between Chr. 02 (A02) and 17 (D02), as well as between Chr. 03 (A03) and 14 (D03). This research represents an important foundation for studies on polyploid cottons, including germplasm characterization, gene discovery, and genome sequence assembly. PMID:22384381

  4. Herpes simplex virus type 1 (HSV-1) strain HSZP host shutoff gene: nucleotide sequence and comparison with HSV-1 strains differing in early shutoff of host protein synthesis.

    PubMed

    Vojvodová, A; Matis, J; Kúdelová, M; Rajcáni, J

    1997-01-01

    The UL41 gene of the HSZP strain of herpes simplex virus type 1 (HSV-1) defective with respect to the early shutoff of host protein synthesis was sequenced and compared with the corresponding HSV-1 strain KOS and 17 gene sequences. In comparison with strain 17, nine mutations (base changes) were HSZP specific, five KOS specific and four were common for both strains. Nine mutations caused codon changes. Three of these mapped to the nonconserved regions and the others to the conserved regions of the functional map of UL41 gene. One KOS specific mutation mapped to the region responsible for the binding of the virion host shutoff (vhs) protein to the alpha-transinducing factor (VP16). The possible relationship between mutations and host shutoff function is discussed. The nucleotide sequence data of the UL41 gene of HSZP and KOS have been submitted to the Genbank nucleotide database and have been assigned the accession numbers Z72337 and Z72338.

  5. Nucleotide sequence of XhoI O fragment of ectromelia virus DNA reveals significant differences from vaccinia virus.

    PubMed

    Senkevich, T G; Muravnik, G L; Pozdnyakov, S G; Chizhikov, V E; Ryazankina, O I; Shchelkunov, S N; Koonin, E V; Chernos, V I

    1993-10-01

    The nucleotide sequence of the 3913 base pair XhoI O fragment located in an evolutionary variable region adjacent to the right end of the genome of ectromelia virus (EMV) was determined. The sequence contains two long open reading frames coding for putative proteins of 559 amino acid residues (p65) and 344 amino acid residues (p39). Amino acid database searches showed that p39 is closely related to vaccinia virus (VV), strain WR, B22R gene product (C12L gene product of strain Copenhagen), which belongs to the family of serine protease inhibitors (serpins). Despite the overall high conservation, differences were observed in the sequences of p39, B22R, and C12L in the site known to interact with proteases in other serpins, suggesting that the serpins of EMV and two strains of VV may all inhibit proteases with different specificities. The gene coding for the ortholog of p65 is lacking in the Copenhagen strain of vaccinia virus; the WR strain contains a truncated variant of this gene (B21R) potentially coding for a small protein (p16) corresponding to the C-terminal region of p65. p65 is a new member of the family of poxvirus proteins including vaccinia virus proteins A55R, C2L and F3L, and a group of related proteins of leporipoxviruses, Shope fibroma and myxoma viruses (T6, T8, T9, M9). These proteins are homologous to the Drosophila protein Kelch involved in egg development. Both Kelch protein and the related poxvirus proteins contain two distinct domains. The N-terminal domain is related to the similarly located domains of transcription factors Ttk, Br-C (Drosophila), and KUP (human), and GCL protein involved in early development in Drosophila. The C-terminal domain consists of an array of four to five imperfect repeats and is related to human placental protein MIPP. Phylogenetic analysis of the family of poxvirus proteins showed that their genes have undergone a complex succession of duplications, and complete or partial deletions.

  6. Mixture models of nucleotide sequence evolution that account for heterogeneity in the substitution process across sites and across lineages.

    PubMed

    Jayaswal, Vivek; Wong, Thomas K F; Robinson, John; Poladian, Leon; Jermiin, Lars S

    2014-09-01

    Molecular phylogenetic studies of homologous sequences of nucleotides often assume that the underlying evolutionary process was globally stationary, reversible, and homogeneous (SRH), and that a model of evolution with one or more site-specific and time-reversible rate matrices (e.g., the GTR rate matrix) is enough to accurately model the evolution of data over the whole tree. However, an increasing body of data suggests that evolution under these conditions is an exception, rather than the norm. To address this issue, several non-SRH models of molecular evolution have been proposed, but they either ignore heterogeneity in the substitution process across sites (HAS) or assume it can be modeled accurately using the distribution. As an alternative to these models of evolution, we introduce a family of mixture models that approximate HAS without the assumption of an underlying predefined statistical distribution. This family of mixture models is combined with non-SRH models of evolution that account for heterogeneity in the substitution process across lineages (HAL). We also present two algorithms for searching model space and identifying an optimal model of evolution that is less likely to over- or underparameterize the data. The performance of the two new algorithms was evaluated using alignments of nucleotides with 10 000 sites simulated under complex non-SRH conditions on a 25-tipped tree. The algorithms were found to be very successful, identifying the correct HAL model with a 75% success rate (the average success rate for assigning rate matrices to the tree's 48 edges was 99.25%) and, for the correct HAL model, identifying the correct HAS model with a 98% success rate. Finally, parameter estimates obtained under the correct HAL-HAS model were found to be accurate and precise. The merits of our new algorithms were illustrated with an analysis of 42 337 second codon sites extracted from a concatenation of 106 alignments of orthologous genes encoded by the nuclear

  7. Nucleotide sequence and genomic organization of Aleutian mink disease parvovirus (ADV): sequence comparisons between a nonpathogenic and a pathogenic strain of ADV.

    PubMed Central

    Bloom, M E; Alexandersen, S; Perryman, S; Lechner, D; Wolfinbarger, J B

    1988-01-01

    A DNA sequence of 4,592 nucleotides (nt) was derived for the nonpathogenic ADV-G strain of Aleutian mink disease parvovirus (ADV). The 3'(left) end of the virion strand contained a 117-nt palindrome that could assume a Y-shaped configuration similar to, but less stable than, that of other parvoviruses. The sequence obtained for the 5' end was incomplete and did not contain the 5' (right) hairpin structure but ended just after a 25-nt A + T-rich direct repeat. Features of ADV genomic organization are (i) major left (622 amino acids) and right (702 amino acids) open reading frames (ORFs) in different translational frames of the plus-sense strand, (ii) two short mid-ORFs, (iii) eight potential promoter motifs (TATA boxes), including ones at 3 and 36 map units, and (iv) six potential polyadenylation sites, including three clustered near the termination of the right ORF. Although the overall homology to other parvoviruses is less than 50%, there are short conserved amino acid regions in both major ORFs. However, two regions in the right ORF allegedly conserved among the parvoviruses were not present in ADV. At the DNA level, ADV-G is 97.5% related to the pathogenic ADV-Utah 1. A total of 22 amino acid changes were found in the right ORF; changes were found in both hydrophilic and hydrophobic regions and generally did not affect the theoretical hydropathy. However, there is a short heterogeneous region at 64 to 65 map units in which 8 out of 11 residues have diverged; this hypervariable segment may be analogous to short amino acid regions in other parvoviruses that determine host range and pathogenicity. These findings suggested that this region may harbor some of the determinants responsible for the differences in pathogenicity of ADV-G and ADV-Utah 1. PMID:2839709

  8. The nucleotide sequence of the mitochondrial DNA molecule of the grey seal, Halichoerus grypus, and a comparison with mitochondrial sequences of other true seals.

    PubMed

    Arnason, U; Gullberg, A; Johnsson, E; Ledje, C

    1993-10-01

    The sequence of the mtDNA of the grey seal, Halichoerus grypus, was determined. The length of the molecule was 16,797 base pairs. The organization of the molecule conformed with that of other eutherian mammals but the control region was unusually long due to the presence of two types of repeated motifs. The grey seal and the previously reported harbor seal, Phoca vitulina, belong to different but closely related genera of family Phocidae, true (or earless) seals. In order to determine the degree of differences that may occur between mtDNAs of closely related mammalian genera, the 2 rRNA genes, the 13 peptide coding genes, and the 22 tRNA genes of the 2 species were compared. Total nucleotide difference in the peptide coding genes was 2.0-6.1%. The range of conservative difference was 0.0-1.5%. In the inferred peptide sequences the amino acid difference was 0.0-4.5%, and the difference with respect to chemical properties of amino acids was 0.0-3.0%. A gene that showed a limited degree of difference in one mode of comparison did not necessarily show a corresponding limited difference in another mode. The ratio for differences in codon positions 1, 2, and 3 was approximately 2.7:1:16. The corresponding ratio for conservative differences was approximately 1.8:1.1:1. The evolutionary separation of the two species was calculated to have taken place 2-2.5 million years ago. This dating gives the figure approximately 8 x 10(-9) as the mean rate of substitution per site and year in the entire mtDNA molecule. Comparison with the cytochrome b gene of the Hawaiian monk seal and the Weddell seal suggested that the lineage of these two species and that of the grey and harbor seals separated approximately 8 million years ago. PMID:8308902

  9. The nucleotide sequence of a DNA fragment, 71 base pairs in length, near the origin of DNA replication of bacteriophage 0X174.

    PubMed Central

    Mansfeld, A D; Vereijken, J M; Jansz, H S

    1976-01-01

    Part of the nucleotide sequence of a restriction fragment covering the origin of phiX174 DNA replication 1 has been determined. The fragment A7c was obtained by digestion of phiX174 RF DNA by the restriction enzyme from Arthrobacter luteus, Alu 1. It was further cleaved into two fragments, one large and one small, by the action of the restriction enzyme from Haemophilus aegyptius, Hae 111. The nucleotide sequence of the small fragment has been determined by analysis of the transcription products obtained by the action of Escherichia coli DNA-dependent RNA polymerase on denaturated template under conditions of low salt. Transcripts longer than the template were found. The whole sequence of 71 nucleotide pairs could be derived from complementary oligonucleotides, obtained after digestion of the transcripts with T1 or pancreatic RNAase. The sequence suggests that at least 4 of the 5 amber mutants 2 that have been mapped on this fragment are identical. On account of this and other evidence a reading frame is proposed. Images PMID:995652

  10. Complete nucleotide sequence of the linear DNA plasmid pRS224 with hairpin loops from Rhizoctonia solani and its unique transcriptional form.

    PubMed

    Katsura, K; Sasaki, A; Nagasaka, A; Fuji, M; Miyake, Y; Hashiba, T

    2001-10-01

    The complete nucleotide sequence of the linear DNA plasmid (pRS224-1) from the plant-pathogenic fungus Rhizoctonia solani isolate H-16 was determined; and its unique RNA transcripts were characterized. The pRS224-1 DNA consists of 4,986 nucleotides. A computer-based study of the folding of pRS224-1 at both termini predicted hairpin-loop structures. The hairpin loops consisted of the left and right termini of 236 and 264 nucleotides, respectively, and share no sequence homology. Unique poly(A) RNAs, 4.7 kb and 7.4 kb in length and hybridizing with the pRS224 DNA, were found in mycelial cells of R. solani H-16. Transcript product-mapping allowed the prediction of the locations of different expression signals. The 7.4-kb transcript is generated from the left terminal region of the complementary strand, via the full-length sense-strand, to the right terminal region of the complementary strand. The 4.7-kb transcript is generated from the center region of the sense strand to the right terminal region of the complementary strand. One open reading frame (ORF) found in pRS224-1 is 887 amino acids long and has a potential coding capacity of 102 kDa. The ORF contains the highly conserved domains characteristic of reverse transcriptase sequences, including the highly conserved YXDD sequence.

  11. Insertion sequence element single nucleotide polymorphism typing provides insights into the population structure and evolution of Mycobacterium ulcerans across Africa.

    PubMed

    Vandelannoote, Koen; Jordaens, Kurt; Bomans, Pieter; Leirs, Herwig; Durnez, Lies; Affolabi, Dissou; Sopoh, Ghislain; Aguiar, Julia; Phanzu, Delphin Mavinga; Kibadi, Kapay; Eyangoh, Sara; Manou, Louis Bayonne; Phillips, Richard Odame; Adjei, Ohene; Ablordey, Anthony; Rigouts, Leen; Portaels, Françoise; Eddyani, Miriam; de Jong, Bouke C

    2014-02-01

    Buruli ulcer is an indolent, slowly progressing necrotizing disease of the skin caused by infection with Mycobacterium ulcerans. In the present study, we applied a redesigned technique to a vast panel of M. ulcerans disease isolates and clinical samples originating from multiple African disease foci in order to (i) gain fundamental insights into the population structure and evolutionary history of the pathogen and (ii) disentangle the phylogeographic relationships within the genetically conserved cluster of African M. ulcerans. Our analyses identified 23 different African insertion sequence element single nucleotide polymorphism (ISE-SNP) types that dominate in different areas where Buruli ulcer is endemic. These ISE-SNP types appear to be the initial stages of clonal diversification from a common, possibly ancestral ISE-SNP type. ISE-SNP types were found unevenly distributed over the greater West African hydrological drainage basins. Our findings suggest that geographical barriers bordering the basins to some extent prevented bacterial gene flow between basins and that this resulted in independent focal transmission clusters associated with the hydrological drainage areas. Different phylogenetic methods yielded two well-supported sister clades within the African ISE-SNP types. The ISE-SNP types from the "pan-African clade" were found to be widespread throughout Africa, while the ISE-SNP types of the "Gabonese/Cameroonian clade" were much rarer and found in a more restricted area, which suggested that the latter clade evolved more recently. Additionally, the Gabonese/Cameroonian clade was found to form a strongly supported monophyletic group with Papua New Guinean ISE-SNP type 8, which is unrelated to other Southeast Asian ISE-SNP types.

  12. Insertion Sequence Element Single Nucleotide Polymorphism Typing Provides Insights into the Population Structure and Evolution of Mycobacterium ulcerans across Africa

    PubMed Central

    Jordaens, Kurt; Bomans, Pieter; Leirs, Herwig; Durnez, Lies; Affolabi, Dissou; Sopoh, Ghislain; Aguiar, Julia; Phanzu, Delphin Mavinga; Kibadi, Kapay; Eyangoh, Sara; Manou, Louis Bayonne; Phillips, Richard Odame; Adjei, Ohene; Ablordey, Anthony; Rigouts, Leen; Portaels, Françoise; Eddyani, Miriam; de Jong, Bouke C.

    2014-01-01

    Buruli ulcer is an indolent, slowly progressing necrotizing disease of the skin caused by infection with Mycobacterium ulcerans. In the present study, we applied a redesigned technique to a vast panel of M. ulcerans disease isolates and clinical samples originating from multiple African disease foci in order to (i) gain fundamental insights into the population structure and evolutionary history of the pathogen and (ii) disentangle the phylogeographic relationships within the genetically conserved cluster of African M. ulcerans. Our analyses identified 23 different African insertion sequence element single nucleotide polymorphism (ISE-SNP) types that dominate in different areas where Buruli ulcer is endemic. These ISE-SNP types appear to be the initial stages of clonal diversification from a common, possibly ancestral ISE-SNP type. ISE-SNP types were found unevenly distributed over the greater West African hydrological drainage basins. Our findings suggest that geographical barriers bordering the basins to some extent prevented bacterial gene flow between basins and that this resulted in independent focal transmission clusters associated with the hydrological drainage areas. Different phylogenetic methods yielded two well-supported sister clades within the African ISE-SNP types. The ISE-SNP types from the “pan-African clade” were found to be widespread throughout Africa, while the ISE-SNP types of the “Gabonese/Cameroonian clade” were much rarer and found in a more restricted area, which suggested that the latter clade evolved more recently. Additionally, the Gabonese/Cameroonian clade was found to form a strongly supported monophyletic group with Papua New Guinean ISE-SNP type 8, which is unrelated to other Southeast Asian ISE-SNP types. PMID:24296504

  13. Insertion sequence element single nucleotide polymorphism typing provides insights into the population structure and evolution of Mycobacterium ulcerans across Africa.

    PubMed

    Vandelannoote, Koen; Jordaens, Kurt; Bomans, Pieter; Leirs, Herwig; Durnez, Lies; Affolabi, Dissou; Sopoh, Ghislain; Aguiar, Julia; Phanzu, Delphin Mavinga; Kibadi, Kapay; Eyangoh, Sara; Manou, Louis Bayonne; Phillips, Richard Odame; Adjei, Ohene; Ablordey, Anthony; Rigouts, Leen; Portaels, Françoise; Eddyani, Miriam; de Jong, Bouke C

    2014-02-01

    Buruli ulcer is an indolent, slowly progressing necrotizing disease of the skin caused by infection with Mycobacterium ulcerans. In the present study, we applied a redesigned technique to a vast panel of M. ulcerans disease isolates and clinical samples originating from multiple African disease foci in order to (i) gain fundamental insights into the population structure and evolutionary history of the pathogen and (ii) disentangle the phylogeographic relationships within the genetically conserved cluster of African M. ulcerans. Our analyses identified 23 different African insertion sequence element single nucleotide polymorphism (ISE-SNP) types that dominate in different areas where Buruli ulcer is endemic. These ISE-SNP types appear to be the initial stages of clonal diversification from a common, possibly ancestral ISE-SNP type. ISE-SNP types were found unevenly distributed over the greater West African hydrological drainage basins. Our findings suggest that geographical barriers bordering the basins to some extent prevented bacterial gene flow between basins and that this resulted in independent focal transmission clusters associated with the hydrological drainage areas. Different phylogenetic methods yielded two well-supported sister clades within the African ISE-SNP types. The ISE-SNP types from the "pan-African clade" were found to be widespread throughout Africa, while the ISE-SNP types of the "Gabonese/Cameroonian clade" were much rarer and found in a more restricted area, which suggested that the latter clade evolved more recently. Additionally, the Gabonese/Cameroonian clade was found to form a strongly supported monophyletic group with Papua New Guinean ISE-SNP type 8, which is unrelated to other Southeast Asian ISE-SNP types. PMID:24296504

  14. Analysis of the complete nucleotide sequence of the picornavirus Theiler's murine encephalomyelitis virus indicates that it is closely related to cardioviruses.

    PubMed Central

    Pevear, D C; Calenoff, M; Rozhon, E; Lipton, H L

    1987-01-01

    Theiler's murine encephalomyelitis viruses (TMEV) are naturally occurring enteric pathogens of mice which constitute a separate serological group within the picornavirus family. Persistent TMEV infection in mice provides a relevant experimental animal model for the human demyelinating disease multiple sclerosis. To provide information about the TMEV classification, genome organization, and protein processing map, we determined the complete nucleotide sequence of the TMEV genome and deduced the amino acid sequence of the polyprotein coding region. The RNA genome, which is typical of the picornavirus family, is 8,098 nucleotides long. The 5' untranslated region is 1,064 nucleotides long (making it the longest in the picornavirus family after the aphthoviruses) and lacks a poly(C) tract. Computer-generated comparison of the 5' and 3' noncoding regions and polyprotein revealed the highest level of nucleotide and predicted amino acid identity between the TMEV and the cardioviruses encephalomyocarditis virus (EMCV) and Mengo virus. The TMEV polyprotein, which appears to be processed like EMCV since the amino acids flanking the putative proteolytic cleavage sites have been conserved, begins with a short leader peptide followed by 11 other gene products in the standard L-4-3-4 picornavirus arrangement. Because of these similarities, we propose that the TMEV be grouped with the cardioviruses. However, since TMEV and EMCV have different biophysical properties and show no cross-neutralization, they most likely belong in a separate cardiovirus subgroup. PMID:3033278

  15. Avocado cellulase: nucleotide sequence of a putative full-length cDNA clone and evidence for a small gene family.

    PubMed

    Tucker, M L; Durbin, M L; Clegg, M T; Lewis, L N

    1987-05-01

    A cDNA library was prepared from ripe avocado fruit (Persea americana Mill. cv. Hass) and screened for clones hybridizing to a 600 bp cDNA clone (pAV5) coding for avocado fruit cellulase. This screening led to the isolation of a clone (pAV363) containing a 2021 nucleotide transcribed sequence and an approximately 150 nucleotide poly(A) tail. Hybridization of pAV363 to a northern blot shows that the length of the homologous message is approximately 2.2 kb. The nucleotide sequence of this putative full-length mRNA clone contains an open reading frame of 1482 nucleotides which codes for a polypeptide of 54.1 kD. The deduced amino acid composition compares favorably with the amino acid composition of native avocado cellulase determined by amino acid analysis. Southern blot analysis of Hind III and Eco RI endonuclease digested genomic DNA indicates a small family of cellulase genes.

  16. Single nucleotide polymorphism discovery in cutthroat trout subspecies using genome reduction, barcoding, and 454 pyro-sequencing

    PubMed Central

    2012-01-01

    Background Salmonids are popular sport fishes, and as such have been subjected to widespread stocking throughout western North America. Historically, stocking was done with little regard for genetic variation among populations and has resulted in genetic mixing among species and subspecies in many areas, thus putting the genetic integrity of native salmonid populations at risk and creating a need to assess the genetic constitution of native salmonid populations. Cutthroat trout is a salmonid species with pronounced geographic structure (there are 10 extant subspecies) and a recent history of hybridization with introduced rainbow trout in many populations. Genetic admixture has also occurred among cutthroat trout subspecies in areas where introductions have brought two or more subspecies into contact. Consequently, management agencies have increased their efforts to evaluate the genetic composition of cutthroat trout populations to identify populations that remain uncompromised and manage them accordingly, but additional genetic markers are needed to do so effectively. Here we used genome reduction, MID-barcoding, and 454-pyrosequencing to discover single nucleotide polymorphisms that differentiate cutthroat trout subspecies and can be used as a rapid, cost-effective method to characterize the genetic composition of cutthroat trout populations. Results Thirty cutthroat and six rainbow trout individuals were subjected to genome reduction and next-generation sequencing. A total of 1,499,670 reads averaging 379 base pairs in length were generated by 454-pyrosequencing, resulting in 569,060,077 total base pairs sequenced. A total of 43,558 putative SNPs were identified, and of those, 125 SNP primers were developed that successfully amplified 96 cutthroat trout and rainbow trout individuals. These SNP loci were able to differentiate most cutthroat trout subspecies using distance methods and Structure analyses. Conclusions Genomic and bioinformatic protocols were

  17. Indigenous and introduced potyviruses of legumes and Passiflora spp. from Australia: biological properties and comparison of coat protein nucleotide sequences.

    PubMed

    Coutts, Brenda A; Kehoe, Monica A; Webster, Craig G; Wylie, Stephen J; Jones, Roger A C

    2011-10-01

    Five Australian potyviruses, passion fruit woodiness virus (PWV), passiflora mosaic virus (PaMV), passiflora virus Y, clitoria chlorosis virus (ClCV) and hardenbergia mosaic virus (HarMV), and two introduced potyviruses, bean common mosaic virus (BCMV) and cowpea aphid-borne mosaic virus (CAbMV), were detected in nine wild or cultivated Passiflora and legume species growing in tropical, subtropical or Mediterranean climatic regions of Western Australia. When ClCV (1), PaMV (1), PaVY (8) and PWV (5) isolates were inoculated to 15 plant species, PWV and two PaVY P. foetida isolates infected P. edulis and P. caerulea readily but legumes only occasionally. Another PaVY P. foetida isolate resembled five PaVY legume isolates in infecting legumes readily but not infecting P. edulis. PaMV resembled PaVY legume isolates in legumes but also infected P. edulis. ClCV did not infect P. edulis or P. caerulea and behaved differently from PaVY legume isolates and PaMV when inoculated to two legume species. When complete coat protein (CP) nucleotide (nt) sequences of 33 new isolates were compared with 41 others, PWV (8), HarMV (4), PaMV (1) and ClCV (1) were within a large group of Australian isolates, while PaVY (14), CAbMV (1) and BCMV (3) isolates were in three other groups. Variation among PWV and PaVY isolates was sufficient for division into four clades each (I-IV). A variable block of 56 amino acid residues at the N-terminal region of the CPs of PaMV and ClCV distinguished them from PWV. Comparison of PWV, PaMV and ClCV CP sequences showed that nt identities were both above and below the 76-77% potyvirus species threshold level. This research gives insights into invasion of new hosts by potyviruses at the natural vegetation and cultivated area interface, and illustrates the potential of indigenous viruses to emerge to infect introduced plants. PMID:21744001

  18. Genome-wide association study for endocrine fertility traits using single nucleotide polymorphism arrays and sequence variants in dairy cattle.

    PubMed

    Tenghe, A M M; Bouwman, A C; Berglund, B; Strandberg, E; de Koning, D J; Veerkamp, R F

    2016-07-01

    Endocrine fertility traits, which are defined from progesterone concentration levels in milk, are interesting indicators of dairy cow fertility because they more directly reflect the cows own reproductive physiology than classical fertility traits, which are more biased by farm management decisions. The aim of this study was to detect quantitative trait loci (QTL) for 7 endocrine fertility traits in dairy cows by performing a genome-wide association study with 85k single nucleotide polymorphisms (SNP), and then fine-map targeted QTL regions, using imputed sequence variants. Two classical fertility traits were also analyzed for QTL with 85k SNP. The association between a SNP and a phenotype was assessed by single-locus regression for each SNP, using a linear mixed model that included a random polygenic effect. A total of 2,447 Holstein Friesian cows with 5,339 lactations with both phenotypes and genotypes were used for association analysis. Heritability estimates ranged from 0.09 to 0.15 for endocrine fertility traits and 0.03 to 0.10 for classical fertility traits. The genome-wide association study identified 17 QTL regions for endocrine fertility traits on Bos taurus autosomes (BTA) 2, 3, 8, 12, 15, 17, 23, and 25. The highest number (5) of QTL regions from the genome-wide association study was identified for the endocrine trait "proportion of samples with luteal activity." Overlapping QTL regions were found between endocrine traits on BTA 2, 3, and 17. For the classical trait calving to first service, 3 QTL regions were identified on BTA 3, 15, and 23, and an overlapping region was identified on BTA 23 with endocrine traits. Fine-mapping target regions for the endocrine traits on BTA 2 and 3 using imputed sequence variants confirmed the QTL from the genome-wide association study, and identified several associated variants that can contribute to an index of markers for genetic improvement of fertility. Several potential candidate genes underlying endocrine

  19. Nucleotide sequence of the afimbrial-adhesin-encoding afa-3 gene cluster and its translocation via flanking IS1 insertion sequences.

    PubMed Central

    Garcia, M I; Labigne, A; Le Bouguenec, C

    1994-01-01

    The afa gene clusters encode afimbrial adhesins (AFAs) that are expressed by uropathogenic and diarrhea-associated Escherichia coli strains. The plasmid-borne afa-3 gene cluster is responsible for the biosynthesis of the AFA-III adhesin that belongs to the Dr family of hemagglutinins. Reported in this work is the nucleotide sequence of the 9.2-kb insert of the recombinant plasmid pILL61, which contains the afa-3 gene cluster cloned from a cystitis-associated E. coli strain (A30). The afa-3 gene cluster was shown to contain six open reading frames, designated afaA to afaF. It was organized in two divergent transcriptional units. Five of the six Afa products showed marked homologies with proteins encoded by previously described adhesion systems that allowed us to attribute to each of them a putative function in the biogenesis of the AFA-III adhesin. AfaE was identified as the structural adhesin product, whereas AfaB and AfaC were recognized as periplasmic chaperone and outer membrane anchor proteins, respectively. The AfaA and AfaF products were shown to be homologous to the PapI-PapB transcriptional regulatory proteins. No function could be attributed to the AfaD product, the gene of which was previously shown to be dispensable for the synthesis of a functional adhesin. Upstream of the afa-3 gene cluster, a 1.2-kb region was found to be 96% identical to the RepFIB sequence of one of the enterotoxigenic E. coli plasmids (P307), suggesting a common ancestor plasmid. This region contains an integrase-like gene (int). Sequence analysis revealed the presence of an IS1 element between the int gene and the afa-3 gene cluster. Two other IS1 elements were detected and located in the vicinity of the afa-3 gene cluster by hybridization experiments. The afa-3 gene cluster was therefore found to be flanked by two IS1 elements in direct orientation and two in opposite orientations. The afa-3 gene cluster, flanked by two directly oriented IS1 elements, was shown to translocate

  20. Nucleotide cleaving agents and method

    DOEpatents

    Que, Jr., Lawrence; Hanson, Richard S.; Schnaith, Leah M. T.

    2000-01-01

    The present invention provides a unique series of nucleotide cleaving agents and a method for cleaving a nucleotide sequence, whether single-stranded or double-stranded DNA or RNA, using and a cationic metal complex having at least one polydentate ligand to cleave the nucleotide sequence phosphate backbone to yield a hydroxyl end and a phosphate end.

  1. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... approved by the Director of the Federal Register in accordance with 5 U.S.C. 552(a) and 1 CFR part 51... PATENT CASES Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide...

  2. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... approved by the Director of the Federal Register in accordance with 5 U.S.C. 552(a) and 1 CFR part 51... PATENT CASES Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide...

  3. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... approved by the Director of the Federal Register in accordance with 5 U.S.C. 552(a) and 1 CFR part 51... PATENT CASES Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide...

  4. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... approved by the Director of the Federal Register in accordance with 5 U.S.C. 552(a) and 1 CFR part 51... PATENT CASES Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide...

  5. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... approved by the Director of the Federal Register in accordance with 5 U.S.C. 552(a) and 1 CFR part 51... PATENT CASES Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide...

  6. Buffalo (Bubalus bubalis) interleukin-2: sequence analysis reveals high nucleotide and amino acid identity with interleukin-2 of cattle and other ruminants.

    PubMed

    Sreekumar, E; Premraj, A; Saravanakumar, M; Rasool, T J

    2002-08-01

    A 4400-bp genomic sequence and a 332-bp truncated cDNA sequence of the interleukin-2 (IL-2) gene of Indian water buffalo (Bubalus bubalis) were amplified by polymerase chain reaction and cloned. The coding sequence of the buffalo IL-2 gene was assembled from the 5' end of the genomic clone and the truncated cDNA clone. This sequence had 98.5% nucleotide identity and 98% amino acid identity with cattle IL-2. Three amino acid substitutions were observed at positions 63, 124 and 135. Comparison of the predicted protein structure of buffalo IL-2 with that of human and cattle IL-2 did not reveal significant differences. The putative amino acids responsible for IL-2 receptor binding were conserved in buffalo, cattle and human IL-2. The amino acid sequence of buffalo IL-2 also showed very high identity with that of other ruminants, indicating functional cross-reactivity.

  7. The complete nucleotide sequence of the RNA2 of the crinivirus tomato infectious chlorosis virus: isolates from North America and Europe are essentially identical.

    PubMed

    Orílio, Anelise F; Navas-Castillo, Jesús

    2009-01-01

    The complete nucleotide sequences of the RNA2 of two isolates of Tomato infectious chlorosis virus (TICV, genus Crinivirus, family Closteroviridae) from the United States and Spain, respectively, were determined. The sequences of both isolates were found to be nearly identical. TICV RNA2 consisted of 7,914 nucleotides in both isolates and contains eight open reading frames that encompass the Closteroviridae hallmark gene array represented by a heat shock protein 70 family homologue, a protein of 59 kDa, the major coat protein, and a divergent copy of the coat protein. Phylogenetic analysis suggested that TICV is most similar to Lettuce infectious yellows virus (LIYV), the type species of the genus Crinivirus. PMID:19288051

  8. Cloning and partial nucleotide sequence of human immunoglobulin mu chain cDNA from B cells and mouse-human hybridomas.

    PubMed Central

    Dolby, T W; Devuono, J; Croce, C M

    1980-01-01

    Purified mRNAs coding for mu and kappa human immunoglobulin polypeptides were translated in vitro and their products were characterized. The mu-specific mRNAs, derived from both human lymphoblastoid cells (GM607) and from a mouse-human somatic cell hybrid secreting human mu chains (alpha D5-H11-BC11), were copied into cDNAs and inserted into the plasmid pBR322. Several recombinant cDNAs that were obtained were identified by a combination of colony hybridization with labeled probes, in vitro translation of plasmid-selected mu mRNAs, and DNA nucleotide sequence determination. One recombinant DNA, for which the sequence has been partially determined, contains the codons for part of the C3 constant region domain through the carboxy-terminal piece (155 amino acids total) as well as the entire 3' noncoding sequence up to the poly(A) site of the human mu mRNA. The sequence A-A-U-A-A occurs 12 nucleotides prior to the poly(A) addition site in the human mu mRNA. Considerable sequence homology is observed in the mouse and human mu mRNA 3' coding and noncoding sequences. Images PMID:6777778

  9. Genotyping of human parvovirus B19 in clinical samples from Brazil and Paraguay using heteroduplex mobility assay, single-stranded conformation polymorphism and nucleotide sequencing.

    PubMed

    Mendonça, Marcos César Lima de; Ferreira, Ana Maria de Amorim; Santos, Marta Gonçalves Matos dos; Oviedo, Elva Cristina; Bello, Maria Sônia Dal; Siqueira, Marilda Mendonça; Maceira, Juan Manuel Piñeiro; von Hubinger, Maria Genoveva; Couceiro, José Nelson dos Santos Silva

    2011-06-01

    Heteroduplex mobility assay, single-stranded conformation polymorphism and nucleotide sequencing were utilised to genotype human parvovirus B19 samples from Brazil and Paraguay. Ninety-seven serum samples were collected from individuals presenting with abortion or erythema infectiosum, arthropathies, severe anaemia and transient aplastic crisis; two additional skin samples were collected by biopsy. After the procedure, all clinical samples were classified as genotype 1.

  10. The nucleotide sequence of HLA-B{sup *}2704 reveals a new amino acid substitution in exon 4 which is also present in HLA-B{sup *}2706

    SciTech Connect

    Rudwaleit, M.; Bowness, P.; Wordsworth, P.

    1996-12-31

    The HLA-B27 subtype HLA-B{sup *}2704 is virtually absent in Caucasians but common in Orientals, where it is associated with ankylosing spondylitis. The amino acid sequence of HLA-B{sup *}2704 has been established by peptide mapping and was shown to differ by two amino acids from HLA-B{sup *}2705, HLA-B{sup *}2704 is characterized by a serine for aspartic acid substitution at position 77 and glutamic acid for valine at position 152. To date, however, no nucleotide sequence confirming these changes at the DNA level has been published. 13 refs., 2 figs.

  11. Non-redundant patent sequence databases with value-added annotations at two levels.

    PubMed

    Li, Weizhong; McWilliam, Hamish; de la Torre, Ana Richart; Grodowski, Adam; Benediktovich, Irina; Goujon, Mickael; Nauche, Stephane; Lopez, Rodrigo

    2010-01-01

    The European Bioinformatics Institute (EMBL-EBI) provides public access to patent data, including abstracts, chemical compounds and sequences. Sequences can appear multiple times due to the filing of the same invention with multiple patent offices, or the use of the same sequence by different inventors in different contexts. Information relating to the source invention may be incomplete, and biological information available in patent documents elsewhere may not be reflected in the annotation of the sequence. Search and analysis of these data have become increasingly challenging for both the scientific and intellectual-property communities. Here, we report a collection of non-redundant patent sequence databases, which cover the EMBL-Bank nucleotides patent class and the patent protein databases and contain value-added annotations from patent documents. The databases were created at two levels by the use of sequence MD5 checksums. Sequences within a level-1 cluster are 100% identical over their whole length. Level-2 clusters were defined by sub-grouping level-1 clusters based on patent family information. Value-added annotations, such as publication number corrections, earliest publication dates and feature collations, significantly enhance the quality of the data, allowing for better tracking and cross-referencing. The databases are available format: http://www.ebi.ac.uk/patentdata/nr/.

  12. Characterization of pancreatic ductal adenocarcinoma using whole transcriptome sequencing and copy number analysis by single-nucleotide polymorphism array.

    PubMed

    Di Marco, Mariacristina; Astolfi, Annalisa; Grassi, Elisa; Vecchiarelli, Silvia; Macchini, Marina; Indio, Valentina; Casadei, Riccardo; Ricci, Claudio; D'Ambra, Marielda; Taffurelli, Giovanni; Serra, Carla; Ercolani, Giorgio; Santini, Donatella; D'Errico, Antonia; Pinna, Antonio Daniele; Minni, Francesco; Durante, Sandra; Martella, Laura Raffaella; Biasco, Guido

    2015-11-01

    The aim of the current study was to implement whole transcriptome massively parallel sequencing (RNASeq) and copy number analysis to investigate the molecular biology of pancreatic ductal adenocarcinoma (PDAC). Samples from 16 patients with PDAC were collected by ultrasound‑guided biopsy or from surgical specimens for DNA and RNA extraction. All samples were analyzed by RNASeq performed at 75x2 base pairs on a HiScanSQ Illumina platform. Single‑nucleotide variants (SNVs) were detected with SNVMix and filtered on dbSNP, 1000 Genomes and Cosmic. Non‑synonymous SNVs were analyzed with SNPs&GO and PROVEAN. A total of 13 samples were analyzed by high resolution copy number analysis on an Affymetrix SNP array 6.0. RNAseq resulted in an average of 264 coding non‑synonymous novel SNVs (ranging from 146‑374) and 16 novel insertions or deletions (In/Dels) (ranging from 6‑24) for each sample, of which a mean of 11.2% were disease‑associated and somatic events, while 34.7% were frameshift somatic In/Dels. From this analysis, alterations in the known oncogenes associated with PDAC were observed, including Kirsten rat sarcoma viral oncogene homolog (KRAS) mutations (93.7%) and inactivation of cyclin‑dependent kinase inhibitor 2A (CDKN2A) (50%), mothers against decapentaplegic homolog 4 (SMAD4) (50%), and tumor protein 53 (TP53) (56%). One case that was negative for KRAS exhibited a G13D neuroblastoma RAS viral oncogene homolog mutation. In addition, gene fusions were detected in 10 samples for a total of 23 different intra‑ or inter‑chromosomal rearrangements, however, a recurrent fusion transcript remains to be identified. SNP arrays identified macroscopic and cryptic cytogenetic alterations in 85% of patients. Gains were observed in the chromosome arms 6p, 12p, 18q and 19q which contain KRAS, GATA binding protein 6, protein kinase B and cyclin D3. Deletions were identified on chromosome arms 1p, 9p, 6p, 18q, 10q, 15q, 17p, 21q and 19q which involve TP53

  13. Characterization of pancreatic ductal adenocarcinoma using whole transcriptome sequencing and copy number analysis by single-nucleotide polymorphism array.

    PubMed

    Di Marco, Mariacristina; Astolfi, Annalisa; Grassi, Elisa; Vecchiarelli, Silvia; Macchini, Marina; Indio, Valentina; Casadei, Riccardo; Ricci, Claudio; D'Ambra, Marielda; Taffurelli, Giovanni; Serra, Carla; Ercolani, Giorgio; Santini, Donatella; D'Errico, Antonia; Pinna, Antonio Daniele; Minni, Francesco; Durante, Sandra; Martella, Laura Raffaella; Biasco, Guido

    2015-11-01

    The aim of the current study was to implement whole transcriptome massively parallel sequencing (RNASeq) and copy number analysis to investigate the molecular biology of pancreatic ductal adenocarcinoma (PDAC). Samples from 16 patients with PDAC were collected by ultrasound‑guided biopsy or from surgical specimens for DNA and RNA extraction. All samples were analyzed by RNASeq performed at 75x2 base pairs on a HiScanSQ Illumina platform. Single‑nucleotide variants (SNVs) were detected with SNVMix and filtered on dbSNP, 1000 Genomes and Cosmic. Non‑synonymous SNVs were analyzed with SNPs&GO and PROVEAN. A total of 13 samples were analyzed by high resolution copy number analysis on an Affymetrix SNP array 6.0. RNAseq resulted in an average of 264 coding non‑synonymous novel SNVs (ranging from 146‑374) and 16 novel insertions or deletions (In/Dels) (ranging from 6‑24) for each sample, of which a mean of 11.2% were disease‑associated and somatic events, while 34.7% were frameshift somatic In/Dels. From this analysis, alterations in the known oncogenes associated with PDAC were observed, including Kirsten rat sarcoma viral oncogene homolog (KRAS) mutations (93.7%) and inactivation of cyclin‑dependent kinase inhibitor 2A (CDKN2A) (50%), mothers against decapentaplegic homolog 4 (SMAD4) (50%), and tumor protein 53 (TP53) (56%). One case that was negative for KRAS exhibited a G13D neuroblastoma RAS viral oncogene homolog mutation. In addition, gene fusions were detected in 10 samples for a total of 23 different intra‑ or inter‑chromosomal rearrangements, however, a recurrent fusion transcript remains to be identified. SNP arrays identified macroscopic and cryptic cytogenetic alterations in 85% of patients. Gains were observed in the chromosome arms 6p, 12p, 18q and 19q which contain KRAS, GATA binding protein 6, protein kinase B and cyclin D3. Deletions were identified on chromosome arms 1p, 9p, 6p, 18q, 10q, 15q, 17p, 21q and 19q which involve TP53

  14. Importance of purine and pyrimidine content of local nucleotide sequences (six bases long) for evolution of the human immunodeficiency virus type 1.

    PubMed Central

    Doi, H

    1991-01-01

    Human immunodeficiency virus type 1 evolves rapidly, and random base change is thought to act as a major factor in this evolution. However, segments of the viral genome differ in their variability: there is the highly variable env gene, particularly hypervariable regions located within env, and, in contrast, the conservative gag and pol genes. Computer analysis of the nucleotide sequences of human immunodeficiency virus type 1 isolates reveals that base substitution in this virus is nonrandom and affected by local nucleotide sequences. Certain local sequences 6 base pairs long are excessively frequent in the hypervariable regions. These sequences exhibit base-substitution hotspots at specific positions in their 6 bases. The hotspots tend to be nonsilent letters of codons in the hypervariable regions--thus leading to marked amino acid substitutions there. Conversely, in the conservative gag and pol genes the hotspots tend to be silent letters because of a difference in codon frame from the hypervariable regions. Furthermore, base substitutions in the local sequences that frequently appear in the conservative genes occurred at a low level, even within the variable env. Thus, despite the high variability of this virus, the conservative genes and their products could be conserved. These may be some of the strategies evolved in human immunodeficiency virus type 1 to allow for positive-selection pressures, such as the host immune system, and negative-selection pressures on the conservative gene products. Images PMID:1924392

  15. Nucleotide sequence analysis of Adh genes estimates the time of geographic isolation of the Bogota population of Drosophila pseudoobscura.

    PubMed Central

    Schaeffer, S W; Miller, E L

    1991-01-01

    The population of Drosophila pseudoobscura at Bogota, Columbia, is geographically and partially reproductively isolated from populations in the main body of the species in North America. The degree of genetic differentiation and time of divergence between populations at Bogota and Apple Hill, CA, were estimated by comparison of 3388 nucleotides in the alcohol dehydrogenase region (Adh and Adh-Dup genes) of 18 strains. Of the 146 polymorphic nucleotide sites detected, 68 and 31 were unique to the Apple Hill and Bogota samples, respectively, and 53 were shared. On the basis of an observed net divergence per nucleotide site of 0.264% between the two samples, the Bogota and North American populations were estimated to have been separated for at least 155,000 years. This divergence time suggests that D. pseudoobscura extended its range from North America to South America in a period of Pleistocene glaciation, when habitat suitable for the species presumably existed in lowland Central America. PMID:2068088

  16. The promoter region of the arg3 gene in Saccharomyces cerevisiae: nucleotide sequence and regulation in an arg3-lacZ gene fusion.

    PubMed

    Crabeel, M; Huygen, R; Cunin, R; Glansdorff, N

    1983-01-01

    We have determined the DNA sequence for the 5' end of the arg3 gene of Saccharomyces cerevisiae, including part of the coding region and the 200 nucleotides immediately upstream. A promoter-deletion mutant was found to have lost all of the sequence lying normally in front of the gene except for the 33 nucleotides preceding the AUG codon. The role of the 5' domain in initiation and regulation of arg3 transcription was assessed by a gene fusion experiment. The Escherichia coli lacZ gene, was truncated of the eight amino-terminal codons substituted in vitro, on a 2mu plasmid, for the carboxy-terminal and 3'-flanking regions of arg3, leaving only the first 19 proximal codons and approximately 1600 nucleotides of the region preceding arg3 on the yeast chromosome. The fused gene was expressed in phase and was still submitted to the two mechanisms regulating the wild-type arg3 gene: the general, probably transcriptional control of amino acid biosynthesis and the specific, apparently post-transcriptional control mediated by the products of the argR genes. These results suggest a determining role for the 5' end portion of the arg3 messenger in the specific arginine-mediated control mechanism. PMID:11894927

  17. In vivo commitment to splicing in yeast involves the nucleotide upstream from the branch site conserved sequence and the Mud2 protein.

    PubMed

    Rain, J C; Legrain, P

    1997-04-01

    Pre-mRNA splicing is a stepwise nuclear process involving intron recognition and the assembly of the spliceosome followed by intron excision. We previously developed a pre-mRNA export assay that allows the discrimination between early steps of spliceosome formation and splicing per se. Here we present evidence that these two assays detect different biochemical defects for point mutations. Mutations at the 5' splice site lead to pre-mRNA export, whereas 3' splice site mutations do not. A genetic screen applied to mutants in the branch site region shows that all positions in the conserved TACTAAC sequence are important for intron recognition. An exhaustive analysis of pre-mRNA export and splicing defects of these mutants shows that the in vivo recognition of the branch site region does not involve the base pairing of U2 snRNA with the pre-mRNA. In addition, the nucleotide preceding the conserved TACTAAC sequence contributes to the recognition process. We show that a T residue at this position allows for optimal intron recognition and that in natural introns, this nucleotide is also used preferentially. Moreover, the Mud2 protein is involved in the recognition of this nucleotide, thus establishing a role for this factor in the in vivo splicing pathway.

  18. Phylogenetic relationships among Hepatozoon species from snakes, frogs and mosquitoes of Ontario, Canada, determined by ITS-1 nucleotide sequences and life-cycle, morphological and developmental characteristics.

    PubMed

    Smith, T G; Kim, B; Desser, S S

    1999-02-01

    The molecular biological characteristics of Hepatozoon species infecting various species of snakes, frogs and mosquitoes were investigated by determining the nucleotide sequences of the first internal transcribed spacer region. A phylogenetic analysis was performed on seven isolates of Hepatozoon infecting snakes, including Hepatozoon sipedon and four morphologically similar but not identical forms, and two isolates of Hepatozoon catesbianae infecting Green frogs (Rana clamitans melanota). This analysis, which utilised data from first internal transcribed spacer nucleotide sequences, morphological and morphometric features of gamonts, oocysts and sporocysts, and previously determined life-cycle and host-specificity characteristics, revealed that H. sipedon is a polymorphic species with a wide host and geographic range. Four synapomorphies. including two nucleotide substitutions and two morphological character state changes, supported a monophyletic group of six isolates of H. sipedon from the central region of Ontario which was the sister group for an isolate (HW1) from the southern part of the province. Based on the results of this study, an evaluation of which criteria are useful for describing species of Hepatozoon is presented, with the intent of curtailing the practice of naming species based on morphological features of gamonts or on incomplete life-cycle data.

  19. Complete nucleotide sequence and organization of the mitogenome of the red-spotted apollo butterfly, Parnassius bremeri (Lepidoptera: Papilionidae) and comparison with other lepidopteran insects.

    PubMed

    Kim, Man Il; Baek, Jee Yeon; Kim, Min Jee; Jeong, Heon Cheon; Kim, Ki-Gyoung; Bae, Chang Hwan; Han, Yeon Soo; Jin, Byung Rae; Kim, Iksoo

    2009-10-31

    The 15,389-bp long complete mitogenome of the endangered red-spotted apollo butterfly, Parnassius bremeri (Lepidoptera: Papilionidae) was determined in this study. The start codon for the COI gene in insects has been extensively discussed, and has long remained a matter of some controversy. Herein, we propose that the CGA (arginine) sequence functions as the start codon for the COI gene in lepidopteran insects, on the basis of complete mitogenome sequences of lepidopteran insects, including P. bremeri, as well as additional sequences of the COI start region from a diverse taxonomic range of lepidopteran species (a total of 53 species from 15 families). In our extensive search for a tRNA-like structure in the A+T-rich region, one tRNA(Trp)-like sequence and one tRNA(Leu) (UUR)-like sequence were detected in the P. bremeri A+T-rich region, and one or more tRNA-like structures were detected in the A+T-rich region of the majority of other sequenced lepidopteran insects, thereby indicating that such features occur frequently in the lepidopteran mitogenomes. Phylogenetic analysis using the concatenated 13 amino acid sequences and nucleotide sequences of PCGs of the four macrolepidopteran superfamilies together with the Tortricoidea and Pyraloidea resulted in the successful recovery of a monophyly of Papilionoidea and a monophyly of Bombycoidea. However, the Geometroidea were unexpectedly identified as a sister group of the Bombycoidea, rather than the Papilionoidea. PMID:19823774

  20. Complete nucleotide sequence and organization of the mitogenome of the red-spotted apollo butterfly, Parnassius bremeri (Lepidoptera: Papilionidae) and comparison with other lepidopteran insects.

    PubMed

    Kim, Man Il; Baek, Jee Yeon; Kim, Min Jee; Jeong, Heon Cheon; Kim, Ki-Gyoung; Bae, Chang Hwan; Han, Yeon Soo; Jin, Byung Rae; Kim, Iksoo

    2009-10-31

    The 15,389-bp long complete mitogenome of the endangered red-spotted apollo butterfly, Parnassius bremeri (Lepidoptera: Papilionidae) was determined in this study. The start codon for the COI gene in insects has been extensively discussed, and has long remained a matter of some controversy. Herein, we propose that the CGA (arginine) sequence functions as the start codon for the COI gene in lepidopteran insects, on the basis of complete mitogenome sequences of lepidopteran insects, including P. bremeri, as well as additional sequences of the COI start region from a diverse taxonomic range of lepidopteran species (a total of 53 species from 15 families). In our extensive search for a tRNA-like structure in the A+T-rich region, one tRNA(Trp)-like sequence and one tRNA(Leu) (UUR)-like sequence were detected in the P. bremeri A+T-rich region, and one or more tRNA-like structures were detected in the A+T-rich region of the majority of other sequenced lepidopteran insects, thereby indicating that such features occur frequently in the lepidopteran mitogenomes. Phylogenetic analysis using the concatenated 13 amino acid sequences and nucleotide sequences of PCGs of the four macrolepidopteran superfamilies together with the Tortricoidea and Pyraloidea resulted in the successful recovery of a monophyly of Papilionoidea and a monophyly of Bombycoidea. However, the Geometroidea were unexpectedly identified as a sister group of the Bombycoidea, rather than the Papilionoidea.

  1. The complete nucleotide sequence of PEBV RNA2 reveals the presence of a novel open reading frame and provides insights into the structure of tobraviral subgenomic promoters.

    PubMed Central

    Goulden, M G; Lomonossoff, G P; Davies, J W; Wood, K R

    1990-01-01

    The 3374 nucleotide sequence of RNA2 from the British PEBV strain SP5 has been determined. The RNA includes three open reading frames flanked by 5' and 3' noncoding regions of 509 and 480 nucleotides. The open reading frames specify coat protein, a 29.6K product homologous to the 29.1K product of TRV(TCM) RNA2 and a 23K product not homologous to any previously described protein. The homology demonstrated between the coat proteins of PRV, TRV and PEBV indicates a common evolutionary origin for these proteins. Upstream of each ORF are located sequences homologous to those with which subgenomic RNAs of other tobraviruses start. Subgenomic RNAs for the expression of the three ORFs may start at these points. On all five tobraviral RNA2 molecules sequenced to date, these sequences were found upstream of the coat protein ORF in association with a strongly-conserved potential secondary structural element. Similar potential structures were identified upstream of other tobraviral ORFs. These structures may contribute to the activity of the tobraviral subgenomic promoter. Images PMID:2388830

  2. Molecular cloning and nucleotide sequences of the complementary DNAs to chicken skeletal muscle myosin two alkali light chain mRNAs.

    PubMed Central

    Nabeshima, Y; Fujii-Kuriyama, Y; Muramatsu, M; Ogata, K

    1982-01-01

    We report here the molecular cloning and sequence analysis of DNAs complementary to mRNAs for myosin alkali light chain of chicken embryo and adult leg skeletal muscle. pSMA2-1 contained an 818 base-pair insert that includes the entire coding region and 5' and 3' untranslated regions of A2 mRNA. pSMA1-1 contained a 848 base-pair insert that included the 3' untranslated region and almost all of the coding region except for the N-terminal 13 amino acid residues of the A1 light chain. The 741 nucleotide sequences of A1 and A2 mRNAs corresponding to C-terminal 141 amino acid residues and 3' untranslated regions were identical. The 5' terminal nucleotide sequences corresponding to N-terminal 35 amino acid residues of A1 chain were quite different from the sequences corresponding to N-terminal 8 amino acid residues and of the 5' untranslated region of A2 mRNA. These findings are discussed in relation to the structures of the genes for A1 and A2 mRNA. PMID:6128725

  3. Molecular cloning of complementary DNA to Newcastle disease virus, and nucleotide sequence analysis of the junction between the genes encoding the haemagglutinin-neuraminidase and the large protein.

    PubMed

    Chambers, P; Millar, N S; Bingham, R W; Emmerson, P T

    1986-03-01

    Complementary DNA clones to 90% of the Newcastle disease virus (NDV) genome have been produced and mapped. These clones cover the entire HN, F and M genes, most if not all of the L gene and parts of the NP and P genes. The map of overlapping clones gives the gene order 3'-NP-P-M-F-HN-L-5' for NDV, identical to the gene order of Sendai virus, on the assumption that the NP gene of NDV is at the 3' end of the genome as previously suggested by inactivation of NDV transcription by u.v. light. The nucleotide sequence of 453 bases covering the junction between the HN and L genes has been determined. There is nucleotide sequence homology to the consensus polyadenylation and mRNA start sites of Sendai virus and vesicular stomatitis virus. The deduced amino acid sequence of the C terminus of the HN protein of NDV shows homology to the C-terminal amino acid sequences of the HN proteins of simian virus 5 and Sendai virus. An explanation for the presence of HN0, the precursor to HN in some strains of NDV, is suggested by the presence of a long non-coding region at the 3' terminus of the mRNA encoding the HN protein of NDV that could, by mutation, allow synthesis of a larger polypeptide.

  4. Comparison of single nucleotide polymorphisms and simple sequence repeats in genotype identification and diversity assessment of cacao germplasm

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Accurate identification of individual genotypes in an efficient manner is especially important for cacao (Theobroma cacao L.) germplasm conservation and breeding. The development of single nucleotide polymorphism (SNP) markers in cacao offers the opportunity to use a high throughput genotyping syste...

  5. WWW-query: an on-line retrieval system for biological sequence banks.

    PubMed

    Perrière, G; Gouy, M

    1996-01-01

    We have developed a World Wide Web (WWW) version of the sequence retrieval system Query: WWW-Query. This server allows to query nucleotide sequence banks in the EMBL/GenBank/DDBJ formats and protein sequence banks in the NBRF/PIR format. WWW-Query includes all the features of the on-line sequences browsers already available: possibility to build complex queries, integration of cross-references with different data banks, and access to the functional zones of biological interest. It also provides original services not available elsewhere: introduction of the notion of re-usable sequence lists, integration of dedicated helper applications for visualizing alignments and phylogenetic trees and links with multivariate methods for studying codon usage or for complementing phylogenies.

  6. Hydroxylamine-amplified gold nanoparticles for the naked eye and chemiluminescent detection of sequence-specific DNA with notable potential for single-nucleotide polymorphism discrimination.

    PubMed

    Fan, Aiping; Lau, Choiwan; Lu, Jianzhong

    2009-03-01

    Herein, we report a hydroxylamine-amplified gold nanoparticle-based assay with naked eye and chemiluminescent (CL) detection of sequence-specific DNA. For the naked eye detection assay, the signal can be observed by naked eye directly, which provides a general way for other biological assays. In contrast, the CL detection method can improve the detection limit by two orders of magnitude as compared to the naked eye detection, and a limit as low as 10 amol of target DNA can be sensitively detected. Most importantly, stringent control of either temperature or salt concentration is not needed during washing steps, and this new methodology exhibits an excellent capability for differentiating a perfectly matched target oligonucleotide from eight kinds of one-nucleotide mismatched oligonucleotides, and this detection specificity indicates that the present protocol could be applied to single-nucleotide polymorphism (SNP) analysis in many fields.

  7. Nucleotide and derived amino acid sequences of a cDNA coding for pre-uteroglobin from the lung of the hare (Lepus capensis).

    PubMed Central

    López de Haro, M S; Nieto, A

    1986-01-01

    An almost full-length cDNA coding for pre-uteroglobin from hare lung was cloned and sequenced. The derived amino acid sequence indicated that hare pre-uteroglobin contained 91 amino acids, including a signal peptide of 21 residues. Comparison of the nucleotide sequence of hare pre-uteroglobin cDNA with that previously reported for the rabbit gene indicated five silent point substitutions and six others leading to amino acid changes in the coding region. The untranslated regions of both pre-uteroglobin mRNAs were very similar. The amino acid changes observed are discussed in relation to the different progesterone-binding abilities of both homologous proteins. PMID:3019311

  8. Data in support of the discovery of alternative splicing variants of quail LEPR and the evolutionary conservation of qLEPRl by nucleotide and amino acid sequences alignment

    PubMed Central

    Wang, Dandan; Xu, Chunlin; Wang, Taian; Li, Hong; Li, Yanmin; Ren, Junxiao; Tian, Yadong; Li, Zhuanjian; Jiao, Yuping; Kang, Xiangtao; Liu, Xiaojun

    2015-01-01

    Leptin receptor (LEPR) belongs to the class I cytokine receptor superfamily which share common structural features and signal transduction pathways. Although multiple LEPR isoforms, which are derived from one gene, were identified in mammals, they were rarely found in avian except the long LEPR. Four alternative splicing variants of quail LEPR (qLEPR) had been cloned and sequenced for the first time (Wang et al., 2015 [1]). To define patterns of the four splicing variants (qLEPRl, qLEPR-a, qLEPR-b and qLEPR-c) and locate the conserved regions of qLEPRl, this data article provides nucleotide sequence alignment of qLEPR and amino acid sequence alignment of representative vertebrate LEPR. The detailed analysis was shown in [1]. PMID:26759819

  9. Complete nucleotide sequence and mRNA-mapping of the large subunit gene of ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) from Chlamydomonas moewusii.

    PubMed

    Yang, R C; Dove, M; Seligy, V L; Lemieux, C; Turmel, M; Narang, S A

    1986-01-01

    Nucleotide (nt) sequence of the large subunit (LS) gene of ribulose-1,5-bisphosphate carboxylase/oxygenase from the green alga, Chlamydomonas moewusii, and mapping of transcription ends was achieved by two new strategies. The deduced LS sequence of 475 amino acid residues was compared with similar genes from six other species; cyanobacteria, land plants and a related alga (C. reinhardtii). The most conserved regions are the three ribulose bisphosphate binding sites and the CO2 activator site. The nt sequence conservation outside the coding region is limited to only three segments within the 5'-flanking region: a region of tandem repeats, TATAA box and ribosome-binding site. Termination point of transcription is an 'A' residue 3' to the first of two 18-nt inverted repeats, which has the potential to form a stem-loop hairpin structure. The possible role of these potential regulatory features for transcription and translation, and similar structures in other LS genes is presented.

  10. Molecular evolution of streptococcal M protein: cloning and nucleotide sequence of the type 24 M protein gene and relation to other genes of Streptococcus pyogenes.

    PubMed Central

    Mouw, A R; Beachey, E H; Burdett, V

    1988-01-01

    The structural gene for the type 24 M protein of group A streptococci has been cloned and expressed in Escherichia coli. The complete nucleotide sequence of the gene and the 3' and 5' flanking regions was determined. The sequence includes an open reading frame of 1,617 base pairs encoding a pre-M24 protein of 539 amino acids and a predicted Mr of 58,738. The structural gene contains two distinct tandemly reiterated elements. The first repeated element consists of 5.3 units, and the second contains 2.7 units. Each element shows little variation of the basic 35-amino-acid unit. Comparison of the sequence of the M24 protein with the sequence of the M6 protein (S. K. Hollingshead, V. A. Fischetti, and J. R. Scott, J. Biol. Chem. 261:1677-1686, 1986) indicates that these molecules have are conserved except in the regions coding for the antigenic (type specific) determinant and they have three regions of homology within the structural genes: 38 of 42 amino acids within the amino terminal signal sequence, the second repeated element of the M24 protein is found in the M6 molecule at the same position in the protein, and the carboxy terminal 164 amino acids, including a membrane anchor sequence, are conserved in both proteins. In addition, the sequences flanking the two genes are strongly conserved. Images PMID:3276665

  11. A hybrid next generation transcript sequencing-based approach to identify allelic and homeolog-specific single nucleotide polymorphisms in allotetraploid white clover

    PubMed Central

    2013-01-01

    Background White clover (Trifolium repens L.) is an allotetraploid species possessing two highly collinear ancestral sub-genomes. The apparent existence of highly similar homeolog copies for the majority of genes in white clover is problematic for the development of genome-based resources in the species. This is especially true for the development of genetic markers based on single nucleotide polymorphisms (SNPs), since it is difficult to distinguish between homeolog-specific and allelic variants. Robust methods for categorising single nucleotide variants as allelic or homeolog-specific in large transcript datasets are required. We illustrate one potential approach in this study. Results We used 454-pyrosequencing sequencing to generate ~760,000 transcript sequences from an 8th generation white clover inbred line. These were assembled and partially annotated to yield a reference transcript set comprising 71,545 sequences. We subsequently performed Illumina sequencing on three further white clover samples, generating 14 million transcript reads from a mixed sample comprising 24 divergent white clover genotypes, and 50 million reads on two further eighth generation white clover inbred lines. Mapping these reads to the reference transcript set allowed us to develop a significant SNP resource for white clover, and to partition the SNPs from the inbred lines into categories reflecting allelic or homeolog-specific variation. The potential for using haplotype reconstruction and progenitor genome comparison to assign haplotypes to specific ancestral sub-genomes of white clover is demonstrated for sequences corresponding to genes encoding dehydration responsive element binding protein and acyl-coA oxidase. Conclusions In total, 208,854 independent SNPs in 31,715 reference sequences were discovered, approximately three quarters of which were categorised as representing allelic or homeolog-specific variation using two inbred lines. This represents a significant resource for

  12. Genetic diversity of Histoplasma capsulatum strains isolated from Argentina based on nucleotide sequence variations in the internal transcribed spacer regions of rDNA.

    PubMed

    Landaburu, Fernanda; Cuestas, María Luján; Rubio, Andrea; Elías, Nahuel Alejandro; Daneri, Gabriela Lopez; Veciño, Cecilia; Iovannitti, Cristina A; Mujica, María Teresa

    2014-05-01

    The internal transcribed spacer (ITS) regions of rDNA genes of 49 Histoplasma capsulatum (48 from clinical samples and one from soil) isolates were examined. Nucleotide sequence heterogeneity within this region was useful for phylogenetic classification of H. capsulatum and species identification. Thus, in 45 of 49 isolates we observed higher percentages of identity in the nucleotide sequences of ITS regions when the isolates studied herein were compared with those reported in our country in the South America B clade. Phylogenetic analyses of rDNA sequences corresponding to the 537 bp of the ITS region obtained from H. capsulatum isolates assigned South America type B clade (45 isolates), North America type 1 and Asia clade (2 isolates each one). H. capsulatum strains isolated from soil and from patients living in Argentina (45 of 49) clustered together with the H. capsulatum isolates of the South America B clade. The high level of genetic similarity among our isolates suggests that almost one genetic population is present in the microenvironment. Isolates described as H. capsulatum var. capsulatum or var. farciminosum (2 isolates) did not form a monophyletic group and were found in the Asia clade. Subsequent studies are needed to properly identify these isolates.

  13. Complete nucleotide sequence of the Escherichia coli recC gene and of the thyA-recC intergenic region.

    PubMed Central

    Finch, P W; Wilson, R E; Brown, K; Hickson, I D; Tomkinson, A E; Emmerson, P T

    1986-01-01

    The nucleotide sequence of a 6,000 bp region of the E. coli chromosome that includes the 3' end of the coding region for the thyA gene and the entire recC gene has been determined. The proposed coding region for the RecC protein is 3369 nucleotides long, which would encode a polypeptide consisting of 1122 amino acids with a calculated molecular mass of 129 kDa. Mung bean nuclease mapping of a recC specific transcript produced in vivo indicates that transcription of recC is initiated 80 bp upstream of the translational start point. A weak promoter sequence situated 5' to the transcription initiation point has been identified. In the 1953 bp thyA-recC intergenic region there are three open reading frames that would code for polypeptides of molecular mass 30 kDa, 13.5 kDa and 12 kDa, respectively. Although the first and third of these open reading frames are preceded by possible ribosome binding sites, no obvious promoter sequences could be identified. Moreover, transcripts for these reading frames could not be detected. Images PMID:3520484

  14. Analysis of the genome sequence of the pathogenic Muscovy duck parvovirus strain YY reveals a 14-nucleotide-pair deletion in the inverted terminal repeats.

    PubMed

    Wang, Jianye; Huang, Yu; Zhou, Mingxu; Zhu, Guoqiang

    2016-09-01

    Genomic information about Muscovy duck parvovirus is still limited. In this study, the genome of the pathogenic MDPV strain YY was sequenced. The full-length genome of YY is 5075 nucleotides (nt) long, 57 nt shorter than that of strain FM. Sequence alignment indicates that the 5' and 3' inverted terminal repeats (ITR) of strain YY contain a 14-nucleotide-pair deletion in the stem of the palindromic hairpin structure in comparison to strain FM and FZ91-30. The deleted region contains one "E-box" site and one repeated motif with the sequence "TTCCGGT" or "ACCGGAA". Phylogenetic trees constructed based the protein coding genes concordantly showed that YY, together with nine other MDPV isolates from various places, clustered in a separate branch, distinct from the branch formed by goose parvovirus (GPV) strains. These results demonstrate that, despite the distinctive deletion, the YY strain still belongs to the classical MDPV group. Moreover, the deletion of ITR may contribute to the genome evolution of MDPV under immunization pressure. PMID:27344160

  15. Analysis of the genome sequence of the pathogenic Muscovy duck parvovirus strain YY reveals a 14-nucleotide-pair deletion in the inverted terminal repeats.

    PubMed

    Wang, Jianye; Huang, Yu; Zhou, Mingxu; Zhu, Guoqiang

    2016-09-01

    Genomic information about Muscovy duck parvovirus is still limited. In this study, the genome of the pathogenic MDPV strain YY was sequenced. The full-length genome of YY is 5075 nucleotides (nt) long, 57 nt shorter than that of strain FM. Sequence alignment indicates that the 5' and 3' inverted terminal repeats (ITR) of strain YY contain a 14-nucleotide-pair deletion in the stem of the palindromic hairpin structure in comparison to strain FM and FZ91-30. The deleted region contains one "E-box" site and one repeated motif with the sequence "TTCCGGT" or "ACCGGAA". Phylogenetic trees constructed based the protein coding genes concordantly showed that YY, together with nine other MDPV isolates from various places, clustered in a separate branch, distinct from the branch formed by goose parvovirus (GPV) strains. These results demonstrate that, despite the distinctive deletion, the YY strain still belongs to the classical MDPV group. Moreover, the deletion of ITR may contribute to the genome evolution of MDPV under immunization pressure.

  16. Genetic diversity of Histoplasma capsulatum strains isolated from Argentina based on nucleotide sequence variations in the internal transcribed spacer regions of rDNA.

    PubMed

    Landaburu, Fernanda; Cuestas, María Luján; Rubio, Andrea; Elías, Nahuel Alejandro; Daneri, Gabriela Lopez; Veciño, Cecilia; Iovannitti, Cristina A; Mujica, María Teresa

    2014-05-01

    The internal transcribed spacer (ITS) regions of rDNA genes of 49 Histoplasma capsulatum (48 from clinical samples and one from soil) isolates were examined. Nucleotide sequence heterogeneity within this region was useful for phylogenetic classification of H. capsulatum and species identification. Thus, in 45 of 49 isolates we observed higher percentages of identity in the nucleotide sequences of ITS regions when the isolates studied herein were compared with those reported in our country in the South America B clade. Phylogenetic analyses of rDNA sequences corresponding to the 537 bp of the ITS region obtained from H. capsulatum isolates assigned South America type B clade (45 isolates), North America type 1 and Asia clade (2 isolates each one). H. capsulatum strains isolated from soil and from patients living in Argentina (45 of 49) clustered together with the H. capsulatum isolates of the South America B clade. The high level of genetic similarity among our isolates suggests that almost one genetic population is present in the microenvironment. Isolates described as H. capsulatum var. capsulatum or var. farciminosum (2 isolates) did not form a monophyletic group and were found in the Asia clade. Subsequent studies are needed to properly identify these isolates. PMID:24299459

  17. Nucleotide sequences and organization of the genes for carotovoricin (Ctv) from Erwinia carotovora indicate that Ctv evolved from the same ancestor as Salmonella typhi prophage.

    PubMed

    Yamada, Kazuteru; Hirota, Morihiko; Niimi, Yoshiko; Nguyen, Hoa Anh; Takahara, Yoshiyuki; Kamio, Yoshiyuki; Kaneko, Jun

    2006-09-01

    Carotovoricin Er (CtvEr), which is produced by a plant soft rot disease causative agent, Erwinia carotovora subsp. carotovora Er, is a high-molecular-weight bacteriocin showing Myoviridae phage-tail-like morphology with contractile sheath and plural tail fibers. We determined the complete nucleotide sequences of CtvEr genes on the E. carotovora Er chromosome and report that CtvEr genes consist of lysis cassette, major and minor structural protein gene clusters. Four promoters were identified. The lysis gene cassette, which is composed of the genes for lysis enzyme and holin, was also identified and characterized. The nucleotide sequences and organization of the genes for CtvCGE, which is produced by E. carotovora strain CGE234-M403 with the morphology similar to CtvEr, were also determined and compared to that of CtvEr, and it was found that CtvCGE is almost identical to CtvEr except for tail fibers which are involved in the killing spectra of both bacteriocins. We also explain that the gene organization and the deduced amino acid sequences of both carotovoricins are very close to those of prophage, which is lysogenized in the chromosome on Salmonella enterica serovar Typhi CT18. These findings strongly suggest that Ctv evolved as a phage tail-like bacteriocin from a common ancestor with Salmonella typhi prophage. PMID:16960352

  18. Deep sequencing revealed genome-wide single-nucleotide polymorphism and plasmid content of Erwinia amylovora strains isolated in Middle Atlas, Morocco.

    PubMed

    Hannou, Najat; Mondy, Samuel; Planamente, Sara; Moumni, Mohieddine; Llop, Pablo; López, María; Manceau, Charles; Barny, Marie-Anne; Faure, Denis

    2013-10-01

    Erwinia amylovora causes economic losses that affect pear and apple production in Morocco. Here, we report comparative genomics of four Moroccan E. amylovora strains with the European strain CFBP1430 and North-American strain ATCC49946. Analysis of single nucleotide polymorphisms (SNPs) revealed genetic homogeneity of Moroccan's strains and their proximity to the European strain CFBP1430. Moreover, the collected sequences allowed the assembly of a 65 kpb plasmid, which is highly similar to the plasmid pEI70 harbored by several European E. amylovora isolates. This plasmid was found in 33% of the 40 E. amylovora strains collected from several host plants in 2009 and 2010 in Morocco.

  19. [Analysis of nucleotide sequences polymorphism of chloroplast trnL-trnF spacer of tRNA genes in giant duckweed Spirodela polyrrhiza (L.) Schleiden].

    PubMed

    Ryzhova, N N; Martirosian, L V; Kolganova, T V; Goriunova, S V; Kochieva, E Z

    2006-01-01

    Chloroplast DNA trnL-trnF spacer sequences of tRNA genes of 14 specimens of the fam. Lemnaceae have been characterized. Nucleotide polymorphism analysis of the spacer trnL-trnF of geographically isolated and morphologically differing accessions of S. polyrrhiza that is the most widespread species of Spirodela genus showed the low level of intraspecific variability. Five trnL-trnF haplotypes of S. polyrrhiza are identified. Both mono-, and polynucleotide repeats, and also extensive indels, specific to representatives Spirodela polyrrhiza, Landoltia punctata and Lemna sp. are revealed. Competency of Landoltia genus allocation as separate entity was confirmed. PMID:17209426

  20. Complete Nucleotide Sequence of pGA45, a 140,698-bp IncFIIY Plasmid Encoding blaIMI-3-Mediated Carbapenem Resistance, from River Sediment

    PubMed Central

    Dang, Bingjun; Mao, Daqing; Luo, Yi

    2016-01-01

    Plasmid pGA45 was isolated from the sediments of Haihe River using Escherichia coli CV601 (gfp-tagged) as recipients and indigenous bacteria from sediment as donors. This plasmid confers reduced susceptibility to imipenem which belongs to carbapenem group. Plasmid pGA45 was fully sequenced on an Illumina HiSeq 2000 sequencing system. The complete sequence of plasmid pGA45 was 140,698 bp in length with an average G + C content of 52.03%. Sequence analysis shows that pGA45 belongs to IncFIIY group and harbors a backbone region which shares high homology and gene synteny to several other IncF plasmids including pNDM1_EC14653, pYDC644, pNDM-Ec1GN574, pRJF866, pKOX_NDM1, and pP10164-NDM. In addition to the backbone region, plasmid pGA45 harbors two notable features including one blaIMI-3-containing region and one type VI secretion system region. The blaIMI-3-containing region is responsible for bacteria carbapenem resistance and the type VI secretion system region is probably involved in bacteria virulence, respectively. Plasmid pGA45 represents the first complete nucleotide sequence of the blaIMI-harboring plasmid from environment sample and the sequencing of this plasmid provided insight into the architecture used for the dissemination of blaIMI carbapenemase genes. PMID:26941718

  1. Cloning and nucleotide sequencing of a novel 7 beta-(4-carboxybutanamido)cephalosporanic acid acylase gene of Bacillus laterosporus and its expression in Escherichia coli and Bacillus subtilis.

    PubMed

    Aramori, I; Fukagawa, M; Tsumura, M; Iwami, M; Ono, H; Kojo, H; Kohsaka, M; Ueda, Y; Imanaka, H

    1991-12-01

    A strain of Bacillus species which produced an enzyme named glutaryl 7-ACA acylase which converts 7 beta-(4-carboxybutanamido)cephalosporanic acid (glutaryl 7-ACA) to 7-amino cephalosporanic acid (7-ACA) was isolated from soil. The gene for the glutaryl 7-ACA acylase was cloned with pHSG298 in Escherichia coli JM109, and the nucleotide sequence was determined by the M13 dideoxy chain termination method. The DNA sequence revealed only one large open reading frame composed of 1,902 bp corresponding to 634 amino acid residues. The deduced amino acid sequence contained a potential signal sequence in its amino-terminal region. Expression of the gene for glutaryl 7-ACA acylase was performed in both E. coli and Bacillus subtilis. The enzyme preparations purified from either recombinant strain of E. coli or B. subtilis were shown to be identical with each other as regards the profile of sodium dodecyl sulfate-polyacrylamide gel electrophoresis and were composed of a single peptide with the molecular size of 70 kDa. Determination of the amino-terminal sequence of the two enzyme preparations revealed that both amino-terminal sequences (the first nine amino acids) were identical and completely coincided with residues 28 to 36 of the open reading frame. Extracellular excretion of the enzyme was observed in a recombinant strain of B. subtilis. PMID:1744041

  2. Complete Nucleotide Sequence of pGA45, a 140,698-bp IncFIIY Plasmid Encoding bla IMI-3-Mediated Carbapenem Resistance, from River Sediment.

    PubMed

    Dang, Bingjun; Mao, Daqing; Luo, Yi

    2016-01-01

    Plasmid pGA45 was isolated from the sediments of Haihe River using Escherichia coli CV601 (gfp-tagged) as recipients and indigenous bacteria from sediment as donors. This plasmid confers reduced susceptibility to imipenem which belongs to carbapenem group. Plasmid pGA45 was fully sequenced on an Illumina HiSeq 2000 sequencing system. The complete sequence of plasmid pGA45 was 140,698 bp in length with an average G + C content of 52.03%. Sequence analysis shows that pGA45 belongs to IncFIIY group and harbors a backbone region which shares high homology and gene synteny to several other IncF plasmids including pNDM1_EC14653, pYDC644, pNDM-Ec1GN574, pRJF866, pKOX_NDM1, and pP10164-NDM. In addition to the backbone region, plasmid pGA45 harbors two notable features including one bla IMI-3-containing region and one type VI secretion system region. The bla IMI-3-containing region is responsible for bacteria carbapenem resistance and the type VI secretion system region is probably involved in bacteria virulence, respectively. Plasmid pGA45 represents the first complete nucleotide sequence of the bla IMI-harboring plasmid from environment sample and the sequencing of this plasmid provided insight into the architecture used for the dissemination of bla IMI carbapenemase genes.

  3. Complete Nucleotide Sequence of pGA45, a 140,698-bp IncFIIY Plasmid Encoding bla IMI-3-Mediated Carbapenem Resistance, from River Sediment.

    PubMed

    Dang, Bingjun; Mao, Daqing; Luo, Yi

    2016-01-01

    Plasmid pGA45 was isolated from the sediments of Haihe River using Escherichia coli CV601 (gfp-tagged) as recipients and indigenous bacteria from sediment as donors. This plasmid confers reduced susceptibility to imipenem which belongs to carbapenem group. Plasmid pGA45 was fully sequenced on an Illumina HiSeq 2000 sequencing system. The complete sequence of plasmid pGA45 was 140,698 bp in length with an average G + C content of 52.03%. Sequence analysis shows that pGA45 belongs to IncFIIY group and harbors a backbone region which shares high homology and gene synteny to several other IncF plasmids including pNDM1_EC14653, pYDC644, pNDM-Ec1GN574, pRJF866, pKOX_NDM1, and pP10164-NDM. In addition to the backbone region, plasmid pGA45 harbors two notable features including one bla IMI-3-containing region and one type VI secretion system region. The bla IMI-3-containing region is responsible for bacteria carbapenem resistance and the type VI secretion system region is probably involved in bacteria virulence, respectively. Plasmid pGA45 represents the first complete nucleotide sequence of the bla IMI-harboring plasmid from environment sample and the sequencing of this plasmid provided insight into the architecture used for the dissemination of bla IMI carbapenemase genes. PMID:26941718

  4. The nucleotide sequence of metallothioneins (MT) in liver of the Kafue lechwe (Kobus leche kafuensis) and their potential as biomarkers of heavy metal pollution of the Kafue River.

    PubMed

    M'kandawire, Ethel; Syakalima, Michelo; Muzandu, Kaampwe; Pandey, Girja; Simuunza, Martin; Nakayama, Shouta M M; Kawai, Yusuke K; Ikenaka, Yoshinori; Ishizuka, Mayumi

    2012-09-15

    The study determined heavy metal concentrations and MT1 nucleotide sequence [phylogeny] in liver of the Kafue lechwe. Applicability of MT1 as a biomarker of pollution was assessed. cDNA-encoding sequences for lechwe MT1 were amplified by RT-PCR to characterize the sequence of MT1 which was subjected to BLAST searching at NCBI. Phylogenetic relationships were based on pairwise matrix of sequence divergences calculated by Clustal W. Phylogenetic tree was constructed by NJ method using PHILLIP program. Metals were extracted by acid digestion and concentrations of Cr, Co, Cu, Zn, Cd, Pb, and Ni were determined using an AAS. MT1 mRNA expression levels were measured by quantitative comparative real-time RT-PCR. Lechwe MT1 has a length of 183bp, which encode for MT1 proteins of 61AA, which include 20 cysteines. Nucleotide sequence of lechwe MT1 showed identity with sheep MT (97%) and cattle MT1E (97%). Phylogenetic tree revealed that lechwe MT1 was clustered with sheep MT and cattle MT1E. Cu and Ni concentrations and MT1 mRNA expression levels of lechwe from Blue Lagoon were significantly higher than those from Lochinvar (p<0.05). Concentrations of Cd and Cu, Co and Cu, Co and Pb, Ni and Cu, and Ni and Cr were positively correlated. Spearman's rank correlations also showed positive correlations between Cu and Co concentrations and MT mRNA expression. PCA further suggested that MT mRNA expression was related to Zn and Cd concentrations. Hepatic MT1 mRNA expression in lechwe can be used as biomarker of heavy metal pollution.

  5. Using EMBL-EBI services via Web interface and programmatically via Web Services

    PubMed Central

    Lopez, Rodrigo; Cowley, Andrew; Li, Weizhong; McWilliam, Hamish

    2015-01-01

    The European Bioinformatics Institute (EMBL-EBI) provides access to a wide range of databases and analysis tools that are of key importance in bioinformatics. As well as providing Web interfaces to these resources, Web Services are available using SOAP and REST protocols that enable programmatic access to our resources and allow their integration into other applications and analytical workflows. This unit describes the various options available to a typical researcher or bioinformatician who wishes to use our resources via Web interface or programmatically via a range of programming languages. PMID:25501941

  6. Complete nucleotide sequence of an Amerindian human T-cell lymphotropic virus type II (HTLV-II) isolate: identification of a variant HTLV-II subtype b from a Guaymi Indian.

    PubMed Central

    Pardi, D; Switzer, W M; Hadlock, K G; Kaplan, J E; Lal, R B; Folks, T M

    1993-01-01

    The complete nucleotide sequence of a human T-cell lymphotropic virus type II (HTLV-II) isolate from a Panamanian Guaymi Indian was determined and analyzed. When this new viral isolate (HTLV-IIG12) was compared with prototypic HTLV-IIMoT, the overall nucleotide sequence similarity was 95.4%, while the predicted amino acid sequence similarity was 97.5%. Although the overall percentage of nucleotide and amino acid identity with prototypic HTLV-IIMoT (subtype a) was high, HTLV-IIG12 displayed several distinctive features that defined it as an HTLV-II subtype b. However, there were several characteristics unique to this isolate, which included a cluster of nucleotide substitutions in the pre-gag region and changes in restriction enzyme sites within the pre-gag region and the gag, pol, env, and pX genes. In addition, two nucleotide changes in the C terminus of the Tax protein coding sequence inserted an Arg residue for a stop codon and appeared to result in a larger tax gene product in HTLV-IIG12. Although the HTLV-IIG12 isolate appears to be a variant of the prototypic HTLV-IIb, this information represents the first complete nucleotide sequence of any HTLV-II subtype b. These data will allow further studies on the evolutionary relationships between the HTLV-II subtypes and between HTLV-I and HTLV-II. PMID:8331724

  7. Nucleotide sequence of the McrB region of Escherichia coli K-12 and evidence for two independent translational initiation sites at the mcrB locus.

    PubMed Central

    Ross, T K; Achberger, E C; Braymer, H D

    1989-01-01

    The McrB restriction system of Escherichia coli K-12 is responsible for the biological inactivation of foreign DNA that contains 5-methylcytosine residues (E. A. Raleigh and G. Wilson, Proc. Natl. Acad. Sci. USA 83:9070-9074, 1986). Within the McrB region of the chromosome is the mcrB gene, which encodes a protein of 51 kilodaltons (kDa) (T. K. Ross, E. C. Achberger, and H. D. Braymer, Gene 61:277-289, 1987), and the mcrC gene, the product of which is 39 kDa (T. K. Ross, E. C. Achberger, and H. D. Braymer, Mol. Gen. Genet., in press). The nucleotide sequence of a 2,695-base-pair segment encompassing the McrB region was determined. The deduced amino acid sequence was used to identify two open reading frames specifying peptides of 455 and 348 amino acids, corresponding to the products of the mcrB and mcrC genes, respectively. A single-nucleotide overlap was found to exist between the termination codon of the mcrB gene and the proposed initiation codon of the mcrC gene. The presence of an additional peptide of 33 kDa in strains containing various recombinant plasmids with portions of the McrB region has been reported by Ross et al. (Gene 61:277-289, 1987). The analysis of frameshift and deletion mutants of one such hybrid plasmid, pRAB-13, provided evidence for a second translational initiation site within the McrB open reading frame. The proposed start codon for translation of the 33-kDa peptide lies 481 nucleotides downstream from the initiation codon for the 51-kDa mcrB gene product. The 33-kDa peptide may play a regulatory role in the McrB restriction of DNA containing 5-methylcytosine. Images PMID:2649480

  8. Complete nucleotide sequence of little cherry virus 1 (LChV-1) infecting sweet cherry in China

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Little cherry virus 1 (LChV-1), associated with little cherry disease (LCD), has a significant impact on fruit quality of infected sweet cherry trees. We report the full genome sequence of an isolate of LChV-1 from China, detected by small RNA deep sequencing and amplified by overlapping RT-PCR. The...

  9. Nucleotide sequence analysis of the bovine respiratory syncytial virus fusion protein mRNA and expression from a recombinant vaccinia virus.

    PubMed

    Lerch, R A; Anderson, K; Amann, V L; Wertz, G W

    1991-03-01

    Bovine respiratory syncytial (BRS) virus is an important cause of serious respiratory illness in calves. The disease caused in calves is similar to that caused by human respiratory syncytial (HRS) virus in children. The two viruses, however, have distinct host ranges and the attachment glycoproteins, G, have no antigenic cross-reactivity. The fusion glycoproteins, F, of the HRS and BRS viruses, however, have some antigenic cross-reactivity. To further compare the BRS virus and HRS virus fusion proteins, we determined the nucleotide sequence of cDNA clones to the BRS virus F protein mRNA, deduced the amino acid sequence, and compared these sequences with the HRS virus F protein sequences. The BRS virus F mRNA was 1899 nucleotides in length and had a single major open reading frame which could code for a polypeptide of 574 amino acids with an estimated molecular weight of 63.8 kDa. Structural features predicted from the amino acid sequence included an NH2-terminal signal sequence (residues 1-26), a site for proteolytic cleavage (residues 131-136) to generate the disulfide-linked F1 and F2 subunits, and a hydrophobic transmembrane anchor sequence (residues 522-549). The nucleic acid identity between the BRS virus and the HRS virus F mRNA sequences was 71.5%. The predicted BRS virus F protein shared 80.5% overall amino acid identity with the HRS virus F protein with 89% identity in the F1 polypeptide but only 68% identity in the F2 polypeptide. The position and number of the cysteine residues in the F1 and F2 polypeptides were conserved among all F proteins. However, BRS virus F protein had only three potential N-linked carbohydrate acceptor sites in comparison to four or five for the HRS viruses. A difference in the extent of glycosylation between the BRS and HRS virus F2 polypeptides was shown to be responsible for differences observed in the electrophoretic mobility of these proteins. A cDNA containing the complete open reading frame of the BRS virus F mRNA was

  10. Complete Nucleotide Sequence of Artichoke latent virus Shows it to be a Member of the Genus Macluravirus in the Family Potyviridae.

    PubMed

    Minutillo, S A; Marais, A; Mascia, T; Faure, C; Svanella-Dumas, L; Theil, S; Payet, A; Perennec, S; Schoen, L; Gallitelli, D; Candresse, T

    2015-08-01

    Complete genomic sequences of Artichoke latent virus (ArLV) have been obtained by classical or high-throughput sequencing for an ArLV isolate from Italy (ITBr05) and for two isolates from France (FR37 and FR50). The genome is 8,278 to 8,291 nucleotides long and has a genomic organization comparable with that of Chinese yam necrotic mosaic virus (CYNMV), the only macluravirus fully sequenced to date. The cleavage sites of the viral polyprotein have been tentatively identified by comparison with CYNMV, confirming that macluraviruses are characterized by the absence of a P1 protein, a shorter and N-terminally truncated coat protein (CP). Sequence comparisons firmly place ArLV within the genus Macluravirus, and confirm previous results suggesting that Ranunculus latent virus (RALV), a previously described Macluravirus sp., is very closely related to ArLV. Serological relationships and comparisons of the CP gene and of the partial RaLV sequence available all indicate that RaLV should not be considered as a distinct species but as a strain of ArLV. The results obtained also suggest that the spectrum of currently used ArLV-specific molecular hybridization or polymerase chain reaction detection assays should be improved to cover all isolates and strains in the ArLV species. PMID:25760520

  11. Complete nucleotide sequences of the genomes of two isolates of apple chlorotic leaf spot virus from peach (Prunus persica) in China.

    PubMed

    Niu, Feiqing; Pan, Song; Wu, Zujian; Jiang, Dongmei; Li, Shifang

    2012-04-01

    The complete nucleotide sequences of two isolates of apple chlorotic leaf spot virus (Z1 and Z3) collected from peach in Henan Province, China, were determined. The genomes of both Z1 and Z3 were found to contain three open reading frames (ORFs). Sequence analysis showed that genomic sequences of Z1 and Z3 isolates shared 67.4%-82.9% and 67.2%-82.6% identity, respectively, with the other eight isolates of ACLSV that have been reported previously. Based on the putative amino acid sequences of the products of the three ORFs, Z1 and Z3 isolates showed the greatest identity to isolate PBM1 (GenBank accession number AJ243438) from plum and the least identity with isolate Ta Tao5 (GenBank Accession Number: EU223295) from peach. Considering the low level of sequence identity between Z1/Z3 isolate and Ta Tao5 isolate, two types of ACLSV may exist in peach.

  12. Use of nucleotide sequencing of the genomic cDNA fragments of the capsid/premembrane junction region for molecular epidemiology of dengue type 2 viruses.

    PubMed

    Singh, U B; Seth, P

    2001-06-01

    The recent emergence of dengue hemorrhagic fever/dengue shock syndrome (DHF/ DSS) in India has been a source of concern. In the present study a quantitative comparison of 406 nucleotide long sequence from the capsid-premembrane junction region (C-PrM) of 9 dengue virus type 2 (DEN-2) isolates from Delhi with 10 DEN-2 isolates from diverse geographic areas provided sufficient information for estimating genetic relationships. The data indicated that the 1996 epidemic of DHF in Delhi was caused by genotype IV strains of DEN-2. This genotype, perhaps, displaced genotype V strains of DEN-2, which was circulating genotype in 1967. The period during which this displacement had occurred is not clear from the present study. Nonetheless, similar experience in four countries in Latin America and in Sri Lanka suggest that the introduction of new genotypes of DEN-2 displacing the circulating genotype may be associated with the appearance of DHF/DSS. More work is required to elucidate this hypothesis. Transitions at nucleotide positions 406 and 431 resulted in amino acid substitutions near (aa position 104, methionine --> valine) and at the hinge region (aa position 112, valine --> alanine) of C-PrM, respectively in all/most genotypes of group III and IV DEN-2 viruses analysed. Most of these virus strains have been isolated from DHF/DSS outbreaks. Significance of this observation is discussed. The data presented in this study suggest the utility of C-PrM sequence analysis for molecular epidemiology of dengue viruses.

  13. Comparative nucleotide sequences encoding the immunity proteins and the carboxyl-terminal peptides of colicins E2 and E3.

    PubMed Central

    Lau, P C; Rowsome, R W; Zuker, M; Visentin, L P

    1984-01-01

    Using the M13 dideoxy sequencing technique, we have established the DNA sequences of colicins E2 and E3 which encompass the receptor-binding and the catalytic domains of each of the nucleases, and their immunity (imm) genes. The imm gene of plasmid ColE2-P9 is 255 bp long and is separated from the end of the col gene by a dinucleotide. This gene pair is arranged similarly in plasmid ColE3-CA38 except that the intergenic space is 9 bp and the E3 imm gene is one codon shorter than its E2 counterpart. Comparisons of the E2 and E3 imm sequences indicate considerable divergence whereas the receptor-binding domains of both colicins are highly conserved. The two nuclease domains appear to share some sequence homology. A possible evolutionary relationship between colicin E3 and other microbial extracellular ribonucleases is also suggested from the sequence alignment analysis. PMID:6095211

  14. Nucleotide sequences of two fimbrial major subunit genes, pmpA and ucaA, from canine-uropathogenic Proteus mirabilis strains.

    PubMed

    Bijlsma, I G; van Dijk, L; Kusters, J G; Gaastra, W

    1995-06-01

    Proteus mirabilis strains were isolated from dogs with urinary tract infection (UTI) and fimbriae were prepared from two strains. The N-terminal amino acid sequences of the major fimbrial subunits were determined and both sequences appeared identical to the N-terminal amino acid sequence of a urinary cell adhesin (UCA) (Wray, S. K., Hull, S. I., Cook, R. G., Barrish, J. & Hull, R. A., 1986, Infect Immun 54, 43-49). The genes of two different major fimbrial subunits were cloned using oligonucleotide probes that were designed on the basis of the N-terminal UCA sequence. Nucleotide sequencing revealed the complete ucaA gene of 540 bp (from strain IVB247) encoding a polypeptide of 180 amino acids, including a 22 amino acid signal sequence peptide, and the pmpA (P. mirabilis P-like pili) gene of 549 bp (from strain IVB219) encoding a polypeptide of 183 amino acids, including a 23 amino acid signal sequence. Hybridization experiments gave clear indications of the presence of both kinds of fimbriae in many UTI-related canine P. mirabilis isolates. However, the presence of these fimbriae could not be demonstrated in P. vulgaris or other Proteus-related species. Database analysis of amino acid sequences of major subunit proteins revealed that the UcaA protein shares about 56% amino acid identity with the F17A and F111A major fimbrial subunits from bovine enterotoxigenic Escherichia coli. In turn, the PmpA protein more closely resembled the pyelonephritis-associated pili (Pap)-like major subunit protein from UTI-related E. coli. The evolutionary relationship of UcaA, PmpA and various other fimbrial subunit proteins is presented in a phylogenetic tree.

  15. Characterization of the DNA binding protein encoded by the N-specific filamentous Escherichia coli phage IKe. Binding properties of the protein and nucleotide sequence of the gene.

    PubMed

    Peeters, B P; Konings, R N; Schoenmakers, J G

    1983-09-01

    A DNA binding protein encoded by the filamentous single-stranded DNA phage IKe has been isolated from IKe-infected Escherichia coli cells. Fluorescence and in vitro binding studies have shown that the protein binds co-operatively and with a high specificity to single-stranded but not to double-stranded DNA. From titration of the protein to poly(dA) it has been calculated that approximately four bases of the DNA are covered by one monomer of protein. These binding characteristics closely resemble those of gene V protein encoded by the F-specific filamentous phages M13 and fd. The nucleotide sequence of the gene specifying the IKe DNA binding protein has been established. When compared to the nucleotide sequence of gene V of phage M13 it shows an homology of 58%, indicating that these two phages are evolutionarily related. The IKe DNA binding protein is 88 amino acids long which is one amino acid residue larger than the gene V protein sequence. When the IKe DNA binding protein sequence is compared with that of gene V protein it was found that 39 amino acid residues have identical positions in both proteins. The positions of all five tyrosine residues, a number of which are known to be involved in DNA binding, are conserved. Secondary structure predictions indicate that the two proteins contain similar structural domains. It is proposed that the tyrosine residues which are involved in DNA binding are the ones in or next to a beta-turn, at positions 26, 41 and 56 in gene V protein and at positions 27, 42 and 57 in the IKe DNA binding protein.

  16. The novel heat-stable enterotoxin subtype gene (ystB) of Yersinia enterocolitica: nucleotide sequence and distribution of the yst genes.

    PubMed

    Ramamurthy, T; Yoshino, K i; Huang, X; Balakrish Nair, G; Carniel, E; Maruyama, T; Fukushima, H; Takeda, T

    1997-10-01

    The gene (ystB) encoding the novel subtype of the heat-stable enterotoxin (Y-STb) was cloned from the chromosome of a clinical isolate of Yersinia enterocolitica 84-50 (serotype O:5, biotype 1A) and the nucleotide sequence was determined. The ystB contained 216 base pairs that encoded a protein of 71 amino acid residues. The C-terminal 30 residues of the precursor protein exactly corresponded to the amino acid sequence of the Y-STb toxin, purified from the culture supernatant of the wild strain. Homology search revealed that there are 76.9% nucleotide sequence similarity between ystB and the Yersinia kristensenii ST gene, and 73.5% with the Y. enterocolitica prototype sequence of yst (ystA). When tested with the PCR generated ystB specific probe, 36 of 304 Y. enterocolitica strains from 18 countries hybridized with the probe. All the ystB probe positive strains belonged to biotype 1A and mostly to the so-called non-pathogenic serotype O:5, O:6, O:7,8 O:7,13 and O:10, while ystA was predominantly found among the pathogenic serotypes (78.5%). Out of 36 ystB gene positive strains, 18 were clinical origin from six countries, which were also positive in the suckling mice assay suggesting that ystB may play an important role in the pathogenesis, and the so-called non-pathogenic serotypes could be virulent for human.

  17. Large-scale similarity search profiling of ChEMBL compound data sets.

    PubMed

    Heikamp, Kathrin; Bajorath, Jürgen

    2011-08-22

    A large-scale similarity search investigation has been carried out on 266 well-defined compound activity classes extracted from the ChEMBL database. The analysis was performed using two widely applied two-dimensional (2D) fingerprints that mark opposite ends of the current performance spectrum of these types of fingerprints, i.e., MACCS structural keys and the extended connectivity fingerprint with bond diameter four (ECFP4). For each fingerprint, three nearest neighbor search strategies were applied. On the basis of these search calculations, a similarity search profile of the ChEMBL database was generated. Overall, the fingerprint search campaign was surprisingly successful. In 203 of 266 test cases (∼76%), a compound recovery rate of at least 50% was observed with at least the better performing fingerprint and one search strategy. The similarity search profile also revealed several general trends. For example, fingerprint searching was often characterized by an early enrichment of active compounds in database selection sets. In addition, compound activity classes have been categorized according to different similarity search performance levels, which helps to put the results of benchmark calculations into perspective. Therefore, a compendium of activity classes falling into different search performance categories is provided. On the basis of our large-scale investigation, the performance range of state-of-the-art 2D fingerprinting has been delineated for compound data sets directed against a wide spectrum of pharmaceutical targets.

  18. Molecular cloning, coding nucleotides and the deduced amino acid sequence of P-450BM-1 from Bacillus megaterium.

    PubMed

    He, J S; Ruettinger, R T; Liu, H M; Fulco, A J

    1989-12-22

    The gene encoding barbiturate-inducible cytochrome P-450BM-1 from Bacillus megaterium ATCC 14581 has been cloned and sequenced. An open reading frame in the 1.9 kb of cloned DNA correctly predicted the NH2-terminal sequence of P-450BM-1 previously determined by protein sequencing, and, in toto, predicted a polypeptide of 410 amino acid residues with an Mr of 47,439. The sequence is most, but less than 27%, similar to that of P-450CAM from Pseudomonas putida, so that P-450BM-1 clearly belongs to a new P-450-gene family, distinct especially from that of the P-450 domain of P-450BM-3, a barbiturate-inducible single polypeptide cytochrome P-450:NADPH-P-450 reductase from the same strain of B. megaterium (Ruettinger, R.T., Wen, L.-P. and Fulco, A.J. (1989) J. Biol. Chem. 264, 10987-10995). PMID:2597681

  19. Cloning and nucleotide sequence of the genes coding for the Sau96I restriction and modification enzymes.

    PubMed Central

    Szilák, L; Venetianer, P; Kiss, A

    1990-01-01

    The genes coding for the GGNCC specific Sau96I restriction and modification enzymes were cloned and expressed in E. coli. The DNA sequence predicts a 430 amino acid protein (Mr: 49,252) for the methyltransferase and a 261 amino acid protein (Mr: 30,486) for the endonuclease. No protein sequence similarity was detected between the Sau96I methyltransferase and endonuclease. The methyltransferase contains the sequence elements characteristic for m5C-methyltransferases. In addition to this, M.Sau96I shows similarity, also in the variable region, with one m5C-methyltransferase (M.SinI) which has closely related recognition specificity (GGA/TCC). M.Sau96I methylates the internal cytosine within the GGNCC recognition sequence. The Sau96I endonuclease appears to act as a monomer. Images PMID:2204026

  20. Diagnostic assay for Helicobacter hepaticus based on nucleotide sequence of its 16S rRNA gene.

    PubMed Central

    Battles, J K; Williamson, J C; Pike, K M; Gorelick, P L; Ward, J M; Gonda, M A

    1995-01-01

    Conserved primers were used to PCR amplify 95% of the Helicobacter hepaticus 16S rRNA gene. Its sequence was determined and aligned to those of related bacteria, enabling the selection of primers to highly diverged regions of the 16S rRNA gene and an oligonucleotide probe for the development of a PCR-liquid hybridization assay. This assay was shown to be both sensitive and specific for H. hepaticus 16S rRNA gene sequences. PMID:7542270