Science.gov

Sample records for embl nucleotide sequence

  1. The EMBL Nucleotide Sequence Database.

    PubMed

    Stoesser, G; Tuli, M A; Lopez, R; Sterk, P

    1999-01-01

    The EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl.html) constitutes Europe's primary nucleotide sequence resource. Main sources for DNA and RNA sequences are direct submissions from individual researchers, genome sequencing projects and patent applications. While automatic procedures allow incorporation of sequence data from large-scale genome sequencing centres and from the European Patent Office (EPO), the preferred submission tool for individual submitters is Webin (WWW). Through all stages, dataflow is monitored by EBI biologists communicating with the sequencing groups. In collaboration with DDBJ and GenBank the database is produced, maintained and distributed at the European Bioinformatics Institute (EBI). Database releases are produced quarterly and are distributed on CD-ROM. Network services allow access to the most up-to-date data collection via Internet and World Wide Web interface. EBI's Sequence Retrieval System (SRS) is a Network Browser for Databanks in Molecular Biology, integrating and linking the main nucleotide and protein databases, plus many specialised databases. For sequence similarity searching a variety of tools (e.g. Blitz, Fasta, Blast etc) are available for external users to compare their own sequences against the most currently available data in the EMBL Nucleotide Sequence Database and SWISS-PROT. PMID:9847133

  2. The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 1998.

    PubMed Central

    Bairoch, A; Apweiler, R

    1998-01-01

    SWISS-PROT (http://www.expasy.ch/) is a curated protein sequence database which strives to provide a high level of annotations (such as the description of the function of a protein, its domains structure, post-translational modifications, variants, etc.), a minimal level of redundancy and high level of integration with other databases. Recent developments of the database include: an increase in the number and scope of model organisms; cross-references to two additional databases; a variety of new documentation files and improvements to TrEMBL, a computer annotated supplement to SWISS-PROT. TrEMBL consists of entries in SWISS-PROT-like format derived from the translation of all coding sequences (CDS) in the EMBL nucleotide sequence database, except the CDS already included in SWISS-PROT. PMID:9399796

  3. The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 1999.

    PubMed Central

    Bairoch, A; Apweiler, R

    1999-01-01

    SWISS-PROT is a curated protein sequence database which strives to provide a high level of annotation (such as the description of the function of a protein, its domain structure, post-translational modifications, variants, etc.), a minimal level of redundancy and high level of integration with other databases. Recent developments of the database include: cross-references to additional databases; a variety of new documentation files and improvements to TrEMBL, a computer annotated supplement to SWISS-PROT. TrEMBL consists of entries in SWISS-PROT-like format derived from the translation of all coding sequences (CDS) in the EMBL nucleotide sequence database, except the CDS already included in SWISS-PROT. The URLs for SWISS-PROT on the WWW are: http://www.expasy.ch/sprot and http://www. ebi.ac.uk/sprot PMID:9847139

  4. The SWISS-PROT protein sequence data bank and its supplement TrEMBL.

    PubMed Central

    Bairoch, A; Apweiler, R

    1997-01-01

    SWISS-PROT is a curated protein sequence database which strives to provide a high level of annotations (such as the description of the function of a protein, structure of its domains, post-translational modifications, variants, etc.), a minimal level of redundancy and high level of integration with other databases. Recent developments of the database include: an increase in the number and scope of model organisms; cross-references to two additional databases; a variety of new documentation files and the creation of TrEMBL, a computer annotated supplement to SWISS-PROT. This supplement consists of entries in SWISS-PROT-like format derived from the translation of all coding sequences (CDS) in the EMBL nucleotide sequence database, except the CDS already included in SWISS-PROT. PMID:9016499

  5. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000

    PubMed Central

    Bairoch, Amos; Apweiler, Rolf

    2000-01-01

    SWISS-PROT is a curated protein sequence database which strives to provide a high level of annotation (such as the description of the function of a protein, its domains structure, post-translational modifications, variants, etc.), a minimal level of redundancy and high level of integration with other databases. Recent developments of the database include format and content enhancements, cross-references to additional databases, new documentation files and improvements to TrEMBL, a computer-annotated supplement to SWISS-PROT. TrEMBL consists of entries in SWISS-PROT-like format derived from the translation of all coding sequences (CDSs) in the EMBL Nucleotide Sequence Database, except the CDSs already included in SWISS-PROT. We also describe the Human Proteomics Initiative (HPI), a major project to annotate all known human sequences according to the quality standards of SWISS-PROT. SWISS-PROT is available at: http://www.expasy.ch/sprot/ and http://www.ebi.ac.uk/swissprot/ PMID:10592178

  6. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000.

    PubMed

    Bairoch, A; Apweiler, R

    2000-01-01

    SWISS-PROT is a curated protein sequence database which strives to provide a high level of annotation (such as the description of the function of a protein, its domains structure, post-translational modifications, variants, etc.), a minimal level of redundancy and high level of integration with other databases. Recent developments of the database include format and content enhancements, cross-references to additional databases, new documentation files and improvements to TrEMBL, a computer-annotated supplement to SWISS-PROT. TrEMBL consists of entries in SWISS-PROT-like format derived from the translation of all coding sequences (CDSs) in the EMBL Nucleotide Sequence Database, except the CDSs already included in SWISS-PROT. We also describe the Human Proteomics Initiative (HPI), a major project to annotate all known human sequences according to the quality standards of SWISS-PROT. SWISS-PROT is available at: http://www.expasy.ch/sprot/ and http://www.ebi.ac.uk/swissprot/ PMID:10592178

  7. Nucleotide sequences 1986/1987

    SciTech Connect

    Not Available

    1987-01-01

    These eight volumes are the third annual published compendium of nucleic acid sequences included in the European Molecular Biology Laboratory Nucleotide Sequence Data Library and the GenBank Genetic Sequences Data Bank. Each volume surveys one or more subdivisions of the database. The volume subtitles are: Primates; Rodents; Other Vertebrates and Invertebrates, Plants and Organelles, Bacteria and Bacteriophage, Viruses, Structural RNA, Synthetic and Unannotated Sequences, and Database Directory and Master Indices.

  8. Automated Identification of Nucleotide Sequences

    NASA Technical Reports Server (NTRS)

    Osman, Shariff; Venkateswaran, Kasthuri; Fox, George; Zhu, Dian-Hui

    2007-01-01

    STITCH is a computer program that processes raw nucleotide-sequence data to automatically remove unwanted vector information, perform reverse-complement comparison, stitch shorter sequences together to make longer ones to which the shorter ones presumably belong, and search against the user s choice of private and Internet-accessible public 16S rRNA databases. ["16S rRNA" denotes a ribosomal ribonucleic acid (rRNA) sequence that is common to all organisms.] In STITCH, a template 16S rRNA sequence is used to position forward and reverse reads. STITCH then automatically searches known 16S rRNA sequences in the user s chosen database(s) to find the sequence most similar to (the sequence that lies at the smallest edit distance from) each spliced sequence. The result of processing by STITCH is the identification of the most similar well-described bacterium. Whereas previously commercially available software for analyzing genetic sequences operates on one sequence at a time, STITCH can manipulate multiple sequences simultaneously to perform the aforementioned operations. A typical analysis of several dozen sequences (length of the order of 103 base pairs) by use of STITCH is completed in a few minutes, whereas such an analysis performed by use of prior software takes hours or days.

  9. Nucleotide sequences encoding a thermostable alkaline protease

    DOEpatents

    Wilson, David B.; Lao, Guifang

    1998-01-01

    Nucleotide sequences, derived from a thermophilic actinomycete microorganism, which encode a thermostable alkaline protease are disclosed. Also disclosed are variants of the nucleotide sequences which encode a polypeptide having thermostable alkaline proteolytic activity. Recombinant thermostable alkaline protease or recombinant polypeptide may be obtained by culturing in a medium a host cell genetically engineered to contain and express a nucleotide sequence according to the present invention, and recovering the recombinant thermostable alkaline protease or recombinant polypeptide from the culture medium.

  10. Nucleotide sequences encoding a thermostable alkaline protease

    DOEpatents

    Wilson, D.B.; Lao, G.

    1998-01-06

    Nucleotide sequences, derived from a thermophilic actinomycete microorganism, which encode a thermostable alkaline protease are disclosed. Also disclosed are variants of the nucleotide sequences which encode a polypeptide having thermostable alkaline proteolytic activity. Recombinant thermostable alkaline protease or recombinant polypeptide may be obtained by culturing in a medium a host cell genetically engineered to contain and express a nucleotide sequence according to the present invention, and recovering the recombinant thermostable alkaline protease or recombinant polypeptide from the culture medium. 3 figs.

  11. Long-range correlations in nucleotide sequences

    NASA Astrophysics Data System (ADS)

    Peng, C.-K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Sciortino, F.; Simons, M.; Stanley, H. E.

    1992-03-01

    DNA SEQUENCES have been analysed using models, such as an it-step Markov chain, that incorporate the possibility of short-range nucleotide correlations1. We propose here a method for studying the stochastic properties of nucleotide sequences by constructing a 1:1 map of the nucleotide sequence onto a walk, which we term a 'DNA walk'. We then use the mapping to provide a quantitative measure of the correlation between nucleotides over long distances along the DNA chain. Thus we uncover in the nucleotide sequence a remarkably long-range power law correlation that implies a new scale-invariant property of DNA. We find such long-range correlations in intron-containing genes and in nontranscribed regulatory DNA sequences, but not in complementary DNA sequences or intron-less genes.

  12. Submitting MIGS, MIMS, MIENS Information to EMBL and Standards and the Sequencing Pipelines of the Gordon and Betty Moore Foundation (GSC8 Meeting)

    ScienceCinema

    Vaughan, Bob [EMBL]; Kaye, Jon [Gordon and Betty Moore Foundation

    2011-04-29

    The Genomic Standards Consortium was formed in September 2005. It is an international, open-membership working body which promotes standardization in the description of genomes and the exchange and integration of genomic data. The 2009 meeting was an activity of a five-year funding "Research Coordination Network" from the National Science Foundation and was organized held at the DOE Joint Genome Institute with organizational support provided by the JGI and by the University of California - San Diego. Bob Vaughan of EMBL on submitting MIGS/MIMS/MIENS information to EMBL-EBI's system, followed by a brief talk from Jon Kaye of the Gordon and Betty Moore Foundation on standards and the foundation's sequencing pipelines at the Genomic Standards Consortium's 8th meeting at the DOE JGI in Walnut Creek, Calif. on Sept. 9, 2009

  13. Submitting MIGS, MIMS, MIENS Information to EMBL and Standards and the Sequencing Pipelines of the Gordon and Betty Moore Foundation (GSC8 Meeting)

    SciTech Connect

    Vaughan, Bob; Kaye, Jon

    2009-09-09

    The Genomic Standards Consortium was formed in September 2005. It is an international, open-membership working body which promotes standardization in the description of genomes and the exchange and integration of genomic data. The 2009 meeting was an activity of a five-year funding "Research Coordination Network" from the National Science Foundation and was organized held at the DOE Joint Genome Institute with organizational support provided by the JGI and by the University of California - San Diego. Bob Vaughan of EMBL on submitting MIGS/MIMS/MIENS information to EMBL-EBI's system, followed by a brief talk from Jon Kaye of the Gordon and Betty Moore Foundation on standards and the foundation's sequencing pipelines at the Genomic Standards Consortium's 8th meeting at the DOE JGI in Walnut Creek, Calif. on Sept. 9, 2009

  14. Statistical analysis of nucleotide sequences.

    PubMed Central

    Stückle, E E; Emmrich, C; Grob, U; Nielsen, P J

    1990-01-01

    In order to scan nucleic acid databases for potentially relevant but as yet unknown signals, we have developed an improved statistical model for pattern analysis of nucleic acid sequences by modifying previous methods based on Markov chains. We demonstrate the importance of selecting the appropriate parameters in order for the method to function at all. The model allows the simultaneous analysis of several short sequences with unequal base frequencies and Markov order k not equal to 0 as is usually the case in databases. As a test of these modifications, we show that in E. coli sequences there is a bias against palindromic hexamers which correspond to known restriction enzyme recognition sites. PMID:2251125

  15. BLAST2SRS, a web server for flexible retrieval of related protein sequences in the SWISS-PROT and SPTrEMBL databases

    PubMed Central

    Bimpikis, Konstantinos; Budd, Aidan; Linding, Rune; Gibson, Toby J.

    2003-01-01

    SRS (Sequence Retrieval System) is a widely used keyword search engine for querying biological databases. BLAST2 is the most widely used tool to query databases by sequence similarity search. These tools allow users to retrieve sequences by shared keyword or by shared similarity, with many public web servers available. However, with the increasingly large datasets available it is now quite common that a user is interested in some subset of homologous sequences but has no efficient way to restrict retrieval to that set. By allowing the user to control SRS from the BLAST output, BLAST2SRS (http://blast2srs.embl.de/) aims to meet this need. This server therefore combines the two ways to search sequence databases: similarity and keyword. PMID:12824420

  16. The International Nucleotide Sequence Database Collaboration.

    PubMed

    Cochrane, Guy; Karsch-Mizrachi, Ilene; Takagi, Toshihisa

    2016-01-01

    The International Nucleotide Sequence Database Collaboration (INSDC; http://www.insdc.org) comprises three global partners committed to capturing, preserving and providing comprehensive public-domain nucleotide sequence information. The INSDC establishes standards, formats and protocols for data and metadata to make it easier for individuals and organisations to submit their nucleotide data reliably to public archives. This work enables the continuous, global exchange of information about living things. Here we present an update of the INSDC in 2015, including data growth and diversification, new standards and requirements by publishers for authors to submit their data to the public archives. The INSDC serves as a model for data sharing in the life sciences. PMID:26657633

  17. Nucleotide sequence of bacteriophage fd DNA.

    PubMed Central

    Beck, E; Sommer, R; Auerswald, E A; Kurz, C; Zink, B; Osterburg, G; Schaller, H; Sugimoto, K; Sugisaki, H; Okamoto, T; Takanami, M

    1978-01-01

    The sequence of the 6,408 nucleotides of bacteriophage fd DNA has been determined. This allows to deduce the exact organisation of the filamentous phage genome and provides easy access to DNA segments of known structure and function. PMID:745987

  18. Complete Nucleotide Sequence of Tn10

    PubMed Central

    Chalmers, Ronald; Sewitz, Sven; Lipkow, Karen; Crellin, Paul

    2000-01-01

    The complete nucleotide sequence of Tn10 has been determined. The dinucleotide signature and percent G+C of the sequence had no discontinuities, indicating that Tn10 constitutes a homogeneous unit. The new sequence contained three new open reading frames corresponding to a glutamate permease, repressors of heavy metal resistance operons, and a hypothetical protein in Bacillus subtilis. The glutamate permease was fully functional when expressed, but Tn10 did not protect Escherichia coli from the toxic effects of various metals. PMID:10781570

  19. The multiple codes of nucleotide sequences.

    PubMed

    Trifonov, E N

    1989-01-01

    Nucleotide sequences carry genetic information of many different kinds, not just instructions for protein synthesis (triplet code). Several codes of nucleotide sequences are discussed including: (1) the translation framing code, responsible for correct triplet counting by the ribosome during protein synthesis; (2) the chromatin code, which provides instructions on appropriate placement of nucleosomes along the DNA molecules and their spatial arrangement; (3) a putative loop code for single-stranded RNA-protein interactions. The codes are degenerate and corresponding messages are not only interspersed but actually overlap, so that some nucleotides belong to several messages simultaneously. Tandemly repeated sequences frequently considered as functionless "junk" are found to be grouped into certain classes of repeat unit lengths. This indicates some functional involvement of these sequences. A hypothesis is formulated according to which the tandem repeats are given the role of weak enhancer-silencers that modulate, in a copy number-dependent way, the expression of proximal genes. Fast amplification and elimination of the repeats provides an attractive mechanism of species adaptation to a rapidly changing environment. PMID:2673451

  20. Remote access to ACNUC nucleotide and protein sequence databases at PBIL.

    PubMed

    Gouy, Manolo; Delmotte, Stéphane

    2008-04-01

    The ACNUC biological sequence database system provides powerful and fast query and extraction capabilities to a variety of nucleotide and protein sequence databases. The collection of ACNUC databases served by the Pôle Bio-Informatique Lyonnais includes the EMBL, GenBank, RefSeq and UniProt nucleotide and protein sequence databases and a series of other sequence databases that support comparative genomics analyses: HOVERGEN and HOGENOM containing families of homologous protein-coding genes from vertebrate and prokaryotic genomes, respectively; Ensembl and Genome Reviews for analyses of prokaryotic and of selected eukaryotic genomes. This report describes the main features of the ACNUC system and the access to ACNUC databases from any internet-connected computer. Such access was made possible by the definition of a remote ACNUC access protocol and the implementation of Application Programming Interfaces between the C, Python and R languages and this communication protocol. Two retrieval programs for ACNUC databases, Query_win, with a graphical user interface and raa_query, with a command line interface, are also described. Altogether, these bioinformatics tools provide users with either ready-to-use means of querying remote sequence databases through a variety of selection criteria, or a simple way to endow application programs with an extensive access to these databases. Remote access to ACNUC databases is open to all and fully documented (http://pbil.univ-lyon1.fr/databases/acnuc/acnuc.html). PMID:17825976

  1. Simplified computer programs for search of homology within nucleotide sequences.

    PubMed Central

    Kröger, M; Kröger-Block, A

    1984-01-01

    Four new computer programs for search of homology within nucleotide sequences are presented. The main scope of the program design is flexibility, independence of sequence length and the capability to be used by any molecular biologist without any prior computer experience. The programs offer a linear search, a search for maximal identity, an alignment along a given sequence and a search based on homology within the amino acid coding capacity of nucleotide sequences. The language is Fortran V. Copies are available on request. PMID:6546417

  2. Nucleotide sequence of SHV-2 beta-lactamase gene

    SciTech Connect

    Garbarg-Chenon, A.; Godard, V.; Labia, R.; Nicolas, J.C. )

    1990-07-01

    The nucleotide sequence of plasmid-mediated beta-lactamase SHV-2 from Salmonella typhimurium (SHV-2pHT1) was determined. The gene was very similar to chromosomally encoded beta-lactamase LEN-1 of Klebsiella pneumoniae. Compared with the sequence of the Escherichia coli SHV-2 enzyme (SHV-2E.coli) obtained by protein sequencing, the deduced amino acid sequence of SHV-2pHT1 differed by three amino acid substitutions.

  3. Reading biological processes from nucleotide sequences

    NASA Astrophysics Data System (ADS)

    Murugan, Anand

    Cellular processes have traditionally been investigated by techniques of imaging and biochemical analysis of the molecules involved. The recent rapid progress in our ability to manipulate and read nucleic acid sequences gives us direct access to the genetic information that directs and constrains biological processes. While sequence data is being used widely to investigate genotype-phenotype relationships and population structure, here we use sequencing to understand biophysical mechanisms. We present work on two different systems. First, in chapter 2, we characterize the stochastic genetic editing mechanism that produces diverse T-cell receptors in the human immune system. We do this by inferring statistical distributions of the underlying biochemical events that generate T-cell receptor coding sequences from the statistics of the observed sequences. This inferred model quantitatively describes the potential repertoire of T-cell receptors that can be produced by an individual, providing insight into its potential diversity and the probability of generation of any specific T-cell receptor. Then in chapter 3, we present work on understanding the functioning of regulatory DNA sequences in both prokaryotes and eukaryotes. Here we use experiments that measure the transcriptional activity of large libraries of mutagenized promoters and enhancers and infer models of the sequence-function relationship from this data. For the bacterial promoter, we infer a physically motivated 'thermodynamic' model of the interaction of DNA-binding proteins and RNA polymerase determining the transcription rate of the downstream gene. For the eukaryotic enhancers, we infer heuristic models of the sequence-function relationship and use these models to find synthetic enhancer sequences that optimize inducibility of expression. Both projects demonstrate the utility of sequence information in conjunction with sophisticated statistical inference techniques for dissecting underlying biophysical

  4. The nucleotide sequence of cowpea mosaic virus B RNA

    PubMed Central

    Lomonossoff, G.P.; Shanks, M.

    1983-01-01

    The complete sequence of the bottom component RNA (B RNA) of cowpea mosaic virus (CPMV) has been determined. Restriction enzyme fragments of double-stranded cDNA were cloned in M13 and the sequence of the inserts was determined by a combination of enzymatic and chemical sequencing techniques. Additional sequence information was obtained by primed synthesis on first strand cDNA. The complete sequence deduced is 5889 nucleotides long excluding the 3' poly(A), and contains an open reading frame sufficient to code for a polypeptide of mol. wt. 207 760. The coding region is flanked by a 5' leader sequence of 206 nucleotides and a 3' non-coding region of 82 residues which does not contain a polyadenylation signal. PMID:16453487

  5. Nucleotide sequence of the tobacco (Nicotiana tabacum) anionic peroxidase gene

    SciTech Connect

    Diaz-De-Leon, F.; Klotz, K.L.; Lagrimini, L.M. )

    1993-03-01

    Peroxidases have been implicated in numerous physiological processes including lignification (Grisebach, 1981), wound-healing (Espelie et al., 1986), phenol oxidation (Lagrimini, 1991), pathogen defense (Ye et al., 1990), and the regulation of cell elongation through the formation of interchain covalent bonds between various cell wall polymers (Fry, 1986; Goldberg et al., 1986; Bradley et al., 1992). However, a complete description of peroxidase action in vivo is not available because of the vast number of potential substrates and the existence of multiple isoenzymes. The tobacco anionic peroxidase is one of the better-characterized isoenzymes. This enzyme has been shown to oxidize a number of significant plant secondary compounds in vitro including cinnamyl alcohols, phenolic acids, and indole-3-acetic acid (Maeder, 1980; Lagrimini, 1991). A cDNA encoding the enzyme has been obtained, and this enzyme was shown to be expressed at the highest levels in lignifying tissues (xylem and tracheary elements) and also in epidermal tissue (Lagrimini et al., 1987). It was shown at this time that there were four distinct copies of the anionic peroxidase gene in tobacco (Nicotiana tabacum). A tobacco genomic DNA library was constructed in the [lambda]-phase EMBL3, from which two unique peroxidase genes were sequenced. One of these clones, [lambda]POD1, was designated as a pseudogene when the exonic sequences were found to differ from the cDNA sequences by 1%, and several frame shifts in the coding sequences indicated a dysfunctional gene (the authors' unpublished results). The other clone, [lambda]POD3, described in this manuscript, was designated as the functional tobacco anionic peroxidase gene because of 100% homology with the cDNA. Significant structural elements include an AS-2 box indicated in shoot-specific expression (Lam and Chua, 1989), a TATA box, and two intervening sequences. 10 refs., 1 tab.

  6. DNA sequence representation by trianders and determinative degree of nucleotides

    PubMed Central

    Duplij, Diana; Duplij, Steven

    2005-01-01

    A new version of DNA walks, where nucleotides are regarded unequal in their contribution to a walk is introduced, which allows us to study thoroughly the “fine structure” of nucleotide sequences. The approach is based on the assumption that nucleotides have an inner abstract characteristic, the determinative degree, which reflects genetic code phenomenological properties and is adjusted to nucleotides physical properties. We consider each codon position independently, which gives three separate walks characterized by different angles and lengths, and that such an object is called triander which reflects the “strength” of branch. A general method for identifying DNA sequence “by triander” which can be treated as a unique “genogram” (or “gene passport”) is proposed. The two- and three-dimensional trianders are considered. The difference of sequences fine structure in genes and the intergenic space is shown. A clear triplet signal in coding sequences was found which is absent in the intergenic space and is independent from the sequence length. This paper presents the topological classification of trianders which can allow us to provide a detailed working out signatures of functionally different genomic regions. PMID:16052707

  7. Moss Phylogeny Reconstruction Using Nucleotide Pangenome of Complete Mitogenome Sequences.

    PubMed

    Goryunov, D V; Nagaev, B E; Nikolaev, M Yu; Alexeevski, A V; Troitsky, A V

    2015-11-01

    Stability of composition and sequence of genes was shown earlier in 13 mitochondrial genomes of mosses (Rensing, S. A., et al. (2008) Science, 319, 64-69). It is of interest to study the evolution of mitochondrial genomes not only at the gene level, but also on the level of nucleotide sequences. To do this, we have constructed a "nucleotide pangenome" for mitochondrial genomes of 24 moss species. The nucleotide pangenome is a set of aligned nucleotide sequences of orthologous genome fragments covering the totality of all genomes. The nucleotide pangenome was constructed using specially developed new software, NPG-explorer (NPGe). The stable part of the mitochondrial genome (232 stable blocks) is shown to be, on average, 45% of its length. In the joint alignment of stable blocks, 82% of positions are conserved. The phylogenetic tree constructed with the NPGe program is in good correlation with other phylogenetic reconstructions. With the NPGe program, 30 blocks have been identified with repeats no shorter than 50 bp. The maximal length of a block with repeats is 140 bp. Duplications in the mitochondrial genomes of mosses are rare. On average, the genome contains about 500 bp in large duplications. The total length of insertions and deletions was determined in each genome. The losses and gains of DNA regions are rather active in mitochondrial genomes of mosses, and such rearrangements presumably can be used as additional markers in the reconstruction of phylogeny. PMID:26615445

  8. Complete nucleotide sequence of Nootka lupine vein-clearing virus

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The complete genome sequence of Nootka lupine vein-clearing virus (NLVCV) was determined to be 4,172 nucleotides in length containing four open reading frames ORFs with a similar genetic organization and conceptual translations of virus species in the genus Carmovirus, family Tombusviridae. The orde...

  9. Method for the detection of specific nucleic acid sequences by polymerase nucleotide incorporation

    DOEpatents

    Castro, Alonso

    2004-06-01

    A method for rapid and efficient detection of a target DNA or RNA sequence is provided. A primer having a 3'-hydroxyl group at one end and having a sequence of nucleotides sufficiently homologous with an identifying sequence of nucleotides in the target DNA is selected. The primer is hybridized to the identifying sequence of nucleotides on the DNA or RNA sequence and a reporter molecule is synthesized on the target sequence by progressively binding complementary nucleotides to the primer, where the complementary nucleotides include nucleotides labeled with a fluorophore. Fluorescence emitted by fluorophores on single reporter molecules is detected to identify the target DNA or RNA sequence.

  10. Nucleotide Sequencing and Identification of Some Wild Mushrooms

    PubMed Central

    Das, Sudip Kumar; Mandal, Aninda; Datta, Animesh K.; Gupta, Sudha; Paul, Rita; Saha, Aditi; Sengupta, Sonali; Dubey, Priyanka Kumari

    2013-01-01

    The rDNA-ITS (Ribosomal DNA Internal Transcribed Spacers) fragment of the genomic DNA of 8 wild edible mushrooms (collected from Eastern Chota Nagpur Plateau of West Bengal, India) was amplified using ITS1 (Internal Transcribed Spacers 1) and ITS2 primers and subjected to nucleotide sequence determination for identification of mushrooms as mentioned. The sequences were aligned using ClustalW software program. The aligned sequences revealed identity (homology percentage from GenBank data base) of Amanita hemibapha [CN (Chota Nagpur) 1, % identity 99 (JX844716.1)], Amanita sp. [CN 2, % identity 98 (JX844763.1)], Astraeus hygrometricus [CN 3, % identity 87 (FJ536664.1)], Termitomyces sp. [CN 4, % identity 90 (JF746992.1)], Termitomyces sp. [CN 5, % identity 99 (GU001667.1)], T. microcarpus [CN 6, % identity 82 (EF421077.1)], Termitomyces sp. [CN 7, % identity 76 (JF746993.1)], and Volvariella volvacea [CN 8, % identity 100 (JN086680.1)]. Although out of 8 mushrooms 4 could be identified up to species level, the nucleotide sequences of the rest may be relevant to further characterization. A phylogenetic tree is constructed using Neighbor-Joining method showing interrelationship between/among the mushrooms. The determined nucleotide sequences of the mushrooms may provide additional information enriching GenBank database aiding to molecular taxonomy and facilitating its domestication and characterization for human benefits. PMID:24489501

  11. Nucleotide sequencing and identification of some wild mushrooms.

    PubMed

    Das, Sudip Kumar; Mandal, Aninda; Datta, Animesh K; Gupta, Sudha; Paul, Rita; Saha, Aditi; Sengupta, Sonali; Dubey, Priyanka Kumari

    2013-01-01

    The rDNA-ITS (Ribosomal DNA Internal Transcribed Spacers) fragment of the genomic DNA of 8 wild edible mushrooms (collected from Eastern Chota Nagpur Plateau of West Bengal, India) was amplified using ITS1 (Internal Transcribed Spacers 1) and ITS2 primers and subjected to nucleotide sequence determination for identification of mushrooms as mentioned. The sequences were aligned using ClustalW software program. The aligned sequences revealed identity (homology percentage from GenBank data base) of Amanita hemibapha [CN (Chota Nagpur) 1, % identity 99 (JX844716.1)], Amanita sp. [CN 2, % identity 98 (JX844763.1)], Astraeus hygrometricus [CN 3, % identity 87 (FJ536664.1)], Termitomyces sp. [CN 4, % identity 90 (JF746992.1)], Termitomyces sp. [CN 5, % identity 99 (GU001667.1)], T. microcarpus [CN 6, % identity 82 (EF421077.1)], Termitomyces sp. [CN 7, % identity 76 (JF746993.1)], and Volvariella volvacea [CN 8, % identity 100 (JN086680.1)]. Although out of 8 mushrooms 4 could be identified up to species level, the nucleotide sequences of the rest may be relevant to further characterization. A phylogenetic tree is constructed using Neighbor-Joining method showing interrelationship between/among the mushrooms. The determined nucleotide sequences of the mushrooms may provide additional information enriching GenBank database aiding to molecular taxonomy and facilitating its domestication and characterization for human benefits. PMID:24489501

  12. Nucleotide sequence and genome organization of tomato leaf curl geminivirus.

    PubMed

    Dry, I B; Rigden, J E; Krake, L R; Mullineaux, P M; Rezaian, M A

    1993-01-01

    The genome of tomato leaf curl virus (TLCV) from Australia was cloned and its complete nucleotide sequence determined. It is a single circular ssDNA of 2766 nucleotides containing the consensus nonanucleotide sequence present in all geminiviruses. It has six open reading frames with an organization resembling that of certain other dicotyledonous plant-infecting monopartite geminiviruses, i.e. tomato yellow leaf curl and beet curly top viruses. The regulatory sequences present indicate a bidirectional mode of transcription. A dimeric TLCV DNA clone was constructed in a binary vector and used to agroinoculate three different host species. Typical virus infections were produced, confirming that the single DNA component is sufficient for infectivity. PMID:8423446

  13. Single Nucleotide Polymorphism Mapping Using Genome-Wide Unique Sequences

    PubMed Central

    Chen, Leslie Y.Y.; Lu, Szu-Hsien; Shih, Edward S.C.; Hwang, Ming-Jing

    2002-01-01

    As more and more genomic DNAs are sequenced to characterize human genetic variations, the demand for a very fast and accurate method to genomically position these DNA sequences is high. We have developed a new mapping method that does not require sequence alignment. In this method, we first identified DNA fragments of 15 bp in length that are unique in the human genome and then used them to position single nucleotide polymorphism (SNP) sequences. By use of four desktop personal computers with AMD K7 (1 GHz) processors, our new method mapped more than 1.6 million SNP sequences in 20 hr and achieved a very good agreement with mapping results from alignment-based methods. PMID:12097348

  14. The primary nucleotide sequence of U4 RNA.

    PubMed

    Reddy, R; Henning, D; Busch, H

    1981-04-10

    U4 RNA is one of the "capped" nuclear snRNAs recently found to be precipitable by anti-Sm antibodies as ribonucleoprotein particles. U4 RNA, along with other snRNAs, has been implicated in hnRNA processing, mRNA transport, or both (Lerner, M. R., Boyle, J., Mount, S., Wolin, S., and Steitz, J. A. (1980) Nature 283, 220-224). Since the proteins bound to different snRNAs appear to be the same, the functions of different snRNPs might be dependent on the RNA components. To help understand the function of U4 RNP, the nucleotide sequence of U4 RNA was determined. The sequence is (formula see text) In addition to the modified nucleotides in the "cap," U4 RNA contains Am at position 63 and m6A at position 98. It also exhibited A-C microheterogeneity at position 97. PMID:6162848

  15. Nucleotide-Specific Contrast for DNA Sequencing by Electron Spectroscopy

    PubMed Central

    Schmid, Andreas K.; Davis, Ronald W.

    2016-01-01

    DNA sequencing by imaging in an electron microscope is an approach that holds promise to deliver long reads with low error rates and without the need for amplification. Earlier work using transmission electron microscopes, which use high electron energies on the order of 100 keV, has shown that low contrast and radiation damage necessitates the use of heavy atom labeling of individual nucleotides, which increases the read error rates. Other prior work using scattering electrons with much lower energy has shown to suppress beam damage on DNA. Here we explore possibilities to increase contrast by employing two methods, X-ray photoelectron and Auger electron spectroscopy. Using bulk DNA samples with monomers of each base, both methods are shown to provide contrast mechanisms that can distinguish individual nucleotides without labels. Both spectroscopic techniques can be readily implemented in a low energy electron microscope, which may enable label-free DNA sequencing by direct imaging. PMID:27149617

  16. 77 FR 65537 - Requirements for Patent Applications Containing Nucleotide Sequence and/or Amino Acid Sequence...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-10-29

    ... Amino Acid Sequence Disclosures ACTION: Proposed collection; comment request. SUMMARY: The United States....'' SUPPLEMENTARY INFORMATION: I. Abstract Patent applications that contain nucleotide and/or amino acid sequence disclosures must include a copy of the sequence listing in accordance with the requirements in 37 CFR...

  17. Complete nucleotide sequence of Saccharomyces cerevisiae chromosome X.

    PubMed Central

    Galibert, F; Alexandraki, D; Baur, A; Boles, E; Chalwatzis, N; Chuat, J C; Coster, F; Cziepluch, C; De Haan, M; Domdey, H; Durand, P; Entian, K D; Gatius, M; Goffeau, A; Grivell, L A; Hennemann, A; Herbert, C J; Heumann, K; Hilger, F; Hollenberg, C P; Huang, M E; Jacq, C; Jauniaux, J C; Katsoulou, C; Karpfinger-Hartl, L

    1996-01-01

    The complete nucleotide sequence of Saccharomyces cerevisiae chromosome X (745 442 bp) reveals a total of 379 open reading frames (ORFs), the coding region covering approximately 75% of the entire sequence. One hundred and eighteen ORFs (31%) correspond to genes previously identified in S. cerevisiae. All other ORFs represent novel putative yeast genes, whose function will have to be determined experimentally. However, 57 of the latter subset (another 15% of the total) encode proteins that show significant analogy to proteins of known function from yeast or other organisms. The remaining ORFs, exhibiting no significant similarity to any known sequence, amount to 54% of the total. General features of chromosome X are also reported, with emphasis on the nucleotide frequency distribution in the environment of the ATG and stop codons, the possible coding capacity of at least some of the small ORFs (<100 codons) and the significance of 46 non-canonical or unpaired nucleotides in the stems of some of the 24 tRNA genes recognized on this chromosome. Images PMID:8641269

  18. The complete nucleotide sequence of pelargonium leaf curl virus.

    PubMed

    McGavin, Wendy J; MacFarlane, Stuart A

    2016-05-01

    Investigation of a tombusvirus isolated from tulip plants in Scotland revealed that it was pelargonium leaf curl virus (PLCV) rather than the originally suggested tomato bushy stunt virus. The complete sequence of the PLCV genome was determined for the first time, revealing it to be 4789 nucleotides in size and to have an organization similar to that of the other, previously described tombusviruses. Primers derived from the sequence were used to construct a full-length infectious clone of PLCV that recapitulates the disease symptoms of leaf curling in systemically infected pelargonium plants. PMID:26906694

  19. Nucleotide sequences of five anti-lysozyme monoclonal antibodies.

    PubMed Central

    Darsley, M J; Rees, A R

    1985-01-01

    The nucleotide sequences of the heavy and light chain immunoglobulin mRNAs derived from five hybridomas (Gloop 1-5) secreting IgGs specific for the loop region of hen egg lysozyme were determined. These monoclonal antibodies recognise three distinct but overlapping epitopes within the loop region. The sequences of two pairs of antibodies with indistinguishable fine specificities were similar in both chains whereas the sequences of antibodies of non-identical specificities were very different. It is proposed that the D-segments expressed in two of the antibodies (Gloop3 and Gloop4) are the products of one, or perhaps two, previously unidentified germ line D-genes. Gloop1 and Gloop2 use a D-segment previously identified in antibodies specific for the hapten 2-phenyloxazolone; however it is recombined in a different reading frame in the anti-lysozyme antibodies, producing a different amino acid sequence. PMID:2410256

  20. Comparing compressed sequences for faster nucleotide BLAST searches.

    PubMed

    Cameron, Michael; Williams, Hugh E

    2007-01-01

    Molecular biologists, geneticists, and other life scientists use the BLAST homology search package as their first step for discovery of information about unknown or poorly annotated genomic sequences. There are two main variants of BLAST: BLASTP for searching protein collections and BLASTN for nucleotide collections. Surprisingly, BLASTN has had very little attention; for example, the algorithms it uses do not follow those described in the 1997 BLAST paper and no exact description has been published. It is important that BLASTN is state-of-the-art: Nucleotide collections such as GenBank dwarf the protein collections in size, they double in size almost yearly, and they take many minutes to search on modern general purpose workstations. This paper proposes significant improvements to the BLASTN algorithms. Each of our schemes is based on compressed bytepacked formats that allow queries and collection sequences to be compared four bases at a time, permitting very fast query evaluation using lookup tables and numeric comparisons. Our most significant innovations are two new, fast gapped alignment schemes that allow accurate sequence alignment without decompression of the collection sequences. Overall, our innovations more than double the speed of BLASTN with no effect on accuracy and have been integrated into our new version of BLAST that is freely available for download from http://www.fsa-blast.org/. PMID:17666756

  1. Bioinformatics comparison of sulfate-reducing metabolism nucleotide sequences

    NASA Astrophysics Data System (ADS)

    Tremberger, G.; Dehipawala, Sunil; Nguyen, A.; Cheung, E.; Sullivan, R.; Holden, T.; Lieberman, D.; Cheung, T.

    2015-09-01

    The sulfate-reducing bacteria can be traced back to 3.5 billion years ago. The thermodynamics details of the sulfur cycle have been well documented. A recent sulfate-reducing bacteria report (Robator, Jungbluth, et al , 2015 Jan, Front. Microbiol) with Genbank nucleotide data has been analyzed in terms of the sulfite reductase (dsrAB) via fractal dimension and entropy values. Comparison to oil field sulfate-reducing sequences was included. The AUCG translational mass fractal dimension versus ATCG transcriptional mass fractal dimension for the low temperature dsrB and dsrA sequences reported in Reference Thirteen shows correlation R-sq ~ 0.79 , with a probably of about 3% in simulation. A recent report of using Cystathionine gamma-lyase sequence to produce CdS quantum dot in a biological method, where the sulfur is reduced just like in the H2S production process, was included for comparison. The AUCG mass fractal dimension versus ATCG mass fractal dimension for the Cystathionine gamma-lyase sequences was found to have R-sq of 0.72, similar to the low temperature dissimilatory sulfite reductase dsr group with 3% probability, in contrary to the oil field group having R-sq ~ 0.94, a high probable outcome in the simulation. The other two simulation histograms, namely, fractal dimension versus entropy R-sq outcome values, and di-nucleotide entropy versus mono-nucleotide entropy R-sq outcome values are also discussed in the data analysis focusing on low probability outcomes.

  2. Cytochrome b nucleotide sequence variation among the Atlantic Alcidae.

    PubMed

    Friesen, V L; Montevecchi, W A; Davidson, W S

    1993-01-01

    Analysis of cytochrome b nucleotide sequences of the six extant species of Atlantic alcids and a gull revealed an excess of adenines and cytosines and a deficit of guanines at silent sites on the coding strand. Phylogenetic analyses grouped the sequences of the common (Uria aalge) and Brünnich's (U. lomvia) guillemots, followed by the razorbill (Alca torda) and little auk (Alle alle). The black guillemot (Cepphus grylle) sequence formed a sister taxon, and the puffin (Fratercula arctica) fell outside the other alcids. Phylogenetic comparisons of substitutions indicated that mutabilities of bases did not differ, but that C was much more likely to be incorporated than was G. Imbalances in base composition appear to result from a strand bias in replication errors, which may result from selection on secondary RNA structure and/or the energetics of codon-anticodon interactions. PMID:7916741

  3. The nucleotide sequence of the bacteriophage T5 ltf gene.

    PubMed

    Kaliman, A V; Kulshin, V E; Shlyapnikov, M G; Ksenzenko, V N; Kryukov, V M

    1995-06-01

    The nucleotide sequence of the bacteriophage T5 Bg/II-BamHI fragment (4,835 bp in length) known to carry a gene encoding the LTF protein which forms the phage L-shaped tail fibers was determined. It was shown to contain an open reading frame for 1,396 amino acid residues that corresponds to a protein of 147.8 kDa. The coding region of ltf gene is preceded by a typical Shine-Dalgarno sequence. Downstream from the ltf gene there is a strong transcription terminator. Data bank analysis of the LTF protein sequence reveals 55.1% identity to the hypothetical protein ORF 401 of bacteriophage lambda in a segment of 118 amino acids overlap. PMID:7789514

  4. Nucleotide sequence and expression of a Drosophila metallothionein.

    PubMed

    Lastowski-Perry, D; Otto, E; Maroni, G

    1985-02-10

    A Drosophila melanogaster cDNA clone was isolated based on its more intense hybridization to RNA sequences from copper-fed larvae than from control larval RNA. This clone showed strong hybridization to mouse metallothionein I cDNA at reduced stringency. Its nucleotide sequence includes an open reading segment which codes for a 40-amino acid protein; this protein is identified as metallothionein based on its similarity to the amino-terminal portion of mammalian and crab metalloproteins. The 10 cysteine residues present occur in five pairs of near vicinal cysteines (Cys-X-Cys). This cDNA sequence hybridized to a 400-nucleotide polyadenylated RNA whose presence in the cells of the alimentary canal of larvae was stimulated by ingestion of cadmium or copper; in other tissues this RNA was present at much lower levels. Mercury, silver, and zinc induced metallothionein to a lesser extent. The level of metallothionein RNA increased very soon after the initiation of metal treatment and reached a maximum after approximately 36 h. PMID:2578462

  5. Nucleotide sequence corresponding to five chemotaxis genes in Escherichia coli.

    PubMed Central

    Mutoh, N; Simon, M I

    1986-01-01

    The nucleotide sequence of DNA which contains five chemotaxis-related genes of Escherichia coli, cheW, cheR, cheB, cheY, and cheZ, and part of the cheA gene was determined. Molecular weights of the polypeptides encoded by these genes were calculated from translated amino acid sequences, and they were 18,100 for cheW, 32,700 for cheR, 37,500 for cheB, 14,100 for cheY, and 24,000 for cheZ. Nucleotide sequences which could act as ribosome-binding sites were found in the upstream region of each gene. After the termination codon of the cheW gene, a typical rho-independent transcription termination signal was observed. There are no other open reading frames long enough to encode polypeptides in this region except those which code for the two previously reported genes tar and tap. PMID:3510184

  6. Nucleotide sequence of Bacillus phage Nf terminal protein gene.

    PubMed Central

    Leavitt, M C; Ito, J

    1987-01-01

    The nucleotide sequence of Bacillus phage Nf gene E has been determined. Gene E codes for phage terminal protein which is the primer necessary for the initiation of DNA replication. The deduced amino acid sequence of Nf terminal protein is approximately 66% homologous with the terminal proteins of Bacillus phages PZA and luminal diameter 29, and shows similar hydropathy and secondary structure predictions. A serine which has been identified as the residue which covalently links the protein to the 5' end of the genome in luminal diameter 29, is conserved in all three phages. The hydropathic and secondary structural environment of this serine is similar in these phage terminal proteins and also similar to the linking serine of adenovirus terminal protein. PMID:3601672

  7. Nucleotide sequences specific to Brucella and methods for the detection of Brucella

    DOEpatents

    McCready, Paula M.; Radnedge, Lyndsay; Andersen, Gary L.; Ott, Linda L.; Slezak, Thomas R.; Kuczmarski, Thomas A.

    2009-02-24

    Nucleotide sequences specific to Brucella that serves as a marker or signature for identification of this bacterium were identified. In addition, forward and reverse primers and hybridization probes derived from these nucleotide sequences that are used in nucleotide detection methods to detect the presence of the bacterium are disclosed.

  8. Nucleotide sequences specific to Francisella tularensis and methods for the detection of Francisella tularensis

    DOEpatents

    McCready, Paula M.; Radnedge, Lyndsay; Andersen, Gary L.; Ott, Linda L.; Slezak, Thomas R.; Kuczmarski, Thomas A.; Vitalis, Elizabeth A

    2007-02-06

    Described herein is the identification of nucleotide sequences specific to Francisella tularensis that serves as a marker or signature for identification of this bacterium. In addition, forward and reverse primers and hybridization probes derived from these nucleotide sequences that are used in nucleotide detection methods to detect the presence of the bacterium are disclosed.

  9. Nucleotide sequences specific to Francisella tularensis and methods for the detection of Francisella tularensis

    DOEpatents

    McCready, Paula M.; Radnedge, Lyndsay; Andersen, Gary L.; Ott, Linda L.; Slezak, Thomas R.; Kuczmarski, Thomas A.; Vitalis, Elizabeth A

    2009-02-24

    Described herein is the identification of nucleotide sequences specific to Francisella tularensis that serves as a marker or signature for identification of this bacterium. In addition, forward and reverse primers and hybridization probes derived from these nucleotide sequences that are used in nucleotide detection methods to detect the presence of the bacterium are disclosed.

  10. Nucleotide sequences specific to Yersinia pestis and methods for the detection of Yersinia pestis

    DOEpatents

    McCready, Paula M.; Radnedge, Lyndsay; Andersen, Gary L.; Ott, Linda L.; Slezak, Thomas R.; Kuczmarski, Thomas A.; Motin, Vladinir L.

    2009-02-24

    Nucleotide sequences specific to Yersinia pestis that serve as markers or signatures for identification of this bacterium were identified. In addition, forward and reverse primers and hybridization probes derived from these nucleotide sequences that are used in nucleotide detection methods to detect the presence of the bacterium are disclosed.

  11. Generalized Levy-walk model for DNA nucleotide sequences

    NASA Technical Reports Server (NTRS)

    Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Simons, M.; Stanley, H. E.

    1993-01-01

    We propose a generalized Levy walk to model fractal landscapes observed in noncoding DNA sequences. We find that this model provides a very close approximation to the empirical data and explains a number of statistical properties of genomic DNA sequences such as the distribution of strand-biased regions (those with an excess of one type of nucleotide) as well as local changes in the slope of the correlation exponent alpha. The generalized Levy-walk model simultaneously accounts for the long-range correlations in noncoding DNA sequences and for the apparently paradoxical finding of long subregions of biased random walks (length lj) within these correlated sequences. In the generalized Levy-walk model, the lj are chosen from a power-law distribution P(lj) varies as lj(-mu). The correlation exponent alpha is related to mu through alpha = 2-mu/2 if 2 < mu < 3. The model is consistent with the finding of "repetitive elements" of variable length interspersed within noncoding DNA.

  12. Empirical Bayes Estimation of Coalescence Times from Nucleotide Sequence Data.

    PubMed

    King, Leandra; Wakeley, John

    2016-09-01

    We demonstrate the advantages of using information at many unlinked loci to better calibrate estimates of the time to the most recent common ancestor (TMRCA) at a given locus. To this end, we apply a simple empirical Bayes method to estimate the TMRCA. This method is both asymptotically optimal, in the sense that the estimator converges to the true value when the number of unlinked loci for which we have information is large, and has the advantage of not making any assumptions about demographic history. The algorithm works as follows: we first split the sample at each locus into inferred left and right clades to obtain many estimates of the TMRCA, which we can average to obtain an initial estimate of the TMRCA. We then use nucleotide sequence data from other unlinked loci to form an empirical distribution that we can use to improve this initial estimate. PMID:27440864

  13. Complete nucleotide sequence of Nootka lupine vein-clearing virus.

    PubMed

    Robertson, Nancy L; Côté, Fabien; Paré, Christine; Leblanc, Eric; Bergeron, Michel G; Leclerc, Denis

    2007-12-01

    The complete genome sequence of Nootka lupine vein-clearing virus (NLVCV) was determined to be 4,172 nucleotides in length containing four open reading frames (ORFs) with a similar genetic organization of virus species in the genus Carmovirus, family Tombusviridae. The order and gene product size, starting from the 5'-proximal ORF consisted of: (1) polymerase/replicase gene, ORF1 (p27) and ORF1RT (readthrough) (p87), (2) movement proteins ORF2 (p7) and ORF3 (p9), and, (3) the 3'-proximal coat protein ORF4, (p37). The genomic 5'- and 3'-proximal termini contained a short (59 nt) and a relatively longer 405 nt untranslated region, respectively. The longer replicase gene product contained the GDD motif common to RNA-dependent RNA polymerases. Phylogenetically, NLVCV formed a subgroup with the following four carmoviruses when separately comparing the amino acids of the coat protein or replicase protein: Angelonia flower break virus (AnFBV), Carnation mottle virus (CarMV), Pelargonium flower break virus (PFBV), and Saguaro cactus virus (SgCV). Whole genome nucleotide analysis (percent identities) among the carmoviruses with NLVCV suggested a similar pattern. The species demarcation criteria in the genus Carmovirus for the amino acid sequence identity of the polymerase (<52%) and coat (<41%) protein genes restricted NLVCV as a distinct species, and instead, placed it as a tentative strain of CarMV, PFBV, or SgCV when both the polymerase and CP were used as the determining factors. In contrast, the species criteria that included different host ranges with no overlap and lack of serology relatedness between NLVCV and the carmoviruses, suggested that NLVCV was a distinct species. The relatively low cutoff percentages allowed for the polymerase and CP genes to dictate the inclusion/exclusion of a distinct carmovirus species should be reevaluated. Therefore, at this time we have concluded that NLVCV should be classified as a tentative new species in the genus Carmovirus

  14. The nucleotide sequence of the uvrD gene of E. coli.

    PubMed Central

    Finch, P W; Emmerson, P T

    1984-01-01

    The nucleotide sequence of a cloned section of the E. coli chromosome containing the uvrD gene has been determined. The coding region for the UvrD protein consists of 2,160 nucleotides which would direct the synthesis of a polypeptide 720 amino acids long with a calculated molecular weight of 82 kd. The predicted amino acid sequence of the UvrD protein has been compared with the amino acid sequences of other known adenine nucleotide binding proteins and a common sequence has been identified, thought to contribute towards adenine nucleotide binding. PMID:6379604

  15. Spatially localized generation of nucleotide sequence-specific DNA damage

    PubMed Central

    Oh, Dennis H.; King, Brett A.; Boxer, Steven G.; Hanawalt, Philip C.

    2001-01-01

    Psoralens linked to triplex-forming oligonucleotides (psoTFOs) have been used in conjunction with laser-induced two-photon excitation (TPE) to damage a specific DNA target sequence. To demonstrate that TPE can initiate photochemistry resulting in psoralen–DNA photoadducts, target DNA sequences were incubated with psoTFOs to form triple-helical complexes and then irradiated in liquid solution with pulsed 765-nm laser light, which is half the quantum energy required for conventional one-photon excitation, as used in psoralen + UV A radiation (320–400 nm) therapy. Target DNA acquired strand-specific psoralen monoadducts in a light dose-dependent fashion. To localize DNA damage in a model tissue-like medium, a DNA–psoTFO mixture was prepared in a polyacrylamide gel and then irradiated with a converging laser beam targeting the rear of the gel. The highest number of photoadducts formed at the rear while relatively sparing DNA at the front of the gel, demonstrating spatial localization of sequence-specific DNA damage by TPE. To assess whether TPE treatment could be extended to cells without significant toxicity, cultured monolayers of normal human dermal fibroblasts were incubated with tritium-labeled psoralen without TFO to maximize detectable damage and irradiated by TPE. DNA from irradiated cells treated with psoralen exhibited a 4- to 7-fold increase in tritium activity relative to untreated controls. Functional survival assays indicated that the psoralen–TPE treatment was not toxic to cells. These results demonstrate that DNA damage can be simultaneously manipulated at the nucleotide level and in three dimensions. This approach for targeting photochemical DNA damage may have photochemotherapeutic applications in skin and other optically accessible tissues. PMID:11572980

  16. Spatially localized generation of nucleotide sequence-specific DNA damage.

    PubMed

    Oh, D H; King, B A; Boxer, S G; Hanawalt, P C

    2001-09-25

    Psoralens linked to triplex-forming oligonucleotides (psoTFOs) have been used in conjunction with laser-induced two-photon excitation (TPE) to damage a specific DNA target sequence. To demonstrate that TPE can initiate photochemistry resulting in psoralen-DNA photoadducts, target DNA sequences were incubated with psoTFOs to form triple-helical complexes and then irradiated in liquid solution with pulsed 765-nm laser light, which is half the quantum energy required for conventional one-photon excitation, as used in psoralen + UV A radiation (320-400 nm) therapy. Target DNA acquired strand-specific psoralen monoadducts in a light dose-dependent fashion. To localize DNA damage in a model tissue-like medium, a DNA-psoTFO mixture was prepared in a polyacrylamide gel and then irradiated with a converging laser beam targeting the rear of the gel. The highest number of photoadducts formed at the rear while relatively sparing DNA at the front of the gel, demonstrating spatial localization of sequence-specific DNA damage by TPE. To assess whether TPE treatment could be extended to cells without significant toxicity, cultured monolayers of normal human dermal fibroblasts were incubated with tritium-labeled psoralen without TFO to maximize detectable damage and irradiated by TPE. DNA from irradiated cells treated with psoralen exhibited a 4- to 7-fold increase in tritium activity relative to untreated controls. Functional survival assays indicated that the psoralen-TPE treatment was not toxic to cells. These results demonstrate that DNA damage can be simultaneously manipulated at the nucleotide level and in three dimensions. This approach for targeting photochemical DNA damage may have photochemotherapeutic applications in skin and other optically accessible tissues. PMID:11572980

  17. Nucleotide sequence and proposed secondary structure of Columnea latent viroid: a natural mosaic of viroid sequences.

    PubMed Central

    Hammond, R; Smith, D R; Diener, T O

    1989-01-01

    The Columnea latent viroid (CLV) occurs latently in certain Columnea erythrophae plants grown commercially. In potato and tomato, CLV causes potato spindle tuber viroid (PSTV)-like symptoms. Its nucleotide sequence and proposed secondary structure reveal that CLV consists of a single-stranded circular RNA of 370 nucleotides which can assume a rod-like structure with extensive base-pairing characteristic of all known viroids. The electrophoretic mobility of circular CLV under nondenaturing conditions suggests a potential tertiary structure. CLV contains extensive sequence homologies to the PSTV group of viroids but contains a central conserved region identical to that of hop stunt viroid (HSV). CLV also shares some biological properties with each of the two types of viroids. Most probably, CLV is the result of intracellular RNA recombination between an HSV-type and one or more PSTV-type viroids replicating in the same plant. Images PMID:2602114

  18. Nucleotide sequence of a cloned woodchuck hepatitis virus genome: comparison with the hepatitis B virus sequence.

    PubMed Central

    Galibert, F; Chen, T N; Mandart, E

    1982-01-01

    The complete nucleotide sequence of a woodchuck hepatitis virus genome cloned in Escherichia coli was determined by the method of Maxam and Gilbert. This sequence was found to be 3,308 nucleotides long. Potential ATG initiator triplets and nonsense codons were identified and used to locate regions with a substantial coding capacity. A striking similarity was observed between the organization of human hepatitis B virus and woodchuck hepatitis virus. Nucleotide sequences of these open regions in the woodchuck virus were compared with corresponding regions present in hepatitis B virus. This allowed the location of four viral genes on the L strand and indicated the absence of protein coded by the S strand. Evolution rates of the various parts of the genome as well as of the four different proteins coded by hepatitis B virus and woodchuck hepatitis virus were compared. These results indicated that: (i) the core protein has evolved slightly less rapidly than the other proteins; and (ii) when a region of DNA codes for two different proteins, there is less freedom for the DNA to evolve and, moreover, one of the proteins can evolve more rapidly than the other. A hairpin structure, very well conserved in the two genomes, was located in the only region devoid of coding function, suggesting the location of the origin of replication of the viral DNA. Images PMID:7086958

  19. Complete nucleotide sequence of a maize chlorotic mottle virus isolate from Nebraska

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The complete genome of a maize chlorotic mottle virus isolate from Nebraska (MCMV-NE) was cloned and sequenced. The MCMV-NE genome consists of 4,436 nucleotides and shares 99.5% nucleotide sequence identity with an MCMV isolate from Kansas (MCMV-KS). Of 22 polymorphic sites, most resulted from t...

  20. The SWISS-PROT protein sequence data bank: current status.

    PubMed Central

    Bairoch, A; Boeckmann, B

    1994-01-01

    SWISS-PROT is an annotated protein sequence database established in 1986 and maintained collaboratively, since 1988, by the Department of Medical Biochemistry of the University of Geneva and the EMBL Data Library. The SWISS-PROT protein sequence data bank consist of sequence entries. Sequence entries are composed of different lines types, each with their own format. For standardization purposes the format of SWISS-PROT follows as closely as possible that of the EMBL Nucleotide Sequence Database. A sample SWISS-PROT entry is shown in Figure 1. PMID:7937062

  1. Complete nucleotide sequence of the temperate bacteriophage LBR48, a new member of the family Myoviridae.

    PubMed

    Jang, Se Hwan; Yoon, Bo Hyun; Chang, Hyo Ihl

    2011-02-01

    The complete genomic sequence of LBR48, a temperate bacteriophage induced from a lysogenic strain of Lactobacillus brevis, was found to be 48,211 nucleotides long and to contain 90 putative open reading frames. Based on structural characteristics obtained from microscopic analysis and nucleic acid sequence determination, phage LBR48 can be classified as a member of the family Myoviridae. Analysis of the genome showed the conserved gene order of previously reported phages of the family Siphoviridae from lactic acid bacteria, despite low nucleotide sequence similarity. Analysis of the attachment sites revealed 15-nucleotide-long core sequences. PMID:20976608

  2. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... acids are not intended to be embraced by this definition. Any amino acid sequence that contains post-translationally modified amino acids may be described as the amino acid sequence that is initially translated... sequence of four or more amino acids or an unbranched sequence of ten or more nucleotides....

  3. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... acids are not intended to be embraced by this definition. Any amino acid sequence that contains post-translationally modified amino acids may be described as the amino acid sequence that is initially translated... sequence of four or more amino acids or an unbranched sequence of ten or more nucleotides....

  4. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... acids are not intended to be embraced by this definition. Any amino acid sequence that contains post-translationally modified amino acids may be described as the amino acid sequence that is initially translated... sequence of four or more amino acids or an unbranched sequence of ten or more nucleotides....

  5. Nucleotide sequence analysis with polynucleotide kinase and nucleotide `mapping' methods. 5′-Terminal sequence of deoxyribonucleic acid from bacteriophages λ and 424

    PubMed Central

    Murray, Kenneth

    1973-01-01

    The polynucleotide kinase reaction was used in analyses of complex mixtures of oligodeoxynucleotides which were fractionated by various two-dimensional nucleotide `mapping' procedures. Parallel ionophoretic analyses on DEAE-cellulose paper, pH2, and AE-cellulose paper, pH3.5, of venom phosphodiesterase partial digests of 5′-terminally labelled oligonucleotides enabled the sequence of the nucleotides to be deduced uniquely. A `diagonal ionophoresis' method has been used with mixtures of nucleotides. Application of these methods to 5′-terminally labelled DNA from bacteriophage λ gave the terminal sequences pA-G-G-T-C-G and pG-G-G-C-G. Identical 5′-terminal sequences were found with DNA from bacteriophage 424. ImagesPLATE 5PLATE 1PLATE 2PLATE 3PLATE 4 PMID:4352720

  6. The nucleotide sequence of the mouse immunoglobulin epsilon gene: comparison with the human epsilon gene sequence.

    PubMed Central

    Ishida, N; Ueda, S; Hayashida, H; Miyata, T; Honjo, T

    1982-01-01

    We have determined the nucleotide sequence of the immunoglobulin epsilon gene cloned from newborn mouse DNA. The epsilon gene sequence allows prediction of the amino acid sequence of the constant region of the epsilon chain and comparison of it with sequences of the human epsilon and other mouse immunoglobulin genes. The epsilon gene was shown to be under the weakest selection pressure at the protein level among the immunoglobulin genes although the divergence at the synonymous position is similar. Our results suggest that the epsilon gene may be dispensable, which is in accord with the fact that IgE has only obscure roles in the immune defense system but has an undesirable role as a mediator of hypersensitivity. The sequence data suggest that the human and murine epsilon genes were derived from different ancestors duplicated a long time ago. The amino acid sequence of the epsilon chain is more homologous to those of the gamma chains than the other mouse heavy chains. Two membrane exons, separated by an 80-base intron, were identified 1.7 kb 3' to the CH4 domain of the epsilon gene and shown to conserve a hydrophobic portion similar to those of other heavy chain genes. RNA blot hybridization showed that the epsilon membrane exons are transcribed into two species of mRNA in an IgE hybridoma. Images Fig. 4. PMID:6329728

  7. Completion of the nucleotide sequence of sunn-hemp mosaic virus: a tobamovirus pathogenic to legumes.

    PubMed

    Silver, S; Quan, S; Deom, C M

    1996-01-01

    Sunn-hemp mosaic virus (SHMV) is a member of the tobamovirus group of plant viruses. The nucleotide sequence of the 5'-untranslated region, the 129 kD protein gene, and a portion of the 186 kD protein gene of SHMV was determined. The 4,683 nucleotides (nts) reported here completes the sequence of the SHMV genome and complements previous work (Meshi, Ohno, and Okada, Nucleic Acids Res. 10, 6111-6117 [1982]; Mol. Gen. Genet. 184, 20-25 [1981]) to provide the first complete nucleotide sequence for a tobamovirus that is pathogenic to leguminous plants. PMID:8938983

  8. Complete nucleotide sequence of the genomic RNA of tobacco mosaic virus strain Cg.

    PubMed

    Yamanaka, T; Komatani, H; Meshi, T; Naito, S; Ishikawa, M; Ohno, T

    1998-01-01

    Tobacco mosaic virus (TMV)-Cg is a crucifer-infecting tobamovirus that was isolated from field-grown garlic. We determined the complete nucleotide sequence of the genomic RNA of TMV-Cg. The genomic RNA of TMV-Cg consists of 6303 nucleotides and encodes four large open reading frames, organized basically in the same way as that of other tobamoviruses. The nucleotide and deduced amino acid sequences are very similar to those of the other crucifer-infecting tobamoviruses that have been sequenced so far. PMID:9608662

  9. Matrix genes of measles virus and canine distemper virus: cloning, nucleotide sequences, and deduced amino acid sequences.

    PubMed Central

    Bellini, W J; Englund, G; Richardson, C D; Rozenblatt, S; Lazzarini, R A

    1986-01-01

    The nucleotide sequences encoding the matrix (M) proteins of measles virus (MV) and canine distemper virus (CDV) were determined from cDNA clones containing these genes in their entirety. In both cases, single open reading frames specifying basic proteins of 335 amino acid residues were predicted from the nucleotide sequences. Both viral messages were composed of approximately 1,450 nucleotides and contained 400 nucleotides of presumptive noncoding sequences at their respective 3' ends. MV and CDV M-protein-coding regions were 67% homologous at the nucleotide level and 76% homologous at the amino acid level. Only chance homology was observed in the 400-nucleotide trailer sequences. Comparisons of the M protein sequences of MV and CDV with the sequence reported for Sendai virus (B. M. Blumberg, K. Rose, M. G. Simona, L. Roux, C. Giorgi, and D. Kolakofsky, J. Virol. 52:656-663; Y. Hidaka, T. Kanda, K. Iwasaki, A. Nomoto, T. Shioda, and H. Shibuta, Nucleic Acids Res. 12:7965-7973) indicated the greatest homology among these M proteins in the carboxyterminal third of the molecule. Secondary-structure analyses of this shared region indicated a structurally conserved, hydrophobic sequence which possibly interacted with the lipid bilayer. Images PMID:3754588

  10. The complete nucleotide sequence and genome organization of Red clover vein mosaic virus (genus Carlavirus)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Red clover vein mosaic virus (RCVMV) is a serious pathogen of legume crops including pea, chickpea and lentil. The complete nucleotide sequence was generated from an isolate obtained from chickpea in Washington State. The complete genome of RCVMV consists of 8605 nucleotides excluding the poly(A) ...

  11. Molecular cloning and sequencing of a novel human P2 nucleotide receptor.

    PubMed

    Southey, M C; Hammet, F; Hutchins, A M; Paidhungat, M; Somers, G R; Venter, D J

    1996-11-11

    A novel human P2 nucleotide receptor has been cloned from a T-cell cDNA library. The predicted amino acid sequence shows characteristics of a G-protein-coupled receptor, and shares 88% homology with a recently characterised rat P2 nucleotide receptor sequence. Distinctive features include an extremely short cytoplasmic tail with only one putative protein kinase C phosphorylation site. Northern blot analysis revealed a 1.9 kb transcript expressed in the placenta. PMID:8950181

  12. Complete nucleotide sequence of a new isolate of passion fruit woodiness virus from Western Australia.

    PubMed

    Fukumoto, Tomohiro; Nakamura, Masayuki; Wylie, Stephen J; Chiaki, Yuya; Iwai, Hisashi

    2013-08-01

    We determined the complete genome sequence of the passion fruit woodiness virus Gld-1 isolate (PWV-Gld-1) from Australia and compared it with that of PWV-MU-2, another Australian isolate of PWV. The genomes shared high sequence identity in both the complete nucleotide sequence and the ORF amino acid sequence. All of the cleavage sites of each protein were identical to those of MU-2, and the sequence identity for the individual proteins ranged from 97.2 % to 100.0 %. However, the 5' untranslated region (5'UTR) of the Gld-1 isolate shared only 46.8 % sequence identity with that of PWV-MU-2 and was 177 nucleotides shorter. Re-sequencing of the 5'UTR of MU-2 revealed that the 5' end of the original sequence includes an artifact generated by deep sequencing. PMID:23508550

  13. Evaluation of intra- and interspecific divergence of satellite DNA sequences by nucleotide frequency calculation and pairwise sequence comparison

    PubMed Central

    2003-01-01

    Satellite DNA sequences are known to be highly variable and to have been subjected to concerted evolution that homogenizes member sequences within species. We have analyzed the mode of evolution of satellite DNA sequences in four fishes from the genus Diplodus by calculating the nucleotide frequency of the sequence array and the phylogenetic distances between member sequences. Calculation of nucleotide frequency and pairwise sequence comparison enabled us to characterize the divergence among member sequences in this satellite DNA family. The results suggest that the evolutionary rate of satellite DNA in D. bellottii is about two-fold greater than the average of the other three fishes, and that the sequence homogenization event occurred in D. puntazzo more recently than in the others. The procedures described here are effective to characterize mode of evolution of satellite DNA. PMID:12734555

  14. Evaluation of intra- and interspecific divergence of satellite DNA sequences by nucleotide frequency calculation and pairwise sequence comparison.

    PubMed

    Kato, Mikio

    2003-01-01

    Satellite DNA sequences are known to be highly variable and to have been subjected to concerted evolution that homogenizes member sequences within species. We have analyzed the mode of evolution of satellite DNA sequences in four fishes from the genus Diplodus by calculating the nucleotide frequency of the sequence array and the phylogenetic distances between member sequences. Calculation of nucleotide frequency and pairwise sequence comparison enabled us to characterize the divergence among member sequences in this satellite DNA family. The results suggest that the evolutionary rate of satellite DNA in D. bellottii is about two-fold greater than the average of the other three fishes, and that the sequence homogenization event occurred in D. puntazzo more recently than in the others. The procedures described here are effective to characterize mode of evolution of satellite DNA. PMID:12734555

  15. Nucleotide sequence of Neurospora crassa cytoplasmic initiator tRNA.

    PubMed Central

    Gillum, A M; Hecker, L I; Silberklang, M; Schwartzbach, S D; RajBhandary, U L; Barnett, W E

    1977-01-01

    Initiator methionine tRNA from the cytoplasm of Neurospora crassa has been purified and sequenced. The sequence is: pAGCUGCAUm1GGCGCAGCGGAAGCGCM22GCY*GGGCUCAUt6AACCCGGAGm7GU (or D) - CACUCGAUCGm1AAACGAG*UUGCAGCUACCAOH. Similar to initiator tRNAs from the cytoplasm of other eukaryotes, this tRNA also contains the sequence -AUCG- instead of the usual -TphiCG (or A)- found in loop IV of other tRNAs. The sequence of the N. crassa cytoplasmic initiator tRNA is quite different from that of the corresponding mitochondrial initiator tRNA. Comparison of the sequence of N. crassa cytoplasmic initiator tRNA to those of yeast, wheat germ and vertebrate cytoplasmic initiator tRNA indicates that the sequences of the two fungal tRNAs are no more similar to each other than they are to those of other initiator tRNAs. Images PMID:146192

  16. Nucleotide sequence of the gene for human prothrombin

    SciTech Connect

    Degen, S.J.F.; Davie, E.W.

    1987-09-22

    A human genomic DNA library was screened for the gene coding for human prothrombin with a cDNA coding for the human protein. Eighty-one positive lambda phage were identified, and three were chosen for further characterization. These three phage hybridized with 5' and/or 3' probes prepared from the prothrombin cDNA. The complete DNA sequence of 21 kilobases of the human prothrombin gene was determined and included a 4.9-kilobase region that was previously sequenced. The gene for human prothrombin contains 14 exons separated by 13 intervening sequences. The exons range in size from 25 to 315 base pairs, while the introns range from 84 to 9447 base pairs. Ninety percent of the gene is composed of intervening sequence. All the intron splice junctions are consistent with sequences found in other eukaryotic genes, except for the presence of GC rather than GT on the 5' end of intervening sequence L. Thirty copies of Alu repetitive DNA and two copies of partial KpnI repeats were identified in clusters within several of the intervening sequences, and these repeats represent 40% of the DNA sequence of the gene. The size, distribution, and sequence homology of the introns within the gene were the compared to those of the genes for the other vitamin K dependent proteins and several other serine proteases.

  17. Nucleotide sequence of 3' untranslated portion of human alpha globin mRNA.

    PubMed Central

    Wilson, J T; deRiel, J K; Forget, B G; Marotta, C A; Weissman, S M

    1977-01-01

    We have determined the nucleotide sequence of 75 nucleotides of the 3'-untranslated portion of normal human alpha globin mRNA which corresponds to the elongated amino acid sequence of the chain termination mutant Hb Constant Spring. This was accomplished by sequence analysis of cDNA fragments obtained by restriction endonuclease or T4 endonuclease IV cleavage of human globin cDNA synthesized from globin mRNA by use of viral reverse transcriptase. Analysis of cRNA synthesized from cDNA by use of RNA polymerase provided additional confirmatory sequence information. Possible polymorphism has been identified at one site of the sequence. Our sequence overlaps with, and extends the sequence of 43 nucleotides determined by Proudfood and coworkers for the very 3'-terminal portion of human alpha globin mRNA. The complete 3'-untranslated sequence of human alpha globin mRNA (112 nucleotides including termination codon) shows little homology to that of the human or rabbit beta globin mRNAs except for the presence of the hexanucleotide sequence AAUAAA which is found in most eukaryotic mRNAs near the 3'-terminal poly (A). Images PMID:909779

  18. Nucleotide sequences of 5S ribosomal RNA from four oomycete and chytrid water molds.

    PubMed

    Walker, W F; Doolittle, W F

    1982-09-25

    The nucleotide sequences of the 5S rRNAs of the oomycete water molds Saprolegnia ferax and Pythium hydnosporum and of the chytrid water molds Blastocladiella simplex and Phlyctochytrium irregulare were determined by chemical and enzymatic partial degradation of 3' and 5' end-labelled molecules, followed by gel sequence analysis. The two oomycete sequences differed in 24 positions and the two chytrid sequences differed in 27 positions. These pairs differed in a mean of 44 positions. The chytrid sequences clearly most resemble the sequence from the zygomycete Phycomyces, while the oomycete sequences appear to be allied with those from protozoa and slime molds. PMID:6890670

  19. CLEANUP: a fast computer program for removing redundancies from nucleotide sequence databases.

    PubMed

    Grillo, G; Attimonelli, M; Liuni, S; Pesole, G

    1996-02-01

    A key concept in comparing sequence collections is the issue of redundancy. The production of sequence collections free from redundancy is undoubtedly very useful, both in performing statistical analyses and accelerating extensive database searching on nucleotide sequences. Indeed, publicly available databases contain multiple entries of identical or almost identical sequences. Performing statistical analysis on such biased data makes the risk of assigning high significance to non-significant patterns very high. In order to carry out unbiased statistical analysis as well as more efficient database searching it is thus necessary to analyse sequence data that have been purged of redundancy. Given that a unambiguous definition of redundancy is impracticable for biological sequence data, in the present program a quantitative description of redundancy will be used, based on the measure of sequence similarity. A sequence is considered redundant if it shows a degree of similarity and overlapping with a longer sequence in the database greater than a threshold fixed by the user. In this paper we present a new algorithm based on an "approximate string matching' procedure, which is able to determine the overall degree of similarity between each pair of sequences contained in a nucleotide sequence database and to generate automatically nucleotide sequence collections free from redundancies. PMID:8670613

  20. Nucleotide sequences of 5S rRNAs from four jellyfishes.

    PubMed

    Hori, H; Ohama, T; Kumazaki, T; Osawa, S

    1982-11-25

    The nucleotide sequences of 5S rRNAs from four jellyfishes, Spirocodon saltatrix, Nemopsis dofleini, Aurelia aurita and Chrysaora quinquecirrha have been determined. The sequences are highly similar to each other. A fairly high similarity was also found between these jellyfishes and a sea anemone, Anthopleura japonica. PMID:6130512

  1. The nucleotide sequence of the tnpA gene completes the sequence of the Pseudomonas transposon Tn501.

    PubMed Central

    Brown, N L; Winnie, J N; Fritzinger, D; Pridmore, R D

    1985-01-01

    The nucleotide sequence of the gene (tnpA) which codes for the transposase of transposon Tn501 has been determined. It contains an open reading frame for a polypeptide of Mr = 111,500, which terminates within the inverted repeat sequence of the transposon. The reading frame would be transcribed in the same direction as the mercury-resistance genes and the tnpR gene. The amino acid sequence predicted from this reading frame shows 32% identity with that of the transposase of the related transposon Tn3. The C-terminal regions of these two polypeptides show slightly greater homology than the N-terminal regions when conservative amino acid substitutions are considered. With this sequence determination, the nucleotide sequence of Tn501 is fully defined. The main features of the sequence are briefly presented. PMID:2994007

  2. Diverse nucleotide compositions and sequence fluctuation in Rubisco protein genes

    NASA Astrophysics Data System (ADS)

    Holden, Todd; Dehipawala, S.; Cheung, E.; Bienaime, R.; Ye, J.; Tremberger, G., Jr.; Schneider, P.; Lieberman, D.; Cheung, T.

    2011-10-01

    The Rubisco protein-enzyme is arguably the most abundance protein on Earth. The biology dogma of transcription and translation necessitates the study of the Rubisco genes and Rubisco-like genes in various species. Stronger correlation of fractal dimension of the atomic number fluctuation along a DNA sequence with Shannon entropy has been observed in the studied Rubisco-like gene sequences, suggesting a more diverse evolutionary pressure and constraints in the Rubisco sequences. The strategy of using metal for structural stabilization appears to be an ancient mechanism, with data from the porphobilinogen deaminase gene in Capsaspora owczarzaki and Monosiga brevicollis. Using the chi-square distance probability, our analysis supports the conjecture that the more ancient Rubisco-like sequence in Microcystis aeruginosa would have experienced very different evolutionary pressure and bio-chemical constraint as compared to Bordetella bronchiseptica, the two microbes occupying either end of the correlation graph. Our exploratory study would indicate that high fractal dimension Rubisco sequence would support high carbon dioxide rate via the Michaelis- Menten coefficient; with implication for the control of the whooping cough pathogen Bordetella bronchiseptica, a microbe containing a high fractal dimension Rubisco-like sequence (2.07). Using the internal comparison of chi-square distance probability for 16S rRNA (~ E-22) versus radiation repair Rec-A gene (~ E-05) in high GC content Deinococcus radiodurans, our analysis supports the conjecture that high GC content microbes containing Rubisco-like sequence are likely to include an extra-terrestrial origin, relative to Deinococcus radiodurans. Similar photosynthesis process that could utilize host star radiation would not compete with radiation resistant process from the biology dogma perspective in environments such as Mars and exoplanets.

  3. Whole-genome sequence of Clostridium lituseburense L74, isolated from the larval gut of the rhinoceros beetle, Trypoxylus dichotomus

    PubMed Central

    Lee, Yookyung; Lim, Sooyeon; Rhee, Moon-Soo; Chang, Dong-Ho; Kim, Byoung-Chan

    2016-01-01

    Clostridium lituseburense L74 was isolated from the larval gut of the rhinoceros beetle, Trypoxylus dichotomus collected in Yeong-dong, Chuncheongbuk-do, South Korea and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession NZ_LITJ00000000. PMID:26981432

  4. Whole-genome sequence of Clostridium lituseburense L74, isolated from the larval gut of the rhinoceros beetle, Trypoxylus dichotomus.

    PubMed

    Lee, Yookyung; Lim, Sooyeon; Rhee, Moon-Soo; Chang, Dong-Ho; Kim, Byoung-Chan

    2016-03-01

    Clostridium lituseburense L74 was isolated from the larval gut of the rhinoceros beetle, Trypoxylus dichotomus collected in Yeong-dong, Chuncheongbuk-do, South Korea and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession NZ_LITJ00000000. PMID:26981432

  5. Complete nucleotide sequence of Alfalfa mosaic virus isolated from alfalfa (Medicago sativa L.) in Argentina.

    PubMed

    Trucco, Verónica; de Breuil, Soledad; Bejerman, Nicolás; Lenardon, Sergio; Giolitti, Fabián

    2014-06-01

    The complete nucleotide sequence of an Alfalfa mosaic virus (AMV) isolate infecting alfalfa (Medicago sativa L.) in Argentina, AMV-Arg, was determined. The virus genome has the typical organization described for AMV, and comprises 3,643, 2,593, and 2,038 nucleotides for RNA1, 2 and 3, respectively. The whole genome sequence and each encoding region were compared with those of other four isolates that have been completely sequenced from China, Italy, Spain and USA. The nucleotide identity percentages ranged from 95.9 to 99.1 % for the three RNAs and from 93.7 to 99 % for the protein 1 (P1), protein 2 (P2), movement protein and coat protein (CP) encoding regions, whereas the amino acid identity percentages of these proteins ranged from 93.4 to 99.5 %, the lowest value corresponding to P2. CP sequences of AMV-Arg were compared with those of other 25 available isolates, and the phylogenetic analysis based on the CP gene was carried out. The highest percentage of nucleotide sequence identity of the CP gene was 98.3 % with a Chinese isolate and 98.6 % at the amino acid level with four isolates, two from Italy, one from Brazil and the remaining one from China. The phylogenetic analysis showed that AMV-Arg is closely related to subgroup I of AMV isolates. To our knowledge, this is the first report of a complete nucleotide sequence of AMV from South America and the first worldwide report of complete nucleotide sequence of AMV isolated from alfalfa as natural host. PMID:24510307

  6. Nucleotide sequence of a human tRNA gene heterocluster

    SciTech Connect

    Chang, Y.N.; Pirtle, I.L.; Pirtle, R.M.

    1986-05-01

    Leucine tRNA from bovine liver was used as a hybridization probe to screen a human gene library harbored in Charon-4A of bacteriophage lambda. The human DNA inserts from plaque-pure clones were characterized by restriction endonuclease mapping and Southern hybridization techniques, using both (3'-/sup 32/P)-labeled bovine liver leucine tRNA and total tRNA as hybridization probes. An 8-kb Hind III fragment of one of these ..gamma..-clones was subcloned into the Hind III site of pBR322. Subsequent fine restriction mapping and DNA sequence analysis of this plasmid DNA indicated the presence of four tRNA genes within the 8-kb DNA fragment. A leucine tRNA gene with an anticodon of AAG and a proline tRNA gene with an anticodon of AGG are in a 1.6-kb subfragment. A threonine tRNA gene with an anticodon of UGU and an as yet unidentified tRNA gene are located in a 1.1-kb subfragment. These two different subfragments are separated by 2.8 kb. The coding regions of the three sequenced genes contain characteristic internal split promoter sequences and do not have intervening sequences. The 3'-flanking region of these three genes have typical RNA polymerase III termination sites of at least four consecutive T residues.

  7. Methods for making nucleotide probes for sequencing and synthesis

    DOEpatents

    Church, George M; Zhang, Kun; Chou, Joseph

    2014-07-08

    Compositions and methods for making a plurality of probes for analyzing a plurality of nucleic acid samples are provided. Compositions and methods for analyzing a plurality of nucleic acid samples to obtain sequence information in each nucleic acid sample are also provided.

  8. Cloning and nucleotide sequence of the Lactobacillus casei lactate dehydrogenase gene.

    PubMed Central

    Kim, S F; Baek, S J; Pack, M Y

    1991-01-01

    An allosteric L-(+)-lactate dehydrogenase gene of Lactobacillus casei ATCC 393 was cloned in Escherichia coli, and the nucleotide sequence of the gene was determined. The gene was composed of an open reading frame of 981 bp, starting with a GTG codon and ending with a TAA codon. The sequences for the promoter and ribosome binding site were identified, and a sequence for a structure resembling a rho-independent transcription terminator was also found. Images PMID:1768113

  9. Nucleotide Sequence Diversity and Linkage Disequilibrium of Four Nuclear Loci in Foxtail Millet (Setaria italica)

    PubMed Central

    He, Shui-lian; Yang, Yang; Morrell, Peter L.; Yi, Ting-shuang

    2015-01-01

    Foxtail millet (Setaria italica (L.) Beauv) is one of the earliest domesticated grains, which has been cultivated in northern China by 8,700 years before present (YBP) and across Eurasia by 4,000 YBP. Owing to a small genome and diploid nature, foxtail millet is a tractable model crop for studying functional genomics of millets and bioenergy grasses. In this study, we examined nucleotide sequence diversity, geographic structure, and levels of linkage disequilibrium at four nuclear loci (ADH1, G3PDH, IGS1 and TPI1) in representative samples of 311 landrace accessions across its cultivated range. Higher levels of nucleotide sequence and haplotype diversity were observed in samples from China relative to other sampled regions. Genetic assignment analysis classified the accessions into seven clusters based on nucleotide sequence polymorphisms. Intralocus LD decayed rapidly to half the initial value within ~1.2 kb or less. PMID:26325578

  10. Nucleotide sequence heterogeneity of alpha satellite repetitive DNA: a survey of alphoid sequences from different human chromosomes.

    PubMed Central

    Waye, J S; Willard, H F

    1987-01-01

    The human alpha satellite DNA family is composed of diverse, tandemly reiterated monomer units of approximately 171 basepairs localized to the centromeric region of each chromosome. These sequences are organized in a highly chromosome-specific manner with many, if not all human chromosomes being characterized by individually distinct alphoid subsets. Here, we compare the nucleotide sequences of 153 monomer units, representing alphoid components of at least 12 different human chromosomes. Based on the analysis of sequence variation at each position within the 171 basepair monomer, we have derived a consensus sequence for the monomer unit of human alpha satellite DNA which we suggest may reflect the monomer sequence from which different chromosomal subsets have evolved. Sequence heterogeneity is evident at each position within the consensus monomer unit and there are no positions of strict nucleotide sequence conservation, although some regions are more variable than others. A substantial proportion of the overall sequence variation may be accounted for by nucleotide changes which are characteristic of monomer components of individual chromosomal subsets or groups of subsets which have a common evolutionary history. PMID:3658703

  11. An Integrated System for DNA Sequencing by Synthesis Using Novel Nucleotide Analogues

    PubMed Central

    Guo, Jia; Yu, Lin; Turro, Nicholas J.; Ju, Jingyue

    2010-01-01

    Conspectus The Human Genome Project has concluded, but its successful completion has increased, rather than decreased, the need for high-throughput DNA sequencing technologies. The possibility of clinically screening a full genome for an individual's mutations offers tremendous benefits, both for pursuing personalized medicine as well as uncovering the genomic contributions to diseases. The Sanger sequencing method—although enormously productive for more than 30 years—requires an electrophoretic separation step that, unfortunately, remains a key technical obstacle for achieving economically acceptable full-genome results. Alternative sequencing approaches thus focus on innovations that can reduce costs. The DNA sequencing by synthesis (SBS) approach has shown great promise as a new sequencing platform, with particular progress reported recently. The general fluorescent SBS approach involves (i) incorporation of nucleotide analogs bearing fluorescent reporters, (ii) identification of the incorporated nucleotide by its fluorescent emissions, and (iii) cleavage of the fluorophore, along with the reinitiation of the polymerase reaction for continuing sequence determination. In this Account, we review the construction of a DNA-immobilized chip and the development of novel nucleotide reporters for the SBS sequencing platform. Click chemistry, with its high selectivity and coupling efficiency, was explored for surface immobilization of DNA. The first generation (G-1) modified nucleotides for SBS feature a small chemical moiety capping the 3′-OH and a fluorophore tethered to the base through a chemically cleavable linker; the design ensures that the nucleotide reporters are good substrates for the polymerase. The 3′-capping moiety and the fluorophore on the DNA extension products, generated by the incorporation of the G-1 modified nucleotides, are cleaved simultaneously to reinitiate the polymerase reaction. The sequence of a DNA template immobilized on a surface

  12. Complete nucleotide sequence of the human corticotropin-beta-lipotropin precursor gene.

    PubMed Central

    Takahashi, H; Hakamata, Y; Watanabe, Y; Kikuno, R; Miyata, T; Numa, S

    1983-01-01

    The nucleotide sequence of an 8658-base-pair human genomic DNA segment containing the entire corticotropin-beta-lipotropin precursor gene has been determined, and some sequence features of the gene and its flanking regions have been analysed. The gene is composed of 7665 base pairs including two introns of 3708 and 2886 base pairs. Comparison of the 5'-flanking sequences of the human, bovine and mouse corticotropin-beta-lipotropin precursor genes reveals the presence of a highly conserved region, which contains sequences of 14-15 base pairs homologous with sequences located upstream of the mRNA start site of other glucocorticoid-regulated genes. PMID:6314261

  13. Nucleotide sequence of a small cryptic plasmid from Acidithiobacillus ferrooxidans strain A-6

    SciTech Connect

    F. Roberto

    2003-10-01

    A 2.1 kb cryptic plasmid from Acidithiobacillus ferrooxidans strain A-6 was isolated and cloned into the E. coli vector plasmid, pUC128. The cloned plasmid was mapped by restriction enzyme fragment analysis and subsequently sequenced. At this time over half the plasmid sequence has been determined and compared to sequences in the GenBank nucleotide and protein sequence databases. Much of the plasmid remains cryptic, but substantial nucleotide and protein sequence similarities have been observed to the putative replication protein, RepA, of the small cryptic plasmids pAYS and pAYL found in the ammonia-oxidizing Nitrosomonas sp. Strain ENI-11. These results suggest an entirely new class of plasmid is maintained in at least one strain of Acidithiobacillus ferrooxidans and other acidophilic bacteria, and raises interesting questions about the origin of this plasmid in acidic environments.

  14. The complete nucleotide sequence and genomic characterization of tropical soda apple mosaic virus.

    PubMed

    Fillmer, Kornelia; Adkins, Scott; Pongam, Patchara; D'Elia, Tom

    2016-08-01

    We report the first complete genome sequence of tropical soda apple mosaic virus (TSAMV), a tobamovirus originally isolated from tropical soda apple (Solanum viarum) collected in Okeechobee, Florida. The complete genome of TSAMV is 6,350 nucleotides long and contains four open reading frames encoding the following proteins: i) 126-kDa methyltransferase/helicase (3354 nt), ii) 183-kDa polymerase (4839 nt), iii) movement protein (771 nt) and iv) coat protein (483 nt). The complete genome sequence of TSAMV shares 80.4 % nucleotide sequence identity with pepper mild mottle virus (PMMoV) and 71.2-74.2 % identity with other tobamoviruses naturally infecting members of the Solanaceae plant family. Phylogenetic analysis of the deduced amino acid sequences of the 126-kDa and 183-kDa proteins and the complete genome sequence place TSAMV in a subcluster with PMMoV within the Solanaceae-infecting subgroup of tobamoviruses. PMID:27169599

  15. Complete nucleotide sequence and transcriptional analysis of snakehead fish retrovirus.

    PubMed Central

    Hart, D; Frerichs, G N; Rambaut, A; Onions, D E

    1996-01-01

    The complete genome of the snakehead fish retrovirus has been cloned and sequenced, and its transcriptional profile in cell culture has been determined. The 11.2-kb provirus displays a complex expression pattern capable of encoding accessory proteins and is unique in the predicted location of the env initiation codon and signal peptide upstream of gag and the common splice donor site. The virus is distinguishable from all known retrovirus groups by the presence of an arginine tRNA primer binding site. The coding regions are highly divergent and show a number of unusual characteristics, including a large Gag coiled-coil region, a Pol domain of unknown function, and a long, lentiviral-like, Env cytoplasmic domain. Phylogenetic analysis of the Pol sequence emphasizes the divergent nature of the virus from the avian and mammalian retroviruses. The snakehead virus is also distinct from a previously characterized complex fish retrovirus, suggesting that discrete groups of these viruses have yet to be identified in the lower vertebrates. PMID:8648695

  16. Nucleotide composition of CO1 sequences in Chelicerata (Arthropoda): detecting new mitogenomic rearrangements.

    PubMed

    Arabi, Juliette; Judson, Mark L I; Deharveng, Louis; Lourenço, Wilson R; Cruaud, Corinne; Hassanin, Alexandre

    2012-02-01

    Here we study the evolution of nucleotide composition in third codon-positions of CO1 sequences of Chelicerata, using a phylogenetic framework, based on 180 taxa and three markers (CO1, 18S, and 28S rRNA; 5,218 nt). The analyses of nucleotide composition were also extended to all CO1 sequences of Chelicerata found in GenBank (1,701 taxa). The results show that most species of Chelicerata have a positive strand bias in CO1, i.e., in favor of C nucleotides, including all Amblypygi, Palpigradi, Ricinulei, Solifugae, Uropygi, and Xiphosura. However, several taxa show a negative strand bias, i.e., in favor of G nucleotides: all Scorpiones, Opisthothelae spiders and several taxa within Acari, Opiliones, Pseudoscorpiones, and Pycnogonida. Several reversals of strand-specific bias can be attributed to either a rearrangement of the control region or an inversion of a fragment containing the CO1 gene. Key taxa for which sequencing of complete mitochondrial genomes will be necessary to determine the origin and nature of mtDNA rearrangements involved in the reversals are identified. Acari, Opiliones, Pseudoscorpiones, and Pycnogonida were found to show a strong variability in nucleotide composition. In addition, both mitochondrial and nuclear genomes have been affected by higher substitution rates in Acari and Pseudoscorpiones. The results therefore indicate that these two orders are more liable to fix mutations of all types, including base substitutions, indels, and genomic rearrangements. PMID:22362465

  17. [Nucleotide sequence determination of yeast mitochondrial phenylalanine-tRNA].

    PubMed

    Martin, R; Sibler, A P; Schneller, J M; Keith, G; Stahl, A J; Dirheimer, G

    1978-10-01

    The primary structure of mitochondrial tRNAPhe from Saccharomyces cerevisiae, purified by two-dimensional polyacrylamide gel electrophoresis, was determined using, standard procedures on in vivo 32P-labeled tRNA, as well as the new 5'-end postlabeling techniques. We propose a cloverleaf model which allows for tertiary interaction between cytosine in position 46 and guanine in position 15 and maximizes base pairing in the psi C stem, thus excluding the uracile in position 50 from base pairing in the psi C stem. Comparison of the primary structure of this tRNA with all other known procaryotic, chloroplastic or cytoplasmic tRNAsPhe sequences does not lead to any conclusion about the endosymbiotic theory of mitochondria evolution. PMID:103657

  18. Complete nucleotide sequence of the new potexvirus "Alstroemeria virus X". Brief report.

    PubMed

    Fuji, S; Shinoda, K; Ikeda, M; Furuya, H; Naito, H; Fukumoto, F

    2005-11-01

    A flexuous virus was isolated in Japan from an alstroemeria plant showing mosaic symptoms. The virus had a broad host range but had systemically latent infectivity in alstroemeria. The virus was assigned to the genus Potexvirus based on morphology and physical properties and on an analysis of the complete nucleotide sequence. The genomic RNA of the virus was 7,009 nucleotides in length, excluding the 3'-terminal poly (A) tail. It contained five open reading frames (ORFs), which was consistent with other members of the genus Potexvirus. Although nucleotide sequences of the ORFs differ from previously reported potexviruses, a phylogenetic analysis placed it phylogenetically close to Narcissus mosaic virus and Scallion virus X. Therefore, we propose that this virus should be designated as Alstroemeria virus X (AlsVX). PMID:15986173

  19. Quantitative analysis of the relationship between nucleotide sequence and functional activity.

    PubMed Central

    Stormo, G D; Schneider, T D; Gold, L

    1986-01-01

    Matrices can be used to evaluate sequences for functional activity. Multiple regression can solve for the matrix that gives the best fit between sequence evaluations and quantitative activities. This analysis shows that the best model for context effects on suppression by su2 involves primarily the two nucleotides 3' to the amber codon, and that their contributions are independent and additive. Context effects on 2AP mutagenesis also involve the two nucleotides 3' to the 2AP insertion, but their effects are not independent. In a construct for producing beta-galactosidase, the effects on translational yields of the tri-nucleotide 5' to the initiation codon are dependent on the entire triplet. Models based on these quantitative results are presented for each of the examples. PMID:3092188

  20. Single nucleotide polymorphism mining and nucleotide sequence analysis of Mx1 gene in exonic regions of Japanese quail

    PubMed Central

    Niraj, Diwesh Kumar; Kumar, Pushpendra; Mishra, Chinmoy; Narayan, Raj; Bhattacharya, Tarun Kumar; Shrivastava, Kush; Bhushan, Bharat; Tiwari, Ashok Kumar; Saxena, Vishesh; Sahoo, Nihar Ranjan; Sharma, Deepak

    2015-01-01

    Aim: An attempt has been made to study the Myxovirus resistant (Mx1) gene polymorphism in Japanese quail. Materials and Methods: In the present, investigation four fragments viz. Fragment I of 185 bp (Exon 3 region), Fragment II of 148 bp (Exon 5 region), Fragment III of 161 bp (Exon 7 region), and Fragment IV of 176 bp (Exon 13 region) of Mx1 gene were amplified and screened for polymorphism by polymerase chain reaction-single-strand conformation polymorphism technique in 170 Japanese quail birds. Results: Out of the four fragments, one fragment (Fragment II) was found to be polymorphic. Remaining three fragments (Fragment I, III, and IV) were found to be monomorphic which was confirmed by custom sequencing. Overall nucleotide sequence analysis of Mx1 gene of Japanese quail showed 100% homology with common quail and more than 80% homology with reported sequence of chicken breeds. Conclusion: The Mx1 gene is mostly conserved in Japanese quail. There is an urgent need of comprehensive analysis of other regions of Mx1 gene along with its possible association with the traits of economic importance in Japanese quail. PMID:27047057

  1. The organization of repeated nucleotide sequences in the replicons of mammalian DNA.

    PubMed Central

    Mattern, M R; Painter, R B

    1977-01-01

    Chinese hamster ovary cells were irradiated with 100-5,000 rads of X-rays and inhibition of the initiation of replicons after irradiation was demonstrated by analyzing nascent DNA sedimented in alkaline sucrose gradients. The renaturation kinetics of DNA synthesized during 60 min of incubation after irradiation was compared with that of DNA synthesized during the 60 min after sham irradiation and with that of parental DNA. Nascent DNA from cells whose replicon initiation was inhibited renatured faster than nascent DNA from control cells in the COt range of repeated nucleotide sequences, suggesting that regions of the replicon not close to origins are enriched in repeated sequences and that regions close to origins are enriched in unique sequences. A class of repeated nucleotide sequences may be involved in the regulation of replicon initiation. PMID:880330

  2. On the feasibility of using the intrinsic fluorescence of nucleotides for DNA sequencing.

    SciTech Connect

    Chowdhury, M. H.; Ray, K.; Johnson, R. L.; Gray, S. K.; Pond, J.; Lakowicz, J. R.; Univ. of Maryland; Univ. of Virginia; Lumerical Solutions, Inc.

    2010-04-29

    There is presently a worldwide effort to increase the speed and decrease the cost of DNA sequencing as exemplified by the goal of the National Human Genome Research Institute (NHGRI) to sequence a human genome for under $1000. Several high throughput technologies are under development. Among these, single strand sequencing using exonuclease appear very promising. However, this approach requires complete labeling of at least two bases at a time, with extrinsic high quantum yield probes. This is necessary because nucleotides absorb in the deep ultraviolet (UV) and emit with extremely low quantum yields. Hence intrinsic emission from DNA and nucleotides is not being exploited for DNA sequencing. In the present paper we consider the possibility of identifying single nucleotides using their intrinsic emission. We used the finite-difference time-domain (FDTD) method to calculate the effects of aluminum nanoparticles on nearby fluorophores that emit in the UV. We find that the radiated power of UV fluorophores is significantly increased when they are in close proximity to aluminum nanostructures. We show that there will be increased localized excitation near aluminum particles at wavelengths used to excite intrinsic nucleotide emission. Using FDTD simulation we show that a typical DNA base when coupled to appropriate aluminum nanostructures leads to highly directional emission. Additionally we present experimental results showing that a thin film of nucleotides show enhanced emission when in close proximity to aluminum nanostructures. Finally we provide Monte Carlo simulations that predict high levels of base calling accuracy for an assumed number of photons that is derived from the emission spectra of the intrinsic fluorescence of the bases. Our results suggest that single nucleotides can be detected and identified using aluminum nanostructures that enhance their intrinsic emission. This capability would be valuable for the ongoing efforts toward the $1000 genome.

  3. Single nucleotide polymorphism discovery in rainbow trout by deep sequencing of a reduced representation library

    Technology Transfer Automated Retrieval System (TEKTRAN)

    BACKGROUND: To enhance capabilities for genomic analyses in rainbow trout, such as genomic selection, a large suite of polymorphic markers that are amenable to high-throughput genotyping protocols must be identified. Expressed Sequence Tags (ESTs) have been used for single nucleotide polymorphism (...

  4. Complete Nucleotide Sequence of a Citrobacter freundii Plasmid Carrying KPC-2 in a Unique Genetic Environment

    PubMed Central

    Yao, Yancheng; Imirzalioglu, Can; Hain, Torsten; Kaase, Martin; Gatermann, Soeren; Exner, Martin; Mielke, Martin; Hauri, Anja; Dragneva, Yolanta; Bill, Rita; Wendt, Constanze; Wirtz, Angela; Chakraborty, Trinad

    2014-01-01

    The complete and annotated nucleotide sequence of a 54,036-bp plasmid harboring a blaKPC-2 gene that is clonally present in Citrobacter isolates from different species is presented. The plasmid belongs to incompatibility group N (IncN) and harbors the class A carbapenemase KPC-2 in a unique genetic environment. PMID:25395635

  5. The complete nucleotide sequence of Pepper mottle virus-Florida RNA.

    PubMed

    Warren, C E; Murphy, J F

    2003-01-01

    The Pepper mottle virus-Florida (PepMoV-FL) RNA genome was cloned and sequenced, and shown to consist of 9,717 nucleotides (nt) excluding the poly (A) tail. A single open reading frame was identified beginning at nucleotide position 169 encoding a polyprotein of 3068 amino acids. Phylogenetic sequence analysis revealed that of 44 full-length viral RNA genomes analyzed within the family Potyviridae, PepMoV-FL was most closely related to PepMoV-California (PepMoV-CA), Potato virus Y-H (PVY-H), PVY-N, PVY(o) and Potato virus V-DV42 (PVV-DV42). Using the PepMoV-FL sequence as a basis for comparison, the overall nucleotide sequence identity was highest between PepMoV-FL and PepMoV-CA at 93%, while the relationship was more distant with PVV-DV42 at 64% and for the PVY strains at 61%. A unique direct repeat sequence of 76 nucleotides was identified in the PepMoV-FL 3'-untranslated region (UTR), and this repeat sequence was confirmed not to occur in the PepMoV-CA sequence. Since the Florida isolate was among the first of the PepMoV isolates described, extensive biological and serological data on this isolate are available, and it has now been cloned and sequenced, we recommend that PepMoV-FL be recognized as the PepMoV type strain. PMID:12536304

  6. The nucleotide sequence and genome organization of strawberry mild yellow edge-associated potexvirus.

    PubMed

    Jelkmann, W; Maiss, E; Martin, R R

    1992-02-01

    The nucleotide sequence (5966 nucleotides) of cDNA clones of strawberry mild yellow edge-associated potexvirus was determined. The genome contains six open reading frames (ORFs) encoding putative proteins with Mrs of 149,423, 25,344, 11,576, 8079, 25,714 and 11,216. In the first three putative proteins and the coat protein considerable similarity was found to comparable polypeptides of the potexviruses potato virus X, clover yellow mosaic virus, narcissus mosaic virus, papaya mosaic virus, white clover mosaic virus and lily virus X. PMID:1339469

  7. Nucleotide sequence and genome organization of Dweet mottle virus and its relationship to members of the family Betaflexiviridae

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The nucleotide sequence of Dweet mottle virus (DMV) was determined and compared to sequences of members of the family Alpha- and Beta-flexiviridae. The DMV genome has 8747 nucleotides (nt) excluding the poly-(A) tail at the 3’ end of the genome. The overall G+C content of DMV genomic RNA is 40%. D...

  8. Nucleotide sequence of a cloned woodchuck hepatitis virus genome: evolutional relationship between hepadnaviruses.

    PubMed Central

    Kodama, K; Ogasawara, N; Yoshikawa, H; Murakami, S

    1985-01-01

    We have determined the complete nucleotide sequence of a cloned DNA of woodchuck hepatitis virus (WHV), the most oncogenic virus among hepadnaviruses. The genome, designated WHV2, is 3,320 base pairs long and contains four major open reading frames (ORFs) coded on the same strand of nucleotide sequence as in the human hepatitis B virus (HBV) genome. Comparison of the nucleotide sequence and amino acid sequences deduced from it among the genomes of various hepadnaviruses demonstrates that each protein shows an intrinsic property in conserving its amino acid sequence. A parameter, the ratio of the number of triplets with one-letter change but no amino acid substitution to the total number of triplets in which one-letter change occurred, was introduced to measure the intrinsic properties quantitatively. For each ORF, the parameter gave characteristic values in all combinations. Therefore, the relative evolutional distance between these hepadnaviruses can be measured by the amino acid substitution rate of any ORF. These comparisons suggest that (i) the difference between two WHV clones, WHV1 and WHV2, corresponds to that among clones of a HBV subtype, HBVadr, and (ii) WHV and ground squirrel hepatitis virus can be categorized in a way similar to the subgroups of HBV. PMID:3855246

  9. Nucleotide sequence and genome organization of atractylodes mottle virus, a new member of the genus Carlavirus.

    PubMed

    Zhao, Fumei; Igori, Davaajargal; Lim, Seungmo; Yoo, Ran Hee; Lee, Su-Heon; Moon, Jae Sun

    2015-11-01

    The complete genome sequence of a member of a distinct species of the genus Carlavirus in the family Betaflexiviridae, tentatively named atractylodes mottle virus (AtrMoV), has been determined. Analysis of its genomic organization indicates that it has a single-stranded, positive-sense genomic RNA of 8866 nucleotides, excluding the poly(A) tail, and consists of six open reading frames typical of members of the genus Carlavirus. The individual open reading frames of AtrMoV show moderately low sequence similarity to those of other carlaviruses at the nucleotide and amino acid sequence levels. Pairwise comparison and phylogenetic analysis suggest that AtrMoV is most closely related to chrysanthemum virus B. PMID:26264403

  10. A likelihood method for the detection of selection and recombination using nucleotide sequences.

    PubMed

    Grassly, N C; Holmes, E C

    1997-03-01

    Different regions along nucleotide sequences are often subject to different evolutionary forces. Recombination will result in regions having different evolutionary histories, while selection can cause regions to evolve at different rates. This paper presents a statistical method based on likelihood for detecting such processes by identifying the regions which do not fit with a single phylogenetic topology and nucleotide substitution process along the entire sequence. Subsequent reanalysis of these anomalous regions may then be possible. The method is tested using simulations, and its application is demonstrated using the primate psi eta-globin pseudogene, the V3 region of the envelope gene of HIV-1, and argF sequences from Neisseria bacteria. Reanalysis of anomalous regions is shown to reveal possible immune selection in HIV-1 and recombination in Neisseria. A computer program which implements the method is available. PMID:9066792

  11. Complete nucleotide sequence of the structural gene for alkaline proteinase from Pseudomonas aeruginosa IFO 3455.

    PubMed Central

    Okuda, K; Morihara, K; Atsumi, Y; Takeuchi, H; Kawamoto, S; Kawasaki, H; Suzuki, K; Fukushima, J

    1990-01-01

    The DNA-encoding alkaline proteinase (AP) of Pseudomonas aeruginosa IFO 3455 was cloned, and its complete nucleotide sequence was determined. When the cloned gene was ligated to pUC18, the Escherichia coli expression vector, the gene-incorporated bacteria expressed high levels of both AP activity and AP antigens. The amino acid sequence deduced from the nucleotide sequence revealed that the mature AP consists of 467 amino acids with a relative molecular weight of 49,507. The amino acid composition predicted from the DNA sequence was similar to the chemically determined composition of purified AP reported previously. The amino acid sequence analysis revealed that both the N-terminal side sequence of the purified AP and several internal lysyl peptide fragments were identical to the deduced amino acid sequences. The percent homology of amino acid sequences between AP and Serratia protease was about 55%. The zinc ligands and an active site of the AP were predicted by comparing the structure of the enzyme with of Serratia protease, thermolysin, Bacillus subtilis neutral protease, and Pseudomonas elastase. PMID:2123832

  12. QGRS Mapper: a web-based server for predicting G-quadruplexes in nucleotide sequences

    PubMed Central

    Kikin, Oleg; D'Antonio, Lawrence; Bagga, Paramjeet S

    2006-01-01

    The quadruplex structures formed by guanine-rich nucleic acid sequences have received significant attention recently because of growing evidence for their role in important biological processes and as therapeutic targets. G-quadruplex DNA has been suggested to regulate DNA replication and may control cellular proliferation. Sequences capable of forming G-quadruplexes in the RNA have been shown to play significant roles in regulation of polyadenylation and splicing events in mammalian transcripts. Whether quadruplex structure directly plays a role in regulating RNA processing requires investigation. Computational approaches to study G-quadruplexes allow detailed analysis of mammalian genomes. There are no known easily accessible user-friendly tools that can compute G-quadruplexes in the nucleotide sequences. We have developed a web-based server, QGRS Mapper, that predicts quadruplex forming G-rich sequences (QGRS) in nucleotide sequences. It is a user-friendly application that provides many options for defining and studying G-quadruplexes. It performs analysis of the user provided genomic sequences, e.g. promoter and telomeric regions, as well as RNA sequences. It is also useful for predicting G-quadruplex structures in oligonucleotides. The program provides options to search and retrieve desired gene/nucleotide sequence entries from NCBI databases for mapping G-quadruplexes in the context of RNA processing sites. This feature is very useful for investigating the functional relevance of G-quadruplex structure, in particular its role in regulating the gene expression by alternative processing. In addition to providing data on composition and locations of QGRS relative to the processing sites in the pre-mRNA sequence, QGRS Mapper features interactive graphic representation of the data. The user can also use the graphics module to visualize QGRS distribution patterns among all the alternative RNA products of a gene simultaneously on a single screen. QGRS Mapper can be

  13. Complete nucleotide sequence of the Streptomyces lividans plasmid pIJ101 and correlation of the sequence with genetic properties.

    PubMed Central

    Kendall, K J; Cohen, S N

    1988-01-01

    The complete nucleotide sequence of the multicopy Streptomyces plasmid pIJ101 has been determined and correlated with previously published genetic data. The circular DNA molecule is 8,830 nucleotides in length and has a G+C composition of 72.98%. The use of a computer program, FRAME, enabled identification in the sequence of seven open reading frames, four of which, tra (621 amino acids [aa]), spdA (146 aa), spdB (274 aa), and kilB (177 aa), appear to be genes involved in plasmid transfer. At least two of the above genes are predicted to be transcribed by known promoters that are regulated in trans by the products of the korA (241 aa) and korB (80 aa) loci on the plasmid. The segment of the plasmid capable of autonomous replication contains one large open reading frame (rep; 450 aa) and a noncoding region presumed to be the origin of replication. Four other small (less than 90 aa) open reading frames are also present on the plasmid, although no function can be attributed to them. The sequence of the pIJ101 replication segment present in several widely used cloning vectors (e.g., pIJ350 and pIJ702) has also been determined, so that the complete nucleotide sequences of these vectors are now known. PMID:3170481

  14. Sequence selective naked-eye detection of DNA harnessing extension of oligonucleotide-modified nucleotides.

    PubMed

    Verga, Daniela; Welter, Moritz; Marx, Andreas

    2016-02-01

    DNA polymerases can efficiently and sequence selectively incorporate oligonucleotide (ODN)-modified nucleotides and the incorporated oligonucleotide strand can be employed as primer in rolling circle amplification (RCA). The effective amplification of the DNA primer by Φ29 DNA polymerase allows the sequence-selective hybridisation of the amplified strand with a G-quadruplex DNA sequence that has horse radish peroxidase-like activity. Based on these findings we develop a system that allows DNA detection with single-base resolution by naked eye. PMID:26774580

  15. The nucleotide sequence at the termini of adenovirus type 5 DNA.

    PubMed Central

    Steenbergh, P H; Maat, J; van Ormondt, H; Sussenbach, J S

    1977-01-01

    The sequences of the first 194 base pairs at both termini of adenovirus type 5 (Ad5) DNA have been determined, using the chemical degradation technique developed by Maxam and Gilbert (Proc. Nat. Acad. Sci. USA 74 (1977), pp. 560-564). The nucleotide sequences 1-75 were confirmed by analysis of labeled RNA transcribed from the terminal HhaI fragments in vitro. The sequence data show that Ad5 DNA has a perfect inverted terminal repetition of 103 base pairs long. Images PMID:600799

  16. Complete nucleotide sequence of a subviral DNA molecule of porcine circovirus type 2.

    PubMed

    Wen, Han

    2016-07-01

    Porcine circovirus type 2 (PCV2) is a member of the genus Circovirus in the family Circoviridae. Most subgenomic molecules of PCV2 have been mapped. Here, the first full-length sequence of a subviral molecule of PCV2 (CH-IVT12) containing a reverse complement sequence of the PCV2 genome was determined by sequencing DNA extracted from PK15 cells infected with PCV2. The circular CH-IVT12 DNA consists of 1136 nucleotides and contains one major open reading frame. PMID:27084550

  17. The EMBL-EBI bioinformatics web and programmatic tools framework.

    PubMed

    Li, Weizhong; Cowley, Andrew; Uludag, Mahmut; Gur, Tamer; McWilliam, Hamish; Squizzato, Silvano; Park, Young Mi; Buso, Nicola; Lopez, Rodrigo

    2015-07-01

    Since 2009 the EMBL-EBI Job Dispatcher framework has provided free access to a range of mainstream sequence analysis applications. These include sequence similarity search services (https://www.ebi.ac.uk/Tools/sss/) such as BLAST, FASTA and PSI-Search, multiple sequence alignment tools (https://www.ebi.ac.uk/Tools/msa/) such as Clustal Omega, MAFFT and T-Coffee, and other sequence analysis tools (https://www.ebi.ac.uk/Tools/pfa/) such as InterProScan. Through these services users can search mainstream sequence databases such as ENA, UniProt and Ensembl Genomes, utilising a uniform web interface or systematically through Web Services interfaces (https://www.ebi.ac.uk/Tools/webservices/) using common programming languages, and obtain enriched results with novel visualisations. Integration with EBI Search (https://www.ebi.ac.uk/ebisearch/) and the dbfetch retrieval service (https://www.ebi.ac.uk/Tools/dbfetch/) further expands the usefulness of the framework. New tools and updates such as NCBI BLAST+, InterProScan 5 and PfamScan, new categories such as RNA analysis tools (https://www.ebi.ac.uk/Tools/rna/), new databases such as ENA non-coding, WormBase ParaSite, Pfam and Rfam, and new workflow methods, together with the retirement of depreciated services, ensure that the framework remains relevant to today's biological community. PMID:25845596

  18. Nucleotide sequence of 5S ribosomal RNA from Aspergillus nidulans and Neurospora crassa.

    PubMed Central

    Piechulla, B; Hahn, U; McLaughlin, L W; Küntzel, H

    1981-01-01

    The nucleotide sequences of 5S rRNA molecules isolated from the cytosol and the mitochondria of the ascomycetes A. nidulans and N. crassa were determined by partial chemical cleavage of 3'-terminally labelled RNA. The sequence identity of the cytosolic and mitochondrial RNA preparations confirms the absence of mitochondrion-specific 5S rRNA in these fungi. The sequences of the two organisms differ in 35 positions, and each sequence differs from yeast 5S rRNA in 44 positions. Both molecules contain the sequence GCUC in place of GAAC or GAUY found in all other 5S rRNAs, indicating that this region is not universally involved in base-pairing to the invariant GTpsiC sequence of tRNAs. Images PMID:6453331

  19. Nucleotide sequence specifying the glycoprotein gene, gB, of herpes simplex virus type 1.

    PubMed

    Bzik, D J; Fox, B A; DeLuca, N A; Person, S

    1984-03-01

    The nucleotide sequence thought to specify the glycoprotein gene, gB, of the KOS strain of herpes simplex virus type 1 (HSV-1) has been determined. A 3.1-kilobase (kb), viral-specified RNA was mapped to the left half of the BamHI-G fragment (0.345 to 0.399 map units). TATA, CAT-box, and possible mRNA start sequences characteristic of HSV-1 genes are found near 0.368 map units. The first available ATG codon is at 0.366 and the first in-phase chain terminator at 0.348 map units. A polyA-addition signal (AATAAA) occurs 17 nucleotides past the chain terminator. Translation of these sequences would yield a 100.3-kilodalton (kDa) polypeptide characterized by a 5' signal sequence, nine N-linked saccharide addition sites, a strongly hydrophobic membrane-spanning sequence, and a highly charged 3' cytoplasmic anchor sequence. Two mutants of KOS, tsJ12 and tsJ20, that are temperature-sensitive for viral growth and for the production of gB, have been physically mapped to 0.357 to 0.360 and 0.360 to 0.364 map units, respectively (DeLuca et al., in preparation). The nucleotide sequence of the mutants was determined in these regions. In both cases a single amino acid replacement within the 100.3-kDa polypeptide is predicted from the sequence analysis. PMID:6324454

  20. Linking the human cytogenetic map with nucleotide sequence: the CCAP clone set.

    PubMed

    Jang, Wonhee; Yonescu, Raluca; Knutsen, Turid; Brown, Theresa; Reppert, Tricia; Sirotkin, Karl; Schuler, Gregory D; Ried, Thomas; Kirsch, Ilan R

    2006-07-15

    We present the completed dataset and clone repository of the Cancer Chromosome Aberration Project (CCAP), an initiative developed and funded through the intramural program of the U.S. National Cancer Institute, to provide seamless linkage of human cytogenetic markers with the primary nucleotide sequence of the human genome. Spaced at 1-2 Mb intervals across the human genome, 1,339 bacterial artificial chromosome (BAC) clones have been localized to chromosomal bands through high-resolution fluorescence in situ hybridization (FISH) mapping. Of these clones, 99.8% can be positioned on the primary human genome sequence and 95% are placed at or close to their precise nucleotide starts and stops. This dataset can be studied and manipulated within generally available public Web sites. The clones are available from a commercial repository. The CCAP BAC clone set provides anchors for the interrogation of gene and sequence involvement in oncogenic and developmental disorders when the starting point is the recognition of a structural, numerical, or interstitial chromosomal aberration. This dataset also provides a current view of the quality and coherence of the available genome sequence and insight into the nucleotide and three-dimensional structures that manifest as Giemsa light and dark chromosomal banding patterns. PMID:16843097

  1. Complete nucleotide sequence of cherry virus A (CVA) infecting sweet cherry in India.

    PubMed

    Noorani, M S; Awasthi, P; Singh, Rahul Mohan; Ram, Raja; Sharma, M P; Singh, S R; Ahmed, N; Hallan, V; Zaidi, A A

    2010-12-01

    Cherry virus A (CVA) is a graft-transmissible member of the genus Capillovirus that infects different stone fruits. Sweet cherry (Prunus avium L; family Rosaceae) is an important deciduous temperate fruit crop in the Western Himalayan region of India. In order to determine the health status of cherry plantations and the incidence of the virus in India, cherry orchards in the states of Jammu and Kashmir (J&K) and Himachal Pradesh (H.P.) were surveyed during the months of May and September 2009. The incidence of CVA was found to be 28 and 13% from J&K and H.P., respectively, by RT-PCR. In order to characterize the virus at the molecular level, the complete genome was amplified by RT-PCR using specific primers. The amplicon of about 7.4 kb was sequenced and was found to be 7,379 bp long, with sequence specificity to CVA. The genome organization was similar to that of isolates characterized earlier, coding for two ORFs, in which ORF 2 is nested in ORF1. The complete sequence was 81 and 84% similar to that of the type isolate at the nucleotide and amino acid level, respectively, with 5' and 3' UTRs of 54 and 299 nucleotides, respectively. This is the first report of the complete nucleotide sequence of cherry virus A infecting sweet cherry in India. PMID:20938696

  2. Complete nucleotide sequence of the hypervirulent CFH strain of beet curly top virus.

    PubMed

    Stenger, D C

    1994-01-01

    The complete nucleotide sequence of the hypervirulent CFH strain of beet curly top geminivirus (BCTV) has been determined. The circular DNA genome of BCTV-CFH consists of 2,927 nucleotides and shares extensive sequence homology with the biologically distinct California strain of BCTV. Analysis of the CFH nucleotide sequence indicated that the rightward open reading frames (ORFs) R1, R2, and R3 are highly conserved (> 95% amino acid chemical similarity) in the CFH and California strains, although CFH ORF R2 was extended by 24 carboxy-terminal amino acid residues not present in the California strain. The CFH leftward ORFs L1, L2, and L3 shared varying levels of amino acid chemical similarity with the corresponding ORFs of the California strain (78.8, 66.5, and 86.7%, respectively). CFH ORF L4 was the least conserved ORF present in both strains, encoding a 9.9-kDa protein of 87 amino acid residues, which shares 57.6% chemical similarity with 85 carboxy-terminal amino acid residues of the 19.4-kDa ORF L4 of the California strain. The CFH DNA sequence also contained a unique 12.5-kDa ORF (R4); however, there is no evidence to suggest that R4 is expressed. Comparison of the CFH and California strain nucleotide sequences indicates that certain regions of the BCTV genome have diverged, and this divergence may account for differences in the pathogenic properties of the two strains. PMID:8167369

  3. The nucleotide sequence at the 5' end of foot and mouth disease virus RNA.

    PubMed Central

    Harris, T J

    1979-01-01

    Foot and mouth disease virus RNA has been treated with RNase H in the presence of oligo (dG) specifically to digest the poly(C) tract which lies near the 5' end of the molecule (10). The short (S) fragment containing the 5' end of the RNA was separated from the remainder of the RNA (L fragment) by gel electrophoresis. RNA ligase mediated labelling of the 3' end of S fragment showed that the RNase H digestion gave rise to molecules that differed only in the number of cytidylic acid residues remaining at their 3' ends and did not leave the unique 3' end necessary for fast sequence analysis. As the 5' end of S fragment prepared form virus RNA is blocked by VPg, S fragment was prepared from virus specific messenger RNA which does not contain this protein. This RNA was labelled at the 5' end using polynucleotide kinase and the sequence of 70 nucleotides at the 5' end determined by partial enzyme digestion sequencing on polyacrylamide gels. Some of this sequence was confirmed from an analysis of the oligonucleotides derived by RNase T1 digestion of S fragment. The sequence obtained indicates that there is a stable hairpin loop at the 5' terminus of the RNA before an initiation codon 33 nucleotides from the 5' end. In addition, the RNase T1 analysis suggests that there are short repeated sequences in S fragment and that an eleven nucleotide inverted complementary repeat of a sequence near the 3' end of the RNA is present at the junction of S fragment and the poly(C) tract. Images PMID:231762

  4. Complete nucleotide sequence of the nucleoprotein gene of influenza B virus.

    PubMed Central

    Londo, D R; Davis, A R; Nayak, D P

    1983-01-01

    A DNA copy of influenza B/Singapore/222/79 viral RNA segment 5, containing the gene coding for the nucleoprotein (NP), has been cloned in Escherichia coli plasmid pBR322, and its nucleotide sequence has been determined. The influenza B NP gene contains 1,839 nucleotides and codes for a protein of 560 amino acids with a molecular weight of 61,593. Comparison of the influenza B NP amino acid sequence with that of influenza A NP (A/PR/8/34) reveals 37% direct homology in the aligned regions, indicating a common ancestor. However, influenza B NP has an additional 50 amino acids at its N-terminal end. As is the case with influenza A NP, influenza B NP is a basic protein, with its charged residues relatively evenly distributed rather than clustered. The structural homology suggests functional similarity between the NP of influenza A and B viruses. PMID:6688639

  5. Quadfinder: server for identification and analysis of quadruplex-forming motifs in nucleotide sequences

    PubMed Central

    Scaria, Vinod; Hariharan, Manoj; Arora, Amit; Maiti, Souvik

    2006-01-01

    G-quadruplex secondary structures, which play a structural role in repetitive DNA such as telomeres, may also play a functional role at other genomic locations as targetable regulatory elements which control gene expression. The recent interest in application of quadruplexes in biological systems prompted us to develop a tool for the identification and analysis of quadruplex-forming nucleotide sequences especially in the RNA. Here we present Quadfinder, an online server for prediction and bioinformatics of uni-molecular quadruplex-forming nucleotide sequences. The server is designed to be user-friendly and needs minimal intervention by the user, while providing flexibility of defining the variants of the motif. The server is freely available at URL . PMID:16845097

  6. Nucleotide sequence of cDNA clones of the murine myb proto-oncogene.

    PubMed Central

    Gonda, T J; Gough, N M; Dunn, A R; de Blaquiere, J

    1985-01-01

    We have isolated cDNA clones of murine c-myb mRNA which contain approximately 2.8 kb of the 3.9-kb mRNA sequence. Nucleotide sequencing has shown that these clones extend both 5' and 3' to sequences homologous to the v-myb oncogenes of avian myeloblastosis virus and avian leukemia virus E26. The sequence contains an open reading frame of 1944 nucleotides, and could encode a protein which is both highly homologous, and of similar size (71 kd), to the chicken c-myb protein. Examination of the deduced amino acid sequence of the murine c-myb protein revealed the presence of a 3-fold tandem repeat of 52 residues near the N terminus of the protein, and has enabled prediction of some of the likely structural features of the protein. These include a high alpha-helix content, a basic region toward the N terminus of the protein and an overall globular configuration. The arrangement of genomic c-myb sequences, detected using the cDNA clones as probes, was compared with the reported structure of rearranged c-myb in certain tumour cells. This comparison suggested that the rearranged c-myb gene may encode a protein which, like the v-myb protein, lacks the N-terminal region of c-myb. Images Fig. 5. PMID:2998780

  7. Multimodal phylogeny for taxonomy: integrating information from nucleotide and amino acid sequences.

    PubMed

    Bicego, Manuele; Dellaglio, Franco; Felis, Giovanna E

    2007-10-01

    The crucial role played by the analysis of microbial diversity in biotechnology-based innovations has increased the interest in the microbial taxonomy research area. Phylogenetic sequence analyses have contributed significantly to the advances in this field, also in the view of the large amount of sequence data collected in recent years. Phylogenetic analyses could be realized on the basis of protein-encoding nucleotide sequences or encoded amino acid molecules: these two mechanisms present different peculiarities, still starting from two alternative representations of the same information. This complementarity could be exploited to achieve a multimodal phylogenetic scheme that is able to integrate gene and protein information in order to realize a single final tree. This aspect has been poorly addressed in the literature. In this paper, we propose to integrate the two phylogenetic analyses using basic schemes derived from the multimodality fusion theory (or multiclassifier systems theory), a well-founded and rigorous branch for which its powerfulness has already been demonstrated in other pattern recognition contexts. The proposed approach could be applied to distance matrix-based phylogenetic techniques (like neighbor joining), resulting in a smart and fast method. The proposed methodology has been tested in a real case involving sequences of some species of lactic acid bacteria. With this dataset, both nucleotide sequence- and amino acid sequence-based phylogenetic analyses present some drawbacks, which are overcome with the multimodal analysis. PMID:17933011

  8. Common nucleotide sequence of structural gene encoding fibroblast growth factor 4 in eight cattle derived from three breeds.

    PubMed

    Sato, Sho; Takahashi, Toshikiyo; Nishinomiya, Hiroshi; Katoh, Makiko; Itoh, Ryu; Yokoo, Masaki; Yokoo, Mari; Iha, Momoe; Mori, Yuki; Kasuga, Kano; Kojima, Ikuo; Kobayashi, Masayuki

    2012-03-01

    Fibroblast growth factor 4 (FGF4) is considered as a crucial gene for the proper development of bovine embryos. However, the complete nucleotide sequences of the structural genes encoding FGF4 in identified breeds are still unknown. In the present study, direct sequencing of PCR products derived from genomic DNA samples obtained from three Japanese Black, two Japanese Shorthorn and three Holstein cattle, revealed that the nucleotide sequences of the structural gene encoding FGF4 matched completely among these eight cattle. On the other hand, differences in the nucleotide sequences, leading to substitutions, insertions or deletions of amino acid residues were detected when compared with the already reported sequence from unidentified breeds. We cannot rule out a possibility that the structural gene elucidated in the present study is widely distributed in cattle. To the best of our knowledge, this is the first determination of the complete nucleotide sequence of the structural gene encoding bovine FGF4 in identified breeds. PMID:22435631

  9. Nucleotide sequence, heterologous expression and novel purification of DNA ligase from Bacillus stearothermophilus(1).

    PubMed

    Brannigan, J A; Ashford, S R; Doherty, A J; Timson, D J; Wigley, D B

    1999-07-13

    The gene for DNA ligase (EC 6.5.1.2) from thermophilic bacterium Bacillus stearothermophilus NCA1503 has been cloned and the complete nucleotide sequence determined. The ligase gene encodes a protein 670 amino acids in length. The gene was overexpressed in Escherichia coli and the enzyme has been purified to homogeneity. Preliminary characterisation confirms that it is a thermostable, NAD(+)-dependent DNA ligase. PMID:10407164

  10. Complete Nucleotide Sequence of a French Isolate of Maize rough dwarf virus, a Fijivirus Member in the Family Reoviridae

    PubMed Central

    Svanella-Dumas, L.; Marais, A.; Faure, C.; Theil, S.; Thibord, J. B.

    2016-01-01

    The complete nucleotide sequence of a French isolate of Maize rough dwarf virus (MRDV) was determined by next-generation sequencing and compared with the single available complete sequence and with the partial sequences of two additional isolates available in online databases. PMID:27445367

  11. Nucleotide sequence of the hypervariable region of the human C2 gene

    SciTech Connect

    Zhu, Z.B.; Volanakis, J.V. )

    1991-03-15

    It has been previously suggested that the multiallelic Bam H1/Sst I RFLPs of the human C2 gene arose through deletion/insertion of a tandemly-repeated minisatellite region. In this study the authors subcloned and sequenced the Sst I polymorphic fragment of the b haplotype of the C2 gene. This restriction fragment is 2,450 bp long and maps 1,550 bp 3{prime} of exon 3. Its nucleotide sequence is characterized by the presence of at least 4 different repeated regions varying in size from 18 to 58 bp. One of these regions starting at position 1,413 is 48 bp long and is repeated five times. The first 3 repeats are in tandem and are separated by 72 bp from two additional tandem repeats. Sequence homology among the 5 repeats ranges between 93 and 98%. Eighty three percent of the nucleotides of the repeated-region are G or C. It seems likely that this nucleotide repeat resulted in the multiallelic RFLPs through a mechanism of unequal recombination or replication slippage.

  12. Nucleotide sequence variation of the envelope protein gene identifies two distinct genotypes of yellow fever virus.

    PubMed Central

    Chang, G J; Cropp, B C; Kinney, R M; Trent, D W; Gubler, D J

    1995-01-01

    The evolution of yellow fever virus over 67 years was investigated by comparing the nucleotide sequences of the envelope (E) protein genes of 20 viruses isolated in Africa, the Caribbean, and South America. Uniformly weighted parsimony algorithm analysis defined two major evolutionary yellow fever virus lineages designated E genotypes I and II. E genotype I contained viruses isolated from East and Central Africa. E genotype II viruses were divided into two sublineages: IIA viruses from West Africa and IIB viruses from America, except for a 1979 virus isolated from Trinidad (TRINID79A). Unique signature patterns were identified at 111 nucleotide and 12 amino acid positions within the yellow fever virus E gene by signature pattern analysis. Yellow fever viruses from East and Central Africa contained unique signatures at 60 nucleotide and five amino acid positions, those from West Africa contained unique signatures at 25 nucleotide and two amino acid positions, and viruses from America contained such signatures at 30 nucleotide and five amino acid positions in the E gene. The dissemination of yellow fever viruses from Africa to the Americas is supported by the close genetic relatedness of genotype IIA and IIB viruses and genetic evidence of a possible second introduction of yellow fever virus from West Africa, as illustrated by the TRINID79A virus isolate. The E protein genes of American IIB yellow fever viruses had higher frequencies of amino acid substitutions than did genes of yellow fever viruses of genotypes I and IIA on the basis of comparisons with a consensus amino acid sequence for the yellow fever E gene. The great variation in the E proteins of American yellow fever virus probably results from positive selection imposed by virus interaction with different species of mosquitoes or nonhuman primates in the Americas. PMID:7637022

  13. Cloning and nucleotide sequence of the gene encoding the Ecal DNA methyltransferase.

    PubMed Central

    Brenner, V; Venetianer, P; Kiss, A

    1990-01-01

    The gene coding for the GGTNACC specific Ecal DNA methyltransferase (M.Ecal) has been cloned in E. coli from Enterobacter cloacae and its nucleotide sequence has been determined. The ecalM gene codes for a protein of 452 amino acids (Mr: 51,111). It was determined that M.Ecal is an adenine methyltransferase. M.Ecal shows limited amino acid sequence similarity to other adenine methyltransferases. A clone that expresses Ecal methyltransferase at high level was constructed. Images PMID:2183182

  14. Nucleotide sequencing and characterization of the genes encoding benzene oxidation enzymes of Pseudomonas putida

    SciTech Connect

    Irie, S.; Doi, S.; Yorifuji, T.; Takagi, M.; Yano, K.

    1987-11-01

    The nucleotide sequence of the genes from Pseudomonas putida encoding oxidation of benzene to catechol was determined. Five open reading frames were found in the sequence. Four corresponding protein molecules were detected by a DNA-directed in vitro translation system. Escherichia coli cells containing the fragment with the four open reading frames transformed benzene to cis-benzene glycol, which is an intermediate of the oxidation of benzene to catechol. The relation between the product of each cistron and the components of the benzene oxidation enzyme system is discussed.

  15. Nucleotide sequence of an exceptionally long 5.8S ribosomal RNA from Crithidia fasciculata.

    PubMed

    Schnare, M N; Gray, M W

    1982-03-25

    In Crithidia fasciculata, a trypanosomatid protozoan, the large ribosomal subunit contains five small RNA species (e, f, g, i, j) in addition to 5S rRNA [Gray, M.W. (1981) Mol. Cell. Biol. 1, 347-357]. The complete primary sequence of species i is shown here to be pAACGUGUmCGCGAUGGAUGACUUGGCUUCCUAUCUCGUUGA ... AGAmACGCAGUAAAGUGCGAUAAGUGGUApsiCAAUUGmCAGAAUCAUUCAAUUACCGAAUCUUUGAACGAAACGG ... CGCAUGGGAGAAGCUCUUUUGAGUCAUCCCCGUGCAUGCCAUAUUCUCCAmGUGUCGAA(C)OH. This sequence establishes that species i is a 5.8S rRNA, despite its exceptional length (171-172 nucleotides). The extra nucleotides in C. fasciculata 5.8S rRNA are located in a region whose primary sequence and length are highly variable among 5.8S rRNAs, but which is capable of forming a stable hairpin loop structure (the "G+C-rich hairpin"). The sequence of C. fasciculata 5.8S rRNA is no more closely related to that of another protozoan, Acanthamoeba castellanii, than it is to representative 5.8S rRNA sequences from the other eukaryotic kingdoms, emphasizing the deep phylogenetic divisions that seem to exist within the Kingdom Protista. PMID:7079176

  16. Overlapping open reading frames revealed by complete nucleotide sequencing of turnip yellow mosaic virus genomic RNA.

    PubMed Central

    Morch, M D; Boyer, J C; Haenni, A L

    1988-01-01

    The complete nucleotide sequence of turnip yellow mosaic virus (TYMV) genomic RNA has been determined on a set of overlapping cDNA clones using a sequential sequencing strategy. The RNA is 6318 nucleotides long, excluding the cap structure. The genome organization deduced from the sequence confirms previous results of in vitro translation. A novel open reading frame (ORF) putatively encoding a Pro-rich and very basic 69K (K = kilodalton) protein is detected at the 5' end of the genome. It is initiated at the first AUG codon on the RNA and overlaps the major ORF that encodes the non structural 206K (previously referred to as 195K) protein of TYMV; its function is unknown. Several amino acid consensus sequences already described among plant and animal viruses are also found in the TYMV-encoded polypeptides. A comparison with other viruses whose RNA sequence is known leads to the conclusion that TYMV belongs to the "Sindbis-like" supergroup of viruses and could be related to Semliki forest virus. PMID:3399388

  17. Nucleotide sequences of immunoglobulin eta genes of chimpanzee and orangutan: DNA molecular clock and hominoid evolution

    SciTech Connect

    Sakoyama, Y.; Hong, K.J.; Byun, S.M.; Hisajima, H.; Ueda, S.; Yaoita, Y.; Hayashida, H.; Miyata, T.; Honjo, T.

    1987-02-01

    To determine the phylogenetic relationships among hominoids and the dates of their divergence, the complete nucleotide sequences of the constant region of the immunoglobulin eta-chain (C/sub eta1/) genes from chimpanzee and orangutan have been determined. These sequences were compared with the human eta-chain constant-region sequence. A molecular clock (silent molecular clock), measured by the degree of sequence divergence at the synonymous (silent) positions of protein-encoding regions, was introduced for the present study. From the comparison of nucleotide sequences of ..cap alpha../sub 1/-antitrypsin and ..beta..- and delta-globulin genes between humans and Old World monkeys, the silent molecular clock was calibrated: the mean evolutionary rate of silent substitution was determined to be 1.56 x 10/sup -9/ substitutions per site per year. Using the silent molecular clock, the mean divergence dates of chimpanzee and orangutan from the human lineage were estimated as 6.4 +/- 2.6 million years and 17.3 +/- 4.5 million years, respectively. It was also shown that the evolutionary rate of primate genes is considerably slower than those of other mammalian genes.

  18. Cloning and nucleotide sequence of the Salmonella typhimurium dcp gene encoding dipeptidyl carboxypeptidase.

    PubMed Central

    Hamilton, S; Miller, C G

    1992-01-01

    Plasmids carrying the Salmonella typhimurium dcp gene were isolated from a pBR328 library of Salmonella chromosomal DNA by screening for complementation of a peptide utilization defect conferred by a dcp mutation. Strains carrying these plasmids overproduced dipeptidyl carboxypeptidase approximately 50-fold. The nucleotide sequence of a 2.8-kb region of one of these plasmids contained an open reading frame coding for a protein of 77,269 Da, in agreement with the 80-kDa size for dipeptidyl carboxypeptidase (determined by sodium dodecyl sulfate-polyacrylamide gel electrophoresis and gel filtration). The N-terminal amino acid sequence of dipeptidyl carboxypeptidase purified from an overproducer strain agreed with that predicted by the nucleotide sequence. Northern (RNA) blot data indicated that dcp is not cotranscribed with other genes, and primer extension analysis showed the start of transcription to be 22 bases upstream of the translational start. The amino acid sequence of dcp was not similar to that of a mammalian dipeptidyl carboxypeptidase, angiotensin I-converting enzyme, but showed striking similarities to the amino acid sequence of another S. typhimurium peptidase encoded by the opdA (formerly optA) gene. Images PMID:1537804

  19. Nucleotide sequence analysis of a cloned DNA fragment from human cells reveals homology to retrotransposons.

    PubMed Central

    Flügel, R M; Maurer, B; Bannert, H; Rethwilm, A; Schnitzler, P; Darai, G

    1987-01-01

    During molecular cloning of proviral DNA of human spumaretrovirus, various recombinant clones were established and analyzed. Blot hybridization revealed that one of the recombinant plasmids had the characteristic features of a member of the long interspersed repetitive sequences family. The DNA element was analyzed by restriction mapping and nucleotide sequencing. It showed a high degree of amino acid sequence homology of 54.3% when compared with the 5'-terminal part of the pol gene product of the murine retrotransposon LIMd. The 3' region of the cloned DNA element encodes proteins with an even higher degree of homology of 67.4% in comparison to the corresponding parts of a member of the primate KpnI sequence family. Images PMID:3031462

  20. Large-scale detection and application of expressed sequence tag single nucleotide polymorphisms in Nicotiana.

    PubMed

    Wang, Y; Zhou, D; Wang, S; Yang, L

    2015-01-01

    Single nucleotide polymorphisms (SNPs) are widespread in the Nicotiana genome. Using an alignment and variation detection method, we developed 20,607,973 SNPs, based on the expressed sequence tag sequences of 10 Nicotiana species. The replacement rate was much higher than the transversion rate in the SNPs, and SNPs widely exist in the Nicotiana. In vitro verification indicated that all of the SNPs were high quality and accurate. Evolutionary relationships between 15 varieties were investigated by polymerase chain reaction with a special primer; the specific 302 locus of these sequence results clearly indicated the origin of Zhongyan 100. A database of Nicotiana SNPs (NSNP) was developed to store and search for SNPs in Nicotiana. NSNP is a tool for researchers to develop SNP markers of sequence data. PMID:26214460

  1. Nucleotide sequence of the L1 ribosomal protein gene of Xenopus laevis: remarkable sequence homology among introns.

    PubMed Central

    Loreni, F; Ruberti, I; Bozzoni, I; Pierandrei-Amaldi, P; Amaldi, F

    1985-01-01

    Ribosomal protein L1 is encoded by two genes in Xenopus laevis. The comparison of two cDNA sequences shows that the two L1 gene copies (L1a and L1b) have diverged in many silent sites and very few substitution sites; moreover a small duplication occurred at the very end of the coding region of the L1b gene which thus codes for a product five amino acids longer than that coded by L1a. Quantitatively the divergence between the two L1 genes confirms that a whole genome duplication took place in Xenopus laevis approximately 30 million years ago. A genomic fragment containing one of the two L1 gene copies (L1a), with its nine introns and flanking regions, has been completely sequenced. The 5' end of this gene has been mapped within a 20-pyridimine stretch as already found for other vertebrate ribosomal protein genes. Four of the nine introns have a 60-nucleotide sequence with 80% homology; within this region some boxes, one of which is 16 nucleotides long, are 100% homologous among the four introns. This feature of L1a gene introns is interesting since we have previously shown that the activity of this gene is regulated at a post-transcriptional level and it involves the block of the normal splicing of some intron sequences. Images Fig. 3. Fig. 5. PMID:3841512

  2. Escherichia coli thymidylate kinase: molecular cloning, nucleotide sequence, and genetic organization of the corresponding tmk locus.

    PubMed Central

    Reynes, J P; Tiraby, M; Baron, M; Drocourt, D; Tiraby, G

    1996-01-01

    Thymidylate kinase (dTMP kinase; EC 2.7.4.9) catalyzes the phosphorylation of dTMP to form dTDP in both de novo and salvage pathways of dTTP synthesis. The nucleotide sequence of the tmk gene encoding this essential Escherichia coli enzyme is the last one among all the E. coli nucleoside and nucleotide kinase genes which has not yet been reported. By subcloning the 24.0-min region where the tmk gene has been previously mapped from the lambda phage 236 (E9G1) of the Kohara E. coli genomic library (Y. Kohara, K. Akiyama, and K. Isono, Cell 50:495-508, 1987), we precisely located tmk between acpP and holB genes. Here we report the nucleotide sequence of tmk, including the end portion of an upstream open reading frame (ORF 340) of unknown function that may be cotranscribed with the pabC gene. The tmk gene was located clockwise of and just upstream of the holB gene. Our sequencing data allowed the filling in of the unsequenced gap between the acpP and holB genes within the 24-min region of the E. coli chromosome. Identification of this region as the E. coli tmk gene was confirmed by functional complementation of a yeast dTMP kinase temperature-sensitive mutant and by in vitro enzyme assay of the thymidylate kinase activity in cell extracts of E. coli by use of tmk-overproducing plasmids. The deduced amino acid sequence of the E. coli tmk gene showed significant similarity to the sequences of the thymidylate kinases of vertebrates, yeasts, and viruses as well as two uncharacterized proteins of bacteria belonging to Bacillus and Haemophilus species. PMID:8631667

  3. Nucleotide sequence and characterization of the pyrF operon of Escherichia coli K12.

    PubMed

    Turnbough, C L; Kerr, K H; Funderburg, W R; Donahue, J P; Powell, F E

    1987-07-25

    The pyrF gene of Escherichia coli K12, which encodes the pyrimidine biosynthetic enzyme orotidine-5'-monophosphate (OMP) decarboxylase, is part of an operon that includes a downstream gene designated orfF. The orfF gene product is a small polypeptide of unknown function. The nucleotide sequence of a 1549-base pair chromosomal fragment containing this operon was determined. An open reading frame capable of encoding the 27-kDa OMP decarboxylase subunit was identified and shown to be the pyrF structural gene by purifying and characterizing OMP decarboxylase. The subunit molecular weight (Mr = 26,350), amino-terminal amino acid sequence, and amino acid composition of the polypeptide predicted from the nucleotide sequence are in excellent agreement with those properties determined for the purified enzyme. The orfF structural gene was tentatively identified and apparently encodes an 11,396-dalton polypeptide. The orfF translational initiation codon overlaps the pyrF termination codon, which may indicate translational coupling in the expression of these genes. The pyrF promoter was mapped by primer extension of in vivo transcripts. The primary transcriptional initiation site is 51 base pairs upstream of the pyrF structural gene. The level of pyrF transcription and OMP decarboxylase synthesis was found to be coordinately derepressed by pyrimidine limitation, indicating that regulation of pyrF gene expression occurs at the transcriptional level. Inspection of the nucleotide sequence indicates that pyrF gene expression is not regulated by an attenuation control mechanism similar to that described for the pyrBI operon or pyrE gene. Finally, we compared the amino acid sequences of the OMP decarboxylases from E. coli, Saccharomyces cerevisiae, Neurospora crassa, and Ehrlich ascites cells to identify conserved regions. PMID:2956254

  4. Nucleotide sequence analysis of the hypervariable region III of mitochondrial DNA in Thais.

    PubMed

    Thongngam, Punlop; Leewattanapasuk, Worraanong; Bhoopat, Tanin; Sangthong, Padchanee

    2016-07-01

    This study analyzed the nucleotide sequences of the hypervariable region III (HVRIII) of mitochondrial DNA in Thai individuals. Buccal swab samples were randomly obtained from 100 healthy, unrelated, adult (18-60 years old), volunteer donors living in Thailand. Eighteen different haplotypes were found, of which 11 haplotypes were unique. The most frequent haplotypes observed were 522D-523D. Nucleotide transition from Thymine (T) to Cytosine (C) at position 489 (43%) was the most frequent substitution. Nucleotide transversions were also observed at position 433 (Adenine (A) to C, 1%) and position 499 (Guanine (G) to C, 1%). Fifty-three samples presented nucleotide insertion and deletion of C and A (CA) at position 514-523. Insertion of 1AC (3%) and 2AC (2%) were observed. Deletion of 1CA (53%) and 2CA (2%) at position 514-523 were revealed. The deletion of T at position 459 was observed. The haplotype diversity, random match probability, and discrimination power were calculated to be 0.7770, 0.2308, and 0.7692, respectively. PMID:27107562

  5. Nucleotide sequence variation of chitin synthase genes among ectomycorrhizal fungi and its potential use in taxonomy.

    PubMed Central

    Mehmann, B; Brunner, I; Braus, G H

    1994-01-01

    DNA sequences of single-copy genes coding for chitin synthases (UDP-N-acetyl-D-glucosamine:chitin 4-beta-N-acetylglucosaminyltransferase; EC 2.4.1.16) were used to characterize ectomycorrhizal fungi. Degenerate primers deduced from short, completely conserved amino acid stretches flanking a region of about 200 amino acids of zymogenic chitin synthases allowed the amplification of DNA fragments of several members of this gene family. Different DNA band patterns were obtained from basidiomycetes because of variation in the number and length of amplified fragments. Cloning and sequencing of the most prominent DNA fragments revealed that these differences were due to various introns at conserved positions. The presence of introns in basidiomycetous fungi therefore has a potential use in identification of genera by analyzing PCR-generated DNA fragment patterns. Analyses of the nucleotide sequences of cloned fragments revealed variations in nucleotide sequences from 4 to 45%. By comparison of the deduced amino acid sequences, the majority of the DNA fragments were identified as members of genes for chitin synthase class II. The deduced amino acid sequences from species of the same genus differed only in one amino acid residue, whereas identity between the amino acid sequences of ascomycetous and basidiomycetous fungi within the same taxonomic class was found to be approximately 43 to 66%. Phylogenetic analysis of the amino acid sequence of class II chitin synthase-encoding gene fragments by using parsimony confirmed the current taxonomic groupings. In addition, our data revealed a fourth class of putative zymogenic chitin synthesis. Images PMID:7944356

  6. CodingMotif: exact determination of overrepresented nucleotide motifs in coding sequences

    PubMed Central

    2012-01-01

    Background It has been increasingly appreciated that coding sequences harbor regulatory sequence motifs in addition to encoding for protein. These sequence motifs are expected to be overrepresented in nucleotide sequences bound by a common protein or small RNA. However, detecting overrepresented motifs has been difficult because of interference by constraints at the protein level. Sampling-based approaches to solve this problem based on codon-shuffling have been limited to exploring only an infinitesimal fraction of the sequence space and by their use of parametric approximations. Results We present a novel O(N(log N)2)-time algorithm, CodingMotif, to identify nucleotide-level motifs of unusual copy number in protein-coding regions. Using a new dynamic programming algorithm we are able to exhaustively calculate the distribution of the number of occurrences of a motif over all possible coding sequences that encode the same amino acid sequence, given a background model for codon usage and dinucleotide biases. Our method takes advantage of the sparseness of loci where a given motif can occur, greatly speeding up the required convolution calculations. Knowledge of the distribution allows one to assess the exact non-parametric p-value of whether a given motif is over- or under- represented. We demonstrate that our method identifies known functional motifs more accurately than sampling and parametric-based approaches in a variety of coding datasets of various size, including ChIP-seq data for the transcription factors NRSF and GABP. Conclusions CodingMotif provides a theoretically and empirically-demonstrated advance for the detection of motifs overrepresented in coding sequences. We expect CodingMotif to be useful for identifying motifs in functional genomic datasets such as DNA-protein binding, RNA-protein binding, or microRNA-RNA binding within coding regions. A software implementation is available at http://bioinformatics.bc.edu/chuanglab/codingmotif.tar PMID

  7. The nucleotide sequence of an infectious clone of the geminivirus beet curly top virus.

    PubMed

    Stanley, J; Markham, P G; Callis, R J; Pinner, M S

    1986-08-01

    A number of infectious clones of a Californian isolate of the leafhopper-transmitted geminivirus beet curly top virus (BCTV) have been constructed from virus-specific double-stranded DNA isolated from infected Beta vulgaris and used to demonstrate a single component genome. The nucleotide sequence of one infectious clone has been determined (2993 nucleotides). Comparison with other geminiviruses has shown that the organisation of the genome closely resembles DNA 1 of the whitefly-transmitted members. The four conserved coding regions of DNA 1 have highly homologous counterparts in BCTV with the exception of the putative coat protein which is more closely related to those of the leafhopper-transmitted geminiviruses suggesting a strong interrelationship between coat protein and insect vector. A BCTV component equivalent to DNA 2 is not required for virus infection or transmission and has not been isolated from infected plants. PMID:16453696

  8. Nucleotide sequence analysis of the L gene of Newcastle disease virus: homologies with Sendai and vesicular stomatitis viruses.

    PubMed Central

    Yusoff, K; Millar, N S; Chambers, P; Emmerson, P T

    1987-01-01

    The nucleotide sequence of the L gene of the Beaudette C strain of Newcastle disease virus (NDV) has been determined. The L gene is 6704 nucleotides long and encodes a protein of 2204 amino acids with a calculated molecular weight of 248822. Mung bean nuclease mapping of the 5' terminus of the L gene mRNA indicates that the transcription of the L gene is initiated 11 nucleotides upstream of the translational start site. Comparison with the amino acid sequences of the L genes of Sendai virus and vesicular stomatitis virus (VSV) suggests that there are several regions of homology between the sequences. These data provide further evidence for an evolutionary relationship between the Paramyxoviridae and the Rhabdoviridae. A non-coding sequence of 46 nucleotides downstream of the presumed polyadenylation site of the L gene may be part of a negative strand leader RNA. Images PMID:3035486

  9. Developing Single Nucleotide Polymorphism (SNP) markers from transcriptome sequences for the identification of longan (Dimocarpus longan) germplasm

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Longan (Dimocarpus longan Lour.) is an important tropical fruit tree crop. Accurate varietal identification is essential for germplasm management and breeding. Using longan transcriptome sequences from public databases, we developed single nucleotide polymorphism (SNP) markers; validated 60 SNPs in...

  10. Conservation of nucleotide sequences for molecular diagnosis of Middle East respiratory syndrome coronavirus, 2015.

    PubMed

    Furuse, Yuki; Okamoto, Michiko; Oshitani, Hitoshi

    2015-11-01

    Infection due to the Middle East respiratory syndrome coronavirus (MERS-CoV) is widespread. The present study was performed to assess the protocols used for the molecular diagnosis of MERS-CoV by analyzing the nucleotide sequences of viruses detected between 2012 and 2015, including sequences from the large outbreak in eastern Asia in 2015. Although the diagnostic protocols were established only 2 years ago, mismatches between the sequences of primers/probes and viruses were found for several of the assays. Such mismatches could lead to a lower sensitivity of the assay, thereby leading to false-negative diagnosis. A slight modification in the primer design is suggested. Protocols for the molecular diagnosis of viral infections should be reviewed regularly after they are established, particularly for viruses that pose a great threat to public health such as MERS-CoV. PMID:26432410

  11. Nucleotide sequence of the tcml gene (ribosomal protein L3) of Saccharomyces cerevisiae.

    PubMed Central

    Schultz, L D; Friesen, J D

    1983-01-01

    The yeast tcml gene, which codes for ribosomal protein L3, has been isolated by using recombinant DNA and genetic complementation. The DNA fragment carrying this gene has been subcloned and we have determined its DNA sequence. The 20 amino acid residues at the amino terminus as inferred from the nucleotide sequence agreed exactly with the amino acid sequence data. The amino acid composition of the encoded protein agreed with that determined for purified ribosomal protein L3. Codon usage in the tcml gene was strongly biased in the direction found for several other abundant Saccharomyces cerevisiae proteins. The tcml gene has no introns, which appears to be atypical of ribosomal protein structural genes. PMID:6305925

  12. Identification of shark species in seafood products by forensically informative nucleotide sequencing (FINS).

    PubMed

    Blanco, M; Pérez-Martín, R I; Sotelo, C G

    2008-11-12

    The identification of commercial shark species is a relevant issue to ensure the correct labeling of seafood products, to maintain consumer confidence in seafood, and to enhance the knowledge of the species and volumes that are at present being captured, thus improving the management of shark fisheries. The polymerase chain reaction was employed to obtain a 423 bp amplicon from the mitochondrial cytochrome b gene. The sequences from this fragment, belonging to 63 authentic individuals of 23 species, were analyzed using a genetic distance method. Nine different samples of commercial fresh, frozen, and convenience food were obtained in local and international markets to validate the methodology. These samples were analyzed, and sequences were employed for species identification, showing that forensically informative nucleotide sequencing (FINS) is a suitable technique for identification of processed seafood containing shark as an ingredient. The results also showed that incorrect labeling practices may occur regarding shark products, probably because of incorrect labeling at the production point. PMID:18831561

  13. Nucleotide and predicted amino acid sequences of cloned human and mouse preprocathepsin B cDNAs.

    PubMed Central

    Chan, S J; San Segundo, B; McCormick, M B; Steiner, D F

    1986-01-01

    Cathepsin B is a lysosomal thiol proteinase that may have additional extralysosomal functions. To further our investigations on the structure, mode of biosynthesis, and intracellular sorting of this enzyme, we have determined the complete coding sequences for human and mouse preprocathepsin B by using cDNA clones isolated from human hepatoma and kidney phage libraries. The nucleotide sequences predict that the primary structure of preprocathepsin B contains 339 amino acids organized as follows: a 17-residue NH2-terminal prepeptide sequence followed by a 62-residue propeptide region, 254 residues in mature (single chain) cathepsin B, and a 6-residue extension at the COOH terminus. A comparison of procathepsin B sequences from three species (human, mouse, and rat) reveals that the homology between the propeptides is relatively conserved with a minimum of 68% sequence identity. In particular, two conserved sequences in the propeptide that may be functionally significant include a potential glycosylation site and the presence of a single cysteine at position 59. Comparative analysis of the three sequences also suggests that processing of procathepsin B is a multistep process, during which enzymatically active intermediate forms may be generated. The availability of the cDNA clones will facilitate the identification of possible active or inactive intermediate processive forms as well as studies on the transcriptional regulation of the cathepsin B gene. PMID:3463996

  14. Cloning and nucleotide sequence of the simian rotavirus gene 6 that codes for the major inner capsid protein.

    PubMed Central

    Estes, M K; Mason, B B; Crawford, S; Cohen, J

    1984-01-01

    The nucleotide sequence of the gene that codes for the major inner capsid protein of the simian rotavirus SA11 has been determined. A DNA copy of mRNA from gene 6 was cloned in the E. coli plasmid pBR322. The full-length gene is 1357 nucleotides long with a 5'-noncoding region of 23 nucleotides and a 3'-noncoding region of 140 nucleotides. The gene contains a single, long, open reading-frame of 1194 nucleotides capable of coding for a protein of 397 amino acids with a molecular weight of 44,816. The predicted protein product is relatively proline-rich with a net charge at neutral pH of -3.5. One stretch of 53 amino acids (encoded by nucleotides 327-485) is basic. Images PMID:6322125

  15. Nucleotide sequence of the gene encoding the repressor for the histidine utilization genes of Pseudomonas putida.

    PubMed Central

    Allison, S L; Phillips, A T

    1990-01-01

    The hutC gene of Pseudomonas putida encodes a repressor which, in combination with the inducer urocanate, regulates expression of the five structural genes necessary for conversion of histidine to glutamate, ammonia, and formate. The nucleotide sequence of the hutC region was determined and found to contain two open reading frames which overlapped by one nucleotide. The first open reading frame (ORF1) appeared to encode a 27,648-dalton protein of 248 amino acids whose sequence strongly resembled that of the hut repressor of Klebsiella aerogenes (A. Schwacha and R. A. Bender, J. Bacteriol. 172:5477-5481, 1990) and contained a helix-turn-helix motif that could be involved in operator binding. The gene was preceded by a sequence which was nearly identical to that of the operator site located upstream of hutU which controls transcription of the hutUHIG genes. The operator near hutC would presumably allow the hut repressor to regulate its own synthesis as well as the expression of the divergent hutF gene. A second open reading frame (ORF2) would encode a 21,155-dalton protein, but because this region could be deleted with only a slight effect on repressor activity, it is not likely to be involved in repressor function or structure. PMID:2203753

  16. Structure and nucleotide sequence of the rat intestinal vitamin D-dependent calcium binding protein gene.

    PubMed Central

    Krisinger, J; Darwish, H; Maeda, N; DeLuca, H F

    1988-01-01

    The vitamin D-dependent intestinal calcium binding protein (ICaBP, 9 kDa) is under transcriptional regulation by 1,25-dihydroxyvitamin D3 [1,25-(OH)2D3], the hormonal active form of the vitamin. To study the mechanism of gene regulation by 1,25-(OH)2D3, we isolated the rat ICaBP gene by using a cDNA probe. Its nucleotide sequence revealed 3 exons separated by 2 introns within approximately 3 kilobases. The first exon represents only noncoding sequences, while the second and third encode the two calcium binding domains of the protein. The gene contains a 15-base-pair imperfect palindrome in the first intron that shows high homology to the estrogen-responsive element. This sequence may represent the vitamin D-responsive element involved in the regulation of the ICaBP gene. The second intron shows an 84-base-pair-long simple nucleotide repeat that implicates Z-DNA formation. Genomic Southern analysis shows that the rat gene is represented as a single copy. Images PMID:3194402

  17. Nucleotide sequence of the mRNA encoding the pre-alpha-subunit of mouse thyrotropin.

    PubMed Central

    Chin, W W; Kronenberg, H M; Dee, P C; Maloof, F; Habener, J F

    1981-01-01

    We have constructed and cloned in bacteria recombinant DNA molecules containing DNA sequences coding for the precursor of the alpha subunit of thyrotropin (pre-TSH-alpha). Double-stranded DNA complementary to total poly(A)+RNA derived from a mouse pituitary thyrotropic tumor was prepared enzymatically, inserted into the Pst I site of the plasmid pBR322 by using poly(dC).poly(dG) homopolymeric extensions, and cloned in Escherichia coli chi 1776. Cloned cDNAs encoding pre-TSH-alpha were identified by their hybridization to pre-TSH-alpha mRNA as determined by cell-free translations of hybrid-selected and hybrid-arrested RNA. The nucleotide sequences of two cDNAs (510 and 480 base pairs) were determined with chemical methods and corresponded to much of the region coding for the alpha subunit and the 3' untranslated region of pre-TSH-alpha mRNA. The sequence of the 5' end of the mRNA was determined from cDNA synthesized by using total mRNA as template and a restriction enzyme DNA fragment as primer. Together these sequences represented greater than 90% of the coding and noncoding regions of full-length pre-TSH-alpha mRNA, which was determined to be 800 bases long. The amino acid sequence of the pre-TSH-alpha deduced from the nucleotide sequence showed a NH2-terminal leader sequence of 24 amino acids followed by the 96-amino-acid sequence of the apoprotein of TSH-alpha. There is greater than 90% homology in the amino acid sequences among the murine, ruminant, and porcine alpha subunits and 75-80% homology among the murine, equine, and human alpha subunits. Several regions of the sequence remain absolutely conserved among all species, suggesting that these particular regions are essential for the biological function of the subunit. The successful cloning of the alpha subunit of TSH will permit further studies of the organization of the genes coding for the glycoprotein hormone subunits and the regulation of their expression. Images PMID:6272299

  18. Complete Nucleotide Sequence of a Conjugative Plasmid Carrying blaPER-1

    PubMed Central

    Li, Ruichao; Zhou, Yuanjie; Chan, Edward Wai-chi

    2015-01-01

    The nucleotide sequence of a self-transmissible plasmid pVPH1 harboring blaPER-1 from Vibrio parahaemolyticus was determined. pVPH1 was 183,730 bp in size and shared a backbone similar to pAQU1 and pAQU2, differing mainly in an ∼40-kb multidrug resistance (MDR) region. A complex class 1 integron was identified together with ISCR1 and blaPER-1 (ISCR1-blaPER-1-gst-abct-qacEΔ1-sul1), which was shown to form a circular intermediate playing an important role in the dissemination of blaPER-1. PMID:25779581

  19. Nucleotide sequence and organization of copper resistance genes from Pseudomonas syringae pv. tomato

    SciTech Connect

    Mellano, M.A.; Cooksey, D.A.

    1988-06-01

    The nucleotide sequence of a 4.5-kilobase copper resistance determinant from Pseudomonas syringae pv. tomato revealed four open reading frames (ORFs) in the same orientation. Deletion and site-specific mutational analyses indicated that the first two ORFs were essential for copper resistance; the last two ORFs were required for full resistance, but low-level resistance could be conferred in their absence. Five highly conserved, direct 24-base repeats were found near the beginning of the second ORF, and a similar, but less conserved, repeated region was found in the middle of the first ORF.

  20. Within-Host Nucleotide Diversity of Virus Populations: Insights from Next-Generation Sequencing

    PubMed Central

    Nelson, Chase W.; Hughes, Austin L.

    2014-01-01

    Next-generation sequencing (NGS) technology offers new opportunities for understanding the evolution and dynamics of viral populations within individual hosts over the course of infection. We review simple methods for estimating synonymous and nonsynonymous nucleotide diversity in viral genes from NGS data without the need for inferring linkage. We discuss the potential usefulness of these data for addressing questions of both practical and theoretical interest, including fundamental questions regarding the effective population sizes of within-host viral populations and the modes of natural selection acting on them. PMID:25481279

  1. The Complete Nucleotide Sequence of the Mitochondrial Genome of Bactrocera minax (Diptera: Tephritidae)

    PubMed Central

    Zhang, Bin; Nardi, Francesco; Hull-Sanders, Helen; Wan, Xuanwu; Liu, Yinghong

    2014-01-01

    The complete 16,043 bp mitochondrial genome (mitogenome) of Bactrocera minax (Diptera: Tephritidae) has been sequenced. The genome encodes 37 genes usually found in insect mitogenomes. The mitogenome information for B. minax was compared to the homologous sequences of Bactrocera oleae, Bactrocera tryoni, Bactrocera philippinensis, Bactrocera carambolae, Bactrocera papayae, Bactrocera dorsalis, Bactrocera correcta, Bactrocera cucurbitae and Ceratitis capitata. The analysis indicated the structure and organization are typical of, and similar to, the nine closely related species mentioned above, although it contains the lowest genome-wide A+T content (67.3%). Four short intergenic spacers with a high degree of conservation among the nine tephritid species mentioned above and B. minax were observed, which also have clear counterparts in the control regions (CRs). Correlation analysis among these ten tephritid species revealed close positive correlation between the A+T content of zero-fold degenerate sites (P0FD), the ratio of nucleotide substitution frequency at P0FD sites to all degenerate sites (zero-fold degenerate sites, two-fold degenerate sites and four-fold degenerate sites) and amino acid sequence distance (ASD) were found. Further, significant positive correlation was observed between the A+T content of four-fold degenerate sites (P4FD) and the ratio of nucleotide substitution frequency at P4FD sites to all degenerate sites; however, we found significant negative correlation between ASD and the A+T content of P4FD, and the ratio of nucleotide substitution frequency at P4FD sites to all degenerate sites. A higher nucleotide substitution frequency at non-synonymous sites compared to synonymous sites was observed in nad4, the first time that has been observed in an insect mitogenome. A poly(T) stretch at the 5′ end of the CR followed by a [TA(A)]n-like stretch was also found. In addition, a highly conserved G+A-rich sequence block was observed in front of the

  2. Nucleotide sequence of the gene encoding the two-subunit pilin of Bacteroides nodosus 265.

    PubMed Central

    Elleman, T C; Hoyne, P A; McKern, N M; Stewart, D J

    1986-01-01

    The nucleotide sequence of the gene encoding pilin from Bacteroides nodosus 265 has been determined. The pilin is encoded by a single-copy gene, from which can be predicted a prepilin comprising a single protein chain of Mr 16,637. The prepilin sequence differs in several respects from the mature protein sequence. Seven additional N-terminal amino acid residues are present in prepilin, whereas residue 8, phenylalanine, undergoes posttranslational modification to become the N-methylated amino-terminal residue of mature pilin. In addition, further processing occurs through internal cleavage to produce two noncovalently linked subunits characteristic of pilins from serogroup H of B. nodosus, of which strain 265 is a member. The position of cleavage has been identified between alanine residues at positions 72 and 73 of the mature 149-residue pilin protein. The predicted pilin sequence of B. nodosus 265 shows extensive N-terminal amino acid sequence homology with other pilins of the N-methylphenylalanine type. In addition this sequence also shows homology with these N-methylphenylalanine-type pilins in the C-terminal region of the molecule, especially with pilin from Pseudomonas aeruginosa PAK. Images PMID:2873127

  3. The complete nucleotide sequence and structure of the gene encoding bovine phenylethanolamine N-methyltransferase.

    PubMed

    Batter, D K; D'Mello, S R; Turzai, L M; Hughes, H B; Gioio, A E; Kaplan, B B

    1988-03-01

    A cDNA clone for bovine adrenal phenylethanolamine N-methyltransferase (PNMT) was used to screen a Charon 28 genomic library. One phage was identified, designated lambda P1, which included the entire PNMT gene. Construction of a restriction map, with subsequent Southern blot analysis, allowed the identification of exon-containing fragments. Dideoxy sequence analysis of these fragments, and several more further upstream, indicates that the bovine PNMT gene is 1,594 base pairs in length, consisting of three exons and two introns. The transcription initiation site was identified by two independent methods and is located approximately 12 base pairs upstream from the ATG translation start site. The 3' untranslated region is 88 base pairs in length and contains the expected polyadenylation signal (AATAAA). A putative promoter sequence (TATA box) is located about 25 base pairs upstream from the transcription initiation site. Computer comparison of the nucleotide sequence data with the consensus sequences of known regulatory elements revealed potential binding sites for glucocorticoid receptors and the Sp1 regulatory protein in the 5' flanking region of the gene. Additionally, comparison of the sequence of the exons of the PNMT gene with cDNA sequences for other enzymes involved in biogenic amine synthesis revealed no significant homology, indicating that PNMT is not a member of a multigene family of catecholamine biosynthetic enzymes. PMID:3379652

  4. Nucleotide sequence of the gene for the b subunit of human factor XIII

    SciTech Connect

    Bottenus, R.E.; Ichinose, A.; Davie, E.W. )

    1990-12-01

    Factor XIII (M{sub r} 320 000) is a blood coagulation factor that stabilizes and strengthens the fibrin clot. It circulates in blood as a tetramer composed of two a subunits (M{sub r} 75 000 each) and two b subunits (M{sub r} 80 000 each). The b subunit consists of 641 amino acids and includes 10 tandem repeats of 60 amino acids known as GP-I structures, short consensus repeats (SCR), or sushi domains. In the present study, the human gene for the b subunit has been isolated from three different genomic libraries prepared in {lambda} phage. Fifteen independent phage with inserts coding for the entire gene were isolated and characterized by restriction mapping, Southern blotting, and DNA sequencing. The gene was found to be 28 kilobases in length and consisted of 12 exons (I-XII) separated by 11 intervening sequences. The leader sequence was encoded by exon I, while the carbonyl-terminal region of the protein was encoded by exon XII. Exons II-XI each coded for a single sushi domain, suggesting that the gene evolved through exon shuffling and duplication. The 12 exons in the gene ranged in size from 64 to 222 base pairs, while the introns ranged in size from 87 to 9970 nucleotides and made up 92{percent} of the gene. One nucleotide change was found in the coding region of the gene when its sequence was compared to that of the cDNA. This difference, however, did not result in a change in the amino acid sequence of the protein.

  5. The EBI Search engine: providing search and retrieval functionality for biological data from EMBL-EBI.

    PubMed

    Squizzato, Silvano; Park, Young Mi; Buso, Nicola; Gur, Tamer; Cowley, Andrew; Li, Weizhong; Uludag, Mahmut; Pundir, Sangya; Cham, Jennifer A; McWilliam, Hamish; Lopez, Rodrigo

    2015-07-01

    The European Bioinformatics Institute (EMBL-EBI-https://www.ebi.ac.uk) provides free and unrestricted access to data across all major areas of biology and biomedicine. Searching and extracting knowledge across these domains requires a fast and scalable solution that addresses the requirements of domain experts as well as casual users. We present the EBI Search engine, referred to here as 'EBI Search', an easy-to-use fast text search and indexing system with powerful data navigation and retrieval capabilities. API integration provides access to analytical tools, allowing users to further investigate the results of their search. The interconnectivity that exists between data resources at EMBL-EBI provides easy, quick and precise navigation and a better understanding of the relationship between different data types including sequences, genes, gene products, proteins, protein domains, protein families, enzymes and macromolecular structures, together with relevant life science literature. PMID:25855807

  6. The EBI Search engine: providing search and retrieval functionality for biological data from EMBL-EBI

    PubMed Central

    Squizzato, Silvano; Park, Young Mi; Buso, Nicola; Gur, Tamer; Cowley, Andrew; Li, Weizhong; Uludag, Mahmut; Pundir, Sangya; Cham, Jennifer A.; McWilliam, Hamish; Lopez, Rodrigo

    2015-01-01

    The European Bioinformatics Institute (EMBL-EBI—https://www.ebi.ac.uk) provides free and unrestricted access to data across all major areas of biology and biomedicine. Searching and extracting knowledge across these domains requires a fast and scalable solution that addresses the requirements of domain experts as well as casual users. We present the EBI Search engine, referred to here as ‘EBI Search’, an easy-to-use fast text search and indexing system with powerful data navigation and retrieval capabilities. API integration provides access to analytical tools, allowing users to further investigate the results of their search. The interconnectivity that exists between data resources at EMBL-EBI provides easy, quick and precise navigation and a better understanding of the relationship between different data types including sequences, genes, gene products, proteins, protein domains, protein families, enzymes and macromolecular structures, together with relevant life science literature. PMID:25855807

  7. High-quality protein knowledge resource: SWISS-PROT and TrEMBL.

    PubMed

    O'Donovan, Claire; Martin, Maria Jesus; Gattiker, Alexandre; Gasteiger, Elisabeth; Bairoch, Amos; Apweiler, Rolf

    2002-09-01

    SWISS-PROT is a curated protein sequence database which strives to provide a high level of annotation (such as the description of the function of a protein, its domain structure, post-translational modifications, variants, etc.), a minimal level of redundancy and a high level of integration with other databases. Together with its automatically annotated supplement TrEMBL, it provides a comprehensive and high-quality view of the current state of knowledge about proteins. Ongoing developments include the further improvement of functional and automatic annotation in the databases including evidence attribution with particular emphasis on the human, archaeal and bacterial proteomes and the provision of additional resources such as the International Protein Index (IPI) and XML format of SWISS-PROT and TrEMBL to the user community. PMID:12230036

  8. Nucleotide sequence and structural features of a novel US-a junction present in a defective herpes simplex virus genome.

    PubMed Central

    Mocarski, E S; Deiss, L P; Frenkel, N

    1985-01-01

    Defective genomes generated during serial propagation of herpes simplex virus type 1 (Justin) consist of tandem reiterations of sequences that are colinear with a portion of the S component of the standard viral genome. We determined the structure of the novel US-a junction, at which the US sequences of one repeat unit join the a sequences of the adjacent repeat unit. Comparison of the nucleotide sequence at this junction with the nucleotide sequence of the corresponding US region of the standard virus genome indicated that the defective genome repeat unit arose by a single recombinational event between an L-S junction a sequence and the US region. The recombinational process might have been mediated by limited sequence homology. The sequences retained within the US-a junction further define the signal for cleavage and packaging of viral DNA. PMID:2989551

  9. Nucleotide sequence alignment of hdcA from Gram-positive bacteria.

    PubMed

    Diaz, Maria; Ladero, Victor; Redruello, Begoña; Sanchez-Llana, Esther; Del Rio, Beatriz; Fernandez, Maria; Martin, Maria Cruz; Alvarez, Miguel A

    2016-03-01

    The decarboxylation of histidine -carried out mainly by some gram-positive bacteria- yields the toxic dietary biogenic amine histamine (Ladero et al. 2010 〈10.2174/157340110791233256〉 [1], Linares et al. 2016 〈http://dx.doi.org/10.1016/j.foodchem.2015.11.013〉〉 [2]). The reaction is catalyzed by a pyruvoyl-dependent histidine decarboxylase (Linares et al. 2011 〈10.1080/10408398.2011.582813〉 [3]), which is encoded by the gene hdcA. In order to locate conserved regions in the hdcA gene of Gram-positive bacteria, this article provides a nucleotide sequence alignment of all the hdcA sequences from Gram-positive bacteria present in databases. For further utility and discussion, see 〈http://dx.doi.org/ 10.1016/j.foodcont.2015.11.035〉〉 [4]. PMID:26958625

  10. The complete nucleotide sequence and genome organization of pea streak virus (genus Carlavirus).

    PubMed

    Su, Li; Li, Zhengnan; Bernardy, Mike; Wiersma, Paul A; Cheng, Zhihui; Xiang, Yu

    2015-10-01

    Pea streak virus (PeSV) is a member of the genus Carlavirus in the family Betaflexiviridae. Here, the first complete genome sequence of PeSV was determined by deep sequencing of a cDNA library constructed from dsRNA extracted from a PeSV-infected sample and Rapid Amplification of cDNA Ends (RACE) PCR. The PeSV genome consists of 8041 nucleotides excluding the poly(A) tail and contains six open reading frames (ORFs). The putative peptide encoded by the PeSV ORF6 has an estimated molecular mass of 6.6 kDa and shows no similarity to any known proteins. This differs from typical carlaviruses, whose ORF6 encodes a 12- to 18-kDa cysteine-rich nucleic-acid-binding protein. PMID:26092422

  11. Nucleotide sequence alignment of hdcA from Gram-positive bacteria

    PubMed Central

    Diaz, Maria; Ladero, Victor; Redruello, Begoña; Sanchez-Llana, Esther; del Rio, Beatriz; Fernandez, Maria; Martin, Maria Cruz; Alvarez, Miguel A.

    2016-01-01

    The decarboxylation of histidine -carried out mainly by some gram-positive bacteria- yields the toxic dietary biogenic amine histamine (Ladero et al. 2010 〈10.2174/157340110791233256〉 [1], Linares et al. 2016 〈http://dx.doi.org/10.1016/j.foodchem.2015.11.013〉〉 [2]). The reaction is catalyzed by a pyruvoyl-dependent histidine decarboxylase (Linares et al. 2011 〈10.1080/10408398.2011.582813〉 [3]), which is encoded by the gene hdcA. In order to locate conserved regions in the hdcA gene of Gram-positive bacteria, this article provides a nucleotide sequence alignment of all the hdcA sequences from Gram-positive bacteria present in databases. For further utility and discussion, see 〈http://dx.doi.org/ 10.1016/j.foodcont.2015.11.035〉〉 [4]. PMID:26958625

  12. Infectious hepatitis B virus from cloned DNA of known nucleotide sequence.

    PubMed Central

    Will, H; Cattaneo, R; Darai, G; Deinhardt, F; Schellekens, H; Schaller, H

    1985-01-01

    The infectivity of cloned hepatitis B viral DNA (HBV) has been tested in chimpanzees to identify a fully functional HBV genome and to assess the risk associated with its handling. Only one of two HBV DNA sequence variants tested was shown to be infectious. "Clone purified" virus of predicted nucleotide sequence was produced from the infectious HBV DNA, and the cloned viral genome was identical in structure with naturally occurring HBV. Infection could be initiated independent of whether circular monomeric or plasmid integrated dimeric forms of the viral genome were inoculated, but the infectivity of the DNA depended on liver cell transfection or intrahepatic injection. Intravenous injection of high doses of infectious HBV DNA did not induce hepatitis, suggesting that there is virtually no risk associated with routine laboratory handling of cloned HBV DNA. Images PMID:2983320

  13. Nucleotide sequence of the BsuRI restriction-modification system.

    PubMed Central

    Kiss, A; Posfai, G; Keller, C C; Venetianer, P; Roberts, R J

    1985-01-01

    The genes of the 5'-GGCC specific BsuRI restriction-modification system of Bacillus subtilis have been cloned and expressed in E. coli and their nucleotide sequence has been determined. The restriction and modification genes code for polypeptides with calculated molecular weights of 66,314 and 49,642, respectively. Both enzymes are coded by the same DNA strand. The restriction gene is upstream of the methylase gene and the coding regions are separated by 780 bp. Analysis of the RNA transcripts by S1-nuclease mapping indicates that the restriction and modification genes are transcribed from different promoters. Comparison of the amino acid sequences revealed no homology between the BsuRI restriction and modification enzymes. There are, however, regions of homology between the BsuRI methylase and two other GGCC specific modification enzymes, the BspRI and SPR methylases. Images PMID:2997708

  14. Nucleotide sequence and expression of the gene encoding the EcoRII modification enzyme.

    PubMed Central

    Som, S; Bhagwat, A S; Friedman, S

    1987-01-01

    The gene coding for the EcoRII modification enzyme has been cloned and the nucleotide sequence of 1933 base pairs containing the gene has been determined. The gene codes for a protein of 477 amino acids. Two transcriptional start sites have been mapped by S1 mapping. One deletion that removes 34 N-terminal amino acids was found to have partial enzyme activity. Comparison of the EcoRII methylase sequence with other cytosine methylases revealed several domains of partial homology among all cytosine methylases. Cloning the gene in multicopy pUC vectors increased the expression by 6-18 fold. A 40 fold overproduction of the EcoRII methylase was obtained by cloning the gene in the expression vector carrying the lambda PL promoter. Images PMID:3029675

  15. Nucleotide sequence of nifD from Frankia alni strain ArI3: phylogenetic inferences.

    PubMed

    Normand, P; Gouy, M; Cournoyer, B; Simonet, P

    1992-05-01

    The complete nucleotide sequence of the nifD gene encoding the alpha subunit of component I of nitrogenase from Frankia alni strain ArI3 was determined. The coding region is 1,458 bp in length and encodes a polypeptide of 486 residues with a predicted molecular weight of 53,500. Phylogenetic inferences with 12 complete published nifD sequences were drawn using a variety of approaches. Frankia nifD clusters with proteobacteria rather than with Clostridium pasteurianum, the other Gram-positive bacterium studied. Extant eubacterial nif genes seem to have at least three distinct evolutionary origins as a result of ancient gene duplications. Within the Gram-positive bacterial phylum, functional nif genes descend from different duplicates. PMID:1584016

  16. Nucleotide sequence analysis of beta tubulin gene in a wide range of dermatophytes.

    PubMed

    Rezaei-Matehkolaei, Ali; Mirhendi, Hossein; Makimura, Koichi; de Hoog, G Sybren; Satoh, Kazuo; Najafzadeh, Mohammad Javad; Shidfar, Mohammad Reza

    2014-10-01

    We investigated the resolving power of the beta tubulin protein-coding gene (BT2) for systematic study of dermatophyte fungi. Initially, 144 standard and clinical strains belonging to 26 species in the genera Trichophyton, Microsporum, and Epidermophyton were identified by internal transcribe spacer (ITS) sequencing. Subsequently, BT2 was partially amplified in all strains, and sequence analysis performed after construction of a BT2 database that showed length ranged from approximately 723 (T. ajelloi) to 808 nucleotides (M. persicolor) in different species. Intraspecific sequence variation was found in some species, but T. tonsurans, T. equinum, T. concentricum, T. verrucosum, T. rubrum, T. violaceum, T. eriotrephon, E. floccosum, M. canis, M. ferrugineum, and M. audouinii were invariant. The sequences were found to be relatively conserved among different strains of the same species. The species with the closest resemblance were Arthroderma benhamiae and T. concentricum and T. tonsurans and T. equinum with 100% and 99.8% identity, respectively; the most distant species were M. persicolor and M. amazonicum. The dendrogram obtained from BT2 topology was almost compatible with the species concept based on ITS sequencing, and similar clades and species were distinguished in the BT2 tree. Here, beta tubulin was characterized in a wide range of dermatophytes in order to assess intra- and interspecies variation and resolution and was found to be a taxonomically valuable gene. PMID:25079222

  17. Unique nucleotide sequence (UNS)-guided assembly of repetitive DNA parts for synthetic biology applications

    PubMed Central

    Torella, Joseph P.; Lienert, Florian; Boehm, Christian R.; Chen, Jan-Hung; Way, Jeffrey C.; Silver, Pamela A.

    2016-01-01

    Recombination-based DNA construction methods, such as Gibson assembly, have made it possible to easily and simultaneously assemble multiple DNA parts and hold promise for the development and optimization of metabolic pathways and functional genetic circuits. Over time, however, these pathways and circuits have become more complex, and the increasing need for standardization and insulation of genetic parts has resulted in sequence redundancies — for example repeated terminator and insulator sequences — that complicate recombination-based assembly. We and others have recently developed DNA assembly methods that we refer to collectively as unique nucleotide sequence (UNS)-guided assembly, in which individual DNA parts are flanked with UNSs to facilitate the ordered, recombination-based assembly of repetitive sequences. Here we present a detailed protocol for UNS-guided assembly that enables researchers to convert multiple DNA parts into sequenced, correctly-assembled constructs, or into high-quality combinatorial libraries in only 2–3 days. If the DNA parts must be generated from scratch, an additional 2–5 days are necessary. This protocol requires no specialized equipment and can easily be implemented by a student with experience in basic cloning techniques. PMID:25101822

  18. IMGT/LIGM-DB, the IMGT comprehensive database of immunoglobulin and T cell receptor nucleotide sequences.

    PubMed

    Giudicelli, Véronique; Duroux, Patrice; Ginestoux, Chantal; Folch, Géraldine; Jabado-Michaloud, Joumana; Chaume, Denys; Lefranc, Marie-Paule

    2006-01-01

    IMGT/LIGM-DB is the IMGT comprehensive database of immunoglobulin (IG) and T cell receptor (TR) nucleotide sequences from human and other vertebrate species. It was created in 1989 by LIGM, Montpellier, France and is the oldest and the largest database of IMGT. IMGT/LIGM-DB includes all germline (non-rearranged) and rearranged IG and TR genomic DNA (gDNA) and complementary DNA (cDNA) sequences published in generalist databases. IMGT/LIGM-DB allows searches from the Web interface according to biological and immunogenetic criteria through five distinct modules depending on the user interest. For a given entry, nine types of display are available including the IMGT flat file, the translation of the coding regions and the analysis by the IMGT/V-QUEST tool. IMGT/LIGM-DB distributes expertly annotated sequences. The annotations hugely enhance the quality and the accuracy of the distributed detailed information. They include the sequence identification, the gene and allele classification, the constitutive and specific motif description, the codon and amino acid numbering, and the sequence obtaining information, according to the main concepts of IMGT-ONTOLOGY. They represent the main source of IG and TR gene and allele knowledge stored in IMGT/GENE-DB and in the IMGT reference directory. IMGT/LIGM-DB is freely available at http://imgt.cines.fr. PMID:16381979

  19. Unique nucleotide sequence-guided assembly of repetitive DNA parts for synthetic biology applications

    SciTech Connect

    Torella, JP; Lienert, F; Boehm, CR; Chen, JH; Way, JC; Silver, PA

    2014-08-07

    Recombination-based DNA construction methods, such as Gibson assembly, have made it possible to easily and simultaneously assemble multiple DNA parts, and they hold promise for the development and optimization of metabolic pathways and functional genetic circuits. Over time, however, these pathways and circuits have become more complex, and the increasing need for standardization and insulation of genetic parts has resulted in sequence redundancies-for example, repeated terminator and insulator sequences-that complicate recombination-based assembly. We and others have recently developed DNA assembly methods, which we refer to collectively as unique nucleotide sequence (UNS)-guided assembly, in which individual DNA parts are flanked with UNSs to facilitate the ordered, recombination-based assembly of repetitive sequences. Here we present a detailed protocol for UNS-guided assembly that enables researchers to convert multiple DNA parts into sequenced, correctly assembled constructs, or into high-quality combinatorial libraries in only 2-3 d. If the DNA parts must be generated from scratch, an additional 2-5 d are necessary. This protocol requires no specialized equipment and can easily be implemented by a student with experience in basic cloning techniques.

  20. Mapping Nucleotide Sequences that Encode Complex Binary Disease Traits with HapMap

    PubMed Central

    Cui, Yuehua; Fu, Wenjiang; Sun, Kelian; Romero, Roberto; Wu, Rongling

    2007-01-01

    Detecting the patterns of DNA sequence variants across the human genome is a crucial step for unraveling the genetic basis of complex human diseases. The human HapMap constructed by single nucleotide polymorphisms (SNPs) provides efficient sequence variation information that can speed up the discovery of genes related to common diseases. In this article, we present a generalized linear model for identifying specific nucleotide variants that encode complex human diseases. A novel approach is derived to group haplotypes to form composite diplotypes, which largely reduces the model degrees of freedom for an association test and hence increases the power when multiple SNP markers are involved. An efficient two-stage estimation procedure based on the expectation-maximization (EM) algorithm is derived to estimate parameters. Non-genetic environmental or clinical risk factors can also be fitted into the model. Computer simulations show that our model has reasonable power and type I error rate with appropriate sample size. It is also suggested through simulations that a balanced design with approximately equal number of cases and controls should be preferred to maintain small estimation bias and reasonable testing power. To illustrate the utility, we apply the method to a genetic association study of large for gestational age (LGA) neonates. The model provides a powerful tool for elucidating the genetic basis of complex binary diseases. PMID:19384427

  1. Mapping DNA methylation by transverse current sequencing: Reduction of noise from neighboring nucleotides

    NASA Astrophysics Data System (ADS)

    Alvarez, Jose; Massey, Steven; Kalitsov, Alan; Velev, Julian

    Nanopore sequencing via transverse current has emerged as a competitive candidate for mapping DNA methylation without needed bisulfite-treatment, fluorescent tag, or PCR amplification. By eliminating the error producing amplification step, long read lengths become feasible, which greatly simplifies the assembly process and reduces the time and the cost inherent in current technologies. However, due to the large error rates of nanopore sequencing, single base resolution has not been reached. A very important source of noise is the intrinsic structural noise in the electric signature of the nucleotide arising from the influence of neighboring nucleotides. In this work we perform calculations of the tunneling current through DNA molecules in nanopores using the non-equilibrium electron transport method within an effective multi-orbital tight-binding model derived from first-principles calculations. We develop a base-calling algorithm accounting for the correlations of the current through neighboring bases, which in principle can reduce the error rate below any desired precision. Using this method we show that we can clearly distinguish DNA methylation and other base modifications based on the reading of the tunneling current.

  2. Evidence for Balancing Selection from Nucleotide Sequence Analyses of Human G6PD

    PubMed Central

    Verrelli, Brian C.; McDonald, John H.; Argyropoulos, George; Destro-Bisol, Giovanni; Froment, Alain; Drousiotou, Anthi; Lefranc, Gerard; Helal, Ahmed N.; Loiselet, Jacques; Tishkoff, Sarah A.

    2002-01-01

    Glucose-6-phosphate dehydrogenase (G6PD) mutations that result in reduced enzyme activity have been implicated in malarial resistance and constitute one of the best examples of selection in the human genome. In the present study, we characterize the nucleotide diversity across a 5.2-kb region of G6PD in a sample of 160 Africans and 56 non-Africans, to determine how selection has shaped patterns of DNA variation at this gene. Our global sample of enzymatically normal B alleles and A, A−, and Med alleles with reduced enzyme activities reveals many previously uncharacterized silent-site polymorphisms. In comparison with the absence of amino acid divergence between human and chimpanzee G6PD sequences, we find that the number of G6PD amino acid polymorphisms in human populations is significantly high. Unlike many other G6PD-activity alleles with reduced activity, we find that the age of the A variant, which is common in Africa, may not be consistent with the recent emergence of severe malaria and therefore may have originally had a historically different adaptive function. Overall, our observations strongly support previous genotype-phenotype association studies that proposed that balancing selection maintains G6PD deficiencies within human populations. The present study demonstrates that nucleotide sequence analyses can reveal signatures of both historical and recent selection in the genome and may elucidate the impact that infectious disease has had during human evolution. PMID:12378426

  3. The HLA-DRA*0102 allele: correct nucleotide sequence and associated HLA haplotypes.

    PubMed

    Kralovicova, J; Marsh, S G E; Waller, M J; Hammarstrom, L; Vorechovsky, I

    2002-09-01

    Here we correct the nucleotide sequence of a single known variant of the HLA-DRA gene. We show that the coding regions of the HLA-DRA*0101 and HLA-DRA*0102 alleles do not differ at two codons as reported previously, but only in codon 217. Using nucleotide sequencing and DNA samples from individuals homozygous in the major histocompatibility complex, we found that the variant, leucine 217-encoding HLA-DRA*0102 allele was present on the haplotypes HLA-B*0801, DRB1*03011, DQB1*0201 (ancestral haplotype AH8.1), HLA-B*07021, DRB1*15011, DQB1*0602 (AH7.1), HLA-B*1501, DRB1*15011, DQB1*0602, HLA-B*1501, DRB1*1402, DQB1*03011 and HLA-A3, B*07021, DRB1*1301, DQB1*0603. The HLA-DRA*0101 allele coding for valine 217 was observed on the haplotypes HLA-B*5701, DRB1*0701, DQB1*03032 (AH57.1), HLA-DRB1*04011, DQB1*0302, HLA-DRB1*0701, DQB1*0202, and HLA-DRB1*0101, DQB1*05011. PMID:12445311

  4. Complete nucleotide sequence of a Spanish isolate of alfalfa mosaic virus: evidence for additional genetic variability.

    PubMed

    Parrella, Giuseppe; Acanfora, Nadia; Orílio, Anelise F; Navas-Castillo, Jesús

    2011-06-01

    Alfalfa mosaic virus (AMV) is a plant virus that is distributed worldwide and can induce necrosis and/or yellow mosaic on a large variety of plant species, including commercially important crops. It is the only virus of the genus Alfamovirus in the family Bromoviridae. AMV isolates can be clustered into two genetic groups that correlate with their geographic origin. Here, we report for the first time the complete nucleotide sequence of a Spanish isolate of AMV found infecting Cape honeysuckle (Tecoma capensis) and named Tec-1. The tripartite genome of Tec-1 is composed of 3643 nucleotides (nt) for RNA1, 2594 nt for RNA2 and 2037 nt for RNA3. Comparative sequence analysis of the coat protein gene revealed that the isolate Tec-1 is distantly related to subgroup I of AMV and more closely related to subgroup II, although forming a distinct phylogenetic clade. Therefore, we propose to split subgroup II of AMV into two subgroups, namely IIA, comprising isolates previously included in subgroup II, and IIB, including the novel Spanish isolate Tec-1. PMID:21327783

  5. Complete nucleotide sequence and genome organization of Pelargonium flower break virus.

    PubMed

    Rico, P; Hernández, C

    2004-03-01

    The complete nucleotide sequence of Pelargonium flower break virus (PFBV) has been determined. The genomic RNA is 3923 nucleotides (nt) long and contains five open reading frames (ORFs). The 5'-proximal ORF encodes a 27 kDa protein (p27) and terminates with an amber codon which may be read-through into an in-frame p56 ORF to generate a 86 kDa protein (p86) containing the viral RNA dependent-RNA polymerase motifs. Two small ORFs, located in the central part of the viral genome, encode polypeptides of 7 (p7) and 12 kDa (p12), respectively, which are very likely involved in virus movement. Interestingly, p12 presents a leucine zipper motif that has not been previously reported in related proteins. The 3'-proximal ORF encodes a 37 kDa capsid protein (CP). The p12 ORF is in-frame with the p86 ORF and a double read-through protein of 99 kDa (p99) may be produced. Amino acid sequence comparisons revealed that the proteins encoded by ORFs 2, 3 and 4 are more similar to the corresponding gene products of Carnation mottle virus than to those of other carmoviruses, whereas the p27 and the CP show higher identity with the equivalent proteins of Saguaro cactus virus. Phylogenetic analysis conducted with the different viral products confirmed the assignment of PFBV to the genus Carmovirus. PMID:14991450

  6. The bioinformatics of nucleotide sequence coding for proteins requiring metal coenzymes and proteins embedded with metals

    NASA Astrophysics Data System (ADS)

    Tremberger, G.; Dehipawala, Sunil; Cheung, E.; Holden, T.; Sullivan, R.; Nguyen, A.; Lieberman, D.; Cheung, T.

    2015-09-01

    All metallo-proteins need post-translation metal incorporation. In fact, the isotope ratio of Fe, Cu, and Zn in physiology and oncology have emerged as an important tool. The nickel containing F430 is the prosthetic group of the enzyme methyl coenzyme M reductase which catalyzes the release of methane in the final step of methano-genesis, a prime energy metabolism candidate for life exploration space mission in the solar system. The 3.5 Gyr early life sulfite reductase as a life switch energy metabolism had Fe-Mo clusters. The nitrogenase for nitrogen fixation 3 billion years ago had Mo. The early life arsenite oxidase needed for anoxygenic photosynthesis energy metabolism 2.8 billion years ago had Mo and Fe. The selection pressure in metal incorporation inside a protein would be quantifiable in terms of the related nucleotide sequence complexity with fractal dimension and entropy values. Simulation model showed that the studied metal-required energy metabolism sequences had at least ten times more selection pressure relatively in comparison to the horizontal transferred sequences in Mealybug, guided by the outcome histogram of the correlation R-sq values. The metal energy metabolism sequence group was compared to the circadian clock KaiC sequence group using magnesium atomic level bond shifting mechanism in the protein, and the simulation model would suggest a much higher selection pressure for the energy life switch sequence group. The possibility of using Kepler 444 as an example of ancient life in Galaxy with the associated exoplanets has been proposed and is further discussed in this report. Examples of arsenic metal bonding shift probed by Synchrotron-based X-ray spectroscopy data and Zn controlled FOXP2 regulated pathways in human and chimp brain studied tissue samples are studied in relationship to the sequence bioinformatics. The analysis results suggest that relatively large metal bonding shift amount is associated with low probability correlation R

  7. Purification and characterization of Clostridium perfringens 120-kilodalton collagenase and nucleotide sequence of the corresponding gene.

    PubMed Central

    Matsushita, O; Yoshihara, K; Katayama, S; Minami, J; Okabe, A

    1994-01-01

    Clostridium perfringens type C NCIB 10662 produced various gelatinolytic enzymes with molecular masses ranging from approximately 120 to approximately 80 kDa. A 120-kDa gelatinolytic enzyme was present in the largest quantity in the culture supernatant, and this enzyme was purified to homogeneity on the basis of sodium dodecyl sulfate-polyacrylamide gel electrophoresis. The purified enzyme was identified as the major collagenase of the organism, and it cleaved typical collagenase substrates such as azocoll, a synthetic substrate (4-phenylazobenzyloxy-carbonyl-Pro-Leu-Gly-Pro-D-Arg [Pz peptide]), and a type I collagen fibril. In addition, a gene (colA) encoding a 120-kDa collagenase was cloned in Escherichia coli. Nested deletions were used to define the coding region of colA, and this region was sequenced; from the nucleotide sequence, this gene encodes a protein of 1,104 amino acids (M(r), 125,966). Furthermore, from the N-terminal amino acid sequence of the purified enzyme which was found in this reading frame, the molecular mass of the mature enzyme was calculated to be 116,339 Da. Analysis of the primary structure of the gene product showed that the enzyme was produced with a stretch of 86 amino acids containing a putative signal sequence. Within this stretch was found PLGP, the amino acid sequence constituting the Pz peptide. This sequence may be implicated in self-processing of the collagenase. A consensus zinc-binding sequence (HEXXH) suggested for vertebrate Zn collagenases is present in this bacterial collagenase. Vibrio alginolyticus collagenase and Achromobacter lyticus protease I showed significant homology with the 120-kDa collagenase of C. perfringens, suggesting that these three enzymes are evolutionarily related. Images PMID:8282691

  8. Species diagnostic single-nucleotide polymorphism and sequence-tagged site markers for the parasitic WASP Genus Nasonia (Hymenoptera: Ptermalidae)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We developed, identified and evaluated eight single nucleotide polymorphism (SNP) and three sequence-tagged site (STS) markers in nuclear gene sequences of the wasp genus Nasonia (Hymenoptera). We studied variation of these markers in natural populations of the closely related and regionally sympatr...

  9. Cloning and nucleotide sequence of anaerobically induced porin protein E1 (OprE) of Pseudomonas aeruginosa PAO1.

    PubMed

    Yamano, Y; Nishikawa, T; Komatsu, Y

    1993-05-01

    The porin oprE gene of Pseudomonas aeruginosa PAO1 was isolated. Its nucleotide sequence indicated that the structural gene of 1383 nucleotide residues encodes a precursor consisting of 460 amino acid residues with a signal peptide of 29 amino acid residues, which was confirmed by the N-terminal 23-amino-acid sequence and the reaction with anti-OprE polyclonal antiserum. Anaerobiosis induced OprE production at the transcription level. The transcription start site was determined to be 40 nucleotides upstream from the ATG initiation codon. The control region contained an appropriately situated E sigma 54 recognition site and the putative second half of an ANR box. The amino acid sequence of OprE had some clusters of sequence homologous with that of OprD of P. aeruginosa, which might be responsible for the outer membrane permeability of imipenem and basic amino acids. PMID:8394980

  10. Cloning, nucleotide sequence, and transcriptional analysis of the Pediococcus acidilactici L-(+)-lactate dehydrogenase gene.

    PubMed Central

    Garmyn, D; Ferain, T; Bernard, N; Hols, P; Delcour, J

    1995-01-01

    Recombinant plasmids containing the Pediococcus acidilactici L-(+)-lactate dehydrogenase gene (ldhL) were isolated by complementing for growth under anaerobiosis of an Escherichia coli lactate dehydrogenase-pyruvate formate lyase double mutant. The nucleotide sequence of the ldhL gene predicted a protein of 323 amino acids showing significant similarity with other bacterial L-(+)-lactate dehydrogenases and especially with that of Lactobacillus plantarum. The ldhL transcription start points in P. acidilactici were defined by primer extension, and the promoter sequence was identified as TCAAT-(17 bp)-TATAAT. This sequence is closely related to the consensus sequence of vegetative promoters from gram-positive bacteria as well as from E. coli. Northern analysis of P. acidilactici RNA showed a 1.1-kb ldhL transcript whose abundance is growth rate regulated. These data, together with the presence of a putative rho-independent transcriptional terminator, suggest that ldhL is expressed as a monocistronic transcript in P. acidilactici. PMID:7887607

  11. Nucleotide sequence of ompV, the gene for a major Vibrio cholerae outer membrane protein.

    PubMed

    Pohlner, J; Meyer, T F; Jalajakumari, M B; Manning, P A

    1986-12-01

    The nucleotide sequence of the ompV gene of Vibrio cholerae was determined. The product of the gene is a 28,000 dalton protein which, after the removal of a 19 amino acid signal sequence, produces a mature outer membrane protein of 26,000 daltons. The cleavage site was determined by amino-terminal amino acid sequencing of the purified mature protein. The DNA upstream of the gene shows the presence of a typical promoter region as judged from the Escherichia coli consensus information; however, the Shine-Dalgarno sequence is associated with a region capable of forming a secondary structure in the mRNA. The formation of this structure would inhibit binding of the mRNA to the ribosome and reduce translation. It is proposed that this structure is recognized by a positive activator in V. cholerae and because of its absence in E. coli ompV is poorly expressed. The distribution of rare codons within ompV suggests that they may serve to slow down the translation of particular domains such that the nascent polypeptide has an opportunity to take up its conformation without interference from the later formed regions. Such a mechanism could aid localization of the protein if export were by a contranslational secretion system. PMID:3031428

  12. Nucleotide sequence of both genomic RNAs of a North American tobacco rattle virus isolate.

    PubMed

    Sudarshana, M R; Berger, P H

    1998-01-01

    The complete sequence of a North American tobacco rattle virus (TRV) isolate, 'Oregon yellow' (ORY), was determined from cDNA and RT-PCR clones derived from the two genomic RNAs of this isolate. The RNA-1 is 6790 bases and RNA-2 is 3261 bases. The sequence of TRV-ORY RNA-1 was similar to RNA-1 to TRV isolate SYM, and differs in 48 nucleotides. TRV-ORY RNA-1 was one base shorter than--SYM, and had 47 base substitutions resulting in 12 amino acid substitutions of which 4 were conservative. The RNA-2 of TRV-ORY was distinct from RNA-2 of other characterized TRV isolates and contained three open reading frames (ORFs) that could potentially code for proteins of MW 22.4 kDa, 37.6 kDa and 17.9 kDa. Based on the homology of the predicted amino acid sequence with those of other tobraviruses. ORF1 of RNA-2 encodes the coat protein (CP). The protein sequence of ORF2 had regions of limited similarity with those of ORF2 of two other TRV isolates and pea early browning tobravirus. The ORF3 was unique to TRV-ORY. Phylogenetic analysis of tobravirus CPs indicated that TRV-ORY was most closely related to pepper ringspot tobravirus and TRV-TCM. The relationship of tobravirus CPs to other rod-shaped tubular plant viruses is also discussed. PMID:9739332

  13. Power Spectrum and Mutual Information Analyses of DNA Base (Nucleotide) Sequences

    NASA Astrophysics Data System (ADS)

    Isohata, Yasuhiko; Hayashi, Masaki

    2003-03-01

    On the basis of the power spectrum analyses for the base (nucleotide) sequences of various genes, we have studied long-range correlations in total base sequences which are expressed as 1/fα, behaviour of the exponent α for the accumulated base sequences as well as periodicities at short range. In particular from the analysis of content rate distributions of α we have obtained the average value \\barα=0.40± 0.01 and \\barα=0.20± 0.01 for the human genes and S. cerevisiae genes, respectively. We have also performed the analyses using the mutual information function. We show that there exists a clear difference between the content rate distributions of correlation lengths for the sample human genes and the S. cerevisiae genes. We are led to a conjecture that the elongation of the correlation length in the base sequences of genes from the early eukaryote (S. cerevisiae) to the late eukaryote (human) should be the definite reflection of the evolutionary process.

  14. Proteus mirabilis ambient-temperature fimbriae: cloning and nucleotide sequence of the aft gene cluster.

    PubMed Central

    Massad, G; Fulkerson, J F; Watson, D C; Mobley, H L

    1996-01-01

    Uropathogenic Proteus mirabilis produces at least four types of fimbriae. Amino acid sequences from two peptides, derived by tryptic digestion of the structural subunit of one type of these fimbriae, the ambient-temperature fimbriae, were determined: NVVPGQPSSTQ and LIEGENQLNYNA. PCR primers, based on these sequences and that of the N terminus, were used to amplify a 359-bp fragment. A cosmid clone, isolated from a P. mirabilis genomic library by hybridization with the 359-bp PCR product, was used to determine the nucleotide sequence of the atf gene cluster. A 3,903-bp region encodes three polypeptides: AtfA, the structural subunit; AtfB, the chaperone; and AtfC, the outer membrane molecular usher. No fimbria-related genes are evident either 5' or 3' to the three contiguous genes. AtfA demonstrates significant amino acid sequence identity with type 1 major fimbrial subunits of several enteric species. The 359-bp PCR product hybridized strongly with all Proteus isolates (n = 9) and 25% of 355 Escherichia coli isolates but failed to hybridize with any of 26 isolates among nine other uropathogenic species. Ambient-temperature fimbriae of P. mirabilis may represent a novel type of fimbriae of enteric species. PMID:8926119

  15. Increased functional protein expression using nucleotide sequence features enriched in highly expressed genes in zebrafish

    PubMed Central

    Horstick, Eric J.; Jordan, Diana C.; Bergeron, Sadie A.; Tabor, Kathryn M.; Serpe, Mihaela; Feldman, Benjamin; Burgess, Harold A.

    2015-01-01

    Many genetic manipulations are limited by difficulty in obtaining adequate levels of protein expression. Bioinformatic and experimental studies have identified nucleotide sequence features that may increase expression, however it is difficult to assess the relative influence of these features. Zebrafish embryos are rapidly injected with calibrated doses of mRNA, enabling the effects of multiple sequence changes to be compared in vivo. Using RNAseq and microarray data, we identified a set of genes that are highly expressed in zebrafish embryos and systematically analyzed for enrichment of sequence features correlated with levels of protein expression. We then tested enriched features by embryo microinjection and functional tests of multiple protein reporters. Codon selection, releasing factor recognition sequence and specific introns and 3′ untranslated regions each increased protein expression between 1.5- and 3-fold. These results suggested principles for increasing protein yield in zebrafish through biomolecular engineering. We implemented these principles for rational gene design in software for codon selection (CodonZ) and plasmid vectors incorporating the most active non-coding elements. Rational gene design thus significantly boosts expression in zebrafish, and a similar approach will likely elevate expression in other animal models. PMID:25628360

  16. Monoclonal antibodies specific for elongation factor Tu and complete nucleotide sequence of the tuf gene in Mycobacterium tuberculosis.

    PubMed Central

    Carlin, N I; Löfdahl, S; Magnusson, M

    1992-01-01

    Monoclonal antibodies against mycobacterial antigens were produced by immunizing LOU/C rats with live Mycobacterium bovis BCG. The antibodies were characterized by an enzyme-linked immunosorbent assay and by sodium dodecyl sulfate-polyacrylamide gel electrophoresis followed by Western blotting (immunoblotting). One antibody, MAMB 2, reactive with a 47-kDa protein was used to screen a lambda gt11 M. tuberculosis gene library (R. A. Young, B. R. Bloom, C. M. Grosskinsky, J. Ivanji, D. Thomas, and R. W. Davis, Proc. Natl. Acad. Sci. USA 82:2583-2587, 1985). Three recombinant phages reactive with MAMB 2 in plaque lysates were isolated, and part of the insert was sequenced. The mycobacterial inserts were all expressed as proteins fused with beta-galactosidase when the phages were induced as lysogens in Escherichia coli. The entire M. tuberculosis tuf gene was obtained by screening the lambda gt11 library with a DNA probe specific for the primary clones. A phage isolated from this screening was able to express the native protein in E. coli when introduced as a lysogen. A comparison of the entire gene sequence and the deduced protein sequence with the EMBL DNA and Swiss-Prot protein data libraries revealed strong homologies with elongation factors of bacteria, yeast mitochondria, and a plant chloroplast. Images PMID:1639483

  17. Guanine nucleotide-binding proteins that enhance choleragen ADP-ribosyltransferase activity: nucleotide and deduced amino acid sequence of an ADP-ribosylation factor cDNA.

    PubMed Central

    Price, S R; Nightingale, M; Tsai, S C; Williamson, K C; Adamik, R; Chen, H C; Moss, J; Vaughan, M

    1988-01-01

    Three (two soluble and one membrane) guanine nucleotide-binding proteins (G proteins) that enhance ADP-ribosylation of the Gs alpha stimulatory subunit of the adenylyl cyclase (EC 4.6.1.1) complex by choleragen have recently been purified from bovine brain. To further define the structure and function of these ADP-ribosylation factors (ARFs), we isolated a cDNA clone (lambda ARF2B) from a bovine retinal library by screening with a mixed heptadecanucleotide probe whose sequence was based on the partial amino acid sequence of one of the soluble ARFs from bovine brain. Comparison of the deduced amino acid sequence of lambda ARF2B with sequences of peptides from the ARF protein (total of 60 amino acids) revealed only two differences. Whether these are cloning artifacts or reflect the existence of more than one ARF protein remains to be determined. Deduced amino acid sequences of ARF, Go alpha (the alpha subunit of a G protein that may be involved in regulation of ion fluxes), and c-Ha-ras gene product p21 show similarities in regions believed to be involved in guanine nucleotide binding and GTP hydrolysis. ARF apparently lacks a site analogous to that ADP-ribosylated by choleragen in G-protein alpha subunits. Although both the ARF proteins and the alpha subunits bind guanine nucleotides and serve as choleragen substrates, they must interact with the toxin A1 peptide in different ways. In addition to serving as an ADP-ribose acceptor, ARF interacts with the toxin in a manner that modifies its catalytic properties. PMID:3135549

  18. Filamentous hemagglutinin of Bordetella pertussis: nucleotide sequence and crucial role in adherence.

    PubMed Central

    Relman, D A; Domenighini, M; Tuomanen, E; Rappuoli, R; Falkow, S

    1989-01-01

    Filamentous hemagglutinin is a surface-associated adherence protein of Bordetella pertussis, which is a component of some new acellular pertussis vaccines. The nucleotide sequence of an open reading frame that encompasses the filamentous hemagglutinin structural gene, fhaB, suggests that proteolytic processing is necessary to generate the mature 220-kDa filamentous hemagglutinin product. An Arg-Gly-Asp (RGD) tripeptide is found within filamentous hemagglutinin that may be involved in its adherence properties. An internal in-frame deletion in fhaB, encompassing the RGD region, causes loss of B. pertussis-binding to ciliated eukaryotic cells, confirming a potential role for this protein in host-cell binding and infection. Images PMID:2539596

  19. Nucleotide sequence of a glucosyltransferase gene from Streptococcus sobrinus MFe28.

    PubMed Central

    Ferretti, J J; Gilpin, M L; Russell, R R

    1987-01-01

    The complete nucleotide sequence was determined for the Streptococcus sobrinus MFe28 gtfI gene, which encodes a glucosyltransferase that produces an insoluble glucan product. A single open reading frame encodes a mature glucosyltransferase protein of 1,559 amino acids (Mr, 172,983) and a signal peptide of 38 amino acids. In the C-terminal one-third of the protein there are six repeating units containing 35 amino acids of partial homology and two repeating units containing 48 amino acids of complete homology. The functional role of these repeating units remains to be determined, although truncated forms of glucosyltransferase containing only the first two repeating units of partial homology maintained glucosyltransferase activity and the ability to bind glucan. Regions of homology with alpha-amylase and glycogen phosphorylase were identified in the glucosyltransferase protein and may represent regions involved in functionally similar domains. Images PMID:3040686

  20. High-Throughput Sequencing Reveals Single Nucleotide Variants in Longer-Kernel Bread Wheat

    PubMed Central

    Chen, Feng; Zhu, Zibo; Zhou, Xiaobian; Yan, Yan; Dong, Zhongdong; Cui, Dangqun

    2016-01-01

    The transcriptomes of bread wheat Yunong 201 and its ethyl methanesulfonate derivative Yunong 3114 were obtained by next-sequencing technology. Single nucleotide variants (SNVs) in the wheat strains were explored and compared. A total of 5907 and 6287 non-synonymous SNVs were acquired for Yunong 201 and 3114, respectively. A total of 4021 genes with SNVs were obtained. The genes that underwent non-synonymous SNVs were significantly involved in ATP binding, protein phosphorylation, and cellular protein metabolic process. The heat map analysis also indicated that most of these mutant genes were significantly differentially expressed at different developmental stages. The SNVs in these genes possibly contribute to the longer kernel length of Yunong 3114. Our data provide useful information on wheat transcriptome for future studies on wheat functional genomics. This study could also help in illustrating the gene functions of the non-synonymous SNVs of Yunong 201 and 3114. PMID:27551288

  1. Complete nucleotide sequence of a virus associated with rusty mottle disease of sweet cherry (Prunus avium).

    PubMed

    Villamor, D V; Druffel, K L; Eastwell, K C

    2013-08-01

    Cherry rusty mottle is a disease of sweet cherries first described in 1940 in western North America. Because of the graft-transmissible nature of the disease, a viral nature of the disease was assumed. Here, the complete genomic nucleotide sequences of virus isolates from two trees expressing cherry rusty mottle disease symptoms are characterized; the virus is designated cherry rusty mottle associated virus (CRMaV). The biological and molecular characteristics of this virus in comparison to those of cherry necrotic rusty mottle virus (CNRMV) and cherry green ring mottle virus (CGRMV) are described. CRMaV was subsequently detected in additional sweet cherry trees expressing symptoms of cherry rusty mottle disease. PMID:23525699

  2. Nucleotide sequence of a gene encoding an organophosphorus nerve agent degrading enzyme from Alteromonas haloplanktis.

    PubMed

    Cheng, T; Liu, L; Wang, B; Wu, J; DeFrank, J J; Anderson, D M; Rastogi, V K; Hamilton, A B

    1997-01-01

    Organophosphorus acid anhydrolases (OPAA) catalyzing the hydrolysis of a variety of toxic organophosphorus cholinesterase inhibitors offer potential for decontamination of G-type nerve agents and pesticides. The gene (opa) encoding an OPAA was cloned from the chromosomal DNA of Alteromonas haloplanktis ATCC 23821. The nucleotide sequence of the 1.7 -kb DNA fragment contained the opa gene (1.3 kb) and its flanking region. We report structural and functional similarity of OPAAs from A. haloplanktis and Alteromonas sp JD6.5 with the enzyme prolidase that hydrolyzes dipeptides with a prolyl residue in the carboxyl-terminal position. These results corroborate the earlier conclusion that the OPAA is a type of X-Pro dipeptidase, and that X-Pro could be the native substrate for such an enzyme in Alteromonas cells. PMID:9079288

  3. Nucleotide sequence analysis of the DNA binding region of the chicken fibronectin gene.

    PubMed

    Karasaki, Y; Gotoh, S; Kubomura, S; Higashi, K; Hirano, H

    1988-12-01

    We have determined the nucleotide sequence of 2.0 kb EcoRI segment from the clone lambda FC32 of the genomic chicken fibronectin gene, which is called DNA binding domain. This segment overlapped another clone lambda FC36 and contained three exons which were 16, 17 and 18. They were classified as Type III repeat as originally shown in bovine plasma fibronectin. The average homologies of these three exons among the chicken, rat and human fibronectins in amino acid level are very high (87-98%) compared with that (79-88%) of the exons in the cell binding domain, indicating that this region is highly conservative during the evolution. PMID:3212295

  4. Developing single nucleotide polymorphism (SNP) markers from transcriptome sequences for identification of longan (Dimocarpus longan) germplasm

    PubMed Central

    Wang, Boyi; Tan, Hua-Wei; Fang, Wanping; Meinhardt, Lyndel W; Mischke, Sue; Matsumoto, Tracie; Zhang, Dapeng

    2015-01-01

    Longan (Dimocarpus longan Lour.) is an important tropical fruit tree crop. Accurate varietal identification is essential for germplasm management and breeding. Using longan transcriptome sequences from public databases, we developed single nucleotide polymorphism (SNP) markers; validated 60 SNPs in 50 longan germplasm accessions, including cultivated varieties and wild germplasm; and designated 25 SNP markers that unambiguously identified all tested longan varieties with high statistical rigor (P<0.0001). Multiple trees from the same clone were verified and off-type trees were identified. Diversity analysis revealed genetic relationships among analyzed accessions. Cultivated varieties differed significantly from wild populations (Fst=0.300; P<0.001), demonstrating untapped genetic diversity for germplasm conservation and utilization. Within cultivated varieties, apparent differences between varieties from China and those from Thailand and Hawaii indicated geographic patterns of genetic differentiation. These SNP markers provide a powerful tool to manage longan genetic resources and breeding, with accurate and efficient genotype identification. PMID:26504559

  5. High-Throughput Sequencing Reveals Single Nucleotide Variants in Longer-Kernel Bread Wheat.

    PubMed

    Chen, Feng; Zhu, Zibo; Zhou, Xiaobian; Yan, Yan; Dong, Zhongdong; Cui, Dangqun

    2016-01-01

    The transcriptomes of bread wheat Yunong 201 and its ethyl methanesulfonate derivative Yunong 3114 were obtained by next-sequencing technology. Single nucleotide variants (SNVs) in the wheat strains were explored and compared. A total of 5907 and 6287 non-synonymous SNVs were acquired for Yunong 201 and 3114, respectively. A total of 4021 genes with SNVs were obtained. The genes that underwent non-synonymous SNVs were significantly involved in ATP binding, protein phosphorylation, and cellular protein metabolic process. The heat map analysis also indicated that most of these mutant genes were significantly differentially expressed at different developmental stages. The SNVs in these genes possibly contribute to the longer kernel length of Yunong 3114. Our data provide useful information on wheat transcriptome for future studies on wheat functional genomics. This study could also help in illustrating the gene functions of the non-synonymous SNVs of Yunong 201 and 3114. PMID:27551288

  6. High-throughput nucleotide sequence analysis of diverse bacterial communities in leachates of decomposing pig carcasses

    PubMed Central

    Yang, Seung Hak; Lim, Joung Soo; Khan, Modabber Ahmed; Kim, Bong Soo; Choi, Dong Yoon; Lee, Eun Young; Ahn, Hee Kwon

    2015-01-01

    The leachate generated by the decomposition of animal carcass has been implicated as an environmental contaminant surrounding the burial site. High-throughput nucleotide sequencing was conducted to investigate the bacterial communities in leachates from the decomposition of pig carcasses. We acquired 51,230 reads from six different samples (1, 2, 3, 4, 6 and 14 week-old carcasses) and found that sequences representing the phylum Firmicutes predominated. The diversity of bacterial 16S rRNA gene sequences in the leachate was the highest at 6 weeks, in contrast to those at 2 and 14 weeks. The relative abundance of Firmicutes was reduced, while the proportion of Bacteroidetes and Proteobacteria increased from 3–6 weeks. The representation of phyla was restored after 14 weeks. However, the community structures between the samples taken at 1–2 and 14 weeks differed at the bacterial classification level. The trend in pH was similar to the changes seen in bacterial communities, indicating that the pH of the leachate could be related to the shift in the microbial community. The results indicate that the composition of bacterial communities in leachates of decomposing pig carcasses shifted continuously during the study period and might be influenced by the burial site. PMID:26500442

  7. Nucleotide sequences provide evidence of genetic exchange among distantly related lineages of Trypanosoma cruzi

    PubMed Central

    Machado, Carlos A.; Ayala, Francisco J.

    2001-01-01

    Simple phylogenetic tests were applied to a large data set of nucleotide sequences from two nuclear genes and a region of the mitochondrial genome of Trypanosoma cruzi, the agent of Chagas' disease. Incongruent gene genealogies manifest genetic exchange among distantly related lineages of T. cruzi. Two widely distributed isoenzyme types of T. cruzi are hybrids, their genetic composition being the likely result of genetic exchange between two distantly related lineages. The data show that the reference strain for the T. cruzi genome project (CL Brener) is a hybrid. Well-supported gene genealogies show that mitochondrial and nuclear gene sequences from T. cruzi cluster, respectively, in three or four distinct clades that do not fully correspond to the two previously defined major lineages of T. cruzi. There is clear genetic differentiation among the major groups of sequences, but genetic diversity within each major group is low. We estimate that the major extant lineages of T. cruzi have diverged during the Miocene or early Pliocene (3–16 million years ago). PMID:11416213

  8. Mining for single nucleotide polymorphisms and insertions / deletions in expressed sequence tag libraries of oil palm.

    PubMed

    Riju, Aykkal; Chandrasekar, Arumugam; Arunachalam, Vadivel

    2007-01-01

    The oil palm is a tropical oil bearing tree. Recently EST-derived SNPs and SSRs are a free by-product of the currently expanding EST (Expressed Sequence Tag) data bases. The development of high-throughput methods for the detection of SNPs (Single Nucleotide Polymorphism) and small indels (insertion / deletion) has led to a revolution in their use as molecular markers. Available (5452) Oil palm EST sequences were mined from dbEST of NCBI. CAP3 program was used to assemble EST sequences into contigs. Candidate SNPs and Indel polymorphisms were detected using the perl script auto_snip version 1.0 which has used 576 ESTs for detecting SNPs and Indel sites. We found 1180 SNP sites and 137 indel polymorphisms with frequency 1.36 SNPs / 100 bp. Among the six tissues from which the EST libraries had been generated, mesocarp had high frequency of 2.91 SNPs and indels per 100 bp whereas the zygotic embryos had lowest frequency of 0.15 per 100 bp. We also used the Shannon index to analyze the proportion of ten possible types of SNP/indels. ESTs from tissues of normal apex showed highest values of Shannon index (0.60) whereas abnormal apex had least value (0.02). The present report deals the use of Shannon index for comparing SNP/ indel frequencies mined from ESTlibraries and also confirm that the frequency of SNP occurrence in oil palm to use them as markers for genetic studies. PMID:21670789

  9. Complete nucleotide sequence of the mitochondrial genome of a salamander, Mertensiella luschani.

    PubMed

    Zardoya, Rafael; Malaga-Trillo, Edward; Veith, Michael; Meyer, Axel

    2003-10-23

    The complete nucleotide sequence (16,650 bp) of the mitochondrial genome of the salamander Mertensiella luschani (Caudata, Amphibia) was determined. This molecule conforms to the consensus vertebrate mitochondrial gene order. However, it is characterized by a long non-coding intervening sequence with two 124-bp repeats between the tRNA(Thr) and tRNA(Pro) genes. The new sequence data were used to reconstruct a phylogeny of jawed vertebrates. Phylogenetic analyses of all mitochondrial protein-coding genes at the amino acid level recovered a robust vertebrate tree in which lungfishes are the closest living relatives of tetrapods, salamanders and frogs are grouped together to the exclusion of caecilians (the Batrachia hypothesis) in a monophyletic amphibian clade, turtles show diapsid affinities and are placed as sister group of crocodiles+birds, and the marsupials are grouped together with monotremes and basal to placental mammals. The deduced phylogeny was used to characterize the molecular evolution of vertebrate mitochondrial proteins. Amino acid frequencies were analyzed across the main lineages of jawed vertebrates, and leucine and cysteine were found to be the most and least abundant amino acids in mitochondrial proteins, respectively. Patterns of amino acid replacements were conserved among vertebrates. Overall, cartilaginous fishes showed the least variation in amino acid frequencies and replacements. Constancy of rates of evolution among the main lineages of jawed vertebrates was rejected. PMID:14604788

  10. Whole genome sequencing of a single Bos taurus animal for single nucleotide polymorphism discovery

    PubMed Central

    Eck, Sebastian H; Benet-Pagès, Anna; Flisikowski, Krzysztof; Meitinger, Thomas; Fries, Ruedi; Strom, Tim M

    2009-01-01

    Background The majority of the 2 million bovine single nucleotide polymorphisms (SNPs) currently available in dbSNP have been identified in a single breed, Hereford cattle, during the bovine genome project. In an attempt to evaluate the variance of a second breed, we have produced a whole genome sequence at low coverage of a single Fleckvieh bull. Results We generated 24 gigabases of sequence, mainly using 36-bp paired-end reads, resulting in an average 7.4-fold sequence depth. This coverage was sufficient to identify 2.44 million SNPs, 82% of which were previously unknown, and 115,000 small indels. A comparison with the genotypes of the same animal, generated on a 50 k oligonucleotide chip, revealed a detection rate of 74% and 30% for homozygous and heterozygous SNPs, respectively. The false positive rate, as determined by comparison with genotypes determined for 196 randomly selected SNPs, was approximately 1.1%. We further determined the allele frequencies of the 196 SNPs in 48 Fleckvieh and 48 Braunvieh bulls. 95% of the SNPs were polymorphic with an average minor allele frequency of 24.5% and with 83% of the SNPs having a minor allele frequency larger than 5%. Conclusions This work provides the first single cattle genome by next-generation sequencing. The chosen approach - low to medium coverage re-sequencing - added more than 2 million novel SNPs to the currently publicly available SNP resource, providing a valuable resource for the construction of high density oligonucleotide arrays in the context of genome-wide association studies. PMID:19660108

  11. Detection and quantitation of single nucleotide polymorphisms, DNA sequence variations, DNA mutations, DNA damage and DNA mismatches

    DOEpatents

    McCutchen-Maloney, Sandra L.

    2002-01-01

    DNA mutation binding proteins alone and as chimeric proteins with nucleases are used with solid supports to detect DNA sequence variations, DNA mutations and single nucleotide polymorphisms. The solid supports may be flow cytometry beads, DNA chips, glass slides or DNA dips sticks. DNA molecules are coupled to solid supports to form DNA-support complexes. Labeled DNA is used with unlabeled DNA mutation binding proteins such at TthMutS to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by binding which gives an increase in signal. Unlabeled DNA is utilized with labeled chimeras to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by nuclease activity of the chimera which gives a decrease in signal.

  12. Real-time single-molecule electronic DNA sequencing by synthesis using polymer-tagged nucleotides on a nanopore array

    PubMed Central

    Fuller, Carl W.; Kumar, Shiv; Porel, Mintu; Chien, Minchen; Bibillo, Arek; Stranges, P. Benjamin; Dorwart, Michael; Tao, Chuanjuan; Li, Zengmin; Guo, Wenjing; Shi, Shundi; Korenblum, Daniel; Trans, Andrew; Aguirre, Anne; Liu, Edward; Harada, Eric T.; Pollard, James; Bhat, Ashwini; Cech, Cynthia; Yang, Alexander; Arnold, Cleoma; Palla, Mirkó; Hovis, Jennifer; Chen, Roger; Morozova, Irina; Kalachikov, Sergey; Russo, James J.; Kasianowicz, John J.; Davis, Randy; Roever, Stefan; Church, George M.; Ju, Jingyue

    2016-01-01

    DNA sequencing by synthesis (SBS) offers a robust platform to decipher nucleic acid sequences. Recently, we reported a single-molecule nanopore-based SBS strategy that accurately distinguishes four bases by electronically detecting and differentiating four different polymer tags attached to the 5′-phosphate of the nucleotides during their incorporation into a growing DNA strand catalyzed by DNA polymerase. Further developing this approach, we report here the use of nucleotides tagged at the terminal phosphate with oligonucleotide-based polymers to perform nanopore SBS on an α-hemolysin nanopore array platform. We designed and synthesized several polymer-tagged nucleotides using tags that produce different electrical current blockade levels and verified they are active substrates for DNA polymerase. A highly processive DNA polymerase was conjugated to the nanopore, and the conjugates were complexed with primer/template DNA and inserted into lipid bilayers over individually addressable electrodes of the nanopore chip. When an incoming complementary-tagged nucleotide forms a tight ternary complex with the primer/template and polymerase, the tag enters the pore, and the current blockade level is measured. The levels displayed by the four nucleotides tagged with four different polymers captured in the nanopore in such ternary complexes were clearly distinguishable and sequence-specific, enabling continuous sequence determination during the polymerase reaction. Thus, real-time single-molecule electronic DNA sequencing data with single-base resolution were obtained. The use of these polymer-tagged nucleotides, combined with polymerase tethering to nanopores and multiplexed nanopore sensors, should lead to new high-throughput sequencing methods. PMID:27091962

  13. Real-time single-molecule electronic DNA sequencing by synthesis using polymer-tagged nucleotides on a nanopore array.

    PubMed

    Fuller, Carl W; Kumar, Shiv; Porel, Mintu; Chien, Minchen; Bibillo, Arek; Stranges, P Benjamin; Dorwart, Michael; Tao, Chuanjuan; Li, Zengmin; Guo, Wenjing; Shi, Shundi; Korenblum, Daniel; Trans, Andrew; Aguirre, Anne; Liu, Edward; Harada, Eric T; Pollard, James; Bhat, Ashwini; Cech, Cynthia; Yang, Alexander; Arnold, Cleoma; Palla, Mirkó; Hovis, Jennifer; Chen, Roger; Morozova, Irina; Kalachikov, Sergey; Russo, James J; Kasianowicz, John J; Davis, Randy; Roever, Stefan; Church, George M; Ju, Jingyue

    2016-05-10

    DNA sequencing by synthesis (SBS) offers a robust platform to decipher nucleic acid sequences. Recently, we reported a single-molecule nanopore-based SBS strategy that accurately distinguishes four bases by electronically detecting and differentiating four different polymer tags attached to the 5'-phosphate of the nucleotides during their incorporation into a growing DNA strand catalyzed by DNA polymerase. Further developing this approach, we report here the use of nucleotides tagged at the terminal phosphate with oligonucleotide-based polymers to perform nanopore SBS on an α-hemolysin nanopore array platform. We designed and synthesized several polymer-tagged nucleotides using tags that produce different electrical current blockade levels and verified they are active substrates for DNA polymerase. A highly processive DNA polymerase was conjugated to the nanopore, and the conjugates were complexed with primer/template DNA and inserted into lipid bilayers over individually addressable electrodes of the nanopore chip. When an incoming complementary-tagged nucleotide forms a tight ternary complex with the primer/template and polymerase, the tag enters the pore, and the current blockade level is measured. The levels displayed by the four nucleotides tagged with four different polymers captured in the nanopore in such ternary complexes were clearly distinguishable and sequence-specific, enabling continuous sequence determination during the polymerase reaction. Thus, real-time single-molecule electronic DNA sequencing data with single-base resolution were obtained. The use of these polymer-tagged nucleotides, combined with polymerase tethering to nanopores and multiplexed nanopore sensors, should lead to new high-throughput sequencing methods. PMID:27091962

  14. Cloning and nucleotide sequence of the gene coding for citrate synthase from a thermotolerant Bacillus sp

    SciTech Connect

    Schendel, F.J.; August, P.R.; Anderson, C.R.; Flickinger, M.C. ); Hanson, R.S. )

    1992-01-01

    Acetate salts are emerging as potentially attractive bulk chemicals for a variety of environmental applications, for example, as catalysts to facilitate combustion of high-sulfur coal by electrical utilities and as the biodegradable noncorrosive highway deicing salt calcium magnesium acetate. The structural gene coding for citrate synthase from the gram-positive soil isolate Bacillus sp. strain C4 (ATCC 55182) capable of secreting acetic acid at pH 5.0 to 7.0 in the presence of dolime has been cloned from a genomic library by complementation of an Escherichia coli auxotrophic mutant lacking citrate synthase. The nucleotide sequence of the entire 3.1-kb HindIII fragment has been determined, and one major open reading frame was found coding for citrate synthase (ctsA). Citrate synthase from Bacillus sp. strain C4 was found to be a dimer (M{sub r}, 84,500) with a sub unit with an M{sub r} of 42,000. The N-terminal sequence was found to be identical with that predicted from the gene sequence. The kinetics were best fit to a bisubstrate enzyme with an ordered mechanism. Bacillus sp. strain C4 citrate synthase was not activated by potassium chloride and was not inhibited by NADH, ATP, ADP, or AMP at levels up to 1 mM. The predicted amino acid sequence was compared with that of the E. coli, Acinetobacter anitratum, Pseudomonas aeruginosa, Rickettsia prowazekii, porcine heart, and Saccharomyces cerevisiae cytoplasmic and mitochondrial enzymes.

  15. Human secreted carbonic anhydrase: cDNA cloning, nucleotide sequence, and hybridization histochemistry

    SciTech Connect

    Aldred, P.; Fu, Ping; Barrett, G.; Penschow, J.D.; Wright, R.D.; Coghlan, J.P.; Fernley, R.T. )

    1991-01-01

    Complementary DNA clones coding for the human secreted carbonic anhydrase isozyme (CAVI) have been isolated and their nucleotide sequences determined. These clones identify a 1.45-kb mRNA that is present in high levels in parotid submandibular salivary glands but absent in other tissues such as the sublingual gland, kidney, liver, and prostate gland. Hybridization histochemistry of human salivary glands shows mRNA for CA VI located in the acinar cells of these glands. The cDNA clones encode a protein of 308 amino acids that includes a 17 amino acid leader sequence typical of secreted proteins. The mature protein has 291 amino acids compared to 259 or 260 for the cytoplasmic isozymes, with most of the extra amino acids present as a carboxyl terminal extension. In comparison, sheep CA VI has a 45 amino acid extension. Overall the human CA VI protein has a sequence identity of 35 {percent} with human CA II, while residues involved in the active site of the enzymes have been conserved. The human and sheep secreted carbonic anhydrases have a sequence identity of 72 {percent}. This includes the two cysteine residues that are known to be involved in an intramolecular disulfide bond in the sheep CA VI. The enzyme is known to be glycosylated and three potential N-glycosylation sites (Asn-X-Thr/Ser) have been identified. Two of these are known to be glycosylated in sheep CA VI. Southern analysis of human DNA indicates that there is only one gene coding for CA VI.

  16. Complete nucleotide sequence and experimental host range of Okra mosaic virus.

    PubMed

    Stephan, Dirk; Siddiqua, Mahbuba; Ta Hoang, Anh; Engelmann, Jill; Winter, Stephan; Maiss, Edgar

    2008-02-01

    Okra mosaic virus (OkMV) is a tymovirus infecting members of the family Malvaceae. Early infections in okra (Abelmoschus esculentus) lead to yield losses of 12-19.5%. Besides intensive biological characterizations of OkMV only minor molecular data were available. Therefore, we determined the complete nucleotide sequence of a Nigerian isolate of OkMV. The complete genomic RNA (gRNA) comprises 6,223 nt and its genome organization showed three major ORFs coding for a putative movement protein (MP) of M r 73.1 kDa, a large replication-associated protein (RP) of M r 202.4 kDa and a coat protein (CP) of M r 19.6 kDa. Prediction of secondary RNA structures showed three hairpin structures with internal loops in the 5'-untranslated region (UTR) and a 3'-terminal tRNA-like structure (TLS) which comprises the anticodon for valine, typical for a member of the genus Tymovirus. Phylogenetic comparisons based on the RP, MP and CP amino acid sequences showed the close relationship of OkMV not only to other completely sequenced tymoviruses like Kennedya yellow mosaic virus (KYMV), Turnip yellow mosaic virus (TYMV) and Erysimum latent virus (ErLV), but also to Calopogonium yellow vein virus (CalYVV), Clitoria yellow vein virus (CYVV) and Desmodium yellow mottle virus (DYMoV). This is the first report of a complete OkMV genome sequence from one of the various OkMV isolates originating from West Africa described so far. Additionally, the experimental host range of OkMV including several Nicotiana species was determined. PMID:18049886

  17. The qa repressor gene of Neurospora crassa: wild-type and mutant nucleotide sequences.

    PubMed Central

    Huiet, L; Giles, N H

    1986-01-01

    The qa-1S gene, one of two regulatory genes in the qa gene cluster of Neurospora crassa, encodes the qa repressor. The qa-1S gene together with the qa-1F gene, which encodes the qa activator protein, control the expression of all seven qa genes, including those encoding the inducible enzymes responsible for the utilization of quinic acid as a carbon source. The nucleotide sequence of the qa-1S gene and its flanking regions has been determined. The deduced coding sequence for the qa-1S protein encodes 918 amino acids with a calculated molecular weight of 100,650 and is interrupted by a single 66-base-pair intervening sequence. Both constitutive and noninducible mutants occur in the qa-1S gene and two different mutations of each type have been cloned and sequenced. All four mutations occur within the predicted coding region of the qa-1S gene. This result strongly supports the hypothesis that the qa-1S gene encodes a repressor. All four mutations are located within codons for the last 300 amino acids of the qa-1S protein. The mutations in three of the mutants involve amino acid substitutions, while the fourth mutant, which has a constitutive phenotype, contains a frameshift mutation. The two constitutive mutations occur in the most distal region of the gene, possibly implicating the COOH-terminal region of the qa repressor in binding to its target. The two noninducible mutations occur in a region proximal to the constitutive mutations, possibly implicating this region of the qa repressor in binding the inducer. Images PMID:3010294

  18. Nucleotide sequence and expression of the capsid protein gene of feline calicivirus.

    PubMed Central

    Neill, J D; Reardon, I M; Heinrikson, R L

    1991-01-01

    The sequence of the 3'-terminal 2,486 bases of the feline calicivirus (FCV) genome was determined. This region of the FCV genome, from which the 2.4-kb subgenomic RNA is derived, contained two open reading frames. The larger open reading frame, found in the 5' end of the subgenomic mRNA, contained 2,004 bases encoding a polypeptide of 73,467 Da. The smaller open reading frame, encoded in the 3' end of the mRNA, was composed of 318 bases, encoding a polypeptide of 12,185 Da. The AUG initiation codon of the second open reading frame overlapped the UGA termination codon of the first, with the sequence AUGA. The nucleotide sequence of the region containing this overlap resembles the -1 frameshift sequences of the retroviruses. The 5' end of the 2.4-kb subgenomic RNA was mapped by primer extension analysis. There were two apparent transcription initiation points, both of which were 5' to the AUG initiation codon of the large open reading frame. Transcription from these sites yielded RNA transcripts with 5' nontranslated leader regions of 17 and 18 bases. The total length of the 2.4-kb subgenomic RNA was 2,375 bases (from the 5'-most start site) excluding the poly(A) tail. Edman degradation of the purified capsid protein of FCV showed that the capsid protein was encoded by the large open reading frame. Western immunoblot analysis of FCV-infected cells using a feline anti-FCV antiserum demonstrated that translation of the capsid protein was detectable at 3 h postinfection and continued to accumulate until 8 h postinfection, the last time examined. Images PMID:1716692

  19. Whole-genome sequence of Sunxiuqinia dokdonensis DH1(T), isolated from deep sub-seafloor sediment in Dokdo Island.

    PubMed

    Lim, Sooyeon; Chang, Dong-Ho; Kim, Byoung-Chan

    2016-09-01

    Sunxiuqinia dokdonensis DH1(T) was isolated from deep sub-seafloor sediment at a depth of 900 m below the seafloor off Seo-do (the west part of Dokdo Island) in the East Sea of the Republic of Korea and subjected to whole genome sequencing on HiSeq platform and annotated on RAST. The nucleotide sequence of this genome was deposited into DDBJ/EMBL/GenBank under the accession LGIA00000000. PMID:27437183

  20. Nucleotide sequence of the 3'-noncoding region of alfalfa mosaic virus RNA 4 and its homology with the genomic RNAs.

    PubMed Central

    Koper-Zwarthoff, E C; Brederode, F T; Walstra, P; Bol, J F

    1979-01-01

    A 226-nucleotide fragment was derived from alfalfa mosaic virus RNA 4 (ALMV RNA 4), the subgenomic messenger for viral coat protein, and its sequence was deduced by in vitro labeling with polynucleotide kinase and application of RNA sequencing techniques. The fragment contains the 3'-terminal 45 nucleotides of the coat protein cistron and the complete 3'-noncoding region of 182 nucleotides. The total length of RNA 4 was calculated to be 881 nucleotides. AlMV RNAs 1, 2 and 3 were elongated with a 3'-terminal poly(A) stretch and subjected to sequence analysis by using a specific primer, reverse transcriptase and chain terminators. This revealed and extensive homology between the 3'-terminal 140 to 150 nucleotides of all four ALMV RNAs. Despite a number of base substitutions, the secondary structure of the homologous region is highly conserved. The observed homology indicates that, as with RNA 4, the sites with a high affinity for the viral coat protein are located at the 3'-termini of the genomic RNAs. Images PMID:537914

  1. Nucleotide sequence of the Kaposi sarcoma-associated herpesvirus (HHV8)

    PubMed Central

    Russo, James J.; Bohenzky, Roy A.; Chien, Ming-Cheng; Chen, Jing; Yan, Ming; Maddalena, Dawn; Parry, J. Preston; Peruzzi, Daniela; Edelman, Isidore S.; Chang, Yuan; Moore, Patrick S.

    1996-01-01

    The genome of the Kaposi sarcoma-associated herpesvirus (KSHV or HHV8) was mapped with cosmid and phage genomic libraries from the BC-1 cell line. Its nucleotide sequence was determined except for a 3-kb region at the right end of the genome that was refractory to cloning. The BC-1 KSHV genome consists of a 140.5-kb-long unique coding region flanked by multiple G+C-rich 801-bp terminal repeat sequences. A genomic duplication that apparently arose in the parental tumor is present in this cell culture-derived strain. At least 81 ORFs, including 66 with homology to herpesvirus saimiri ORFs, and 5 internal repeat regions are present in the long unique region. The virus encodes homologs to complement-binding proteins, three cytokines (two macrophage inflammatory proteins and interleukin 6), dihydrofolate reductase, bcl-2, interferon regulatory factors, interleukin 8 receptor, neural cell adhesion molecule-like adhesin, and a D-type cyclin, as well as viral structural and metabolic proteins. Terminal repeat analysis of virus DNA from a KS lesion suggests a monoclonal expansion of KSHV in the KS tumor. PMID:8962146

  2. Complete nucleotide sequence of rose yellow leaf virus, a new member of the family Tombusviridae.

    PubMed

    Mollov, Dimitre; Lockhart, Ben; Zlesak, David C

    2014-10-01

    The genome of the rose yellow leaf virus (RYLV) has been determined to be 3918 nucleotides long and to contain seven open reading frames (ORFs). ORF1 encodes a 27-kDa peptide (p27). ORF2 shares a common start codon with ORF1 and continues through the amber stop codon of p27 to encode an 87-kDa (p87) protein that has amino acid similarity to the RNA-dependent RNA polymerase (RdRp) of members of the family Tombusviridae. ORFs 3 and 4 have no significant amino acid similarity to known functional viral ORFs. ORF5 encodes a 6-kDa (p6) protein that has similarity to movement proteins of members of the Tombusviridae. ORF5A has no conventional start codon and overlaps with p6. A putative +1 frameshift mechanism allows p6 translation to continue through the stop codon and results in a 12-kDa protein that has high homology to the carmovirus p13 movement protein. The 37-kDa protein encoded by ORF6 has amino acid sequence similarity to coat proteins (CP) of members of the Tombusviridae. ORF7 has no significant amino acid similarity to known viral ORFs. Phylogenetic analysis of the RdRp amino acid sequences grouped RYLV together with the unclassified Rosa rugosa leaf distortion virus (RrLDV), pelargonium line pattern virus (PLPV), and pelargonium chlorotic ring pattern virus (PCRPV) in a distinct subgroup of the family Tombusviridae. PMID:24838852

  3. Use of nucleotide sequence data to identify a microsporidian pathogen of Pieris rapae (Lepidoptera, Pieridae).

    PubMed

    Malone, L A; McIvor, C A

    1996-11-01

    Nucleotide sequence was determined for a portion of genomic DNA which spans the V4 variable region of the small subunit ribosomal RNA gene of an unidentified microsporidium from the cabbage white butterfly, Pieris rapae (174 base pairs). Comparison with equivalent sequence data obtained here for two other microsporidian species, Nosema bombycis (240 base pairs) and Nosema bombi (200 base pairs), and from the GenBank database for 11 other microsporidian species suggests that the unidentified species from P. rapae is most closely related to some Vairimorpha species. Light and electron microscopic observations of the developmental stages of this parasite were in accord with this. Infection experiments conducted at 20 and 26 degrees C demonstrated temperature-dependent dimorphism, with the production of both binucleate free spores (mean dimensions: 3.8 x 1.8 microns; 10-13 polar filament coils) and membrane-bound uninucleate octospores (mean dimensions: 3.1 x 1.9 microns). Macrospores (mean dimensions 8.0 x 2.1 microns) were also observed. Sites of infection were the gut epithelium, the Malpighian tubules, the salivary glands, and the fat body. Infections were found in all insect life stages, including the egg. This microsporidium was found to be indistinguishable from both Nosema mesnili (Paillot) and Microsporidium (Thelohania) mesnili (Paillot) and we propose that these species be combined and transferred to the genus Vairimorpha Pilley. PMID:8931362

  4. Predicting Mendelian Disease-Causing Non-Synonymous Single Nucleotide Variants in Exome Sequencing Studies

    PubMed Central

    Bao, Su-Ying; Yang, Wanling; Ho, Shu-Leong; Song, Yong-Qiang; Sham, Pak C.

    2013-01-01

    Exome sequencing is becoming a standard tool for mapping Mendelian disease-causing (or pathogenic) non-synonymous single nucleotide variants (nsSNVs). Minor allele frequency (MAF) filtering approach and functional prediction methods are commonly used to identify candidate pathogenic mutations in these studies. Combining multiple functional prediction methods may increase accuracy in prediction. Here, we propose to use a logit model to combine multiple prediction methods and compute an unbiased probability of a rare variant being pathogenic. Also, for the first time we assess the predictive power of seven prediction methods (including SIFT, PolyPhen2, CONDEL, and logit) in predicting pathogenic nsSNVs from other rare variants, which reflects the situation after MAF filtering is done in exome-sequencing studies. We found that a logit model combining all or some original prediction methods outperforms other methods examined, but is unable to discriminate between autosomal dominant and autosomal recessive disease mutations. Finally, based on the predictions of the logit model, we estimate that an individual has around 5% of rare nsSNVs that are pathogenic and carries ∼22 pathogenic derived alleles at least, which if made homozygous by consanguineous marriages may lead to recessive diseases. PMID:23341771

  5. Predicting mendelian disease-causing non-synonymous single nucleotide variants in exome sequencing studies.

    PubMed

    Li, Miao-Xin; Kwan, Johnny S H; Bao, Su-Ying; Yang, Wanling; Ho, Shu-Leong; Song, Yong-Qiang; Sham, Pak C

    2013-01-01

    Exome sequencing is becoming a standard tool for mapping Mendelian disease-causing (or pathogenic) non-synonymous single nucleotide variants (nsSNVs). Minor allele frequency (MAF) filtering approach and functional prediction methods are commonly used to identify candidate pathogenic mutations in these studies. Combining multiple functional prediction methods may increase accuracy in prediction. Here, we propose to use a logit model to combine multiple prediction methods and compute an unbiased probability of a rare variant being pathogenic. Also, for the first time we assess the predictive power of seven prediction methods (including SIFT, PolyPhen2, CONDEL, and logit) in predicting pathogenic nsSNVs from other rare variants, which reflects the situation after MAF filtering is done in exome-sequencing studies. We found that a logit model combining all or some original prediction methods outperforms other methods examined, but is unable to discriminate between autosomal dominant and autosomal recessive disease mutations. Finally, based on the predictions of the logit model, we estimate that an individual has around 5% of rare nsSNVs that are pathogenic and carries ~22 pathogenic derived alleles at least, which if made homozygous by consanguineous marriages may lead to recessive diseases. PMID:23341771

  6. Mutations in core nucleotide sequence of hepatitis B virus correlate with fulminant and severe hepatitis.

    PubMed Central

    Ehata, T; Omata, M; Chuang, W L; Yokosuka, O; Ito, Y; Hosoda, K; Ohto, M

    1993-01-01

    Infection with hepatitis B virus leads to a wide spectrum of liver injury, including self-limited acute hepatitis, fulminant hepatitis, and chronic hepatitis with progression to cirrhosis or acute exacerbation to liver failure, as well as an asymptomatic chronic carrier state. Several studies have suggested that the hepatitis B core antigen could be an immunological target of cytotoxic T lymphocytes. To investigate the reason why the extreme immunological attack occurred in fulminant hepatitis and severe exacerbation patients, the entire precore and core region of hepatitis B virus DNA was sequenced in 24 subjects (5 fulminant, 10 severe fatal exacerbation, and 9 self-limited acute hepatitis patients). No significant change in the nucleotide sequence and deduced amino acid residue was noted in the nine self-limited acute hepatitis patients. In contrast, clustering changes in a small segment of 16 amino acids (codon 84-99 from the start of the core gene) in all seven adr subtype infected fulminant and severe exacerbation patients was found. A different segment with clustering substitutions (codon 48-60) was also found in seven of eight adw subtype infected fulminant and severe exacerbation patients. Of the 15 patients, 2 lacked precore stop mutation which was previously reported to be associated with fulminant hepatitis. These data suggest that these core regions with mutations may play an important role in the pathogenesis of hepatitis B viral disease, and such mutations are related to severe liver damage. Images PMID:8450049

  7. BIND - an algorithm for loss-less compression of nucleotide sequence data.

    PubMed

    Bose, Tungadri; Mohammed, Monzoorul Haque; Dutta, Anirban; Mande, Sharmila S

    2012-09-01

    Recent advances in DNA sequencing technologies have enabled the current generation of life science researchers to probe deeper into the genomic blueprint. The amount of data generated by these technologies has been increasing exponentially since the last decade. Storage, archival and dissemination of such huge data sets require efficient solutions, both from the hardware as well as software perspective. The present paper describes BIND-an algorithm specialized for compressing nucleotide sequence data. By adopting a unique 'block-length' encoding for representing binary data (as a key step), BIND achieves significant compression gains as compared to the widely used general purpose compression algorithms (gzip, bzip2 and lzma). Moreover, in contrast to implementations of existing specialized genomic compression approaches, the implementation of BIND is enabled to handle non-ATGC and lowercase characters. This makes BIND a loss-less compression approach that is suitable for practical use. More importantly, validation results of BIND (with real-world data sets) indicate reasonable speeds of compression and decompression that can be achieved with minimal processor/ memory usage. BIND is available for download at http://metagenomics.atc.tcs.com/compression/BIND. No license is required for academic or non-profit use. PMID:22922203

  8. Nucleotide sequence and phylogenetic analysis of a new potexvirus: Malva mosaic virus.

    PubMed

    Côté, Fabien; Paré, Christine; Majeau, Nathalie; Bolduc, Marilène; Leblanc, Eric; Bergeron, Michel G; Bernardy, Michael G; Leclerc, Denis

    2008-01-01

    A filamentous virus isolated from Malva neglecta Wallr. (common mallow) and propagated in Chenopodium quinoa was grown, cloned and the complete nucleotide sequence was determined (GenBank accession # DQ660333). The genomic RNA is 6858 nt in length and contains five major open reading frames (ORFs). The genomic organization is similar to members and the viral encoded proteins shared homology with the group of the Potexvirus genus in the Flexiviridae family. Phylogenetic analysis revealed a close relationship with narcissus mosaic virus (NMV), scallion virus X (ScaVX) and, to a lesser extent, to Alstroemeria virus X (AlsVX) and pepino mosaic virus (PepMV). A novel putative pseudoknot structure is predicted in the 3'-UTR of a subgroup of potexviruses, including this newly described virus. The consensus GAAAA sequence is detected at the 5'-end of the genomic RNA and experimental data strongly suggest that this motif could be a distinctive hallmark of this genus. The name Malva mosaic virus is proposed. PMID:18054524

  9. Nucleotide sequence and expression of alpha-glucosidase-encoding gene (agdA) from Aspergillus oryzae.

    PubMed

    Minetoki, T; Gomi, K; Kitamoto, K; Kumagai, C; Tamura, G

    1995-08-01

    We have isolated an alpha-glucosidase(AGL)-encoding gene (agdA) from Aspergillus oryzae by heterologous hybridization using the corresponding Aspergillus niger gene as a probe. Southern hybridization analysis showed that the agdA gene is on a 5.0-kb ScaI fragment and there is a single copy in the A. oryzae chromosome. Comparison with the A. niger agdA gene indicated that the agdA gene contains three putative introns from 52 to 59 nucleotides long, and that it encodes 985 amino acid residues. The deduced amino acid sequence of A. oryzae AGL is 78% homologous with the A. niger AGL. The high degree of homology with the amino acid sequence bordering the putative catalytic residue of a number of AGL enzymes, and this enzyme suggests that Asp492 is a catalytic residue of A. oryzae AGL. The cloned gene was functional. Transformants of A. oryzae containing multiple copies of the cloned agdA gene showed a 6-16 fold increase in AGL activity. Like the Taka-amylase A and glucoamylase genes of A. oryzae, expression of the agdA gene was induced when maltose was provided as a carbon source, but expression was not induced by glucose. This result suggested that cis-element(s) involved in maltose induction may be also present in the agdA promoter region. PMID:7549103

  10. The ChEMBL database as linked open data

    PubMed Central

    2013-01-01

    Background Making data available as Linked Data using Resource Description Framework (RDF) promotes integration with other web resources. RDF documents can natively link to related data, and others can link back using Uniform Resource Identifiers (URIs). RDF makes the data machine-readable and uses extensible vocabularies for additional information, making it easier to scale up inference and data analysis. Results This paper describes recent developments in an ongoing project converting data from the ChEMBL database into RDF triples. Relative to earlier versions, this updated version of ChEMBL-RDF uses recently introduced ontologies, including CHEMINF and CiTO; exposes more information from the database; and is now available as dereferencable, linked data. To demonstrate these new features, we present novel use cases showing further integration with other web resources, including Bio2RDF, Chem2Bio2RDF, and ChemSpider, and showing the use of standard ontologies for querying. Conclusions We have illustrated the advantages of using open standards and ontologies to link the ChEMBL database to other databases. Using those links and the knowledge encoded in standards and ontologies, the ChEMBL-RDF resource creates a foundation for integrated semantic web cheminformatics applications, such as the presented decision support. PMID:23657106

  11. ANTICALIgN: visualizing, editing and analyzing combined nucleotide and amino acid sequence alignments for combinatorial protein engineering.

    PubMed

    Jarasch, Alexander; Kopp, Melanie; Eggenstein, Evelyn; Richter, Antonia; Gebauer, Michaela; Skerra, Arne

    2016-07-01

    ANTIC ALIGN: is an interactive software developed to simultaneously visualize, analyze and modify alignments of DNA and/or protein sequences that arise during combinatorial protein engineering, design and selection. ANTIC ALIGN: combines powerful functions known from currently available sequence analysis tools with unique features for protein engineering, in particular the possibility to display and manipulate nucleotide sequences and their translated amino acid sequences at the same time. ANTIC ALIGN: offers both template-based multiple sequence alignment (MSA), using the unmutated protein as reference, and conventional global alignment, to compare sequences that share an evolutionary relationship. The application of similarity-based clustering algorithms facilitates the identification of duplicates or of conserved sequence features among a set of selected clones. Imported nucleotide sequences from DNA sequence analysis are automatically translated into the corresponding amino acid sequences and displayed, offering numerous options for selecting reading frames, highlighting of sequence features and graphical layout of the MSA. The MSA complexity can be reduced by hiding the conserved nucleotide and/or amino acid residues, thus putting emphasis on the relevant mutated positions. ANTIC ALIGN: is also able to handle suppressed stop codons or even to incorporate non-natural amino acids into a coding sequence. We demonstrate crucial functions of ANTIC ALIGN: in an example of Anticalins selected from a lipocalin random library against the fibronectin extradomain B (ED-B), an established marker of tumor vasculature. Apart from engineered protein scaffolds, ANTIC ALIGN: provides a powerful tool in the area of antibody engineering and for directed enzyme evolution. PMID:27261456

  12. Nucleotide sequence encoding the flavoprotein and hydrophobic subunits of the succinate dehydrogenase of Escherichia coli.

    PubMed Central

    Wood, D; Darlison, M G; Wilde, R J; Guest, J R

    1984-01-01

    The nucleotide sequence of a 3614 base-pair segment of DNA containing the sdhA gene, encoding the flavoprotein subunit of succinate dehydrogenase of Escherichia coli, and two genes sdhC and sdhD, encoding small hydrophobic subunits, has been determined. Together with the iron-sulphur protein gene (sdhB) these genes form an operon (sdhCDAB) situated between the citrate synthase gene (gltA) and the 2-oxoglutarate dehydrogenase complex genes (sucAB): gltA-sdhCDAB-sucAB. Transcription of the gltA and sdhCDAB gene appears to diverge from a single intergenic region that contains two pairs of potential promoter sequences and two putative CRP (cyclic AMP receptor protein)-binding sites. The sdhA structural gene comprises 1761 base-pairs (587 codons, excluding the initiation codon, AUG) and it encodes a polypeptide of Mr 64268 that is strikingly homologous with the flavoprotein subunit of fumarate reductase (frdA gene product). The FAD-binding region, including the histidine residue at the FAD-attachment site, has been identified by its homology with other flavoproteins and with the flavopeptide of the bovine heart mitochondrial succinate dehydrogenase. Potential active-site cysteine and histidine residues have also been indicated by the comparisons. The sdhC (384 base-pairs) and sdhD (342 base-pairs) structural genes encode two strongly hydrophobic proteins of Mr 14167 and 12792 respectively. These proteins resemble in size and composition, but not sequence, the membrane anchor proteins of fumarate reductase (the frdC and frdD gene products). PMID:6383359

  13. Complete Nucleotide Sequences and Genome Organization of Two Pepper Mild Mottle Virus Isolates from Capsicum annuum in South Korea

    PubMed Central

    Choi, Seung-Kook; Choi, Gug-Seoun; Kwon, Sun-Jung

    2016-01-01

    The complete genome sequences of pepper mild mottle virus (PMMoV)-P2 and -P3 were determined by the Sanger sequencing method. Although PMMoV-P2 and PMMoV-P3 have different pathogenicity in some pepper cultivars, the complete genome sequences of PMMoV-P2 and -P3 are composed of 6,356 nucleotides (nt). In this study, we report the complete genome sequences and genome organization of PMMoV-P2 and -P3 isolates from pepper species in South Korea. PMID:27198033

  14. Complete Nucleotide Sequences and Genome Organization of Two Pepper Mild Mottle Virus Isolates from Capsicum annuum in South Korea.

    PubMed

    Choi, Seung-Kook; Choi, Gug-Seoun; Kwon, Sun-Jung; Yoon, Ju-Yeon

    2016-01-01

    The complete genome sequences of pepper mild mottle virus (PMMoV)-P2 and -P3 were determined by the Sanger sequencing method. Although PMMoV-P2 and PMMoV-P3 have different pathogenicity in some pepper cultivars, the complete genome sequences of PMMoV-P2 and -P3 are composed of 6,356 nucleotides (nt). In this study, we report the complete genome sequences and genome organization of PMMoV-P2 and -P3 isolates from pepper species in South Korea. PMID:27198033

  15. 37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... is DNA, RNA, or PRT (protein). If a nucleotide sequence contains both DNA and RNA fragments, the type shall be “DNA.” In addition, the combined DNA/RNA molecule shall be further described in the to feature... combined DNA/RNA” Name/Key Provide appropriate identifier for feature, preferably from WIPO Standard...

  16. 37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... is DNA, RNA, or PRT (protein). If a nucleotide sequence contains both DNA and RNA fragments, the type shall be “DNA.” In addition, the combined DNA/RNA molecule shall be further described in the to feature... combined DNA/RNA” Name/Key Provide appropriate identifier for feature, preferably from WIPO Standard...

  17. Empirical Comparison of Simple Sequence Repeats and Single Nucleotide Polymorphisms in Assessment of Maize Diversity and Relatedness

    Technology Transfer Automated Retrieval System (TEKTRAN)

    While Simple Sequence Repeats (SSRs) are extremely useful genetic markers, recent advances in technology have produced a shift toward use of single nucleotide polymorphisms (SNPs). The different mutational properties of these two classes of markers result in differences in heterozygosities and allel...

  18. A high-density simple sequence repeat and single nucleotide polymorphism genetic map of the tetraploid cotton genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Cotton genome complexity was investigated with a saturated molecular genetic map that combined several sets of microsatellites or simple sequence repeats (SSR) and the first major public set of single nucleotide polymorphism (SNP) markers in cotton genomes (Gossypium spp.), and that was constructed ...

  19. Nucleotide sequence of a predicted diguanylate cyclase unique to egg contaminating Salmonella enteritidis that does not form biofilm.

    Technology Transfer Automated Retrieval System (TEKTRAN)

    This is a nucleotide sequence submitted as bankit1052494 and given the GenBank accession number EU375808A post-review. It will be released to the public in June 2008 in coordination with an ASM abstract presentation. A paper will also be submitted that refers to this accession number. A diguanyla...

  20. 37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... is DNA, RNA, or PRT (protein). If a nucleotide sequence contains both DNA and RNA fragments, the type shall be “DNA.” In addition, the combined DNA/RNA molecule shall be further described in the to feature... combined DNA/RNA” Name/Key Provide appropriate identifier for feature, preferably from WIPO Standard...

  1. 37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... is DNA, RNA, or PRT (protein). If a nucleotide sequence contains both DNA and RNA fragments, the type shall be “DNA.” In addition, the combined DNA/RNA molecule shall be further described in the to feature... combined DNA/RNA” Name/Key Provide appropriate identifier for feature, preferably from WIPO Standard...

  2. 37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... is DNA, RNA, or PRT (protein). If a nucleotide sequence contains both DNA and RNA fragments, the type shall be “DNA.” In addition, the combined DNA/RNA molecule shall be further described in the to feature... combined DNA/RNA” Name/Key Provide appropriate identifier for feature, preferably from WIPO Standard...

  3. Molecular cloning and nucleotide sequence of a transforming gene detected by transfection of chicken B-cell lymphoma DNA

    NASA Astrophysics Data System (ADS)

    Goubin, Gerard; Goldman, Debra S.; Luce, Judith; Neiman, Paul E.; Cooper, Geoffrey M.

    1983-03-01

    A transforming gene detected by transfection of chicken B-cell lymphoma DNA has been isolated by molecular cloning. It is homologous to a conserved family of sequences present in normal chicken and human DNAs but is not related to transforming genes of acutely transforming retroviruses. The nucleotide sequence of the cloned transforming gene suggests that it encodes a protein that is partially homologous to the amino terminus of transferrin and related proteins although only about one tenth the size of transferrin.

  4. Nucleotide sequence of the gene encoding the major subunit of CS3 fimbriae of enterotoxigenic Escherichia coli.

    PubMed Central

    Boylan, M; Smyth, C J; Scott, J R

    1988-01-01

    The complete nucleotide sequence of a 612-base-pair DNA fragment containing the gene for the major fimbrial subunit of CS3 of enterotoxigenic Escherichia coli is presented. A possible promoter region, a ribosome-binding site, and two potential signal peptidase cleavage sites are indicated. Unlike the best-studied fimbrial proteins, the predicted CS3 sequence has no Cys residues. PMID:2903130

  5. Nucleotide sequences of 5S rRNAs from sponge Halichondria japonica and tunicate Halocynthia roretzi and their phylogenetic positions

    PubMed Central

    Komiya, Hiroyuki; Hasegawa, Masami; Takemura, Shosuke

    1983-01-01

    The nucleotide sequences of 5S rRNAs from sponge Halichondria japonica and tunicate Halocynthia roretzi were determined by chemical and enzymatic gel methods. Their phylogenetic positions among metazoans were derived from the 5S rRNA sequences by a computer analysis based on the maximum parsimony principle. It was suggested that the sponge is closely related to several invertebrates and the tunicate has affinity to vertebrates rather than invertebrates. PMID:6835845

  6. Single nucleotide polymorphism discovery in rainbow trout by deep sequencing of a reduced representation library

    PubMed Central

    2009-01-01

    Background To enhance capabilities for genomic analyses in rainbow trout, such as genomic selection, a large suite of polymorphic markers that are amenable to high-throughput genotyping protocols must be identified. Expressed Sequence Tags (ESTs) have been used for single nucleotide polymorphism (SNP) discovery in salmonids. In those strategies, the salmonid semi-tetraploid genomes often led to assemblies of paralogous sequences and therefore resulted in a high rate of false positive SNP identification. Sequencing genomic DNA using primers identified from ESTs proved to be an effective but time consuming methodology of SNP identification in rainbow trout, therefore not suitable for high throughput SNP discovery. In this study, we employed a high-throughput strategy that used pyrosequencing technology to generate data from a reduced representation library constructed with genomic DNA pooled from 96 unrelated rainbow trout that represent the National Center for Cool and Cold Water Aquaculture (NCCCWA) broodstock population. Results The reduced representation library consisted of 440 bp fragments resulting from complete digestion with the restriction enzyme HaeIII; sequencing produced 2,000,000 reads providing an average 6 fold coverage of the estimated 150,000 unique genomic restriction fragments (300,000 fragment ends). Three independent data analyses identified 22,022 to 47,128 putative SNPs on 13,140 to 24,627 independent contigs. A set of 384 putative SNPs, randomly selected from the sets produced by the three analyses were genotyped on individual fish to determine the validation rate of putative SNPs among analyses, distinguish apparent SNPs that actually represent paralogous loci in the tetraploid genome, examine Mendelian segregation, and place the validated SNPs on the rainbow trout linkage map. Approximately 48% (183) of the putative SNPs were validated; 167 markers were successfully incorporated into the rainbow trout linkage map. In addition, 2% of the

  7. Gene-based single nucleotide polymorphism discovery in bovine muscle using next-generation transcriptomic sequencing

    PubMed Central

    2013-01-01

    Background Genetic information based on molecular markers has increasingly being used in cattle breeding improvement programmes, as a mean to improve conventionally phenotypic selection. Advances in molecular genetics have led to the identification of several genetic markers associated with genes affecting economic traits. Until recently, the identification of the causative genetic variants involved in the phenotypes of interest has remained a difficult task. The advent of novel sequencing technologies now offers a new opportunity for the identification of such variants. Despite sequencing costs plummeting, sequencing whole-genomes or large targeted regions is still too expensive for most laboratories. A transcriptomic-based sequencing approach offers a cheaper alternative to identify a large number of polymorphisms and possibly to discover causative variants. In the present study, we performed a gene-based single nucleotide polymorphism (SNP) discovery analysis in bovine Longissimus thoraci, using RNA-Seq. To our knowledge, this represents the first study done in bovine muscle. Results Messenger RNAs from Longissimus thoraci from three Limousin bull calves were subjected to high-throughput sequencing. Approximately 36–46 million paired-end reads were obtained per library. A total of 19,752 transcripts were identified and 34,376 different SNPs were detected. Fifty-five percent of the SNPs were found in coding regions and ~22% resulted in an amino acid change. Applying a very stringent SNP quality threshold, we detected 8,407 different high-confidence SNPs, 18% of which are non synonymous coding SNPs. To analyse the accuracy of RNA-Seq technology for SNP detection, 48 SNPs were selected for validation by genotyping. No discrepancies were observed when using the highest SNP probability threshold. To test the usefulness of the identified SNPs, the 48 selected SNPs were assessed by genotyping 93 bovine samples, representing mostly the nine major breeds used in France

  8. Compilation of 5S rRNA and 5S rRNA gene sequences

    PubMed Central

    Specht, Thomas; Wolters, Jörn; Erdmann, Volker A.

    1990-01-01

    The BERLIN RNA DATABANK as of Dezember 31, 1989, contains a total of 667 sequences of 5S rRNAs or their genes, which is an increase of 114 new sequence entries over the last compilation (1). It covers sequences from 44 archaebacteria, 267 eubacteria, 20 plastids, 6 mitochondria, 319 eukaryotes and 11 eukaryotic pseudogenes. The hardcopy shows only the list (Table 1) of those organisms whose sequences have been determined. The BERLIN RNA DATABANK uses the format of the EMBL Nucleotide Sequence Data Library complemented by a Sequence Alignment (SA) field including secondary structure information. PMID:1692116

  9. Characteristic features of the nucleotide sequences of yeast mitochondrial ribosomal protein genes as analyzed by computer program GeneMark.

    PubMed

    Isono, K; McIninch, J D; Borodovsky, M

    1994-01-01

    The nucleotide sequence data for yeast mitochondrial ribosomal protein (MRP) genes were analyzed by the computer program GeneMark which predicts the presence of likely genes in sequence data by calculating statistical biases in the appearance of consecutive nucleotides. The program uses a set of standard sequence data for this calculation. We used this program for the analysis of yeast nucleotide sequence data containing MRP genes, hoping to obtain information as to whether they share features in common that are different from other yeast genes. Sequence data sets for ordinary yeast genes and for 27 known MRP genes were used. The MRP genes were nicely predicted as likely genes regardless of the data sets used, whereas other yeast genes were predicted to be likely genes only when the data set for ordinary yeast genes was used. The assembled sequence data for chromosomes II, III, VIII and XI as well as the segmented data for chromosome V were analyzed in a similar manner. In addition to the known MRP genes, eleven ORF's were predicted to be likely MRP genes. Thus, the method seems very powerful in analyzing genes of heterologous origins. PMID:7719921

  10. Complete nucleotide sequence of the gene for the specific glycoprotein (gp55) of Friend spleen focus-forming virus.

    PubMed Central

    Amanuma, H; Katori, A; Obata, M; Sagata, N; Ikawa, Y

    1983-01-01

    The complete nucleotide sequence of the gene for the specific glycoprotein (gp55) of the polycythemic strain of Friend spleen focus-forming virus (SFFV) was derived from the cloned SFFV DNA intermediate. The gp55 gene is present within 1.4 kilobases of the 5' side of the 3'long terminal repeat sequence. The open reading frame predicts the primary translation product has a total of 409 amino acids with a Mr of 44,752. Comparisons of the deduced amino acid sequence of gp55 with those of the envelope (env) gene products of murine leukemia viruses (MuLVs) revealed that gp55 is composed of three distinct regions. The amino-terminal 80% of the molecule has a high degree of sequence homology with the amino-terminal portion of the gp70 of the Moloney mink cell focus-forming virus (BALB/Mo-MCFV). This portion of the BALB/Mo-MCFV gp70 is known to be coded for by the acquired xenotropic env-like sequence. The sequence of the following 66 amino acids of gp55 is highly homologous to that of the middle portion of the p15E of Moloney MuLV (Mo-MuLV). The sequence of the Carboxyl-terminal 12 amino acids is specific to gp55 and a comparison of the nucleotide sequence showed that this specific amino acid sequence is due to the presence of seven extra nucleotides compared with the sequence of the Mo-MuLV. PMID:6306650

  11. Nucleotide sequence and genetic organization of the Bacillus subtilis comG operon.

    PubMed Central

    Albano, M; Breitling, R; Dubnau, D A

    1989-01-01

    A series of Tn917lac insertions define the comG region of the Bacillus subtilis chromosome. comG mutants are deficient in competence and specifically in the binding of exogenous DNA. The genes included in the comG region are first expressed during the transition from the exponential to the stationary growth phase. From nucleotide sequence information, it was concluded that the comG locus contains seven open reading frames (ORFs), several of which overlap at their termini. High-resolution S1 nuclease mapping and primer extension were used to identify the 5' terminus of the comG mRNA. The sequence upstream from the comG start site closely resembled the consensus recognition sequence for the major B. subtilis vegetative RNA polymerase holoenzyme. Complementation analysis confirmed that the comG ORF1 protein is required for the ability of competent cultures to resolve into two populations with different cell densities on Renografin (E. R. Squibb & Sons, Princeton, N.J.) gradients, as well as for full expression of comE, another late competence locus. The predicted comG ORF1 protein showed significant similarity to the virB ORF11 protein from Agrobacterium tumefaciens, which is probably involved in T-DNA transfer. The N-terminal sequences of comG ORF3 and, to a lesser extent, the comG ORF4 and ORF5 proteins were similar to a class of pilin proteins from members of the genera Bacteroides, Pseudomonas, Neisseria, and Moraxella. All of the comG proteins except comG ORF1 possessed hydrophobic domains that were potentially capable of spanning the bacterial membrane. It is likely that these proteins are membrane associated, and they may comprise part of the DNA transport machinery. When present in multiple copies, a DNA fragment carrying the comG promoter was capable of inhibiting the development of competence as well as the expression of several late com genes, suggesting a role for a transcriptional activator in the expression of those genes. Images PMID:2507524

  12. Proteus mirabilis MR/P fimbrial operon: genetic organization, nucleotide sequence, and conditions for expression.

    PubMed Central

    Bahrani, F K; Mobley, H L

    1994-01-01

    Proteus mirabilis, an agent of urinary tract infection, expresses at least four fimbrial types. Among these are the MR/P (mannose-resistant/Proteus-like) fimbriae. MrpA, the structural subunit, is optimally expressed at 37 degrees C in Luria broth cultured statically for 48 h by each of seven strains examined. Genes encoding this fimbria were isolated, and the complete nucleotide sequence was determined. The mrp gene cluster encoded by 7,293 bp predicts eight polypeptides: MrpI (22,133 Da), MrpA (17,909 Da), MrpB (19,632 Da), MrpC (96,823 Da), MrpD (27,886 Da), MrpE (19,470 Da), MrpF (17,363 Da), and MrpG (13,169 Da). mrpI is upstream of the gene encoding the major structural subunit gene mrpA and is transcribed in the direction opposite to that of the rest of the operon. All predicted polypeptides share > or = 25% amino acid identity with at least one other enteric fimbrial gene product encoded by the pap, fim, smf, fan, or mrk gene clusters. Images PMID:7910820

  13. Pediococcus acidilactici ldhD gene: cloning, nucleotide sequence, and transcriptional analysis.

    PubMed Central

    Garmyn, D; Ferain, T; Bernard, N; Hols, P; Delplace, B; Delcour, J

    1995-01-01

    The gene encoding D-lactate dehydrogenase was isolated on a 2.9-kb insert from a library of Pediococcus acidilactici DNA by complementation for growth under anaerobiosis of an Escherichia coli lactate dehydrogenase and pyruvate-formate lyase double mutant. The nucleotide sequence of ldhD encodes a protein of 331 amino acids (predicted molecular mass of 37,210 Da) which shows similarity to the family of D-2-hydroxyacid dehydrogenases. The enzyme encoded by the cloned fragment is equally active on pyruvate and hydroxypyruvate, indicating that the enzyme has both D-lactate and D-glycerate dehydrogenase activities. Three other open reading frames were found in the 2.9-kb insert, one of which (rpsB) is highly similar to bacterial genes coding for ribosomal protein S2. Northern (RNA) blotting analyses indicated the presence of a 2-kb dicistronic transcript of ldhD (a metabolic gene) and rpsB (a putative ribosomal protein gene) together with a 1-kb monocistronic rpsB mRNA. These transcripts are abundant in the early phase of exponential growth but steadily fade away to disappear in the stationary phase. Primer extension analysis identified two distinct promoters driving either cotranscription of ldhD and rpsB or transcription of rpsB alone. PMID:7539419

  14. Whole-genome sequencing identifies genomic heterogeneity at a nucleotide and chromosomal level in bladder cancer

    PubMed Central

    Morrison, Carl D.; Liu, Pengyuan; Woloszynska-Read, Anna; Zhang, Jianmin; Luo, Wei; Qin, Maochun; Bshara, Wiam; Conroy, Jeffrey M.; Sabatini, Linda; Vedell, Peter; Xiong, Donghai; Liu, Song; Wang, Jianmin; Shen, He; Li, Yinwei; Omilian, Angela R.; Hill, Annette; Head, Karen; Guru, Khurshid; Kunnev, Dimiter; Leach, Robert; Eng, Kevin H.; Darlak, Christopher; Hoeflich, Christopher; Veeranki, Srividya; Glenn, Sean; You, Ming; Pruitt, Steven C.; Johnson, Candace S.; Trump, Donald L.

    2014-01-01

    Using complete genome analysis, we sequenced five bladder tumors accrued from patients with muscle-invasive transitional cell carcinoma of the urinary bladder (TCC-UB) and identified a spectrum of genomic aberrations. In three tumors, complex genotype changes were noted. All three had tumor protein p53 mutations and a relatively large number of single-nucleotide variants (SNVs; average of 11.2 per megabase), structural variants (SVs; average of 46), or both. This group was best characterized by chromothripsis and the presence of subclonal populations of neoplastic cells or intratumoral mutational heterogeneity. Here, we provide evidence that the process of chromothripsis in TCC-UB is mediated by nonhomologous end-joining using kilobase, rather than megabase, fragments of DNA, which we refer to as “stitchers,” to repair this process. We postulate that a potential unifying theme among tumors with the more complex genotype group is a defective replication–licensing complex. A second group (two bladder tumors) had no chromothripsis, and a simpler genotype, WT tumor protein p53, had relatively few SNVs (average of 5.9 per megabase) and only a single SV. There was no evidence of a subclonal population of neoplastic cells. In this group, we used a preclinical model of bladder carcinoma cell lines to study a unique SV (translocation and amplification) of the gene glutamate receptor ionotropic N-methyl D-aspertate as a potential new therapeutic target in bladder cancer. PMID:24469795

  15. New features in the genus Ilarvirus revealed by the nucleotide sequence of Fragaria chiloensis latent virus.

    PubMed

    Tzanetakis, Ioannis E; Martin, Robert R

    2005-09-01

    Fragaria chiloensis latent virus (FClLV), a member of the genus Ilarvirus was first identified in the early 1990s. Double-stranded RNA was extracted from FClLV infected plants and cloned. The complete nucleotide sequence of the virus has been elucidated. RNA 1 encodes a protein with methyltransferase and helicase enzymatic motifs while RNA 2 encodes the viral RNA dependent RNA polymerase and an ORF, that shares no homology with other Ilarvirus genes. RNA 3 codes for movement and coat proteins and an additional ORF, making FClLV possibly the first Ilarvirus encoding a third protein in RNA 3. Phylogenetic analysis reveals that FClLV is most closely related to Prune dwarf virus, the type member of subgroup 4 of the Ilarvirus genus. FClLV is also closely related to Alfalfa mosaic virus (AlMV), a virus that shares many properties with Ilarviruses . We propose the reclassification of AlMV as a member of the Ilarvirus genus instead of being a member of a distinct genus. PMID:15878214

  16. Postzygotic single-nucleotide mosaicisms in whole-genome sequences of clinically unremarkable individuals

    PubMed Central

    Huang, August Y; Xu, Xiaojing; Ye, Adam Y; Wu, Qixi; Yan, Linlin; Zhao, Boxun; Yang, Xiaoxu; He, Yao; Wang, Sheng; Zhang, Zheng; Gu, Bowen; Zhao, Han-Qing; Wang, Meng; Gao, Hua; Gao, Ge; Zhang, Zhichao; Yang, Xiaoling; Wu, Xiru; Zhang, Yuehua; Wei, Liping

    2014-01-01

    Postzygotic single-nucleotide mutations (pSNMs) have been studied in cancer and a few other overgrowth human disorders at whole-genome scale and found to play critical roles. However, in clinically unremarkable individuals, pSNMs have never been identified at whole-genome scale largely due to technical difficulties and lack of matched control tissue samples, and thus the genome-wide characteristics of pSNMs remain unknown. We developed a new Bayesian-based mosaic genotyper and a series of effective error filters, using which we were able to identify 17 SNM sites from ∼80× whole-genome sequencing of peripheral blood DNAs from three clinically unremarkable adults. The pSNMs were thoroughly validated using pyrosequencing, Sanger sequencing of individual cloned fragments, and multiplex ligation-dependent probe amplification. The mutant allele fraction ranged from 5%-31%. We found that C→T and C→A were the predominant types of postzygotic mutations, similar to the somatic mutation profile in tumor tissues. Simulation data showed that the overall mutation rate was an order of magnitude lower than that in cancer. We detected varied allele fractions of the pSNMs among multiple samples obtained from the same individuals, including blood, saliva, hair follicle, buccal mucosa, urine, and semen samples, indicating that pSNMs could affect multiple sources of somatic cells as well as germ cells. Two of the adults have children who were diagnosed with Dravet syndrome. We identified two non-synonymous pSNMs in SCN1A, a causal gene for Dravet syndrome, from these two unrelated adults and found that the mutant alleles were transmitted to their children, highlighting the clinical importance of detecting pSNMs in genetic counseling. PMID:25312340

  17. Nucleotide sequences and genetic analysis of hydrogen oxidation (hox) genes in Azotobacter vinelandii.

    PubMed Central

    Menon, A L; Mortenson, L E; Robson, R L

    1992-01-01

    Azotobacter vinelandii contains a heterodimeric, membrane-bound [NiFe]hydrogenase capable of catalyzing the reversible oxidation of H2. The beta and alpha subunits of the enzyme are encoded by the structural genes hoxK and hoxG, respectively, which appear to form part of an operon that contains at least one further potential gene (open reading frame 3 [ORF3]). In this study, determination of the nucleotide sequence of a region of 2,344 bp downstream of ORF3 revealed four additional closely spaced or overlapping ORFs. These ORFs, ORF4 through ORF7, potentially encode polypeptides with predicted masses of 22.8, 11.4, 16.3, and 31 kDa, respectively. Mutagenesis of the chromosome of A. vinelandii in the area sequenced was carried out by introduction of antibiotic resistance gene cassettes. Disruption of hoxK and hoxG by a kanamycin resistance gene abolished whole-cell hydrogenase activity coupled to O2 and led to loss of the hydrogenase alpha subunit. Insertional mutagenesis of ORF3 through ORF7 with a promoterless lacZ-Kmr cassette established that the region is transcriptionally active and involved in H2 oxidation. We propose to call ORF3 through ORF7 hoxZ, hoxM, hoxL, hoxO, and hoxQ, respectively. The predicted hox gene products resemble those encoded by genes from hydrogenase-related operons in other bacteria, including Escherichia coli and Alcaligenes eutrophus. Images PMID:1624446

  18. Nucleotide sequence of the fadR gene, a multifunctional regulator of fatty acid metabolism in Escherichia coli.

    PubMed Central

    DiRusso, C C

    1988-01-01

    The Escherichia coli fadR gene is a multifunctional regulator of fatty acid and acetate metabolism. In the present work the nucleotide sequence of the 1.3 kb DNA fragment which encodes FadR has been determined. The coding sequence of the fadR gene is 714 nucleotides long and is preceded by a typical E. coli ribosome binding site and is followed by a sequence predicted to be sufficient for factor-independent chain termination. Primer extension experiments demonstrated that the transcription of the fadR gene initiates with an adenine nucleotide 33 nucleotides upstream from the predicted start of translation. The derived fadR peptide has a calculated molecular weight of 26,972. This is in reasonable agreement with the apparent molecular weight of 29,000 previously estimated on the basis of maxi-cell analysis of plasmid encoded proteins. There is a segment of twenty amino acids within the predicted peptide which resembles the DNA recognition and binding site of many transcriptional regulatory proteins. Images PMID:2843809

  19. T box transcription antitermination riboswitch: Influence of nucleotide sequence and orientation on tRNA binding by the antiterminator element

    PubMed Central

    Fauzi, Hamid; Agyeman, Akwasi; Hines, Jennifer V.

    2008-01-01

    Many bacteria utilize riboswitch transcription regulation to monitor and appropriately respond to cellular levels of important metabolites or effector molecules. The T box transcription antitermination riboswitch responds to cognate uncharged tRNA by specifically stabilizing an antiterminator element in the 5′-untranslated mRNA leader region and precluding formation of a thermodynamically more stable terminator element. Stabilization occurs when the tRNA acceptor end base pairs with the first four nucleotides in the seven nucleotide bulge of the highly conserved antiterminator element. The significance of the conservation of the antiterminator bulge nucleotides that do not base pair with the tRNA is unknown, but they are required for optimal function. In vitro selection was used to determine if the isolated antiterminator bulge context alone dictates the mode in which the tRNA acceptor end binds the bulge nucleotides. No sequence conservation beyond complementarity was observed and the location was not constrained to the first four bases of the bulge. The results indicate that formation of a structure that recognizes the tRNA acceptor end in isolation is not the determinant driving force for the high phylogenetic sequence conservation observed within the antiterminator bulge. Additional factors or T box leader features more likely influenced the phylogenetic sequence conservation. PMID:19152843

  20. Nucleotide Sequence Evolution at the κ-Casein Locus: Evidence for Positive Selection within the Family Bovidae

    PubMed Central

    Ward, T. J.; Honeycutt, R. L.; Derr, J. N.

    1997-01-01

    κ-Casein is a mammalian milk protein involved in a number of important physiological processes. In the gut, the ingested protein is split into an insoluble peptide (para κ-casein) and a soluble hydrophilic glycopeptide (caseinomacropeptide). Caseinomacropeptide is responsible for increased efficiency of digestion, prevention of neonate hypersensitivity to ingested proteins, and inhibition of gastric pathogens. Variation within this peptide has significant effects associated with important traits such as milk production. The nucleotide sequences for regions of κ-casein exon and intron four were determined for representatives of the artiodactyl family Bovidae. The pattern of nucleotide substitution in κ-casein sequences for distantly related bovid taxa demonstrates that positive selection has accelerated their divergence at the amino acid sequence level. This selection has differentially influenced the molecular evolution of the two κ-casein split peptides and is focused within a 34-codon region of caseinomacropeptide. PMID:9409842

  1. Nucleotide sequence analysis of Aleutian mink disease parvovirus shows that multiple virus types are present in infected mink.

    PubMed Central

    Gottschalck, E; Alexandersen, S; Cohn, A; Poulsen, L A; Bloom, M E; Aasted, B

    1991-01-01

    Different isolates of Aleutian mink disease parvovirus (ADV) were cloned and nucleotide sequenced. Analysis of individual clones from two in vivo-derived isolates of high virulence indicated that more than one type of ADV DNA were present in each of these isolates. Analysis of several clones from two preparations of a cell culture-adapted isolate of low virulence showed the presence of only one type of ADV DNA. We also describe the nucleotide sequence from map units 44 to 88 of a new type of ADV DNA. The new type of ADV DNA is compared with the previously published ADV sequences, to which it shows 95% homology. These findings indicate that ADV, a single-stranded DNA virus, has a considerable degree of variability and that several virus types can be present simultaneously in an infected animal. PMID:1649336

  2. Nucleotide sequence of a chickpea chlorotic stunt virus relative that infects pea and faba bean in China.

    PubMed

    Zhou, Cui-Ji; Xiang, Hai-Ying; Zhuo, Tao; Li, Da-Wei; Yu, Jia-Lin; Han, Cheng-Gui

    2012-07-01

    We determined the genome sequence of a new polerovirus that infects field pea and faba bean in China. Its entire nucleotide sequence (6021 nt) was most closely related (83.3% identity) to that of an Ethiopian isolate of chickpea chlorotic stunt virus (CpCSV-Eth). With the exception of the coat protein (encoded by ORF3), amino acid sequence identities of all gene products of this virus to those of CpCSV-Eth and other poleroviruses were <90%. This suggests that it is a new member of the genus Polerovirus, and the name pea mild chlorosis virus is proposed. PMID:22476900

  3. Nucleotide sequence of a cluster of early and late genes in a conserved segment of the vaccinia virus genome.

    PubMed Central

    Plucienniczak, A; Schroeder, E; Zettlmeissl, G; Streeck, R E

    1985-01-01

    The nucleotide sequence of a 7.6 kb vaccinia DNA segment from a genomic region conserved among different orthopox virus has been determined. This segment contains a tight cluster of 12 partly overlapping open reading frames most of which can be correlated with previously identified early and late proteins and mRNAs. Regulatory signals used by vaccinia virus have been studied. Presumptive promoter regions are rich in A, T and carry the consensus sequences TATA and AATAA spaced at 20-24 base pairs. Tandem repeats of a CTATTC consensus sequence are proposed to be involved in the termination of early transcription. PMID:2987815

  4. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... Director of the Federal Register in accordance with 5 U.S.C. 552(a) and 1 CFR part 51. Copies of WIPO... 37 Patents, Trademarks, and Copyrights 1 2010-07-01 2010-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid...

  5. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... Director of the Federal Register in accordance with 5 U.S.C. 552(a) and 1 CFR part 51. Copies of WIPO... 37 Patents, Trademarks, and Copyrights 1 2011-07-01 2011-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid...

  6. Nucleotide sequence of a complementary DNA encoding pea cytosolic copper/zinc superoxide dismutase. [Pisum sativum L

    SciTech Connect

    White, D.A.; Zilinskas, B.A. )

    1991-08-01

    The authors now report the nucleotide sequence of the cytosolic Cu/Zn SOD cloned from a {lambda}gt11 cDNA library constructed from mRNA extracted from leaves of 7- to 10-d pea seedlings (Pisum sativum L.). The clone was isolated using a 22-base synthetic oligonucleotide complementary to the amino acid sequence CGIIGLQG. This sequence, found at the protein's carboxy terminus, is highly conserved among plant cytosolic Cu/Zn SODs but not chloroplastic Cu/Zn SODs. The 738-base pair sequence contains an open reading frame specifying 152 codons and a predicted M{sub r} of 18,024 D. The deduced amino acid sequence is highly homologous (79-82% identity) with the sequences of other known plant cytosolic Cu/Zn SODs but less highly conserved (63-65%) when compared with several chloroplastic Cu/Zn SODs including pea (10).

  7. Sequence-Specific Incorporation of Enzyme-Nucleotide Chimera by DNA Polymerases.

    PubMed

    Welter, Moritz; Verga, Daniela; Marx, Andreas

    2016-08-16

    DNA polymerases select the right nucleotide for the growing polynucleotide chain based on the shape and geometry of the nascent nucleotide pairs and thereby ensure high DNA replication selectivity. High-fidelity DNA polymerases are believed to possess tight active sites that allow little deviation from the canonical structures. However, DNA polymerases are known to use nucleotides with small modifications as substrates, which is key for numerous core biotechnology applications. We show that even high-fidelity DNA polymerases are capable of efficiently using nucleotide chimera modified with a large protein like horseradish peroxidase as substrates for template-dependent DNA synthesis, despite this "cargo" being more than 100-fold larger than the natural substrates. We exploited this capability for the development of systems that enable naked-eye detection of DNA and RNA at single nucleotide resolution. PMID:27392211

  8. The ChEMBL bioactivity database: an update.

    PubMed

    Bento, A Patrícia; Gaulton, Anna; Hersey, Anne; Bellis, Louisa J; Chambers, Jon; Davies, Mark; Krüger, Felix A; Light, Yvonne; Mak, Lora; McGlinchey, Shaun; Nowotka, Michal; Papadatos, George; Santos, Rita; Overington, John P

    2014-01-01

    ChEMBL is an open large-scale bioactivity database (https://www.ebi.ac.uk/chembl), previously described in the 2012 Nucleic Acids Research Database Issue. Since then, a variety of new data sources and improvements in functionality have contributed to the growth and utility of the resource. In particular, more comprehensive tracking of compounds from research stages through clinical development to market is provided through the inclusion of data from United States Adopted Name applications; a new richer data model for representing drug targets has been developed; and a number of methods have been put in place to allow users to more easily identify reliable data. Finally, access to ChEMBL is now available via a new Resource Description Framework format, in addition to the web-based interface, data downloads and web services. PMID:24214965

  9. The ChEMBL bioactivity database: an update

    PubMed Central

    Bento, A. Patrícia; Gaulton, Anna; Hersey, Anne; Bellis, Louisa J.; Chambers, Jon; Davies, Mark; Krüger, Felix A.; Light, Yvonne; Mak, Lora; McGlinchey, Shaun; Nowotka, Michal; Papadatos, George; Santos, Rita; Overington, John P.

    2014-01-01

    ChEMBL is an open large-scale bioactivity database (https://www.ebi.ac.uk/chembl), previously described in the 2012 Nucleic Acids Research Database Issue. Since then, a variety of new data sources and improvements in functionality have contributed to the growth and utility of the resource. In particular, more comprehensive tracking of compounds from research stages through clinical development to market is provided through the inclusion of data from United States Adopted Name applications; a new richer data model for representing drug targets has been developed; and a number of methods have been put in place to allow users to more easily identify reliable data. Finally, access to ChEMBL is now available via a new Resource Description Framework format, in addition to the web-based interface, data downloads and web services. PMID:24214965

  10. Nucleotide Sequence and Genetic Structure of a Novel Carbaryl Hydrolase Gene (cehA) from Rhizobium sp. Strain AC100

    PubMed Central

    Hashimoto, Masayuki; Fukui, Mitsuru; Hayano, Kouichi; Hayatsu, Masahito

    2002-01-01

    Rhizobium sp. strain AC100, which is capable of degrading carbaryl (1-naphthyl-N-methylcarbamate), was isolated from soil treated with carbaryl. This bacterium hydrolyzed carbaryl to 1-naphthol and methylamine. Carbaryl hydrolase from the strain was purified to homogeneity, and its N-terminal sequence, molecular mass (82 kDa), and enzymatic properties were determined. The purified enzyme hydrolyzed 1-naphthyl acetate and 4-nitrophenyl acetate indicating that the enzyme is an esterase. We then cloned the carbaryl hydrolase gene (cehA) from the plasmid DNA of the strain and determined the nucleotide sequence of the 10-kb region containing cehA. No homologous sequences were found by a database homology search using the nucleotide and deduced amino acid sequences of the cehA gene. Six open reading frames including the cehA gene were found in the 10-kb region, and sequencing analysis shows that the cehA gene is flanked by two copies of insertion sequence-like sequence, suggesting that it makes part of a composite transposon. PMID:11872471

  11. DNA sequencing by synthesis using 3′-O-azidomethyl nucleotide reversible terminators and surface-enhanced Raman spectroscopic detection

    PubMed Central

    Palla, Mirkó; Guo, Wenjing; Shi, Shundi; Li, Zengmin; Wu, Jian; Jockusch, Steffen; Guo, Cheng; Russo, James J.; Turro, Nicholas J.; Ju, Jingyue

    2014-01-01

    As an alternative to fluorescence-based DNA sequencing by synthesis (SBS), we report here an approach using an azido moiety (N3) that has an intense, narrow and unique Raman shift at 2125 cm−1, where virtually all biological molecules are transparent, as a label for SBS. We first demonstrated that the four 3′-O-azidomethyl nucleotide reversible terminators (3′-O-azidomethyl-dNTPs) displayed surface enhanced Raman scattering (SERS) at 2125 cm−1. Using these 4 nucleotide analogues as substrates, we then performed a complete 4-step SBS reaction. We used SERS to monitor the appearance of the azide-specific Raman peak at 2125 cm−1 as a result of polymerase extension by a single 3′-O-azidomethyl-dNTP into the growing DNA strand and disappearance of this Raman peak with cleavage of the azido label to permit the next nucleotide incorporation, thereby continuously determining the DNA sequence. Due to the small size of the azido label, the 3′-O-azidomethyl-dNTPs are efficient substrates for the DNA polymerase. In the SBS cycles, the natural nucleotides are restored after each incorporation and cleavage, producing a growing DNA strand that bears no modifications and will not impede further polymerase reactions. Thus, with further improvements in SERS for the azido moiety, this approach has the potential to provide an attractive alternative to fluorescence-based SBS. PMID:25396047

  12. Estimates of Gene Flow in Drosophila Pseudoobscura Determined from Nucleotide Sequence Analysis of the Alcohol Dehydrogenase Region

    PubMed Central

    Schaeffer, S. W.; Miller, E. L.

    1992-01-01

    The genetic structure of Drosophila pseudoobscura populations was inferred from a nucleotide sequence analysis of a 3.4-kb segment of the alcohol dehydrogenase (Adh) region. A total of 99 isochromosomal strains collected from 13 populations in North and South America were used to determine if any population departed from a neutral model and to estimate levels of gene flow between populations. This study also included the nucleotide sequences from two sibling species, D. persimilis and D. miranda. We estimated the neutral mutation parameter, 4Nμ, in synonymous and noncoding sites for 17 subregions of Adh in each of nine populations with sample sizes greater than three. The nucleotide diversity data in the nine populations was tested for departures from an equilibrium neutral model with two statistical tests. The Tajima and the Hudson, Kreitman, Aguade tests showed that each population fails to reject a neutral model. Tests for genetic differentiation between populations fail to show any population substructure among the North American populations of D. pseudoobscura. The nucleotide diversity data is consistent with direct and indirect measures of gene flow that show extensive dispersal between populations of D. pseudoobscura. PMID:1427038

  13. Characterization and Nucleotide Sequence of the Cryptic Cel Operon of Escherichia Coli K12

    PubMed Central

    Parker, L. L.; Hall, B. G.

    1990-01-01

    Wild-type Escherichia coli are not able to utilize β-glucoside sugars because the genes for utilization of these sugars are cryptic. Spontaneous mutations in the cel operon allow its expression and enable the organism to ferment cellobiose, arbutin and salicin. In this report we describe the structure and nucleotide sequence of the cel operon. The cel operon consists of five genes: celA, whose function is unknown; celB and celC which encode phosphoenolpyruvate-dependent phosphotransferase system enzyme II(cel) and enzyme III(cel), respectively, for the transport and phosphorylation of β-glucoside sugars; celD, which encodes a negative regulatory protein; and celF, which encodes a phospho-β-glucosidase that acts on phosphorylated cellobiose, arbutin and salicin. The mutationally activated cel operon is induced in the presence of its substrates, and is repressed in their absence. A comparison of proteins encoded by the cel operon with functionally equivalent proteins of the bgl operon, another cryptic E. coli gene system responsible for the catabolism of β-glucoside sugars, revealed no significant homology between these two systems despite common functional characteristics. The celD and celF encoded repressor and phospho-β-glucosidase proteins are homologous to the melibiose regulatory protein and to the melA encoded α-galactosidase of E. coli, respectively. Furthermore, the celC encoded PEP-dependent phosphotransferase system enzyme III(cel) is strikingly homologous to an enzyme III(lac) of the Gram-positive organism Staphylococcus aureus. We conclude that the genes for these two enzyme IIIs diverged much more recently than did their hosts, indicating that E. coli and S. aureus have undergone relatively recent exchange of chromosomal genes. PMID:2179047

  14. Nucleotide sequence and mutational analysis of the vnfENX region of Azotobacter vinelandii.

    PubMed Central

    Wolfinger, E D; Bishop, P E

    1991-01-01

    The nucleotide sequence (3,600 bp) of a second copy of nifENX-like genes in Azotobacter vinelandii has been determined. These genes are located immediately downstream from vnfA and have been designated vnfENX. The vnfENX genes appear to be organized as a single transcriptional unit that is preceded by a potential RpoN-dependent promoter. While the nifEN genes are thought to be evolutionarily related to nifDK, the vnfEN genes appear to be more closely related to nifEN than to either nifDK, vnfDK, or anfDK. Mutant strains (CA47 and CA48) carrying insertions in vnfE and vnfN, respectively, are able to grow diazotrophically in molybdenum (Mo)-deficient medium containing vanadium (V) (Vnf+) and in medium lacking both Mo and V (Anf+). However, a double mutant (strain DJ42.48) which contains a nifEN deletion and an insertion in vnfE is unable to grow diazotrophically in Mo-sufficient medium or in Mo-deficient medium with or without V. This suggests that NifE and NifN substitute for VnfE and VnfN when the vnfEN genes are mutationally inactivated. AnfA is not required for the expression of a vnfN-lacZ transcriptional fusion, even though this fusion is expressed under Mo- and V-deficient diazotrophic growth conditions. PMID:1938952

  15. A comparison of nucleotide sequences of measles virus L genes derived from wild-type viruses and SSPE brain tissues.

    PubMed

    Komase, K; Rima, B K; Pardowitz, I; Kunz, C; Billeter, M A; ter Meulen, V; Baczko, K

    1995-04-20

    The nucleotide sequences of the large protein (L) gene derived from two wild-type measles viruses (MV) and two SSPE brain-derived viruses have been determined. All sequences have single large open reading frames encoding 2183 amino acid residues. The deduced L proteins are well conserved and the proposed functional domains which have been identified for rhabdo- and paramyxoviruses are completely conserved in all strains. The degree of variability of L proteins is the lowest of all structural proteins of MV, reflecting its role in virus reproduction and persistence. Biased hypermutation was not observed in the L genes derived from SSPE brain tissue. None of the nucleotide changes can be associated with the attenuated phenotype of the Edmonston vaccine viruses. PMID:7747453

  16. Phylogeny of Bipolaris inferred from nucleotide sequences of Brn1, a reductase gene involved in melanin biosynthesis.

    PubMed

    Shimizu, Kiminori; Tanaka, Chihiro; Peng, You-Liang; Tsuda, Mitsuya

    1998-08-01

    The Brn1 reductase melanin biosynthesis gene in the fungal genus Bipolaris was sequenced in 74 strains of 22 species. The Brn1 region was highly conserved among the species examined at the nucleotide and the amino acid levels. To elucidate the phylogenetic relationships among Bipolaris species, trees were inferred from nucleotide sequences of this region. Species in these trees formed exclusive clusters clearly separated from one another, except for B. panici-miliacei and B. setariae, and B. victoriae and B. zeicola. When unidentified strains were added to this tree, they fell within known species or formed independent clusters. These data indicated that the Brn1 gene region was suitable for species-level systematics within the genus. The results also suggest that Bipolaris consists of two or more clades that may reflect teleomorphic connections. PMID:12501419

  17. Nucleotide sequence of the genes encoding the canine herpesvirus gB, gC and gD homologues.

    PubMed

    Limbach, K J; Limbach, M P; Conte, D; Paoletti, E

    1994-08-01

    The nucleotide sequence of the genes encoding the canine herpesvirus (CHV) gB, gC and gD homologues was determined. These genes are predicted to encode polypeptides of 879, 459 and 345 amino acids, respectively. Comparison of the predicted amino acid sequences of CHV gB, gC and gD with the homologous sequences from other herpesviruses indicates that CHV is an alphaherpesvirus, a conclusion that is consistent with the previous classification of this virus according to biological properties. Alignment of the homologous gB, gC and gD amino acid sequences indicates that most of the cysteine residues are conserved, suggesting that these glycoproteins possess similar tertiary structures. The nucleotide sequence of the open reading frame downstream from the CHV gC gene was also determined. The predicted amino acid sequence of this putative polypeptide appears to be homologous to a family of proteins encoded downstream from the gC gene in most, although not all, alphaherpesviruses. PMID:7545942

  18. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus) Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms

    PubMed Central

    Bertolini, Francesca; Scimone, Concetta; Geraci, Claudia; Schiavo, Giuseppina; Utzeri, Valerio Joe; Chiofalo, Vincenzo; Fontanesi, Luca

    2015-01-01

    Few studies investigated the donkey (Equus asinus) at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer) and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated) and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL) obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca). The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs) in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing) and Ion Torrent (RRL) runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources. PMID:26151450

  19. Characterization of Newcastle disease virus isolates by reverse transcription PCR coupled to direct nucleotide sequencing and development of sequence database for pathotype prediction and molecular epidemiological analysis.

    PubMed Central

    Seal, B S; King, D J; Bennett, J D

    1995-01-01

    Degenerate oligonucleotide primers were synthesized to amplify nucleotide sequences from portions of the fusion protein and matrix protein genes of Newcastle disease virus (NDV) genomic RNA that could be used diagnostically. These primers were used in a single-tube reverse transcription PCR of NDV genomic RNA coupled to direct nucleotide sequencing of the amplified product to characterize more than 30 NDV isolates. In agreement with previous reports, differences in the fusion protein cleavage sequence that correlated genotypically with virulence among various NDV pathotypes were detected. By using sequences generated from the matrix protein gene coding for the nuclear localization signal, lentogenic viruses were again grouped phylogenetically separate from other pathotypes. These techniques were applied to compare neurotropic velogenic viruses isolated from an outbreak of Newcastle disease in cormorants and turkeys. Cormorant NDV isolates and an NDV isolate from an infected turkey flock in North Dakota had the fusion protein cleavage sequence 109SRGRRQKRFVG119. The R-for-G substitution at position 110 may be unique for the cormorant-type isolates. Although the amino acid sequences from the fusion protein cleavage site were identical, nucleotide sequence data correlate the outbreak in turkeys to a cormorant virus isolate from Minnesota and not to a cormorant virus isolate from Michigan. On the basis of sequence information, the cormorant isolates are virulent viruses related to isolates of psittacine origin, possibly genotypically distinct from other velogenic NDV isolates. These techniques can be used reliably for Newcastle disease epidemiology and for prediction of pathotypes of NDV isolates without traditional live-bird inoculations. PMID:8567895

  20. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus) Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms.

    PubMed

    Bertolini, Francesca; Scimone, Concetta; Geraci, Claudia; Schiavo, Giuseppina; Utzeri, Valerio Joe; Chiofalo, Vincenzo; Fontanesi, Luca

    2015-01-01

    Few studies investigated the donkey (Equus asinus) at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer) and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated) and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL) obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca). The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs) in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing) and Ion Torrent (RRL) runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources. PMID:26151450

  1. Complete nucleotide sequence and gene rearrangement of the mitochondrial genome of the bell-ring frog, Buergeria buergeri (family Rhacophoridae).

    PubMed

    Sano, Naomi; Kurabayashi, Atsushi; Fujii, Tamotsu; Yonekawa, Hiromichi; Sumida, Masayuki

    2004-06-01

    In this study we determined the complete nucleotide sequence (19,959 bp) of the mitochondrial DNA of the rhacophorid frog Buergeria buergeri. The gene content, nucleotide composition, and codon usage of B. buergeri conformed to those of typical vertebrate patterns. However, due to an accumulation of lengthy repetitive sequences in the D-loop region, this species possesses the largest mitochondrial genome among all the vertebrates examined so far. Comparison of the gene organizations among amphibian species (Rana, Xenopus, salamanders and caecilians) revealed that the positioning of four tRNA genes and the ND5 gene in the mtDNA of B. buergeri diverged from the common vertebrate gene arrangement shared by Xenopus, salamanders and caecilians. The unique positions of the tRNA genes in B. buergeri are shared by ranid frogs, indicating that the rearrangements of the tRNA genes occurred in a common ancestral lineage of ranids and rhacophorids. On the other hand, the novel position of the ND5 gene seems to have arisen in a lineage leading to rhacophorids (and other closely related taxa) after ranid divergence. Phylogenetic analysis based on nucleotide sequence data of all mitochondrial genes also supported the gene rearrangement pathway. PMID:15329496

  2. Nucleotide sequence of the cDNA encoding the precursor of the beta subunit of rat lutropin.

    PubMed Central

    Chin, W W; Godine, J E; Klein, D R; Chang, A S; Tan, L K; Habener, J F

    1983-01-01

    We have determined the nucleotide sequences of cDNAs encoding the precursor of the beta subunit of rat lutropin, a polypeptide hormone that regulates gonadal function, including the development of gametes and the production of steroid sex hormones. The cDNAs were prepared from poly(A)+ RNA derived from the pituitary glands of rats 4 weeks after ovariectomy and were cloned in bacterial plasmids. Bacterial colonies containing transfected plasmids were screened by hybridization with a 32P-labeled cDNA encoding the beta subunit of human chorionic gonadotropin, a protein that is related in structure to lutropin. Several recombinant plasmids were detected that by nucleotide sequence analyses contained coding sequences for the precursor of the beta subunit of lutropin. Complete determination of the nucleotide sequences of these cDNAs, as well as of cDNA reverse-transcribed from pituitary poly(A)+ RNA by using a synthetic pentadecanucleotide as a primer of RNA, provided the entire 141-codon sequence of the precursor of the beta subunit of rat lutropin. The precursor consists of a 20 amino acid leader (signal) peptide and an apoprotein of 121 amino acids. The amino acid sequence of the rat lutropin beta subunit shows similarity to the beta subunits of the ovine/bovine, porcine, and human lutropins (81, 86, and 74% of amino acids identical, respectively). Blot hybridization of pituitary RNAs separated by electrophoresis on agarose gels showed that the mRNA encoding the lutropin beta subunit consists of approximately 700 bases. The availability of cDNAs for both the alpha and beta subunits of lutropin will facilitate studies of the regulation of lutropin expression. Images PMID:6192440

  3. Nucleotides critical for the interaction of the Streptococcus pyogenes Mga virulence regulator with Mga-regulated promoter sequences.

    PubMed

    Hause, Lara L; McIver, Kevin S

    2012-09-01

    The Mga regulator of Streptococcus pyogenes directly activates the transcription of a core regulon that encodes virulence factors such as M protein (emm), C5a peptidase (scpA), and streptococcal inhibitor of complement (sic) by directly binding to a 45-bp binding site as determined by an electrophoretic mobility shift assay (EMSA) and DNase I protection. However, by comparing the nucleotide sequences of all established Mga binding sites, we found that they exhibit only 13.4% identity with no discernible symmetry. To determine the core nucleotides involved in functional Mga-DNA interactions, the M1T1 Pemm1 binding site was altered and screened for nucleotides important for DNA binding in vitro and for transcriptional activation using a plasmid-based luciferase reporter in vivo. Following this analysis, 34 nucleotides within the Pemm1 binding site that had an effect on Mga binding, Mga-dependent transcriptional activation, or both were identified. Of these critical nucleotides, guanines and cytosines within the major groove were disproportionately identified clustered at the 5' and 3' ends of the binding site and with runs of nonessential adenines between the critical nucleotides. On the basis of these results, a Pemm1 minimal binding site of 35 bp bound Mga at a level comparable to the level of binding of the larger 45-bp site. Comparison of Pemm with directed mutagenesis performed in the M1T1 Mga-regulated PscpA and Psic promoters, as well as methylation interference analysis of PscpA, establish that Mga binds to DNA in a promoter-specific manner. PMID:22773785

  4. SMRT Sequencing of Long Tandem Nucleotide Repeats in SCA10 Reveals Unique Insight of Repeat Expansion Structure

    PubMed Central

    Landrian, Ivette; Godiska, Ronald; Shanker, Savita; Yu, Fahong; Farmerie, William G.; Ashizawa, Tetsuo

    2015-01-01

    A large, non-coding ATTCT repeat expansion causes the neurodegenerative disorder, spinocerebellar ataxia type 10 (SCA10). In a subset of SCA10 patients, interruption motifs are present at the 5’ end of the expansion and strongly correlate with epileptic seizures. Thus, interruption motifs are a predictor of the epileptic phenotype and are hypothesized to act as a phenotypic modifier in SCA10. Yet, the exact internal sequence structure of SCA10 expansions remains unknown due to limitations in current technologies for sequencing across long extended tracts of tandem nucleotide repeats. We used the third generation sequencing technology, Single Molecule Real Time (SMRT) sequencing, to obtain full-length contiguous expansion sequences, ranging from 2.5 to 4.4 kb in length, from three SCA10 patients with different clinical presentations. We obtained sequence spanning the entire length of the expansion and identified the structure of known and novel interruption motifs within the SCA10 expansion. The exact interruption patterns in expanded SCA10 alleles will allow us to further investigate the potential contributions of these interrupting sequences to the pathogenic modification leading to the epilepsy phenotype in SCA10. Our results also demonstrate that SMRT sequencing is useful for deciphering long tandem repeats that pose as “gaps” in the human genome sequence. PMID:26295943

  5. SNUFER: A software for localization and presentation of single nucleotide polymorphisms using a Clustal multiple sequence alignment output file

    PubMed Central

    Mansur, Marco A B; Cardozo, Giovana P; Santos, Elaine V; Marins, Mozart

    2008-01-01

    SNUFER is a software for the automatic localization and generation of tables used for the presentation of single nucleotide polymorphisms (SNPs). After input of a fasta file containing the sequences to be analyzed, a multiple sequence alignment is generated using ClustalW ran inside SNUFER. The ClustalW output file is then used to generate a table which displays the SNPs detected in the aligned sequences and their degree of similarity. This table can be exported to Microsoft Word, Microsoft Excel or as a single text file, permitting further editing for publication. The software was written using Delphi 7 for programming and FireBird 2.0 for sequence database management. It is freely available for noncommercial use and can be downloaded from http://www.heranza.com.br/bioinformatica2.htm. PMID:19238196

  6. Human adult T-cell leukemia virus: complete nucleotide sequence of the provirus genome integrated in leukemia cell DNA.

    PubMed Central

    Seiki, M; Hattori, S; Hirayama, Y; Yoshida, M

    1983-01-01

    Human retrovirus adult T-cell leukemia virus (ATLV) has been shown to be closely associated with human adult T-cell leukemia (ATL) [Yoshida, M., Miyoshi, I. & Hinuma, Y. (1982) Proc. Natl. Acad. Sci. USA 79, 2031-2035]. The provirus of ATLV integrated in DNA of leukemia T cells from a patient with ATL was molecularly cloned and the complete nucleotide sequence of 9,032 bases of the proviral genome was determined. The provirus DNA contains two long terminal repeats (LTRs) consisting of 755 bases, one at each end, which are flanked by a 6-base direct repeat of the cellular DNA sequence. The nucleotides in the LTR could be arranged into a unique secondary structure, which could explain transcriptional termination within the 3' LTR but not in the 5' LTR. The nucleotide sequence of the provirus contains three large open reading frames, which are capable of coding for proteins of 48,000, 99,000, and 54,000 daltons. The three open frames are in this order from the 5' end of the viral genome and the predicted 48,000-dalton polypeptide is a precursor of gag proteins, because it has an identical amino acid sequence to that of the NH2 terminus of human T-cell leukemia virus (HTLV) p24. The open frames coding for 99,000- and 54,000-dalton polypeptides are thought to be the pol and env genes, respectively. On the 3' side of these three open frames, the ATLV sequence has four smaller open frames in various phases; these frames may code for 10,000-, 11,000-, 12,000-, and 27,000-dalton polypeptides. Although one or some of these open frames could be the transforming gene of this virus, in preliminary analysis, DNA of this region has no homology with the normal human genome. PMID:6304725

  7. Nucleotide sequences and mutations of the 5'-nontranslated region (5'NTR) of natural isolates of an epidemic echovirus 11' (prime).

    PubMed

    Szendrõi, A; El-Sageyer, M; Takács, M; Mezey, I; Berencsi, G

    2000-01-01

    An echovirus 11' (prime) virus caused an epidemic in Hungary in 1989. The leading clinical form of the diseases was myocarditis. Hemorrhagic hepatitis syndroms were also caused, however, with lethal outcome in 13 newborn babies. Altogether 386 children suffered from registered clinical disease. No accumulation of serous meningitis cases and intrauterine death were observed during the epidemic, and the monovalent oral poliovirus vaccination campaign has prevented the further circulation of the virus. The 5'-nontranslated region (5'-NTR) of 12 natural isolates were sequenced (nucleotides: 260-577). The 5'-NTR was found to be different from that of the prototype Gregory strain (X80059) of EV11 (less than 90% identity), but related to the swine vesicular disease virus (D16364) SVDV and EV9 (X92886) as indicated by the best fitting dendogram. The examination of the variable nucleotides in the internal ribosomal entry site (IRES) revealed, that the nucleotide sequence of a region of the epidemic 5'-NTR was identical to that of coxsackievirus B2. Five of the epidemic isolates were found to carry mutations. Seven EV11' IRES elements possessed identical sequences indicating, that the virus has evolved before its arrival to Hungary. The comparative examination of the suboptimal secondary structures revealed, that no one of the mutations affected the secondary structure of stem-loop structures IV and V in the IRES elements. Although it has been shown previously, that the echovirus group is genetically coherent and related to coxsackie B viruses the sequence differences in the epidemic isolates resulted in profound modification of the central stem (residues 477-529) of stem-loop structure No.V known to be affecting neurovirulence of polioviruses. Two alternate cloverleaf (stem-loop) structures were also recognised (nucleotides 376 to 460 and 540 to 565) which seem to mask both regions of the IRES element complementary to the 3'-end of the 18 S rRNA (460 to 466 and 561 to 570

  8. Nucleotide Sequencing and SNP Detection of Toll-Like Receptor-4 Gene in Murrah Buffalo (Bubalus bubalis)

    PubMed Central

    Mitra, M.; Taraphder, S.; Sonawane, G. S.; Verma, A.

    2012-01-01

    Toll-like receptor-4 (TLR-4) has an important pattern recognition receptor that recognizes endotoxins associated with gram negative bacterial infections. The present investigation was carried out to study nucleotide sequencing and SNP detection by PCR-RFLP analysis of the TLR-4 gene in Murrah buffalo. Genomic DNA was isolated from 102 lactating Murrah buffalo from NDRI herd. The amplified PCR fragments of TLR-4 comprised of exon 1, exon 2, exon 3.1, and exon 3.2 were examined to RFLP. PCR products were obtained with sizes of 165, 300, 478, and 409 bp. TLR-4 gene of investigated Murrah buffaloes was highly polymorphic with AA, AB, and BB genotypes as revealed by PCR-RFLP analysis using Dra I, Hae III, and Hinf I REs. Nucleotide sequencing of the amplified fragment of TLR-4 gene of Murrah buffalo was done. Twelve SNPs were identified. Six SNPs were nonsynonymous resulting in change in amino acids. Murrah is an indigenous Buffalo breed and the presence of the nonsynonymous SNP is indicative of its unique genomic architecture. Sequence alignment and homology across species using BLAST analysis revealed 97%, 97%, 99%, 98%, and 80% sequence homology with Bos taurus, Bos indicus, Ovis aries, Capra hircus, and Homo sapiens, respectively.

  9. Glucitol-specific enzymes of the phosphotransferase system in Escherichia coli. Nucleotide sequence of the gut operon.

    PubMed

    Yamada, M; Saier, M H

    1987-04-25

    The complete nucleotide sequence of the glucitol (gut) operon in Escherichia coli has been determined. The glucitol-specific Enzyme II and Enzyme III of the phosphoenolpyruvate:sugar phosphotransferase system as well as glucitol-6-phosphate dehydrogenase which are encoded by the gutA, gutB, and gutD genes of the gut operon, respectively, are predicted to consist of 506 (Mr = 54,018), 123 (Mr = 13,306), and 259 (Mr = 27,866) amino acyl residues, respectively. The hydropathic profile of the Enzyme IIgut revealed 7 or 8 long hydrophobic segments which may traverse the cell membrane as alpha-helices as well as 2 or 4 short strongly hydrophobic stretches which may traverse the membrane as beta-structure. The number of amino acyl residues in the sum of the molecular weights of the glucitol Enzyme II-III pair are nearly the same as those of the mannitol Enzyme II. The ratio of hydrophobic to hydrophilic amino acyl residues and the numbers of the hydrophobic segments are also nearly the same for both transport systems. However, no significant homology was found in the nucleotide or amino acyl sequences of the two systems. Glucitol-6-phosphate dehydrogenase was found to exhibit sequence homology with ribitol dehydrogenase. A repetitive extragenic palindromic sequence was found in the 3'-flanking region of the gutD gene, suggesting the presence of a gene downstream from the gutD gene. PMID:3553176

  10. An Interpretation of the Ancestral Codon from Miller’s Amino Acids and Nucleotide Correlations in Modern Coding Sequences

    PubMed Central

    Carels, Nicolas; de Leon, Miguel Ponce

    2015-01-01

    Purine bias, which is usually referred to as an “ancestral codon”, is known to result in short-range correlations between nucleotides in coding sequences, and it is common in all species. We demonstrate that RWY is a more appropriate pattern than the classical RNY, and purine bias (Rrr) is the product of a network of nucleotide compensations induced by functional constraints on the physicochemical properties of proteins. Through deductions from universal correlation properties, we also demonstrate that amino acids from Miller’s spark discharge experiment are compatible with functional primeval proteins at the dawn of living cell radiation on earth. These amino acids match the hydropathy and secondary structures of modern proteins. PMID:25922573

  11. Development of Single Nucleotide Polymorphism Markers via Sequence-based Genotyping in Cotton (Gossypium spp)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    High-throughput single nucleotide polymorphism (SNP) genotyping has become the dominant approach to genomic analysis and genetic manipulation in many crop plants. In cotton (Gossypium spp), however, only a very limited number of loci and a dearth of information have been generated from SNP genotypi...

  12. Complete nucleotide sequence of Rose yellow leaf virus, a new member of the family Tombusviridae

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The genome of the Rose yellow leaf virus (RYLV) has been determined to be 3918 nucleotides containing seven open reading frames (ORFs). ORF1 encodes a 27 kDa peptide (p27). ORF2 shares a common start codon with ORF1 and continues through the amber stop codon of p27 to encode a 87 kDa (p87) protein t...

  13. Nucleotide sequences of fic and fic-1 genes involved in cell filamentation induced by cyclic AMP in Escherichia coli.

    PubMed Central

    Kawamukai, M; Matsuda, H; Fujii, W; Utsumi, R; Komano, T

    1989-01-01

    The nucleotide sequences of fic-1 involved in the cell filamentation induced by cyclic AMP in Escherichia coli and its normal counterpart fic were analyzed. The open reading frame of both fic-1 and fic coded for 200 amino acids. The Gly at position 55 in the Fic protein was changed to Arg in the Fic-1 protein. The promoter activity of fic was confirmed by fusing fic and lacZ. The gene downstream from fic was found to be pabA (p-aminobenzoate). There is an open reading frame (ORF190) coding for 190 amino acids upstream from the fic gene. Computer-assisted analysis showed that Fic has sequence similarity with part of CDC28 of Saccharomyces cerevisiae, CDC2 of Schizosaccharomyces pombe, and FtsA of E. coli. In addition, ORF190 has sequence similarity with the cyclosporin A-binding protein cyclophilin. PMID:2546924

  14. The complete nucleotide sequence of a 16S ribosomal RNA gene from a blue-green alga, Anacystis nidulans.

    PubMed

    Tomioka, N; Sugiura, M

    1983-01-01

    The complete nucleotide sequence of a 16S ribosomal RNA gene from a blue-green alga, Anacystis nidulans, has been determined. Its coding region is estimated to be 1,487 base pairs long, which is nearly identical to those reported for chloroplast 16S rRNA genes and is about 4% shorter than that of the Escherichia coli gene. The 16S rRNA sequence of A. nidulans has 83% homology with that of tobacco chloroplast and 74% homology with that of E. coli. Possible stem and loop structures of A. nidulans 16S rRNA sequences resemble more closely those of chloroplast 16S rRNAs than those of E. coli 16S rRNA. These observations support the endosymbiotic theory of chloroplast origin. PMID:6412038

  15. The nucleotide sequences of several tRNA genes from rat mitochondria: common features and relatedness to homologous species.

    PubMed Central

    Cantatore, P; De Benedetto, C; Gadaleta, G; Gallerani, R; Kroon, A M; Holtrop, M; Lanave, C; Pepe, G; Quagliariello, C; Saccone, C; Sbisa, E

    1982-01-01

    We have determined the nucleotide sequences of thirteen rat mt tRNA genes. The features of the primary and secondary structures of these tRNAs show that those for Gln, Ser, and f-Met resemble, while those for Lys, Cys, and Trp depart strikingly from the universal type. The remainder are slightly abnormal. Among many mammalian mt DNA sequences, those of mt tRNA genes are highly conserved, thus suggesting for those genes an additional, perhaps regulatory, function. A simple evolutionary relationship between the tRNAs of animal mitochondria and those of eukaryotic cytoplasm, of lower eukaryotic mitochondria or of prokaryotes, is not evident owing to the extreme divergence of the tRNA sequences in the two groups. However, a slightly higher homology does exist between a few animal mt tRNAs and those from prokaryotes or from lower eukaryotic mitochondria. PMID:7099963

  16. Nucleotide sequence of Zygosaccharomyces bailii virus Z: Evidence for +1 programmed ribosomal frameshifting and for assignment to family Amalgaviridae.

    PubMed

    Depierreux, Delphine; Vong, Minh; Nibert, Max L

    2016-06-01

    Zygosaccharomyces bailii virus Z (ZbV-Z) is a monosegmented dsRNA virus that infects the yeast Zygosaccharomyces bailii and remains unclassified to date despite its discovery >20years ago. The previously reported nucleotide sequence of ZbV-Z (GenBank AF224490) encompasses two nonoverlapping long ORFs: upstream ORF1 encoding the putative coat protein and downstream ORF2 encoding the RNA-dependent RNA polymerase (RdRp). The lack of overlap between these ORFs raises the question of how the downstream ORF is translated. After examining the previous sequence of ZbV-Z, we predicted that it contains at least one sequencing error to explain the nonoverlapping ORFs, and hence we redetermined the nucleotide sequence of ZbV-Z, derived from the same isolate of Z. bailii as previously studied, to address this prediction. The key finding from our new sequence, which includes several insertions, deletions, and substitutions relative to the previous one, is that ORF2 in fact overlaps ORF1 in the +1 frame. Moreover, a proposed sequence motif for +1 programmed ribosomal frameshifting, previously noted in influenza A viruses, plant amalgaviruses, and others, is also present in the newly identified ORF1-ORF2 overlap region of ZbV-Z. Phylogenetic analyses provided evidence that ZbV-Z represents a distinct taxon most closely related to plant amalgaviruses (genus Amalgavirus, family Amalgaviridae). We conclude that ZbV-Z is the prototype of a new species, which we propose to assign as type species of a new genus of monosegmented dsRNA mycoviruses in family Amalgaviridae. Comparisons involving other unclassified mycoviruses with RdRps apparently related to those of plant amalgaviruses, and having either mono- or bisegmented dsRNA genomes, are also discussed. PMID:26951859

  17. A comparative genomics strategy for targeted discovery of single-nucleotide polymorphisms and conserved-noncoding sequences in orphan crops.

    PubMed

    Feltus, F A; Singh, H P; Lohithaswa, H C; Schulze, S R; Silva, T D; Paterson, A H

    2006-04-01

    Completed genome sequences provide templates for the design of genome analysis tools in orphan species lacking sequence information. To demonstrate this principle, we designed 384 PCR primer pairs to conserved exonic regions flanking introns, using Sorghum/Pennisetum expressed sequence tag alignments to the Oryza genome. Conserved-intron scanning primers (CISPs) amplified single-copy loci at 37% to 80% success rates in taxa that sample much of the approximately 50-million years of Poaceae divergence. While the conserved nature of exons fostered cross-taxon amplification, the lesser evolutionary constraints on introns enhanced single-nucleotide polymorphism detection. For example, in eight rice (Oryza sativa) genotypes, polymorphism averaged 12.1 per kb in introns but only 3.6 per kb in exons. Curiously, among 124 CISPs evaluated across Oryza, Sorghum, Pennisetum, Cynodon, Eragrostis, Zea, Triticum, and Hordeum, 23 (18.5%) seemed to be subject to rigid intron size constraints that were independent of per-nucleotide DNA sequence variation. Furthermore, we identified 487 conserved-noncoding sequence motifs in 129 CISP loci. A large CISP set (6,062 primer pairs, amplifying introns from 1,676 genes) designed using an automated pipeline showed generally higher abundance in recombinogenic than in nonrecombinogenic regions of the rice genome, thus providing relatively even distribution along genetic maps. CISPs are an effective means to explore poorly characterized genomes for both DNA polymorphism and noncoding sequence conservation on a genome-wide or candidate gene basis, and also provide anchor points for comparative genomics across a diverse range of species. PMID:16607031

  18. Partition enrichment of nucleotide sequences (PINS)--a generally applicable, sequence based method for enrichment of complex DNA samples.

    PubMed

    Kvist, Thomas; Sondt-Marcussen, Line; Mikkelsen, Marie Just

    2014-01-01

    The dwindling cost of DNA sequencing is driving transformative changes in various biological disciplines including medicine, thus resulting in an increased need for routine sequencing. Preparation of samples suitable for sequencing is the starting point of any practical application, but enrichment of the target sequence over background DNA is often laborious and of limited sensitivity thereby limiting the usefulness of sequencing. The present paper describes a new method, Probability directed Isolation of Nucleic acid Sequences (PINS), for enrichment of DNA, enabling the sequencing of a large DNA region surrounding a small known sequence. A 275,000 fold enrichment of a target DNA sample containing integrated human papilloma virus is demonstrated. Specifically, a sample containing 0.0028 copies of target sequence per ng of total DNA was enriched to 786 copies per ng. The starting concentration of 0.0028 target copies per ng corresponds to one copy of target in a background of 100,000 complete human genomes. The enriched sample was subsequently amplified using rapid genome walking and the resulting DNA sequence revealed not only the sequence of a the truncated virus, but also 1026 base pairs 5' and 50 base pairs 3' to the integration site in chromosome 8. The demonstrated enrichment method is extremely sensitive and selective and requires only minimal knowledge of the sequence to be enriched and will therefore enable sequencing where the target concentration relative to background is too low to allow the use of other sample preparation methods or where significant parts of the target sequence is unknown. PMID:25203653

  19. Partition Enrichment of Nucleotide Sequences (PINS) - A Generally Applicable, Sequence Based Method for Enrichment of Complex DNA Samples

    PubMed Central

    Kvist, Thomas; Sondt-Marcussen, Line; Mikkelsen, Marie Just

    2014-01-01

    The dwindling cost of DNA sequencing is driving transformative changes in various biological disciplines including medicine, thus resulting in an increased need for routine sequencing. Preparation of samples suitable for sequencing is the starting point of any practical application, but enrichment of the target sequence over background DNA is often laborious and of limited sensitivity thereby limiting the usefulness of sequencing. The present paper describes a new method, Probability directed Isolation of Nucleic acid Sequences (PINS), for enrichment of DNA, enabling the sequencing of a large DNA region surrounding a small known sequence. A 275,000 fold enrichment of a target DNA sample containing integrated human papilloma virus is demonstrated. Specifically, a sample containing 0.0028 copies of target sequence per ng of total DNA was enriched to 786 copies per ng. The starting concentration of 0.0028 target copies per ng corresponds to one copy of target in a background of 100,000 complete human genomes. The enriched sample was subsequently amplified using rapid genome walking and the resulting DNA sequence revealed not only the sequence of a the truncated virus, but also 1026 base pairs 5′ and 50 base pairs 3′ to the integration site in chromosome 8. The demonstrated enrichment method is extremely sensitive and selective and requires only minimal knowledge of the sequence to be enriched and will therefore enable sequencing where the target concentration relative to background is too low to allow the use of other sample preparation methods or where significant parts of the target sequence is unknown. PMID:25203653

  20. Nucleotide sequence and genome organization of Dweet mottle virus and its relationship to members of the family Betaflexiviridae.

    PubMed

    Hajeri, Subhas; Ramadugu, Chandrika; Keremane, Manjunath; Vidalakis, Georgios; Lee, Richard

    2010-09-01

    The nucleotide sequence of Dweet mottle virus (DMV) was determined and compared to sequences of members of the families Alphaflexiviridae and Betaflexiviridae. The DMV genome has 8,747 nucleotides (nt) excluding the 3' poly-(A) tail. DMV genomic RNA contains three putative open reading frames (ORFs) and untranslated regions of 73 nt at the 5' and 541 nt at 3' termini. ORF1 potentially encoding a 227.48-kDa polyprotein, which has methyltransferase, oxygenase, endopeptidase, helicase, and RNA-dependent RNA polymerase (RdRP) domains. ORF2 encodes a movement protein of 40.25 kDa, while ORF3 encodes a coat protein of 40.69 kDa. Protein database searches showed 98-99% matches of DMV ORFs with citrus leaf blotch virus (CLBV) sequences. Phylogenetic analysis based on the RdRP core domain revealed that DMV is closely related to CLBV as a member of the genus Citrivirus. DMV did not satisfy the molecular criteria for demarcation of an independent species within the genus Citrivirus, family Betaflexiviridae, and hence, DMV can be considered a CLBV isolate. PMID:20644968

  1. Nucleotide sequence, transcription and phylogeny of the gene encoding the superoxide dismutase of Sulfolobus acidocaldarius.

    PubMed

    Klenk, H P; Schleper, C; Schwass, V; Brudler, R

    1993-07-18

    The gene encoding the superoxide dismutase (SOD) of the thermophilic archaeon Sulfolobus acidocaldarius has been isolated and sequenced. Both the start site and the termination sites of the corresponding transcript were mapped. The deduced amino acid sequence of the protein is very similar to the sequence of manganese- or iron-containing SODs. Phylogenetic sequence analysis corroborated the monophyletic nature of the archaeal domain. PMID:8334170

  2. Complete Nucleotide Sequence of cfr-Carrying IncX4 Plasmid pSD11 from Escherichia coli

    PubMed Central

    Sun, Jian; Deng, Hui; Li, Liang; Chen, Mu-Ya; Fang, Liang-Xing; Yang, Qiu-E

    2014-01-01

    We report the complete nucleotide sequence of a plasmid carrying the multiresistance gene cfr. This plasmid was isolated from an Escherichia coli strain of swine origin in 2011. This 37,672-bp plasmid, pSD11, had an IncX4 backbone similar to those of the IncX4 plasmids obtained from the United States and Australia, in which the cfr gene was flanked by two copies of IS26 and a truncated Tn1331 was inserted. PMID:25403661

  3. DNA sequencing by a single molecule detection of labeled nucleotides sequentially cleaved from a single strand of DNA

    SciTech Connect

    Goodwin, P.M.; Schecker, J.A.; Wilkerson, C.W.; Hammond, M.L.; Ambrose, W.P.; Jett, J.H.; Martin, J.C.; Marrone, B.L.; Keller, R.A. ); Haces, A.; Shih, P.J.; Harding, J.D. )

    1993-01-01

    We are developing a laser-based technique for the rapid sequencing of large DNA fragments (several kb in size) at a rate of 100 to 1000 bases per second. Our approach relies on fluorescent labeling of the bases in a single fragment of DNA, attachment of this labeled DNA fragment to a support, movement of the supported DNA into a flowing sample stream, sequential cleavage of the end nucleotide from the DNA fragment with an exonuclease, and detection of the individual fluorescently labeled bases by laser-induced fluorescence.

  4. DNA sequencing by a single molecule detection of labeled nucleotides sequentially cleaved from a single strand of DNA

    SciTech Connect

    Goodwin, P.M.; Schecker, J.A.; Wilkerson, C.W.; Hammond, M.L.; Ambrose, W.P.; Jett, J.H.; Martin, J.C.; Marrone, B.L.; Keller, R.A.; Haces, A.; Shih, P.J.; Harding, J.D.

    1993-02-01

    We are developing a laser-based technique for the rapid sequencing of large DNA fragments (several kb in size) at a rate of 100 to 1000 bases per second. Our approach relies on fluorescent labeling of the bases in a single fragment of DNA, attachment of this labeled DNA fragment to a support, movement of the supported DNA into a flowing sample stream, sequential cleavage of the end nucleotide from the DNA fragment with an exonuclease, and detection of the individual fluorescently labeled bases by laser-induced fluorescence.

  5. Analysis of a nucleotide-binding site of 5-lipoxygenase by affinity labelling: binding characteristics and amino acid sequences.

    PubMed Central

    Zhang, Y Y; Hammarberg, T; Radmark, O; Samuelsson, B; Ng, C F; Funk, C D; Loscalzo, J

    2000-01-01

    5-Lipoxygenase (5LO) catalyses the first two steps in the biosynthesis of leukotrienes, which are inflammatory mediators derived from arachidonic acid. 5LO activity is stimulated by ATP; however, a consensus ATP-binding site or nucleotide-binding site has not been found in its protein sequence. In the present study, affinity and photoaffinity labelling of 5LO with 5'-p-fluorosulphonylbenzoyladenosine (FSBA) and 2-azido-ATP showed that 5LO bound to the ATP analogues quantitatively and specifically and that the incorporation of either analogue inhibited ATP stimulation of 5LO activity. The stoichiometry of the labelling was 1.4 mol of FSBA/mol of 5LO (of which ATP competed with 1 mol/mol) or 0.94 mol of 2-azido-ATP/mol of 5LO (of which ATP competed with 0.77 mol/mol). Labelling with FSBA prevented further labelling with 2-azido-ATP, indicating that the same binding site was occupied by both analogues. Other nucleotides (ADP, AMP, GTP, CTP and UTP) also competed with 2-azido-ATP labelling, suggesting that the site was a general nucleotide-binding site rather than a strict ATP-binding site. Ca(2+), which also stimulates 5LO activity, had no effect on the labelling of the nucleotide-binding site. Digestion with trypsin and peptide sequencing showed that two fragments of 5LO were labelled by 2-azido-ATP. These fragments correspond to residues 73-83 (KYWLNDDWYLK, in single-letter amino acid code) and 193-209 (FMHMFQSSWNDFADFEK) in the 5LO sequence. Trp-75 and Trp-201 in these peptides were modified by the labelling, suggesting that they were immediately adjacent to the C-2 position of the adenine ring of ATP. Given the stoichiometry of the labelling, the two peptide sequences of 5LO were probably near each other in the enzyme's tertiary structure, composing or surrounding the ATP-binding site of 5LO. PMID:11042125

  6. Nucleotide sequence neighbouring a late modified guanylic residue within the 28S ribosomal RNA of several eukaryotic cells.

    PubMed Central

    Eladari, M E; Hampe, A; Galibert, F

    1977-01-01

    The nucleotide sequence of a particular T1 oligonucleotide found in 41S and 28S RNAs of several cellular cell lines (human, mouse, rat and chicken fibroblast) but absent in 45S ribosomal RNA has been deduced. Its primary structure : A-U-U*-G*-psi-U-C-A-C-C-C-A-C-U-A-A-U-A-Gp shows the presence of a modified G residue which explains the existence of this oligonucleotide in the T1 fingerprint of 41S RNA and 28S. Its absence on the 45S RNA T1 fingerprint is accounted for by a late modification. Images PMID:561392

  7. Nucleotide sequence of the 3'-terminal region of the genome confirms that pea mosaic virus is a strain of bean yellow mosaic potyvirus.

    PubMed

    Xiao, X W; Frenkel, M J; Ward, C W; Shukla, D D

    1994-01-01

    The 1,035 nucleotides at the 3'end of the I strain of pea mosaic potyvirus (PMV-I) genomic RNA, encoding the coat protein, have been cloned and sequenced. A comparison of the derived coat protein sequence with those of the bean yellow mosaic virus (BYMV) strains, CS, S, D and GDD, indicates that PMV-I is a strain of BYMV. Sequence comparisons and hybridisation studies using the 3'-noncoding region support this classification. The nucleotide and protein sequence data also suggest that PMV-I and BYMV-CS form one subset of BYMV strains while the other three strains form another. PMID:8031241

  8. PerPlot & PerScan: tools for analysis of DNA curvature-related periodicity in genomic nucleotide sequences

    PubMed Central

    2011-01-01

    Background Periodic spacing of short adenine or thymine runs phased with DNA helical period of ~10.5 bp is associated with intrinsic DNA curvature and deformability, which play important roles in DNA-protein interactions and in the organization of chromosomes in both eukaryotes and prokaryotes. Local differences in DNA sequence periodicity have been linked to differences in gene expression in some organisms. Despite the significance of these periodic patterns, there are virtually no publicly accessible tools for their analysis. Results We present novel tools suitable for assessments of DNA curvature-related sequence periodicity in nucleotide sequences at the genome scale. Utility of the present software is demonstrated on a comparison of sequence periodicities in the genomes of Haemophilus influenzae, Methanocaldococcus jannaschii, Saccharomyces cerevisiae, and Arabidopsis thaliana. The software can be accessed through a web interface and the programs are also available for download. Conclusions The present software is suitable for comparing DNA curvature-related sequence periodicity among different genomes as well as for analysis of intrachromosomal heterogeneity of the sequence periodicity. It provides a quick and convenient way to detect anomalous regions of chromosomes that could have unusual structural and functional properties and/or distinct evolutionary history. PMID:22587738

  9. The nucleotide composition of the spacer sequence influences the expression yield of heterologously expressed genes in Bacillus subtilis.

    PubMed

    Liebeton, Klaus; Lengefeld, Jette; Eck, Jürgen

    2014-12-10

    Bacillus subtilis is a commonly used host for the heterologous expression of genes in academia and industry. Many factors are known to influence the expression yield in this organism e.g. the complementarity between the Shine-Dalgarno sequence (SD) and the 16S-rRNA or secondary structures in the translation initiation region of the transcript. In this study, we analysed the impact of the nucleotide composition between the SD sequence and the start codon (the spacer sequence) on the expression yield. We demonstrated that a polyadenylate-moiety spacer sequence moderately increases the expression level of laccase CotA from B. subtilis. By screening a library of artificially generated spacer variants, we identified clones with greatly increased expression levels of two model enzymes, the laccase CotA from B. subtilis (11 fold) and the metagenome derived protease H149 (30 fold). Furthermore, we demonstrated that the effect of the spacer sequence is specific to the gene of interest. These results prove the high impact of the spacer sequence on the expression yield in B. subtilis. PMID:24997355

  10. The nucleotide sequence and genomic organization of Citrus leaf blotch virus: candidate type species for a new virus genus.

    PubMed

    Vives, M C; Galipienso, L; Navarro, L; Moreno, P; Guerri, J

    2001-08-15

    The complete nucleotide sequence of Citrus leaf blotch virus (CLBV) was determined. CLBV genomic RNA (gRNA) has 8747 nt, excluding the 3'-terminal poly(A) tail, and contains three open reading frames (ORFs) and untranslated regions (UTR) of 73 and 541 nucleotides at the 5' and 3' termini, respectively. ORF1 potentially encodes a 227.4-kDa polypeptide, which has methyltransferase, papain-like protease, helicase, and RNA-dependent RNA polymerase motifs. ORF2 encodes a 40.2-kDa polypeptide containing a motif characteristic of cell-to-cell movement proteins. The 40.7-kDa polypeptide encoded by ORF3 was identified as the coat protein. The genome organization of CLBV resembles that of viruses in the genus Trichovirus, but they differ in various aspects: (i) in trichoviruses ORF2 overlaps ORFs 1 and 3, whereas in CLBV, ORFs 2 and 3 are separated and ORFs 1 and 2 overlap in one nucleotide; (ii) CLBV gRNA and CP are larger than those of trichoviruses; and (iii) the CLBV 3' UTR is larger than that of trichoviruses. Phylogenetic comparisons based on CP amino acid signatures clearly separates CLBV from trichoviruses. Also contrasting with trichoviruses, CLBV could not be transmitted to Chenopodium quinoa Willd. Considering these singularities, we propose that CLBV should be included in a new virus genus. PMID:11504557

  11. Nucleotide sequence of the phosphoglycerate kinase gene from the extreme thermophile Thermus thermophilus. Comparison of the deduced amino acid sequence with that of the mesophilic yeast phosphoglycerate kinase.

    PubMed Central

    Bowen, D; Littlechild, J A; Fothergill, J E; Watson, H C; Hall, L

    1988-01-01

    Using oligonucleotide probes derived from amino acid sequencing information, the structural gene for phosphoglycerate kinase from the extreme thermophile, Thermus thermophilus, was cloned in Escherichia coli and its complete nucleotide sequence determined. The gene consists of an open reading frame corresponding to a protein of 390 amino acid residues (calculated Mr 41,791) with an extreme bias for G or C (93.1%) in the codon third base position. Comparison of the deduced amino acid sequence with that of the corresponding mesophilic yeast enzyme indicated a number of significant differences. These are discussed in terms of the unusual codon bias and their possible role in enhanced protein thermal stability. Images Fig. 1. PMID:3052437

  12. Anabolic ornithine carbamoyltransferase of Pseudomonas aeruginosa: nucleotide sequence and transcriptional control of the argF structural gene.

    PubMed Central

    Itoh, Y; Soldati, L; Stalon, V; Falmagne, P; Terawaki, Y; Leisinger, T; Haas, D

    1988-01-01

    In Pseudomonas aeruginosa PAO the anabolic ornithine carbamoyltransferase (OTCase, EC 2.1.3.3) is the product of the argF gene and the only arginine biosynthetic enzyme whose synthesis is repressible by arginine. We have determined the complete nucleotide sequence of the argF gene including its promoter-control region. The deduced amino acid sequence of the anabolic OTCase consists of 305 residues (Mr 33,924), and this was confirmed by the N-terminal amino acid sequence, the total amino acid composition, and the subunit Mr of the purified enzyme. The native anabolic OTCase (Mr 110,000 to 125,000) was found to be a trimer by cross-linking experiments. P. aeruginosa also has a catabolic OTCase (the arcB gene product), which catalyzes the reverse reaction of the anabolic conversion. At the nucleotide sequence level, the P. aeruginosa argF gene had 52.4% identity with the arcB gene. The Escherichia coli argF and argI genes, which code for anabolic OTCase isoenzymes, had 47.3 and 44.9% identity, respectively, with the P. aeruginosa argF sequence. This suggests that these four genes have evolved from a common ancestral gene. The arcB gene appears to be more closely related to the E. coli argF gene than to the P. aeruginosa argF gene. Two transcripts (mRNA-1, mRNA-2) of the P. aeruginosa argF gene were identified by S1 mapping. The transcription initiation site for mRNA-1 was preceded by sequences having partial homology with the E. coli -35 and -10 consensus promoter sequences. No sequence similar to consensus promoters of enteric bacteria was found upstream of the 5' end of mRNA-2. E. coli carrying a P. aeruginosa argF+ recombinant plasmid produced mRNA-1 with low efficiency but no (or very little) mRNA-2. Arginine repressed argF transcription in P. aeruginosa. In the argF promoter region no sequence homologous to the "arg box" (arginine operator module) of E. coli was found. The mechanism of arginine repression in P. aeruginosa thus appears to be different from that in

  13. Large and small subunits of the Aujeszky's disease virus ribonucleotide reductase: nucleotide sequence and putative structure.

    PubMed

    Kaliman, A V; Boldogköi, Z; Fodor, I

    1994-09-13

    We determined the entire DNA sequence of two adjacent open reading frames of Aujeszky's disease virus encoding ribonucleotide reductase genes with the intergenic sequence of 9 bp. From the sequence analysis we deduce that ORFs encode large and small subunits, with sizes of 835 and 303 amino acids, respectively. Amino acid sequence comparison of ADV RR2 with that of equine herpesvirus type 1, bovine herpesvirus type 1, HSV-1 and varicella zoster virus revealed that 48% of amino acids represent clusters of residues conserved in all compared sequences. In the N-terminal part ADV RR1 shows low homology to the RR1 of other herpesviruses. Rest of the RR1 protein contains highly conserved amino acid sequences divided by blocks of low homology. PMID:8086454

  14. Rapid DNA Sequencing by Direct Nanoscale Reading of Nucleotide Bases on Individual DNA Chains

    SciTech Connect

    Lee, James Weifu; Meller, Amit

    2007-01-01

    Since the independent invention of DNA sequencing by Sanger and by Gilbert 30 years ago, it has grown from a small scale technique capable of reading several kilobase-pair of sequence per day into today's multibillion dollar industry. This growth has spurred the development of new sequencing technologies that do not involve either electrophoresis or Sanger sequencing chemistries. Sequencing by Synthesis (SBS) involves multiple parallel micro-sequencing addition events occurring on a surface, where data from each round is detected by imaging. New High Throughput Technologies for DNA Sequencing and Genomics is the second volume in the Perspectives in Bioanalysis series, which looks at the electroanalytical chemistry of nucleic acids and proteins, development of electrochemical sensors and their application in biomedicine and in the new fields of genomics and proteomics. The authors have expertly formatted the information for a wide variety of readers, including new developments that will inspire students and young scientists to create new tools for science and medicine in the 21st century. Reviews of complementary developments in Sanger and SBS sequencing chemistries, capillary electrophoresis and microdevice integration, MS sequencing and applications set the framework for the book.

  15. Defining natural species of bacteria: clear-cut genomic boundaries revealed by a turning point in nucleotide sequence divergence

    PubMed Central

    2013-01-01

    Background Bacteria are currently classified into arbitrary species, but whether they actually exist as discrete natural species was unclear. To reveal genomic features that may unambiguously group bacteria into discrete genetic clusters, we carried out systematic genomic comparisons among representative bacteria. Results We found that bacteria of Salmonella formed tight phylogenetic clusters separated by various genetic distances: whereas over 90% of the approximately four thousand shared genes had completely identical sequences among strains of the same lineage, the percentages dropped sharply to below 50% across the lineages, demonstrating the existence of clear-cut genetic boundaries by a steep turning point in nucleotide sequence divergence. Recombination assays supported the genetic boundary hypothesis, suggesting that genetic barriers had been formed between bacteria of even very closely related lineages. We found similar situations in bacteria of Yersinia and Staphylococcus. Conclusions Bacteria are genetically isolated into discrete clusters equivalent to natural species. PMID:23865772

  16. Fusion protein of the paramyxovirus simian virus 5: nucleotide sequence of mRNA predicts a highly hydrophobic glycoprotein.

    PubMed Central

    Paterson, R G; Harris, T J; Lamb, R A

    1984-01-01

    The nucleotide sequence of the mRNA coding for the fusion glycoprotein (F) of the paramyxovirus, simian virus 5, has been obtained. There is a single large open reading frame on the mRNA that encodes a protein of 529 amino acids with a molecular weight of 56,531. The proteolytic cleavage/activation site of F, to yield F2 and F1, contains five arginine residues. Six potential glycosylation sites were identified in the protein, two on F2 and four on F1. The deduced amino acid sequence indicates that F is extensively hydrophobic over the length of the polypeptide chain. Three regions are very hydrophobic and could interact directly with membranes: these are the NH2-terminal putative signal peptide, the COOH-terminal putative membrane anchorage domain, and the NH2-terminal region of F1. Images PMID:6093114

  17. A Simple Sequence Repeat- and Single-Nucleotide Polymorphism-Based Genetic Linkage Map of the Brown Planthopper, Nilaparvata lugens

    PubMed Central

    Jairin, Jirapong; Kobayashi, Tetsuya; Yamagata, Yoshiyuki; Sanada-Morimura, Sachiyo; Mori, Kazuki; Tashiro, Kosuke; Kuhara, Satoru; Kuwazaki, Seigo; Urio, Masahiro; Suetsugu, Yoshitaka; Yamamoto, Kimiko; Matsumura, Masaya; Yasui, Hideshi

    2013-01-01

    In this study, we developed the first genetic linkage map for the major rice insect pest, the brown planthopper (BPH, Nilaparvata lugens). The linkage map was constructed by integrating linkage data from two backcross populations derived from three inbred BPH strains. The consensus map consists of 474 simple sequence repeats, 43 single-nucleotide polymorphisms, and 1 sequence-tagged site, for a total of 518 markers at 472 unique positions in 17 linkage groups. The linkage groups cover 1093.9 cM, with an average distance of 2.3 cM between loci. The average number of marker loci per linkage group was 27.8. The sex-linkage group was identified by exploiting X-linked and Y-specific markers. Our linkage map and the newly developed markers used to create it constitute an essential resource and a useful framework for future genetic analyses in BPH. PMID:23204257

  18. Comparison of the nucleotide and amino acid sequences of the RsrI and EcoRI restriction endonucleases.

    PubMed

    Stephenson, F H; Ballard, B T; Boyer, H W; Rosenberg, J M; Greene, P J

    1989-12-21

    The RsrI endonuclease, a type-II restriction endonuclease (ENase) found in Rhodobacter sphaeroides, is an isoschizomer of the EcoRI ENase. A clone containing an 11-kb BamHI fragment was isolated from an R. sphaeroides genomic DNA library by hybridization with synthetic oligodeoxyribonucleotide probes based on the N-terminal amino acid (aa) sequence of RsrI. Extracts of E. coli containing a subclone of the 11-kb fragment display RsrI activity. Nucleotide sequence analysis reveals an 831-bp open reading frame encoding a polypeptide of 277 aa. A 50% identity exists within a 266-aa overlap between the deduced aa sequences of RsrI and EcoRI. Regions of 75-100% aa sequence identity correspond to key structural and functional regions of EcoRI. The type-II ENases have many common properties, and a common origin might have been expected. Nevertheless, this is the first demonstration of aa sequence similarity between ENases produced by different organisms. PMID:2695392

  19. Prioritization Of Nonsynonymous Single Nucleotide Variants For Exome Sequencing Studies Via Integrative Learning On Multiple Genomic Data

    PubMed Central

    Wu, Mengmeng; Wu, Jiaxin; Chen, Ting; Jiang, Rui

    2015-01-01

    The rapid advancement of next generation sequencing technology has greatly accelerated the progress for understanding human inherited diseases via such innovations as exome sequencing. Nevertheless, the identification of causative variants from sequencing data remains a great challenge. Traditional statistical genetics approaches such as linkage analysis and association studies have limited power in analyzing exome sequencing data, while relying on simply filtration strategies and predicted functional implications of mutations to pinpoint pathogenic variants are prone to produce false positives. To overcome these limitations, we herein propose a supervised learning approach, termed snvForest, to prioritize candidate nonsynonymous single nucleotide variants for a specific type of disease by integrating 11 functional scores at the variant level and 8 association scores at the gene level. We conduct a series of large-scale in silico validation experiments, demonstrating the effectiveness of snvForest across 2,511 diseases of different inheritance styles and the superiority of our approach over two state-of-the-art methods. We further apply snvForest to three real exome sequencing data sets of epileptic encephalophathies and intellectual disability to show the ability of our approach to identify causative de novo mutations for these complex diseases. The online service and standalone software of snvForest are found at http://bioinfo.au.tsinghua.edu.cn/jianglab/snvforest. PMID:26459872

  20. Complete nucleotide sequence of little cherry virus 1 (LChV-1) infecting sweet cherry in China.

    PubMed

    Wang, Jiawei; Zhu, Dongzi; Tan, Yue; Zong, Xiaojuan; Wei, Hairong; Hammond, Rosemarie W; Liu, Qingzhong

    2016-03-01

    Little cherry virus 1 (LChV-1), associated with little cherry disease (LCD), has a significant impact on fruit quality of infected sweet cherry trees. We report the full genome sequence of an isolate of LChV-1 from Taian, China (LChV-1-TA), detected by small-RNA deep sequencing and amplified by overlapping RT-PCR. The LChV-1-TA genome was 16,932 nt in length and contained nine open reading frames (ORFs), with sequence identity at the overall genome level of 76%, 76%, and 78% to LChV-1 isolates Y10237 (UW2 isolate), EU715989 (ITMAR isolate) and JX669615 (V2356 isolate), respectively. Based on the phylogenetic analysis of HSP70h amino acid sequences of Closteroviridae family members, LChV-1-TA was grouped into a well-supported cluster with the members of the genus Velarivirus and was also closely related to other LChV-1 isolates. This is the first report of the complete nucleotide sequence of LChV-1 infecting sweet cherry in China. PMID:26733294

  1. Prioritization Of Nonsynonymous Single Nucleotide Variants For Exome Sequencing Studies Via Integrative Learning On Multiple Genomic Data.

    PubMed

    Wu, Mengmeng; Wu, Jiaxin; Chen, Ting; Jiang, Rui

    2015-01-01

    The rapid advancement of next generation sequencing technology has greatly accelerated the progress for understanding human inherited diseases via such innovations as exome sequencing. Nevertheless, the identification of causative variants from sequencing data remains a great challenge. Traditional statistical genetics approaches such as linkage analysis and association studies have limited power in analyzing exome sequencing data, while relying on simply filtration strategies and predicted functional implications of mutations to pinpoint pathogenic variants are prone to produce false positives. To overcome these limitations, we herein propose a supervised learning approach, termed snvForest, to prioritize candidate nonsynonymous single nucleotide variants for a specific type of disease by integrating 11 functional scores at the variant level and 8 association scores at the gene level. We conduct a series of large-scale in silico validation experiments, demonstrating the effectiveness of snvForest across 2,511 diseases of different inheritance styles and the superiority of our approach over two state-of-the-art methods. We further apply snvForest to three real exome sequencing data sets of epileptic encephalophathies and intellectual disability to show the ability of our approach to identify causative de novo mutations for these complex diseases. The online service and standalone software of snvForest are found at http://bioinfo.au.tsinghua.edu.cn/jianglab/snvforest. PMID:26459872

  2. Complete nucleotide sequence of pSCV50, the virulence plasmid of Salmonella enterica serovar Choleraesuis SC-B67.

    PubMed

    Yu, Hong; Wang, Jianbin; Ye, Jiehua; Tang, Petrus; Chu, Chishih; Hu, Songnian; Chiu, Cheng-Hsun

    2006-03-01

    We carried out comparative analysis on the sequences of two 50-kb virulence plasmids of Salmonella enterica serovar Choleraesuis strains SC-B67 (pSCV50) and RF-1 (pKDSC50). The two plasmids share over 99% sequence similarity. Ninety-two nucleotide variations at 42 sites were detected between the two plasmids; pSCV50 contains 24 nucleotide substitutions, 6 deletions, and 62 insertions, compared to pKDSC50. Two regions in pSCV50 appeared to be more susceptible to changes: one is the non-virulence-associated transfer region (27.5-33.0 K) and the other a function-unknown region (9.0-10.5 K). We re-annotated pSCV50 using more advanced tools and the up-to-date databases and corrected the inaccurate annotation in pKDSC50. The results indicate that virulence-related genes on the 50-kb plasmid are under negative selection, suggesting that they play important roles in the expression of virulence during the process of infection, while other genes in this plasmid tend to evolve neutrally. PMID:16257053

  3. Molecular cloning and nucleotide sequence of cDNA for human glucose-6-phosphate dehydrogenase variant A(-).

    PubMed Central

    Hirono, A; Beutler, E

    1988-01-01

    Glucose-6-phosphate dehydrogenase (G6PD; D-glucose-6-phosphate:NADP+ oxidoreductase, EC 1.1.1.49) A(-) is a common variant in Blacks that causes sensitivity to drug-and infection-induced hemolytic anemia. A cDNA library was constructed from Epstein-Barr virus-transformed lymphoblastoid cells from a male who was G6PD A(-). One of four cDNA clones isolated contained a sequence not found in the other clones nor in the published cDNA sequence. Consisting of 138 bases and coding 46 amino acids, this segment of cDNA apparently is derived from the alternative splicing involving the 3' end of intron 7. Comparison of the remaining sequences of these clones with the published sequence revealed three nucleotide substitutions: C33----G, G202----A, and A376----G. Each change produces a new restriction site. Genomic DNA from five G6PD A(-) individuals was amplified by the polymerase chain reaction. The base substitution at position 376, identical to the substitution that has been reported in G6PD A(+), was present in all G6PD A(-) samples and none of the control G6PD B(+) samples examined. The substitution at position 202 was found in four of the five G6PD A(-) samples and no normal control sample. At position 33 guanine was found in all G6PD A(-) samples and seven G6PD B(+) control samples and is, presumably, the usual nucleotide found at this position. The finding of the same mutation in G6PD A(-) as is found in G6PD A(+) strongly suggests that the G6PD A(-) mutation arose in an individual with G6PD A(+), adding another mutation that causes the in vivo instability of this enzyme protein. Images PMID:2836867

  4. Nucleotide sequence analysis of genes encoding a toluene/benzene-2-monooxygenase from pseudomonas sp. strain JS150

    SciTech Connect

    Johnson, G.R.; Olsen, R.H.

    1995-09-01

    Pseudomonas sp. strain JS150 metabolizes benzene and alkyl- and chloro-substituted benzenes by using dioxygenase-initiated pathways coupled with multiple downstream metabolic pathways to accommodate catechol metabolism. By cloning genes encoding benzene-degradative enzymes, strain JS150 was also found to carry genes for a toluene/benzene-2-monooxygenase. The gene cluster encoding a 2-monooxygenase and its cognate regulator was cloned from a plasmid carried by strain JS150. Oxygen ({sup 18}O{sub 2}) incorporation experiments using Pseudomonas aeruginosa strains carrying the cloned genes confirmed toluene hydroxylation was catalyzed through an authentic monooxygenase reaction to yield ortho-cresol. Encoding the toluene-2-monooxygenase and regulatory gene product was localized in two regions of the cloned fragment. The nucleotide sequence of the toluene/benzene-2-monooxygenase locus was determined, revealing six open reading frames that were then designated tbmA, tbmB, tbmC, tbmD, tbmE, and tbmF. The deduced amino acid sequences for these genes showed the presence of motifs similar to well-conserved functional domains of multicomponent oxygenases. This analysis allowed the tentative identification of two terminal oxygenase subunits (TbmB and TbmD) and an electron transport protein (TbmF) for the monooxygenase enzyme. All the tbm polypeptides shared significant homology with protein components from other bacterial multicomponent monooxygenases. Overall, the tbm gene products shared greater similarity with polypeptides from the phenol hydroxylases of Pseudomo-KR1 and Burkholderia (Pseudomonas) picketti PKO1. The relationship found between the phenol hydroxlases and a toluene-2-monooxygenase, characterized in this study for the first time at the nucleotide sequence level, suggested DNA probes used for surveys of environmental populations should be carefully selected to reflect DNA sequences corresponding to the metabolic pathway of interest. 58 refs., 8 figs., 1 tab.

  5. The complete nucleotide sequence and gene organization of the mitochondrial genome of the bumblebee, Bombus ignitus (Hymenoptera: Apidae).

    PubMed

    Cha, So Young; Yoon, Hyung Joo; Lee, Eun Mee; Yoon, Myung Hee; Hwang, Jae Sam; Jin, Byung Rae; Han, Yeon Soo; Kim, Iksoo

    2007-05-01

    The complete 16,434-bp nucleotide sequence of the mitogenome of the bumble bee, Bombus ignitus (Hymenoptera: Apidae), was determined. The genome contains the base composition and codon usage typical of metazoan mitogenomes. An unusual feature of the B. ignitus mitogenome is the presence of five tRNA-like structures: two each of the tRNALeu(UUR)-like and tRNASer(AGN)-like sequences and one tRNAPhe-like sequence. These tRNA-like sequences have proper folding structures and anticodon sequences, but their functionality in their respective amino acid transfers remained uncertain. Among these sequences, the tRNALeu(UUR)-like sequence and the tRNASer(AGN)-like sequence are seemingly located within the A+T-rich region. This tRNASer(AGN)-like sequence is highly unusual in that its sequence homology is very high compared to the tRNAMet of other insects, including Apis mellifera, but it contains the anticodon ACT, which designates it as tRNASer(AGN). All PCG and rRNAs are conserved in positions observed most frequently in insect mitogenome structures, but the positions of the tRNAs are highly variable, presenting a new arrangement for an insect mitogenome. As a whole, the B. ignitus mitogenome contains the highest A+T content (86.9%) found in any of the complete insects mt sequences determined to date. All protein-coding sequences started with a typical ATN codon. Nine of the 13 PCGs have a complete termination codon (all TAA), but the remaining four genes terminate with the incomplete TA or T. All tRNAs have the typical clover-leaf structures of mt tRNAs, except for tRNASer(AGN), in which the DHU arm forms a simple loop. All anticodons of B. ignitus tRNAs are identical to those of A. mellifera. In the A+T-rich region, a highly conserved sequence block that was previously described in Orthoptera and Diptera was also present. The stem-and-loop structures that may play a role in the initiation of mtDNA replication were also found in this region. Phylogenetic analysis among

  6. The complete nucleotide sequence and genomic characterization of tropical soda apple mosaic virus

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Tropical soda apple mosaic virus (TSAMV) was first identified in tropical soda apple (Solanum viarum), a noxious weed, in Florida in 2002. This report provides the first full genome sequence of TSAMV. The full genome sequence of this virus will enable research scientists to develop additional spec...

  7. Complete nucleotide sequences of two NDM-1-encoding plasmids from the same sequence type 11 Klebsiella pneumoniae strain.

    PubMed

    Studentova, V; Dobiasova, H; Hedlova, D; Dolejska, M; Papagiannitsis, C C; Hrabak, J

    2015-02-01

    The sequence type 11 Klebsiella pneumoniae strain Kpn-3002cz was confirmed to harbor two NDM-1-encoding plasmids, pB-3002cz and pS-3002cz. pB-3002cz (97,649 bp) displayed extensive sequence similarity with the blaNDM-1-carrying plasmid pKPX-1. pS-3002cz (73,581 bp) was found to consist of an IncR-related sequence (13,535 bp) and a mosaic region (60,046 bp). A 40,233-bp sequence of pS-3002cz was identical to the mosaic region of pB-3002cz, indicating the en bloc acquisition of the NDM-1-encoding region from one plasmid by the other. PMID:25421477

  8. Complete Nucleotide Sequences of Two NDM-1-Encoding Plasmids from the Same Sequence Type 11 Klebsiella pneumoniae Strain

    PubMed Central

    Studentova, V.; Dobiasova, H.; Hedlova, D.; Dolejska, M.; Hrabak, J.

    2014-01-01

    The sequence type 11 Klebsiella pneumoniae strain Kpn-3002cz was confirmed to harbor two NDM-1-encoding plasmids, pB-3002cz and pS-3002cz. pB-3002cz (97,649 bp) displayed extensive sequence similarity with the blaNDM-1-carrying plasmid pKPX-1. pS-3002cz (73,581 bp) was found to consist of an IncR-related sequence (13,535 bp) and a mosaic region (60,046 bp). A 40,233-bp sequence of pS-3002cz was identical to the mosaic region of pB-3002cz, indicating the en bloc acquisition of the NDM-1-encoding region from one plasmid by the other. PMID:25421477

  9. FeatureScan: revealing property-dependent similarity of nucleotide sequences

    PubMed Central

    Deyneko, Igor V.; Bredohl, Björn; Wesely, Daniel; Kalybaeva, Yulia M.; Kel, Alexander E.; Blöcker, Helmut; Kauer, Gerhard

    2006-01-01

    FeatureScan is a software package aiming to reveal novel types of DNA sequence similarity by comparing physico-chemical properties. Thirty-eight different parameters of DNA double strands such as charge, melting enthalpy, conformational parameters and the like are provided. As input FeatureScan requires two sequences, a pattern sequence and a target sequence, search conditions are set by selecting a specific DNA parameter and a threshold value. Search results are displayed in FASTA format and directly linked to external genome databases/browsers (ENSEMBL, NCBI, UCSC). An Internet version of FeatureScan is accessible at . As part of the HOBIT initiative () FeatureScan is also accessible as a web service at its above home page. Currently, several preloaded genomes are provided at this Internet website (Homo sapiens, Mus musculus, Rattus norvegicus and four strains of Escherichia coli) as target sequences. Standalone executables of FeatureScan are available on request. PMID:16845077

  10. Bayesian Markov models consistently outperform PWMs at predicting motifs in nucleotide sequences.

    PubMed

    Siebert, Matthias; Söding, Johannes

    2016-07-27

    Position weight matrices (PWMs) are the standard model for DNA and RNA regulatory motifs. In PWMs nucleotide probabilities are independent of nucleotides at other positions. Models that account for dependencies need many parameters and are prone to overfitting. We have developed a Bayesian approach for motif discovery using Markov models in which conditional probabilities of order k - 1 act as priors for those of order k This Bayesian Markov model (BaMM) training automatically adapts model complexity to the amount of available data. We also derive an EM algorithm for de-novo discovery of enriched motifs. For transcription factor binding, BaMMs achieve significantly (P    =  1/16) higher cross-validated partial AUC than PWMs in 97% of 446 ChIP-seq ENCODE datasets and improve performance by 36% on average. BaMMs also learn complex multipartite motifs, improving predictions of transcription start sites, polyadenylation sites, bacterial pause sites, and RNA binding sites by 26-101%. BaMMs never performed worse than PWMs. These robust improvements argue in favour of generally replacing PWMs by BaMMs. PMID:27288444

  11. Nucleotide sequence of the gene encoding the nitrogenase iron protein of Thiobacillus ferrooxidans

    SciTech Connect

    Pretorius, I.M.; Rawlings, D.E.; O'Neill, E.G.; Jones, W.A.; Kirby, R.; Woods, D.R.

    1987-01-01

    The DNA sequence was determined for the cloned Thiobacillus ferrooxidans nifH and part of the nifD genes. The DNA chains were radiolabeled with (..cap alpha..-/sup 32/P)dCTP (3000 Ci/mmol) or (..cap alpha..-/sup 35/S)dCTP (400 Ci/mmol). A putative T. ferrooxidans nifH promoter was identified whose sequences showed perfect consensus with those of the Klebsiella pneumoniae nif promoter. Two putative consensus upstream activator sequences were also identified. The amino acid sequence was deduced from the DNA sequence. In a comparison of nifH DNA sequences from T. ferrooxidans and eight other nitrogen-fixing microbes, a Rhizobium sp. isolated from Parasponia andersonii showed the greatest homology (74%) and Clostridium pasteurianum (nifH1) showed the least homology (54%). In the comparison of the amino acid sequences of the Fe proteins, the Rhizobium sp. and Rhizobium japonicum showed the greatest homology (both 86%) and C. pasteurianum (nifH1 gene product) demonstrated the least homology (56%) to the T. ferrooxidans Fe protein.

  12. 37 CFR 1.824 - Form and format for nucleotide and/or amino acid sequence submissions in computer readable form.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... nucleotide and/or amino acid sequence submissions in computer readable form. 1.824 Section 1.824 Patents... submissions in computer readable form. (a) The computer readable form required by § 1.821(e) shall meet the following requirements: (1) The computer readable form shall contain a single “Sequence Listing” as either...

  13. 37 CFR 1.824 - Form and format for nucleotide and/or amino acid sequence submissions in computer readable form.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... nucleotide and/or amino acid sequence submissions in computer readable form. 1.824 Section 1.824 Patents... submissions in computer readable form. (a) The computer readable form required by § 1.821(e) shall meet the following requirements: (1) The computer readable form shall contain a single “Sequence Listing” as either...

  14. 37 CFR 1.824 - Form and format for nucleotide and/or amino acid sequence submissions in computer readable form.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... nucleotide and/or amino acid sequence submissions in computer readable form. 1.824 Section 1.824 Patents... submissions in computer readable form. (a) The computer readable form required by § 1.821(e) shall meet the following requirements: (1) The computer readable form shall contain a single “Sequence Listing” as either...

  15. 37 CFR 1.824 - Form and format for nucleotide and/or amino acid sequence submissions in computer readable form.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... nucleotide and/or amino acid sequence submissions in computer readable form. 1.824 Section 1.824 Patents... submissions in computer readable form. (a) The computer readable form required by § 1.821(e) shall meet the following requirements: (1) The computer readable form shall contain a single “Sequence Listing” as either...

  16. 37 CFR 1.824 - Form and format for nucleotide and/or amino acid sequence submissions in computer readable form.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... nucleotide and/or amino acid sequence submissions in computer readable form. 1.824 Section 1.824 Patents... submissions in computer readable form. (a) The computer readable form required by § 1.821(e) shall meet the following requirements: (1) The computer readable form shall contain a single “Sequence Listing” as either...

  17. Cloning and nucleotide sequence determination of the Clostridium pasteurianum ferredoxin gene.

    PubMed Central

    Graves, M C; Mullenbach, G T; Rabinowitz, J C

    1985-01-01

    We have constructed a library of Clostridium pasteurianum DNA cloned in the plasmid pBR322. Based on the known amino acid sequence for C. pasteurianum ferredoxin, a 64-fold degenerate heptadecanucleotide pool was synthesized. This mixed probe hybridized to two clones which were shown to contain greater than 6 kilobase pairs of the same genomic DNA. Sequence analysis of a common Sau3A1 0.6-kilobase-pair fragment revealed that it contains the information for the apoferredoxin structural gene. According to the DNA sequence, the only post-translational processing of this small apoprotein is the hydrolysis of the initiator methionine. Putative transcription and translation start and stop signals are present within the sequence. Images PMID:3856844

  18. The nucleotide sequence of Beneckea harveyi 5S rRNA. [bioluminescent marine bacterium

    NASA Technical Reports Server (NTRS)

    Luehrsen, K. R.; Fox, G. E.

    1981-01-01

    The primary sequence of the 5S ribosomal RNA isolated from the free-living bioluminescent marine bacterium Beneckea harveyi is reported and discussed in regard to indications of phylogenetic relationships with the bacteria Escherichia coli and Photobacterium phosphoreum. Sequences were determined for oligonucleotide products generated by digestion with ribonuclease T1, pancreatic ribonuclease and ribonuclease T2. The presence of heterogeneity is indicated for two sites. The B. harveyi sequence can be arranged into the same four helix secondary structures as E. coli and other prokaryotic 5S rRNAs. Examination of the 5S-RNS sequences of the three bacteria indicates that B. harveyi and P. phosphoreum are specifically related and share a common ancestor which diverged from an ancestor of E. coli at a somewhat earlier time, consistent with previous studies.

  19. Targeted capture enrichment and sequencing identifies extensive nucleotide variation in the turkey MHC-B.

    PubMed

    Reed, Kent M; Mendoza, Kristelle M; Settlage, Robert E

    2016-03-01

    Variation in the major histocompatibility complex (MHC) is increasingly associated with disease susceptibility and resistance in avian species of agricultural importance. This variation includes sequence polymorphisms but also structural differences (gene rearrangement) and copy number variation (CNV). The MHC has now been described for multiple galliform species including the best defined assemblies of the chicken (Gallus gallus) and domestic turkey (Meleagris gallopavo). Using this sequence resource, this study applied high-throughput sequencing to investigate MHC variation in turkeys of North America (NA turkeys). An MHC-specific SureSelect (Agilent) capture array was developed, and libraries were created for 14 turkeys representing domestic (commercial bred), heritage breed, and wild turkeys. In addition, a representative of the Ocellated turkey (M. ocellata) and chicken (G. gallus) was included to test cross-species applicability of the capture array allowing for identification of new species-specific polymorphisms. Libraries were hybridized to ∼12 K cRNA baits and the resulting pools were sequenced. On average, 98% of processed reads mapped to the turkey whole genome sequence and 53% to the MHC target. In addition to the MHC, capture hybridization recovered sequences corresponding to other MHC regions. Sequence alignment and de novo assembly indicated the presence of several additional BG genes in the turkey with evidence for CNV. Variant detection identified an average of 2245 polymorphisms per individual for the NA turkeys, 3012 for the Ocellated turkey, and 462 variants in the chicken (RJF-256). This study provides an extensive sequence resource for examining MHC variation and its relation to health of this agriculturally important group of birds. PMID:26729471

  20. Sequences, Annotation and Single Nucleotide Polymorphism of the Major Histocompatibility Complex in the Domestic Cat

    PubMed Central

    Yuhki, Naoya; Mullikin, James C.; Beck, Thomas; Stephens, Robert; O'Brien, Stephen J.

    2008-01-01

    Two sequences of major histocompatibility complex (MHC) regions in the domestic cat, 2.976 and 0.362 Mbps, which were separated by an ancient chromosome break (55–80 MYA) and followed by a chromosomal inversion were annotated in detail. Gene annotation of this MHC was completed and identified 183 possible coding regions, 147 human homologues, possible functional genes and 36 pseudo/unidentified genes) by GENSCAN and BLASTN, BLASTP RepeatMasker programs. The first region spans 2.976 Mbp sequence, which encodes six classical class II antigens (three DRA and three DRB antigens) lacking the functional DP, DQ regions, nine antigen processing molecules (DOA/DOB, DMA/DMB, TAPASIN, and LMP2/LMP7,TAP1/TAP2), 52 class III genes, nineteen class I genes/gene fragments (FLAI-A to FLAI-S). Three class I genes (FLAI-H, I-K, I-E) may encode functional classical class I antigens based on deduced amino acid sequence and promoter structure. The second region spans 0.362 Mbp sequence encoding no class I genes and 18 cross-species conserved genes, excluding class I, II and their functionally related/associated genes, namely framework genes, including three olfactory receptor genes. One previously identified feline endogenous retrovirus, a baboon retrovirus derived sequence (ECE1) and two new endogenous retrovirus sequences, similar to brown bat endogenous retrovirus (FERVmlu1, FERVmlu2) were found within a 140 Kbp interval in the middle of class I region. MHC SNPs were examined based on comparisons of this BAC sequence and MHC homozygous 1.9× WGS sequences and found that 11,654 SNPs in 2.84 Mbp (0.00411 SNP per bp), which is 2.4 times higher rate than average heterozygous region in the WGS (0.0017 SNP per bp genome), and slightly higher than the SNP rate observed in human MHC (0.00337 SNP per bp). PMID:18629345

  1. Four novel cystic fibrosis mutations in splice junction sequences affecting the CFTR nucleotide binding folds

    SciTech Connect

    Doerk, T.; Wulbrand, U.; Tuemmler, B. )

    1993-03-01

    Single cases of the four novel splice site mutations 1525[minus]1 G [r arrow] A (intron 9), 3601[minus]2 A [r arrow] G (intron 18), 3850[minus]3 T [r arrow] G (intron 19), and 4374+1 G [r arrow] T (intron 23) were detected in the CFTR gene of cystic fibrosis patients of Indo-Iranian, Turkish, Polish, and Germany descent. The nucleotide substitutions at the +1, [minus]1, and [minus]2 positions all destroy splice sites and lead to severe disease alleles associated with features typical of gastrointestinal and pulmonary cystic fibrosis disease. The 3850[minus]3 T-to-G change was discovered in a very mildly affected 33-year-old [Delta]F508 compound heterozygote, suggesting that the T-to-G transversion at the less conserved [minus]3 position of the acceptor splice site may retain some wildtype function. 13 refs., 1 fig., 2 tabs.

  2. ChEMBL web services: streamlining access to drug discovery data and utilities

    PubMed Central

    Davies, Mark; Nowotka, Michał; Papadatos, George; Dedman, Nathan; Gaulton, Anna; Atkinson, Francis; Bellis, Louisa; Overington, John P.

    2015-01-01

    ChEMBL is now a well-established resource in the fields of drug discovery and medicinal chemistry research. The ChEMBL database curates and stores standardized bioactivity, molecule, target and drug data extracted from multiple sources, including the primary medicinal chemistry literature. Programmatic access to ChEMBL data has been improved by a recent update to the ChEMBL web services (version 2.0.x, https://www.ebi.ac.uk/chembl/api/data/docs), which exposes significantly more data from the underlying database and introduces new functionality. To complement the data-focused services, a utility service (version 1.0.x, https://www.ebi.ac.uk/chembl/api/utils/docs), which provides RESTful access to commonly used cheminformatics methods, has also been concurrently developed. The ChEMBL web services can be used together or independently to build applications and data processing workflows relevant to drug discovery and chemical biology. PMID:25883136

  3. Analysis of the nucleotide sequence of the guinea pig cytomegalovirus (GPCMV) genome

    PubMed Central

    Schleiss, Mark R; McGregor, Alistair; Choi, K Yeon; Date, Shailesh V; Cui, Xiaohong; McVoy, Michael A

    2008-01-01

    In this report we describe the genomic sequence of guinea pig cytomegalovirus (GPCMV) assembled from a tissue culture-derived bacterial artificial chromosome clone, plasmid clones of viral restriction fragments, and direct PCR sequencing of viral DNA. The GPCMV genome is 232,678 bp, excluding the terminal repeats, and has a GC content of 55%. A total of 105 open reading frames (ORFs) of > 100 amino acids with sequence and/or positional homology to other CMV ORFs were annotated. Positional and sequence homologs of human cytomegalovirus open reading frames UL23 through UL122 were identified. Homology with other cytomegaloviruses was most prominent in the central ~60% of the genome, with divergence of sequence and lack of conserved homologs at the respective genomic termini. Of interest, the GPCMV genome was found in many cases to bear stronger phylogenetic similarity to primate CMVs than to rodent CMVs. The sequence of GPCMV should facilitate vaccine and pathogenesis studies in this model of congenital CMV infection. PMID:19014498

  4. Nucleotide sequence and characterization of the gene for secreted alkaline phosphatase from Lysobacter enzymogenes.

    PubMed Central

    Au, S; Roy, K L; von Tigerstrom, R G

    1991-01-01

    Lysobacter enzymogenes produces an alkaline phosphatase which is secreted into the medium. The gene for the enzyme (phoA) was isolated from a recombinant lambda library. It was identified within a 4.4-kb EcoRI-BamH1 fragment, and its sequence was determined by the chain termination method. The structural gene consists of an open reading frame which encodes a 539-amino-acid protein with a 29-residue signal sequence, followed by a 119-residue propeptide, the 281-residue mature phosphatase, and a 110-residue carboxy-terminal domain. The roles of the propeptide and the carboxy-terminal peptide remain to be determined. A molecular weight of 30,000 was determined for the mature enzyme from sodium dodecyl sulfate-polyacrylamide gel electrophoresis. The amino acid sequence was compared with sequences available in the current protein data base, and a region of the sequence was found to show considerable homology with sequences in mammalian type 5 iron-containing purple acid phosphatases. Images PMID:1856159

  5. Nucleotide sequence of the 3' region of an infectious human T-cell leukemia virus type II genome.

    PubMed Central

    Shimotohno, K; Wachsman, W; Takahashi, Y; Golde, D W; Miwa, M; Sugimura, T; Chen, I S

    1984-01-01

    The nucleic acid sequence of the 3' region of human T-cell leukemia virus type II (HTLV-II) proviral DNA was determined using a HTLV-II proviral clone that could be recovered as infectious, transforming virus. The sequence data indicate a region of unknown function of approximately equal to 1.6 kilobase pairs in the 3' region, analogous to the X region previously identified in human T-cell leukemia virus type I (HTLV-I). Three overlapping open reading frames are present in the X region of HTLV-II. One of these open reading frames, Xc, is most likely to encode a protein product, because it has greater predicted amino acid sequence homology (78%) with the X-IV region of HTLV-I and a greater percentage of its base differences with X-IV at the third nucleotide position of codons than do the other open reading frames. Sequences of the X-region that include the open reading frames are conserved in two deletion mutants of HTLV-II, which are associated with a subline of Mo cells with a decreased dependence on fetal bovine serum. Images PMID:6093110

  6. Nucleotide sequence of the bovine parainfluenza 3 virus genome: the genes of the F and HN glycoproteins.

    PubMed Central

    Suzu, S; Sakai, Y; Shioda, T; Shibuta, H

    1987-01-01

    By analysing complementary DNA clones constructed from genomic RNA of bovine parainfluenza 3 virus (BPIV3), we determined the nucleotide sequence of the region containing the entire F and HN genes. Their deduced amino acid sequences showed about 80% homologies with those of human parainfluenza 3 virus (HPIV3), about 45% with those of Sendai virus, and about 20% with those of SV5 and Newcastle disease virus (NDV), indicating, together with the results described in the preceding paper on the NP, P, C and M proteins of BPIV3, that BPIV3, HPIV3 and Sendai virus constitute a paramyxovirus subgroup, and that BPIV3 and HPIV3 are very closely related. The F and HN proteins of all these viruses, including SV5 and NDV, however, were shown to have protein-specific structures as well as short but well-conserved amino acid sequences, suggesting that these structures and sequences are related to the activities of these glycoproteins. Images PMID:3031615

  7. Nucleotide sequences of the HLA-DRw12 and DRw8 B1 chains from an Australian aborigine.

    PubMed

    O'Brien, R M; Cram, D S; Russ, G R; Starr, R; Tait, B D

    1992-06-01

    To gain a more detailed understanding of the molecular structure of the HLA genes in Australian aborigines, the polymorphic first-domain sequences of the DR B alleles were determined in an aborigine who was tissue typed as HLA-DRw8 and a probable DRw12; DRw52; DQw1,7. Both peripheral blood leukocytes and a lymphoblastoid cell line were reactive with the majority of DRw12-specific sera, but also with half of the DRw11-specific sera. With the use of primers specific for the conserved regions flanking the first domain, the polymerase chain reaction technique was used to amplify first-strand synthesis products prepared from the cell line. Two distinct DRB1 sequences were obtained. One was virtually identical to the reported DRw8,Dw8.3 sequence present in an Asian haplotype, differing only by a single silent nucleotide substitution at the third position of codon 36 (A to G). A second DRB allele was closely related to two recently published and nearly identical sequences for DRw12, with amino acid differences at positions 67 and 85 of the first domain. DRB RFLP studies on this cell line using the Taq I restriction enzyme indicated bands previously described for the DRw8 and DRw12 haplotypes. PMID:1358866

  8. Molecular cloning and nucleotide sequence of cDNA for human glucose-6-phosphate dehydrogenase variant A(-)

    SciTech Connect

    Hirono, A.; Beutler, E. )

    1988-06-01

    Glucose-6-phosphate dehydrogenase A(-) is a common variant in Blacks that causes sensitivity to drug- and infection-induced hemolytic anemia. A cDNA library was constructed from Epstein-Barr virus-transformed lymphoblastoid cells from a male who was G6PD A(-). One of four cDNA clones isolated contained a sequence not found in the other clones nor in the published cDNA sequence. Consisting of 138 bases and coding 46 amino acids, this segment of cDNA apparently is derived from the alternative splicing involving the 3{prime} end of intron 7. Comparison of the remaining sequences of these clones with the published sequence revealed three nucleotide substitutions: C{sup 33} {yields} G, G{sup 202} {yields} A, and A{sup 376} {yields} G. Each change produces a new restriction site. Genomic DNA from five G6PD A(-) individuals was amplified by the polymerase chain reaction. The findings of the same mutation in G6PD A(-) as is found in G6PD A(+) strongly suggests that the G6PD A(-) mutation arose in an individual with G6PD A(+), adding another mutation that causes the in vivo instability of this enzyme protein.

  9. Novel technologies applied to the nucleotide sequencing and comparative sequence analysis of the genomes of infectious agents in veterinary medicine.

    PubMed

    Granberg, F; Bálint, Á; Belák, S

    2016-04-01

    Next-generation sequencing (NGS), also referred to as deep, high-throughput or massively parallel sequencing, is a powerful new tool that can be used for the complex diagnosis and intensive monitoring of infectious disease in veterinary medicine. NGS technologies are also being increasingly used to study the aetiology, genomics, evolution and epidemiology of infectious disease, as well as host-pathogen interactions and other aspects of infection biology. This review briefly summarises recent progress and achievements in this field by first introducing a range of novel techniques and then presenting examples of NGS applications in veterinary infection biology. Various work steps and processes for sampling and sample preparation, sequence analysis and comparative genomics, and improving the accuracy of genomic prediction are discussed, as are bioinformatics requirements. Examples of sequencing-based applications and comparative genomics in veterinary medicine are then provided. This review is based on novel references selected from the literature and on experiences of the World Organisation for Animal Health (OIE) Collaborating Centre for the Biotechnology-based Diagnosis of Infectious Diseases in Veterinary Medicine, Uppsala, Sweden. PMID:27217166

  10. Functional analysis and nucleotide sequence of the promoter region of the murine hck gene.

    PubMed Central

    Lock, P; Stanley, E; Holtzman, D A; Dunn, A R

    1990-01-01

    The structure and function of the promoter region and exon 1 of the murine hck gene have been characterized in detail. RNase protection analysis has established that hck transcripts initiate from heterogeneous start sites located within the hck gene. Fusion gene constructs containing hck 5'-flanking sequences and the bacterial Neor gene have been introduced into the hematopoietic cell lines FDC-P1 and WEHI-265 by using a self-inactivating retroviral vector. The transcriptional start sites of the fusion gene are essentially identical to those of the endogenous hck gene. Analysis of infected WEHI-265 cell lines treated with bacterial lipopolysaccharide (LPS) reveals a 3- to 5-fold elevation in the levels of endogenous hck mRNA and a 1.4- to 2.6-fold increase in the level of Neor fusion gene transcripts, indicating that hck 5'-flanking sequences are capable of conferring LPS responsiveness on the Neor gene. The 5'-flanking region of the hck gene contains sequences similar to an element which is thought to be involved in the LPS responsiveness of the class II major histocompatibility gene A alpha k. A subset of these sequences are also found in the 5'-flanking regions of other LPS-responsive genes. Moreover, this motif is related to the consensus binding sequence of NF-kappa B, a transcription factor which is known to be regulated by LPS. Images PMID:2388619

  11. Mining and comparison of haplotype-based expressed sequence tag single nucleotide polymorphisms among citrus cultivars

    Technology Transfer Automated Retrieval System (TEKTRAN)

    In this paper, haplotype-based SNPs were mined out of publicly available citrus expressed sequence tags (ESTs) from different citrus cultivars (genotypes) individually and collectively for comparison. There were a total of 567,297 ESTs belonging to 27 cultivars in varying numbers and consequentially...

  12. Complete Nucleotide Sequence of an Isolate of Coleus vein necrosis virus from Verbena

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A plant of 'Taylor Town Red' verbena exhibiting mottling, necrosis and low vigor was tested for the presence of viruses by extracting double-stranded RNA which is indicative of infection with an RNA virus. The dsRNA was cloned and sequenced and a novel carlavirus identified. The new virus was dete...

  13. Phylogenetic analysis of Rutaceous plants based on single nucleotide polymorphism in chloroplast and nuclear gene sequences

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The family Rutaceae encompasses several genera including the economically important genus Citrus. In this study, we selected 22 citrus relatives belonging to the various sub groups of Rutaceae and compared the sequences of three gene fragments. The accessions selected belong to the subfamily Rutoide...

  14. Nucleotide sequence discrepancies within the GA strain of Marek's disease virus

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Comparative genomics between 9 gallid herpesvirus type 2 strains have singled out the virulent (v) prototype strain GA as phylogenetically distant from other v pathotypes. Multiple amino acid alignments of otherwise highly conserved unique long (UL) genes have indicated sequence discrepancies within...

  15. Synthetic promoter elements obtained by nucleotide sequence variation and selection for activity

    PubMed Central

    Edelman, Gerald M.; Meech, Robyn; Owens, Geoffrey C.; Jones, Frederick S.

    2000-01-01

    Eukaryotic transcriptional regulation in different cells involves large numbers and arrangements of cis and trans elements. To survey the number of cis regulatory elements that are active in different contexts, we have devised a high-throughput selection procedure permitting synthesis of active cis motifs that enhance the activity of a minimal promoter. This synthetic promoter construction method (SPCM) was used to identify >100 DNA sequences that showed increased promoter activity in the neuroblastoma cell line Neuro2A. After determining DNA sequences of selected synthetic promoters, database searches for known elements revealed a predominance of eight motifs: AP2, CEBP, GRE, Ebox, ETS, CREB, AP1, and SP1/MAZ. The most active of the selected synthetic promoters contain composites of a number of these motifs. Assays of DNA binding and promoter activity of three exemplary motifs (ETS, CREB, and SP1/MAZ) were used to prove the effectiveness of SPCM in uncovering active sequences. Up to 10% of 133 selected active sequences had no match in currently available databases, raising the possibility that new motifs and transcriptional regulatory proteins to which they bind may be revealed by SPCM. The method may find uses in constructing databases of active cis motifs, in diagnostics, and in gene therapy. PMID:10725347

  16. Molecular cloning and nucleotide sequence of cDNA for human liver arginase

    SciTech Connect

    Haraguchi, Y.; Takiguchi, M.; Amaya, Y.; Kawamoto, S.; Matsuda, I.; Mori, M.

    1987-01-01

    Arginase (EC3.5.3.1) catalyzes the last step of the urea cycle in the liver of ureotelic animals. Inherited deficiency of the enzyme results in argininemia, an autosomal recessive disorder characterized by hyperammonemia. To facilitate investigation of the enzyme and gene structures and to elucidate the nature of the mutation in argininemia, the authors isolated cDNA clones for human liver arginase. Oligo(dT)-primed and random primer human liver cDNA libraries in lambda gt11 were screened using isolated rat arginase cDNA as a probe. Two of the positive clones, designated lambda hARG6 and lambda hARG109, contained an overlapping cDNA sequence with an open reading frame encoding a polypeptide of 322 amino acid residues (predicted M/sub r/, 34,732), a 5'-untranslated sequence of 56 base pairs, a 3'-untranslated sequence of 423 base pairs, and a poly(A) segment. Arginase activity was detected in Escherichia coli cells transformed with the plasmid carrying lambda hARG6 cDNA insert. RNA gel blot analysis of human liver RNA showed a single mRNA of 1.6 kilobases. The predicted amino acid sequence of human liver arginase is 87% and 41% identical with those of the rat liver and yeast enzymes, respectively. There are several highly conserved segments among the human, rat, and yeast enzymes.

  17. Genomic structure and complete nucleotide sequence of the Batten disease gene, CLN3

    SciTech Connect

    Mitchison, H.M.; Munroe, P.B.; O`Rawe, A.M.

    1997-03-01

    We recently cloned a cDNA for CLN3, the gene for juvenile-onset neuronal ceroid lipofuscinosis or Batten disease. To resolve the genomic organization we used a cosmid clone containing CLN3 to sequence the entire gene in addition to 1.1 kb 5{prime} of the start of the published CLN3 cDNA and 0.3 kb 3{prime} to the polyadenylation site. CLN3 is organized into at least 15 exons spanning 15 kb and ranging from 47 to 356 bp. The 14 introns vary from 80 to 4227 bp, and all exon/intron junction sequences conform to the GTAG rule. Numerous repetitive Alu elements are present within the introns and 5{prime}- and 3{prime}-untranslated regions. The 5{prime} region of the CLN3 gene contains several potential transcription regulatory elements but no consensus TATA-1 box was identified. CLN3 is homologous to 27 deposited human ESTs, and sequence comparisons suggest alternative splicing of the gene and the existence of transcribed sequences upstream to the start of the published CLN3 cDNA. 19 refs., 2 figs., 1 tab.

  18. Cloning and nucleotide sequence of the leucyl-tRNA synthetase gene of Bacillus subtilis.

    PubMed Central

    Vander Horn, P B; Zahler, S A

    1992-01-01

    The leucyl-tRNA synthetase gene (leuS) of Bacillus subtilis was cloned and sequenced. A mutation in the gene, leuS1, increases the transcription and expression of the ilv-leu operion, permitting monitoring of leuS alleles. The leuS1 mutation was mapped to 270 degrees on the chromosome. Sequence analysis showed that the mutation is a single-base substitution, possibly in a monocistronic operon. The leader mRNA predicted by the sequence would contain a number of possible secondary structures and a T box, a sequence observed upstream of leader mRNA terminators of Bacillus tRNA synthetases and the B. subtilis ilv-leu operon. The DNA of the B. subtilis leuS open reading frame is 48% identical to the leuS gene of Escherichia coli and is predicted to encode a polypeptide with 46% identity to the leucyl-tRNA synthetase of E. coli. PMID:1317842

  19. Host-specific segregation of ribosomal nucleotide sequence diversity in the microsporidian Enterocytozoon bieneusi.

    PubMed

    Widmer, Giovanni; Akiyoshi, Donna E

    2010-01-01

    Enterocytozoon bieneusi is a unicellular enteric fungal pathogen and the most common cause of human microsporidiosis. The frequent detection of this organism in animals, including companion animals, livestock and wildlife, has raised the question of the importance of animal reservoirs in the epidemiology of this pathogen. A partial sequence of the ribosomal internal transcribed spacer (ITS) has been widely used as a genetic marker for studying the molecular epidemiology of E. bieneusi. With the aim of comparing E. bieneusi ITS genotypes originating from different host species, and assess the potential for zoonotic transmission, E. bieneusi ITS sequences retrieved from GenBank were analyzed using two metrics of diversity, rarefaction and phylogenetic distance. In spite of the human ITS sample being geographically more diverse, ITS sequence diversity in animals exceeded that of humans. In both host groups much of the ITS diversity remains to be sampled. Using quantitative phylogenetic tests we found evidence for a partial but significant segregation of E. bieneusi ITS sequences according to host species. Host-specific segregation was confirmed by hierarchical analysis of molecular variation. To improve our understanding of the epidemiology of human microsporidiosis and strengthen the study of E. bieneusi populations, efforts to genotype additional E. bieneusi isolates from wildlife and companion animals should be prioritized and the geographic and species diversify of animal samples should be increased. Due to the possibility of genetic recombination in this species, additional unlinked genetic markers need to be developed and included in future studies. PMID:19931647

  20. Nucleotide sequence of the thermostable direct hemolysin gene of Vibrio parahaemolyticus.

    PubMed Central

    Nishibuchi, M; Kaper, J B

    1985-01-01

    The gene encoding the thermostable direct hemolysin of Vibrio parahaemolyticus was characterized. This gene (designated tdh) was subcloned into pBR322 in Escherichia coli, and the functional tdh gene was localized to a 1.3-kilobase HindIII fragment. This fragment was sequenced, and the structural gene was found to encode a mature protein of 165 amino acid residues. The mature protein sequence was preceded by a putative signal peptide sequence of 24 amino acids. A putative tdh promoter, determined by its similarity to concensus sequences, was not functional in E. coli. However, a promoter that was functional in E. coli was shown to exist further upstream by use of a promoter probe plasmid. A 5.7-kilobase SalI fragment containing the structural gene and both potential promoters was cloned into a broad-host-range plasmid and mobilized into a Kanagawa phenomenon-negative V. parahaemolyticus strain. In contrast to E. coli, where the hemolysin was detected only in cell lysates, introduction of the cloned gene into V. parahaemolyticus resulted in the production of extracellular hemolysin. Images PMID:3988703

  1. Phylogenetic discovery bias in Bacillus anthracis using single-nucleotide polymorphisms from whole-genome sequencing

    PubMed Central

    Pearson, Talima; Busch, Joseph D.; Ravel, Jacques; Read, Timothy D.; Rhoton, Shane D.; U'Ren, Jana M.; Simonson, Tatum S.; Kachur, Sergey M.; Leadem, Rebecca R.; Cardon, Michelle L.; Van Ert, Matthew N.; Huynh, Lynn Y.; Fraser, Claire M.; Keim, Paul

    2004-01-01

    Phylogenetic reconstruction using molecular data is often subject to homoplasy, leading to inaccurate conclusions about phylogenetic relationships among operational taxonomic units. Compared with other molecular markers, single-nucleotide polymorphisms (SNPs) exhibit extremely low mutation rates, making them rare in recently emerged pathogens, but they are less prone to homoplasy and thus extremely valuable for phylogenetic analyses. Despite their phylogenetic potential, ascertainment bias occurs when SNP characters are discovered through biased taxonomic sampling; by using whole-genome comparisons of five diverse strains of Bacillus anthracis to facilitate SNP discovery, we show that only polymorphisms lying along the evolutionary pathway between reference strains will be observed. We illustrate this in theoretical and simulated data sets in which complex phylogenetic topologies are reduced to linear evolutionary models. Using a set of 990 SNP markers, we also show how divergent branches in our topologies collapse to single points but provide accurate information on internodal distances and points of origin for ancestral clades. These data allowed us to determine the ancestral root of B. anthracis, showing that it lies closer to a newly described “C” branch than to either of two previously described “A” or “B” branches. In addition, subclade rooting of the C branch revealed unequal evolutionary rates that seem to be correlated with ecological parameters and strain attributes. Our use of nonhomoplastic whole-genome SNP characters allows branch points and clade membership to be estimated with great precision, providing greater insight into epidemiological, ecological, and forensic questions. PMID:15347815

  2. Genetic identification of horse mackerel and related species in seafood products by means of forensically informative nucleotide sequencing methodology.

    PubMed

    Lago, Fátima C; Herrero, Beatriz; Vieites, Juan M; Espiñeira, Montserrat

    2011-03-23

    In the present study, a methodology based on the amplification of a fragment of mitochondrial cytochrome b and subsequent phylogenetic analysis (FINS: forensically informative nucleotide sequencing) to genetically identify horse mackerels have been developed. This methodology makes possible the identification of more than 20 species belonging to the families Carangidae, Mullidae, and Scombridae. The main novelty of this work lies in the longest number of different horse mackerel species included and in the applicability of the developed methods to all kinds of processed products that can be found by consumers in markets around the world, including those that have undergone intensive processes of transformation, as for instance canned foods. Finally, the methods were applied to 15 commercial samples, all of them canned products. Therefore, these methods are useful for checking the fulfillment of labeling regulations for horse mackerels and horse mackerel products, verifying the correct traceability in commercial trade, and fisheries control. PMID:21332203

  3. Cloning and nucleotide sequencing of genes for three small, acid-soluble proteins from Bacillus subtilis spores.

    PubMed Central

    Connors, M J; Mason, J M; Setlow, P

    1986-01-01

    Three Bacillus subtilis genes (termed sspA, sspB, and sspD) which code for small, acid-soluble spore proteins (SASPs) have been cloned, and their complete nucleotide sequence has been determined. The amino acid sequences of the SASPs coded for by these genes are similar to each other and to those of the SASP-1 of B. subtilis (coded for by the sspC gene) and the SASP-A/C family of B. megaterium. The sspA and sspB genes are expressed only in sporulation, in parallel with each other and with the sspC gene. Two regions upstream of the postulated transcription start sites for the sspA and B genes have significant homology with the analogous regions of the sspC gene and the SASP-A/C gene family. Purification of two of the three major B, subtilis SASPs (alpha and beta) and determination of their amino-terminal sequences indicated that the sspA gene codes for SASP-alpha and that the sspB gene codes for SASP-beta. This was confirmed by the introduction of deletion mutations into the cloned sspA and sspB genes and transfer of these deletions into the B. subtilis chromosome with concomitant loss of the wild-type gene. Images PMID:3009398

  4. Conserved nucleotide sequences in the open reading frame and 3' untranslated region of selenoprotein P mRNA.

    PubMed Central

    Hill, K E; Lloyd, R S; Burk, R F

    1993-01-01

    Rat liver selenoprotein P contains 10 selenocysteine residues in its primary structure (deduced). It is the only selenoprotein characterized to date that has more than one selenocysteine residue. Selenoprotein P cDNA has been cloned from human liver and heart cDNA libraries and sequenced. The open reading frames are identical and contain a signal peptide, indicating that the protein is secreted by both organs and is therefore not exclusively produced in the liver. Ten selenocysteine residues (deduced) are present. Comparison of the open reading frame of the human cDNA with the rat cDNA reveals a 69% identity of the nucleotide sequence and 72% identity of the deduced amino acid sequence. Two regions in the 3' untranslated portion have high conservation between human and rat. Each of these regions contains a predicted stable stem-loop structure similar to the single stem-loop structures reported in 3' untranslated regions of type I iodothyronine 5'-deiodinase and glutathione peroxidase. The stem-loop structure of type I iodothyronine 5'-deiodinase has been shown to be necessary for incorporation of the selenocysteine residue at the UGA codon. Because only two stem-loop structures are present in the 3' untranslated region of selenoprotein P mRNA, it can be concluded that a separate stem-loop structure is not required for each selenocysteine residue. Images PMID:8421687

  5. Gene Cloning and Nucleotide Sequencing and Properties of a Cocaine Esterase from Rhodococcus sp. Strain MB1

    PubMed Central

    Bresler, Matthew M.; Rosser, Susan J.; Basran, Amrik; Bruce, Neil C.

    2000-01-01

    A strain of Rhodococcus designated MB1, which was capable of utilizing cocaine as a sole source of carbon and nitrogen for growth, was isolated from rhizosphere soil of the tropane alkaloid-producing plant Erythroxylum coca. A cocaine esterase was found to initiate degradation of cocaine, which was hydrolyzed to ecgonine methyl ester and benzoate; both of these esterolytic products were further metabolized by Rhodococcus sp. strain MB1. The structural gene encoding a cocaine esterase, designated cocE, was cloned from Rhodococcus sp. strain MB1 genomic libraries by screening recombinant strains of Rhodococcus erythropolis CW25 for growth on cocaine. The nucleotide sequence of cocE corresponded to an open reading frame of 1,724 bp that codes for a protein of 574 amino acids. The amino acid sequence of cocaine esterase has a region of similarity with the active serine consensus of X-prolyl dipeptidyl aminopeptidases, suggesting that the cocaine esterase is a serine esterase. The cocE coding sequence was subcloned into the pCFX1 expression plasmid and expressed in Escherichia coli. The recombinant cocaine esterase was purified to apparent homogeneity and was found to be monomeric, with an Mr of approximately 65,000. The apparent Km of the enzyme (mean ± standard deviation) for cocaine was measured as 1.33 ± 0.085 mM. These findings are of potential use in the development of a linked assay for the detection of illicit cocaine. PMID:10698749

  6. The complete nucleotide sequence and genomic organization of Citrus Leprosis associated Virus, Cytoplasmatic type (CiLV-C).

    PubMed

    Pascon, Renata C; Kitajima, João Paulo; Breton, Michèle C; Assumpção, Laura; Greggio, Christian; Zanca, Almir S; Okura, Vagner Katsumi; Alegria, Marcos C; Camargo, Maria E; Silva, Giovana G C; Cardozo, Jussara C; Vallim, Marcelo A; Franco, Sulamita F; Silva, Vitor H; Jordão, Hamilton; Oliveira, Fernanda; Giachetto, Poliana F; Ferrari, Fernanda; Aguilar-Vildoso, Carlos I; Franchiscini, Fabrício J B; Silva, José M F; Arruda, Paulo; Ferro, Jesus A; Reinach, Fernando; da Silva, Ana Cláudia Rasera

    2006-06-01

    The Citrus leprosis disease (CiL) is associated to a virus (CiLV) transmitted by Brevipalpus spp. mites (Acari: Tenuipalpidae). CiL is endemic in Brazil and its recently spreading to Central America represents a threat to citrus industry in the USA. Electron microscopy images show two forms of CiLV: a rare nuclear form, characterized by rod-shaped naked particle (CiLV-N) and a common cytoplasmic form (CiLV-C) associated with bacilliform-enveloped particle and cytoplasmic viroplasm. Due to this morphological feature, CiLV-C has been treated as Rhabdovirus-like. In this paper we present the complete nucleotide sequence and genomic organization of CiLV-C. It is a bipartite virus with sequence similarity to ssRNA positive plant virus. RNA1 encodes a putative replicase polyprotein and an ORF with no known function. RNA2 encodes 4 ORFs. pl5, p24 and p61 have no significant similarity to any known proteins and p32 encodes a protein with similarity to a viral movement protein. The CiLV-C sequences are associated with typical symptoms of CiL by RT-PCR. Phylogenetic analysis suggests that CiLV-C is probably a member of a new family of plant virus evolutionarily related to Tobamovirus. PMID:16732481

  7. Complete nucleotide sequences of a new bipartite begomovirus from Malvastrum sp. plants with bright yellow mosaic symptoms in South Texas.

    PubMed

    Alabi, Olufemi J; Villegas, Cecilia; Gregg, Lori; Murray, K Daniel

    2016-06-01

    Two isolates of a novel bipartite begomovirus, tentatively named malvastrum bright yellow mosaic virus (MaBYMV), were molecularly characterized from naturally infected plants of the genus Malvastrum showing bright yellow mosaic disease symptoms in South Texas. Six complete DNA-A and five DNA-B genome sequences of MaBYMV obtained from the isolates ranged in length from 2,608 to 2,609 nucleotides (nt) and 2,578 to 2,605 nt, respectively. Both genome segments shared a 178- to 180-nt common region. In pairwise comparisons, the complete DNA-A and DNA-B sequences of MaBYMV were most similar (87-88 % and 79-81 % identity, respectively) and phylogenetically related to the corresponding sequences of sida mosaic Sinaloa virus-[MX-Gua-06]. Further analysis revealed that MaBYMV is a putative recombinant virus, thus supporting the notion that malvaceous hosts may be influencing the evolution of several begomoviruses. The design of new diagnostic primers enabled the detection of MaBYMV in cohorts of Bemisia tabaci collected from symptomatic Malvastrum sp. plants, thus implicating whiteflies as potential vectors of the virus. PMID:27016928

  8. Molecular Properties of Poliovirus Isolates: Nucleotide Sequence Analysis, Typing by PCR and Real-Time RT-PCR.

    PubMed

    Burns, Cara C; Kilpatrick, David R; Iber, Jane C; Chen, Qi; Kew, Olen M

    2016-01-01

    Virologic surveillance is essential to the success of the World Health Organization initiative to eradicate poliomyelitis. Molecular methods have been used to detect polioviruses in tissue culture isolates derived from stool samples obtained through surveillance for acute flaccid paralysis. This chapter describes the use of realtime PCR assays to identify and serotype polioviruses. In particular, a degenerate, inosine-containing, panpoliovirus (panPV) PCR primer set is used to distinguish polioviruses from NPEVs. The high degree of nucleotide sequence diversity among polioviruses presents a challenge to the systematic design of nucleic acid-based reagents. To accommodate the wide variability and rapid evolution of poliovirus genomes, degenerate codon positions on the template were matched to mixed-base or deoxyinosine residues on both the primers and the TaqMan™ probes. Additional assays distinguish between Sabin vaccine strains and non-Sabin strains. This chapter also describes the use of generic poliovirus specific primers, along with degenerate and inosine-containing primers, for routine VP1 sequencing of poliovirus isolates. These primers, along with nondegenerate serotype-specific Sabin primers, can also be used to sequence individual polioviruses in mixtures. PMID:26983735

  9. InPhaDel: integrative shotgun and proximity-ligation sequencing to phase deletions with single nucleotide polymorphisms.

    PubMed

    Patel, Anand; Edge, Peter; Selvaraj, Siddarth; Bansal, Vikas; Bafna, Vineet

    2016-07-01

    Phasing of single nucleotide (SNV), and structural variations into chromosome-wide haplotypes in humans has been challenging, and required either trio sequencing or restricting phasing to population-based haplotypes. Selvaraj et al demonstrated single individual SNV phasing is possible with proximity ligated (HiC) sequencing. Here, we demonstrate HiC can phase structural variants into phased scaffolds of SNVs. Since HiC data is noisy, and SV calling is challenging, we applied a range of supervised classification techniques, including Support Vector Machines and Random Forest, to phase deletions. Our approach was demonstrated on deletion calls and phasings on the NA12878 human genome. We used three NA12878 chromosomes and simulated chromosomes to train model parameters. The remaining NA12878 chromosomes withheld from training were used to evaluate phasing accuracy. Random Forest had the highest accuracy and correctly phased 86% of the deletions with allele-specific read evidence. Allele-specific read evidence was found for 76% of the deletions. HiC provides significant read evidence for accurately phasing 33% of the deletions. Also, eight of eight top ranked deletions phased by only HiC were validated using long range polymerase chain reaction and Sanger. Thus, deletions from a single individual can be accurately phased using a combination of shotgun and proximity ligation sequencing. InPhaDel software is available at: http://l337x911.github.io/inphadel/. PMID:27105843

  10. InPhaDel: integrative shotgun and proximity-ligation sequencing to phase deletions with single nucleotide polymorphisms

    PubMed Central

    Patel, Anand; Edge, Peter; Selvaraj, Siddarth; Bansal, Vikas; Bafna, Vineet

    2016-01-01

    Phasing of single nucleotide (SNV), and structural variations into chromosome-wide haplotypes in humans has been challenging, and required either trio sequencing or restricting phasing to population-based haplotypes. Selvaraj et al. demonstrated single individual SNV phasing is possible with proximity ligated (HiC) sequencing. Here, we demonstrate HiC can phase structural variants into phased scaffolds of SNVs. Since HiC data is noisy, and SV calling is challenging, we applied a range of supervised classification techniques, including Support Vector Machines and Random Forest, to phase deletions. Our approach was demonstrated on deletion calls and phasings on the NA12878 human genome. We used three NA12878 chromosomes and simulated chromosomes to train model parameters. The remaining NA12878 chromosomes withheld from training were used to evaluate phasing accuracy. Random Forest had the highest accuracy and correctly phased 86% of the deletions with allele-specific read evidence. Allele-specific read evidence was found for 76% of the deletions. HiC provides significant read evidence for accurately phasing 33% of the deletions. Also, eight of eight top ranked deletions phased by only HiC were validated using long range polymerase chain reaction and Sanger. Thus, deletions from a single individual can be accurately phased using a combination of shotgun and proximity ligation sequencing. InPhaDel software is available at: http://l337x911.github.io/inphadel/. PMID:27105843

  11. Finding the right coverage: the impact of coverage and sequence quality on single nucleotide polymorphism genotyping error rates.

    PubMed

    Fountain, Emily D; Pauli, Jonathan N; Reid, Brendan N; Palsbøll, Per J; Peery, M Zachariah

    2016-07-01

    Restriction-enzyme-based sequencing methods enable the genotyping of thousands of single nucleotide polymorphism (SNP) loci in nonmodel organisms. However, in contrast to traditional genetic markers, genotyping error rates in SNPs derived from restriction-enzyme-based methods remain largely unknown. Here, we estimated genotyping error rates in SNPs genotyped with double digest RAD sequencing from Mendelian incompatibilities in known mother-offspring dyads of Hoffman's two-toed sloth (Choloepus hoffmanni) across a range of coverage and sequence quality criteria, for both reference-aligned and de novo-assembled data sets. Genotyping error rates were more sensitive to coverage than sequence quality and low coverage yielded high error rates, particularly in de novo-assembled data sets. For example, coverage ≥5 yielded median genotyping error rates of ≥0.03 and ≥0.11 in reference-aligned and de novo-assembled data sets, respectively. Genotyping error rates declined to ≤0.01 in reference-aligned data sets with a coverage ≥30, but remained ≥0.04 in the de novo-assembled data sets. We observed approximately 10- and 13-fold declines in the number of loci sampled in the reference-aligned and de novo-assembled data sets when coverage was increased from ≥5 to ≥30 at quality score ≥30, respectively. Finally, we assessed the effects of genotyping coverage on a common population genetic application, parentage assignments, and showed that the proportion of incorrectly assigned maternities was relatively high at low coverage. Overall, our results suggest that the trade-off between sample size and genotyping error rates be considered prior to building sequencing libraries, reporting genotyping error rates become standard practice, and that effects of genotyping errors on inference be evaluated in restriction-enzyme-based SNP studies. PMID:26946083

  12. Update on Pneumocystis carinii f. sp. hominis Typing Based on Nucleotide Sequence Variations in Internal Transcribed Spacer Regions of rRNA Genes

    PubMed Central

    Lee, Chao-Hung; Helweg-Larsen, Jannik; Tang, Xing; Jin, Shaoling; Li, Baozheng; Bartlett, Marilyn S.; Lu, Jang-Jih; Lundgren, Bettina; Lundgren, Jens D.; Olsson, Mats; Lucas, Sebastian B.; Roux, Patricia; Cargnel, Antonietta; Atzori, Chiara; Matos, Olga; Smith, James W.

    1998-01-01

    Pneumocystis carinii f. sp. hominis isolates from 207 clinical specimens from nine countries were typed based on nucleotide sequence variations in the internal transcribed spacer regions I and II (ITS1 and ITS2, respectively) of rRNA genes. The number of ITS1 nucleotides has been revised from the previously reported 157 bp to 161 bp. Likewise, the number of ITS2 nucleotides has been changed from 177 to 192 bp. The number of ITS1 sequence types has increased from 2 to 15, and that of ITS2 has increased from 3 to 14. The 15 ITS1 sequence types are designated types A through O, and the 14 ITS2 types are named types a through n. A total of 59 types of P. carinii f. sp. hominis were found in this study. PMID:9508304

  13. Complete nucleotide sequence and organization of the mitochondrial genome of Sirtheneaflavipes (Hemiptera: Reduviidae: Peiratinae) and comparison with other assassin bugs.

    PubMed

    Gao, Jianyu; Li, Hu; Truong, Xuan Lam; Dai, Xun; Chang, Jian; Cai, Wanzhi

    2013-01-01

    The complete sequence of the mitochondrial (mt) genome of the assassin bug, Sirtheneaflavipes (Stål), was determined. The circular genome is 15, 961 bp long and contains a standard gene complement, i.e., the large and small ribosomal RNA (rRNA) subunits, 22 transfer RNA (tRNA) genes, 13 protein-coding genes (PCGs), and the 1, 295 bp control region. The nucleotide composition of S. flavipes mt genome is 71.8% AT-rich, reflected in the predominance of AT-rich codons in PCGs. Compared with the other three reduviid species available in complete mt genomes, the genome architecture as well as the nucleotide composition, codon usage, and amino acid composition reflected high similarity. All PCGs use standard initiation codons (ATN); however, ND4L and ND1 started with GTG. Canonical TAA and TAG termination codons are found in nine PCGs, the remaining four (COIII, ND3, ND5, and ND]) have incomplete termination codons. All tRNAs have the typical clover-leaf structure, except the dihydrouridine (DHU) arm of tRNASer(AGN) forms a simple loop as seen in many other metazoans. Secondary structure models of the ribosomal RNA genes of S. flavipes are presented and are similar to those proposed for other insects. The structure of rrnL is more conservative than that of rrnS among sequenced assassin bugs. The monophyly of Reduviidae is highly supported by Bayesian inferences, and the Peiratinae presents a sister position to the Triatominae+ (Salyavatinae + Harpactorinae). PMID:26312315

  14. Inferring Multiple Refugia and Phylogeographical Patterns in Pinus massoniana Based on Nucleotide Sequence Variation and DNA Fingerprinting

    PubMed Central

    Lin, Chung-Jian; Huang, Chi-Chung; Huang, Chao-Ching; Chiang, Yu-Chung; Chiang, Tzen-Yuh

    2012-01-01

    Background Pinus massoniana, an ecologically and economically important conifer, is widespread across central and southern mainland China and Taiwan. In this study, we tested the central–marginal paradigm that predicts that the marginal populations tend to be less polymorphic than the central ones in their genetic composition, and examined a founders' effect in the island population. Methodology/Principal Findings We examined the phylogeography and population structuring of the P. massoniana based on nucleotide sequences of cpDNA atpB-rbcL intergenic spacer, intron regions of the AdhC2 locus, and microsatellite fingerprints. SAMOVA analysis of nucleotide sequences indicated that most genetic variants resided among geographical regions. High levels of genetic diversity in the marginal populations in the south region, a pattern seemingly contradicting the central–marginal paradigm, and the fixation of private haplotypes in most populations indicate that multiple refugia may have existed over the glacial maxima. STRUCTURE analyses on microsatellites revealed that genetic structure of mainland populations was mediated with recent genetic exchanges mostly via pollen flow, and that the genetic composition in east region was intermixed between south and west regions, a pattern likely shaped by gene introgression and maintenance of ancestral polymorphisms. As expected, the small island population in Taiwan was genetically differentiated from mainland populations. Conclusions/Significance The marginal populations in south region possessed divergent gene pools, suggesting that the past glaciations might have low impacts on these populations at low latitudes. Estimates of ancestral population sizes interestingly reflect a recent expansion in mainland from a rather smaller population, a pattern that seemingly agrees with the pollen record. PMID:22952747

  15. Isolation and nucleotide sequence of an autonomously replicating sequence (ARS) element functional in Candida albicans and Saccharomyces cerevisiae.

    PubMed

    Cannon, R D; Jenkinson, H F; Shepherd, M G

    1990-04-01

    An 8.6-kb fragment was isolated from an EcoRI digest of Candida albicans ATCC 10261 genomic DNA which conferred the property of autonomous replication in Saccharomyces cervisiae on the otherwise non-replicative plasmid pMK155 (5.6 kb). The DNA responsible for the replicative function was subcloned as a 1.2-kb fragment onto a non-replicative plasmid (pRC3915) containing the C. albicans URA3 and LEU2 genes to form plasmid pRC3920. This plasmid was capable of autonomous replication in both S. cerevisiae and C. albicans and transformed S. cerevisiae AH22 (leu2-) to Leu+ at a frequency of 2.15 x 10(3) transformants per microgram DNA, and transformed C. albicans SGY-243 (delta ura3) to Ura+ at a frequency of 1.91 x 10(3) transformants per microgram DNA. Sequence analysis of the cloned DNA revealed the presence of two identical regions of eleven base pairs (5'TTTTATGTTTT3') which agreed with the consensus of autonomously replicating sequence (ARS) cores functional in S. cerevisiae. In addition there were two 10/11 and numerous 9/11 matches to the core consensus. The two 11/11 matches to the consensus, CaARS1 and CaARS2, were located on opposite strands in a non-coding AT-rich region and were separated by 107 bp. Also present on the C. albicans DNA, 538 bp from the ARS cores, was a gene for 5S rRNA which showed sequence homology with several other yeast 5S rRNA genes.(ABSTRACT TRUNCATED AT 250 WORDS) PMID:2196431

  16. The complete nucleotide sequence and genome organization of lychnis mottle virus.

    PubMed

    Yoo, Ran Hee; Zhao, Fumei; Lim, Seungmo; Igori, Davaajargal; Lee, Su-Heon; Moon, Jae Sun

    2015-11-01

    The complete genomic sequence of lychnis mottle virus (LycMoV) from a Lychnis cognata plant was determined. LycMoV has a bipartite genome consisting of RNA1 (7,428 nt) and RNA2 (3,734 nt). Species in the family Secoviridae are demarcated based on their amino acid similarities in the protease-polymerase and coat protein. In LycMoV, these proteins share 90% and 63% sequence similarity, respectively, with the most closely related virus, strawberry latent ringspot virus, which is a member of the family Secoviridae but has not been assigned to a genus. Therefore, LycMoV is a tentative new virus of the family Secoviridae. PMID:26259831

  17. The mitochondrial genome of Anopheles quadrimaculatus species A: complete nucleotide sequence and gene organization.

    PubMed

    Mitchell, S E; Cockburn, A F; Seawright, J A

    1993-12-01

    The complete sequence (15,455 bp) of the mitochondrial DNA of the mosquito Anopheles quadrimaculatus species A is reported. This genome is compact and very A+T rich (77.4% A+T). It contains genes for 2 ribosomal RNAs (rRNAs), 22 transfer RNAs (tRNAs), and 13 subunits of the mitochondrial inner membrane respiratory complexes. The gene arrangement is the same as in Drosophila yakuba, except that the positions of two contiguous tRNAs are reversed and a third tRNA is transcribed from the complementary strand. Protein-coding genes, rRNAs, and most tRNAs were similar to D. yakuba. Two tRNAs had nonstandard secondary structures comparable with those of nematode mitochondrial tRNAs. The very small putative control region (625 bp) contains no sequence motifs similar to those used in vertebrates and other insects for initiation of transcription and replication. PMID:8112570

  18. Nucleotide sequence of the regulatory locus controlling expression of bacterial genes for bioluminescence.

    PubMed Central

    Engebrecht, J; Silverman, M

    1987-01-01

    Production of light by the marine bacterium Vibrio fischeri and by recombinant hosts containing cloned lux genes is controlled by the density of the culture. Density-dependent regulation of lux gene expression has been shown to require a locus consisting of the luxR and luxI genes and two closely linked divergent promoters. As part of a genetic analysis to understand the regulation of bioluminescence, we have sequenced the region of DNA containing this control circuit. Open reading frames corresponding to luxR and luxI were identified; transcription start sites were defined by S1 nuclease mapping and sequences resembling promoter elements were located. Images PMID:3697093

  19. Proteus mirabilis urease: nucleotide sequence determination and comparison with jack bean urease.

    PubMed

    Jones, B D; Mobley, H L

    1989-12-01

    Proteus mirabilis, a common cause of urinary tract infection, produces a potent urease that hydrolyzes urea to NH3 and CO2, initiating kidney stone formation. Urease genes, which were localized to a 7.6-kilobase-pair region of DNA, were sequenced by using the dideoxy method. Six open reading frames were found within a region of 4,952 base pairs which were predicted to encode polypeptides of 31.0 (ureD), 11.0 (ureA), 12.2 (ureB), 61.0 (ureC), 17.9 (ureE), and 23.0 (ureF) kilodaltons (kDa). Each open reading frame was preceded by a ribosome-binding site, with the exception of ureE. Putative promoterlike sequences were identified upstream of ureD, ureA, and ureF. Possible termination sites were found downstream of ureD, ureC, and ureF. Structural subunits of the enzyme were encoded by ureA, ureB, and ureC and were translated from a single transcript in the order of 11.0, 12.2, and 61.0 kDa. When the deduced amino acid sequences of the P. mirabilis urease subunits were compared with the amino acid sequence of the jack bean urease, significant amino acid similarity was observed (58% exact matches; 73% exact plus conservative replacements). The 11.0-kDa polypeptide aligned with the N-terminal residues of the plant enzyme, the 12.2-kDa polypeptide lined up with internal residues, and the 61.0-kDa polypeptide matched with the C-terminal residues, suggesting an evolutionary relationship of the urease genes of jack bean and P. mirabilis. PMID:2687233

  20. The influence of nucleotide sequence and temperature on the activity of thermostable DNA polymerases.

    PubMed

    Montgomery, Jesse L; Rejali, Nick; Wittwer, Carl T

    2014-05-01

    Extension rates of a thermostable, deletion-mutant polymerase were measured from 50°C to 90°C using a fluorescence activity assay adapted for real-time PCR instruments. Substrates with a common hairpin (6-base loop and a 14-bp stem) were synthesized with different 10-base homopolymer tails. Rates for A, C, G, T, and 7-deaza-G incorporation at 75°C were 81, 150, 214, 46, and 120 seconds(-1). Rates for U were half as fast as T and did not increase with increasing concentration. Hairpin substrates with 25-base tails from 0% to 100% GC content had maximal extension rates near 60% GC and were predicted from the template sequence and mononucleotide incorporation rates to within 30% for most sequences. Addition of dimethyl sulfoxide at 7.5% increased rates to within 1% to 17% of prediction for templates with 40% to 90% GC. When secondary structure was designed into the template region, extension rates decreased. Oligonucleotide probes reduced extension rates by 65% (5'-3' exo-) and 70% (5'-3' exo+). When using a separate primer and a linear template to form a polymerase substrate, rates were dependent on both the primer melting temperature (Tm) and the annealing/extension temperature. Maximum rates were observed from Tm to Tm - 5°C with little extension by Tm + 5°C. Defining the influence of sequence and temperature on polymerase extension will enable more rapid and efficient PCR. PMID:24607271

  1. Nucleotide sequence and characterization of peb4A encoding an antigenic protein in Campylobacter jejuni.

    PubMed

    Burucoa, C; Frémaux, C; Pei, Z; Tummuru, M; Blaser, M J; Cenatiempo, Y; Fauchère, J L

    1995-01-01

    The 29-kDa protein PEB4, a major antigen of Campylobacter jejuni, is present in all C. jejuni strains tested and elicits an antibody response in infected patients. By screening a lambda gt11 library of chromosomal DNA fragments of C. jejuni strain 81-176 in Escherichia coli Y1090 cells with antibody raised against purified PEB4, a recombinant phage with a 2-kb insert expressing an immunoreactive protein of 29 kDa was isolated. DNA sequence analysis revealed that the insert contains two complete open reading frames ORF-A and ORF-B. ORF-A (peb4A) encodes a 273-residue protein with a calculated molecular mass of 30,460 daltons. The deduced amino acid sequence, composition and pl of the recombinant mature protein are similar to those determined for purified PEB4. The first 21 residues resemble a signal peptide. Gene bank searches indicated 33.7% identity with protein export protein PrsA of Bacillus subtilis and 23.8% identity with protease maturation protein precursor PrtM of Lactococcus lactis. PCR experiments indicate that peb4A is highly conserved among C. jejuni strains. ORF-B begins 2 bp after the last codon of peb4A and encodes a putative protein of 353 residues with 63.4% identity with E. coli fructose 1,6-biphosphate aldolase. The sequence arrangement suggests that these two genes form an operon. PMID:8525063

  2. Model annotation for synthetic biology: automating model to nucleotide sequence conversion

    PubMed Central

    Misirli, Goksel; Hallinan, Jennifer S.; Yu, Tommy; Lawson, James R.; Wimalaratne, Sarala M.; Cooling, Michael T.; Wipat, Anil

    2011-01-01

    Motivation: The need for the automated computational design of genetic circuits is becoming increasingly apparent with the advent of ever more complex and ambitious synthetic biology projects. Currently, most circuits are designed through the assembly of models of individual parts such as promoters, ribosome binding sites and coding sequences. These low level models are combined to produce a dynamic model of a larger device that exhibits a desired behaviour. The larger model then acts as a blueprint for physical implementation at the DNA level. However, the conversion of models of complex genetic circuits into DNA sequences is a non-trivial undertaking due to the complexity of mapping the model parts to their physical manifestation. Automating this process is further hampered by the lack of computationally tractable information in most models. Results: We describe a method for automatically generating DNA sequences from dynamic models implemented in CellML and Systems Biology Markup Language (SBML). We also identify the metadata needed to annotate models to facilitate automated conversion, and propose and demonstrate a method for the markup of these models using RDF. Our algorithm has been implemented in a software tool called MoSeC. Availability: The software is available from the authors' web site http://research.ncl.ac.uk/synthetic_biology/downloads.html. Contact: anil.wipat@ncl.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21296753

  3. Statistical tests to identify appropriate types of nucleotide sequence recoding in molecular phylogenetics

    PubMed Central

    2014-01-01

    Background Under a Markov model of evolution, recoding, or lumping, of the four nucleotides into fewer groups may permit analysis under simpler conditions but may unfortunately yield misleading results unless the evolutionary process of the recoded groups remains Markovian. If a Markov process is lumpable, then the evolutionary process of the recoded groups is Markovian. Results We consider stationary, reversible, and homogeneous Markov processes on two taxa and compare three tests for lumpability: one using an ad hoc test statistic, which is based on an index that is evaluated using a bootstrap approximation of its distribution; one that is based on a test proposed specifically for Markov chains; and one using a likelihood-ratio test. We show that the likelihood-ratio test is more powerful than the index test, which is more powerful than that based on the Markov chain test statistic. We also show that for stationary processes on binary trees with more than two taxa, the tests can be applied to all pairs. Finally, we show that if the process is lumpable, then estimates obtained under the recoded model agree with estimates obtained under the original model, whereas, if the process is not lumpable, then these estimates can differ substantially. We apply the new likelihood-ratio test for lumpability to two primate data sets, one with a mitochondrial origin and one with a nuclear origin. Conclusions Recoding may result in biased phylogenetic estimates because the original evolutionary process is not lumpable. Accordingly, testing for lumpability should be done prior to phylogenetic analysis of recoded data. PMID:24564837

  4. The nucleotide sequence of the chicken thymidine kinase gene and the relationship of its predicted polypeptide to that of the vaccinia virus thymidine kinase.

    PubMed Central

    Kwoh, T J; Engler, J A

    1984-01-01

    The entire DNA nucleotide sequence of a 3.0 kilobase pair Hind III fragment containing the chicken cytoplasmic thymidine kinase gene was determined. Oligonucleotide linker insertion mutations distributed throughout this gene and having known effects upon gene activity ( Kwoh , T.J., Zipser , D., and Wigler , M. 1983. J. Mol. Appl. Genet. 2, 191-200), were used to access regions of the Hind III fragment for sequencing reactions. The complete nucleotide sequence, together with the positions of the linker insertion mutations within the sequence, allows us to propose a structure for the chicken thymidine kinase gene. The protein coding sequence of the gene is divided into seven small segments (each less than 160 base pairs) by six small introns (each less than 230 base pairs). The proposed 244 amino acid polypeptide encoded by this gene bears strong homology to the vaccinia virus thymidine kinase. No homology with the thymidine kinases of the herpes simplex viruses was found. PMID:6328447

  5. 5'-terminal nucleotide sequences of mammalian type C helper viruses are conserved in the genomes of replication-defective mammalian transforming viruses.

    PubMed Central

    Tronick, S R; Cabradilla, C D; Aaronson, S A; Haseltine, W A

    1978-01-01

    The RNAs of replication-defective murine and primate type C transforming viruses were analyzed for the presence of nucleotide sequences homologous to the genomes of their respective helper type C viruses by using DNAs complementary (cDNA) to either the 5'-terminal (cDNA5') or total (cDNAtotal) nucleotide sequences of the helper virus RNA. The defective viruses examined have previously been shown to vary in their ability to express helper viral gag gene proteins. With cDNAtotal as a probe, these transforming viruses were shown to vary in their representation of helper sequences (15 to 60% hybridization of cDNAtotal). In striking contrast, 5'-terminal-specific sequences of the helper virus were conserved in the RNAs of every transforming virus tested (is greater than 80% hybridization of cDNA5'). These findings suggest a critical role for these sequences in the life cycle of the defective transforming virus. PMID:209210

  6. Nucleotide Sequence of the Envelope Gene of Gardner-Arnstein Feline Leukemia Virus B Reveals Unique Sequence Homologies with a Murine Mink Cell Focus-Forming Virus †

    PubMed Central

    Elder, John H.; Mullins, James I.

    1983-01-01

    The nucleotide sequence of the envelope gene and the adjacent 3′ long terminal repeat (LTR) of Gardner-Arnstein feline leukemia virus of subgroup B (GA-FeLV-B) has been determined. Comparison of the derived amino acid sequence of the gp70-p15E polyprotein to those of several previously reported murine retroviruses revealed striking homologies between GA-FeLV-B gp70 and the gp70 of a Moloney virus-derived mink cell focus-forming virus. These homologies were located within the substituted (presumably xenotropic) portion of the mink cell focus-forming virus envelope gene and comprised amino acid sequences not present in three ecotropic virus gp70s. In addition, areas of insertions and deletions, in general, were the same between GA-FeLV-B and Moloney mink cell focus-forming virus, although the sizes of the insertions and deletions differed. Homologies between GA-FeLV-B and mink cell focus-forming virus gp70s is functionally significant in that they both possess expanded host ranges, a property dictated by gp70. The amino acid sequence of FeLV-B contains 12 Asn-X-Ser/Thr sequences, indicating 12 possible sites of N-linked glycosylation as compared with 7 or 8 for its murine counterparts. Comparison of the 3′ LTR of GA-FeLV-B to AKR and Moloney virus LTRs revealed extensive conservation in several regions including the “CCAAT” and Goldberg-Hogness (TATA) boxes thought to be involved in promotion of transcription and in the repeat region of the LTR. The inverted repeats that flanked the LTR of GA-FeLV-B were identical to the murine inverted repeats, but were one base longer than the latter. The region of U3 corresponding to the approximately 75-nucleotide “enhancer sequence” is present in GA-FeLV-B, but contains deletions relative to AKR and Moloney virus and is not repeated. An interesting pallindrome in the repeat region immediately 3′ to the U3 region was noted in all the LTRs, but was particularly pronounced in GA-FeLV-B. Possible roles for this

  7. Nucleotide Sequence and Gene Organization of the Starfish Asterina Pectinifera Mitochondrial Genome

    PubMed Central

    Asakawa, S.; Himeno, H.; Miura, K. I.; Watanabe, K.

    1995-01-01

    The 16,260-bp mitochondrial DNA (mtDNA) from the starfish Asterina pectinifera has been sequenced. The genes for 13 proteins, two rRNAs and 22 tRNAs are organized in an extremely economical fashion, similar to those of other animal mtDNAs, with some of the genes overlapping each other. The gene organization is the same as that for another echinoderm, sea urchin, except for the inversion of a 4.6-kb segment that contains genes for two proteins, 13 tRNAs and the 16S rRNA. Judging from the organization of the protein coding genes, mammalian mtDNAs resemble the sea urchin mtDNA more than that of the starfish. The region around the 3' end of the 12S rRNA gene of the starfish shows a high similarity with those for vertebrates. This region encodes a possible stem and loop structure; similar potential structures occur in this region of vertebrate mtDNAs and also in nonmitochondrial small subunit rRNA. A similar stem and loop structure is also found at the 3' end of the 16S rRNA genes in A. pectinifera, in another starfish Pisaster ochraceus, in vertebrates and in Drosophila, but not in sea urchins. The full sequence data confirm the presumption that AGA/AGG, AUA and AAA codons, respectively, code for serine, isoleucine, and asparagine in the starfish mitochondria, and that AGA/AGG codons are read by tRNA(GCU)(Ser), which possesses a truncated dihydrouridine arm, that was previously suggested from a partial mtDNA sequence. The structural characteristics of tRNAs and possible mechanisms for the change in the mitochondrial genetic code are also discussed. PMID:7672576

  8. Cloning and nucleotide sequence of luxR, a regulatory gene controlling bioluminescence in Vibrio harveyi.

    PubMed Central

    Showalter, R E; Martin, M O; Silverman, M R

    1990-01-01

    Mutagenesis with transposon mini-Mulac was used previously to identify a regulatory locus necessary for expression of bioluminescence genes, lux, in Vibrio harveyi (M. Martin, R. Showalter, and M. Silverman, J. Bacteriol. 171:2406-2414, 1989). Mutants with transposon insertions in this regulatory locus were used to construct a hybridization probe which was used in this study to detect recombinants in a cosmid library containing the homologous DNA. Recombinant cosmids with this DNA stimulated expression of the genes encoding enzymes for luminescence, i.e., the luxCDABE operon, which were positioned in trans on a compatible replicon in Escherichia coli. Transposon mutagenesis and analysis of the DNA sequence of the cloned DNA indicated that regulatory function resided in a single gene of about 0.6-kilobases named luxR. Expression of bioluminescence in V. harveyi and in the fish light-organ symbiont Vibrio fischeri is controlled by density-sensing mechanisms involving the accumulation of small signal molecules called autoinducers, but similarity of the two luminescence systems at the molecular level was not apparent in this study. The amino acid sequence of the LuxR product of V. harveyi, which indicates a structural relationship to some DNA-binding proteins, is not similar to the sequence of the protein that regulates expression of luminescence in V. fischeri. In addition, reconstitution of autoinducer-controlled luminescence in recombinant E. coli, already achieved with lux genes cloned from V. fischeri, was not accomplished with the isolation of luxR from V. harveyi, suggesting a requirement for an additional regulatory component. PMID:2160932

  9. Shuttle cloning and nucleotide sequences of Helicobacter pylori genes responsible for urease activity.

    PubMed

    Labigne, A; Cussac, V; Courcoux, P

    1991-03-01

    Production of a potent urease has been described as a trait common to all Helicobacter pylori so far isolated from humans with gastritis as well as peptic ulceration. The detection of urease activity from genes cloned from H. pylori was made possible by use of a shuttle cosmid vector, allowing replication and movement of cloned DNA sequences in either Escherichia coli or Campylobacter jejuni. With this approach, we cloned a 44-kb portion of H. pylori chromosomal DNA which did not lead to urease activity when introduced into E. coli but permitted, although temporarily, biosynthesis of the urease when transferred by conjugation to C. jejuni. The recombinant cosmid (pILL585) expressing the urease phenotype was mapped and used to subclone an 8.1-kb fragment (pILL590) able to confer the same property to C. jejuni recipient strains. By a series of deletions and subclonings, the urease genes were localized to a 4.2-kb region of DNA and were sequenced by the dideoxy method. Four open reading frames were found, encoding polypeptides with predicted molecular weights of 26,500 (ureA), 61,600 (ureB), 49,200 (ureC), and 15,000 (ureD). The predicted UreA and UreB polypeptides correspond to the two structural subunits of the urease enzyme; they exhibit a high degree of homology with the three structural subunits of Proteus mirabilis (56% exact matches) as well as with the unique structural subunit of jack bean urease (55.5% exact matches). Although the UreD-predicted polypeptide has domains relevant to transmembrane proteins, no precise role could be attributed to this polypeptide or to the UreC polypeptide, which both mapped to a DNA sequence shown to be required to confer urease activity to a C. jejuni recipient strain. PMID:2001995

  10. Unique graphical representation of protein sequences based on nucleotide triplet codons

    NASA Astrophysics Data System (ADS)

    Randić, Milan; Zupan, Jure; Balaban, Alexandru T.

    2004-10-01

    We consider a graphical representation of proteins as an alternative to the usual representation of proteins as a sequence listing the natural amino acids. The approach is based on a graphical representation of triplets of DNA in which the interior of a square or the interior of a tetrahedron is used to accommodate 64 sites for the 64 codons. By associating a zigzag curve and various matrices with a protein, just as was the case with graphical representation of DNA, one can construct selected invariants to serve as protein descriptors. The approach is illustrated on the A-chain of human insulin.

  11. A genome walking strategy for the identification of nucleotide sequences adjacent to known regions.

    PubMed

    Wang, Hailong; Yao, Ting; Cai, Mei; Xiao, Xiuqing; Ding, Xuezhi; Xia, Liqiu

    2013-02-01

    To identify the transposon insertion sites in a soil actinomycete, Saccharopolyspora spinosa, a genome walking approach, termed SPTA-PCR, was developed. In SPTA-PCR, a simple procedure consisting of TA cloning and a high stringency PCR, following the single primer-mediated, randomly-primed PCR, can eliminate non-target DNA fragments and obtain target fragments specifically. Using SPTA-PCR, the DNA sequence adjacent to the highly conserved region of lectin coding gene in onion plant, Allium chinense, was also cloned. PMID:23108875

  12. Nucleotide sequence of the rpoN gene and characterization of two downstream open reading frames in Pseudomonas aeruginosa.

    PubMed Central

    Jin, S; Ishimoto, K; Lory, S

    1994-01-01

    The rpoN gene of Pseudomonas aeruginosa is required for the expression of a number of diverse genes, ranging from several classes of bacterial adhesins to enzymes for amino acid biosynthesis. The nucleotide sequence of the rpoN gene and its flanking region has been determined. The deduced amino acid sequence of the rpoN product is highly homologous to sequences of RpoN proteins of other microorganisms. Moreover, two open reading frames (ORF1 and ORF2) encoding peptides of 103 and 154 amino acids long, respectively, were found downstream of the rpoN gene. These two ORF products have a high degree of amino acid sequence homology with products of similar ORFs located adjacent to the rpoN genes in other microorganisms. Mutations in either ORF lead to a significant increase in P. aeruginosa generation time when propagated on minimal medium. These mutations had no effect on the expression of pilin or flagellin genes, whose expression depends on RpoN. Complementation analysis showed that the two ORFs are in the same transcriptional unit and the growth defects of the two ORF mutants on minimal medium are due to mutational effects on ORF2. The adverse effect of the ORF mutations on the growth of P. aeruginosa in minimal media can be suppressed by the addition of glutamine but not arginine, glutamate, histidine, or proline. Since rpoN mutants of P. aeruginosa display this same amino acid requirement for growth, the ORF2 product very likely functions as a coinducer of some but not all of the RpoN-controlled genes. Images PMID:8113171

  13. Species-wide genome sequence and nucleotide polymorphisms from the model allopolyploid plant Brassica napus

    PubMed Central

    Schmutzer, Thomas; Samans, Birgit; Dyrszka, Emmanuelle; Ulpinnis, Chris; Weise, Stephan; Stengel, Doreen; Colmsee, Christian; Lespinasse, Denis; Micic, Zeljko; Abel, Stefan; Duchscherer, Peter; Breuer, Frank; Abbadi, Amine; Leckband, Gunhild; Snowdon, Rod; Scholz, Uwe

    2015-01-01

    Brassica napus (oilseed rape, canola) is one of the world’s most important sources of vegetable oil for human nutrition and biofuel, and also a model species for studies investigating the evolutionary consequences of polyploidisation. Strong bottlenecks during its recent origin from interspecific hybridisation, and subsequently through intensive artificial selection, have severely depleted the genetic diversity available for breeding. On the other hand, high-throughput genome profiling technologies today provide unprecedented scope to identify, characterise and utilise genetic diversity in primary and secondary crop gene pools. Such methods also enable implementation of genomic selection strategies to accelerate breeding progress. The key prerequisite is availability of high-quality sequence data and identification of high-quality, genome-wide sequence polymorphisms representing relevant gene pools. We present comprehensive genome resequencing data from a panel of 52 highly diverse natural and synthetic B. napus accessions, along with a stringently selected panel of 4.3 million high-confidence, genome-wide SNPs. The data is of great interest for genomics-assisted breeding and for evolutionary studies on the origins and consequences in allopolyploidisation in plants. PMID:26647166

  14. Species-wide genome sequence and nucleotide polymorphisms from the model allopolyploid plant Brassica napus.

    PubMed

    Schmutzer, Thomas; Samans, Birgit; Dyrszka, Emmanuelle; Ulpinnis, Chris; Weise, Stephan; Stengel, Doreen; Colmsee, Christian; Lespinasse, Denis; Micic, Zeljko; Abel, Stefan; Duchscherer, Peter; Breuer, Frank; Abbadi, Amine; Leckband, Gunhild; Snowdon, Rod; Scholz, Uwe

    2015-01-01

    Brassica napus (oilseed rape, canola) is one of the world's most important sources of vegetable oil for human nutrition and biofuel, and also a model species for studies investigating the evolutionary consequences of polyploidisation. Strong bottlenecks during its recent origin from interspecific hybridisation, and subsequently through intensive artificial selection, have severely depleted the genetic diversity available for breeding. On the other hand, high-throughput genome profiling technologies today provide unprecedented scope to identify, characterise and utilise genetic diversity in primary and secondary crop gene pools. Such methods also enable implementation of genomic selection strategies to accelerate breeding progress. The key prerequisite is availability of high-quality sequence data and identification of high-quality, genome-wide sequence polymorphisms representing relevant gene pools. We present comprehensive genome resequencing data from a panel of 52 highly diverse natural and synthetic B. napus accessions, along with a stringently selected panel of 4.3 million high-confidence, genome-wide SNPs. The data is of great interest for genomics-assisted breeding and for evolutionary studies on the origins and consequences in allopolyploidisation in plants. PMID:26647166

  15. Nucleotide sequence and expression of a maize H1 histone cDNA.

    PubMed Central

    Razafimahatratra, P; Chaubet, N; Philipps, G; Gigot, C

    1991-01-01

    The first complete amino acid sequence of a H1 histone of a monocotyledonous plant was deduced from a cDNA isolated from a maize library. The encoded H1 protein is 245 amino acid-long and shows the classical tripartite organization of this class of histones. The central globular region of 76 residues shows 60% sequence homology with H1 proteins from dicots but only 20% with the animal H1 proteins. However, several of the amino acids considered as being important in the structure of the nucleosome are conserved between this protein and its animal counterparts. The N-terminal region contains an equal number of acidic and basic residues which appears as a general feature of plant H1 proteins. The 124 residue long and highly basic C-terminal region contains a 7-fold repeated element KA/PKXA/PAKA/PK. Southern-blot hybridization showed that the H1 protein is encoded by a small multigene family. Highly homologous H1 gene families were also detected in the genomes of several more or less closely related plant species. The general expression pattern of these genes was not significantly different from that of these genes encoding the core-histones neither during germination nor in the different tissues of adult maize. Images PMID:1709276

  16. Human bradykinin B2 receptor: Nucleotide sequence analysis and assignment to chromosome 14

    SciTech Connect

    Powell, S.J.; Slynn, G.; Thomas, C.; Hopkins, B.; Briggs, I.; Graham, A. )

    1993-02-01

    Functional cDNA clones for human bradykinin B2 receptor were isolated from uterus RNA by a polymerase chain reaction (PCR)-based method and by screening a human cosmid library with rat bradykinin B2 receptor probe. We isolated several overlapping clones from the cosmid library, each of which encodes the entire protein-coding sequence. The human bradykinin B2 receptor gene codes for a 364-amino-acid protein with a molecular mass of 41,442 Da that is highly homologous to rat bradykinin B2 receptor cDNA (81%). The entire human cDNA sequence was cloned into an expression vector and mRNA was synthesised by in vitro transcription. Applications of bradykinin caused membrane current responses in Xenopus oocytes injected with the in vitro-synthesized mRNA. Preincubation with the potent B2 antagonist, HOE140, prevented this response. The genomic clone is intronless, and we have identified an upstream promoter region and a downstream polyadenylation signal. The human bradykinin B2 receptor gene has been mapped to chromosome 14 using PCR to specifically amplify DNA from somatic cell hybrids. 10 refs., 1 fig., 1 tab.

  17. Cloning, nucleotide sequence, and expression of the Escherichia coli gene encoding carnitine dehydratase.

    PubMed Central

    Eichler, K; Schunck, W H; Kleber, H P; Mandrand-Berthelot, M A

    1994-01-01

    Carnitine dehydratase from Escherichia coli O44 K74 is an inducible enzyme detectable in cells grown anaerobically in the presence of L-(-)-carnitine or crotonobetaine. The purified enzyme catalyzes the dehydration of L-(-)-carnitine to crotonobetaine (H. Jung, K. Jung, and H.-P. Kleber, Biochim. Biophys. Acta 1003:270-276, 1989). The caiB gene, encoding carnitine dehydratase, was isolated by oligonucleotide screening from a genomic library of E. coli O44 K74. The caiB gene is 1,215 bp long, and it encodes a protein of 405 amino acids with a predicted M(r) of 45,074. The identity of the gene product was first assessed by its comigration in sodium dodecyl sulfate-polyacrylamide gels with the purified enzyme after overexpression in the pT7 system and by its enzymatic activity. Moreover, the N-terminal amino acid sequence of the purified protein was found to be identical to that predicted from the gene sequence. Northern (RNA) analysis showed that caiB is likely to be cotranscribed with at least one other gene. This other gene could be the gene encoding a 47-kDa protein, which was overexpressed upstream of caiB. Images PMID:8188598

  18. The structure and complete nucleotide sequence of the human cyclophilin 40 (PPID) gene

    SciTech Connect

    Yokoi, Haruhiko; Shimizu, Yukiko; Anazawa, Hideharu

    1996-08-01

    Cyclophilin 40 is a recently identified member of the cyclophilin family that is found in an unactivated steroid hormone receptor complex. Cyclophilin 40 possesses a region homologous to FKBP59, a member of the FK506-binding protein family that is also a component of the receptor complex. We report the isolation and sequencing of the entire human cyclophilin 40 (hCyP40) gene (human gene symbol PPID). The gene contains 10 exons (43 to 698 bp) and 9 introns encompassing 14.2 kb. The exon organization of the cyclophilin-like region is not similar to that of the human cyclophilin A gene (PPIA), suggesting their early divergence in evolution. Determination of the sequence of the 5{prime} end of the hCyP40 mRNA by an {open_quotes}anchor-ligation PCR{close_quotes} procedure showed that transcription is initiated from a cluster of sites about 80 bp upstream from the first in-frame ATG. The immediate 5{prime}-flanking region of the gene lacks typical TATA and CAAT boxes, but is GC-rich and contains Sp1 sites, features characteristic of promoters associated with housekeeping genes. The hCyP40 gene was mapped to chromosome 4 by PCR with genomic DNA from somatic cell hybrids. As shown by {open_quotes}Zoo blot{close_quotes} analysis, the cylophilin 40 gene appears to be highly conserved throughout evolution. 47 refs., 4 figs., 1 tab.

  19. Purification of histidase from Streptomyces griseus and nucleotide sequence of the hutH structural gene.

    PubMed Central

    Wu, P C; Kroening, T A; White, P J; Kendrick, K E

    1992-01-01

    Histidine ammonia-lyase (histidase) was purified to homogeneity from vegetative mycelia of Streptomyces griseus. The enzyme was specific for L-histidine and showed no activity against the substrate analog, D-histidine. Histidinol phosphate was a potent competitive inhibitor. Histidase displayed saturation kinetics with no detectable sigmoidal response. Neither thiol reagents nor a variety of divalent cations had any effect on the activity of the purified enzyme. High concentrations of potassium cyanide inactivated histidase in the absence of its substrate or histidinol phosphate, suggesting that, as in other histidases, dehydroalanine plays an important role in catalysis. The N-terminal amino acid sequence of histidase was used to construct a mixed oligonucleotide probe to identify and clone the histidase structural gene, hutH, from genomic DNA of the wild-type strain of S. griseus. The cloned DNA restored the ability of a histidase structural gene mutant to grow on L-histidine as the sole nitrogen source. The deduced amino acid sequence of hutH shows significant relatedness with histidase from bacteria and a mammal as well as phenylalanine ammonia-lyase from plants and fungi. Images PMID:1537807

  20. Human parainfluenza type 3 virus hemagglutinin-neuraminidase glycoprotein: nucleotide sequence of mRNA and limited amino acid sequence of the purified protein.

    PubMed Central

    Elango, N; Coligan, J E; Jambou, R C; Venkatesan, S

    1986-01-01

    The nucleotide sequence of mRNA for the hemagglutinin-neuraminidase (HN) protein of human parainfluenza type 3 virus obtained from the corresponding cDNA clone had a single long open reading frame encoding a putative protein of 64,254 daltons consisting of 572 amino acids. The deduced protein sequence was confirmed by limited N-terminal amino acid microsequencing of CNBr cleavage fragments of native HN that was purified by immunoprecipitation. The HN protein is moderately hydrophobic and has four potential sites (Asn-X-Ser/Thr) of N-glycosylation in the C-terminal half of the molecule. It is devoid of both the N-terminal signal sequence and the C-terminal membrane anchorage domain characteristic of the hemagglutinin of influenza virus and the fusion (F0) protein of the paramyxoviruses. Instead, it has a single prominent hydrophobic region capable of membrane insertion beginning at 32 residues from the N terminus. This N-terminal membrane insertion is similar to that of influenza virus neuraminidase and the recently reported structures of HN proteins of Sendai virus and simian virus 5. Images PMID:3003381

  1. The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003.

    PubMed

    Boeckmann, Brigitte; Bairoch, Amos; Apweiler, Rolf; Blatter, Marie-Claude; Estreicher, Anne; Gasteiger, Elisabeth; Martin, Maria J; Michoud, Karine; O'Donovan, Claire; Phan, Isabelle; Pilbout, Sandrine; Schneider, Michel

    2003-01-01

    The SWISS-PROT protein knowledgebase (http://www.expasy.org/sprot/ and http://www.ebi.ac.uk/swissprot/) connects amino acid sequences with the current knowledge in the Life Sciences. Each protein entry provides an interdisciplinary overview of relevant information by bringing together experimental results, computed features and sometimes even contradictory conclusions. Detailed expertise that goes beyond the scope of SWISS-PROT is made available via direct links to specialised databases. SWISS-PROT provides annotated entries for all species, but concentrates on the annotation of entries from human (the HPI project) and other model organisms to ensure the presence of high quality annotation for representative members of all protein families. Part of the annotation can be transferred to other family members, as is already done for microbes by the High-quality Automated and Manual Annotation of microbial Proteomes (HAMAP) project. Protein families and groups of proteins are regularly reviewed to keep up with current scientific findings. Complementarily, TrEMBL strives to comprise all protein sequences that are not yet represented in SWISS-PROT, by incorporating a perpetually increasing level of mostly automated annotation. Researchers are welcome to contribute their knowledge to the scientific community by submitting relevant findings to SWISS-PROT at swiss-prot@expasy.org. PMID:12520024

  2. The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003

    PubMed Central

    Boeckmann, Brigitte; Bairoch, Amos; Apweiler, Rolf; Blatter, Marie-Claude; Estreicher, Anne; Gasteiger, Elisabeth; Martin, Maria J.; Michoud, Karine; O'Donovan, Claire; Phan, Isabelle; Pilbout, Sandrine; Schneider, Michel

    2003-01-01

    The SWISS-PROT protein knowledgebase (http://www.expasy.org/sprot/ and http://www.ebi.ac.uk/swissprot/) connects amino acid sequences with the current knowledge in the Life Sciences. Each protein entry provides an interdisciplinary overview of relevant information by bringing together experimental results, computed features and sometimes even contradictory conclusions. Detailed expertise that goes beyond the scope of SWISS-PROT is made available via direct links to specialised databases. SWISS-PROT provides annotated entries for all species, but concentrates on the annotation of entries from human (the HPI project) and other model organisms to ensure the presence of high quality annotation for representative members of all protein families. Part of the annotation can be transferred to other family members, as is already done for microbes by the High-quality Automated and Manual Annotation of microbial Proteomes (HAMAP) project. Protein families and groups of proteins are regularly reviewed to keep up with current scientific findings. Complementarily, TrEMBL strives to comprise all protein sequences that are not yet represented in SWISS-PROT, by incorporating a perpetually increasing level of mostly automated annotation. Researchers are welcome to contribute their knowledge to the scientific community by submitting relevant findings to SWISS-PROT at swiss-prot@expasy.org. PMID:12520024

  3. Cloning of human transketolase cDNAs and comparison of the nucleotide sequence of the coding region in Wernicke-Korsakoff and non-Wernicke-Korsakoff individuals.

    PubMed

    McCool, B A; Plonk, S G; Martin, P R; Singleton, C K

    1993-01-15

    Variants of the enzyme transketolase which possess reduced affinity for its cofactor thiamine pyrophosphate (high apparent Km) have been described in chronic alcoholic patients with Wernicke-Korsakoff syndrome. Since the syndrome has been shown to be directly related to thiamine deficiency, it has been hypothesized that such transketolase variants may represent a genetic predisposition to the development of this syndrome. To test this hypothesis, human transketolase cDNA clones were isolated, and their nucleotide and predicted amino acid sequence were determined. Transketolase was found to be a single copy gene which produces a single mRNA of approximately 2100 nucleotides. Additionally, the nucleotide sequence of the transketolase coding region in fibroblasts derived from two Wernicke-Korsakoff (WK) patients was compared to that of two nonalcoholic controls. Although nucleotide and predicted amino acid differences were detected between fibroblast cultures and the original cDNAs and among the cultures themselves, no specific nucleotide variations, which would encode a variant amino acid sequence, were associated exclusively with the coding region from WK patients. Thus, allelic variants of the transketolase gene cannot account for the biochemically distinct forms of the enzyme found in these patients nor be considered as a mechanism for genetic predisposition to the development of Wernicke-Korsakoff syndrome. Instead, the underlying mechanism must be extragenic and may be a result of differences in post-translational processing/modification of the transketolase polypeptide. PMID:8419340

  4. Development and characterization of new single nucleotide polymorphism markers from expressed sequence tags in common carp (Cyprinus carpio).

    PubMed

    Zhu, Chuankun; Cheng, Lei; Tong, Jingou; Yu, Xiaomu

    2012-01-01

    The common carp (Cyprinus carpio) is an important aquaculture fish worldwide but only limited single nucleotide polymorphism (SNP) markers are characterized from expressed sequence tags (ESTs) in this species. In this study, 1487 putative SNPs were bioinformatically mined from 14,066 online ESTs mainly from the European common carp, with the occurrence rate of about one SNP every 173 bp. One hundred and twenty-one of these SNPs were selected for validation using PCR fragment sequencing, and 48 out of 81 primers could amplify the expected fragments in the Chinese common carp genome. Only 26 (21.5%) putative SNPs were validated, however, 508 new SNPs and 68 indels were identified. The ratios of transitions to transversions were 1.77 for exon SNPs and 1.05 for intron SNPs. All the 23 SNPs selected for population tests were polymorphic, with the observed heterozygosity (Ho) ranging from 0.053 to 0.526 (mean 0.262), polymorphism information content (PIC) from 0.095 to 0.357 (mean 0.246), and 21 SNPs were in Hardy-Weinberg equilibrium. These results suggest that different common carp populations with geographic isolation have significant genetic variation at the SNP level, and these new EST-SNP markers are readily available for genetics and breeding studies in common carp. PMID:22837697

  5. Complete nucleotide sequence of a gene conferring polymyxin B resistance on yeast: similarity of the predicted polypeptide to protein kinases.

    PubMed

    Boguslawski, G; Polazzi, J O

    1987-08-01

    Polymyxin B is an antibiotic that kills sensitive cells by disrupting their membranes. We have cloned a wild-type yeast gene that, when present on a high-copy-number plasmid, renders the cells resistant to the drug. The nucleotide sequence of this gene is presented. A single open reading frame within the sequence has the potential to encode a polypeptide (molecular mass of 77.5 kDa) that shows strong homologies to polypeptides of the protein kinase family. The gene, PBS2, located on chromosome X, is not allelic to the previously described PBS1 gene (where PBS signifies polymyxin B sensitivity). Although pbs1 mutations confer resistance to high levels of polymyxin B, double mutants, pbs1 pbs2, are not resistant to the drug, indicating that PBS2 is essential for pbs1 activity. Models based on the proposed protein kinase activity of the PBS2 gene product are presented to explain the interaction between PBS1 and PBS2 gene products involved in conferring polymyxin B resistance on yeast cells. PMID:3039511

  6. Complete nucleotide sequences of mitochondrial genomes of two solitary entoprocts, Loxocorone allax and Loxosomella aloxiata: implications for lophotrochozoan phylogeny.

    PubMed

    Yokobori, Shin-ichi; Iseto, Tohru; Asakawa, Shuichi; Sasaki, Takashi; Shimizu, Nobuyoshi; Yamagishi, Akihiko; Oshima, Tairo; Hirose, Euichi

    2008-05-01

    The complete nucleotide sequences of the mitochondrial (mt) genomes of the entoprocts Loxocorone allax and Loxosomella aloxiata were determined. Both species carry the typical gene set of metazoan mt genomes and have similar organizations of their mt genes. However, they show differences in the positions of two tRNA(Leu) genes. Additionally, the tRNA(Val) gene, and half of the long non-coding region, is duplicated and inverted in the Loxos. aloxiata mt genome. The initiation codon of the Loxos. aloxiata cytochrome oxidase subunit I gene is expected to be ACG rather than AUG. The mt gene organizations in these two entoproct species most closely resemble those of mollusks such as Katharina tunicata and Octopus vulgaris, which have the most evolutionarily conserved mt gene organization reported to date in mollusks. Analyses of the mt gene organization in the lophotrochozoan phyla (Annelida, Brachiopoda, Echiura, Entoprocta, Mollusca, Nemertea, and Phoronida) suggested a close phylogenetic relationship between Brachiopoda, Annelida, and Echiura. However, Phoronida was excluded from this grouping. Molecular phylogenetic analyses based on the sequences of mt protein-coding genes suggested a possible close relationship between Entoprocta and Phoronida, and a close relationship among Brachiopoda, Annelida, and Echiura. PMID:18374604

  7. Nucleotide sequence and characterization of four additional genes of the hydrogenase structural operon from Rhizobium leguminosarum bv. viciae.

    PubMed Central

    Hidalgo, E; Palacios, J M; Murillo, J; Ruiz-Argüeso, T

    1992-01-01

    The nucleotide sequence of a 2.5-kbp region following the hydrogenase structural genes (hupSL) in the H2 uptake gene cluster from Rhizobium leguminosarum bv. viciae UPM791 was determined. Four closely linked genes encoding peptides of 27.9 (hupC), 22.1 (hupD), 19.0 (hupE), and 10.4 (hupF) kDa were identified immediately downstream of hupL. Proteins with comparable apparent molecular weights were detected by heterologous expression of these genes in Escherichia coli. The six genes, hupS to hupF, are arranged as an operon, and by mutant complementation analysis, it was shown that genes hupSLCD are cotranscribed. A transcription start site preceded by the -12 to -24 consensus sequence characteristic of NtrA-dependent promoters was identified upstream of hupS. On the basis of the lack of oxygen-dependent H2 uptake activity of a hupC::Tn5 mutant and on structural characteristics of the protein, we postulate that HupC is a b-type cytochrome involved in electron transfer from hydrogenase to oxygen. The product from hupE, which is needed for full hydrogenase activity, exhibited characteristics typical of a membrane protein. The features of HupC and HupE suggest that they form, together with the hydrogenase itself, a membrane-bound protein complex involved in hydrogen oxidation. Images PMID:1597428

  8. The R package otu2ot for implementing the entropy decomposition of nucleotide variation in sequence data

    PubMed Central

    Ramette, Alban; Buttigieg, Pier Luigi

    2014-01-01

    Oligotyping is a novel, supervised computational method that classifies closely related sequences into “oligotypes” (OTs) based on subtle nucleotide variation (Eren et al., 2013). Its application to microbial datasets has helped reveal ecological patterns which are often hidden by the way sequence data are currently clustered to define operational taxonomic units (OTUs). Here, we implemented the OT entropy decomposition procedure and its unsupervised version, Minimal Entropy Decomposition (MED; Eren et al., 2014c), in the statistical programming language and environment, R. The aim of this implementation is to facilitate the integration of computational routines, interactive statistical analyses, and visualization into a single framework. In addition, two complementary approaches are implemented: (1) An analytical method (the broken stick model) is proposed to help identify OTs of low abundance that could be generated by chance alone and (2) a one-pass profiling (OP) method, to efficiently identify those OTUs whose subsequent oligotyping would be most promising to be undertaken. These enhancements are especially useful for large datasets, where a manual screening of entropy analysis results and the creation of a full set of OTs may not be feasible. The package and procedures are illustrated by several tutorials and examples. PMID:25452747

  9. Comparison of the nucleotide sequences of wheat dwarf virus (WDV) isolates from Hungary and Ukraine.

    PubMed

    Tóbiás, Istvan; Shevchenko, Oleksiy; Kiss, Balázs; Bysov, Andriy; Snihur, Halina; Polischuk, Valery; Salánki, Katalin; Palkovics, László

    2011-01-01

    Wheat dwarf virus (WDV) is the most ubiquitous virus in cereals causing huge losses in both Hungary and Ukraine. The presence of barley-and wheat-adapted strains has been confirmed, suggesting that the barley strain is restricted to barley, while the wheat strain is present in both wheat and barley plants. Five WDV isolates from wheat plants sampled in Hungary and Ukraine were sequenced and compared with known WDV isolates from GenBank. Four WDV isolates belonged to the wheat strain. Our results indicate that WDV-Odessa is an isolate of special interest since it has originated from wheat, but belongs to the barley-adapted strain, providing novel data on WDV biology and raising issues of pathogen epidemiology. PMID:21905629

  10. Complete nucleotide sequence of plasmid plca36 isolated from Lactobacillus casei Zhang.

    PubMed

    Zhang, Wenyi; Yu, Dongliang; Sun, Zhihong; Chen, Xia; Bao, Qiuhua; Meng, He; Hu, Songnian; Zhang, Heping

    2008-09-01

    The complete 36,487 bp sequence of plasmid plca36 from Lactobacillus casei Zhang was determined. Plca36 contains 44 predicted coding regions, and to 23 of them functions could be assigned. For the first time, we identified a relBE toxin-antitoxin (TA) locus in a Lactobacillus genus, perhaps indicating a potential role for plca36 in host survival under extreme nutritional stress. A region encoding a cluster of conjugation genes (tra) was also identified. The cluster showed high similarity and co-linearity with tra regions of pWCFS103 and pMRC01 from Lactobacillus plantarum and Lactococcus lactis, respectively. Comparative gene analysis revealed that plasmids from the genus Lactobacillus may have contributed to the environmental adaptation mainly by providing carbohydrate and amino acid transporters. In addition, two chromosome-encoded relBE systems in Lactobacillus johnsonii and Lactobacillus gasseri were identified. PMID:18634821

  11. Longitudinal study of a heteroplasmic 3460 Leber hereditary optic neuropathy family by multiplexed primer-extension analysis and nucleotide sequencing

    SciTech Connect

    Ghosh, S.S.; Fahy, E.; Bodis-Wollner, I.

    1996-02-01

    Nucleotide-sequencing and multiplexed primer-extension assays have been used to quantitate the mutant-allele frequency in 14 maternal relatives, spanning three generations, from a family that is heteroplasmic for the primary Leber hereditary optic neuropathy (LHON) mutation at nucleotide 3460 of the mitochondrial genome. There was excellent agreement between the values that were obtained with the two different methods. The longitudinal study shows that the mutant-allele frequency was constant within individual family members over a sampling period of 3.5 years. Second, although there was an overall increase in the mutant-allele frequency in successive generations, segregation in the direction of the mutant allele was not invariant, and there was one instance in which there was a significant decrease in the frequency from parent to offspring. From these two sets of results, and from previous studies of heteroplasmic LHON families, we conclude that there is no evidence for a marked selective pressure that determines the replication, segregation, or transmission of primary LHON mutations to white blood cells and platelets. Instead, the mtDNA molecules are most likely to replicate and segregate under conditions of random drift at the cellular level. Finally, the pattern of transmission in this maternal lineage is compatible with a developmental bottleneck model in which the number of mitochondrial units of segregation in the female germ line is relatively small in relation to the number of mtDNA molecules within a cell. However, this is not an invariant pattern for humans, and simple models of mitochondrial gene transmission are inappropriate at the present time. 37 refs., 4 figs., 1 tab.

  12. Demonstration of biological activity and nucleotide sequence of an in vitro synthesized clone of the Moloney murine sarcoma virus mos gene.

    PubMed Central

    Donoghue, D J

    1982-01-01

    A clone of the Moloney murine sarcoma virus mos gene derived by in vitro reverse transcription was characterized. When assayed for focus formation by DNA transfection on NIH/3T3 cells, this clone was biologically inactive, presumably due to the absence of a long terminal repeat sequence. Therefore, a long terminal repeat was inserted into the clone by in vitro recombination, after which the most gene was able to transform NIH/3T3 cells efficiently. The nucleotide sequence encompassing the transforming region of this clone was determined. A single long open reading frame was observed, which potentially encodes a polypeptide of 41,000 daltons. This open reading frame initiates with the first five amino acids of the murine leukemia virus env gene, after which it enters the mos sequence, where it terminates. The nucleotide sequence described in this paper was compared with other sequences of mos in an effort to resolve discrepancies in the position of the long open reading frame. Although Moloney murine sarcoma virus retains the 3' splicing site of the murine leukemia virus env gene, a mos-specific mRNA which corresponds structurally to the murine leukemia virus env mRNA was not identified. The sequence described here revealed a single nucleotide change in the proposed env gene 3' splicing site which was retained in Moloney murine sarcoma virus. This deviation from the consensus 3' splicing sequence may underlie the observed absence of mos expression via the env gene splicing pathway. Images PMID:7045395

  13. Nucleotide and protein sequences for dog masticatory tropomyosin identify a novel Tpm4 gene product.

    PubMed

    Brundage, Elizabeth A; Biesiadecki, Brandon J; Reiser, Peter J

    2015-10-01

    Jaw-closing muscles of several vertebrate species, including members of Carnivora, express a unique, "masticatory", isoform of myosin heavy chain, along with isoforms of other myofibrillar proteins that are not expressed in most other muscles. It is generally believed that the complement of myofibrillar isoforms in these muscles serves high force generation for capturing live prey, breaking down tough plant material and defensive biting. A unique isoform of tropomyosin (Tpm) was reported to be expressed in cat jaw-closing muscle, based upon two-dimensional gel mobility, peptide mapping, and immunohistochemistry. The objective of this study was to obtain protein and gene sequence information for this unique Tpm isoform. Samples of masseter (a jaw-closing muscle), tibialis (predominantly fast-twitch fibers), and the deep lateral gastrocnemius (predominantly slow-twitch fibers) were obtained from adult dogs. Expressed Tpm isoforms were cloned and sequencing yielded cDNAs that were identical to genomic predicted striated muscle Tpm1.1St(a,b,b,a) (historically referred to as αTpm), Tpm2.2St(a,b,b,a) (βTpm) and Tpm3.12St(a,b,b,a) (γTpm) isoforms (nomenclature reflects predominant tissue expression ("St"-striated muscle) and exon splicing pattern), as well as a novel 284 amino acid isoform observed in jaw-closing muscle that is identical to a genomic predicted product of the Tpm4 gene (δTpm) family. The novel isoform is designated as Tpm4.3St(a,b,b,a). The myofibrillar Tpm isoform expressed in dog masseter exhibits a unique electrophoretic mobility on gels containing 6 M urea, compared to other skeletal Tpm isoforms. To validate that the cloned Tpm4.3 isoform is the Tpm expressed in dog masseter, E. coli-expressed Tpm4.3 was electrophoresed in the presence of urea. Results demonstrate that Tpm4.3 has identical electrophoretic mobility to the unique dog masseter Tpm isoform and is of different mobility from that of muscle Tpm1.1, Tpm2.2 and Tpm3.12 isoforms. We

  14. Nucleotide and protein sequences for dog masticatory tropomyosin identify a novel Tpm4 gene product

    PubMed Central

    Reiser, Peter J.

    2016-01-01

    Jaw-closing muscles of several vertebrate species, including members of Carnivora, express a unique, “masticatory”, isoform of myosin heavy chain, along with isoforms of other myofibrillar proteins that are not expressed in most other muscles. It is generally believed that the complement of myofibrillar isoforms in these muscles serves high force generation for capturing live prey, breaking down tough plant material and defensive biting. A unique isoform of tropomyosin (Tpm) was reported to be expressed in cat jaw-closing muscle, based upon two-dimensional gel mobility, peptide mapping, and immunohistochemistry. The objective of this study was to obtain protein and gene sequence information for this unique Tpm isoform. Samples of masseter (also a jaw-closing muscle), tibialis (with predominantly fast-twitch fibers), and the deep lateral gastrocnemius (predominantly slow-twitch fibers) were obtained from adult dogs. Expressed Tpm isoforms were cloned and sequencing yielded cDNAs that were identical to genomic predicted striated muscle Tpm1.1St(a,b,b,a) (historically referred to as αTpm), Tpm2.2St(a,b,b,a) (βTpm) and Tpm3.12St(a,b,b,a) (cTpm) isoforms (nomenclature reflects predominant tissue expression (“St”—striated muscle) and exon splicing pattern), as well as a novel 284 amino acid isoform observed in jaw-closing muscle that is identical to a genomic predicted product of the Tpm4 gene (δTpm) family. The novel isoform is designated as Tpm4.3St(a,b,b,a). The myofibrillar Tpm isoform expressed in dog masseter exhibits a unique electrophoretic mobility on gels containing 6 M urea, compared to other skeletal Tpm isoforms. To validate that the cloned Tpm4.3 isoform is the Tpm expressed in dog masseter, E. coli-expressed Tpm4.3 was electrophoresed in the presence of urea. Results demonstrate that Tpm4.3 has identical electrophoretic mobility to the unique dog masseter Tpm isoform and is of different mobility from that of muscle Tpm1.1, Tpm2.2 and Tpm3

  15. Nucleotide sequence analysis and seroreactivities of the 65K heat shock protein from Mycobacterium paratuberculosis.

    PubMed Central

    el-Zaatari, F A; Naser, S A; Engstrand, L; Burch, P E; Hachem, C Y; Whipple, D L; Graham, D Y

    1995-01-01

    Mycobacterium paratuberculosis is the causative agent of Johne's disease, a chronic enteritis in ruminants. It has also been implicated as a possible cause of Crohn's disease, an inflammatory bowel disease of unknown etiology. The mycobacterial 65K heat shock proteins (hsp-65K) are among the most extensively studied mycobacterial proteins, and their immunogenic characteristics have been suggested to be the basis for autoimmunization in chronic inflammatory diseases. In this context, we isolated and sequenced the hsp-65K-encoding gene from our M. paratuberculosis PTB65K genomic library. A high degree of identity was found between the open reading frame (ORF) of the PTB65K gene and those of Mycobacterium tuberculosis (89.6%), Mycobacterium leprae (86.6%), and Mycobacterium avium 18 (98.8%). The amino acid sequence alignment of the PTB65K protein with the hsp-65K homologs revealed that the M. tuberculosis and M. leprae proteins each differed by 36 amino acid residues and that the M. avium 18 protein differed by 8 residues. We also investigated the humoral immune responses of animals with Johne's disease and patients with Crohn's disease against the recombinant PTB65K antigen. Immunoblot analysis showed that sera from only 3 of 10 clinically ill and 5 of 25 subclinically ill cows reacted with PTB65K. In addition, sera from two of two sheep and one of two goats with clinical symptoms of Johne's disease also reacted with PTB65K; 0 samples from 10 normal cows reacted. In humans, sera from 7 of 13 patients with Crohn's disease, 3 of 4 with tuberculosis, 5 of 6 with leprosy, 5 of 12 with non-inflammatory bowel disease, and 0 of 4 with ulcerative colitis reacted with the recombinant PTB65K antigen. These results indicate that this PTB65K heat shock protein is uninformative when used for serodiagnosis of Johne's disease in animals. However, in humans, the high intensity of antibody reactions of some sera from Crohn's disease patients compared with that from noninflammatory

  16. Nucleotide sequence analysis of NIPBL gene in Indian Cornelia de Lange syndrome cases

    PubMed Central

    Bajaj, Shailesh; Ranade, Suvidya; Gambhir, Prakash

    2013-01-01

    BACKGROUND: Cornelia de Lange syndrome (CdLS) is a multisystem developmental disorder in children. The disorder is caused mainly due to mutations in Nipped-B-like protein. The molecular data for CdLS is available from developed countries, but not available in developing countries like India. In the present study, the hotspot region of NIPBL gene was screened by Polymerase Chain Reaction which includes exon 2, 22, 42, and a biggest exon 10, in six CdLS patients and ten controls. MATERIALS AND METHODS: The method adopted in present study was amplification of the target exon by using polymerase chain reaction, qualitative confirmation of amplicons by Agarose Gel Electrophoresis and use of amplicons for Conformation Sensitive Gel Electrophoresis to find heteroduplex formation followed by sequencing. RESULTS: We report two polymorphisms in the studied region of gene NIPBL. The polymorphisms are in the region of intron 1 and in exon 10. The polymorphism C/A is present in intron 1 region and polymorphism T/G in exon 10. CONCLUSION: The intronic region polymorphism may have a role in intron splicing whereas the polymorphism in exon 10 results in amino acid change (Val to Gly). These polymorphisms are disease associated as these are found in CdLS patients only and not in controls. PMID:23901187

  17. The eSNV-detect: a computational system to identify expressed single nucleotide variants from transcriptome sequencing data.

    PubMed

    Tang, Xiaojia; Baheti, Saurabh; Shameer, Khader; Thompson, Kevin J; Wills, Quin; Niu, Nifang; Holcomb, Ilona N; Boutet, Stephane C; Ramakrishnan, Ramesh; Kachergus, Jennifer M; Kocher, Jean-Pierre A; Weinshilboum, Richard M; Wang, Liewei; Thompson, E Aubrey; Kalari, Krishna R

    2014-12-16

    Rapid development of next generation sequencing technology has enabled the identification of genomic alterations from short sequencing reads. There are a number of software pipelines available for calling single nucleotide variants from genomic DNA but, no comprehensive pipelines to identify, annotate and prioritize expressed SNVs (eSNVs) from non-directional paired-end RNA-Seq data. We have developed the eSNV-Detect, a novel computational system, which utilizes data from multiple aligners to call, even at low read depths, and rank variants from RNA-Seq. Multi-platform comparisons with the eSNV-Detect variant candidates were performed. The method was first applied to RNA-Seq from a lymphoblastoid cell-line, achieving 99.7% precision and 91.0% sensitivity in the expressed SNPs for the matching HumanOmni2.5 BeadChip data. Comparison of RNA-Seq eSNV candidates from 25 ER+ breast tumors from The Cancer Genome Atlas (TCGA) project with whole exome coding data showed 90.6-96.8% precision and 91.6-95.7% sensitivity. Contrasting single-cell mRNA-Seq variants with matching traditional multicellular RNA-Seq data for the MD-MB231 breast cancer cell-line delineated variant heterogeneity among the single-cells. Further, Sanger sequencing validation was performed for an ER+ breast tumor with paired normal adjacent tissue validating 29 out of 31 candidate eSNVs. The source code and user manuals of the eSNV-Detect pipeline for Sun Grid Engine and virtual machine are available at http://bioinformaticstools.mayo.edu/research/esnv-detect/. PMID:25352556

  18. A survey of endogenous retrovirus (ERV) sequences in the vicinity of multiple sclerosis (MS)-associated single nucleotide polymorphisms (SNPs).

    PubMed

    Brütting, Christine; Emmer, Alexander; Kornhuber, Malte; Staege, Martin S

    2016-08-01

    Although multiple sclerosis (MS) is one of the most common central nervous system diseases in young adults, little is known about its etiology. Several human endogenous retroviruses (ERVs) are considered to play a role in MS. We are interested in which ERVs can be identified in the vicinity of MS associated genetic marker to find potential initiators of MS. We analysed the chromosomal regions surrounding 58 single nucleotide polymorphisms (SNPs) that are associated with MS identified in one of the last major genome wide association studies. We scanned these regions for putative endogenous retrovirus sequences with large open reading frames (ORFs). We observed that more retrovirus-related putative ORFs exist in the relatively close vicinity of SNP marker indices in multiple sclerosis compared to control SNPs. We found very high homologies to HERV-K, HCML-ARV, XMRV, Galidia ERV, HERV-H/env62 and XMRV-like mouse endogenous retrovirus mERV-XL. The associated genes (CYP27B1, CD6, CD58, MPV17L2, IL12RB1, CXCR5, PTGER4, TAGAP, TYK2, ICAM3, CD86, GALC, GPR65 as well as the HLA DRB1*1501) are mainly involved in the immune system, but also in vitamin D regulation. The most frequently detected ERV sequences are related to the multiple sclerosis-associated retrovirus, the human immunodeficiency virus 1, HERV-K, and the Simian foamy virus. Our data shows that there is a relation between MS associated SNPs and the number of retroviral elements compared to control. Our data identifies new ERV sequences that have not been associated with MS, so far. PMID:27169423

  19. Nucleotide sequence of the coat protein genes of alstroemeria mosaic virus and amazon lily mosaic virus, a tentative species of genus potyvirus.

    PubMed

    Fuji, S; Terami, F; Furuya, H; Naito, H; Fukumoto, F

    2004-09-01

    The nucleotide sequences of the 3' terminal region of the genomes of Alstroemeria mosaic virus (AlsMV) and the Amazon lily mosaic virus (ALiMV) have been determined. These sequences contain the complete coding region of the viral coat protein (CP) gene followed by a 3'-untranslated region (3'-UTR). AlsMV and ALiMV share 74.9% identity in the amino acid sequence of the CP, and 55.6% identity in the nucleotide sequence of the 3'-UTR. Phylogenetic analysis of these CP genes and 3'-UTRs in relation to those of 79 potyvirus species revealed that AlsMV and ALiMV should be assigned to the Potato virus Y (PVY) subgroup. AlsMV and ALiMV were concluded to have arisen independently within the PVY subgroup. PMID:15593424

  20. Neuropeptidergic Signaling in the American Lobster Homarus americanus: New Insights from High-Throughput Nucleotide Sequencing.

    PubMed

    Christie, Andrew E; Chi, Megan; Lameyer, Tess J; Pascual, Micah G; Shea, Devlin N; Stanhope, Meredith E; Schulz, David J; Dickinson, Patsy S

    2015-01-01

    Peptides are the largest and most diverse class of molecules used for neurochemical communication, playing key roles in the control of essentially all aspects of physiology and behavior. The American lobster, Homarus americanus, is a crustacean of commercial and biomedical importance; lobster growth and reproduction are under neuropeptidergic control, and portions of the lobster nervous system serve as models for understanding the general principles underlying rhythmic motor behavior (including peptidergic neuromodulation). While a number of neuropeptides have been identified from H. americanus, and the effects of some have been investigated at the cellular/systems levels, little is currently known about the molecular components of neuropeptidergic signaling in the lobster. Here, a H. americanus neural transcriptome was generated and mined for sequences encoding putative peptide precursors and receptors; 35 precursor- and 41 receptor-encoding transcripts were identified. We predicted 194 distinct neuropeptides from the deduced precursor proteins, including members of the adipokinetic hormone-corazonin-like peptide, allatostatin A, allatostatin C, bursicon, CCHamide, corazonin, crustacean cardioactive peptide, crustacean hyperglycemic hormone (CHH), CHH precursor-related peptide, diuretic hormone 31, diuretic hormone 44, eclosion hormone, FLRFamide, GSEFLamide, insulin-like peptide, intocin, leucokinin, myosuppressin, neuroparsin, neuropeptide F, orcokinin, pigment dispersing hormone, proctolin, pyrokinin, SIFamide, sulfakinin and tachykinin-related peptide families. While some of the predicted peptides are known H. americanus isoforms, most are novel identifications, more than doubling the extant lobster neuropeptidome. The deduced receptor proteins are the first descriptions of H. americanus neuropeptide receptors, and include ones for most of the peptide groups mentioned earlier, as well as those for ecdysis-triggering hormone, red pigment concentrating hormone

  1. Neuropeptidergic Signaling in the American Lobster Homarus americanus: New Insights from High-Throughput Nucleotide Sequencing

    PubMed Central

    Christie, Andrew E.; Chi, Megan; Lameyer, Tess J.; Pascual, Micah G.; Shea, Devlin N.; Stanhope, Meredith E.; Schulz, David J.; Dickinson, Patsy S.

    2015-01-01

    Peptides are the largest and most diverse class of molecules used for neurochemical communication, playing key roles in the control of essentially all aspects of physiology and behavior. The American lobster, Homarus americanus, is a crustacean of commercial and biomedical importance; lobster growth and reproduction are under neuropeptidergic control, and portions of the lobster nervous system serve as models for understanding the general principles underlying rhythmic motor behavior (including peptidergic neuromodulation). While a number of neuropeptides have been identified from H. americanus, and the effects of some have been investigated at the cellular/systems levels, little is currently known about the molecular components of neuropeptidergic signaling in the lobster. Here, a H. americanus neural transcriptome was generated and mined for sequences encoding putative peptide precursors and receptors; 35 precursor- and 41 receptor-encoding transcripts were identified. We predicted 194 distinct neuropeptides from the deduced precursor proteins, including members of the adipokinetic hormone-corazonin-like peptide, allatostatin A, allatostatin C, bursicon, CCHamide, corazonin, crustacean cardioactive peptide, crustacean hyperglycemic hormone (CHH), CHH precursor-related peptide, diuretic hormone 31, diuretic hormone 44, eclosion hormone, FLRFamide, GSEFLamide, insulin-like peptide, intocin, leucokinin, myosuppressin, neuroparsin, neuropeptide F, orcokinin, pigment dispersing hormone, proctolin, pyrokinin, SIFamide, sulfakinin and tachykinin-related peptide families. While some of the predicted peptides are known H. americanus isoforms, most are novel identifications, more than doubling the extant lobster neuropeptidome. The deduced receptor proteins are the first descriptions of H. americanus neuropeptide receptors, and include ones for most of the peptide groups mentioned earlier, as well as those for ecdysis-triggering hormone, red pigment concentrating hormone

  2. A resource of single-nucleotide polymorphisms for rainbow trout generated by restriction-site associated DNA sequencing of doubled haploids

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Salmonid genomes are considered to be in a pseudo-tetraploid state as a result of an evolutionarily recent genome duplication event. This situation complicates single nucleotide polymorphism (SNP) discovery in rainbow trout as many putative SNPs are actually paralogous sequence variants (PSVs) and ...

  3. Development of Single Nucleotide Polymorphism markers in Theobroma cacao and comparison to Simple Sequence Repeat markers for genotyping of Cameroon clones.

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Single Nucleotide Polymorphism (SNP) markers are increasingly being used in crop breeding programs, slowly replacing Simple Sequence Repeats (SSR) and other markers. SNPs provide many benefits over SSRs, including ease of analysis and unambiguous results across various platforms. We have identifie...

  4. Variation in the Nucleotide Sequence of Cottontail Rabbit Papillomavirus a and b Subtypes Affects Wart Regression and Malignant Transformation and Level of Viral Replication in Domestic Rabbits

    PubMed Central

    Salmon, Jérôme; Nonnenmacher, Mathieu; Cazé, Sandrine; Flamant, Patricia; Croissant, Odile; Orth, Gérard; Breitburd, Françoise

    2000-01-01

    We previously reported the partial characterization of two cottontail rabbit papillomavirus (CRPV) subtypes with strikingly divergent E6 and E7 oncoproteins. We report now the complete nucleotide sequences of these subtypes, referred to as CRPVa4 (7,868 nucleotides) and CRPVb (7,867 nucleotides). The CRPVa4 and CRPVb genomes differed at 238 (3%) nucleotide positions, whereas CRPVa4 and the prototype CRPV differed by only 5 nucleotides. The most variable region (7% nucleotide divergence) included the long regulatory region (LRR) and the E6 and E7 genes. A mutation in the stop codon resulted in an 8-amino-acid-longer CRPVb E4 protein, and a nucleotide deletion reduced the coding capacity of the E5 gene from 101 to 25 amino acids. In domestic rabbits homozygous for a specific haplotype of the DRA and DQA genes of the major histocompatibility complex, warts induced by CRPVb DNA or a chimeric genome containing the CRPVb LRR/E6/E7 region showed an early regression, whereas warts induced by CRPVa4 or a chimeric genome containing the CRPVa4 LRR/E6/E7 region persisted and evolved into carcinomas. In contrast, most CRPVa, CRPVb, and chimeric CRPV DNA-induced warts showed no early regression in rabbits homozygous for another DRA-DQA haplotype. Little, if any, viral replication is usually observed in domestic rabbit warts. When warts induced by CRPVa and CRPVb virions and DNA were compared, the number of cells positive for viral DNA or capsid antigens was found to be greater by 1 order of magnitude for specimens induced by CRPVb. Thus, both sequence variation in the LRR/E6/E7 region and the genetic constitution of the host influence the expression of the oncogenic potential of CRPV. Furthermore, intratype variation may overcome to some extent the host restriction of CRPV replication in domestic rabbits. PMID:11044121

  5. LISTA, LISTA-HOP and LISTA-HON: a comprehensive compilation of protein encoding sequences and its associated homology databases from the yeast Saccharomyces.

    PubMed Central

    Dölz, R; Mossé, M O; Slonimski, P P; Bairoch, A; Linder, P

    1994-01-01

    We continued our effort to make a comprehensive database (LISTA) for the yeast Saccharomyces cerevisiae. In this database each sequence has been attributed a single genetic name. In the case of duplicated sequences a simple method has been applied to distinguish between sequences of one and the same gene from non-allelic sequences of duplicated genes. If necessary, synonyms are given in the case of allelic duplicated sequences. Thus sequences can be found either by the name or by synonyms given in LISTA. Each entry contains the genetic name, the mnemonic from the EMBL data bank, the codon bias, reference of the publication of the sequence, Chromosomal location as far as known, Swissprot and EMBL accession numbers. To obtain more information on the included sequences, each entry has been screened against non-redundant nucleotide and protein data bank collections resulting in LISTA-HON and LISTA-HOP. The LISTA data base can be linked to the associated data sets or to nucleotide and protein banks by the Sequence Retrieval System (SRS). PMID:7937046

  6. Complete nucleotide sequence and genome organization of an endornavirus from bottle gourd (Lagenaria siceraria) in California, U.S.A.

    PubMed

    Kwon, Sun-Jung; Tan, Shih-Hua; Vidalakis, Georgios

    2014-08-01

    The full-length nucleotide sequence and genome organization of an Endornavirus isolated from ornamental hard shell bottle gourd plants (Lagenaria siceraria (Molina) Standl.) in California (CA), USA tentatively named L. siceraria endornavirus-California (LsEV-CA) was determined. The LsEV-CA genome was 15088 bp in length, with a G + C content of 36.55 %. The lengths of the 5' and 3' untranslated regions were 111 and 52 bp, respectively. The genome of LsEV-CA contained one large ORF encoding a 576 kDa polyprotein. The predicted protein contains two glycosyltransferase motifs, as well as RNA-dependent RNA polymerase and helicase domains. LsEV-CA was detected in healthy-looking field-grown gourd plants, as well as plants expressing yellows symptoms. It was also detected in non-symptomatic greenhouse-grown gourd seedlings grown from seed obtained from the same field sites. These preliminary data indicate that LsEV-CA is likely not associated with the gourd-yellows syndrome observed in the field. PMID:24818693

  7. Genome sequence of Perigonia lusca single nucleopolyhedrovirus: insights into the evolution of a nucleotide metabolism enzyme in the family Baculoviridae.

    PubMed

    Ardisson-Araújo, Daniel M P; Lima, Rayane Nunes; Melo, Fernando L; Clem, Rollie J; Huang, Ning; Báo, Sônia Nair; Sosa-Gómez, Daniel R; Ribeiro, Bergmann M

    2016-01-01

    The genome of a novel group II alphabaculovirus, Perigonia lusca single nucleopolyhedrovirus (PeluSNPV), was sequenced and shown to contain 132,831 bp with 145 putative ORFs (open reading frames) of at least 50 amino acids. An interesting feature of this novel genome was the presence of a putative nucleotide metabolism enzyme-encoding gene (pelu112). The pelu112 gene was predicted to encode a fusion of thymidylate kinase (tmk) and dUTP diphosphatase (dut). Phylogenetic analysis indicated that baculoviruses have independently acquired tmk and dut several times during their evolution. Two homologs of the tmk-dut fusion gene were separately introduced into the Autographa californica multiple nucleopolyhedrovirus (AcMNPV) genome, which lacks tmk and dut. The recombinant baculoviruses produced viral DNA, virus progeny, and some viral proteins earlier during in vitro infection and the yields of viral occlusion bodies were increased 2.5-fold when compared to the parental virus. Interestingly, both enzymes appear to retain their active sites, based on separate modeling using previously solved crystal structures. We suggest that the retention of these tmk-dut fusion genes by certain baculoviruses could be related to accelerating virus replication and to protecting the virus genome from deleterious mutation. PMID:27273152

  8. Empirical Comparison of Simple Sequence Repeats and Single Nucleotide Polymorphisms in Assessment of Maize Diversity and Relatedness

    PubMed Central

    Hamblin, Martha T.; Warburton, Marilyn L.; Buckler, Edward S.

    2007-01-01

    While Simple Sequence Repeats (SSRs) are extremely useful genetic markers, recent advances in technology have produced a shift toward use of single nucleotide polymorphisms (SNPs). The different mutational properties of these two classes of markers result in differences in heterozygosities and allele frequencies that may have implications for their use in assessing relatedness and evaluation of genetic diversity. We compared analyses based on 89 SSRs (primarily dinucleotide repeats) to analyses based on 847 SNPs in individuals from the same 259 inbred maize lines, which had been chosen to represent the diversity available among current and historic lines used in breeding. The SSRs performed better at clustering germplasm into populations than did a set of 847 SNPs or 554 SNP haplotypes, and SSRs provided more resolution in measuring genetic distance based on allele-sharing. Except for closely related pairs of individuals, measures of distance based on SSRs were only weakly correlated with measures of distance based on SNPs. Our results suggest that 1) large numbers of SNP loci will be required to replace highly polymorphic SSRs in studies of diversity and relatedness and 2) relatedness among highly-diverged maize lines is difficult to measure accurately regardless of the marker system. PMID:18159250

  9. Molecular gene cloning and nucleotide sequencing and construction of an aroA mutant of Pasteurella haemolytica serotype A1.

    PubMed Central

    Tatum, F M; Briggs, R E; Halling, S M

    1994-01-01

    The aroA gene of Pasteurella haemolytica serotype A1 was cloned by complementation of the aroA mutation in Escherichia coli K-12 strain AB2829. The nucleotide sequence of a 2.2-kb fragment encoding aroA predicted an open reading frame product 434 amino acids long that shows homology to other bacterial AroA proteins. Several strategies to inactivate aroA were unsuccessful. Gene replacement was finally achieved by constructing a replacement plasmid with aroA inactivated by insertion of a P. haemolytica ampicillin resistance fragment into a unique NdeI site in aroA. A hybrid plasmid was constructed by joining the aroA replacement plasmid with a 4.2-kb P. haemolytica plasmid which encodes streptomycin resistance. Following PhaI methylation, the replacement plasmid was introduced by electroporation into P. haemolytica NADC-D60, a plasmidless strain of serotype 1A. Allelic exchange between the replacement plasmid and the chromosome of P. haemolytica gave rise to an ampicillin-resistant mutant which grew on chemically defined P. haemolytica medium supplemented with aromatic amino acids but failed to grow on the same medium lacking tryptophan. Southern blot analysis confirmed that aroA of the mutant was inactivated and that the mutant was without a plasmid. Images PMID:8031095

  10. The complete nucleotide sequence of the hornwort (Anthoceros formosae) chloroplast genome: insight into the earliest land plants.

    PubMed

    Kugita, Masanori; Kaneko, Akira; Yamamoto, Yuhei; Takeya, Yuko; Matsumoto, Tohoru; Yoshinaga, Koichi

    2003-01-15

    It is generally believed that bryophytes are the earliest land plants. However, the phylogenetic relationships among bryophytes, including mosses, liverworts and hornworts, are not clearly resolved. To obtain more information on the earliest land plants, we determined the complete nucleotide sequence of the chloroplast genome from the hornwort Anthoceros formosae. The circular double-stranded DNA of 161 162 bp is the largest genome ever reported among land plant chloroplasts. It contains 76 protein, 32 tRNA and 4 rRNA genes and 10 open reading frames (ORFs), which are identical with the chloroplast genome of the other green plants analyzed. The major difference is a larger inverted repeat than that of the liverwort Marchantia, Anthoceros contains an excess of ndhB and rps7 genes and the 3' exon of rps12. The genes matK and rps15, commonly found in the chloroplast genomes of land plants, are pseudogenes. The intron of rrn23 is the first finding in the known chloroplast genomes of land plants. A striking feature of the hornwort chloroplast is that more than half of the protein-coding genes have nonsense codons, which are converted into sense codons by RNA editing. Maximum-likelihood (ML) analysis, based on 11 518 amino acid sites of 52 proteins encoded in the chloroplast genomes of the green plants, placed liverworts as the sister to all other land plants. PMID:12527781

  11. BRDT gene sequence in human testicular pathologies and the implication of its single nucleotide polymorphism (rs3088232) on fertility.

    PubMed

    Barda, S; Yogev, L; Paz, G; Yavetz, H; Lehavi, O; Hauser, R; Doniger, T; Breitbart, H; Kleiman, S E

    2014-07-01

    Bromodomain testis-specific (BRDT) protein is essential for the normal process of spermatogenesis. Mutant mice that expressed truncated BRDT had impaired testicular histology with severely reduced sperm concentration and abnormal sperm morphology, while a model of knockout Brdt mice with no BRDT protein had complete meiotic arrest. A BRDT single nucleotide polymorphism (SNP) (rs3088232) was reported as being associated with infertility in men. We assessed testicular specimens of 276 azoospermic men who underwent testicular sperm extraction to search for specimens that showed spermatogenic impairments similar to those of mutant BRDT mice. Ten similar specimens were selected for BRDT gene sequencing and they revealed three NCBI-reported SNPs (rs10783071, rs3088232 and rs10747493) variously distributed among them. Bioinformatics analysis predicted that they would not affect protein activity. Further assessment of rs3088232 frequency in a large group of non-obstructive azoospermia men and fertile controls demonstrated no significant difference between them (27.2 and 21.7% respectively; p = 0.122, Fisher's exact test). We conclude that the testicular impairments observed in the 10 specimens were not a consequence of BRDT gene mutation. The association between BRDT rs3088232 and infertility that had been reported in other studies was not supported. PMID:24865796

  12. Nucleotide sequence variation within the human tyrosine kinase B neurotrophin receptor gene: association with antisocial alcohol dependence.

    PubMed

    Xu, K; Anderson, T R; Neyer, K M; Lamparella, N; Jenkins, G; Zhou, Z; Yuan, Q; Virkkunen, M; Lipsky, R H

    2007-12-01

    To identify sequence variants in genes that may have roles in neuronal responses to alcohol, we resequenced the 5' region of tyrosine kinase B neurotrophin receptor gene (NTRK2) and determined linkage disequilibrium (LD) values, haplotype structure, and performed association analyses using 43 single nucleotide polymorphisms (SNPs) covering the entire NTRK2 region in a Finnish Caucasian sample of 229 alcohol-dependent subjects with antisocial personality disorder (ASPD) and 287 healthy controls. Individually, three SNPs were associated with alcohol dependence and alcohol abuse (AD) (P-value from 0.0019 to 0.0059, significance level was set at P

  13. Nucleotide sequence variation within the human tyrosine kinase B neurotrophin receptor gene (NTRK2): association with antisocial alcohol dependence

    PubMed Central

    Xu, K.; Anderson, T. R.; Neyer, K. M.; Lamparella, N.; Jenkins, G.; Zhou, Z.; Yuan, Q.; Virkkunen, M.; Lipsky, R. H.

    2006-01-01

    To identify sequence variants in genes that may have roles in neuronal responses to alcohol, we resequenced the 5′ region of NTRK2 and determined linkage disequilibrium (LD) values, haplotype structure, and performed association analyses using 43 single nucleotide polymorphisms (SNPs) covering the entire NTRK2 region in a Finnish Caucasian sample of 229 alcohol dependent subjects with antisocial personality disorder and 287 healthy controls. Individually, three SNPs were associated with alcohol dependence and alcohol abuse (AD)(P-value from 0.0019 to 0.0059, significance level was set at P ≤ 0.01 corrected for multiple testing), while a common eighteen-locus haplotype within the largest LD block of NTRK2, a 119 kb region containing the 5′ flanking region and exons 1 through 15, was marginally overrepresented in control subjects compared to AD individuals (global P = 0.057). Taken together, these results support a role for the NTRK2 gene in addiction in a Caucasian population with AD and a subtype of antisocial personality disorder. PMID:17200667

  14. Nucleotide sequence and functional analysis of cbbR, a positive regulator of the Calvin cycle operons of Rhodobacter sphaeroides.

    PubMed Central

    Gibson, J L; Tabita, F R

    1993-01-01

    Structural genes encoding Calvin cycle enzymes in Rhodobacter sphaeroides are duplicated and organized within two physically distinct transcriptional units, the form I and form II cbb operons. Nucleotide sequence determination of the region upstream of the form I operon revealed a divergently transcribed open reading frame, cbbR, that showed significant similarity to the LysR family of transcriptional regulatory proteins. Mutants containing an insertionally inactivated cbbR gene were impaired in photoheterotrophic growth and completely unable to grow photolithoautotrophically with CO2 as the sole carbon source. In the cbbR strain, expression of genes within the form I operon was completely abolished and that of the form II operon was reduced to about 30% of the wild-type level. The cloned cbbR gene complemented the mutant for wild-type growth characteristics, and normal levels of ribulose 1,5-bisphosphate carboxylase/oxygenase (RubisCO) were observed. However, rocket immunoelectrophoresis revealed that the wild-type level of RubisCO was due to overexpression of the form II enzyme, whereas expression of the form I RubisCO was 10% of that of the wild-type strain. The cbbR insertional inactivation did not appear to affect aerobic expression of either CO2 fixation operon, but preliminary evidence suggests that the constitutive expression of the form II operon observed in the cbbR strain may be subject to repression during aerobic growth. PMID:8376325

  15. Genome sequence of Perigonia lusca single nucleopolyhedrovirus: insights into the evolution of a nucleotide metabolism enzyme in the family Baculoviridae

    PubMed Central

    Ardisson-Araújo, Daniel M. P.; Lima, Rayane Nunes; Melo, Fernando L.; Clem, Rollie J.; Huang, Ning; Báo, Sônia Nair; Sosa-Gómez, Daniel R.; Ribeiro, Bergmann M.

    2016-01-01

    The genome of a novel group II alphabaculovirus, Perigonia lusca single nucleopolyhedrovirus (PeluSNPV), was sequenced and shown to contain 132,831 bp with 145 putative ORFs (open reading frames) of at least 50 amino acids. An interesting feature of this novel genome was the presence of a putative nucleotide metabolism enzyme-encoding gene (pelu112). The pelu112 gene was predicted to encode a fusion of thymidylate kinase (tmk) and dUTP diphosphatase (dut). Phylogenetic analysis indicated that baculoviruses have independently acquired tmk and dut several times during their evolution. Two homologs of the tmk-dut fusion gene were separately introduced into the Autographa californica multiple nucleopolyhedrovirus (AcMNPV) genome, which lacks tmk and dut. The recombinant baculoviruses produced viral DNA, virus progeny, and some viral proteins earlier during in vitro infection and the yields of viral occlusion bodies were increased 2.5-fold when compared to the parental virus. Interestingly, both enzymes appear to retain their active sites, based on separate modeling using previously solved crystal structures. We suggest that the retention of these tmk-dut fusion genes by certain baculoviruses could be related to accelerating virus replication and to protecting the virus genome from deleterious mutation. PMID:27273152

  16. Nucleotide sequence and mutational analysis of the structural genes (anfHDGK) for the second alternative nitrogenase from Azotobacter vinelandii.

    PubMed Central

    Joerger, R D; Jacobson, M R; Premakumar, R; Wolfinger, E D; Bishop, P E

    1989-01-01

    The nucleotide sequence of a region of the Azotobacter vinelandii genome exhibiting sequence similarity to nifH has been determined. The order of open reading frames within this 6.1-kilobase-pair region was found to be anfH (alternative nitrogen fixation, nifH-like gene), anfD (nifD-like gene), anfG (potentially encoding a protein similar to the product of vnfG from Azotobacter chroococcum), anfK (nifK-like gene), followed by two additional open reading frames. The 5'-flanking region of anfH contains a nif promoter similar to that found in the A. vinelandii nifHDK gene cluster. The presumed products of anfH, anfD, and anfK are similar in predicted Mr and pI to the previously described subunits of nitrogenase 3. Deletion plus insertion mutations introduced into the anfHDGK region of wild-type strain A. vinelandii CA resulted in mutant strains that were unable to grow in Mo-deficient, N-free medium but grew in the presence of 1 microM Na2MoO4 or V2O5. Introduction of the same mutations into the nifHDK deletion strain CA11 resulted in strains that grew under diazotrophic conditions only in the presence of vanadium. The lack of nitrogenase 3 subunits in these mutant strains was demonstrated through two-dimensional gel analysis of protein extracts from cells derepressed for nitrogenase under Mo and V deficiency. These results indicate that anfH, anfD, and anfK encode structural proteins for nitrogenase 3. Images PMID:2644222

  17. Nucleotide sequence and mutational analysis of the structural genes (anfHDGK) for the second alternative nitrogenase from Azotobacter vinelandii.

    PubMed

    Joerger, R D; Jacobson, M R; Premakumar, R; Wolfinger, E D; Bishop, P E

    1989-02-01

    The nucleotide sequence of a region of the Azotobacter vinelandii genome exhibiting sequence similarity to nifH has been determined. The order of open reading frames within this 6.1-kilobase-pair region was found to be anfH (alternative nitrogen fixation, nifH-like gene), anfD (nifD-like gene), anfG (potentially encoding a protein similar to the product of vnfG from Azotobacter chroococcum), anfK (nifK-like gene), followed by two additional open reading frames. The 5'-flanking region of anfH contains a nif promoter similar to that found in the A. vinelandii nifHDK gene cluster. The presumed products of anfH, anfD, and anfK are similar in predicted Mr and pI to the previously described subunits of nitrogenase 3. Deletion plus insertion mutations introduced into the anfHDGK region of wild-type strain A. vinelandii CA resulted in mutant strains that were unable to grow in Mo-deficient, N-free medium but grew in the presence of 1 microM Na2MoO4 or V2O5. Introduction of the same mutations into the nifHDK deletion strain CA11 resulted in strains that grew under diazotrophic conditions only in the presence of vanadium. The lack of nitrogenase 3 subunits in these mutant strains was demonstrated through two-dimensional gel analysis of protein extracts from cells derepressed for nitrogenase under Mo and V deficiency. These results indicate that anfH, anfD, and anfK encode structural proteins for nitrogenase 3. PMID:2644222

  18. A comparative study of 2',3'-cyclic-nucleotide 3'-phosphodiesterase in vertebrates: cDNA cloning and amino acid sequences for chicken and bullfrog enzymes.

    PubMed

    Kasama-Yoshida, H; Tohyama, Y; Kurihara, T; Sakuma, M; Kojima, H; Tamai, Y

    1997-10-01

    In mammalian brain, two 2',3'-cyclic-nucleotide 3'-phosphodiesterase (EC 3.1.4.37) isoforms, CNP1 and CNP2, are translated, respectively, from the two mRNAs, which have been transcribed and processed by alternative use of the two transcription start points and by differential splicing. In the present study, the cDNAs encoding chicken CNP2 and bullfrog CNP1, respectively, were isolated, and the amino acid sequences of chicken CNP2 and bullfrog CNP1 were deduced. Western blot analysis showed that chicken brain contains a major CNP2-type protein together with a minor unidentified isoform, and bullfrog brain contains only a CNP1-type protein. All available amino acid sequences of vertebrate 2',3'-cyclic-nucleotide 3'-phosphodiesterases were aligned and compared. Three conserved motif sequences were noted: (a) an ATP-binding site near the amino terminus, (b) an isoprenylation site at the carboxyl terminus, and (c) a probable catalytic site resembling the active site of beta-ketoacyl synthase (EC 2.3.1.41). The second and the third motifs are conserved also in goldfish RICH (regeneration-induced 2',3'-cyclic-nucleotide 3'-phosphodiesterase homologue), which has been shown recently to have 2',3'-cyclic-nucleotide 3'-phosphodiesterase activity. The third motif (probably catalytic site) was assigned for the first time in the present report. PMID:9326261

  19. Mixture models of nucleotide sequence evolution that account for heterogeneity in the substitution process across sites and across lineages.

    PubMed

    Jayaswal, Vivek; Wong, Thomas K F; Robinson, John; Poladian, Leon; Jermiin, Lars S

    2014-09-01

    Molecular phylogenetic studies of homologous sequences of nucleotides often assume that the underlying evolutionary process was globally stationary, reversible, and homogeneous (SRH), and that a model of evolution with one or more site-specific and time-reversible rate matrices (e.g., the GTR rate matrix) is enough to accurately model the evolution of data over the whole tree. However, an increasing body of data suggests that evolution under these conditions is an exception, rather than the norm. To address this issue, several non-SRH models of molecular evolution have been proposed, but they either ignore heterogeneity in the substitution process across sites (HAS) or assume it can be modeled accurately using the distribution. As an alternative to these models of evolution, we introduce a family of mixture models that approximate HAS without the assumption of an underlying predefined statistical distribution. This family of mixture models is combined with non-SRH models of evolution that account for heterogeneity in the substitution process across lineages (HAL). We also present two algorithms for searching model space and identifying an optimal model of evolution that is less likely to over- or underparameterize the data. The performance of the two new algorithms was evaluated using alignments of nucleotides with 10 000 sites simulated under complex non-SRH conditions on a 25-tipped tree. The algorithms were found to be very successful, identifying the correct HAL model with a 75% success rate (the average success rate for assigning rate matrices to the tree's 48 edges was 99.25%) and, for the correct HAL model, identifying the correct HAS model with a 98% success rate. Finally, parameter estimates obtained under the correct HAL-HAS model were found to be accurate and precise. The merits of our new algorithms were illustrated with an analysis of 42 337 second codon sites extracted from a concatenation of 106 alignments of orthologous genes encoded by the nuclear

  20. Genomic DNA sequence of a rice gene coding for a pullulanase-type of starch debranching enzyme.

    PubMed

    Francisco, P B; Zhang, Y; Park, S Y; Ogata, N; Yamanouchi, H; Nakamura, Y

    1998-09-01

    A genomic DNA containing a rice (Oryza sativa L., cv. Norin-8) gene coding for a pullulanase-type starch debranching enzyme (EC 3.2.1. 41) was sequenced (EMBL/GenBank/DDBJ accession number AB012915). Along the 15, 248 bp DNA, the pullulanase gene is split into 26 exons. The four pullulanase consensus regions are positioned in the middle portion of the sequence and are separated by long introns and 1-3 exons. Comparison of the rice cv. Norin-8 pullulanase genomic structure with that of barley pullulanase (limit dextrinase) (F. Lok et al., EMBL/GenBank/DDBJ accession number AF022725) indicates that most of the pullulanase exons are highly conserved. Alignment of the nucleotide bases of rice exon 8 with those of barley exon 8-intron 8-exon 9 fragment suggests that the 85 bp internal sequence of rice exon 8 was originally an intron, a possibility further indicated by the absence in barley and spinach (A. Renz et al., EMBL/GenBank/DDBJ accession number X83969) pullulanases of amino acid residues encoded by the 85 bp fragment. PMID:9748665

  1. Nucleotide sequence and genomic organization of Aleutian mink disease parvovirus (ADV): sequence comparisons between a nonpathogenic and a pathogenic strain of ADV.

    PubMed Central

    Bloom, M E; Alexandersen, S; Perryman, S; Lechner, D; Wolfinbarger, J B

    1988-01-01

    A DNA sequence of 4,592 nucleotides (nt) was derived for the nonpathogenic ADV-G strain of Aleutian mink disease parvovirus (ADV). The 3'(left) end of the virion strand contained a 117-nt palindrome that could assume a Y-shaped configuration similar to, but less stable than, that of other parvoviruses. The sequence obtained for the 5' end was incomplete and did not contain the 5' (right) hairpin structure but ended just after a 25-nt A + T-rich direct repeat. Features of ADV genomic organization are (i) major left (622 amino acids) and right (702 amino acids) open reading frames (ORFs) in different translational frames of the plus-sense strand, (ii) two short mid-ORFs, (iii) eight potential promoter motifs (TATA boxes), including ones at 3 and 36 map units, and (iv) six potential polyadenylation sites, including three clustered near the termination of the right ORF. Although the overall homology to other parvoviruses is less than 50%, there are short conserved amino acid regions in both major ORFs. However, two regions in the right ORF allegedly conserved among the parvoviruses were not present in ADV. At the DNA level, ADV-G is 97.5% related to the pathogenic ADV-Utah 1. A total of 22 amino acid changes were found in the right ORF; changes were found in both hydrophilic and hydrophobic regions and generally did not affect the theoretical hydropathy. However, there is a short heterogeneous region at 64 to 65 map units in which 8 out of 11 residues have diverged; this hypervariable segment may be analogous to short amino acid regions in other parvoviruses that determine host range and pathogenicity. These findings suggested that this region may harbor some of the determinants responsible for the differences in pathogenicity of ADV-G and ADV-Utah 1. PMID:2839709

  2. The annotation-enriched non-redundant patent sequence databases.

    PubMed

    Li, Weizhong; Kondratowicz, Bartosz; McWilliam, Hamish; Nauche, Stephane; Lopez, Rodrigo

    2013-01-01

    The EMBL-European Bioinformatics Institute (EMBL-EBI) offers public access to patent sequence data, providing a valuable service to the intellectual property and scientific communities. The non-redundant (NR) patent sequence databases comprise two-level nucleotide and protein sequence clusters (NRNL1, NRNL2, NRPL1 and NRPL2) based on sequence identity (level-1) and patent family (level-2). Annotation from the source entries in these databases is merged and enhanced with additional information from the patent literature and biological context. Corrections in patent publication numbers, kind-codes and patent equivalents significantly improve the data quality. Data are available through various user interfaces including web browser, downloads via FTP, SRS, Dbfetch and EBI-Search. Sequence similarity/homology searches against the databases are available using BLAST, FASTA and PSI-Search. In this article, we describe the data collection and annotation and also outline major changes and improvements introduced since 2009. Apart from data growth, these changes include additional annotation for singleton clusters, the identifier versioning for tracking entry change and the entry mappings between the two-level databases. Database URL: http://www.ebi.ac.uk/patentdata/nr/ PMID:23396323

  3. Nucleotide cleaving agents and method

    DOEpatents

    Que, Jr., Lawrence; Hanson, Richard S.; Schnaith, Leah M. T.

    2000-01-01

    The present invention provides a unique series of nucleotide cleaving agents and a method for cleaving a nucleotide sequence, whether single-stranded or double-stranded DNA or RNA, using and a cationic metal complex having at least one polydentate ligand to cleave the nucleotide sequence phosphate backbone to yield a hydroxyl end and a phosphate end.

  4. Insertion Sequence Element Single Nucleotide Polymorphism Typing Provides Insights into the Population Structure and Evolution of Mycobacterium ulcerans across Africa

    PubMed Central

    Jordaens, Kurt; Bomans, Pieter; Leirs, Herwig; Durnez, Lies; Affolabi, Dissou; Sopoh, Ghislain; Aguiar, Julia; Phanzu, Delphin Mavinga; Kibadi, Kapay; Eyangoh, Sara; Manou, Louis Bayonne; Phillips, Richard Odame; Adjei, Ohene; Ablordey, Anthony; Rigouts, Leen; Portaels, Françoise; Eddyani, Miriam; de Jong, Bouke C.

    2014-01-01

    Buruli ulcer is an indolent, slowly progressing necrotizing disease of the skin caused by infection with Mycobacterium ulcerans. In the present study, we applied a redesigned technique to a vast panel of M. ulcerans disease isolates and clinical samples originating from multiple African disease foci in order to (i) gain fundamental insights into the population structure and evolutionary history of the pathogen and (ii) disentangle the phylogeographic relationships within the genetically conserved cluster of African M. ulcerans. Our analyses identified 23 different African insertion sequence element single nucleotide polymorphism (ISE-SNP) types that dominate in different areas where Buruli ulcer is endemic. These ISE-SNP types appear to be the initial stages of clonal diversification from a common, possibly ancestral ISE-SNP type. ISE-SNP types were found unevenly distributed over the greater West African hydrological drainage basins. Our findings suggest that geographical barriers bordering the basins to some extent prevented bacterial gene flow between basins and that this resulted in independent focal transmission clusters associated with the hydrological drainage areas. Different phylogenetic methods yielded two well-supported sister clades within the African ISE-SNP types. The ISE-SNP types from the “pan-African clade” were found to be widespread throughout Africa, while the ISE-SNP types of the “Gabonese/Cameroonian clade” were much rarer and found in a more restricted area, which suggested that the latter clade evolved more recently. Additionally, the Gabonese/Cameroonian clade was found to form a strongly supported monophyletic group with Papua New Guinean ISE-SNP type 8, which is unrelated to other Southeast Asian ISE-SNP types. PMID:24296504

  5. Nucleotide sequence of the celG gene of Clostridium thermocellum and characterization of its product, endoglucanase CelG.

    PubMed Central

    Lemaire, M; Béguin, P

    1993-01-01

    The nucleotide sequence of the celG gene of Clostridium thermocellum, encoding endoglucanase CelG, was determined. The open reading frame extended over 1,698 bp and encoded a 566-amino-acid polypeptide (molecular weight of 63,128) similar to the C. thermocellum endoglucanase CelB (51.5% identical residues). The N terminus displayed a typical signal peptide, followed by a catalytic domain. The C terminus, which was separated from the catalytic domain by a 25-amino-acid segment rich in Pro, Thr, and Ser, contained two conserved stretches of 22 amino acids closely similar to those previously described in other cellulases from the same organism. Expression of the gene in Escherichia coli was increased by fusing the fragment coding for the catalytic domain in frame with the start of the lacZ' gene present in the vector. A low- and a high-M(r) form of the protein were purified. The two forms displayed identical enzymatic properties. Sodium dodecyl sulfate-polyacrylamide gel electrophoresis analysis showed that both forms consist of a major polypeptide of M(r) 50,000 and two minor polypeptides of M(r)s 49,000 and 48,000, resulting from heterogeneous proteolytic cleavage at the C terminus. An antiserum raised against the forms purified from E. coli reacted with an immunoreactive polypeptide of M(r) 66,000, which was associated with the extracellular cellulolytic complex of C. thermocellum known as the cellulosome. Images PMID:8501039

  6. Indigenous and introduced potyviruses of legumes and Passiflora spp. from Australia: biological properties and comparison of coat protein nucleotide sequences.

    PubMed

    Coutts, Brenda A; Kehoe, Monica A; Webster, Craig G; Wylie, Stephen J; Jones, Roger A C

    2011-10-01

    Five Australian potyviruses, passion fruit woodiness virus (PWV), passiflora mosaic virus (PaMV), passiflora virus Y, clitoria chlorosis virus (ClCV) and hardenbergia mosaic virus (HarMV), and two introduced potyviruses, bean common mosaic virus (BCMV) and cowpea aphid-borne mosaic virus (CAbMV), were detected in nine wild or cultivated Passiflora and legume species growing in tropical, subtropical or Mediterranean climatic regions of Western Australia. When ClCV (1), PaMV (1), PaVY (8) and PWV (5) isolates were inoculated to 15 plant species, PWV and two PaVY P. foetida isolates infected P. edulis and P. caerulea readily but legumes only occasionally. Another PaVY P. foetida isolate resembled five PaVY legume isolates in infecting legumes readily but not infecting P. edulis. PaMV resembled PaVY legume isolates in legumes but also infected P. edulis. ClCV did not infect P. edulis or P. caerulea and behaved differently from PaVY legume isolates and PaMV when inoculated to two legume species. When complete coat protein (CP) nucleotide (nt) sequences of 33 new isolates were compared with 41 others, PWV (8), HarMV (4), PaMV (1) and ClCV (1) were within a large group of Australian isolates, while PaVY (14), CAbMV (1) and BCMV (3) isolates were in three other groups. Variation among PWV and PaVY isolates was sufficient for division into four clades each (I-IV). A variable block of 56 amino acid residues at the N-terminal region of the CPs of PaMV and ClCV distinguished them from PWV. Comparison of PWV, PaMV and ClCV CP sequences showed that nt identities were both above and below the 76-77% potyvirus species threshold level. This research gives insights into invasion of new hosts by potyviruses at the natural vegetation and cultivated area interface, and illustrates the potential of indigenous viruses to emerge to infect introduced plants. PMID:21744001

  7. Genome-wide association study for endocrine fertility traits using single nucleotide polymorphism arrays and sequence variants in dairy cattle.

    PubMed

    Tenghe, A M M; Bouwman, A C; Berglund, B; Strandberg, E; de Koning, D J; Veerkamp, R F

    2016-07-01

    Endocrine fertility traits, which are defined from progesterone concentration levels in milk, are interesting indicators of dairy cow fertility because they more directly reflect the cows own reproductive physiology than classical fertility traits, which are more biased by farm management decisions. The aim of this study was to detect quantitative trait loci (QTL) for 7 endocrine fertility traits in dairy cows by performing a genome-wide association study with 85k single nucleotide polymorphisms (SNP), and then fine-map targeted QTL regions, using imputed sequence variants. Two classical fertility traits were also analyzed for QTL with 85k SNP. The association between a SNP and a phenotype was assessed by single-locus regression for each SNP, using a linear mixed model that included a random polygenic effect. A total of 2,447 Holstein Friesian cows with 5,339 lactations with both phenotypes and genotypes were used for association analysis. Heritability estimates ranged from 0.09 to 0.15 for endocrine fertility traits and 0.03 to 0.10 for classical fertility traits. The genome-wide association study identified 17 QTL regions for endocrine fertility traits on Bos taurus autosomes (BTA) 2, 3, 8, 12, 15, 17, 23, and 25. The highest number (5) of QTL regions from the genome-wide association study was identified for the endocrine trait "proportion of samples with luteal activity." Overlapping QTL regions were found between endocrine traits on BTA 2, 3, and 17. For the classical trait calving to first service, 3 QTL regions were identified on BTA 3, 15, and 23, and an overlapping region was identified on BTA 23 with endocrine traits. Fine-mapping target regions for the endocrine traits on BTA 2 and 3 using imputed sequence variants confirmed the QTL from the genome-wide association study, and identified several associated variants that can contribute to an index of markers for genetic improvement of fertility. Several potential candidate genes underlying endocrine

  8. Single nucleotide polymorphism discovery in cutthroat trout subspecies using genome reduction, barcoding, and 454 pyro-sequencing

    PubMed Central

    2012-01-01

    Background Salmonids are popular sport fishes, and as such have been subjected to widespread stocking throughout western North America. Historically, stocking was done with little regard for genetic variation among populations and has resulted in genetic mixing among species and subspecies in many areas, thus putting the genetic integrity of native salmonid populations at risk and creating a need to assess the genetic constitution of native salmonid populations. Cutthroat trout is a salmonid species with pronounced geographic structure (there are 10 extant subspecies) and a recent history of hybridization with introduced rainbow trout in many populations. Genetic admixture has also occurred among cutthroat trout subspecies in areas where introductions have brought two or more subspecies into contact. Consequently, management agencies have increased their efforts to evaluate the genetic composition of cutthroat trout populations to identify populations that remain uncompromised and manage them accordingly, but additional genetic markers are needed to do so effectively. Here we used genome reduction, MID-barcoding, and 454-pyrosequencing to discover single nucleotide polymorphisms that differentiate cutthroat trout subspecies and can be used as a rapid, cost-effective method to characterize the genetic composition of cutthroat trout populations. Results Thirty cutthroat and six rainbow trout individuals were subjected to genome reduction and next-generation sequencing. A total of 1,499,670 reads averaging 379 base pairs in length were generated by 454-pyrosequencing, resulting in 569,060,077 total base pairs sequenced. A total of 43,558 putative SNPs were identified, and of those, 125 SNP primers were developed that successfully amplified 96 cutthroat trout and rainbow trout individuals. These SNP loci were able to differentiate most cutthroat trout subspecies using distance methods and Structure analyses. Conclusions Genomic and bioinformatic protocols were

  9. Cloning and partial nucleotide sequence of human immunoglobulin mu chain cDNA from B cells and mouse-human hybridomas.

    PubMed Central

    Dolby, T W; Devuono, J; Croce, C M

    1980-01-01

    Purified mRNAs coding for mu and kappa human immunoglobulin polypeptides were translated in vitro and their products were characterized. The mu-specific mRNAs, derived from both human lymphoblastoid cells (GM607) and from a mouse-human somatic cell hybrid secreting human mu chains (alpha D5-H11-BC11), were copied into cDNAs and inserted into the plasmid pBR322. Several recombinant cDNAs that were obtained were identified by a combination of colony hybridization with labeled probes, in vitro translation of plasmid-selected mu mRNAs, and DNA nucleotide sequence determination. One recombinant DNA, for which the sequence has been partially determined, contains the codons for part of the C3 constant region domain through the carboxy-terminal piece (155 amino acids total) as well as the entire 3' noncoding sequence up to the poly(A) site of the human mu mRNA. The sequence A-A-U-A-A occurs 12 nucleotides prior to the poly(A) addition site in the human mu mRNA. Considerable sequence homology is observed in the mouse and human mu mRNA 3' coding and noncoding sequences. Images PMID:6777778

  10. SureChEMBL: a large-scale, chemically annotated patent document database.

    PubMed

    Papadatos, George; Davies, Mark; Dedman, Nathan; Chambers, Jon; Gaulton, Anna; Siddle, James; Koks, Richard; Irvine, Sean A; Pettersson, Joe; Goncharoff, Nicko; Hersey, Anne; Overington, John P

    2016-01-01

    SureChEMBL is a publicly available large-scale resource containing compounds extracted from the full text, images and attachments of patent documents. The data are extracted from the patent literature according to an automated text and image-mining pipeline on a daily basis. SureChEMBL provides access to a previously unavailable, open and timely set of annotated compound-patent associations, complemented with sophisticated combined structure and keyword-based search capabilities against the compound repository and patent document corpus; given the wealth of knowledge hidden in patent documents, analysis of SureChEMBL data has immediate applications in drug discovery, medicinal chemistry and other commercial areas of chemical science. Currently, the database contains 17 million compounds extracted from 14 million patent documents. Access is available through a dedicated web-based interface and data downloads at: https://www.surechembl.org/. PMID:26582922

  11. SureChEMBL: a large-scale, chemically annotated patent document database

    PubMed Central

    Papadatos, George; Davies, Mark; Dedman, Nathan; Chambers, Jon; Gaulton, Anna; Siddle, James; Koks, Richard; Irvine, Sean A.; Pettersson, Joe; Goncharoff, Nicko; Hersey, Anne; Overington, John P.

    2016-01-01

    SureChEMBL is a publicly available large-scale resource containing compounds extracted from the full text, images and attachments of patent documents. The data are extracted from the patent literature according to an automated text and image-mining pipeline on a daily basis. SureChEMBL provides access to a previously unavailable, open and timely set of annotated compound-patent associations, complemented with sophisticated combined structure and keyword-based search capabilities against the compound repository and patent document corpus; given the wealth of knowledge hidden in patent documents, analysis of SureChEMBL data has immediate applications in drug discovery, medicinal chemistry and other commercial areas of chemical science. Currently, the database contains 17 million compounds extracted from 14 million patent documents. Access is available through a dedicated web-based interface and data downloads at: https://www.surechembl.org/. PMID:26582922

  12. Activity, assay and target data curation and quality in the ChEMBL database.

    PubMed

    Papadatos, George; Gaulton, Anna; Hersey, Anne; Overington, John P

    2015-09-01

    The emergence of a number of publicly available bioactivity databases, such as ChEMBL, PubChem BioAssay and BindingDB, has raised awareness about the topics of data curation, quality and integrity. Here we provide an overview and discussion of the current and future approaches to activity, assay and target data curation of the ChEMBL database. This curation process involves several manual and automated steps and aims to: (1) maximise data accessibility and comparability; (2) improve data integrity and flag outliers, ambiguities and potential errors; and (3) add further curated annotations and mappings thus increasing the usefulness and accuracy of the ChEMBL data for all users and modellers in particular. Issues related to activity, assay and target data curation and integrity along with their potential impact for users of the data are discussed, alongside robust selection and filter strategies in order to avoid or minimise these, depending on the desired application. PMID:26201396

  13. Nucleotide and deduced amino acid sequences of the nucleocapsid protein of the virulent A75/17-CDV strain of canine distemper virus.

    PubMed

    Stettler, M; Zurbriggen, A

    1995-05-01

    Virus persistence is essential in the chronic inflammatory canine distemper virus (CDV)-induced demyelinating disease. In the case of CDV there is a close association between persistence and virulence. Virulent CDV isolated from dogs with distemper shows immediate persistence in primary dog brain cell cultures (DBCC) and in different cell lines. We have evidence that the nucleocapsid (NP) protein plays an important role in the development of persistence. The NP-protein, the most abundant structural virus protein, also influences virus assembly and has some regulatory functions in virus transcription and replication. In this study we compared the nucleotide and deduced amino acid sequence of a virulent CDV strain (A75/17-CDV) to a culture-attenuated non-virulent strain (OP-CDV). Viral RNA was extracted from DBCC infected with virulent CDV. Virulent CDV retains its in vivo properties, such as virulence and ability to cause demyelination, when propagated in these DBCC. The viral RNA was reverse transcribed and the resulting cDNA amplified by polymerase chain reaction for subsequent cloning. The nucleotide sequences of these clones were determined by the dideoxy chain termination method. The number of nucleotides and the putative NP-protein of the virulent strain matched the attenuated CDV strain. We observed a total of 105 nucleotide differences. Three were localised within the 3' and five within the 5' non-coding region of the NP-gene. The 97 nucleotide changes within the coding region resulted in 22 amino acid differences. 10 of these amino acid (AA) modifications were within the N-terminal region (AA 1 to 159) and 12 within the C-terminal area (AA 351 to 523).(ABSTRACT TRUNCATED AT 250 WORDS) PMID:8588315

  14. The nucleotide sequence of HLA-B{sup *}2704 reveals a new amino acid substitution in exon 4 which is also present in HLA-B{sup *}2706

    SciTech Connect

    Rudwaleit, M.; Bowness, P.; Wordsworth, P.

    1996-12-31

    The HLA-B27 subtype HLA-B{sup *}2704 is virtually absent in Caucasians but common in Orientals, where it is associated with ankylosing spondylitis. The amino acid sequence of HLA-B{sup *}2704 has been established by peptide mapping and was shown to differ by two amino acids from HLA-B{sup *}2705, HLA-B{sup *}2704 is characterized by a serine for aspartic acid substitution at position 77 and glutamic acid for valine at position 152. To date, however, no nucleotide sequence confirming these changes at the DNA level has been published. 13 refs., 2 figs.

  15. Complete nucleotide sequence of the M2 gene segment of reovirus type 3 dearing and analysis of its protein product mu 1.

    PubMed

    Jayasuriya, A K; Nibert, M L; Fields, B N

    1988-04-01

    The nucleotide sequence of the M2 gene segment of the mammalian reovirus prototype strain, type 3 Dearing, was determined from a cloned full-length cDNA copy of the viral double-stranded RNA segment. The gene comprises 2203 nucleotides and has a single long open reading frame that spans bases 30 through 2154 and encodes the 708 amino acid outer capsid protein mu 1. Aminoterminal sequence analysis of mu 1C, the proteolytically cleaved form of mu 1 that is found in purified reovirions, has identified the site of mu 1 to mu 1C cleavage between residues 42 and 43 in the mu 1 sequence. Aminoterminal sequence analysis of delta, the proteolytically cleaved product of mu 1C that is found in chymotrypsin-generated intermediate subviral particles, has indicated that the mu 1C to delta cleavage occurs near the carboxyterminus of mu 1C. Lastly, stoichiometric determinations using new sequence information have suggested that approximately equimolar amounts of mu 1C and the other major outer capsid component sigma 3 are present in virions. The data presented in this study should be useful for understanding the molecular basis of the functions of the mu 1 protein in reovirus entry into cells and in pathogenesis in the host animal. PMID:3354207

  16. Nucleotide sequence of a region of the herpes simplex virus type 1 gB glycoprotein gene: mutations affecting rate of virus entry and cell fusion.

    PubMed

    Bzik, D J; Fox, B A; DeLuca, N A; Person, S

    1984-08-01

    The tsB5 isolate of herpes simplex virus type I (HSV-1) enters host cells more rapidly than does KOS, an independent isolate of HSV-1, and this rate-of-entry determinant is located between prototypic map coordinates 0.350 and 0.360 (1). The nucleotide sequence of strain tsB5 has now been determined between prototypic map coordinates 0.347 and 0.360. Comparison of the tsB5 sequence to the homologous KOS sequence revealed that the rate-of-entry difference between these two HSV-1 strains may be due to the single amino acid difference observed within these sequences (0.350 to 0.360). A cell fusion determinant in tsB5 is located between coordinates 0.345 and 0.355 and to the left of the rate-of-entry determinant (1). Nucleotide sequence analysis revealed a second amino acid difference between tsB5 and KOS at coordinate 0.349. The cell fusion determinant was tentatively assigned to this location. PMID:6089415

  17. Non-redundant patent sequence databases with value-added annotations at two levels.

    PubMed

    Li, Weizhong; McWilliam, Hamish; de la Torre, Ana Richart; Grodowski, Adam; Benediktovich, Irina; Goujon, Mickael; Nauche, Stephane; Lopez, Rodrigo

    2010-01-01

    The European Bioinformatics Institute (EMBL-EBI) provides public access to patent data, including abstracts, chemical compounds and sequences. Sequences can appear multiple times due to the filing of the same invention with multiple patent offices, or the use of the same sequence by different inventors in different contexts. Information relating to the source invention may be incomplete, and biological information available in patent documents elsewhere may not be reflected in the annotation of the sequence. Search and analysis of these data have become increasingly challenging for both the scientific and intellectual-property communities. Here, we report a collection of non-redundant patent sequence databases, which cover the EMBL-Bank nucleotides patent class and the patent protein databases and contain value-added annotations from patent documents. The databases were created at two levels by the use of sequence MD5 checksums. Sequences within a level-1 cluster are 100% identical over their whole length. Level-2 clusters were defined by sub-grouping level-1 clusters based on patent family information. Value-added annotations, such as publication number corrections, earliest publication dates and feature collations, significantly enhance the quality of the data, allowing for better tracking and cross-referencing. The databases are available format: http://www.ebi.ac.uk/patentdata/nr/. PMID:19884134

  18. Characterization of pancreatic ductal adenocarcinoma using whole transcriptome sequencing and copy number analysis by single-nucleotide polymorphism array.

    PubMed

    Di Marco, Mariacristina; Astolfi, Annalisa; Grassi, Elisa; Vecchiarelli, Silvia; Macchini, Marina; Indio, Valentina; Casadei, Riccardo; Ricci, Claudio; D'Ambra, Marielda; Taffurelli, Giovanni; Serra, Carla; Ercolani, Giorgio; Santini, Donatella; D'Errico, Antonia; Pinna, Antonio Daniele; Minni, Francesco; Durante, Sandra; Martella, Laura Raffaella; Biasco, Guido

    2015-11-01

    The aim of the current study was to implement whole transcriptome massively parallel sequencing (RNASeq) and copy number analysis to investigate the molecular biology of pancreatic ductal adenocarcinoma (PDAC). Samples from 16 patients with PDAC were collected by ultrasound‑guided biopsy or from surgical specimens for DNA and RNA extraction. All samples were analyzed by RNASeq performed at 75x2 base pairs on a HiScanSQ Illumina platform. Single‑nucleotide variants (SNVs) were detected with SNVMix and filtered on dbSNP, 1000 Genomes and Cosmic. Non‑synonymous SNVs were analyzed with SNPs&GO and PROVEAN. A total of 13 samples were analyzed by high resolution copy number analysis on an Affymetrix SNP array 6.0. RNAseq resulted in an average of 264 coding non‑synonymous novel SNVs (ranging from 146‑374) and 16 novel insertions or deletions (In/Dels) (ranging from 6‑24) for each sample, of which a mean of 11.2% were disease‑associated and somatic events, while 34.7% were frameshift somatic In/Dels. From this analysis, alterations in the known oncogenes associated with PDAC were observed, including Kirsten rat sarcoma viral oncogene homolog (KRAS) mutations (93.7%) and inactivation of cyclin‑dependent kinase inhibitor 2A (CDKN2A) (50%), mothers against decapentaplegic homolog 4 (SMAD4) (50%), and tumor protein 53 (TP53) (56%). One case that was negative for KRAS exhibited a G13D neuroblastoma RAS viral oncogene homolog mutation. In addition, gene fusions were detected in 10 samples for a total of 23 different intra‑ or inter‑chromosomal rearrangements, however, a recurrent fusion transcript remains to be identified. SNP arrays identified macroscopic and cryptic cytogenetic alterations in 85% of patients. Gains were observed in the chromosome arms 6p, 12p, 18q and 19q which contain KRAS, GATA binding protein 6, protein kinase B and cyclin D3. Deletions were identified on chromosome arms 1p, 9p, 6p, 18q, 10q, 15q, 17p, 21q and 19q which involve TP53

  19. Organization and nucleotide sequences of ten ribosomal protein genes from the region equivalent to the S10 operon in the archaebacterium, Halobacterium halobium.

    PubMed

    Miyokawa, T; Urayama, T; Shimooka, K; Itoh, T

    1996-08-01

    A determination was made of the nucleotide sequence of the 7340-bp region of a ribosomal protein gene cluster of Halobacterium halobium, which is equivalent to the S10 operon of Escherichia coli. The sequence was analyzed with the codonpreference program deduced from the halobacterial codon usage table that showed a very high GC content of the third codon position. The sequence was comprised of a string of 13 tightly linked ORFs. Most of the ORFs were homologous with ribosomal protein genes (ORF1-ORF2-rpl3-rpl4-rpl23--rpl2- rps19-rpl22-rps3-rpl29-ORF11-rps17-r pl14). The 13-gene string was preceded by three putative AT-rich promoter sequences. The order of the genes in H. halobium essentially agreed with that of the corresponding genes of E. coli (S10-operon), except for certain deletions or insertions of additional protein genes. PMID:8876975

  20. Complete nucleotide sequence and organization of the mitogenome of the red-spotted apollo butterfly, Parnassius bremeri (Lepidoptera: Papilionidae) and comparison with other lepidopteran insects.

    PubMed

    Kim, Man Il; Baek, Jee Yeon; Kim, Min Jee; Jeong, Heon Cheon; Kim, Ki-Gyoung; Bae, Chang Hwan; Han, Yeon Soo; Jin, Byung Rae; Kim, Iksoo

    2009-10-31

    The 15,389-bp long complete mitogenome of the endangered red-spotted apollo butterfly, Parnassius bremeri (Lepidoptera: Papilionidae) was determined in this study. The start codon for the COI gene in insects has been extensively discussed, and has long remained a matter of some controversy. Herein, we propose that the CGA (arginine) sequence functions as the start codon for the COI gene in lepidopteran insects, on the basis of complete mitogenome sequences of lepidopteran insects, including P. bremeri, as well as additional sequences of the COI start region from a diverse taxonomic range of lepidopteran species (a total of 53 species from 15 families). In our extensive search for a tRNA-like structure in the A+T-rich region, one tRNA(Trp)-like sequence and one tRNA(Leu) (UUR)-like sequence were detected in the P. bremeri A+T-rich region, and one or more tRNA-like structures were detected in the A+T-rich region of the majority of other sequenced lepidopteran insects, thereby indicating that such features occur frequently in the lepidopteran mitogenomes. Phylogenetic analysis using the concatenated 13 amino acid sequences and nucleotide sequences of PCGs of the four macrolepidopteran superfamilies together with the Tortricoidea and Pyraloidea resulted in the successful recovery of a monophyly of Papilionoidea and a monophyly of Bombycoidea. However, the Geometroidea were unexpectedly identified as a sister group of the Bombycoidea, rather than the Papilionoidea. PMID:19823774

  1. Nucleotide sequence and in vivo expression of the ilvY and ilvC genes in Escherichia coli K12. Transcription from divergent overlapping promoters.

    PubMed

    Wek, R C; Hatfield, G W

    1986-02-15

    The ilvC gene of Escherichia coli K12 encodes acetohydroxy acid isomeroreductase, the second enzyme in the parallel isoleucine-valine biosynthetic pathway. Previous data have shown that transcription of the ilvC gene is induced by the acetohydroxy acid isomeroreductase substrates, acetohydroxybutyrate or acetolactate, and that this substrate induction of ilvC expression is mediated by a positive activator encoded by the ilvY gene. We report here the isolation and complete nucleotide sequence of the ilvY and ilvC genes. The ilvY and ilvC genes encode polypeptides of Mr 33,200 and 54,000, respectively. In vitro transcription-translation of these gene templates results in the synthesis of gene products of these identical molecular weights. The ilvC gene is transcribed in the same direction as the genes of the adjacent ilvGMEDA operon. The ilvY gene is transcribed in a direction opposite to the ilvC and ilvGMEDA genes. The in vivo transcriptional initiation sites of the ilvY and ilvC genes have been determined by S1 nuclease protection experiments. These transcriptional initiation sites are 45 nucleotides apart, and transcription of the ilvY and ilvC genes is initiated via divergent overlapping promoters. The nucleotide sequence of the ilvY and ilvC promoters and 5'-coding regions of Salmonella typhimurium LT2 have been determined. A comparison of these sequences with E. coli K12 suggests regions important in the promotion, regulation, and translation of the ilvY and ilvC genes. A model is presented in which the ilvY-encoded activator binds to an operator site in the overlapping promoter region and reciprocally regulates the transcription of the ilvY and ilvC genes. The carboxyl-terminal amino acid sequence of threonine deaminase encoded by the ilvA gene of the ilv-GMEDA operon of E. coli K12 has been identified by homology with the previously deduced threonine deaminase amino acid sequence encoded by the ilv1 gene of Saccharomyces cerevisiae. Based on the deduced

  2. Molecular cloning and nucleotide sequences of the complementary DNAs to chicken skeletal muscle myosin two alkali light chain mRNAs.

    PubMed Central

    Nabeshima, Y; Fujii-Kuriyama, Y; Muramatsu, M; Ogata, K

    1982-01-01

    We report here the molecular cloning and sequence analysis of DNAs complementary to mRNAs for myosin alkali light chain of chicken embryo and adult leg skeletal muscle. pSMA2-1 contained an 818 base-pair insert that includes the entire coding region and 5' and 3' untranslated regions of A2 mRNA. pSMA1-1 contained a 848 base-pair insert that included the 3' untranslated region and almost all of the coding region except for the N-terminal 13 amino acid residues of the A1 light chain. The 741 nucleotide sequences of A1 and A2 mRNAs corresponding to C-terminal 141 amino acid residues and 3' untranslated regions were identical. The 5' terminal nucleotide sequences corresponding to N-terminal 35 amino acid residues of A1 chain were quite different from the sequences corresponding to N-terminal 8 amino acid residues and of the 5' untranslated region of A2 mRNA. These findings are discussed in relation to the structures of the genes for A1 and A2 mRNA. PMID:6128725

  3. Comparison of single nucleotide polymorphisms and simple sequence repeats in genotype identification and diversity assessment of cacao germplasm

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Accurate identification of individual genotypes in an efficient manner is especially important for cacao (Theobroma cacao L.) germplasm conservation and breeding. The development of single nucleotide polymorphism (SNP) markers in cacao offers the opportunity to use a high throughput genotyping syste...

  4. Data in support of the discovery of alternative splicing variants of quail LEPR and the evolutionary conservation of qLEPRl by nucleotide and amino acid sequences alignment.

    PubMed

    Wang, Dandan; Xu, Chunlin; Wang, Taian; Li, Hong; Li, Yanmin; Ren, Junxiao; Tian, Yadong; Li, Zhuanjian; Jiao, Yuping; Kang, Xiangtao; Liu, Xiaojun

    2016-03-01

    Leptin receptor (LEPR) belongs to the class I cytokine receptor superfamily which share common structural features and signal transduction pathways. Although multiple LEPR isoforms, which are derived from one gene, were identified in mammals, they were rarely found in avian except the long LEPR. Four alternative splicing variants of quail LEPR (qLEPR) had been cloned and sequenced for the first time (Wang et al., 2015 [1]). To define patterns of the four splicing variants (qLEPRl, qLEPR-a, qLEPR-b and qLEPR-c) and locate the conserved regions of qLEPRl, this data article provides nucleotide sequence alignment of qLEPR and amino acid sequence alignment of representative vertebrate LEPR. The detailed analysis was shown in [1]. PMID:26759819

  5. Data in support of the discovery of alternative splicing variants of quail LEPR and the evolutionary conservation of qLEPRl by nucleotide and amino acid sequences alignment

    PubMed Central

    Wang, Dandan; Xu, Chunlin; Wang, Taian; Li, Hong; Li, Yanmin; Ren, Junxiao; Tian, Yadong; Li, Zhuanjian; Jiao, Yuping; Kang, Xiangtao; Liu, Xiaojun

    2015-01-01

    Leptin receptor (LEPR) belongs to the class I cytokine receptor superfamily which share common structural features and signal transduction pathways. Although multiple LEPR isoforms, which are derived from one gene, were identified in mammals, they were rarely found in avian except the long LEPR. Four alternative splicing variants of quail LEPR (qLEPR) had been cloned and sequenced for the first time (Wang et al., 2015 [1]). To define patterns of the four splicing variants (qLEPRl, qLEPR-a, qLEPR-b and qLEPR-c) and locate the conserved regions of qLEPRl, this data article provides nucleotide sequence alignment of qLEPR and amino acid sequence alignment of representative vertebrate LEPR. The detailed analysis was shown in [1]. PMID:26759819

  6. The complete nucleotide sequence and genomic organization of a novel victorivirus with two non-overlapping ORFs, identified in the plant-pathogenic fungus Phomopsis vexans.

    PubMed

    Zhang, Ru Jia; Zhong, Jie; Shang, Hong Hong; Pan, Xian Ting; Zhu, Hong Jian; Gao, Bi Da

    2015-07-01

    In this study, a novel virus designated Phomopsis vexans RNA virus 1 (PvRV1) was identified in a strain of Phomopsis vexans. The complete genomic nucleotide sequence was determined and analyzed. Sequence analysis indicated that PvRV1 is closely related to viruses in the genus Victorivirus of the family Totiviridae. Two open reading frames (ORF1 and 2) were found in the PvRV1 sequence, and these showed significant similarity to the capsid protein (CP) and RNA-dependent RNA polymerase (RdRp), respectively, of members of the family Totiviridae. The two ORFs were spaced 98 nt apart, which is unique to PvRV1 and different from the overlapping arrangement in most victoriviruses. The expression strategies of the CP and RdRp are discussed based on in silico RNA secondary structure analysis. PMID:25902724

  7. Recombination-Independent Recognition of DNA Homology for Repeat-Induced Point Mutation (RIP) Is Modulated by the Underlying Nucleotide Sequence.

    PubMed

    Gladyshev, Eugene; Kleckner, Nancy

    2016-05-01

    Haploid germline nuclei of many filamentous fungi have the capacity to detect homologous nucleotide sequences present on the same or different chromosomes. Once recognized, such sequences can undergo cytosine methylation or cytosine-to-thymine mutation specifically over the extent of shared homology. In Neurospora crassa this process is known as Repeat-Induced Point mutation (RIP). Previously, we showed that RIP did not require MEI-3, the only RecA homolog in Neurospora, and that it could detect homologous trinucleotides interspersed with a matching periodicity of 11 or 12 base-pairs along participating chromosomal segments. This pattern was consistent with a mechanism of homology recognition that involved direct interactions between co-aligned double-stranded (ds) DNA molecules, where sequence-specific dsDNA/dsDNA contacts could be established using no more than one triplet per turn. In the present study we have further explored the DNA sequence requirements for RIP. In our previous work, interspersed homologies were always examined in the context of a relatively long adjoining region of perfect homology. Using a new repeat system lacking this strong interaction, we now show that interspersed homologies with overall sequence identity of only 36% can be efficiently detected by RIP in the absence of any perfect homology. Furthermore, in this new system, where the total amount of homology is near the critical threshold required for RIP, the nucleotide composition of participating DNA molecules is identified as an important factor. Our results specifically pinpoint the triplet 5'-GAC-3' as a particularly efficient unit of homology recognition. Finally, we present experimental evidence that the process of homology sensing can be uncoupled from the downstream mutation. Taken together, our results advance the notion that sequence information can be compared directly between double-stranded DNA molecules during RIP and, potentially, in other processes where homologous

  8. Recombination-Independent Recognition of DNA Homology for Repeat-Induced Point Mutation (RIP) Is Modulated by the Underlying Nucleotide Sequence

    PubMed Central

    Kleckner, Nancy

    2016-01-01

    Haploid germline nuclei of many filamentous fungi have the capacity to detect homologous nucleotide sequences present on the same or different chromosomes. Once recognized, such sequences can undergo cytosine methylation or cytosine-to-thymine mutation specifically over the extent of shared homology. In Neurospora crassa this process is known as Repeat-Induced Point mutation (RIP). Previously, we showed that RIP did not require MEI-3, the only RecA homolog in Neurospora, and that it could detect homologous trinucleotides interspersed with a matching periodicity of 11 or 12 base-pairs along participating chromosomal segments. This pattern was consistent with a mechanism of homology recognition that involved direct interactions between co-aligned double-stranded (ds) DNA molecules, where sequence-specific dsDNA/dsDNA contacts could be established using no more than one triplet per turn. In the present study we have further explored the DNA sequence requirements for RIP. In our previous work, interspersed homologies were always examined in the context of a relatively long adjoining region of perfect homology. Using a new repeat system lacking this strong interaction, we now show that interspersed homologies with overall sequence identity of only 36% can be efficiently detected by RIP in the absence of any perfect homology. Furthermore, in this new system, where the total amount of homology is near the critical threshold required for RIP, the nucleotide composition of participating DNA molecules is identified as an important factor. Our results specifically pinpoint the triplet 5'-GAC-3' as a particularly efficient unit of homology recognition. Finally, we present experimental evidence that the process of homology sensing can be uncoupled from the downstream mutation. Taken together, our results advance the notion that sequence information can be compared directly between double-stranded DNA molecules during RIP and, potentially, in other processes where homologous

  9. Complete nucleotide sequence of the Escherichia coli recC gene and of the thyA-recC intergenic region.

    PubMed Central

    Finch, P W; Wilson, R E; Brown, K; Hickson, I D; Tomkinson, A E; Emmerson, P T

    1986-01-01

    The nucleotide sequence of a 6,000 bp region of the E. coli chromosome that includes the 3' end of the coding region for the thyA gene and the entire recC gene has been determined. The proposed coding region for the RecC protein is 3369 nucleotides long, which would encode a polypeptide consisting of 1122 amino acids with a calculated molecular mass of 129 kDa. Mung bean nuclease mapping of a recC specific transcript produced in vivo indicates that transcription of recC is initiated 80 bp upstream of the translational start point. A weak promoter sequence situated 5' to the transcription initiation point has been identified. In the 1953 bp thyA-recC intergenic region there are three open reading frames that would code for polypeptides of molecular mass 30 kDa, 13.5 kDa and 12 kDa, respectively. Although the first and third of these open reading frames are preceded by possible ribosome binding sites, no obvious promoter sequences could be identified. Moreover, transcripts for these reading frames could not be detected. Images PMID:3520484

  10. Analysis of the genome sequence of the pathogenic Muscovy duck parvovirus strain YY reveals a 14-nucleotide-pair deletion in the inverted terminal repeats.

    PubMed

    Wang, Jianye; Huang, Yu; Zhou, Mingxu; Zhu, Guoqiang

    2016-09-01

    Genomic information about Muscovy duck parvovirus is still limited. In this study, the genome of the pathogenic MDPV strain YY was sequenced. The full-length genome of YY is 5075 nucleotides (nt) long, 57 nt shorter than that of strain FM. Sequence alignment indicates that the 5' and 3' inverted terminal repeats (ITR) of strain YY contain a 14-nucleotide-pair deletion in the stem of the palindromic hairpin structure in comparison to strain FM and FZ91-30. The deleted region contains one "E-box" site and one repeated motif with the sequence "TTCCGGT" or "ACCGGAA". Phylogenetic trees constructed based the protein coding genes concordantly showed that YY, together with nine other MDPV isolates from various places, clustered in a separate branch, distinct from the branch formed by goose parvovirus (GPV) strains. These results demonstrate that, despite the distinctive deletion, the YY strain still belongs to the classical MDPV group. Moreover, the deletion of ITR may contribute to the genome evolution of MDPV under immunization pressure. PMID:27344160

  11. Nucleotide sequences and organization of the genes for carotovoricin (Ctv) from Erwinia carotovora indicate that Ctv evolved from the same ancestor as Salmonella typhi prophage.

    PubMed

    Yamada, Kazuteru; Hirota, Morihiko; Niimi, Yoshiko; Nguyen, Hoa Anh; Takahara, Yoshiyuki; Kamio, Yoshiyuki; Kaneko, Jun

    2006-09-01

    Carotovoricin Er (CtvEr), which is produced by a plant soft rot disease causative agent, Erwinia carotovora subsp. carotovora Er, is a high-molecular-weight bacteriocin showing Myoviridae phage-tail-like morphology with contractile sheath and plural tail fibers. We determined the complete nucleotide sequences of CtvEr genes on the E. carotovora Er chromosome and report that CtvEr genes consist of lysis cassette, major and minor structural protein gene clusters. Four promoters were identified. The lysis gene cassette, which is composed of the genes for lysis enzyme and holin, was also identified and characterized. The nucleotide sequences and organization of the genes for CtvCGE, which is produced by E. carotovora strain CGE234-M403 with the morphology similar to CtvEr, were also determined and compared to that of CtvEr, and it was found that CtvCGE is almost identical to CtvEr except for tail fibers which are involved in the killing spectra of both bacteriocins. We also explain that the gene organization and the deduced amino acid sequences of both carotovoricins are very close to those of prophage, which is lysogenized in the chromosome on Salmonella enterica serovar Typhi CT18. These findings strongly suggest that Ctv evolved as a phage tail-like bacteriocin from a common ancestor with Salmonella typhi prophage. PMID:16960352

  12. Nucleotide sequence of the Escherichia coli motB gene and site-limited incorporation of its product into the cytoplasmic membrane.

    PubMed Central

    Stader, J; Matsumura, P; Vacante, D; Dean, G E; Macnab, R M

    1986-01-01

    The motB gene product of Escherichia coli is an integral membrane protein required for rotation of the flagellar motor. We have determined the nucleotide sequence of the motB region and find that it contains an open reading frame of 924 nucleotides which we ascribe to the motB gene. The predicted amino acid sequence of the gene product is 308 residues long and indicates an amphipathic protein with one major hydrophobic region, about 22 residues long, near the N terminus. There is no consensus signal sequence. We postulate that the protein has a short N-terminal region in the cytoplasm, an anchoring region in the membrane consisting of two spanning segments, and a large cytoplasmic C-terminal domain. By placing motB under control of the tryptophan operon promoter of Serratia marcescens, we have succeeded in overproducing the MotB protein. Under these conditions, the majority of MotB was found in the cytoplasm, indicating that the membrane has a limited capacity to incorporate the protein. We conclude that insertion of MotB into the membrane requires the presence of other more hydrophobic components, possibly including the MotA protein or other components of the flagellar motor. The results further reinforce the concept that the total flagellar motor consists of more than just the basal body. Images PMID:3007435

  13. Nucleotide sequence of the Dpn II DNA methylase gene of Streptococcus pneumoniae and its relationship to the dam gene of Escherichia coli

    SciTech Connect

    Mannarelli, B.M.; Balganesh, T.S.; Greenberg, B.; Springhorn, S.S.; Lacks, S.A.

    1985-07-01

    The structural gene (dpnM) for the Dpn II DNA methylase of Streptococcus pneumoniae, which is part of the Dpn II restriction system and methylates adenine in the sequence 5'-G-A-T-C-3', was identified by subcloning fragments of a chromosomal segment from a Dpn II-producing strain in an S. pneumoniae host/vector cloning system and demonstrating function of the gene also in Bacillus subtilis. Determination of the nucleotide sequence of the gene and adjacent DNA indicates that it encodes a polypeptide of 32,903 daltons. A putative promoter for transcription of the gene lies within a hundred nucleotides of the polypeptide start codon. Comparison of the coding sequence to that of the dam gene of Escherichia coli, which encodes a similar methylase, revealed 30% of the amino acid residues in the two enzymes to be identical. This homology presumably reflects a common origin of the two genes prior to the divergence of Gram-positive and Gram-negative bacteria. It is suggested that the restriction function of the gene is primitive, and that the homologous restriction system in E. coli has evolved to play an accessory role in heteroduplex DNA base mismatch repair.

  14. The complete nucleotide sequence of the mitochondrial genome of the Asian longhorn beetle, Anoplophora glabripennis (Coleoptera: Cerambycidae).

    PubMed

    Fang, Jie; Qian, Lu; Xu, Mei; Yang, Xiaojun; Wang, Baode; An, Yulin

    2016-09-01

    The complete mitochondrial genome of Anoplophora glabripennis has been investigated and analyzed. The genome is a circular molecule of 15,774 bp, containing 13 protein-coding genes (PCGs), 2 rRNA genes, 22 tRNA genes, and an A + T-rich region. The nucleotide composition of the A.glabripennis mitogenome is strongly biased toward A + T nucleotides (78.30%). Nine protein-coding genes and 14 tRNA genes are encoded on the H strand, and the other 4 protein-coding genes and 8 tRNA genes are encoded on the L strand. The arrangement of genes is identical to all know longhorn beetles mitochondrial genomes. PMID:25693709

  15. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... in the sequence. (4) The enumeration of amino acids may start at the first amino acid of the first..., counting backwards starting with the amino acid next to number 1. Otherwise, the enumeration of amino acids... sequence every 5 amino acids. The enumeration method for amino acid sequences that is set forth......

  16. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... in the sequence. (4) The enumeration of amino acids may start at the first amino acid of the first..., counting backwards starting with the amino acid next to number 1. Otherwise, the enumeration of amino acids... sequence every 5 amino acids. The enumeration method for amino acid sequences that is set forth......

  17. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... in the sequence. (4) The enumeration of amino acids may start at the first amino acid of the first..., counting backwards starting with the amino acid next to number 1. Otherwise, the enumeration of amino acids... sequence every 5 amino acids. The enumeration method for amino acid sequences that is set forth......

  18. Complete Nucleotide Sequence of pKOI-34, an IncL/M Plasmid Carrying blaIMP-34 in Klebsiella oxytoca Isolated in Japan.

    PubMed

    Shimada, Norimitsu; Kayama, Shizuo; Shigemoto, Norifumi; Hisatsune, Junzo; Kuwahara, Ryuichi; Nishio, Hisaaki; Yamasaki, Katsutoshi; Wada, Yasunao; Sueda, Taijiro; Ohge, Hiroki; Sugai, Motoyuki

    2016-05-01

    We determined the complete nucleotide sequence of a self-transmissible IncL/M plasmid, pKOI-34, from a Klebsiella oxytoca isolate. pKOI-34 possessed the core structure of an IncL/M plasmid found in Erwinia amylovora, pEL60, with two mobile elements inserted, a transposon carrying the arsenic resistance operon and a Tn21-like core module (tnp and mer modules) piggybacking blaIMP-34 as a class 1 integron, In808, where blaIMP-34 confers a resistance to carbapenems in K. oxytoca and Klebsiella pneumoniae. PMID:26902770

  19. Nucleotide sequence of the nifH gene coding for nitrogen reductase in the acetic acid bacterium Acetobacter diazotrophicus.

    PubMed

    Franke, I H; Fegan, M; Hayward, A C; Sly, L I

    1998-01-01

    The nifH gene sequence of the nitrogen-fixing bacterium Acetobacter diazotrophicus was determined with the use of the polymerase chain reaction and universal degenerate oligonucleotide primers. The gene shows highest pair-wise similarity to the nifH gene of Azospirillum brasilense. The phylogenetic relationships of the nifH gene sequences were compared with those inferred from 16S rRNA gene sequences. Knowledge of the sequence of the nifH gene contributes to the growing database of nifH gene sequences, and will allow the detection of Acet. diazotrophicus from environmental samples with nifH gene-based primers. PMID:9489028

  20. Complete Nucleotide Sequence of pGA45, a 140,698-bp IncFIIY Plasmid Encoding bla IMI-3-Mediated Carbapenem Resistance, from River Sediment.

    PubMed

    Dang, Bingjun; Mao, Daqing; Luo, Yi

    2016-01-01

    Plasmid pGA45 was isolated from the sediments of Haihe River using Escherichia coli CV601 (gfp-tagged) as recipients and indigenous bacteria from sediment as donors. This plasmid confers reduced susceptibility to imipenem which belongs to carbapenem group. Plasmid pGA45 was fully sequenced on an Illumina HiSeq 2000 sequencing system. The complete sequence of plasmid pGA45 was 140,698 bp in length with an average G + C content of 52.03%. Sequence analysis shows that pGA45 belongs to IncFIIY group and harbors a backbone region which shares high homology and gene synteny to several other IncF plasmids including pNDM1_EC14653, pYDC644, pNDM-Ec1GN574, pRJF866, pKOX_NDM1, and pP10164-NDM. In addition to the backbone region, plasmid pGA45 harbors two notable features including one bla IMI-3-containing region and one type VI secretion system region. The bla IMI-3-containing region is responsible for bacteria carbapenem resistance and the type VI secretion system region is probably involved in bacteria virulence, respectively. Plasmid pGA45 represents the first complete nucleotide sequence of the bla IMI-harboring plasmid from environment sample and the sequencing of this plasmid provided insight into the architecture used for the dissemination of bla IMI carbapenemase genes. PMID:26941718

  1. Complete Nucleotide Sequence of pGA45, a 140,698-bp IncFIIY Plasmid Encoding blaIMI-3-Mediated Carbapenem Resistance, from River Sediment

    PubMed Central

    Dang, Bingjun; Mao, Daqing; Luo, Yi

    2016-01-01

    Plasmid pGA45 was isolated from the sediments of Haihe River using Escherichia coli CV601 (gfp-tagged) as recipients and indigenous bacteria from sediment as donors. This plasmid confers reduced susceptibility to imipenem which belongs to carbapenem group. Plasmid pGA45 was fully sequenced on an Illumina HiSeq 2000 sequencing system. The complete sequence of plasmid pGA45 was 140,698 bp in length with an average G + C content of 52.03%. Sequence analysis shows that pGA45 belongs to IncFIIY group and harbors a backbone region which shares high homology and gene synteny to several other IncF plasmids including pNDM1_EC14653, pYDC644, pNDM-Ec1GN574, pRJF866, pKOX_NDM1, and pP10164-NDM. In addition to the backbone region, plasmid pGA45 harbors two notable features including one blaIMI-3-containing region and one type VI secretion system region. The blaIMI-3-containing region is responsible for bacteria carbapenem resistance and the type VI secretion system region is probably involved in bacteria virulence, respectively. Plasmid pGA45 represents the first complete nucleotide sequence of the blaIMI-harboring plasmid from environment sample and the sequencing of this plasmid provided insight into the architecture used for the dissemination of blaIMI carbapenemase genes. PMID:26941718

  2. Cloning and nucleotide sequencing of a novel 7 beta-(4-carboxybutanamido)cephalosporanic acid acylase gene of Bacillus laterosporus and its expression in Escherichia coli and Bacillus subtilis.

    PubMed

    Aramori, I; Fukagawa, M; Tsumura, M; Iwami, M; Ono, H; Kojo, H; Kohsaka, M; Ueda, Y; Imanaka, H

    1991-12-01

    A strain of Bacillus species which produced an enzyme named glutaryl 7-ACA acylase which converts 7 beta-(4-carboxybutanamido)cephalosporanic acid (glutaryl 7-ACA) to 7-amino cephalosporanic acid (7-ACA) was isolated from soil. The gene for the glutaryl 7-ACA acylase was cloned with pHSG298 in Escherichia coli JM109, and the nucleotide sequence was determined by the M13 dideoxy chain termination method. The DNA sequence revealed only one large open reading frame composed of 1,902 bp corresponding to 634 amino acid residues. The deduced amino acid sequence contained a potential signal sequence in its amino-terminal region. Expression of the gene for glutaryl 7-ACA acylase was performed in both E. coli and Bacillus subtilis. The enzyme preparations purified from either recombinant strain of E. coli or B. subtilis were shown to be identical with each other as regards the profile of sodium dodecyl sulfate-polyacrylamide gel electrophoresis and were composed of a single peptide with the molecular size of 70 kDa. Determination of the amino-terminal sequence of the two enzyme preparations revealed that both amino-terminal sequences (the first nine amino acids) were identical and completely coincided with residues 28 to 36 of the open reading frame. Extracellular excretion of the enzyme was observed in a recombinant strain of B. subtilis. PMID:1744041

  3. Cloning and nucleotide sequence of the gyrA gene from Campylobacter fetus subsp. fetus ATCC 27374 and characterization of ciprofloxacin-resistant laboratory and clinical isolates.

    PubMed Central

    Taylor, D E; Chau, A S

    1997-01-01

    The gyrA gene of Campylobacter fetus subsp. fetus, which encodes the A subunit of DNA gyrase, was cloned, and its nucleotide sequence was determined. An open reading frame of 2,586 nucleotides which encodes a polypeptide of 862 amino acids with an Mr of 96,782 was identified. C. fetus subsp. fetus GyrA is most closely related to Campylobacter jejuni GyrA, with 73% homology at the nucleotide level and 78% identity between polypeptides. The next most closely related GyrA was that from Helicobacter pylori, with both DNA homology and amino acid identity of 63%. The gyrA and gyrB (DNA gyrase B subunit) genes were located on the genomic map of C. fetus subsp. fetus ATCC 27374 and shown to be separate. A clinical isolate of C. fetus subsp. fetus and a laboratory-derived mutant of ATCC 27374, both resistant to ciprofloxacin, had identical mutations within the quinolone resistance determining region. In both mutants a G-->T transversion, corresponding to a substitution of Asp-91 to Tyr in GyrA, was linked to ciprofloxacin resistance, giving MICs of 8 to 16 micrograms/ml. PMID:9056011

  4. Analysis of the entire nucleotide sequence of hepatitis B virus genotype B in the Philippines reveals a new subgenotype of genotype B.

    PubMed

    Nagasaki, Futoshi; Niitsuma, Hirofumi; Cervantes, Julieta G; Chiba, Masanori; Hong, Shan; Ojima, Toshiaki; Ueno, Yoshiyuki; Bondoc, Edgardo; Kobayashi, Koju; Ishii, Motoyasu; Shimosegawa, Tooru

    2006-05-01

    The entire nucleotide sequences were determined for hepatitis B virus (HBV) genotype B (HBV/B) genomes extracted from five patients in the Philippines and designated GenBank AB219426, AB219427, AB219428, AB219429 and AB219430. The serotype of the first four isolates was ayw and that of GenBank AB219430 was adw. Divergences of entire sequences were 1.0-2.0 % between the first four isolates and 3.8-4.2 % between these four and GenBank AB219430. Phylogenetic-tree analysis revealed that, worldwide, HBV/B comprises five subgenotypes: B1, B2, B3, B4 and the new Philippines group, designated B5. Divergences of the entire genome sequences between four isolates in subgenotype B5 and isolates from other countries (subgenotypes) were 4.4-4.8 % with Vietnam (B4), 2.9-3.5 % with Indonesia (B3), 4.7-5.1 % with China (B2) and 5.4-6.0 % with Japan (B1). Similarly, GenBank AB219430 showed the lowest divergences: 3.4 % with the isolate from Indonesia (B3), 5.0 % with Vietnam (B4), 5.4 % with China (B2) and 6.1 % with Japan (B1). This is the first report of entire nucleotide sequences of HBV/B from the Philippines and the results show that these sequences belong to a new subgenotype, B5. The present study identified that HBV/B isolates throughout the world are divided genetically into five subgenotypes, the relationships between geographical distances and the genetic distances of HBV/B being well-correlated. PMID:16603518

  5. Nucleotide sequence of an insertion sequence (IS) element identified in the T-DNA region of a spontaneous variant of the Ti-plasmid pTiT37.

    PubMed Central

    Vanderleyden, J; Desair, J; De Meirsman, C; Michiels, K; Van Gool, A P; Chilton, M D; Jen, G C

    1986-01-01

    We have identified and determined the nucleotide sequence of an IS element (IS136) of Agrobacterium tumefaciens. This is the first IS element isolated and sequenced from a nopaline type Ti-plasmid. Our IS element has 32/30 bp inverted repeats with 6 mismatches, is 1,313 bp long and generates 9 bp direct repeats upon integration. IS136 has 3 main open reading frames (ORF's). Only ORF1 (159 codons) is preceded by sequences that are proposed to serve functional roles in transcriptional and translational initiation. No DNA sequence homology was found between IS136 and IS66, an IS element isolated from an octopine type Ti-plasmid. PMID:3018677

  6. Component A2 of methylcoenzyme M reductase system from Methanobacterium thermoautotrophicum delta H: nucleotide sequence and functional expression by Escherichia coli.

    PubMed Central

    Kuhner, C H; Lindenbach, B D; Wolfe, R S

    1993-01-01

    The gene for component A2 of the methylcoenzyme M reductase system from Methanobacterium thermoautotrophicum delta H was cloned, and its nucleotide sequence was determined. The gene for A2, designated atwA, encodes an acidic protein of 59,335 Da. Amino acid sequence analysis revealed partial homology of A2 to a number of eucaryotic and bacterial proteins in the ATP-binding cassette (ABC) family of transport systems. Component A2 possesses two ATP-binding domains. A 2.2-kb XmaI-BamHI fragment containing atwA and the surrounding open reading frames was cloned into pGEM-7Zf(+). A cell extract from this strain replaced purified A2 from M. thermoautotrophicum delta H in an in vitro methylreductase assay. Images PMID:8491734

  7. Nucleotide sequence of the McrB region of Escherichia coli K-12 and evidence for two independent translational initiation sites at the mcrB locus.

    PubMed Central

    Ross, T K; Achberger, E C; Braymer, H D

    1989-01-01

    The McrB restriction system of Escherichia coli K-12 is responsible for the biological inactivation of foreign DNA that contains 5-methylcytosine residues (E. A. Raleigh and G. Wilson, Proc. Natl. Acad. Sci. USA 83:9070-9074, 1986). Within the McrB region of the chromosome is the mcrB gene, which encodes a protein of 51 kilodaltons (kDa) (T. K. Ross, E. C. Achberger, and H. D. Braymer, Gene 61:277-289, 1987), and the mcrC gene, the product of which is 39 kDa (T. K. Ross, E. C. Achberger, and H. D. Braymer, Mol. Gen. Genet., in press). The nucleotide sequence of a 2,695-base-pair segment encompassing the McrB region was determined. The deduced amino acid sequence was used to identify two open reading frames specifying peptides of 455 and 348 amino acids, corresponding to the products of the mcrB and mcrC genes, respectively. A single-nucleotide overlap was found to exist between the termination codon of the mcrB gene and the proposed initiation codon of the mcrC gene. The presence of an additional peptide of 33 kDa in strains containing various recombinant plasmids with portions of the McrB region has been reported by Ross et al. (Gene 61:277-289, 1987). The analysis of frameshift and deletion mutants of one such hybrid plasmid, pRAB-13, provided evidence for a second translational initiation site within the McrB open reading frame. The proposed start codon for translation of the 33-kDa peptide lies 481 nucleotides downstream from the initiation codon for the 51-kDa mcrB gene product. The 33-kDa peptide may play a regulatory role in the McrB restriction of DNA containing 5-methylcytosine. Images PMID:2649480

  8. Nucleotide sequence of the plasminogen activator gene of Yersinia pestis: relationship to ompT of Escherichia coli and gene E of Salmonella typhimurium.

    PubMed

    Sodeinde, O A; Goguen, J D

    1989-05-01

    We have determined the nucleotide sequence of the 1.4-kilobase DNA fragment containing the plasminogen activator gene (pla) of Yersinia pestis, which determines both plasminogen activator and coagulase activities of the species. The sequence revealed the presence of a 936-base-pair open reading frame that constitutes the pla gene. This reading frame encodes a 312-amino-acid protein of 34.6 kilodaltons and containing a putative 20-amino-acid signal sequence. The presence of a single large open reading frame is consistent with our previous conclusion that the two Pla proteins which appear in the outer membrane of pla+ Y. pestis are derived from a common precursor. The deduced amino acid sequence of Pla revealed that it possesses a high degree of homology to the products of gene E of Salmonella typhimurium and ompT of Escherichia coli but does not possess significant homology to other plasminogen activators of known sequence. We also identified a transcription unit that resides on the complimentary strand and overlaps the pla gene. PMID:2651310

  9. Complete Nucleotide Sequence of Artichoke latent virus Shows it to be a Member of the Genus Macluravirus in the Family Potyviridae.

    PubMed

    Minutillo, S A; Marais, A; Mascia, T; Faure, C; Svanella-Dumas, L; Theil, S; Payet, A; Perennec, S; Schoen, L; Gallitelli, D; Candresse, T

    2015-08-01

    Complete genomic sequences of Artichoke latent virus (ArLV) have been obtained by classical or high-throughput sequencing for an ArLV isolate from Italy (ITBr05) and for two isolates from France (FR37 and FR50). The genome is 8,278 to 8,291 nucleotides long and has a genomic organization comparable with that of Chinese yam necrotic mosaic virus (CYNMV), the only macluravirus fully sequenced to date. The cleavage sites of the viral polyprotein have been tentatively identified by comparison with CYNMV, confirming that macluraviruses are characterized by the absence of a P1 protein, a shorter and N-terminally truncated coat protein (CP). Sequence comparisons firmly place ArLV within the genus Macluravirus, and confirm previous results suggesting that Ranunculus latent virus (RALV), a previously described Macluravirus sp., is very closely related to ArLV. Serological relationships and comparisons of the CP gene and of the partial RaLV sequence available all indicate that RaLV should not be considered as a distinct species but as a strain of ArLV. The results obtained also suggest that the spectrum of currently used ArLV-specific molecular hybridization or polymerase chain reaction detection assays should be improved to cover all isolates and strains in the ArLV species. PMID:25760520

  10. Nucleotide sequence and expression in vitro of cDNA derived from mRNA of int-1, a provirally activated mouse mammary oncogene.

    PubMed Central

    Fung, Y K; Shackleford, G M; Brown, A M; Sanders, G S; Varmus, H E

    1985-01-01

    The mouse int-1 gene is a putative mammary oncogene discovered as a target for transcriptionally activating proviral insertion mutations in mammary carcinomas induced by the mouse mammary tumor virus in C3H mice. We have isolated molecular clones of full- or nearly full-length cDNA transcribed from int-1 RNA (2.6 kilobases) in a virus-induced mammary tumor. Comparison of the nucleotide sequence of the cDNA clones with that of the int-1 gene (A. van Ooyen and R. Nusse, Cell 39:233-240, 1984) shows the following. The coding region of the int-1 gene is composed of four exons. The splice donor and acceptor sites conform to consensus; however, at least two closely spaced polyadenylation sites are used, and the transcriptional initiation site remains ambiguous. The major open reading frame is preceded by an open frame 10 codons in length. The mRNA encodes a 41-kilodalton protein with several striking features--a strongly hydrophobic amino terminus, a cysteine-rich carboxy terminus, and four potential glycosylation sites. There are no differences in nucleotide sequence between the known exons of the normal and a provirally activated allele. The length of the deduced open reading frame was further confirmed by in vitro translation of RNA transcribed from the cDNA clones with SP6 RNA polymerase. Images PMID:3018519

  11. Complete nucleotide sequence of little cherry virus 1 (LChV-1) infecting sweet cherry in China

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Little cherry virus 1 (LChV-1), associated with little cherry disease (LCD), has a significant impact on fruit quality of infected sweet cherry trees. We report the full genome sequence of an isolate of LChV-1 from China, detected by small RNA deep sequencing and amplified by overlapping RT-PCR. The...

  12. Complete nucleotide sequence, genome organization, and biological properties of human immunodeficiency virus type 1 in vivo: evidence for limited defectiveness and complementation.

    PubMed Central

    Li, Y; Hui, H; Burgess, C J; Price, R W; Sharp, P M; Hahn, B H; Shaw, G M

    1992-01-01

    Previous studies of the genetic and biologic characteristics of human immunodeficiency virus type 1 (HIV-1) have by necessity used tissue culture-derived virus. We recently reported the molecular cloning of four full-length HIV-1 genomes directly from uncultured human brain tissue (Y. Li, J. C. Kappes, J. A. Conway, R. W. Price, G. M. Shaw, and B. H. Hahn, J. Virol. 65:3973-3985, 1991). In this report, we describe the biologic properties of these four clones and the complete nucleotide sequences and genome organization of two of them. Clones HIV-1YU-2 and HIV-1YU-10 were 9,174 and 9,176 nucleotides in length, differed by 0.26% in nucleotide sequence, and except for a frameshift mutation in the pol gene in HIV-1YU-10, contained open reading frames corresponding to 5'-gag-pol-vif-vpr-tat-rev-vpu-env-nef-3' flanked by long terminal repeats. HIV-1YU-2 was fully replication competent, while HIV-1YU-10 and two other clones, HIV-1YU-21 and HIV-1YU-32, were defective. All three defective clones, however, when transfected into Cos-1 cells in any pairwise combination, yielded virions that were replication competent and transmissible by cell-free passage. The cellular host range of HIV-1YU-2 was strictly limited to primary T lymphocytes and monocyte-macrophages, a property conferred by its external envelope glycoprotein. Phylogenetic analyses of HIV-1YU-2 gene sequences revealed this virus to be a member of the North American/European HIV-1 subgroup, with specific similarity to other monocyte-tropic viruses in its V3 envelope amino acid sequence. These results indicate that HIV-1 infection of brain is characterized by the persistence of mixtures of fully competent, minimally defective, and more substantially altered viral forms and that complementation among them is readily attainable. In addition, the limited degree of genotypic heterogeneity observed among HIV-1YU and other brain-derived viruses and their preferential tropism for monocyte-macrophages suggest that viral

  13. Nucleotide and predicted amino acid sequence of a cDNA clone encoding part of human transketolase.

    PubMed

    Abedinia, M; Layfield, R; Jones, S M; Nixon, P F; Mattick, J S

    1992-03-31

    Transketolase is a key enzyme in the pentose-phosphate pathway which has been implicated in the latent human genetic disease, Wernicke-Korsakoff syndrome. Here we report the cloning and partial characterisation of the coding sequences encoding human transketolase from a human brain cDNA library. The library was screened with oligonucleotide probes based on the amino acid sequence of proteolytic fragments of the purified protein. Northern blots showed that the transketolase mRNA is approximately 2.2 kb, close to the minimum expected, of which approximately 60% was represented in the largest cDNA clone. Sequence analysis of the transketolase coding sequences reveals a number of homologies with related enzymes from other species. PMID:1567394

  14. Detection of a G>C single nucleotide polymorphism within a repetitive DNA sequence by high-resolution DNA melting.

    PubMed

    Schmidt, Ulrike; Hulkkonen, Johannes; Naue, Jana

    2016-09-01

    In standard forensic DNA analysis, single base mutations within short tandem repeats (STR) mostly escape detection. In this study, high-resolution DNA melting (HRM) is compared to minisequencing and Sanger sequencing as to determine the most suitable method for detection of a G to C mutation within a repetitive DNA sequence, the STR system DXS10161. It shows an ATG/ATC polymorphism surrounded by a variable number of (TATC) and (ATCT) motifs. Neutral base changes like G:C to C:G result in very low differences in the melting temperature (T m) of the PCR amplicons. By enhanced resolution of fluorescence vs. temperature in HRM, the technique showed to be suitable for detecting a G to C transversion in this repetitive DNA sequence context. Compared to minisequencing, HRM is more time- and cost-effective. Results were confirmed by Sanger sequencing. PMID:26972692

  15. Nucleotide sequence and expression in Escherichia coli of cDNAs encoding papaya proteinase omega from Carica papaya.

    PubMed

    Revell, D F; Cummings, N J; Baker, K C; Collins, M E; Taylor, M A; Sumner, I G; Pickersgill, R W; Connerton, I F; Goodenough, P W

    1993-05-30

    We have cloned and sequenced two similar, but distinct, cDNAs from both fruit and leaf tissues of Carica papaya. The C-terminal portion of the predicted amino acid (aa) sequence of one of the clones has complete identity with the mature enzyme sequence of the cysteine proteinase papaya proteinase omega (Pp omega). The second clone contains ten individual bp changes compared with the first and encodes a protein with three single-aa substitutions, only one of which is located in the mature sequence, but most noticeably carries an additional 19-aa C-terminal extension. The clones encode pre-pro precursor isoforms of Pp omega. The former of these clones has been expressed in Escherichia coli using a T7 polymerase expression system to produce insoluble pro-enzyme which has been solubilized and refolded to yield auto-activable pro-Pp omega. PMID:7684720

  16. Cloning and nucleotide sequence of the genes coding for the Sau96I restriction and modification enzymes.

    PubMed Central

    Szilák, L; Venetianer, P; Kiss, A

    1990-01-01

    The genes coding for the GGNCC specific Sau96I restriction and modification enzymes were cloned and expressed in E. coli. The DNA sequence predicts a 430 amino acid protein (Mr: 49,252) for the methyltransferase and a 261 amino acid protein (Mr: 30,486) for the endonuclease. No protein sequence similarity was detected between the Sau96I methyltransferase and endonuclease. The methyltransferase contains the sequence elements characteristic for m5C-methyltransferases. In addition to this, M.Sau96I shows similarity, also in the variable region, with one m5C-methyltransferase (M.SinI) which has closely related recognition specificity (GGA/TCC). M.Sau96I methylates the internal cytosine within the GGNCC recognition sequence. The Sau96I endonuclease appears to act as a monomer. Images PMID:2204026

  17. The nucleotide sequence of the ubiquitous repetitive DNA sequence B1 complementary to the most abundant class of mouse fold-back RNA.

    PubMed Central

    Krayev, A S; Kramerov, D A; Skryabin, K G; Ryskov, A P; Bayev, A A; Georgiev, G P

    1980-01-01

    Three copies of a highly repetitive DNA sequence B1 which is complementary to the most abundant class of mouse fold-back RNA have been cloned in pBR322 plasmid and sequenced by the method of Maxam and Gilbert. All the three have a length of about 130 base pairs and are very similar in their base sequence. The deviation from the average sequence is equal to 4% and the overall mismatch between each two is not higher than 8%. One of the recombinant clones used contained two copies of B1 oriented in the same direction. All of the B1 copies are flanked with sequences which possess nonidentical but very similar structure. They consist of a number of AmCn blocks (where m varies from 2 to 8 and n equals 1-2). These peculiar sequences in all cases are separated from B1 by non-homologous DNA stretches of 2-8 residues. In one case, a long polypurine stretch is located next to such a block. It consists of 74 residues most of which represent a reiteration of the basic sequence AAAAG. We have found two regions within the B1 sequence which are homologous to the intron-exon junctions, especially to those present in the large intron of the mouse beta-globin gene. It may indicate the involvement of the B1 sequence in pre-mRNA splicing. Images PMID:7433120

  18. Extended region of nodulation genes in Rhizobium meliloti 1021. II. Nucleotide sequence, transcription start sites and protein products

    SciTech Connect

    Fisher, R.F.; Swanson, J.A.; Mulligan, J.T.; Long, S.R.

    1987-10-01

    The authors have established the DNA sequence and analyzed the transcription and translation products of a series of putative nodulation (nod) genes in Rhizobium meliloti strain 1021. Four loci have been designated nodF, nodE, nodG and nodH. The correlation of transposon insertion positions with phenotypes and open reading frames was confirmed by sequencing the insertion junctions of the transposons. The protein products of these nod genes were visualized by in vitro expression of cloned DNA segments in a R. meliloti transcription-translation system. In addition, the sequence for nodG was substantiated by creating translational fusions in all three reading frames at several points in the sequence; the resulting fusions were expressed in vitro in both E. coli and R. meliloti transcription-translation systems. A DNA segment bearing several open reading frames downstream of nodG corresponds to the putative nod gene mutated in strain nod-216. The transcription start sites of nodF and nodH were mapped by primer extension of RNA from cells induced with the plant flavone, luteolin. Initiation of transcription occurs approximately 25 bp downstream from the conserved sequence designated the nod box, suggesting that this conserved sequence acts as an upstream regulator of inducible nod gene expression. Its distance from the transcription start site is more suggestive of an activator binding site rather than an RNA polymerase binding site.

  19. Biocompatible mannosylated endosomal-escape nanoparticles enhance selective delivery of short nucleotide sequences to tumor associated macrophages

    NASA Astrophysics Data System (ADS)

    Ortega, Ryan A.; Barham, Whitney J.; Kumar, Bharat; Tikhomirov, Oleg; McFadden, Ian D.; Yull, Fiona E.; Giorgio, Todd D.

    2014-12-01

    Tumor associated macrophages (TAMs) can modify the tumor microenvironment to create a pro-tumor niche. Manipulation of the TAM phenotype is a novel, potential therapeutic approach to engage anti-cancer immunity. siRNA is a molecular tool for knockdown of specific mRNAs that is tunable in both strength and duration. The use of siRNA to reprogram TAMs to adopt an immunogenic, anti-tumor phenotype is an attractive alternative to ablation of this cell population. One current difficulty with this approach is that TAMs are difficult to specifically target and transfect. We report here successful utilization of novel mannosylated polymer nanoparticles (MnNP) that are capable of escaping the endosomal compartment to deliver siRNA to TAMs in vitro and in vivo. Transfection with MnNP-siRNA complexes did not significantly decrease TAM cell membrane integrity in culture, nor did it create adverse kidney or liver function in mice, even at repeated doses of 5 mg kg-1. Furthermore, MnNP effectively delivers labeled nucleotides to TAMs in mice with primary mammary tumors. We also confirmed TAM targeting in the solid tumors disseminated throughout the peritoneum of ovarian tumor bearing mice following injection of fluorescently labeled MnNP-nucleotide complexes into the peritoneum. Finally, we show enhanced uptake of MnNP in lung metastasis associated macrophages compared to untargeted particles when using an intubation delivery method. In summary, we have shown that MnNP specifically and effectively deliver siRNA to TAMs in vivo.

  20. The Complete Nucleotide Sequence of the Mitochondrial Genome of the Lungfish (Protopterus Dolloi) Supports Its Phylogenetic Position as a Close Relative of Land Vertebrates

    PubMed Central

    Zardoya, R.; Meyer, A.

    1996-01-01

    The complete DNA sequence (16,646 bp) of the mitochondrial genome of the African lungfish, Protopterus dolloi, was determined. The evolutionary position of lungfish as possibly the closest living relative among fish of land vertebrates made its mitochondrial DNA sequence particularly interesting. Its mitochondrial gene order conforms to the consensus vertebrate gene order. Several sequence motifs and secondary structures likely involved in the regulation of the initiation of replication and transcription of the mitochondrial genome are conserved in the lungfish and are more similar to those of land vertebrates than those of ray-finned fish. A novel feature discovered is that the putative origin of L-strand replication partially overlaps the adjacent tRNA(Cys). The phylogenetic analyses of genes coding for tRNAs and proteins confirm the intermediate phylogenetic position of lungfish between ray-finned fishes and tetrapods. The complete nucleotide sequence of the African lungfish mitochondrial genome was used to estimate which mitochondrial genes are most appropriate to elucidate deep branch phylogenies. Only a combined set of either protein or tRNA mitochondrial genes (but not each gene alone) is able to confidently recover the expected phylogeny among vertebrates that have diverged up to but not over ~400 mya. PMID:8846902