Science.gov

Sample records for cdna nucleotide sequence

  1. Nucleotide sequence of cDNA clones of the murine myb proto-oncogene.

    PubMed Central

    Gonda, T J; Gough, N M; Dunn, A R; de Blaquiere, J

    1985-01-01

    We have isolated cDNA clones of murine c-myb mRNA which contain approximately 2.8 kb of the 3.9-kb mRNA sequence. Nucleotide sequencing has shown that these clones extend both 5' and 3' to sequences homologous to the v-myb oncogenes of avian myeloblastosis virus and avian leukemia virus E26. The sequence contains an open reading frame of 1944 nucleotides, and could encode a protein which is both highly homologous, and of similar size (71 kd), to the chicken c-myb protein. Examination of the deduced amino acid sequence of the murine c-myb protein revealed the presence of a 3-fold tandem repeat of 52 residues near the N terminus of the protein, and has enabled prediction of some of the likely structural features of the protein. These include a high alpha-helix content, a basic region toward the N terminus of the protein and an overall globular configuration. The arrangement of genomic c-myb sequences, detected using the cDNA clones as probes, was compared with the reported structure of rearranged c-myb in certain tumour cells. This comparison suggested that the rearranged c-myb gene may encode a protein which, like the v-myb protein, lacks the N-terminal region of c-myb. Images Fig. 5. PMID:2998780

  2. Human secreted carbonic anhydrase: cDNA cloning, nucleotide sequence, and hybridization histochemistry

    SciTech Connect

    Aldred, P.; Fu, Ping; Barrett, G.; Penschow, J.D.; Wright, R.D.; Coghlan, J.P.; Fernley, R.T. )

    1991-01-01

    Complementary DNA clones coding for the human secreted carbonic anhydrase isozyme (CAVI) have been isolated and their nucleotide sequences determined. These clones identify a 1.45-kb mRNA that is present in high levels in parotid submandibular salivary glands but absent in other tissues such as the sublingual gland, kidney, liver, and prostate gland. Hybridization histochemistry of human salivary glands shows mRNA for CA VI located in the acinar cells of these glands. The cDNA clones encode a protein of 308 amino acids that includes a 17 amino acid leader sequence typical of secreted proteins. The mature protein has 291 amino acids compared to 259 or 260 for the cytoplasmic isozymes, with most of the extra amino acids present as a carboxyl terminal extension. In comparison, sheep CA VI has a 45 amino acid extension. Overall the human CA VI protein has a sequence identity of 35 {percent} with human CA II, while residues involved in the active site of the enzymes have been conserved. The human and sheep secreted carbonic anhydrases have a sequence identity of 72 {percent}. This includes the two cysteine residues that are known to be involved in an intramolecular disulfide bond in the sheep CA VI. The enzyme is known to be glycosylated and three potential N-glycosylation sites (Asn-X-Thr/Ser) have been identified. Two of these are known to be glycosylated in sheep CA VI. Southern analysis of human DNA indicates that there is only one gene coding for CA VI.

  3. Molecular cloning and nucleotide sequence of cDNA for human liver arginase

    SciTech Connect

    Haraguchi, Y.; Takiguchi, M.; Amaya, Y.; Kawamoto, S.; Matsuda, I.; Mori, M.

    1987-01-01

    Arginase (EC3.5.3.1) catalyzes the last step of the urea cycle in the liver of ureotelic animals. Inherited deficiency of the enzyme results in argininemia, an autosomal recessive disorder characterized by hyperammonemia. To facilitate investigation of the enzyme and gene structures and to elucidate the nature of the mutation in argininemia, the authors isolated cDNA clones for human liver arginase. Oligo(dT)-primed and random primer human liver cDNA libraries in lambda gt11 were screened using isolated rat arginase cDNA as a probe. Two of the positive clones, designated lambda hARG6 and lambda hARG109, contained an overlapping cDNA sequence with an open reading frame encoding a polypeptide of 322 amino acid residues (predicted M/sub r/, 34,732), a 5'-untranslated sequence of 56 base pairs, a 3'-untranslated sequence of 423 base pairs, and a poly(A) segment. Arginase activity was detected in Escherichia coli cells transformed with the plasmid carrying lambda hARG6 cDNA insert. RNA gel blot analysis of human liver RNA showed a single mRNA of 1.6 kilobases. The predicted amino acid sequence of human liver arginase is 87% and 41% identical with those of the rat liver and yeast enzymes, respectively. There are several highly conserved segments among the human, rat, and yeast enzymes.

  4. Molecular cloning and nucleotide sequence of cDNA for human glucose-6-phosphate dehydrogenase variant A(-)

    SciTech Connect

    Hirono, A.; Beutler, E. )

    1988-06-01

    Glucose-6-phosphate dehydrogenase A(-) is a common variant in Blacks that causes sensitivity to drug- and infection-induced hemolytic anemia. A cDNA library was constructed from Epstein-Barr virus-transformed lymphoblastoid cells from a male who was G6PD A(-). One of four cDNA clones isolated contained a sequence not found in the other clones nor in the published cDNA sequence. Consisting of 138 bases and coding 46 amino acids, this segment of cDNA apparently is derived from the alternative splicing involving the 3{prime} end of intron 7. Comparison of the remaining sequences of these clones with the published sequence revealed three nucleotide substitutions: C{sup 33} {yields} G, G{sup 202} {yields} A, and A{sup 376} {yields} G. Each change produces a new restriction site. Genomic DNA from five G6PD A(-) individuals was amplified by the polymerase chain reaction. The findings of the same mutation in G6PD A(-) as is found in G6PD A(+) strongly suggests that the G6PD A(-) mutation arose in an individual with G6PD A(+), adding another mutation that causes the in vivo instability of this enzyme protein.

  5. Molecular cloning and nucleotide sequence of cDNA for human glucose-6-phosphate dehydrogenase variant A(-).

    PubMed Central

    Hirono, A; Beutler, E

    1988-01-01

    Glucose-6-phosphate dehydrogenase (G6PD; D-glucose-6-phosphate:NADP+ oxidoreductase, EC 1.1.1.49) A(-) is a common variant in Blacks that causes sensitivity to drug-and infection-induced hemolytic anemia. A cDNA library was constructed from Epstein-Barr virus-transformed lymphoblastoid cells from a male who was G6PD A(-). One of four cDNA clones isolated contained a sequence not found in the other clones nor in the published cDNA sequence. Consisting of 138 bases and coding 46 amino acids, this segment of cDNA apparently is derived from the alternative splicing involving the 3' end of intron 7. Comparison of the remaining sequences of these clones with the published sequence revealed three nucleotide substitutions: C33----G, G202----A, and A376----G. Each change produces a new restriction site. Genomic DNA from five G6PD A(-) individuals was amplified by the polymerase chain reaction. The base substitution at position 376, identical to the substitution that has been reported in G6PD A(+), was present in all G6PD A(-) samples and none of the control G6PD B(+) samples examined. The substitution at position 202 was found in four of the five G6PD A(-) samples and no normal control sample. At position 33 guanine was found in all G6PD A(-) samples and seven G6PD B(+) control samples and is, presumably, the usual nucleotide found at this position. The finding of the same mutation in G6PD A(-) as is found in G6PD A(+) strongly suggests that the G6PD A(-) mutation arose in an individual with G6PD A(+), adding another mutation that causes the in vivo instability of this enzyme protein. Images PMID:2836867

  6. Nucleotide sequence of the cDNA encoding the precursor of the beta subunit of rat lutropin.

    PubMed Central

    Chin, W W; Godine, J E; Klein, D R; Chang, A S; Tan, L K; Habener, J F

    1983-01-01

    We have determined the nucleotide sequences of cDNAs encoding the precursor of the beta subunit of rat lutropin, a polypeptide hormone that regulates gonadal function, including the development of gametes and the production of steroid sex hormones. The cDNAs were prepared from poly(A)+ RNA derived from the pituitary glands of rats 4 weeks after ovariectomy and were cloned in bacterial plasmids. Bacterial colonies containing transfected plasmids were screened by hybridization with a 32P-labeled cDNA encoding the beta subunit of human chorionic gonadotropin, a protein that is related in structure to lutropin. Several recombinant plasmids were detected that by nucleotide sequence analyses contained coding sequences for the precursor of the beta subunit of lutropin. Complete determination of the nucleotide sequences of these cDNAs, as well as of cDNA reverse-transcribed from pituitary poly(A)+ RNA by using a synthetic pentadecanucleotide as a primer of RNA, provided the entire 141-codon sequence of the precursor of the beta subunit of rat lutropin. The precursor consists of a 20 amino acid leader (signal) peptide and an apoprotein of 121 amino acids. The amino acid sequence of the rat lutropin beta subunit shows similarity to the beta subunits of the ovine/bovine, porcine, and human lutropins (81, 86, and 74% of amino acids identical, respectively). Blot hybridization of pituitary RNAs separated by electrophoresis on agarose gels showed that the mRNA encoding the lutropin beta subunit consists of approximately 700 bases. The availability of cDNAs for both the alpha and beta subunits of lutropin will facilitate studies of the regulation of lutropin expression. Images PMID:6192440

  7. Nucleotide sequence and expression of a maize H1 histone cDNA.

    PubMed Central

    Razafimahatratra, P; Chaubet, N; Philipps, G; Gigot, C

    1991-01-01

    The first complete amino acid sequence of a H1 histone of a monocotyledonous plant was deduced from a cDNA isolated from a maize library. The encoded H1 protein is 245 amino acid-long and shows the classical tripartite organization of this class of histones. The central globular region of 76 residues shows 60% sequence homology with H1 proteins from dicots but only 20% with the animal H1 proteins. However, several of the amino acids considered as being important in the structure of the nucleosome are conserved between this protein and its animal counterparts. The N-terminal region contains an equal number of acidic and basic residues which appears as a general feature of plant H1 proteins. The 124 residue long and highly basic C-terminal region contains a 7-fold repeated element KA/PKXA/PAKA/PK. Southern-blot hybridization showed that the H1 protein is encoded by a small multigene family. Highly homologous H1 gene families were also detected in the genomes of several more or less closely related plant species. The general expression pattern of these genes was not significantly different from that of these genes encoding the core-histones neither during germination nor in the different tissues of adult maize. Images PMID:1709276

  8. Guanine nucleotide-binding proteins that enhance choleragen ADP-ribosyltransferase activity: nucleotide and deduced amino acid sequence of an ADP-ribosylation factor cDNA.

    PubMed Central

    Price, S R; Nightingale, M; Tsai, S C; Williamson, K C; Adamik, R; Chen, H C; Moss, J; Vaughan, M

    1988-01-01

    Three (two soluble and one membrane) guanine nucleotide-binding proteins (G proteins) that enhance ADP-ribosylation of the Gs alpha stimulatory subunit of the adenylyl cyclase (EC 4.6.1.1) complex by choleragen have recently been purified from bovine brain. To further define the structure and function of these ADP-ribosylation factors (ARFs), we isolated a cDNA clone (lambda ARF2B) from a bovine retinal library by screening with a mixed heptadecanucleotide probe whose sequence was based on the partial amino acid sequence of one of the soluble ARFs from bovine brain. Comparison of the deduced amino acid sequence of lambda ARF2B with sequences of peptides from the ARF protein (total of 60 amino acids) revealed only two differences. Whether these are cloning artifacts or reflect the existence of more than one ARF protein remains to be determined. Deduced amino acid sequences of ARF, Go alpha (the alpha subunit of a G protein that may be involved in regulation of ion fluxes), and c-Ha-ras gene product p21 show similarities in regions believed to be involved in guanine nucleotide binding and GTP hydrolysis. ARF apparently lacks a site analogous to that ADP-ribosylated by choleragen in G-protein alpha subunits. Although both the ARF proteins and the alpha subunits bind guanine nucleotides and serve as choleragen substrates, they must interact with the toxin A1 peptide in different ways. In addition to serving as an ADP-ribose acceptor, ARF interacts with the toxin in a manner that modifies its catalytic properties. PMID:3135549

  9. Nucleotide and predicted amino acid sequence of a cDNA clone encoding part of human transketolase.

    PubMed

    Abedinia, M; Layfield, R; Jones, S M; Nixon, P F; Mattick, J S

    1992-03-31

    Transketolase is a key enzyme in the pentose-phosphate pathway which has been implicated in the latent human genetic disease, Wernicke-Korsakoff syndrome. Here we report the cloning and partial characterisation of the coding sequences encoding human transketolase from a human brain cDNA library. The library was screened with oligonucleotide probes based on the amino acid sequence of proteolytic fragments of the purified protein. Northern blots showed that the transketolase mRNA is approximately 2.2 kb, close to the minimum expected, of which approximately 60% was represented in the largest cDNA clone. Sequence analysis of the transketolase coding sequences reveals a number of homologies with related enzymes from other species. PMID:1567394

  10. Nucleotide sequence and expression in vitro of cDNA derived from mRNA of int-1, a provirally activated mouse mammary oncogene.

    PubMed Central

    Fung, Y K; Shackleford, G M; Brown, A M; Sanders, G S; Varmus, H E

    1985-01-01

    The mouse int-1 gene is a putative mammary oncogene discovered as a target for transcriptionally activating proviral insertion mutations in mammary carcinomas induced by the mouse mammary tumor virus in C3H mice. We have isolated molecular clones of full- or nearly full-length cDNA transcribed from int-1 RNA (2.6 kilobases) in a virus-induced mammary tumor. Comparison of the nucleotide sequence of the cDNA clones with that of the int-1 gene (A. van Ooyen and R. Nusse, Cell 39:233-240, 1984) shows the following. The coding region of the int-1 gene is composed of four exons. The splice donor and acceptor sites conform to consensus; however, at least two closely spaced polyadenylation sites are used, and the transcriptional initiation site remains ambiguous. The major open reading frame is preceded by an open frame 10 codons in length. The mRNA encodes a 41-kilodalton protein with several striking features--a strongly hydrophobic amino terminus, a cysteine-rich carboxy terminus, and four potential glycosylation sites. There are no differences in nucleotide sequence between the known exons of the normal and a provirally activated allele. The length of the deduced open reading frame was further confirmed by in vitro translation of RNA transcribed from the cDNA clones with SP6 RNA polymerase. Images PMID:3018519

  11. Channel catfish, Ictalurus punctatus, cyclophilin B cDNA sequence

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Cyclophilin B is a member of highly conserved immunophilins and ubiquitously found intracellularly. The complete sequence of the channel catfish cyclophilin B cDNA gene consisted of 996 nucleotides. Analysis of the nucleotide sequence reveals one open reading frame and 5’- and 3’-end untranslated...

  12. A comparative study of 2',3'-cyclic-nucleotide 3'-phosphodiesterase in vertebrates: cDNA cloning and amino acid sequences for chicken and bullfrog enzymes.

    PubMed

    Kasama-Yoshida, H; Tohyama, Y; Kurihara, T; Sakuma, M; Kojima, H; Tamai, Y

    1997-10-01

    In mammalian brain, two 2',3'-cyclic-nucleotide 3'-phosphodiesterase (EC 3.1.4.37) isoforms, CNP1 and CNP2, are translated, respectively, from the two mRNAs, which have been transcribed and processed by alternative use of the two transcription start points and by differential splicing. In the present study, the cDNAs encoding chicken CNP2 and bullfrog CNP1, respectively, were isolated, and the amino acid sequences of chicken CNP2 and bullfrog CNP1 were deduced. Western blot analysis showed that chicken brain contains a major CNP2-type protein together with a minor unidentified isoform, and bullfrog brain contains only a CNP1-type protein. All available amino acid sequences of vertebrate 2',3'-cyclic-nucleotide 3'-phosphodiesterases were aligned and compared. Three conserved motif sequences were noted: (a) an ATP-binding site near the amino terminus, (b) an isoprenylation site at the carboxyl terminus, and (c) a probable catalytic site resembling the active site of beta-ketoacyl synthase (EC 2.3.1.41). The second and the third motifs are conserved also in goldfish RICH (regeneration-induced 2',3'-cyclic-nucleotide 3'-phosphodiesterase homologue), which has been shown recently to have 2',3'-cyclic-nucleotide 3'-phosphodiesterase activity. The third motif (probably catalytic site) was assigned for the first time in the present report. PMID:9326261

  13. Cloning and partial nucleotide sequence of human immunoglobulin mu chain cDNA from B cells and mouse-human hybridomas.

    PubMed Central

    Dolby, T W; Devuono, J; Croce, C M

    1980-01-01

    Purified mRNAs coding for mu and kappa human immunoglobulin polypeptides were translated in vitro and their products were characterized. The mu-specific mRNAs, derived from both human lymphoblastoid cells (GM607) and from a mouse-human somatic cell hybrid secreting human mu chains (alpha D5-H11-BC11), were copied into cDNAs and inserted into the plasmid pBR322. Several recombinant cDNAs that were obtained were identified by a combination of colony hybridization with labeled probes, in vitro translation of plasmid-selected mu mRNAs, and DNA nucleotide sequence determination. One recombinant DNA, for which the sequence has been partially determined, contains the codons for part of the C3 constant region domain through the carboxy-terminal piece (155 amino acids total) as well as the entire 3' noncoding sequence up to the poly(A) site of the human mu mRNA. The sequence A-A-U-A-A occurs 12 nucleotides prior to the poly(A) addition site in the human mu mRNA. Considerable sequence homology is observed in the mouse and human mu mRNA 3' coding and noncoding sequences. Images PMID:6777778

  14. cDNA encoding a polypeptide including a hevein sequence

    DOEpatents

    Raikhel, Natasha V.; Broekaert, Willem F.; Chua, Nam-Hai; Kush, Anil

    1993-02-16

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a pu GOVERNMENT RIGHTS This application was funded under Department of Energy Contract DE-AC02-76ER01338. The U.S. Government has certain rights under this application and any patent issuing thereon.

  15. Method enabling fast partial sequencing of cDNA clones.

    PubMed

    Nordström, T; Gharizadeh, B; Pourmand, N; Nyren, P; Ronaghi, M

    2001-05-15

    Pyrosequencing is a nonelectrophoretic single-tube DNA sequencing method that takes advantage of cooperativity between four enzymes to monitor DNA synthesis. To investigate the feasibility of the recently developed technique for tag sequencing, 64 colonies of a selected cDNA library from human were sequenced by both pyrosequencing and Sanger DNA sequencing. To determine the needed length for finding a unique DNA sequence, 100 sequence tags from human were retrieved from the database and different lengths from each sequence were randomly analyzed. An homology search based on 20 and 30 nucleotides produced 97 and 98% unique hits, respectively. An homology search based on 100 nucleotides could identify all searched genes. Pyrosequencing was employed to produce sequence data for 30 nucleotides. A similar search using BLAST revealed 16 different genes. Forty-six percent of the sequences shared homology with one gene at different positions. Two of the 64 clones had unique sequences. The search results from pyrosequencing were in 100% agreement with conventional DNA sequencing methods. The possibility of using a fully automated pyrosequencer machine for future high-throughput tag sequencing is discussed. PMID:11355860

  16. The EMBL Nucleotide Sequence Database.

    PubMed

    Stoesser, G; Tuli, M A; Lopez, R; Sterk, P

    1999-01-01

    The EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl.html) constitutes Europe's primary nucleotide sequence resource. Main sources for DNA and RNA sequences are direct submissions from individual researchers, genome sequencing projects and patent applications. While automatic procedures allow incorporation of sequence data from large-scale genome sequencing centres and from the European Patent Office (EPO), the preferred submission tool for individual submitters is Webin (WWW). Through all stages, dataflow is monitored by EBI biologists communicating with the sequencing groups. In collaboration with DDBJ and GenBank the database is produced, maintained and distributed at the European Bioinformatics Institute (EBI). Database releases are produced quarterly and are distributed on CD-ROM. Network services allow access to the most up-to-date data collection via Internet and World Wide Web interface. EBI's Sequence Retrieval System (SRS) is a Network Browser for Databanks in Molecular Biology, integrating and linking the main nucleotide and protein databases, plus many specialised databases. For sequence similarity searching a variety of tools (e.g. Blitz, Fasta, Blast etc) are available for external users to compare their own sequences against the most currently available data in the EMBL Nucleotide Sequence Database and SWISS-PROT. PMID:9847133

  17. Sequencing of channel catfish, Ictalurus punctatus, cyclophilin A cDNA

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Cyclophilin A is a member of highly conserved immunophilins and ubiquitously found intracellularly. The complete sequence of the channel catfish cyclophilin A cDNA gene consisted of 1,170 nucleotides. Analysis of the nucleotide sequence reveals one open reading frame and 5’- and 3’-end untranslate...

  18. Nucleotide sequences 1986/1987

    SciTech Connect

    Not Available

    1987-01-01

    These eight volumes are the third annual published compendium of nucleic acid sequences included in the European Molecular Biology Laboratory Nucleotide Sequence Data Library and the GenBank Genetic Sequences Data Bank. Each volume surveys one or more subdivisions of the database. The volume subtitles are: Primates; Rodents; Other Vertebrates and Invertebrates, Plants and Organelles, Bacteria and Bacteriophage, Viruses, Structural RNA, Synthetic and Unannotated Sequences, and Database Directory and Master Indices.

  19. The nucleotide sequence of cowpea mosaic virus B RNA

    PubMed Central

    Lomonossoff, G.P.; Shanks, M.

    1983-01-01

    The complete sequence of the bottom component RNA (B RNA) of cowpea mosaic virus (CPMV) has been determined. Restriction enzyme fragments of double-stranded cDNA were cloned in M13 and the sequence of the inserts was determined by a combination of enzymatic and chemical sequencing techniques. Additional sequence information was obtained by primed synthesis on first strand cDNA. The complete sequence deduced is 5889 nucleotides long excluding the 3' poly(A), and contains an open reading frame sufficient to code for a polypeptide of mol. wt. 207 760. The coding region is flanked by a 5' leader sequence of 206 nucleotides and a 3' non-coding region of 82 residues which does not contain a polyadenylation signal. PMID:16453487

  20. Automated Identification of Nucleotide Sequences

    NASA Technical Reports Server (NTRS)

    Osman, Shariff; Venkateswaran, Kasthuri; Fox, George; Zhu, Dian-Hui

    2007-01-01

    STITCH is a computer program that processes raw nucleotide-sequence data to automatically remove unwanted vector information, perform reverse-complement comparison, stitch shorter sequences together to make longer ones to which the shorter ones presumably belong, and search against the user s choice of private and Internet-accessible public 16S rRNA databases. ["16S rRNA" denotes a ribosomal ribonucleic acid (rRNA) sequence that is common to all organisms.] In STITCH, a template 16S rRNA sequence is used to position forward and reverse reads. STITCH then automatically searches known 16S rRNA sequences in the user s chosen database(s) to find the sequence most similar to (the sequence that lies at the smallest edit distance from) each spliced sequence. The result of processing by STITCH is the identification of the most similar well-described bacterium. Whereas previously commercially available software for analyzing genetic sequences operates on one sequence at a time, STITCH can manipulate multiple sequences simultaneously to perform the aforementioned operations. A typical analysis of several dozen sequences (length of the order of 103 base pairs) by use of STITCH is completed in a few minutes, whereas such an analysis performed by use of prior software takes hours or days.

  1. cDNA encoding a polypeptide including a hevein sequence

    DOEpatents

    Raikhel, N.V.; Broekaert, W.F.; Chua, N.H.; Kush, A.

    1995-03-21

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1,018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74--79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli. 11 figures.

  2. cDNA encoding a polypeptide including a hevein sequence

    SciTech Connect

    Raikhel, N.V.; Broekaert, W.F.; Chua, N.H.; Kush, A.

    2000-07-04

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74--79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli.

  3. CDNA encoding a polypeptide including a hevein sequence

    DOEpatents

    Raikhel, Natasha V.; Broekaert, Willem F.; Chua, Nam-Hai; Kush, Anil

    1995-03-21

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74-79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli.

  4. cDNA encoding a polypeptide including a hevein sequence

    DOEpatents

    Raikhel, Natasha V.; Broekaert, Willem F.; Chua, Nam-Hai; Kush, Anil

    1999-05-04

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74-79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli.

  5. cDNA encoding a polypeptide including a hevein sequence

    DOEpatents

    Raikhel, N.V.; Broekaert, W.F.; Chua, N.H.; Kush, A.

    1999-05-04

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74--79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli. 12 figs.

  6. cDNA encoding a polypeptide including a hevein sequence

    DOEpatents

    Raikhel, N.V.; Broekaert, W.F.; Namhai Chua; Kush, A.

    1993-02-16

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1,018 nucleotides long and includes an open reading frame of 204 amino acids.

  7. Nucleotide sequences encoding a thermostable alkaline protease

    DOEpatents

    Wilson, David B.; Lao, Guifang

    1998-01-01

    Nucleotide sequences, derived from a thermophilic actinomycete microorganism, which encode a thermostable alkaline protease are disclosed. Also disclosed are variants of the nucleotide sequences which encode a polypeptide having thermostable alkaline proteolytic activity. Recombinant thermostable alkaline protease or recombinant polypeptide may be obtained by culturing in a medium a host cell genetically engineered to contain and express a nucleotide sequence according to the present invention, and recovering the recombinant thermostable alkaline protease or recombinant polypeptide from the culture medium.

  8. Nucleotide sequences encoding a thermostable alkaline protease

    DOEpatents

    Wilson, D.B.; Lao, G.

    1998-01-06

    Nucleotide sequences, derived from a thermophilic actinomycete microorganism, which encode a thermostable alkaline protease are disclosed. Also disclosed are variants of the nucleotide sequences which encode a polypeptide having thermostable alkaline proteolytic activity. Recombinant thermostable alkaline protease or recombinant polypeptide may be obtained by culturing in a medium a host cell genetically engineered to contain and express a nucleotide sequence according to the present invention, and recovering the recombinant thermostable alkaline protease or recombinant polypeptide from the culture medium. 3 figs.

  9. Long-range correlations in nucleotide sequences

    NASA Astrophysics Data System (ADS)

    Peng, C.-K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Sciortino, F.; Simons, M.; Stanley, H. E.

    1992-03-01

    DNA SEQUENCES have been analysed using models, such as an it-step Markov chain, that incorporate the possibility of short-range nucleotide correlations1. We propose here a method for studying the stochastic properties of nucleotide sequences by constructing a 1:1 map of the nucleotide sequence onto a walk, which we term a 'DNA walk'. We then use the mapping to provide a quantitative measure of the correlation between nucleotides over long distances along the DNA chain. Thus we uncover in the nucleotide sequence a remarkably long-range power law correlation that implies a new scale-invariant property of DNA. We find such long-range correlations in intron-containing genes and in nontranscribed regulatory DNA sequences, but not in complementary DNA sequences or intron-less genes.

  10. Nucleotide sequence and expression of a Drosophila metallothionein.

    PubMed

    Lastowski-Perry, D; Otto, E; Maroni, G

    1985-02-10

    A Drosophila melanogaster cDNA clone was isolated based on its more intense hybridization to RNA sequences from copper-fed larvae than from control larval RNA. This clone showed strong hybridization to mouse metallothionein I cDNA at reduced stringency. Its nucleotide sequence includes an open reading segment which codes for a 40-amino acid protein; this protein is identified as metallothionein based on its similarity to the amino-terminal portion of mammalian and crab metalloproteins. The 10 cysteine residues present occur in five pairs of near vicinal cysteines (Cys-X-Cys). This cDNA sequence hybridized to a 400-nucleotide polyadenylated RNA whose presence in the cells of the alimentary canal of larvae was stimulated by ingestion of cadmium or copper; in other tissues this RNA was present at much lower levels. Mercury, silver, and zinc induced metallothionein to a lesser extent. The level of metallothionein RNA increased very soon after the initiation of metal treatment and reached a maximum after approximately 36 h. PMID:2578462

  11. Molecular cloning and sequencing of a novel human P2 nucleotide receptor.

    PubMed

    Southey, M C; Hammet, F; Hutchins, A M; Paidhungat, M; Somers, G R; Venter, D J

    1996-11-11

    A novel human P2 nucleotide receptor has been cloned from a T-cell cDNA library. The predicted amino acid sequence shows characteristics of a G-protein-coupled receptor, and shares 88% homology with a recently characterised rat P2 nucleotide receptor sequence. Distinctive features include an extremely short cytoplasmic tail with only one putative protein kinase C phosphorylation site. Northern blot analysis revealed a 1.9 kb transcript expressed in the placenta. PMID:8950181

  12. Statistical analysis of nucleotide sequences.

    PubMed Central

    Stückle, E E; Emmrich, C; Grob, U; Nielsen, P J

    1990-01-01

    In order to scan nucleic acid databases for potentially relevant but as yet unknown signals, we have developed an improved statistical model for pattern analysis of nucleic acid sequences by modifying previous methods based on Markov chains. We demonstrate the importance of selecting the appropriate parameters in order for the method to function at all. The model allows the simultaneous analysis of several short sequences with unequal base frequencies and Markov order k not equal to 0 as is usually the case in databases. As a test of these modifications, we show that in E. coli sequences there is a bias against palindromic hexamers which correspond to known restriction enzyme recognition sites. PMID:2251125

  13. The International Nucleotide Sequence Database Collaboration.

    PubMed

    Cochrane, Guy; Karsch-Mizrachi, Ilene; Takagi, Toshihisa

    2016-01-01

    The International Nucleotide Sequence Database Collaboration (INSDC; http://www.insdc.org) comprises three global partners committed to capturing, preserving and providing comprehensive public-domain nucleotide sequence information. The INSDC establishes standards, formats and protocols for data and metadata to make it easier for individuals and organisations to submit their nucleotide data reliably to public archives. This work enables the continuous, global exchange of information about living things. Here we present an update of the INSDC in 2015, including data growth and diversification, new standards and requirements by publishers for authors to submit their data to the public archives. The INSDC serves as a model for data sharing in the life sciences. PMID:26657633

  14. Nucleotide sequence of 3' untranslated portion of human alpha globin mRNA.

    PubMed Central

    Wilson, J T; deRiel, J K; Forget, B G; Marotta, C A; Weissman, S M

    1977-01-01

    We have determined the nucleotide sequence of 75 nucleotides of the 3'-untranslated portion of normal human alpha globin mRNA which corresponds to the elongated amino acid sequence of the chain termination mutant Hb Constant Spring. This was accomplished by sequence analysis of cDNA fragments obtained by restriction endonuclease or T4 endonuclease IV cleavage of human globin cDNA synthesized from globin mRNA by use of viral reverse transcriptase. Analysis of cRNA synthesized from cDNA by use of RNA polymerase provided additional confirmatory sequence information. Possible polymorphism has been identified at one site of the sequence. Our sequence overlaps with, and extends the sequence of 43 nucleotides determined by Proudfood and coworkers for the very 3'-terminal portion of human alpha globin mRNA. The complete 3'-untranslated sequence of human alpha globin mRNA (112 nucleotides including termination codon) shows little homology to that of the human or rabbit beta globin mRNAs except for the presence of the hexanucleotide sequence AAUAAA which is found in most eukaryotic mRNAs near the 3'-terminal poly (A). Images PMID:909779

  15. Cloning and sequence analysis of cDNA for human cathepsin D.

    PubMed Central

    Faust, P L; Kornfeld, S; Chirgwin, J M

    1985-01-01

    An 1110-base-pair cDNA clone for human cathepsin D was obtained by screening a lambda gt10 human hepatoma G2 cDNA library with a human renin exon 3 genomic fragment. Poly(A)+ RNA blot analysis with this cathepsin D clone demonstrated a message length of about 2.2 kilobases. The partial clone was used to screen a size-selected human kidney cDNA library, from which two cathepsin D recombinant plasmids with inserts of about 2200 and 2150 base pairs were obtained. The nucleotide sequences of these clones and of the lambda gt10 clone were determined. The amino acid sequence predicted from the cDNA sequence shows that human cathepsin D consists of 412 amino acids with 20 and 44 amino acids in a pre- and a prosegment, respectively. The mature protein region shows 87% amino acid identity with porcine cathepsin D but differs in having nine additional amino acids. Two of these are at the COOH terminus; the other seven are positioned between the previously determined junction for the light and heavy chains of porcine cathepsin D. A high degree of sequence homology was observed between human cathepsin D and other aspartyl proteases, suggesting a conservation of three-dimensional structure in this family of proteins. Images PMID:3927292

  16. Expressed sequence tags of Chinese cabbage flower bud cDNA.

    PubMed Central

    Lim, C O; Kim, H Y; Kim, M G; Lee, S I; Chung, W S; Park, S H; Hwang, I; Cho, M J

    1996-01-01

    We randomly selected and partially sequenced cDNA clones from a library of Chinese cabbage (Brassica campestris L. ssp. pekinensis) flower bud cDNAs. Out of 1216 expressed sequence tags (ESTs), 904 cDNA clones were unique or nonredundant. Five hundred eighty-eight clones (48.4%) had sequence homology to functionally defined genes at the peptide level. Only 5 clones encoded known flower-specific proteins. Among the cDNAs with no similarity to known protein sequences (628), 184 clones had significant similarity to nucleotide sequences registered in the databases. Among these 184 clones, 142 exhibited similarities at the nucleotide level only with plant ESTs. Also, sequence similarities were evident between these 142 ESTs and their matching ESTs when compared using the deduced amino acid sequences. Therefore, it is possible that the anonymous ESTs encode plant-specific ubiquitous proteins. Our extensive EST analysis of genes expressed in floral organs not only contributes to the understanding of the dynamics of genome expression patterns in floral organs but also adds data to the repertoire of all genomic genes. PMID:8787028

  17. Nucleotide sequence of bacteriophage fd DNA.

    PubMed Central

    Beck, E; Sommer, R; Auerswald, E A; Kurz, C; Zink, B; Osterburg, G; Schaller, H; Sugimoto, K; Sugisaki, H; Okamoto, T; Takanami, M

    1978-01-01

    The sequence of the 6,408 nucleotides of bacteriophage fd DNA has been determined. This allows to deduce the exact organisation of the filamentous phage genome and provides easy access to DNA segments of known structure and function. PMID:745987

  18. [Construction and sequencing of full-length cDNA of peste des petits ruminants virus].

    PubMed

    Zhai, Jun-Jun; Dou, Yong-Xi; Zhang, Hai-Rui; Mao, Li; Meng, Xue-Lian; Luo, Xuo-Nong; Cai, Xue-Peng

    2010-07-01

    To develop a reverse genetics system of Peste des petits ruminants virus(PPRV), five pairs of oligonucleotide primers were designed on the basis of the full-length genomic sequence of PPRV Nigeria 75/ 1 strain. Using RT-PCR technique, five over-lapping cDNA fragments, designated as JF1, JF2, JF3, JF4 and JF5, respectively, were amplified, followed by cloning into pcDNA3.1(+)vector. An AscI restriction enzyme site and a T7 promoter sequence were introduced immediately upstream of 5'-end, while a PacI restriction enzyme site was engineered downstream of 3'-end. Using pok12 as a plasmid vector, the full-length cDNA clone pok12-PPRV of Nigeria 75/1 was assembled by connecting the five cDNA fragments via the unique restriction endonuclease site of PPRV genome. The resultant nucleotide sequence of the PPRV Nigeria 75/1 strain in the study was compared with other members of genus morbillivirus, and phylogenetic analysis was used to examine the evolutionary relationships. The results showed that PPRV Nigeria 75/ 1 was antigenically closely related to Rinderpest virus and Measles virus. Successful construction of full-length cDNA clone of PPRV Nigeria 75/1 strain lays the basis rescuing PPRV effectively and enables further research of PPRV at molecular level. PMID:20836386

  19. Complete Nucleotide Sequence of Tn10

    PubMed Central

    Chalmers, Ronald; Sewitz, Sven; Lipkow, Karen; Crellin, Paul

    2000-01-01

    The complete nucleotide sequence of Tn10 has been determined. The dinucleotide signature and percent G+C of the sequence had no discontinuities, indicating that Tn10 constitutes a homogeneous unit. The new sequence contained three new open reading frames corresponding to a glutamate permease, repressors of heavy metal resistance operons, and a hypothetical protein in Bacillus subtilis. The glutamate permease was fully functional when expressed, but Tn10 did not protect Escherichia coli from the toxic effects of various metals. PMID:10781570

  20. The multiple codes of nucleotide sequences.

    PubMed

    Trifonov, E N

    1989-01-01

    Nucleotide sequences carry genetic information of many different kinds, not just instructions for protein synthesis (triplet code). Several codes of nucleotide sequences are discussed including: (1) the translation framing code, responsible for correct triplet counting by the ribosome during protein synthesis; (2) the chromatin code, which provides instructions on appropriate placement of nucleosomes along the DNA molecules and their spatial arrangement; (3) a putative loop code for single-stranded RNA-protein interactions. The codes are degenerate and corresponding messages are not only interspersed but actually overlap, so that some nucleotides belong to several messages simultaneously. Tandemly repeated sequences frequently considered as functionless "junk" are found to be grouped into certain classes of repeat unit lengths. This indicates some functional involvement of these sequences. A hypothesis is formulated according to which the tandem repeats are given the role of weak enhancer-silencers that modulate, in a copy number-dependent way, the expression of proximal genes. Fast amplification and elimination of the repeats provides an attractive mechanism of species adaptation to a rapidly changing environment. PMID:2673451

  1. Characterization of Expressed Sequence Tags From a Gallus gallus Pineal Gland cDNA Library

    PubMed Central

    Hartman, Stefanie; Touchton, Greg; Wynn, Jessica; Geng, Tuoyu; Chong, Nelson W.

    2005-01-01

    The pineal gland is the circadian oscillator in the chicken, regulating diverse functions ranging from egg laying to feeding. Here, we describe the isolation and characterization of expressed sequence tags (ESTs) isolated from a chicken pineal gland cDNA library. A total of 192 unique sequences were analysed and submitted to GenBank; 6% of the ESTs matched neither GenBank cDNA sequences nor the newly assembled chicken genomic DNA sequence, three ESTs aligned with sequences designated to be on the Z_random, while one matched a W chromosome sequence and could be useful in cataloguing functionally important genes on this sex chromosome. Additionally, single nucleotide polymorphisms (SNPs) were identified and validated in 10 ESTs that showed 98% or higher sequence similarity to known chicken genes. Here, we have described resources that may be useful in comparative and functional genomic analysis of genes expressed in an important organ, the pineal gland, in a model and agriculturally important organism. PMID:18629218

  2. cDNA sequences of two apolipoproteins from lamprey

    SciTech Connect

    Pontes, M.; Xu, X.; Graham, D.; Riley, M.; Doolittle, R.F.

    1987-03-24

    The messages for two small but abundant apolipoproteins found in lamprey blood plasma were cloned with the aid of oligonucleotide probes based on amino-terminal sequences. In both cases, numerous clones were identified in a lamprey liver cDNA library, consistent with the great abundance of these proteins in lamprey blood. One of the cDNAs (LAL1) has a coding region of 105 amino acids that corresponds to a 21-residue signal peptide, a putative 8-residue propeptide, and the 76-residue mature protein found in blood. The other cDNA (LAL2) codes for a total of 191 residues, the first 23 of which constitute a signal peptide. The two proteins, which occur in the high-density lipoprotein fraction of ultracentrifuged plasma, have amino acid compositions similar to those of apolipoproteins found in mammalian blood; computer analysis indicates that the sequences are largely helix-permissive. When the sequences were searched against an amino acid sequence data base, rat apolipoprotein IV was the best matching candidate in both cases. Although a reasonable alignment can be made with that sequence and LAL1, definitive assignment of the two lamprey proteins to typical mammalian classes cannot be made at this point.

  3. Large-scale sequencing based on full-length-enriched cDNA libraries in pigs: contribution to annotation of the pig genome draft sequence

    PubMed Central

    2012-01-01

    Background Along with the draft sequencing of the pig genome, which has been completed by an international consortium, collection of the nucleotide sequences of genes expressed in various tissues and determination of entire cDNA sequences are necessary for investigations of gene function. The sequences of expressed genes are also useful for genome annotation, which is important for isolating the genes responsible for particular traits. Results We performed a large-scale expressed sequence tag (EST) analysis in pigs by using 32 full-length-enriched cDNA libraries derived from 28 kinds of tissues and cells, including seven tissues (brain, cerebellum, colon, hypothalamus, inguinal lymph node, ovary, and spleen) derived from pigs that were cloned from a sow subjected to genome sequencing. We obtained more than 330,000 EST reads from the 5′-ends of the cDNA clones. Comparison with human and bovine gene catalogs revealed that the ESTs corresponded to at least 15,000 genes. cDNA clones representing contigs and singlets generated by assembly of the EST reads were subjected to full-length determination of inserts. We have finished sequencing 31,079 cDNA clones corresponding to more than 12,000 genes. Mapping of the sequences of these cDNA clones on the draft sequence of the pig genome has indicated that the clones are derived from about 15,000 independent loci on the pig genome. Conclusions ESTs and cDNA sequences derived from full-length-enriched libraries are valuable for annotation of the draft sequence of the pig genome. This information will also contribute to the exploration of promoter sequences on the genome and to molecular biology-based analyses in pigs. PMID:23150988

  4. 70-Kilodalton heat shock polypeptides from rainbow trout: characterization of cDNA sequences.

    PubMed Central

    Kothary, R K; Jones, D; Candido, E P

    1984-01-01

    RTG-2 cells, a line of fibroblasts from rainbow trout (Salmo gairdnerii), are induced to synthesize a distinct set of heat-shock polypeptides after exposure to elevated temperature or to low concentrations of sodium arsenite. We isolated and characterized two cDNA sequences, THS70.7 and THS70.14, encoding partial information for two distinct species of 70-kilodalton heat shock polypeptide (hsp70) from these cells. These sequences are identical at 73.3% of the nucleotide positions in their regions of overlap, and their degree of sequence conservation at the polypeptide level is 88.1%. The two derived trout hsp70 polypeptide sequences show extensive homology with derived amino acid sequences for hsp70 polypeptides from Drosophila melanogaster and Saccharomyces cerevisiae. Northern blot analysis of RNA from arsenite-induced RTG-2 cells, with the trout hsp70 cDNAs as probes, revealed the presence of three hsp70 mRNA species. Southern blot analysis of trout testis DNA cleaved with various restriction endonucleases revealed a small number of bands hybridizing to the hsp70 cDNAs, suggesting the existence of a small family of hsp70 genes in this species. Finally, trout hsp70 cDNA sequences cross-hybridized with restriction fragments in genomic DNA from HeLa cells, bovine liver, Caenorhabditis elegans, and D. melanogaster. Images PMID:6092938

  5. Cloning and sequencing of chloroperoxidase cDNA.

    PubMed Central

    Fang, G H; Kenigsberg, P; Axley, M J; Nuell, M; Hager, L P

    1986-01-01

    An oligod-d(T) 12-18 primed cDNA library has been prepared from Caldariomyces fumago mRNA. A clone containing a full-length insert was sequenced on the supercoiled plasmid, pBR322. The complete primary sequence of chloroperoxidase has been derived. We have also determined about 73% of the peptide sequence by amino acid sequencing. The DNA sequence data matches all of the available known peptide sequences. The mature polypeptide contains 300 amino acids having a combined molecular weight of 32,974 daltons. A putative signal peptide of 21 amino acids is proposed from DNA sequence data. The chloroperoxidase gene encodes three potential glycosylation sites recognized as Asn-X-Thr/Ser sequences. Three cysteine residues are found in the protein sequence. A small region around Cys87 bears a minimal homology to the active site of cytochrome P450cam. No other heme protein homologues can be detected. We propose that Cys87 serves as a thiolate ligand to the iron of heme prosthetic group. A rare arginine codon, AGG, is used three times out of twelve in contrast to the very infrequent use of this codon in E. coli or yeast. PMID:3774552

  6. Oreochromis mossambicus (tilapia) corticotropin-releasing hormone: cDNA sequence and bioactivity.

    PubMed

    van Enckevort, F H; Pepels, P P; Leunissen, J A; Martens, G J; Wendelaar Bonga, S E; Balm, P H

    2000-02-01

    Although hypothalamic corticotropin-releasing hormone (CRH) is involved in the stress response in all vertebrate groups, only a limited number of studies on this neuroendocrine peptide deals with non-mammalian neuroendocrine systems. We determined the cDNA sequence of the CRH precursor of the teleost Oreochromis mossambicus (tilapia) and studied the biological potency of the CRH peptide in a homologous teleost bioassay. Polymerase chain reaction (PCR) with degenerate and specific primers yielded fragments of tilapia CRH cDNA. Full-length CRH cDNA (988 nucleotides) was obtained by screening a tilapia hypothalamus cDNA library with the tilapia CRH PCR products. The precursor sequence (167 amino acids) contains a signal peptide, the CRH peptide and a motif conserved among all vertebrate CRH precursors. Tilapia CRH (41 aa) displays between 63% and 80% amino acid sequence identity to CRH from other vertebrates, whereas the degree of identity to members of the urotensin I/urocortin lineage is considerably lower. In a phylogenetic tree, based on alignment of all full CRH peptide precursors presently known, the three teleost CRH precursors (tilapia; sockeye salmon, Oncorhynchus nerka; white sucker, Catostomus commersoni) form a monophyletic group distinct from amphibian and mammalian precursors. Despite the differences between the primary structures of tilapia and rat CRH, maximally effective concentrations of tilapia and rat CRH were equally potent in stimulating adrenocorticotropic hormone (ACTH) and alpha-MSH release by tilapia pituitaries in vitro. The tilapia and salmon CRH sequences show that more variation exists between orthologous vertebrate CRH structures, and teleost CRHs in particular than previously recognized. Whether the structural differences reflect different mechanisms of action of this peptide in the stress response remains to be investigated. PMID:10718913

  7. Matrix genes of measles virus and canine distemper virus: cloning, nucleotide sequences, and deduced amino acid sequences.

    PubMed Central

    Bellini, W J; Englund, G; Richardson, C D; Rozenblatt, S; Lazzarini, R A

    1986-01-01

    The nucleotide sequences encoding the matrix (M) proteins of measles virus (MV) and canine distemper virus (CDV) were determined from cDNA clones containing these genes in their entirety. In both cases, single open reading frames specifying basic proteins of 335 amino acid residues were predicted from the nucleotide sequences. Both viral messages were composed of approximately 1,450 nucleotides and contained 400 nucleotides of presumptive noncoding sequences at their respective 3' ends. MV and CDV M-protein-coding regions were 67% homologous at the nucleotide level and 76% homologous at the amino acid level. Only chance homology was observed in the 400-nucleotide trailer sequences. Comparisons of the M protein sequences of MV and CDV with the sequence reported for Sendai virus (B. M. Blumberg, K. Rose, M. G. Simona, L. Roux, C. Giorgi, and D. Kolakofsky, J. Virol. 52:656-663; Y. Hidaka, T. Kanda, K. Iwasaki, A. Nomoto, T. Shioda, and H. Shibuta, Nucleic Acids Res. 12:7965-7973) indicated the greatest homology among these M proteins in the carboxyterminal third of the molecule. Secondary-structure analyses of this shared region indicated a structurally conserved, hydrophobic sequence which possibly interacted with the lipid bilayer. Images PMID:3754588

  8. cDNA encoding a polypeptide including a hev ein sequence

    DOEpatents

    Raikhel, Natasha V.; Broekaert, Willem F.; Chua, Nam-Hai; Kush, Anil

    2000-07-04

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74-79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli.

  9. Characterization of cDNA clones encoding rabbit and human serum paraoxonase: The mature protein retains its signal sequence

    SciTech Connect

    Hassett, C.; Richter, R.J.; Humbert, R.; Omiecinski, C.J.; Furlong, C.E. ); Chapline, C.; Crabb, J.W. )

    1991-10-22

    Serum paraoxonase hydrolyzes the toxic metabolites of a variety of organophosphorus insecticides. High serum paraoxonase levels appear to protect against the neurotoxic effects of organophosphorus substrates of this enzyme. The amino acid sequence accounting for 42% of rabbit paraoxonase was determined. From these data, two oligonucleotide probes were synthesized and used to screen a rabbit liver cDNA library. Human paraoxonase clones were isolated from a liver cDNA library by using the rabbit cDNA as a hybridization probe. Inserts from three of the longest clones were sequenced, and one full-length clone contained an open reading frame encoding 355 amino acids, four less than the rabbit paraoxonase protein. Amino-terminal sequences derived from purified rabbit and human paraoxonase proteins suggested that the signal sequence is retained, with the exception of the initiator methionine residue. Characterization of the rabbit and human paraoxonase cDNA clones confirms that the signal sequences are not processed, except for the N-terminal methionine residue. The rabbit and human cDNA clones demonstrate striking nucleotide and deduced amino acid similarities (greater than 85%), suggesting an important metabolic role and constraints on the evolution of this protein.

  10. cDNA cloning and sequencing of rat alpha sub 1 -macroglobulin

    SciTech Connect

    Waermegaard, B.; Martin, N.; Johansson, S. )

    1992-03-03

    cDNA clones coding for the plasma protease inhibitor {alpha}{sub 1}-macroglobulin were isolated from a rat liver library. The obtained cDNA sequence contained 4701 nucleotides and had an open reading frame coding for a 1500 amino acid long protein, including a 24 amino acid signal peptide. The identity of the deduced protein sequence as {alpha}{sub 1}-macroglobulin was established by comparison with published peptide sequences of the protein. The mature protein shares 53% and 57% overall amino acid identity with the two other identified members of the rat {alpha}-macroglobulin family, {alpha}{sub 1}-inhibitor 3 and {alpha}{sub 2}-macroglobulin. A sequence typical for an internal thiol ester was identified. Of the 24 cysteines, 23 are conserved with {alpha}{sub 2}-macroglobulin. However, instead of the two most C-terminal cysteines in {alpha}{sub 2}-macroglobulin, which forms a disulfide bridge in the receptor binding domain, {alpha}{sub 1}-macroglobulin contains phenylalanine. One MRNA species hybridizing with the {alpha}{sub 1}-macroglobulin probe was observed in rat and mouse liver RNA ({approximately} 6.2 kb), whereas no corresponding transcript was detected in RNA from human liver.

  11. Rat cellular retinol-binding protein: cDNA sequence and rapid retinol-dependent accumulation of mRNA.

    PubMed Central

    Sherman, D R; Lloyd, R S; Chytil, F

    1987-01-01

    Cellular retinol-binding protein (CRBP) may be an important mediator of vitamin A action. We report here the identification of a cDNA clone corresponding to the rat CRBP gene. The cDNA is 695 nucleotides long, with an open reading frame corresponding to a protein of 134 amino acids. The deduced amino acid sequence is identical with that of rat CRBP. The nucleotide sequence shows 90.5% similarity with the human CRBP cDNA sequence. Genomic DNA analysis indicates that CRBP is present in one, or at most two, copies within the rat genome. Analysis of mRNA reveals a single species in every tissue tested and suggests that the isolated cDNA is full-length. Finally, when retinol-deficient rats are fed retinyl acetate for 4 hr, about 4-fold accumulation of CRBP-specific mRNA is observed in the lungs. This rapid effect suggests that the micronutrient retinol may directly influence the expression of its specific intracellular binding protein. Images PMID:3472205

  12. Insights into corn genes derived from large-scale cDNA sequencing.

    PubMed

    Alexandrov, Nickolai N; Brover, Vyacheslav V; Freidin, Stanislav; Troukhan, Maxim E; Tatarinova, Tatiana V; Zhang, Hongyu; Swaller, Timothy J; Lu, Yu-Ping; Bouck, John; Flavell, Richard B; Feldmann, Kenneth A

    2009-01-01

    We present a large portion of the transcriptome of Zea mays, including ESTs representing 484,032 cDNA clones from 53 libraries and 36,565 fully sequenced cDNA clones, out of which 31,552 clones are non-redundant. These and other previously sequenced transcripts have been aligned with available genome sequences and have provided new insights into the characteristics of gene structures and promoters within this major crop species. We found that although the average number of introns per gene is about the same in corn and Arabidopsis, corn genes have more alternatively spliced isoforms. Examination of the nucleotide composition of coding regions reveals that corn genes, as well as genes of other Poaceae (Grass family), can be divided into two classes according to the GC content at the third position in the amino acid encoding codons. Many of the transcripts that have lower GC content at the third position have dicot homologs but the high GC content transcripts tend to be more specific to the grasses. The high GC content class is also enriched with intronless genes. Together this suggests that an identifiable class of genes in plants is associated with the Poaceae divergence. Furthermore, because many of these genes appear to be derived from ancestral genes that do not contain introns, this evolutionary divergence may be the result of horizontal gene transfer from species not only with different codon usage but possibly that did not have introns, perhaps outside of the plant kingdom. By comparing the cDNAs described herein with the non-redundant set of corn mRNAs in GenBank, we estimate that there are about 50,000 different protein coding genes in Zea. All of the sequence data from this study have been submitted to DDBJ/GenBank/EMBL under accession numbers EU940701-EU977132 (FLI cDNA) and FK944382-FL482108 (EST). PMID:18937034

  13. The nucleotide sequence and genome organization of strawberry mild yellow edge-associated potexvirus.

    PubMed

    Jelkmann, W; Maiss, E; Martin, R R

    1992-02-01

    The nucleotide sequence (5966 nucleotides) of cDNA clones of strawberry mild yellow edge-associated potexvirus was determined. The genome contains six open reading frames (ORFs) encoding putative proteins with Mrs of 149,423, 25,344, 11,576, 8079, 25,714 and 11,216. In the first three putative proteins and the coat protein considerable similarity was found to comparable polypeptides of the potexviruses potato virus X, clover yellow mosaic virus, narcissus mosaic virus, papaya mosaic virus, white clover mosaic virus and lily virus X. PMID:1339469

  14. WebGMAP: a web service for mapping and aligning cDNA sequences to genomes

    PubMed Central

    Liang, Chun; Liu, Lin; Ji, Guoli

    2009-01-01

    The genomes of thousands of organisms are being sequenced, often with accompanying sequences of cDNAs or ESTs. One of the great challenges in bioinformatics is to make these genomic sequences and genome annotations accessible in a user-friendly manner to general biologists to address interesting biological questions. We have created an open-access web service called WebGMAP (http://www.bioinfolab.org/software/webgmap) that seamlessly integrates cDNA-genome alignment tools, such as GMAP, with easy-to-use data visualization and mining tools. This web service is intended to facilitate community efforts in improving genome annotation, determining accurate gene structures and their variations, and exploring important biological processes such as alternative splicing and alternative polyadenylation. For routine sequence analysis, WebGMAP provides a web-based sequence viewer with many useful functions, including nucleotide positioning, six-frame translations, sequence reverse complementation, and imperfect motif detection and alignment. WebGMAP also provides users with the ability to sort, filter and search for individual cDNA sequences and cDNA-genome alignments. Our EST-Genome-Browser can display annotated gene structures and cDNA-genome alignments at scales from 100 to 50 000 nt. With its ability to highlight base differences between query cDNAs and the genome, our EST-Genome-Browser allows biologists to discover potential point or insertion-deletion variations from cDNA-genome alignments. PMID:19465381

  15. cDNA sequence, mRNA expression and genomic DNA of trypsinogen from the indianmeal moth, Plodia interpunctella.

    PubMed

    Zhu, Y C; Oppert, B; Kramer, K J; McGaughey, W H; Dowdy, A K

    2000-02-01

    Trypsin-like enzymes are major insect gut enzymes that digest dietary proteins and proteolytically activate insecticidal proteins produced by the bacterium Bacillus thuringiensis (Bt). Resistance to Bt in a strain of the Indianmeal moth, Plodia interpunctella, was linked to the absence of a major trypsin-like proteinase (Oppert et al., 1997). In this study, trypsin-like proteinases, cDNA sequences, mRNA expression levels and genomic DNAs from Bt-susceptible and -resistant strains of the Indianmeal moth were compared. Proteinase activity blots of gut extracts indicated that the susceptible strain had two major trypsin-like proteinases, whereas the resistant strain had only one. Several trypsinogen-like cDNA clones were isolated and sequenced from cDNA libraries of both strains using a probe deduced from a conserved sequence for a serine proteinase active site. cDNAs of 852 nucleotides from the susceptible strain and 848 nucleotides from the resistant strain contained an open reading frame of 783 nucleotides which encoded a 261-amino acid trypsinogen-like protein. There was a single silent nucleotide difference between the two cDNAs in the open reading frame and the predicted amino acid sequence from the cDNA clones was most similar to sequences of trypsin-like proteinases from the spruce budworm, Choristoneura fumiferana, and the tobacco hornworm, Manduca sexta. The encoded protein included amino acid sequence motifs of serine proteinase active sites, conserved cysteine residues, and both zymogen activation and signal peptides. Northern blotting analysis showed no major difference between the two strains in mRNA expression in fourth-instar larvae, indicating that transcription was similar in the strains. Southern blotting analysis revealed that the restriction sites for the trypsinogen genes from the susceptible and resistant strains were different. Based on an enzyme size comparison, the cDNA isolated in this study corresponded to the gene for the smaller of two

  16. Cataloging of the genes expressed in human keratinocytes: analysis of 607 randomly isolated cDNA sequences.

    PubMed

    Konishi, K; Morishima, Y; Ueda, E; Kibe, Y; Nonomura, K; Yamanishi, K; Yasuno, H

    1994-07-29

    The partial nucleotide sequences of 607 cDNAs randomly isolated from a cDNA library of cultured human epidermal keratinocytes were determined by single pass sequencing. Homology search of the sequences to the non-redundant nucleotide databases revealed that 27% of the cDNAs matched registered human-or non-human genes encoding not only keratinocyte specific genes, but also a variety of functional proteins, the expression of which had not been identified in keratinocytes. Non-matching cDNAs covering 49% of the cDNAs were not homologous even to ESTs from other organs, suggesting that these cDNAs include novel genes expressed in the cells. The large scale sequencing of keratinocyte cDNAs provides a useful molecular source for research into biology and diseases of the skin. PMID:8048971

  17. The complete nucleotide sequence and genome organization of pea streak virus (genus Carlavirus).

    PubMed

    Su, Li; Li, Zhengnan; Bernardy, Mike; Wiersma, Paul A; Cheng, Zhihui; Xiang, Yu

    2015-10-01

    Pea streak virus (PeSV) is a member of the genus Carlavirus in the family Betaflexiviridae. Here, the first complete genome sequence of PeSV was determined by deep sequencing of a cDNA library constructed from dsRNA extracted from a PeSV-infected sample and Rapid Amplification of cDNA Ends (RACE) PCR. The PeSV genome consists of 8041 nucleotides excluding the poly(A) tail and contains six open reading frames (ORFs). The putative peptide encoded by the PeSV ORF6 has an estimated molecular mass of 6.6 kDa and shows no similarity to any known proteins. This differs from typical carlaviruses, whose ORF6 encodes a 12- to 18-kDa cysteine-rich nucleic-acid-binding protein. PMID:26092422

  18. cDNA, genomic sequence cloning and overexpression of giant panda (Ailuropoda melanoleuca) mitochondrial ATP synthase ATP5G1.

    PubMed

    Hou, W-R; Hou, Y-L; Ding, X; Wang, T

    2012-01-01

    The ATP5G1 gene is one of the three genes that encode mitochondrial ATP synthase subunit c of the proton channel. We cloned the cDNA and determined the genomic sequence of the ATP5G1 gene from the giant panda (Ailuropoda melanoleuca) using RT-PCR technology and touchdown-PCR, respectively. The cloned cDNA fragment contains an open reading frame of 411 bp encoding 136 amino acids; the length of the genomic sequence is of 1838 bp, containing three exons and two introns. Alignment analysis revealed that the nucleotide sequence and the deduced protein sequence are highly conserved compared to Homo sapiens, Mus musculus, Rattus norvegicus, Bos taurus, and Sus scrofa. The homologies for nucleotide sequences of the giant panda ATP5G1 to those of these species are 93.92, 92.21, 92.46, 93.67, and 92.46%, respectively, and the homologies for amino acid sequences are 90.44, 95.59, 93.38, 94.12, and 91.91%, respectively. Topology prediction showed that there is one protein kinase C phosphorylation site, one casein kinase II phosphorylation site, five N-myristoylation sites, and one ATP synthase c subunit signature in the ATP5G1 protein of the giant panda. The cDNA of ATP5G1 was transfected into Escherichia coli, and the ATP5G1 fused with the N-terminally GST-tagged protein gave rise to accumulation of an expected 40-kDa polypeptide, which had the characteristics of the predicted protein. PMID:23007995

  19. Cost-Effective Sequencing of Full-Length cDNA Clones Powered by a De Novo-Reference Hybrid Assembly

    PubMed Central

    Sugano, Sumio; Morishita, Shinichi; Suzuki, Yutaka

    2010-01-01

    Background Sequencing full-length cDNA clones is important to determine gene structures including alternative splice forms, and provides valuable resources for experimental analyses to reveal the biological functions of coded proteins. However, previous approaches for sequencing cDNA clones were expensive or time-consuming, and therefore, a fast and efficient sequencing approach was demanded. Methodology We developed a program, MuSICA 2, that assembles millions of short (36-nucleotide) reads collected from a single flow cell lane of Illumina Genome Analyzer to shotgun-sequence ∼800 human full-length cDNA clones. MuSICA 2 performs a hybrid assembly in which an external de novo assembler is run first and the result is then improved by reference alignment of shotgun reads. We compared the MuSICA 2 assembly with 200 pooled full-length cDNA clones finished independently by the conventional primer-walking using Sanger sequencers. The exon-intron structure of the coding sequence was correct for more than 95% of the clones with coding sequence annotation when we excluded cDNA clones insufficiently represented in the shotgun library due to PCR failure (42 out of 200 clones excluded), and the nucleotide-level accuracy of coding sequences of those correct clones was over 99.99%. We also applied MuSICA 2 to full-length cDNA clones from Toxoplasma gondii, to confirm that its ability was competent even for non-human species. Conclusions The entire sequencing and shotgun assembly takes less than 1 week and the consumables cost only ∼US$3 per clone, demonstrating a significant advantage over previous approaches. PMID:20479877

  20. cDNA sequences of variant forms of human placenta diamine oxidase

    SciTech Connect

    Zhang, X.; Kim, J.; McIntire, S.

    1995-08-01

    Genes for two forms of human placenta diamine oxidase (dao) were cloned from a cDNA library and sequenced. One gene, pdao1, is identical in length to human kidney dao but differs from it by two bases in the coding region and differs slightly in the 3{prime} - and 5{prime}-noncoding regions. The second gene, pdao2, is nearly identical to these genes in the coding region, except that it has an extra 57-nucleotide coding segment near the 3{prime} end of this region. This segment corresponds to the contiguous sequence of the 3{prime} end of intron 3 of human kidney dao. pdao2 also differs significantly from pdao1 and human kidney dao in a 13-base sequence in the t{prime}-noncoding region. It is proposed that pdao1 and human kidney dao are polymorphic forms of the same allele. Whether pdao2 is a polymorph of these two is not certain, because of the significant differences in the coding and noncoding regions. pdao2 may represent a different allele. 21 refs., 2 figs.

  1. Simplified computer programs for search of homology within nucleotide sequences.

    PubMed Central

    Kröger, M; Kröger-Block, A

    1984-01-01

    Four new computer programs for search of homology within nucleotide sequences are presented. The main scope of the program design is flexibility, independence of sequence length and the capability to be used by any molecular biologist without any prior computer experience. The programs offer a linear search, a search for maximal identity, an alignment along a given sequence and a search based on homology within the amino acid coding capacity of nucleotide sequences. The language is Fortran V. Copies are available on request. PMID:6546417

  2. Primary structure of bovine pituitary secretory protein I (chromogranin A) deduced from the cDNA sequence

    SciTech Connect

    Ahn, T.G.; Cohn, D.V.; Gorr, S.U.; Ornstein, D.L.; Kashdan, M.A.; Levine, M.A.

    1987-07-01

    Secretory protein I (SP-I), also referred to as chromogranin A, is an acidic glycoprotein that has been found in every tissue of endocrine and neuroendocrine origin examined but never in exocrine or epithelial cells. Its co-storage and co-secretion with peptide hormones and neurotransmitters suggest that it has an important endocrine or secretory function. The authors have isolated cDNA clones from a bovine pituitary lambdagt11 expression library using an antiserum to parathyroid SP-I. The largest clone (SP4B) hybridized to a transcript of 2.1 kilobases in RNA from parathyroid, pituitary, and adrenal medulla. Immunoblots of bacterial lysates derived from SP4B lysognes demonstrated specific antibody binding to an SP4B/..beta..-galactosidase fusion protein (160 kDa) with a cDNA-derived component of 46 kDa. Radioimmunoassay of the bacterial lystates with SP-I antiserum yielded parallel displacement curves of /sup 125/I-labeled SP-I by the SP4B lysate and authentic SP-I. SP4B contains a cDNA of 1614 nucleotides that encodes a 449-amino acid protein (calculated mass, 50 kDa). The nucleotide sequences of the pituitary SP-I cDNA and adrenal medullary SP-I cDNAs are nearly identical. Analysis of genomic DNA suggests that pituitary, adrenal, and parathyroid SP-I are products of the same gene.

  3. Nucleotide sequence of SHV-2 beta-lactamase gene

    SciTech Connect

    Garbarg-Chenon, A.; Godard, V.; Labia, R.; Nicolas, J.C. )

    1990-07-01

    The nucleotide sequence of plasmid-mediated beta-lactamase SHV-2 from Salmonella typhimurium (SHV-2pHT1) was determined. The gene was very similar to chromosomally encoded beta-lactamase LEN-1 of Klebsiella pneumoniae. Compared with the sequence of the Escherichia coli SHV-2 enzyme (SHV-2E.coli) obtained by protein sequencing, the deduced amino acid sequence of SHV-2pHT1 differed by three amino acid substitutions.

  4. Acetylcholinesterase of Stomoxys calcitrans (L.) (Diptera: Muscidae): cDNA sequence, baculovirus expression, and biochemical properties

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A 2193-nucleotide cDNA encoding acetylcholinesterase (AChE) of the stable fly, Stomoxys calcitrans (L.) was expressed in the baculovirus system. The open reading frame encoded a 91 amino acid secretion signal peptide and a 613 amino acid mature protein with 96% and 94% identity to the AChEs of Haema...

  5. Reading biological processes from nucleotide sequences

    NASA Astrophysics Data System (ADS)

    Murugan, Anand

    Cellular processes have traditionally been investigated by techniques of imaging and biochemical analysis of the molecules involved. The recent rapid progress in our ability to manipulate and read nucleic acid sequences gives us direct access to the genetic information that directs and constrains biological processes. While sequence data is being used widely to investigate genotype-phenotype relationships and population structure, here we use sequencing to understand biophysical mechanisms. We present work on two different systems. First, in chapter 2, we characterize the stochastic genetic editing mechanism that produces diverse T-cell receptors in the human immune system. We do this by inferring statistical distributions of the underlying biochemical events that generate T-cell receptor coding sequences from the statistics of the observed sequences. This inferred model quantitatively describes the potential repertoire of T-cell receptors that can be produced by an individual, providing insight into its potential diversity and the probability of generation of any specific T-cell receptor. Then in chapter 3, we present work on understanding the functioning of regulatory DNA sequences in both prokaryotes and eukaryotes. Here we use experiments that measure the transcriptional activity of large libraries of mutagenized promoters and enhancers and infer models of the sequence-function relationship from this data. For the bacterial promoter, we infer a physically motivated 'thermodynamic' model of the interaction of DNA-binding proteins and RNA polymerase determining the transcription rate of the downstream gene. For the eukaryotic enhancers, we infer heuristic models of the sequence-function relationship and use these models to find synthetic enhancer sequences that optimize inducibility of expression. Both projects demonstrate the utility of sequence information in conjunction with sophisticated statistical inference techniques for dissecting underlying biophysical

  6. Amino acid sequence of the serine-repeat antigen (SERA) of Plasmodium falciparum determined from cloned cDNA.

    PubMed

    Bzik, D J; Li, W B; Horii, T; Inselburg, J

    1988-09-01

    We report the isolation of cDNA clones for a Plasmodium falciparum gene that encodes the complete amino acid sequence of a previously identified exported blood stage antigen. The Mr of this antigen protein had been determined by sodium dodecylsulphate-polyacrylamide gel electrophoresis analysis, by different workers, to be 113,000, 126,000, and 140,000. We show, by cDNA nucleotide sequence analysis, that this antigen gene encodes a 989 amino acid protein (111 kDa) that contains a potential signal peptide, but not a membrane anchor domain. In the FCR3 strain the serine content of the protein was 11%, of which 57% of the serine residues were localized within a 201 amino acid sequence that included 35 consecutive serine residues. The protein also contained three possible N-linked glycosylation sites and numerous possible O-linked glycosylation sites. The mRNA was abundant during late trophozoite-schizont parasite stages. We propose to identity this antigen, which had been called p126, by the acronym SERA, serine-repeat antigen, based on its complete structure. The usefulness of the cloned cDNA as a source of a possible malaria vaccine is considered in view of the previously demonstrated ability of the antigen to induce parasite-inhibitory antibodies and a protective immune response in Saimiri monkeys. PMID:2847041

  7. Nucleotide and predicted amino acid sequences of cloned human and mouse preprocathepsin B cDNAs.

    PubMed Central

    Chan, S J; San Segundo, B; McCormick, M B; Steiner, D F

    1986-01-01

    Cathepsin B is a lysosomal thiol proteinase that may have additional extralysosomal functions. To further our investigations on the structure, mode of biosynthesis, and intracellular sorting of this enzyme, we have determined the complete coding sequences for human and mouse preprocathepsin B by using cDNA clones isolated from human hepatoma and kidney phage libraries. The nucleotide sequences predict that the primary structure of preprocathepsin B contains 339 amino acids organized as follows: a 17-residue NH2-terminal prepeptide sequence followed by a 62-residue propeptide region, 254 residues in mature (single chain) cathepsin B, and a 6-residue extension at the COOH terminus. A comparison of procathepsin B sequences from three species (human, mouse, and rat) reveals that the homology between the propeptides is relatively conserved with a minimum of 68% sequence identity. In particular, two conserved sequences in the propeptide that may be functionally significant include a potential glycosylation site and the presence of a single cysteine at position 59. Comparative analysis of the three sequences also suggests that processing of procathepsin B is a multistep process, during which enzymatically active intermediate forms may be generated. The availability of the cDNA clones will facilitate the identification of possible active or inactive intermediate processive forms as well as studies on the transcriptional regulation of the cathepsin B gene. PMID:3463996

  8. 5'-terminal nucleotide sequences of mammalian type C helper viruses are conserved in the genomes of replication-defective mammalian transforming viruses.

    PubMed Central

    Tronick, S R; Cabradilla, C D; Aaronson, S A; Haseltine, W A

    1978-01-01

    The RNAs of replication-defective murine and primate type C transforming viruses were analyzed for the presence of nucleotide sequences homologous to the genomes of their respective helper type C viruses by using DNAs complementary (cDNA) to either the 5'-terminal (cDNA5') or total (cDNAtotal) nucleotide sequences of the helper virus RNA. The defective viruses examined have previously been shown to vary in their ability to express helper viral gag gene proteins. With cDNAtotal as a probe, these transforming viruses were shown to vary in their representation of helper sequences (15 to 60% hybridization of cDNAtotal). In striking contrast, 5'-terminal-specific sequences of the helper virus were conserved in the RNAs of every transforming virus tested (is greater than 80% hybridization of cDNA5'). These findings suggest a critical role for these sequences in the life cycle of the defective transforming virus. PMID:209210

  9. Nucleotide sequence of the gene for human prothrombin

    SciTech Connect

    Degen, S.J.F.; Davie, E.W.

    1987-09-22

    A human genomic DNA library was screened for the gene coding for human prothrombin with a cDNA coding for the human protein. Eighty-one positive lambda phage were identified, and three were chosen for further characterization. These three phage hybridized with 5' and/or 3' probes prepared from the prothrombin cDNA. The complete DNA sequence of 21 kilobases of the human prothrombin gene was determined and included a 4.9-kilobase region that was previously sequenced. The gene for human prothrombin contains 14 exons separated by 13 intervening sequences. The exons range in size from 25 to 315 base pairs, while the introns range from 84 to 9447 base pairs. Ninety percent of the gene is composed of intervening sequence. All the intron splice junctions are consistent with sequences found in other eukaryotic genes, except for the presence of GC rather than GT on the 5' end of intervening sequence L. Thirty copies of Alu repetitive DNA and two copies of partial KpnI repeats were identified in clusters within several of the intervening sequences, and these repeats represent 40% of the DNA sequence of the gene. The size, distribution, and sequence homology of the introns within the gene were the compared to those of the genes for the other vitamin K dependent proteins and several other serine proteases.

  10. The complete nucleotide sequence and structure of the gene encoding bovine phenylethanolamine N-methyltransferase.

    PubMed

    Batter, D K; D'Mello, S R; Turzai, L M; Hughes, H B; Gioio, A E; Kaplan, B B

    1988-03-01

    A cDNA clone for bovine adrenal phenylethanolamine N-methyltransferase (PNMT) was used to screen a Charon 28 genomic library. One phage was identified, designated lambda P1, which included the entire PNMT gene. Construction of a restriction map, with subsequent Southern blot analysis, allowed the identification of exon-containing fragments. Dideoxy sequence analysis of these fragments, and several more further upstream, indicates that the bovine PNMT gene is 1,594 base pairs in length, consisting of three exons and two introns. The transcription initiation site was identified by two independent methods and is located approximately 12 base pairs upstream from the ATG translation start site. The 3' untranslated region is 88 base pairs in length and contains the expected polyadenylation signal (AATAAA). A putative promoter sequence (TATA box) is located about 25 base pairs upstream from the transcription initiation site. Computer comparison of the nucleotide sequence data with the consensus sequences of known regulatory elements revealed potential binding sites for glucocorticoid receptors and the Sp1 regulatory protein in the 5' flanking region of the gene. Additionally, comparison of the sequence of the exons of the PNMT gene with cDNA sequences for other enzymes involved in biogenic amine synthesis revealed no significant homology, indicating that PNMT is not a member of a multigene family of catecholamine biosynthetic enzymes. PMID:3379652

  11. Illumina sequencing of green stink bug nymph and adult cdna to identify potential rnai gene targets

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Whole-body transcriptomes for nymphs and adults of the green stink bug, Acrosternum hilare (Say), were sequenced on an Illumina® Genome Analyzer IIx sequencer. The insects were collected from sites in North Carolina and Virginia, USA. The cDNA library for each sample was sequenced on one lane of an...

  12. Sequence rearrangement and duplication of double stranded fibronectin cDNA probably occurring during cDNA synthesis by AMV reverse transcriptase and Escherichia coli DNA polymerase I.

    PubMed Central

    Fagan, J B; Pastan, I; de Crombrugghe, B

    1980-01-01

    Two cloned cDNAs derived from the mRNA for cell fibronectin have been sequenced, providing evidence that transcription with AMV reverse transcriptase or Escherichia coli DNA polymerase I may not always result in double stranded cDNA that is exactly homologous with its mRNA template. Instead, the sequences of these cloned cDNAs are consistent with the duplication and rearrangement of sequences during synthesis of double stranded cDNA. PMID:6159581

  13. DNA sequence representation by trianders and determinative degree of nucleotides

    PubMed Central

    Duplij, Diana; Duplij, Steven

    2005-01-01

    A new version of DNA walks, where nucleotides are regarded unequal in their contribution to a walk is introduced, which allows us to study thoroughly the “fine structure” of nucleotide sequences. The approach is based on the assumption that nucleotides have an inner abstract characteristic, the determinative degree, which reflects genetic code phenomenological properties and is adjusted to nucleotides physical properties. We consider each codon position independently, which gives three separate walks characterized by different angles and lengths, and that such an object is called triander which reflects the “strength” of branch. A general method for identifying DNA sequence “by triander” which can be treated as a unique “genogram” (or “gene passport”) is proposed. The two- and three-dimensional trianders are considered. The difference of sequences fine structure in genes and the intergenic space is shown. A clear triplet signal in coding sequences was found which is absent in the intergenic space and is independent from the sequence length. This paper presents the topological classification of trianders which can allow us to provide a detailed working out signatures of functionally different genomic regions. PMID:16052707

  14. Characterization of long cDNA clones from human adult spleen. II. The complete sequences of 81 cDNA clones.

    PubMed

    Jikuya, Hiroyuki; Takano, Jun; Kikuno, Reiko; Hirosawa, Makoto; Nagase, Takahiro; Nomura, Nobuo; Ohara, Osamu

    2003-02-28

    To accumulate information on the coding sequences (CDSs) of unidentified genes, we have conducted a sequencing project of human long cDNA clones. Both the end sequences of approximately 10,000 cDNA clones from two size-fractionated human spleen cDNA libraries (average sizes of 4.5 kb and 5.6 kb) were determined by single-pass sequencing to select cDNAs with unidentified sequences. We herein present the entire sequences of 81 cDNA clones, most of which were selected by two approaches based on their protein-coding potentialities in silico: Fifty-eight cDNA clones were selected as those having protein-coding potentialities at the 5'-end of single-pass sequences by applying the GeneMark analysis; and 20 cDNA clones were selected as those expected to encode proteins larger than 100 amino acid residues by analysis of the human genome sequences flanked by both the end sequences of cDNAs using the GENSCAN gene prediction program. In addition to these newly identified cDNAs, three cDNA clones were isolated by colony hybridization experiments using probes corresponding to known gene sequences since these cDNAs are likely to contain considerable amounts of new information regarding the genes already annotated. The sequence data indicated that the average sizes of the inserts and corresponding CDSs of cDNA clones analyzed here were 5.0 kb and 2.0 kb (670 amino acid residues), respectively. From the results of homology and motif searches against the public databases, functional categories of the 29 predicted gene products could be assigned; 86% of these predicted gene products (25 gene products) were classified into proteins relating to cell signaling/communication, nucleic acid management, and cell structure/motility. PMID:12693554

  15. Moss Phylogeny Reconstruction Using Nucleotide Pangenome of Complete Mitogenome Sequences.

    PubMed

    Goryunov, D V; Nagaev, B E; Nikolaev, M Yu; Alexeevski, A V; Troitsky, A V

    2015-11-01

    Stability of composition and sequence of genes was shown earlier in 13 mitochondrial genomes of mosses (Rensing, S. A., et al. (2008) Science, 319, 64-69). It is of interest to study the evolution of mitochondrial genomes not only at the gene level, but also on the level of nucleotide sequences. To do this, we have constructed a "nucleotide pangenome" for mitochondrial genomes of 24 moss species. The nucleotide pangenome is a set of aligned nucleotide sequences of orthologous genome fragments covering the totality of all genomes. The nucleotide pangenome was constructed using specially developed new software, NPG-explorer (NPGe). The stable part of the mitochondrial genome (232 stable blocks) is shown to be, on average, 45% of its length. In the joint alignment of stable blocks, 82% of positions are conserved. The phylogenetic tree constructed with the NPGe program is in good correlation with other phylogenetic reconstructions. With the NPGe program, 30 blocks have been identified with repeats no shorter than 50 bp. The maximal length of a block with repeats is 140 bp. Duplications in the mitochondrial genomes of mosses are rare. On average, the genome contains about 500 bp in large duplications. The total length of insertions and deletions was determined in each genome. The losses and gains of DNA regions are rather active in mitochondrial genomes of mosses, and such rearrangements presumably can be used as additional markers in the reconstruction of phylogeny. PMID:26615445

  16. Generation and analysis of expressed sequence tags from Trypanosoma cruzi trypomastigote and amastigote cDNA libraries.

    PubMed

    Agüero, Fernán; Abdellah, Karim Ben; Tekiel, Valeria; Sánchez, Daniel O; González, Antonio

    2004-08-01

    We have generated 2771 expressed sequence tags (ESTs) from two cDNA libraries of Trypanosoma cruzi CL-Brener. The libraries were constructed from trypomastigote and amastigotes, using a spliced leader primer to synthesize the cDNA second strand, thus selecting for full-length cDNAs. Since the libraries were not normalized nor pre-screened, we compared the representation of transcripts between the two using a statistical test and identify a subset of transcripts that show apparent differential representation. A non-redundant set of 1619 reconstructed transcripts was generated by sequence clustering. This dataset was used to perform similarity searches against protein and nucleotide databases. Based on these searches, 339 sequences could be assigned a putative identity. One thousand one-hundred and sixteen sequences in the non-redundant clustered dataset (68.8%) are new expression tags, not represented in the T. cruzi epimastigote ESTs that are in the public databases. Additional information is provided online at http://genoma.unsam.edu.ar/projects/tram. To the best of our knowledge these are the first ESTs reported for the life cycle stages of T. cruzi that occur in the vertebrate host. PMID:15478800

  17. H3 and H4 histone cDNA sequences from Xenopus: a sequence comparison of H4 genes.

    PubMed Central

    Turner, P C; Woodland, H R

    1982-01-01

    Ovarian poly (A) + RNA from Xenopus laevis and Xenopus borealis was used to construct two cDNA libraries which were screened for histone sequences. cDNA clones to H4 mRNA were obtained from both species and an H3 cDNA clone from Xenopus laevis. The complete DNA sequences of these clones have been determined and are presented. These new sequences are compared with other H3 and H4 DNA sequences both in the coding and 3' noncoding regions. We find that there is considerable non-random codon usage in ten H4 genes. In addition there are some sequence similarities in the 3' noncoding regions of H3 and H4 genes. PMID:6896750

  18. Complete nucleotide sequence of Nootka lupine vein-clearing virus

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The complete genome sequence of Nootka lupine vein-clearing virus (NLVCV) was determined to be 4,172 nucleotides in length containing four open reading frames ORFs with a similar genetic organization and conceptual translations of virus species in the genus Carmovirus, family Tombusviridae. The orde...

  19. Sequence Characterization of cDNA Sequence of Encoding of an Antimicrobial Peptide With No Disulfide Bridge from the Iranian Mesobuthus Eupeus Venomous Glands

    PubMed Central

    Farajzadeh-Sheikh, Ahmad; Jolodar, Abbas; Ghaemmaghami, Shamsedin

    2013-01-01

    Background Scorpion venom glands produce some antimicrobial peptides (AMP) that can rapidly kill a broad range of microbes and have additional activities that impact on the quality and effectiveness of innate responses and inflammation. Objectives In this study, we reported the identification of a cDNA sequence encoding cysteine-free antimicrobial peptides isolated from venomous glands of this species. Materials and Methods Total RNA was extracted from the Iranian mesobuthus eupeus venom glands, and cDNA was synthesized by using the modified oligo (dT). The cDNA was used as the template for applying Semi-nested RT- PCR technique. PCR Products were used for direct nucleotide sequencing and the results were compared with Gen Bank database. Results A 213 BP cDNA fragment encoding the entire coding region of an antimicrobial toxin from the Iranian scorpion M. Eupeus venom glands were isolated. The full-length sequence of the coding region was 210 BP contained an open reading frame of 70 amino with a predicted molecular mass of 7970.48 Da and theoretical Pi of 9.10. The open reading frame consists of 210 BP encoding a precursor of 70 amino acid residues, including a signal peptide of 23 residues a propertied of 7 residues, and a mature peptide of 34 residues with no disulfide bridge. The peptide has detectable sequence identity to the Lesser Asian mesobuthus eupeus MeVAMP-2 (98%), MeVAMP-9 (60%) and several previously described AMPs from other scorpion venoms including mesobuthus martensii (94%) and buthus occitanus Israelis (82%). Conclusions The secondary structure of the peptide mainly consisted of α-helical structure which was generally conserved by previously reported scorpion counterparts. The phylogenetic analysis showed that the Iranian MeAMP-like toxin was similar but not identical with that of venom antimicrobial peptides from lesser Asian scorpion mesobuthus eupeus. PMID:23486842

  20. cDNA sequence analysis of a 29-kDa cysteine-rich surface antigen of pathogenic Entamoeba histolytica

    SciTech Connect

    Torian, B.E.; Stroeher, V.L.; Stamm, W.E. ); Flores, B.M. ); Hagen, F.S. )

    1990-08-01

    A {lambda}gt11 cDNA library was constructed from poly(U)-Spharose-selected Entamoeba histolytica trophozoite RNA in order to clone and identify surface antigens. The library was screened with rabbit polyclonal anti-E. histolytica serum. A 700-base-pair cDNA insert was isolated and the nucleotide sequence was determined. The deduced amino acid sequence of the cDNA revealed a cysteine-rich protein. DNA hybridizations showed that the gene was specific to E. histolytica since the cDNA probe reacted with DNA from four axenic strains of E. histolytica but did not react with DNA from Entamoeba invadens, Acanthamoeba castellanii, or Trichomonas vaginalis. The insert was subcloned into the expression vector pGEX-1 and the protein was expressed as a fusion with the C terminus of glutathione S-transferase. Purified fusion protein was used to generate 22 monoclonal antibodies (mAbs) and a mouse polyclonal antiserum specific for the E. histolytica portion of the fusion protein. A 29-kDa protein was identified as a surface antigen when mAbs were used to immunoprecipitate the antigen from metabolically {sup 35}S-labeled live trophozoites. The surface location of the antigen was corroborated by mAb immunoprecipitation of a 29-kDa protein from surface-{sup 125}I-labeled whole trophozoites as well as by the reaction of mAbs with live trophozoites in an indirect immunofluorescence assay performed at 4{degree}C. Immunoblotting with mAbs demonstrated that the antigen was present on four axenic isolates tested. mAbs recognized epitopes on the 29-kDa native antigen on some but not all clinical isolates tested.

  1. Overlapping open reading frames revealed by complete nucleotide sequencing of turnip yellow mosaic virus genomic RNA.

    PubMed Central

    Morch, M D; Boyer, J C; Haenni, A L

    1988-01-01

    The complete nucleotide sequence of turnip yellow mosaic virus (TYMV) genomic RNA has been determined on a set of overlapping cDNA clones using a sequential sequencing strategy. The RNA is 6318 nucleotides long, excluding the cap structure. The genome organization deduced from the sequence confirms previous results of in vitro translation. A novel open reading frame (ORF) putatively encoding a Pro-rich and very basic 69K (K = kilodalton) protein is detected at the 5' end of the genome. It is initiated at the first AUG codon on the RNA and overlaps the major ORF that encodes the non structural 206K (previously referred to as 195K) protein of TYMV; its function is unknown. Several amino acid consensus sequences already described among plant and animal viruses are also found in the TYMV-encoded polypeptides. A comparison with other viruses whose RNA sequence is known leads to the conclusion that TYMV belongs to the "Sindbis-like" supergroup of viruses and could be related to Semliki forest virus. PMID:3399388

  2. Nucleotide sequence of the tobacco (Nicotiana tabacum) anionic peroxidase gene

    SciTech Connect

    Diaz-De-Leon, F.; Klotz, K.L.; Lagrimini, L.M. )

    1993-03-01

    Peroxidases have been implicated in numerous physiological processes including lignification (Grisebach, 1981), wound-healing (Espelie et al., 1986), phenol oxidation (Lagrimini, 1991), pathogen defense (Ye et al., 1990), and the regulation of cell elongation through the formation of interchain covalent bonds between various cell wall polymers (Fry, 1986; Goldberg et al., 1986; Bradley et al., 1992). However, a complete description of peroxidase action in vivo is not available because of the vast number of potential substrates and the existence of multiple isoenzymes. The tobacco anionic peroxidase is one of the better-characterized isoenzymes. This enzyme has been shown to oxidize a number of significant plant secondary compounds in vitro including cinnamyl alcohols, phenolic acids, and indole-3-acetic acid (Maeder, 1980; Lagrimini, 1991). A cDNA encoding the enzyme has been obtained, and this enzyme was shown to be expressed at the highest levels in lignifying tissues (xylem and tracheary elements) and also in epidermal tissue (Lagrimini et al., 1987). It was shown at this time that there were four distinct copies of the anionic peroxidase gene in tobacco (Nicotiana tabacum). A tobacco genomic DNA library was constructed in the [lambda]-phase EMBL3, from which two unique peroxidase genes were sequenced. One of these clones, [lambda]POD1, was designated as a pseudogene when the exonic sequences were found to differ from the cDNA sequences by 1%, and several frame shifts in the coding sequences indicated a dysfunctional gene (the authors' unpublished results). The other clone, [lambda]POD3, described in this manuscript, was designated as the functional tobacco anionic peroxidase gene because of 100% homology with the cDNA. Significant structural elements include an AS-2 box indicated in shoot-specific expression (Lam and Chua, 1989), a TATA box, and two intervening sequences. 10 refs., 1 tab.

  3. Method for the detection of specific nucleic acid sequences by polymerase nucleotide incorporation

    DOEpatents

    Castro, Alonso

    2004-06-01

    A method for rapid and efficient detection of a target DNA or RNA sequence is provided. A primer having a 3'-hydroxyl group at one end and having a sequence of nucleotides sufficiently homologous with an identifying sequence of nucleotides in the target DNA is selected. The primer is hybridized to the identifying sequence of nucleotides on the DNA or RNA sequence and a reporter molecule is synthesized on the target sequence by progressively binding complementary nucleotides to the primer, where the complementary nucleotides include nucleotides labeled with a fluorophore. Fluorescence emitted by fluorophores on single reporter molecules is detected to identify the target DNA or RNA sequence.

  4. Cloning, sequencing, and expression of cDNA for human. beta. -glucuronidase

    SciTech Connect

    Oshima, A.; Kyle, J.W.; Miller, R.D.; Hoffmann, J.W.; Powell, P.P.; Grubb, J.H.; Sly, W.S.; Tropak, M.; Guise, K.S.; Gravel, R.A.

    1987-02-01

    The authors report here the cDNA sequence for human placental ..beta..-glucuronidase (..beta..-D-glucuronoside glucuronosohydrolase, EC 3.2.1.31) and demonstrate expression of the human enzyme in transfected COS cells. They also sequenced a partial cDNA clone from human fibroblasts that contained a 153-base-pair deletion within the coding sequence and found a second type of cDNA clone from placenta that contained the same deletion. Nuclease S1 mapping studies demonstrated two types of mRNAs in human placenta that corresponded to the two types of cDNA clones isolated. The NH/sub 2/-terminal amino acid sequence determined for human spleen ..beta..-glucuronidase agreed with that inferred from the DNA sequence of the two placental clones, beginning at amino acid 23, suggesting a cleaved signal sequence of 22 amino acids. When transfected into COS cells, plasmids containing either placental clone expressed an immunoprecipitable protein that contained N-linked oligosaccharides as evidenced by sensitivity to endoglycosidase F. However, only transfection with the clone containing the 153-base-pair segment led to expression of human ..beta..-glucuronidase activity. These studies provide the sequence for the full-length cDNA for human ..beta..-glucuronidase, demonstrate the existence of two populations of mRNA for ..beta..-glucuronidase in human placenta, only one of which specifies a catalytically active enzyme, and illustrate the importance of expression studies in verifying that a cDNA is functionally full-length.

  5. Nucleotide Sequencing and Identification of Some Wild Mushrooms

    PubMed Central

    Das, Sudip Kumar; Mandal, Aninda; Datta, Animesh K.; Gupta, Sudha; Paul, Rita; Saha, Aditi; Sengupta, Sonali; Dubey, Priyanka Kumari

    2013-01-01

    The rDNA-ITS (Ribosomal DNA Internal Transcribed Spacers) fragment of the genomic DNA of 8 wild edible mushrooms (collected from Eastern Chota Nagpur Plateau of West Bengal, India) was amplified using ITS1 (Internal Transcribed Spacers 1) and ITS2 primers and subjected to nucleotide sequence determination for identification of mushrooms as mentioned. The sequences were aligned using ClustalW software program. The aligned sequences revealed identity (homology percentage from GenBank data base) of Amanita hemibapha [CN (Chota Nagpur) 1, % identity 99 (JX844716.1)], Amanita sp. [CN 2, % identity 98 (JX844763.1)], Astraeus hygrometricus [CN 3, % identity 87 (FJ536664.1)], Termitomyces sp. [CN 4, % identity 90 (JF746992.1)], Termitomyces sp. [CN 5, % identity 99 (GU001667.1)], T. microcarpus [CN 6, % identity 82 (EF421077.1)], Termitomyces sp. [CN 7, % identity 76 (JF746993.1)], and Volvariella volvacea [CN 8, % identity 100 (JN086680.1)]. Although out of 8 mushrooms 4 could be identified up to species level, the nucleotide sequences of the rest may be relevant to further characterization. A phylogenetic tree is constructed using Neighbor-Joining method showing interrelationship between/among the mushrooms. The determined nucleotide sequences of the mushrooms may provide additional information enriching GenBank database aiding to molecular taxonomy and facilitating its domestication and characterization for human benefits. PMID:24489501

  6. Nucleotide sequencing and identification of some wild mushrooms.

    PubMed

    Das, Sudip Kumar; Mandal, Aninda; Datta, Animesh K; Gupta, Sudha; Paul, Rita; Saha, Aditi; Sengupta, Sonali; Dubey, Priyanka Kumari

    2013-01-01

    The rDNA-ITS (Ribosomal DNA Internal Transcribed Spacers) fragment of the genomic DNA of 8 wild edible mushrooms (collected from Eastern Chota Nagpur Plateau of West Bengal, India) was amplified using ITS1 (Internal Transcribed Spacers 1) and ITS2 primers and subjected to nucleotide sequence determination for identification of mushrooms as mentioned. The sequences were aligned using ClustalW software program. The aligned sequences revealed identity (homology percentage from GenBank data base) of Amanita hemibapha [CN (Chota Nagpur) 1, % identity 99 (JX844716.1)], Amanita sp. [CN 2, % identity 98 (JX844763.1)], Astraeus hygrometricus [CN 3, % identity 87 (FJ536664.1)], Termitomyces sp. [CN 4, % identity 90 (JF746992.1)], Termitomyces sp. [CN 5, % identity 99 (GU001667.1)], T. microcarpus [CN 6, % identity 82 (EF421077.1)], Termitomyces sp. [CN 7, % identity 76 (JF746993.1)], and Volvariella volvacea [CN 8, % identity 100 (JN086680.1)]. Although out of 8 mushrooms 4 could be identified up to species level, the nucleotide sequences of the rest may be relevant to further characterization. A phylogenetic tree is constructed using Neighbor-Joining method showing interrelationship between/among the mushrooms. The determined nucleotide sequences of the mushrooms may provide additional information enriching GenBank database aiding to molecular taxonomy and facilitating its domestication and characterization for human benefits. PMID:24489501

  7. Nucleotide sequence and genome organization of tomato leaf curl geminivirus.

    PubMed

    Dry, I B; Rigden, J E; Krake, L R; Mullineaux, P M; Rezaian, M A

    1993-01-01

    The genome of tomato leaf curl virus (TLCV) from Australia was cloned and its complete nucleotide sequence determined. It is a single circular ssDNA of 2766 nucleotides containing the consensus nonanucleotide sequence present in all geminiviruses. It has six open reading frames with an organization resembling that of certain other dicotyledonous plant-infecting monopartite geminiviruses, i.e. tomato yellow leaf curl and beet curly top viruses. The regulatory sequences present indicate a bidirectional mode of transcription. A dimeric TLCV DNA clone was constructed in a binary vector and used to agroinoculate three different host species. Typical virus infections were produced, confirming that the single DNA component is sufficient for infectivity. PMID:8423446

  8. A compilation of partial sequences of randomly selected cDNA clones from the rat incisor.

    PubMed

    Matsuki, Y; Nakashima, M; Amizuka, N; Warshawsky, H; Goltzman, D; Yamada, K M; Yamada, Y

    1995-01-01

    The formation of tooth organs is regulated by a series of developmental programs. We have initiated a genome project with the ultimate goal of identifying novel genes important for tooth development. As an initial approach, we constructed a unidirectional cDNA library from the non-calcified portion of incisors of 3- to 4-week-old rats, sequenced cDNA clones, and classified their sequences by homology search through the GenBank data base and the PIR protein data base. Here, we report partial DNA sequences obtained by automated DNA sequencing on 400 cDNA clones randomly selected from the library. Of the sequences determined, 51% represented sequences of new genes that were not related to any previously reported gene. Twenty-six percent of the clones strongly matched genes and proteins in the data bases, including amelogenin, alpha 1(I) and alpha 2(I) collagen chains, osteonectin, and decorin. Nine percent of clones revealed partial sequence homology to known genes such as transcription factors and cell surface receptors. A significant number of the previously identified genes were expressed redundantly and were found to encode extracellular matrix proteins. Identification and cataloging of cDNA clones in these tissues are the first step toward identification of markers expressed in a tissue- or stage-specific manner, as well as the genetic linkage study of tooth anomalies. Further characterization of the clones described in this paper should lead to the discovery of novel genes important for tooth development. PMID:7876422

  9. Nucleotide sequence of the L1 ribosomal protein gene of Xenopus laevis: remarkable sequence homology among introns.

    PubMed Central

    Loreni, F; Ruberti, I; Bozzoni, I; Pierandrei-Amaldi, P; Amaldi, F

    1985-01-01

    Ribosomal protein L1 is encoded by two genes in Xenopus laevis. The comparison of two cDNA sequences shows that the two L1 gene copies (L1a and L1b) have diverged in many silent sites and very few substitution sites; moreover a small duplication occurred at the very end of the coding region of the L1b gene which thus codes for a product five amino acids longer than that coded by L1a. Quantitatively the divergence between the two L1 genes confirms that a whole genome duplication took place in Xenopus laevis approximately 30 million years ago. A genomic fragment containing one of the two L1 gene copies (L1a), with its nine introns and flanking regions, has been completely sequenced. The 5' end of this gene has been mapped within a 20-pyridimine stretch as already found for other vertebrate ribosomal protein genes. Four of the nine introns have a 60-nucleotide sequence with 80% homology; within this region some boxes, one of which is 16 nucleotides long, are 100% homologous among the four introns. This feature of L1a gene introns is interesting since we have previously shown that the activity of this gene is regulated at a post-transcriptional level and it involves the block of the normal splicing of some intron sequences. Images Fig. 3. Fig. 5. PMID:3841512

  10. Acetylcholinesterase of the sand fly, Phlebotomus papatasi (Scopoli): cDNA sequence, baculovirus expression, and biochemical properties

    PubMed Central

    2013-01-01

    Background Millions of people and domestic animals around the world are affected by leishmaniasis, a disease caused by various species of flagellated protozoans in the genus Leishmania that are transmitted by several sand fly species. Insecticides are widely used for sand fly population control to try to reduce or interrupt Leishmania transmission. Zoonotic cutaneous leishmaniasis caused by L. major is vectored mainly by Phlebotomus papatasi (Scopoli) in Asia and Africa. Organophosphates comprise a class of insecticides used for sand fly control, which act through the inhibition of acetylcholinesterase (AChE) in the central nervous system. Point mutations producing an altered, insensitive AChE are a major mechanism of organophosphate resistance in insects and preliminary evidence for organophosphate-insensitive AChE has been reported in sand flies. This report describes the identification of complementary DNA for an AChE in P. papatasi and the biochemical characterization of recombinant P. papatasi AChE. Methods A P. papatasi Israeli strain laboratory colony was utilized to prepare total RNA utilized as template for RT-PCR amplification and sequencing of cDNA encoding acetylcholinesterase 1 using gene specific primers and 3’-5’-RACE. The cDNA was cloned into pBlueBac4.5/V5-His TOPO, and expressed by baculovirus in Sf21 insect cells in serum-free medium. Recombinant P. papatasi acetylcholinesterase was biochemically characterized using a modified Ellman’s assay in microplates. Results A 2309 nucleotide sequence of PpAChE1 cDNA [GenBank: JQ922267] of P. papatasi from a laboratory colony susceptible to insecticides is reported with 73-83% nucleotide identity to acetylcholinesterase mRNA sequences of Culex tritaeniorhynchus and Lutzomyia longipalpis, respectively. The P. papatasi cDNA ORF encoded a 710-amino acid protein [GenBank: AFP20868] exhibiting 85% amino acid identity with acetylcholinesterases of Cx. pipiens, Aedes aegypti, and 92% amino acid identity for

  11. Evolution of vertebrate IgM: complete amino acid sequence of the constant region of Ambystoma mexicanum mu chain deduced from cDNA sequence.

    PubMed

    Fellah, J S; Wiles, M V; Charlemagne, J; Schwager, J

    1992-10-01

    cDNA clones coding for the constant region of the Mexican axolotl (Ambystoma mexicanum) mu heavy immunoglobulin chain were selected from total spleen RNA, using a cDNA polymerase chain reaction technique. The specific 5'-end primer was an oligonucleotide homologous to the JH segment of Xenopus laevis mu chain. One of the clones, JHA/3, corresponded to the complete constant region of the axolotl mu chain, consisting of a 1362-nucleotide sequence coding for a polypeptide of 454 amino acids followed in 3' direction by a 179-nucleotide untranslated region and a polyA+ tail. The axolotl C mu is divided into four typical domains (C mu 1-C mu 4) and can be aligned with the Xenopus C mu with an overall identity of 56% at the nucleotide level. Percent identities were particularly high between C mu 1 (59%) and C mu 4 (71%). The C-terminal 20-amino acid segment which constitutes the secretory part of the mu chain is strongly homologous to the equivalent sequences of chondrichthyans and of other tetrapods, including a conserved N-linked oligosaccharide, the penultimate cysteine and the C-terminal lysine. The four C mu domains of 13 vertebrate species ranging from chondrichthyans to mammals were aligned and compared at the amino acid level. The significant number of mu-specific residues which are conserved into each of the four C mu domains argues for a continuous line of evolution of the vertebrate mu chain. This notion was confirmed by the ability to reconstitute a consistent vertebrate evolution tree based on the phylogenic parsimony analysis of the C mu 4 sequences. PMID:1382992

  12. Analysis of a cDNA clone expressing a human autoimmune antigen: full-length sequence of the U2 small nuclear RNA-associated B antigen

    SciTech Connect

    Habets, W.J.; Sillekens, P.T.G.; Hoet, M.H.; Schalken, J.A.; Roebroek, A.J.M.; Leunissen, J.A.M.; Van de Ven, W.J.M.; Van Venrooij, W.J.

    1987-04-01

    A U2 small nuclear RNA-associated protein, designated B'', was recently identified as the target antigen for autoimmune sera from certain patients with systemic lupus erythematosus and other rheumatic diseases. Such antibodies enabled them to isolate cDNA clone lambdaHB''-1 from a phage lambdagt11 expression library. This clone appeared to code for the B'' protein as established by in vitro translation of hybrid-selected mRNA. The identity of clone lambdaHB''-1 was further confirmed by partial peptide mapping and analysis of the reactivity of the recombinant antigen with monospecific and monoclonal antibodies. Analysis of the nucleotide sequence of the 1015-base-pair cDNA insert of clone lambdaHB''-1 revealed a large open reading frame of 800 nucleotides containing the coding sequence for a polypeptide of 25,457 daltons. In vitro transcription of the lambdaHB''-1 cDNA insert and subsequent translation resulted in a protein product with the molecular size of the B'' protein. These data demonstrate that clone lambdaHB''-1 contains the complete coding sequence of this antigen. The deduced polypeptide sequence contains three very hydrophilic regions that might constitute RNA binding sites and/or antigenic determinants. These findings might have implications both for the understanding of the pathogenesis of rheumatic diseases as well as for the elucidation of the biological function of autoimmune antigens.

  13. Completion of Kunjin virus RNA sequence and recovery of an infectious RNA transcribed from stably cloned full-length cDNA.

    PubMed Central

    Khromykh, A A; Westaway, E G

    1994-01-01

    Completion of the Kunjin virus (KUN) RNA sequence showed that it is the longest flavivirus sequence reported (11,022 bases), commencing with a 5' noncoding region of 96 bases. The 3' noncoding sequence of 624 nucleotides included a unique insertion sequence of 46 bases adjacent to the stop codon, but otherwise it had properties similar to those of RNAs of closely related flaviviruses. A full-length KUN cDNA clone which could be stably propagated in Escherichia coli DH5 alpha was constructed; SP6 polymerase RNA transcripts from amplified cDNA were infectious when transfected into BHK-21 cells. A mutational change abolishing the BamHI restriction site at position 4049, leading to a conservative amino acid change of Arg-175 to Lys in the NS2A protein, was introduced into the cDNA during construction and was retained in the recovered virus. Extra terminal nucleotides introduced during cloning of the cDNA were shown to be present in the in vitro RNA transcripts but absent in the RNA of recovered virus. Although recovered virus differed from the parental KUN by a smaller plaque phenotype and delayed growth rate in BHK-21 cells and mice, it was very similar as assessed by several other criteria, such as peak titer during growth in cells, infectivity titer in cells and in mice, rate of adsorption and penetration in cells, replication at 39 degrees C, and neurovirulence after intraperitoneal injection in mice. The KUN stably cloned cDNA will provide a useful basis for future studies in defining and characterizing functional roles of all the gene products. Images PMID:8207832

  14. Single Nucleotide Polymorphism Mapping Using Genome-Wide Unique Sequences

    PubMed Central

    Chen, Leslie Y.Y.; Lu, Szu-Hsien; Shih, Edward S.C.; Hwang, Ming-Jing

    2002-01-01

    As more and more genomic DNAs are sequenced to characterize human genetic variations, the demand for a very fast and accurate method to genomically position these DNA sequences is high. We have developed a new mapping method that does not require sequence alignment. In this method, we first identified DNA fragments of 15 bp in length that are unique in the human genome and then used them to position single nucleotide polymorphism (SNP) sequences. By use of four desktop personal computers with AMD K7 (1 GHz) processors, our new method mapped more than 1.6 million SNP sequences in 20 hr and achieved a very good agreement with mapping results from alignment-based methods. PMID:12097348

  15. The primary nucleotide sequence of U4 RNA.

    PubMed

    Reddy, R; Henning, D; Busch, H

    1981-04-10

    U4 RNA is one of the "capped" nuclear snRNAs recently found to be precipitable by anti-Sm antibodies as ribonucleoprotein particles. U4 RNA, along with other snRNAs, has been implicated in hnRNA processing, mRNA transport, or both (Lerner, M. R., Boyle, J., Mount, S., Wolin, S., and Steitz, J. A. (1980) Nature 283, 220-224). Since the proteins bound to different snRNAs appear to be the same, the functions of different snRNPs might be dependent on the RNA components. To help understand the function of U4 RNP, the nucleotide sequence of U4 RNA was determined. The sequence is (formula see text) In addition to the modified nucleotides in the "cap," U4 RNA contains Am at position 63 and m6A at position 98. It also exhibited A-C microheterogeneity at position 97. PMID:6162848

  16. Nucleotide-Specific Contrast for DNA Sequencing by Electron Spectroscopy

    PubMed Central

    Schmid, Andreas K.; Davis, Ronald W.

    2016-01-01

    DNA sequencing by imaging in an electron microscope is an approach that holds promise to deliver long reads with low error rates and without the need for amplification. Earlier work using transmission electron microscopes, which use high electron energies on the order of 100 keV, has shown that low contrast and radiation damage necessitates the use of heavy atom labeling of individual nucleotides, which increases the read error rates. Other prior work using scattering electrons with much lower energy has shown to suppress beam damage on DNA. Here we explore possibilities to increase contrast by employing two methods, X-ray photoelectron and Auger electron spectroscopy. Using bulk DNA samples with monomers of each base, both methods are shown to provide contrast mechanisms that can distinguish individual nucleotides without labels. Both spectroscopic techniques can be readily implemented in a low energy electron microscope, which may enable label-free DNA sequencing by direct imaging. PMID:27149617

  17. Molecular cloning and sequence analysis of factor C cDNA from the Singapore horseshoe crab, Carcinoscorpius rotundicauda.

    PubMed

    Ding, J L; Navas, M A; Ho, B

    1995-03-01

    Two forms of Factor C cDNAs: CrFC21 (3448 bp) and CrFC26 (4182 bp) have been cloned into lambda gt22. CrFC26 includes 568 nucleotides of 5' untranslated region (5' UTR) containing seven ATGs before the real initiation site, an open reading frame (ORF) of 3249 nucleotides, a stop codon, and 365 nucleotides of 3' untranslated sequence. There are four polyadenylation signals and six potential glycosylation sites. The ORF codes for a signal peptide of 24 amino acids and a Factor C zymogen of 1059 residues. The CrFC21 lacks most of the 5' UTR, and has some base changes in its ORF. The predicted secondary mRNA structures of the 5' end of CrFC26 showed numerous stem-and-loop structures, thus obscuring its real start codon. In contrast, CrFC21 has a well-exposed AUG start site, and expresses Factor C in transcription-translation reactions in vitro. There is a typical serine protease catalytic triad of Asp-His-Ser, which is structurally like prothrombin, but catalytically more similar to trypsin. Although an overall homology of 97.7% was observed in comparison with the Tachypleus tridentatus Factor C (TtFC) cDNA, there were notable differences in the restriction sites and subtle base substitutions in the CrFC cDNA. The high degree of homology between Factor C from T. tridentatus and C. rotundicauda substantiates, at the molecular level, the proximity of these two species in the course of evolution. This finding contravenes the apparent disparities with respect to their morphology, ecological habitat, and taxonomical classification. PMID:7538401

  18. 77 FR 65537 - Requirements for Patent Applications Containing Nucleotide Sequence and/or Amino Acid Sequence...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-10-29

    ... Amino Acid Sequence Disclosures ACTION: Proposed collection; comment request. SUMMARY: The United States....'' SUPPLEMENTARY INFORMATION: I. Abstract Patent applications that contain nucleotide and/or amino acid sequence disclosures must include a copy of the sequence listing in accordance with the requirements in 37 CFR...

  19. Metallothionein cDNA, promoter, and genomic sequences of the tropical green mussel, Perna viridis.

    PubMed

    Khoo, H W; Patel, K H

    1999-09-01

    The primary structure of the cDNA and metallothionein (MT) genomic sequences of the tropical green mussel (Perna viridis) was determined. The complete cDNA sequences were obtained using degenerate primers designed from known metallothionein consensus amino acid sequences from the temperate species Mytilus edulis. The amino acid sequences of P. viridis metallothionein deduced from the coding region consisted of 72 amino acids with 21 cysteine residues and 9 Cys-X-Cys motifs corresponding to Type I MT class of other species. Two different genomic sequences coding for the same mRNA were obtained. Each putative gene contained a unique 5'UTR and two unique introns located at the same splice sites. The promoters for both genes were different in length and both contained metal responsive elements and active protein-binding sites. The structures of the genomic clones were compared with those of other species. J. Exp. Zool. 284:445-453, 1999. PMID:10451422

  20. Complete nucleotide sequence of Saccharomyces cerevisiae chromosome X.

    PubMed Central

    Galibert, F; Alexandraki, D; Baur, A; Boles, E; Chalwatzis, N; Chuat, J C; Coster, F; Cziepluch, C; De Haan, M; Domdey, H; Durand, P; Entian, K D; Gatius, M; Goffeau, A; Grivell, L A; Hennemann, A; Herbert, C J; Heumann, K; Hilger, F; Hollenberg, C P; Huang, M E; Jacq, C; Jauniaux, J C; Katsoulou, C; Karpfinger-Hartl, L

    1996-01-01

    The complete nucleotide sequence of Saccharomyces cerevisiae chromosome X (745 442 bp) reveals a total of 379 open reading frames (ORFs), the coding region covering approximately 75% of the entire sequence. One hundred and eighteen ORFs (31%) correspond to genes previously identified in S. cerevisiae. All other ORFs represent novel putative yeast genes, whose function will have to be determined experimentally. However, 57 of the latter subset (another 15% of the total) encode proteins that show significant analogy to proteins of known function from yeast or other organisms. The remaining ORFs, exhibiting no significant similarity to any known sequence, amount to 54% of the total. General features of chromosome X are also reported, with emphasis on the nucleotide frequency distribution in the environment of the ATG and stop codons, the possible coding capacity of at least some of the small ORFs (<100 codons) and the significance of 46 non-canonical or unpaired nucleotides in the stems of some of the 24 tRNA genes recognized on this chromosome. Images PMID:8641269

  1. Partial sequence analysis of 130 randomly selected maize cDNA clones.

    PubMed Central

    Keith, C S; Hoang, D O; Barrett, B M; Feigelman, B; Nelson, M C; Thai, H; Baysdorfer, C

    1993-01-01

    As part of a project to identify novel maize (Zea mays L. cv B73) genes functionally, we have partially sequenced 130 randomly selected clones from a maize leaf cDNA library. Data base comparisons revealed seven previously sequenced maize cDNAs and 18 cDNAs with sequence similarity to related maize genes or to genes from other organisms. One hundred five cDNAs show little or no similarity to previously sequenced genes. Our results also establish the suitability of this library for large-scale sequencing in terms of its large insert size, proper insert orientation, and low duplication rate. PMID:8278499

  2. Structure and nucleotide sequence of the rat intestinal vitamin D-dependent calcium binding protein gene.

    PubMed Central

    Krisinger, J; Darwish, H; Maeda, N; DeLuca, H F

    1988-01-01

    The vitamin D-dependent intestinal calcium binding protein (ICaBP, 9 kDa) is under transcriptional regulation by 1,25-dihydroxyvitamin D3 [1,25-(OH)2D3], the hormonal active form of the vitamin. To study the mechanism of gene regulation by 1,25-(OH)2D3, we isolated the rat ICaBP gene by using a cDNA probe. Its nucleotide sequence revealed 3 exons separated by 2 introns within approximately 3 kilobases. The first exon represents only noncoding sequences, while the second and third encode the two calcium binding domains of the protein. The gene contains a 15-base-pair imperfect palindrome in the first intron that shows high homology to the estrogen-responsive element. This sequence may represent the vitamin D-responsive element involved in the regulation of the ICaBP gene. The second intron shows an 84-base-pair-long simple nucleotide repeat that implicates Z-DNA formation. Genomic Southern analysis shows that the rat gene is represented as a single copy. Images PMID:3194402

  3. The complete nucleotide sequence of pelargonium leaf curl virus.

    PubMed

    McGavin, Wendy J; MacFarlane, Stuart A

    2016-05-01

    Investigation of a tombusvirus isolated from tulip plants in Scotland revealed that it was pelargonium leaf curl virus (PLCV) rather than the originally suggested tomato bushy stunt virus. The complete sequence of the PLCV genome was determined for the first time, revealing it to be 4789 nucleotides in size and to have an organization similar to that of the other, previously described tombusviruses. Primers derived from the sequence were used to construct a full-length infectious clone of PLCV that recapitulates the disease symptoms of leaf curling in systemically infected pelargonium plants. PMID:26906694

  4. Genes galore: a summary of methods for accessing results from large-scale partial sequencing of anonymous Arabidopsis cDNA clones.

    PubMed Central

    Newman, T; de Bruijn, F J; Green, P; Keegstra, K; Kende, H; McIntosh, L; Ohlrogge, J; Raikhel, N; Somerville, S; Thomashow, M

    1994-01-01

    High-throughput automated partial sequencing of anonymous cDNA clones provides a method to survey the repertoire of expressed genes from an organism. Comparison of the coding capacity of these expressed sequence tags (ESTs) with the sequences in the public data bases results in assignment of putative function to a significant proportion of the ESTs. Thus, the more than 13,400 plant ESTs that are currently available provide a new resource that will facilitate progress in many areas of plant biology. These opportunities are illustrated by a description of the results obtained from analysis of 1500 Arabidopsis ESTs from a cDNA library prepared from equal portions of poly(A+) mRNA from etiolated seedlings, roots, leaves, and flowering inflorescences. More than 900 different sequences were represented, 32% of which showed significant nucleotide or deduced amino acid sequences similarity to previously characterized genes or proteins from a wide range of organisms. At least 165 of the clones had significant deduced amino acid sequence homology to proteins or gene products that have not been previously characterized from higher plants. A summary of methods for accessing the information and materials generated by the Arabidopsis cDNA sequencing project is provided. PMID:7846151

  5. HUGE: a database for human large proteins identified in the Kazusa cDNA sequencing project.

    PubMed

    Kikuno, R; Nagase, T; Suyama, M; Waki, M; Hirosawa, M; Ohara, O

    2000-01-01

    HUGE is a database for human large proteins newly identified in the Kazusa cDNA project, the aim of which is to predict the primary structure of proteins from the sequences of human large cDNAs (>4 kb). In particular, cDNA clones capable of coding for large proteins (>50 kDa) are the current targets of the project. HUGE contains >1100 cDNA sequences and detailed information obtained through analysis of the sequences of cDNAs and the predicted proteins. Besides an increase in the number of cDNA entries, the amount of experimental data for expression profiling has been largely increased and data on chromosomal locations have been newly added. All of the protein-coding regions were examined by GeneMark analysis, and the results of a motif/domain search of each predicted protein sequence against the Pfam database have been newly added. HUGE is available through the WWW at http://www.kazusa.or.jp/huge PMID:10592264

  6. Molecular Cloning and Sequencing of Channel Catfish, Ictalurus punctatus, Cathepsin H and L cDNA

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Cathepsin H and L, a lysosomal cysteine endopeptidase of the papain family, are ubiquitously expressed and involve in antigen processing. In this communication, the channel catfish cathepsin H and L transcripts were sequenced and analyzed. Total RNA from tissues was extracted and cDNA libraries we...

  7. Genomic and cDNA sequence tags of the hyperthermophilic archaeon Pyrobaculum aerophilum.

    PubMed Central

    Völkl, P; Markiewicz, P; Baikalov, C; Fitz-Gibbon, S; Stetter, K O; Miller, J H

    1996-01-01

    The hyperthermophilic archaeum, Pyrobaculum aerophilum, grows optimally at 100 degrees C with a doubling time of 180 min. It is a member of the phylogenetically ancient Thermoproteales order, but differs significantly from all other members by its facultatively aerobic metabolism. Due to its simple cultivation requirements and its nearly 100% plating efficiency, it was chosen as a model organism for studying the genome organization of hyperthermophilic ancient archaea. By a G+C content of the DNA of 52 mol%, sequence analysis was easily possible. At least some of the mRNA of P. aerophilum carried poly-A tails facilitating the construction of a cDNA library. 245 sequence tags of a poly-A primed cDNA library and 55 sequence tags from a 1-2 kb Sau3AI-fragment containing genomic library were analyzed and the corresponding amino acid sequences compared with protein sequences from databases. Fourteen percent of the cDNA and >9% of genomic DNA sequence tags revealed significant similarities to proteins in the databases. Matches were obtained to proteins from archaeal, bacterial and eukaryal sources. Some sequences showed greatest similarity to eukaryal rather than to bacterial versions of proteins, other matches were found to proteins which had previously only been found in eukaryotes. PMID:8948626

  8. Nucleotide sequences of five anti-lysozyme monoclonal antibodies.

    PubMed Central

    Darsley, M J; Rees, A R

    1985-01-01

    The nucleotide sequences of the heavy and light chain immunoglobulin mRNAs derived from five hybridomas (Gloop 1-5) secreting IgGs specific for the loop region of hen egg lysozyme were determined. These monoclonal antibodies recognise three distinct but overlapping epitopes within the loop region. The sequences of two pairs of antibodies with indistinguishable fine specificities were similar in both chains whereas the sequences of antibodies of non-identical specificities were very different. It is proposed that the D-segments expressed in two of the antibodies (Gloop3 and Gloop4) are the products of one, or perhaps two, previously unidentified germ line D-genes. Gloop1 and Gloop2 use a D-segment previously identified in antibodies specific for the hapten 2-phenyloxazolone; however it is recombined in a different reading frame in the anti-lysozyme antibodies, producing a different amino acid sequence. PMID:2410256

  9. Conserved nucleotide sequences in the open reading frame and 3' untranslated region of selenoprotein P mRNA.

    PubMed Central

    Hill, K E; Lloyd, R S; Burk, R F

    1993-01-01

    Rat liver selenoprotein P contains 10 selenocysteine residues in its primary structure (deduced). It is the only selenoprotein characterized to date that has more than one selenocysteine residue. Selenoprotein P cDNA has been cloned from human liver and heart cDNA libraries and sequenced. The open reading frames are identical and contain a signal peptide, indicating that the protein is secreted by both organs and is therefore not exclusively produced in the liver. Ten selenocysteine residues (deduced) are present. Comparison of the open reading frame of the human cDNA with the rat cDNA reveals a 69% identity of the nucleotide sequence and 72% identity of the deduced amino acid sequence. Two regions in the 3' untranslated portion have high conservation between human and rat. Each of these regions contains a predicted stable stem-loop structure similar to the single stem-loop structures reported in 3' untranslated regions of type I iodothyronine 5'-deiodinase and glutathione peroxidase. The stem-loop structure of type I iodothyronine 5'-deiodinase has been shown to be necessary for incorporation of the selenocysteine residue at the UGA codon. Because only two stem-loop structures are present in the 3' untranslated region of selenoprotein P mRNA, it can be concluded that a separate stem-loop structure is not required for each selenocysteine residue. Images PMID:8421687

  10. Computer-Based Methods for the Mouse Full-Length cDNA Encyclopedia: Real-Time Sequence Clustering for Construction of a Nonredundant cDNA Library

    PubMed Central

    Konno, Hideaki; Fukunishi, Yoshifumi; Shibata, Kazuhiro; Itoh, Masayoshi; Carninci, Piero; Sugahara, Yuichi; Hayashizaki, Yoshihide

    2001-01-01

    We developed computer-based methods for constructing a nonredundant mouse full-length cDNA library. Our cDNA library construction process comprises assessment of library quality, sequencing the 3′ ends of inserts and clustering, and completing a re-array to generate a nonredundant library from a redundant one. After the cDNA libraries are generated, we sequence the 5′ ends of the inserts to check the quality of the library; then we determine the sequencing priority of each library. Selected libraries undergo large-scale sequencing of the 3′ ends of the inserts and clustering of the tag sequences. After clustering, the nonredundant library is constructed from the original libraries, which have redundant clones. All libraries, plates, clones, sequences, and clusters are uniquely identified, and all information is saved in the database according to this identifier. At press time, our system has been in place for the past two years; we have clustered 939,725 3′ end sequences into 127,385 groups from 227 cDNA libraries/sublibraries (see http://genome.gse.riken.go.jp/). [The sequence data described in this paper have been submitted to the DDBJ data library under accession nos. AV00011–AV175734, AV204013–AV382295, and BB561685–BB609425.] PMID:11157791

  11. cDNA cloning, sequence analysis, and chromosomal localization of the gene for human carnitine palmitoyltransferase.

    PubMed Central

    Finocchiaro, G; Taroni, F; Rocchi, M; Martin, A L; Colombo, I; Tarelli, G T; DiDonato, S

    1991-01-01

    We have cloned and sequenced a cDNA encoding human liver carnitine palmitoyltransferase (CPTase; palmitoyl-CoA:L-carnitine O-palmitoyltransferase, EC 2.3.1.21), an inner mitochondrial membrane enzyme that plays a major role in the fatty acid oxidation pathway. Mixed oligonucleotide primers whose sequences were deduced from one tryptic peptide obtained from purified CPTase were used in a polymerase chain reaction, allowing the amplification of a 0.12-kilobase fragment of human genomic DNA encoding such a peptide. A 60-base-pair (bp) oligonucleotide synthesized on the basis of the sequence from this fragment was used for the screening of a cDNA library from human liver and hybridized to a cDNA insert of 2255 bp. This cDNA contains an open reading frame of 1974 bp that encodes a protein of 658 amino acid residues including 25 residues of an NH2-terminal leader peptide. The assignment of this open reading frame to human liver CPTase is confirmed by matches to seven different amino acid sequences of tryptic peptides derived from pure human CPTase and by the 82.2% homology with the amino acid sequence of rat CPTase. The NH2-terminal region of CPTase contains a leucine-proline motif that is shared by carnitine acetyl- and octanoyltransferases and by choline acetyltransferase. The gene encoding CPTase was assigned to human chromosome 1, region 1q12-1pter, by hybridization of CPTase cDNA with a DNA panel of 19 human-hamster somatic cell hybrids. Images PMID:1988962

  12. Cloning and sequencing of a cDNA encoding a taste-modifying protein, miraculin.

    PubMed

    Masuda, Y; Nirasawa, S; Nakaya, K; Kurihara, Y

    1995-08-19

    A cDNA clone encoding a taste-modifying protein, miraculin (MIR), was isolated and sequenced. The encoded precursor to MIR was composed of 220 amino acid (aa) residues, including a possible signal sequence of 29 aa. Northern blot analysis showed that the mRNA encoding MIR was already expressed in fruits of Richadella dulcifica at 3 weeks after pollination and was present specifically in the pulp. PMID:7665074

  13. Human thyroid peroxidase: complete cDNA and protein sequence, chromosome mapping, and identification of two alternately spliced mRNAs

    SciTech Connect

    Kimura, S.; Kotani, T.; McBride, O.W.; Umeki, K.; Hirai, K.; Nakayama, T.; Ohtaki, S.

    1987-08-01

    Two forms of human thyroid peroxidase cDNAs were isolated from a lambdagt11 cDNA library, prepared from Graves disease thyroid tissue mRNA, by use of oligonucleotides. The longest complete cDNA, designated phTPO-1, has 3048 nucleotides and an open reading frame consisting of 933 amino acids, which would encode a protein with a molecular weight of 103,026. Five potential asparagine-linked glycosylation sites are found in the deduced amino acid sequence. The second peroxidase cDNA, designated phTPO-2, is almost identical to phTPO-1 beginning 605 base pairs downstream except that it contains 1-base-pair difference and lacks 171 base pairs in the middle of the sequence. This results in a loss of 57 amino acids corresponding to a molecular weight of 6282. Interestingly, this 171-nucleotide sequence has GT and AG at its 5' and 3' boundaries, respectively, that are in good agreement with donor and acceptor splice site consensus sequences. Using specific oligonucleotide probes for the mRNAs derived from the cDNA sequences hTOP-1 and hTOP-2, the authors show that both are expressed in all thyroid tissues examined and the relative level of two mRNAs is different in each sample. The results suggest that two thyroid peroxidase proteins might be generated through alternate splicing of the same gene. By using somatic cell hybrid lines, the thyroid peroxidase gene was mapped to the short arm of human chromosome 2.

  14. Identification of genomic sequences corresponding to cDNA clones

    SciTech Connect

    Spoerel, N.A.; Kafatos, F.C.

    1987-01-01

    The general methods applicable to the isolation of genomic sequences from phage lambda or cosmid libraries have been described. This chapter presents strategies for the investigation of genes that occur in several identical or nonidentical copies per genome, or that share a common conserved domain with other genes. The methods discussed are applicable both to the identification of the genes in Southern blots and to their isolation from libraries. Furthermore, the methods are well suited for the analysis of homologous genes in different species. A high proportion of genes in eukaryotes are known to be members of multigene families. Carefully controlled hybridization conditions and well-tailored probes are powerful tools in the isolation and analysis of genes which share a common domain or are members of multigene families. This chapter consists of a short review of recommended strategies and relevant parameters, which have been discussed in more detail earlier. Using three examples from the authors' analysis of the silk moth choriun locus, they demonstrate how powerful carefully tailored short single-stranded probes can be in the analysis of closely related gene copies.

  15. Comparing compressed sequences for faster nucleotide BLAST searches.

    PubMed

    Cameron, Michael; Williams, Hugh E

    2007-01-01

    Molecular biologists, geneticists, and other life scientists use the BLAST homology search package as their first step for discovery of information about unknown or poorly annotated genomic sequences. There are two main variants of BLAST: BLASTP for searching protein collections and BLASTN for nucleotide collections. Surprisingly, BLASTN has had very little attention; for example, the algorithms it uses do not follow those described in the 1997 BLAST paper and no exact description has been published. It is important that BLASTN is state-of-the-art: Nucleotide collections such as GenBank dwarf the protein collections in size, they double in size almost yearly, and they take many minutes to search on modern general purpose workstations. This paper proposes significant improvements to the BLASTN algorithms. Each of our schemes is based on compressed bytepacked formats that allow queries and collection sequences to be compared four bases at a time, permitting very fast query evaluation using lookup tables and numeric comparisons. Our most significant innovations are two new, fast gapped alignment schemes that allow accurate sequence alignment without decompression of the collection sequences. Overall, our innovations more than double the speed of BLASTN with no effect on accuracy and have been integrated into our new version of BLAST that is freely available for download from http://www.fsa-blast.org/. PMID:17666756

  16. Nucleotide sequence of a complementary DNA encoding pea cytosolic copper/zinc superoxide dismutase. [Pisum sativum L

    SciTech Connect

    White, D.A.; Zilinskas, B.A. )

    1991-08-01

    The authors now report the nucleotide sequence of the cytosolic Cu/Zn SOD cloned from a {lambda}gt11 cDNA library constructed from mRNA extracted from leaves of 7- to 10-d pea seedlings (Pisum sativum L.). The clone was isolated using a 22-base synthetic oligonucleotide complementary to the amino acid sequence CGIIGLQG. This sequence, found at the protein's carboxy terminus, is highly conserved among plant cytosolic Cu/Zn SODs but not chloroplastic Cu/Zn SODs. The 738-base pair sequence contains an open reading frame specifying 152 codons and a predicted M{sub r} of 18,024 D. The deduced amino acid sequence is highly homologous (79-82% identity) with the sequences of other known plant cytosolic Cu/Zn SODs but less highly conserved (63-65%) when compared with several chloroplastic Cu/Zn SODs including pea (10).

  17. Bioinformatics comparison of sulfate-reducing metabolism nucleotide sequences

    NASA Astrophysics Data System (ADS)

    Tremberger, G.; Dehipawala, Sunil; Nguyen, A.; Cheung, E.; Sullivan, R.; Holden, T.; Lieberman, D.; Cheung, T.

    2015-09-01

    The sulfate-reducing bacteria can be traced back to 3.5 billion years ago. The thermodynamics details of the sulfur cycle have been well documented. A recent sulfate-reducing bacteria report (Robator, Jungbluth, et al , 2015 Jan, Front. Microbiol) with Genbank nucleotide data has been analyzed in terms of the sulfite reductase (dsrAB) via fractal dimension and entropy values. Comparison to oil field sulfate-reducing sequences was included. The AUCG translational mass fractal dimension versus ATCG transcriptional mass fractal dimension for the low temperature dsrB and dsrA sequences reported in Reference Thirteen shows correlation R-sq ~ 0.79 , with a probably of about 3% in simulation. A recent report of using Cystathionine gamma-lyase sequence to produce CdS quantum dot in a biological method, where the sulfur is reduced just like in the H2S production process, was included for comparison. The AUCG mass fractal dimension versus ATCG mass fractal dimension for the Cystathionine gamma-lyase sequences was found to have R-sq of 0.72, similar to the low temperature dissimilatory sulfite reductase dsr group with 3% probability, in contrary to the oil field group having R-sq ~ 0.94, a high probable outcome in the simulation. The other two simulation histograms, namely, fractal dimension versus entropy R-sq outcome values, and di-nucleotide entropy versus mono-nucleotide entropy R-sq outcome values are also discussed in the data analysis focusing on low probability outcomes.

  18. cDNA sequence and chromosomal localization of a novel human protein, RBQ-1 (RBBP6), that binds to the retinoblastoma gene product

    SciTech Connect

    Sakai, Yoshihisa; Saijo, Masafumi; Taya, Yoichi

    1995-11-01

    We have previously isolated cDNA of a novel protein (RBQ-1, HGMW-approved symbol RBBP6) that binds to the retinoblastoma gene product (pRB). Total nucleotide sequence of the cDNA has now been determined. It encoded a protein of 140 kDa that consists of 948 amino acids and contains multiple repeated sequences like SRS, YRE, and VPPP. The region used for pRB binding was identified on a small region near the C-terminus. We have mapped this gene to 16p11.2-p12 using polymerase chain reaction analysis on a human-hamster hybrid cell panel and chromosomal fluorescence in situ hybridization. 24 refs., 3 figs.

  19. Cytochrome b nucleotide sequence variation among the Atlantic Alcidae.

    PubMed

    Friesen, V L; Montevecchi, W A; Davidson, W S

    1993-01-01

    Analysis of cytochrome b nucleotide sequences of the six extant species of Atlantic alcids and a gull revealed an excess of adenines and cytosines and a deficit of guanines at silent sites on the coding strand. Phylogenetic analyses grouped the sequences of the common (Uria aalge) and Brünnich's (U. lomvia) guillemots, followed by the razorbill (Alca torda) and little auk (Alle alle). The black guillemot (Cepphus grylle) sequence formed a sister taxon, and the puffin (Fratercula arctica) fell outside the other alcids. Phylogenetic comparisons of substitutions indicated that mutabilities of bases did not differ, but that C was much more likely to be incorporated than was G. Imbalances in base composition appear to result from a strand bias in replication errors, which may result from selection on secondary RNA structure and/or the energetics of codon-anticodon interactions. PMID:7916741

  20. The nucleotide sequence of the bacteriophage T5 ltf gene.

    PubMed

    Kaliman, A V; Kulshin, V E; Shlyapnikov, M G; Ksenzenko, V N; Kryukov, V M

    1995-06-01

    The nucleotide sequence of the bacteriophage T5 Bg/II-BamHI fragment (4,835 bp in length) known to carry a gene encoding the LTF protein which forms the phage L-shaped tail fibers was determined. It was shown to contain an open reading frame for 1,396 amino acid residues that corresponds to a protein of 147.8 kDa. The coding region of ltf gene is preceded by a typical Shine-Dalgarno sequence. Downstream from the ltf gene there is a strong transcription terminator. Data bank analysis of the LTF protein sequence reveals 55.1% identity to the hypothetical protein ORF 401 of bacteriophage lambda in a segment of 118 amino acids overlap. PMID:7789514

  1. IMGT/LIGM-DB, the IMGT comprehensive database of immunoglobulin and T cell receptor nucleotide sequences.

    PubMed

    Giudicelli, Véronique; Duroux, Patrice; Ginestoux, Chantal; Folch, Géraldine; Jabado-Michaloud, Joumana; Chaume, Denys; Lefranc, Marie-Paule

    2006-01-01

    IMGT/LIGM-DB is the IMGT comprehensive database of immunoglobulin (IG) and T cell receptor (TR) nucleotide sequences from human and other vertebrate species. It was created in 1989 by LIGM, Montpellier, France and is the oldest and the largest database of IMGT. IMGT/LIGM-DB includes all germline (non-rearranged) and rearranged IG and TR genomic DNA (gDNA) and complementary DNA (cDNA) sequences published in generalist databases. IMGT/LIGM-DB allows searches from the Web interface according to biological and immunogenetic criteria through five distinct modules depending on the user interest. For a given entry, nine types of display are available including the IMGT flat file, the translation of the coding regions and the analysis by the IMGT/V-QUEST tool. IMGT/LIGM-DB distributes expertly annotated sequences. The annotations hugely enhance the quality and the accuracy of the distributed detailed information. They include the sequence identification, the gene and allele classification, the constitutive and specific motif description, the codon and amino acid numbering, and the sequence obtaining information, according to the main concepts of IMGT-ONTOLOGY. They represent the main source of IG and TR gene and allele knowledge stored in IMGT/GENE-DB and in the IMGT reference directory. IMGT/LIGM-DB is freely available at http://imgt.cines.fr. PMID:16381979

  2. Molecular cloning and characterization of potato spindle tuber viroid cDNA sequences

    PubMed Central

    Owens, Robert A.; Cress, Dean E.

    1980-01-01

    Double-stranded cDNA has been synthesized from a polyadenylylated potato spindle tuber viroid (PSTV) template and inserted in the Pst I endonuclease site of plasmid pBR322 by using the oligo(dC)·oligo(dG)-tailing procedure. Tetracycline-resistant ampicillin-sensitive transformants contained sequences complementary to PSTV [32P]cDNA, and one recombinant clone (pDC-29) contains a 460-base-pair insert. This cloned double-stranded PSTV cDNA contains the cleavage sites for six restriction endonucleases predicted by the published primary sequence of PSTV as well as one additional site each for Ava I, Hae III, Hpa II, and Sma I. The additional Ava I, Hpa II, and Sma I sites are explained by the presence of a second C-C-C-G-G-G sequence in the cloned double-stranded cDNA. The largest fragment released by Hae III digestion contains approximately 360 base pairs. These results suggest that we have cloned almost the entire sequence of PSTV, but the sequence cloned differs slightly from that published. Hybridization probes derived from pDC-29 insert have allowed detection and preliminary characterization of RNA molecules having the same size as PSTV but the opposite polarity. This RNA is present during PSTV replication in infected tomato cells. Images PMID:16592877

  3. Selection and sequence analysis of a cDNA clone encoding a known chorion protein of the A family.

    PubMed Central

    Tsitilou, S G; Regier, J C; Kafatos, F C

    1980-01-01

    Using as criteria the size, abundance and developmental specificity of hybridizing mRNA sequences, we have selected from our chorion cDNA library a clone corresponding to a specific chorion protein, A4--cl. Comparison between the clone sequence and the largely known sequence of A4--cl validates the use of the cDNA library for sequence analysis of the chorion multigene families. The two major chorion protein families, A and B, share certain structural similarities. Images PMID:7433133

  4. Nucleotide sequence corresponding to five chemotaxis genes in Escherichia coli.

    PubMed Central

    Mutoh, N; Simon, M I

    1986-01-01

    The nucleotide sequence of DNA which contains five chemotaxis-related genes of Escherichia coli, cheW, cheR, cheB, cheY, and cheZ, and part of the cheA gene was determined. Molecular weights of the polypeptides encoded by these genes were calculated from translated amino acid sequences, and they were 18,100 for cheW, 32,700 for cheR, 37,500 for cheB, 14,100 for cheY, and 24,000 for cheZ. Nucleotide sequences which could act as ribosome-binding sites were found in the upstream region of each gene. After the termination codon of the cheW gene, a typical rho-independent transcription termination signal was observed. There are no other open reading frames long enough to encode polypeptides in this region except those which code for the two previously reported genes tar and tap. PMID:3510184

  5. Nucleotide sequence of the gene for the b subunit of human factor XIII

    SciTech Connect

    Bottenus, R.E.; Ichinose, A.; Davie, E.W. )

    1990-12-01

    Factor XIII (M{sub r} 320 000) is a blood coagulation factor that stabilizes and strengthens the fibrin clot. It circulates in blood as a tetramer composed of two a subunits (M{sub r} 75 000 each) and two b subunits (M{sub r} 80 000 each). The b subunit consists of 641 amino acids and includes 10 tandem repeats of 60 amino acids known as GP-I structures, short consensus repeats (SCR), or sushi domains. In the present study, the human gene for the b subunit has been isolated from three different genomic libraries prepared in {lambda} phage. Fifteen independent phage with inserts coding for the entire gene were isolated and characterized by restriction mapping, Southern blotting, and DNA sequencing. The gene was found to be 28 kilobases in length and consisted of 12 exons (I-XII) separated by 11 intervening sequences. The leader sequence was encoded by exon I, while the carbonyl-terminal region of the protein was encoded by exon XII. Exons II-XI each coded for a single sushi domain, suggesting that the gene evolved through exon shuffling and duplication. The 12 exons in the gene ranged in size from 64 to 222 base pairs, while the introns ranged in size from 87 to 9970 nucleotides and made up 92{percent} of the gene. One nucleotide change was found in the coding region of the gene when its sequence was compared to that of the cDNA. This difference, however, did not result in a change in the amino acid sequence of the protein.

  6. cDNA Library Enrichment of Full Length Transcripts for SMRT Long Read Sequencing

    PubMed Central

    Hartwig, Benjamin; Reinhardt, Richard; Schneeberger, Korbinian

    2016-01-01

    The utility of genome assemblies does not only rely on the quality of the assembled genome sequence, but also on the quality of the gene annotations. The Pacific Biosciences Iso-Seq technology is a powerful support for accurate eukaryotic gene model annotation as it allows for direct readout of full-length cDNA sequences without the need for noisy short read-based transcript assembly. We propose the implementation of the TeloPrime Full Length cDNA Amplification kit to the Pacific Biosciences Iso-Seq technology in order to enrich for genuine full-length transcripts in the cDNA libraries. We provide evidence that TeloPrime outperforms the commonly used SMARTer PCR cDNA Synthesis Kit in identifying transcription start and end sites in Arabidopsis thaliana. Furthermore, we show that TeloPrime-based Pacific Biosciences Iso-Seq can be successfully applied to the polyploid genome of bread wheat (Triticum aestivum) not only to efficiently annotate gene models, but also to identify novel transcription sites, gene homeologs, splicing isoforms and previously unidentified gene loci. PMID:27327613

  7. Sequencing and comparative genomic analysis of 1227 Felis catus cDNA sequences enriched for developmental, clinical and nutritional phenotypes

    PubMed Central

    2012-01-01

    Background The feline genome is valuable to the veterinary and model organism genomics communities because the cat is an obligate carnivore and a model for endangered felids. The initial public release of the Felis catus genome assembly provided a framework for investigating the genomic basis of feline biology. However, the entire set of protein coding genes has not been elucidated. Results We identified and characterized 1227 protein coding feline sequences, of which 913 map to public sequences and 314 are novel. These sequences have been deposited into NCBI's genbank database and complement public genomic resources by providing additional protein coding sequences that fill in some of the gaps in the feline genome assembly. Through functional and comparative genomic analyses, we gained an understanding of the role of these sequences in feline development, nutrition and health. Specifically, we identified 104 orthologs of human genes associated with Mendelian disorders. We detected negative selection within sequences with gene ontology annotations associated with intracellular trafficking, cytoskeleton and muscle functions. We detected relatively less negative selection on protein sequences encoding extracellular networks, apoptotic pathways and mitochondrial gene ontology annotations. Additionally, we characterized feline cDNA sequences that have mouse orthologs associated with clinical, nutritional and developmental phenotypes. Together, this analysis provides an overview of the value of our cDNA sequences and enhances our understanding of how the feline genome is similar to, and different from other mammalian genomes. Conclusions The cDNA sequences reported here expand existing feline genomic resources by providing high-quality sequences annotated with comparative genomic information providing functional, clinical, nutritional and orthologous gene information. PMID:22257742

  8. Complete cDNA and derived amino acid sequence of human factor V

    SciTech Connect

    Jenny, R.J.; Pittman, D.D.; Toole, J.J.; Kriz, R.W.; Aldape, R.A.; Hewick, R.M.; Kaufman, R.J.; Mann, K.G.

    1987-07-01

    cDNA clones encoding human factor V have been isolated from an oligo(dT)-primed human fetal liver cDNA library prepared with vector Charon 21A. The cDNA sequence of factor V from three overlapping clones includes a 6672-base-pair (bp) coding region, a 90-bp 5' untranslated region, and a 163-bp 3' untranslated region within which is a poly(A)tail. The deduced amino acid sequence consists of 2224 amino acids inclusive of a 28-amino acid leader peptide. Direct comparison with human factor VIII reveals considerable homology between proteins in amino acid sequence and domain structure: a triplicated A domain and duplicated C domain show approx. 40% identity with the corresponding domains in factor VIII. As in factor VIII, the A domains of factor V share approx. 40% amino acid-sequence homology with the three highly conserved domains in ceruloplasmin. The B domain of factor V contains 35 tandem and approx. 9 additional semiconserved repeats of nine amino acids of the form Asp-Leu-Ser-Gln-Thr-Thr/Asn-Leu-Ser-Pro and 2 additional semiconserved repeats of 17 amino acids. Factor V contains 37 potential N-linked glycosylation sites, 25 of which are in the B domain, and a total of 19 cysteine residues.

  9. Was cDNA sequences modulate transgene expression of was promoter-driven lentiviral vectors.

    PubMed

    Toscano, Miguel G; Benabdellah, Karim; Muñoz, Pilar; Frecha, Cecilia; Cobo, Marién; Martín, Francisco

    2009-11-01

    Abstract The development of vectors that express a therapeutic transgene efficiently and specifically in hematopoietic cells (HCs) is an important goal for gene therapy of hematological disorders. We have previously shown that a 500-bp fragment from the proximal Was gene promoter in a lentiviral vector (LV) was sufficient to achieve more than 100-fold higher levels of Wiskott-Aldrich syndrome protein in HCs than in nonhematopoietic cells (non-HCs). We show now that this differential was reduced up to 10 times when the enhanced green fluorescent protein gene (eGFP) was expressed instead of Was in the same LV backbone. Insertion of Was cDNA sequences downstream of eGFP in these LVs had a negative effect on transgene expression. This effect varied in different cell types but, overall, Was cDNA sequences increased the hematopoietic specificity of Was promoter-driven LV. We have characterized the minimal fragment required to increase hematopoietic specificity and have demonstrated that the mechanism involves Was promoter regulation and RNA processing. In addition, we have shown that Was cDNA sequences interfere with the enhancer activity of the woodchuck posttranscriptional regulatory element. These results represent the first data showing the role of Was intragenic sequences in gene regulation. PMID:19630517

  10. Molecular cloning and sequencing of a cDNA encoding partial putative molt-inhibiting hormone from Penaeus chinensis

    NASA Astrophysics Data System (ADS)

    Wang, Zai-Zhao; Xiang, Jian-Hai

    2002-09-01

    Total RNA was extracted from eyestalks of shrimp Penaeus chinensis. Eyestalk cDNA was obtained from total RNA by reverse transcription. Reverse transcriptase-polymerase chain reaction (RT-PCR) was initiated using eyestalk cDNA and degenerate primers designed from the amino acid sequence of molt-inhibiting hormone from shrimp Penaeus japonicus. A specific cDNA was obtained and cloned into a T vector for sequencing. The cDNA consisted of 201 base pairs and encoding for a peptide of 67 amino acid residues. The peptide of P. chinensis had the highest identity with molt-inhibiting hormones of P. japonicus. The cDNA could be a partial gene of molt-inhibiting hormones from P. chinensis. This paper reports for the first time cDNA encoding for neuropeptide of P. chinensis.

  11. Nucleotide sequence of both genomic RNAs of a North American tobacco rattle virus isolate.

    PubMed

    Sudarshana, M R; Berger, P H

    1998-01-01

    The complete sequence of a North American tobacco rattle virus (TRV) isolate, 'Oregon yellow' (ORY), was determined from cDNA and RT-PCR clones derived from the two genomic RNAs of this isolate. The RNA-1 is 6790 bases and RNA-2 is 3261 bases. The sequence of TRV-ORY RNA-1 was similar to RNA-1 to TRV isolate SYM, and differs in 48 nucleotides. TRV-ORY RNA-1 was one base shorter than--SYM, and had 47 base substitutions resulting in 12 amino acid substitutions of which 4 were conservative. The RNA-2 of TRV-ORY was distinct from RNA-2 of other characterized TRV isolates and contained three open reading frames (ORFs) that could potentially code for proteins of MW 22.4 kDa, 37.6 kDa and 17.9 kDa. Based on the homology of the predicted amino acid sequence with those of other tobraviruses. ORF1 of RNA-2 encodes the coat protein (CP). The protein sequence of ORF2 had regions of limited similarity with those of ORF2 of two other TRV isolates and pea early browning tobravirus. The ORF3 was unique to TRV-ORY. Phylogenetic analysis of tobravirus CPs indicated that TRV-ORY was most closely related to pepper ringspot tobravirus and TRV-TCM. The relationship of tobravirus CPs to other rod-shaped tubular plant viruses is also discussed. PMID:9739332

  12. Nucleotide sequence of Bacillus phage Nf terminal protein gene.

    PubMed Central

    Leavitt, M C; Ito, J

    1987-01-01

    The nucleotide sequence of Bacillus phage Nf gene E has been determined. Gene E codes for phage terminal protein which is the primer necessary for the initiation of DNA replication. The deduced amino acid sequence of Nf terminal protein is approximately 66% homologous with the terminal proteins of Bacillus phages PZA and luminal diameter 29, and shows similar hydropathy and secondary structure predictions. A serine which has been identified as the residue which covalently links the protein to the 5' end of the genome in luminal diameter 29, is conserved in all three phages. The hydropathic and secondary structural environment of this serine is similar in these phage terminal proteins and also similar to the linking serine of adenovirus terminal protein. PMID:3601672

  13. cDNA sequence, genomic organization, and evolutionary conservation of a novel gene from the WAGR region

    SciTech Connect

    Schwartz, F.; Eisenman, R.; Knoll, J.; Bruns, G.

    1995-09-20

    A new gene (239FB) with predominant and differential expression in fetal brain has recently been isolated from a chromosome 11p13-p14 boundary area near FSHB. The corresponding mRNA has an open reading frame of 294 amino acids, a 3` untranslated region of 1247 nucleotides, and a highly GC-rich 5` untranslated region. The coding and 3` UT sequence is specified by 6 exons within nearly 87 kb of isolated genomic locus. The 5` end region of the transcript maps adjacent to the only genomically defined CpG island in a chromosomal subregion that may be associated with part of the mental retardation of some WAGR (Wilms tumor, aniridia, genitourinary anomalies, and mental retardation) syndrome patients. In addition to nucleotide and amino acid similarity to an EST from a normalized infant brain cDNA library, the predicted protein has extensive similarity to Caenorhbditis elegans polypeptides of, as yet, unknown function. The 239FB locus is, therefore, likely part of a family of genes with two members expressed in human brain. The extensive conservation of the predicted protein suggests a fundamental function of the gene product and will enable evaluation of the role of the 239FB gene in neurogenesis in model organisms. 48 refs., 4 figs., 1 tab.

  14. Expressed sequence tags from a NaCl-treated Suaeda salsa cDNA library.

    PubMed

    Zhang, L; Ma, X L; Zhang, Q; Ma, C L; Wang, P P; Sun, Y F; Zhao, Y X; Zhang, H

    2001-04-18

    Past efforts to improve plant tolerance to osmotic stress have had limited success owing to the genetic complexity of stress responses. The first step towards cataloging and categorizing genetically complex abotic stress responses is the rapid discovery of genes by the large-scale partial sequencing of randomly selected cDNA clones or expressed sequence tags (ESTs). Suaeda salsa, which can survive seawater-level salinity, is a favorite halophytic model for salt tolerant research. We constructed a NaCl-treated cDNA library of Suaeda salsa and sequenced 1048 randomly selected clones, out of which 1016 clones produced readable sequences (773 showed homology to previously identified genes, 227 matched unknown protein coding regions, 16 anomalous sequences or sequences of bacterial origin were excluded from further analysis). By sequence analysis we identified 492 unique clones: 315 showed homology to previously identified genes, 177 matched unknown protein coding regions (101 of which have been found before in other organisms and 76 are completely novel). All our EST data are available on the Internet. We believe that our dbEST and the associated DNA materials will be a useful source to scientists engaging in stress-tolerance study. PMID:11313146

  15. Nucleotide sequences specific to Brucella and methods for the detection of Brucella

    DOEpatents

    McCready, Paula M.; Radnedge, Lyndsay; Andersen, Gary L.; Ott, Linda L.; Slezak, Thomas R.; Kuczmarski, Thomas A.

    2009-02-24

    Nucleotide sequences specific to Brucella that serves as a marker or signature for identification of this bacterium were identified. In addition, forward and reverse primers and hybridization probes derived from these nucleotide sequences that are used in nucleotide detection methods to detect the presence of the bacterium are disclosed.

  16. Nucleotide sequences specific to Francisella tularensis and methods for the detection of Francisella tularensis

    DOEpatents

    McCready, Paula M.; Radnedge, Lyndsay; Andersen, Gary L.; Ott, Linda L.; Slezak, Thomas R.; Kuczmarski, Thomas A.; Vitalis, Elizabeth A

    2007-02-06

    Described herein is the identification of nucleotide sequences specific to Francisella tularensis that serves as a marker or signature for identification of this bacterium. In addition, forward and reverse primers and hybridization probes derived from these nucleotide sequences that are used in nucleotide detection methods to detect the presence of the bacterium are disclosed.

  17. Nucleotide sequences specific to Francisella tularensis and methods for the detection of Francisella tularensis

    DOEpatents

    McCready, Paula M.; Radnedge, Lyndsay; Andersen, Gary L.; Ott, Linda L.; Slezak, Thomas R.; Kuczmarski, Thomas A.; Vitalis, Elizabeth A

    2009-02-24

    Described herein is the identification of nucleotide sequences specific to Francisella tularensis that serves as a marker or signature for identification of this bacterium. In addition, forward and reverse primers and hybridization probes derived from these nucleotide sequences that are used in nucleotide detection methods to detect the presence of the bacterium are disclosed.

  18. Nucleotide sequences specific to Yersinia pestis and methods for the detection of Yersinia pestis

    DOEpatents

    McCready, Paula M.; Radnedge, Lyndsay; Andersen, Gary L.; Ott, Linda L.; Slezak, Thomas R.; Kuczmarski, Thomas A.; Motin, Vladinir L.

    2009-02-24

    Nucleotide sequences specific to Yersinia pestis that serve as markers or signatures for identification of this bacterium were identified. In addition, forward and reverse primers and hybridization probes derived from these nucleotide sequences that are used in nucleotide detection methods to detect the presence of the bacterium are disclosed.

  19. Complete cDNA and deduced amino acid sequence of the chaperonin containing T-complex polypeptide 1 (CCT) delta subunit from Aedes triseriatus mosquitoes.

    PubMed

    Blitvich, B J; Rayms-Keller, A; Blair, C D; Beaty, B J

    2001-01-01

    The chaperonin containing t-complex polypeptide 1 (CCT) assists in the ATP-dependent folding and assembly of newly translated actin and tubulin in the eukaryotic cytosol. CCT is composed of eight different subunits, each encoded by an independent gene. In this report, we used RT-PCR amplification and 5'- and 3'-rapid amplification of cDNA ends (RACE) to determine the complete cDNA sequence of the CCT delta subunit from Aedes triseriatus mosquitoes. The CCT delta cDNA is 1936 nucleotides in length and encodes a putative 533 amino acid protein with a calculated molecular mass of 57,179 daltons and pI of 7.15. Hydrophobic residues comprise 39.8% of the amino acid sequence and putative motifs for ATP-binding and ATPase-activity are present. The amino acid sequence displays strong sequence similarity to Drosophila melanogaster (92%), human (85%), puffer fish (84%) and mouse (84%) counterparts. CCT delta mRNA was detected in both biosynthetically active (embryonating) and dormant (diapausing) Ae. triseriatus embryos by RT-PCR analysis. PMID:11762197

  20. Brief report: genome sequence and construction of an infectious cDNA clone of Ribgrass mosaic virus from Chinese cabbage in Korea.

    PubMed

    Ryu, So-Young; Hong, Jin-Sung; Rhee, Sun-Ju; Lee, Gung Pyo

    2012-04-01

    Ribgrass mosaic virus (RMV) has severely decreased the production and lowered quality of Chinese cabbage co-infected with Turnip mosaic virus (63.4%) in Korea. The complete genome sequence of RMV isolated from Brassica rapa ssp. pekinensis was determined. The full genome consisted of 6,304 nucleotides and showed sequence identities of 91.5-94.2% with the corresponding genome of other RMV strains. Full-length cDNA of RMV-Br was amplified by RT-PCR with a 5'-end primer harboring a T7 promoter sequence and a 3'-end RMV specific primer. Subsequently, the full-length cDNA was cloned into plasmid vectors. Capped transcripts synthesized from the cDNA clone were highly infectious and caused characteristic symptoms in B. rapa ssp. pekinensis and several indicator plants, similar to wild type RMV. Since there has not been found RMV resistant Chinese cabbage yet and the virus has been prevalent already throughout the natural fields of Korea, the identification of full sequence and development of infectious clone would help developing breeding program for RMV resistant crops. PMID:22143325

  1. Nucleotide sequence of the mRNA encoding the pre-alpha-subunit of mouse thyrotropin.

    PubMed Central

    Chin, W W; Kronenberg, H M; Dee, P C; Maloof, F; Habener, J F

    1981-01-01

    We have constructed and cloned in bacteria recombinant DNA molecules containing DNA sequences coding for the precursor of the alpha subunit of thyrotropin (pre-TSH-alpha). Double-stranded DNA complementary to total poly(A)+RNA derived from a mouse pituitary thyrotropic tumor was prepared enzymatically, inserted into the Pst I site of the plasmid pBR322 by using poly(dC).poly(dG) homopolymeric extensions, and cloned in Escherichia coli chi 1776. Cloned cDNAs encoding pre-TSH-alpha were identified by their hybridization to pre-TSH-alpha mRNA as determined by cell-free translations of hybrid-selected and hybrid-arrested RNA. The nucleotide sequences of two cDNAs (510 and 480 base pairs) were determined with chemical methods and corresponded to much of the region coding for the alpha subunit and the 3' untranslated region of pre-TSH-alpha mRNA. The sequence of the 5' end of the mRNA was determined from cDNA synthesized by using total mRNA as template and a restriction enzyme DNA fragment as primer. Together these sequences represented greater than 90% of the coding and noncoding regions of full-length pre-TSH-alpha mRNA, which was determined to be 800 bases long. The amino acid sequence of the pre-TSH-alpha deduced from the nucleotide sequence showed a NH2-terminal leader sequence of 24 amino acids followed by the 96-amino-acid sequence of the apoprotein of TSH-alpha. There is greater than 90% homology in the amino acid sequences among the murine, ruminant, and porcine alpha subunits and 75-80% homology among the murine, equine, and human alpha subunits. Several regions of the sequence remain absolutely conserved among all species, suggesting that these particular regions are essential for the biological function of the subunit. The successful cloning of the alpha subunit of TSH will permit further studies of the organization of the genes coding for the glycoprotein hormone subunits and the regulation of their expression. Images PMID:6272299

  2. Generalized Levy-walk model for DNA nucleotide sequences

    NASA Technical Reports Server (NTRS)

    Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Simons, M.; Stanley, H. E.

    1993-01-01

    We propose a generalized Levy walk to model fractal landscapes observed in noncoding DNA sequences. We find that this model provides a very close approximation to the empirical data and explains a number of statistical properties of genomic DNA sequences such as the distribution of strand-biased regions (those with an excess of one type of nucleotide) as well as local changes in the slope of the correlation exponent alpha. The generalized Levy-walk model simultaneously accounts for the long-range correlations in noncoding DNA sequences and for the apparently paradoxical finding of long subregions of biased random walks (length lj) within these correlated sequences. In the generalized Levy-walk model, the lj are chosen from a power-law distribution P(lj) varies as lj(-mu). The correlation exponent alpha is related to mu through alpha = 2-mu/2 if 2 < mu < 3. The model is consistent with the finding of "repetitive elements" of variable length interspersed within noncoding DNA.

  3. Mouse muscle nicotinic acetylcholine receptor gamma subunit: cDNA sequence and gene expression.

    PubMed Central

    Yu, L; LaPolla, R J; Davidson, N

    1986-01-01

    Clones coding for the mouse nicotinic acetylcholine receptor (AChR) gamma subunit precursor have been selected from a cDNA library derived from a mouse myogenic cell line and sequenced. The deduced protein sequence consists of a signal peptide of 22 amino acid residues and a mature gamma subunit of 497 amino acid residues. There is a high degree of sequence conservation between this mouse sequence and published human and calf AChR gamma subunits and, after allowing for functional amino acid substitutions, also to the more distantly related chicken and Torpedo AChR gamma subunits. The degree of sequence conservation is especially high in the four putative hydrophobic membrane spanning regions, supporting the assignment of these domains. RNA blot hybridization showed that the mRNA level of the gamma subunit increases by 30 fold or more upon differentiation of the two mouse myogenic cell lines, BC3H-1 and C2C12, suggesting that the primary controls for changes in gene expression during differentiation are at the level of transcription. One cDNA clone was found to correspond to a partially processed nuclear transcript containing two as yet unspliced intervening sequences. Images PMID:3010242

  4. Identification and Sequencing of Candida krusei Aconitate Hydratase Gene Using Rapid Amplification of cDNA Ends Method and Phylogenetic Analysis

    PubMed Central

    Fateh, Roohollah; Zaini, Farideh; Kordbacheh, Parivash; Falahati, Mehraban; Rezaie, Sasan; Daie Ghazvini, Roshanak; Borhani, Nahid; Safara, Mahin; Fattahi, Azam; Kanani, Ali; Farahyar, Shirin; Bolhassani, Manzar; Heidari, Mansour

    2015-01-01

    Background: The production and development of an effective fungicidal drug requires the identification of an essential fungal protein as a drug target. Aconitase (ACO) is a mitochondrial protein that plays a vital role in tricarboxylic acid (TCA) cycle and thus production of energy within the cell. Objectives: The current study aimed to sequence Candida krusei ACO gene and determine any amino acid residue differences between human and fungal aconitases to obtain selective inhibition. Materials and Methods: Candida krusei (ATCC: 6258) aconitase gene was determined by 5’Rapid Amplification of cDNA Ends (RACE) method and degenerate Polymerase Chain Reaction (PCR) and analyzed using bioinformatics softwares. Results: One thousand-four hundred-nineteen nucleotide of C. krusei aconitase gene were clarified and submitted in Genbank as a partial sequence and then taxonomic location of C. krusei was determined by nucleotide and amino acid sequences of this gene. The comparison of nucleotide and amino acid sequences of Candida species ACO genes showed that C. krusei possessed characteristic sequences. No significant differences were observed between C. krusei and human aconitases within the active site amino acid residues. Conclusions: Results of the current study indicated that aconitase was not a suitable target to design new anti-fungal drugs that selectively block this enzyme. PMID:26855741

  5. Cloning and sequencing of the cDNA species for mammalian dimeric dihydrodiol dehydrogenases.

    PubMed Central

    Arimitsu, E; Aoki, S; Ishikura, S; Nakanishi, K; Matsuura, K; Hara, A

    1999-01-01

    Cynomolgus and Japanese monkey kidneys, dog and pig livers and rabbit lens contain dimeric dihydrodiol dehydrogenase (EC 1.3.1.20) associated with high carbonyl reductase activity. Here we have isolated cDNA species for the dimeric enzymes by reverse transcriptase-PCR from human intestine in addition to the above five animal tissues. The amino acid sequences deduced from the monkey, pig and dog cDNA species perfectly matched the partial sequences of peptides digested from the respective enzymes of these animal tissues, and active recombinant proteins were expressed in a bacterial system from the monkey and human cDNA species. Northern blot analysis revealed the existence of a single 1.3 kb mRNA species for the enzyme in these animal tissues. The human enzyme shared 94%, 85%, 84% and 82% amino acid identity with the enzymes of the two monkey strains (their sequences were identical), the dog, the pig and the rabbit respectively. The sequences of the primate enzymes consisted of 335 amino acid residues and lacked one amino acid compared with the other animal enzymes. In contrast with previous reports that other types of dihydrodiol dehydrogenase, carbonyl reductases and enzymes with either activity belong to the aldo-keto reductase family or the short-chain dehydrogenase/reductase family, dimeric dihydrodiol dehydrogenase showed no sequence similarity with the members of the two protein families. The dimeric enzyme aligned with low degrees of identity (14-25%) with several prokaryotic proteins, in which 47 residues are strictly or highly conserved. Thus dimeric dihydrodiol dehydrogenase has a primary structure distinct from the previously known mammalian enzymes and is suggested to constitute a novel protein family with the prokaryotic proteins. PMID:10477285

  6. Empirical Bayes Estimation of Coalescence Times from Nucleotide Sequence Data.

    PubMed

    King, Leandra; Wakeley, John

    2016-09-01

    We demonstrate the advantages of using information at many unlinked loci to better calibrate estimates of the time to the most recent common ancestor (TMRCA) at a given locus. To this end, we apply a simple empirical Bayes method to estimate the TMRCA. This method is both asymptotically optimal, in the sense that the estimator converges to the true value when the number of unlinked loci for which we have information is large, and has the advantage of not making any assumptions about demographic history. The algorithm works as follows: we first split the sample at each locus into inferred left and right clades to obtain many estimates of the TMRCA, which we can average to obtain an initial estimate of the TMRCA. We then use nucleotide sequence data from other unlinked loci to form an empirical distribution that we can use to improve this initial estimate. PMID:27440864

  7. Detection of spurious interruptions of protein-coding regions in cloned cDNA sequences by GeneMark analysis.

    PubMed

    Hirosawa, M; Ishikawa, K; Nagase, T; Ohara, O

    2000-09-01

    cDNA is an artificial copy of mRNA and, therefore, no cDNA can be completely free from suspicion of cloning errors. Because overlooking these cloning errors results in serious misinterpretation of cDNA sequences, development of an alerting system targeting spurious sequences in cloned cDNAs is an urgent requirement for massive cDNA sequence analysis. We describe here the application of a modified GeneMark program, originally designed for prokaryotic gene finding, for detection of artifacts in cDNA clones. This program serves to provide a warning when any spurious split of protein-coding regions is detected through statistical analysis of cDNA sequences based on Markov models. In this study, 817 cDNA sequences deposited in public databases by us were subjected to analysis using this alerting system to assess its sensitivity and specificity. The results indicated that any spurious split of protein-coding regions in cloned cDNAs could be sensitively detected and systematically revised by means of this system after the experimental validation of the alerts. Furthermore, this study offered us, for the first time, statistical data regarding the rates and types of errors causing protein-coding splits in cloned cDNAs obtained by conventional cloning methods. PMID:10984451

  8. Complete nucleotide sequence of Nootka lupine vein-clearing virus.

    PubMed

    Robertson, Nancy L; Côté, Fabien; Paré, Christine; Leblanc, Eric; Bergeron, Michel G; Leclerc, Denis

    2007-12-01

    The complete genome sequence of Nootka lupine vein-clearing virus (NLVCV) was determined to be 4,172 nucleotides in length containing four open reading frames (ORFs) with a similar genetic organization of virus species in the genus Carmovirus, family Tombusviridae. The order and gene product size, starting from the 5'-proximal ORF consisted of: (1) polymerase/replicase gene, ORF1 (p27) and ORF1RT (readthrough) (p87), (2) movement proteins ORF2 (p7) and ORF3 (p9), and, (3) the 3'-proximal coat protein ORF4, (p37). The genomic 5'- and 3'-proximal termini contained a short (59 nt) and a relatively longer 405 nt untranslated region, respectively. The longer replicase gene product contained the GDD motif common to RNA-dependent RNA polymerases. Phylogenetically, NLVCV formed a subgroup with the following four carmoviruses when separately comparing the amino acids of the coat protein or replicase protein: Angelonia flower break virus (AnFBV), Carnation mottle virus (CarMV), Pelargonium flower break virus (PFBV), and Saguaro cactus virus (SgCV). Whole genome nucleotide analysis (percent identities) among the carmoviruses with NLVCV suggested a similar pattern. The species demarcation criteria in the genus Carmovirus for the amino acid sequence identity of the polymerase (<52%) and coat (<41%) protein genes restricted NLVCV as a distinct species, and instead, placed it as a tentative strain of CarMV, PFBV, or SgCV when both the polymerase and CP were used as the determining factors. In contrast, the species criteria that included different host ranges with no overlap and lack of serology relatedness between NLVCV and the carmoviruses, suggested that NLVCV was a distinct species. The relatively low cutoff percentages allowed for the polymerase and CP genes to dictate the inclusion/exclusion of a distinct carmovirus species should be reevaluated. Therefore, at this time we have concluded that NLVCV should be classified as a tentative new species in the genus Carmovirus

  9. Structural of the class II enzyme of human liver alcohol dehydrogenase: combined cDNA and protein sequence determination of the. pi. subunit

    SciTech Connect

    Hoeoeg, J.O.; von Bahr-Lindstroem, H.; Heden, L.O.; Holmquist, B.; Larsson, K.; Hempel, J.; Vallee, B.L.; Joernvall, H.

    1987-04-07

    The class II enzyme of human liver alcohol dehydrogenase was isolated, carboxymethylated, and cleaved with CNBr and proteolytic enzymes. Sequence analysis of peptides established structures corresponding to the ..pi.. subunit. Two segments from the C-terminal region unique to ..pi.. were selected for synthesis of oligodeoxyribonucleotide probes to screen a human liver cDNA library constructed in plasmid pT4. Sequence analysis of two identical hybridization-positive clones with cDNA inserts of about 2000 nucleotides gave the entire coding region of the ..pi.. subunit, a 61-nucleotide 5' noncoding region and a 741-nucleotide 3' noncoding region containing four possible polyadenylation sites. Translation of the coding region yields a 391-residue polypeptide, which in all regions except the C-terminal segment corresponds to the protein structure as determined directly by peptide analysis. With the class I numbering system, the exception concerns a residue exchange at position 368, the actual C-terminus which is Phe-374 by peptide data but a 12 residue extension by cDNA data, and possibly two further residue exchanges at positions 303 and 312. The size difference might indicate the existence of posttranslational modifications of the mature protein or, in combination with the residue exchanges, the existence of polymorphism at the locus for class II subunits. The ..pi.. subunit analyzed directly results in a 379-residue polypeptide and is the only class II size thus far known to occur in the mature protein. Comparison of the ..pi.. structure with those of the class I subunits (..cap alpha.., ..beta.., and ..gamma..) reveals a homology with extensive differences. Large variations in segments affecting relationships at the active site and the area of subunit interactions account for the significant alterations of enzymatic specificities and other properties that differentiate class II from class I enzymes.

  10. Time scale for cyclostome evolution inferred with a phylogenetic diagnosis of hagfish and lamprey cDNA sequences.

    PubMed

    Kuraku, Shigehiro; Kuratani, Shigeru

    2006-12-01

    The Cyclostomata consists of the two orders Myxiniformes (hagfishes) and Petromyzoniformes (lampreys), and its monophyly has been unequivocally supported by recent molecular phylogenetic studies. Under this updated vertebrate phylogeny, we performed in silico evolutionary analyses using currently available cDNA sequences of cyclostomes. We first calculated the GC-content at four-fold degenerate sites (GC(4)), which revealed that an extremely high GC-content is shared by all the lamprey species we surveyed, whereas no striking pattern in GC-content was observed in any of the hagfish species surveyed. We then estimated the timing of diversification in cyclostome evolution using nucleotide and amino acid sequences. We obtained divergence times of 470-390 million years ago (Mya) in the Ordovician-Silurian-Devonian Periods for the interordinal split between Myxiniformes and Petromyzoniformes; 90-60 Mya in the Cretaceous-Tertiary Periods for the split between the two hagfish subfamilies, Myxininae and Eptatretinae; 280-220 Mya in the Permian-Triassic Periods for the split between the two lamprey subfamilies, Geotriinae and Petromyzoninae; and 30-10 Mya in the Tertiary Period for the split between the two lamprey genera, Petromyzon and Lethenteron. This evolutionary configuration indicates that Myxiniformes and Petromyzoniformes diverged shortly after the common ancestor of cyclostomes split from the future gnathostome lineage. Our results also suggest that intra-subfamilial diversification in hagfish and lamprey lineages (especially those distributed in the northern hemisphere) occurred in the Cretaceous or Tertiary Periods. PMID:17261918

  11. cDNA cloning and sequencing of human fibrillarin, a conserved nucleolar protein recognized by autoimmune antisera

    SciTech Connect

    Aris, J.P.; Blobel, G. )

    1991-02-01

    The authors have isolated a 1.1-kilobase cDNA clone that encodes human fibrillarin by screening a hepatoma library in parallel with DNA probes derived from the fibrillarin genes of Saccharomyces cerevisiae (NOP1) and Xenopus laevis. RNA blot analysis indicates that the corresponding mRNA is {approximately}1,300 nucleotides in length. Human fibrillarin expressed in vitro migrates on SDS gels as a 36-kDa protein that is specifically immunoprecipitated by antisera from humans with scleroderma autoimmune disease. Human fibrillarin contains an amino-terminal repetitive domain {approximately}75-80 amino acids in length that is rich in glycine and arginine residues and is similar to amino-terminal domains in the yeast and Xenopus fibrillarins. The occurrence of a putative RNA-binding domain and an RNP consensus sequence within the protein is consistent with the association of fibrillarin with small nucleolar RNAs. Protein sequence alignments show that 67% of amino acids from human fibrillarin are identical to those in yeast fibrillarin and that 81% are identical to those in Xenopus fibrillarin. This identity suggests the evolutionary conservation of an important function early in the pathway for ribosome biosynthesis.

  12. Cloning and sequencing of a cDNA for Akazara scallop troponin T.

    PubMed

    Inoue, A; Ojima, T; Nishita, K

    1996-10-01

    A cDNA clone encoding troponin T of Akazara scallop (Chlamys nipponensis akazara) striated adductor muscle has been isolated and sequenced. The complete sequence deduced consists of 314 amino acid residues with a molecular weight of 37,206. Akazara scallop troponin T contains 55 amino acid residues more and 82 residues fewer than rabbit skeletal muscle troponin T and Drosophila melanogaster troponin T, respectively, showing almost the lowest sequence homology with rabbit troponin T (26%) but the highest homology with Drosophila troponin T (33%). Further, high sequence homology was seen in the functional regions: residues 33-120 and 174-227, corresponding respectively to residues 71-158 and 197-250 of rabbit troponin T (tropomyosin-binding regions); and residues 200-204, corresponding to 223 227 of rabbit troponin T (troponin I-binding region). In residues 1-70 (tropomyosin-binding region), however, only six residues are identical with rabbit troponin T. PMID:8947849

  13. The nucleotide sequence of the uvrD gene of E. coli.

    PubMed Central

    Finch, P W; Emmerson, P T

    1984-01-01

    The nucleotide sequence of a cloned section of the E. coli chromosome containing the uvrD gene has been determined. The coding region for the UvrD protein consists of 2,160 nucleotides which would direct the synthesis of a polypeptide 720 amino acids long with a calculated molecular weight of 82 kd. The predicted amino acid sequence of the UvrD protein has been compared with the amino acid sequences of other known adenine nucleotide binding proteins and a common sequence has been identified, thought to contribute towards adenine nucleotide binding. PMID:6379604

  14. Construction of cDNA library and preliminary analysis of expressed sequence tags from Siberian tiger

    PubMed Central

    Liu, Chang-Qing; Lu, Tao-Feng; Feng, Bao-Gang; Liu, Dan; Guan, Wei-Jun; Ma, Yue-Hui

    2010-01-01

    In this study we successfully constructed a full-length cDNA library from Siberian tiger, Panthera tigris altaica, the most well-known wild Animal. Total RNA was extracted from cultured Siberian tiger fibroblasts in vitro. The titers of primary and amplified libraries were 1.30×106 pfu/ml and 1.62×109 pfu/ml respectively. The proportion of recombinants from unamplified library was 90.5% and average length of exogenous inserts was 1.13 kb. A total of 282 individual ESTs with sizes ranging from 328 to 1,142bps were then analyzed the BLASTX score revealed that 53.9% of the sequences were classified as strong match, 38.6% as nominal and 7.4% as weak match. 28.0% of them were found to be related to enzyme/catalytic protein, 20.9% ESTs to metabolism, 13.1% ESTs to transport, 12.1% ESTs to signal transducer/cell communication, 9.9% ESTs to structure protein, 3.9% ESTs to immunity protein/defense metabolism, 3.2% ESTs to cell cycle, and 8.9 ESTs classified as novel genes. These results demonstrated that the reliability and representativeness of the cDNA library attained to the requirements of a standard cDNA library. This library provided a useful platform for the functional genomic research of Siberian tigers. PMID:20941376

  15. Benchmarking of the Oxford Nanopore MinION sequencing for quantitative and qualitative assessment of cDNA populations

    PubMed Central

    Oikonomopoulos, Spyros; Wang, Yu Chang; Djambazian, Haig; Badescu, Dunarel; Ragoussis, Jiannis

    2016-01-01

    To assess the performance of the Oxford Nanopore Technologies MinION sequencing platform, cDNAs from the External RNA Controls Consortium (ERCC) RNA Spike-In mix were sequenced. This mix mimics mammalian mRNA species and consists of 92 polyadenylated transcripts with known concentration. cDNA libraries were generated using a template switching protocol to facilitate the direct comparison between different sequencing platforms. The MinION performance was assessed for its ability to sequence the cDNAs directly with good accuracy in terms of abundance and full length. The abundance of the ERCC cDNA molecules sequenced by MinION agreed with their expected concentration. No length or GC content bias was observed. The majority of cDNAs were sequenced as full length. Additionally, a complex cDNA population derived from a human HEK-293 cell line was sequenced on an Illumina HiSeq 2500, PacBio RS II and ONT MinION platforms. We observed that there was a good agreement in the measured cDNA abundance between PacBio RS II and ONT MinION (rpearson = 0.82, isoforms with length more than 700bp) and between Illumina HiSeq 2500 and ONT MinION (rpearson = 0.75). This indicates that the ONT MinION can sequence quantitatively both long and short full length cDNA molecules. PMID:27554526

  16. Benchmarking of the Oxford Nanopore MinION sequencing for quantitative and qualitative assessment of cDNA populations.

    PubMed

    Oikonomopoulos, Spyros; Wang, Yu Chang; Djambazian, Haig; Badescu, Dunarel; Ragoussis, Jiannis

    2016-01-01

    To assess the performance of the Oxford Nanopore Technologies MinION sequencing platform, cDNAs from the External RNA Controls Consortium (ERCC) RNA Spike-In mix were sequenced. This mix mimics mammalian mRNA species and consists of 92 polyadenylated transcripts with known concentration. cDNA libraries were generated using a template switching protocol to facilitate the direct comparison between different sequencing platforms. The MinION performance was assessed for its ability to sequence the cDNAs directly with good accuracy in terms of abundance and full length. The abundance of the ERCC cDNA molecules sequenced by MinION agreed with their expected concentration. No length or GC content bias was observed. The majority of cDNAs were sequenced as full length. Additionally, a complex cDNA population derived from a human HEK-293 cell line was sequenced on an Illumina HiSeq 2500, PacBio RS II and ONT MinION platforms. We observed that there was a good agreement in the measured cDNA abundance between PacBio RS II and ONT MinION (rpearson = 0.82, isoforms with length more than 700bp) and between Illumina HiSeq 2500 and ONT MinION (rpearson = 0.75). This indicates that the ONT MinION can sequence quantitatively both long and short full length cDNA molecules. PMID:27554526

  17. Spatially localized generation of nucleotide sequence-specific DNA damage

    PubMed Central

    Oh, Dennis H.; King, Brett A.; Boxer, Steven G.; Hanawalt, Philip C.

    2001-01-01

    Psoralens linked to triplex-forming oligonucleotides (psoTFOs) have been used in conjunction with laser-induced two-photon excitation (TPE) to damage a specific DNA target sequence. To demonstrate that TPE can initiate photochemistry resulting in psoralen–DNA photoadducts, target DNA sequences were incubated with psoTFOs to form triple-helical complexes and then irradiated in liquid solution with pulsed 765-nm laser light, which is half the quantum energy required for conventional one-photon excitation, as used in psoralen + UV A radiation (320–400 nm) therapy. Target DNA acquired strand-specific psoralen monoadducts in a light dose-dependent fashion. To localize DNA damage in a model tissue-like medium, a DNA–psoTFO mixture was prepared in a polyacrylamide gel and then irradiated with a converging laser beam targeting the rear of the gel. The highest number of photoadducts formed at the rear while relatively sparing DNA at the front of the gel, demonstrating spatial localization of sequence-specific DNA damage by TPE. To assess whether TPE treatment could be extended to cells without significant toxicity, cultured monolayers of normal human dermal fibroblasts were incubated with tritium-labeled psoralen without TFO to maximize detectable damage and irradiated by TPE. DNA from irradiated cells treated with psoralen exhibited a 4- to 7-fold increase in tritium activity relative to untreated controls. Functional survival assays indicated that the psoralen–TPE treatment was not toxic to cells. These results demonstrate that DNA damage can be simultaneously manipulated at the nucleotide level and in three dimensions. This approach for targeting photochemical DNA damage may have photochemotherapeutic applications in skin and other optically accessible tissues. PMID:11572980

  18. Spatially localized generation of nucleotide sequence-specific DNA damage.

    PubMed

    Oh, D H; King, B A; Boxer, S G; Hanawalt, P C

    2001-09-25

    Psoralens linked to triplex-forming oligonucleotides (psoTFOs) have been used in conjunction with laser-induced two-photon excitation (TPE) to damage a specific DNA target sequence. To demonstrate that TPE can initiate photochemistry resulting in psoralen-DNA photoadducts, target DNA sequences were incubated with psoTFOs to form triple-helical complexes and then irradiated in liquid solution with pulsed 765-nm laser light, which is half the quantum energy required for conventional one-photon excitation, as used in psoralen + UV A radiation (320-400 nm) therapy. Target DNA acquired strand-specific psoralen monoadducts in a light dose-dependent fashion. To localize DNA damage in a model tissue-like medium, a DNA-psoTFO mixture was prepared in a polyacrylamide gel and then irradiated with a converging laser beam targeting the rear of the gel. The highest number of photoadducts formed at the rear while relatively sparing DNA at the front of the gel, demonstrating spatial localization of sequence-specific DNA damage by TPE. To assess whether TPE treatment could be extended to cells without significant toxicity, cultured monolayers of normal human dermal fibroblasts were incubated with tritium-labeled psoralen without TFO to maximize detectable damage and irradiated by TPE. DNA from irradiated cells treated with psoralen exhibited a 4- to 7-fold increase in tritium activity relative to untreated controls. Functional survival assays indicated that the psoralen-TPE treatment was not toxic to cells. These results demonstrate that DNA damage can be simultaneously manipulated at the nucleotide level and in three dimensions. This approach for targeting photochemical DNA damage may have photochemotherapeutic applications in skin and other optically accessible tissues. PMID:11572980

  19. Cloning and sequence analysis of cDNA coding for a lectin from Helianthus tuberosus callus and its jasmonate-induced expression.

    PubMed

    Nakagawa, R; Yasokawa, D; Okumura, Y; Nagashima, K

    2000-06-01

    Two lectins (designated as HTA I and HTA II) that seemed to be isolectins were found in Helianthus tuberosus callus. cDNA encoding HTA I was isolated from a ZAP Express expression library by immunoselection by using the anti-HTA antiserum. The sequence of this cDNA consisted of 432 bp nucleotides coding for a polypeptide of 143 amino acid residues (Mr, 15,314). When introduced into E. coli, the cDNA directed the synthesis of active HTA I as indicated by the hemagglutination activity. The deduced amino acid sequence showed homology with some lectins and jasmonate-induced proteins. When callus was cultured in the presence of methyl jasmonate (MeJA), the hemagglutination activity increased in a dose-dependent manner. The levels of expression of the HTA protein and of the corresponding mRNA also increased in the treated callus. In view of these results, HTA I is considered to be a jasmonate-induced protein. PMID:10923797

  20. Isolation and sequence of a cDNA clone for human tyrosinase that maps at the mouse c-albino locus

    SciTech Connect

    Kwon, B.S.; Haq, A.K.; Pomerantz, S.H.; Halaban, R.

    1987-11-01

    Screening of a lambdagt11 human melanocyte cDNA library with antibodies against hamster tyrosinase resulted in the isolation of 16 clones. The cDNA inserts from 13 of the 16 clones cross-hybridized with each other, indicating that they were form related mRNA species. One of the cDNA clones, Pmel34, detected one mRNA species with an approximate length of 2.4 kilobases that was expressed preferentially in normal and malignant melanocytes but not in other cell types. The amino acid sequence deduced from the nucleotide sequence showed that the putative human tyrosinase is composed of 548 amino acids with a molecular weight of 62,610. The deduced protein contains glycosylation sites and histidine-rich sites that could be used for copper binding. Southern blot analysis of DNA derived from newborn mice carrying lethal albino deletion mutations revealed that Pmel34 maps near or at the c-albino locus, the position of the structural gene for tyrosinase.

  1. Serine protease variants encoded by Echis ocellatus venom gland cDNA: cloning and sequencing analysis.

    PubMed

    Hasson, S S; Mothana, R A; Sallam, T A; Al-balushi, M S; Rahman, M T; Al-Jabri, A A

    2010-01-01

    Envenoming by Echis saw-scaled viper is the leading cause of death and morbidity in Africa due to snake bite. Despite its medical importance, there have been few investigations into the toxin composition of the venom of this viper. Here, we report the cloning of cDNA sequences encoding four groups or isoforms of the haemostasis-disruptive Serine protease proteins (SPs) from the venom glands of Echis ocellatus. All these SP sequences encoded the cysteine residues scaffold that form the 6-disulphide bonds responsible for the characteristic tertiary structure of venom serine proteases. All the Echis ocellatus EoSP groups showed varying degrees of sequence similarity to published viper venom SPs. However, these groups also showed marked intercluster sequence conservation across them which were significantly different from that of previously published viper SPs. Because viper venom SPs exhibit a high degree of sequence similarity and yet exert profoundly different effects on the mammalian haemostatic system, no attempt was made to assign functionality to the new Echis ocellatus EoSPs on the basis of sequence alone. The extraordinary level of interspecific and intergeneric sequence conservation exhibited by the Echis ocellatus EoSPs and analogous serine proteases from other viper species leads us to speculate that antibodies to representative molecules should neutralise (that we will exploit, by epidermal DNA immunization) the biological function of this important group of venom toxins in vipers that are distributed throughout Africa, the Middle East, and the Indian subcontinent. PMID:20936075

  2. Nucleotide sequence and proposed secondary structure of Columnea latent viroid: a natural mosaic of viroid sequences.

    PubMed Central

    Hammond, R; Smith, D R; Diener, T O

    1989-01-01

    The Columnea latent viroid (CLV) occurs latently in certain Columnea erythrophae plants grown commercially. In potato and tomato, CLV causes potato spindle tuber viroid (PSTV)-like symptoms. Its nucleotide sequence and proposed secondary structure reveal that CLV consists of a single-stranded circular RNA of 370 nucleotides which can assume a rod-like structure with extensive base-pairing characteristic of all known viroids. The electrophoretic mobility of circular CLV under nondenaturing conditions suggests a potential tertiary structure. CLV contains extensive sequence homologies to the PSTV group of viroids but contains a central conserved region identical to that of hop stunt viroid (HSV). CLV also shares some biological properties with each of the two types of viroids. Most probably, CLV is the result of intracellular RNA recombination between an HSV-type and one or more PSTV-type viroids replicating in the same plant. Images PMID:2602114

  3. Nucleotide sequence of a cloned woodchuck hepatitis virus genome: comparison with the hepatitis B virus sequence.

    PubMed Central

    Galibert, F; Chen, T N; Mandart, E

    1982-01-01

    The complete nucleotide sequence of a woodchuck hepatitis virus genome cloned in Escherichia coli was determined by the method of Maxam and Gilbert. This sequence was found to be 3,308 nucleotides long. Potential ATG initiator triplets and nonsense codons were identified and used to locate regions with a substantial coding capacity. A striking similarity was observed between the organization of human hepatitis B virus and woodchuck hepatitis virus. Nucleotide sequences of these open regions in the woodchuck virus were compared with corresponding regions present in hepatitis B virus. This allowed the location of four viral genes on the L strand and indicated the absence of protein coded by the S strand. Evolution rates of the various parts of the genome as well as of the four different proteins coded by hepatitis B virus and woodchuck hepatitis virus were compared. These results indicated that: (i) the core protein has evolved slightly less rapidly than the other proteins; and (ii) when a region of DNA codes for two different proteins, there is less freedom for the DNA to evolve and, moreover, one of the proteins can evolve more rapidly than the other. A hairpin structure, very well conserved in the two genomes, was located in the only region devoid of coding function, suggesting the location of the origin of replication of the viral DNA. Images PMID:7086958

  4. Complete nucleotide sequence of a maize chlorotic mottle virus isolate from Nebraska

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The complete genome of a maize chlorotic mottle virus isolate from Nebraska (MCMV-NE) was cloned and sequenced. The MCMV-NE genome consists of 4,436 nucleotides and shares 99.5% nucleotide sequence identity with an MCMV isolate from Kansas (MCMV-KS). Of 22 polymorphic sites, most resulted from t...

  5. Cloning and sequence analysis of a full-length cDNA of SmPP1cb encoding turbot protein phosphatase 1 beta catalytic subunit

    NASA Astrophysics Data System (ADS)

    Qi, Fei; Guo, Huarong; Wang, Jian

    2008-02-01

    Reversible protein phosphorylation, catalyzed by protein kinases and phosphatases, is an important and versatile mechanism by which eukaryotic cells regulate almost all the signaling processes. Protein phosphatase 1 (PP1) is the first and well-characterized member of the protein serine/threonine phosphatase family. In the present study, a full-length cDNA encoding the beta isoform of the catalytic subunit of protein phosphatase 1(PP1cb), was for the first time isolated and sequenced from the skin tissue of flatfish turbot Scophthalmus maximus, designated SmPP1cb, by the rapid amplification of cDNA ends (RACE) technique. The cDNA sequence of SmPP1cb we obtained contains a 984 bp open reading frame (ORF), flanked by a complete 39 bp 5' untranslated region and 462 bp 3' untranslated region. The ORF encodes a putative 327 amino acid protein, and the N-terminal section of this protein is highly acidic, Met-Ala-Glu-Gly-Glu-Leu-Asp-Val-Asp, a common feature for PP1 catalytic subunit but absent in protein phosphatase 2B (PP2B). And its calculated molecular mass is 37 193 Da and pI 5.8. Sequence analysis indicated that, SmPP1cb is extremely conserved in both amino acid and nucleotide acid levels compared with the PP1cb of other vertebrates and invertebrates, and its Kozak motif contained in the 5'UTR around ATG start codon is GXXAXXGXX ATGG, which is different from mammalian in two positions A-6 and G-3, indicating the possibility of different initiation of translation in turbot, and also the 3'UTR of SmPP1cb is highly diverse in the sequence similarity and length compared with other animals, especially zebrafish. The cloning and sequencing of SmPP1cb gene lays a good foundation for the future work on the biological functions of PP1 in the flatfish turbot.

  6. Nuclear-encoded chloroplast ribosomal protein L12 of Nicotiana tabacum: characterization of mature protein and isolation and sequence analysis of cDNA clones encoding its cytoplasmic precursor.

    PubMed Central

    Elhag, G A; Thomas, F J; McCreery, T P; Bourque, D P

    1992-01-01

    Poly(A)+ mRNA isolated from Nicotiana tabacum (cv. Petite Havana) leaves was used to prepare a cDNA library in the expression vector lambda gt11. Recombinant phage containing cDNAs coding for chloroplast ribosomal protein L12 were identified and sequenced. Mature tobacco L12 protein has 44% amino acid identity with ribosomal protein L7/L12 of Escherichia coli. The longest L12 cDNA (733 nucleotides) codes for a 13,823 molecular weight polypeptide with a transit peptide of 53 amino acids and a mature protein of 133 amino acids. The transit peptide and mature protein share 43% and 79% amino acid identity, respectively, with corresponding regions of spinach chloroplast ribosomal protein L12. The predicted amino terminus of the mature protein was confirmed by partial sequence analysis of HPLC-purified tobacco chloroplast ribosomal protein L12. A single L12 mRNA of about 0.8 kb was detected by hybridization of L12 cDNA to poly(A)+ and total leaf RNA. Hybridization patterns of restriction fragments of tobacco genomic DNA probed with the L12 cDNA suggested the existence of more than one gene for ribosomal protein L12. Characterization of a second cDNA with an identical L12 coding sequence but a different 3'-noncoding sequence provided evidence that at least two L12 genes are expressed in tobacco. Images PMID:1542565

  7. cDNA sequence and deduced primary structure of an alpha-amylase inhibitor from a bruchid-resistant wild common bean.

    PubMed

    Suzuki, K; Ishimoto, M; Kitamura, K

    1994-06-12

    alpha-Amylase inhibitor-2 (alpha AI-2), a seed storage protein present in a bruchid-resistant wild common bean (Phaseolus vulgaris), inhibits the growth of bruchid pests. The authors isolated and determined the sequence of an 852 nucleotide cDNA, designated as alpha ai2, and found it to contain a 720 base open reading frame (ORF). This ORF encodes a 240 amino-acid alpha AI-2 polypeptide 75.8% identical with alpha-amylase inhibitor-1 (alpha AI-1) and 50.6-55.6% with arcelin-1, phytohemagglutinin (PHA)-L and PHA-E of common bean. The high degree of sequence homology suggests that there is an evolutionary relationship among these genes. PMID:8003534

  8. Complete nucleotide sequence of the temperate bacteriophage LBR48, a new member of the family Myoviridae.

    PubMed

    Jang, Se Hwan; Yoon, Bo Hyun; Chang, Hyo Ihl

    2011-02-01

    The complete genomic sequence of LBR48, a temperate bacteriophage induced from a lysogenic strain of Lactobacillus brevis, was found to be 48,211 nucleotides long and to contain 90 putative open reading frames. Based on structural characteristics obtained from microscopic analysis and nucleic acid sequence determination, phage LBR48 can be classified as a member of the family Myoviridae. Analysis of the genome showed the conserved gene order of previously reported phages of the family Siphoviridae from lactic acid bacteria, despite low nucleotide sequence similarity. Analysis of the attachment sites revealed 15-nucleotide-long core sequences. PMID:20976608

  9. Differential representation of sunflower ESTs in enriched organ-specific cDNA libraries in a small scale sequencing project

    PubMed Central

    Fernández, Paula; Paniego, Norma; Lew, Sergio; Hopp, H Esteban; Heinz, Ruth A

    2003-01-01

    Background Subtractive hybridization methods are valuable tools for identifying differentially regulated genes in a given tissue avoiding redundant sequencing of clones representing the same expressed genes, maximizing detection of low abundant transcripts and thus, affecting the efficiency and cost effectiveness of small scale cDNA sequencing projects aimed to the specific identification of useful genes for breeding purposes. The objective of this work is to evaluate alternative strategies to high-throughput sequencing projects for the identification of novel genes differentially expressed in sunflower as a source of organ-specific genetic markers that can be functionally associated to important traits. Results Differential organ-specific ESTs were generated from leaf, stem, root and flower bud at two developmental stages (R1 and R4). The use of different sources of RNA as tester and driver cDNA for the construction of differential libraries was evaluated as a tool for detection of rare or low abundant transcripts. Organ-specificity ranged from 75 to 100% of non-redundant sequences in the different cDNA libraries. Sequence redundancy varied according to the target and driver cDNA used in each case. The R4 flower cDNA library was the less redundant library with 62% of unique sequences. Out of a total of 919 sequences that were edited and annotated, 318 were non-redundant sequences. Comparison against sequences in public databases showed that 60% of non-redundant sequences showed significant similarity to known sequences. The number of predicted novel genes varied among the different cDNA libraries, ranging from 56% in the R4 flower to 16 % in the R1 flower bud library. Comparison with sunflower ESTs on public databases showed that 197 of non-redundant sequences (60%) did not exhibit significant similarity to previously reported sunflower ESTs. This approach helped to successfully isolate a significant number of new reported sequences putatively related to responses

  10. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... acids are not intended to be embraced by this definition. Any amino acid sequence that contains post-translationally modified amino acids may be described as the amino acid sequence that is initially translated... sequence of four or more amino acids or an unbranched sequence of ten or more nucleotides....

  11. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... acids are not intended to be embraced by this definition. Any amino acid sequence that contains post-translationally modified amino acids may be described as the amino acid sequence that is initially translated... sequence of four or more amino acids or an unbranched sequence of ten or more nucleotides....

  12. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... acids are not intended to be embraced by this definition. Any amino acid sequence that contains post-translationally modified amino acids may be described as the amino acid sequence that is initially translated... sequence of four or more amino acids or an unbranched sequence of ten or more nucleotides....

  13. Cloning and sequence analysis of the coding sequence of β-actin cDNA from the Chinese alligator and suitable internal reference primers from the β-actin gene.

    PubMed

    Zhu, H N; Zhang, S Z; Zhou, Y K; Wang, C L; Wu, X B

    2015-01-01

    β-Actin is an essential component of the cytoskeleton and is stably expressed in various tissues of animals, thus, it is commonly used as an internal reference for gene expression studies. In this study, a 1731-bp fragment of β-actin cDNA from Alligator sinensis was obtained using the homology cloning technique. Sequence analysis showed that this fragment contained the complete coding sequence of the β-actin gene (1128 bp), encoding 375 amino acids. The amino acid sequence of β-actin is highly conserved and its nucleotide sequence is slightly variable. Multiple alignment analyses showed that the nucleotide sequence of the β-actin gene from A. sinensis is very similar to sequences from birds, with 94-95% identity. Ten pairs of primers with different product sizes and different annealing temperatures were screened by PCR amplification, agarose gel electrophoresis, and DNA sequencing, and could be used as internal reference primers in gene expression studies. This study expands our knowledge of β-actin gene phylogenetic evolution and provides a basis for quantitative gene expression studies in A. sinensis. PMID:26505364

  14. Nucleotide sequence analysis with polynucleotide kinase and nucleotide `mapping' methods. 5′-Terminal sequence of deoxyribonucleic acid from bacteriophages λ and 424

    PubMed Central

    Murray, Kenneth

    1973-01-01

    The polynucleotide kinase reaction was used in analyses of complex mixtures of oligodeoxynucleotides which were fractionated by various two-dimensional nucleotide `mapping' procedures. Parallel ionophoretic analyses on DEAE-cellulose paper, pH2, and AE-cellulose paper, pH3.5, of venom phosphodiesterase partial digests of 5′-terminally labelled oligonucleotides enabled the sequence of the nucleotides to be deduced uniquely. A `diagonal ionophoresis' method has been used with mixtures of nucleotides. Application of these methods to 5′-terminally labelled DNA from bacteriophage λ gave the terminal sequences pA-G-G-T-C-G and pG-G-G-C-G. Identical 5′-terminal sequences were found with DNA from bacteriophage 424. ImagesPLATE 5PLATE 1PLATE 2PLATE 3PLATE 4 PMID:4352720

  15. The nucleotide sequence of the mouse immunoglobulin epsilon gene: comparison with the human epsilon gene sequence.

    PubMed Central

    Ishida, N; Ueda, S; Hayashida, H; Miyata, T; Honjo, T

    1982-01-01

    We have determined the nucleotide sequence of the immunoglobulin epsilon gene cloned from newborn mouse DNA. The epsilon gene sequence allows prediction of the amino acid sequence of the constant region of the epsilon chain and comparison of it with sequences of the human epsilon and other mouse immunoglobulin genes. The epsilon gene was shown to be under the weakest selection pressure at the protein level among the immunoglobulin genes although the divergence at the synonymous position is similar. Our results suggest that the epsilon gene may be dispensable, which is in accord with the fact that IgE has only obscure roles in the immune defense system but has an undesirable role as a mediator of hypersensitivity. The sequence data suggest that the human and murine epsilon genes were derived from different ancestors duplicated a long time ago. The amino acid sequence of the epsilon chain is more homologous to those of the gamma chains than the other mouse heavy chains. Two membrane exons, separated by an 80-base intron, were identified 1.7 kb 3' to the CH4 domain of the epsilon gene and shown to conserve a hydrophobic portion similar to those of other heavy chain genes. RNA blot hybridization showed that the epsilon membrane exons are transcribed into two species of mRNA in an IgE hybridoma. Images Fig. 4. PMID:6329728

  16. Completion of the nucleotide sequence of sunn-hemp mosaic virus: a tobamovirus pathogenic to legumes.

    PubMed

    Silver, S; Quan, S; Deom, C M

    1996-01-01

    Sunn-hemp mosaic virus (SHMV) is a member of the tobamovirus group of plant viruses. The nucleotide sequence of the 5'-untranslated region, the 129 kD protein gene, and a portion of the 186 kD protein gene of SHMV was determined. The 4,683 nucleotides (nts) reported here completes the sequence of the SHMV genome and complements previous work (Meshi, Ohno, and Okada, Nucleic Acids Res. 10, 6111-6117 [1982]; Mol. Gen. Genet. 184, 20-25 [1981]) to provide the first complete nucleotide sequence for a tobamovirus that is pathogenic to leguminous plants. PMID:8938983

  17. Complete nucleotide sequence of the genomic RNA of tobacco mosaic virus strain Cg.

    PubMed

    Yamanaka, T; Komatani, H; Meshi, T; Naito, S; Ishikawa, M; Ohno, T

    1998-01-01

    Tobacco mosaic virus (TMV)-Cg is a crucifer-infecting tobamovirus that was isolated from field-grown garlic. We determined the complete nucleotide sequence of the genomic RNA of TMV-Cg. The genomic RNA of TMV-Cg consists of 6303 nucleotides and encodes four large open reading frames, organized basically in the same way as that of other tobamoviruses. The nucleotide and deduced amino acid sequences are very similar to those of the other crucifer-infecting tobamoviruses that have been sequenced so far. PMID:9608662

  18. Epitopes of human testis-specific lactate dehydrogenase deduced from a cDNA sequence

    SciTech Connect

    Millan, J.L.; Driscoll, C.E.; LeVan, K.M.; Goldberg, E.

    1987-08-01

    The sequence and structure of human testis-specific L-lactate dehydrogenase (LDHC/sub 4/, LDHX; (L)-lactate:NAD/sup +/ oxidoreductase, EC 1.1.1.27) has been derived from analysis of a complementary DNA (cDNA) clone comprising the complete protein coding region of the enzyme. From the deduced amino acid sequence, human LDHC/sub 4/ is as different from rodent LDHC/sub 4/ (73% homology) as it is from human LDHA/sub 4/ (76% homology) and porcine LDHB/sub 4/ (68% homology). Subunit homologies are consistent with the conclusion that the LDHC gene arose by at least two independent duplication events. Furthermore, the lower degree of homology between mouse and human LDHC/sub 4/ and the appearance of this isozyme late in evolution suggests a higher rate of mutation in the mammalian LDHC genes than in the LDHA and -B genes. Comparison of exposed amino acid residues of discrete anti-genic determinants of mouse and human LDHC/sub 4/ reveals significant differences. Knowledge of the human LDHC/sub 4/ sequence will help design human-specific peptides useful in the development of a contraceptive vaccine.

  19. The complete nucleotide sequence and genome organization of Red clover vein mosaic virus (genus Carlavirus)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Red clover vein mosaic virus (RCVMV) is a serious pathogen of legume crops including pea, chickpea and lentil. The complete nucleotide sequence was generated from an isolate obtained from chickpea in Washington State. The complete genome of RCVMV consists of 8605 nucleotides excluding the poly(A) ...

  20. Preparation of cDNA libraries for high-throughput RNA sequencing analysis of RNA 5′ ends

    PubMed Central

    Vvedenskaya, Irina O.; Goldman, Seth R.; Nickels, Bryce E.

    2015-01-01

    Summary We provide a detailed protocol for preparing cDNA libraries suitable for high throughput sequencing that are derived specifically from the 5′ ends of RNA (5′ specific RNA-seq). The protocol describes how cDNA libraries for 5′ specific RNA-seq can be tailored to analyze specific classes of RNAs based upon the phosphorylation status of the 5′ end. Thus, the analysis of cDNA libraries generated by these methods provides information regarding both the sequence and phosphorylation status of the 5′ ends of RNAs. 5′ specific RNA-seq can be used to analyze transcription initiation and post-transcriptional processing of RNAs with single base pair resolution on a genome-wide level. PMID:25665566

  1. Cloning of human transketolase cDNAs and comparison of the nucleotide sequence of the coding region in Wernicke-Korsakoff and non-Wernicke-Korsakoff individuals.

    PubMed

    McCool, B A; Plonk, S G; Martin, P R; Singleton, C K

    1993-01-15

    Variants of the enzyme transketolase which possess reduced affinity for its cofactor thiamine pyrophosphate (high apparent Km) have been described in chronic alcoholic patients with Wernicke-Korsakoff syndrome. Since the syndrome has been shown to be directly related to thiamine deficiency, it has been hypothesized that such transketolase variants may represent a genetic predisposition to the development of this syndrome. To test this hypothesis, human transketolase cDNA clones were isolated, and their nucleotide and predicted amino acid sequence were determined. Transketolase was found to be a single copy gene which produces a single mRNA of approximately 2100 nucleotides. Additionally, the nucleotide sequence of the transketolase coding region in fibroblasts derived from two Wernicke-Korsakoff (WK) patients was compared to that of two nonalcoholic controls. Although nucleotide and predicted amino acid differences were detected between fibroblast cultures and the original cDNAs and among the cultures themselves, no specific nucleotide variations, which would encode a variant amino acid sequence, were associated exclusively with the coding region from WK patients. Thus, allelic variants of the transketolase gene cannot account for the biochemically distinct forms of the enzyme found in these patients nor be considered as a mechanism for genetic predisposition to the development of Wernicke-Korsakoff syndrome. Instead, the underlying mechanism must be extragenic and may be a result of differences in post-translational processing/modification of the transketolase polypeptide. PMID:8419340

  2. cDNA sequence and protein bioinformatics analyses of MSTN in African catfish (Clarias gariepinus).

    PubMed

    Kanjanaworakul, Poonmanee; Sawatdichaikul, Orathai; Poompuang, Supawadee

    2016-04-01

    Myostatin, also known as growth differentiation factor 8, has been identified as a potent negative regulator of skeletal muscle growth. The purpose of this study was to characterize and predict function of the myostatin gene of the African catfish (Cg-MSTN). Expression of Cg-MSTN was determined at three growth stages to establish the relationship between the levels of MSTN transcript and skeletal muscle growth. The partial cDNA sequence of Cg-MSTN was cloned by using published information from its congener walking catfish (Cm-MSTN). The Cg-MSTN was 1194 bp in length encoding a protein of 397 amino acids. The deduced MSTN sequence exhibited key functional sites similar to those of other members of the TGF-β superfamily, especially, the proteolytic processing site (RXXR motif) and nine conserved cysteines at the C-terminal. Expression of MSTN appeared to be correlated with muscle development and growth of African catfish. Protein bioinformatics revealed that the primary sequence of Cg-MSTN shared 98 % sequence identity with that of walking catfish Cm-MSTN with only two different residues, [Formula: see text]. and [Formula: see text]. The proposed model of Cg-MSTN revealed the key point mutation [Formula: see text] causing a 7.35 Å shorter distance between the N- and C-lobes and an approximately 11° narrow angle than those of Cm-MSTN. The substitution of a proline residue near the proteolytic processing site which altered the structure of myostatin may play a critical role in reducing proteolytic activity of this protein in African catfish. PMID:26912268

  3. Development of polymorphic genic-SSR markers by cDNA library sequencing in boxwood, Buxus spp. (Buxaceae)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Genic microsatellites or simple sequence repeat (genic-SSR) markers were developed in boxwood (Buxus taxa) for genetic diversity analysis, identification of taxa, and to facilitate breeding. cDNA libraries were developed from mRNA extracted from leaves of Buxus sempervirens ‘Vardar Valley’ and seque...

  4. Sequence analysis and mapping of a novel human mitochondrial ATP synthase subunit 9 cDNA (ATP5G3)

    SciTech Connect

    Yan, W.L.; Gusella, J.F. |; Haines, J.L. |

    1994-11-15

    The authors describe the cloning, sequence analysis, and chromosomal mapping of a novel mitochondrial ATP synthase subunit 9 cDNA, P3. Subunit 9 transports protons across the inner mitochondrial membrane to the F{sub 1}-ATPase protruding on the matrix side, resulting in the generation of ATP. Sequence analysis of the P3 cDNA reveals only 80% identity with the human subunit 9 genes P1 and P2 in the DNA sequence encoding the mature protein identical to P1 and P2. The predicted sequence of the P3 leader peptide differs from the P1 and P2 leaders, but retains the {open_quotes}RFS{close_quotes} motif critical for mitochondrial import and maturation. The P3 gene (ATP5G3) maps to chromosome 2. 8 refs., 1 fig., 1 tab.

  5. Complete nucleotide sequence of a new isolate of passion fruit woodiness virus from Western Australia.

    PubMed

    Fukumoto, Tomohiro; Nakamura, Masayuki; Wylie, Stephen J; Chiaki, Yuya; Iwai, Hisashi

    2013-08-01

    We determined the complete genome sequence of the passion fruit woodiness virus Gld-1 isolate (PWV-Gld-1) from Australia and compared it with that of PWV-MU-2, another Australian isolate of PWV. The genomes shared high sequence identity in both the complete nucleotide sequence and the ORF amino acid sequence. All of the cleavage sites of each protein were identical to those of MU-2, and the sequence identity for the individual proteins ranged from 97.2 % to 100.0 %. However, the 5' untranslated region (5'UTR) of the Gld-1 isolate shared only 46.8 % sequence identity with that of PWV-MU-2 and was 177 nucleotides shorter. Re-sequencing of the 5'UTR of MU-2 revealed that the 5' end of the original sequence includes an artifact generated by deep sequencing. PMID:23508550

  6. Evaluation of intra- and interspecific divergence of satellite DNA sequences by nucleotide frequency calculation and pairwise sequence comparison

    PubMed Central

    2003-01-01

    Satellite DNA sequences are known to be highly variable and to have been subjected to concerted evolution that homogenizes member sequences within species. We have analyzed the mode of evolution of satellite DNA sequences in four fishes from the genus Diplodus by calculating the nucleotide frequency of the sequence array and the phylogenetic distances between member sequences. Calculation of nucleotide frequency and pairwise sequence comparison enabled us to characterize the divergence among member sequences in this satellite DNA family. The results suggest that the evolutionary rate of satellite DNA in D. bellottii is about two-fold greater than the average of the other three fishes, and that the sequence homogenization event occurred in D. puntazzo more recently than in the others. The procedures described here are effective to characterize mode of evolution of satellite DNA. PMID:12734555

  7. Evaluation of intra- and interspecific divergence of satellite DNA sequences by nucleotide frequency calculation and pairwise sequence comparison.

    PubMed

    Kato, Mikio

    2003-01-01

    Satellite DNA sequences are known to be highly variable and to have been subjected to concerted evolution that homogenizes member sequences within species. We have analyzed the mode of evolution of satellite DNA sequences in four fishes from the genus Diplodus by calculating the nucleotide frequency of the sequence array and the phylogenetic distances between member sequences. Calculation of nucleotide frequency and pairwise sequence comparison enabled us to characterize the divergence among member sequences in this satellite DNA family. The results suggest that the evolutionary rate of satellite DNA in D. bellottii is about two-fold greater than the average of the other three fishes, and that the sequence homogenization event occurred in D. puntazzo more recently than in the others. The procedures described here are effective to characterize mode of evolution of satellite DNA. PMID:12734555

  8. Cloning and sequence analysis of an Ophiophagus hannah cDNA encoding a precursor of two natriuretic peptide domains.

    PubMed

    Lei, Weiwei; Zhang, Yong; Yu, Guoyu; Jiang, Ping; He, Yingying; Lee, Wenhui; Zhang, Yun

    2011-04-01

    The king cobra (Ophiophagus hannah) is the largest venomous snake. Despite the components are mainly neurotoxins, the venom contains several proteins affecting blood system. Natriuretic peptide (NP), one of the important components of snake venoms, could cause local vasodilatation and a promoted capillary permeability facilitating a rapid diffusion of other toxins into the prey tissues. Due to the low abundance, it is hard to purify the snake venom NPs. The cDNA cloning of the NPs become a useful approach. In this study, a 957 bp natriuretic peptide-encoding cDNA clone was isolated from an O. hannah venom gland cDNA library. The open-reading frame of the cDNA encodes a 210-amino acid residues precursor protein named Oh-NP. Oh-NP has a typical signal peptide sequence of 26 amino acid residues. Surprisingly, Oh-NP has two typical NP domains which consist of the typical sequence of 17-residue loop of CFGXXDRIGC, so it is an unusual NP precursor. These two NP domains share high amino acid sequence identity. In addition, there are two homologous peptides of unknown function within the Oh-NP precursor. To our knowledge, Oh-NP is the first protein precursor containing two NP domains. It might belong to another subclass of snake venom NPs. PMID:21334357

  9. Isolation and sequence analysis of a cDNA encoding the c subunit of a vacuolar-type H(+)-ATPase from the CAM plant Kalanchoë daigremontiana.

    PubMed

    Bartholomew, D M; Rees, D J; Rambaut, A; Smith, J A

    1996-05-01

    We report the sequence of a cDNA clone encoding the c ("16 kDa') subunit of a vacuolar-type H(+)-ATPase (V-ATPase) from Kalanchoë daigremontiana, a plant in which the cell vacuole plays a pivotal role in crassulacean acid metabolism. The clone, pKVA211, was isolated from a K. daigremontiana leaf cDNA library constructed in lambda ZAP II using a homologous PCR-generated cDNA probe for the V-ATPase c subunit. The KVA211 cDNA was 839 nucleotides long and included a 20 bp poly(A)+ tail together with a complete 495 bp coding region for a polypeptide with a predicted molecular mass of 16659 Da. The deduced amino acid sequence was highly conserved across the wide range of eukaryotes (vertebrates, invertebrates, fungi, plants and protozoa) in which this gene has now been identified. Sequence comparison of several PCR products and genomic Southern analysis indicated that the V-ATPase c subunit in K. daigremontiana is encoded by a small multi-gene family. Steady-state levels of the KVA211 mRNA were much higher in leaves than in roots or flowers, and expression of this transcript in leaves was shown to be strongly light-dependent. PMID:8756609

  10. Nucleotide sequence of Neurospora crassa cytoplasmic initiator tRNA.

    PubMed Central

    Gillum, A M; Hecker, L I; Silberklang, M; Schwartzbach, S D; RajBhandary, U L; Barnett, W E

    1977-01-01

    Initiator methionine tRNA from the cytoplasm of Neurospora crassa has been purified and sequenced. The sequence is: pAGCUGCAUm1GGCGCAGCGGAAGCGCM22GCY*GGGCUCAUt6AACCCGGAGm7GU (or D) - CACUCGAUCGm1AAACGAG*UUGCAGCUACCAOH. Similar to initiator tRNAs from the cytoplasm of other eukaryotes, this tRNA also contains the sequence -AUCG- instead of the usual -TphiCG (or A)- found in loop IV of other tRNAs. The sequence of the N. crassa cytoplasmic initiator tRNA is quite different from that of the corresponding mitochondrial initiator tRNA. Comparison of the sequence of N. crassa cytoplasmic initiator tRNA to those of yeast, wheat germ and vertebrate cytoplasmic initiator tRNA indicates that the sequences of the two fungal tRNAs are no more similar to each other than they are to those of other initiator tRNAs. Images PMID:146192

  11. Identification and isolation of full-length cDNA sequences by sequencing and analysis of expressed sequence tags from guarana (Paullinia cupana).

    PubMed

    Figueirêdo, L C; Faria-Campos, A C; Astolfi-Filho, S; Azevedo, J L

    2011-01-01

    The current intense production of biological data, generated by sequencing techniques, has created an ever-growing volume of unanalyzed data. We reevaluated data produced by the guarana (Paullinia cupana) transcriptome sequencing project to identify cDNA clones with complete coding sequences (full-length clones) and complete sequences of genes of biotechnological interest, contributing to the knowledge of biological characteristics of this organism. We analyzed 15,490 ESTs of guarana in search of clones with complete coding regions. A total of 12,402 sequences were analyzed using BLAST, and 4697 full-length clones were identified, responsible for the production of 2297 different proteins. Eighty-four clones were identified as full-length for N-methyltransferase and 18 were sequenced in both directions to obtain the complete genome sequence, and confirm the search made in silico for full-length clones. Phylogenetic analyses were made with the complete genome sequences of three clones, which showed only 0.017% dissimilarity; these are phylogenetically close to the caffeine synthase of Theobroma cacao. The search for full-length clones allowed the identification of numerous clones that had the complete coding region, demonstrating this to be an efficient and useful tool in the process of biological data mining. The sequencing of the complete coding region of identified full-length clones corroborated the data from the in silico search, strengthening its efficiency and utility. PMID:21732283

  12. cDNA sequences and mRNA levels of two hexamerin storage proteins PinSP1 and PinSP2 from the Indianmeal moth, Plodia interpunctella.

    PubMed

    Zhu, Yu Cheng; Muthukrishnan, Subbaratnam; Kramer, Karl J

    2002-05-01

    In insects, storage proteins or hexamerins accumulate apparently to serve as sources of amino acids during metamorphosis and reproduction. Two storage protein-like cDNAs obtained from a cDNA library prepared from fourth instar larvae of the Indianmeal moth (Plodia interpunctella) were cloned and sequenced. The first clone, PinSP1, contained 2431 nucleotides with a 2295 nucleotide open reading frame (ORF) encoding a protein with 765 amino acid residues. The second cDNA, PinSP2, consisted of 2336 nucleotides with a 2250-nucleotide ORF encoding a protein with 750 amino acid residues. PinSP1 and PinSP2 shared 59% nucleotide sequence identity and 44% deduced amino acid sequence identity. A 17-amino acid signal peptide and a molecular mass of 90.4 kDa were predicted for the PinSP1 protein, whereas a 15-amino acid signal peptide and a mass of 88 kDa were predicted for PinSP2. Both proteins contained conserved insect larval storage protein signature sequence patterns and were 60-70% identical to other lepidopteran larval storage proteins. Expression of mRNA for both larval storage proteins was determined using the quantitative reverse transcription polymerase chain reaction method. Only very low levels were present in the second instar, but both mRNAs dramatically increased during the third instar, peaked in the fourth instar, decreased dramatically late in the same instar and pupal stages, and were undetectable during the adult stage. Males and females exhibited similar mRNA expression levels for both storage proteins during the pupal and adult stages. The results support the hypothesis that P. interpunctella, a species that does not feed after the larval stage, accumulates these two storage proteins as reserves during larval development for subsequent use in the pupal and adult stages. PMID:11891129

  13. Nucleotide sequences of 5S ribosomal RNA from four oomycete and chytrid water molds.

    PubMed

    Walker, W F; Doolittle, W F

    1982-09-25

    The nucleotide sequences of the 5S rRNAs of the oomycete water molds Saprolegnia ferax and Pythium hydnosporum and of the chytrid water molds Blastocladiella simplex and Phlyctochytrium irregulare were determined by chemical and enzymatic partial degradation of 3' and 5' end-labelled molecules, followed by gel sequence analysis. The two oomycete sequences differed in 24 positions and the two chytrid sequences differed in 27 positions. These pairs differed in a mean of 44 positions. The chytrid sequences clearly most resemble the sequence from the zygomycete Phycomyces, while the oomycete sequences appear to be allied with those from protozoa and slime molds. PMID:6890670

  14. CLEANUP: a fast computer program for removing redundancies from nucleotide sequence databases.

    PubMed

    Grillo, G; Attimonelli, M; Liuni, S; Pesole, G

    1996-02-01

    A key concept in comparing sequence collections is the issue of redundancy. The production of sequence collections free from redundancy is undoubtedly very useful, both in performing statistical analyses and accelerating extensive database searching on nucleotide sequences. Indeed, publicly available databases contain multiple entries of identical or almost identical sequences. Performing statistical analysis on such biased data makes the risk of assigning high significance to non-significant patterns very high. In order to carry out unbiased statistical analysis as well as more efficient database searching it is thus necessary to analyse sequence data that have been purged of redundancy. Given that a unambiguous definition of redundancy is impracticable for biological sequence data, in the present program a quantitative description of redundancy will be used, based on the measure of sequence similarity. A sequence is considered redundant if it shows a degree of similarity and overlapping with a longer sequence in the database greater than a threshold fixed by the user. In this paper we present a new algorithm based on an "approximate string matching' procedure, which is able to determine the overall degree of similarity between each pair of sequences contained in a nucleotide sequence database and to generate automatically nucleotide sequence collections free from redundancies. PMID:8670613

  15. Generation of expressed sequence tags of random root cDNA clones of Brassica napus by single-run partial sequencing.

    PubMed Central

    Park, Y S; Kwak, J M; Kwon, O Y; Kim, Y S; Lee, D S; Cho, M J; Lee, H H; Nam, H G

    1993-01-01

    Two hundred thirty-seven expressed sequence tags (ESTs) of Brassica napus were generated by single-run partial sequencing of 197 random root cDNA clones. A computer search of these root ESTs revealed that 21 ESTs show significant similarity to the protein-coding sequences in the existing data bases, including five stress- or defense-related genes and four clones related to the genes from other kingdoms. Northern blot analysis of the 10 data base-matched cDNA clones revealed that many of the clones are expressed most abundantly in root but less abundantly in other organs. However, two clones were highly root specific. The results show that generation of the root ESTs by partial sequencing of random cDNA clones along with the expression analysis is an efficient approach to isolate genes that are functional in plant root in a large scale. We also discuss the results of the examination of cDNA libraries and sequencing methods suitable for this approach. PMID:8029332

  16. Complete sequence of HLA-B27 cDNA identified through the characterization of structural markers unique to the HLA-A, -B, and -C allelic series.

    PubMed Central

    Szöts, H; Riethmüller, G; Weiss, E; Meo, T

    1986-01-01

    Antigen HLA-B27 is a high-risk genetic factor with respect to a group of rheumatoid disorders, especially ankylosing spondylitis. A cDNA library was constructed from an autozygous B-cell line expressing HLA-B27, HLA-Cw1, and the previously cloned HLA-A2 antigen. Clones detected with an HLA probe were isolated and sorted into homology groups by differential hybridization and restriction maps. Nucleotide sequencing allowed the unambiguous assignment of cDNAs to HLA-A, -B, and -C loci. The HLA-B27 mRNA has the structural features and the codon variability typical of an HLA class I transcript but it specifies two uncommon amino acid replacements: a cysteine in position 67 and a serine in position 131. The latter substitution may have functional consequences, because it occurs in a conserved region and at a position invariably occupied by a species-specific arginine in humans and lysine in mice. The availability of the complete sequence of HLA-B27 and of the partial sequence of HLA-Cw1 allows the recognition of locus-specific sequence markers, particularly, but not exclusively, in the transmembrane and cytoplasmic domains. Images PMID:3485286

  17. Nucleotide sequences of 5S rRNAs from four jellyfishes.

    PubMed

    Hori, H; Ohama, T; Kumazaki, T; Osawa, S

    1982-11-25

    The nucleotide sequences of 5S rRNAs from four jellyfishes, Spirocodon saltatrix, Nemopsis dofleini, Aurelia aurita and Chrysaora quinquecirrha have been determined. The sequences are highly similar to each other. A fairly high similarity was also found between these jellyfishes and a sea anemone, Anthopleura japonica. PMID:6130512

  18. The nucleotide sequence of the tnpA gene completes the sequence of the Pseudomonas transposon Tn501.

    PubMed Central

    Brown, N L; Winnie, J N; Fritzinger, D; Pridmore, R D

    1985-01-01

    The nucleotide sequence of the gene (tnpA) which codes for the transposase of transposon Tn501 has been determined. It contains an open reading frame for a polypeptide of Mr = 111,500, which terminates within the inverted repeat sequence of the transposon. The reading frame would be transcribed in the same direction as the mercury-resistance genes and the tnpR gene. The amino acid sequence predicted from this reading frame shows 32% identity with that of the transposase of the related transposon Tn3. The C-terminal regions of these two polypeptides show slightly greater homology than the N-terminal regions when conservative amino acid substitutions are considered. With this sequence determination, the nucleotide sequence of Tn501 is fully defined. The main features of the sequence are briefly presented. PMID:2994007

  19. Diverse nucleotide compositions and sequence fluctuation in Rubisco protein genes

    NASA Astrophysics Data System (ADS)

    Holden, Todd; Dehipawala, S.; Cheung, E.; Bienaime, R.; Ye, J.; Tremberger, G., Jr.; Schneider, P.; Lieberman, D.; Cheung, T.

    2011-10-01

    The Rubisco protein-enzyme is arguably the most abundance protein on Earth. The biology dogma of transcription and translation necessitates the study of the Rubisco genes and Rubisco-like genes in various species. Stronger correlation of fractal dimension of the atomic number fluctuation along a DNA sequence with Shannon entropy has been observed in the studied Rubisco-like gene sequences, suggesting a more diverse evolutionary pressure and constraints in the Rubisco sequences. The strategy of using metal for structural stabilization appears to be an ancient mechanism, with data from the porphobilinogen deaminase gene in Capsaspora owczarzaki and Monosiga brevicollis. Using the chi-square distance probability, our analysis supports the conjecture that the more ancient Rubisco-like sequence in Microcystis aeruginosa would have experienced very different evolutionary pressure and bio-chemical constraint as compared to Bordetella bronchiseptica, the two microbes occupying either end of the correlation graph. Our exploratory study would indicate that high fractal dimension Rubisco sequence would support high carbon dioxide rate via the Michaelis- Menten coefficient; with implication for the control of the whooping cough pathogen Bordetella bronchiseptica, a microbe containing a high fractal dimension Rubisco-like sequence (2.07). Using the internal comparison of chi-square distance probability for 16S rRNA (~ E-22) versus radiation repair Rec-A gene (~ E-05) in high GC content Deinococcus radiodurans, our analysis supports the conjecture that high GC content microbes containing Rubisco-like sequence are likely to include an extra-terrestrial origin, relative to Deinococcus radiodurans. Similar photosynthesis process that could utilize host star radiation would not compete with radiation resistant process from the biology dogma perspective in environments such as Mars and exoplanets.

  20. Complete nucleotide sequence of Alfalfa mosaic virus isolated from alfalfa (Medicago sativa L.) in Argentina.

    PubMed

    Trucco, Verónica; de Breuil, Soledad; Bejerman, Nicolás; Lenardon, Sergio; Giolitti, Fabián

    2014-06-01

    The complete nucleotide sequence of an Alfalfa mosaic virus (AMV) isolate infecting alfalfa (Medicago sativa L.) in Argentina, AMV-Arg, was determined. The virus genome has the typical organization described for AMV, and comprises 3,643, 2,593, and 2,038 nucleotides for RNA1, 2 and 3, respectively. The whole genome sequence and each encoding region were compared with those of other four isolates that have been completely sequenced from China, Italy, Spain and USA. The nucleotide identity percentages ranged from 95.9 to 99.1 % for the three RNAs and from 93.7 to 99 % for the protein 1 (P1), protein 2 (P2), movement protein and coat protein (CP) encoding regions, whereas the amino acid identity percentages of these proteins ranged from 93.4 to 99.5 %, the lowest value corresponding to P2. CP sequences of AMV-Arg were compared with those of other 25 available isolates, and the phylogenetic analysis based on the CP gene was carried out. The highest percentage of nucleotide sequence identity of the CP gene was 98.3 % with a Chinese isolate and 98.6 % at the amino acid level with four isolates, two from Italy, one from Brazil and the remaining one from China. The phylogenetic analysis showed that AMV-Arg is closely related to subgroup I of AMV isolates. To our knowledge, this is the first report of a complete nucleotide sequence of AMV from South America and the first worldwide report of complete nucleotide sequence of AMV isolated from alfalfa as natural host. PMID:24510307

  1. Characterization and cDNA sequence of Bothriechis schlegeliil-amino acid oxidase with antibacterial activity.

    PubMed

    Vargas Muñoz, Leidy Johana; Estrada-Gomez, Sebastian; Núñez, Vitelbina; Sanz, Libia; Calvete, Juan J

    2014-08-01

    Snake venoms are complex mixtures of proteins including l-amino acid oxidase (lAAO). A lAAO (named BslAAO) with a mass of 56kDa and a theoretical Ip of 5.79, was purified from Bothriechis schlegelii venom through size-exclusion, ion exchange and affinity chromatography. The entire protein sequence of 498 amino acids, was determined from cDNA using reverse-transcribed mRNA isolated from venom gland. The enzyme showed dose-dependent inhibition of bacterial growth. BslAAO showed inhibitory effect against S. aureus with a MIC of 4μg/mL and a MBC of 8μg/mL. Against Acinetobacter baumannii, showed a MIC of 2μg/mL and MBC of 4μg/mL, No effect was observed in Escherichia coli. This antibacterial activity was inhibited by catalase, indicating that antimicrobial activity was due to H2O2 production. BslAAO did not show any cytotoxic activity toward mouse myoblast cell line C2C12 or peripheral blood mononuclear cells. The enzyme oxidated l-Leu, with a Km of 16.37μM and a Vmax of 0.39μM/min. Snake venoms lAAOs, are potential frames of different therapeutics molecules since these enzymes exhibit low MICs and MBCs and show to be harmless to human cells due to microorganisms being generally several fold more sensitive to reactive oxygen species than human tissues. PMID:24875315

  2. Nucleotide sequence of a human tRNA gene heterocluster

    SciTech Connect

    Chang, Y.N.; Pirtle, I.L.; Pirtle, R.M.

    1986-05-01

    Leucine tRNA from bovine liver was used as a hybridization probe to screen a human gene library harbored in Charon-4A of bacteriophage lambda. The human DNA inserts from plaque-pure clones were characterized by restriction endonuclease mapping and Southern hybridization techniques, using both (3'-/sup 32/P)-labeled bovine liver leucine tRNA and total tRNA as hybridization probes. An 8-kb Hind III fragment of one of these ..gamma..-clones was subcloned into the Hind III site of pBR322. Subsequent fine restriction mapping and DNA sequence analysis of this plasmid DNA indicated the presence of four tRNA genes within the 8-kb DNA fragment. A leucine tRNA gene with an anticodon of AAG and a proline tRNA gene with an anticodon of AGG are in a 1.6-kb subfragment. A threonine tRNA gene with an anticodon of UGU and an as yet unidentified tRNA gene are located in a 1.1-kb subfragment. These two different subfragments are separated by 2.8 kb. The coding regions of the three sequenced genes contain characteristic internal split promoter sequences and do not have intervening sequences. The 3'-flanking region of these three genes have typical RNA polymerase III termination sites of at least four consecutive T residues.

  3. Methods for making nucleotide probes for sequencing and synthesis

    DOEpatents

    Church, George M; Zhang, Kun; Chou, Joseph

    2014-07-08

    Compositions and methods for making a plurality of probes for analyzing a plurality of nucleic acid samples are provided. Compositions and methods for analyzing a plurality of nucleic acid samples to obtain sequence information in each nucleic acid sample are also provided.

  4. Cloning and nucleotide sequence of the Lactobacillus casei lactate dehydrogenase gene.

    PubMed Central

    Kim, S F; Baek, S J; Pack, M Y

    1991-01-01

    An allosteric L-(+)-lactate dehydrogenase gene of Lactobacillus casei ATCC 393 was cloned in Escherichia coli, and the nucleotide sequence of the gene was determined. The gene was composed of an open reading frame of 981 bp, starting with a GTG codon and ending with a TAA codon. The sequences for the promoter and ribosome binding site were identified, and a sequence for a structure resembling a rho-independent transcription terminator was also found. Images PMID:1768113

  5. Nucleotide Sequence Diversity and Linkage Disequilibrium of Four Nuclear Loci in Foxtail Millet (Setaria italica)

    PubMed Central

    He, Shui-lian; Yang, Yang; Morrell, Peter L.; Yi, Ting-shuang

    2015-01-01

    Foxtail millet (Setaria italica (L.) Beauv) is one of the earliest domesticated grains, which has been cultivated in northern China by 8,700 years before present (YBP) and across Eurasia by 4,000 YBP. Owing to a small genome and diploid nature, foxtail millet is a tractable model crop for studying functional genomics of millets and bioenergy grasses. In this study, we examined nucleotide sequence diversity, geographic structure, and levels of linkage disequilibrium at four nuclear loci (ADH1, G3PDH, IGS1 and TPI1) in representative samples of 311 landrace accessions across its cultivated range. Higher levels of nucleotide sequence and haplotype diversity were observed in samples from China relative to other sampled regions. Genetic assignment analysis classified the accessions into seven clusters based on nucleotide sequence polymorphisms. Intralocus LD decayed rapidly to half the initial value within ~1.2 kb or less. PMID:26325578

  6. Complete cDNA sequence of human complement C1s and close physical linkage of the homologous genes C1s and C1r

    SciTech Connect

    Tosi, M.; Duponchel, C.; Meo, T.; Julier, C.

    1987-12-29

    Overlapping molecular clones encoding the complement subcomponent C1s were isolated from a human liver cDNA library. The nucleotide sequence reconstructed from these clones spans about 85% of the length of the liver C1s messenger RNAs, which occur in three distinct size classes around 3 kilobases in length. Comparisons with the sequence of C1r, the other enzymatic subcomponent of C1, reveal 40% amino acid identity and conservation of all the cysteine residues. Beside the serine protease domain, the following sequence motifs, previously described in C1r, were also found in C1s: (a) two repeats of the type found in the Ba fragment of complement factor B and in several other complement but also noncomplement proteins, (b) a cysteine-rich segment homologous to the repeats of epidermal growth factor precursor, and (c) a duplicated segment found only in C1r and C1s. Differences in each of these structural motifs provide significant clues for the interpretation of the functional divergence of these interacting serine protease zymogens. Hybridizations of C1r and C1s probes to restriction endonuclease fragments of genomic DNA demonstrate close physical linkage of the corresponding genes. The implications of this finding are discussed with respect to the evolution of C1r and C1s after their origin by tandem gene duplication and to the previously observed combined hereditary deficiencies of Clr and Cls.

  7. Nucleotide sequence heterogeneity of alpha satellite repetitive DNA: a survey of alphoid sequences from different human chromosomes.

    PubMed Central

    Waye, J S; Willard, H F

    1987-01-01

    The human alpha satellite DNA family is composed of diverse, tandemly reiterated monomer units of approximately 171 basepairs localized to the centromeric region of each chromosome. These sequences are organized in a highly chromosome-specific manner with many, if not all human chromosomes being characterized by individually distinct alphoid subsets. Here, we compare the nucleotide sequences of 153 monomer units, representing alphoid components of at least 12 different human chromosomes. Based on the analysis of sequence variation at each position within the 171 basepair monomer, we have derived a consensus sequence for the monomer unit of human alpha satellite DNA which we suggest may reflect the monomer sequence from which different chromosomal subsets have evolved. Sequence heterogeneity is evident at each position within the consensus monomer unit and there are no positions of strict nucleotide sequence conservation, although some regions are more variable than others. A substantial proportion of the overall sequence variation may be accounted for by nucleotide changes which are characteristic of monomer components of individual chromosomal subsets or groups of subsets which have a common evolutionary history. PMID:3658703

  8. Genomic structure and complete nucleotide sequence of the Batten disease gene, CLN3

    SciTech Connect

    Mitchison, H.M.; Munroe, P.B.; O`Rawe, A.M.

    1997-03-01

    We recently cloned a cDNA for CLN3, the gene for juvenile-onset neuronal ceroid lipofuscinosis or Batten disease. To resolve the genomic organization we used a cosmid clone containing CLN3 to sequence the entire gene in addition to 1.1 kb 5{prime} of the start of the published CLN3 cDNA and 0.3 kb 3{prime} to the polyadenylation site. CLN3 is organized into at least 15 exons spanning 15 kb and ranging from 47 to 356 bp. The 14 introns vary from 80 to 4227 bp, and all exon/intron junction sequences conform to the GTAG rule. Numerous repetitive Alu elements are present within the introns and 5{prime}- and 3{prime}-untranslated regions. The 5{prime} region of the CLN3 gene contains several potential transcription regulatory elements but no consensus TATA-1 box was identified. CLN3 is homologous to 27 deposited human ESTs, and sequence comparisons suggest alternative splicing of the gene and the existence of transcribed sequences upstream to the start of the published CLN3 cDNA. 19 refs., 2 figs., 1 tab.

  9. Human parainfluenza type 3 virus hemagglutinin-neuraminidase glycoprotein: nucleotide sequence of mRNA and limited amino acid sequence of the purified protein.

    PubMed Central

    Elango, N; Coligan, J E; Jambou, R C; Venkatesan, S

    1986-01-01

    The nucleotide sequence of mRNA for the hemagglutinin-neuraminidase (HN) protein of human parainfluenza type 3 virus obtained from the corresponding cDNA clone had a single long open reading frame encoding a putative protein of 64,254 daltons consisting of 572 amino acids. The deduced protein sequence was confirmed by limited N-terminal amino acid microsequencing of CNBr cleavage fragments of native HN that was purified by immunoprecipitation. The HN protein is moderately hydrophobic and has four potential sites (Asn-X-Ser/Thr) of N-glycosylation in the C-terminal half of the molecule. It is devoid of both the N-terminal signal sequence and the C-terminal membrane anchorage domain characteristic of the hemagglutinin of influenza virus and the fusion (F0) protein of the paramyxoviruses. Instead, it has a single prominent hydrophobic region capable of membrane insertion beginning at 32 residues from the N terminus. This N-terminal membrane insertion is similar to that of influenza virus neuraminidase and the recently reported structures of HN proteins of Sendai virus and simian virus 5. Images PMID:3003381

  10. An Integrated System for DNA Sequencing by Synthesis Using Novel Nucleotide Analogues

    PubMed Central

    Guo, Jia; Yu, Lin; Turro, Nicholas J.; Ju, Jingyue

    2010-01-01

    Conspectus The Human Genome Project has concluded, but its successful completion has increased, rather than decreased, the need for high-throughput DNA sequencing technologies. The possibility of clinically screening a full genome for an individual's mutations offers tremendous benefits, both for pursuing personalized medicine as well as uncovering the genomic contributions to diseases. The Sanger sequencing method—although enormously productive for more than 30 years—requires an electrophoretic separation step that, unfortunately, remains a key technical obstacle for achieving economically acceptable full-genome results. Alternative sequencing approaches thus focus on innovations that can reduce costs. The DNA sequencing by synthesis (SBS) approach has shown great promise as a new sequencing platform, with particular progress reported recently. The general fluorescent SBS approach involves (i) incorporation of nucleotide analogs bearing fluorescent reporters, (ii) identification of the incorporated nucleotide by its fluorescent emissions, and (iii) cleavage of the fluorophore, along with the reinitiation of the polymerase reaction for continuing sequence determination. In this Account, we review the construction of a DNA-immobilized chip and the development of novel nucleotide reporters for the SBS sequencing platform. Click chemistry, with its high selectivity and coupling efficiency, was explored for surface immobilization of DNA. The first generation (G-1) modified nucleotides for SBS feature a small chemical moiety capping the 3′-OH and a fluorophore tethered to the base through a chemically cleavable linker; the design ensures that the nucleotide reporters are good substrates for the polymerase. The 3′-capping moiety and the fluorophore on the DNA extension products, generated by the incorporation of the G-1 modified nucleotides, are cleaved simultaneously to reinitiate the polymerase reaction. The sequence of a DNA template immobilized on a surface

  11. Complete nucleotide sequence of the human corticotropin-beta-lipotropin precursor gene.

    PubMed Central

    Takahashi, H; Hakamata, Y; Watanabe, Y; Kikuno, R; Miyata, T; Numa, S

    1983-01-01

    The nucleotide sequence of an 8658-base-pair human genomic DNA segment containing the entire corticotropin-beta-lipotropin precursor gene has been determined, and some sequence features of the gene and its flanking regions have been analysed. The gene is composed of 7665 base pairs including two introns of 3708 and 2886 base pairs. Comparison of the 5'-flanking sequences of the human, bovine and mouse corticotropin-beta-lipotropin precursor genes reveals the presence of a highly conserved region, which contains sequences of 14-15 base pairs homologous with sequences located upstream of the mRNA start site of other glucocorticoid-regulated genes. PMID:6314261

  12. Human glutamate pyruvate transaminase (GPT): Localization to 8q24.3, cDNA and genomic sequences, and polymorphic sites

    SciTech Connect

    Sohocki, M.M.; Sullivan, L.S.; Daiger, S.P.

    1997-03-01

    Two frequent protein variants of glutamate pyruvate transaminase (GPT) (E.C.2.6.1.2) have been used as genetic markers in humans for more than two decades, although chromosomal mapping of the GPT locus in the 1980s produced conflicting results. To resolve this conflict and develop useful DNA markers for this gene, we isolated and characterized cDNA and genomic clones of GPT. We have definitively mapped human GPT to the terminus of 8q using several methods. First, two cosmids shown to contain the GPT sequence were derived from a chromosome 8-specific library. Second, by fluorescence in situ hybridization, we mapped the cosmid containing the human GPT gene to chromosome band 8q24.3. Third, we mapped the rat gpt cDNA to the syntenic region of rat chromosome 7. Finally, PCR primers specific to human GPT amplify sequences contained within a {open_quotes}half-YAC{close_quotes} from the long arm of chromosome 8, that is, a YAC containing the 8q telomere. The human GPT genomic sequence spans 2.7 kb and consists of 11 exons, ranging in size from 79 to 243 bp. The exonic sequence encodes a protein of 495 amino acids that is nearly identical to the previously reported protein sequence of human GPT-1. The two polymorphic GPT isozymes are the result of a nucleotide substitution in codon 14. In addition, a cosmid containing the GPT sequence also contains a previously unmapped, polymorphic microsatellite sequence, D8S421. The cloned GPT gene and associated polymorphisms will be useful for linkage and physical mapping of disease loci that map to the terminus of 8q, including atypical vitelliform macular dystrophy (VMD1) and epidermolysis bullosa simplex, type Ogna (EBS1). In addition, this will be a useful system for characterizing the telomeric region of 8q. Finally, determination of the molecular basis of the GPT isozyme variants will permit PCR-based detection of this world-wide polymorphism. 22 refs., 3 figs.

  13. Assessing the utility of the Oxford Nanopore MinION for snake venom gland cDNA sequencing

    PubMed Central

    Hargreaves, Adam D.

    2015-01-01

    Portable DNA sequencers such as the Oxford Nanopore MinION device have the potential to be truly disruptive technologies, facilitating new approaches and analyses and, in some cases, taking sequencing out of the lab and into the field. However, the capabilities of these technologies are still being revealed. Here we show that single-molecule cDNA sequencing using the MinION accurately characterises venom toxin-encoding genes in the painted saw-scaled viper, Echis coloratus. We find the raw sequencing error rate to be around 12%, improved to 0–2% with hybrid error correction and 3% with de novo error correction. Our corrected data provides full coding sequences and 5′ and 3′ UTRs for 29 of 33 candidate venom toxins detected, far superior to Illumina data (13/40 complete) and Sanger-based ESTs (15/29). We suggest that, should the current pace of improvement continue, the MinION will become the default approach for cDNA sequencing in a variety of species. PMID:26623194

  14. Assessing the utility of the Oxford Nanopore MinION for snake venom gland cDNA sequencing.

    PubMed

    Hargreaves, Adam D; Mulley, John F

    2015-01-01

    Portable DNA sequencers such as the Oxford Nanopore MinION device have the potential to be truly disruptive technologies, facilitating new approaches and analyses and, in some cases, taking sequencing out of the lab and into the field. However, the capabilities of these technologies are still being revealed. Here we show that single-molecule cDNA sequencing using the MinION accurately characterises venom toxin-encoding genes in the painted saw-scaled viper, Echis coloratus. We find the raw sequencing error rate to be around 12%, improved to 0-2% with hybrid error correction and 3% with de novo error correction. Our corrected data provides full coding sequences and 5' and 3' UTRs for 29 of 33 candidate venom toxins detected, far superior to Illumina data (13/40 complete) and Sanger-based ESTs (15/29). We suggest that, should the current pace of improvement continue, the MinION will become the default approach for cDNA sequencing in a variety of species. PMID:26623194

  15. Nucleotide sequence of a small cryptic plasmid from Acidithiobacillus ferrooxidans strain A-6

    SciTech Connect

    F. Roberto

    2003-10-01

    A 2.1 kb cryptic plasmid from Acidithiobacillus ferrooxidans strain A-6 was isolated and cloned into the E. coli vector plasmid, pUC128. The cloned plasmid was mapped by restriction enzyme fragment analysis and subsequently sequenced. At this time over half the plasmid sequence has been determined and compared to sequences in the GenBank nucleotide and protein sequence databases. Much of the plasmid remains cryptic, but substantial nucleotide and protein sequence similarities have been observed to the putative replication protein, RepA, of the small cryptic plasmids pAYS and pAYL found in the ammonia-oxidizing Nitrosomonas sp. Strain ENI-11. These results suggest an entirely new class of plasmid is maintained in at least one strain of Acidithiobacillus ferrooxidans and other acidophilic bacteria, and raises interesting questions about the origin of this plasmid in acidic environments.

  16. The complete nucleotide sequence and genomic characterization of tropical soda apple mosaic virus.

    PubMed

    Fillmer, Kornelia; Adkins, Scott; Pongam, Patchara; D'Elia, Tom

    2016-08-01

    We report the first complete genome sequence of tropical soda apple mosaic virus (TSAMV), a tobamovirus originally isolated from tropical soda apple (Solanum viarum) collected in Okeechobee, Florida. The complete genome of TSAMV is 6,350 nucleotides long and contains four open reading frames encoding the following proteins: i) 126-kDa methyltransferase/helicase (3354 nt), ii) 183-kDa polymerase (4839 nt), iii) movement protein (771 nt) and iv) coat protein (483 nt). The complete genome sequence of TSAMV shares 80.4 % nucleotide sequence identity with pepper mild mottle virus (PMMoV) and 71.2-74.2 % identity with other tobamoviruses naturally infecting members of the Solanaceae plant family. Phylogenetic analysis of the deduced amino acid sequences of the 126-kDa and 183-kDa proteins and the complete genome sequence place TSAMV in a subcluster with PMMoV within the Solanaceae-infecting subgroup of tobamoviruses. PMID:27169599

  17. Complete nucleotide sequence and transcriptional analysis of snakehead fish retrovirus.

    PubMed Central

    Hart, D; Frerichs, G N; Rambaut, A; Onions, D E

    1996-01-01

    The complete genome of the snakehead fish retrovirus has been cloned and sequenced, and its transcriptional profile in cell culture has been determined. The 11.2-kb provirus displays a complex expression pattern capable of encoding accessory proteins and is unique in the predicted location of the env initiation codon and signal peptide upstream of gag and the common splice donor site. The virus is distinguishable from all known retrovirus groups by the presence of an arginine tRNA primer binding site. The coding regions are highly divergent and show a number of unusual characteristics, including a large Gag coiled-coil region, a Pol domain of unknown function, and a long, lentiviral-like, Env cytoplasmic domain. Phylogenetic analysis of the Pol sequence emphasizes the divergent nature of the virus from the avian and mammalian retroviruses. The snakehead virus is also distinct from a previously characterized complex fish retrovirus, suggesting that discrete groups of these viruses have yet to be identified in the lower vertebrates. PMID:8648695

  18. Nucleotide composition of CO1 sequences in Chelicerata (Arthropoda): detecting new mitogenomic rearrangements.

    PubMed

    Arabi, Juliette; Judson, Mark L I; Deharveng, Louis; Lourenço, Wilson R; Cruaud, Corinne; Hassanin, Alexandre

    2012-02-01

    Here we study the evolution of nucleotide composition in third codon-positions of CO1 sequences of Chelicerata, using a phylogenetic framework, based on 180 taxa and three markers (CO1, 18S, and 28S rRNA; 5,218 nt). The analyses of nucleotide composition were also extended to all CO1 sequences of Chelicerata found in GenBank (1,701 taxa). The results show that most species of Chelicerata have a positive strand bias in CO1, i.e., in favor of C nucleotides, including all Amblypygi, Palpigradi, Ricinulei, Solifugae, Uropygi, and Xiphosura. However, several taxa show a negative strand bias, i.e., in favor of G nucleotides: all Scorpiones, Opisthothelae spiders and several taxa within Acari, Opiliones, Pseudoscorpiones, and Pycnogonida. Several reversals of strand-specific bias can be attributed to either a rearrangement of the control region or an inversion of a fragment containing the CO1 gene. Key taxa for which sequencing of complete mitochondrial genomes will be necessary to determine the origin and nature of mtDNA rearrangements involved in the reversals are identified. Acari, Opiliones, Pseudoscorpiones, and Pycnogonida were found to show a strong variability in nucleotide composition. In addition, both mitochondrial and nuclear genomes have been affected by higher substitution rates in Acari and Pseudoscorpiones. The results therefore indicate that these two orders are more liable to fix mutations of all types, including base substitutions, indels, and genomic rearrangements. PMID:22362465

  19. [Nucleotide sequence determination of yeast mitochondrial phenylalanine-tRNA].

    PubMed

    Martin, R; Sibler, A P; Schneller, J M; Keith, G; Stahl, A J; Dirheimer, G

    1978-10-01

    The primary structure of mitochondrial tRNAPhe from Saccharomyces cerevisiae, purified by two-dimensional polyacrylamide gel electrophoresis, was determined using, standard procedures on in vivo 32P-labeled tRNA, as well as the new 5'-end postlabeling techniques. We propose a cloverleaf model which allows for tertiary interaction between cytosine in position 46 and guanine in position 15 and maximizes base pairing in the psi C stem, thus excluding the uracile in position 50 from base pairing in the psi C stem. Comparison of the primary structure of this tRNA with all other known procaryotic, chloroplastic or cytoplasmic tRNAsPhe sequences does not lead to any conclusion about the endosymbiotic theory of mitochondria evolution. PMID:103657

  20. Molecular cloning, sequence analysis and expression in Escherichia coli of Camelus dromedarius glucose-6-phosphate dehydrogenase cDNA.

    PubMed

    Saeed, Hesham Mahmoud; Alanazi, Mohammad Saud; Abduljaleel, Zainularifeen; Al-Amri, Abdullah; Khan, Zahid

    2012-06-01

    This study determined the full length sequence of glucose-6-phosphate dehydrogenase cDNA (G6PD) from the Arabian camel Camelus dromedarius using reverse transcription polymerase chain reaction. The C. dromedarius G6PD has an open reading frame of 1545 bp, and the cDNA encodes a protein of 515 amino acid residues with a molecular weight of 59.0 KDa. The amino acid sequence showed the highest identity with Equus caballus (92%) and Homo sapiens (92%). The G6PD cDNA was cloned and expressed into Escherichia coli as a fusion protein and was purified in a single chromatographic step using nickel affinity gel column. The purity and the molecular weight of the enzyme were checked on SDS-PAGE and the purified enzyme showed a single band on the gel with a molecular weight of 63.0 KDa. The specific activity of G6PD was determined to be 289.6 EU/mg protein with a fold purification of 95.45 and yield of 56.8%. PMID:22538316

  1. Complete nucleotide sequence of the new potexvirus "Alstroemeria virus X". Brief report.

    PubMed

    Fuji, S; Shinoda, K; Ikeda, M; Furuya, H; Naito, H; Fukumoto, F

    2005-11-01

    A flexuous virus was isolated in Japan from an alstroemeria plant showing mosaic symptoms. The virus had a broad host range but had systemically latent infectivity in alstroemeria. The virus was assigned to the genus Potexvirus based on morphology and physical properties and on an analysis of the complete nucleotide sequence. The genomic RNA of the virus was 7,009 nucleotides in length, excluding the 3'-terminal poly (A) tail. It contained five open reading frames (ORFs), which was consistent with other members of the genus Potexvirus. Although nucleotide sequences of the ORFs differ from previously reported potexviruses, a phylogenetic analysis placed it phylogenetically close to Narcissus mosaic virus and Scallion virus X. Therefore, we propose that this virus should be designated as Alstroemeria virus X (AlsVX). PMID:15986173

  2. Quantitative analysis of the relationship between nucleotide sequence and functional activity.

    PubMed Central

    Stormo, G D; Schneider, T D; Gold, L

    1986-01-01

    Matrices can be used to evaluate sequences for functional activity. Multiple regression can solve for the matrix that gives the best fit between sequence evaluations and quantitative activities. This analysis shows that the best model for context effects on suppression by su2 involves primarily the two nucleotides 3' to the amber codon, and that their contributions are independent and additive. Context effects on 2AP mutagenesis also involve the two nucleotides 3' to the 2AP insertion, but their effects are not independent. In a construct for producing beta-galactosidase, the effects on translational yields of the tri-nucleotide 5' to the initiation codon are dependent on the entire triplet. Models based on these quantitative results are presented for each of the examples. PMID:3092188

  3. Single nucleotide polymorphism mining and nucleotide sequence analysis of Mx1 gene in exonic regions of Japanese quail

    PubMed Central

    Niraj, Diwesh Kumar; Kumar, Pushpendra; Mishra, Chinmoy; Narayan, Raj; Bhattacharya, Tarun Kumar; Shrivastava, Kush; Bhushan, Bharat; Tiwari, Ashok Kumar; Saxena, Vishesh; Sahoo, Nihar Ranjan; Sharma, Deepak

    2015-01-01

    Aim: An attempt has been made to study the Myxovirus resistant (Mx1) gene polymorphism in Japanese quail. Materials and Methods: In the present, investigation four fragments viz. Fragment I of 185 bp (Exon 3 region), Fragment II of 148 bp (Exon 5 region), Fragment III of 161 bp (Exon 7 region), and Fragment IV of 176 bp (Exon 13 region) of Mx1 gene were amplified and screened for polymorphism by polymerase chain reaction-single-strand conformation polymorphism technique in 170 Japanese quail birds. Results: Out of the four fragments, one fragment (Fragment II) was found to be polymorphic. Remaining three fragments (Fragment I, III, and IV) were found to be monomorphic which was confirmed by custom sequencing. Overall nucleotide sequence analysis of Mx1 gene of Japanese quail showed 100% homology with common quail and more than 80% homology with reported sequence of chicken breeds. Conclusion: The Mx1 gene is mostly conserved in Japanese quail. There is an urgent need of comprehensive analysis of other regions of Mx1 gene along with its possible association with the traits of economic importance in Japanese quail. PMID:27047057

  4. The organization of repeated nucleotide sequences in the replicons of mammalian DNA.

    PubMed Central

    Mattern, M R; Painter, R B

    1977-01-01

    Chinese hamster ovary cells were irradiated with 100-5,000 rads of X-rays and inhibition of the initiation of replicons after irradiation was demonstrated by analyzing nascent DNA sedimented in alkaline sucrose gradients. The renaturation kinetics of DNA synthesized during 60 min of incubation after irradiation was compared with that of DNA synthesized during the 60 min after sham irradiation and with that of parental DNA. Nascent DNA from cells whose replicon initiation was inhibited renatured faster than nascent DNA from control cells in the COt range of repeated nucleotide sequences, suggesting that regions of the replicon not close to origins are enriched in repeated sequences and that regions close to origins are enriched in unique sequences. A class of repeated nucleotide sequences may be involved in the regulation of replicon initiation. PMID:880330

  5. On the feasibility of using the intrinsic fluorescence of nucleotides for DNA sequencing.

    SciTech Connect

    Chowdhury, M. H.; Ray, K.; Johnson, R. L.; Gray, S. K.; Pond, J.; Lakowicz, J. R.; Univ. of Maryland; Univ. of Virginia; Lumerical Solutions, Inc.

    2010-04-29

    There is presently a worldwide effort to increase the speed and decrease the cost of DNA sequencing as exemplified by the goal of the National Human Genome Research Institute (NHGRI) to sequence a human genome for under $1000. Several high throughput technologies are under development. Among these, single strand sequencing using exonuclease appear very promising. However, this approach requires complete labeling of at least two bases at a time, with extrinsic high quantum yield probes. This is necessary because nucleotides absorb in the deep ultraviolet (UV) and emit with extremely low quantum yields. Hence intrinsic emission from DNA and nucleotides is not being exploited for DNA sequencing. In the present paper we consider the possibility of identifying single nucleotides using their intrinsic emission. We used the finite-difference time-domain (FDTD) method to calculate the effects of aluminum nanoparticles on nearby fluorophores that emit in the UV. We find that the radiated power of UV fluorophores is significantly increased when they are in close proximity to aluminum nanostructures. We show that there will be increased localized excitation near aluminum particles at wavelengths used to excite intrinsic nucleotide emission. Using FDTD simulation we show that a typical DNA base when coupled to appropriate aluminum nanostructures leads to highly directional emission. Additionally we present experimental results showing that a thin film of nucleotides show enhanced emission when in close proximity to aluminum nanostructures. Finally we provide Monte Carlo simulations that predict high levels of base calling accuracy for an assumed number of photons that is derived from the emission spectra of the intrinsic fluorescence of the bases. Our results suggest that single nucleotides can be detected and identified using aluminum nanostructures that enhance their intrinsic emission. This capability would be valuable for the ongoing efforts toward the $1000 genome.

  6. Single nucleotide polymorphism discovery in rainbow trout by deep sequencing of a reduced representation library

    Technology Transfer Automated Retrieval System (TEKTRAN)

    BACKGROUND: To enhance capabilities for genomic analyses in rainbow trout, such as genomic selection, a large suite of polymorphic markers that are amenable to high-throughput genotyping protocols must be identified. Expressed Sequence Tags (ESTs) have been used for single nucleotide polymorphism (...

  7. Complete Nucleotide Sequence of a Citrobacter freundii Plasmid Carrying KPC-2 in a Unique Genetic Environment

    PubMed Central

    Yao, Yancheng; Imirzalioglu, Can; Hain, Torsten; Kaase, Martin; Gatermann, Soeren; Exner, Martin; Mielke, Martin; Hauri, Anja; Dragneva, Yolanta; Bill, Rita; Wendt, Constanze; Wirtz, Angela; Chakraborty, Trinad

    2014-01-01

    The complete and annotated nucleotide sequence of a 54,036-bp plasmid harboring a blaKPC-2 gene that is clonally present in Citrobacter isolates from different species is presented. The plasmid belongs to incompatibility group N (IncN) and harbors the class A carbapenemase KPC-2 in a unique genetic environment. PMID:25395635

  8. Characterization of expressed sequence tags from a full-length enriched cDNA library of Cryptomeria japonica male strobili

    PubMed Central

    Futamura, Norihiro; Totoki, Yasushi; Toyoda, Atsushi; Igasaki, Tomohiro; Nanjo, Tokihiko; Seki, Motoaki; Sakaki, Yoshiyuki; Mari, Adriano; Shinozaki, Kazuo; Shinohara, Kenji

    2008-01-01

    Background Cryptomeria japonica D. Don is one of the most commercially important conifers in Japan. However, the allergic disease caused by its pollen is a severe public health problem in Japan. Since large-scale analysis of expressed sequence tags (ESTs) in the male strobili of C. japonica should help us to clarify the overall expression of genes during the process of pollen development, we constructed a full-length enriched cDNA library that was derived from male strobili at various developmental stages. Results We obtained 36,011 expressed sequence tags (ESTs) from either one or both ends of 19,437 clones derived from the cDNA library of C. japonica male strobili at various developmental stages. The 19,437 cDNA clones corresponded to 10,463 transcripts. Approximately 80% of the transcripts resembled ESTs from Pinus and Picea, while approximately 75% had homologs in Arabidopsis. An analysis of homologies between ESTs from C. japonica male strobili and known pollen allergens in the Allergome Database revealed that products of 180 transcripts exhibited significant homology. Approximately 2% of the transcripts appeared to encode transcription factors. We identified twelve genes for MADS-box proteins among these transcription factors. The twelve MADS-box genes were classified as DEF/GLO/GGM13-, AG-, AGL6-, TM3- and TM8-like MIKCC genes and type I MADS-box genes. Conclusion Our full-length enriched cDNA library derived from C. japonica male strobili provides information on expression of genes during the development of male reproductive organs. We provided potential allergens in C. japonica. We also provided new information about transcription factors including MADS-box genes expressed in male strobili of C. japonica. Large-scale gene discovery using full-length cDNAs is a valuable tool for studies of gymnosperm species. PMID:18691438

  9. The complete nucleotide sequence of Pepper mottle virus-Florida RNA.

    PubMed

    Warren, C E; Murphy, J F

    2003-01-01

    The Pepper mottle virus-Florida (PepMoV-FL) RNA genome was cloned and sequenced, and shown to consist of 9,717 nucleotides (nt) excluding the poly (A) tail. A single open reading frame was identified beginning at nucleotide position 169 encoding a polyprotein of 3068 amino acids. Phylogenetic sequence analysis revealed that of 44 full-length viral RNA genomes analyzed within the family Potyviridae, PepMoV-FL was most closely related to PepMoV-California (PepMoV-CA), Potato virus Y-H (PVY-H), PVY-N, PVY(o) and Potato virus V-DV42 (PVV-DV42). Using the PepMoV-FL sequence as a basis for comparison, the overall nucleotide sequence identity was highest between PepMoV-FL and PepMoV-CA at 93%, while the relationship was more distant with PVV-DV42 at 64% and for the PVY strains at 61%. A unique direct repeat sequence of 76 nucleotides was identified in the PepMoV-FL 3'-untranslated region (UTR), and this repeat sequence was confirmed not to occur in the PepMoV-CA sequence. Since the Florida isolate was among the first of the PepMoV isolates described, extensive biological and serological data on this isolate are available, and it has now been cloned and sequenced, we recommend that PepMoV-FL be recognized as the PepMoV type strain. PMID:12536304

  10. Proteome-wide Identification of Novel Ceramide-binding Proteins by Yeast Surface cDNA Display and Deep Sequencing.

    PubMed

    Bidlingmaier, Scott; Ha, Kevin; Lee, Nam-Kyung; Su, Yang; Liu, Bin

    2016-04-01

    Although the bioactive sphingolipid ceramide is an important cell signaling molecule, relatively few direct ceramide-interacting proteins are known. We used an approach combining yeast surface cDNA display and deep sequencing technology to identify novel proteins binding directly to ceramide. We identified 234 candidate ceramide-binding protein fragments and validated binding for 20. Most (17) bound selectively to ceramide, although a few (3) bound to other lipids as well. Several novel ceramide-binding domains were discovered, including the EF-hand calcium-binding motif, the heat shock chaperonin-binding motif STI1, the SCP2 sterol-binding domain, and the tetratricopeptide repeat region motif. Interestingly, four of the verified ceramide-binding proteins (HPCA, HPCAL1, NCS1, and VSNL1) and an additional three candidate ceramide-binding proteins (NCALD, HPCAL4, and KCNIP3) belong to the neuronal calcium sensor family of EF hand-containing proteins. We used mutagenesis to map the ceramide-binding site in HPCA and to create a mutant HPCA that does not bind to ceramide. We demonstrated selective binding to ceramide by mammalian cell-produced wild type but not mutant HPCA. Intriguingly, we also identified a fragment from prostaglandin D2synthase that binds preferentially to ceramide 1-phosphate. The wide variety of proteins and domains capable of binding to ceramide suggests that many of the signaling functions of ceramide may be regulated by direct binding to these proteins. Based on the deep sequencing data, we estimate that our yeast surface cDNA display library covers ∼60% of the human proteome and our selection/deep sequencing protocol can identify target-interacting protein fragments that are present at extremely low frequency in the starting library. Thus, the yeast surface cDNA display/deep sequencing approach is a rapid, comprehensive, and flexible method for the analysis of protein-ligand interactions, particularly for the study of non-protein ligands. PMID

  11. Complete nucleotide sequence of the M2 gene segment of reovirus type 3 dearing and analysis of its protein product mu 1.

    PubMed

    Jayasuriya, A K; Nibert, M L; Fields, B N

    1988-04-01

    The nucleotide sequence of the M2 gene segment of the mammalian reovirus prototype strain, type 3 Dearing, was determined from a cloned full-length cDNA copy of the viral double-stranded RNA segment. The gene comprises 2203 nucleotides and has a single long open reading frame that spans bases 30 through 2154 and encodes the 708 amino acid outer capsid protein mu 1. Aminoterminal sequence analysis of mu 1C, the proteolytically cleaved form of mu 1 that is found in purified reovirions, has identified the site of mu 1 to mu 1C cleavage between residues 42 and 43 in the mu 1 sequence. Aminoterminal sequence analysis of delta, the proteolytically cleaved product of mu 1C that is found in chymotrypsin-generated intermediate subviral particles, has indicated that the mu 1C to delta cleavage occurs near the carboxyterminus of mu 1C. Lastly, stoichiometric determinations using new sequence information have suggested that approximately equimolar amounts of mu 1C and the other major outer capsid component sigma 3 are present in virions. The data presented in this study should be useful for understanding the molecular basis of the functions of the mu 1 protein in reovirus entry into cells and in pathogenesis in the host animal. PMID:3354207

  12. Human bradykinin B2 receptor: Nucleotide sequence analysis and assignment to chromosome 14

    SciTech Connect

    Powell, S.J.; Slynn, G.; Thomas, C.; Hopkins, B.; Briggs, I.; Graham, A. )

    1993-02-01

    Functional cDNA clones for human bradykinin B2 receptor were isolated from uterus RNA by a polymerase chain reaction (PCR)-based method and by screening a human cosmid library with rat bradykinin B2 receptor probe. We isolated several overlapping clones from the cosmid library, each of which encodes the entire protein-coding sequence. The human bradykinin B2 receptor gene codes for a 364-amino-acid protein with a molecular mass of 41,442 Da that is highly homologous to rat bradykinin B2 receptor cDNA (81%). The entire human cDNA sequence was cloned into an expression vector and mRNA was synthesised by in vitro transcription. Applications of bradykinin caused membrane current responses in Xenopus oocytes injected with the in vitro-synthesized mRNA. Preincubation with the potent B2 antagonist, HOE140, prevented this response. The genomic clone is intronless, and we have identified an upstream promoter region and a downstream polyadenylation signal. The human bradykinin B2 receptor gene has been mapped to chromosome 14 using PCR to specifically amplify DNA from somatic cell hybrids. 10 refs., 1 fig., 1 tab.

  13. Sequence of cDNA for rat cystathionine gamma-lyase and comparison of deduced amino acid sequence with related Escherichia coli enzymes.

    PubMed Central

    Erickson, P F; Maxwell, I H; Su, L J; Baumann, M; Glode, L M

    1990-01-01

    A cDNA clone for cystathionine gamma-lyase was isolated from a rat cDNA library in lambda gt11 by screening with a monospecific antiserum. The identity of this clone, containing 600 bp proximal to the 3'-end of the gene, was confirmed by positive hybridization selection. Northern-blot hybridization showed the expected higher abundance of the corresponding mRNA in liver than in brain. Two further cDNA clones from a plasmid pcD library were isolated by colony hybridization with the first clone and were found to contain inserts of 1600 and 1850 bp. One of these was confirmed as encoding cystathionine gamma-lyase by hybridization with two independent pools of oligodeoxynucleotides corresponding to partial amino acid sequence information for cystathionine gamma-lyase. The other clone (estimated to represent all but 8% of the 5'-end of the mRNA) was sequenced and its deduced amino acid sequence showed similarity to those of the Escherichia coli enzymes cystathionine beta-lyase and cystathionine gamma-synthase throughout its length, especially to that of the latter. Images Fig. 1. Fig. 2. Fig. 3. Fig. 5. PMID:2201285

  14. Nucleotide sequence and genome organization of Dweet mottle virus and its relationship to members of the family Betaflexiviridae

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The nucleotide sequence of Dweet mottle virus (DMV) was determined and compared to sequences of members of the family Alpha- and Beta-flexiviridae. The DMV genome has 8747 nucleotides (nt) excluding the poly-(A) tail at the 3’ end of the genome. The overall G+C content of DMV genomic RNA is 40%. D...

  15. Nucleotide sequence of a cloned woodchuck hepatitis virus genome: evolutional relationship between hepadnaviruses.

    PubMed Central

    Kodama, K; Ogasawara, N; Yoshikawa, H; Murakami, S

    1985-01-01

    We have determined the complete nucleotide sequence of a cloned DNA of woodchuck hepatitis virus (WHV), the most oncogenic virus among hepadnaviruses. The genome, designated WHV2, is 3,320 base pairs long and contains four major open reading frames (ORFs) coded on the same strand of nucleotide sequence as in the human hepatitis B virus (HBV) genome. Comparison of the nucleotide sequence and amino acid sequences deduced from it among the genomes of various hepadnaviruses demonstrates that each protein shows an intrinsic property in conserving its amino acid sequence. A parameter, the ratio of the number of triplets with one-letter change but no amino acid substitution to the total number of triplets in which one-letter change occurred, was introduced to measure the intrinsic properties quantitatively. For each ORF, the parameter gave characteristic values in all combinations. Therefore, the relative evolutional distance between these hepadnaviruses can be measured by the amino acid substitution rate of any ORF. These comparisons suggest that (i) the difference between two WHV clones, WHV1 and WHV2, corresponds to that among clones of a HBV subtype, HBVadr, and (ii) WHV and ground squirrel hepatitis virus can be categorized in a way similar to the subgroups of HBV. PMID:3855246

  16. Nucleotide sequence and genome organization of atractylodes mottle virus, a new member of the genus Carlavirus.

    PubMed

    Zhao, Fumei; Igori, Davaajargal; Lim, Seungmo; Yoo, Ran Hee; Lee, Su-Heon; Moon, Jae Sun

    2015-11-01

    The complete genome sequence of a member of a distinct species of the genus Carlavirus in the family Betaflexiviridae, tentatively named atractylodes mottle virus (AtrMoV), has been determined. Analysis of its genomic organization indicates that it has a single-stranded, positive-sense genomic RNA of 8866 nucleotides, excluding the poly(A) tail, and consists of six open reading frames typical of members of the genus Carlavirus. The individual open reading frames of AtrMoV show moderately low sequence similarity to those of other carlaviruses at the nucleotide and amino acid sequence levels. Pairwise comparison and phylogenetic analysis suggest that AtrMoV is most closely related to chrysanthemum virus B. PMID:26264403

  17. A likelihood method for the detection of selection and recombination using nucleotide sequences.

    PubMed

    Grassly, N C; Holmes, E C

    1997-03-01

    Different regions along nucleotide sequences are often subject to different evolutionary forces. Recombination will result in regions having different evolutionary histories, while selection can cause regions to evolve at different rates. This paper presents a statistical method based on likelihood for detecting such processes by identifying the regions which do not fit with a single phylogenetic topology and nucleotide substitution process along the entire sequence. Subsequent reanalysis of these anomalous regions may then be possible. The method is tested using simulations, and its application is demonstrated using the primate psi eta-globin pseudogene, the V3 region of the envelope gene of HIV-1, and argF sequences from Neisseria bacteria. Reanalysis of anomalous regions is shown to reveal possible immune selection in HIV-1 and recombination in Neisseria. A computer program which implements the method is available. PMID:9066792

  18. Complete nucleotide sequence of the structural gene for alkaline proteinase from Pseudomonas aeruginosa IFO 3455.

    PubMed Central

    Okuda, K; Morihara, K; Atsumi, Y; Takeuchi, H; Kawamoto, S; Kawasaki, H; Suzuki, K; Fukushima, J

    1990-01-01

    The DNA-encoding alkaline proteinase (AP) of Pseudomonas aeruginosa IFO 3455 was cloned, and its complete nucleotide sequence was determined. When the cloned gene was ligated to pUC18, the Escherichia coli expression vector, the gene-incorporated bacteria expressed high levels of both AP activity and AP antigens. The amino acid sequence deduced from the nucleotide sequence revealed that the mature AP consists of 467 amino acids with a relative molecular weight of 49,507. The amino acid composition predicted from the DNA sequence was similar to the chemically determined composition of purified AP reported previously. The amino acid sequence analysis revealed that both the N-terminal side sequence of the purified AP and several internal lysyl peptide fragments were identical to the deduced amino acid sequences. The percent homology of amino acid sequences between AP and Serratia protease was about 55%. The zinc ligands and an active site of the AP were predicted by comparing the structure of the enzyme with of Serratia protease, thermolysin, Bacillus subtilis neutral protease, and Pseudomonas elastase. PMID:2123832

  19. QGRS Mapper: a web-based server for predicting G-quadruplexes in nucleotide sequences

    PubMed Central

    Kikin, Oleg; D'Antonio, Lawrence; Bagga, Paramjeet S

    2006-01-01

    The quadruplex structures formed by guanine-rich nucleic acid sequences have received significant attention recently because of growing evidence for their role in important biological processes and as therapeutic targets. G-quadruplex DNA has been suggested to regulate DNA replication and may control cellular proliferation. Sequences capable of forming G-quadruplexes in the RNA have been shown to play significant roles in regulation of polyadenylation and splicing events in mammalian transcripts. Whether quadruplex structure directly plays a role in regulating RNA processing requires investigation. Computational approaches to study G-quadruplexes allow detailed analysis of mammalian genomes. There are no known easily accessible user-friendly tools that can compute G-quadruplexes in the nucleotide sequences. We have developed a web-based server, QGRS Mapper, that predicts quadruplex forming G-rich sequences (QGRS) in nucleotide sequences. It is a user-friendly application that provides many options for defining and studying G-quadruplexes. It performs analysis of the user provided genomic sequences, e.g. promoter and telomeric regions, as well as RNA sequences. It is also useful for predicting G-quadruplex structures in oligonucleotides. The program provides options to search and retrieve desired gene/nucleotide sequence entries from NCBI databases for mapping G-quadruplexes in the context of RNA processing sites. This feature is very useful for investigating the functional relevance of G-quadruplex structure, in particular its role in regulating the gene expression by alternative processing. In addition to providing data on composition and locations of QGRS relative to the processing sites in the pre-mRNA sequence, QGRS Mapper features interactive graphic representation of the data. The user can also use the graphics module to visualize QGRS distribution patterns among all the alternative RNA products of a gene simultaneously on a single screen. QGRS Mapper can be

  20. Complete nucleotide sequence of the Streptomyces lividans plasmid pIJ101 and correlation of the sequence with genetic properties.

    PubMed Central

    Kendall, K J; Cohen, S N

    1988-01-01

    The complete nucleotide sequence of the multicopy Streptomyces plasmid pIJ101 has been determined and correlated with previously published genetic data. The circular DNA molecule is 8,830 nucleotides in length and has a G+C composition of 72.98%. The use of a computer program, FRAME, enabled identification in the sequence of seven open reading frames, four of which, tra (621 amino acids [aa]), spdA (146 aa), spdB (274 aa), and kilB (177 aa), appear to be genes involved in plasmid transfer. At least two of the above genes are predicted to be transcribed by known promoters that are regulated in trans by the products of the korA (241 aa) and korB (80 aa) loci on the plasmid. The segment of the plasmid capable of autonomous replication contains one large open reading frame (rep; 450 aa) and a noncoding region presumed to be the origin of replication. Four other small (less than 90 aa) open reading frames are also present on the plasmid, although no function can be attributed to them. The sequence of the pIJ101 replication segment present in several widely used cloning vectors (e.g., pIJ350 and pIJ702) has also been determined, so that the complete nucleotide sequences of these vectors are now known. PMID:3170481

  1. Sequence selective naked-eye detection of DNA harnessing extension of oligonucleotide-modified nucleotides.

    PubMed

    Verga, Daniela; Welter, Moritz; Marx, Andreas

    2016-02-01

    DNA polymerases can efficiently and sequence selectively incorporate oligonucleotide (ODN)-modified nucleotides and the incorporated oligonucleotide strand can be employed as primer in rolling circle amplification (RCA). The effective amplification of the DNA primer by Φ29 DNA polymerase allows the sequence-selective hybridisation of the amplified strand with a G-quadruplex DNA sequence that has horse radish peroxidase-like activity. Based on these findings we develop a system that allows DNA detection with single-base resolution by naked eye. PMID:26774580

  2. The nucleotide sequence at the termini of adenovirus type 5 DNA.

    PubMed Central

    Steenbergh, P H; Maat, J; van Ormondt, H; Sussenbach, J S

    1977-01-01

    The sequences of the first 194 base pairs at both termini of adenovirus type 5 (Ad5) DNA have been determined, using the chemical degradation technique developed by Maxam and Gilbert (Proc. Nat. Acad. Sci. USA 74 (1977), pp. 560-564). The nucleotide sequences 1-75 were confirmed by analysis of labeled RNA transcribed from the terminal HhaI fragments in vitro. The sequence data show that Ad5 DNA has a perfect inverted terminal repetition of 103 base pairs long. Images PMID:600799

  3. Complete nucleotide sequence of a subviral DNA molecule of porcine circovirus type 2.

    PubMed

    Wen, Han

    2016-07-01

    Porcine circovirus type 2 (PCV2) is a member of the genus Circovirus in the family Circoviridae. Most subgenomic molecules of PCV2 have been mapped. Here, the first full-length sequence of a subviral molecule of PCV2 (CH-IVT12) containing a reverse complement sequence of the PCV2 genome was determined by sequencing DNA extracted from PK15 cells infected with PCV2. The circular CH-IVT12 DNA consists of 1136 nucleotides and contains one major open reading frame. PMID:27084550

  4. Nucleotide sequence of 5S ribosomal RNA from Aspergillus nidulans and Neurospora crassa.

    PubMed Central

    Piechulla, B; Hahn, U; McLaughlin, L W; Küntzel, H

    1981-01-01

    The nucleotide sequences of 5S rRNA molecules isolated from the cytosol and the mitochondria of the ascomycetes A. nidulans and N. crassa were determined by partial chemical cleavage of 3'-terminally labelled RNA. The sequence identity of the cytosolic and mitochondrial RNA preparations confirms the absence of mitochondrion-specific 5S rRNA in these fungi. The sequences of the two organisms differ in 35 positions, and each sequence differs from yeast 5S rRNA in 44 positions. Both molecules contain the sequence GCUC in place of GAAC or GAUY found in all other 5S rRNAs, indicating that this region is not universally involved in base-pairing to the invariant GTpsiC sequence of tRNAs. Images PMID:6453331

  5. Nucleotide sequences of an important functional gene hnRNPA2/B1 from Ailuropoda melanoleuca and Ursus thibetanus mupinensis and its potential value in phylogenetic study.

    PubMed

    Du, Yu-jie; Hou, Yi-ling; Hou, Wan-ru

    2014-01-01

    The cDNA fragments of hnRNPA2/B1 were cloned from the giant panda and black bear using RT-PCR method, which were, respectively, 1029bp and 1026bp in length encoding 343 and 341 amino acids. Analysis indicated the cDNA cloned from the giant panda encoded variant B1 while the cDNA cloned from black bear encoded variant A2. Analyzing the hnRNPA2B1 peptide of the giant panda and black bear, 76 glycine residues and 86 glycine residues were, respectively, found, and moreover, most glycine are concentrated in the latter halves of the hnRNPA2B1 peptides. Functional sites prediction also showed many N-myristoylation sites existed in the glycine-rich domain, which is probably related to the role of telomere maintenance. From base bias and substitution analysis, we can conclude that the ORF of hnRNPA2/B1 biased G while hated C, and transition of the third site did not achieve the level of saturation. Orthology analysis indicated that both the nucleotide sequence and the deduced amino acid sequence showed high identity to other 26 hnRNPA2/B1 sequences from mammals and nonmammals reported. These sequences were used to construct phylogenetic trees employing the NJ method with 1000 bootstrap, and the obtained tree demonstrated similar topology with the classical systematics, which suggested the potential value of hnRNPA2/B1 in phylogenetic analysis. This report will be the first step to the study function of hnRNPA2/B1 in the giant panda and black bear, and will provide a scientific basis to disease surveillance, captive breeding, and conservation of the endangered species. PMID:24588753

  6. Nucleotide and deduced amino acid sequences of the nucleocapsid protein of the virulent A75/17-CDV strain of canine distemper virus.

    PubMed

    Stettler, M; Zurbriggen, A

    1995-05-01

    Virus persistence is essential in the chronic inflammatory canine distemper virus (CDV)-induced demyelinating disease. In the case of CDV there is a close association between persistence and virulence. Virulent CDV isolated from dogs with distemper shows immediate persistence in primary dog brain cell cultures (DBCC) and in different cell lines. We have evidence that the nucleocapsid (NP) protein plays an important role in the development of persistence. The NP-protein, the most abundant structural virus protein, also influences virus assembly and has some regulatory functions in virus transcription and replication. In this study we compared the nucleotide and deduced amino acid sequence of a virulent CDV strain (A75/17-CDV) to a culture-attenuated non-virulent strain (OP-CDV). Viral RNA was extracted from DBCC infected with virulent CDV. Virulent CDV retains its in vivo properties, such as virulence and ability to cause demyelination, when propagated in these DBCC. The viral RNA was reverse transcribed and the resulting cDNA amplified by polymerase chain reaction for subsequent cloning. The nucleotide sequences of these clones were determined by the dideoxy chain termination method. The number of nucleotides and the putative NP-protein of the virulent strain matched the attenuated CDV strain. We observed a total of 105 nucleotide differences. Three were localised within the 3' and five within the 5' non-coding region of the NP-gene. The 97 nucleotide changes within the coding region resulted in 22 amino acid differences. 10 of these amino acid (AA) modifications were within the N-terminal region (AA 1 to 159) and 12 within the C-terminal area (AA 351 to 523).(ABSTRACT TRUNCATED AT 250 WORDS) PMID:8588315

  7. Nucleotide sequence specifying the glycoprotein gene, gB, of herpes simplex virus type 1.

    PubMed

    Bzik, D J; Fox, B A; DeLuca, N A; Person, S

    1984-03-01

    The nucleotide sequence thought to specify the glycoprotein gene, gB, of the KOS strain of herpes simplex virus type 1 (HSV-1) has been determined. A 3.1-kilobase (kb), viral-specified RNA was mapped to the left half of the BamHI-G fragment (0.345 to 0.399 map units). TATA, CAT-box, and possible mRNA start sequences characteristic of HSV-1 genes are found near 0.368 map units. The first available ATG codon is at 0.366 and the first in-phase chain terminator at 0.348 map units. A polyA-addition signal (AATAAA) occurs 17 nucleotides past the chain terminator. Translation of these sequences would yield a 100.3-kilodalton (kDa) polypeptide characterized by a 5' signal sequence, nine N-linked saccharide addition sites, a strongly hydrophobic membrane-spanning sequence, and a highly charged 3' cytoplasmic anchor sequence. Two mutants of KOS, tsJ12 and tsJ20, that are temperature-sensitive for viral growth and for the production of gB, have been physically mapped to 0.357 to 0.360 and 0.360 to 0.364 map units, respectively (DeLuca et al., in preparation). The nucleotide sequence of the mutants was determined in these regions. In both cases a single amino acid replacement within the 100.3-kDa polypeptide is predicted from the sequence analysis. PMID:6324454

  8. Linking the human cytogenetic map with nucleotide sequence: the CCAP clone set.

    PubMed

    Jang, Wonhee; Yonescu, Raluca; Knutsen, Turid; Brown, Theresa; Reppert, Tricia; Sirotkin, Karl; Schuler, Gregory D; Ried, Thomas; Kirsch, Ilan R

    2006-07-15

    We present the completed dataset and clone repository of the Cancer Chromosome Aberration Project (CCAP), an initiative developed and funded through the intramural program of the U.S. National Cancer Institute, to provide seamless linkage of human cytogenetic markers with the primary nucleotide sequence of the human genome. Spaced at 1-2 Mb intervals across the human genome, 1,339 bacterial artificial chromosome (BAC) clones have been localized to chromosomal bands through high-resolution fluorescence in situ hybridization (FISH) mapping. Of these clones, 99.8% can be positioned on the primary human genome sequence and 95% are placed at or close to their precise nucleotide starts and stops. This dataset can be studied and manipulated within generally available public Web sites. The clones are available from a commercial repository. The CCAP BAC clone set provides anchors for the interrogation of gene and sequence involvement in oncogenic and developmental disorders when the starting point is the recognition of a structural, numerical, or interstitial chromosomal aberration. This dataset also provides a current view of the quality and coherence of the available genome sequence and insight into the nucleotide and three-dimensional structures that manifest as Giemsa light and dark chromosomal banding patterns. PMID:16843097

  9. Complete nucleotide sequence of cherry virus A (CVA) infecting sweet cherry in India.

    PubMed

    Noorani, M S; Awasthi, P; Singh, Rahul Mohan; Ram, Raja; Sharma, M P; Singh, S R; Ahmed, N; Hallan, V; Zaidi, A A

    2010-12-01

    Cherry virus A (CVA) is a graft-transmissible member of the genus Capillovirus that infects different stone fruits. Sweet cherry (Prunus avium L; family Rosaceae) is an important deciduous temperate fruit crop in the Western Himalayan region of India. In order to determine the health status of cherry plantations and the incidence of the virus in India, cherry orchards in the states of Jammu and Kashmir (J&K) and Himachal Pradesh (H.P.) were surveyed during the months of May and September 2009. The incidence of CVA was found to be 28 and 13% from J&K and H.P., respectively, by RT-PCR. In order to characterize the virus at the molecular level, the complete genome was amplified by RT-PCR using specific primers. The amplicon of about 7.4 kb was sequenced and was found to be 7,379 bp long, with sequence specificity to CVA. The genome organization was similar to that of isolates characterized earlier, coding for two ORFs, in which ORF 2 is nested in ORF1. The complete sequence was 81 and 84% similar to that of the type isolate at the nucleotide and amino acid level, respectively, with 5' and 3' UTRs of 54 and 299 nucleotides, respectively. This is the first report of the complete nucleotide sequence of cherry virus A infecting sweet cherry in India. PMID:20938696

  10. Complete nucleotide sequence of the hypervirulent CFH strain of beet curly top virus.

    PubMed

    Stenger, D C

    1994-01-01

    The complete nucleotide sequence of the hypervirulent CFH strain of beet curly top geminivirus (BCTV) has been determined. The circular DNA genome of BCTV-CFH consists of 2,927 nucleotides and shares extensive sequence homology with the biologically distinct California strain of BCTV. Analysis of the CFH nucleotide sequence indicated that the rightward open reading frames (ORFs) R1, R2, and R3 are highly conserved (> 95% amino acid chemical similarity) in the CFH and California strains, although CFH ORF R2 was extended by 24 carboxy-terminal amino acid residues not present in the California strain. The CFH leftward ORFs L1, L2, and L3 shared varying levels of amino acid chemical similarity with the corresponding ORFs of the California strain (78.8, 66.5, and 86.7%, respectively). CFH ORF L4 was the least conserved ORF present in both strains, encoding a 9.9-kDa protein of 87 amino acid residues, which shares 57.6% chemical similarity with 85 carboxy-terminal amino acid residues of the 19.4-kDa ORF L4 of the California strain. The CFH DNA sequence also contained a unique 12.5-kDa ORF (R4); however, there is no evidence to suggest that R4 is expressed. Comparison of the CFH and California strain nucleotide sequences indicates that certain regions of the BCTV genome have diverged, and this divergence may account for differences in the pathogenic properties of the two strains. PMID:8167369

  11. Phylogeny of immunoglobulin heavy chain isotypes: structure of the constant region of Ambystoma mexicanum upsilon chain deduced from cDNA sequence.

    PubMed

    Fellah, J S; Kerfourn, F; Wiles, M V; Schwager, J; Charlemagne, J

    1993-01-01

    An RNA polymerase chain reaction strategy was used to amplify and clone a cDNA segment encoding for the complete constant part of the axolotl IgY heavy (C upsilon) chain. C upsilon is 433 amino acids long and organized into four domains (C upsilon 1-C upsilon 4); each has the typical internal disulfide bond and invariant tryptophane residues. Axolotl C upsilon is most closely related to Xenopus C upsilon (40% identical amino acid residues) and C upsilon 1 shares 46.4% amino acid residues among these species. The presence of additional cysteines in C upsilon 1 and C upsilon 2 domains is consistent with an additional intradomain S-S bond similar to that suggested for Xenopus C upsilon and C chi, and for the avian C upsilon and the human C epsilon. C upsilon 4 ends with the Gly-Lys dipeptide characteristic of secreted mammalian C gamma 3, human C epsilon 4, and avian and anuran C upsilon 4, and contains the consensus [G/GT(AA)] nucleotide splice signal sequence for joining C upsilon 4 to the transmembrane region. These results are consistent with the hypothesis of an ancestral structural relationship between amphibian, avian upsilon chains, and mammalian epsilon chains. However, these molecules have different biological properties: axolotl IgY is secretory Ig, anuran and avian IgY behave like mammalian IgG, and mammalian IgE is implicated in anaphylactic reactions. PMID:8344718

  12. The nucleotide sequence at the 5' end of foot and mouth disease virus RNA.

    PubMed Central

    Harris, T J

    1979-01-01

    Foot and mouth disease virus RNA has been treated with RNase H in the presence of oligo (dG) specifically to digest the poly(C) tract which lies near the 5' end of the molecule (10). The short (S) fragment containing the 5' end of the RNA was separated from the remainder of the RNA (L fragment) by gel electrophoresis. RNA ligase mediated labelling of the 3' end of S fragment showed that the RNase H digestion gave rise to molecules that differed only in the number of cytidylic acid residues remaining at their 3' ends and did not leave the unique 3' end necessary for fast sequence analysis. As the 5' end of S fragment prepared form virus RNA is blocked by VPg, S fragment was prepared from virus specific messenger RNA which does not contain this protein. This RNA was labelled at the 5' end using polynucleotide kinase and the sequence of 70 nucleotides at the 5' end determined by partial enzyme digestion sequencing on polyacrylamide gels. Some of this sequence was confirmed from an analysis of the oligonucleotides derived by RNase T1 digestion of S fragment. The sequence obtained indicates that there is a stable hairpin loop at the 5' terminus of the RNA before an initiation codon 33 nucleotides from the 5' end. In addition, the RNase T1 analysis suggests that there are short repeated sequences in S fragment and that an eleven nucleotide inverted complementary repeat of a sequence near the 3' end of the RNA is present at the junction of S fragment and the poly(C) tract. Images PMID:231762

  13. The complete cDNA sequence of laminin alpha 4 and its relationship to the other human laminin alpha chains.

    PubMed

    Richards, A; Al-Imara, L; Pope, F M

    1996-06-15

    We previously localised the gene (LAMA4) encoding a novel laminin alpha 4 chain to chromosome 6q21. In this study, we describe the complete coding sequence and compare the protein with the other three known human laminin alpha chains. Although closely linked to LAMA2, the LAMA4 product most closely resembles laminin alpha 3, a constituent of laminin 5. Like laminin alpha 3A, the alpha 4 chain is a truncated version of the alpha 1 and alpha 2 chains, with a much reduced short arm. While the alpha 4 molecule is most similar to alpha 3, it shares some features of the C-terminal domains G4 and G5 in common with alpha 2. Unlike the LAMA3 gene, LAMA4 appears to encode only a single transcript, as determined by 5' rapid amplification of cDNA ends. The cDNA sequence encodes 1816 amino acids, which include a 24-residue signal peptide. The gene is expressed in skin, placenta, heart, lung, skeletal muscle, and pancreas. We have also shown that the mRNA can be readily reverse transcribed and amplified from cultured dermal fibroblasts. PMID:8706685

  14. Complete nucleotide sequence of the nucleoprotein gene of influenza B virus.

    PubMed Central

    Londo, D R; Davis, A R; Nayak, D P

    1983-01-01

    A DNA copy of influenza B/Singapore/222/79 viral RNA segment 5, containing the gene coding for the nucleoprotein (NP), has been cloned in Escherichia coli plasmid pBR322, and its nucleotide sequence has been determined. The influenza B NP gene contains 1,839 nucleotides and codes for a protein of 560 amino acids with a molecular weight of 61,593. Comparison of the influenza B NP amino acid sequence with that of influenza A NP (A/PR/8/34) reveals 37% direct homology in the aligned regions, indicating a common ancestor. However, influenza B NP has an additional 50 amino acids at its N-terminal end. As is the case with influenza A NP, influenza B NP is a basic protein, with its charged residues relatively evenly distributed rather than clustered. The structural homology suggests functional similarity between the NP of influenza A and B viruses. PMID:6688639

  15. Quadfinder: server for identification and analysis of quadruplex-forming motifs in nucleotide sequences

    PubMed Central

    Scaria, Vinod; Hariharan, Manoj; Arora, Amit; Maiti, Souvik

    2006-01-01

    G-quadruplex secondary structures, which play a structural role in repetitive DNA such as telomeres, may also play a functional role at other genomic locations as targetable regulatory elements which control gene expression. The recent interest in application of quadruplexes in biological systems prompted us to develop a tool for the identification and analysis of quadruplex-forming nucleotide sequences especially in the RNA. Here we present Quadfinder, an online server for prediction and bioinformatics of uni-molecular quadruplex-forming nucleotide sequences. The server is designed to be user-friendly and needs minimal intervention by the user, while providing flexibility of defining the variants of the motif. The server is freely available at URL . PMID:16845097

  16. Multimodal phylogeny for taxonomy: integrating information from nucleotide and amino acid sequences.

    PubMed

    Bicego, Manuele; Dellaglio, Franco; Felis, Giovanna E

    2007-10-01

    The crucial role played by the analysis of microbial diversity in biotechnology-based innovations has increased the interest in the microbial taxonomy research area. Phylogenetic sequence analyses have contributed significantly to the advances in this field, also in the view of the large amount of sequence data collected in recent years. Phylogenetic analyses could be realized on the basis of protein-encoding nucleotide sequences or encoded amino acid molecules: these two mechanisms present different peculiarities, still starting from two alternative representations of the same information. This complementarity could be exploited to achieve a multimodal phylogenetic scheme that is able to integrate gene and protein information in order to realize a single final tree. This aspect has been poorly addressed in the literature. In this paper, we propose to integrate the two phylogenetic analyses using basic schemes derived from the multimodality fusion theory (or multiclassifier systems theory), a well-founded and rigorous branch for which its powerfulness has already been demonstrated in other pattern recognition contexts. The proposed approach could be applied to distance matrix-based phylogenetic techniques (like neighbor joining), resulting in a smart and fast method. The proposed methodology has been tested in a real case involving sequences of some species of lactic acid bacteria. With this dataset, both nucleotide sequence- and amino acid sequence-based phylogenetic analyses present some drawbacks, which are overcome with the multimodal analysis. PMID:17933011

  17. Common nucleotide sequence of structural gene encoding fibroblast growth factor 4 in eight cattle derived from three breeds.

    PubMed

    Sato, Sho; Takahashi, Toshikiyo; Nishinomiya, Hiroshi; Katoh, Makiko; Itoh, Ryu; Yokoo, Masaki; Yokoo, Mari; Iha, Momoe; Mori, Yuki; Kasuga, Kano; Kojima, Ikuo; Kobayashi, Masayuki

    2012-03-01

    Fibroblast growth factor 4 (FGF4) is considered as a crucial gene for the proper development of bovine embryos. However, the complete nucleotide sequences of the structural genes encoding FGF4 in identified breeds are still unknown. In the present study, direct sequencing of PCR products derived from genomic DNA samples obtained from three Japanese Black, two Japanese Shorthorn and three Holstein cattle, revealed that the nucleotide sequences of the structural gene encoding FGF4 matched completely among these eight cattle. On the other hand, differences in the nucleotide sequences, leading to substitutions, insertions or deletions of amino acid residues were detected when compared with the already reported sequence from unidentified breeds. We cannot rule out a possibility that the structural gene elucidated in the present study is widely distributed in cattle. To the best of our knowledge, this is the first determination of the complete nucleotide sequence of the structural gene encoding bovine FGF4 in identified breeds. PMID:22435631

  18. Nucleotide sequence, heterologous expression and novel purification of DNA ligase from Bacillus stearothermophilus(1).

    PubMed

    Brannigan, J A; Ashford, S R; Doherty, A J; Timson, D J; Wigley, D B

    1999-07-13

    The gene for DNA ligase (EC 6.5.1.2) from thermophilic bacterium Bacillus stearothermophilus NCA1503 has been cloned and the complete nucleotide sequence determined. The ligase gene encodes a protein 670 amino acids in length. The gene was overexpressed in Escherichia coli and the enzyme has been purified to homogeneity. Preliminary characterisation confirms that it is a thermostable, NAD(+)-dependent DNA ligase. PMID:10407164

  19. Complete Nucleotide Sequence of a French Isolate of Maize rough dwarf virus, a Fijivirus Member in the Family Reoviridae

    PubMed Central

    Svanella-Dumas, L.; Marais, A.; Faure, C.; Theil, S.; Thibord, J. B.

    2016-01-01

    The complete nucleotide sequence of a French isolate of Maize rough dwarf virus (MRDV) was determined by next-generation sequencing and compared with the single available complete sequence and with the partial sequences of two additional isolates available in online databases. PMID:27445367

  20. Sequences of cDNA Clones from Lygus lineolaris (Palisot de Beauvois) (Heteroptera: Miridae)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Eighteen sequences have been deposited to augment the expressed sequences of Lygus lineolaris in the National Center for Biotechnology Information database, GenBank. These sequences were obtained from laboratory reared specimens. Total RNA was extracted from specimens, and then pooled and used to ob...

  1. Nucleotide sequence of the hypervariable region of the human C2 gene

    SciTech Connect

    Zhu, Z.B.; Volanakis, J.V. )

    1991-03-15

    It has been previously suggested that the multiallelic Bam H1/Sst I RFLPs of the human C2 gene arose through deletion/insertion of a tandemly-repeated minisatellite region. In this study the authors subcloned and sequenced the Sst I polymorphic fragment of the b haplotype of the C2 gene. This restriction fragment is 2,450 bp long and maps 1,550 bp 3{prime} of exon 3. Its nucleotide sequence is characterized by the presence of at least 4 different repeated regions varying in size from 18 to 58 bp. One of these regions starting at position 1,413 is 48 bp long and is repeated five times. The first 3 repeats are in tandem and are separated by 72 bp from two additional tandem repeats. Sequence homology among the 5 repeats ranges between 93 and 98%. Eighty three percent of the nucleotides of the repeated-region are G or C. It seems likely that this nucleotide repeat resulted in the multiallelic RFLPs through a mechanism of unequal recombination or replication slippage.

  2. Nucleotide sequence variation of the envelope protein gene identifies two distinct genotypes of yellow fever virus.

    PubMed Central

    Chang, G J; Cropp, B C; Kinney, R M; Trent, D W; Gubler, D J

    1995-01-01

    The evolution of yellow fever virus over 67 years was investigated by comparing the nucleotide sequences of the envelope (E) protein genes of 20 viruses isolated in Africa, the Caribbean, and South America. Uniformly weighted parsimony algorithm analysis defined two major evolutionary yellow fever virus lineages designated E genotypes I and II. E genotype I contained viruses isolated from East and Central Africa. E genotype II viruses were divided into two sublineages: IIA viruses from West Africa and IIB viruses from America, except for a 1979 virus isolated from Trinidad (TRINID79A). Unique signature patterns were identified at 111 nucleotide and 12 amino acid positions within the yellow fever virus E gene by signature pattern analysis. Yellow fever viruses from East and Central Africa contained unique signatures at 60 nucleotide and five amino acid positions, those from West Africa contained unique signatures at 25 nucleotide and two amino acid positions, and viruses from America contained such signatures at 30 nucleotide and five amino acid positions in the E gene. The dissemination of yellow fever viruses from Africa to the Americas is supported by the close genetic relatedness of genotype IIA and IIB viruses and genetic evidence of a possible second introduction of yellow fever virus from West Africa, as illustrated by the TRINID79A virus isolate. The E protein genes of American IIB yellow fever viruses had higher frequencies of amino acid substitutions than did genes of yellow fever viruses of genotypes I and IIA on the basis of comparisons with a consensus amino acid sequence for the yellow fever E gene. The great variation in the E proteins of American yellow fever virus probably results from positive selection imposed by virus interaction with different species of mosquitoes or nonhuman primates in the Americas. PMID:7637022

  3. Cloning and nucleotide sequence of the gene encoding the Ecal DNA methyltransferase.

    PubMed Central

    Brenner, V; Venetianer, P; Kiss, A

    1990-01-01

    The gene coding for the GGTNACC specific Ecal DNA methyltransferase (M.Ecal) has been cloned in E. coli from Enterobacter cloacae and its nucleotide sequence has been determined. The ecalM gene codes for a protein of 452 amino acids (Mr: 51,111). It was determined that M.Ecal is an adenine methyltransferase. M.Ecal shows limited amino acid sequence similarity to other adenine methyltransferases. A clone that expresses Ecal methyltransferase at high level was constructed. Images PMID:2183182

  4. Nucleotide sequencing and characterization of the genes encoding benzene oxidation enzymes of Pseudomonas putida

    SciTech Connect

    Irie, S.; Doi, S.; Yorifuji, T.; Takagi, M.; Yano, K.

    1987-11-01

    The nucleotide sequence of the genes from Pseudomonas putida encoding oxidation of benzene to catechol was determined. Five open reading frames were found in the sequence. Four corresponding protein molecules were detected by a DNA-directed in vitro translation system. Escherichia coli cells containing the fragment with the four open reading frames transformed benzene to cis-benzene glycol, which is an intermediate of the oxidation of benzene to catechol. The relation between the product of each cistron and the components of the benzene oxidation enzyme system is discussed.

  5. Nucleotide sequence of an exceptionally long 5.8S ribosomal RNA from Crithidia fasciculata.

    PubMed

    Schnare, M N; Gray, M W

    1982-03-25

    In Crithidia fasciculata, a trypanosomatid protozoan, the large ribosomal subunit contains five small RNA species (e, f, g, i, j) in addition to 5S rRNA [Gray, M.W. (1981) Mol. Cell. Biol. 1, 347-357]. The complete primary sequence of species i is shown here to be pAACGUGUmCGCGAUGGAUGACUUGGCUUCCUAUCUCGUUGA ... AGAmACGCAGUAAAGUGCGAUAAGUGGUApsiCAAUUGmCAGAAUCAUUCAAUUACCGAAUCUUUGAACGAAACGG ... CGCAUGGGAGAAGCUCUUUUGAGUCAUCCCCGUGCAUGCCAUAUUCUCCAmGUGUCGAA(C)OH. This sequence establishes that species i is a 5.8S rRNA, despite its exceptional length (171-172 nucleotides). The extra nucleotides in C. fasciculata 5.8S rRNA are located in a region whose primary sequence and length are highly variable among 5.8S rRNAs, but which is capable of forming a stable hairpin loop structure (the "G+C-rich hairpin"). The sequence of C. fasciculata 5.8S rRNA is no more closely related to that of another protozoan, Acanthamoeba castellanii, than it is to representative 5.8S rRNA sequences from the other eukaryotic kingdoms, emphasizing the deep phylogenetic divisions that seem to exist within the Kingdom Protista. PMID:7079176

  6. Nucleotide sequences of immunoglobulin eta genes of chimpanzee and orangutan: DNA molecular clock and hominoid evolution

    SciTech Connect

    Sakoyama, Y.; Hong, K.J.; Byun, S.M.; Hisajima, H.; Ueda, S.; Yaoita, Y.; Hayashida, H.; Miyata, T.; Honjo, T.

    1987-02-01

    To determine the phylogenetic relationships among hominoids and the dates of their divergence, the complete nucleotide sequences of the constant region of the immunoglobulin eta-chain (C/sub eta1/) genes from chimpanzee and orangutan have been determined. These sequences were compared with the human eta-chain constant-region sequence. A molecular clock (silent molecular clock), measured by the degree of sequence divergence at the synonymous (silent) positions of protein-encoding regions, was introduced for the present study. From the comparison of nucleotide sequences of ..cap alpha../sub 1/-antitrypsin and ..beta..- and delta-globulin genes between humans and Old World monkeys, the silent molecular clock was calibrated: the mean evolutionary rate of silent substitution was determined to be 1.56 x 10/sup -9/ substitutions per site per year. Using the silent molecular clock, the mean divergence dates of chimpanzee and orangutan from the human lineage were estimated as 6.4 +/- 2.6 million years and 17.3 +/- 4.5 million years, respectively. It was also shown that the evolutionary rate of primate genes is considerably slower than those of other mammalian genes.

  7. Cloning and nucleotide sequence of the Salmonella typhimurium dcp gene encoding dipeptidyl carboxypeptidase.

    PubMed Central

    Hamilton, S; Miller, C G

    1992-01-01

    Plasmids carrying the Salmonella typhimurium dcp gene were isolated from a pBR328 library of Salmonella chromosomal DNA by screening for complementation of a peptide utilization defect conferred by a dcp mutation. Strains carrying these plasmids overproduced dipeptidyl carboxypeptidase approximately 50-fold. The nucleotide sequence of a 2.8-kb region of one of these plasmids contained an open reading frame coding for a protein of 77,269 Da, in agreement with the 80-kDa size for dipeptidyl carboxypeptidase (determined by sodium dodecyl sulfate-polyacrylamide gel electrophoresis and gel filtration). The N-terminal amino acid sequence of dipeptidyl carboxypeptidase purified from an overproducer strain agreed with that predicted by the nucleotide sequence. Northern (RNA) blot data indicated that dcp is not cotranscribed with other genes, and primer extension analysis showed the start of transcription to be 22 bases upstream of the translational start. The amino acid sequence of dcp was not similar to that of a mammalian dipeptidyl carboxypeptidase, angiotensin I-converting enzyme, but showed striking similarities to the amino acid sequence of another S. typhimurium peptidase encoded by the opdA (formerly optA) gene. Images PMID:1537804

  8. Remote access to ACNUC nucleotide and protein sequence databases at PBIL.

    PubMed

    Gouy, Manolo; Delmotte, Stéphane

    2008-04-01

    The ACNUC biological sequence database system provides powerful and fast query and extraction capabilities to a variety of nucleotide and protein sequence databases. The collection of ACNUC databases served by the Pôle Bio-Informatique Lyonnais includes the EMBL, GenBank, RefSeq and UniProt nucleotide and protein sequence databases and a series of other sequence databases that support comparative genomics analyses: HOVERGEN and HOGENOM containing families of homologous protein-coding genes from vertebrate and prokaryotic genomes, respectively; Ensembl and Genome Reviews for analyses of prokaryotic and of selected eukaryotic genomes. This report describes the main features of the ACNUC system and the access to ACNUC databases from any internet-connected computer. Such access was made possible by the definition of a remote ACNUC access protocol and the implementation of Application Programming Interfaces between the C, Python and R languages and this communication protocol. Two retrieval programs for ACNUC databases, Query_win, with a graphical user interface and raa_query, with a command line interface, are also described. Altogether, these bioinformatics tools provide users with either ready-to-use means of querying remote sequence databases through a variety of selection criteria, or a simple way to endow application programs with an extensive access to these databases. Remote access to ACNUC databases is open to all and fully documented (http://pbil.univ-lyon1.fr/databases/acnuc/acnuc.html). PMID:17825976

  9. Sphingomyelinase D from venoms of Loxosceles spiders: evolutionary insights from cDNA sequences and gene structure.

    PubMed

    Binford, Greta J; Cordes, Matthew H J; Wells, Michael A

    2005-04-01

    Loxosceles spider venoms cause dermonecrosis in mammalian tissues. The toxin sphingomyelinase D (SMaseD) is a sufficient causative agent in lesion formation and is only known in these spiders and a few pathogenic bacteria. Similarities between spider and bacterial SMaseD in molecular weights, pIs and N-terminal amino acid sequence suggest an evolutionary relationship between these molecules. We report three cDNA sequences from venom-expressed mRNAs, analyses of amino acid sequences, and partial characterization of gene structure of SMaseD homologs from Loxosceles arizonica with the goal of better understanding the evolution of this toxin. Sequence analyses indicate SMaseD is a single domain protein and a divergent member of the ubitiquous, broadly conserved glycerophosphoryl diester phosphodiesterase family (GDPD). Bacterial SMaseDs are not identifiable as homologs of spider SMaseD or GDPD family members. Amino acid sequence similarities do not afford clear distinction between independent origin of toxic SMaseD activity in spiders and bacteria and origin in one lineage by ancient horizontal transfer from the other. The SMaseD genes span at least 6500bp and contain at least 5 introns. Together, these data indicate L. arizonica SMaseD has been evolving within a eukaryotic genome for a long time ruling out origin by recent transfer from bacteria. PMID:15777950

  10. Analysis of xylem formation in pine by cDNA sequencing

    NASA Technical Reports Server (NTRS)

    Allona, I.; Quinn, M.; Shoop, E.; Swope, K.; St Cyr, S.; Carlis, J.; Riedl, J.; Retzel, E.; Campbell, M. M.; Sederoff, R.; Whetten, R. W.; Davies, E. (Principal Investigator)

    1998-01-01

    Secondary xylem (wood) formation is likely to involve some genes expressed rarely or not at all in herbaceous plants. Moreover, environmental and developmental stimuli influence secondary xylem differentiation, producing morphological and chemical changes in wood. To increase our understanding of xylem formation, and to provide material for comparative analysis of gymnosperm and angiosperm sequences, ESTs were obtained from immature xylem of loblolly pine (Pinus taeda L.). A total of 1,097 single-pass sequences were obtained from 5' ends of cDNAs made from gravistimulated tissue from bent trees. Cluster analysis detected 107 groups of similar sequences, ranging in size from 2 to 20 sequences. A total of 361 sequences fell into these groups, whereas 736 sequences were unique. About 55% of the pine EST sequences show similarity to previously described sequences in public databases. About 10% of the recognized genes encode factors involved in cell wall formation. Sequences similar to cell wall proteins, most known lignin biosynthetic enzymes, and several enzymes of carbohydrate metabolism were found. A number of putative regulatory proteins also are represented. Expression patterns of several of these genes were studied in various tissues and organs of pine. Sequencing novel genes expressed during xylem formation will provide a powerful means of identifying mechanisms controlling this important differentiation pathway.

  11. Nucleotide sequence analysis of a cloned DNA fragment from human cells reveals homology to retrotransposons.

    PubMed Central

    Flügel, R M; Maurer, B; Bannert, H; Rethwilm, A; Schnitzler, P; Darai, G

    1987-01-01

    During molecular cloning of proviral DNA of human spumaretrovirus, various recombinant clones were established and analyzed. Blot hybridization revealed that one of the recombinant plasmids had the characteristic features of a member of the long interspersed repetitive sequences family. The DNA element was analyzed by restriction mapping and nucleotide sequencing. It showed a high degree of amino acid sequence homology of 54.3% when compared with the 5'-terminal part of the pol gene product of the murine retrotransposon LIMd. The 3' region of the cloned DNA element encodes proteins with an even higher degree of homology of 67.4% in comparison to the corresponding parts of a member of the primate KpnI sequence family. Images PMID:3031462

  12. Large-scale detection and application of expressed sequence tag single nucleotide polymorphisms in Nicotiana.

    PubMed

    Wang, Y; Zhou, D; Wang, S; Yang, L

    2015-01-01

    Single nucleotide polymorphisms (SNPs) are widespread in the Nicotiana genome. Using an alignment and variation detection method, we developed 20,607,973 SNPs, based on the expressed sequence tag sequences of 10 Nicotiana species. The replacement rate was much higher than the transversion rate in the SNPs, and SNPs widely exist in the Nicotiana. In vitro verification indicated that all of the SNPs were high quality and accurate. Evolutionary relationships between 15 varieties were investigated by polymerase chain reaction with a special primer; the specific 302 locus of these sequence results clearly indicated the origin of Zhongyan 100. A database of Nicotiana SNPs (NSNP) was developed to store and search for SNPs in Nicotiana. NSNP is a tool for researchers to develop SNP markers of sequence data. PMID:26214460

  13. Escherichia coli thymidylate kinase: molecular cloning, nucleotide sequence, and genetic organization of the corresponding tmk locus.

    PubMed Central

    Reynes, J P; Tiraby, M; Baron, M; Drocourt, D; Tiraby, G

    1996-01-01

    Thymidylate kinase (dTMP kinase; EC 2.7.4.9) catalyzes the phosphorylation of dTMP to form dTDP in both de novo and salvage pathways of dTTP synthesis. The nucleotide sequence of the tmk gene encoding this essential Escherichia coli enzyme is the last one among all the E. coli nucleoside and nucleotide kinase genes which has not yet been reported. By subcloning the 24.0-min region where the tmk gene has been previously mapped from the lambda phage 236 (E9G1) of the Kohara E. coli genomic library (Y. Kohara, K. Akiyama, and K. Isono, Cell 50:495-508, 1987), we precisely located tmk between acpP and holB genes. Here we report the nucleotide sequence of tmk, including the end portion of an upstream open reading frame (ORF 340) of unknown function that may be cotranscribed with the pabC gene. The tmk gene was located clockwise of and just upstream of the holB gene. Our sequencing data allowed the filling in of the unsequenced gap between the acpP and holB genes within the 24-min region of the E. coli chromosome. Identification of this region as the E. coli tmk gene was confirmed by functional complementation of a yeast dTMP kinase temperature-sensitive mutant and by in vitro enzyme assay of the thymidylate kinase activity in cell extracts of E. coli by use of tmk-overproducing plasmids. The deduced amino acid sequence of the E. coli tmk gene showed significant similarity to the sequences of the thymidylate kinases of vertebrates, yeasts, and viruses as well as two uncharacterized proteins of bacteria belonging to Bacillus and Haemophilus species. PMID:8631667

  14. Nucleotide sequence and characterization of the pyrF operon of Escherichia coli K12.

    PubMed

    Turnbough, C L; Kerr, K H; Funderburg, W R; Donahue, J P; Powell, F E

    1987-07-25

    The pyrF gene of Escherichia coli K12, which encodes the pyrimidine biosynthetic enzyme orotidine-5'-monophosphate (OMP) decarboxylase, is part of an operon that includes a downstream gene designated orfF. The orfF gene product is a small polypeptide of unknown function. The nucleotide sequence of a 1549-base pair chromosomal fragment containing this operon was determined. An open reading frame capable of encoding the 27-kDa OMP decarboxylase subunit was identified and shown to be the pyrF structural gene by purifying and characterizing OMP decarboxylase. The subunit molecular weight (Mr = 26,350), amino-terminal amino acid sequence, and amino acid composition of the polypeptide predicted from the nucleotide sequence are in excellent agreement with those properties determined for the purified enzyme. The orfF structural gene was tentatively identified and apparently encodes an 11,396-dalton polypeptide. The orfF translational initiation codon overlaps the pyrF termination codon, which may indicate translational coupling in the expression of these genes. The pyrF promoter was mapped by primer extension of in vivo transcripts. The primary transcriptional initiation site is 51 base pairs upstream of the pyrF structural gene. The level of pyrF transcription and OMP decarboxylase synthesis was found to be coordinately derepressed by pyrimidine limitation, indicating that regulation of pyrF gene expression occurs at the transcriptional level. Inspection of the nucleotide sequence indicates that pyrF gene expression is not regulated by an attenuation control mechanism similar to that described for the pyrBI operon or pyrE gene. Finally, we compared the amino acid sequences of the OMP decarboxylases from E. coli, Saccharomyces cerevisiae, Neurospora crassa, and Ehrlich ascites cells to identify conserved regions. PMID:2956254

  15. Nucleotide sequence analysis of the hypervariable region III of mitochondrial DNA in Thais.

    PubMed

    Thongngam, Punlop; Leewattanapasuk, Worraanong; Bhoopat, Tanin; Sangthong, Padchanee

    2016-07-01

    This study analyzed the nucleotide sequences of the hypervariable region III (HVRIII) of mitochondrial DNA in Thai individuals. Buccal swab samples were randomly obtained from 100 healthy, unrelated, adult (18-60 years old), volunteer donors living in Thailand. Eighteen different haplotypes were found, of which 11 haplotypes were unique. The most frequent haplotypes observed were 522D-523D. Nucleotide transition from Thymine (T) to Cytosine (C) at position 489 (43%) was the most frequent substitution. Nucleotide transversions were also observed at position 433 (Adenine (A) to C, 1%) and position 499 (Guanine (G) to C, 1%). Fifty-three samples presented nucleotide insertion and deletion of C and A (CA) at position 514-523. Insertion of 1AC (3%) and 2AC (2%) were observed. Deletion of 1CA (53%) and 2CA (2%) at position 514-523 were revealed. The deletion of T at position 459 was observed. The haplotype diversity, random match probability, and discrimination power were calculated to be 0.7770, 0.2308, and 0.7692, respectively. PMID:27107562

  16. Nucleotide sequence variation of chitin synthase genes among ectomycorrhizal fungi and its potential use in taxonomy.

    PubMed Central

    Mehmann, B; Brunner, I; Braus, G H

    1994-01-01

    DNA sequences of single-copy genes coding for chitin synthases (UDP-N-acetyl-D-glucosamine:chitin 4-beta-N-acetylglucosaminyltransferase; EC 2.4.1.16) were used to characterize ectomycorrhizal fungi. Degenerate primers deduced from short, completely conserved amino acid stretches flanking a region of about 200 amino acids of zymogenic chitin synthases allowed the amplification of DNA fragments of several members of this gene family. Different DNA band patterns were obtained from basidiomycetes because of variation in the number and length of amplified fragments. Cloning and sequencing of the most prominent DNA fragments revealed that these differences were due to various introns at conserved positions. The presence of introns in basidiomycetous fungi therefore has a potential use in identification of genera by analyzing PCR-generated DNA fragment patterns. Analyses of the nucleotide sequences of cloned fragments revealed variations in nucleotide sequences from 4 to 45%. By comparison of the deduced amino acid sequences, the majority of the DNA fragments were identified as members of genes for chitin synthase class II. The deduced amino acid sequences from species of the same genus differed only in one amino acid residue, whereas identity between the amino acid sequences of ascomycetous and basidiomycetous fungi within the same taxonomic class was found to be approximately 43 to 66%. Phylogenetic analysis of the amino acid sequence of class II chitin synthase-encoding gene fragments by using parsimony confirmed the current taxonomic groupings. In addition, our data revealed a fourth class of putative zymogenic chitin synthesis. Images PMID:7944356

  17. CodingMotif: exact determination of overrepresented nucleotide motifs in coding sequences

    PubMed Central

    2012-01-01

    Background It has been increasingly appreciated that coding sequences harbor regulatory sequence motifs in addition to encoding for protein. These sequence motifs are expected to be overrepresented in nucleotide sequences bound by a common protein or small RNA. However, detecting overrepresented motifs has been difficult because of interference by constraints at the protein level. Sampling-based approaches to solve this problem based on codon-shuffling have been limited to exploring only an infinitesimal fraction of the sequence space and by their use of parametric approximations. Results We present a novel O(N(log N)2)-time algorithm, CodingMotif, to identify nucleotide-level motifs of unusual copy number in protein-coding regions. Using a new dynamic programming algorithm we are able to exhaustively calculate the distribution of the number of occurrences of a motif over all possible coding sequences that encode the same amino acid sequence, given a background model for codon usage and dinucleotide biases. Our method takes advantage of the sparseness of loci where a given motif can occur, greatly speeding up the required convolution calculations. Knowledge of the distribution allows one to assess the exact non-parametric p-value of whether a given motif is over- or under- represented. We demonstrate that our method identifies known functional motifs more accurately than sampling and parametric-based approaches in a variety of coding datasets of various size, including ChIP-seq data for the transcription factors NRSF and GABP. Conclusions CodingMotif provides a theoretically and empirically-demonstrated advance for the detection of motifs overrepresented in coding sequences. We expect CodingMotif to be useful for identifying motifs in functional genomic datasets such as DNA-protein binding, RNA-protein binding, or microRNA-RNA binding within coding regions. A software implementation is available at http://bioinformatics.bc.edu/chuanglab/codingmotif.tar PMID

  18. cDNA sequence and heterologous expression of monomeric spinach pullulanase: multiple isomeric forms arise from the same polypeptide.

    PubMed

    Renz, A; Schikora, S; Schmid, R; Kossmann, J; Beck, E

    1998-05-01

    The spinach pullulanase gene was cloned and sequenced using peptide sequences of the purified enzyme as a starting point and employing PCR techniques and cDNA library screening. Its open reading frame codes for a protein of 964 amino acids which represents a precursor of the pullulanase. The N-terminal transit peptide consists of 65 amino acids, and the mature protein, comprising 899 amino acids, has a calculated molecular mass of 99kDa. Pullulanase is a member of the alpha-amylase family. In addition to a characteristic catalytic (beta/alpha)8-barrel domain, it contains a domain, F, that is specific for branching and debranching enzymes. Pullulanase cDNA was expressed in Escherichia coli, and the purified protein was compared with the enzyme from spinach leaves. Identity of the two proteins was confirmed in terms of catalytic properties, N-terminal amino acid sequences and molecular masses. The pullulanase produced by E. coli showed the same microheterogeneity as the spinach leaf enzyme: it could be resolved into two substrate-induced forms by electrophoresis in amylopectin-containing polyacrylamide gels, and, in the absence of substrate, into several free forms (charge isomers) by isoelectric focusing or chromatofocusing. Rechromatofocusing of single free forms resulted in the originally observed pattern of molecular forms. However, heterogeneity of the protein disappeared on isoelectric focusing under completely denaturing conditions when only one protein band was observed. Post-translational modifications such as glycosylation and phosphorylation could be excluded as potential explanations for the protein heterogeneity. Therefore the microheterogeneity of spinach leaf pullulanase results from neither genetic variation nor post-translational modifications, but is a property of the single unmodified gene product. The different interconvertible forms of the pullulanase represent protein populations of different tertiary structure of the same polypeptide. PMID

  19. cDNA sequence coding for the alpha'-chain of the third complement component in the African lungfish.

    PubMed

    Sato, A; Sültmann, H; Mayer, W E; Figueroa, F; Tichy, H; Klein, J

    1999-04-01

    cDNA clones coding for almost the entire C3 alpha-chain of the African lungfish (Protopterus aethiopicus), a representative of the Sarcopterygii (lobe-finned fishes), were sequenced and characterized. From the sequence it is deduced that the lungfish C3 molecule is probably a disulphide-bonded alpha:beta dimer similar to that of the C3 components of other jawed vertebrates. The deduced sequence contains conserved sites presumably recognized by proteolytic enzymes (e.g. factor I) involved in the activation and inactivation of the component. It also contains the conserved thioester region and the putative site for binding properdin. However, the site for the interaction with complement receptor 2 and factor H are poorly conserved. Either complement receptor 2 and factor H are not present in the lungfish or they bind to different residues at the same or a different site than mammalian complement receptor 2 and factor H. The C3 alpha-chain sequences faithfully reflect the phylogenetic relationships among vertebrate classes and can therefore be used to help to resolve the long-standing controversy concerning the origin of the tetrapods. PMID:10219761

  20. Evolution of tissue-specific keratins as deduced from novel cDNA sequences of the lungfish Protopterus aethiopicus.

    PubMed

    Schaffeld, Michael; Bremer, Miriam; Hunzinger, Christian; Markl, Jürgen

    2005-03-01

    Lungfishes are possibly the closest extant relatives of the land vertebrates (tetrapods). We report here the cDNA and predicted amino acid sequences of 13 different keratins (ten type I and three type II) of the lungfish Protopterus aethiopicus. These keratins include the orthologs of human K8 and K18. The lungfish keratins were also identified in tissue extracts using two-dimensional polyacrylamide gel electrophoresis, keratin blot binding assays and immunoblotting. The identified keratin spots were analyzed by peptide mass fingerprinting which assigned seven sequences (inclusively Protopterus K8 and K18) to their respective protein spot. The peptide mass fingerprints also revealed the fact that the major epidermal type I and type II keratins of this lungfish have not yet been sequenced. Nevertheless, phylogenetic trees constructed from multiple sequence alignments of keratins from lungfish and distantly related vertebrates such as lamprey, shark, trout, frog, and human reveal new insights into the evolution of K8 and K18, and unravel a variety of independent keratin radiation events. PMID:15819414

  1. Molecular cloning and sequence analysis of the cDNA encoding rat liver cysteine sulfinate decarboxylase (CSD).

    PubMed

    Reymond, I; Sergeant, A; Tappaz, M

    1996-06-01

    The taurine biosynthesis enzyme, cysteine sulfinate decarboxylase (CSD), was purified to homogeneity from rat liver. Three CSD peptides generated by tryptic cleavage were isolated and partially sequenced. Two of them showed a marked homology with glutamate decarboxylase and their respective position on the CSD amino acid sequence was postulated accordingly. Using appropriate degenerated primers derived from these two peptides, a PCR amplified DNA fragment was generated from liver poly(A)+ mRNA, cloned and used as a probe to screen a rat liver cDNA library. Three cDNAs, length around 1800 bp, were isolated which all contained an open reading frame (ORF) encoding a 493 amino acid protein with a calculated molecular mass of 55.2 kDa close to the experimental values for CSD. The encoded protein contained the sequence of the three peptides isolated from homogenous liver CSD. Our data confirm and significantly extend those recently published (Kaisaki et al. (1995) Biochim. Biophys. Acta 1262, 79-82). Indeed, an additional base pair found 1371 bp downstream from the initiation codon led to a shift in the open reading frame which extended the carboxy-terminal end by 15 amino acid residues and altogether modified 36 amino acids. The validity of this correction is supported by the finding that the corrected reading frame encoded a peptide issued from CSD tryptic cleavage that was not encoded anywhere in the CSD sequence previously reported. PMID:8679699

  2. The nucleotide sequence of an infectious clone of the geminivirus beet curly top virus.

    PubMed

    Stanley, J; Markham, P G; Callis, R J; Pinner, M S

    1986-08-01

    A number of infectious clones of a Californian isolate of the leafhopper-transmitted geminivirus beet curly top virus (BCTV) have been constructed from virus-specific double-stranded DNA isolated from infected Beta vulgaris and used to demonstrate a single component genome. The nucleotide sequence of one infectious clone has been determined (2993 nucleotides). Comparison with other geminiviruses has shown that the organisation of the genome closely resembles DNA 1 of the whitefly-transmitted members. The four conserved coding regions of DNA 1 have highly homologous counterparts in BCTV with the exception of the putative coat protein which is more closely related to those of the leafhopper-transmitted geminiviruses suggesting a strong interrelationship between coat protein and insect vector. A BCTV component equivalent to DNA 2 is not required for virus infection or transmission and has not been isolated from infected plants. PMID:16453696

  3. Porcine dentin matrix protein 1: gene structure, cDNA sequence, and expression in teeth

    PubMed Central

    Kim, Jung-Wook; Yamakoshi, Yasuo; Iwata, Takanori; Hu, Yuan Yuan; Zhang, Hengmin; Hu, Jan C.-C.; Simmer, James P.

    2015-01-01

    Dentin matrix protein 1 (DMP1) is an acidic non-collagenous protein that is necessary for the proper biomineralization of bone, cartilage, cementum, dentin, and enamel. Dentin matrix protein 1 is highly phosphorylated and potentially glycosylated, but there is no experimental data identifying which specific amino acids are modified. For the purpose of facilitating the characterization of DMP1 from pig, which has the advantage of large developing teeth for obtaining protein in quantity and extensive structural information concerning other tooth matrix proteins, we characterized the porcine DMP1 cDNA and gene structure, raised anti-peptide immunoglobulins that are specific for porcine DMP1, and detected DMP1 protein in porcine tooth extracts and histological sections. Porcine DMP1 has 510 amino acids, including a 16-amino acid signal peptide. The deduced molecular weight of the secreted, unmodified protein is 53.5 kDa. The protein has 93 serines and 12 threonines in the appropriate context for phosphorylation, and four asparagines in a context suitable for glycosylation. Dentin matrix protein 1 protein bands with apparent molecular weights between 30 and 45 kDa were observed in partially purified dentin extracts. In developing teeth, immunohistochemistry localized DMP1 in odontoblasts and the dentinal tubules of mineralized dentin and in ameloblasts, but not in the enamel matrix. PMID:16460339

  4. Nucleotide sequence analysis of the L gene of Newcastle disease virus: homologies with Sendai and vesicular stomatitis viruses.

    PubMed Central

    Yusoff, K; Millar, N S; Chambers, P; Emmerson, P T

    1987-01-01

    The nucleotide sequence of the L gene of the Beaudette C strain of Newcastle disease virus (NDV) has been determined. The L gene is 6704 nucleotides long and encodes a protein of 2204 amino acids with a calculated molecular weight of 248822. Mung bean nuclease mapping of the 5' terminus of the L gene mRNA indicates that the transcription of the L gene is initiated 11 nucleotides upstream of the translational start site. Comparison with the amino acid sequences of the L genes of Sendai virus and vesicular stomatitis virus (VSV) suggests that there are several regions of homology between the sequences. These data provide further evidence for an evolutionary relationship between the Paramyxoviridae and the Rhabdoviridae. A non-coding sequence of 46 nucleotides downstream of the presumed polyadenylation site of the L gene may be part of a negative strand leader RNA. Images PMID:3035486

  5. Developing Single Nucleotide Polymorphism (SNP) markers from transcriptome sequences for the identification of longan (Dimocarpus longan) germplasm

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Longan (Dimocarpus longan Lour.) is an important tropical fruit tree crop. Accurate varietal identification is essential for germplasm management and breeding. Using longan transcriptome sequences from public databases, we developed single nucleotide polymorphism (SNP) markers; validated 60 SNPs in...

  6. cDNA sequence of the horse (Equus caballus) LAMA3 gene and characterization of two intronic SNP markers.

    PubMed

    Milenkovic, Dragan; Mata, Xavier; Chadi, Sead; Guérin, Gérard

    2005-12-01

    Laminins are large heterotrimeric basement membrane glycoproteins composed of alpha, beta and gamma chains. The Laminin 5 isoform has an alpha3beta3gamma2 composition and is essential for the adhesion of basal keratinocytes to the underlying epithelial basement membrane where it is mainly located. Mutations in the genes coding for the 3 chains have been associated with a severe skin blistering disease, Herlitz's junctional epidermolysis bullosa (JEB), observed in different species as man, dog, cat and horse. In this study, we report the sequence of the 5.2 kb horse laminin alpha 3 cDNA (LAMA3) as well as the detection of two intronic SNPs. These data will be useful to further identify causal mutations for the disease in this gene. PMID:16287627

  7. Molecular cloning and nucleotide sequence of the 1,2-alpha-D-mannosidase gene, msdS, from Aspergillus saitoi and expression of the gene in yeast cells.

    PubMed

    Inoue, T; Yoshida, T; Ichishima, E

    1995-12-01

    A full-length cDNA encoding 1,2-alpha-D-mannosidase (EC 3.2.1.113) from Aspergillus saitoi was cloned. Analysis of the 1718 bp nucleotide sequence of the cDNA revealed a single open reading frame with 1539 nucleotides of 1,2-alpha-D-mannosidase gene, msdS. The predicted amino-acid sequence of 1,2-alpha-D-mannosidase consists of 513 residues with a molecular mass of 55,767 and is 70%, 26% and 35% identity with those of Penicillium citrinum 1,2-alpha-D-mannosidase, yeast alpha-mannosidase, and mouse alpha-mannosidase. The cDNA of the msdS gene has been cloned and expressed in yeast cells. To identify the activity of expression product methyl-2-O-alpha-mannopyranosyl-alpha-mannopyranoside (Man alpha 1-->2Man-OMe) was used as a substrate at pH 5.0. PMID:8519794

  8. Conservation of nucleotide sequences for molecular diagnosis of Middle East respiratory syndrome coronavirus, 2015.

    PubMed

    Furuse, Yuki; Okamoto, Michiko; Oshitani, Hitoshi

    2015-11-01

    Infection due to the Middle East respiratory syndrome coronavirus (MERS-CoV) is widespread. The present study was performed to assess the protocols used for the molecular diagnosis of MERS-CoV by analyzing the nucleotide sequences of viruses detected between 2012 and 2015, including sequences from the large outbreak in eastern Asia in 2015. Although the diagnostic protocols were established only 2 years ago, mismatches between the sequences of primers/probes and viruses were found for several of the assays. Such mismatches could lead to a lower sensitivity of the assay, thereby leading to false-negative diagnosis. A slight modification in the primer design is suggested. Protocols for the molecular diagnosis of viral infections should be reviewed regularly after they are established, particularly for viruses that pose a great threat to public health such as MERS-CoV. PMID:26432410

  9. Nucleotide sequence of the tcml gene (ribosomal protein L3) of Saccharomyces cerevisiae.

    PubMed Central

    Schultz, L D; Friesen, J D

    1983-01-01

    The yeast tcml gene, which codes for ribosomal protein L3, has been isolated by using recombinant DNA and genetic complementation. The DNA fragment carrying this gene has been subcloned and we have determined its DNA sequence. The 20 amino acid residues at the amino terminus as inferred from the nucleotide sequence agreed exactly with the amino acid sequence data. The amino acid composition of the encoded protein agreed with that determined for purified ribosomal protein L3. Codon usage in the tcml gene was strongly biased in the direction found for several other abundant Saccharomyces cerevisiae proteins. The tcml gene has no introns, which appears to be atypical of ribosomal protein structural genes. PMID:6305925

  10. Identification of shark species in seafood products by forensically informative nucleotide sequencing (FINS).

    PubMed

    Blanco, M; Pérez-Martín, R I; Sotelo, C G

    2008-11-12

    The identification of commercial shark species is a relevant issue to ensure the correct labeling of seafood products, to maintain consumer confidence in seafood, and to enhance the knowledge of the species and volumes that are at present being captured, thus improving the management of shark fisheries. The polymerase chain reaction was employed to obtain a 423 bp amplicon from the mitochondrial cytochrome b gene. The sequences from this fragment, belonging to 63 authentic individuals of 23 species, were analyzed using a genetic distance method. Nine different samples of commercial fresh, frozen, and convenience food were obtained in local and international markets to validate the methodology. These samples were analyzed, and sequences were employed for species identification, showing that forensically informative nucleotide sequencing (FINS) is a suitable technique for identification of processed seafood containing shark as an ingredient. The results also showed that incorrect labeling practices may occur regarding shark products, probably because of incorrect labeling at the production point. PMID:18831561

  11. Murine cystathionine γ-lyase: complete cDNA and genomic sequences, promoter activity, tissue distribution and developmental expression

    PubMed Central

    2004-01-01

    Cystathionine γ-lyase (CSE) is the last key enzyme in the trans-sulphuration pathway for biosynthesis of cysteine from methionine. Cysteine could be provided through diet; however, CSE has been shown to be important for the adequate supply of cysteine to synthesize glutathione, a major intracellular antioxidant. With a view to determining physiological roles of CSE in mice, we report the sequence of a complete mouse CSE cDNA along with its associated genomic structure, generation of specific polyclonal antibodies, and the tissue distribution and developmental expression patterns of CSE in mice. A 1.8 kb full-length cDNA containing an open reading frame of 1197 bp, which encodes a 43.6 kDa protein, was isolated from adult mouse kidney. A 35 kb mouse genomic fragment was obtained by λ genomic library screening. It contained promoter regions, 12 exons, ranging in size from 53 to 579 bp, spanning over 30 kb, and exon/intron boundaries that were conserved with rat and human CSE. The GC-rich core promoter contained canonical TATA and CAAT motifs, and several transcription factor-binding consensus sequences. The CSE transcript, protein and enzymic activity were detected in liver, kidney, and, at much lower levels, in small intestine and stomach of both rats and mice. In developing mouse liver and kidney, the expression levels of CSE protein and activity gradually increased with age until reaching their peak value at 3 weeks of age, following which the expression levels in liver remained constant, whereas those in kidney decreased significantly. Immunohistochemical analyses revealed predominant CSE expression in hepatocytes and kidney cortical tubuli. These results suggest important physiological roles for CSE in mice. PMID:15038791

  12. Cloning and nucleotide sequence of the simian rotavirus gene 6 that codes for the major inner capsid protein.

    PubMed Central

    Estes, M K; Mason, B B; Crawford, S; Cohen, J

    1984-01-01

    The nucleotide sequence of the gene that codes for the major inner capsid protein of the simian rotavirus SA11 has been determined. A DNA copy of mRNA from gene 6 was cloned in the E. coli plasmid pBR322. The full-length gene is 1357 nucleotides long with a 5'-noncoding region of 23 nucleotides and a 3'-noncoding region of 140 nucleotides. The gene contains a single, long, open reading-frame of 1194 nucleotides capable of coding for a protein of 397 amino acids with a molecular weight of 44,816. The predicted protein product is relatively proline-rich with a net charge at neutral pH of -3.5. One stretch of 53 amino acids (encoded by nucleotides 327-485) is basic. Images PMID:6322125

  13. Nucleotide sequence of the gene encoding the repressor for the histidine utilization genes of Pseudomonas putida.

    PubMed Central

    Allison, S L; Phillips, A T

    1990-01-01

    The hutC gene of Pseudomonas putida encodes a repressor which, in combination with the inducer urocanate, regulates expression of the five structural genes necessary for conversion of histidine to glutamate, ammonia, and formate. The nucleotide sequence of the hutC region was determined and found to contain two open reading frames which overlapped by one nucleotide. The first open reading frame (ORF1) appeared to encode a 27,648-dalton protein of 248 amino acids whose sequence strongly resembled that of the hut repressor of Klebsiella aerogenes (A. Schwacha and R. A. Bender, J. Bacteriol. 172:5477-5481, 1990) and contained a helix-turn-helix motif that could be involved in operator binding. The gene was preceded by a sequence which was nearly identical to that of the operator site located upstream of hutU which controls transcription of the hutUHIG genes. The operator near hutC would presumably allow the hut repressor to regulate its own synthesis as well as the expression of the divergent hutF gene. A second open reading frame (ORF2) would encode a 21,155-dalton protein, but because this region could be deleted with only a slight effect on repressor activity, it is not likely to be involved in repressor function or structure. PMID:2203753

  14. Trichinella spiralis thymidylate synthase: cDNA cloning and sequencing, and developmental pattern of mRNA expression.

    PubMed

    Dabrowska, M; Jagielska, E; Cieśla, J; Płucienniczak, A; Kwiatowski, J; Wranicz, M; Boireau, P; Rode, W

    2004-02-01

    The persistent expression of thymidylate synthase activity has previously been demonstrated not only in adult forms, but also in non-developing muscle larvae of Trichinella spiralis and T. pseudospiralis, pointing to an unusual pattern of cell cycle regulation, and prompting further studies on the developmental pattern of T. spiralis thymidylate synthase gene expression. The enzyme cDNA was cloned and sequenced, allowing the characterization of a single open reading frame of 307 amino acids coding for a putative protein of 35,582 Da molecular weight. The amino acid sequence of the parasite enzyme was analysed, the consensus phylogenetic tree built and its stability assessed. The aa sequence identity with thymidylate synthase was confirmed by the enzymatic activity of the recombinant protein expressed in E. coli. As compared with the enzyme purified from muscle larvae, it showed apparently similar Vmax value, but higher Km(app) values desscribing interactions with dUMP (28.8 microM vs. 3.9 microM) and (6RS,alphaS)-N(5,10)-methylenetetrahydrofolate (383 microM vs. 54.7 microM). With the coding region used as a probe, thymidylate synthase mRNA levels, relative to 18S rRNA, were found to be similar in muscle larvae, adult forms and newborn larvae, in agreement with muscle larvae cells being arrested in the cell cycle. PMID:15030008

  15. Expressed Sequence Tags Analysis and Design of Simple Sequence Repeats Markers from a Full-Length cDNA Library in Perilla frutescens (L.)

    PubMed Central

    Seong, Eun Soo; Yoo, Ji Hye; Choi, Jae Hoo; Kim, Chang Heum; Jeon, Mi Ran; Kang, Byeong Ju; Lee, Jae Geun; Choi, Seon Kang; Ghimire, Bimal Kumar; Yu, Chang Yeon

    2015-01-01

    Perilla frutescens is valuable as a medicinal plant as well as a natural medicine and functional food. However, comparative genomics analyses of P. frutescens are limited due to a lack of gene annotations and characterization. A full-length cDNA library from P. frutescens leaves was constructed to identify functional gene clusters and probable EST-SSR markers via analysis of 1,056 expressed sequence tags. Unigene assembly was performed using basic local alignment search tool (BLAST) homology searches and annotated Gene Ontology (GO). A total of 18 simple sequence repeats (SSRs) were designed as primer pairs. This study is the first to report comparative genomics and EST-SSR markers from P. frutescens will help gene discovery and provide an important source for functional genomics and molecular genetic research in this interesting medicinal plant. PMID:26664999

  16. Complete Nucleotide Sequence of a Conjugative Plasmid Carrying blaPER-1

    PubMed Central

    Li, Ruichao; Zhou, Yuanjie; Chan, Edward Wai-chi

    2015-01-01

    The nucleotide sequence of a self-transmissible plasmid pVPH1 harboring blaPER-1 from Vibrio parahaemolyticus was determined. pVPH1 was 183,730 bp in size and shared a backbone similar to pAQU1 and pAQU2, differing mainly in an ∼40-kb multidrug resistance (MDR) region. A complex class 1 integron was identified together with ISCR1 and blaPER-1 (ISCR1-blaPER-1-gst-abct-qacEΔ1-sul1), which was shown to form a circular intermediate playing an important role in the dissemination of blaPER-1. PMID:25779581

  17. Nucleotide sequence and organization of copper resistance genes from Pseudomonas syringae pv. tomato

    SciTech Connect

    Mellano, M.A.; Cooksey, D.A.

    1988-06-01

    The nucleotide sequence of a 4.5-kilobase copper resistance determinant from Pseudomonas syringae pv. tomato revealed four open reading frames (ORFs) in the same orientation. Deletion and site-specific mutational analyses indicated that the first two ORFs were essential for copper resistance; the last two ORFs were required for full resistance, but low-level resistance could be conferred in their absence. Five highly conserved, direct 24-base repeats were found near the beginning of the second ORF, and a similar, but less conserved, repeated region was found in the middle of the first ORF.

  18. Within-Host Nucleotide Diversity of Virus Populations: Insights from Next-Generation Sequencing

    PubMed Central

    Nelson, Chase W.; Hughes, Austin L.

    2014-01-01

    Next-generation sequencing (NGS) technology offers new opportunities for understanding the evolution and dynamics of viral populations within individual hosts over the course of infection. We review simple methods for estimating synonymous and nonsynonymous nucleotide diversity in viral genes from NGS data without the need for inferring linkage. We discuss the potential usefulness of these data for addressing questions of both practical and theoretical interest, including fundamental questions regarding the effective population sizes of within-host viral populations and the modes of natural selection acting on them. PMID:25481279

  19. The Complete Nucleotide Sequence of the Mitochondrial Genome of Bactrocera minax (Diptera: Tephritidae)

    PubMed Central

    Zhang, Bin; Nardi, Francesco; Hull-Sanders, Helen; Wan, Xuanwu; Liu, Yinghong

    2014-01-01

    The complete 16,043 bp mitochondrial genome (mitogenome) of Bactrocera minax (Diptera: Tephritidae) has been sequenced. The genome encodes 37 genes usually found in insect mitogenomes. The mitogenome information for B. minax was compared to the homologous sequences of Bactrocera oleae, Bactrocera tryoni, Bactrocera philippinensis, Bactrocera carambolae, Bactrocera papayae, Bactrocera dorsalis, Bactrocera correcta, Bactrocera cucurbitae and Ceratitis capitata. The analysis indicated the structure and organization are typical of, and similar to, the nine closely related species mentioned above, although it contains the lowest genome-wide A+T content (67.3%). Four short intergenic spacers with a high degree of conservation among the nine tephritid species mentioned above and B. minax were observed, which also have clear counterparts in the control regions (CRs). Correlation analysis among these ten tephritid species revealed close positive correlation between the A+T content of zero-fold degenerate sites (P0FD), the ratio of nucleotide substitution frequency at P0FD sites to all degenerate sites (zero-fold degenerate sites, two-fold degenerate sites and four-fold degenerate sites) and amino acid sequence distance (ASD) were found. Further, significant positive correlation was observed between the A+T content of four-fold degenerate sites (P4FD) and the ratio of nucleotide substitution frequency at P4FD sites to all degenerate sites; however, we found significant negative correlation between ASD and the A+T content of P4FD, and the ratio of nucleotide substitution frequency at P4FD sites to all degenerate sites. A higher nucleotide substitution frequency at non-synonymous sites compared to synonymous sites was observed in nad4, the first time that has been observed in an insect mitogenome. A poly(T) stretch at the 5′ end of the CR followed by a [TA(A)]n-like stretch was also found. In addition, a highly conserved G+A-rich sequence block was observed in front of the

  20. Constructing and random sequencing analysis of normalized cDNA library of testis tissue from oriental river prawn (Macrobrachium nipponense).

    PubMed

    Qiao, Hui; Fu, Hongtuo; Jin, Shubo; Wu, Yan; Jiang, Sufei; Gong, Yongsheng; Xiong, Yiwei

    2012-09-01

    The oriental river prawn, Macrobrachium nipponense, is an important aquaculture species in China. Sexual precocity is a serious problem because of genetic retrogression, which has negative effects on product quality and dramatically affects price. Culture of all-male populations of this species would be economically advantageous, as the males grow faster and reach a much larger size than females. Developing such a culture scheme will require discovery of sex- or reproduction-related genes that affect sexual maturity and sex determination. In this study, a high-quality normalized testis cDNA library was constructed to identify novel transcripts. Of the 5280 successful sequencing reaction yields, 5202 expressed tagged sequences (ESTs) with an average length of 954 bp. Ultimately, 3677 unique sequences, including 891 contigs and 2786 singletons, were identified based on cluster and assembly analyses. Sixteen hundred (43.5%) genes were novel based on the NCBI protein database, thus these unidentified genes may improve basic molecular knowledge about M. nipponense. Of the novel unigenes, 34.4% (715/2077) were homologous to insects, such as Tribolium castaneum, Drosophila spp. and Apis mellifera. Fifty-two genes were identified as sex- or reproduction-related based on Gene Ontology classification and sequence comparison with data from other publications. These genes can be classified into groups based on different functions, including 10 sex-determination related genes, 8 male-reproductive genes, 5 cathepsin-related genes, 20 ubiquitin-related genes, 5 ferritin-related genes, and 4 LRR genes. The results of this study provide new sequence information about M. nipponense, which will be the basis for further genetic studies of this species and other decapods crustaceans. PMID:22632994

  1. Nucleotide sequence of the gene encoding the two-subunit pilin of Bacteroides nodosus 265.

    PubMed Central

    Elleman, T C; Hoyne, P A; McKern, N M; Stewart, D J

    1986-01-01

    The nucleotide sequence of the gene encoding pilin from Bacteroides nodosus 265 has been determined. The pilin is encoded by a single-copy gene, from which can be predicted a prepilin comprising a single protein chain of Mr 16,637. The prepilin sequence differs in several respects from the mature protein sequence. Seven additional N-terminal amino acid residues are present in prepilin, whereas residue 8, phenylalanine, undergoes posttranslational modification to become the N-methylated amino-terminal residue of mature pilin. In addition, further processing occurs through internal cleavage to produce two noncovalently linked subunits characteristic of pilins from serogroup H of B. nodosus, of which strain 265 is a member. The position of cleavage has been identified between alanine residues at positions 72 and 73 of the mature 149-residue pilin protein. The predicted pilin sequence of B. nodosus 265 shows extensive N-terminal amino acid sequence homology with other pilins of the N-methylphenylalanine type. In addition this sequence also shows homology with these N-methylphenylalanine-type pilins in the C-terminal region of the molecule, especially with pilin from Pseudomonas aeruginosa PAK. Images PMID:2873127

  2. Sequencing and comparative genomics analysis in Senecio scandens Buch.-Ham. Ex D. Don, based on full-length cDNA library

    PubMed Central

    Qian, Gang; Ping, Junjiao; Zhang, Zhen; Xu, Delin

    2014-01-01

    Senecio scandens Buch.-Ham. ex D. Don, an important antibacterial source of Chinese traditional medicine, has a widespread distribution in a few ecological habitats of China. We generated a full-length complementary DNA (cDNA) library from a sample of elite individuals with superior antibacterial properties, with satisfactory parameters such as library storage (4.30 × 106 CFU), efficiency of titre (1.30 × 106 CFU/mL), transformation efficiency (96.35%), full-length ratio (64.00%) and redundancy ratio (3.28%). The BLASTN search revealed the facile formation of counterparts between the experimental sample and Arabidopsis thaliana in view of high-homology cDNA sequence (90.79%) with e-values <1e – 50. Sequence similarities to known proteins indicate that the entire sequences of the full-length cDNA clones consist of the major of functional genes identified by a large set of microarray data from the present experimental material. For other Compositae species, a large set of full-length cDNA clones reported in the present article will serve as a useful resource to facilitate further research on the transferability of expressed sequence tag-derived simple sequence repeats (EST-SSR) development, comparative genomics and novel transcript profiles. PMID:26740776

  3. Complete sequence of an HLA-dR beta chain deduced from a cDNA clone and identification of multiple non-allelic DR beta chain genes.

    PubMed Central

    Long, E O; Wake, C T; Gorski, J; Mach, B

    1983-01-01

    At least three polymorphic class II antigens are encoded in the human major histocompatibility complex (HLA): DR, DC and SB. cDNA clones encoding beta chains of HLA-DR antigen, derived from mRNA of a heterozygous B-cell line, were isolated and could be divided into four subsets, clearly distinct from cDNA clones encoding DC beta chains. Therefore, at least two non-allelic DR beta chain genes exist. The complete sequence of one of the DR beta chain cDNA clones is presented. It defines a putative signal sequence, two extracellular domains, a trans-membrane region and a cytoplasmic tail. Comparison with a DC beta chain cDNA clone revealed a homology of 70% between the two beta chains and that the two genes diverged under relatively little selective pressure. A set of amino acids conserved in immunoglobulin molecules was found to be identical in both DR and DC beta chains. Comparison of the DR beta chain sequence with the amino acid sequence of another DR beta chain revealed a homology of 87% and that most differences are single amino acid substitutions. Allelic polymorphism in DR beta chains has probably not arisen by changes in long blocks of sequence. PMID:11894954

  4. Mining an Ostrinia nubilalis Midgut Expressed Sequence Tag (EST) Library for Candidate Genes and Single Nucleotide Polymorphisms (SNPs)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    European corn borer, Ostrinia nubilalis, larvae feed upon many plant hosts and are a major target for genetically-engineered corn expressing Bacillus thuringiensis (Bt) toxins. DNA sequencing of a non-normalized O. nubilalis larval midgut cDNA library (ARS-CICGRU ONmgEST) identified 535 unique sequ...

  5. Nucleotide sequence and structural features of a novel US-a junction present in a defective herpes simplex virus genome.

    PubMed Central

    Mocarski, E S; Deiss, L P; Frenkel, N

    1985-01-01

    Defective genomes generated during serial propagation of herpes simplex virus type 1 (Justin) consist of tandem reiterations of sequences that are colinear with a portion of the S component of the standard viral genome. We determined the structure of the novel US-a junction, at which the US sequences of one repeat unit join the a sequences of the adjacent repeat unit. Comparison of the nucleotide sequence at this junction with the nucleotide sequence of the corresponding US region of the standard virus genome indicated that the defective genome repeat unit arose by a single recombinational event between an L-S junction a sequence and the US region. The recombinational process might have been mediated by limited sequence homology. The sequences retained within the US-a junction further define the signal for cleavage and packaging of viral DNA. PMID:2989551

  6. cDNA and deduced amino acid sequence of human pulmonary surfactant-associated proteolipid SPL(Phe)

    SciTech Connect

    Glasser, S.W.; Korfhagen, T.R.; Weaver, T.; Pilot-Matias, T.; Fox, J.L.; Whitsett, J.A.

    1987-06-01

    Hydrophobic surfactant-associated protein of M/sub r/ 6000-14,000 was isolated from either/ethanol or chloroform/methanol extracts of mammalian pulmonary surfactant. Automated Edman degradation in a gas-phase sequencer showed the major N-terminus of the human low molecular weight protein to be Phe-Pro-Ile-Pro-Leu-Pro-Try-Cys-Trp-Leu-Cys-Arg-Ala-Leu-. Because of the N-terminal phenylalanine, the surfactant protein was designated SPL(Phe). Antiserum generated against hydrophobic surfactant protein(s) from bovine pulmonary surfactant recognized protein of M/sub r/ 6000-14,000 in immunoblot analysis and was used to screen a lambdagt11 expression library constructed from adult human lung poly(A)/sup +/ RNA. This resulted in identification of a 1.4-kilobase cDNA clone that was shown to encode the N-terminus of the surfactant polypeptide SPL(Phe) (Phe-Pro-Ile-Pro-Leu-Pro-) within an open reading frame for a larger protein. Expression of a fused ..beta..-galactosidase-SPL (Phe) gene in Escherichia coli yielded an immunoreactive M/sub r/ 34,000 fusion peptide. Hybrid-arrested translation with the cDNA and immunoprecipitation of (/sup 35/S)methionine-labeled in vitro translation products of human poly(A)/sup +/ RNA with a surfactant polyclonal antibody resulted in identification of a M/sub r/ 40,000 precursor protein. Blot hybridization analysis of electrophoretically fractionated RNA from human lung detected a 2.0-kilobase RNA that was more abundant in adult lung than in fetal lung. These proteins, and specifically SPL(Phe), may therefore be useful for synthesis of replacement surfactants for treatment of hyaline membrane disease in newborn infants or of other surfactant-deficient states.

  7. Nucleotide sequence alignment of hdcA from Gram-positive bacteria.

    PubMed

    Diaz, Maria; Ladero, Victor; Redruello, Begoña; Sanchez-Llana, Esther; Del Rio, Beatriz; Fernandez, Maria; Martin, Maria Cruz; Alvarez, Miguel A

    2016-03-01

    The decarboxylation of histidine -carried out mainly by some gram-positive bacteria- yields the toxic dietary biogenic amine histamine (Ladero et al. 2010 〈10.2174/157340110791233256〉 [1], Linares et al. 2016 〈http://dx.doi.org/10.1016/j.foodchem.2015.11.013〉〉 [2]). The reaction is catalyzed by a pyruvoyl-dependent histidine decarboxylase (Linares et al. 2011 〈10.1080/10408398.2011.582813〉 [3]), which is encoded by the gene hdcA. In order to locate conserved regions in the hdcA gene of Gram-positive bacteria, this article provides a nucleotide sequence alignment of all the hdcA sequences from Gram-positive bacteria present in databases. For further utility and discussion, see 〈http://dx.doi.org/ 10.1016/j.foodcont.2015.11.035〉〉 [4]. PMID:26958625

  8. Nucleotide sequence alignment of hdcA from Gram-positive bacteria

    PubMed Central

    Diaz, Maria; Ladero, Victor; Redruello, Begoña; Sanchez-Llana, Esther; del Rio, Beatriz; Fernandez, Maria; Martin, Maria Cruz; Alvarez, Miguel A.

    2016-01-01

    The decarboxylation of histidine -carried out mainly by some gram-positive bacteria- yields the toxic dietary biogenic amine histamine (Ladero et al. 2010 〈10.2174/157340110791233256〉 [1], Linares et al. 2016 〈http://dx.doi.org/10.1016/j.foodchem.2015.11.013〉〉 [2]). The reaction is catalyzed by a pyruvoyl-dependent histidine decarboxylase (Linares et al. 2011 〈10.1080/10408398.2011.582813〉 [3]), which is encoded by the gene hdcA. In order to locate conserved regions in the hdcA gene of Gram-positive bacteria, this article provides a nucleotide sequence alignment of all the hdcA sequences from Gram-positive bacteria present in databases. For further utility and discussion, see 〈http://dx.doi.org/ 10.1016/j.foodcont.2015.11.035〉〉 [4]. PMID:26958625

  9. Infectious hepatitis B virus from cloned DNA of known nucleotide sequence.

    PubMed Central

    Will, H; Cattaneo, R; Darai, G; Deinhardt, F; Schellekens, H; Schaller, H

    1985-01-01

    The infectivity of cloned hepatitis B viral DNA (HBV) has been tested in chimpanzees to identify a fully functional HBV genome and to assess the risk associated with its handling. Only one of two HBV DNA sequence variants tested was shown to be infectious. "Clone purified" virus of predicted nucleotide sequence was produced from the infectious HBV DNA, and the cloned viral genome was identical in structure with naturally occurring HBV. Infection could be initiated independent of whether circular monomeric or plasmid integrated dimeric forms of the viral genome were inoculated, but the infectivity of the DNA depended on liver cell transfection or intrahepatic injection. Intravenous injection of high doses of infectious HBV DNA did not induce hepatitis, suggesting that there is virtually no risk associated with routine laboratory handling of cloned HBV DNA. Images PMID:2983320

  10. Nucleotide sequence of the BsuRI restriction-modification system.

    PubMed Central

    Kiss, A; Posfai, G; Keller, C C; Venetianer, P; Roberts, R J

    1985-01-01

    The genes of the 5'-GGCC specific BsuRI restriction-modification system of Bacillus subtilis have been cloned and expressed in E. coli and their nucleotide sequence has been determined. The restriction and modification genes code for polypeptides with calculated molecular weights of 66,314 and 49,642, respectively. Both enzymes are coded by the same DNA strand. The restriction gene is upstream of the methylase gene and the coding regions are separated by 780 bp. Analysis of the RNA transcripts by S1-nuclease mapping indicates that the restriction and modification genes are transcribed from different promoters. Comparison of the amino acid sequences revealed no homology between the BsuRI restriction and modification enzymes. There are, however, regions of homology between the BsuRI methylase and two other GGCC specific modification enzymes, the BspRI and SPR methylases. Images PMID:2997708

  11. Nucleotide sequence and expression of the gene encoding the EcoRII modification enzyme.

    PubMed Central

    Som, S; Bhagwat, A S; Friedman, S

    1987-01-01

    The gene coding for the EcoRII modification enzyme has been cloned and the nucleotide sequence of 1933 base pairs containing the gene has been determined. The gene codes for a protein of 477 amino acids. Two transcriptional start sites have been mapped by S1 mapping. One deletion that removes 34 N-terminal amino acids was found to have partial enzyme activity. Comparison of the EcoRII methylase sequence with other cytosine methylases revealed several domains of partial homology among all cytosine methylases. Cloning the gene in multicopy pUC vectors increased the expression by 6-18 fold. A 40 fold overproduction of the EcoRII methylase was obtained by cloning the gene in the expression vector carrying the lambda PL promoter. Images PMID:3029675

  12. Nucleotide sequence of nifD from Frankia alni strain ArI3: phylogenetic inferences.

    PubMed

    Normand, P; Gouy, M; Cournoyer, B; Simonet, P

    1992-05-01

    The complete nucleotide sequence of the nifD gene encoding the alpha subunit of component I of nitrogenase from Frankia alni strain ArI3 was determined. The coding region is 1,458 bp in length and encodes a polypeptide of 486 residues with a predicted molecular weight of 53,500. Phylogenetic inferences with 12 complete published nifD sequences were drawn using a variety of approaches. Frankia nifD clusters with proteobacteria rather than with Clostridium pasteurianum, the other Gram-positive bacterium studied. Extant eubacterial nif genes seem to have at least three distinct evolutionary origins as a result of ancient gene duplications. Within the Gram-positive bacterial phylum, functional nif genes descend from different duplicates. PMID:1584016

  13. The Venom Gland Transcriptome of Latrodectus tredecimguttatus Revealed by Deep Sequencing and cDNA Library Analysis

    PubMed Central

    He, Quanze; Duan, Zhigui; Yu, Ying; Liu, Zhen; Liu, Zhonghua; Liang, Songping

    2013-01-01

    Latrodectus tredecimguttatus, commonly known as black widow spider, is well known for its dangerous bite. Although its venom has been characterized extensively, some fundamental questions about its molecular composition remain unanswered. The limited transcriptome and genome data available prevent further understanding of spider venom at the molecular level. In the present study, we combined next-generation sequencing and conventional DNA sequencing to construct a venom gland transcriptome of the spider L. tredecimguttatus, which resulted in the identification of 9,666 and 480 high-confidence proteins among 34,334 de novo sequences and 1,024 cDNA sequences, respectively, by assembly, translation, filtering, quantification and annotation. Extensive functional analyses of these proteins indicated that mRNAs involved in RNA transport and spliceosome, protein translation, processing and transport were highly enriched in the venom gland, which is consistent with the specific function of venom glands, namely the production of toxins. Furthermore, we identified 146 toxin-like proteins forming 12 families, including 6 new families in this spider in which α-LTX-Lt1a family2 is firstly identified as a subfamily of α-LTX-Lt1a family. The toxins were classified according to their bioactivities into five categories that functioned in a coordinate way. Few ion channels were expressed in venom gland cells, suggesting a possible mechanism of protection from the attack of their own toxins. The present study provides a gland transcriptome profile and extends our understanding of the toxinome of spiders and coordination mechanism for toxin production in protein expression quantity. PMID:24312294

  14. Nucleotide sequence analysis of beta tubulin gene in a wide range of dermatophytes.

    PubMed

    Rezaei-Matehkolaei, Ali; Mirhendi, Hossein; Makimura, Koichi; de Hoog, G Sybren; Satoh, Kazuo; Najafzadeh, Mohammad Javad; Shidfar, Mohammad Reza

    2014-10-01

    We investigated the resolving power of the beta tubulin protein-coding gene (BT2) for systematic study of dermatophyte fungi. Initially, 144 standard and clinical strains belonging to 26 species in the genera Trichophyton, Microsporum, and Epidermophyton were identified by internal transcribe spacer (ITS) sequencing. Subsequently, BT2 was partially amplified in all strains, and sequence analysis performed after construction of a BT2 database that showed length ranged from approximately 723 (T. ajelloi) to 808 nucleotides (M. persicolor) in different species. Intraspecific sequence variation was found in some species, but T. tonsurans, T. equinum, T. concentricum, T. verrucosum, T. rubrum, T. violaceum, T. eriotrephon, E. floccosum, M. canis, M. ferrugineum, and M. audouinii were invariant. The sequences were found to be relatively conserved among different strains of the same species. The species with the closest resemblance were Arthroderma benhamiae and T. concentricum and T. tonsurans and T. equinum with 100% and 99.8% identity, respectively; the most distant species were M. persicolor and M. amazonicum. The dendrogram obtained from BT2 topology was almost compatible with the species concept based on ITS sequencing, and similar clades and species were distinguished in the BT2 tree. Here, beta tubulin was characterized in a wide range of dermatophytes in order to assess intra- and interspecies variation and resolution and was found to be a taxonomically valuable gene. PMID:25079222

  15. Unique nucleotide sequence (UNS)-guided assembly of repetitive DNA parts for synthetic biology applications

    PubMed Central

    Torella, Joseph P.; Lienert, Florian; Boehm, Christian R.; Chen, Jan-Hung; Way, Jeffrey C.; Silver, Pamela A.

    2016-01-01

    Recombination-based DNA construction methods, such as Gibson assembly, have made it possible to easily and simultaneously assemble multiple DNA parts and hold promise for the development and optimization of metabolic pathways and functional genetic circuits. Over time, however, these pathways and circuits have become more complex, and the increasing need for standardization and insulation of genetic parts has resulted in sequence redundancies — for example repeated terminator and insulator sequences — that complicate recombination-based assembly. We and others have recently developed DNA assembly methods that we refer to collectively as unique nucleotide sequence (UNS)-guided assembly, in which individual DNA parts are flanked with UNSs to facilitate the ordered, recombination-based assembly of repetitive sequences. Here we present a detailed protocol for UNS-guided assembly that enables researchers to convert multiple DNA parts into sequenced, correctly-assembled constructs, or into high-quality combinatorial libraries in only 2–3 days. If the DNA parts must be generated from scratch, an additional 2–5 days are necessary. This protocol requires no specialized equipment and can easily be implemented by a student with experience in basic cloning techniques. PMID:25101822

  16. Unique nucleotide sequence-guided assembly of repetitive DNA parts for synthetic biology applications

    SciTech Connect

    Torella, JP; Lienert, F; Boehm, CR; Chen, JH; Way, JC; Silver, PA

    2014-08-07

    Recombination-based DNA construction methods, such as Gibson assembly, have made it possible to easily and simultaneously assemble multiple DNA parts, and they hold promise for the development and optimization of metabolic pathways and functional genetic circuits. Over time, however, these pathways and circuits have become more complex, and the increasing need for standardization and insulation of genetic parts has resulted in sequence redundancies-for example, repeated terminator and insulator sequences-that complicate recombination-based assembly. We and others have recently developed DNA assembly methods, which we refer to collectively as unique nucleotide sequence (UNS)-guided assembly, in which individual DNA parts are flanked with UNSs to facilitate the ordered, recombination-based assembly of repetitive sequences. Here we present a detailed protocol for UNS-guided assembly that enables researchers to convert multiple DNA parts into sequenced, correctly assembled constructs, or into high-quality combinatorial libraries in only 2-3 d. If the DNA parts must be generated from scratch, an additional 2-5 d are necessary. This protocol requires no specialized equipment and can easily be implemented by a student with experience in basic cloning techniques.

  17. Mapping Nucleotide Sequences that Encode Complex Binary Disease Traits with HapMap

    PubMed Central

    Cui, Yuehua; Fu, Wenjiang; Sun, Kelian; Romero, Roberto; Wu, Rongling

    2007-01-01

    Detecting the patterns of DNA sequence variants across the human genome is a crucial step for unraveling the genetic basis of complex human diseases. The human HapMap constructed by single nucleotide polymorphisms (SNPs) provides efficient sequence variation information that can speed up the discovery of genes related to common diseases. In this article, we present a generalized linear model for identifying specific nucleotide variants that encode complex human diseases. A novel approach is derived to group haplotypes to form composite diplotypes, which largely reduces the model degrees of freedom for an association test and hence increases the power when multiple SNP markers are involved. An efficient two-stage estimation procedure based on the expectation-maximization (EM) algorithm is derived to estimate parameters. Non-genetic environmental or clinical risk factors can also be fitted into the model. Computer simulations show that our model has reasonable power and type I error rate with appropriate sample size. It is also suggested through simulations that a balanced design with approximately equal number of cases and controls should be preferred to maintain small estimation bias and reasonable testing power. To illustrate the utility, we apply the method to a genetic association study of large for gestational age (LGA) neonates. The model provides a powerful tool for elucidating the genetic basis of complex binary diseases. PMID:19384427

  18. Mapping DNA methylation by transverse current sequencing: Reduction of noise from neighboring nucleotides

    NASA Astrophysics Data System (ADS)

    Alvarez, Jose; Massey, Steven; Kalitsov, Alan; Velev, Julian

    Nanopore sequencing via transverse current has emerged as a competitive candidate for mapping DNA methylation without needed bisulfite-treatment, fluorescent tag, or PCR amplification. By eliminating the error producing amplification step, long read lengths become feasible, which greatly simplifies the assembly process and reduces the time and the cost inherent in current technologies. However, due to the large error rates of nanopore sequencing, single base resolution has not been reached. A very important source of noise is the intrinsic structural noise in the electric signature of the nucleotide arising from the influence of neighboring nucleotides. In this work we perform calculations of the tunneling current through DNA molecules in nanopores using the non-equilibrium electron transport method within an effective multi-orbital tight-binding model derived from first-principles calculations. We develop a base-calling algorithm accounting for the correlations of the current through neighboring bases, which in principle can reduce the error rate below any desired precision. Using this method we show that we can clearly distinguish DNA methylation and other base modifications based on the reading of the tunneling current.

  19. Evidence for Balancing Selection from Nucleotide Sequence Analyses of Human G6PD

    PubMed Central

    Verrelli, Brian C.; McDonald, John H.; Argyropoulos, George; Destro-Bisol, Giovanni; Froment, Alain; Drousiotou, Anthi; Lefranc, Gerard; Helal, Ahmed N.; Loiselet, Jacques; Tishkoff, Sarah A.

    2002-01-01

    Glucose-6-phosphate dehydrogenase (G6PD) mutations that result in reduced enzyme activity have been implicated in malarial resistance and constitute one of the best examples of selection in the human genome. In the present study, we characterize the nucleotide diversity across a 5.2-kb region of G6PD in a sample of 160 Africans and 56 non-Africans, to determine how selection has shaped patterns of DNA variation at this gene. Our global sample of enzymatically normal B alleles and A, A−, and Med alleles with reduced enzyme activities reveals many previously uncharacterized silent-site polymorphisms. In comparison with the absence of amino acid divergence between human and chimpanzee G6PD sequences, we find that the number of G6PD amino acid polymorphisms in human populations is significantly high. Unlike many other G6PD-activity alleles with reduced activity, we find that the age of the A variant, which is common in Africa, may not be consistent with the recent emergence of severe malaria and therefore may have originally had a historically different adaptive function. Overall, our observations strongly support previous genotype-phenotype association studies that proposed that balancing selection maintains G6PD deficiencies within human populations. The present study demonstrates that nucleotide sequence analyses can reveal signatures of both historical and recent selection in the genome and may elucidate the impact that infectious disease has had during human evolution. PMID:12378426

  20. The HLA-DRA*0102 allele: correct nucleotide sequence and associated HLA haplotypes.

    PubMed

    Kralovicova, J; Marsh, S G E; Waller, M J; Hammarstrom, L; Vorechovsky, I

    2002-09-01

    Here we correct the nucleotide sequence of a single known variant of the HLA-DRA gene. We show that the coding regions of the HLA-DRA*0101 and HLA-DRA*0102 alleles do not differ at two codons as reported previously, but only in codon 217. Using nucleotide sequencing and DNA samples from individuals homozygous in the major histocompatibility complex, we found that the variant, leucine 217-encoding HLA-DRA*0102 allele was present on the haplotypes HLA-B*0801, DRB1*03011, DQB1*0201 (ancestral haplotype AH8.1), HLA-B*07021, DRB1*15011, DQB1*0602 (AH7.1), HLA-B*1501, DRB1*15011, DQB1*0602, HLA-B*1501, DRB1*1402, DQB1*03011 and HLA-A3, B*07021, DRB1*1301, DQB1*0603. The HLA-DRA*0101 allele coding for valine 217 was observed on the haplotypes HLA-B*5701, DRB1*0701, DQB1*03032 (AH57.1), HLA-DRB1*04011, DQB1*0302, HLA-DRB1*0701, DQB1*0202, and HLA-DRB1*0101, DQB1*05011. PMID:12445311

  1. Complete nucleotide sequence of a Spanish isolate of alfalfa mosaic virus: evidence for additional genetic variability.

    PubMed

    Parrella, Giuseppe; Acanfora, Nadia; Orílio, Anelise F; Navas-Castillo, Jesús

    2011-06-01

    Alfalfa mosaic virus (AMV) is a plant virus that is distributed worldwide and can induce necrosis and/or yellow mosaic on a large variety of plant species, including commercially important crops. It is the only virus of the genus Alfamovirus in the family Bromoviridae. AMV isolates can be clustered into two genetic groups that correlate with their geographic origin. Here, we report for the first time the complete nucleotide sequence of a Spanish isolate of AMV found infecting Cape honeysuckle (Tecoma capensis) and named Tec-1. The tripartite genome of Tec-1 is composed of 3643 nucleotides (nt) for RNA1, 2594 nt for RNA2 and 2037 nt for RNA3. Comparative sequence analysis of the coat protein gene revealed that the isolate Tec-1 is distantly related to subgroup I of AMV and more closely related to subgroup II, although forming a distinct phylogenetic clade. Therefore, we propose to split subgroup II of AMV into two subgroups, namely IIA, comprising isolates previously included in subgroup II, and IIB, including the novel Spanish isolate Tec-1. PMID:21327783

  2. Complete nucleotide sequence and genome organization of Pelargonium flower break virus.

    PubMed

    Rico, P; Hernández, C

    2004-03-01

    The complete nucleotide sequence of Pelargonium flower break virus (PFBV) has been determined. The genomic RNA is 3923 nucleotides (nt) long and contains five open reading frames (ORFs). The 5'-proximal ORF encodes a 27 kDa protein (p27) and terminates with an amber codon which may be read-through into an in-frame p56 ORF to generate a 86 kDa protein (p86) containing the viral RNA dependent-RNA polymerase motifs. Two small ORFs, located in the central part of the viral genome, encode polypeptides of 7 (p7) and 12 kDa (p12), respectively, which are very likely involved in virus movement. Interestingly, p12 presents a leucine zipper motif that has not been previously reported in related proteins. The 3'-proximal ORF encodes a 37 kDa capsid protein (CP). The p12 ORF is in-frame with the p86 ORF and a double read-through protein of 99 kDa (p99) may be produced. Amino acid sequence comparisons revealed that the proteins encoded by ORFs 2, 3 and 4 are more similar to the corresponding gene products of Carnation mottle virus than to those of other carmoviruses, whereas the p27 and the CP show higher identity with the equivalent proteins of Saguaro cactus virus. Phylogenetic analysis conducted with the different viral products confirmed the assignment of PFBV to the genus Carmovirus. PMID:14991450

  3. cDNA sequences of channel catfish (Ictalurus punctatus Rafinesque, 1818) annexin A2, A4, A5 and A11

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Annexins, a protein superfamily, are ubiquitous, and play many important roles in immuno-physiological processes. In this report, we cloned and sequenced channel catfish orthologs to human annexin A2, A4, A5 and A11. Total RNA from tissues was extracted and cDNA libraries were constructed by the r...

  4. The bioinformatics of nucleotide sequence coding for proteins requiring metal coenzymes and proteins embedded with metals

    NASA Astrophysics Data System (ADS)

    Tremberger, G.; Dehipawala, Sunil; Cheung, E.; Holden, T.; Sullivan, R.; Nguyen, A.; Lieberman, D.; Cheung, T.

    2015-09-01

    All metallo-proteins need post-translation metal incorporation. In fact, the isotope ratio of Fe, Cu, and Zn in physiology and oncology have emerged as an important tool. The nickel containing F430 is the prosthetic group of the enzyme methyl coenzyme M reductase which catalyzes the release of methane in the final step of methano-genesis, a prime energy metabolism candidate for life exploration space mission in the solar system. The 3.5 Gyr early life sulfite reductase as a life switch energy metabolism had Fe-Mo clusters. The nitrogenase for nitrogen fixation 3 billion years ago had Mo. The early life arsenite oxidase needed for anoxygenic photosynthesis energy metabolism 2.8 billion years ago had Mo and Fe. The selection pressure in metal incorporation inside a protein would be quantifiable in terms of the related nucleotide sequence complexity with fractal dimension and entropy values. Simulation model showed that the studied metal-required energy metabolism sequences had at least ten times more selection pressure relatively in comparison to the horizontal transferred sequences in Mealybug, guided by the outcome histogram of the correlation R-sq values. The metal energy metabolism sequence group was compared to the circadian clock KaiC sequence group using magnesium atomic level bond shifting mechanism in the protein, and the simulation model would suggest a much higher selection pressure for the energy life switch sequence group. The possibility of using Kepler 444 as an example of ancient life in Galaxy with the associated exoplanets has been proposed and is further discussed in this report. Examples of arsenic metal bonding shift probed by Synchrotron-based X-ray spectroscopy data and Zn controlled FOXP2 regulated pathways in human and chimp brain studied tissue samples are studied in relationship to the sequence bioinformatics. The analysis results suggest that relatively large metal bonding shift amount is associated with low probability correlation R

  5. Purification and characterization of Clostridium perfringens 120-kilodalton collagenase and nucleotide sequence of the corresponding gene.

    PubMed Central

    Matsushita, O; Yoshihara, K; Katayama, S; Minami, J; Okabe, A

    1994-01-01

    Clostridium perfringens type C NCIB 10662 produced various gelatinolytic enzymes with molecular masses ranging from approximately 120 to approximately 80 kDa. A 120-kDa gelatinolytic enzyme was present in the largest quantity in the culture supernatant, and this enzyme was purified to homogeneity on the basis of sodium dodecyl sulfate-polyacrylamide gel electrophoresis. The purified enzyme was identified as the major collagenase of the organism, and it cleaved typical collagenase substrates such as azocoll, a synthetic substrate (4-phenylazobenzyloxy-carbonyl-Pro-Leu-Gly-Pro-D-Arg [Pz peptide]), and a type I collagen fibril. In addition, a gene (colA) encoding a 120-kDa collagenase was cloned in Escherichia coli. Nested deletions were used to define the coding region of colA, and this region was sequenced; from the nucleotide sequence, this gene encodes a protein of 1,104 amino acids (M(r), 125,966). Furthermore, from the N-terminal amino acid sequence of the purified enzyme which was found in this reading frame, the molecular mass of the mature enzyme was calculated to be 116,339 Da. Analysis of the primary structure of the gene product showed that the enzyme was produced with a stretch of 86 amino acids containing a putative signal sequence. Within this stretch was found PLGP, the amino acid sequence constituting the Pz peptide. This sequence may be implicated in self-processing of the collagenase. A consensus zinc-binding sequence (HEXXH) suggested for vertebrate Zn collagenases is present in this bacterial collagenase. Vibrio alginolyticus collagenase and Achromobacter lyticus protease I showed significant homology with the 120-kDa collagenase of C. perfringens, suggesting that these three enzymes are evolutionarily related. Images PMID:8282691

  6. Species diagnostic single-nucleotide polymorphism and sequence-tagged site markers for the parasitic WASP Genus Nasonia (Hymenoptera: Ptermalidae)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We developed, identified and evaluated eight single nucleotide polymorphism (SNP) and three sequence-tagged site (STS) markers in nuclear gene sequences of the wasp genus Nasonia (Hymenoptera). We studied variation of these markers in natural populations of the closely related and regionally sympatr...

  7. Cloning and nucleotide sequence of anaerobically induced porin protein E1 (OprE) of Pseudomonas aeruginosa PAO1.

    PubMed

    Yamano, Y; Nishikawa, T; Komatsu, Y

    1993-05-01

    The porin oprE gene of Pseudomonas aeruginosa PAO1 was isolated. Its nucleotide sequence indicated that the structural gene of 1383 nucleotide residues encodes a precursor consisting of 460 amino acid residues with a signal peptide of 29 amino acid residues, which was confirmed by the N-terminal 23-amino-acid sequence and the reaction with anti-OprE polyclonal antiserum. Anaerobiosis induced OprE production at the transcription level. The transcription start site was determined to be 40 nucleotides upstream from the ATG initiation codon. The control region contained an appropriately situated E sigma 54 recognition site and the putative second half of an ANR box. The amino acid sequence of OprE had some clusters of sequence homologous with that of OprD of P. aeruginosa, which might be responsible for the outer membrane permeability of imipenem and basic amino acids. PMID:8394980

  8. Sequencing of cDNA from 50 unrelated patients reveals that mutations in the triple-helical domain of type III procollagen are an infrequent cause of aortic aneurysms.

    PubMed Central

    Tromp, G; Wu, Y; Prockop, D J; Madhatheri, S L; Kleinert, C; Earley, J J; Zhuang, J; Norrgård, O; Darling, R C; Abbott, W M

    1993-01-01

    Detailed DNA sequencing of the triple-helical domain of type III procollagen was carried out on cDNA prepared from 54 patients with aortic aneurysms. The 43 male and 11 female patients originated from 50 different families and five different nationalities. 43 patients had at least one additional blood relative who had aneurysms. Five overlapping asymmetric PCR products, covering all the coding sequences of the triple-helical domain of type III procollagen, were sequenced with 28 specific sequencing primers. Analysis of the sequencing gels revealed only two nucleotide changes that altered the structure of the protein. One was a substitution of threonine for proline at amino acid position 501 and its functional importance was not clearly established. The other was a substitution of arginine for an obligatory glycine at amino acid position 136. In 40 of the 54 patients, detection of a polymorphism in the mRNA established that both alleles were expressed. The results indicate that mutations in type III procollagen are the cause of only about 2% of aortic aneurysms. Images PMID:8514866

  9. Isolating Viral and Host RNA Sequences from Archival Material and Production of cDNA Libraries for High-Throughput DNA Sequencing

    PubMed Central

    Xiao, Yongli; Sheng, Zong-Mei; Taubenberger, Jeffery K.

    2015-01-01

    The vast majority of surgical biopsy and post-mortem tissue samples are formalin-fixed and paraffin-embedded (FFPE), but this process leads to RNA degradation that limits gene expression analysis. As an example, the viral RNA genome of the 1918 pandemic influenza A virus was previously determined in a 9-year effort by overlapping RT-PCR from post-mortem samples. Using the protocols described here, the full genome of the 1918 virus at high coverage was determined in one high-throughput sequencing run of a cDNA library derived from total RNA of a 1918 FFPE sample after duplex-specific nuclease treatments. This basic methodological approach should assist in the analysis of FFPE tissue samples isolated over the past century from a variety of infectious diseases. PMID:26344216

  10. Cloning, nucleotide sequence, and transcriptional analysis of the Pediococcus acidilactici L-(+)-lactate dehydrogenase gene.

    PubMed Central

    Garmyn, D; Ferain, T; Bernard, N; Hols, P; Delcour, J

    1995-01-01

    Recombinant plasmids containing the Pediococcus acidilactici L-(+)-lactate dehydrogenase gene (ldhL) were isolated by complementing for growth under anaerobiosis of an Escherichia coli lactate dehydrogenase-pyruvate formate lyase double mutant. The nucleotide sequence of the ldhL gene predicted a protein of 323 amino acids showing significant similarity with other bacterial L-(+)-lactate dehydrogenases and especially with that of Lactobacillus plantarum. The ldhL transcription start points in P. acidilactici were defined by primer extension, and the promoter sequence was identified as TCAAT-(17 bp)-TATAAT. This sequence is closely related to the consensus sequence of vegetative promoters from gram-positive bacteria as well as from E. coli. Northern analysis of P. acidilactici RNA showed a 1.1-kb ldhL transcript whose abundance is growth rate regulated. These data, together with the presence of a putative rho-independent transcriptional terminator, suggest that ldhL is expressed as a monocistronic transcript in P. acidilactici. PMID:7887607

  11. Nucleotide sequence of ompV, the gene for a major Vibrio cholerae outer membrane protein.

    PubMed

    Pohlner, J; Meyer, T F; Jalajakumari, M B; Manning, P A

    1986-12-01

    The nucleotide sequence of the ompV gene of Vibrio cholerae was determined. The product of the gene is a 28,000 dalton protein which, after the removal of a 19 amino acid signal sequence, produces a mature outer membrane protein of 26,000 daltons. The cleavage site was determined by amino-terminal amino acid sequencing of the purified mature protein. The DNA upstream of the gene shows the presence of a typical promoter region as judged from the Escherichia coli consensus information; however, the Shine-Dalgarno sequence is associated with a region capable of forming a secondary structure in the mRNA. The formation of this structure would inhibit binding of the mRNA to the ribosome and reduce translation. It is proposed that this structure is recognized by a positive activator in V. cholerae and because of its absence in E. coli ompV is poorly expressed. The distribution of rare codons within ompV suggests that they may serve to slow down the translation of particular domains such that the nascent polypeptide has an opportunity to take up its conformation without interference from the later formed regions. Such a mechanism could aid localization of the protein if export were by a contranslational secretion system. PMID:3031428

  12. Power Spectrum and Mutual Information Analyses of DNA Base (Nucleotide) Sequences

    NASA Astrophysics Data System (ADS)

    Isohata, Yasuhiko; Hayashi, Masaki

    2003-03-01

    On the basis of the power spectrum analyses for the base (nucleotide) sequences of various genes, we have studied long-range correlations in total base sequences which are expressed as 1/fα, behaviour of the exponent α for the accumulated base sequences as well as periodicities at short range. In particular from the analysis of content rate distributions of α we have obtained the average value \\barα=0.40± 0.01 and \\barα=0.20± 0.01 for the human genes and S. cerevisiae genes, respectively. We have also performed the analyses using the mutual information function. We show that there exists a clear difference between the content rate distributions of correlation lengths for the sample human genes and the S. cerevisiae genes. We are led to a conjecture that the elongation of the correlation length in the base sequences of genes from the early eukaryote (S. cerevisiae) to the late eukaryote (human) should be the definite reflection of the evolutionary process.

  13. Proteus mirabilis ambient-temperature fimbriae: cloning and nucleotide sequence of the aft gene cluster.

    PubMed Central

    Massad, G; Fulkerson, J F; Watson, D C; Mobley, H L

    1996-01-01

    Uropathogenic Proteus mirabilis produces at least four types of fimbriae. Amino acid sequences from two peptides, derived by tryptic digestion of the structural subunit of one type of these fimbriae, the ambient-temperature fimbriae, were determined: NVVPGQPSSTQ and LIEGENQLNYNA. PCR primers, based on these sequences and that of the N terminus, were used to amplify a 359-bp fragment. A cosmid clone, isolated from a P. mirabilis genomic library by hybridization with the 359-bp PCR product, was used to determine the nucleotide sequence of the atf gene cluster. A 3,903-bp region encodes three polypeptides: AtfA, the structural subunit; AtfB, the chaperone; and AtfC, the outer membrane molecular usher. No fimbria-related genes are evident either 5' or 3' to the three contiguous genes. AtfA demonstrates significant amino acid sequence identity with type 1 major fimbrial subunits of several enteric species. The 359-bp PCR product hybridized strongly with all Proteus isolates (n = 9) and 25% of 355 Escherichia coli isolates but failed to hybridize with any of 26 isolates among nine other uropathogenic species. Ambient-temperature fimbriae of P. mirabilis may represent a novel type of fimbriae of enteric species. PMID:8926119

  14. Increased functional protein expression using nucleotide sequence features enriched in highly expressed genes in zebrafish

    PubMed Central

    Horstick, Eric J.; Jordan, Diana C.; Bergeron, Sadie A.; Tabor, Kathryn M.; Serpe, Mihaela; Feldman, Benjamin; Burgess, Harold A.

    2015-01-01

    Many genetic manipulations are limited by difficulty in obtaining adequate levels of protein expression. Bioinformatic and experimental studies have identified nucleotide sequence features that may increase expression, however it is difficult to assess the relative influence of these features. Zebrafish embryos are rapidly injected with calibrated doses of mRNA, enabling the effects of multiple sequence changes to be compared in vivo. Using RNAseq and microarray data, we identified a set of genes that are highly expressed in zebrafish embryos and systematically analyzed for enrichment of sequence features correlated with levels of protein expression. We then tested enriched features by embryo microinjection and functional tests of multiple protein reporters. Codon selection, releasing factor recognition sequence and specific introns and 3′ untranslated regions each increased protein expression between 1.5- and 3-fold. These results suggested principles for increasing protein yield in zebrafish through biomolecular engineering. We implemented these principles for rational gene design in software for codon selection (CodonZ) and plasmid vectors incorporating the most active non-coding elements. Rational gene design thus significantly boosts expression in zebrafish, and a similar approach will likely elevate expression in other animal models. PMID:25628360

  15. Full-Length Venom Protein cDNA Sequences from Venom-Derived mRNA: Exploring Compositional Variation and Adaptive Multigene Evolution.

    PubMed

    Modahl, Cassandra M; Mackessy, Stephen P

    2016-06-01

    Envenomation of humans by snakes is a complex and continuously evolving medical emergency, and treatment is made that much more difficult by the diverse biochemical composition of many venoms. Venomous snakes and their venoms also provide models for the study of molecular evolutionary processes leading to adaptation and genotype-phenotype relationships. To compare venom complexity and protein sequences, venom gland transcriptomes are assembled, which usually requires the sacrifice of snakes for tissue. However, toxin transcripts are also present in venoms, offering the possibility of obtaining cDNA sequences directly from venom. This study provides evidence that unknown full-length venom protein transcripts can be obtained from the venoms of multiple species from all major venomous snake families. These unknown venom protein cDNAs are obtained by the use of primers designed from conserved signal peptide sequences within each venom protein superfamily. This technique was used to assemble a partial venom gland transcriptome for the Middle American Rattlesnake (Crotalus simus tzabcan) by amplifying sequences for phospholipases A2, serine proteases, C-lectins, and metalloproteinases from within venom. Phospholipase A2 sequences were also recovered from the venoms of several rattlesnakes and an elapid snake (Pseudechis porphyriacus), and three-finger toxin sequences were recovered from multiple rear-fanged snake species, demonstrating that the three major clades of advanced snakes (Elapidae, Viperidae, Colubridae) have stable mRNA present in their venoms. These cDNA sequences from venom were then used to explore potential activities derived from protein sequence similarities and evolutionary histories within these large multigene superfamilies. Venom-derived sequences can also be used to aid in characterizing venoms that lack proteomic profiles and identify sequence characteristics indicating specific envenomation profiles. This approach, requiring only venom, provides

  16. Full-Length Venom Protein cDNA Sequences from Venom-Derived mRNA: Exploring Compositional Variation and Adaptive Multigene Evolution

    PubMed Central

    Modahl, Cassandra M.; Mackessy, Stephen P.

    2016-01-01

    Envenomation of humans by snakes is a complex and continuously evolving medical emergency, and treatment is made that much more difficult by the diverse biochemical composition of many venoms. Venomous snakes and their venoms also provide models for the study of molecular evolutionary processes leading to adaptation and genotype-phenotype relationships. To compare venom complexity and protein sequences, venom gland transcriptomes are assembled, which usually requires the sacrifice of snakes for tissue. However, toxin transcripts are also present in venoms, offering the possibility of obtaining cDNA sequences directly from venom. This study provides evidence that unknown full-length venom protein transcripts can be obtained from the venoms of multiple species from all major venomous snake families. These unknown venom protein cDNAs are obtained by the use of primers designed from conserved signal peptide sequences within each venom protein superfamily. This technique was used to assemble a partial venom gland transcriptome for the Middle American Rattlesnake (Crotalus simus tzabcan) by amplifying sequences for phospholipases A2, serine proteases, C-lectins, and metalloproteinases from within venom. Phospholipase A2 sequences were also recovered from the venoms of several rattlesnakes and an elapid snake (Pseudechis porphyriacus), and three-finger toxin sequences were recovered from multiple rear-fanged snake species, demonstrating that the three major clades of advanced snakes (Elapidae, Viperidae, Colubridae) have stable mRNA present in their venoms. These cDNA sequences from venom were then used to explore potential activities derived from protein sequence similarities and evolutionary histories within these large multigene superfamilies. Venom-derived sequences can also be used to aid in characterizing venoms that lack proteomic profiles and identify sequence characteristics indicating specific envenomation profiles. This approach, requiring only venom, provides

  17. Sequence and expression of an Eisenia-fetida-derived cDNA clone that encodes the 40-kDa fetidin antibacterial protein.

    PubMed

    Lassegues, M; Milochau, A; Doignon, F; Du Pasquier, L; Valembois, P

    1997-06-15

    Fetidins are 40-kDa and 45-kDa hemolytic and antibacterial glycoproteins present in the coelomic fluid of the earthworm Eisenia fetida andrei. By screening a cDNA library with a polyclonal antifetidin serum, we have cloned a cDNA that encoded the 40-kDa fetidin. The clone contains an insert of 1.44 kb encoding a protein of 34 kDa, which corresponds to the size of deglycosylated fetidins. The recombinant protein inhibits Bacillus megaterium growth. Restriction fragment polymorphisms were observed on Southern blots and correspond to a known protein polymorphism. The sequence of the cDNA contains a peroxidase signature and fetidins from earthworm coelomic fluid have peroxidase activity. The 40-kDa and 45-kDa fetidins therefore represent two related polymorphic defence factors in invertebrates. PMID:9219536

  18. Sequencing and analysis of 10967 full-length cDNA clones from Xenopus laevis and Xenopus tropicalis

    SciTech Connect

    Morin, R D; Chang, E; Petrescu, A; Liao, N; Kirkpatrick, R; Griffith, M; Butterfield, Y; Stott, J; Barber, S; Babakaiff, R; Matsuo, C; Wong, D; Yang, G; Smailus, D; Brown-John, M; Mayo, M; Beland, J; Gibson, S; Olson, T; Tsai, M; Featherstone, R; Chand, S; Siddiqui, A; Jang, W; Lee, E; Klein, S; Prange, C; Myers, R M; Green, E D; Wagner, L; Gerhard, D; Marra, M; Jones, S M; Holt, R

    2005-10-31

    Sequencing of full-insert clones from full-length cDNA libraries from both Xenopus laevis and Xenopus tropicalis has been ongoing as part of the Xenopus Gene Collection initiative. Here we present an analysis of 10967 clones (8049 from X. laevis and 2918 from X. tropicalis). The clone set contains 2013 orthologs between X. laevis and X. tropicalis as well as 1795 paralog pairs within X. laevis. 1199 are in-paralogs, believed to have resulted from an allotetraploidization event approximately 30 million years ago, and the remaining 546 are likely out-paralogs that have resulted from more ancient gene duplications, prior to the divergence between the two species. We do not detect any evidence for positive selection by the Yang and Nielsen maximum likelihood method of approximating d{sub N}/d{sub S}. However, d{sub N}/d{sub S} for X. laevis in-paralogs is elevated relative to X. tropicalis orthologs. This difference is highly significant, and indicates an overall relaxation of selective pressures on duplicated gene pairs. Within both groups of paralogs, we found evidence of subfunctionalization, manifested as differential expression of paralogous genes among tissues, as measured by EST information from public resources. We have observed, as expected, a higher instance of subfunctionalization in out-paralogs relative to in-paralogs.

  19. Construction of a cDNA library and preliminary analysis of expressed sequence tags in Piper hainanense.

    PubMed

    Fan, R; Ling, P; Hao, C Y; Li, F P; Huang, L F; Wu, B D; Wu, H S

    2015-01-01

    Black pepper is a perennial climbing vine. It is widely cultivated because its berries can be utilized not only as a spice in food but also for medicinal use. This study aimed to construct a standardized, high-quality cDNA library to facilitated identification of new Piper hainanense transcripts. For this, 262 unigenes were used to generate raw reads. The average length of these 262 unigenes was 774.8 bp. Of these, 94 genes (35.9%) were newly identified, according to the NCBI protein database. Thus, identification of new genes may broaden the molecular knowledge of P. hainanense on the basis of Clusters of Orthologous Groups and Gene Ontology categories. In addition, certain basic genes linked to physiological processes, which can contribute to disease resistance and thereby to the breeding of black pepper. A total of 26 unigenes were found to be SSR markers. Dinucleotide SSR was the main repeat motif, accounting for 61.54%, followed by trinucleotide SSR (23.07%). Eight primer pairs successfully amplified DNA fragments and detected significant amounts of polymorphism among twenty-one piper germplasm. These results present a novel sequence information of P. hainanense, which can serve as the foundation for further genetic research on this species. PMID:26505424

  20. Cloning and sequence of a cDNA coding for the human beta-migrating endothelial-cell-type plasminogen activator inhibitor.

    PubMed Central

    Ny, T; Sawdey, M; Lawrence, D; Millan, J L; Loskutoff, D J

    1986-01-01

    A lambda gt11 expression library containing cDNA inserts prepared from human placental mRNA was screened immunologically using an antibody probe developed against the beta-migrating plasminogen activator inhibitor (beta-PAI) purified from cultured bovine aortic endothelial cells. Thirty-four positive clones were isolated after screening 7 X 10(5) phages. Three clones (lambda 1.2, lambda 3, and lambda 9.2) were randomly picked and further characterized. These contained inserts 1.9, 3.0, and 1.9 kilobases (kb) long, respectively. Escherichia coli lysogenic for lambda 9.2, but not for lambda gt11, produced a fusion protein of 180 kDa that was recognized by affinity-purified antibodies against the bovine aortic endothelial cell beta-PAI and had beta-PAI activity when analyzed by reverse fibrin autography. The largest cDNA insert was sequenced and shown to be 2944 base pairs (bp) long. It has a large 3' untranslated region [1788 bp, excluding the poly(A) tail] and contains the entire coding region of the mature protein but lacks the initiation codon and part of the signal peptide coding region at the 5' terminus. The two clones carrying the 1.9-kb cDNA inserts were partially sequenced and shown to be identical to the 3.0-kb cDNA except that they were truncated, lacking much of the 3' untranslated region. Blot hybridization analysis of electrophoretically fractionated RNA from the human fibrosarcoma cell line HT-1080 was performed using the 3.0-kb cDNA as hybridization probe. Two distinct transcripts, 2.2 and 3.0 kb, were detected, suggesting that the 1.9-kb cDNA may have been copied from the shorter RNA transcript. The amino acid sequence deduced from the cDNA was aligned with the NH2-terminal sequence of the human beta-PAI. Based on this alignment, the mature human beta-PAI is 379 amino acids long and contains an NH2-terminal valine. The deduced amino acid sequence has extensive (30%) homology with alpha 1-antitrypsin and antithrombin III, indicating that the beta

  1. Filamentous hemagglutinin of Bordetella pertussis: nucleotide sequence and crucial role in adherence.

    PubMed Central

    Relman, D A; Domenighini, M; Tuomanen, E; Rappuoli, R; Falkow, S

    1989-01-01

    Filamentous hemagglutinin is a surface-associated adherence protein of Bordetella pertussis, which is a component of some new acellular pertussis vaccines. The nucleotide sequence of an open reading frame that encompasses the filamentous hemagglutinin structural gene, fhaB, suggests that proteolytic processing is necessary to generate the mature 220-kDa filamentous hemagglutinin product. An Arg-Gly-Asp (RGD) tripeptide is found within filamentous hemagglutinin that may be involved in its adherence properties. An internal in-frame deletion in fhaB, encompassing the RGD region, causes loss of B. pertussis-binding to ciliated eukaryotic cells, confirming a potential role for this protein in host-cell binding and infection. Images PMID:2539596

  2. Nucleotide sequence of a glucosyltransferase gene from Streptococcus sobrinus MFe28.

    PubMed Central

    Ferretti, J J; Gilpin, M L; Russell, R R

    1987-01-01

    The complete nucleotide sequence was determined for the Streptococcus sobrinus MFe28 gtfI gene, which encodes a glucosyltransferase that produces an insoluble glucan product. A single open reading frame encodes a mature glucosyltransferase protein of 1,559 amino acids (Mr, 172,983) and a signal peptide of 38 amino acids. In the C-terminal one-third of the protein there are six repeating units containing 35 amino acids of partial homology and two repeating units containing 48 amino acids of complete homology. The functional role of these repeating units remains to be determined, although truncated forms of glucosyltransferase containing only the first two repeating units of partial homology maintained glucosyltransferase activity and the ability to bind glucan. Regions of homology with alpha-amylase and glycogen phosphorylase were identified in the glucosyltransferase protein and may represent regions involved in functionally similar domains. Images PMID:3040686

  3. High-Throughput Sequencing Reveals Single Nucleotide Variants in Longer-Kernel Bread Wheat

    PubMed Central

    Chen, Feng; Zhu, Zibo; Zhou, Xiaobian; Yan, Yan; Dong, Zhongdong; Cui, Dangqun

    2016-01-01

    The transcriptomes of bread wheat Yunong 201 and its ethyl methanesulfonate derivative Yunong 3114 were obtained by next-sequencing technology. Single nucleotide variants (SNVs) in the wheat strains were explored and compared. A total of 5907 and 6287 non-synonymous SNVs were acquired for Yunong 201 and 3114, respectively. A total of 4021 genes with SNVs were obtained. The genes that underwent non-synonymous SNVs were significantly involved in ATP binding, protein phosphorylation, and cellular protein metabolic process. The heat map analysis also indicated that most of these mutant genes were significantly differentially expressed at different developmental stages. The SNVs in these genes possibly contribute to the longer kernel length of Yunong 3114. Our data provide useful information on wheat transcriptome for future studies on wheat functional genomics. This study could also help in illustrating the gene functions of the non-synonymous SNVs of Yunong 201 and 3114. PMID:27551288

  4. Complete nucleotide sequence of a virus associated with rusty mottle disease of sweet cherry (Prunus avium).

    PubMed

    Villamor, D V; Druffel, K L; Eastwell, K C

    2013-08-01

    Cherry rusty mottle is a disease of sweet cherries first described in 1940 in western North America. Because of the graft-transmissible nature of the disease, a viral nature of the disease was assumed. Here, the complete genomic nucleotide sequences of virus isolates from two trees expressing cherry rusty mottle disease symptoms are characterized; the virus is designated cherry rusty mottle associated virus (CRMaV). The biological and molecular characteristics of this virus in comparison to those of cherry necrotic rusty mottle virus (CNRMV) and cherry green ring mottle virus (CGRMV) are described. CRMaV was subsequently detected in additional sweet cherry trees expressing symptoms of cherry rusty mottle disease. PMID:23525699

  5. Nucleotide sequence of a gene encoding an organophosphorus nerve agent degrading enzyme from Alteromonas haloplanktis.

    PubMed

    Cheng, T; Liu, L; Wang, B; Wu, J; DeFrank, J J; Anderson, D M; Rastogi, V K; Hamilton, A B

    1997-01-01

    Organophosphorus acid anhydrolases (OPAA) catalyzing the hydrolysis of a variety of toxic organophosphorus cholinesterase inhibitors offer potential for decontamination of G-type nerve agents and pesticides. The gene (opa) encoding an OPAA was cloned from the chromosomal DNA of Alteromonas haloplanktis ATCC 23821. The nucleotide sequence of the 1.7 -kb DNA fragment contained the opa gene (1.3 kb) and its flanking region. We report structural and functional similarity of OPAAs from A. haloplanktis and Alteromonas sp JD6.5 with the enzyme prolidase that hydrolyzes dipeptides with a prolyl residue in the carboxyl-terminal position. These results corroborate the earlier conclusion that the OPAA is a type of X-Pro dipeptidase, and that X-Pro could be the native substrate for such an enzyme in Alteromonas cells. PMID:9079288

  6. Nucleotide sequence analysis of the DNA binding region of the chicken fibronectin gene.

    PubMed

    Karasaki, Y; Gotoh, S; Kubomura, S; Higashi, K; Hirano, H

    1988-12-01

    We have determined the nucleotide sequence of 2.0 kb EcoRI segment from the clone lambda FC32 of the genomic chicken fibronectin gene, which is called DNA binding domain. This segment overlapped another clone lambda FC36 and contained three exons which were 16, 17 and 18. They were classified as Type III repeat as originally shown in bovine plasma fibronectin. The average homologies of these three exons among the chicken, rat and human fibronectins in amino acid level are very high (87-98%) compared with that (79-88%) of the exons in the cell binding domain, indicating that this region is highly conservative during the evolution. PMID:3212295

  7. Developing single nucleotide polymorphism (SNP) markers from transcriptome sequences for identification of longan (Dimocarpus longan) germplasm

    PubMed Central

    Wang, Boyi; Tan, Hua-Wei; Fang, Wanping; Meinhardt, Lyndel W; Mischke, Sue; Matsumoto, Tracie; Zhang, Dapeng

    2015-01-01

    Longan (Dimocarpus longan Lour.) is an important tropical fruit tree crop. Accurate varietal identification is essential for germplasm management and breeding. Using longan transcriptome sequences from public databases, we developed single nucleotide polymorphism (SNP) markers; validated 60 SNPs in 50 longan germplasm accessions, including cultivated varieties and wild germplasm; and designated 25 SNP markers that unambiguously identified all tested longan varieties with high statistical rigor (P<0.0001). Multiple trees from the same clone were verified and off-type trees were identified. Diversity analysis revealed genetic relationships among analyzed accessions. Cultivated varieties differed significantly from wild populations (Fst=0.300; P<0.001), demonstrating untapped genetic diversity for germplasm conservation and utilization. Within cultivated varieties, apparent differences between varieties from China and those from Thailand and Hawaii indicated geographic patterns of genetic differentiation. These SNP markers provide a powerful tool to manage longan genetic resources and breeding, with accurate and efficient genotype identification. PMID:26504559

  8. High-Throughput Sequencing Reveals Single Nucleotide Variants in Longer-Kernel Bread Wheat.

    PubMed

    Chen, Feng; Zhu, Zibo; Zhou, Xiaobian; Yan, Yan; Dong, Zhongdong; Cui, Dangqun

    2016-01-01

    The transcriptomes of bread wheat Yunong 201 and its ethyl methanesulfonate derivative Yunong 3114 were obtained by next-sequencing technology. Single nucleotide variants (SNVs) in the wheat strains were explored and compared. A total of 5907 and 6287 non-synonymous SNVs were acquired for Yunong 201 and 3114, respectively. A total of 4021 genes with SNVs were obtained. The genes that underwent non-synonymous SNVs were significantly involved in ATP binding, protein phosphorylation, and cellular protein metabolic process. The heat map analysis also indicated that most of these mutant genes were significantly differentially expressed at different developmental stages. The SNVs in these genes possibly contribute to the longer kernel length of Yunong 3114. Our data provide useful information on wheat transcriptome for future studies on wheat functional genomics. This study could also help in illustrating the gene functions of the non-synonymous SNVs of Yunong 201 and 3114. PMID:27551288

  9. High-throughput nucleotide sequence analysis of diverse bacterial communities in leachates of decomposing pig carcasses

    PubMed Central

    Yang, Seung Hak; Lim, Joung Soo; Khan, Modabber Ahmed; Kim, Bong Soo; Choi, Dong Yoon; Lee, Eun Young; Ahn, Hee Kwon

    2015-01-01

    The leachate generated by the decomposition of animal carcass has been implicated as an environmental contaminant surrounding the burial site. High-throughput nucleotide sequencing was conducted to investigate the bacterial communities in leachates from the decomposition of pig carcasses. We acquired 51,230 reads from six different samples (1, 2, 3, 4, 6 and 14 week-old carcasses) and found that sequences representing the phylum Firmicutes predominated. The diversity of bacterial 16S rRNA gene sequences in the leachate was the highest at 6 weeks, in contrast to those at 2 and 14 weeks. The relative abundance of Firmicutes was reduced, while the proportion of Bacteroidetes and Proteobacteria increased from 3–6 weeks. The representation of phyla was restored after 14 weeks. However, the community structures between the samples taken at 1–2 and 14 weeks differed at the bacterial classification level. The trend in pH was similar to the changes seen in bacterial communities, indicating that the pH of the leachate could be related to the shift in the microbial community. The results indicate that the composition of bacterial communities in leachates of decomposing pig carcasses shifted continuously during the study period and might be influenced by the burial site. PMID:26500442

  10. Nucleotide sequences provide evidence of genetic exchange among distantly related lineages of Trypanosoma cruzi

    PubMed Central

    Machado, Carlos A.; Ayala, Francisco J.

    2001-01-01

    Simple phylogenetic tests were applied to a large data set of nucleotide sequences from two nuclear genes and a region of the mitochondrial genome of Trypanosoma cruzi, the agent of Chagas' disease. Incongruent gene genealogies manifest genetic exchange among distantly related lineages of T. cruzi. Two widely distributed isoenzyme types of T. cruzi are hybrids, their genetic composition being the likely result of genetic exchange between two distantly related lineages. The data show that the reference strain for the T. cruzi genome project (CL Brener) is a hybrid. Well-supported gene genealogies show that mitochondrial and nuclear gene sequences from T. cruzi cluster, respectively, in three or four distinct clades that do not fully correspond to the two previously defined major lineages of T. cruzi. There is clear genetic differentiation among the major groups of sequences, but genetic diversity within each major group is low. We estimate that the major extant lineages of T. cruzi have diverged during the Miocene or early Pliocene (3–16 million years ago). PMID:11416213

  11. Mining for single nucleotide polymorphisms and insertions / deletions in expressed sequence tag libraries of oil palm.

    PubMed

    Riju, Aykkal; Chandrasekar, Arumugam; Arunachalam, Vadivel

    2007-01-01

    The oil palm is a tropical oil bearing tree. Recently EST-derived SNPs and SSRs are a free by-product of the currently expanding EST (Expressed Sequence Tag) data bases. The development of high-throughput methods for the detection of SNPs (Single Nucleotide Polymorphism) and small indels (insertion / deletion) has led to a revolution in their use as molecular markers. Available (5452) Oil palm EST sequences were mined from dbEST of NCBI. CAP3 program was used to assemble EST sequences into contigs. Candidate SNPs and Indel polymorphisms were detected using the perl script auto_snip version 1.0 which has used 576 ESTs for detecting SNPs and Indel sites. We found 1180 SNP sites and 137 indel polymorphisms with frequency 1.36 SNPs / 100 bp. Among the six tissues from which the EST libraries had been generated, mesocarp had high frequency of 2.91 SNPs and indels per 100 bp whereas the zygotic embryos had lowest frequency of 0.15 per 100 bp. We also used the Shannon index to analyze the proportion of ten possible types of SNP/indels. ESTs from tissues of normal apex showed highest values of Shannon index (0.60) whereas abnormal apex had least value (0.02). The present report deals the use of Shannon index for comparing SNP/ indel frequencies mined from ESTlibraries and also confirm that the frequency of SNP occurrence in oil palm to use them as markers for genetic studies. PMID:21670789

  12. Complete nucleotide sequence of the mitochondrial genome of a salamander, Mertensiella luschani.

    PubMed

    Zardoya, Rafael; Malaga-Trillo, Edward; Veith, Michael; Meyer, Axel

    2003-10-23

    The complete nucleotide sequence (16,650 bp) of the mitochondrial genome of the salamander Mertensiella luschani (Caudata, Amphibia) was determined. This molecule conforms to the consensus vertebrate mitochondrial gene order. However, it is characterized by a long non-coding intervening sequence with two 124-bp repeats between the tRNA(Thr) and tRNA(Pro) genes. The new sequence data were used to reconstruct a phylogeny of jawed vertebrates. Phylogenetic analyses of all mitochondrial protein-coding genes at the amino acid level recovered a robust vertebrate tree in which lungfishes are the closest living relatives of tetrapods, salamanders and frogs are grouped together to the exclusion of caecilians (the Batrachia hypothesis) in a monophyletic amphibian clade, turtles show diapsid affinities and are placed as sister group of crocodiles+birds, and the marsupials are grouped together with monotremes and basal to placental mammals. The deduced phylogeny was used to characterize the molecular evolution of vertebrate mitochondrial proteins. Amino acid frequencies were analyzed across the main lineages of jawed vertebrates, and leucine and cysteine were found to be the most and least abundant amino acids in mitochondrial proteins, respectively. Patterns of amino acid replacements were conserved among vertebrates. Overall, cartilaginous fishes showed the least variation in amino acid frequencies and replacements. Constancy of rates of evolution among the main lineages of jawed vertebrates was rejected. PMID:14604788

  13. Whole genome sequencing of a single Bos taurus animal for single nucleotide polymorphism discovery

    PubMed Central

    Eck, Sebastian H; Benet-Pagès, Anna; Flisikowski, Krzysztof; Meitinger, Thomas; Fries, Ruedi; Strom, Tim M

    2009-01-01

    Background The majority of the 2 million bovine single nucleotide polymorphisms (SNPs) currently available in dbSNP have been identified in a single breed, Hereford cattle, during the bovine genome project. In an attempt to evaluate the variance of a second breed, we have produced a whole genome sequence at low coverage of a single Fleckvieh bull. Results We generated 24 gigabases of sequence, mainly using 36-bp paired-end reads, resulting in an average 7.4-fold sequence depth. This coverage was sufficient to identify 2.44 million SNPs, 82% of which were previously unknown, and 115,000 small indels. A comparison with the genotypes of the same animal, generated on a 50 k oligonucleotide chip, revealed a detection rate of 74% and 30% for homozygous and heterozygous SNPs, respectively. The false positive rate, as determined by comparison with genotypes determined for 196 randomly selected SNPs, was approximately 1.1%. We further determined the allele frequencies of the 196 SNPs in 48 Fleckvieh and 48 Braunvieh bulls. 95% of the SNPs were polymorphic with an average minor allele frequency of 24.5% and with 83% of the SNPs having a minor allele frequency larger than 5%. Conclusions This work provides the first single cattle genome by next-generation sequencing. The chosen approach - low to medium coverage re-sequencing - added more than 2 million novel SNPs to the currently publicly available SNP resource, providing a valuable resource for the construction of high density oligonucleotide arrays in the context of genome-wide association studies. PMID:19660108

  14. Detection and quantitation of single nucleotide polymorphisms, DNA sequence variations, DNA mutations, DNA damage and DNA mismatches

    DOEpatents

    McCutchen-Maloney, Sandra L.

    2002-01-01

    DNA mutation binding proteins alone and as chimeric proteins with nucleases are used with solid supports to detect DNA sequence variations, DNA mutations and single nucleotide polymorphisms. The solid supports may be flow cytometry beads, DNA chips, glass slides or DNA dips sticks. DNA molecules are coupled to solid supports to form DNA-support complexes. Labeled DNA is used with unlabeled DNA mutation binding proteins such at TthMutS to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by binding which gives an increase in signal. Unlabeled DNA is utilized with labeled chimeras to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by nuclease activity of the chimera which gives a decrease in signal.

  15. Real-time single-molecule electronic DNA sequencing by synthesis using polymer-tagged nucleotides on a nanopore array

    PubMed Central

    Fuller, Carl W.; Kumar, Shiv; Porel, Mintu; Chien, Minchen; Bibillo, Arek; Stranges, P. Benjamin; Dorwart, Michael; Tao, Chuanjuan; Li, Zengmin; Guo, Wenjing; Shi, Shundi; Korenblum, Daniel; Trans, Andrew; Aguirre, Anne; Liu, Edward; Harada, Eric T.; Pollard, James; Bhat, Ashwini; Cech, Cynthia; Yang, Alexander; Arnold, Cleoma; Palla, Mirkó; Hovis, Jennifer; Chen, Roger; Morozova, Irina; Kalachikov, Sergey; Russo, James J.; Kasianowicz, John J.; Davis, Randy; Roever, Stefan; Church, George M.; Ju, Jingyue

    2016-01-01

    DNA sequencing by synthesis (SBS) offers a robust platform to decipher nucleic acid sequences. Recently, we reported a single-molecule nanopore-based SBS strategy that accurately distinguishes four bases by electronically detecting and differentiating four different polymer tags attached to the 5′-phosphate of the nucleotides during their incorporation into a growing DNA strand catalyzed by DNA polymerase. Further developing this approach, we report here the use of nucleotides tagged at the terminal phosphate with oligonucleotide-based polymers to perform nanopore SBS on an α-hemolysin nanopore array platform. We designed and synthesized several polymer-tagged nucleotides using tags that produce different electrical current blockade levels and verified they are active substrates for DNA polymerase. A highly processive DNA polymerase was conjugated to the nanopore, and the conjugates were complexed with primer/template DNA and inserted into lipid bilayers over individually addressable electrodes of the nanopore chip. When an incoming complementary-tagged nucleotide forms a tight ternary complex with the primer/template and polymerase, the tag enters the pore, and the current blockade level is measured. The levels displayed by the four nucleotides tagged with four different polymers captured in the nanopore in such ternary complexes were clearly distinguishable and sequence-specific, enabling continuous sequence determination during the polymerase reaction. Thus, real-time single-molecule electronic DNA sequencing data with single-base resolution were obtained. The use of these polymer-tagged nucleotides, combined with polymerase tethering to nanopores and multiplexed nanopore sensors, should lead to new high-throughput sequencing methods. PMID:27091962

  16. Real-time single-molecule electronic DNA sequencing by synthesis using polymer-tagged nucleotides on a nanopore array.

    PubMed

    Fuller, Carl W; Kumar, Shiv; Porel, Mintu; Chien, Minchen; Bibillo, Arek; Stranges, P Benjamin; Dorwart, Michael; Tao, Chuanjuan; Li, Zengmin; Guo, Wenjing; Shi, Shundi; Korenblum, Daniel; Trans, Andrew; Aguirre, Anne; Liu, Edward; Harada, Eric T; Pollard, James; Bhat, Ashwini; Cech, Cynthia; Yang, Alexander; Arnold, Cleoma; Palla, Mirkó; Hovis, Jennifer; Chen, Roger; Morozova, Irina; Kalachikov, Sergey; Russo, James J; Kasianowicz, John J; Davis, Randy; Roever, Stefan; Church, George M; Ju, Jingyue

    2016-05-10

    DNA sequencing by synthesis (SBS) offers a robust platform to decipher nucleic acid sequences. Recently, we reported a single-molecule nanopore-based SBS strategy that accurately distinguishes four bases by electronically detecting and differentiating four different polymer tags attached to the 5'-phosphate of the nucleotides during their incorporation into a growing DNA strand catalyzed by DNA polymerase. Further developing this approach, we report here the use of nucleotides tagged at the terminal phosphate with oligonucleotide-based polymers to perform nanopore SBS on an α-hemolysin nanopore array platform. We designed and synthesized several polymer-tagged nucleotides using tags that produce different electrical current blockade levels and verified they are active substrates for DNA polymerase. A highly processive DNA polymerase was conjugated to the nanopore, and the conjugates were complexed with primer/template DNA and inserted into lipid bilayers over individually addressable electrodes of the nanopore chip. When an incoming complementary-tagged nucleotide forms a tight ternary complex with the primer/template and polymerase, the tag enters the pore, and the current blockade level is measured. The levels displayed by the four nucleotides tagged with four different polymers captured in the nanopore in such ternary complexes were clearly distinguishable and sequence-specific, enabling continuous sequence determination during the polymerase reaction. Thus, real-time single-molecule electronic DNA sequencing data with single-base resolution were obtained. The use of these polymer-tagged nucleotides, combined with polymerase tethering to nanopores and multiplexed nanopore sensors, should lead to new high-throughput sequencing methods. PMID:27091962

  17. Construction of a normalized directionally cloned cDNA library from adult heart and analysis of 3040 clones by partial sequencing.

    PubMed

    Tanaka, T; Ogiwara, A; Uchiyama, I; Takagi, T; Yazaki, Y; Nakamura, Y

    1996-07-01

    Large-scale sequencing of clones from cDNA libraries derived from specific tissues is a rapid and efficient way of discovering novel genes expressed in those tissues. However, because the heart is continually contracting and relaxing, it strongly expresses muscle-contractile genes and/or mitochondrial genes, a bias that reduces the efficiency of this method. To improve the efficiency of identifying novel genes expressed in the heart, we constructed a normalized directionally cloned cDNA library from adult heart and partially sequenced 3040 clones. Comparisons of these sequence data with known DNA sequences in the database revealed that 57.1% of the clones matched human genes already known, 23.4% were identical or almost identical to human expressed sequence tags (ESTs), 14.2% bore no significant homology to any sequences in the database, and 1.2% represented repetitive sequences. The remaining 4.1% showed some homology with known genes, and Northern blot analysis of several clones in this category revealed that most of them were expressed mainly in the heart and skeletal muscle. After redundancy was excluded, the 3040 clones accounted for 1395 distinctive ESTs, 446 of which exhibited no match to any known sequence. Our results suggest that our normalized library is less redundant than standard libraries and is a useful resource for cataloging genes expressed in the heart. PMID:8661126

  18. Cloning and nucleotide sequence of the gene coding for citrate synthase from a thermotolerant Bacillus sp

    SciTech Connect

    Schendel, F.J.; August, P.R.; Anderson, C.R.; Flickinger, M.C. ); Hanson, R.S. )

    1992-01-01

    Acetate salts are emerging as potentially attractive bulk chemicals for a variety of environmental applications, for example, as catalysts to facilitate combustion of high-sulfur coal by electrical utilities and as the biodegradable noncorrosive highway deicing salt calcium magnesium acetate. The structural gene coding for citrate synthase from the gram-positive soil isolate Bacillus sp. strain C4 (ATCC 55182) capable of secreting acetic acid at pH 5.0 to 7.0 in the presence of dolime has been cloned from a genomic library by complementation of an Escherichia coli auxotrophic mutant lacking citrate synthase. The nucleotide sequence of the entire 3.1-kb HindIII fragment has been determined, and one major open reading frame was found coding for citrate synthase (ctsA). Citrate synthase from Bacillus sp. strain C4 was found to be a dimer (M{sub r}, 84,500) with a sub unit with an M{sub r} of 42,000. The N-terminal sequence was found to be identical with that predicted from the gene sequence. The kinetics were best fit to a bisubstrate enzyme with an ordered mechanism. Bacillus sp. strain C4 citrate synthase was not activated by potassium chloride and was not inhibited by NADH, ATP, ADP, or AMP at levels up to 1 mM. The predicted amino acid sequence was compared with that of the E. coli, Acinetobacter anitratum, Pseudomonas aeruginosa, Rickettsia prowazekii, porcine heart, and Saccharomyces cerevisiae cytoplasmic and mitochondrial enzymes.

  19. Complete nucleotide sequence and experimental host range of Okra mosaic virus.

    PubMed

    Stephan, Dirk; Siddiqua, Mahbuba; Ta Hoang, Anh; Engelmann, Jill; Winter, Stephan; Maiss, Edgar

    2008-02-01

    Okra mosaic virus (OkMV) is a tymovirus infecting members of the family Malvaceae. Early infections in okra (Abelmoschus esculentus) lead to yield losses of 12-19.5%. Besides intensive biological characterizations of OkMV only minor molecular data were available. Therefore, we determined the complete nucleotide sequence of a Nigerian isolate of OkMV. The complete genomic RNA (gRNA) comprises 6,223 nt and its genome organization showed three major ORFs coding for a putative movement protein (MP) of M r 73.1 kDa, a large replication-associated protein (RP) of M r 202.4 kDa and a coat protein (CP) of M r 19.6 kDa. Prediction of secondary RNA structures showed three hairpin structures with internal loops in the 5'-untranslated region (UTR) and a 3'-terminal tRNA-like structure (TLS) which comprises the anticodon for valine, typical for a member of the genus Tymovirus. Phylogenetic comparisons based on the RP, MP and CP amino acid sequences showed the close relationship of OkMV not only to other completely sequenced tymoviruses like Kennedya yellow mosaic virus (KYMV), Turnip yellow mosaic virus (TYMV) and Erysimum latent virus (ErLV), but also to Calopogonium yellow vein virus (CalYVV), Clitoria yellow vein virus (CYVV) and Desmodium yellow mottle virus (DYMoV). This is the first report of a complete OkMV genome sequence from one of the various OkMV isolates originating from West Africa described so far. Additionally, the experimental host range of OkMV including several Nicotiana species was determined. PMID:18049886

  20. The qa repressor gene of Neurospora crassa: wild-type and mutant nucleotide sequences.

    PubMed Central

    Huiet, L; Giles, N H

    1986-01-01

    The qa-1S gene, one of two regulatory genes in the qa gene cluster of Neurospora crassa, encodes the qa repressor. The qa-1S gene together with the qa-1F gene, which encodes the qa activator protein, control the expression of all seven qa genes, including those encoding the inducible enzymes responsible for the utilization of quinic acid as a carbon source. The nucleotide sequence of the qa-1S gene and its flanking regions has been determined. The deduced coding sequence for the qa-1S protein encodes 918 amino acids with a calculated molecular weight of 100,650 and is interrupted by a single 66-base-pair intervening sequence. Both constitutive and noninducible mutants occur in the qa-1S gene and two different mutations of each type have been cloned and sequenced. All four mutations occur within the predicted coding region of the qa-1S gene. This result strongly supports the hypothesis that the qa-1S gene encodes a repressor. All four mutations are located within codons for the last 300 amino acids of the qa-1S protein. The mutations in three of the mutants involve amino acid substitutions, while the fourth mutant, which has a constitutive phenotype, contains a frameshift mutation. The two constitutive mutations occur in the most distal region of the gene, possibly implicating the COOH-terminal region of the qa repressor in binding to its target. The two noninducible mutations occur in a region proximal to the constitutive mutations, possibly implicating this region of the qa repressor in binding the inducer. Images PMID:3010294

  1. Nucleotide sequence and expression of the capsid protein gene of feline calicivirus.

    PubMed Central

    Neill, J D; Reardon, I M; Heinrikson, R L

    1991-01-01

    The sequence of the 3'-terminal 2,486 bases of the feline calicivirus (FCV) genome was determined. This region of the FCV genome, from which the 2.4-kb subgenomic RNA is derived, contained two open reading frames. The larger open reading frame, found in the 5' end of the subgenomic mRNA, contained 2,004 bases encoding a polypeptide of 73,467 Da. The smaller open reading frame, encoded in the 3' end of the mRNA, was composed of 318 bases, encoding a polypeptide of 12,185 Da. The AUG initiation codon of the second open reading frame overlapped the UGA termination codon of the first, with the sequence AUGA. The nucleotide sequence of the region containing this overlap resembles the -1 frameshift sequences of the retroviruses. The 5' end of the 2.4-kb subgenomic RNA was mapped by primer extension analysis. There were two apparent transcription initiation points, both of which were 5' to the AUG initiation codon of the large open reading frame. Transcription from these sites yielded RNA transcripts with 5' nontranslated leader regions of 17 and 18 bases. The total length of the 2.4-kb subgenomic RNA was 2,375 bases (from the 5'-most start site) excluding the poly(A) tail. Edman degradation of the purified capsid protein of FCV showed that the capsid protein was encoded by the large open reading frame. Western immunoblot analysis of FCV-infected cells using a feline anti-FCV antiserum demonstrated that translation of the capsid protein was detectable at 3 h postinfection and continued to accumulate until 8 h postinfection, the last time examined. Images PMID:1716692

  2. cDNA cloning and structural characterization of a lectin from the mussel Crenomytilus grayanus with a unique amino acid sequence and antibacterial activity.

    PubMed

    Kovalchuk, Svetlana N; Chikalovets, Irina V; Chernikov, Oleg V; Molchanova, Valentina I; Li, Wei; Rasskazov, Valery A; Lukyanov, Pavel A

    2013-10-01

    An amino acid sequence of GalNAc/Gal-specific lectin from the mussel Crenomytilus grayanus (CGL) was determined by cDNA sequencing. CGL consists of 150 amino acid residues, contains three tandem repeats with high sequence similarities to each other (up to 73%) and does not belong to any known lectins family. According to circular dichroism results CGL is a β/α-protein with the predominance of β-structure. CGL was predicted to adopt a ß-trefoil fold. The lectin exhibits antibacterial activity and might be involved in the recognition and clearance of bacterial pathogens in the shellfish. PMID:23886951

  3. Fibulin-2 (FBLN2): Human cDNA sequence, mRNA expression, and mapping of the gene on human and mouse chromosomes

    SciTech Connect

    Zhang, R.Z.; Pan, T.C.; Zhang, Z.Y.

    1994-07-15

    Fibulin-2 is a new extracellular matrix protein recently identified by characterizing mouse cDNA clones. Fibulin-2 mRNA is prominently expressed in mouse heart tissue and is present in low amounts in other tissues. In this study, the authors isolated and sequenced a 4.1-kb human fibulin-2 cDNA, which encoded a mature protein of 1157 amino acids preceded by a 27-residue signal sequence. The predicted polypeptide contains three consecutive anaphylatoxin-related segments (domain I) in its central region followed by 10 EGF-like repeats (domain II), 9 of which have a consensus sequence for calcium binding. The 408-residue N-terminal region consists of two separate subdomains, a cysteine-rich segment of 150 residues (Na subdomain) and a cysteine-free segment with a stretch of acidic amino acids (Nb subdomain). The 115-residue C-terminal segment (domain III) is similar to the C variant of fibulin-1. The amino acid sequences of the human and mouse fibulin-2 share {approximately}90% identity in domains Na, I, II, and III but only 62% identity in domain Nb. The human cDNA lacks an EGF-like repeat, which is alternatively spliced in the mouse cDNA clones, and a potential cell-binding Arg-Gly-Asp sequence found in the Nb domain of the mouse counterpart. Northern blot analysis of mRNA from various human tissues reveals an abundant 4.5-kb transcript in heart, placenta, and ovary tissue. The expression pattern differs from that of fibulin-1. The fibulin-2 gene was localized by in situ hybridization to the p24-p25 region of human chromosome 3 and to the band D-E of mouse chromosome 6. 27 refs., 5 figs.

  4. Nucleotide sequence of the 3'-noncoding region of alfalfa mosaic virus RNA 4 and its homology with the genomic RNAs.

    PubMed Central

    Koper-Zwarthoff, E C; Brederode, F T; Walstra, P; Bol, J F

    1979-01-01

    A 226-nucleotide fragment was derived from alfalfa mosaic virus RNA 4 (ALMV RNA 4), the subgenomic messenger for viral coat protein, and its sequence was deduced by in vitro labeling with polynucleotide kinase and application of RNA sequencing techniques. The fragment contains the 3'-terminal 45 nucleotides of the coat protein cistron and the complete 3'-noncoding region of 182 nucleotides. The total length of RNA 4 was calculated to be 881 nucleotides. AlMV RNAs 1, 2 and 3 were elongated with a 3'-terminal poly(A) stretch and subjected to sequence analysis by using a specific primer, reverse transcriptase and chain terminators. This revealed and extensive homology between the 3'-terminal 140 to 150 nucleotides of all four ALMV RNAs. Despite a number of base substitutions, the secondary structure of the homologous region is highly conserved. The observed homology indicates that, as with RNA 4, the sites with a high affinity for the viral coat protein are located at the 3'-termini of the genomic RNAs. Images PMID:537914

  5. Nucleotide sequence of the Kaposi sarcoma-associated herpesvirus (HHV8)

    PubMed Central

    Russo, James J.; Bohenzky, Roy A.; Chien, Ming-Cheng; Chen, Jing; Yan, Ming; Maddalena, Dawn; Parry, J. Preston; Peruzzi, Daniela; Edelman, Isidore S.; Chang, Yuan; Moore, Patrick S.

    1996-01-01

    The genome of the Kaposi sarcoma-associated herpesvirus (KSHV or HHV8) was mapped with cosmid and phage genomic libraries from the BC-1 cell line. Its nucleotide sequence was determined except for a 3-kb region at the right end of the genome that was refractory to cloning. The BC-1 KSHV genome consists of a 140.5-kb-long unique coding region flanked by multiple G+C-rich 801-bp terminal repeat sequences. A genomic duplication that apparently arose in the parental tumor is present in this cell culture-derived strain. At least 81 ORFs, including 66 with homology to herpesvirus saimiri ORFs, and 5 internal repeat regions are present in the long unique region. The virus encodes homologs to complement-binding proteins, three cytokines (two macrophage inflammatory proteins and interleukin 6), dihydrofolate reductase, bcl-2, interferon regulatory factors, interleukin 8 receptor, neural cell adhesion molecule-like adhesin, and a D-type cyclin, as well as viral structural and metabolic proteins. Terminal repeat analysis of virus DNA from a KS lesion suggests a monoclonal expansion of KSHV in the KS tumor. PMID:8962146

  6. Complete nucleotide sequence of rose yellow leaf virus, a new member of the family Tombusviridae.

    PubMed

    Mollov, Dimitre; Lockhart, Ben; Zlesak, David C

    2014-10-01

    The genome of the rose yellow leaf virus (RYLV) has been determined to be 3918 nucleotides long and to contain seven open reading frames (ORFs). ORF1 encodes a 27-kDa peptide (p27). ORF2 shares a common start codon with ORF1 and continues through the amber stop codon of p27 to encode an 87-kDa (p87) protein that has amino acid similarity to the RNA-dependent RNA polymerase (RdRp) of members of the family Tombusviridae. ORFs 3 and 4 have no significant amino acid similarity to known functional viral ORFs. ORF5 encodes a 6-kDa (p6) protein that has similarity to movement proteins of members of the Tombusviridae. ORF5A has no conventional start codon and overlaps with p6. A putative +1 frameshift mechanism allows p6 translation to continue through the stop codon and results in a 12-kDa protein that has high homology to the carmovirus p13 movement protein. The 37-kDa protein encoded by ORF6 has amino acid sequence similarity to coat proteins (CP) of members of the Tombusviridae. ORF7 has no significant amino acid similarity to known viral ORFs. Phylogenetic analysis of the RdRp amino acid sequences grouped RYLV together with the unclassified Rosa rugosa leaf distortion virus (RrLDV), pelargonium line pattern virus (PLPV), and pelargonium chlorotic ring pattern virus (PCRPV) in a distinct subgroup of the family Tombusviridae. PMID:24838852

  7. Use of nucleotide sequence data to identify a microsporidian pathogen of Pieris rapae (Lepidoptera, Pieridae).

    PubMed

    Malone, L A; McIvor, C A

    1996-11-01

    Nucleotide sequence was determined for a portion of genomic DNA which spans the V4 variable region of the small subunit ribosomal RNA gene of an unidentified microsporidium from the cabbage white butterfly, Pieris rapae (174 base pairs). Comparison with equivalent sequence data obtained here for two other microsporidian species, Nosema bombycis (240 base pairs) and Nosema bombi (200 base pairs), and from the GenBank database for 11 other microsporidian species suggests that the unidentified species from P. rapae is most closely related to some Vairimorpha species. Light and electron microscopic observations of the developmental stages of this parasite were in accord with this. Infection experiments conducted at 20 and 26 degrees C demonstrated temperature-dependent dimorphism, with the production of both binucleate free spores (mean dimensions: 3.8 x 1.8 microns; 10-13 polar filament coils) and membrane-bound uninucleate octospores (mean dimensions: 3.1 x 1.9 microns). Macrospores (mean dimensions 8.0 x 2.1 microns) were also observed. Sites of infection were the gut epithelium, the Malpighian tubules, the salivary glands, and the fat body. Infections were found in all insect life stages, including the egg. This microsporidium was found to be indistinguishable from both Nosema mesnili (Paillot) and Microsporidium (Thelohania) mesnili (Paillot) and we propose that these species be combined and transferred to the genus Vairimorpha Pilley. PMID:8931362

  8. Predicting Mendelian Disease-Causing Non-Synonymous Single Nucleotide Variants in Exome Sequencing Studies

    PubMed Central

    Bao, Su-Ying; Yang, Wanling; Ho, Shu-Leong; Song, Yong-Qiang; Sham, Pak C.

    2013-01-01

    Exome sequencing is becoming a standard tool for mapping Mendelian disease-causing (or pathogenic) non-synonymous single nucleotide variants (nsSNVs). Minor allele frequency (MAF) filtering approach and functional prediction methods are commonly used to identify candidate pathogenic mutations in these studies. Combining multiple functional prediction methods may increase accuracy in prediction. Here, we propose to use a logit model to combine multiple prediction methods and compute an unbiased probability of a rare variant being pathogenic. Also, for the first time we assess the predictive power of seven prediction methods (including SIFT, PolyPhen2, CONDEL, and logit) in predicting pathogenic nsSNVs from other rare variants, which reflects the situation after MAF filtering is done in exome-sequencing studies. We found that a logit model combining all or some original prediction methods outperforms other methods examined, but is unable to discriminate between autosomal dominant and autosomal recessive disease mutations. Finally, based on the predictions of the logit model, we estimate that an individual has around 5% of rare nsSNVs that are pathogenic and carries ∼22 pathogenic derived alleles at least, which if made homozygous by consanguineous marriages may lead to recessive diseases. PMID:23341771

  9. Predicting mendelian disease-causing non-synonymous single nucleotide variants in exome sequencing studies.

    PubMed

    Li, Miao-Xin; Kwan, Johnny S H; Bao, Su-Ying; Yang, Wanling; Ho, Shu-Leong; Song, Yong-Qiang; Sham, Pak C

    2013-01-01

    Exome sequencing is becoming a standard tool for mapping Mendelian disease-causing (or pathogenic) non-synonymous single nucleotide variants (nsSNVs). Minor allele frequency (MAF) filtering approach and functional prediction methods are commonly used to identify candidate pathogenic mutations in these studies. Combining multiple functional prediction methods may increase accuracy in prediction. Here, we propose to use a logit model to combine multiple prediction methods and compute an unbiased probability of a rare variant being pathogenic. Also, for the first time we assess the predictive power of seven prediction methods (including SIFT, PolyPhen2, CONDEL, and logit) in predicting pathogenic nsSNVs from other rare variants, which reflects the situation after MAF filtering is done in exome-sequencing studies. We found that a logit model combining all or some original prediction methods outperforms other methods examined, but is unable to discriminate between autosomal dominant and autosomal recessive disease mutations. Finally, based on the predictions of the logit model, we estimate that an individual has around 5% of rare nsSNVs that are pathogenic and carries ~22 pathogenic derived alleles at least, which if made homozygous by consanguineous marriages may lead to recessive diseases. PMID:23341771

  10. Mutations in core nucleotide sequence of hepatitis B virus correlate with fulminant and severe hepatitis.

    PubMed Central

    Ehata, T; Omata, M; Chuang, W L; Yokosuka, O; Ito, Y; Hosoda, K; Ohto, M

    1993-01-01

    Infection with hepatitis B virus leads to a wide spectrum of liver injury, including self-limited acute hepatitis, fulminant hepatitis, and chronic hepatitis with progression to cirrhosis or acute exacerbation to liver failure, as well as an asymptomatic chronic carrier state. Several studies have suggested that the hepatitis B core antigen could be an immunological target of cytotoxic T lymphocytes. To investigate the reason why the extreme immunological attack occurred in fulminant hepatitis and severe exacerbation patients, the entire precore and core region of hepatitis B virus DNA was sequenced in 24 subjects (5 fulminant, 10 severe fatal exacerbation, and 9 self-limited acute hepatitis patients). No significant change in the nucleotide sequence and deduced amino acid residue was noted in the nine self-limited acute hepatitis patients. In contrast, clustering changes in a small segment of 16 amino acids (codon 84-99 from the start of the core gene) in all seven adr subtype infected fulminant and severe exacerbation patients was found. A different segment with clustering substitutions (codon 48-60) was also found in seven of eight adw subtype infected fulminant and severe exacerbation patients. Of the 15 patients, 2 lacked precore stop mutation which was previously reported to be associated with fulminant hepatitis. These data suggest that these core regions with mutations may play an important role in the pathogenesis of hepatitis B viral disease, and such mutations are related to severe liver damage. Images PMID:8450049

  11. BIND - an algorithm for loss-less compression of nucleotide sequence data.

    PubMed

    Bose, Tungadri; Mohammed, Monzoorul Haque; Dutta, Anirban; Mande, Sharmila S

    2012-09-01

    Recent advances in DNA sequencing technologies have enabled the current generation of life science researchers to probe deeper into the genomic blueprint. The amount of data generated by these technologies has been increasing exponentially since the last decade. Storage, archival and dissemination of such huge data sets require efficient solutions, both from the hardware as well as software perspective. The present paper describes BIND-an algorithm specialized for compressing nucleotide sequence data. By adopting a unique 'block-length' encoding for representing binary data (as a key step), BIND achieves significant compression gains as compared to the widely used general purpose compression algorithms (gzip, bzip2 and lzma). Moreover, in contrast to implementations of existing specialized genomic compression approaches, the implementation of BIND is enabled to handle non-ATGC and lowercase characters. This makes BIND a loss-less compression approach that is suitable for practical use. More importantly, validation results of BIND (with real-world data sets) indicate reasonable speeds of compression and decompression that can be achieved with minimal processor/ memory usage. BIND is available for download at http://metagenomics.atc.tcs.com/compression/BIND. No license is required for academic or non-profit use. PMID:22922203

  12. Nucleotide sequence and phylogenetic analysis of a new potexvirus: Malva mosaic virus.

    PubMed

    Côté, Fabien; Paré, Christine; Majeau, Nathalie; Bolduc, Marilène; Leblanc, Eric; Bergeron, Michel G; Bernardy, Michael G; Leclerc, Denis

    2008-01-01

    A filamentous virus isolated from Malva neglecta Wallr. (common mallow) and propagated in Chenopodium quinoa was grown, cloned and the complete nucleotide sequence was determined (GenBank accession # DQ660333). The genomic RNA is 6858 nt in length and contains five major open reading frames (ORFs). The genomic organization is similar to members and the viral encoded proteins shared homology with the group of the Potexvirus genus in the Flexiviridae family. Phylogenetic analysis revealed a close relationship with narcissus mosaic virus (NMV), scallion virus X (ScaVX) and, to a lesser extent, to Alstroemeria virus X (AlsVX) and pepino mosaic virus (PepMV). A novel putative pseudoknot structure is predicted in the 3'-UTR of a subgroup of potexviruses, including this newly described virus. The consensus GAAAA sequence is detected at the 5'-end of the genomic RNA and experimental data strongly suggest that this motif could be a distinctive hallmark of this genus. The name Malva mosaic virus is proposed. PMID:18054524

  13. Nucleotide sequence and expression of alpha-glucosidase-encoding gene (agdA) from Aspergillus oryzae.

    PubMed

    Minetoki, T; Gomi, K; Kitamoto, K; Kumagai, C; Tamura, G

    1995-08-01

    We have isolated an alpha-glucosidase(AGL)-encoding gene (agdA) from Aspergillus oryzae by heterologous hybridization using the corresponding Aspergillus niger gene as a probe. Southern hybridization analysis showed that the agdA gene is on a 5.0-kb ScaI fragment and there is a single copy in the A. oryzae chromosome. Comparison with the A. niger agdA gene indicated that the agdA gene contains three putative introns from 52 to 59 nucleotides long, and that it encodes 985 amino acid residues. The deduced amino acid sequence of A. oryzae AGL is 78% homologous with the A. niger AGL. The high degree of homology with the amino acid sequence bordering the putative catalytic residue of a number of AGL enzymes, and this enzyme suggests that Asp492 is a catalytic residue of A. oryzae AGL. The cloned gene was functional. Transformants of A. oryzae containing multiple copies of the cloned agdA gene showed a 6-16 fold increase in AGL activity. Like the Taka-amylase A and glucoamylase genes of A. oryzae, expression of the agdA gene was induced when maltose was provided as a carbon source, but expression was not induced by glucose. This result suggested that cis-element(s) involved in maltose induction may be also present in the agdA promoter region. PMID:7549103

  14. Nucleotide sequence and organization of the human S-protein gene: repeating peptide motifs in the pexin family and a model for their evolution

    SciTech Connect

    Jenne, D.; Stanley, K.K.

    1987-10-20

    The S-protein/vitronectin gene was isolated from a human genomic DNA library, and its sequence of about 5.3 kilobases including the adjacent 5' and 3' flanking regions was established. Alignment of the genomic DNA nucleotide sequence and the cDNA sequence indicated that the gene consisted of eight exons and seven introns. The intron positions in the S-protein gene and their phase type were compared to those in the hemopexin gene which shares amino acid sequence homologies with transin and the S-protein. Three introns have been found at equivalent positions; two other introns are very close to these positions and are interpreted as cases of intron sliding. Introns 3-7 occur at a conserved glycine residue within repeating peptide segments, whereas introns 1 and 2 are at the boundaries of the Somatomedin B domain of S-protein. The analysis of the exon structure in relations to repeating peptide motifs within the S-protein strongly suggest that it contains only seven repeats, one less than the hemopexin molecule. A very similar repeat pattern like that in hemopexin is shown to be present also in two other related proteins, transin and interstitial collagenase. An evolutionary model for the generation of the repeat pattern in the S-protein and the other members of this novel pexin gene family is proposed, and the sequence modifications for some of the repeats during divergent evolution are discussed in relation to know unique functional properties of hemopexin and S-protein.

  15. The human Ig-[beta] cDNA sequence, a homologue of murine B29, is identical in B cell and plasma cell lines producing all the human Ig isotypes

    SciTech Connect

    Hashimoto, Shiori; Gregersen, P.K.; Chiorazzi, N. Cornell Univ., New York, NY )

    1993-01-15

    The B cell Ag receptor complex consists of at least two disulfide-linked, heterodimeric structures: the clonally restricted membrane Ig (mIg) molecule and the nonpolymorphic Ig-[alpha]:Ig-[beta] protein dimer. The latter molecule is encoded by two separate genes, mb-1 and B29. The DNA sequences of murine and human mb-1 and murine B29 have been determined previously. This study describes the sequence of the full-length human cDNA homologue of the murine Ig-[beta]/B29 message. The human sequence codes for a protein that displays the typical subunit features of a transmembrane member of the Ig superfamily. The transmembrane and intracytoplasmic domains exhibit striking nucleotide and amino acid sequence similarity between the two species. These regions show almost complete conservation of areas presumed to be involved in noncovalent interactions with other members of the receptor complex and with intracellular kinases and cytoskeletal components. The only sequence dissimilarity seen in these presumed critical areas involves the Y-E-G-L-N motif, a potential target for tyrosine phosphorylation. In contrast, the extracellular portion is much more divergent. Inasmuch as similar patterns of species diversity have been reported for Ig-[alpha], the Ig-[alpha] and Ig-[beta] molecules may have coevolved to maintain species-specific extracellular interactions between one another and with mIg. Similar to the Ig-[alpha] molecule, the Ig-[beta] sequence is identical in B lineage cells expressing all five Ig isotypes. However, in contrast to the Ig-[alpha] molecule, the Ig-[beta] sequence is expressed at apparently similar levels in terminally differentiated, mIg[sup [minus

  16. ANTICALIgN: visualizing, editing and analyzing combined nucleotide and amino acid sequence alignments for combinatorial protein engineering.

    PubMed

    Jarasch, Alexander; Kopp, Melanie; Eggenstein, Evelyn; Richter, Antonia; Gebauer, Michaela; Skerra, Arne

    2016-07-01

    ANTIC ALIGN: is an interactive software developed to simultaneously visualize, analyze and modify alignments of DNA and/or protein sequences that arise during combinatorial protein engineering, design and selection. ANTIC ALIGN: combines powerful functions known from currently available sequence analysis tools with unique features for protein engineering, in particular the possibility to display and manipulate nucleotide sequences and their translated amino acid sequences at the same time. ANTIC ALIGN: offers both template-based multiple sequence alignment (MSA), using the unmutated protein as reference, and conventional global alignment, to compare sequences that share an evolutionary relationship. The application of similarity-based clustering algorithms facilitates the identification of duplicates or of conserved sequence features among a set of selected clones. Imported nucleotide sequences from DNA sequence analysis are automatically translated into the corresponding amino acid sequences and displayed, offering numerous options for selecting reading frames, highlighting of sequence features and graphical layout of the MSA. The MSA complexity can be reduced by hiding the conserved nucleotide and/or amino acid residues, thus putting emphasis on the relevant mutated positions. ANTIC ALIGN: is also able to handle suppressed stop codons or even to incorporate non-natural amino acids into a coding sequence. We demonstrate crucial functions of ANTIC ALIGN: in an example of Anticalins selected from a lipocalin random library against the fibronectin extradomain B (ED-B), an established marker of tumor vasculature. Apart from engineered protein scaffolds, ANTIC ALIGN: provides a powerful tool in the area of antibody engineering and for directed enzyme evolution. PMID:27261456

  17. Nucleotide sequence encoding the flavoprotein and hydrophobic subunits of the succinate dehydrogenase of Escherichia coli.

    PubMed Central

    Wood, D; Darlison, M G; Wilde, R J; Guest, J R

    1984-01-01

    The nucleotide sequence of a 3614 base-pair segment of DNA containing the sdhA gene, encoding the flavoprotein subunit of succinate dehydrogenase of Escherichia coli, and two genes sdhC and sdhD, encoding small hydrophobic subunits, has been determined. Together with the iron-sulphur protein gene (sdhB) these genes form an operon (sdhCDAB) situated between the citrate synthase gene (gltA) and the 2-oxoglutarate dehydrogenase complex genes (sucAB): gltA-sdhCDAB-sucAB. Transcription of the gltA and sdhCDAB gene appears to diverge from a single intergenic region that contains two pairs of potential promoter sequences and two putative CRP (cyclic AMP receptor protein)-binding sites. The sdhA structural gene comprises 1761 base-pairs (587 codons, excluding the initiation codon, AUG) and it encodes a polypeptide of Mr 64268 that is strikingly homologous with the flavoprotein subunit of fumarate reductase (frdA gene product). The FAD-binding region, including the histidine residue at the FAD-attachment site, has been identified by its homology with other flavoproteins and with the flavopeptide of the bovine heart mitochondrial succinate dehydrogenase. Potential active-site cysteine and histidine residues have also been indicated by the comparisons. The sdhC (384 base-pairs) and sdhD (342 base-pairs) structural genes encode two strongly hydrophobic proteins of Mr 14167 and 12792 respectively. These proteins resemble in size and composition, but not sequence, the membrane anchor proteins of fumarate reductase (the frdC and frdD gene products). PMID:6383359

  18. Complete Nucleotide Sequences and Genome Organization of Two Pepper Mild Mottle Virus Isolates from Capsicum annuum in South Korea

    PubMed Central

    Choi, Seung-Kook; Choi, Gug-Seoun; Kwon, Sun-Jung

    2016-01-01

    The complete genome sequences of pepper mild mottle virus (PMMoV)-P2 and -P3 were determined by the Sanger sequencing method. Although PMMoV-P2 and PMMoV-P3 have different pathogenicity in some pepper cultivars, the complete genome sequences of PMMoV-P2 and -P3 are composed of 6,356 nucleotides (nt). In this study, we report the complete genome sequences and genome organization of PMMoV-P2 and -P3 isolates from pepper species in South Korea. PMID:27198033

  19. Complete Nucleotide Sequences and Genome Organization of Two Pepper Mild Mottle Virus Isolates from Capsicum annuum in South Korea.

    PubMed

    Choi, Seung-Kook; Choi, Gug-Seoun; Kwon, Sun-Jung; Yoon, Ju-Yeon

    2016-01-01

    The complete genome sequences of pepper mild mottle virus (PMMoV)-P2 and -P3 were determined by the Sanger sequencing method. Although PMMoV-P2 and PMMoV-P3 have different pathogenicity in some pepper cultivars, the complete genome sequences of PMMoV-P2 and -P3 are composed of 6,356 nucleotides (nt). In this study, we report the complete genome sequences and genome organization of PMMoV-P2 and -P3 isolates from pepper species in South Korea. PMID:27198033

  20. 37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... is DNA, RNA, or PRT (protein). If a nucleotide sequence contains both DNA and RNA fragments, the type shall be “DNA.” In addition, the combined DNA/RNA molecule shall be further described in the to feature... combined DNA/RNA” Name/Key Provide appropriate identifier for feature, preferably from WIPO Standard...

  1. 37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... is DNA, RNA, or PRT (protein). If a nucleotide sequence contains both DNA and RNA fragments, the type shall be “DNA.” In addition, the combined DNA/RNA molecule shall be further described in the to feature... combined DNA/RNA” Name/Key Provide appropriate identifier for feature, preferably from WIPO Standard...

  2. Empirical Comparison of Simple Sequence Repeats and Single Nucleotide Polymorphisms in Assessment of Maize Diversity and Relatedness

    Technology Transfer Automated Retrieval System (TEKTRAN)

    While Simple Sequence Repeats (SSRs) are extremely useful genetic markers, recent advances in technology have produced a shift toward use of single nucleotide polymorphisms (SNPs). The different mutational properties of these two classes of markers result in differences in heterozygosities and allel...

  3. A high-density simple sequence repeat and single nucleotide polymorphism genetic map of the tetraploid cotton genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Cotton genome complexity was investigated with a saturated molecular genetic map that combined several sets of microsatellites or simple sequence repeats (SSR) and the first major public set of single nucleotide polymorphism (SNP) markers in cotton genomes (Gossypium spp.), and that was constructed ...

  4. Nucleotide sequence of a predicted diguanylate cyclase unique to egg contaminating Salmonella enteritidis that does not form biofilm.

    Technology Transfer Automated Retrieval System (TEKTRAN)

    This is a nucleotide sequence submitted as bankit1052494 and given the GenBank accession number EU375808A post-review. It will be released to the public in June 2008 in coordination with an ASM abstract presentation. A paper will also be submitted that refers to this accession number. A diguanyla...

  5. 37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... is DNA, RNA, or PRT (protein). If a nucleotide sequence contains both DNA and RNA fragments, the type shall be “DNA.” In addition, the combined DNA/RNA molecule shall be further described in the to feature... combined DNA/RNA” Name/Key Provide appropriate identifier for feature, preferably from WIPO Standard...

  6. 37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... is DNA, RNA, or PRT (protein). If a nucleotide sequence contains both DNA and RNA fragments, the type shall be “DNA.” In addition, the combined DNA/RNA molecule shall be further described in the to feature... combined DNA/RNA” Name/Key Provide appropriate identifier for feature, preferably from WIPO Standard...

  7. 37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... is DNA, RNA, or PRT (protein). If a nucleotide sequence contains both DNA and RNA fragments, the type shall be “DNA.” In addition, the combined DNA/RNA molecule shall be further described in the to feature... combined DNA/RNA” Name/Key Provide appropriate identifier for feature, preferably from WIPO Standard...

  8. Molecular cloning and nucleotide sequence of a transforming gene detected by transfection of chicken B-cell lymphoma DNA

    NASA Astrophysics Data System (ADS)

    Goubin, Gerard; Goldman, Debra S.; Luce, Judith; Neiman, Paul E.; Cooper, Geoffrey M.

    1983-03-01

    A transforming gene detected by transfection of chicken B-cell lymphoma DNA has been isolated by molecular cloning. It is homologous to a conserved family of sequences present in normal chicken and human DNAs but is not related to transforming genes of acutely transforming retroviruses. The nucleotide sequence of the cloned transforming gene suggests that it encodes a protein that is partially homologous to the amino terminus of transferrin and related proteins although only about one tenth the size of transferrin.

  9. Nucleotide sequence of the gene encoding the major subunit of CS3 fimbriae of enterotoxigenic Escherichia coli.

    PubMed Central

    Boylan, M; Smyth, C J; Scott, J R

    1988-01-01

    The complete nucleotide sequence of a 612-base-pair DNA fragment containing the gene for the major fimbrial subunit of CS3 of enterotoxigenic Escherichia coli is presented. A possible promoter region, a ribosome-binding site, and two potential signal peptidase cleavage sites are indicated. Unlike the best-studied fimbrial proteins, the predicted CS3 sequence has no Cys residues. PMID:2903130

  10. Nucleotide sequences of 5S rRNAs from sponge Halichondria japonica and tunicate Halocynthia roretzi and their phylogenetic positions

    PubMed Central

    Komiya, Hiroyuki; Hasegawa, Masami; Takemura, Shosuke

    1983-01-01

    The nucleotide sequences of 5S rRNAs from sponge Halichondria japonica and tunicate Halocynthia roretzi were determined by chemical and enzymatic gel methods. Their phylogenetic positions among metazoans were derived from the 5S rRNA sequences by a computer analysis based on the maximum parsimony principle. It was suggested that the sponge is closely related to several invertebrates and the tunicate has affinity to vertebrates rather than invertebrates. PMID:6835845

  11. Single nucleotide polymorphism discovery in rainbow trout by deep sequencing of a reduced representation library

    PubMed Central

    2009-01-01

    Background To enhance capabilities for genomic analyses in rainbow trout, such as genomic selection, a large suite of polymorphic markers that are amenable to high-throughput genotyping protocols must be identified. Expressed Sequence Tags (ESTs) have been used for single nucleotide polymorphism (SNP) discovery in salmonids. In those strategies, the salmonid semi-tetraploid genomes often led to assemblies of paralogous sequences and therefore resulted in a high rate of false positive SNP identification. Sequencing genomic DNA using primers identified from ESTs proved to be an effective but time consuming methodology of SNP identification in rainbow trout, therefore not suitable for high throughput SNP discovery. In this study, we employed a high-throughput strategy that used pyrosequencing technology to generate data from a reduced representation library constructed with genomic DNA pooled from 96 unrelated rainbow trout that represent the National Center for Cool and Cold Water Aquaculture (NCCCWA) broodstock population. Results The reduced representation library consisted of 440 bp fragments resulting from complete digestion with the restriction enzyme HaeIII; sequencing produced 2,000,000 reads providing an average 6 fold coverage of the estimated 150,000 unique genomic restriction fragments (300,000 fragment ends). Three independent data analyses identified 22,022 to 47,128 putative SNPs on 13,140 to 24,627 independent contigs. A set of 384 putative SNPs, randomly selected from the sets produced by the three analyses were genotyped on individual fish to determine the validation rate of putative SNPs among analyses, distinguish apparent SNPs that actually represent paralogous loci in the tetraploid genome, examine Mendelian segregation, and place the validated SNPs on the rainbow trout linkage map. Approximately 48% (183) of the putative SNPs were validated; 167 markers were successfully incorporated into the rainbow trout linkage map. In addition, 2% of the

  12. Gene-based single nucleotide polymorphism discovery in bovine muscle using next-generation transcriptomic sequencing

    PubMed Central

    2013-01-01

    Background Genetic information based on molecular markers has increasingly being used in cattle breeding improvement programmes, as a mean to improve conventionally phenotypic selection. Advances in molecular genetics have led to the identification of several genetic markers associated with genes affecting economic traits. Until recently, the identification of the causative genetic variants involved in the phenotypes of interest has remained a difficult task. The advent of novel sequencing technologies now offers a new opportunity for the identification of such variants. Despite sequencing costs plummeting, sequencing whole-genomes or large targeted regions is still too expensive for most laboratories. A transcriptomic-based sequencing approach offers a cheaper alternative to identify a large number of polymorphisms and possibly to discover causative variants. In the present study, we performed a gene-based single nucleotide polymorphism (SNP) discovery analysis in bovine Longissimus thoraci, using RNA-Seq. To our knowledge, this represents the first study done in bovine muscle. Results Messenger RNAs from Longissimus thoraci from three Limousin bull calves were subjected to high-throughput sequencing. Approximately 36–46 million paired-end reads were obtained per library. A total of 19,752 transcripts were identified and 34,376 different SNPs were detected. Fifty-five percent of the SNPs were found in coding regions and ~22% resulted in an amino acid change. Applying a very stringent SNP quality threshold, we detected 8,407 different high-confidence SNPs, 18% of which are non synonymous coding SNPs. To analyse the accuracy of RNA-Seq technology for SNP detection, 48 SNPs were selected for validation by genotyping. No discrepancies were observed when using the highest SNP probability threshold. To test the usefulness of the identified SNPs, the 48 selected SNPs were assessed by genotyping 93 bovine samples, representing mostly the nine major breeds used in France

  13. Characteristic features of the nucleotide sequences of yeast mitochondrial ribosomal protein genes as analyzed by computer program GeneMark.

    PubMed

    Isono, K; McIninch, J D; Borodovsky, M

    1994-01-01

    The nucleotide sequence data for yeast mitochondrial ribosomal protein (MRP) genes were analyzed by the computer program GeneMark which predicts the presence of likely genes in sequence data by calculating statistical biases in the appearance of consecutive nucleotides. The program uses a set of standard sequence data for this calculation. We used this program for the analysis of yeast nucleotide sequence data containing MRP genes, hoping to obtain information as to whether they share features in common that are different from other yeast genes. Sequence data sets for ordinary yeast genes and for 27 known MRP genes were used. The MRP genes were nicely predicted as likely genes regardless of the data sets used, whereas other yeast genes were predicted to be likely genes only when the data set for ordinary yeast genes was used. The assembled sequence data for chromosomes II, III, VIII and XI as well as the segmented data for chromosome V were analyzed in a similar manner. In addition to the known MRP genes, eleven ORF's were predicted to be likely MRP genes. Thus, the method seems very powerful in analyzing genes of heterologous origins. PMID:7719921

  14. Cloning and sequence analysis of beta-4 cDNA: an integrin subunit that contains a unique 118 kd cytoplasmic domain.

    PubMed Central

    Hogervorst, F; Kuikman, I; von dem Borne, A E; Sonnenberg, A

    1990-01-01

    The alpha 6 beta 4 complex is a member of the integrin superfamily of adhesion receptors. A human keratinocyte lambda gt11 cDNA library was screened using a monoclonal antibody directed against the beta 4 subunit. Two cDNAs were selected and subsequently used to isolate a complete set of overlapping cDNA clones. The beta 4 subunit consists of 1778 amino acids with a 683 amino acid extracellular domain, a 23 amino acid transmembrane domain and an exceptionally long cytoplasmic domain of 1072 residues. The deduced amino-terminal sequence is in good agreement with the published amino-terminal sequence of purified beta 4. The extracellular domain contains five potential N-linked glycosylation sites and four cysteine-rich homologous repeat sequences. The extracellular part of the beta 4 subunit sequence shows 35% identify with other integrin beta subunits, but is the most different among this class of molecules. The transmembrane region is poorly conserved, whereas the cytoplasmic domain shows no substantial identity in any region to the cytoplasmic tails of the known beta sequences or to other protein sequences. The exceptionally long cytoplasmic domain suggests distinct interactions of the beta 4 subunit with cytoplasmic proteins. Images Fig. 2. Fig. 3. PMID:2311578

  15. Transcription profiling of guanine nucleotide binding proteins during developmental regulation, and pesticide response in Solenopsis invicta (Hymenoptera: Formicidae)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Guanine nucleotide binding proteins (GNBP or G-protein) are glycoproteins anchored on the cytoplasmic cell membrane, and are mediators for many cellular processes. Complete cDNA of guanine nucleotide-binding protein gene ß-subunit (SiGNBP) was cloned and sequenced from S. invicta workers. To detect ...

  16. Complete nucleotide sequence of the gene for the specific glycoprotein (gp55) of Friend spleen focus-forming virus.

    PubMed Central

    Amanuma, H; Katori, A; Obata, M; Sagata, N; Ikawa, Y

    1983-01-01

    The complete nucleotide sequence of the gene for the specific glycoprotein (gp55) of the polycythemic strain of Friend spleen focus-forming virus (SFFV) was derived from the cloned SFFV DNA intermediate. The gp55 gene is present within 1.4 kilobases of the 5' side of the 3'long terminal repeat sequence. The open reading frame predicts the primary translation product has a total of 409 amino acids with a Mr of 44,752. Comparisons of the deduced amino acid sequence of gp55 with those of the envelope (env) gene products of murine leukemia viruses (MuLVs) revealed that gp55 is composed of three distinct regions. The amino-terminal 80% of the molecule has a high degree of sequence homology with the amino-terminal portion of the gp70 of the Moloney mink cell focus-forming virus (BALB/Mo-MCFV). This portion of the BALB/Mo-MCFV gp70 is known to be coded for by the acquired xenotropic env-like sequence. The sequence of the following 66 amino acids of gp55 is highly homologous to that of the middle portion of the p15E of Moloney MuLV (Mo-MuLV). The sequence of the Carboxyl-terminal 12 amino acids is specific to gp55 and a comparison of the nucleotide sequence showed that this specific amino acid sequence is due to the presence of seven extra nucleotides compared with the sequence of the Mo-MuLV. PMID:6306650

  17. Nucleotide sequence and genetic organization of the Bacillus subtilis comG operon.

    PubMed Central

    Albano, M; Breitling, R; Dubnau, D A

    1989-01-01

    A series of Tn917lac insertions define the comG region of the Bacillus subtilis chromosome. comG mutants are deficient in competence and specifically in the binding of exogenous DNA. The genes included in the comG region are first expressed during the transition from the exponential to the stationary growth phase. From nucleotide sequence information, it was concluded that the comG locus contains seven open reading frames (ORFs), several of which overlap at their termini. High-resolution S1 nuclease mapping and primer extension were used to identify the 5' terminus of the comG mRNA. The sequence upstream from the comG start site closely resembled the consensus recognition sequence for the major B. subtilis vegetative RNA polymerase holoenzyme. Complementation analysis confirmed that the comG ORF1 protein is required for the ability of competent cultures to resolve into two populations with different cell densities on Renografin (E. R. Squibb & Sons, Princeton, N.J.) gradients, as well as for full expression of comE, another late competence locus. The predicted comG ORF1 protein showed significant similarity to the virB ORF11 protein from Agrobacterium tumefaciens, which is probably involved in T-DNA transfer. The N-terminal sequences of comG ORF3 and, to a lesser extent, the comG ORF4 and ORF5 proteins were similar to a class of pilin proteins from members of the genera Bacteroides, Pseudomonas, Neisseria, and Moraxella. All of the comG proteins except comG ORF1 possessed hydrophobic domains that were potentially capable of spanning the bacterial membrane. It is likely that these proteins are membrane associated, and they may comprise part of the DNA transport machinery. When present in multiple copies, a DNA fragment carrying the comG promoter was capable of inhibiting the development of competence as well as the expression of several late com genes, suggesting a role for a transcriptional activator in the expression of those genes. Images PMID:2507524

  18. Proteus mirabilis MR/P fimbrial operon: genetic organization, nucleotide sequence, and conditions for expression.

    PubMed Central

    Bahrani, F K; Mobley, H L

    1994-01-01

    Proteus mirabilis, an agent of urinary tract infection, expresses at least four fimbrial types. Among these are the MR/P (mannose-resistant/Proteus-like) fimbriae. MrpA, the structural subunit, is optimally expressed at 37 degrees C in Luria broth cultured statically for 48 h by each of seven strains examined. Genes encoding this fimbria were isolated, and the complete nucleotide sequence was determined. The mrp gene cluster encoded by 7,293 bp predicts eight polypeptides: MrpI (22,133 Da), MrpA (17,909 Da), MrpB (19,632 Da), MrpC (96,823 Da), MrpD (27,886 Da), MrpE (19,470 Da), MrpF (17,363 Da), and MrpG (13,169 Da). mrpI is upstream of the gene encoding the major structural subunit gene mrpA and is transcribed in the direction opposite to that of the rest of the operon. All predicted polypeptides share > or = 25% amino acid identity with at least one other enteric fimbrial gene product encoded by the pap, fim, smf, fan, or mrk gene clusters. Images PMID:7910820

  19. Pediococcus acidilactici ldhD gene: cloning, nucleotide sequence, and transcriptional analysis.

    PubMed Central

    Garmyn, D; Ferain, T; Bernard, N; Hols, P; Delplace, B; Delcour, J

    1995-01-01

    The gene encoding D-lactate dehydrogenase was isolated on a 2.9-kb insert from a library of Pediococcus acidilactici DNA by complementation for growth under anaerobiosis of an Escherichia coli lactate dehydrogenase and pyruvate-formate lyase double mutant. The nucleotide sequence of ldhD encodes a protein of 331 amino acids (predicted molecular mass of 37,210 Da) which shows similarity to the family of D-2-hydroxyacid dehydrogenases. The enzyme encoded by the cloned fragment is equally active on pyruvate and hydroxypyruvate, indicating that the enzyme has both D-lactate and D-glycerate dehydrogenase activities. Three other open reading frames were found in the 2.9-kb insert, one of which (rpsB) is highly similar to bacterial genes coding for ribosomal protein S2. Northern (RNA) blotting analyses indicated the presence of a 2-kb dicistronic transcript of ldhD (a metabolic gene) and rpsB (a putative ribosomal protein gene) together with a 1-kb monocistronic rpsB mRNA. These transcripts are abundant in the early phase of exponential growth but steadily fade away to disappear in the stationary phase. Primer extension analysis identified two distinct promoters driving either cotranscription of ldhD and rpsB or transcription of rpsB alone. PMID:7539419

  20. Whole-genome sequencing identifies genomic heterogeneity at a nucleotide and chromosomal level in bladder cancer

    PubMed Central

    Morrison, Carl D.; Liu, Pengyuan; Woloszynska-Read, Anna; Zhang, Jianmin; Luo, Wei; Qin, Maochun; Bshara, Wiam; Conroy, Jeffrey M.; Sabatini, Linda; Vedell, Peter; Xiong, Donghai; Liu, Song; Wang, Jianmin; Shen, He; Li, Yinwei; Omilian, Angela R.; Hill, Annette; Head, Karen; Guru, Khurshid; Kunnev, Dimiter; Leach, Robert; Eng, Kevin H.; Darlak, Christopher; Hoeflich, Christopher; Veeranki, Srividya; Glenn, Sean; You, Ming; Pruitt, Steven C.; Johnson, Candace S.; Trump, Donald L.

    2014-01-01

    Using complete genome analysis, we sequenced five bladder tumors accrued from patients with muscle-invasive transitional cell carcinoma of the urinary bladder (TCC-UB) and identified a spectrum of genomic aberrations. In three tumors, complex genotype changes were noted. All three had tumor protein p53 mutations and a relatively large number of single-nucleotide variants (SNVs; average of 11.2 per megabase), structural variants (SVs; average of 46), or both. This group was best characterized by chromothripsis and the presence of subclonal populations of neoplastic cells or intratumoral mutational heterogeneity. Here, we provide evidence that the process of chromothripsis in TCC-UB is mediated by nonhomologous end-joining using kilobase, rather than megabase, fragments of DNA, which we refer to as “stitchers,” to repair this process. We postulate that a potential unifying theme among tumors with the more complex genotype group is a defective replication–licensing complex. A second group (two bladder tumors) had no chromothripsis, and a simpler genotype, WT tumor protein p53, had relatively few SNVs (average of 5.9 per megabase) and only a single SV. There was no evidence of a subclonal population of neoplastic cells. In this group, we used a preclinical model of bladder carcinoma cell lines to study a unique SV (translocation and amplification) of the gene glutamate receptor ionotropic N-methyl D-aspertate as a potential new therapeutic target in bladder cancer. PMID:24469795

  1. New features in the genus Ilarvirus revealed by the nucleotide sequence of Fragaria chiloensis latent virus.

    PubMed

    Tzanetakis, Ioannis E; Martin, Robert R

    2005-09-01

    Fragaria chiloensis latent virus (FClLV), a member of the genus Ilarvirus was first identified in the early 1990s. Double-stranded RNA was extracted from FClLV infected plants and cloned. The complete nucleotide sequence of the virus has been elucidated. RNA 1 encodes a protein with methyltransferase and helicase enzymatic motifs while RNA 2 encodes the viral RNA dependent RNA polymerase and an ORF, that shares no homology with other Ilarvirus genes. RNA 3 codes for movement and coat proteins and an additional ORF, making FClLV possibly the first Ilarvirus encoding a third protein in RNA 3. Phylogenetic analysis reveals that FClLV is most closely related to Prune dwarf virus, the type member of subgroup 4 of the Ilarvirus genus. FClLV is also closely related to Alfalfa mosaic virus (AlMV), a virus that shares many properties with Ilarviruses . We propose the reclassification of AlMV as a member of the Ilarvirus genus instead of being a member of a distinct genus. PMID:15878214

  2. Postzygotic single-nucleotide mosaicisms in whole-genome sequences of clinically unremarkable individuals

    PubMed Central

    Huang, August Y; Xu, Xiaojing; Ye, Adam Y; Wu, Qixi; Yan, Linlin; Zhao, Boxun; Yang, Xiaoxu; He, Yao; Wang, Sheng; Zhang, Zheng; Gu, Bowen; Zhao, Han-Qing; Wang, Meng; Gao, Hua; Gao, Ge; Zhang, Zhichao; Yang, Xiaoling; Wu, Xiru; Zhang, Yuehua; Wei, Liping

    2014-01-01

    Postzygotic single-nucleotide mutations (pSNMs) have been studied in cancer and a few other overgrowth human disorders at whole-genome scale and found to play critical roles. However, in clinically unremarkable individuals, pSNMs have never been identified at whole-genome scale largely due to technical difficulties and lack of matched control tissue samples, and thus the genome-wide characteristics of pSNMs remain unknown. We developed a new Bayesian-based mosaic genotyper and a series of effective error filters, using which we were able to identify 17 SNM sites from ∼80× whole-genome sequencing of peripheral blood DNAs from three clinically unremarkable adults. The pSNMs were thoroughly validated using pyrosequencing, Sanger sequencing of individual cloned fragments, and multiplex ligation-dependent probe amplification. The mutant allele fraction ranged from 5%-31%. We found that C→T and C→A were the predominant types of postzygotic mutations, similar to the somatic mutation profile in tumor tissues. Simulation data showed that the overall mutation rate was an order of magnitude lower than that in cancer. We detected varied allele fractions of the pSNMs among multiple samples obtained from the same individuals, including blood, saliva, hair follicle, buccal mucosa, urine, and semen samples, indicating that pSNMs could affect multiple sources of somatic cells as well as germ cells. Two of the adults have children who were diagnosed with Dravet syndrome. We identified two non-synonymous pSNMs in SCN1A, a causal gene for Dravet syndrome, from these two unrelated adults and found that the mutant alleles were transmitted to their children, highlighting the clinical importance of detecting pSNMs in genetic counseling. PMID:25312340

  3. Nucleotide sequences and genetic analysis of hydrogen oxidation (hox) genes in Azotobacter vinelandii.

    PubMed Central

    Menon, A L; Mortenson, L E; Robson, R L

    1992-01-01

    Azotobacter vinelandii contains a heterodimeric, membrane-bound [NiFe]hydrogenase capable of catalyzing the reversible oxidation of H2. The beta and alpha subunits of the enzyme are encoded by the structural genes hoxK and hoxG, respectively, which appear to form part of an operon that contains at least one further potential gene (open reading frame 3 [ORF3]). In this study, determination of the nucleotide sequence of a region of 2,344 bp downstream of ORF3 revealed four additional closely spaced or overlapping ORFs. These ORFs, ORF4 through ORF7, potentially encode polypeptides with predicted masses of 22.8, 11.4, 16.3, and 31 kDa, respectively. Mutagenesis of the chromosome of A. vinelandii in the area sequenced was carried out by introduction of antibiotic resistance gene cassettes. Disruption of hoxK and hoxG by a kanamycin resistance gene abolished whole-cell hydrogenase activity coupled to O2 and led to loss of the hydrogenase alpha subunit. Insertional mutagenesis of ORF3 through ORF7 with a promoterless lacZ-Kmr cassette established that the region is transcriptionally active and involved in H2 oxidation. We propose to call ORF3 through ORF7 hoxZ, hoxM, hoxL, hoxO, and hoxQ, respectively. The predicted hox gene products resemble those encoded by genes from hydrogenase-related operons in other bacteria, including Escherichia coli and Alcaligenes eutrophus. Images PMID:1624446

  4. Nucleotide sequence of the fadR gene, a multifunctional regulator of fatty acid metabolism in Escherichia coli.

    PubMed Central

    DiRusso, C C

    1988-01-01

    The Escherichia coli fadR gene is a multifunctional regulator of fatty acid and acetate metabolism. In the present work the nucleotide sequence of the 1.3 kb DNA fragment which encodes FadR has been determined. The coding sequence of the fadR gene is 714 nucleotides long and is preceded by a typical E. coli ribosome binding site and is followed by a sequence predicted to be sufficient for factor-independent chain termination. Primer extension experiments demonstrated that the transcription of the fadR gene initiates with an adenine nucleotide 33 nucleotides upstream from the predicted start of translation. The derived fadR peptide has a calculated molecular weight of 26,972. This is in reasonable agreement with the apparent molecular weight of 29,000 previously estimated on the basis of maxi-cell analysis of plasmid encoded proteins. There is a segment of twenty amino acids within the predicted peptide which resembles the DNA recognition and binding site of many transcriptional regulatory proteins. Images PMID:2843809

  5. T box transcription antitermination riboswitch: Influence of nucleotide sequence and orientation on tRNA binding by the antiterminator element

    PubMed Central

    Fauzi, Hamid; Agyeman, Akwasi; Hines, Jennifer V.

    2008-01-01

    Many bacteria utilize riboswitch transcription regulation to monitor and appropriately respond to cellular levels of important metabolites or effector molecules. The T box transcription antitermination riboswitch responds to cognate uncharged tRNA by specifically stabilizing an antiterminator element in the 5′-untranslated mRNA leader region and precluding formation of a thermodynamically more stable terminator element. Stabilization occurs when the tRNA acceptor end base pairs with the first four nucleotides in the seven nucleotide bulge of the highly conserved antiterminator element. The significance of the conservation of the antiterminator bulge nucleotides that do not base pair with the tRNA is unknown, but they are required for optimal function. In vitro selection was used to determine if the isolated antiterminator bulge context alone dictates the mode in which the tRNA acceptor end binds the bulge nucleotides. No sequence conservation beyond complementarity was observed and the location was not constrained to the first four bases of the bulge. The results indicate that formation of a structure that recognizes the tRNA acceptor end in isolation is not the determinant driving force for the high phylogenetic sequence conservation observed within the antiterminator bulge. Additional factors or T box leader features more likely influenced the phylogenetic sequence conservation. PMID:19152843

  6. Sequence evaluation of four pooled-tissue normalized bovine cDNA libraries and construction of a gene index for cattle.

    PubMed

    Smith, T P; Grosse, W M; Freking, B A; Roberts, A J; Stone, R T; Casas, E; Wray, J E; White, J; Cho, J; Fahrenkrug, S C; Bennett, G L; Heaton, M P; Laegreid, W W; Rohrer, G A; Chitko-McKown, C G; Pertea, G; Holt, I; Karamycheva, S; Liang, F; Quackenbush, J; Keele, J W

    2001-04-01

    An essential component of functional genomics studies is the sequence of DNA expressed in tissues of interest. To provide a resource of bovine-specific expressed sequence data and facilitate this powerful approach in cattle research, four normalized cDNA libraries were produced and arrayed for high-throughput sequencing. The libraries were made with RNA pooled from multiple tissues to increase efficiency of normalization and maximize the number of independent genes for which sequence data were obtained. Target tissues included those with highest likelihood to have impact on production parameters of animal health, growth, reproductive efficiency, and carcass merit. Success of normalization and inter- and intralibrary redundancy were assessed by collecting 6000-23,000 sequences from each of the libraries (68,520 total sequences deposited in GenBank). Sequence comparison and assembly of these sequences was performed in combination with 56,500 other bovine EST sequences present in the GenBank dbEST database to construct a cattle Gene Index (available from The Institute for Genomic Research at http://www.tigr.org/tdb/tgi.shtml). The 124,381 bovine ESTs present in GenBank at the time of the analysis form 16,740 assemblies that are listed and annotated on the Web site. Analysis of individual library sequence data indicates that the pooled-tissue approach was highly effective in preparing libraries for efficient deep sequencing. PMID:11282978

  7. Nucleotide Sequence Evolution at the κ-Casein Locus: Evidence for Positive Selection within the Family Bovidae

    PubMed Central

    Ward, T. J.; Honeycutt, R. L.; Derr, J. N.

    1997-01-01

    κ-Casein is a mammalian milk protein involved in a number of important physiological processes. In the gut, the ingested protein is split into an insoluble peptide (para κ-casein) and a soluble hydrophilic glycopeptide (caseinomacropeptide). Caseinomacropeptide is responsible for increased efficiency of digestion, prevention of neonate hypersensitivity to ingested proteins, and inhibition of gastric pathogens. Variation within this peptide has significant effects associated with important traits such as milk production. The nucleotide sequences for regions of κ-casein exon and intron four were determined for representatives of the artiodactyl family Bovidae. The pattern of nucleotide substitution in κ-casein sequences for distantly related bovid taxa demonstrates that positive selection has accelerated their divergence at the amino acid sequence level. This selection has differentially influenced the molecular evolution of the two κ-casein split peptides and is focused within a 34-codon region of caseinomacropeptide. PMID:9409842

  8. Nucleotide sequence analysis of Aleutian mink disease parvovirus shows that multiple virus types are present in infected mink.

    PubMed Central

    Gottschalck, E; Alexandersen, S; Cohn, A; Poulsen, L A; Bloom, M E; Aasted, B

    1991-01-01

    Different isolates of Aleutian mink disease parvovirus (ADV) were cloned and nucleotide sequenced. Analysis of individual clones from two in vivo-derived isolates of high virulence indicated that more than one type of ADV DNA were present in each of these isolates. Analysis of several clones from two preparations of a cell culture-adapted isolate of low virulence showed the presence of only one type of ADV DNA. We also describe the nucleotide sequence from map units 44 to 88 of a new type of ADV DNA. The new type of ADV DNA is compared with the previously published ADV sequences, to which it shows 95% homology. These findings indicate that ADV, a single-stranded DNA virus, has a considerable degree of variability and that several virus types can be present simultaneously in an infected animal. PMID:1649336

  9. Sequencing analysis of 20,000 full-length cDNA clones from cassava reveals lineage specific expansions in gene families related to stress response

    PubMed Central

    Sakurai, Tetsuya; Plata, Germán; Rodríguez-Zapata, Fausto; Seki, Motoaki; Salcedo, Andrés; Toyoda, Atsushi; Ishiwata, Atsushi; Tohme, Joe; Sakaki, Yoshiyuki; Shinozaki, Kazuo; Ishitani, Manabu

    2007-01-01

    Background Cassava, an allotetraploid known for its remarkable tolerance to abiotic stresses is an important source of energy for humans and animals and a raw material for many industrial processes. A full-length cDNA library of cassava plants under normal, heat, drought, aluminum and post harvest physiological deterioration conditions was built; 19968 clones were sequence-characterized using expressed sequence tags (ESTs). Results The ESTs were assembled into 6355 contigs and 9026 singletons that were further grouped into 10577 scaffolds; we found 4621 new cassava sequences and 1521 sequences with no significant similarity to plant protein databases. Transcripts of 7796 distinct genes were captured and we were able to assign a functional classification to 78% of them while finding more than half of the enzymes annotated in metabolic pathways in Arabidopsis. The annotation of sequences that were not paired to transcripts of other species included many stress-related functional categories showing that our library is enriched with stress-induced genes. Finally, we detected 230 putative gene duplications that include key enzymes in reactive oxygen species signaling pathways and could play a role in cassava stress response features. Conclusion The cassava full-length cDNA library here presented contains transcripts of genes involved in stress response as well as genes important for different areas of cassava research. This library will be an important resource for gene discovery, characterization and cloning; in the near future it will aid the annotation of the cassava genome. PMID:18096061

  10. Cloning and sequencing of the cDNA encoding a core protein of the paired helical filament of Alzheimer's disease: Identification as the microtubule-associated protein tau

    SciTech Connect

    Goedert, M.; Wischik, C.M.; Crowther, R.A.; Walker, J.E.; Klug, A. )

    1988-06-01

    Screening of cDNA libraries prepared from the frontal cortex of an Alzheimer's disease patient and from fetal human brain has led to isolation of the cDNA for a core protein of the paired helical filament of Alzheimer's disease. The partial amino acid sequence of this core protein was used to design synthetic oligonucleotide probes. The cDNA encodes a protein of 352 amino acids that contains a characteristic amino acid repeat in its carboxyl-terminal half. This protein is highly homologous to the sequence of the mouse microtubule-associated protein tau and thus constitutes the human equivalent of mouse tau. RNA blot analysis indicates the presence of two major transcripts, 6 and 2 kilobases long, with a wide distribution in normal human brain. Tau protein mRNAs were found in normal amounts in the frontal cortex from patients with Alzheimer's disease. The proof that at least part of tau protein forms a component of the paired helical filament core opens the way to understanding the mode of formation of paired helical filaments and thus, ultimately, the pathogenesis of Alzheimer's disease.

  11. Interferon-induced 56,000 Mr protein and its mRNA in human cells: molecular cloning and partial sequence of the cDNA.

    PubMed Central

    Chebath, J; Merlin, G; Metz, R; Benech, P; Revel, M

    1983-01-01

    Treatment of responsive cells by interferons (IFNs) induces within a few hours a rise in the concentration of several proteins and mRNAs. In order to characterize these IFN-induced mRNA species, we have cloned in E. coli the cDNA made from a 17-18S poly(A)+ RNA of human fibroblastoid cells (SV80) treated with IFN-beta. We describe here a pBR322 recombinant plasmid (C56) which contains a 400 bp cDNA insert corresponding to a 18S mRNA species newly induced by IFN. The C56 mRNA codes for a 56,000 dalton protein easily detectable by hybridization-translation experiments. The sequence of 66 of the carboxy-terminal amino-acids of the protein can be deduced from the cDNA sequence. IFNs-alpha, beta or gamma are able to activate the expression of this gene in human fibroblasts as well as lymphoblastoid cells. The mRNA is not detectable without IFN; it reaches maximum levels (0.1% of the total poly(A)+ RNA) within 4-8 hrs and decreases after 16 hrs. Images PMID:6186990

  12. Cloning, Nucleotide Sequencing and Bioinformatics Study of NcSRS2 Gene, an Immunogen from Iranian Isolate of Neospora caninum

    PubMed Central

    Soltani, M; Sadrebazzaz, A; Nassiri, M; Tahmoorespoor, M

    2013-01-01

    Background Neosporosis is caused by an obligate intracellular parasitic protozoa Neospora caninum which infect variety of hosts. NcSRS2 is an immuno-dominant antigen of N. caninum which is considered as one of the most promising targets for a recombinant or DNA vaccine against neosporosis. As no study has been carried out to identify the molecular structure of N. caninum in Iran, as first step, we prepared a scheme to identify this gene in this parasite in Iran. Methods Tachyzoite total RNA was extracted and cDNA was synthesized and NcSRS2 gene was amplified using cDNA as template. Then the PCR product was cloned into pTZ57R/T vector and transformed into E. coli (DH5α strain). Finally, the recombinant plasmid was extracted from transformed E. coli and sequenced. Bioinformatics analysis also carried out. Results The PCR product of NcSRS2 gene was sequenced and recorded in GenBank. The deduced amino acid sequence of NcSRS2 in current study was compared with other N. caninum NcSRS2 and showed some identities and differences. Conclusion NcSRS2 gene of N. caninum successfully cloned in pTZ57R/T. Recombinant plasmid was confirmed by sequencing, colony PCR and enzymatic digestion. It is ready to express recombinant protein for further studies. PMID:23682269

  13. Nucleotide sequences of cDNAs for human papillomavirus type 18 transcripts in HeLa cells

    SciTech Connect

    Inagaki, Yutaka; Tsunokawa, Youko; Takebe, Naoko; Terada, Masaaki; Sugimura, Takashi ); Nawa, Hiroyuki; Nakanishi, Shigetada )

    1988-05-01

    HeLa cells expressed 3.4- and 1.6-kilobase (kb) transcripts of the integrated human papillomavirus (HPV) type 18 genome. Two types of cDNA clones representing each size of HPV type 18 transcript were isolated. Sequence analysis of these two types of cDNA clones revealed that the 3.4-kb transcript contained E6, E7, the 5{prime} portion of E1, and human sequence and that the 1.6-kb transcript contained spliced and frameshifted E6 (E6{sup *}), E7, and human sequence. There was a common human sequence containing a poly(A) addition signal in the 3{prime} end portions of both transcripts, indicating that they were transcribed from the HPV genome at the same integration site with different splicing. Furthermore, the 1.6-kb transcript contained both of the two viral TATA boxes upstream of E6, strongly indicating that a cellular promoter was used for its transcription.

  14. Complete Nucleotide Sequence and Genome Organization of Hibiscus Chlorotic Ringspot Virus, a New Member of the Genus Carmovirus: Evidence for the Presence and Expression of Two Novel Open Reading Frames

    PubMed Central

    Huang, Mei; Koh, Dora Chin-Yen; Weng, Li-Juan; Chang, Min-Li; Yap, Yun-Kiam; Zhang, Lee; Wong, Sek-Man

    2000-01-01

    The complete nucleotide sequence of hibiscus chlorotic ringspot virus (HCRSV) was determined. The genomic RNA (gRNA) is 3,911 nucleotides long and has the potential to encode seven viral proteins in the order of 28 (p28), 23 (p23), 81 (p81), 8 (p8), 9 (p9), 38 (p38), and 25 (p25) kDa. Excluding two unique open reading frames (ORFs) encoding p23 and p25, the ORFs encode proteins with high amino acid similarity to those of carmoviruses. In addition to gRNA, two 3′-coterminated subgenomic RNA (sgRNA) species were identified. Full-length cDNA clones derived from gRNA and sgRNA were constructed under the control of a T7 promoter. Both capped and uncapped transcripts derived from the full-length genomic cDNA clone were infectious. In vitro translation and mutagenesis assays confirmed that all the predicted ORFs except the ORF encoding p8 are translatable, and the two novel ORFs (those encoding p23 and p25) may be functionally indispensable for the viral infection cycle. Based on virion morphology and genome organization, we propose that HCRSV be classified as a new member of the genus Carmovirus in family Tombusviridae. PMID:10708431

  15. Nucleotide sequence of a chickpea chlorotic stunt virus relative that infects pea and faba bean in China.

    PubMed

    Zhou, Cui-Ji; Xiang, Hai-Ying; Zhuo, Tao; Li, Da-Wei; Yu, Jia-Lin; Han, Cheng-Gui

    2012-07-01

    We determined the genome sequence of a new polerovirus that infects field pea and faba bean in China. Its entire nucleotide sequence (6021 nt) was most closely related (83.3% identity) to that of an Ethiopian isolate of chickpea chlorotic stunt virus (CpCSV-Eth). With the exception of the coat protein (encoded by ORF3), amino acid sequence identities of all gene products of this virus to those of CpCSV-Eth and other poleroviruses were <90%. This suggests that it is a new member of the genus Polerovirus, and the name pea mild chlorosis virus is proposed. PMID:22476900

  16. Nucleotide sequence of a cluster of early and late genes in a conserved segment of the vaccinia virus genome.

    PubMed Central

    Plucienniczak, A; Schroeder, E; Zettlmeissl, G; Streeck, R E

    1985-01-01

    The nucleotide sequence of a 7.6 kb vaccinia DNA segment from a genomic region conserved among different orthopox virus has been determined. This segment contains a tight cluster of 12 partly overlapping open reading frames most of which can be correlated with previously identified early and late proteins and mRNAs. Regulatory signals used by vaccinia virus have been studied. Presumptive promoter regions are rich in A, T and carry the consensus sequences TATA and AATAA spaced at 20-24 base pairs. Tandem repeats of a CTATTC consensus sequence are proposed to be involved in the termination of early transcription. PMID:2987815

  17. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... Director of the Federal Register in accordance with 5 U.S.C. 552(a) and 1 CFR part 51. Copies of WIPO... 37 Patents, Trademarks, and Copyrights 1 2010-07-01 2010-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid...

  18. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... Director of the Federal Register in accordance with 5 U.S.C. 552(a) and 1 CFR part 51. Copies of WIPO... 37 Patents, Trademarks, and Copyrights 1 2011-07-01 2011-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid...

  19. Sequence-Specific Incorporation of Enzyme-Nucleotide Chimera by DNA Polymerases.

    PubMed

    Welter, Moritz; Verga, Daniela; Marx, Andreas

    2016-08-16

    DNA polymerases select the right nucleotide for the growing polynucleotide chain based on the shape and geometry of the nascent nucleotide pairs and thereby ensure high DNA replication selectivity. High-fidelity DNA polymerases are believed to possess tight active sites that allow little deviation from the canonical structures. However, DNA polymerases are known to use nucleotides with small modifications as substrates, which is key for numerous core biotechnology applications. We show that even high-fidelity DNA polymerases are capable of efficiently using nucleotide chimera modified with a large protein like horseradish peroxidase as substrates for template-dependent DNA synthesis, despite this "cargo" being more than 100-fold larger than the natural substrates. We exploited this capability for the development of systems that enable naked-eye detection of DNA and RNA at single nucleotide resolution. PMID:27392211

  20. Nucleotide Sequence and Genetic Structure of a Novel Carbaryl Hydrolase Gene (cehA) from Rhizobium sp. Strain AC100

    PubMed Central

    Hashimoto, Masayuki; Fukui, Mitsuru; Hayano, Kouichi; Hayatsu, Masahito

    2002-01-01

    Rhizobium sp. strain AC100, which is capable of degrading carbaryl (1-naphthyl-N-methylcarbamate), was isolated from soil treated with carbaryl. This bacterium hydrolyzed carbaryl to 1-naphthol and methylamine. Carbaryl hydrolase from the strain was purified to homogeneity, and its N-terminal sequence, molecular mass (82 kDa), and enzymatic properties were determined. The purified enzyme hydrolyzed 1-naphthyl acetate and 4-nitrophenyl acetate indicating that the enzyme is an esterase. We then cloned the carbaryl hydrolase gene (cehA) from the plasmid DNA of the strain and determined the nucleotide sequence of the 10-kb region containing cehA. No homologous sequences were found by a database homology search using the nucleotide and deduced amino acid sequences of the cehA gene. Six open reading frames including the cehA gene were found in the 10-kb region, and sequencing analysis shows that the cehA gene is flanked by two copies of insertion sequence-like sequence, suggesting that it makes part of a composite transposon. PMID:11872471

  1. DNA sequencing by synthesis using 3′-O-azidomethyl nucleotide reversible terminators and surface-enhanced Raman spectroscopic detection

    PubMed Central

    Palla, Mirkó; Guo, Wenjing; Shi, Shundi; Li, Zengmin; Wu, Jian; Jockusch, Steffen; Guo, Cheng; Russo, James J.; Turro, Nicholas J.; Ju, Jingyue

    2014-01-01

    As an alternative to fluorescence-based DNA sequencing by synthesis (SBS), we report here an approach using an azido moiety (N3) that has an intense, narrow and unique Raman shift at 2125 cm−1, where virtually all biological molecules are transparent, as a label for SBS. We first demonstrated that the four 3′-O-azidomethyl nucleotide reversible terminators (3′-O-azidomethyl-dNTPs) displayed surface enhanced Raman scattering (SERS) at 2125 cm−1. Using these 4 nucleotide analogues as substrates, we then performed a complete 4-step SBS reaction. We used SERS to monitor the appearance of the azide-specific Raman peak at 2125 cm−1 as a result of polymerase extension by a single 3′-O-azidomethyl-dNTP into the growing DNA strand and disappearance of this Raman peak with cleavage of the azido label to permit the next nucleotide incorporation, thereby continuously determining the DNA sequence. Due to the small size of the azido label, the 3′-O-azidomethyl-dNTPs are efficient substrates for the DNA polymerase. In the SBS cycles, the natural nucleotides are restored after each incorporation and cleavage, producing a growing DNA strand that bears no modifications and will not impede further polymerase reactions. Thus, with further improvements in SERS for the azido moiety, this approach has the potential to provide an attractive alternative to fluorescence-based SBS. PMID:25396047

  2. Estimates of Gene Flow in Drosophila Pseudoobscura Determined from Nucleotide Sequence Analysis of the Alcohol Dehydrogenase Region

    PubMed Central

    Schaeffer, S. W.; Miller, E. L.

    1992-01-01

    The genetic structure of Drosophila pseudoobscura populations was inferred from a nucleotide sequence analysis of a 3.4-kb segment of the alcohol dehydrogenase (Adh) region. A total of 99 isochromosomal strains collected from 13 populations in North and South America were used to determine if any population departed from a neutral model and to estimate levels of gene flow between populations. This study also included the nucleotide sequences from two sibling species, D. persimilis and D. miranda. We estimated the neutral mutation parameter, 4Nμ, in synonymous and noncoding sites for 17 subregions of Adh in each of nine populations with sample sizes greater than three. The nucleotide diversity data in the nine populations was tested for departures from an equilibrium neutral model with two statistical tests. The Tajima and the Hudson, Kreitman, Aguade tests showed that each population fails to reject a neutral model. Tests for genetic differentiation between populations fail to show any population substructure among the North American populations of D. pseudoobscura. The nucleotide diversity data is consistent with direct and indirect measures of gene flow that show extensive dispersal between populations of D. pseudoobscura. PMID:1427038

  3. Characterization and Nucleotide Sequence of the Cryptic Cel Operon of Escherichia Coli K12

    PubMed Central

    Parker, L. L.; Hall, B. G.

    1990-01-01

    Wild-type Escherichia coli are not able to utilize β-glucoside sugars because the genes for utilization of these sugars are cryptic. Spontaneous mutations in the cel operon allow its expression and enable the organism to ferment cellobiose, arbutin and salicin. In this report we describe the structure and nucleotide sequence of the cel operon. The cel operon consists of five genes: celA, whose function is unknown; celB and celC which encode phosphoenolpyruvate-dependent phosphotransferase system enzyme II(cel) and enzyme III(cel), respectively, for the transport and phosphorylation of β-glucoside sugars; celD, which encodes a negative regulatory protein; and celF, which encodes a phospho-β-glucosidase that acts on phosphorylated cellobiose, arbutin and salicin. The mutationally activated cel operon is induced in the presence of its substrates, and is repressed in their absence. A comparison of proteins encoded by the cel operon with functionally equivalent proteins of the bgl operon, another cryptic E. coli gene system responsible for the catabolism of β-glucoside sugars, revealed no significant homology between these two systems despite common functional characteristics. The celD and celF encoded repressor and phospho-β-glucosidase proteins are homologous to the melibiose regulatory protein and to the melA encoded α-galactosidase of E. coli, respectively. Furthermore, the celC encoded PEP-dependent phosphotransferase system enzyme III(cel) is strikingly homologous to an enzyme III(lac) of the Gram-positive organism Staphylococcus aureus. We conclude that the genes for these two enzyme IIIs diverged much more recently than did their hosts, indicating that E. coli and S. aureus have undergone relatively recent exchange of chromosomal genes. PMID:2179047

  4. Nucleotide sequence and mutational analysis of the vnfENX region of Azotobacter vinelandii.

    PubMed Central

    Wolfinger, E D; Bishop, P E

    1991-01-01

    The nucleotide sequence (3,600 bp) of a second copy of nifENX-like genes in Azotobacter vinelandii has been determined. These genes are located immediately downstream from vnfA and have been designated vnfENX. The vnfENX genes appear to be organized as a single transcriptional unit that is preceded by a potential RpoN-dependent promoter. While the nifEN genes are thought to be evolutionarily related to nifDK, the vnfEN genes appear to be more closely related to nifEN than to either nifDK, vnfDK, or anfDK. Mutant strains (CA47 and CA48) carrying insertions in vnfE and vnfN, respectively, are able to grow diazotrophically in molybdenum (Mo)-deficient medium containing vanadium (V) (Vnf+) and in medium lacking both Mo and V (Anf+). However, a double mutant (strain DJ42.48) which contains a nifEN deletion and an insertion in vnfE is unable to grow diazotrophically in Mo-sufficient medium or in Mo-deficient medium with or without V. This suggests that NifE and NifN substitute for VnfE and VnfN when the vnfEN genes are mutationally inactivated. AnfA is not required for the expression of a vnfN-lacZ transcriptional fusion, even though this fusion is expressed under Mo- and V-deficient diazotrophic growth conditions. PMID:1938952

  5. Molecular characterization of the body site-specific human epidermal cytokeratin 9: cDNA cloning, amino acid sequence, and tissue specificity of gene expression.

    PubMed

    Langbein, L; Heid, H W; Moll, I; Franke, W W

    1993-12-01

    Differentiation of human plantar and palmar epidermis is characterized by the suprabasal synthesis of a major special intermediate-sized filament (IF) protein, the type I (acidic) cytokeratin 9 (CK 9). Using partial amino acid (aa) sequence information obtained by direct Edman sequencing of peptides resulting from proteolytic digestion of purified CK 9, we synthesized several redundant primers by 'back-translation'. Amplification by polymerase chain reaction (PCR) of cDNAs obtained by reverse transcription of mRNAs from human foot sole epidermis, including 5'-primer extension, resulted in multiple overlapping cDNA clones, from which the complete cDNA (2353 bp) could be constructed. This cDNA encoded the CK 9 polypeptide with a calculated molecular weight of 61,987 and an isoelectric point at about pH 5.0. The aa sequence deduced from cDNA was verified in several parts by comparison with the peptide sequences and showed the typical structure of type I CKs, with a head (153 aa), and alpha-helical coiled-coil-forming rod (306 aa), and a tail (163 aa) domain. The protein displayed the highest homology to human CK 10, not only in the highly conserved rod domain but also in large parts of the head and the tail domains. On the other hand, the aa sequence revealed some remarkable differences from CK 10 and other CKs, even in the most conserved segments of the rod domain. The nuclease digestion pattern seen on Southern blot analysis of human genomic DNA indicated the existence of a unique CK 9 gene. Using CK 9-specific riboprobes for hybridization on Northern blots of RNAs from various epithelia, a mRNA of about 2.4 kb in length could be identified only in foot sole epidermis, and a weaker cross-hybridization signal was seen in RNA from bovine heel pad epidermis at about 2.0 kb. A large number of tissues and cell cultures were examined by PCR of mRNA-derived cDNAs, using CK 9-specific primers. But even with this very sensitive signal amplification, only palmar

  6. A comparison of nucleotide sequences of measles virus L genes derived from wild-type viruses and SSPE brain tissues.

    PubMed

    Komase, K; Rima, B K; Pardowitz, I; Kunz, C; Billeter, M A; ter Meulen, V; Baczko, K

    1995-04-20

    The nucleotide sequences of the large protein (L) gene derived from two wild-type measles viruses (MV) and two SSPE brain-derived viruses have been determined. All sequences have single large open reading frames encoding 2183 amino acid residues. The deduced L proteins are well conserved and the proposed functional domains which have been identified for rhabdo- and paramyxoviruses are completely conserved in all strains. The degree of variability of L proteins is the lowest of all structural proteins of MV, reflecting its role in virus reproduction and persistence. Biased hypermutation was not observed in the L genes derived from SSPE brain tissue. None of the nucleotide changes can be associated with the attenuated phenotype of the Edmonston vaccine viruses. PMID:7747453

  7. Phylogeny of Bipolaris inferred from nucleotide sequences of Brn1, a reductase gene involved in melanin biosynthesis.

    PubMed

    Shimizu, Kiminori; Tanaka, Chihiro; Peng, You-Liang; Tsuda, Mitsuya

    1998-08-01

    The Brn1 reductase melanin biosynthesis gene in the fungal genus Bipolaris was sequenced in 74 strains of 22 species. The Brn1 region was highly conserved among the species examined at the nucleotide and the amino acid levels. To elucidate the phylogenetic relationships among Bipolaris species, trees were inferred from nucleotide sequences of this region. Species in these trees formed exclusive clusters clearly separated from one another, except for B. panici-miliacei and B. setariae, and B. victoriae and B. zeicola. When unidentified strains were added to this tree, they fell within known species or formed independent clusters. These data indicated that the Brn1 gene region was suitable for species-level systematics within the genus. The results also suggest that Bipolaris consists of two or more clades that may reflect teleomorphic connections. PMID:12501419

  8. Nucleotide sequence of the genes encoding the canine herpesvirus gB, gC and gD homologues.

    PubMed

    Limbach, K J; Limbach, M P; Conte, D; Paoletti, E

    1994-08-01

    The nucleotide sequence of the genes encoding the canine herpesvirus (CHV) gB, gC and gD homologues was determined. These genes are predicted to encode polypeptides of 879, 459 and 345 amino acids, respectively. Comparison of the predicted amino acid sequences of CHV gB, gC and gD with the homologous sequences from other herpesviruses indicates that CHV is an alphaherpesvirus, a conclusion that is consistent with the previous classification of this virus according to biological properties. Alignment of the homologous gB, gC and gD amino acid sequences indicates that most of the cysteine residues are conserved, suggesting that these glycoproteins possess similar tertiary structures. The nucleotide sequence of the open reading frame downstream from the CHV gC gene was also determined. The predicted amino acid sequence of this putative polypeptide appears to be homologous to a family of proteins encoded downstream from the gC gene in most, although not all, alphaherpesviruses. PMID:7545942

  9. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus) Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms

    PubMed Central

    Bertolini, Francesca; Scimone, Concetta; Geraci, Claudia; Schiavo, Giuseppina; Utzeri, Valerio Joe; Chiofalo, Vincenzo; Fontanesi, Luca

    2015-01-01

    Few studies investigated the donkey (Equus asinus) at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer) and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated) and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL) obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca). The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs) in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing) and Ion Torrent (RRL) runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources. PMID:26151450

  10. Characterization of Newcastle disease virus isolates by reverse transcription PCR coupled to direct nucleotide sequencing and development of sequence database for pathotype prediction and molecular epidemiological analysis.

    PubMed Central

    Seal, B S; King, D J; Bennett, J D

    1995-01-01

    Degenerate oligonucleotide primers were synthesized to amplify nucleotide sequences from portions of the fusion protein and matrix protein genes of Newcastle disease virus (NDV) genomic RNA that could be used diagnostically. These primers were used in a single-tube reverse transcription PCR of NDV genomic RNA coupled to direct nucleotide sequencing of the amplified product to characterize more than 30 NDV isolates. In agreement with previous reports, differences in the fusion protein cleavage sequence that correlated genotypically with virulence among various NDV pathotypes were detected. By using sequences generated from the matrix protein gene coding for the nuclear localization signal, lentogenic viruses were again grouped phylogenetically separate from other pathotypes. These techniques were applied to compare neurotropic velogenic viruses isolated from an outbreak of Newcastle disease in cormorants and turkeys. Cormorant NDV isolates and an NDV isolate from an infected turkey flock in North Dakota had the fusion protein cleavage sequence 109SRGRRQKRFVG119. The R-for-G substitution at position 110 may be unique for the cormorant-type isolates. Although the amino acid sequences from the fusion protein cleavage site were identical, nucleotide sequence data correlate the outbreak in turkeys to a cormorant virus isolate from Minnesota and not to a cormorant virus isolate from Michigan. On the basis of sequence information, the cormorant isolates are virulent viruses related to isolates of psittacine origin, possibly genotypically distinct from other velogenic NDV isolates. These techniques can be used reliably for Newcastle disease epidemiology and for prediction of pathotypes of NDV isolates without traditional live-bird inoculations. PMID:8567895

  11. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus) Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms.

    PubMed

    Bertolini, Francesca; Scimone, Concetta; Geraci, Claudia; Schiavo, Giuseppina; Utzeri, Valerio Joe; Chiofalo, Vincenzo; Fontanesi, Luca

    2015-01-01

    Few studies investigated the donkey (Equus asinus) at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer) and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated) and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL) obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca). The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs) in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing) and Ion Torrent (RRL) runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources. PMID:26151450

  12. Complete nucleotide sequence and gene rearrangement of the mitochondrial genome of the bell-ring frog, Buergeria buergeri (family Rhacophoridae).

    PubMed

    Sano, Naomi; Kurabayashi, Atsushi; Fujii, Tamotsu; Yonekawa, Hiromichi; Sumida, Masayuki

    2004-06-01

    In this study we determined the complete nucleotide sequence (19,959 bp) of the mitochondrial DNA of the rhacophorid frog Buergeria buergeri. The gene content, nucleotide composition, and codon usage of B. buergeri conformed to those of typical vertebrate patterns. However, due to an accumulation of lengthy repetitive sequences in the D-loop region, this species possesses the largest mitochondrial genome among all the vertebrates examined so far. Comparison of the gene organizations among amphibian species (Rana, Xenopus, salamanders and caecilians) revealed that the positioning of four tRNA genes and the ND5 gene in the mtDNA of B. buergeri diverged from the common vertebrate gene arrangement shared by Xenopus, salamanders and caecilians. The unique positions of the tRNA genes in B. buergeri are shared by ranid frogs, indicating that the rearrangements of the tRNA genes occurred in a common ancestral lineage of ranids and rhacophorids. On the other hand, the novel position of the ND5 gene seems to have arisen in a lineage leading to rhacophorids (and other closely related taxa) after ranid divergence. Phylogenetic analysis based on nucleotide sequence data of all mitochondrial genes also supported the gene rearrangement pathway. PMID:15329496

  13. Characterization of cDNA clones for human myeloperoxidase: predicted amino acid sequence and evidence for multiple mRNA species.

    PubMed Central

    Johnson, K R; Nauseef, W M; Care, A; Wheelock, M J; Shane, S; Hudson, S; Koeffler, H P; Selsted, M; Miller, C; Rovera, G

    1987-01-01

    Myeloperoxidase is a component of the microbicidal network of polymorphonuclear leukocytes. The enzyme is a tetramer consisting of two heavy and two light subunits. A large proportion of humans demonstrate genetic deficiencies in the production of myeloperoxidase. As a first step in analyzing these deficiencies in more detail, we have isolated cDNA clones for myeloperoxidase from an expression library of the HL-60 human promyelocytic leukemia cell line. Two overlapping plasmids (pMP02 and pMP062) were identified as myeloperoxidase cDNA clones based on the detection with myeloperoxidase antiserum of 70 kDa protein expressed in pMP02-containing bacteria and a 75 kDa polypeptide produced by hybridization selection and translation using pMP062 and HL-60 RNA. Formal identification of the clones was made by matching the predicted amino acid sequences with the amino terminal sequences of the heavy and light subunits. Both subunits are encoded by one mRNA in the following order: pre-pro-sequences--light subunit--heavy subunit. The molecular weight of the predicted primary translation product is 83.7 kDa. Northern blots reveal two size classes of hybridizing RNAs (approximately 3.0-3.3 and 3.5-4.0 kilobases) whose expression is restricted to cells of the granulocytic lineage and parallels the changes in enzymatic activity observed during differentiation. Images PMID:3031585

  14. Identification of potential vaccine and drug target candidates by expressed sequence tag analysis and immunoscreening of Onchocerca volvulus larval cDNA libraries.

    PubMed

    Lizotte-Waniewski, M; Tawe, W; Guiliano, D B; Lu, W; Liu, J; Williams, S A; Lustigman, S

    2000-06-01

    The search for appropriate vaccine candidates and drug targets against onchocerciasis has so far been confronted with several limitations due to the unavailability of biological material, appropriate molecular resources, and knowledge of the parasite biology. To identify targets for vaccine or chemotherapy development we have undertaken two approaches. First, cDNA expression libraries were constructed from life cycle stages that are critical for establishment of Onchocerca volvulus infection, the third-stage larvae (L3) and the molting L3. A gene discovery effort was then initiated by random expressed sequence tag analysis of 5,506 cDNA clones. Cluster analyses showed that many of the transcripts were up-regulated and/or stage specific in either one or both of the cDNA libraries when compared to the microfilariae, L2, and both adult stages of the parasite. Homology searches against the GenBank database facilitated the identification of several genes of interest, such as proteinases, proteinase inhibitors, antioxidant or detoxification enzymes, and neurotransmitter receptors, as well as structural and housekeeping genes. Other O. volvulus genes showed homology only to predicted genes from the free-living nematode Caenorhabditis elegans or were entirely novel. Some of the novel proteins contain potential secretory leaders. Secondly, by immunoscreening the molting L3 cDNA library with a pool of human sera from putatively immune individuals, we identified six novel immunogenic proteins that otherwise would not have been identified as potential vaccinogens using the gene discovery effort. This study lays a solid foundation for a better understanding of the biology of O. volvulus as well as for the identification of novel targets for filaricidal agents and/or vaccines against onchocerciasis based on immunological and rational hypothesis-driven research. PMID:10816503

  15. Nucleotides critical for the interaction of the Streptococcus pyogenes Mga virulence regulator with Mga-regulated promoter sequences.

    PubMed

    Hause, Lara L; McIver, Kevin S

    2012-09-01

    The Mga regulator of Streptococcus pyogenes directly activates the transcription of a core regulon that encodes virulence factors such as M protein (emm), C5a peptidase (scpA), and streptococcal inhibitor of complement (sic) by directly binding to a 45-bp binding site as determined by an electrophoretic mobility shift assay (EMSA) and DNase I protection. However, by comparing the nucleotide sequences of all established Mga binding sites, we found that they exhibit only 13.4% identity with no discernible symmetry. To determine the core nucleotides involved in functional Mga-DNA interactions, the M1T1 Pemm1 binding site was altered and screened for nucleotides important for DNA binding in vitro and for transcriptional activation using a plasmid-based luciferase reporter in vivo. Following this analysis, 34 nucleotides within the Pemm1 binding site that had an effect on Mga binding, Mga-dependent transcriptional activation, or both were identified. Of these critical nucleotides, guanines and cytosines within the major groove were disproportionately identified clustered at the 5' and 3' ends of the binding site and with runs of nonessential adenines between the critical nucleotides. On the basis of these results, a Pemm1 minimal binding site of 35 bp bound Mga at a level comparable to the level of binding of the larger 45-bp site. Comparison of Pemm with directed mutagenesis performed in the M1T1 Mga-regulated PscpA and Psic promoters, as well as methylation interference analysis of PscpA, establish that Mga binds to DNA in a promoter-specific manner. PMID:22773785

  16. Characterization of antigen-expressing Plasmodium falciparum cDNA clones that are reactive with parasite inhibitory antibodies.

    PubMed

    Horii, T; Bzik, D J; Inselburg, J

    1988-07-01

    A Plasmodium falciparum (FCR3 strain) lambda gt11 cDNA expression library was constructed from trophozoite and schizont poly(A) RNA and was screened immunologically with a pooled human immune serum from Nigeria to form a gene bank of 288 positive clones. The gene bank was subsequently screened with parasite inhibitory mouse monoclonal antibodies (mMAb) and with individual human Liberian sera. Two mMAb, 43E5 and 5H10, strongly reacted with 8 and 3 cDNA clones, respectively. Several of those clones also weakly cross-reacted with the other mMAb. Two of those weakly cross-reactive clones, cDNA#366 and cDNA#22, were shown to be located in different chromosomal regions of the parasite by Southern hybridization and so appeared to represent two different parasite genes. The genomic organization of both cDNA#366 and cDNA#22 sequences were identical in the FCR3 and the Honduras-1 strain. The nucleotide sequence of cDNA#366 and the amino acid sequence it coded for were homologous to a partial DNA and amino acid sequence previously reported for a P. falciparum (Camp strain) exoantigen designated p126. The mRNA for cDNA#366 appeared to represent an abundant message in blood stage trophozoites and schizonts. PMID:2456465

  17. SMRT Sequencing of Long Tandem Nucleotide Repeats in SCA10 Reveals Unique Insight of Repeat Expansion Structure

    PubMed Central

    Landrian, Ivette; Godiska, Ronald; Shanker, Savita; Yu, Fahong; Farmerie, William G.; Ashizawa, Tetsuo

    2015-01-01

    A large, non-coding ATTCT repeat expansion causes the neurodegenerative disorder, spinocerebellar ataxia type 10 (SCA10). In a subset of SCA10 patients, interruption motifs are present at the 5’ end of the expansion and strongly correlate with epileptic seizures. Thus, interruption motifs are a predictor of the epileptic phenotype and are hypothesized to act as a phenotypic modifier in SCA10. Yet, the exact internal sequence structure of SCA10 expansions remains unknown due to limitations in current technologies for sequencing across long extended tracts of tandem nucleotide repeats. We used the third generation sequencing technology, Single Molecule Real Time (SMRT) sequencing, to obtain full-length contiguous expansion sequences, ranging from 2.5 to 4.4 kb in length, from three SCA10 patients with different clinical presentations. We obtained sequence spanning the entire length of the expansion and identified the structure of known and novel interruption motifs within the SCA10 expansion. The exact interruption patterns in expanded SCA10 alleles will allow us to further investigate the potential contributions of these interrupting sequences to the pathogenic modification leading to the epilepsy phenotype in SCA10. Our results also demonstrate that SMRT sequencing is useful for deciphering long tandem repeats that pose as “gaps” in the human genome sequence. PMID:26295943

  18. SNUFER: A software for localization and presentation of single nucleotide polymorphisms using a Clustal multiple sequence alignment output file

    PubMed Central

    Mansur, Marco A B; Cardozo, Giovana P; Santos, Elaine V; Marins, Mozart

    2008-01-01

    SNUFER is a software for the automatic localization and generation of tables used for the presentation of single nucleotide polymorphisms (SNPs). After input of a fasta file containing the sequences to be analyzed, a multiple sequence alignment is generated using ClustalW ran inside SNUFER. The ClustalW output file is then used to generate a table which displays the SNPs detected in the aligned sequences and their degree of similarity. This table can be exported to Microsoft Word, Microsoft Excel or as a single text file, permitting further editing for publication. The software was written using Delphi 7 for programming and FireBird 2.0 for sequence database management. It is freely available for noncommercial use and can be downloaded from http://www.heranza.com.br/bioinformatica2.htm. PMID:19238196

  19. Human adult T-cell leukemia virus: complete nucleotide sequence of the provirus genome integrated in leukemia cell DNA.

    PubMed Central

    Seiki, M; Hattori, S; Hirayama, Y; Yoshida, M

    1983-01-01

    Human retrovirus adult T-cell leukemia virus (ATLV) has been shown to be closely associated with human adult T-cell leukemia (ATL) [Yoshida, M., Miyoshi, I. & Hinuma, Y. (1982) Proc. Natl. Acad. Sci. USA 79, 2031-2035]. The provirus of ATLV integrated in DNA of leukemia T cells from a patient with ATL was molecularly cloned and the complete nucleotide sequence of 9,032 bases of the proviral genome was determined. The provirus DNA contains two long terminal repeats (LTRs) consisting of 755 bases, one at each end, which are flanked by a 6-base direct repeat of the cellular DNA sequence. The nucleotides in the LTR could be arranged into a unique secondary structure, which could explain transcriptional termination within the 3' LTR but not in the 5' LTR. The nucleotide sequence of the provirus contains three large open reading frames, which are capable of coding for proteins of 48,000, 99,000, and 54,000 daltons. The three open frames are in this order from the 5' end of the viral genome and the predicted 48,000-dalton polypeptide is a precursor of gag proteins, because it has an identical amino acid sequence to that of the NH2 terminus of human T-cell leukemia virus (HTLV) p24. The open frames coding for 99,000- and 54,000-dalton polypeptides are thought to be the pol and env genes, respectively. On the 3' side of these three open frames, the ATLV sequence has four smaller open frames in various phases; these frames may code for 10,000-, 11,000-, 12,000-, and 27,000-dalton polypeptides. Although one or some of these open frames could be the transforming gene of this virus, in preliminary analysis, DNA of this region has no homology with the normal human genome. PMID:6304725

  20. Construction of a full-length enriched cDNA library and preliminary analysis of expressed sequence tags from Bengal Tiger Panthera tigris tigris.

    PubMed

    Liu, Changqing; Liu, Dan; Guo, Yu; Lu, Taofeng; Li, Xiangchen; Zhang, Minghai; Ma, Jianzhang; Ma, Yuehui; Guan, Weijun

    2013-01-01

    In this study, a full-length enriched cDNA library was successfully constructed from Bengal tiger, Panthera tigris tigris, the most well-known wild Animal. Total RNA was extracted from cultured Bengal tiger fibroblasts in vitro. The titers of primary and amplified libraries were 1.28 × 106 pfu/mL and 1.56 × 109 pfu/mL respectively. The percentage of recombinants from unamplified library was 90.2% and average length of exogenous inserts was 0.98 kb. A total of 212 individual ESTs with sizes ranging from 356 to 1108 bps were then analyzed. The BLASTX score revealed that 48.1% of the sequences were classified as a strong match, 45.3% as nominal and 6.6% as a weak match. Among the ESTs with known putative function, 26.4% ESTs were found to be related to all kinds of metabolisms, 19.3% ESTs to information storage and processing, 11.3% ESTs to posttranslational modification, protein turnover, chaperones, 11.3% ESTs to transport, 9.9% ESTs to signal transducer/cell communication, 9.0% ESTs to structure protein, 3.8% ESTs to cell cycle, and only 6.6% ESTs classified as novel genes. By EST sequencing, a full-length gene coding ferritin was identified and characterized. The recombinant plasmid pET32a-TAT-Ferritin was constructed, coded for the TAT-Ferritin fusion protein with two 6× His-tags in N and C-terminal. After BCA assay, the concentration of soluble Trx-TAT-Ferritin recombinant protein was 2.32 ± 0.12 mg/mL. These results demonstrated that the reliability and representativeness of the cDNA library attained to the requirements of a standard cDNA library. This library provided a useful platform for the functional genome and transcriptome research of Bengal tigers. PMID:23708105

  1. Construction of a Full-Length Enriched cDNA Library and Preliminary Analysis of Expressed Sequence Tags from Bengal Tiger Panthera tigris tigris

    PubMed Central

    Liu, Changqing; Liu, Dan; Guo, Yu; Lu, Taofeng; Li, Xiangchen; Zhang, Minghai; Ma, Jianzhang; Ma, Yuehui; Guan, Weijun

    2013-01-01

    In this study, a full-length enriched cDNA library was successfully constructed from Bengal tiger, Panthera tigris tigris, the most well-known wild Animal. Total RNA was extracted from cultured Bengal tiger fibroblasts in vitro. The titers of primary and amplified libraries were 1.28 × 106 pfu/mL and 1.56 × 109 pfu/mL respectively. The percentage of recombinants from unamplified library was 90.2% and average length of exogenous inserts was 0.98 kb. A total of 212 individual ESTs with sizes ranging from 356 to 1108 bps were then analyzed. The BLASTX score revealed that 48.1% of the sequences were classified as a strong match, 45.3% as nominal and 6.6% as a weak match. Among the ESTs with known putative function, 26.4% ESTs were found to be related to all kinds of metabolisms, 19.3% ESTs to information storage and processing, 11.3% ESTs to posttranslational modification, protein turnover, chaperones, 11.3% ESTs to transport, 9.9% ESTs to signal transducer/cell communication, 9.0% ESTs to structure protein, 3.8% ESTs to cell cycle, and only 6.6% ESTs classified as novel genes. By EST sequencing, a full-length gene coding ferritin was identified and characterized. The recombinant plasmid pET32a-TAT-Ferritin was constructed, coded for the TAT-Ferritin fusion protein with two 6× His-tags in N and C-terminal. After BCA assay, the concentration of soluble Trx-TAT-Ferritin recombinant protein was 2.32 ± 0.12 mg/mL. These results demonstrated that the reliability and representativeness of the cDNA library attained to the requirements of a standard cDNA library. This library provided a useful platform for the functional genome and transcriptome research of Bengal tigers. PMID:23708105

  2. Nucleotide sequences and mutations of the 5'-nontranslated region (5'NTR) of natural isolates of an epidemic echovirus 11' (prime).

    PubMed

    Szendrõi, A; El-Sageyer, M; Takács, M; Mezey, I; Berencsi, G

    2000-01-01

    An echovirus 11' (prime) virus caused an epidemic in Hungary in 1989. The leading clinical form of the diseases was myocarditis. Hemorrhagic hepatitis syndroms were also caused, however, with lethal outcome in 13 newborn babies. Altogether 386 children suffered from registered clinical disease. No accumulation of serous meningitis cases and intrauterine death were observed during the epidemic, and the monovalent oral poliovirus vaccination campaign has prevented the further circulation of the virus. The 5'-nontranslated region (5'-NTR) of 12 natural isolates were sequenced (nucleotides: 260-577). The 5'-NTR was found to be different from that of the prototype Gregory strain (X80059) of EV11 (less than 90% identity), but related to the swine vesicular disease virus (D16364) SVDV and EV9 (X92886) as indicated by the best fitting dendogram. The examination of the variable nucleotides in the internal ribosomal entry site (IRES) revealed, that the nucleotide sequence of a region of the epidemic 5'-NTR was identical to that of coxsackievirus B2. Five of the epidemic isolates were found to carry mutations. Seven EV11' IRES elements possessed identical sequences indicating, that the virus has evolved before its arrival to Hungary. The comparative examination of the suboptimal secondary structures revealed, that no one of the mutations affected the secondary structure of stem-loop structures IV and V in the IRES elements. Although it has been shown previously, that the echovirus group is genetically coherent and related to coxsackie B viruses the sequence differences in the epidemic isolates resulted in profound modification of the central stem (residues 477-529) of stem-loop structure No.V known to be affecting neurovirulence of polioviruses. Two alternate cloverleaf (stem-loop) structures were also recognised (nucleotides 376 to 460 and 540 to 565) which seem to mask both regions of the IRES element complementary to the 3'-end of the 18 S rRNA (460 to 466 and 561 to 570

  3. Nucleotide Sequencing and SNP Detection of Toll-Like Receptor-4 Gene in Murrah Buffalo (Bubalus bubalis)

    PubMed Central

    Mitra, M.; Taraphder, S.; Sonawane, G. S.; Verma, A.

    2012-01-01

    Toll-like receptor-4 (TLR-4) has an important pattern recognition receptor that recognizes endotoxins associated with gram negative bacterial infections. The present investigation was carried out to study nucleotide sequencing and SNP detection by PCR-RFLP analysis of the TLR-4 gene in Murrah buffalo. Genomic DNA was isolated from 102 lactating Murrah buffalo from NDRI herd. The amplified PCR fragments of TLR-4 comprised of exon 1, exon 2, exon 3.1, and exon 3.2 were examined to RFLP. PCR products were obtained with sizes of 165, 300, 478, and 409 bp. TLR-4 gene of investigated Murrah buffaloes was highly polymorphic with AA, AB, and BB genotypes as revealed by PCR-RFLP analysis using Dra I, Hae III, and Hinf I REs. Nucleotide sequencing of the amplified fragment of TLR-4 gene of Murrah buffalo was done. Twelve SNPs were identified. Six SNPs were nonsynonymous resulting in change in amino acids. Murrah is an indigenous Buffalo breed and the presence of the nonsynonymous SNP is indicative of its unique genomic architecture. Sequence alignment and homology across species using BLAST analysis revealed 97%, 97%, 99%, 98%, and 80% sequence homology with Bos taurus, Bos indicus, Ovis aries, Capra hircus, and Homo sapiens, respectively.

  4. Glucitol-specific enzymes of the phosphotransferase system in Escherichia coli. Nucleotide sequence of the gut operon.

    PubMed

    Yamada, M; Saier, M H

    1987-04-25

    The complete nucleotide sequence of the glucitol (gut) operon in Escherichia coli has been determined. The glucitol-specific Enzyme II and Enzyme III of the phosphoenolpyruvate:sugar phosphotransferase system as well as glucitol-6-phosphate dehydrogenase which are encoded by the gutA, gutB, and gutD genes of the gut operon, respectively, are predicted to consist of 506 (Mr = 54,018), 123 (Mr = 13,306), and 259 (Mr = 27,866) amino acyl residues, respectively. The hydropathic profile of the Enzyme IIgut revealed 7 or 8 long hydrophobic segments which may traverse the cell membrane as alpha-helices as well as 2 or 4 short strongly hydrophobic stretches which may traverse the membrane as beta-structure. The number of amino acyl residues in the sum of the molecular weights of the glucitol Enzyme II-III pair are nearly the same as those of the mannitol Enzyme II. The ratio of hydrophobic to hydrophilic amino acyl residues and the numbers of the hydrophobic segments are also nearly the same for both transport systems. However, no significant homology was found in the nucleotide or amino acyl sequences of the two systems. Glucitol-6-phosphate dehydrogenase was found to exhibit sequence homology with ribitol dehydrogenase. A repetitive extragenic palindromic sequence was found in the 3'-flanking region of the gutD gene, suggesting the presence of a gene downstream from the gutD gene. PMID:3553176

  5. An Interpretation of the Ancestral Codon from Miller’s Amino Acids and Nucleotide Correlations in Modern Coding Sequences

    PubMed Central

    Carels, Nicolas; de Leon, Miguel Ponce

    2015-01-01

    Purine bias, which is usually referred to as an “ancestral codon”, is known to result in short-range correlations between nucleotides in coding sequences, and it is common in all species. We demonstrate that RWY is a more appropriate pattern than the classical RNY, and purine bias (Rrr) is the product of a network of nucleotide compensations induced by functional constraints on the physicochemical properties of proteins. Through deductions from universal correlation properties, we also demonstrate that amino acids from Miller’s spark discharge experiment are compatible with functional primeval proteins at the dawn of living cell radiation on earth. These amino acids match the hydropathy and secondary structures of modern proteins. PMID:25922573

  6. Development of Single Nucleotide Polymorphism Markers via Sequence-based Genotyping in Cotton (Gossypium spp)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    High-throughput single nucleotide polymorphism (SNP) genotyping has become the dominant approach to genomic analysis and genetic manipulation in many crop plants. In cotton (Gossypium spp), however, only a very limited number of loci and a dearth of information have been generated from SNP genotypi...

  7. Complete nucleotide sequence of Rose yellow leaf virus, a new member of the family Tombusviridae

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The genome of the Rose yellow leaf virus (RYLV) has been determined to be 3918 nucleotides containing seven open reading frames (ORFs). ORF1 encodes a 27 kDa peptide (p27). ORF2 shares a common start codon with ORF1 and continues through the amber stop codon of p27 to encode a 87 kDa (p87) protein t...

  8. Genetic instability of Japanese encephalitis virus cDNA clones propagated in Escherichia coli.

    PubMed

    Zheng, Xuchen; Tong, Wu; Liu, Fei; Liang, Chao; Gao, Fei; Li, Guoxin; Tong, Guangzhi; Zheng, Hao

    2016-04-01

    The genetic instability of Flavivirus cDNA clones in transformed bacteria is a common phenomenon. Herein, a cDNA fragment of the nucleotide (nt) 1-2913 of the genome of a flavivirus, Japanese encephalitis virus (JEV), was used to investigate factors that caused the instability of cDNA clones. Several cDNA fragments with different 5'- or 3'-termini of the 2913-nt cDNA were obtained by PCR amplification or restriction enzyme digestion and cloned into a pCR-Blunt II-TOPO vector. All the cDNA fragments were stably propagated at 25 °C. However, the 5'-untranslated region and half of the 3'-E gene could cause the instability of the 2913-nt cDNA at 37 °C. The 5'-terminus sequences of the 2913-nt fragment were subjected to testing of the prokaryotic promoter activity by luciferase assay and Western blot. The sequences of 54-120 nt of the JEV genome exhibited high prokaryotic promoter activity at 37 °C, and the activity declined markedly at 25 °C. These findings revealed that the high prokaryotic promoter activity of the 54-120 nt sequences of the JEV genome together with expression of JEV structural genes determined the instability of a JEV cDNA clone. Growth at room temperature may reduce the prokaryotic promoter activity of 5'-sequences of the JEV genome and could represent an effective way to improve the stability of flavivirus cDNA clones in host bacteria. PMID:26888374

  9. Nucleotide sequences of fic and fic-1 genes involved in cell filamentation induced by cyclic AMP in Escherichia coli.

    PubMed Central

    Kawamukai, M; Matsuda, H; Fujii, W; Utsumi, R; Komano, T

    1989-01-01

    The nucleotide sequences of fic-1 involved in the cell filamentation induced by cyclic AMP in Escherichia coli and its normal counterpart fic were analyzed. The open reading frame of both fic-1 and fic coded for 200 amino acids. The Gly at position 55 in the Fic protein was changed to Arg in the Fic-1 protein. The promoter activity of fic was confirmed by fusing fic and lacZ. The gene downstream from fic was found to be pabA (p-aminobenzoate). There is an open reading frame (ORF190) coding for 190 amino acids upstream from the fic gene. Computer-assisted analysis showed that Fic has sequence similarity with part of CDC28 of Saccharomyces cerevisiae, CDC2 of Schizosaccharomyces pombe, and FtsA of E. coli. In addition, ORF190 has sequence similarity with the cyclosporin A-binding protein cyclophilin. PMID:2546924

  10. The complete nucleotide sequence of a 16S ribosomal RNA gene from a blue-green alga, Anacystis nidulans.

    PubMed

    Tomioka, N; Sugiura, M

    1983-01-01

    The complete nucleotide sequence of a 16S ribosomal RNA gene from a blue-green alga, Anacystis nidulans, has been determined. Its coding region is estimated to be 1,487 base pairs long, which is nearly identical to those reported for chloroplast 16S rRNA genes and is about 4% shorter than that of the Escherichia coli gene. The 16S rRNA sequence of A. nidulans has 83% homology with that of tobacco chloroplast and 74% homology with that of E. coli. Possible stem and loop structures of A. nidulans 16S rRNA sequences resemble more closely those of chloroplast 16S rRNAs than those of E. coli 16S rRNA. These observations support the endosymbiotic theory of chloroplast origin. PMID:6412038

  11. The nucleotide sequences of several tRNA genes from rat mitochondria: common features and relatedness to homologous species.

    PubMed Central

    Cantatore, P; De Benedetto, C; Gadaleta, G; Gallerani, R; Kroon, A M; Holtrop, M; Lanave, C; Pepe, G; Quagliariello, C; Saccone, C; Sbisa, E

    1982-01-01

    We have determined the nucleotide sequences of thirteen rat mt tRNA genes. The features of the primary and secondary structures of these tRNAs show that those for Gln, Ser, and f-Met resemble, while those for Lys, Cys, and Trp depart strikingly from the universal type. The remainder are slightly abnormal. Among many mammalian mt DNA sequences, those of mt tRNA genes are highly conserved, thus suggesting for those genes an additional, perhaps regulatory, function. A simple evolutionary relationship between the tRNAs of animal mitochondria and those of eukaryotic cytoplasm, of lower eukaryotic mitochondria or of prokaryotes, is not evident owing to the extreme divergence of the tRNA sequences in the two groups. However, a slightly higher homology does exist between a few animal mt tRNAs and those from prokaryotes or from lower eukaryotic mitochondria. PMID:7099963

  12. Construction of full-length cDNA library and development of EST-derived simple sequence repeat (EST-SSR) markers in Senecio scandens.

    PubMed

    Qian, Gang; Ping, Junjiao; Lu, Jian; Zhang, Zhen; Wang, Lei; Xu, Delin

    2014-12-01

    Senecio scandens Buch.-Ham. ex D. Don (Compositae) is a crucial source of Chinese traditional medicine with antibacterial properties. We constructed a cDNA library and obtained expressed sequence tags (ESTs) to show the distribution of gene ontology annotations for mRNAs, using an individual plant with superior antibacterial characteristics. Analysis of comparative genomics indicates that the putative uncharacterized proteins (21.07%) might be derived from "molecular function unknown" clones or rare transcripts. Furthermore, the Compositae had high cross-species transferability of EST-derived simple sequence repeats (EST-SSR), based on valid amplifications of 206 primer pairs developed from the newly assembled expressed sequence tag sequences in Artemisia annua L. Among those EST-SSR markers, 52 primers showed polymorphic amplifications between individuals with contrasting diverse antibacterial traits. Our sequence data and molecular markers will be cost-effective tools for further studies such as genome annotation, molecular breeding, and novel transcript profiles within Compositae species. PMID:25007751

  13. The human and mouse homologs of the yeat RAD52 gene: cDNA cloning, sequence analysis, assignment to human chromosome 12p12.2-p13, and mRNA expression in mouse tissues

    SciTech Connect

    Shen, Z.; Chen, D.J.; Denison, K.

    1995-01-01

    The yeast Saccharomyces cerevisiae RAD52 gene is involved in DNA double-strand break repair and mitotic/meiotic recombination. The N-terminal amino acid sequence of yeast S. cerevisiae, Schizosaccharomyces pombe, and Kluyveromyces lactis and chicken is highly conserved. Using the technology of mixed oligonucleotide primed amplification of cDNA (MOPAC), two mouse RAD52 homologous cDNA fragments were amplified and sequenced. Subsequently, we have cloned the cDNA of the human and mouse homologs of yeast RAD52 gene by screening cDNA libraries using the identified mouse cDNA fragments. Sequence analysis of cDNA derived amino acid revealed a highly conserved N-terminus among human, mouse, chicken, and yeast RAD52 genes. The human RAD52 gene was assigned to chromosome 12p12.2-p13 by fluorescence in situ hybridization, R-banding, and DNA analysis of somatic cell hybrids. Unlike chicken RAD52 and mouse RAD51, no significant difference in mouse RAD52 mRNA level was found among mouse heart, brain, spleen, lung, liver, skeletal muscle, kidney, and testis. In addition to an {approximately}1.9-kb RAD52 mRNA band that is present in all of the tested tissues, an extra mRNA species of {approximately}0.85 kb was detectable in mouse testis. 40 refs., 7 figs., 1 tab.

  14. Nucleotide sequence of Zygosaccharomyces bailii virus Z: Evidence for +1 programmed ribosomal frameshifting and for assignment to family Amalgaviridae.

    PubMed

    Depierreux, Delphine; Vong, Minh; Nibert, Max L

    2016-06-01

    Zygosaccharomyces bailii virus Z (ZbV-Z) is a monosegmented dsRNA virus that infects the yeast Zygosaccharomyces bailii and remains unclassified to date despite its discovery >20years ago. The previously reported nucleotide sequence of ZbV-Z (GenBank AF224490) encompasses two nonoverlapping long ORFs: upstream ORF1 encoding the putative coat protein and downstream ORF2 encoding the RNA-dependent RNA polymerase (RdRp). The lack of overlap between these ORFs raises the question of how the downstream ORF is translated. After examining the previous sequence of ZbV-Z, we predicted that it contains at least one sequencing error to explain the nonoverlapping ORFs, and hence we redetermined the nucleotide sequence of ZbV-Z, derived from the same isolate of Z. bailii as previously studied, to address this prediction. The key finding from our new sequence, which includes several insertions, deletions, and substitutions relative to the previous one, is that ORF2 in fact overlaps ORF1 in the +1 frame. Moreover, a proposed sequence motif for +1 programmed ribosomal frameshifting, previously noted in influenza A viruses, plant amalgaviruses, and others, is also present in the newly identified ORF1-ORF2 overlap region of ZbV-Z. Phylogenetic analyses provided evidence that ZbV-Z represents a distinct taxon most closely related to plant amalgaviruses (genus Amalgavirus, family Amalgaviridae). We conclude that ZbV-Z is the prototype of a new species, which we propose to assign as type species of a new genus of monosegmented dsRNA mycoviruses in family Amalgaviridae. Comparisons involving other unclassified mycoviruses with RdRps apparently related to those of plant amalgaviruses, and having either mono- or bisegmented dsRNA genomes, are also discussed. PMID:26951859

  15. A comparative genomics strategy for targeted discovery of single-nucleotide polymorphisms and conserved-noncoding sequences in orphan crops.

    PubMed

    Feltus, F A; Singh, H P; Lohithaswa, H C; Schulze, S R; Silva, T D; Paterson, A H

    2006-04-01

    Completed genome sequences provide templates for the design of genome analysis tools in orphan species lacking sequence information. To demonstrate this principle, we designed 384 PCR primer pairs to conserved exonic regions flanking introns, using Sorghum/Pennisetum expressed sequence tag alignments to the Oryza genome. Conserved-intron scanning primers (CISPs) amplified single-copy loci at 37% to 80% success rates in taxa that sample much of the approximately 50-million years of Poaceae divergence. While the conserved nature of exons fostered cross-taxon amplification, the lesser evolutionary constraints on introns enhanced single-nucleotide polymorphism detection. For example, in eight rice (Oryza sativa) genotypes, polymorphism averaged 12.1 per kb in introns but only 3.6 per kb in exons. Curiously, among 124 CISPs evaluated across Oryza, Sorghum, Pennisetum, Cynodon, Eragrostis, Zea, Triticum, and Hordeum, 23 (18.5%) seemed to be subject to rigid intron size constraints that were independent of per-nucleotide DNA sequence variation. Furthermore, we identified 487 conserved-noncoding sequence motifs in 129 CISP loci. A large CISP set (6,062 primer pairs, amplifying introns from 1,676 genes) designed using an automated pipeline showed generally higher abundance in recombinogenic than in nonrecombinogenic regions of the rice genome, thus providing relatively even distribution along genetic maps. CISPs are an effective means to explore poorly characterized genomes for both DNA polymorphism and noncoding sequence conservation on a genome-wide or candidate gene basis, and also provide anchor points for comparative genomics across a diverse range of species. PMID:16607031

  16. Partition enrichment of nucleotide sequences (PINS)--a generally applicable, sequence based method for enrichment of complex DNA samples.

    PubMed

    Kvist, Thomas; Sondt-Marcussen, Line; Mikkelsen, Marie Just

    2014-01-01

    The dwindling cost of DNA sequencing is driving transformative changes in various biological disciplines including medicine, thus resulting in an increased need for routine sequencing. Preparation of samples suitable for sequencing is the starting point of any practical application, but enrichment of the target sequence over background DNA is often laborious and of limited sensitivity thereby limiting the usefulness of sequencing. The present paper describes a new method, Probability directed Isolation of Nucleic acid Sequences (PINS), for enrichment of DNA, enabling the sequencing of a large DNA region surrounding a small known sequence. A 275,000 fold enrichment of a target DNA sample containing integrated human papilloma virus is demonstrated. Specifically, a sample containing 0.0028 copies of target sequence per ng of total DNA was enriched to 786 copies per ng. The starting concentration of 0.0028 target copies per ng corresponds to one copy of target in a background of 100,000 complete human genomes. The enriched sample was subsequently amplified using rapid genome walking and the resulting DNA sequence revealed not only the sequence of a the truncated virus, but also 1026 base pairs 5' and 50 base pairs 3' to the integration site in chromosome 8. The demonstrated enrichment method is extremely sensitive and selective and requires only minimal knowledge of the sequence to be enriched and will therefore enable sequencing where the target concentration relative to background is too low to allow the use of other sample preparation methods or where significant parts of the target sequence is unknown. PMID:25203653

  17. Partition Enrichment of Nucleotide Sequences (PINS) - A Generally Applicable, Sequence Based Method for Enrichment of Complex DNA Samples

    PubMed Central

    Kvist, Thomas; Sondt-Marcussen, Line; Mikkelsen, Marie Just

    2014-01-01

    The dwindling cost of DNA sequencing is driving transformative changes in various biological disciplines including medicine, thus resulting in an increased need for routine sequencing. Preparation of samples suitable for sequencing is the starting point of any practical application, but enrichment of the target sequence over background DNA is often laborious and of limited sensitivity thereby limiting the usefulness of sequencing. The present paper describes a new method, Probability directed Isolation of Nucleic acid Sequences (PINS), for enrichment of DNA, enabling the sequencing of a large DNA region surrounding a small known sequence. A 275,000 fold enrichment of a target DNA sample containing integrated human papilloma virus is demonstrated. Specifically, a sample containing 0.0028 copies of target sequence per ng of total DNA was enriched to 786 copies per ng. The starting concentration of 0.0028 target copies per ng corresponds to one copy of target in a background of 100,000 complete human genomes. The enriched sample was subsequently amplified using rapid genome walking and the resulting DNA sequence revealed not only the sequence of a the truncated virus, but also 1026 base pairs 5′ and 50 base pairs 3′ to the integration site in chromosome 8. The demonstrated enrichment method is extremely sensitive and selective and requires only minimal knowledge of the sequence to be enriched and will therefore enable sequencing where the target concentration relative to background is too low to allow the use of other sample preparation methods or where significant parts of the target sequence is unknown. PMID:25203653

  18. Nucleotide sequence and genome organization of Dweet mottle virus and its relationship to members of the family Betaflexiviridae.

    PubMed

    Hajeri, Subhas; Ramadugu, Chandrika; Keremane, Manjunath; Vidalakis, Georgios; Lee, Richard

    2010-09-01

    The nucleotide sequence of Dweet mottle virus (DMV) was determined and compared to sequences of members of the families Alphaflexiviridae and Betaflexiviridae. The DMV genome has 8,747 nucleotides (nt) excluding the 3' poly-(A) tail. DMV genomic RNA contains three putative open reading frames (ORFs) and untranslated regions of 73 nt at the 5' and 541 nt at 3' termini. ORF1 potentially encoding a 227.48-kDa polyprotein, which has methyltransferase, oxygenase, endopeptidase, helicase, and RNA-dependent RNA polymerase (RdRP) domains. ORF2 encodes a movement protein of 40.25 kDa, while ORF3 encodes a coat protein of 40.69 kDa. Protein database searches showed 98-99% matches of DMV ORFs with citrus leaf blotch virus (CLBV) sequences. Phylogenetic analysis based on the RdRP core domain revealed that DMV is closely related to CLBV as a member of the genus Citrivirus. DMV did not satisfy the molecular criteria for demarcation of an independent species within the genus Citrivirus, family Betaflexiviridae, and hence, DMV can be considered a CLBV isolate. PMID:20644968

  19. Nucleotide sequence, transcription and phylogeny of the gene encoding the superoxide dismutase of Sulfolobus acidocaldarius.

    PubMed

    Klenk, H P; Schleper, C; Schwass, V; Brudler, R

    1993-07-18

    The gene encoding the superoxide dismutase (SOD) of the thermophilic archaeon Sulfolobus acidocaldarius has been isolated and sequenced. Both the start site and the termination sites of the corresponding transcript were mapped. The deduced amino acid sequence of the protein is very similar to the sequence of manganese- or iron-containing SODs. Phylogenetic sequence analysis corroborated the monophyletic nature of the archaeal domain. PMID:8334170

  20. Characterization of the venom from the Australian scorpion Urodacus yaschenkoi: Molecular mass analysis of components, cDNA sequences and peptides with antimicrobial activity.

    PubMed

    Luna-Ramírez, Karen; Quintero-Hernández, Veronica; Vargas-Jaimes, Leonel; Batista, Cesar V F; Winkel, Kenneth D; Possani, Lourival D

    2013-03-01

    The Urodacidae scorpions are the most widely distributed of the four families in Australia and represent half of the species in the continent, yet their venoms remain largely unstudied. This communication reports the first results of a proteome analysis of the venom of the scorpion Urodacus yaschenkoi performed by mass fingerprinting, after high performance liquid chromatography (HPLC) separation. A total of 74 fractions were obtained by HPLC separation allowing the identification of approximately 274 different molecular masses with molecular weights varying from 287 to 43,437 Da. The most abundant peptides were those from 1 K Da and 4-5 K Da representing antimicrobial peptides and putative potassium channel toxins, respectively. Three such peptides were chemically synthesized and tested against Gram-positive and Gram-negative bacteria showing minimum inhibitory concentration in the low micromolar range, but with moderate hemolytic activity. It also reports a transcriptome analysis of the venom glands of the same scorpion species, undertaken by constructing a cDNA library and conducting random sequencing screening of the transcripts. From the resultant cDNA library 172 expressed sequence tags (ESTs) were analyzed. These transcripts were further clustered into 120 unique sequences (23 contigs and 97 singlets). The identified putative proteins can be assorted in several groups, such as those implicated in common cellular processes, putative neurotoxins and antimicrobial peptides. The scorpion U. yaschenkoi is not known to be dangerous to humans and its venom contains peptides similar to those of Opisthacanthus cayaporum (antibacterial), Scorpio maurus palmatus (maurocalcin), Opistophthalmus carinatus (opistoporines) and Hadrurus gerstchi (scorpine-like molecules), amongst others. PMID:23182832

  1. Large scale full-length cDNA sequencing reveals a unique genomic landscape in a lepidopteran model insect, Bombyx mori.

    PubMed

    Suetsugu, Yoshitaka; Futahashi, Ryo; Kanamori, Hiroyuki; Kadono-Okuda, Keiko; Sasanuma, Shun-ichi; Narukawa, Junko; Ajimura, Masahiro; Jouraku, Akiya; Namiki, Nobukazu; Shimomura, Michihiko; Sezutsu, Hideki; Osanai-Futahashi, Mizuko; Suzuki, Masataka G; Daimon, Takaaki; Shinoda, Tetsuro; Taniai, Kiyoko; Asaoka, Kiyoshi; Niwa, Ryusuke; Kawaoka, Shinpei; Katsuma, Susumu; Tamura, Toshiki; Noda, Hiroaki; Kasahara, Masahiro; Sugano, Sumio; Suzuki, Yutaka; Fujiwara, Haruhiko; Kataoka, Hiroshi; Arunkumar, Kallare P; Tomar, Archana; Nagaraju, Javaregowda; Goldsmith, Marian R; Feng, Qili; Xia, Qingyou; Yamamoto, Kimiko; Shimada, Toru; Mita, Kazuei

    2013-09-01

    The establishment of a complete genomic sequence of silkworm, the model species of Lepidoptera, laid a foundation for its functional genomics. A more complete annotation of the genome will benefit functional and comparative studies and accelerate extensive industrial applications for this insect. To realize these goals, we embarked upon a large-scale full-length cDNA collection from 21 full-length cDNA libraries derived from 14 tissues of the domesticated silkworm and performed full sequencing by primer walking for 11,104 full-length cDNAs. The large average intron size was 1904 bp, resulting from a high accumulation of transposons. Using gene models predicted by GLEAN and published mRNAs, we identified 16,823 gene loci on the silkworm genome assembly. Orthology analysis of 153 species, including 11 insects, revealed that among three Lepidoptera including Monarch and Heliconius butterflies, the 403 largest silkworm-specific genes were composed mainly of protective immunity, hormone-related, and characteristic structural proteins. Analysis of testis-/ovary-specific genes revealed distinctive features of sexual dimorphism, including depletion of ovary-specific genes on the Z chromosome in contrast to an enrichment of testis-specific genes. More than 40% of genes expressed in specific tissues mapped in tissue-specific chromosomal clusters. The newly obtained FL-cDNA sequences enabled us to annotate the genome of this lepidopteran model insect more accurately, enhancing genomic and functional studies of Lepidoptera and comparative analyses with other insect orders, and yielding new insights into the evolution and organization of lepidopteran-specific genes. PMID:23821615

  2. Large Scale Full-Length cDNA Sequencing Reveals a Unique Genomic Landscape in a Lepidopteran Model Insect, Bombyx mori

    PubMed Central

    Suetsugu, Yoshitaka; Futahashi, Ryo; Kanamori, Hiroyuki; Kadono-Okuda, Keiko; Sasanuma, Shun-ichi; Narukawa, Junko; Ajimura, Masahiro; Jouraku, Akiya; Namiki, Nobukazu; Shimomura, Michihiko; Sezutsu, Hideki; Osanai-Futahashi, Mizuko; Suzuki, Masataka G; Daimon, Takaaki; Shinoda, Tetsuro; Taniai, Kiyoko; Asaoka, Kiyoshi; Niwa, Ryusuke; Kawaoka, Shinpei; Katsuma, Susumu; Tamura, Toshiki; Noda, Hiroaki; Kasahara, Masahiro; Sugano, Sumio; Suzuki, Yutaka; Fujiwara, Haruhiko; Kataoka, Hiroshi; Arunkumar, Kallare P.; Tomar, Archana; Nagaraju, Javaregowda; Goldsmith, Marian R.; Feng, Qili; Xia, Qingyou; Yamamoto, Kimiko; Shimada, Toru; Mita, Kazuei

    2013-01-01

    The establishment of a complete genomic sequence of silkworm, the model species of Lepidoptera, laid a foundation for its functional genomics. A more complete annotation of the genome will benefit functional and comparative studies and accelerate extensive industrial applications for this insect. To realize these goals, we embarked upon a large-scale full-length cDNA collection from 21 full-length cDNA libraries derived from 14 tissues of the domesticated silkworm and performed full sequencing by primer walking for 11,104 full-length cDNAs. The large average intron size was 1904 bp, resulting from a high accumulation of transposons. Using gene models predicted by GLEAN and published mRNAs, we identified 16,823 gene loci on the silkworm genome assembly. Orthology analysis of 153 species, including 11 insects, revealed that among three Lepidoptera including Monarch and Heliconius butterflies, the 403 largest silkworm-specific genes were composed mainly of protective immunity, hormone-related, and characteristic structural proteins. Analysis of testis-/ovary-specific genes revealed distinctive features of sexual dimorphism, including depletion of ovary-specific genes on the Z chromosome in contrast to an enrichment of testis-specific genes. More than 40% of genes expressed in specific tissues mapped in tissue-specific chromosomal clusters. The newly obtained FL-cDNA sequences enabled us to annotate the genome of this lepidopteran model insect more accurately, enhancing genomic and functional studies of Lepidoptera and comparative analyses with other insect orders, and yielding new insights into the evolution and organization of lepidopteran-specific genes. PMID:23821615

  3. Complete Nucleotide Sequence of cfr-Carrying IncX4 Plasmid pSD11 from Escherichia coli

    PubMed Central

    Sun, Jian; Deng, Hui; Li, Liang; Chen, Mu-Ya; Fang, Liang-Xing; Yang, Qiu-E

    2014-01-01

    We report the complete nucleotide sequence of a plasmid carrying the multiresistance gene cfr. This plasmid was isolated from an Escherichia coli strain of swine origin in 2011. This 37,672-bp plasmid, pSD11, had an IncX4 backbone similar to those of the IncX4 plasmids obtained from the United States and Australia, in which the cfr gene was flanked by two copies of IS26 and a truncated Tn1331 was inserted. PMID:25403661

  4. DNA sequencing by a single molecule detection of labeled nucleotides sequentially cleaved from a single strand of DNA

    SciTech Connect

    Goodwin, P.M.; Schecker, J.A.; Wilkerson, C.W.; Hammond, M.L.; Ambrose, W.P.; Jett, J.H.; Martin, J.C.; Marrone, B.L.; Keller, R.A. ); Haces, A.; Shih, P.J.; Harding, J.D. )

    1993-01-01

    We are developing a laser-based technique for the rapid sequencing of large DNA fragments (several kb in size) at a rate of 100 to 1000 bases per second. Our approach relies on fluorescent labeling of the bases in a single fragment of DNA, attachment of this labeled DNA fragment to a support, movement of the supported DNA into a flowing sample stream, sequential cleavage of the end nucleotide from the DNA fragment with an exonuclease, and detection of the individual fluorescently labeled bases by laser-induced fluorescence.

  5. DNA sequencing by a single molecule detection of labeled nucleotides sequentially cleaved from a single strand of DNA

    SciTech Connect

    Goodwin, P.M.; Schecker, J.A.; Wilkerson, C.W.; Hammond, M.L.; Ambrose, W.P.; Jett, J.H.; Martin, J.C.; Marrone, B.L.; Keller, R.A.; Haces, A.; Shih, P.J.; Harding, J.D.

    1993-02-01

    We are developing a laser-based technique for the rapid sequencing of large DNA fragments (several kb in size) at a rate of 100 to 1000 bases per second. Our approach relies on fluorescent labeling of the bases in a single fragment of DNA, attachment of this labeled DNA fragment to a support, movement of the supported DNA into a flowing sample stream, sequential cleavage of the end nucleotide from the DNA fragment with an exonuclease, and detection of the individual fluorescently labeled bases by laser-induced fluorescence.

  6. Analysis of a nucleotide-binding site of 5-lipoxygenase by affinity labelling: binding characteristics and amino acid sequences.

    PubMed Central

    Zhang, Y Y; Hammarberg, T; Radmark, O; Samuelsson, B; Ng, C F; Funk, C D; Loscalzo, J

    2000-01-01

    5-Lipoxygenase (5LO) catalyses the first two steps in the biosynthesis of leukotrienes, which are inflammatory mediators derived from arachidonic acid. 5LO activity is stimulated by ATP; however, a consensus ATP-binding site or nucleotide-binding site has not been found in its protein sequence. In the present study, affinity and photoaffinity labelling of 5LO with 5'-p-fluorosulphonylbenzoyladenosine (FSBA) and 2-azido-ATP showed that 5LO bound to the ATP analogues quantitatively and specifically and that the incorporation of either analogue inhibited ATP stimulation of 5LO activity. The stoichiometry of the labelling was 1.4 mol of FSBA/mol of 5LO (of which ATP competed with 1 mol/mol) or 0.94 mol of 2-azido-ATP/mol of 5LO (of which ATP competed with 0.77 mol/mol). Labelling with FSBA prevented further labelling with 2-azido-ATP, indicating that the same binding site was occupied by both analogues. Other nucleotides (ADP, AMP, GTP, CTP and UTP) also competed with 2-azido-ATP labelling, suggesting that the site was a general nucleotide-binding site rather than a strict ATP-binding site. Ca(2+), which also stimulates 5LO activity, had no effect on the labelling of the nucleotide-binding site. Digestion with trypsin and peptide sequencing showed that two fragments of 5LO were labelled by 2-azido-ATP. These fragments correspond to residues 73-83 (KYWLNDDWYLK, in single-letter amino acid code) and 193-209 (FMHMFQSSWNDFADFEK) in the 5LO sequence. Trp-75 and Trp-201 in these peptides were modified by the labelling, suggesting that they were immediately adjacent to the C-2 position of the adenine ring of ATP. Given the stoichiometry of the labelling, the two peptide sequences of 5LO were probably near each other in the enzyme's tertiary structure, composing or surrounding the ATP-binding site of 5LO. PMID:11042125

  7. Nucleotide sequence neighbouring a late modified guanylic residue within the 28S ribosomal RNA of several eukaryotic cells.

    PubMed Central

    Eladari, M E; Hampe, A; Galibert, F

    1977-01-01

    The nucleotide sequence of a particular T1 oligonucleotide found in 41S and 28S RNAs of several cellular cell lines (human, mouse, rat and chicken fibroblast) but absent in 45S ribosomal RNA has been deduced. Its primary structure : A-U-U*-G*-psi-U-C-A-C-C-C-A-C-U-A-A-U-A-Gp shows the presence of a modified G residue which explains the existence of this oligonucleotide in the T1 fingerprint of 41S RNA and 28S. Its absence on the 45S RNA T1 fingerprint is accounted for by a late modification. Images PMID:561392

  8. A drosophila full-length cDNA resource

    SciTech Connect

    Stapleton, Mark; Carlson, Joseph; Brokstein, Peter; Yu, Charles; Champe, Mark; George, Reed; Guarin, Hannibal; Kronmiller, Brent; Pacleb, Joanne; Park, Soo; Rubin, Gerald M.; Celniker, Susan E.

    2003-05-09

    Background: A collection of sequenced full-length cDNAs is an important resource both for functional genomics studies and for the determination of the intron-exon structure of genes. Providing this resource to the Drosophila melanogaster research community has been a long-term goal of the Berkeley Drosophila Genome Project. We have previously described the Drosophila Gene Collection (DGC), a set of putative full-length cDNAs that was produced by generating and analyzing over 250,000 expressed sequence tags (ESTs) derived from a variety of tissues and developmental stages. Results: We have generated high-quality full-insert sequence for 8,921 clones in the DGC. We compared the sequence of these clones to the annotated Release 3 genomic sequence, and identified more than 5,300 cDNAs that contain a complete and accurate protein-coding sequence. This corresponds to at least one splice form for 40 percent of the predicted D. melanogaster genes. We also identified potential new cases of RNA editing. Conclusions: We show that comparison of cDNA sequences to a high-quality annotated genomic sequence is an effective approach to identifying and eliminating defective clones from a cDNA collection and ensure its utility for experimentation. Clones were eliminated either because they carry single nucleotide discrepancies, which most probably result from reverse transcriptase errors, or because they are truncated and contain only part of the protein-coding sequence.

  9. Nucleotide sequence of the 3'-terminal region of the genome confirms that pea mosaic virus is a strain of bean yellow mosaic potyvirus.

    PubMed

    Xiao, X W; Frenkel, M J; Ward, C W; Shukla, D D

    1994-01-01

    The 1,035 nucleotides at the 3'end of the I strain of pea mosaic potyvirus (PMV-I) genomic RNA, encoding the coat protein, have been cloned and sequenced. A comparison of the derived coat protein sequence with those of the bean yellow mosaic virus (BYMV) strains, CS, S, D and GDD, indicates that PMV-I is a strain of BYMV. Sequence comparisons and hybridisation studies using the 3'-noncoding region support this classification. The nucleotide and protein sequence data also suggest that PMV-I and BYMV-CS form one subset of BYMV strains while the other three strains form another. PMID:8031241

  10. Transcriptome sequencing to produce SNP-based genetic maps of onion

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We used the 454 platform to sequence from normalized cDNA libraries from each of two inbred lines of onion (OH1 and 5225). From approximately 1.6 million reads from each inbred, 27,065 and 33,254 cDNA contigs were assembled from OH1 and 5225, respectively. In total, 3,364 single nucleotide polymorph...

  11. PerPlot & PerScan: tools for analysis of DNA curvature-related periodicity in genomic nucleotide sequences

    PubMed Central

    2011-01-01

    Background Periodic spacing of short adenine or thymine runs phased with DNA helical period of ~10.5 bp is associated with intrinsic DNA curvature and deformability, which play important roles in DNA-protein interactions and in the organization of chromosomes in both eukaryotes and prokaryotes. Local differences in DNA sequence periodicity have been linked to differences in gene expression in some organisms. Despite the significance of these periodic patterns, there are virtually no publicly accessible tools for their analysis. Results We present novel tools suitable for assessments of DNA curvature-related sequence periodicity in nucleotide sequences at the genome scale. Utility of the present software is demonstrated on a comparison of sequence periodicities in the genomes of Haemophilus influenzae, Methanocaldococcus jannaschii, Saccharomyces cerevisiae, and Arabidopsis thaliana. The software can be accessed through a web interface and the programs are also available for download. Conclusions The present software is suitable for comparing DNA curvature-related sequence periodicity among different genomes as well as for analysis of intrachromosomal heterogeneity of the sequence periodicity. It provides a quick and convenient way to detect anomalous regions of chromosomes that could have unusual structural and functional properties and/or distinct evolutionary history. PMID:22587738

  12. The nucleotide composition of the spacer sequence influences the expression yield of heterologously expressed genes in Bacillus subtilis.

    PubMed

    Liebeton, Klaus; Lengefeld, Jette; Eck, Jürgen

    2014-12-10

    Bacillus subtilis is a commonly used host for the heterologous expression of genes in academia and industry. Many factors are known to influence the expression yield in this organism e.g. the complementarity between the Shine-Dalgarno sequence (SD) and the 16S-rRNA or secondary structures in the translation initiation region of the transcript. In this study, we analysed the impact of the nucleotide composition between the SD sequence and the start codon (the spacer sequence) on the expression yield. We demonstrated that a polyadenylate-moiety spacer sequence moderately increases the expression level of laccase CotA from B. subtilis. By screening a library of artificially generated spacer variants, we identified clones with greatly increased expression levels of two model enzymes, the laccase CotA from B. subtilis (11 fold) and the metagenome derived protease H149 (30 fold). Furthermore, we demonstrated that the effect of the spacer sequence is specific to the gene of interest. These results prove the high impact of the spacer sequence on the expression yield in B. subtilis. PMID:24997355

  13. The nucleotide sequence and genomic organization of Citrus leaf blotch virus: candidate type species for a new virus genus.

    PubMed

    Vives, M C; Galipienso, L; Navarro, L; Moreno, P; Guerri, J

    2001-08-15

    The complete nucleotide sequence of Citrus leaf blotch virus (CLBV) was determined. CLBV genomic RNA (gRNA) has 8747 nt, excluding the 3'-terminal poly(A) tail, and contains three open reading frames (ORFs) and untranslated regions (UTR) of 73 and 541 nucleotides at the 5' and 3' termini, respectively. ORF1 potentially encodes a 227.4-kDa polypeptide, which has methyltransferase, papain-like protease, helicase, and RNA-dependent RNA polymerase motifs. ORF2 encodes a 40.2-kDa polypeptide containing a motif characteristic of cell-to-cell movement proteins. The 40.7-kDa polypeptide encoded by ORF3 was identified as the coat protein. The genome organization of CLBV resembles that of viruses in the genus Trichovirus, but they differ in various aspects: (i) in trichoviruses ORF2 overlaps ORFs 1 and 3, whereas in CLBV, ORFs 2 and 3 are separated and ORFs 1 and 2 overlap in one nucleotide; (ii) CLBV gRNA and CP are larger than those of trichoviruses; and (iii) the CLBV 3' UTR is larger than that of trichoviruses. Phylogenetic comparisons based on CP amino acid signatures clearly separates CLBV from trichoviruses. Also contrasting with trichoviruses, CLBV could not be transmitted to Chenopodium quinoa Willd. Considering these singularities, we propose that CLBV should be included in a new virus genus. PMID:11504557

  14. Nucleotide sequence of the phosphoglycerate kinase gene from the extreme thermophile Thermus thermophilus. Comparison of the deduced amino acid sequence with that of the mesophilic yeast phosphoglycerate kinase.

    PubMed Central

    Bowen, D; Littlechild, J A; Fothergill, J E; Watson, H C; Hall, L

    1988-01-01

    Using oligonucleotide probes derived from amino acid sequencing information, the structural gene for phosphoglycerate kinase from the extreme thermophile, Thermus thermophilus, was cloned in Escherichia coli and its complete nucleotide sequence determined. The gene consists of an open reading frame corresponding to a protein of 390 amino acid residues (calculated Mr 41,791) with an extreme bias for G or C (93.1%) in the codon third base position. Comparison of the deduced amino acid sequence with that of the corresponding mesophilic yeast enzyme indicated a number of significant differences. These are discussed in terms of the unusual codon bias and their possible role in enhanced protein thermal stability. Images Fig. 1. PMID:3052437

  15. Anabolic ornithine carbamoyltransferase of Pseudomonas aeruginosa: nucleotide sequence and transcriptional control of the argF structural gene.

    PubMed Central

    Itoh, Y; Soldati, L; Stalon, V; Falmagne, P; Terawaki, Y; Leisinger, T; Haas, D

    1988-01-01

    In Pseudomonas aeruginosa PAO the anabolic ornithine carbamoyltransferase (OTCase, EC 2.1.3.3) is the product of the argF gene and the only arginine biosynthetic enzyme whose synthesis is repressible by arginine. We have determined the complete nucleotide sequence of the argF gene including its promoter-control region. The deduced amino acid sequence of the anabolic OTCase consists of 305 residues (Mr 33,924), and this was confirmed by the N-terminal amino acid sequence, the total amino acid composition, and the subunit Mr of the purified enzyme. The native anabolic OTCase (Mr 110,000 to 125,000) was found to be a trimer by cross-linking experiments. P. aeruginosa also has a catabolic OTCase (the arcB gene product), which catalyzes the reverse reaction of the anabolic conversion. At the nucleotide sequence level, the P. aeruginosa argF gene had 52.4% identity with the arcB gene. The Escherichia coli argF and argI genes, which code for anabolic OTCase isoenzymes, had 47.3 and 44.9% identity, respectively, with the P. aeruginosa argF sequence. This suggests that these four genes have evolved from a common ancestral gene. The arcB gene appears to be more closely related to the E. coli argF gene than to the P. aeruginosa argF gene. Two transcripts (mRNA-1, mRNA-2) of the P. aeruginosa argF gene were identified by S1 mapping. The transcription initiation site for mRNA-1 was preceded by sequences having partial homology with the E. coli -35 and -10 consensus promoter sequences. No sequence similar to consensus promoters of enteric bacteria was found upstream of the 5' end of mRNA-2. E. coli carrying a P. aeruginosa argF+ recombinant plasmid produced mRNA-1 with low efficiency but no (or very little) mRNA-2. Arginine repressed argF transcription in P. aeruginosa. In the argF promoter region no sequence homologous to the "arg box" (arginine operator module) of E. coli was found. The mechanism of arginine repression in P. aeruginosa thus appears to be different from that in

  16. Large and small subunits of the Aujeszky's disease virus ribonucleotide reductase: nucleotide sequence and putative structure.

    PubMed

    Kaliman, A V; Boldogköi, Z; Fodor, I

    1994-09-13

    We determined the entire DNA sequence of two adjacent open reading frames of Aujeszky's disease virus encoding ribonucleotide reductase genes with the intergenic sequence of 9 bp. From the sequence analysis we deduce that ORFs encode large and small subunits, with sizes of 835 and 303 amino acids, respectively. Amino acid sequence comparison of ADV RR2 with that of equine herpesvirus type 1, bovine herpesvirus type 1, HSV-1 and varicella zoster virus revealed that 48% of amino acids represent clusters of residues conserved in all compared sequences. In the N-terminal part ADV RR1 shows low homology to the RR1 of other herpesviruses. Rest of the RR1 protein contains highly conserved amino acid sequences divided by blocks of low homology. PMID:8086454

  17. Localization of the human fibromodulin gene (FMOD) to chromosome 1q32 and completion of the cDNA sequence

    SciTech Connect

    Sztrolovics, R.; Grover, J.; Roughley, P.J.

    1994-10-01

    This report describes the cloning of the 3{prime}-untranslated region of the human fibromodulin cDNA and its use to map the gene. For somatic cell hybrids, the generation of the PCR product was concordant with the presence of chromosome 1 and discordant with the presence of all other chromosomes, confirming that the fibromodulin gene is located within region q32 of chromosome 1. The physical mapping of genes is a critical step in the process of identifying which genes may be responsible for various inherited disorders. Specifically, the mapping of the fibromodulin gene now provides the information necessary to evaluate its potential role in genetic disorders of connective tissues. The analysis of previously reported diseases mapped to chromosome 1 reveals two genes located in the proximity of the fibromodulin locus. These are Usher syndrome type II, a recessive disorder characterized by hearing loss and retinitis pigmentosa, and Van der Woude syndrome, a dominant condition associated with abnormalities such as cleft lip and palate and hyperdontia. The genes for both of these disorders have been projected to be localized to 1q32 of a physical map that integrates available genetic linkage and physical data. However, it seems improbable that either of these disorders, exhibiting restricted tissue involvement, could be linked to the fibromodulin gene, given the wide tissue distribution of the encoded proteoglycan, although it remains possible that the relative importance of the quantity and function of the proteoglycan may avry between tissues. 11 refs., 1 fig.

  18. Rapid DNA Sequencing by Direct Nanoscale Reading of Nucleotide Bases on Individual DNA Chains

    SciTech Connect

    Lee, James Weifu; Meller, Amit

    2007-01-01

    Since the independent invention of DNA sequencing by Sanger and by Gilbert 30 years ago, it has grown from a small scale technique capable of reading several kilobase-pair of sequence per day into today's multibillion dollar industry. This growth has spurred the development of new sequencing technologies that do not involve either electrophoresis or Sanger sequencing chemistries. Sequencing by Synthesis (SBS) involves multiple parallel micro-sequencing addition events occurring on a surface, where data from each round is detected by imaging. New High Throughput Technologies for DNA Sequencing and Genomics is the second volume in the Perspectives in Bioanalysis series, which looks at the electroanalytical chemistry of nucleic acids and proteins, development of electrochemical sensors and their application in biomedicine and in the new fields of genomics and proteomics. The authors have expertly formatted the information for a wide variety of readers, including new developments that will inspire students and young scientists to create new tools for science and medicine in the 21st century. Reviews of complementary developments in Sanger and SBS sequencing chemistries, capillary electrophoresis and microdevice integration, MS sequencing and applications set the framework for the book.

  19. Partial purification of the chloroplast ATP synthase from Chlamydomonas reinhardtii and the cloning and sequencing of a cDNA encoding the gamma subunit

    SciTech Connect

    Yu, L.M.

    1988-01-01

    The chloroplast ATP synthase was partially purified from the green alga Chlamydomonas reinhardtii by extracting membranes with deoxycholate and KCl, followed by centrifugation and ammonium sulfate fractionation of the supernatant. The enzyme assay involved the reconstitution of such fractions with bacteriorhodopsin and soybean phospholipids to form vesicles capable of light-dependent ({sup 32}P)-phosphate esterification. A cDNA for the gamma subunit from Chlamydomonas was isolated, expressed in vitro and sequenced. It contains the entire coding region for the gamma subunit precursor. A 35 amino acid long transit peptide resides at the NH{sub 2}-terminus of a 323 amino acid long mature peptide that is 77% similar to the spinach gamma subunit. Six cysteines were found; three were conserved in Chlamydomonas and spinach.

  20. Defining natural species of bacteria: clear-cut genomic boundaries revealed by a turning point in nucleotide sequence divergence

    PubMed Central

    2013-01-01

    Background Bacteria are currently classified into arbitrary species, but whether they actually exist as discrete natural species was unclear. To reveal genomic features that may unambiguously group bacteria into discrete genetic clusters, we carried out systematic genomic comparisons among representative bacteria. Results We found that bacteria of Salmonella formed tight phylogenetic clusters separated by various genetic distances: whereas over 90% of the approximately four thousand shared genes had completely identical sequences among strains of the same lineage, the percentages dropped sharply to below 50% across the lineages, demonstrating the existence of clear-cut genetic boundaries by a steep turning point in nucleotide sequence divergence. Recombination assays supported the genetic boundary hypothesis, suggesting that genetic barriers had been formed between bacteria of even very closely related lineages. We found similar situations in bacteria of Yersinia and Staphylococcus. Conclusions Bacteria are genetically isolated into discrete clusters equivalent to natural species. PMID:23865772

  1. Fusion protein of the paramyxovirus simian virus 5: nucleotide sequence of mRNA predicts a highly hydrophobic glycoprotein.

    PubMed Central

    Paterson, R G; Harris, T J; Lamb, R A

    1984-01-01

    The nucleotide sequence of the mRNA coding for the fusion glycoprotein (F) of the paramyxovirus, simian virus 5, has been obtained. There is a single large open reading frame on the mRNA that encodes a protein of 529 amino acids with a molecular weight of 56,531. The proteolytic cleavage/activation site of F, to yield F2 and F1, contains five arginine residues. Six potential glycosylation sites were identified in the protein, two on F2 and four on F1. The deduced amino acid sequence indicates that F is extensively hydrophobic over the length of the polypeptide chain. Three regions are very hydrophobic and could interact directly with membranes: these are the NH2-terminal putative signal peptide, the COOH-terminal putative membrane anchorage domain, and the NH2-terminal region of F1. Images PMID:6093114

  2. A Simple Sequence Repeat- and Single-Nucleotide Polymorphism-Based Genetic Linkage Map of the Brown Planthopper, Nilaparvata lugens

    PubMed Central

    Jairin, Jirapong; Kobayashi, Tetsuya; Yamagata, Yoshiyuki; Sanada-Morimura, Sachiyo; Mori, Kazuki; Tashiro, Kosuke; Kuhara, Satoru; Kuwazaki, Seigo; Urio, Masahiro; Suetsugu, Yoshitaka; Yamamoto, Kimiko; Matsumura, Masaya; Yasui, Hideshi

    2013-01-01

    In this study, we developed the first genetic linkage map for the major rice insect pest, the brown planthopper (BPH, Nilaparvata lugens). The linkage map was constructed by integrating linkage data from two backcross populations derived from three inbred BPH strains. The consensus map consists of 474 simple sequence repeats, 43 single-nucleotide polymorphisms, and 1 sequence-tagged site, for a total of 518 markers at 472 unique positions in 17 linkage groups. The linkage groups cover 1093.9 cM, with an average distance of 2.3 cM between loci. The average number of marker loci per linkage group was 27.8. The sex-linkage group was identified by exploiting X-linked and Y-specific markers. Our linkage map and the newly developed markers used to create it constitute an essential resource and a useful framework for future genetic analyses in BPH. PMID:23204257

  3. Comparison of the nucleotide and amino acid sequences of the RsrI and EcoRI restriction endonucleases.

    PubMed

    Stephenson, F H; Ballard, B T; Boyer, H W; Rosenberg, J M; Greene, P J

    1989-12-21

    The RsrI endonuclease, a type-II restriction endonuclease (ENase) found in Rhodobacter sphaeroides, is an isoschizomer of the EcoRI ENase. A clone containing an 11-kb BamHI fragment was isolated from an R. sphaeroides genomic DNA library by hybridization with synthetic oligodeoxyribonucleotide probes based on the N-terminal amino acid (aa) sequence of RsrI. Extracts of E. coli containing a subclone of the 11-kb fragment display RsrI activity. Nucleotide sequence analysis reveals an 831-bp open reading frame encoding a polypeptide of 277 aa. A 50% identity exists within a 266-aa overlap between the deduced aa sequences of RsrI and EcoRI. Regions of 75-100% aa sequence identity correspond to key structural and functional regions of EcoRI. The type-II ENases have many common properties, and a common origin might have been expected. Nevertheless, this is the first demonstration of aa sequence similarity between ENases produced by different organisms. PMID:2695392

  4. Prioritization Of Nonsynonymous Single Nucleotide Variants For Exome Sequencing Studies Via Integrative Learning On Multiple Genomic Data

    PubMed Central

    Wu, Mengmeng; Wu, Jiaxin; Chen, Ting; Jiang, Rui

    2015-01-01

    The rapid advancement of next generation sequencing technology has greatly accelerated the progress for understanding human inherited diseases via such innovations as exome sequencing. Nevertheless, the identification of causative variants from sequencing data remains a great challenge. Traditional statistical genetics approaches such as linkage analysis and association studies have limited power in analyzing exome sequencing data, while relying on simply filtration strategies and predicted functional implications of mutations to pinpoint pathogenic variants are prone to produce false positives. To overcome these limitations, we herein propose a supervised learning approach, termed snvForest, to prioritize candidate nonsynonymous single nucleotide variants for a specific type of disease by integrating 11 functional scores at the variant level and 8 association scores at the gene level. We conduct a series of large-scale in silico validation experiments, demonstrating the effectiveness of snvForest across 2,511 diseases of different inheritance styles and the superiority of our approach over two state-of-the-art methods. We further apply snvForest to three real exome sequencing data sets of epileptic encephalophathies and intellectual disability to show the ability of our approach to identify causative de novo mutations for these complex diseases. The online service and standalone software of snvForest are found at http://bioinfo.au.tsinghua.edu.cn/jianglab/snvforest. PMID:26459872

  5. Complete nucleotide sequence of little cherry virus 1 (LChV-1) infecting sweet cherry in China.

    PubMed

    Wang, Jiawei; Zhu, Dongzi; Tan, Yue; Zong, Xiaojuan; Wei, Hairong; Hammond, Rosemarie W; Liu, Qingzhong

    2016-03-01

    Little cherry virus 1 (LChV-1), associated with little cherry disease (LCD), has a significant impact on fruit quality of infected sweet cherry trees. We report the full genome sequence of an isolate of LChV-1 from Taian, China (LChV-1-TA), detected by small-RNA deep sequencing and amplified by overlapping RT-PCR. The LChV-1-TA genome was 16,932 nt in length and contained nine open reading frames (ORFs), with sequence identity at the overall genome level of 76%, 76%, and 78% to LChV-1 isolates Y10237 (UW2 isolate), EU715989 (ITMAR isolate) and JX669615 (V2356 isolate), respectively. Based on the phylogenetic analysis of HSP70h amino acid sequences of Closteroviridae family members, LChV-1-TA was grouped into a well-supported cluster with the members of the genus Velarivirus and was also closely related to other LChV-1 isolates. This is the first report of the complete nucleotide sequence of LChV-1 infecting sweet cherry in China. PMID:26733294

  6. Prioritization Of Nonsynonymous Single Nucleotide Variants For Exome Sequencing Studies Via Integrative Learning On Multiple Genomic Data.

    PubMed

    Wu, Mengmeng; Wu, Jiaxin; Chen, Ting; Jiang, Rui

    2015-01-01

    The rapid advancement of next generation sequencing technology has greatly accelerated the progress for understanding human inherited diseases via such innovations as exome sequencing. Nevertheless, the identification of causative variants from sequencing data remains a great challenge. Traditional statistical genetics approaches such as linkage analysis and association studies have limited power in analyzing exome sequencing data, while relying on simply filtration strategies and predicted functional implications of mutations to pinpoint pathogenic variants are prone to produce false positives. To overcome these limitations, we herein propose a supervised learning approach, termed snvForest, to prioritize candidate nonsynonymous single nucleotide variants for a specific type of disease by integrating 11 functional scores at the variant level and 8 association scores at the gene level. We conduct a series of large-scale in silico validation experiments, demonstrating the effectiveness of snvForest across 2,511 diseases of different inheritance styles and the superiority of our approach over two state-of-the-art methods. We further apply snvForest to three real exome sequencing data sets of epileptic encephalophathies and intellectual disability to show the ability of our approach to identify causative de novo mutations for these complex diseases. The online service and standalone software of snvForest are found at http://bioinfo.au.tsinghua.edu.cn/jianglab/snvforest. PMID:26459872

  7. Complete nucleotide sequence of pSCV50, the virulence plasmid of Salmonella enterica serovar Choleraesuis SC-B67.

    PubMed

    Yu, Hong; Wang, Jianbin; Ye, Jiehua; Tang, Petrus; Chu, Chishih; Hu, Songnian; Chiu, Cheng-Hsun

    2006-03-01

    We carried out comparative analysis on the sequences of two 50-kb virulence plasmids of Salmonella enterica serovar Choleraesuis strains SC-B67 (pSCV50) and RF-1 (pKDSC50). The two plasmids share over 99% sequence similarity. Ninety-two nucleotide variations at 42 sites were detected between the two plasmids; pSCV50 contains 24 nucleotide substitutions, 6 deletions, and 62 insertions, compared to pKDSC50. Two regions in pSCV50 appeared to be more susceptible to changes: one is the non-virulence-associated transfer region (27.5-33.0 K) and the other a function-unknown region (9.0-10.5 K). We re-annotated pSCV50 using more advanced tools and the up-to-date databases and corrected the inaccurate annotation in pKDSC50. The results indicate that virulence-related genes on the 50-kb plasmid are under negative selection, suggesting that they play important roles in the expression of virulence during the process of infection, while other genes in this plasmid tend to evolve neutrally. PMID:16257053

  8. Porcine transcriptome analysis based on 97 non-normalized cDNA libraries and assembly of 1,021,891 expressed sequence tags

    PubMed Central

    Gorodkin, Jan; Cirera, Susanna; Hedegaard, Jakob; Gilchrist, Michael J; Panitz, Frank; Jørgensen, Claus; Scheibye-Knudsen, Karsten; Arvin, Troels; Lumholdt, Steen; Sawera, Milena; Green, Trine; Nielsen, Bente J; Havgaard, Jakob H; Rosenkilde, Carina; Wang, Jun; Li, Heng; Li, Ruiqiang; Liu, Bin; Hu, Songnian; Dong, Wei; Li, Wei; Yu, Jun; Wang, Jian; Stærfeldt, Hans-Henrik; Wernersson, Rasmus; Madsen, Lone B; Thomsen, Bo; Hornshøj, Henrik; Bujie, Zhan; Wang, Xuegang; Wang, Xuefei; Bolund, Lars; Brunak, Søren; Yang, Huanming; Bendixen, Christian; Fredholm, Merete

    2007-01-01

    Background Knowledge of the structure of gene expression is essential for mammalian transcriptomics research. We analyzed a collection of more than one million porcine expressed sequence tags (ESTs), of which two-thirds were generated in the Sino-Danish Pig Genome Project and one-third are from public databases. The Sino-Danish ESTs were generated from one normalized and 97 non-normalized cDNA libraries representing 35 different tissues and three developmental stages. Results Using the Distiller package, the ESTs were assembled to roughly 48,000 contigs and 73,000 singletons, of which approximately 25% have a high confidence match to UniProt. Approximately 6,000 new porcine gene clusters were identified. Expression analysis based on the non-normalized libraries resulted in the following findings. The distribution of cluster sizes is scaling invariant. Brain and testes are among the tissues with the greatest number of different expressed genes, whereas tissues with more specialized function, such as developing liver, have fewer expressed genes. There are at least 65 high confidence housekeeping gene candidates and 876 cDNA library-specific gene candidates. We identified differential expression of genes between different tissues, in particular brain/spinal cord, and found patterns of correlation between genes that share expression in pairs of libraries. Finally, there was remarkable agreement in expression between specialized tissues according to Gene Ontology categories. Conclusion This EST collection, the largest to date in pig, represents an essential resource for annotation, comparative genomics, assembly of the pig genome sequence, and further porcine transcription studies. PMID:17407547

  9. Thermostable group II intron reverse transcriptase fusion proteins and their use in cDNA synthesis and next-generation RNA sequencing

    PubMed Central

    Mohr, Sabine; Ghanem, Eman; Smith, Whitney; Sheeter, Dennis; Qin, Yidan; King, Olga; Polioudakis, Damon; Iyer, Vishwanath R.; Hunicke-Smith, Scott; Swamy, Sajani; Kuersten, Scott; Lambowitz, Alan M.

    2013-01-01

    Mobile group II introns encode reverse transcriptases (RTs) that function in intron mobility (“retrohoming”) by a process that requires reverse transcription of a highly structured, 2–2.5-kb intron RNA with high processivity and fidelity. Although the latter properties are potentially useful for applications in cDNA synthesis and next-generation RNA sequencing (RNA-seq), group II intron RTs have been difficult to purify free of the intron RNA, and their utility as research tools has not been investigated systematically. Here, we developed general methods for the high-level expression and purification of group II intron-encoded RTs as fusion proteins with a rigidly linked, noncleavable solubility tag, and we applied them to group II intron RTs from bacterial thermophiles. We thus obtained thermostable group II intron RT fusion proteins that have higher processivity, fidelity, and thermostability than retroviral RTs, synthesize cDNAs at temperatures up to 81°C, and have significant advantages for qRT-PCR, capillary electrophoresis for RNA-structure mapping, and next-generation RNA sequencing. Further, we find that group II intron RTs differ from the retroviral enzymes in template switching with minimal base-pairing to the 3′ ends of new RNA templates, making it possible to efficiently and seamlessly link adaptors containing PCR-primer binding sites to cDNA ends without an RNA ligase step. This novel template-switching activity enables facile and less biased cloning of nonpolyadenylated RNAs, such as miRNAs or protein-bound RNA fragments. Our findings demonstrate novel biochemical activities and inherent advantages of group II intron RTs for research, biotechnological, and diagnostic methods, with potentially wide applications. PMID:23697550

  10. Thermostable group II intron reverse transcriptase fusion proteins and their use in cDNA synthesis and next-generation RNA sequencing.

    PubMed

    Mohr, Sabine; Ghanem, Eman; Smith, Whitney; Sheeter, Dennis; Qin, Yidan; King, Olga; Polioudakis, Damon; Iyer, Vishwanath R; Hunicke-Smith, Scott; Swamy, Sajani; Kuersten, Scott; Lambowitz, Alan M

    2013-07-01

    Mobile group II introns encode reverse transcriptases (RTs) that function in intron mobility ("retrohoming") by a process that requires reverse transcription of a highly structured, 2-2.5-kb intron RNA with high processivity and fidelity. Although the latter properties are potentially useful for applications in cDNA synthesis and next-generation RNA sequencing (RNA-seq), group II intron RTs have been difficult to purify free of the intron RNA, and their utility as research tools has not been investigated systematically. Here, we developed general methods for the high-level expression and purification of group II intron-encoded RTs as fusion proteins with a rigidly linked, noncleavable solubility tag, and we applied them to group II intron RTs from bacterial thermophiles. We thus obtained thermostable group II intron RT fusion proteins that have higher processivity, fidelity, and thermostability than retroviral RTs, synthesize cDNAs at temperatures up to 81°C, and have significant advantages for qRT-PCR, capillary electrophoresis for RNA-structure mapping, and next-generation RNA sequencing. Further, we find that group II intron RTs differ from the retroviral enzymes in template switching with minimal base-pairing to the 3' ends of new RNA templates, making it possible to efficiently and seamlessly link adaptors containing PCR-primer binding sites to cDNA ends without an RNA ligase step. This novel template-switching activity enables facile and less biased cloning of nonpolyadenylated RNAs, such as miRNAs or protein-bound RNA fragments. Our findings demonstrate novel biochemical activities and inherent advantages of group II intron RTs for research, biotechnological, and diagnostic methods, with potentially wide applications. PMID:23697550

  11. Nucleotide sequence analysis of genes encoding a toluene/benzene-2-monooxygenase from pseudomonas sp. strain JS150

    SciTech Connect

    Johnson, G.R.; Olsen, R.H.

    1995-09-01

    Pseudomonas sp. strain JS150 metabolizes benzene and alkyl- and chloro-substituted benzenes by using dioxygenase-initiated pathways coupled with multiple downstream metabolic pathways to accommodate catechol metabolism. By cloning genes encoding benzene-degradative enzymes, strain JS150 was also found to carry genes for a toluene/benzene-2-monooxygenase. The gene cluster encoding a 2-monooxygenase and its cognate regulator was cloned from a plasmid carried by strain JS150. Oxygen ({sup 18}O{sub 2}) incorporation experiments using Pseudomonas aeruginosa strains carrying the cloned genes confirmed toluene hydroxylation was catalyzed through an authentic monooxygenase reaction to yield ortho-cresol. Encoding the toluene-2-monooxygenase and regulatory gene product was localized in two regions of the cloned fragment. The nucleotide sequence of the toluene/benzene-2-monooxygenase locus was determined, revealing six open reading frames that were then designated tbmA, tbmB, tbmC, tbmD, tbmE, and tbmF. The deduced amino acid sequences for these genes showed the presence of motifs similar to well-conserved functional domains of multicomponent oxygenases. This analysis allowed the tentative identification of two terminal oxygenase subunits (TbmB and TbmD) and an electron transport protein (TbmF) for the monooxygenase enzyme. All the tbm polypeptides shared significant homology with protein components from other bacterial multicomponent monooxygenases. Overall, the tbm gene products shared greater similarity with polypeptides from the phenol hydroxylases of Pseudomo-KR1 and Burkholderia (Pseudomonas) picketti PKO1. The relationship found between the phenol hydroxlases and a toluene-2-monooxygenase, characterized in this study for the first time at the nucleotide sequence level, suggested DNA probes used for surveys of environmental populations should be carefully selected to reflect DNA sequences corresponding to the metabolic pathway of interest. 58 refs., 8 figs., 1 tab.

  12. Identification of genes expressed in human CD34+ hematopoietic stem/progenitor cells by expressed sequence tags and efficient full-length cDNA cloning

    PubMed Central

    Mao, Mao; Fu, Gang; Wu, Ji-Sheng; Zhang, Qing-Hua; Zhou, Jun; Kan, Li-Xin; Huang, Qiu-Hua; He, Kai-Li; Gu, Bai-Wei; Han, Ze-Guang; Shen, Yu; Gu, Jian; Yu, Ya-Ping; Xu, Shu-Hua; Wang, Ya-Xin; Chen, Sai-Juan; Chen, Zhu

    1998-01-01

    Hematopoietic stem/progenitor cells (HSPCs) possess the potentials of self-renewal, proliferation, and differentiation toward different lineages of blood cells. These cells not only play a primordial role in hematopoietic development but also have important clinical application. Characterization of the gene expression profile in CD34+ HSPCs may lead to a better understanding of the regulation of normal and pathological hematopoiesis. In the present work, genes expressed in human umbilical cord blood CD34+ cells were catalogued by partially sequencing a large amount of cDNA clones [or expressed sequence tags (ESTs)] and analyzing these sequences with the tools of bioinformatics. Among 9,866 ESTs thus obtained, 4,697 (47.6%) showed identity to known genes in the GenBank database, 2,603 (26.4%) matched to the ESTs previously deposited in a public domain database, 1,415 (14.3%) were previously undescribed ESTs, and the remaining 1,151 (11.7%) were mitochondrial DNA, ribosomal RNA, or repetitive (Alu or L1) sequences. Integration of ESTs of known genes generated a profile including 855 genes that could be divided into different categories according to their functions. Some (8.2%) of the genes in this profile were considered related to early hematopoiesis. The possible function of ESTs corresponding to so far unknown genes were approached by means of homology and functional motif searches. Moreover, attempts were made to generate libraries enriched for full-length cDNAs, to better explore the genes in HSPCs. Nearly 60% of the cDNA clones of mRNA under 2 kb in our libraries had 5′ ends upstream of the first ATG codon of the ORF. With this satisfactory result, we have developed an efficient working system that allowed fast sequencing of 32 full-length cDNAs, 16 of them being mapped to the chromosomes with radiation hybrid panels. This work may lay a basis for the further research on the molecular network of hematopoietic regulation. PMID:9653160

  13. The complete nucleotide sequence and gene organization of the mitochondrial genome of the bumblebee, Bombus ignitus (Hymenoptera: Apidae).

    PubMed

    Cha, So Young; Yoon, Hyung Joo; Lee, Eun Mee; Yoon, Myung Hee; Hwang, Jae Sam; Jin, Byung Rae; Han, Yeon Soo; Kim, Iksoo

    2007-05-01

    The complete 16,434-bp nucleotide sequence of the mitogenome of the bumble bee, Bombus ignitus (Hymenoptera: Apidae), was determined. The genome contains the base composition and codon usage typical of metazoan mitogenomes. An unusual feature of the B. ignitus mitogenome is the presence of five tRNA-like structures: two each of the tRNALeu(UUR)-like and tRNASer(AGN)-like sequences and one tRNAPhe-like sequence. These tRNA-like sequences have proper folding structures and anticodon sequences, but their functionality in their respective amino acid transfers remained uncertain. Among these sequences, the tRNALeu(UUR)-like sequence and the tRNASer(AGN)-like sequence are seemingly located within the A+T-rich region. This tRNASer(AGN)-like sequence is highly unusual in that its sequence homology is very high compared to the tRNAMet of other insects, including Apis mellifera, but it contains the anticodon ACT, which designates it as tRNASer(AGN). All PCG and rRNAs are conserved in positions observed most frequently in insect mitogenome structures, but the positions of the tRNAs are highly variable, presenting a new arrangement for an insect mitogenome. As a whole, the B. ignitus mitogenome contains the highest A+T content (86.9%) found in any of the complete insects mt sequences determined to date. All protein-coding sequences started with a typical ATN codon. Nine of the 13 PCGs have a complete termination codon (all TAA), but the remaining four genes terminate with the incomplete TA or T. All tRNAs have the typical clover-leaf structures of mt tRNAs, except for tRNASer(AGN), in which the DHU arm forms a simple loop. All anticodons of B. ignitus tRNAs are identical to those of A. mellifera. In the A+T-rich region, a highly conserved sequence block that was previously described in Orthoptera and Diptera was also present. The stem-and-loop structures that may play a role in the initiation of mtDNA replication were also found in this region. Phylogenetic analysis among

  14. The complete nucleotide sequence and genomic characterization of tropical soda apple mosaic virus

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Tropical soda apple mosaic virus (TSAMV) was first identified in tropical soda apple (Solanum viarum), a noxious weed, in Florida in 2002. This report provides the first full genome sequence of TSAMV. The full genome sequence of this virus will enable research scientists to develop additional spec...

  15. Sequence Comparison and Phylogeny of Nucleotide Sequence of Coat Protein and Nucleic Acid Binding Protein of a Distinct Isolate of Shallot virus X from India.

    PubMed

    Majumder, S; Baranwal, V K

    2011-06-01

    Shallot virus X (ShVX), a type species in the genus Allexivirus of the family Alfaflexiviridae has been associated with shallot plants in India and other shallot growing countries like Russia, Germany, Netherland, and New Zealand. Coat protein (CP) and nucleic acid binding protein (NB) region of the virus was obtained by reverse transcriptase polymerase chain reaction from scales leaves of shallot bulbs. The partial cDNA contained two open reading frames encoding proteins of molecular weights of 28.66 and 14.18 kDa belonging to Flexi_CP super-family and viral NB super-family, respectively. The percent identity and phylogenetic analysis of amino acid sequences of CP and NB region of the virus associated with shallot indicated that it was a distinct isolate of ShVX. PMID:23637504

  16. Complete nucleotide sequences of two NDM-1-encoding plasmids from the same sequence type 11 Klebsiella pneumoniae strain.

    PubMed

    Studentova, V; Dobiasova, H; Hedlova, D; Dolejska, M; Papagiannitsis, C C; Hrabak, J

    2015-02-01

    The sequence type 11 Klebsiella pneumoniae strain Kpn-3002cz was confirmed to harbor two NDM-1-encoding plasmids, pB-3002cz and pS-3002cz. pB-3002cz (97,649 bp) displayed extensive sequence similarity with the blaNDM-1-carrying plasmid pKPX-1. pS-3002cz (73,581 bp) was found to consist of an IncR-related sequence (13,535 bp) and a mosaic region (60,046 bp). A 40,233-bp sequence of pS-3002cz was identical to the mosaic region of pB-3002cz, indicating the en bloc acquisition of the NDM-1-encoding region from one plasmid by the other. PMID:25421477

  17. Complete Nucleotide Sequences of Two NDM-1-Encoding Plasmids from the Same Sequence Type 11 Klebsiella pneumoniae Strain

    PubMed Central

    Studentova, V.; Dobiasova, H.; Hedlova, D.; Dolejska, M.; Hrabak, J.

    2014-01-01

    The sequence type 11 Klebsiella pneumoniae strain Kpn-3002cz was confirmed to harbor two NDM-1-encoding plasmids, pB-3002cz and pS-3002cz. pB-3002cz (97,649 bp) displayed extensive sequence similarity with the blaNDM-1-carrying plasmid pKPX-1. pS-3002cz (73,581 bp) was found to consist of an IncR-related sequence (13,535 bp) and a mosaic region (60,046 bp). A 40,233-bp sequence of pS-3002cz was identical to the mosaic region of pB-3002cz, indicating the en bloc acquisition of the NDM-1-encoding region from one plasmid by the other. PMID:25421477

  18. FeatureScan: revealing property-dependent similarity of nucleotide sequences

    PubMed Central

    Deyneko, Igor V.; Bredohl, Björn; Wesely, Daniel; Kalybaeva, Yulia M.; Kel, Alexander E.; Blöcker, Helmut; Kauer, Gerhard

    2006-01-01

    FeatureScan is a software package aiming to reveal novel types of DNA sequence similarity by comparing physico-chemical properties. Thirty-eight different parameters of DNA double strands such as charge, melting enthalpy, conformational parameters and the like are provided. As input FeatureScan requires two sequences, a pattern sequence and a target sequence, search conditions are set by selecting a specific DNA parameter and a threshold value. Search results are displayed in FASTA format and directly linked to external genome databases/browsers (ENSEMBL, NCBI, UCSC). An Internet version of FeatureScan is accessible at . As part of the HOBIT initiative () FeatureScan is also accessible as a web service at its above home page. Currently, several preloaded genomes are provided at this Internet website (Homo sapiens, Mus musculus, Rattus norvegicus and four strains of Escherichia coli) as target sequences. Standalone executables of FeatureScan are available on request. PMID:16845077

  19. Bayesian Markov models consistently outperform PWMs at predicting motifs in nucleotide sequences.

    PubMed

    Siebert, Matthias; Söding, Johannes

    2016-07-27

    Position weight matrices (PWMs) are the standard model for DNA and RNA regulatory motifs. In PWMs nucleotide probabilities are independent of nucleotides at other positions. Models that account for dependencies need many parameters and are prone to overfitting. We have developed a Bayesian approach for motif discovery using Markov models in which conditional probabilities of order k - 1 act as priors for those of order k This Bayesian Markov model (BaMM) training automatically adapts model complexity to the amount of available data. We also derive an EM algorithm for de-novo discovery of enriched motifs. For transcription factor binding, BaMMs achieve significantly (P    =  1/16) higher cross-validated partial AUC than PWMs in 97% of 446 ChIP-seq ENCODE datasets and improve performance by 36% on average. BaMMs also learn complex multipartite motifs, improving predictions of transcription start sites, polyadenylation sites, bacterial pause sites, and RNA binding sites by 26-101%. BaMMs never performed worse than PWMs. These robust improvements argue in favour of generally replacing PWMs by BaMMs. PMID:27288444

  20. Nucleotide sequence of the gene encoding the nitrogenase iron protein of Thiobacillus ferrooxidans

    SciTech Connect

    Pretorius, I.M.; Rawlings, D.E.; O'Neill, E.G.; Jones, W.A.; Kirby, R.; Woods, D.R.

    1987-01-01

    The DNA sequence was determined for the cloned Thiobacillus ferrooxidans nifH and part of the nifD genes. The DNA chains were radiolabeled with (..cap alpha..-/sup 32/P)dCTP (3000 Ci/mmol) or (..cap alpha..-/sup 35/S)dCTP (400 Ci/mmol). A putative T. ferrooxidans nifH promoter was identified whose sequences showed perfect consensus with those of the Klebsiella pneumoniae nif promoter. Two putative consensus upstream activator sequences were also identified. The amino acid sequence was deduced from the DNA sequence. In a comparison of nifH DNA sequences from T. ferrooxidans and eight other nitrogen-fixing microbes, a Rhizobium sp. isolated from Parasponia andersonii showed the greatest homology (74%) and Clostridium pasteurianum (nifH1) showed the least homology (54%). In the comparison of the amino acid sequences of the Fe proteins, the Rhizobium sp. and Rhizobium japonicum showed the greatest homology (both 86%) and C. pasteurianum (nifH1 gene product) demonstrated the least homology (56%) to the T. ferrooxidans Fe protein.