Science.gov

Sample records for cdna nucleotide sequence

  1. Nucleotide sequence of alkyl-dihydroxyacetonephosphate synthase cDNA from Dictyostelium discoideum.

    PubMed

    de Vet, E C; van den Bosch, H

    1998-11-27

    The nucleotide sequence is reported of alkyl-dihydroxyacetonephosphate synthase cDNA from the cellular slime mold Dictyostelium discoideum. The open reading frame encodes a protein of 611 amino acids which shows a 33% amino acid identity to the human enzyme. This D. discoideum homolog carries a variant of the peroxisomal targeting signal type 1 at its C-terminus (PKL). Expression of the cDNA in Escherichia coli yielded an enzymatically active protein.

  2. Isolation and nucleotide sequence of a cDNA clone encoding rat mitochondrial malate dehydrogenase.

    PubMed Central

    Grant, P M; Tellam, J; May, V L; Strauss, A W

    1986-01-01

    We have determined the complete sequence of the rat mitochondrial malate dehydrogenase (mMDH) precursor derived from nucleotide sequence of the cDNA. A single synthetic oligodeoxynucleotide probe was used to screen a rat atrial cDNA library constructed in lambda gt10. A 1.2 kb full-length cDNA clone provided the first complete amino acid sequence of pre-mMDH. The 1014 nucleotide-long open reading frame encodes the 314 residue long mature mMDH protein and a 24 amino acid NH2-terminal extension which directs mitochondrial import and is cleaved from the precursor after import to generate mature mMDH. The amino acid composition of the transit peptide is polar and basic. The pre-mMDH transit peptide shows marked homology with those of two other enzymes targeted to the rat mitochondrial matrix. Images PMID:3755817

  3. Molecular cloning and nucleotide sequencing of human immunoglobulin epsilon chain cDNA.

    PubMed Central

    Seno, M; Kurokawa, T; Ono, Y; Onda, H; Sasada, R; Igarashi, K; Kikuchi, M; Sugino, Y; Nishida, Y; Honjo, T

    1983-01-01

    DNA complementary to mRNA of human immunoglobulin E heavy chain (epsilon chain) isolated and purified from U266 cells has been synthesized and inserted into the PstI site of pBR322 by G-C tailing. This recombinant plasmid was used to transform E. coli chi 1776 to screen 1445 tetracycline resistant colonies. Nine clones (pGETI - 9) containing cDNA coding for the human epsilon chain were recognized by colony hybridization and Southern blotting analysis with a nick-translated human IgE genome fragment. The nucleotide sequence of the longest cDNA contained in pGET2 was determined. The results indicate that the sequence of 1657 nucleotides codes for 494 amino acids covering a part of the variable region and all of the constant region of the human epsilon chain. Most of the amino acid sequence deduced from the nucleotide sequence is in substantial agreement with that reported. Furthermore a termination codon after the -COOH terminal amino acid marks the beginning of a 3' untranslated region of 125 nucleotides with a poly A tail. Taking this into account, the structure of the human epsilon chain mRNA, except a part of the 5' end, is conserved fairly well in the cDNA insert in pGET2. Images PMID:6300763

  4. Infectivity and complete nucleotide sequence of cucumber fruit mottle mosaic virus isolate Cm cDNA.

    PubMed

    Rhee, Sun-Ju; Hong, Jin-Sung; Lee, Gung Pyo

    2014-07-01

    Three isolates of cucumber fruit mottle mosaic virus (CFMMV) were collected from melon, cucumber, and pumpkin plants in Korea. A full-length cDNA clone of CFMMV-Cm (melon isolate) was produced and evaluated for infectivity after T7 transcription in vitro (pT7CF-Cmflc). The complete CFMMV genome sequence of the infectious clone pT7CF-Cmflc was determined. The genome of CFMMV-Cm consisted of 6,571 nucleotides and shared high nucleotide sequence identity (98.8 %) with the Israel isolate of CFMMV. Based on the infectious clone pT7CF-Cmflc, a CaMV 35S-promoter driven cDNA clone (p35SCF-Cmflc) was subsequently constructed and sequenced. Mechanical inoculation with RNA transcripts of pT7CF-Cmflc and agro-inoculation with p35SCF-Cmflc resulted in systemic infection of cucumber and melon, producing symptoms similar to those produced by CFMMV-Cm. Progeny virus in infected plants was detected by RT-PCR, western blot assay, and transmission electron microscopy.

  5. Molecular cloning and nucleotide sequence of cDNA for human liver arginase

    SciTech Connect

    Haraguchi, Y.; Takiguchi, M.; Amaya, Y.; Kawamoto, S.; Matsuda, I.; Mori, M.

    1987-01-01

    Arginase (EC3.5.3.1) catalyzes the last step of the urea cycle in the liver of ureotelic animals. Inherited deficiency of the enzyme results in argininemia, an autosomal recessive disorder characterized by hyperammonemia. To facilitate investigation of the enzyme and gene structures and to elucidate the nature of the mutation in argininemia, the authors isolated cDNA clones for human liver arginase. Oligo(dT)-primed and random primer human liver cDNA libraries in lambda gt11 were screened using isolated rat arginase cDNA as a probe. Two of the positive clones, designated lambda hARG6 and lambda hARG109, contained an overlapping cDNA sequence with an open reading frame encoding a polypeptide of 322 amino acid residues (predicted M/sub r/, 34,732), a 5'-untranslated sequence of 56 base pairs, a 3'-untranslated sequence of 423 base pairs, and a poly(A) segment. Arginase activity was detected in Escherichia coli cells transformed with the plasmid carrying lambda hARG6 cDNA insert. RNA gel blot analysis of human liver RNA showed a single mRNA of 1.6 kilobases. The predicted amino acid sequence of human liver arginase is 87% and 41% identical with those of the rat liver and yeast enzymes, respectively. There are several highly conserved segments among the human, rat, and yeast enzymes.

  6. Nucleotide sequence and infectious cDNA clone of the L1 isolate of Pea seed-borne mosaic potyvirus.

    PubMed

    Olsen, B S; Johansen, I E

    2001-01-01

    The complete nucleotide sequence of Pea seed-borne mosaic potyvirus isolate L1 has been determined from cloned virus cDNA. The PSbMV L1 genome is 9895 nucleotides in length excluding the poly(A) tail. Computer analysis of the sequence revealed a single long open reading frame (ORF) of 9594 nucleotides. The ORF potentially encodes a polyprotein of 3198 amino acids with a deduced Mr of 363537. Nine putative proteolytic cleavage sites were identified by analogy to consensus sequences and genome arrangement in other potyviruses. Two full-length cDNA clones, p35S-L1-4 and p35S-L1-5, were assembled under control of an enhanced 35S promoter and nopaline synthase terminator. Clone p35S-L1-4 was constructed with four introns and p35S-L1-5 with five introns inserted in the cDNA. Clone p35S-L1-4 was unstable in Escherichia coli often resulting in amplification of plasmids with deletions. Clone p35S-L1-5 was stable and apparently less toxic to Escherichia coli resulting in larger bacterial colonies and higher plasmid yield. Both clones were infectious upon mechanical inoculation of plasmid DNA on susceptible pea cultivars Fjord, Scout, and Brutus. Eight pea genotypes resistant to L1 virus were also resistant to the cDNA derived L1 virus. Both native PSbMV L1 and the cDNA derived virus infected Chenopodium quinoa systemically giving rise to characteristic necrotic lesions on uninoculated leaves.

  7. Nucleotide sequence of cloned cDNA for human pancreatic kallikrein.

    PubMed

    Fukushima, D; Kitamura, N; Nakanishi, S

    1985-12-31

    Cloned cDNA sequences for human pancreatic kallikrein have been isolated and determined by molecular cloning and sequence analysis. The identity between human pancreatic and urinary kallikreins is indicated by the complete coincidence between the amino acid sequence deduced from the cloned cDNA sequence and that reported partially for urinary kallikrein. The active enzyme form of the human pancreatic kallikrein consists of 238 amino acids and is preceded by a signal peptide and a profragment of 24 amino acids. A sequence comparison of this with other mammalian kallikreins indicates that key amino acid residues required for both serine protease activity and kallikrein-like cleavage specificity are retained in the human sequence, and residues corresponding to some external loops of the kallikrein diverge from other kallikreins. Analyses by RNA blot hybridization, primer extension, and S1 nuclease mapping indicate that the pancreatic kallikrein mRNA is also expressed in the kidney and sublingual gland, suggesting the active synthesis of urinary kallikrein in these tissues. Furthermore, the tissue-specific regulation of the expression of the members of the human kallikrein gene family has been discussed.

  8. Molecular cloning and nucleotide sequence of cDNA for human glucose-6-phosphate dehydrogenase variant A(-)

    SciTech Connect

    Hirono, A.; Beutler, E. )

    1988-06-01

    Glucose-6-phosphate dehydrogenase A(-) is a common variant in Blacks that causes sensitivity to drug- and infection-induced hemolytic anemia. A cDNA library was constructed from Epstein-Barr virus-transformed lymphoblastoid cells from a male who was G6PD A(-). One of four cDNA clones isolated contained a sequence not found in the other clones nor in the published cDNA sequence. Consisting of 138 bases and coding 46 amino acids, this segment of cDNA apparently is derived from the alternative splicing involving the 3{prime} end of intron 7. Comparison of the remaining sequences of these clones with the published sequence revealed three nucleotide substitutions: C{sup 33} {yields} G, G{sup 202} {yields} A, and A{sup 376} {yields} G. Each change produces a new restriction site. Genomic DNA from five G6PD A(-) individuals was amplified by the polymerase chain reaction. The findings of the same mutation in G6PD A(-) as is found in G6PD A(+) strongly suggests that the G6PD A(-) mutation arose in an individual with G6PD A(+), adding another mutation that causes the in vivo instability of this enzyme protein.

  9. Human uroporphyrinogen III synthase: Molecular cloning, nucleotide sequence, and expression of a full-length cDNA

    SciTech Connect

    Tsai, Shihfeng; Bishop, D.F.; Desnick, R.J. )

    1988-10-01

    Uroporphyrinogen III synthase, the fourth enzyme in the heme biosynthetic pathway, is responsible for conversion of the linear tetrapyrrole, hydroxymethylbilane, to the cyclic tetrapyrrole, uroporphyrinogen III. The deficient activity of URO-synthase is the enzymatic defect in the autosomal recessive disorder congenital erythropoietic porphyria. To facilitate the isolation of a full-length cDNA for human URO-synthase, the human erythrocyte enzyme was purified to homogeneity and 81 nonoverlapping amino acids were determined by microsequencing the N terminus and four tryptic peptides. Two synthetic oligonucleotide mixtures were used to screen 1.2 {times} 10{sup 6} recombinants from a human adult liver cDNA library. Eight clones were positive with both oligonucleotide mixtures. Of these, dideoxy sequencing of the 1.3 kilobase insert from clone pUROS-2 revealed 5' and 3' untranslated sequences of 196 and 284 base pairs, respectively, and an open reading frame of 798 base pairs encoding a protein of 265 amino acids with a predicted molecular mass of 28,607 Da. The isolation and expression of this full-length cDNA for human URO-synthase should facilitate studies of the structure, organization, and chromosomal localization of this heme biosynthetic gene as well as the characterization of the molecular lesions causing congenital erythropoietic porphyria.

  10. Nucleotide sequence of murine PCNA: interspecies comparison of the cDNA and the 5' flanking region of the gene.

    PubMed

    Shipman-Appasamy, P M; Cohen, K S; Prystowsky, M B

    1991-01-01

    Proliferating cell nuclear antigen (PCNA) RNA levels are regulated by transcription as well as changes in stability, in growing cells. We have cloned the murine PCNA cDNA and a fragment of the murine PCNA gene flanking the transcription initiation site. Comparison of the murine deduced amino acid sequence with the PCNA sequence from rat, human, Drosophila, Saccharomyces cerevisiae, and higher plants, reveals extensive homology between species. The homology is likely to be related to the fundamental role of PCNA as an auxiliary protein for DNA replication. Consensus sequences for transcriptional regulatory factors identified within 520 bp 5' of the cap site of the murine PCNA gene include: an inverted CCAAT site, an enhancer core element (EBP-1), three cAMP-response elements (CRE-BP), one AP-2 site, three Sp1 sites, and two octamer sequences. The first 20 bp of the transcriptional unit are homologous to an initiator element, which may direct transcription from RNA polymerase II in the absence of a TATAA box. The consensus elements in the murine PCNA gene are similar in sequence and/or location to elements identified in the genes for human, Drosophilia, and yeast PCNA.

  11. Uroporphyrinogen-III synthase: Molecular cloning, nucleotide sequence, expression of a mouse full-length cDNA, and its localization on mouse chromosome 7

    SciTech Connect

    Xu, W.; Desnick, R.J.; Kozak, C.A.

    1995-04-10

    Uroporphyrinogen-III synthase, the fourth enzyme in the heme biosynthetic pathway, is responsible for the conversion of hydroxymethylbilane to the cyclic tetrapyrrole, uroporphyrinogen III. The deficient activity of URO-S is the enzymatic defect in congenital erythropoietic porphyria (CEP), an autosomal recessive disorder. For the generation of a mouse model of CEP, the human URO-S cDNA was used to screen 2 X 10{sup 6} recombinants from a mouse adult liver cDNA library. Ten positive clones were isolated, and dideoxy sequencing of the entire 1.6-kb insert of clone pmUROS-1 revealed 5{prime} and 3{prime} untranslated sequences of 144 and 623 bp, respectively, and an open reading frame of 798 bp encoding a 265-amino-acid polypeptide with a predicted molecular mass of 28,501 Da. The mouse and human coding sequences had 80.5 and 77.8% nucleotide and amino acid identity, respectively. The authenticity of the mouse cDNA was established by expression of the active monomeric enzyme in Escherichia coli. In addition, the analysis of two multilocus genetic crosses localized the mouse gene on chromosome 7, consistent with the mapping of the human gene to a position of conserved synteny on chromosome 10. The isolation, expression, and chromosomal mapping of this full-length cDNA should facilitate studies of the structure and organization of the mouse genomic sequence and the development of a mouse model of CEP for characterization of the disease pathogenesis and evaluation of gene therapy. 38 refs., 1 tab.

  12. Full-length cDNA nucleotide sequence of a serologically undetectable HLA-DQA1 allele: HLA-DQA1*"LA".

    PubMed

    Lardy, N M; Otting, N; van der Horst, A R; Bontrop, R E; de Waal, L P

    1997-10-01

    This study describes the characterization of a serological HLA-DQ"blank" specificity that segregates with the HLA-A2, -B7, -DR14, -DR52 haplotype. Although conventional serological typing techniques could not detect an HLA-DQ product on the haplotype positive for the HLA-DQ"blank" specificity, sequence-specific oligonucleotide (SSO) dot-blot analysis demonstrated the presence of the HLA-DQA1*01 and HLA-DQB1*05 alleles. Full-length cDNA nucleotide sequence analysis revealed that the HLA-DQB1 allele that segregated with the HLA-DQ"blank" specificity was identical to HLA-DQB1*05031. As for the HLA DQA1 allele, one nucleotide substitution distinguished the HLA-DQA1 "blank" allele from HLA-DQA1*0104. In exon 2 at nucleotide position 304 a C was substituted for a T (Arg-->Cys). Pending official recognition by the WHO Nomenclature Committee, this HLA-DQA1 "blank" allele is termed HLA-DQA1*"LA". Furthermore, it is postulated that the introduction of cysteine at amino acid position 102 abrogates the classical HLA-DQ1 specificity.

  13. Complete nucleotide and derived amino acid sequence of cDNA encoding the mitochondrial uncoupling protein of rat brown adipose tissue: lack of a mitochondrial targeting presequence.

    PubMed Central

    Ridley, R G; Patel, H V; Gerber, G E; Morton, R C; Freeman, K B

    1986-01-01

    A cDNA clone spanning the entire amino acid sequence of the nuclear-encoded uncoupling protein of rat brown adipose tissue mitochondria has been isolated and sequenced. With the exception of the N-terminal methionine the deduced N-terminus of the newly synthesized uncoupling protein is identical to the N-terminal 30 amino acids of the native uncoupling protein as determined by protein sequencing. This proves that the protein contains no N-terminal mitochondrial targeting prepiece and that a targeting region must reside within the amino acid sequence of the mature protein. Images PMID:3012461

  14. Structure of LEP100, a glycoprotein that shuttles between lysosomes and the plasma membrane, deduced from the nucleotide sequence of the encoding cDNA

    PubMed Central

    1988-01-01

    LEP100, a membrane glycoprotein that has the unique property of shuttling from lysosomes to endosomes to plasma membrane and back, was purified from chicken brain. Its NH2-terminal amino acid sequence was determined, and an oligonucleotide encoding part of this sequence was used to clone the encoding cDNA. The deduced amino acid sequence consists of 414 residues of which the NH2-terminal 18 constitute a signal peptide. The sequence includes 17 sites for N-glycosylation in the NH2-terminal 75% of the polypeptide chain followed by a region lacking N-linked oligosaccharides, a single possible membrane-spanning segment, and a cytoplasmic domain of 11 residues, including three potential phosphorylation sites. Eight cysteine residues are spaced in a regular pattern through the lumenal (extracellular) domain, while a 32-residue sequence rich in proline, serine, and threonine occurs at its midpoint. Expression of the cDNA in mouse L cells resulted in targeting of LEP100 primarily to the mouse lysosomes. PMID:3339090

  15. Cloning and partial nucleotide sequence of human immunoglobulin mu chain cDNA from B cells and mouse-human hybridomas.

    PubMed Central

    Dolby, T W; Devuono, J; Croce, C M

    1980-01-01

    Purified mRNAs coding for mu and kappa human immunoglobulin polypeptides were translated in vitro and their products were characterized. The mu-specific mRNAs, derived from both human lymphoblastoid cells (GM607) and from a mouse-human somatic cell hybrid secreting human mu chains (alpha D5-H11-BC11), were copied into cDNAs and inserted into the plasmid pBR322. Several recombinant cDNAs that were obtained were identified by a combination of colony hybridization with labeled probes, in vitro translation of plasmid-selected mu mRNAs, and DNA nucleotide sequence determination. One recombinant DNA, for which the sequence has been partially determined, contains the codons for part of the C3 constant region domain through the carboxy-terminal piece (155 amino acids total) as well as the entire 3' noncoding sequence up to the poly(A) site of the human mu mRNA. The sequence A-A-U-A-A occurs 12 nucleotides prior to the poly(A) addition site in the human mu mRNA. Considerable sequence homology is observed in the mouse and human mu mRNA 3' coding and noncoding sequences. Images PMID:6777778

  16. Nucleotide sequence of a tobacco cDNA encoding plastidic glutamine synthetase and light inducibility, organ specificity and diurnal rhythmicity in the expression of the corresponding genes of tobacco and tomato.

    PubMed

    Becker, T W; Caboche, M; Carrayol, E; Hirel, B

    1992-06-01

    A full-length cDNA encoding glutamine synthetase (GS) was cloned from a lambda gt10 library of tobacco leaf RNA, and the nucleotide sequence was determined. An open reading frame accounting for a primary translation product consisting of 432 amino acids has been localized on the cDNA. The calculated molecular mass of the encoded protein is 47.2 kDa. The predicted amino acid sequence of this precursor shows higher homology to GS-2 protein sequences from other species than to a leaf GS-1 polypeptide sequence, indicating that the cDNA isolated encodes the chloroplastic isoform (GS-2) of tobacco GS. The presence of C- and N-terminal extensions which are characteristic of GS-2 proteins supports this conclusion. Genomic Southern blot analysis indicated that GS-2 is encoded by a single gene in the diploid genomes of both tomato and Nicotiana sylvestris, while two GS-2 genes are very likely present in the amphidiploid tobacco genome. Western blot analysis indicated that in etiolated and in green tomato cotyledons GS-2 subunits are represented by polypeptides of similar size, while in green tomato leaves an additional GS-2 polypeptide of higher apparent molecular weight is detectable. In contrast, tobacco GS-2 is composed of subunits of identical size in all organs examined. GS-2 transcripts and GS-2 proteins could be detected at high levels in the leaves of both tobacco or tomato. Lower amounts of GS-2 mRNA were detected in stems, corolla, and roots of tomato, but not in non-green organs of tobacco. The GS-2 transcript abundance exhibited a diurnal fluctuation in tomato leaves but not in tobacco leaves. White or red light stimulated the accumulation of GS-2 transcripts and GS-2 protein in etiolated tomato cotyledons. Far-red light cancelled this stimulation. The red light response of the GS-2 gene was reduced in etiolated seedlings of the phytochrome-deficient aurea mutant of tomato. These results indicate a phytochrome-mediated light stimulation of GS-2 gene expression

  17. cDNA encoding a polypeptide including a hevein sequence

    DOEpatents

    Raikhel, Natasha V.; Broekaert, Willem F.; Chua, Nam-Hai; Kush, Anil

    1993-02-16

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a pu GOVERNMENT RIGHTS This application was funded under Department of Energy Contract DE-AC02-76ER01338. The U.S. Government has certain rights under this application and any patent issuing thereon.

  18. Human somatostatin I: sequence of the cDNA.

    PubMed Central

    Shen, L P; Pictet, R L; Rutter, W J

    1982-01-01

    RNA has been isolated from a human pancreatic somatostatinoma and used to prepare a cDNA library. After prescreening, clones containing somatostatin I sequences were identified by hybridization with an anglerfish somatostatin I-cloned cDNA probe. From the nucleotide sequence of two of these clones, we have deduced an essentially full-length mRNA sequence, including the preprosomatostatin coding region, 105 nucleotides from the 5' untranslated region and the complete 150-nucleotide 3' untranslated region. The coding region predicts a 116-amino acid precursor protein (Mr, 12.727) that contains somatostatin-14 and -28 at its COOH terminus. The predicted amino acid sequence of human somatostatin-28 is identical to that of somatostatin-28 isolated from the porcine and ovine species. A comparison of the amino acid sequences of human and anglerfish preprosomatostatin I indicated that the COOH-terminal region encoding somatostatin-14 and the adjacent 6 amino acids are highly conserved, whereas the remainder of the molecule, including the signal peptide region, is more divergent. However, many of the amino acid differences found in the pro region of the human and anglerfish proteins are conservative changes. This suggests that the propeptides have a similar secondary structure, which in turn may imply a biological function for this region of the molecule. Images PMID:6126875

  19. Complete nucleotide sequences and construction of full-length infectious cDNA clones of cucumber green mottle mosaic virus (CGMMV) in a versatile newly developed binary vector including both 35S and T7 promoters.

    PubMed

    Park, Chan-Hwan; Ju, Hye-Kyoung; Han, Jae-Yeong; Park, Jong-Seo; Kim, Ik-Hyun; Seo, Eun-Young; Kim, Jung-Kyu; Hammond, John; Lim, Hyoun-Sub

    2017-04-01

    Seed-transmitted viruses have caused significant damage to watermelon crops in Korea in recent years, with cucumber green mottle mosaic virus (CGMMV) infection widespread as a result of infected seed lots. To determine the likely origin of CGMMV infection, we collected CGMMV isolates from watermelon and melon fields and generated full-length infectious cDNA clones. The full-length cDNAs were cloned into newly constructed binary vector pJY, which includes both the 35S and T7 promoters for versatile usage (agroinfiltration and in vitro RNA transcription) and a modified hepatitis delta virus ribozyme sequence to precisely cleave RNA transcripts at the 3' end of the tobamovirus genome. Three CGMMV isolates (OMpj, Wpj, and Mpj) were separately evaluated for infectivity in Nicotiana benthamiana, demonstrated by either Agroinfiltration or inoculation with in vitro RNA transcripts. CGMMV nucleotide identities to other tobamoviruses were calculated from pairwise alignments using DNAMAN. CGMMV identities were 49.89% to tobacco mosaic virus; 49.85% to pepper mild mottle virus; 50.47% to tomato mosaic virus; 60.9% to zucchini green mottle mosaic virus; and 60.96% to kyuri green mottle mosaic virus, confirming that CGMMV is a distinct species most similar to other cucurbit-infecting tobamoviruses. We further performed phylogenetic analysis to determine relationships of our new Korean CGMMV isolates to previously characterized isolates from Canada, China, India, Israel, Japan, Korea, Russia, Spain, and Taiwan available from NCBI. Analysis of CGMMV amino acid sequences showed three major clades, broadly typified as 'Russian,' 'Israeli,' and 'Asian' groups. All of our new Korean isolates fell within the 'Asian' clade. Neither the 128 nor 186 kDa RdRps of the three new isolates showed any detectable gene silencing suppressor function.

  20. Automated Identification of Nucleotide Sequences

    NASA Technical Reports Server (NTRS)

    Osman, Shariff; Venkateswaran, Kasthuri; Fox, George; Zhu, Dian-Hui

    2007-01-01

    STITCH is a computer program that processes raw nucleotide-sequence data to automatically remove unwanted vector information, perform reverse-complement comparison, stitch shorter sequences together to make longer ones to which the shorter ones presumably belong, and search against the user s choice of private and Internet-accessible public 16S rRNA databases. ["16S rRNA" denotes a ribosomal ribonucleic acid (rRNA) sequence that is common to all organisms.] In STITCH, a template 16S rRNA sequence is used to position forward and reverse reads. STITCH then automatically searches known 16S rRNA sequences in the user s chosen database(s) to find the sequence most similar to (the sequence that lies at the smallest edit distance from) each spliced sequence. The result of processing by STITCH is the identification of the most similar well-described bacterium. Whereas previously commercially available software for analyzing genetic sequences operates on one sequence at a time, STITCH can manipulate multiple sequences simultaneously to perform the aforementioned operations. A typical analysis of several dozen sequences (length of the order of 103 base pairs) by use of STITCH is completed in a few minutes, whereas such an analysis performed by use of prior software takes hours or days.

  1. cDNA encoding a polypeptide including a hevein sequence

    DOEpatents

    Raikhel, N.V.; Broekaert, W.F.; Chua, N.H.; Kush, A.

    1995-03-21

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1,018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74--79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli. 11 figures.

  2. cDNA encoding a polypeptide including a hevein sequence

    DOEpatents

    Raikhel, N.V.; Broekaert, W.F.; Chua, N.H.; Kush, A.

    1999-05-04

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74--79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli. 12 figs.

  3. CDNA encoding a polypeptide including a hevein sequence

    DOEpatents

    Raikhel, Natasha V.; Broekaert, Willem F.; Chua, Nam-Hai; Kush, Anil

    1995-03-21

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74-79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli.

  4. cDNA encoding a polypeptide including a hevein sequence

    DOEpatents

    Raikhel, Natasha V.; Broekaert, Willem F.; Chua, Nam-Hai; Kush, Anil

    1999-05-04

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74-79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli.

  5. cDNA encoding a polypeptide including a hevein sequence

    SciTech Connect

    Raikhel, N.V.; Broekaert, W.F.; Chua, N.H.; Kush, A.

    2000-07-04

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74--79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli.

  6. cDNA encoding a polypeptide including a hevein sequence

    DOEpatents

    Raikhel, N.V.; Broekaert, W.F.; Namhai Chua; Kush, A.

    1993-02-16

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1,018 nucleotides long and includes an open reading frame of 204 amino acids.

  7. Nucleotide sequences important for translation initiation of enterovirus RNA.

    PubMed Central

    Iizuka, N; Yonekawa, H; Nomoto, A

    1991-01-01

    An infectious cDNA clone was constructed from the genome of coxsackievirus B1 strain. A number of RNA transcripts that have mutations in the 5' noncoding region were synthesized in vitro from the modified cDNA clones and examined for their abilities to act as mRNAs in a cell-free translation system prepared from HeLa S3 cells. RNAs that lack nucleotide sequences at positions 568 to 726 and 565 to 726 were found to be less efficient and inactive mRNAs, respectively. To understand the biological significance of this region of RNA, small deletions and point mutations were introduced in the nucleotide sequence between positions 538 and 601. Except for a nucleotide substitution at 592 (U----C) within the 7-base conserved sequence, mutations introduced in the sequence downstream of position 568 did not affect much, if any, of the ability of RNA to act as mRNA. Except for a point mutation at 558 (C----U), mutations upstream of position 567 appeared to inactivate the mRNA. In the upstream region, a sequence consisting of 21 nucleotides at positions 546 to 566 is perfectly conserved in the 5' noncoding regions of enterovirus and rhinovirus genomes. These results suggest that the 7-base conserved sequence functions to maintain the efficiency of translation initiation and that the nucleotide sequence upstream of position 567, including the 21-base conserved sequence, plays essential roles in translation initiation. A deletion mutant whose genome lacks the nucleotide sequence at positions 568 to 726 showed a small-plaque phenotype and less virulence against suckling mice than the wild-type virus. Thus, reduction of the efficiency of translation initiation may result in the construction of enteroviruses with the lower-virulence phenotype. Images PMID:1651409

  8. Nucleotide sequences encoding a thermostable alkaline protease

    DOEpatents

    Wilson, David B.; Lao, Guifang

    1998-01-01

    Nucleotide sequences, derived from a thermophilic actinomycete microorganism, which encode a thermostable alkaline protease are disclosed. Also disclosed are variants of the nucleotide sequences which encode a polypeptide having thermostable alkaline proteolytic activity. Recombinant thermostable alkaline protease or recombinant polypeptide may be obtained by culturing in a medium a host cell genetically engineered to contain and express a nucleotide sequence according to the present invention, and recovering the recombinant thermostable alkaline protease or recombinant polypeptide from the culture medium.

  9. Nucleotide sequences encoding a thermostable alkaline protease

    DOEpatents

    Wilson, D.B.; Lao, G.

    1998-01-06

    Nucleotide sequences, derived from a thermophilic actinomycete microorganism, which encode a thermostable alkaline protease are disclosed. Also disclosed are variants of the nucleotide sequences which encode a polypeptide having thermostable alkaline proteolytic activity. Recombinant thermostable alkaline protease or recombinant polypeptide may be obtained by culturing in a medium a host cell genetically engineered to contain and express a nucleotide sequence according to the present invention, and recovering the recombinant thermostable alkaline protease or recombinant polypeptide from the culture medium. 3 figs.

  10. cDNA cloning and sequencing of tarantula hemocyanin subunits.

    PubMed

    Voit, R; Feldmaier-Fuchs, G

    1990-01-01

    Tarantula heart cDNA libraries were screened with synthetic oligonucleotide probes deduced from the highly conserved amino acid sequences of the two copper-binding sites, copper A and copper B, found in chelicerate hemocyanins. Positive cDNA clones could be obtained and four different cDNA types were characterized.

  11. Long-range correlations in nucleotide sequences

    NASA Technical Reports Server (NTRS)

    Peng, C. K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Sciortino, F.; Simons, M.; Stanley, H. E.

    1992-01-01

    DNA sequences have been analysed using models, such as an n-step Markov chain, that incorporate the possibility of short-range nucleotide correlations. We propose here a method for studying the stochastic properties of nucleotide sequences by constructing a 1:1 map of the nucleotide sequence onto a walk, which we term a 'DNA walk'. We then use the mapping to provide a quantitative measure of the correlation between nucleotides over long distances along the DNA chain. Thus we uncover in the nucleotide sequence a remarkably long-range power law correlation that implies a new scale-invariant property of DNA. We find such long-range correlations in intron-containing genes and in nontranscribed regulatory DNA sequences, but not in complementary DNA sequences or intron-less genes.

  12. Long-range correlations in nucleotide sequences

    NASA Astrophysics Data System (ADS)

    Peng, C.-K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Sciortino, F.; Simons, M.; Stanley, H. E.

    1992-03-01

    DNA SEQUENCES have been analysed using models, such as an it-step Markov chain, that incorporate the possibility of short-range nucleotide correlations1. We propose here a method for studying the stochastic properties of nucleotide sequences by constructing a 1:1 map of the nucleotide sequence onto a walk, which we term a 'DNA walk'. We then use the mapping to provide a quantitative measure of the correlation between nucleotides over long distances along the DNA chain. Thus we uncover in the nucleotide sequence a remarkably long-range power law correlation that implies a new scale-invariant property of DNA. We find such long-range correlations in intron-containing genes and in nontranscribed regulatory DNA sequences, but not in complementary DNA sequences or intron-less genes.

  13. Sequence of the cDNA encoding an actin homolog in the crayfish Procambarus clarkii.

    PubMed

    Kang, W K; Naya, Y

    1993-11-15

    A cDNA library was constructed by using mRNAs purified from crayfish (Procambarus clarkii) muscle. Using a homology search of the nucleotide (nt) sequences, a clone of the library was found to encode a protein homologous to actin (Act). The insert fragment of this cDNA clone was 1072 nt in length. The amino acid sequence deduced from the nt sequence showed significant similarity to Act of various organisms as follows: 88.1% to Drosophila melanogaster, 88.2% to silk worm, 87.3% to brine shrimp, 86.3% to rat, and 86.3% to human (% identity).

  14. Flavin reductase: sequence of cDNA from bovine liver and tissue distribution.

    PubMed Central

    Quandt, K S; Hultquist, D E

    1994-01-01

    Flavin reductase catalyzes electron transfer from reduced pyridine nucleotides to methylene blue or riboflavin, and this catalysis is the basis of the therapeutic use of methylene blue or riboflavin in the treatment of methemoglobinemia. A cDNA for a mammalian flavin reductase has been isolated and sequenced. Degenerate oligonucleotides, with sequences based on amino acid sequences of peptides derived from bovine erythrocyte flavin reductase, were used as primers in PCR to selectively amplify a partial cDNA that encodes the bovine reductase. The template used in the PCR was first strand cDNA synthesized from bovine liver total RNA using oligo(dT) primers. A PCR product was used as a specific probe to screen a bovine liver cDNA library. The sequence determined from two overlapping clones contains an open reading frame of 621 nucleotides and encodes 206 amino acids. The amino acid sequence deduced from the bovine liver flavin reductase cDNA matches the amino acid sequences determined for erythrocyte reductase-derived peptides, and the predicted molecular mass of 22,001 Da for the liver reductase agrees well with the molecular mass of 21,994 Da determined for the erythrocyte reductase by electrospray mass spectrometry. The amino acid sequence at the N terminus of the reductase has homology to sequences of pyridine nucleotide-dependent enzymes, and the predicted secondary structure, beta alpha beta, resembles the common nucleotide-binding structural motif. RNA blot analysis indicates a single 1-kilobase reductase transcript in human heart, kidney, liver, lung, pancreas, placenta, and skeletal muscle. Images PMID:7937764

  15. A model organism for new gene discovery by cDNA sequencing

    SciTech Connect

    El-Saved, N.M.; Donelson, J.E.; Alarcon, C.M.

    1994-09-01

    One method of new gene discovery is single pass sequencing of cDNAs to identify expressed sequence tags (ESTs). Model organisms can have biological properties which makes their use advantageous over studies with humans. One such model organism with advantages for cDNA sequencing is the African trypanosome T. brucei rhodesiense. This organism has the same 40 nucleotide sequence (splice leader sequence) on the 5{prime} end of all mRNAs. We have constructed a 5{prime} cDNA library by priming off the splice leader sequence and have begun sequencing this cDNA library. To date, over nearly 500 such cDNA expressed sequence tags (ESTs) have been examined. Forty-three percent of the sequences sampled from the trypanosome cDNA library have significant similarities to sequences already in the protein and translated nucleic acid databases. Among these are cDNA sequences which encode previously reported T. brucej proteins such as the VSG, tubulin, calflagin, etc., and proteins previously identified in other trypanosomatids. Other cDNAs display significant similarities to genes in unrelated organisms encoding several ribosomal proteins, metabolic enzymes, GTP binding proteins, transcription factors, cyclophillin, nucleosomal histones, histone H1, and a macrophage stress protein, among others. The 57% of the cDNAs that are not similar to sequences currently in the databases likely encode both trypanosome-specific proteins and housekeeping proteins shared with other eukaryotes. These cDNA ESTs provide new avenues of research for exploring both the biochemistry and the genome organization of this parasite, as well as a resource for identifying the 5{prime} sequence of novel genes likely to have homology to genes expressed in other organisms.

  16. Nucleotide capacitance calculation for DNA sequencing

    SciTech Connect

    Lu, Jun-Qiang; Zhang, Xiaoguang

    2008-01-01

    Using a first-principles linear response theory, the capacitance of the DNA nucleotides, adenine, cytosine, guanine and thymine, are calculated. The difference in the capacitance between the nucleotides is studied with respect to conformational distortion. The result suggests that although an alternate current capacitance measurement of a single-stranded DNA chain threaded through a nano-gap electrodes may not sufficient to be used as a stand alone method for rapid DNA sequencing, the capacitance of the nucleotides should be taken into consideration in any GHz-frequency electric measurements and may also serve as an additional criterion for identifying the DNA sequence.

  17. Human and Tree Shrew Alpha-synuclein: Comparative cDNA Sequence and Protein Structure Analysis.

    PubMed

    Wu, Zheng-Cun; Huang, Zhang-Qiong; Jiang, Qin-Fang; Dai, Jie-Jie; Zhang, Ying; Gao, Jia-Hong; Sun, Xiao-Mei; Chen, Nai-Hong; Yuan, Yu-He; Li, Cong; Han, Yuan-Yuan; Li, Yun; Ma, Kai-Li

    2015-10-01

    The synaptic protein alpha-synuclein (α-syn) is associated with a number of neurodegenerative diseases, and homology analyses among many species have been reported. Nevertheless, little is known about the cDNA sequence and protein structure of α-syn in tree shrews, and this information might contribute to our understanding of its role in both health and disease. We designed primers to the human α-syn cDNA sequence; then, tree shrew α-syn cDNA was obtained by RT-PCR and sequenced. Based on the acquired tree shrew α-syn cDNA sequence, both the amino acid sequence and the spatial structure of α-syn were predicted and analyzed. The homology analysis results showed that the tree shrew cDNA sequence matches the human cDNA sequence exactly except at nucleotide positions 45, 60, 65, 69, 93, 114, 147, 150, 157, 204, 252, 270, 284, 298, 308, and 324. Further protein sequence analysis revealed that the tree shrew α-syn protein sequence is 97.1 % identical to that of human α-syn. The secondary protein structure of tree shrew α-syn based on random coils and α-helices is the same as that of the human structure. The phosphorylation sites are highly conserved, except the site at position 103 of tree shrew α-syn. The predicted spatial structure of tree shrew α-syn is identical to that of human α-syn. Thus, α-syn might have a similar function in tree shrew and in human, and tree shrew might be a potential animal model for studying the pathogenesis of α-synucleinopathies.

  18. The International Nucleotide Sequence Database Collaboration.

    PubMed

    Nakamura, Yasukazu; Cochrane, Guy; Karsch-Mizrachi, Ilene

    2013-01-01

    The International Nucleotide Sequence Database Collaboration (INSDC; http://www.insdc.org), one of the longest-standing global alliances of biological data archives, captures, preserves and provides comprehensive public domain nucleotide sequence information. Three partners of the INSDC work in cooperation to establish formats for data and metadata and protocols that facilitate reliable data submission to their databases and support continual data exchange around the world. In this article, the INSDC current status and update for the year of 2012 are presented. Among discussed items of international collaboration meeting in 2012, BioSample database and changes in submission are described as topics.

  19. The International Nucleotide Sequence Database Collaboration

    PubMed Central

    Cochrane, Guy; Karsch-Mizrachi, Ilene; Takagi, Toshihisa; Sequence Database Collaboration, International Nucleotide

    2016-01-01

    The International Nucleotide Sequence Database Collaboration (INSDC; http://www.insdc.org) comprises three global partners committed to capturing, preserving and providing comprehensive public-domain nucleotide sequence information. The INSDC establishes standards, formats and protocols for data and metadata to make it easier for individuals and organisations to submit their nucleotide data reliably to public archives. This work enables the continuous, global exchange of information about living things. Here we present an update of the INSDC in 2015, including data growth and diversification, new standards and requirements by publishers for authors to submit their data to the public archives. The INSDC serves as a model for data sharing in the life sciences. PMID:26657633

  20. Molecular cloning and sequencing of the banded dogfish (Triakis scyllia) interleukin-8 cDNA.

    PubMed

    Inoue, Yuuki; Haruta, Chiaki; Usui, Kazushige; Moritomo, Tadaaki; Nakanishi, Teruyuki

    2003-03-01

    The dogfish (Triakis scyllia) interleukin-8 (IL-8) cDNA was isolated from mitogen-stimulated peripheral white blood cells (WBCs) utilising the polymerase chain reaction (PCR). The cDNA sequence showed that the dogfish IL-8 clones contained an open reading frame encoding 101 amino acids. A short 5' untranslated region (UTR) of 70 nucleotides and a long 3' UTR of 893 nucleotides were also present in this 1.2-kb cDNA. Furthermore, the 3' UTR of the mRNA contained the AUUUA sequence that has been implicated in shortening of the half-life of several cytokines and growth factors. The predicted IL-8 peptide had one potential N-linked glycosylation site (Asn-72-Thr-74) that is not conserved in other vertebrates. It also contained four cysteine residues (Cys-34, 36, 61 and 77), which are characteristic of CXC subfamily cytokines and found in all vertebrates, to date. The dogfish IL-8 lacked an ELR motif as found in the lamprey and trout. Comparison of the deduced amino acids showed that the dogfish IL-8 sequence shared 50.5, 41.2, 37.1 and 40.4-45.5% identity with the chicken, lamprey, trout and mammalian IL-8 sequences, respectively.

  1. cDNA sequencing and expression analysis of Dicentrarchus labrax heme oxygenase-1.

    PubMed

    Prevot-D'Alvise, N; Pierre, S; Gaillard, S; Gouze, E; Gouze, J-N; Aubert, J; Richard, S; Grillasca, J-P

    2008-11-17

    The liver cDNA encoding heme oxygenase--1 (HO-1) was sequenced from European sea bass (Dicentrarchus labrax) (accession number no. EF139130). The HO-1 cDNA was 1250 bp in nucleotide length and the open reading frame encoded 277 amino acid residues. The deduced amino acid sequence of the European sea bass had 75% and 50% identity with the amino acid sequences of tetraodontiformes (Tetraodon nigroviridis and Takifugu rubripes) and human HO-1 proteins, respectively. A short hydrophobic transmembrane domain at the C--terminal region was found, and four histidine residues were highly conserved, including human his25 that is essential for HO catalytic activity. RT-PCR of mRNA from eight different European sea bass tissues revealed that, in a homeostatis state, the heme oxygenase--1 was abundant in the spleen and liver but not in the brain.

  2. Giant panda ribosomal protein S14: cDNA, genomic sequence cloning, sequence analysis, and overexpression.

    PubMed

    Wu, G-F; Hou, Y-L; Hou, W-R; Song, Y; Zhang, T

    2010-10-13

    RPS14 is a component of the 40S ribosomal subunit encoded by the RPS14 gene and is required for its maturation. The cDNA and the genomic sequence of RPS14 were cloned successfully from the giant panda (Ailuropoda melanoleuca) using RT-PCR technology and touchdown-PCR, respectively; they were both sequenced and analyzed. The length of the cloned cDNA fragment was 492 bp; it contained an open-reading frame of 456 bp, encoding 151 amino acids. The length of the genomic sequence is 3421 bp; it contains four exons and three introns. Alignment analysis indicates that the nucleotide sequence shares a high degree of homology with those of Homo sapiens, Bos taurus, Mus musculus, Rattus norvegicus, Gallus gallus, Xenopus laevis, and Danio rerio (93.64, 83.37, 92.54, 91.89, 87.28, 84.21, and 84.87%, respectively). Comparison of the deduced amino acid sequences of the giant panda with those of these other species revealed that the RPS14 of giant panda is highly homologous with those of B. taurus, R. norvegicus and D. rerio (85.99, 99.34 and 99.34%, respectively), and is 100% identical with the others. This degree of conservation of RPS14 suggests evolutionary selection. Topology prediction shows that there are two N-glycosylation sites, three protein kinase C phosphorylation sites, two casein kinase II phosphorylation sites, four N-myristoylation sites, two amidation sites, and one ribosomal protein S11 signature in the RPS14 protein of the giant panda. The RPS14 gene can be readily expressed in Escherichia coli. When it was fused with the N-terminally His-tagged protein, it gave rise to accumulation of an expected 22-kDa polypeptide, in good agreement with the predicted molecular weight. The expression product obtained can be purified for studies of its function.

  3. ERCC2: cDNA cloning and molecular characterization of a human nucleotide excision repair gene with high homology to yeast RAD3.

    PubMed Central

    Weber, C A; Salazar, E P; Stewart, S A; Thompson, L H

    1990-01-01

    Human ERCC2 genomic clones give efficient, stable correction of the nucleotide excision repair defect in UV5 Chinese hamster ovary cells. One clone having a breakpoint just 5' of classical promoter elements corrects only transiently, implicating further flanking sequences in stable gene expression. The nucleotide sequences of a cDNA clone and genomic flanking regions were determined. The ERCC2 translated amino acid sequence has 52% identity (73% homology) with the yeast nucleotide excision repair protein RAD3. RAD3 is essential for cell viability and encodes a protein that is a single-stranded DNA dependent ATPase and an ATP dependent helicase. The similarity of ERCC2 and RAD3 suggests a role for ERCC2 in both cell viability and DNA repair and provides the first insight into the biochemical function of a mammalian nucleotide excision repair gene. Images Fig. 5. PMID:2184031

  4. Complete nucleotide sequence of a potyvirus causing maize dwarf mosaic disease in central China.

    PubMed

    Liu, X; Wang, X; Zhao, Y; Zheng, C; Zhou, G

    2003-01-01

    The full-length nucleotide sequence of a potyvirus causing the maize dwarf mosaic (MDM) disease in Henan province, central China, was obtained by reverse transcription-polymerase chain reaction (RT-PCR) and rapid amplification of the cDNA 5'-end (5'-RACE). The viral genome comprised of 9596 nucleotides except the polyA tail and encoded a putative polyprotein of 3603 amino acids. The entire genomic sequence of this isolate shared identities of 94.2% and 98.3% with Sugarcane mosaic virus (SCMV) HZ isolate at the nucleotide and deduced amino acid levels, respectively, but only a 69.1% identity with MDM virus (MDMV) Bulgarian isolate (MDMV-Bg) at the nucleotide level. Phylogenetical tree analysis of the complete nucleotide sequences indicated that the Henan isolate of a potyvirus causing MDM disease is in fact a Henan strain of SCMV (SCMV-HN).

  5. cDNA cloning and sequence analysis of human pancreatic procarboxypeptidase A1.

    PubMed Central

    Catasús, L; Villegas, V; Pascual, R; Avilés, F X; Wicker-Planquart, C; Puigserver, A

    1992-01-01

    Using polyclonal antibodies raised against human pancreatic procarboxypeptidases, a full-length cDNA coding for an A-type proenzyme was isolated from a lambda gt11 human pancreatic library. This cDNA contains standard 3' and 5' flanking regions, a poly(A)+ tail and a central region of 1260 nucleotides coding for a protein of 419 amino acids. On the basis of sequence comparisons, the human protein was classified as a procarboxypeptidase A1 which is very similar to the previously described A1 forms from rat and bovine pancreatic glands. The presence of the amino acid sequences assumed to be of importance for the zymogen inhibition by its activation segment, primarily on the basis of the recently reported crystal structure of the B form, further supports the proposed classification. PMID:1417781

  6. Estimation of evolutionary distances between nucleotide sequences.

    PubMed

    Zharkikh, A

    1994-09-01

    A formal mathematical analysis of the substitution process in nucleotide sequence evolution was done in terms of the Markov process. By using matrix algebra theory, the theoretical foundation of Barry and Hartigan's (Stat. Sci. 2:191-210, 1987) and Lanave et al.'s (J. Mol. Evol. 20:86-93, 1984) methods was provided. Extensive computer simulation was used to compare the accuracy and effectiveness of various methods for estimating the evolutionary distance between two nucleotide sequences. It was shown that the multiparameter methods of Lanave et al.'s (J. Mol. Evol. 20:86-93, 1984), Gojobori et al.'s (J. Mol. Evol. 18:414-422, 1982), and Barry and Hartigan's (Stat. Sci. 2:191-210, 1987) are preferable to others for the purpose of phylogenetic analysis when the sequences are long. However, when sequences are short and the evolutionary distance is large, Tajima and Nei's (Mol. Biol. Evol. 1:269-285, 1984) method is superior to others.

  7. Genomic and cDNA actin sequences from a virulent strain of Entamoeba histolytica.

    PubMed Central

    Edman, U; Meza, I; Agabian, N

    1987-01-01

    Invasiveness of Entamoeba histolytica strains that cause acute amoebiasis is characterized by aggressive behavior associated with cell motility and actin function. Analysis of actin genes from E. histolytica was initiated by devising methods for the isolation of biologically active nucleic acids, which allowed the preparation of cDNA and genomic DNA libraries. E. histolytica actin-encoding cDNAs and genomic clones have been isolated from libraries prepared from the virulent HM1:IMSS strain using a heterologous actin probe. Nucleotide sequence analysis of three independent cDNA clones and one genomic clone reveals a highly unusual codon bias and the absence of intervening sequences in E. histolytica actin. The coding sequence of the genomic clone is identical to that of two of the three cDNA clones. These represent at least two distinct mRNAs differing only by five silent changes in the protein coding sequence. Multiple genomic copies of the actin gene can be detected by Southern hybridization. E. histolytica actin exhibits a higher degree of homology to cytoplasmic than to muscle actin. Although the protein has been shown not to bind DNase I, the inferred amino acid sequence indicates conservation of all residues implied to participate in this binding. Images PMID:2883657

  8. cDNA sequences of two apolipoproteins from lamprey

    SciTech Connect

    Pontes, M.; Xu, X.; Graham, D.; Riley, M.; Doolittle, R.F.

    1987-03-24

    The messages for two small but abundant apolipoproteins found in lamprey blood plasma were cloned with the aid of oligonucleotide probes based on amino-terminal sequences. In both cases, numerous clones were identified in a lamprey liver cDNA library, consistent with the great abundance of these proteins in lamprey blood. One of the cDNAs (LAL1) has a coding region of 105 amino acids that corresponds to a 21-residue signal peptide, a putative 8-residue propeptide, and the 76-residue mature protein found in blood. The other cDNA (LAL2) codes for a total of 191 residues, the first 23 of which constitute a signal peptide. The two proteins, which occur in the high-density lipoprotein fraction of ultracentrifuged plasma, have amino acid compositions similar to those of apolipoproteins found in mammalian blood; computer analysis indicates that the sequences are largely helix-permissive. When the sequences were searched against an amino acid sequence data base, rat apolipoprotein IV was the best matching candidate in both cases. Although a reasonable alignment can be made with that sequence and LAL1, definitive assignment of the two lamprey proteins to typical mammalian classes cannot be made at this point.

  9. Nucleotide sequence and genetic organization of Hungarian grapevine chrome mosaic nepovirus RNA2.

    PubMed Central

    Brault, V; Hibrand, L; Candresse, T; Le Gall, O; Dunez, J

    1989-01-01

    The complete nucleotide sequence of hungarian grapevine chrome mosaic nepovirus (GCMV) RNA2 has been determined. The RNA sequence is 4441 nucleotides in length, excluding the poly(A) tail. A polyprotein of 1324 amino acids with a calculated molecular weight of 146 kDa is encoded in a single long open reading frame extending from nucleotides 218 to 4190. This polyprotein is homologous with the protein encoded by the S strain of tomato black ring virus (TBRV) RNA2, the only other nepovirus sequenced so far. Direct sequencing of the viral coat protein and in vitro translation of transcripts derived from cDNA sequences demonstrate that, as for comoviruses, the coat protein is located at the carboxy terminus of the polyprotein. A model for the expression of GCMV RNA2 is presented. Images PMID:2798129

  10. Nucleotide sequence alignment using sparse coding and belief propagation.

    PubMed

    Roozgard, Aminmohammad; Barzigar, Nafise; Wang, Shuang; Jiang, Xiaoqian; Ohno-Machado, Lucila; Cheng, Samuel

    2013-01-01

    Advances in DNA information extraction techniques have led to huge sequenced genomes from organisms spanning the tree of life. This increasing amount of genomic information requires tools for comparison of the nucleotide sequences. In this paper, we propose a novel nucleotide sequence alignment method based on sparse coding and belief propagation to compare the similarity of the nucleotide sequences. We used the neighbors of each nucleotide as features, and then we employed sparse coding to find a set of candidate nucleotides. To select optimum matches, belief propagation was subsequently applied to these candidate nucleotides. Experimental results show that the proposed approach is able to robustly align nucleotide sequences and is competitive to SOAPaligner [1] and BWA [2].

  11. cDNA encoding a polypeptide including a hev ein sequence

    DOEpatents

    Raikhel, Natasha V.; Broekaert, Willem F.; Chua, Nam-Hai; Kush, Anil

    2000-07-04

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74-79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli.

  12. Characterization of cDNA clones encoding rabbit and human serum paraoxonase: The mature protein retains its signal sequence

    SciTech Connect

    Hassett, C.; Richter, R.J.; Humbert, R.; Omiecinski, C.J.; Furlong, C.E. ); Chapline, C.; Crabb, J.W. )

    1991-10-22

    Serum paraoxonase hydrolyzes the toxic metabolites of a variety of organophosphorus insecticides. High serum paraoxonase levels appear to protect against the neurotoxic effects of organophosphorus substrates of this enzyme. The amino acid sequence accounting for 42% of rabbit paraoxonase was determined. From these data, two oligonucleotide probes were synthesized and used to screen a rabbit liver cDNA library. Human paraoxonase clones were isolated from a liver cDNA library by using the rabbit cDNA as a hybridization probe. Inserts from three of the longest clones were sequenced, and one full-length clone contained an open reading frame encoding 355 amino acids, four less than the rabbit paraoxonase protein. Amino-terminal sequences derived from purified rabbit and human paraoxonase proteins suggested that the signal sequence is retained, with the exception of the initiator methionine residue. Characterization of the rabbit and human paraoxonase cDNA clones confirms that the signal sequences are not processed, except for the N-terminal methionine residue. The rabbit and human cDNA clones demonstrate striking nucleotide and deduced amino acid similarities (greater than 85%), suggesting an important metabolic role and constraints on the evolution of this protein.

  13. Nucleotide Sequence of the Akv env Gene

    PubMed Central

    Lenz, Jack; Crowther, Robert; Straceski, Anthony; Haseltine, William

    1982-01-01

    The sequence of 2,191 nucleotides encoding the env gene of murine retrovirus Akv was determined by using a molecular clone of the Akv provirus. Deduction of the encoded amino acid sequence showed that a single open reading frame encodes a 638-amino acid precursor to gp70 and p15E. In addition, there is a typical leader sequence preceding the amino terminus of gp70. The locations of potential glycosylation sites and other structural features indicate that the entire gp70 molecule and most of p15E are located on the outer side of the membrane. Internal cleavage of the env precursor to generate gp70 and p15E occurs immediately adjacent to several basic amino acids at the carboxyl terminus of gp70. This cleavage generates a region of 42 uncharged, relatively hydrophobic amino acids at the amino terminus of p15E, which is located in a position analogous to the hydrophobic membrane fusion sequence of influenza virus hemagglutinin. The mature polypeptides are predicted to associate with the membrane via a region of 30 uncharged, mostly hydrophobic amino acids located near the carboxyl terminus of p15E. Distal to this membrane association region is a sequence of 35 amino acids at the carboxyl terminus of the env precursor, which is predicted to be located on the inner side of the membrane. By analogy to Moloney murine leukemia virus, a proteolytic cleavage in this region removes the terminal 19 amino acids, thus generating the carboxyl terminus of p15E. This leaves 15 amino acids at the carboxyl terminus of p15E on the inner side of the membrane in a position to interact with virion cores during budding. The precise location and order of the large RNase T1-resistant oligonucleotides in the env region were determined and compared with those from several leukemogenic viruses of AKR origin. This permitted a determination of how the differences in the leukemogenic viruses affect the primary structure of the env gene products. PMID:6283170

  14. Amino acid sequence of band-3 protein from rainbow trout erythrocytes derived from cDNA.

    PubMed Central

    Hübner, S; Michel, F; Rudloff, V; Appelhans, H

    1992-01-01

    In this report we present the first complete band-3 cDNA sequence of a poikilothermic lower vertebrate. The primary structure of the anion-exchange protein band 3 (AE1) from rainbow trout erythrocytes was determined by nucleotide sequencing of cDNA clones. The overlapping clones have a total length of 3827 bp with a 5'-terminal untranslated region of 150 bp, a 2754 bp open reading frame and a 3'-untranslated region of 924 bp. Band-3 protein from trout erythrocytes consists of 918 amino acid residues with a calculated molecular mass of 101 827 Da. Comparison of its amino acid sequence revealed a 60-65% identity within the transmembrane spanning sequence of band-3 proteins published so far. An additional insertion of 24 amino acid residues within the membrane-associated domain of trout band-3 protein was identified, which until now was thought to be a general feature only of mammalian band-3-related proteins. PMID:1637296

  15. Complete nucleotide sequence of the polymerase 3 gene of human influenza virus A/WSN/33.

    PubMed Central

    Kaptein, J S; Nayak, D P

    1982-01-01

    The complete nucleotide sequence of polymerase 3 (P3) gene of a human influenza virus (A/WSN/33) has been determined using cDNA clones except for the last 11 nucleotides which were obtained by direct RNA sequencing. The WSN P3 gene contains 2,341 nucleotides and codes for a protein of 759 amino acids (molecular weight 85,800). The WSN P3 protein, as deduced from the plus-strand DNA sequence, is basic and enriched in positively charged amino acids. In addition, it contains clusters of basic amino acids which may provide sites for the interaction of P3 protein with the capped primer, template, and/or other polymerase proteins during the transcriptive and replicative processes of influenza viral RNA. PMID:7045393

  16. Insights into corn genes derived from large-scale cDNA sequencing.

    PubMed

    Alexandrov, Nickolai N; Brover, Vyacheslav V; Freidin, Stanislav; Troukhan, Maxim E; Tatarinova, Tatiana V; Zhang, Hongyu; Swaller, Timothy J; Lu, Yu-Ping; Bouck, John; Flavell, Richard B; Feldmann, Kenneth A

    2009-01-01

    We present a large portion of the transcriptome of Zea mays, including ESTs representing 484,032 cDNA clones from 53 libraries and 36,565 fully sequenced cDNA clones, out of which 31,552 clones are non-redundant. These and other previously sequenced transcripts have been aligned with available genome sequences and have provided new insights into the characteristics of gene structures and promoters within this major crop species. We found that although the average number of introns per gene is about the same in corn and Arabidopsis, corn genes have more alternatively spliced isoforms. Examination of the nucleotide composition of coding regions reveals that corn genes, as well as genes of other Poaceae (Grass family), can be divided into two classes according to the GC content at the third position in the amino acid encoding codons. Many of the transcripts that have lower GC content at the third position have dicot homologs but the high GC content transcripts tend to be more specific to the grasses. The high GC content class is also enriched with intronless genes. Together this suggests that an identifiable class of genes in plants is associated with the Poaceae divergence. Furthermore, because many of these genes appear to be derived from ancestral genes that do not contain introns, this evolutionary divergence may be the result of horizontal gene transfer from species not only with different codon usage but possibly that did not have introns, perhaps outside of the plant kingdom. By comparing the cDNAs described herein with the non-redundant set of corn mRNAs in GenBank, we estimate that there are about 50,000 different protein coding genes in Zea. All of the sequence data from this study have been submitted to DDBJ/GenBank/EMBL under accession numbers EU940701-EU977132 (FLI cDNA) and FK944382-FL482108 (EST).

  17. [cDNA cloning and sequence analysis of the seventh segment of maize rough dwarf virus genome].

    PubMed

    Deng, W; Yang, X; Zhang, Y; Liu, Y; Kang, L

    2000-10-01

    The double strand RNA of maize rough dwarf virus (MRDV) was prepared from the maize samples showing symptoms which was from the Luanchen county of Heibei province of China. The primers were designed according to the known sequence of MRDV, the cDNA sequence of the seventh segment of MRDV was obtained by RT-PCR, the S7 sequence was analyzed by computer after sequencing. The results showed: the full length of the S7 cDNA is 1936 bp and equal to that of the S7 cDNA from abroad, the two open reading frame(ORF1 and ORF2) contained in the S7 segment are also unchanged. In comparison with the S7 segment from Italy, the homology of S7 nucleotide is 87.7% and the homology of ORF1 amino acid sequence is 91.6%. However, the MRDV S7 segment and the rice black strike dwarf virus S8 segment showed 95.5% nucleotide identities and 93.5% ORF1 amino acid identities.

  18. Nucleotide sequence of the pyruvate decarboxylase gene from Zymomonas mobilis.

    PubMed

    Neale, A D; Scopes, R K; Wettenhall, R E; Hoogenraad, N J

    1987-02-25

    Pyruvate decarboxylase (EC 4.1.1.1), the penultimate enzyme in the alcoholic fermentation pathway of Zymomonas mobilis, converts pyruvate to acetaldehyde and carbon dioxide. The complete nucleotide sequence of the structural gene encoding pyruvate decarboxylase from Zymomonas mobilis has been determined. The coding region is 1704 nucleotides long and encodes a polypeptide of 567 amino acids with a calculated subunit mass of 60,790 daltons. The amino acid sequence was confirmed by comparison with the amino acid sequence of a selection of tryptic fragments of the enzyme. The amino acid composition obtained from the nucleotide sequence is in good agreement with that obtained experimentally.

  19. Nucleotide sequence and expression of the 14-3-3 from the halotolerant alga Dunaliella salina.

    PubMed

    Wang, Tian-yun; Jing, Chang-Qin; Dong, Wei-Hua; Zhang, Jun-He; Zhang, Yu

    2010-02-01

    Previously we reported the nucleotide sequence of a 14-3-3 cDNA cloned from the unicellular green alga Dunaliella salina, however, the nucleotide sequence of this gene have not been reported so far. In the present study, the cloning and characterization of the nucleotide sequence, the gene copy and expression were undertaken. The coding sequence of the gene was found to be interrupted by five introns of 132, 266, 153, 152 and 625 bp, respectively. Introns 3-5 were found in conserved positions as compared to the Chlamydomonas reinhardtii 14-3-3 gene. D. salina 14-3-3 cDNA was inserted into the prokaryotic expression plasmid pET-28 and transformed into E. coli BL21, and the recombinant expressed 14-3-3 protein was purified from E. coli and immunized the rabbit. Indirect ELISA coated with 14-3-3 illustrated that the rabbit antisera titration was 1:1.00E + 06. Western blotting assays confirmed that prepared rabbit antibodies could recognize the recombinant 14-3-3 protein. Southern blotting results showed that there was only one copy of the 14-3-3 present in the genome of D. salina and 14-3-3 expression did not change throughout the Dnualiella cell cycle.

  20. Nucleotide sequence and the encoded amino acids of human apolipoprotein A-I mRNA.

    PubMed Central

    Law, S W; Brewer, H B

    1984-01-01

    The cDNA clones encoding the precursor form of human liver apolipoprotein A-I (apoA-I), preproapoA-I, have been isolated from a cDNA library. A 17-base synthetic oligonucleotide based on residues 108-113 of apoA-I and a 26-base primer-extended, dideoxynucleotide-terminated cDNA were used as hybridization probes to select for recombinant plasmids bearing the apoA-I sequence. The complete nucleic acid sequence of human liver preproapoA-I has been determined by analysis of the cloned cDNA. The sequence is composed of 801 nucleotides encoding 267 amino acid residues. PreproapoA-I contains an 18-amino-acid prepeptide and a 6-amino-acid propeptide connected to the amino terminus of the 243-amino acid mature apoA-I. Southern blotting analysis of chromosomal DNA obtained from peripheral blood indicated the apoA-I gene is contained in a 2.1-kilobase-pair Pst I fragment and there is no gross difference in structural organization between the normal apoA-I gene and the Tangier disease apoA-I gene. Images PMID:6198645

  1. Nucleotide sequence of papaya mosaic virus RNA.

    PubMed

    Sit, T L; Abouhaidar, M G; Holy, S

    1989-09-01

    The RNA genome of papaya mosaic virus is 6656 nucleotides long [excluding the poly(A) tail] with six open reading frames (ORFs) more than 200 nucleotides long. The four nearest the 5' end each overlap with adjacent ORFs and could code for proteins with Mr 176307, 26248, 11949 and 7224 (ORFs 1 to 4). The fifth ORF produces the capsid protein of Mr 23043 and the sixth ORF, located completely within ORF1, could code for a protein with Mr 14113. The translation products of ORFs 1 to 3 show strong similarity with those of other potexviruses but the ORF 4 protein has only limited similarity with the other potexvirus ORF 4 proteins of 7K to 11K.

  2. Sequence of a cDNA encoding nitrite reductase from the tree Betula pendula and identification of conserved protein regions.

    PubMed

    Friemann, A; Brinkmann, K; Hachtel, W

    1992-02-01

    The sequence of an mRNA encoding nitrite reductase (NiR, EC 1.7.7.1.) from the tree Betula pendula was determined. A cDNA library constructed from leaf poly(A)+ mRNA was screened with an oligonucleotide probe deduced from NiR sequences from spinach and maize. A 2.5 kb cDNA was isolated that hybridized to an mRNA, the steady-state level of which increased markedly upon induction with nitrate. The nucleotide sequence of the cDNA contains a reading frame encoding a protein of 583 amino acids that reveals 79% identity with NiR from spinach. The transit peptide of the NiR precursor from birch was determined to be 22 amino acids in size by sequence comparison with NiR from spinach and maize and is the shortest transit peptide reported so far. A graphical evaluation of identities found in the NiR sequence alignment revealed nine well conserved sections each exceeding ten amino acids in size. Sequence comparisons with related redox proteins identified essential residues involved in cofactor binding. A putative binding site for ferredoxin was found in the N-terminal half of the protein.

  3. The complete nucleotide sequence and genome organization of pea streak virus (genus Carlavirus).

    PubMed

    Su, Li; Li, Zhengnan; Bernardy, Mike; Wiersma, Paul A; Cheng, Zhihui; Xiang, Yu

    2015-10-01

    Pea streak virus (PeSV) is a member of the genus Carlavirus in the family Betaflexiviridae. Here, the first complete genome sequence of PeSV was determined by deep sequencing of a cDNA library constructed from dsRNA extracted from a PeSV-infected sample and Rapid Amplification of cDNA Ends (RACE) PCR. The PeSV genome consists of 8041 nucleotides excluding the poly(A) tail and contains six open reading frames (ORFs). The putative peptide encoded by the PeSV ORF6 has an estimated molecular mass of 6.6 kDa and shows no similarity to any known proteins. This differs from typical carlaviruses, whose ORF6 encodes a 12- to 18-kDa cysteine-rich nucleic-acid-binding protein.

  4. Complete nucleotide sequence and construction of an infectious clone of Chinese yam necrotic mosaic virus suggest that macluraviruses have the smallest genome among members of the family Potyviridae.

    PubMed

    Kondo, Toru; Fujita, Takashi

    2012-12-01

    The complete nucleotide sequence of Chinese yam necrotic mosaic virus (CYNMV) was determined from cloned virus cDNA. The CYNMV genomic RNA is 8224 nucleotides in length, excluding the poly(A) tail, and contains one long open reading frame encoding a large polyprotein of 2620 amino acids. CYNMV has no counterpart to the P1 cistron and a short HC-Pro cistron located at the 5' side of the potyvirus genome. A full-length cDNA clone, pCYNMV, was assembled under the control of the cauliflower mosaic virus 35S promoter and the nopaline synthase terminator. Biolistic inoculation of Nagaimo plants with cDNA resulted in systemic necrotic mosaic symptoms typical of CYNMV infection. To our knowledge, this is the first report of the complete nucleotide sequence and construction of an infectious cDNA clone of a member of the genus Macluravirus.

  5. Reading biological processes from nucleotide sequences

    NASA Astrophysics Data System (ADS)

    Murugan, Anand

    Cellular processes have traditionally been investigated by techniques of imaging and biochemical analysis of the molecules involved. The recent rapid progress in our ability to manipulate and read nucleic acid sequences gives us direct access to the genetic information that directs and constrains biological processes. While sequence data is being used widely to investigate genotype-phenotype relationships and population structure, here we use sequencing to understand biophysical mechanisms. We present work on two different systems. First, in chapter 2, we characterize the stochastic genetic editing mechanism that produces diverse T-cell receptors in the human immune system. We do this by inferring statistical distributions of the underlying biochemical events that generate T-cell receptor coding sequences from the statistics of the observed sequences. This inferred model quantitatively describes the potential repertoire of T-cell receptors that can be produced by an individual, providing insight into its potential diversity and the probability of generation of any specific T-cell receptor. Then in chapter 3, we present work on understanding the functioning of regulatory DNA sequences in both prokaryotes and eukaryotes. Here we use experiments that measure the transcriptional activity of large libraries of mutagenized promoters and enhancers and infer models of the sequence-function relationship from this data. For the bacterial promoter, we infer a physically motivated 'thermodynamic' model of the interaction of DNA-binding proteins and RNA polymerase determining the transcription rate of the downstream gene. For the eukaryotic enhancers, we infer heuristic models of the sequence-function relationship and use these models to find synthetic enhancer sequences that optimize inducibility of expression. Both projects demonstrate the utility of sequence information in conjunction with sophisticated statistical inference techniques for dissecting underlying biophysical

  6. Nucleotide sequence of SHV-2 beta-lactamase gene

    SciTech Connect

    Garbarg-Chenon, A.; Godard, V.; Labia, R.; Nicolas, J.C. )

    1990-07-01

    The nucleotide sequence of plasmid-mediated beta-lactamase SHV-2 from Salmonella typhimurium (SHV-2pHT1) was determined. The gene was very similar to chromosomally encoded beta-lactamase LEN-1 of Klebsiella pneumoniae. Compared with the sequence of the Escherichia coli SHV-2 enzyme (SHV-2E.coli) obtained by protein sequencing, the deduced amino acid sequence of SHV-2pHT1 differed by three amino acid substitutions.

  7. cDNA, genomic sequence cloning and overexpression of giant panda (Ailuropoda melanoleuca) mitochondrial ATP synthase ATP5G1.

    PubMed

    Hou, W-R; Hou, Y-L; Ding, X; Wang, T

    2012-09-03

    The ATP5G1 gene is one of the three genes that encode mitochondrial ATP synthase subunit c of the proton channel. We cloned the cDNA and determined the genomic sequence of the ATP5G1 gene from the giant panda (Ailuropoda melanoleuca) using RT-PCR technology and touchdown-PCR, respectively. The cloned cDNA fragment contains an open reading frame of 411 bp encoding 136 amino acids; the length of the genomic sequence is of 1838 bp, containing three exons and two introns. Alignment analysis revealed that the nucleotide sequence and the deduced protein sequence are highly conserved compared to Homo sapiens, Mus musculus, Rattus norvegicus, Bos taurus, and Sus scrofa. The homologies for nucleotide sequences of the giant panda ATP5G1 to those of these species are 93.92, 92.21, 92.46, 93.67, and 92.46%, respectively, and the homologies for amino acid sequences are 90.44, 95.59, 93.38, 94.12, and 91.91%, respectively. Topology prediction showed that there is one protein kinase C phosphorylation site, one casein kinase II phosphorylation site, five N-myristoylation sites, and one ATP synthase c subunit signature in the ATP5G1 protein of the giant panda. The cDNA of ATP5G1 was transfected into Escherichia coli, and the ATP5G1 fused with the N-terminally GST-tagged protein gave rise to accumulation of an expected 40-kDa polypeptide, which had the characteristics of the predicted protein.

  8. cDNA sequences of variant forms of human placenta diamine oxidase

    SciTech Connect

    Zhang, X.; Kim, J.; McIntire, S.

    1995-08-01

    Genes for two forms of human placenta diamine oxidase (dao) were cloned from a cDNA library and sequenced. One gene, pdao1, is identical in length to human kidney dao but differs from it by two bases in the coding region and differs slightly in the 3{prime} - and 5{prime}-noncoding regions. The second gene, pdao2, is nearly identical to these genes in the coding region, except that it has an extra 57-nucleotide coding segment near the 3{prime} end of this region. This segment corresponds to the contiguous sequence of the 3{prime} end of intron 3 of human kidney dao. pdao2 also differs significantly from pdao1 and human kidney dao in a 13-base sequence in the t{prime}-noncoding region. It is proposed that pdao1 and human kidney dao are polymorphic forms of the same allele. Whether pdao2 is a polymorph of these two is not certain, because of the significant differences in the coding and noncoding regions. pdao2 may represent a different allele. 21 refs., 2 figs.

  9. Nucleotide sequence of the coat protein gene of canine parvovirus.

    PubMed Central

    Rhode, S L

    1985-01-01

    The nucleotide sequence of the canine parvovirus (CPV2) from map units 33 to 95 has been determined. This includes the entire coat protein gene and noncoding sequences at the 3' end of the gene, exclusive of the terminal inverted repeat. The predicted capsid protein structures are discussed and compared with those of the rodent parvoviruses H-1 and MVM. PMID:3989914

  10. [Tabular excel editor for analysis of aligned nucleotide sequences].

    PubMed

    Demkin, V V

    2010-01-01

    Excel platform was used for transition of results of multiple aligned nucleotide sequences obtained using the BLAST network service to the form appropriate for visual analysis and editing. Two macros operators for MS Excel 2007 were constructed. The array of aligned sequences transformed into Excel table and processed using macros operators is more appropriate for analysis than initial html data.

  11. The Nucleotide Sequence of the lac Operator

    PubMed Central

    Gilbert, Walter; Maxam, Allan

    1973-01-01

    The lac repressor protects the lac operator against digestion with deoxyribonuclease. The protected fragment is double-stranded and about 27 base-pairs long. We determined the sequence of RNA transcription copies of this fragment and present a sequence for 24 base pairs. It is: 5′--T G G A A T T G T G A G C G G A T A A C A A T T 3′ 3′--A C C T T A A C A C T C G C C T A T T G T T A A 5′ The sequence has 2-fold symmetry regions; the two longest are separated by one turn of the DNA double helix. PMID:4587255

  12. 77 FR 65537 - Requirements for Patent Applications Containing Nucleotide Sequence and/or Amino Acid Sequence...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-10-29

    ... Amino Acid Sequence Disclosures ACTION: Proposed collection; comment request. SUMMARY: The United States....'' SUPPLEMENTARY INFORMATION: I. Abstract Patent applications that contain nucleotide and/or amino acid...

  13. Analysis of cloned cDNA and genomic sequences for phytochrome: complete amino acid sequences for two gene products expressed in etiolated Avena.

    PubMed Central

    Hershey, H P; Barker, R F; Idler, K B; Lissemore, J L; Quail, P H

    1985-01-01

    Cloned cDNA and genomic sequences have been analyzed to deduce the amino acid sequence of phytochrome from etiolated Avena. Restriction endonuclease site polymorphism between clones indicates that at least four phytochrome genes are expressed in this tissue. Sequence analysis of two complete and one partial coding region shows approximately 98% homology at both the nucleotide and amino acid levels, with the majority of amino acid changes being conservative. High sequence homology is also found in the 5'-untranslated region but significant divergence occurs in the 3'-untranslated region. The phytochrome polypeptides are 1128 amino acid residues long corresponding to a molecular mass of 125 kdaltons. The known protein sequence at the chromophore attachment site occurs only once in the polypeptide, establishing that phytochrome has a single chromophore per monomer covalently linked to Cys-321. Computer analyses of the amino acid sequences have provided predictions regarding a number of structural features of the phytochrome molecule. PMID:3001642

  14. Identification of cDNA clones encoding secretory isoenzyme forms: sequence determination of canine pancreatic prechymotrypsinogen 2 mRNA.

    PubMed Central

    Pinsky, S D; LaForge, K S; Luc, V; Scheele, G

    1983-01-01

    A cDNA library has been constructed from canine poly(A)+ mRNA. Clones containing cDNA inserts coding for prechymotrypsinogen 2 (isoelectric point = 7.1; Mr = 27,500), one of three canine pancreatic isoenzyme forms, were selected by colony hybridization using a cDNA probe synthesized from immunoselected prechymotrypsinogen 2 mRNA. To verify that cDNA clones code for prechymotrypsinogen 2 forms that translocate across rough endoplasmic reticulum membranes and fold into stable and identifiable secretory proteins, we conducted in vitro translation of hybrid-selected mRNA in the presence of microsomal membranes and optimal concentrations of glutathione and analyzed nascent translation products in their nonreduced state by two-dimensional isoelectric focusing/NaDodSO4 gel electrophoresis and fluorography. A near full-length chymotrypsinogen 2 cDNA and its primed extension were used to determine the nucleotide sequence for the entire coding region of prechymotrypsinogen 2 mRNA and 87 residues, including a poly(A) addition signal, in the 3' nontranslated region. The deduced amino acid sequence shows a 263-residue presecretory protein containing an 18-residue amino-terminal transport peptide (Met-Ala-Phe-Leu-Trp-Leu-Leu-Ser-Cys-Phe-Ala-Leu-Leu-Gly-Thr-Ala-Phe-Gly ), which we have previously shown to mediate the translocation of chymotrypsinogen 2 across the rough endoplasmic reticulum membrane. Following the transport peptide is a 245-residue proenzyme, which shows 82% and 80% sequence identity with bovine chymotrypsinogens A and B, respectively. Conserved among the three zymogens are 10 Cys residues that form five disulfide bonds in bovine chymotrypsinogens A and B and the residues that are required for zymogen activation, substrate binding, and catalytic activity. Images PMID:6584866

  15. Nucleotide sequence composition and method for detection of neisseria gonorrhoeae

    SciTech Connect

    Lo, A.; Yang, H.L.

    1990-02-13

    This patent describes a composition of matter that is specific for {ital Neisseria gonorrhoeae}. It comprises: at least one nucleotide sequence for which the ratio of the amount of the sequence which hybridizes to chromosomal DNA of {ital Neisseria gonorrhoeae} to the amount of the sequence which hybridizes to chromosomal DNA of {ital Neisseria meningitidis} is greater than about five. The ratio being obtained by a method described.

  16. Cloning and characterization of a highly repetitive fish nucleotide sequence.

    PubMed

    Datta, U; Dutta, P; Mandal, R K

    1988-01-01

    We have cloned and sequenced a highly repetitive HindIII fragment of DNA from the common carp Cyprinus carpio. It represents a tandemly repeated sequence with a monomeric unit of 245 bp and comprises 8% of the fish genome. Higher units of this monomer appear as a ladder in Southern blots. The monomeric unit has been sequenced; it is A + T-rich with some direct and some inverse-repeat nucleotide clusters.

  17. Primary structure of bovine pituitary secretory protein I (chromogranin A) deduced from the cDNA sequence

    SciTech Connect

    Ahn, T.G.; Cohn, D.V.; Gorr, S.U.; Ornstein, D.L.; Kashdan, M.A.; Levine, M.A.

    1987-07-01

    Secretory protein I (SP-I), also referred to as chromogranin A, is an acidic glycoprotein that has been found in every tissue of endocrine and neuroendocrine origin examined but never in exocrine or epithelial cells. Its co-storage and co-secretion with peptide hormones and neurotransmitters suggest that it has an important endocrine or secretory function. The authors have isolated cDNA clones from a bovine pituitary lambdagt11 expression library using an antiserum to parathyroid SP-I. The largest clone (SP4B) hybridized to a transcript of 2.1 kilobases in RNA from parathyroid, pituitary, and adrenal medulla. Immunoblots of bacterial lysates derived from SP4B lysognes demonstrated specific antibody binding to an SP4B/..beta..-galactosidase fusion protein (160 kDa) with a cDNA-derived component of 46 kDa. Radioimmunoassay of the bacterial lystates with SP-I antiserum yielded parallel displacement curves of /sup 125/I-labeled SP-I by the SP4B lysate and authentic SP-I. SP4B contains a cDNA of 1614 nucleotides that encodes a 449-amino acid protein (calculated mass, 50 kDa). The nucleotide sequences of the pituitary SP-I cDNA and adrenal medullary SP-I cDNAs are nearly identical. Analysis of genomic DNA suggests that pituitary, adrenal, and parathyroid SP-I are products of the same gene.

  18. The complete sequence of a full length cDNA for human liver glyceraldehyde-3-phosphate dehydrogenase: evidence for multiple mRNA species.

    PubMed Central

    Arcari, P; Martinelli, R; Salvatore, F

    1984-01-01

    A recombinant M13 clone (O42) containing a 65 b.p. cDNA fragment from human fetal liver mRNA coding for glyceraldehyde-3-phosphate dehydrogenase has been identified and it has been used to isolate from a full-length human adult liver cDNA library a recombinant clone, pG1, which has been subcloned in M13 phage and completely sequenced with the chain terminator method. Besides the coding region of 1008 b.p., the cDNA sequence includes 60 nucleotides at the 5'-end and 204 nucleotides at the 3'-end up to the polyA tail. Hybridization of pG1 to human liver total RNA shows only one band about the size of pG1 cDNA. A much stronger hybridization signal was observed using RNA derived from human hepatocarcinoma and kidney carcinoma cell lines. Sequence homology between clone 042 and the homologous region of clone pG1 is 86%. On the other hand, homology among the translated sequences and the known human muscle protein sequence ranges between 77 and 90%; these data demonstrate the existence of more than one gene coding for G3PD. Southern blot of human DNA, digested with several restriction enzymes, also indicate that several homologous sequences are present in the human genome. Images PMID:6096821

  19. Cloning and genomic nucleotide sequence of the matrix attachment region binding protein from the halotolerant alga Dunaliella salina.

    PubMed

    Wang, Peng-Ju; Wang, Tian-Yun; Wang, Ya-Feng; Yang, Rui; Li, Zhao-Xi

    2013-07-01

    In our previous study, the sequence of a matrix attachment region binding protein (MBP) cDNA was cloned from the unicellular green alga Dunaliella salina. However, the nucleotide sequence of this gene has not been reported so far. In this paper, the nucleotide sequence of MBP was cloned and characterized, and its gene copy number was determined. The MBP nucleotide sequence is 5641 bp long, and interrupted by 12 introns ranging from 132 to 562 bp. All the introns in the D. salina MBP gene have orthodox splice sites, exhibiting GT at the 5' end and AG at the 3' end. Southern blot analysis showed that MBP only has one copy in the D. salina genome.

  20. Nucleotide correlations and electronic transport of DNA sequences

    NASA Astrophysics Data System (ADS)

    Albuquerque, E. L.; Vasconcelos, M. S.; Lyra, M. L.; de Moura, F. A. B. F.

    2005-02-01

    We use a tight-binding formulation to investigate the transmissivity and wave-packet dynamics of sequences of single-strand DNA molecules made up from the nucleotides guanine G , adenine A , cytosine C , and thymine T . In order to reveal the relevance of the underlying correlations in the nucleotides distribution, we compare the results for the genomic DNA sequence with those of two artificial sequences: (i) the Rudin-Shapiro one, which has long-range correlations; (ii) a random sequence, which is a kind of prototype of a short-range correlated system, presented here with the same first-neighbor pair correlations of the human DNA sequence. We found that the long-range character of the correlations is important to the persistence of resonances of finite segments. On the other hand, the wave-packet dynamics seems to be mostly influenced by the short-range correlations.

  1. The complete nucleotide sequence of bean yellow mosaic potyvirus RNA.

    PubMed

    Guyatt, K J; Proll, D F; Menssen, A; Davidson, A D

    1996-01-01

    The complete nucleotide sequence of an Australian strain of bean yellow mosaic virus (BYMV-S) has been determined from cloned viral cDNAs. The BYMV-S genome is 9 547 nucleotides in length excluding a poly(A) tail. Computer analysis of the sequence revealed a single long open reading frame (ORF) of 9168 nucleotides, commencing at position 206 and terminating with UAG at position 9374-6. The ORF potentially encodes a polyprotein of 3056 amino acids with a deduced Mr of 347 409. The 5' and 3' untranslated regions are 205 and 174 nucleotides in length respectively. Alignment of the amino acid sequence of the BYMV-S polyprotein with those of other potyviruses identified nine putative proteolytic cleavage sites. The predicted consensus cleavage site of the BYMV NIa protease was found to differ from that described for other potyviruses. Processing of the BYMV polyprotein at the designated proteolytic cleavage sites would result in a typical potyviral genome arrangement. The amino acid sequences of the putative BYMV encoded proteins were compared to the homologous gene products of twelve individual potyviruses to identify overall and specific regions of amino acid sequence homology.

  2. Cloning and sequencing of dolphinfish (Coryphaena hippurus, Coryphaenidae) growth hormone-encoding cDNA.

    PubMed

    Peduel, A D; Elizur, A; Knibb, W

    1994-01-01

    The cDNA encoding the preprotein growth hormone from the dolphinfish (Coryphaena hippurus) has been cloned and sequenced. The cDNA was derived by reverse transcription of RNA from the pituitary of a young fish using the method known as Rapid Amplification of cDNA Ends (RACE). An oligonucleotide primer corresponding to the 5' region of Pagrus major and the universal RACE primer enabled amplification using the Polymerase Chain Reaction (PCR). The dolphinfish and yellow-tail, Seriola quineqeradiata, are both members of the sub-order Percoidei (Perciforme) and their GH sequences show a high level of homology.

  3. cDNA, genomic sequence cloning and overexpression of glyceraldehyde-3-phosphate dehydrogenase gene (GAPDH) from the Giant Panda.

    PubMed

    Hou, Wan-Ru; Hou, Yi-Ling; Du, Yu-Jie; Zhang, Tian; Hao, Yan-Zhe

    2010-01-01

    GAPDH (glyceraldehyde-3-phosphate dehydrogenase) is a key enzyme of the glycolytic pathway and it is related to the occurrence of some diseases. The cDNA and the genomic sequence of GAPDH were cloned successfully from the Giant Panda (Ailuropoda melanoleuca) using the RT-PCR technology and Touchdown-PCR, respectively. Both sequences were analyzed preliminarily. The cDNA of GAPDH cloned from the Giant Panda is 1191 bp in size, contains an open reading frame of 1002 bp encoding 333 amino acids. The genomic sequence is 3941 bp in length and was found to possess 10 exons and 9 introns. Alignment analysis indicates that the nucleotide sequence and the deduced amino acid sequence are highly conserved in some mammalian species, including Homo sapiens, Mu musculus, Rattus norvegicus, Canis lupus familiaris and Bos taurus. The homologies for the nucleotide sequences of the Giant Panda GAPDH to that of these species are 90.67, 90.92, 90.62, 95.01 and 92.32% respectively, while the homologies for the amino acid sequences are 94.93, 95.5, 95.8, 98.8 and 97.0%. Primary structure analysis revealed that the molecular weight of the putative GAPDH protein is 35.7899 kDa with a theoretical pI of 8.21. Topology prediction showed that there is one Glyceraldehyde 3-phosphate dehydrogenase active site, two N-glycosylation sites, four Casein kinase II phosphorylation sites, seven Protein kinase C phosphorylation sites and eight N-myristoylation sites in the GAPDH protein of the Giant Panda. The GAPDH gene was overexpressed in E. coli BL21. The results indicated that the fusion of GAPDH with the N-terminally His-tagged form gave rise to the accumulation of an expected 43 kDa polypeptide. The SDS-PAGE analysis also showed that the recombinant GAPDH was soluble and thus could be used for further functional studies.

  4. Nucleotide sequence of the capsid protein gene and 3' non-coding region of papaya mosaic virus RNA.

    PubMed

    Abouhaidar, M G

    1988-01-01

    The nucleotide sequences of cDNA clones corresponding to the 3' OH end of papaya mosaic virus RNA have been determined. The 3'-terminal sequence obtained was 900 nucleotides in length, excluding the poly(A) tail, and contained an open reading frame capable of giving rise to a protein of 214 amino acid residues with an Mr of 22930. This protein was identified as the viral capsid protein. The 3' non-coding region of PMV genome RNA was about 121 nucleotides long [excluding the poly(A) tail] and homologous to the complementary sequence of the non-coding region at the 5' end of PMV RNA. A long open reading frame was also found in the predicted 5' end region of the negative strand.

  5. Nucleotide sequencing and identification of some wild mushrooms.

    PubMed

    Das, Sudip Kumar; Mandal, Aninda; Datta, Animesh K; Gupta, Sudha; Paul, Rita; Saha, Aditi; Sengupta, Sonali; Dubey, Priyanka Kumari

    2013-01-01

    The rDNA-ITS (Ribosomal DNA Internal Transcribed Spacers) fragment of the genomic DNA of 8 wild edible mushrooms (collected from Eastern Chota Nagpur Plateau of West Bengal, India) was amplified using ITS1 (Internal Transcribed Spacers 1) and ITS2 primers and subjected to nucleotide sequence determination for identification of mushrooms as mentioned. The sequences were aligned using ClustalW software program. The aligned sequences revealed identity (homology percentage from GenBank data base) of Amanita hemibapha [CN (Chota Nagpur) 1, % identity 99 (JX844716.1)], Amanita sp. [CN 2, % identity 98 (JX844763.1)], Astraeus hygrometricus [CN 3, % identity 87 (FJ536664.1)], Termitomyces sp. [CN 4, % identity 90 (JF746992.1)], Termitomyces sp. [CN 5, % identity 99 (GU001667.1)], T. microcarpus [CN 6, % identity 82 (EF421077.1)], Termitomyces sp. [CN 7, % identity 76 (JF746993.1)], and Volvariella volvacea [CN 8, % identity 100 (JN086680.1)]. Although out of 8 mushrooms 4 could be identified up to species level, the nucleotide sequences of the rest may be relevant to further characterization. A phylogenetic tree is constructed using Neighbor-Joining method showing interrelationship between/among the mushrooms. The determined nucleotide sequences of the mushrooms may provide additional information enriching GenBank database aiding to molecular taxonomy and facilitating its domestication and characterization for human benefits.

  6. Nucleotide Sequencing and Identification of Some Wild Mushrooms

    PubMed Central

    Das, Sudip Kumar; Mandal, Aninda; Datta, Animesh K.; Gupta, Sudha; Paul, Rita; Saha, Aditi; Sengupta, Sonali; Dubey, Priyanka Kumari

    2013-01-01

    The rDNA-ITS (Ribosomal DNA Internal Transcribed Spacers) fragment of the genomic DNA of 8 wild edible mushrooms (collected from Eastern Chota Nagpur Plateau of West Bengal, India) was amplified using ITS1 (Internal Transcribed Spacers 1) and ITS2 primers and subjected to nucleotide sequence determination for identification of mushrooms as mentioned. The sequences were aligned using ClustalW software program. The aligned sequences revealed identity (homology percentage from GenBank data base) of Amanita hemibapha [CN (Chota Nagpur) 1, % identity 99 (JX844716.1)], Amanita sp. [CN 2, % identity 98 (JX844763.1)], Astraeus hygrometricus [CN 3, % identity 87 (FJ536664.1)], Termitomyces sp. [CN 4, % identity 90 (JF746992.1)], Termitomyces sp. [CN 5, % identity 99 (GU001667.1)], T. microcarpus [CN 6, % identity 82 (EF421077.1)], Termitomyces sp. [CN 7, % identity 76 (JF746993.1)], and Volvariella volvacea [CN 8, % identity 100 (JN086680.1)]. Although out of 8 mushrooms 4 could be identified up to species level, the nucleotide sequences of the rest may be relevant to further characterization. A phylogenetic tree is constructed using Neighbor-Joining method showing interrelationship between/among the mushrooms. The determined nucleotide sequences of the mushrooms may provide additional information enriching GenBank database aiding to molecular taxonomy and facilitating its domestication and characterization for human benefits. PMID:24489501

  7. Method for the detection of specific nucleic acid sequences by polymerase nucleotide incorporation

    DOEpatents

    Castro, Alonso

    2004-06-01

    A method for rapid and efficient detection of a target DNA or RNA sequence is provided. A primer having a 3'-hydroxyl group at one end and having a sequence of nucleotides sufficiently homologous with an identifying sequence of nucleotides in the target DNA is selected. The primer is hybridized to the identifying sequence of nucleotides on the DNA or RNA sequence and a reporter molecule is synthesized on the target sequence by progressively binding complementary nucleotides to the primer, where the complementary nucleotides include nucleotides labeled with a fluorophore. Fluorescence emitted by fluorophores on single reporter molecules is detected to identify the target DNA or RNA sequence.

  8. Complete sequence analysis of cDNA clones encoding rat whey phosphoprotein: homology to a protease inhibitor.

    PubMed

    Dandekar, A M; Robinson, E A; Appella, E; Qasba, P K

    1982-07-01

    Lactoprotein clones have been isolated from a rat mammary gland recombinant library of cDNA plasmids. Clones p-Wp 52 and p-Wp 47 were shown by hybrid selection, in vitro translation, and immunoprecipitation to represent a cloned DNA sequence encoding rat whey phosphoprotein. We report here the nucleotide sequence of the cDNA insert of p-Wp 52 and shows that it encodes the complete whey phosphoprotein sequence. The encoded sequence shows a high content of half-cystine, glutamic acid, aspartic acid, and serine but an absence of tyrosine. The half-cystines appear in unique arrangements and are repeated in two domains of the protein. The second domain has striking similarities with the second domain of the red sea turtle protease inhibitor. Clone p-Wp 52 has allowed the study of expression of whey phosphoprotein mRNA during functional differentiation of rat mammary gland and in mammary tumors. The whey phosphoprotein mRNA is detected during midpregnancy and lactation in the rat mammary gland but is barely detected in mammary tumors in which other milk protein mRNAs are expressed. The whey phosphoprotein gene in these tumors is hypermethylated, correlating with the reduced expression of this gene.

  9. Construction and EST sequencing of full-length, drought stress cDNA libraries for common beans (Phaseolus vulgaris L.)

    PubMed Central

    2011-01-01

    Background Common bean is an important legume crop with only a moderate number of short expressed sequence tags (ESTs) made with traditional methods. The goal of this research was to use full-length cDNA technology to develop ESTs that would overlap with the beginning of open reading frames and therefore be useful for gene annotation of genomic sequences. The library was also constructed to represent genes expressed under drought, low soil phosphorus and high soil aluminum toxicity. We also undertook comparisons of the full-length cDNA library to two previous non-full clone EST sets for common bean. Results Two full-length cDNA libraries were constructed: one for the drought tolerant Mesoamerican genotype BAT477 and the other one for the acid-soil tolerant Andean genotype G19833 which has been selected for genome sequencing. Plants were grown in three soil types using deep rooting cylinders subjected to drought and non-drought stress and tissues were collected from both roots and above ground parts. A total of 20,000 clones were selected robotically, half from each library. Then, nearly 10,000 clones from the G19833 library were sequenced with an average read length of 850 nucleotides. A total of 4,219 unigenes were identified consisting of 2,981 contigs and 1,238 singletons. These were functionally annotated with gene ontology terms and placed into KEGG pathways. Compared to other EST sequencing efforts in common bean, about half of the sequences were novel or represented the 5' ends of known genes. Conclusions The present full-length cDNA libraries add to the technological toolbox available for common bean and our sequencing of these clones substantially increases the number of unique EST sequences available for the common bean genome. All of this should be useful for both functional gene annotation, analysis of splice site variants and intron/exon boundary determination by comparison to soybean genes or with common bean whole-genome sequences. In addition the

  10. cDNA cloning, sequence analysis, and chromosomal localization of the gene for human carnitine palmitoyltransferase

    SciTech Connect

    Finocchiaro, G.; Taroni, F.; Martin, A.L.; Colombo, I.; Tarelli, G.T.; DiDonato, S. ); Rocchi, M. )

    1991-01-15

    The authors have cloned and sequenced a cDNA encoding human liver carnitine palmitoyltransferase an inner mitochondrial membrane enzyme that plays a major role in the fatty acid oxidation pathway. Mixed oligonucleotide primers whose sequences were deduced from one tryptic peptide obtained from purified CPTase were used in a polymerase chain reaction, allowing the amplification of a 0.12-kilobase fragment of human genomic DNA encoding such a peptide. A 60-base-pair (bp) oligonucleotide synthesized on the basis of the sequence from this fragment was used for the screening of a cDNA library from human liver and hybridized to a cDNA insert of 2255 bp. This cDNA contains an open reading frame of 1974 bp that encodes a protein of 658 amino acid residues including 25 residues of an NH{sub 2}-terminal leader peptide. The assignment of this open reading frame to human liver CPTase is confirmed by matches to seven different amino acid sequences of tryptic peptides derived from pure human CPTase and by the 82.2% homology with the amino acid sequence of rat CPTase. The NH{sub 2}-terminal region of CPTase contains a leucine-proline motif that is shared by carnitine acetyl- and octanoyltransferases and by choline acetyltransferase. The gene encoding CPTase was assigned to human chromosome 1, region 1q12-1pter, by hybridization of CPTase cDNA with a DNA panel of 19 human-hanster somatic cell hybrids.

  11. Complete nucleotide sequence and genome organization of bovine parvovirus.

    PubMed Central

    Chen, K C; Shull, B C; Moses, E A; Lederman, M; Stout, E R; Bates, R C

    1986-01-01

    We determined the complete nucleotide sequence of bovine parvovirus (BPV), an autonomous parvovirus. The sequence is 5,491 nucleotides long. The terminal regions contain nonidentical imperfect palindromic sequences of 150 and 121 nucleotides. In the plus strand, there are three large open reading frames (left ORF, mid ORF, and right ORF) with coding capacities of 729, 255, and 685 amino acids, respectively. As with all parvoviruses studied to date, the left ORF of BPV codes for the nonstructural protein NS-1 and the right ORF codes for the major parts of the three capsid proteins. The mid ORF probably encodes the major part of the nonstructural protein NP-1. There are promoterlike sequences at map units 4.5, 12.8, and 38.7 and polyadenylation signals at map units 61.6, 64.6, and 98.5. BPV has little DNA homology with the defective parvovirus AAV, with the human autonomous parvovirus B19, or with the other autonomous parvoviruses sequenced (canine parvovirus, feline panleukopenia virus, H-1, and minute virus of mice). Even though the overall DNA homology of BPV with other parvoviruses is low, several small regions of high homology are observed when the amino acid sequences encoded by the left and right ORFs are compared. From these comparisons, it can be shown that the evolutionary relationship among the parvoviruses is B19 in equilibrium with AAV in equilibrium with BPV in equilibrium with MVM. The highly conserved amino acid sequences observed among all parvoviruses may be useful in the identification and detection of parvoviruses and in the design of a general parvovirus vaccine. PMID:3783814

  12. cDNA sequence analysis of a 29-kDa cysteine-rich surface antigen of pathogenic Entamoeba histolytica

    SciTech Connect

    Torian, B.E.; Stroeher, V.L.; Stamm, W.E. ); Flores, B.M. ); Hagen, F.S. )

    1990-08-01

    A {lambda}gt11 cDNA library was constructed from poly(U)-Spharose-selected Entamoeba histolytica trophozoite RNA in order to clone and identify surface antigens. The library was screened with rabbit polyclonal anti-E. histolytica serum. A 700-base-pair cDNA insert was isolated and the nucleotide sequence was determined. The deduced amino acid sequence of the cDNA revealed a cysteine-rich protein. DNA hybridizations showed that the gene was specific to E. histolytica since the cDNA probe reacted with DNA from four axenic strains of E. histolytica but did not react with DNA from Entamoeba invadens, Acanthamoeba castellanii, or Trichomonas vaginalis. The insert was subcloned into the expression vector pGEX-1 and the protein was expressed as a fusion with the C terminus of glutathione S-transferase. Purified fusion protein was used to generate 22 monoclonal antibodies (mAbs) and a mouse polyclonal antiserum specific for the E. histolytica portion of the fusion protein. A 29-kDa protein was identified as a surface antigen when mAbs were used to immunoprecipitate the antigen from metabolically {sup 35}S-labeled live trophozoites. The surface location of the antigen was corroborated by mAb immunoprecipitation of a 29-kDa protein from surface-{sup 125}I-labeled whole trophozoites as well as by the reaction of mAbs with live trophozoites in an indirect immunofluorescence assay performed at 4{degree}C. Immunoblotting with mAbs demonstrated that the antigen was present on four axenic isolates tested. mAbs recognized epitopes on the 29-kDa native antigen on some but not all clinical isolates tested.

  13. Nucleotide-Specific Contrast for DNA Sequencing by Electron Spectroscopy

    PubMed Central

    Schmid, Andreas K.; Davis, Ronald W.

    2016-01-01

    DNA sequencing by imaging in an electron microscope is an approach that holds promise to deliver long reads with low error rates and without the need for amplification. Earlier work using transmission electron microscopes, which use high electron energies on the order of 100 keV, has shown that low contrast and radiation damage necessitates the use of heavy atom labeling of individual nucleotides, which increases the read error rates. Other prior work using scattering electrons with much lower energy has shown to suppress beam damage on DNA. Here we explore possibilities to increase contrast by employing two methods, X-ray photoelectron and Auger electron spectroscopy. Using bulk DNA samples with monomers of each base, both methods are shown to provide contrast mechanisms that can distinguish individual nucleotides without labels. Both spectroscopic techniques can be readily implemented in a low energy electron microscope, which may enable label-free DNA sequencing by direct imaging. PMID:27149617

  14. The nucleotide sequence of the human beta-globin gene.

    PubMed

    Lawn, R M; Efstratiadis, A; O'Connell, C; Maniatis, T

    1980-10-01

    We report the complete nucleotide sequence of the human beta-globin gene. The purpose of this study is to obtain information necessary to study the evolutionary relationships between members of the human beta-like globin gene family and to provide the basis for comparing normal beta-globin genes with those obtained from the DNA of individuals with genetic defects in hemoglobin expression.

  15. Cloning and sequencing of a dextranase-encoding cDNA from Penicillium minioluteum.

    PubMed

    Garcia, B; Margolles, E; Roca, H; Mateu, D; Raices, M; Gonzales, M E; Herrera, L; Delgado, J

    1996-10-01

    A cDNA from Penicillium minioluteum HI-4 encoding a dextranase (1,6-alpha-glucan hydrolase, EC 3.2.1.11) was isolated and characterized. cDNA clones corresponding to genes expressed in dextran-induced cultures were identified by differential hybridization. Southern hybridization and restriction mapping analysis of selected clones revealed four different groups of cDNAs. The dextranase cDNA was identified after expressing a cDNA fragment from each of the isolated groups of cDNA clones in the Escherichia coli T7 system. The expression of a 2 kb cDNA fragment in E. coli led to the production of a 67 kDa protein which was recognized by an anti-dextranase polyclonal antibody. The cDNA contains 2109 bp plus a poly(A) tail, coding for a protein of 608 amino acids, including 20 N-terminal amino acid residues which might correspond to a signal peptide. There was 29% sequence identity between the P. minioluteum dextranase and the dextranase from Arthrobacter sp. CB-8.

  16. The complete nucleotide sequence of pelargonium leaf curl virus.

    PubMed

    McGavin, Wendy J; MacFarlane, Stuart A

    2016-05-01

    Investigation of a tombusvirus isolated from tulip plants in Scotland revealed that it was pelargonium leaf curl virus (PLCV) rather than the originally suggested tomato bushy stunt virus. The complete sequence of the PLCV genome was determined for the first time, revealing it to be 4789 nucleotides in size and to have an organization similar to that of the other, previously described tombusviruses. Primers derived from the sequence were used to construct a full-length infectious clone of PLCV that recapitulates the disease symptoms of leaf curling in systemically infected pelargonium plants.

  17. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2010-07-01 2010-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide...

  18. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2012-07-01 2012-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide...

  19. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2014-07-01 2014-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide...

  20. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2011-07-01 2011-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide...

  1. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2013-07-01 2013-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide...

  2. Molecular cloning and characterization of a new cDNA sequence encoding a venom peptide from the centipede Scolopendra subspinipes mutilans.

    PubMed

    Liu, Wanhong; Luo, Feng; He, Jing; Cao, Zhijian; Miao, Lixia

    2012-01-01

    Many studies have been performed on venomous peptides derived from animals. However, little of this research has focused on peptides from centipede venoms. Here, a venom gland cDNA library was successfully constructed for the centipede Scolopendra subspinipes mutilans. A new cDNA encoding the precursor of a venom peptide, named SsmTx, was cloned from the venomous gland cDNA library of the centipede S. subspinipes mutilans. The full-length SsmTx cDNA sequence is 465 nt, including a 249 nt ORF, a 45 nt 5' UTR and a 171 nt 3' UTR. There is a signal tail AATAAA 31 nt upstream of the poly (A) tail. The precursor nucleotide sequence of SsmTx encodes a signal peptide of 25 residues and a mature peptide of 57 residues, which is bridged by two pairs of disulfide bonds. SsmTx displays a unique cysteine motif that is completely different from that of other venomous animal toxins. This is the first reported cDNA sequence encoding a venom peptide from the centipede S. subspinipes mutilans.

  3. Structure and nucleotide sequence of the rat intestinal vitamin D-dependent calcium binding protein gene.

    PubMed Central

    Krisinger, J; Darwish, H; Maeda, N; DeLuca, H F

    1988-01-01

    The vitamin D-dependent intestinal calcium binding protein (ICaBP, 9 kDa) is under transcriptional regulation by 1,25-dihydroxyvitamin D3 [1,25-(OH)2D3], the hormonal active form of the vitamin. To study the mechanism of gene regulation by 1,25-(OH)2D3, we isolated the rat ICaBP gene by using a cDNA probe. Its nucleotide sequence revealed 3 exons separated by 2 introns within approximately 3 kilobases. The first exon represents only noncoding sequences, while the second and third encode the two calcium binding domains of the protein. The gene contains a 15-base-pair imperfect palindrome in the first intron that shows high homology to the estrogen-responsive element. This sequence may represent the vitamin D-responsive element involved in the regulation of the ICaBP gene. The second intron shows an 84-base-pair-long simple nucleotide repeat that implicates Z-DNA formation. Genomic Southern analysis shows that the rat gene is represented as a single copy. Images PMID:3194402

  4. [cDNA cloning and sequence analysis of pluripotency genes in tree shrews (Tupaia belangeri)].

    PubMed

    Wang, Cai-Yun; Ma, Yun-Han; He, Da-Jian; Yang, Shi-Hua

    2013-04-01

    In this paper, partial sequences of the tree shrew (Tupaia belangeri) Klf4, Sox2, and c-Myc genes were cloned and sequenced, which were 382, 612, and 485 bp in length and encoded 127, 204, and 161 amino acids, respectively. Whereas, their cDNA sequence identities with those of human were 89%, 98%, and 89%, respectively. Their phylogenetic tree results indicated different topologies and suggested individual evolutional pathways. These results can facilitate further functional studies.

  5. Nucleotide sequence and structural organization of the human vasopressin pituitary receptor (V3) gene.

    PubMed

    René, P; Lenne, F; Ventura, M A; Bertagna, X; de Keyzer, Y

    2000-01-04

    In the pituitary, vasopressin triggers ACTH release through a specific receptor subtype, termed V3 or V1b. We cloned the V3 cDNA and showed that its expression was almost exclusive to pituitary corticotrophs and some corticotroph tumors. To study the determinants of this tissue specificity, we have now cloned the gene for the human (h) V3 receptor and characterized its structure. It is composed of two exons, spanning 10kb, with the coding region interrupted between transmembrane domains 6 and 7. We established that the transcription initiation site is located 498 nucleotides upstream of the initiator codon and showed that two polyadenylation sites may be used, while the most frequent is the most downstream. Sequence analysis of the promoter region showed no TATA box but identified consensus binding motifs for Sp1, CREB, and half sites of the estrogen receptor binding site. However comparison with another corticotroph-specific gene, proopiomelanocortin, did not identify common regulatory elements in the two promoters except for a short GC-rich region. Unexpectedly, hV3 gene analysis revealed that a formerly cloned 'artifactual' hV3 cDNA indeed corresponded to a spliced antisense transcript, overlapping the 5' part of the coding sequence in exon 1 and the promoter region. This transcript, hV3rev, was detected in normal pituitary and in many corticotroph tumors expressing hV3 sense mRNA and may therefore play a role in hV3 gene expression.

  6. Bioinformatics comparison of sulfate-reducing metabolism nucleotide sequences

    NASA Astrophysics Data System (ADS)

    Tremberger, G.; Dehipawala, Sunil; Nguyen, A.; Cheung, E.; Sullivan, R.; Holden, T.; Lieberman, D.; Cheung, T.

    2015-09-01

    The sulfate-reducing bacteria can be traced back to 3.5 billion years ago. The thermodynamics details of the sulfur cycle have been well documented. A recent sulfate-reducing bacteria report (Robator, Jungbluth, et al , 2015 Jan, Front. Microbiol) with Genbank nucleotide data has been analyzed in terms of the sulfite reductase (dsrAB) via fractal dimension and entropy values. Comparison to oil field sulfate-reducing sequences was included. The AUCG translational mass fractal dimension versus ATCG transcriptional mass fractal dimension for the low temperature dsrB and dsrA sequences reported in Reference Thirteen shows correlation R-sq ~ 0.79 , with a probably of about 3% in simulation. A recent report of using Cystathionine gamma-lyase sequence to produce CdS quantum dot in a biological method, where the sulfur is reduced just like in the H2S production process, was included for comparison. The AUCG mass fractal dimension versus ATCG mass fractal dimension for the Cystathionine gamma-lyase sequences was found to have R-sq of 0.72, similar to the low temperature dissimilatory sulfite reductase dsr group with 3% probability, in contrary to the oil field group having R-sq ~ 0.94, a high probable outcome in the simulation. The other two simulation histograms, namely, fractal dimension versus entropy R-sq outcome values, and di-nucleotide entropy versus mono-nucleotide entropy R-sq outcome values are also discussed in the data analysis focusing on low probability outcomes.

  7. Analysis of a cDNA clone expressing a human autoimmune antigen: full-length sequence of the U2 small nuclear RNA-associated B antigen

    SciTech Connect

    Habets, W.J.; Sillekens, P.T.G.; Hoet, M.H.; Schalken, J.A.; Roebroek, A.J.M.; Leunissen, J.A.M.; Van de Ven, W.J.M.; Van Venrooij, W.J.

    1987-04-01

    A U2 small nuclear RNA-associated protein, designated B'', was recently identified as the target antigen for autoimmune sera from certain patients with systemic lupus erythematosus and other rheumatic diseases. Such antibodies enabled them to isolate cDNA clone lambdaHB''-1 from a phage lambdagt11 expression library. This clone appeared to code for the B'' protein as established by in vitro translation of hybrid-selected mRNA. The identity of clone lambdaHB''-1 was further confirmed by partial peptide mapping and analysis of the reactivity of the recombinant antigen with monospecific and monoclonal antibodies. Analysis of the nucleotide sequence of the 1015-base-pair cDNA insert of clone lambdaHB''-1 revealed a large open reading frame of 800 nucleotides containing the coding sequence for a polypeptide of 25,457 daltons. In vitro transcription of the lambdaHB''-1 cDNA insert and subsequent translation resulted in a protein product with the molecular size of the B'' protein. These data demonstrate that clone lambdaHB''-1 contains the complete coding sequence of this antigen. The deduced polypeptide sequence contains three very hydrophilic regions that might constitute RNA binding sites and/or antigenic determinants. These findings might have implications both for the understanding of the pathogenesis of rheumatic diseases as well as for the elucidation of the biological function of autoimmune antigens.

  8. Nucleotide sequence and genome organization of canine parvovirus.

    PubMed Central

    Reed, A P; Jones, E V; Miller, T J

    1988-01-01

    The genome of a canine parvovirus isolate strain (CPV-N) was cloned, and the DNA sequence was determined. The entire genome, including ends, was 5,323 nucleotides in length. The terminal repeat at the 3' end of the genome shared similar structural characteristics but limited homology with the rodent parvoviruses. The 5' terminal repeat was not detected in any of the clones. Instead, a region of DNA starting near the capsid gene stop codon and extending 248 base pairs into the coding region had been duplicated and inserted 75 base pairs downstream from the poly(A) addition site. Consensus sequences for the 5' donor and 3' acceptor sites as well as promotors and poly(A) addition sites were identified and compared with the available information on related parvoviruses. The genomic organization of CPV-N is similar to that of feline parvovirus (FPV) in that there are two major open reading frames (668 and 722 amino acids) in the plus strand (mRNA polarity). Both coding domains are in the same frame, and no significant open reading frames were apparent in any of the other frames of both minus and plus DNA strands. The nucleotide and amino acid homologies of the capsid genes between CPV-N and FPV were 98 and 99%, respectively. In contrast, the nucleotide and amino acid homologies of the capsid genes for CPV-N and CPV-b (S. Rhode III, J. Virol. 54:630-633, 1985) were 95 and 98%, respectively. These results indicate that very few nucleotide or amino acid changes differentiate the antigenic and host range specificity of FPV and CPV. PMID:2824850

  9. Cloning and sequencing of human intestinal alkaline phosphatase cDNA

    SciTech Connect

    Berger, J.; Garattini, E.; Hua, J.C.; Udenfriend, S.

    1987-02-01

    Partial protein sequence data obtained on intestinal alkaline phosphatase indicated a high degree of homology with the reported sequence of the placental isoenzyme. Accordingly, placental alkaline phosphatase cDNA was cloned and used as a probe to clone intestinal alkaline phosphatase cDNA. The latter is somewhat larger (3.1 kilobases) than the cDNA for the placental isozyme (2.8 kilobases). Although the 3' untranslated regions are quite different, there is almost 90% homology in the translated regions of the two isozymes. There are, however, significant differences at their amino and carboxyl termini and a substitution of an alanine in intestinal alkaline phosphatase for a glycine in the active site of the placental isozyme.

  10. Contamination of cDNA libraries and expressed sequence-tags databases

    SciTech Connect

    Dean, M.; Allikmets, R.

    1995-11-01

    Partially sequenced cDNAs, or expressed sequence tags (ESTs), are claimed to represent an efficient strategy for characterizing an organism`s genes. By necessity, these sequences are incompletely characterized, and examples of contamination of cDNA libraries with sequences from other species have been described. It has been suggested that a Human T-cell cDNA library (Clontech HL1963g) is contaminated by sequences from yeast (Saccharomyces cerevisiae) and an unknown bacterium. We are characterizing human ESTs that represent new members of the ATP-binding cassette transporter super-family. In examining human ESTs generated from the T-cell library, we have encountered one gene that was in fact a yeast sequence (Genbank Z15214 = SSH2 locus) and several genes that do not hybridize to human DNA or RNA. PCR primers from these sequences failed to amplify a product from human, yeast, or Escherichia coli DNA but did produce a product from a Clontech kidney cDNA library (HL1123a). To determine the source of the contamination, we amplified a conserved segment of the 16S rDNA (following a suggestion from Dr. C. Savakis) from the kidney library. The sequence of this product was nearly identical to that of the bacterium Leuconostoc lactis (300 of 304 bp). Leuconostoc species are commonly found in dairy products, fruits, vegetables, and wine and are nonpathogenic to humans. 6 refs., 1 fig.

  11. Analysis of expressed sequence tags generated from full-length enriched cDNA libraries of melon

    PubMed Central

    2011-01-01

    Background Melon (Cucumis melo), an economically important vegetable crop, belongs to the Cucurbitaceae family which includes several other important crops such as watermelon, cucumber, and pumpkin. It has served as a model system for sex determination and vascular biology studies. However, genomic resources currently available for melon are limited. Result We constructed eleven full-length enriched and four standard cDNA libraries from fruits, flowers, leaves, roots, cotyledons, and calluses of four different melon genotypes, and generated 71,577 and 22,179 ESTs from full-length enriched and standard cDNA libraries, respectively. These ESTs, together with ~35,000 ESTs available in public domains, were assembled into 24,444 unigenes, which were extensively annotated by comparing their sequences to different protein and functional domain databases, assigning them Gene Ontology (GO) terms, and mapping them onto metabolic pathways. Comparative analysis of melon unigenes and other plant genomes revealed that 75% to 85% of melon unigenes had homologs in other dicot plants, while approximately 70% had homologs in monocot plants. The analysis also identified 6,972 gene families that were conserved across dicot and monocot plants, and 181, 1,192, and 220 gene families specific to fleshy fruit-bearing plants, the Cucurbitaceae family, and melon, respectively. Digital expression analysis identified a total of 175 tissue-specific genes, which provides a valuable gene sequence resource for future genomics and functional studies. Furthermore, we identified 4,068 simple sequence repeats (SSRs) and 3,073 single nucleotide polymorphisms (SNPs) in the melon EST collection. Finally, we obtained a total of 1,382 melon full-length transcripts through the analysis of full-length enriched cDNA clones that were sequenced from both ends. Analysis of these full-length transcripts indicated that sizes of melon 5' and 3' UTRs were similar to those of tomato, but longer than many other dicot

  12. Nucleotide sequence of a complementary DNA encoding pea cytosolic copper/zinc superoxide dismutase. [Pisum sativum L

    SciTech Connect

    White, D.A.; Zilinskas, B.A. )

    1991-08-01

    The authors now report the nucleotide sequence of the cytosolic Cu/Zn SOD cloned from a {lambda}gt11 cDNA library constructed from mRNA extracted from leaves of 7- to 10-d pea seedlings (Pisum sativum L.). The clone was isolated using a 22-base synthetic oligonucleotide complementary to the amino acid sequence CGIIGLQG. This sequence, found at the protein's carboxy terminus, is highly conserved among plant cytosolic Cu/Zn SODs but not chloroplastic Cu/Zn SODs. The 738-base pair sequence contains an open reading frame specifying 152 codons and a predicted M{sub r} of 18,024 D. The deduced amino acid sequence is highly homologous (79-82% identity) with the sequences of other known plant cytosolic Cu/Zn SODs but less highly conserved (63-65%) when compared with several chloroplastic Cu/Zn SODs including pea (10).

  13. Rapid Amplification of cDNA Ends for RNA Transcript Sequencing in Staphylococcus.

    PubMed

    Miller, Eric

    2016-01-01

    Rapid amplification of cDNA ends (RACE) is a technique that was developed to swiftly and efficiently amplify full-length RNA molecules in which the terminal ends have not been characterized. Current usage of this procedure has been more focused on sequencing and characterizing RNA 5' and 3' untranslated regions. Herein is described an adapted RACE protocol to amplify bacterial RNA transcripts.

  14. Molecular Cloning and Sequencing of Channel Catfish, Ictalurus punctatus, Cathepsin H and L cDNA

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Cathepsin H and L, a lysosomal cysteine endopeptidase of the papain family, are ubiquitously expressed and involve in antigen processing. In this communication, the channel catfish cathepsin H and L transcripts were sequenced and analyzed. Total RNA from tissues was extracted and cDNA libraries we...

  15. Genes galore: a summary of methods for accessing results from large-scale partial sequencing of anonymous Arabidopsis cDNA clones.

    PubMed Central

    Newman, T; de Bruijn, F J; Green, P; Keegstra, K; Kende, H; McIntosh, L; Ohlrogge, J; Raikhel, N; Somerville, S; Thomashow, M

    1994-01-01

    High-throughput automated partial sequencing of anonymous cDNA clones provides a method to survey the repertoire of expressed genes from an organism. Comparison of the coding capacity of these expressed sequence tags (ESTs) with the sequences in the public data bases results in assignment of putative function to a significant proportion of the ESTs. Thus, the more than 13,400 plant ESTs that are currently available provide a new resource that will facilitate progress in many areas of plant biology. These opportunities are illustrated by a description of the results obtained from analysis of 1500 Arabidopsis ESTs from a cDNA library prepared from equal portions of poly(A+) mRNA from etiolated seedlings, roots, leaves, and flowering inflorescences. More than 900 different sequences were represented, 32% of which showed significant nucleotide or deduced amino acid sequences similarity to previously characterized genes or proteins from a wide range of organisms. At least 165 of the clones had significant deduced amino acid sequence homology to proteins or gene products that have not been previously characterized from higher plants. A summary of methods for accessing the information and materials generated by the Arabidopsis cDNA sequencing project is provided. PMID:7846151

  16. Identification of genomic sequences corresponding to cDNA clones

    SciTech Connect

    Spoerel, N.A.; Kafatos, F.C.

    1987-01-01

    The general methods applicable to the isolation of genomic sequences from phage lambda or cosmid libraries have been described. This chapter presents strategies for the investigation of genes that occur in several identical or nonidentical copies per genome, or that share a common conserved domain with other genes. The methods discussed are applicable both to the identification of the genes in Southern blots and to their isolation from libraries. Furthermore, the methods are well suited for the analysis of homologous genes in different species. A high proportion of genes in eukaryotes are known to be members of multigene families. Carefully controlled hybridization conditions and well-tailored probes are powerful tools in the isolation and analysis of genes which share a common domain or are members of multigene families. This chapter consists of a short review of recommended strategies and relevant parameters, which have been discussed in more detail earlier. Using three examples from the authors' analysis of the silk moth choriun locus, they demonstrate how powerful carefully tailored short single-stranded probes can be in the analysis of closely related gene copies.

  17. Comparison of latent and nominal rabbit Ig VHa1 allotype cDNA sequences.

    PubMed

    McCormack, W T; Dhanarajan, P; Roux, K H

    1988-09-15

    The genetic basis for the expression of a latent VH allotype in the rabbit was investigated. VH region cDNA libraries were produced from spleen mRNA derived from a homozygous a2a2 rabbit expressing an induced latent VHa1 allotype and, for comparison, from a normal homozygus a1a1 rabbit expressing nominal VHa1 allotype. The deduced amino acid sequences of the nominal VHa1 cDNA were concordant with previously published VHa1 protein sequences. A comparison of two complete VH-DH-JH and six partial VHa1 sequences reveals highly conserved sequence within VH framework regions (FR) and considerable diversity in complementarity-determining regions and D region sequences. Two functional JH genes or alleles are evident. Amino acid sequencing of the N-terminal 15 residues of pooled affinity-purified latent VHa1 H chain showed complete sequence identity with the nominal VHa1 sequences. Possible latent VHa1-encoding cDNA clones, derived from the a2a2 rabbit, were selected by hybridization with oligonucleotide probes corresponding to the VHa1 allotype-associated segments of the first and third framework regions (FR1 and FR3). cDNA sequence analysis reveals that the 5' untranslated regions of nominal and latent VHa1 cDNA were virtually identical to each other and to previously reported sequences associated with VHa2 and VHa-negative genes. Moreover, some latent VHa1 genes encode FR1 segments that are essentially homologous to the corresponding segment of a nominal VHa1 allotype. In contrast, other putative latent genes display blocks of VHa1 sequence in either FR1 or FR3 that are flanked by blocks of sequence identical to other rabbit VH genes (i.e., VHa2 or VHa-negative). These composite sequences may be directly encoded by composite germ-line VH genes or may be the products of somatically generated recombination or gene conversion between genes encoding latent and nominal allotypes. The data do not support the hypothesis that latent genes are the result of extensive modification

  18. Deblur Rapidly Resolves Single-Nucleotide Community Sequence Patterns

    PubMed Central

    Amir, Amnon; McDonald, Daniel; Navas-Molina, Jose A.; Kopylova, Evguenia; Morton, James T.; Zech Xu, Zhenjiang; Kightley, Eric P.; Thompson, Luke R.; Hyde, Embriette R.; Gonzalez, Antonio

    2017-01-01

    ABSTRACT High-throughput sequencing of 16S ribosomal RNA gene amplicons has facilitated understanding of complex microbial communities, but the inherent noise in PCR and DNA sequencing limits differentiation of closely related bacteria. Although many scientific questions can be addressed with broad taxonomic profiles, clinical, food safety, and some ecological applications require higher specificity. Here we introduce a novel sub-operational-taxonomic-unit (sOTU) approach, Deblur, that uses error profiles to obtain putative error-free sequences from Illumina MiSeq and HiSeq sequencing platforms. Deblur substantially reduces computational demands relative to similar sOTU methods and does so with similar or better sensitivity and specificity. Using simulations, mock mixtures, and real data sets, we detected closely related bacterial sequences with single nucleotide differences while removing false positives and maintaining stability in detection, suggesting that Deblur is limited only by read length and diversity within the amplicon sequences. Because Deblur operates on a per-sample level, it scales to modern data sets and meta-analyses. To highlight Deblur’s ability to integrate data sets, we include an interactive exploration of its application to multiple distinct sequencing rounds of the American Gut Project. Deblur is open source under the Berkeley Software Distribution (BSD) license, easily installable, and downloadable from https://github.com/biocore/deblur. IMPORTANCE Deblur provides a rapid and sensitive means to assess ecological patterns driven by differentiation of closely related taxa. This algorithm provides a solution to the problem of identifying real ecological differences between taxa whose amplicons differ by a single base pair, is applicable in an automated fashion to large-scale sequencing data sets, and can integrate sequencing runs collected over time. PMID:28289731

  19. Sequence and neuronal expression of mouse endothelin-1 cDNA.

    PubMed

    Kurama, M; Ishida, N; Matsui, M; Saida, K; Mitsui, Y

    1996-07-17

    We have isolated and sequenced a cDNA that encodes mouse endothelin-1 (ET-1). The putative protein contains 202 amino acids corresponds to the prepro-form of ET-1. Twenty-one amino acids sequence of the putative mature ET-1 was identical with that of rat, porcine, bovine, and human. In situ hybridization histochemistry indicate that ET-1 mRNA was expressed in several hypothalamic nuclei including the suprachiasmatic nucleus (SCN) in rodent brain.

  20. Cloning and sequencing of a cDNA encoding a taste-modifying protein, miraculin.

    PubMed

    Masuda, Y; Nirasawa, S; Nakaya, K; Kurihara, Y

    1995-08-19

    A cDNA clone encoding a taste-modifying protein, miraculin (MIR), was isolated and sequenced. The encoded precursor to MIR was composed of 220 amino acid (aa) residues, including a possible signal sequence of 29 aa. Northern blot analysis showed that the mRNA encoding MIR was already expressed in fruits of Richadella dulcifica at 3 weeks after pollination and was present specifically in the pulp.

  1. Cloning and sequencing of cDNA and genomic DNA encoding PDM phosphatase of Fusarium moniliforme.

    PubMed

    Yoshida, Hiroshi; Iizuka, Mari; Narita, Takao; Norioka, Naoko; Norioka, Shigemi

    2006-12-01

    PDM phosphatase was purified approximately 500-fold through six steps from the extract of dried powder of the culture filtrate of Fusarium moniliforme. The purified preparation appeared homogeneous on SDS-PAGE although the protein band was broad. Amino acid sequence information was collected on tryptic peptides from this preparation. cDNA cloning was carried out based on the information. A full-length cDNA was obtained and sequenced. The sequence had an open reading frame of 651 amino acid residues with a molecular mass of 69,988 Da. Cloning and sequencing of the genomic DNA corresponding to the cDNA was also conducted. The deduced amino acid sequence could account for many but not all of the tryptic peptides, suggesting presence of contaminant protein(s). SDS-PAGE analysis after chemical deglycosylation showed two proteins with molecular masses of 58 and 68 kDa. This implied that the 58 kDa protein had been copurified with PDM phosphatase. Homology search showed that PDM phosphatase belongs to the purple acid phosphatase family, which is widely distributed in the biosphere. Sequence data of fungal purple acid phosphatases were collected from the database. Processing of the data revealed presence of two types, whose evolutionary relationships were discussed.

  2. Generation of expressed sequence tags from a normalized porcine skeletal muscle cDNA library.

    PubMed

    Yao, Jianbo; Coussens, Paul M; Saama, Peter; Suchyta, Steven; Ernst, Catherine W

    2002-11-01

    Recent developments in microarray technologies permit scientists to analyze expression of thousands of genes simultaneously in diverse biological systems. In an effort to provide integrated resources for application of microarray technologies to studies of skeletal muscle growth and development in swine, we have constructed a normalized cDNA library from porcine skeletal muscle. The effectiveness of normalization was evaluated by DNA sequencing of clones randomly picked from the library before and after normalization, and also by Southern blot hybridization using probes representing abundant transcripts. Our data suggests that the normalization procedure successfully reduced the highly abundant cDNA species in the normalized library. To date, a total of 782 EST (expressed sequence tag) sequences have been generated from this normalized library (687 ESTs) and the original library (95 ESTs). The sequence information of these ESTs plus their BLAST results has been made available through a web accessible database (http://nbfgc.msu.edu). Cluster analysis of the data indicates that a total of 742 unique sequences are present in this collection. BLASTN search of the 742 EST sequences against the public database (dbEST) revealed that 139 had no significant matches (E-value > 10(-15)) to porcine ESTs already entered in the database, suggesting the possibility of their specific expression in porcine skeletal muscle. Generation of non-redundant ESTs from this library will allow us to construct cDNA microarrays for identification of gene expression changes that regulate muscle growth and affect meat quality in swine.

  3. cDNA sequence and expression pattern of the putative pheromone carrier aphrodisin.

    PubMed Central

    Mägert, H J; Hadrys, T; Cieslak, A; Gröger, A; Feller, S; Forssmann, W G

    1995-01-01

    The cDNA sequence for aphrodisin, a lipocalin from hamster vaginal discharge which is involved in pheromonal activity, has been determined. Corresponding genomic clones were isolated and the promoter region was identified. Primer extension analysis revealed an adenosine residue as the main transcription initiation site, located 50 bp upstream of the translation start codon ATG, which is surrounded by a typical Kozak sequence. However, data from polymerase chain reaction analysis suggest the existence of at least one alternative transcription initiation site. The aphrodisin cDNA is 732 bp long and codes for the mature 151-aa aphrodisin and an additional N-terminal 16-aa secretory signal peptide. The 3' nontranslated region is 228 bp long. Among the known sequences, the aphrodisin cDNA shares the highest homology with the rat odorant-binding protein cDNA (45%), which verifies the protein data. Vaginal tissue and Bartholin's glands are the main aphrodisin gene-expressing tissues of the female hamster genital tract, as demonstrated by Northern blot analysis. Under less stringent hybridization conditions, RNA isolated from rat Bartholin's glands also showed a signal, indicating the occurrence of aphrodisin-related mRNA in this species. Images Fig. 4 Fig. 5 Fig. 6 Fig. 7 PMID:7892229

  4. The nucleotide sequence of a nematode vitellogenin gene.

    PubMed Central

    Spieth, J; Denison, K; Zucker, E; Blumenthal, T

    1985-01-01

    The nematode, Caenorhabditis elegans, contains a family of six genes that code for vitellogenins. Here we report the complete nucleotide sequence of one of these genes, vit-5. The gene specifies a mRNA of 4869 nucleotides, including untranslated regions of 9 bases at the 5' end and 51 bases at the 3' end. Vit-5 contains four short introns totalling 218 bp. The predicted vitellogenin, yp170A, has a molecular weight of 186,430. At its N terminus it is clearly related to the vitellogenins of vertebrates. However, the vit-5-encoded protein does not contain a serine-rich sequence related to the vertebrate vitellin, phosvitin. In fact, the amino acid composition of the nematode protein is very similar to that of the vertebrate protein without phosvitin. Vit-5 has a highly asymmetric codon choice dictionary. The favored codons are different from those favored in other organisms, but are characteristic of highly expressed C. elegans genes. The strong selection against rare codons is not as great near the 5' end of the gene; rare codons are 15 times more frequent within the first 54 bp than in the next 4.8 kb. PMID:3855245

  5. Computer-based methods for the mouse full-length cDNA encyclopedia: real-time sequence clustering for construction of a nonredundant cDNA library.

    PubMed

    Konno, H; Fukunishi, Y; Shibata, K; Itoh, M; Carninci, P; Sugahara, Y; Hayashizaki, Y

    2001-02-01

    We developed computer-based methods for constructing a nonredundant mouse full-length cDNA library. Our cDNA library construction process comprises assessment of library quality, sequencing the 3' ends of inserts and clustering, and completing a re-array to generate a nonredundant library from a redundant one. After the cDNA libraries are generated, we sequence the 5' ends of the inserts to check the quality of the library; then we determine the sequencing priority of each library. Selected libraries undergo large-scale sequencing of the 3' ends of the inserts and clustering of the tag sequences. After clustering, the nonredundant library is constructed from the original libraries, which have redundant clones. All libraries, plates, clones, sequences, and clusters are uniquely identified, and all information is saved in the database according to this identifier. At press time, our system has been in place for the past two years; we have clustered 939,725 3' end sequences into 127,385 groups from 227 cDNA libraries/sublibraries (see http://genome.gse.riken.go.jp/).

  6. The human clotting factor VIII cDNA contains an autonomously replicating sequence consensus- and matrix attachment region-like sequence that binds a nuclear factor, represses heterologous gene expression, and mediates the transcriptional effects of sodium butyrate.

    PubMed Central

    Fallaux, F J; Hoeben, R C; Cramer, S J; van den Wollenberg, D J; Briët, E; van Ormondt, H; van Der Eb, A J

    1996-01-01

    Expression of the human blood-clotting factor VIII (FVIII) cDNA is hampered by the presence of sequences located in the coding region that repress transcription. We have previously identified a 305-bp fragment within the FVIII cDNA that is involved in the repression (R.C. Hoeben, F.J. Fallaux, S.J. Cramer, D.J.M. van den Wollenberg, H. van Ormondt, E. Briet, and A.J. van der Eb, Blood 85:2447-2454, 1995). Here, we show that this 305-bp region of FVIII cDNA contains sequences that resemble the yeast (Saccharomyces cerevisiae) autonomously replicating sequence consensus. Two of these DNA elements coincide with AT-rich sequences that are often found in matrix attachment regions or scaffold-attached regions. One of these elements, consisting of nucleotides 1569 to 1600 of the FVIII cDNA (nucleotide numbering is according to the system of Wood et al. (W.I. Wood, D.J. Capon, C.C. Simonsen, D.L. Eaton, J. Gitschier, D. Keyt, P.H. Seeburg, D.H. Smith, P. Hollingshead, K.L. Wion, et al., Nature [London] 312:330-337,1984), binds a nuclear factor in vitro but loses this capacity after four of its base pairs have been changed. A synthetic heptamer of this segment can repress the expression of a chloramphenicol acetyltransferase (CAT) reporter gene and also loses this capacity upon mutation. Furthermore, we demonstrate that repression by FVIII sequences can be relieved by sodium butyrate. We demonstrate that the synthetic heptamer (FVIII nucleotides 1569 to 1600), when placed upstream of the Moloney murine leukemia virus long terminal repeat promoter that drives the CAT reporter, can render the CAT reporter inducible by butyrate. This effect was absent when the same element was mutated. The stimulatory effect of butyrate could not be attributed to butyrate-responsive elements in the studied long terminal repeat promoters. Our data provide a functional characterization of the sequences that repress expression of the FVIII cDNA. These data also suggest a link between

  7. Nucleotide sequences specific to Yersinia pestis and methods for the detection of Yersinia pestis

    DOEpatents

    McCready, Paula M.; Radnedge, Lyndsay; Andersen, Gary L.; Ott, Linda L.; Slezak, Thomas R.; Kuczmarski, Thomas A.; Motin, Vladinir L.

    2009-02-24

    Nucleotide sequences specific to Yersinia pestis that serve as markers or signatures for identification of this bacterium were identified. In addition, forward and reverse primers and hybridization probes derived from these nucleotide sequences that are used in nucleotide detection methods to detect the presence of the bacterium are disclosed.

  8. Nucleotide sequences specific to Francisella tularensis and methods for the detection of Francisella tularensis

    DOEpatents

    McCready, Paula M.; Radnedge, Lyndsay; Andersen, Gary L.; Ott, Linda L.; Slezak, Thomas R.; Kuczmarski, Thomas A.; Vitalis, Elizabeth A

    2009-02-24

    Described herein is the identification of nucleotide sequences specific to Francisella tularensis that serves as a marker or signature for identification of this bacterium. In addition, forward and reverse primers and hybridization probes derived from these nucleotide sequences that are used in nucleotide detection methods to detect the presence of the bacterium are disclosed.

  9. Nucleotide sequences specific to Francisella tularensis and methods for the detection of Francisella tularensis

    DOEpatents

    McCready, Paula M.; Radnedge, Lyndsay; Andersen, Gary L.; Ott, Linda L.; Slezak, Thomas R.; Kuczmarski, Thomas A.; Vitalis, Elizabeth A

    2007-02-06

    Described herein is the identification of nucleotide sequences specific to Francisella tularensis that serves as a marker or signature for identification of this bacterium. In addition, forward and reverse primers and hybridization probes derived from these nucleotide sequences that are used in nucleotide detection methods to detect the presence of the bacterium are disclosed.

  10. Nucleotide sequences specific to Brucella and methods for the detection of Brucella

    DOEpatents

    McCready, Paula M.; Radnedge, Lyndsay; Andersen, Gary L.; Ott, Linda L.; Slezak, Thomas R.; Kuczmarski, Thomas A.

    2009-02-24

    Nucleotide sequences specific to Brucella that serves as a marker or signature for identification of this bacterium were identified. In addition, forward and reverse primers and hybridization probes derived from these nucleotide sequences that are used in nucleotide detection methods to detect the presence of the bacterium are disclosed.

  11. The complete nucleotide sequence of chrysanthemum stem necrosis virus.

    PubMed

    Dullemans, A M; Verhoeven, J Th J; Kormelink, R; van der Vlugt, R A A

    2015-02-01

    The complete genome sequence of chrysanthemum stem necrosis virus (CSNV) was determined using Roche 454 next-generation sequencing. CSNV is a tentative member of the genus Tospovirus within the family Bunyaviridae, whose members are arthropod-borne. This is the first report of the entire RNA genome sequence of a CSNV isolate. The large RNA of CSNV is 8955 nucleotides (nt) in size and contains a single open reading frame of 8625 nt in the antisense arrangement, coding for the putative RNA-dependent RNA polymerase (L protein) of 2874 aa with a predicted Mr of 331 kDa. Two untranslated regions of 397 and 33 nt are present at the 5' and 3' termini, respectively. The medium (M) and small (S) RNAs are 4830 and 2947 nt in size, respectively, and show 99 % identity to the corresponding genomic segments of previously partially characterized CSNV genomes. Protein sequences for the precursor of the Gn/Gc proteins, N and NSs, are identical in length in all of the analysed CSNV isolates.

  12. Generalized Levy-walk model for DNA nucleotide sequences

    NASA Technical Reports Server (NTRS)

    Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Simons, M.; Stanley, H. E.

    1993-01-01

    We propose a generalized Levy walk to model fractal landscapes observed in noncoding DNA sequences. We find that this model provides a very close approximation to the empirical data and explains a number of statistical properties of genomic DNA sequences such as the distribution of strand-biased regions (those with an excess of one type of nucleotide) as well as local changes in the slope of the correlation exponent alpha. The generalized Levy-walk model simultaneously accounts for the long-range correlations in noncoding DNA sequences and for the apparently paradoxical finding of long subregions of biased random walks (length lj) within these correlated sequences. In the generalized Levy-walk model, the lj are chosen from a power-law distribution P(lj) varies as lj(-mu). The correlation exponent alpha is related to mu through alpha = 2-mu/2 if 2 < mu < 3. The model is consistent with the finding of "repetitive elements" of variable length interspersed within noncoding DNA.

  13. cDNA sequence of a human skeletal muscle ADP/ATP translocator: lack of a leader peptide, divergence from a fibroblast translocator cDNA, and coevolution with mitochondrial DNA genes

    SciTech Connect

    Neckelmann, N.; Li, K.; Wade, R.P.; Shuster, R.; Wallace, D.C.

    1987-11-01

    The authors have characterized a 1400-nucleotide cDNA for the human skeletal muscle ADP/ATP translocator. The deduced amino acid sequence is 94% homologous to the beef heart ADP/ATP translocator protein and contains only a single additional amino-terminal methionine. This implies that the human translocator lacks an amino-terminal targeting peptide, a conclusion substantiated by measuring the molecular weight of the protein synthesized in vitro. A 1400-nucleotide transcript encoding the skeletal muscle translocator was detected on blots of total RNA from human heart, kidney, skeletal muscle, and HeLa cells by hybridization with oligonucleotide probes homologous to the coding region and 3' noncoding region of the cDNA. However, the level of this mRNA varied substantially among tissues. Comparison of our skeletal muscle translocator sequence with that of a recently published human fibroblast translocator cognate revealed that the two proteins are 88% identical and diverged about 275 million years ago. Hence, tissues vary both in the level of expression of individual translocator genes and in differential expression of cognate translocator genes. Comparison of the base substitution rates of the ADP/ATP translocator and the oxidative phosphorylation genes encoded by mitochondrial DNA revealed that the mitochondrial DNA genes fix 10 times more synonymous substitutions and 12 times more replacement substitutions; yet, these nuclear and cytoplasmic respiration genes experience comparable evolutionary constraints. This suggest that the mitochondrial DNA genes are highly prone to deleterious mutations.

  14. Sequencing and comparative genomic analysis of 1227 Felis catus cDNA sequences enriched for developmental, clinical and nutritional phenotypes

    PubMed Central

    2012-01-01

    Background The feline genome is valuable to the veterinary and model organism genomics communities because the cat is an obligate carnivore and a model for endangered felids. The initial public release of the Felis catus genome assembly provided a framework for investigating the genomic basis of feline biology. However, the entire set of protein coding genes has not been elucidated. Results We identified and characterized 1227 protein coding feline sequences, of which 913 map to public sequences and 314 are novel. These sequences have been deposited into NCBI's genbank database and complement public genomic resources by providing additional protein coding sequences that fill in some of the gaps in the feline genome assembly. Through functional and comparative genomic analyses, we gained an understanding of the role of these sequences in feline development, nutrition and health. Specifically, we identified 104 orthologs of human genes associated with Mendelian disorders. We detected negative selection within sequences with gene ontology annotations associated with intracellular trafficking, cytoskeleton and muscle functions. We detected relatively less negative selection on protein sequences encoding extracellular networks, apoptotic pathways and mitochondrial gene ontology annotations. Additionally, we characterized feline cDNA sequences that have mouse orthologs associated with clinical, nutritional and developmental phenotypes. Together, this analysis provides an overview of the value of our cDNA sequences and enhances our understanding of how the feline genome is similar to, and different from other mammalian genomes. Conclusions The cDNA sequences reported here expand existing feline genomic resources by providing high-quality sequences annotated with comparative genomic information providing functional, clinical, nutritional and orthologous gene information. PMID:22257742

  15. cDNA Library Enrichment of Full Length Transcripts for SMRT Long Read Sequencing

    PubMed Central

    Hartwig, Benjamin; Reinhardt, Richard; Schneeberger, Korbinian

    2016-01-01

    The utility of genome assemblies does not only rely on the quality of the assembled genome sequence, but also on the quality of the gene annotations. The Pacific Biosciences Iso-Seq technology is a powerful support for accurate eukaryotic gene model annotation as it allows for direct readout of full-length cDNA sequences without the need for noisy short read-based transcript assembly. We propose the implementation of the TeloPrime Full Length cDNA Amplification kit to the Pacific Biosciences Iso-Seq technology in order to enrich for genuine full-length transcripts in the cDNA libraries. We provide evidence that TeloPrime outperforms the commonly used SMARTer PCR cDNA Synthesis Kit in identifying transcription start and end sites in Arabidopsis thaliana. Furthermore, we show that TeloPrime-based Pacific Biosciences Iso-Seq can be successfully applied to the polyploid genome of bread wheat (Triticum aestivum) not only to efficiently annotate gene models, but also to identify novel transcription sites, gene homeologs, splicing isoforms and previously unidentified gene loci. PMID:27327613

  16. cDNA, genomic sequence cloning and overexpression of ribosomal protein S25 gene (RPS25) from the Giant Panda.

    PubMed

    Hao, Yan-Zhe; Hou, Wan-Ru; Hou, Yi-Ling; Du, Yu-Jie; Zhang, Tian; Peng, Zheng-Song

    2009-11-01

    RPS25 is a component of the 40S small ribosomal subunit encoded by RPS25 gene, which is specific to eukaryotes. Studies in reference to RPS25 gene from animals were handful. The Giant Panda (Ailuropoda melanoleuca), known as a "living fossil", are increasingly concerned by the world community. Studies on RPS25 of the Giant Panda could provide scientific data for inquiring into the hereditary traits of the gene and formulating the protective strategy for the Giant Panda. The cDNA of the RPS25 cloned from Giant Panda is 436 bp in size, containing an open reading frame of 378 bp encoding 125 amino acids. The length of the genomic sequence is 1,992 bp, which was found to possess four exons and three introns. Alignment analysis indicated that the nucleotide sequence of the coding sequence shows a high homology to those of Homo sapiens, Bos taurus, Mus musculus and Rattus norvegicus as determined by Blast analysis, 92.6, 94.4, 89.2 and 91.5%, respectively. Primary structure analysis revealed that the molecular weight of the putative RPS25 protein is 13.7421 kDa with a theoretical pI 10.12. Topology prediction showed there is one N-glycosylation site, one cAMP and cGMP-dependent protein kinase phosphorylation site, two Protein kinase C phosphorylation sites and one Tyrosine kinase phosphorylation site in the RPS25 protein of the Giant Panda. The RPS25 gene was overexpressed in E. coli BL21 and Western Blotting of the RPS25 protein was also done. The results indicated that the RPS25 gene can be really expressed in E. coli and the RPS25 protein fusioned with the N-terminally his-tagged form gave rise to the accumulation of an expected 17.4 kDa polypeptide. The cDNA and the genomic sequence of RPS25 were cloned successfully for the first time from the Giant Panda using RT-PCR technology and Touchdown-PCR, respectively, which were both sequenced and analyzed preliminarily; then the cDNA of the RPS25 gene was overexpressed in E. coli BL21 and immunoblotted, which is the first

  17. Complete cDNA and derived amino acid sequence of human factor V

    SciTech Connect

    Jenny, R.J.; Pittman, D.D.; Toole, J.J.; Kriz, R.W.; Aldape, R.A.; Hewick, R.M.; Kaufman, R.J.; Mann, K.G.

    1987-07-01

    cDNA clones encoding human factor V have been isolated from an oligo(dT)-primed human fetal liver cDNA library prepared with vector Charon 21A. The cDNA sequence of factor V from three overlapping clones includes a 6672-base-pair (bp) coding region, a 90-bp 5' untranslated region, and a 163-bp 3' untranslated region within which is a poly(A)tail. The deduced amino acid sequence consists of 2224 amino acids inclusive of a 28-amino acid leader peptide. Direct comparison with human factor VIII reveals considerable homology between proteins in amino acid sequence and domain structure: a triplicated A domain and duplicated C domain show approx. 40% identity with the corresponding domains in factor VIII. As in factor VIII, the A domains of factor V share approx. 40% amino acid-sequence homology with the three highly conserved domains in ceruloplasmin. The B domain of factor V contains 35 tandem and approx. 9 additional semiconserved repeats of nine amino acids of the form Asp-Leu-Ser-Gln-Thr-Thr/Asn-Leu-Ser-Pro and 2 additional semiconserved repeats of 17 amino acids. Factor V contains 37 potential N-linked glycosylation sites, 25 of which are in the B domain, and a total of 19 cysteine residues.

  18. Nucleotide sequence of the large double-stranded RNA segment of bacteriophage phi 6: genes specifying the viral replicase and transcriptase.

    PubMed Central

    Mindich, L; Nemhauser, I; Gottlieb, P; Romantschuk, M; Carton, J; Frucht, S; Strassman, J; Bamford, D H; Kalkkinen, N

    1988-01-01

    The genome of the lipid-containing bacteriophage phi 6 contains three segments of double-stranded RNA. We determined the nucleotide sequence of cDNA derived from the largest RNA segment (L). This segment specifies the procapsid proteins necessary for transcription and replication of the phi 6 genome. The coding sequences of the four proteins on this segment were identified on the basis of size and the correlation of predicted N-terminal amino acid sequences with those found through analysis of isolated proteins. This report completes the sequence analysis of phi 6. This constitutes the first complete sequence of a double-stranded RNA genome virus. PMID:3346944

  19. Complete nucleotide sequence of a native plasmid from Brevibacterium linens.

    PubMed

    Moore, Mathew; Svenson, Charles; Bowling, David; Glenn, Dianne

    2003-03-01

    Brevibacterium linens has commercial significance in the dairy industry and potential application in the production of bacteriocins and carotenoids. Strain development of these industrially significant organisms would be facilitated by the use of vectors, yet few are available. In this study we report the isolation of four novel plasmids from the Gram-positive coryneform B. linens, and determine the first complete nucleotide sequence of a native plasmid of B. linens. The cryptic plasmid pLIM is 7610 bp in length, and belongs to a subfamily of theta replicating ColE2-related plasmids. Initial investigation suggests that replication in pLIM requires two replicases, a primase (RepA) and a DNA binding protein (RepB), encoded by a single operon repAB. The origin of replication is located upstream of repAB transcription.

  20. Molecular cloning and sequencing of a cDNA encoding partial putative molt-inhibiting hormone from Penaeus chinensis

    NASA Astrophysics Data System (ADS)

    Wang, Zai-Zhao; Xiang, Jian-Hai

    2002-09-01

    Total RNA was extracted from eyestalks of shrimp Penaeus chinensis. Eyestalk cDNA was obtained from total RNA by reverse transcription. Reverse transcriptase-polymerase chain reaction (RT-PCR) was initiated using eyestalk cDNA and degenerate primers designed from the amino acid sequence of molt-inhibiting hormone from shrimp Penaeus japonicus. A specific cDNA was obtained and cloned into a T vector for sequencing. The cDNA consisted of 201 base pairs and encoding for a peptide of 67 amino acid residues. The peptide of P. chinensis had the highest identity with molt-inhibiting hormones of P. japonicus. The cDNA could be a partial gene of molt-inhibiting hormones from P. chinensis. This paper reports for the first time cDNA encoding for neuropeptide of P. chinensis.

  1. Base sequence context effects on nucleotide excision repair.

    PubMed

    Cai, Yuqin; Patel, Dinshaw J; Broyde, Suse; Geacintov, Nicholas E

    2010-08-23

    Nucleotide excision repair (NER) plays a critical role in maintaining the integrity of the genome when damaged by bulky DNA lesions, since inefficient repair can cause mutations and human diseases notably cancer. The structural properties of DNA lesions that determine their relative susceptibilities to NER are therefore of great interest. As a model system, we have investigated the major mutagenic lesion derived from the environmental carcinogen benzo[a]pyrene (B[a]P), 10S (+)-trans-anti-B[a]P-N(2)-dG in six different sequence contexts that differ in how the lesion is positioned in relation to nearby guanine amino groups. We have obtained molecular structural data by NMR and MD simulations, bending properties from gel electrophoresis studies, and NER data obtained from human HeLa cell extracts for our six investigated sequence contexts. This model system suggests that disturbed Watson-Crick base pairing is a better recognition signal than a flexible bend, and that these can act in concert to provide an enhanced signal. Steric hinderance between the minor groove-aligned lesion and nearby guanine amino groups determines the exact nature of the disturbances. Both nearest neighbor and more distant neighbor sequence contexts have an impact. Regardless of the exact distortions, we hypothesize that they provide a local thermodynamic destabilization signal for repair.

  2. cDNA sequence, genomic organization, and evolutionary conservation of a novel gene from the WAGR region

    SciTech Connect

    Schwartz, F.; Eisenman, R.; Knoll, J.; Bruns, G.

    1995-09-20

    A new gene (239FB) with predominant and differential expression in fetal brain has recently been isolated from a chromosome 11p13-p14 boundary area near FSHB. The corresponding mRNA has an open reading frame of 294 amino acids, a 3` untranslated region of 1247 nucleotides, and a highly GC-rich 5` untranslated region. The coding and 3` UT sequence is specified by 6 exons within nearly 87 kb of isolated genomic locus. The 5` end region of the transcript maps adjacent to the only genomically defined CpG island in a chromosomal subregion that may be associated with part of the mental retardation of some WAGR (Wilms tumor, aniridia, genitourinary anomalies, and mental retardation) syndrome patients. In addition to nucleotide and amino acid similarity to an EST from a normalized infant brain cDNA library, the predicted protein has extensive similarity to Caenorhbditis elegans polypeptides of, as yet, unknown function. The 239FB locus is, therefore, likely part of a family of genes with two members expressed in human brain. The extensive conservation of the predicted protein suggests a fundamental function of the gene product and will enable evaluation of the role of the 239FB gene in neurogenesis in model organisms. 48 refs., 4 figs., 1 tab.

  3. Nucleotide sequence of the hemolysin I gene from Actinobacillus pleuropneumoniae.

    PubMed Central

    Frey, J; Meier, R; Gygi, D; Nicolet, J

    1991-01-01

    The DNA sequence of the gene encoding the structural protein of hemolysin I (HlyI) of Actinobacillus pleuropneumoniae serotype 1 strain 4074 was analyzed. The nucleotide sequence shows a 3,072-bp reading frame encoding a protein of 1,023 amino acids with a calculated molecular size of 110.1 kDa. This corresponds to the HlyI protein, which has an apparent molecular size on sodium dodecyl sulfate gels of 105 kDa. The structure of the protein derived from the DNA sequence shows three hydrophobic regions in the N-terminal part of the protein, 13 glycine-rich domains in the second half of the protein, and a hydrophilic C-terminal area, all of which are typical of the cytotoxins of the RTX (repeats in the structural toxin) toxin family. The derived amino acid sequence of HlyI shows 42% homology with the hemolysin of A. pleuropneumoniae serotype 5, 41% homology with the leukotoxin of Pasteurella haemolytica, and 56% homology with the Escherichia coli alpha-hemolysin. The 13 glycine-rich repeats and three hydrophobic areas of the HlyI sequence show more similarity to the E. coli alpha-hemolysin than to either the A. pleuropneumoniae serotype 5 hemolysin or the leukotoxin (while the last two are more similar to each other). Two types of RTX hemolysins therefore seem to be present in A. pleuropneumoniae, one (HlyI) resembling the alpha-hemolysin and a second more closely related to the leukotoxin. Ca(2+)-binding experiments using HlyI and recombinant A. pleuropneumoniae prohemolysin (HlyIA) that was produced in E. coli shows that HlyI binds 45Ca2+, probably because of the 13 glycine-rich repeated domains. Activation of the prohemolysin is not required for Ca2+ binding. Images PMID:1879928

  4. Expressed sequence tags from a NaCl-treated Suaeda salsa cDNA library.

    PubMed

    Zhang, L; Ma, X L; Zhang, Q; Ma, C L; Wang, P P; Sun, Y F; Zhao, Y X; Zhang, H

    2001-04-18

    Past efforts to improve plant tolerance to osmotic stress have had limited success owing to the genetic complexity of stress responses. The first step towards cataloging and categorizing genetically complex abotic stress responses is the rapid discovery of genes by the large-scale partial sequencing of randomly selected cDNA clones or expressed sequence tags (ESTs). Suaeda salsa, which can survive seawater-level salinity, is a favorite halophytic model for salt tolerant research. We constructed a NaCl-treated cDNA library of Suaeda salsa and sequenced 1048 randomly selected clones, out of which 1016 clones produced readable sequences (773 showed homology to previously identified genes, 227 matched unknown protein coding regions, 16 anomalous sequences or sequences of bacterial origin were excluded from further analysis). By sequence analysis we identified 492 unique clones: 315 showed homology to previously identified genes, 177 matched unknown protein coding regions (101 of which have been found before in other organisms and 76 are completely novel). All our EST data are available on the Internet. We believe that our dbEST and the associated DNA materials will be a useful source to scientists engaging in stress-tolerance study.

  5. Nucleic acid (cDNA) and amino acid sequences of alpha-type gliadins from wheat (Triticum aestivum).

    PubMed Central

    Kasarda, D D; Okita, T W; Bernardin, J E; Baecker, P A; Nimmo, C C; Lew, E J; Dietler, M D; Greene, F C

    1984-01-01

    The complete amino acid sequence for an alpha-type gliadin protein of wheat (Triticum aestivum Linnaeus) endosperm has been derived from a cloned cDNA sequence. An additional cDNA clone that corresponds to about 75% of a similar alpha-type gliadin has been sequenced and shows some important differences. About 97% of the composite sequence of A-gliadin (an alpha-type gliadin fraction) has also been obtained by direct amino acid sequencing. This sequence shows a high degree of similarity with amino acid sequences derived from both cDNA clones and is virtually identical to one of them. On the basis of sequence information, after loss of the signal sequence, the mature alpha-type gliadins may be divided into five different domains, two of which may have evolved from an ancestral gliadin gene, whereas the remaining three contain repeating sequences that may have developed independently. Images PMID:6589619

  6. Crustacean hyperglycemic hormones of two cold water crab species, Chionoecetes opilio and C. japonicus: isolation of cDNA sequences and localization of CHH neuropeptide in eyestalk ganglia.

    PubMed

    Chung, J Sook; Ahn, I S; Yu, O H; Kim, D S

    2015-04-01

    Crustacean hyperglycemic hormone (CHH) is primarily known for its prototypical function in hyperglycemia which is induced by the release of CHH. The CHH release takes place as an adaptive response to the energy demands of the animals experiencing stressful environmental, physiological or behavioral conditions. Although >63 decapod CHH nucleotide sequences are known (GenBank), the majority of them is garnered from the species inhabiting shallow and warm water. In order to understand the adaptive role of CHH in Chionoecetes opilio and Chionoecetes japonicus inhabiting deep water environments, we first aimed for the isolation of the full-length cDNA sequence of CHH from the eyestalk ganglia of C. opilio (ChoCHH) and C. japonicus (ChjCHH) using degenerate PCR and 5' and 3' RACE. Cho- and ChjCHH cDNA sequences are identical in 5' UTR and ORF with 100% sequence identity of the putative 138aa of preproCHHs. The length of 3' UTR ChjCHH cDNA sequence is 39 nucleotides shorter than that of ChoCHH. This is the first report in decapod crustaceans that two different species have the identical sequence of CHH. ChoCHH expression increases during embryogenesis of C. opilio and is significantly higher in adult males and females. C. japonicus males have slightly higher ChjCHH expression than C. opilio males, but no statistical difference. In both species, the immunostaining intensity of CHH is stronger in the sinus gland than that of X-organ cells. Future studies will enable us to gain better understanding of the comparative metabolic physiology and endocrinology of cold, deep water species of Chionoecetes spp.

  7. Sequence of a cDNA encoding the bi-specific NAD(P)H-nitrate reductase from the tree Betula pendula and identification of conserved protein regions.

    PubMed

    Friemann, A; Brinkmann, K; Hachtel, W

    1991-05-01

    Nitrate reductase (NR) assays revealed a bispecific NAD(P)H-NR (EC 1.6.6.2.) to be the only nitrate-reducing enzyme in leaves of hydroponically grown birches. To obtain the primary structure of the NAD(P)H-NR, leaf poly(A)+ mRNA was used to construct a cDNA library in the lambda gt11 phage. Recombinant clones were screened with heterologous gene probes encoding NADH-NR from tobacco and squash. A 3.0 kb cDNA was isolated which hybridized to a 3.2 kb mRNA whose level was significantly higher in plants grown on nitrate than in those grown on ammonia. The nucleotide sequence of the cDNA comprises a reading frame encoding a protein of 898 amino acids which reveals 67%-77% identity with NADH-nitrate reductase sequences from higher plants. To identify conserved and variable regions of the multicentre electron-transfer protein a graphical evaluation of identities found in NR sequence alignments was carried out. Thirteen well-conserved sections exceeding a size of 10 amino acids were found in higher plant nitrate reductases. Sequence comparisons with related redox proteins indicate that about half of the conserved NR regions are involved in cofactor binding. The most striking difference in the birch NAD(P)H-NR sequence in comparison to NADH-NR sequences was found at the putative pyridine nucleotide binding site. Southern analysis indicates that the bi-specific NR is encoded by a single copy gene in birch.

  8. Cloning and sequence analysis of a cDNA encoding a Brazil nut protein exceptionally rich in methionine.

    PubMed

    Altenbach, S B; Pearson, K W; Leung, F W; Sun, S S

    1987-05-01

    The primary amino acid sequence of an abundant methionine-rich seed protein found in Brazil nut (Bertholletia excelsa H.B.K.) has been elucidated by protein sequencing and from the nucleotide sequence of cDNA clones. The 9 kDa subunit of this protein was found to contain 77 amino acids of which 14 were methionine (18%) and 6 were cysteine (8%). Over half of the methionine residues in this subunit are clustered in two regions of the polypeptide where they are interspersed with arginine residues. In one of these regions, methionine residues account for 5 out of 6 amino acids and four of these methionine residues are contiguous. The sequence data verifies that the Brazil nut sulfur-rich protein is synthesized as a precursor polypeptide that is considerably larger than either of the two subunits of the mature protein. Three proteolytic processing steps by which the encoded polypeptide is sequentially trimmed to the 9 kDa and 3 kDa subunit polypeptides have been correlated with the sequence information. In addition, we have found that the sulfur-rich protein from Brazil nut is homologous in its amino acid sequence to small water-soluble proteins found in two other oilseeds, castor bean (Ricinus communis) and rapeseed (Brassica napus). When the amino acid sequences of these three proteins are aligned to maximize homology, the arrangement of cysteine residues is conserved. However, the two subunits of the Brazil nut protein contain over 19% methionine whereas the homologous proteins from castor bean and rapeseed contain only 2.1% and 2.6% methionine, respectively.

  9. [Nucleotide sequence of genes for alpha- and beta-subunits of luciferase from Photobacterium leiognathi].

    PubMed

    Illarionov, B A; Protopopova, M V; Karginov, V A; Mertvetsov, N P; Gitel'zon, I I

    1988-03-01

    Nucleotide sequence of the Photobacterium leiognathi DNA containing genes of alpha and beta subunits of luciferase has been determined. We also deduced amino acid sequence and molecular mass of luciferase and localized luciferase genes in the sequenced DNA fragment.

  10. Interspecies diversity of the occludin sequence: cDNA cloning of human, mouse, dog, and rat-kangaroo homologues.

    PubMed

    Ando-Akatsuka, Y; Saitou, M; Hirase, T; Kishi, M; Sakakibara, A; Itoh, M; Yonemura, S; Furuse, M; Tsukita, S

    1996-04-01

    Occludin has been identified from chick liver as a novel integral membrane protein localizing at tight junctions (Furuse, M., T. Hirase, M. Itoh, A. Nagafuchi, S. Yonemura, Sa. Tsukita, and Sh. Tsukita. 1993. J. Cell Biol. 123:1777-1788). To analyze and modulate the functions of tight junctions, it would be advantageous to know the mammalian homologues of occludin and their genes. Here we describe the nucleotide sequences of full length cDNAs encoding occludin of rat-kangaroo (potoroo), human, mouse, and dog. Rat-kangaroo occludin cDNA was prepared from RNA isolated from PtK2 cell culture, using a mAb against chicken occludin, whereas the others were amplified by polymerase chain reaction based on the sequence found around the human neuronal apoptosis inhibitory protein gene. The amino acid sequences of the three mammalian (human, murine, and canine) occludins were very closely related to each other (approximately 90% identity), whereas they diverged considerably from those of chicken and rat-kangaroo (approximately 50% identity). Implications of these data and novel experimental options in cell biological research are discussed.

  11. cDNA sequence and chromosomal localization of human enterokinase, the proteolytic activator of trypsinogen.

    PubMed

    Kitamoto, Y; Veile, R A; Donis-Keller, H; Sadler, J E

    1995-04-11

    Enterokinase is a serine protease of the duodenal brush border membrane that cleaves trypsinogen and produces active trypsin, thereby leading to the activation of many pancreatic digestive enzymes. Overlapping cDNA clones that encode the complete human enterokinase amino acid sequence were isolated from a human intestine cDNA library. Starting from the first ATG codon, the composite 3696 nt cDNA sequence contains an open reading frame of 3057 nt that encodes a 784 amino acid heavy chain followed by a 235 amino acid light chain; the two chains are linked by at least one disulfide bond. The heavy chain contains a potential N-terminal myristoylation site, a potential signal anchor sequence near the amino terminus, and six structural motifs that are found in otherwise unrelated proteins. These domains resemble motifs of the LDL receptor (two copies), complement component Clr (two copies), the metalloprotease meprin (one copy), and the macrophage scavenger receptor (one copy). The enterokinase light chain is homologous to the trypsin-like serine proteinases. These structural features are conserved among human, bovine, and porcine enterokinase. By Northern blotting, a 4.4 kb enterokinase mRNA was detected only in small intestine. The enterokinase gene was localized to human chromosome 21q21 by fluorescence in situ hybridization.

  12. Nucleic acid (cDNA) and amino acid sequences of the maize endosperm protein glutelin-2.

    PubMed Central

    Prat, S; Cortadas, J; Puigdomènech, P; Palau, J

    1985-01-01

    The cDNA coding for a glutelin-2 protein from maize endosperm has been cloned and the complete amino acid sequence of the protein derived for the first time. An immature maize endosperm cDNA bank was screened for the expression of a beta-lactamase:glutelin-2 (G2) fusion polypeptide by using antibodies against the purified 28 kd G2 protein. A clone corresponding to the 28 kd G2 protein was sequenced and the primary structure of this protein was derived. Five regions can be defined in the protein sequence: an 11 residue N-terminal part, a repeated region formed by eight units of the sequence Pro-Pro-Pro-Val-His-Leu, an alternating Pro-X stretch 21 residues long, a Cys rich domain and a C-terminal part rich in Gln. The protein sequence is preceded by 19 residues which have the characteristics of the signal peptide found in secreted proteins. Unlike zeins, the main maize storage proteins, 28 kd glutelin-2 has several homologous sequences in common with other cereal storage proteins. Images PMID:3839076

  13. Nucleotide sequence of the human N-myc gene

    SciTech Connect

    Stanton, L.W.; Schwab, M.; Bishop, J.M.

    1986-03-01

    Human neuroblastomas frequently display amplification and augmented expression of a gene known as N-myc because of its similarity to the protooncogene c-myc. It has therefore been proposed that N-myc is itself a protooncogene, and subsequent tests have shown that N-myc and c-myc have similar biological activities in cell culture. The authors have now detailed the kinship between N-myc and c-myc by determining the nucleotide sequence of human N-myc and deducing the amino acid sequence of the protein encoded by the gene. The topography of N-myc is strikingly similar to that of c-myc: both genes contain three exons of similar lengths; the coding elements of both genes are located in the second and third exons; and both genes have unusually long 5' untranslated regions in their mRNAs, with features that raise the possibility that expression of the genes may be subject to similar controls of translation. The resemblance between the proteins encoded by N-myc and c-myc sustains previous suspicions that the genes encode related functions.

  14. Nucleotide sequence from the coding region of rabbit β-globin messenger RNA

    PubMed Central

    Proudfoot, N.J.

    1976-01-01

    A sequence of 89 nucleotides from rabbit β-globin mRNA has been determined and is shown to code for residues 107 to 137 of the β-globin protein. In addition, a sequence heterogeneity has been identified within this 89 nucleotide long sequence which corresponds to a known polymorphic variant of rabbit β-globin. Images PMID:61580

  15. Molecular cloning, sequencing and expression of cDNA encoding human trehalase.

    PubMed

    Ishihara, R; Taketani, S; Sasai-Takedatsu, M; Kino, M; Tokunaga, R; Kobayashi, Y

    1997-11-20

    A complete cDNA clone encoding human trehalase, a glycoprotein of brush-border membranes, has been isolated from a human kidney library. The cDNA encodes a protein of 583 amino acids with a calculated molecular weight of 66,595. Human enzyme contains a typical cleavable signal peptide at amino terminus, five potential glycosylation sites, and a hydrophobic region at carboxyl terminus where the protein is anchored to plasma membranes via glycosylphosphatidylinositol. The deduced amino acid sequence of the human enzyme showed similarity to sequences of the enzyme from rabbit, silk worm, Tenebrio molitor, Escherichia coli and yeast. Northern blots revealed that human trehalase mRNA of approx. 2.0 kb was found mainly in the kidney, liver and small intestine. Expression of the recombinant trehalase in E. coli provided a high level of the enzyme activity. The isolation and expression of cDNA for human trehalase should facilitate studies of the structure of the gene, as well as a basis for a better understanding of the catalytic mechanism.

  16. 2058 Expressed sequence tags (ESTs) from a human fetal lung cDNA library

    SciTech Connect

    Kazunori, Sudo |; Katsuya Chinen; Yusuke Nakamura

    1994-11-15

    ESTs (expressed sequence tags) provide complementary resources for structural and functional analyses of the human genome. The authors have performed single-pass sequencing of 2058 randomly selected, directionally cloned cDNAs isolated from a fetal-lung cDNA library constructed with oligo (dT) primers. Computer analyses of the 5{prime}-end sequences revealed that 60.4% of the clones were considered to be identical to previously reported human genes or ESTs; 9.0% of them showed significant homology to known genes in human, other mammals, or lower organisms; 30.6% showed no homology to any genes or DNA sequences in the public database. These data and reagents will be useful for future investigations of gene expression during prenatal development of human lung. 11 refs., 1 fig., 2 tabs.

  17. cDNA cloning and sequencing of human fibrillarin, a conserved nucleolar protein recognized by autoimmune antisera

    SciTech Connect

    Aris, J.P.; Blobel, G. )

    1991-02-01

    The authors have isolated a 1.1-kilobase cDNA clone that encodes human fibrillarin by screening a hepatoma library in parallel with DNA probes derived from the fibrillarin genes of Saccharomyces cerevisiae (NOP1) and Xenopus laevis. RNA blot analysis indicates that the corresponding mRNA is {approximately}1,300 nucleotides in length. Human fibrillarin expressed in vitro migrates on SDS gels as a 36-kDa protein that is specifically immunoprecipitated by antisera from humans with scleroderma autoimmune disease. Human fibrillarin contains an amino-terminal repetitive domain {approximately}75-80 amino acids in length that is rich in glycine and arginine residues and is similar to amino-terminal domains in the yeast and Xenopus fibrillarins. The occurrence of a putative RNA-binding domain and an RNP consensus sequence within the protein is consistent with the association of fibrillarin with small nucleolar RNAs. Protein sequence alignments show that 67% of amino acids from human fibrillarin are identical to those in yeast fibrillarin and that 81% are identical to those in Xenopus fibrillarin. This identity suggests the evolutionary conservation of an important function early in the pathway for ribosome biosynthesis.

  18. Molecular Cloning and Sequence Analysis of Novel Cytochrome P450 cDNA Fragments from Dastarcus helophoroides

    PubMed Central

    Wang, Hai-Dong; Li, Fei-Fei; He, Cai; Cui, Jun; Song, Wang; Li, Meng-Lou

    2014-01-01

    The predatory beetle Dastarcus helophoroides (Fairmaire) (Coleoptera: Bothrideridae) is a natural enemy of many longhorned beetles and is mainly distributed in both China and Japan. To date, no research on D. helophoroides P450 enzymes has been reported. In our study, for the better understanding of P450 enzymes in D. helophoroides, 100 novel cDNA fragments encoding cytochrome P450 were amplified from the total RNA of adult D. helophoroides abdomens using five pairs of degenerate primers designed according to the conserved amino acid sequences of the CYP6 family genes in insects through RT-PCR. The obtained nucleotide sequences were 250 bp, 270 bp, and 420 bp in length depending on different primers. Ninety-six fragments were determined to represent CYP6 genes, mainly from CYP6BK, CYP6BQ, and CYP6BR subfamilies, and four fragments were determined to represent CYP9 genes. Twenty-two fragments, submitted to GenBank, were selected for further homologous analysis, which revealed that some fragments of different sizes might be parts of the same P450 gene. PMID:25373175

  19. Time scale for cyclostome evolution inferred with a phylogenetic diagnosis of hagfish and lamprey cDNA sequences.

    PubMed

    Kuraku, Shigehiro; Kuratani, Shigeru

    2006-12-01

    The Cyclostomata consists of the two orders Myxiniformes (hagfishes) and Petromyzoniformes (lampreys), and its monophyly has been unequivocally supported by recent molecular phylogenetic studies. Under this updated vertebrate phylogeny, we performed in silico evolutionary analyses using currently available cDNA sequences of cyclostomes. We first calculated the GC-content at four-fold degenerate sites (GC(4)), which revealed that an extremely high GC-content is shared by all the lamprey species we surveyed, whereas no striking pattern in GC-content was observed in any of the hagfish species surveyed. We then estimated the timing of diversification in cyclostome evolution using nucleotide and amino acid sequences. We obtained divergence times of 470-390 million years ago (Mya) in the Ordovician-Silurian-Devonian Periods for the interordinal split between Myxiniformes and Petromyzoniformes; 90-60 Mya in the Cretaceous-Tertiary Periods for the split between the two hagfish subfamilies, Myxininae and Eptatretinae; 280-220 Mya in the Permian-Triassic Periods for the split between the two lamprey subfamilies, Geotriinae and Petromyzoninae; and 30-10 Mya in the Tertiary Period for the split between the two lamprey genera, Petromyzon and Lethenteron. This evolutionary configuration indicates that Myxiniformes and Petromyzoniformes diverged shortly after the common ancestor of cyclostomes split from the future gnathostome lineage. Our results also suggest that intra-subfamilial diversification in hagfish and lamprey lineages (especially those distributed in the northern hemisphere) occurred in the Cretaceous or Tertiary Periods.

  20. Molecular cloning and sequence analysis of novel cytochrome P450 cDNA fragments from Dastarcus helophoroides.

    PubMed

    Wang, Hai-Dong; Li, Fei-Fei; He, Cai; Cui, Jun; Song, Wang; Li, Meng-Lou

    2014-02-26

    The predatory beetle Dastarcus helophoroides (Fairmaire) (Coleoptera: Bothrideridae) is a natural enemy of many longhorned beetles and is mainly distributed in both China and Japan. To date, no research on D. helophoroides P450 enzymes has been reported. In our study, for the better understanding of P450 enzymes in D. helophoroides, 100 novel cDNA fragments encoding cytochrome P450 were amplified from the total RNA of adult D. helophoroides abdomens using five pairs of degenerate primers designed according to the conserved amino acid sequences of the CYP6 family genes in insects through RT-PCR. The obtained nucleotide sequences were 250 bp, 270 bp, and 420 bp in length depending on different primers. Ninety-six fragments were determined to represent CYP6 genes, mainly from CYP6BK, CYP6BQ, and CYP6BR subfamilies, and four fragments were determined to represent CYP9 genes. Twenty-two fragments, submitted to GenBank, were selected for further homologous analysis, which revealed that some fragments of different sizes might be parts of the same P450 gene.

  1. Nucleotide sequence of a cloned woodchuck hepatitis virus genome: comparison with the hepatitis B virus sequence.

    PubMed Central

    Galibert, F; Chen, T N; Mandart, E

    1982-01-01

    The complete nucleotide sequence of a woodchuck hepatitis virus genome cloned in Escherichia coli was determined by the method of Maxam and Gilbert. This sequence was found to be 3,308 nucleotides long. Potential ATG initiator triplets and nonsense codons were identified and used to locate regions with a substantial coding capacity. A striking similarity was observed between the organization of human hepatitis B virus and woodchuck hepatitis virus. Nucleotide sequences of these open regions in the woodchuck virus were compared with corresponding regions present in hepatitis B virus. This allowed the location of four viral genes on the L strand and indicated the absence of protein coded by the S strand. Evolution rates of the various parts of the genome as well as of the four different proteins coded by hepatitis B virus and woodchuck hepatitis virus were compared. These results indicated that: (i) the core protein has evolved slightly less rapidly than the other proteins; and (ii) when a region of DNA codes for two different proteins, there is less freedom for the DNA to evolve and, moreover, one of the proteins can evolve more rapidly than the other. A hairpin structure, very well conserved in the two genomes, was located in the only region devoid of coding function, suggesting the location of the origin of replication of the viral DNA. Images PMID:7086958

  2. Complete nucleotide sequence of a monopartite Begomovirus and associated satellites infecting Carica papaya in Nepal.

    PubMed

    Shahid, M S; Yoshida, S; Khatri-Chhetri, G B; Briddon, R W; Natsuaki, K T

    2013-06-01

    Carica papaya (papaya) is a fruit crop that is cultivated mostly in kitchen gardens throughout Nepal. Leaf samples of C. papaya plants with leaf curling, vein darkening, vein thickening, and a reduction in leaf size were collected from a garden in Darai village, Rampur, Nepal in 2010. Full-length clones of a monopartite Begomovirus, a betasatellite and an alphasatellite were isolated. The complete nucleotide sequence of the Begomovirus showed the arrangement of genes typical of Old World begomoviruses with the highest nucleotide sequence identity (>99 %) to an isolate of Ageratum yellow vein virus (AYVV), confirming it as an isolate of AYVV. The complete nucleotide sequence of betasatellite showed greater than 89 % nucleotide sequence identity to an isolate of Tomato leaf curl Java betasatellite originating from Indonesian. The sequence of the alphasatellite displayed 92 % nucleotide sequence identity to Sida yellow vein China alphasatellite. This is the first identification of these components in Nepal and the first time they have been identified in papaya.

  3. Murine muscle-specific enolase: cDNA cloning, sequence, and developmental expression.

    PubMed Central

    Lamandé, N; Mazo, A M; Lucas, M; Montarras, D; Pinset, C; Gros, F; Legault-Demare, L; Lazar, M

    1989-01-01

    In vertebrates, the glycolytic enzyme enolase (EC 4.2.1.11) is present as homodimers and heterodimers formed from three distinct subunits of identical molecular weight, alpha, beta, and gamma. We report the cloning and sequencing of a cDNA encoding the beta subunit of murine muscle-specific enolase. The corresponding amino acid sequence shows greater than 80% homology with the beta subunit from chicken obtained by protein sequencing and with alpha and gamma subunits from rat and mouse deduced from cloned cDNAs. In contrast, there is no homology between the 3' untranslated regions of mouse alpha, beta, and gamma enolase mRNAs, which also differ greatly in length. The short 3' untranslated region of beta enolase mRNA accounts for its distinct length, 1600 bases. It is known that a progressive transition from alpha alpha to beta beta enolase occurs in developing skeletal muscle. We show that this transition mainly results from a differential regulation of alpha and beta mRNA levels. Analysis of myogenic cell lines shows that beta enolase gene is expressed at the myoblast stage. Moreover, transfection of premyogenic C3H10T1/2 cells with MyoD1 cDNA shows that the initial expression of beta transcripts occurs during the very first steps of the myogenic pathway, suggesting that it could be a marker event of myogenic lineage determination. Images PMID:2734297

  4. Maple syrup urine disease. Complete primary structure of the E1 beta subunit of human branched chain alpha-ketoacid dehydrogenase complex deduced from the nucleotide sequence and a gene analysis of patients with this disease.

    PubMed Central

    Nobukuni, Y; Mitsubuchi, H; Endo, F; Akaboshi, I; Asaka, J; Matsuda, I

    1990-01-01

    A defect in the E1 beta subunit of the branched chain alpha-ketoacid dehydrogenase (BCKDH) complex is one cause of maple syrup urine disease (MSUD). In an attempt to elucidate the molecular basis of MSUD, we isolated and characterized a 1.35 kbp cDNA clone encoding the entire precursor of the E1 beta subunit of BCKDH complex from a human placental cDNA library. Nucleotide sequence analysis revealed that the isolated cDNA clone (lambda hBE1 beta-1) contained a 5'-untranslated sequence of four nucleotides, the translated sequence of 1,176 nucleotides and the 3'-untranslated sequence of 169 nucleotides. Comparison of the amino acid sequence predicted from the nucleotide sequence of the cDNA insert of the clone with the NH2-terminal amino acid sequence of the purified mature bovine BCKDH-E1 beta subunit showed that the cDNA insert encodes for a 342-amino acid subunit with a Mr = 37,585. The subunit is synthesized as the precursor with a leader sequence of 50 amino acids and is processed at the NH2 terminus. A search for protein homology revealed that the primary structure of human BCKDH-E1 beta was similar to the bovine BCKDH-E1 beta and to the E1 beta subunit of human pyruvate dehydrogenase complex, in all regions. The structures and functions of mammalian alpha-ketoacid dehydrogenase complexes are apparently highly conserved. Genomic DNA from lymphoblastoid cell lines derived from normal and five MSUD patients, in whom E1 beta was not detected by immunoblot analysis, gave the same restriction maps on Southern blot analysis. The gene has at least 80 kbp. Images PMID:2365818

  5. Achieving high throughput sequencing of a cDNA library utilizing an alternative protocol for the bench top next-generation sequencing system.

    PubMed

    Wan, Minxi; Faruq, Junaid; Rosenberg, Julian N; Xia, Jinlan; Oyler, George A; Betenbaugh, Michael J

    2013-02-15

    The development of next-generation sequencing (NGS) technologies has provided novel tools for genome analysis and expression profiling. A high throughput cDNA sequencing method using a bench top next-generation sequencing system, GS Junior, is now available. Here, we used an alternative protocol to the standard method for generating the cDNA library. This protocol can decrease the number of processing steps to manipulate RNA when constructing a cDNA library from an RNA sample, and does not require mRNA isolation from total RNA. Thus it can decrease the risk of RNA degradation and the cost for preparing a cDNA library. Also, the efficiency of sequencing data obtained with this approach is comparable to the standard method as verified by sequencing characteristics and expression levels of the reference gene glyceraldehyde-3-phosphate dehydrogenase (GAPDH).

  6. Nucleotide sequences of the cylindrical inclusion protein genes of two Japanese zucchini yellow mosaic virus isolates.

    PubMed

    Kundu, A K; Ohshima, K; Sako, N; Yaegashi, H

    1999-02-01

    The nucleotide sequences of the cylindrical inclusion protein (CIP) genes of two Japanese zucchini yellow mosaic virus (ZYMV) isolates (ZYMV-169 and ZYMV-M) were determined. The CIP genes of both isolates comprised 1902 nucleotides and encoded 634 amino acids containing consensus nucleotide binding motif. The sequence similarities between the two isolates at the nucleotide and amino acid levels were 91% and 98%, respectively. When the CIP gene sequences of the Japanese ZYMV isolates were compared with those of previously reported ZYMV isolates, the nucleotide and amino acid sequence similarities ranged between 81% and 97%, and between 95% and 97%, respectively. Phylogenetic analysis of the deduced amino acid sequences of the CIP genes indicated that the Japanese ZYMV isolates were closely related to those of other ZYMV isolates.

  7. Isolation and sequence of a cDNA clone for human tyrosinase that maps at the mouse c-albino locus

    SciTech Connect

    Kwon, B.S.; Haq, A.K.; Pomerantz, S.H.; Halaban, R.

    1987-11-01

    Screening of a lambdagt11 human melanocyte cDNA library with antibodies against hamster tyrosinase resulted in the isolation of 16 clones. The cDNA inserts from 13 of the 16 clones cross-hybridized with each other, indicating that they were form related mRNA species. One of the cDNA clones, Pmel34, detected one mRNA species with an approximate length of 2.4 kilobases that was expressed preferentially in normal and malignant melanocytes but not in other cell types. The amino acid sequence deduced from the nucleotide sequence showed that the putative human tyrosinase is composed of 548 amino acids with a molecular weight of 62,610. The deduced protein contains glycosylation sites and histidine-rich sites that could be used for copper binding. Southern blot analysis of DNA derived from newborn mice carrying lethal albino deletion mutations revealed that Pmel34 maps near or at the c-albino locus, the position of the structural gene for tyrosinase.

  8. Development of single-nucleotide polymorphism markers for Bromus tectorum (Poaceae) from a partially sequenced transcriptome1

    PubMed Central

    Merrill, Keith R.; Coleman, Craig E.; Meyer, Susan E.; Leger, Elizabeth A.; Collins, Katherine A.

    2016-01-01

    Premise of the study: Bromus tectorum (Poaceae) is an annual grass species that is invasive in many areas of the world but most especially in the U.S. Intermountain West. Single-nucleotide polymorphism (SNP) markers were developed for use in investigating the geospatial and ecological diversity of B. tectorum in the Intermountain West to better understand the mechanisms behind its successful invasion. Methods and Results: Normalized cDNA libraries from six diverse B. tectorum individuals were pooled and sequenced using 454 sequencing. Ninety-five SNP assays were developed for use on 96.96 arrays with the Fluidigm EP1 genotyping platform. Verification of the 95 SNPs by genotyping 251 individuals from 12 populations is reported, along with amplification data from four related Bromus species. Conclusions: These SNP markers are polymorphic across populations of B. tectorum, are optimized for high-throughput applications, and may be applicable to other, related Bromus species. PMID:27843723

  9. An efficient strategy for large-scale high-throughput transposon-mediated sequencing of cDNA clones

    PubMed Central

    Butterfield, Yaron S. N.; Marra, Marco A.; Asano, Jennifer K.; Chan, Susanna Y.; Guin, Ranabir; Krzywinski, Martin I.; Lee, Soo Sen; MacDonald, Kim W. K.; Mathewson, Carrie A.; Olson, Teika E.; Pandoh, Pawan K.; Prabhu, Anna-Liisa; Schnerch, Angelique; Skalska, Ursula; Smailus, Duane E.; Stott, Jeff M.; Tsai, Miranda I.; Yang, George S.; Zuyderduyn, Scott D.; Schein, Jacqueline E.; Jones, Steven J. M.

    2002-01-01

    We describe an efficient high-throughput method for accurate DNA sequencing of entire cDNA clones. Developed as part of our involvement in the Mammalian Gene Collection full-length cDNA sequencing initiative, the method has been used and refined in our laboratory since September 2000. Amenable to large scale projects, we have used the method to generate >7 Mb of accurate sequence from 3695 candidate full-length cDNAs. Sequencing is accomplished through the insertion of Mu transposon into cDNAs, followed by sequencing reactions primed with Mu-specific sequencing primers. Transposon insertion reactions are not performed with individual cDNAs but rather on pools of up to 96 clones. This pooling strategy reduces the number of transposon insertion sequencing libraries that would otherwise be required, reducing the costs and enhancing the efficiency of the transposon library construction procedure. Sequences generated using transposon-specific sequencing primers are assembled to yield the full-length cDNA sequence, with sequence editing and other sequence finishing activities performed as required to resolve sequence ambiguities. Although analysis of the many thousands (22 785) of sequenced Mu transposon insertion events revealed a weak sequence preference for Mu insertion, we observed insertion of the Mu transposon into 1015 of the possible 1024 5mer candidate insertion sites. PMID:12034834

  10. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data...

  11. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data...

  12. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data...

  13. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data...

  14. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data...

  15. Cloning and sequence analysis of a full-length cDNA of SmPP1cb encoding turbot protein phosphatase 1 beta catalytic subunit

    NASA Astrophysics Data System (ADS)

    Qi, Fei; Guo, Huarong; Wang, Jian

    2008-02-01

    Reversible protein phosphorylation, catalyzed by protein kinases and phosphatases, is an important and versatile mechanism by which eukaryotic cells regulate almost all the signaling processes. Protein phosphatase 1 (PP1) is the first and well-characterized member of the protein serine/threonine phosphatase family. In the present study, a full-length cDNA encoding the beta isoform of the catalytic subunit of protein phosphatase 1(PP1cb), was for the first time isolated and sequenced from the skin tissue of flatfish turbot Scophthalmus maximus, designated SmPP1cb, by the rapid amplification of cDNA ends (RACE) technique. The cDNA sequence of SmPP1cb we obtained contains a 984 bp open reading frame (ORF), flanked by a complete 39 bp 5' untranslated region and 462 bp 3' untranslated region. The ORF encodes a putative 327 amino acid protein, and the N-terminal section of this protein is highly acidic, Met-Ala-Glu-Gly-Glu-Leu-Asp-Val-Asp, a common feature for PP1 catalytic subunit but absent in protein phosphatase 2B (PP2B). And its calculated molecular mass is 37 193 Da and pI 5.8. Sequence analysis indicated that, SmPP1cb is extremely conserved in both amino acid and nucleotide acid levels compared with the PP1cb of other vertebrates and invertebrates, and its Kozak motif contained in the 5'UTR around ATG start codon is GXXAXXGXX ATGG, which is different from mammalian in two positions A-6 and G-3, indicating the possibility of different initiation of translation in turbot, and also the 3'UTR of SmPP1cb is highly diverse in the sequence similarity and length compared with other animals, especially zebrafish. The cloning and sequencing of SmPP1cb gene lays a good foundation for the future work on the biological functions of PP1 in the flatfish turbot.

  16. Nuclear-encoded chloroplast ribosomal protein L12 of Nicotiana tabacum: characterization of mature protein and isolation and sequence analysis of cDNA clones encoding its cytoplasmic precursor.

    PubMed Central

    Elhag, G A; Thomas, F J; McCreery, T P; Bourque, D P

    1992-01-01

    Poly(A)+ mRNA isolated from Nicotiana tabacum (cv. Petite Havana) leaves was used to prepare a cDNA library in the expression vector lambda gt11. Recombinant phage containing cDNAs coding for chloroplast ribosomal protein L12 were identified and sequenced. Mature tobacco L12 protein has 44% amino acid identity with ribosomal protein L7/L12 of Escherichia coli. The longest L12 cDNA (733 nucleotides) codes for a 13,823 molecular weight polypeptide with a transit peptide of 53 amino acids and a mature protein of 133 amino acids. The transit peptide and mature protein share 43% and 79% amino acid identity, respectively, with corresponding regions of spinach chloroplast ribosomal protein L12. The predicted amino terminus of the mature protein was confirmed by partial sequence analysis of HPLC-purified tobacco chloroplast ribosomal protein L12. A single L12 mRNA of about 0.8 kb was detected by hybridization of L12 cDNA to poly(A)+ and total leaf RNA. Hybridization patterns of restriction fragments of tobacco genomic DNA probed with the L12 cDNA suggested the existence of more than one gene for ribosomal protein L12. Characterization of a second cDNA with an identical L12 coding sequence but a different 3'-noncoding sequence provided evidence that at least two L12 genes are expressed in tobacco. Images PMID:1542565

  17. Antigenic characteristics and cDNA sequences of HLA-B73.

    PubMed

    Hoffmann, H J; Kristensen, T J; Jensen, T G; Graugaard, B; Lamm, L U

    1995-06-01

    The cDNA sequence and serological data for HLA-B73 are reported. Anti-B73 sera are found relatively frequently, considering the rarity of the antigen. It was noted early that in some cases the antibodies in sera of multiparous women did not react with the eliciting cells (fathers) and thus all behaved as a naturally occurring antibody. We report on 18 B73 antisera found during the screening of 55,000 Danish sera. Only one of the 17 stimulators typed also had the B73 tissue type. Ten of the stimulators had antigens from the B7 CREG (B7, B22, B27, B42, B67, B73), whereas none of the responders had such tissue types. In seven cases the serum was not able to react with the stimulator's lymphocytes in a cytotoxicity assay and in four cases the stimulator lymphocytes could not deplete the anti-B73 activity from the serum in absorption experiments. The cDNA of B73 was expressed correctly in COS cells and was recognized on the cell surface by a monospecific serum. The alpha 1 alpha 2 domains of B73 are most similar to those of the HLA-B22 family. Interestingly, the alpha 3 and transmembrane domains of HLA-B73 are not standard human domains, but are most similar to the corresponding domains of some gorilla and chimpanzee HLA-B genes.

  18. Sequence of a cDNA clone encoding the polysialic acid-rich and cytoplasmic domains of the neural cell adhesion molecule N-CAM.

    PubMed Central

    Hemperly, J J; Murray, B A; Edelman, G M; Cunningham, B A

    1986-01-01

    Purified fractions of the neural cell-adhesion molecule N-CAM from embryonic chicken brain contain two similar polypeptides (Mr, 160,000 and 130,000), each containing an amino-terminal external binding region, a carbohydrate-rich central region, and a carboxyl-terminal region that is associated with the cell. Previous studies indicate that the two polypeptides arise by alternative splicing of mRNAs transcribed from a single gene. We report here the 3556-nucleotide sequence of a cDNA clone (pEC208) that encodes 964 amino acids from the carbohydrate and cell-associated domains of the larger N-CAM polypeptide followed by 664 nucleotides of 3' untranslated sequence. The predicted protein sequence contains attachment sites for polysialic acid-containing oligosaccharides, four tandem homologous regions of polypeptide resembling those seen in the immunoglobulin superfamily, and a single hydrophobic sequence that appears to be the membrane-spanning segment. The cytoplasmic domain carboxyl terminal to this segment includes a block of approximately equal to 250 amino acids present in the larger but not in the smaller N-CAM polypeptide. We designate these the ld (large domain) polypeptide and the sd (small domain) polypeptide. The intracellular domains of the ld and sd polypeptides are likely to be critical for cell-surface modulation of N-CAM by interacting in a differential fashion with other intrinsic proteins or with the cytoskeleton. PMID:3458261

  19. cDNA, genomic sequence cloning and overexpression of ribosomal protein gene L9 (rpL9) of the giant panda (Ailuropoda melanoleuca).

    PubMed

    Hou, W R; Hou, Y L; Wu, G F; Song, Y; Su, X L; Sun, B; Li, J

    2011-01-01

    The ribosomal protein L9 (RPL9), a component of the large subunit of the ribosome, has an unusual structure, comprising two compact globular domains connected by an α-helix; it interacts with 23 S rRNA. To obtain information about rpL9 of Ailuropoda melanoleuca (the giant panda) we designed primers based on the known mammalian nucleotide sequence. RT-PCR and PCR strategies were employed to isolate cDNA and the rpL9 gene from A. melanoleuca; these were sequenced and analyzed. We overexpressed cDNA of the rpL9 gene in Escherichia coli BL21. The cloned cDNA fragment was 627 bp in length, containing an open reading frame of 579 bp. The deduced protein is composed of 192 amino acids, with an estimated molecular mass of 21.86 kDa and an isoelectric point of 10.36. The length of the genomic sequence is 3807 bp, including six exons and five introns. Based on alignment analysis, rpL9 has high similarity among species; we found 85% agreement of DNA and amino acid sequences with the other species that have been analyzed. Based on topology predictions, there are two N-glycosylation sites, five protein kinase C phosphorylation sites, one casein kinase II phosphorylation site, two tyrosine kinase phosphorylation sites, three N-myristoylation sites, one amidation site, and one ribosomal protein L6 signature 2 in the L9 protein of A. melanoleuca. The rpL9 gene can be readily expressed in E. coli; it fuses with the N-terminal GST-tagged protein, giving rise to the accumulation of an expected 26.51-kDa polypeptide, which is in good agreement with the predicted molecular weight. This expression product could be used for purification and further study of its function.

  20. Nucleotide sequence of the Lactococcus lactis NCDO 763 (ML3) rpoD gene.

    PubMed

    Gansel, X; Hartke, A; Boutibonnes, P; Auffray, Y

    1993-10-19

    The complete nucleotide sequence of rpoD gene from Lactococcus lactis has been determined. The nucleotide data have indicated the presence of an open reading frame of 1020 base pairs encoding a polypeptide which shares the framework structure for principal sigma factors of eubacteria strains.

  1. Nucleotide sequence of a lysine transfer ribonucleic Acid from bakers' yeast.

    PubMed

    Madison, J T; Boguslawski, S J; Teetor, G H

    1972-05-12

    The nucleotide sequence of one of the two major lysine transfer RNA's from bakers' yeast has been determined. Its structure is compared to that of a lysine tRNA from a haploid yeast. A total of 21 nucleotides differ in the two molecules. Only the T-psi-C-G (thymidine-pseudouridine-cytidine-guanosine) loop and its supporting stem are identical.

  2. Isolation and sequencing of the cDNA of a novel cytochrome P450 from rat oesophagus.

    PubMed

    Brookman-Amissah, N; Mackay, A G; Swann, P F

    2001-01-01

    RT-PCR was used to find whether cytochromes P450 of the 2A, 2B and 2E sub-families are expressed in the rat oesophagus. This showed that this tissue expresses a previously unknown member of the CYP2B sub-family, now designated CYP2B21. Using a combination of 5'- and 3'-RACE (rapid amplification of cDNA ends) and library screening, the cDNA was amplified and sequenced. The cDNA sequence (GenBank accession no. AF159245) covers the whole of the coding region and the whole of the 3'-untranslated region (UTR), but only 17 nt of the 5'-UTR. The DNA sequence has strong similarity to those of CYP2B1 and CYP2B2, with the derived amino acid sequence being 84 and 83% identical, respectively. The ease with which this cDNA was found in the cDNA library suggests that CYP2B21 is a major P450 of the oesophagus. The catalytic activity of this new CYP2B is not yet known, but as previous authors have reported that other members of this sub-family (CYP2B1 or 2B2) metabolize the selective oesophageal carcinogen N:-nitrosomethylbutylamine with the chemical selectivity necessary for carcinogenesis, i.e. they preferentially hydroxylate the alpha-carbon of the butyl chain, this new CYP2B may be the nitrosamine-activating enzyme of the oesophagus.

  3. Sequence of a novel cytochrome CYP2B cDNA coding for a protein which is expressed in a sebaceous gland, but not in the liver.

    PubMed Central

    Friedberg, T; Grassow, M A; Bartlomowicz-Oesch, B; Siegert, P; Arand, M; Adesnik, M; Oesch, F

    1992-01-01

    The major phenobarbital-inducible rat hepatic cytochromes P-450, CYP2B1 and CYP2B2, are the paradigmatic members of a cytochrome P-450 gene subfamily that contains at least seven additional members. Specific oligonucleotide probes for these genomic members of the CYP2B subfamily were used to assess their tissue-specific expression. In Northern-blot analysis a probe specific to gene 4 (which is designated now as CYP2B12) hybridized to a single mRNA present in the preputial gland, an organ which is used as a model for sebaceous glands, but did not hybridize to mRNA isolated from the liver or from five other tissues of untreated or Aroclor 1254-treated rats. The cDNA sequence for the CYP2B12 RNA was determined from overlapping cDNA clones and contained a long open reading frame of 1476 bp. The nucleotide sequence of the CYP2B12 cDNA was 85% similar to the sequence of the CYP2B1 cDNA in its coding region and was different from any CYP2B cDNA characterized until now. The cDNA-derived primary structure of the CYP2B12 protein contains a signal sequence for its insertion into the endoplasmic reticulum and the putative haem-binding site characteristic of cytochromes P-450. A part of the potential haem pocket of CYP2B12 was identical with a similar structure in a bacterial protocatechuate dioxygenase. In immunoblot analysis of preputial-gland microsomes, antibodies against CYP2B1 recognized a single abundant protein with a lower apparent molecular mass than that of CYP2B1. Our results demonstrate that the CYP2B12 protein has the potential to be enzymically active and are the first demonstration that a member of the CYP2B subfamily is expressed exclusively and at high levels in an extrahepatic organ. Images Fig. 1. Fig. 5. Fig. 6. PMID:1445240

  4. Variation in the nucleotide sequence of a prolamin gene family in wild rice.

    PubMed

    Barbier, P; Ishihama, A

    1990-07-01

    Variation in the DNA sequence of the 10 kDa prolamin gene family within the wild rice species Oryza rufipogon was probed using the direct sequencing of PCR-amplified genes. A comparison of the nucleotide and deduced amino-acid sequences of eight Asian strains of O. rufipogon and one strain of the related African species O. longistaminata is presented.

  5. Synthetic oligonucleotides with particular base sequences from the cDNA encoding proteins of Mycobacterium bovis BCG induce interferons and activate natural killer cells.

    PubMed

    Tokunaga, T; Yano, O; Kuramoto, E; Kimura, Y; Yamamoto, T; Kataoka, T; Yamamoto, S

    1992-01-01

    Thirteen kinds of 45-mer single-stranded oligonucleotide, having sequence randomly selected from the known cDNA encoding BCG proteins, were tested for their capability to augment natural killer (NK) cell activity of mouse spleen cells in vitro. Six out of the 13 oligonucleotides showed the activity, while the others did not. In order to know the minimal and essential sequence(s) responsible for the biological activity, 2 kinds of 30-mer and 5 kinds of 15-mer oligonucleotide fragments of an active 45-mer nucleotide were tested for their activity. One of the 30-mer oligonucleotides, designated BCG-A4a, was active, but the other 30-mer was inactive. All of the 15-mer oligonucleotide fragments were inactive. The BCG-A4a also stimulated the spleen cells to produce interferon (IFN)-alpha and -gamma. An experiment using anti-IFN antisera showed that the NK cell activation by the oligonucleotide was ascribed to the IFN-alpha produced. It was noticed that all of the biologically active oligonucleotides possessed one or more palindrome sequence(s), and the inactive ones did not, with an exception of a 45-mer inactive oligonucleotide containing overlapping palindrome sequences (GGGCCCGGG). These findings strongly suggest that certain palindrome sequences, like GACGTC, GGCGCC and TGCGCA, are essential for 30-mer oligonucleotides, like BCG-A4a, to induce IFNs.

  6. Complete nucleotide sequence of the 23S rRNA gene of the Cyanobacterium, Anacystis nidulans.

    PubMed Central

    Douglas, S E; Doolittle, W F

    1984-01-01

    The nucleotide sequence of the Anacystis nidulans 23S rRNA gene, including the 5'- and 3'-flanking regions has been determined. The gene is 2876 nucleotides long and shows higher primary sequence homology to the 23S rRNAs of plastids (84.5%) than to that of E. coli (79%). The predicted rRNA transcript also shares many secondary structural features with those of plastids, reinforcing the endosymbiont hypothesis for the origin of these organelles. PMID:6326060

  7. Identification of a cDNA clone that contains the complete coding sequence for a 140-kD rat NCAM polypeptide

    PubMed Central

    1987-01-01

    Neural cell adhesion molecules (NCAMs) are cell surface glycoproteins that appear to mediate cell-cell adhesion. In vertebrates NCAMs exist in at least three different polypeptide forms of apparent molecular masses 180, 140, and 120 kD. The 180- and 140-kD forms span the plasma membrane whereas the 120-kD form lacks a transmembrane region. In this study, we report the isolation of NCAM clones from an adult rat brain cDNA library. Sequence analysis indicated that the longest isolate, pR18, contains a 2,574 nucleotide open reading frame flanked by 208 bases of 5' and 409 bases of 3' untranslated sequence. The predicted polypeptide encoded by clone pR18 contains a single membrane-spanning region and a small cytoplasmic domain (120 amino acids), suggesting that it codes for a full-length 140-kD NCAM form. In Northern analysis, probes derived from 5' sequences of pR18, which presumably code for extracellular portions of the molecule hybridized to five discrete mRNA size classes (7.4, 6.7, 5.2, 4.3, and 2.9 kb) in adult rat brain but not to liver or muscle RNA. However, the 5.2- and 2.9-kb mRNA size classes did not hybridize to either a large restriction fragment or three oligonucleotides derived from the putative transmembrane coding region and regions that lie 3' to it. The 3' probes did hybridize to the 7.4-, 6.7-, and 4.3-kb message size classes. These combined results indicate that clone pR18 is derived from either the 7.4-, 6.7-, or 4.3- kb adult rat brain RNA size class. Comparison with chicken and mouse NCAM cDNA sequences suggests that pR18 represents the amino acid coding region of the 6.7- or 4.3-kb mRNA. The isolation of pR18, the first cDNA that contains the complete coding sequence of an NCAM polypeptide, unambiguously demonstrates the predicted linear amino acid sequence of this probable rat 140-kD polypeptide. This cDNA also contains a 30-base pair segment not found in NCAM cDNAs isolated from other species. The significance of this segment and other

  8. Statistical analysis of nucleotide sequences of the hemagglutinin gene of human influenza A viruses.

    PubMed Central

    Ina, Y; Gojobori, T

    1994-01-01

    To examine whether positive selection operates on the hemagglutinin 1 (HA1) gene of human influenza A viruses (H1 subtype), 21 nucleotide sequences of the HA1 gene were statistically analyzed. The nucleotide sequences were divided into antigenic and nonantigenic sites. The nucleotide diversities for antigenic and nonantigenic sites of the HA1 gene were computed at synonymous and nonsynonymous sites separately. For nonantigenic sites, the nucleotide diversities were larger at synonymous sites than at nonsynonymous sites. This is consistent with the neutral theory of molecular evolution. For antigenic sites, however, the nucleotide diversities at nonsynonymous sites were larger than those at synonymous sites. These results suggest that positive selection operates on antigenic sites of the HA1 gene of human influenza A viruses (H1 subtype). PMID:8078892

  9. Annotated expressed sequence tags and cDNA microarrays for studies of brain and behavior in the honey bee.

    PubMed

    Whitfield, Charles W; Band, Mark R; Bonaldo, Maria F; Kumar, Charu G; Liu, Lei; Pardinas, Jose R; Robertson, Hugh M; Soares, M Bento; Robinson, Gene E

    2002-04-01

    To accelerate the molecular analysis of behavior in the honey bee (Apis mellifera), we created expressed sequence tag (EST) and cDNA microarray resources for the bee brain. Over 20,000 cDNA clones were partially sequenced from a normalized (and subsequently subtracted) library generated from adult A. mellifera brains. These sequences were processed to identify 15,311 high-quality ESTs representing 8912 putative transcripts. Putative transcripts were functionally annotated (using the Gene Ontology classification system) based on matching gene sequences in Drosophila melanogaster. The brain ESTs represent a broad range of molecular functions and biological processes, with neurobiological classifications particularly well represented. Roughly half of Drosophila genes currently implicated in synaptic transmission and/or behavior are represented in the Apis EST set. Of Apis sequences with open reading frames of at least 450 bp, 24% are highly diverged with no matches to known protein sequences. Additionally, over 100 Apis transcript sequences conserved with other organisms appear to have been lost from the Drosophila genome. DNA microarrays were fabricated with over 7000 EST cDNA clones putatively representing different transcripts. Using probe derived from single bee brain mRNA, microarrays detected gene expression for 90% of Apis cDNAs two standard deviations greater than exogenous control cDNAs. [The sequence data described in this paper have been submitted to Genbank data library under accession nos. BI502708-BI517278. The sequences are also available at http://titan.biotec.uiuc.edu/bee/honeybee_project.htm.

  10. RNA Secondary Structures Having a Compatible Sequence of Certain Nucleotide Ratios.

    PubMed

    Barrett, Christopher L; Li, Thomas J X; Reidys, Christian M

    2016-11-01

    Given a random RNA secondary structure, S, we study RNA sequences having fixed ratios of nucleotides that are compatible with S. We perform this analysis for RNA secondary structures subject to various base-pairing rules and minimum arc- and stack-length restrictions. Our main result reads as follows: in the simplex of nucleotide ratios, there exists a convex region, in which, in the limit of long sequences, a random structure asymptotically almost surely (a.a.s.) has compatible sequence with these ratios and outside of which a.a.s. a random structure has no such compatible sequence. We localize this region for RNA secondary structures subject to various base-pairing rules and minimum arc- and stack-length restrictions. In particular, for GC-sequences (GC denoting the nucleotides guanine and cytosine, respectively) having a ratio of G nucleotides smaller than 1/3, a random RNA secondary structure without any minimum arc- and stack-length restrictions has a.a.s. no such compatible sequence. For sequences having a ratio of G nucleotides larger than 1/3, a random RNA secondary structure has a.a.s. such compatible sequences. We discuss our results in the context of various families of RNA structures.

  11. FASH: A web application for nucleotides sequence search

    PubMed Central

    Veksler-Lublinksy, Isana; Barash, Danny; Avisar, Chai; Troim, Einav; Chew, Paul; Kedem, Klara

    2008-01-01

    FASH (Fourier Alignment Sequence Heuristics) is a web application, based on the Fast Fourier Transform, for finding remote homologs within a long nucleic acid sequence. Given a query sequence and a long text-sequence (e.g, the human genome), FASH detects subsequences within the text that are remotely-similar to the query. FASH offers an alternative approach to Blast/Fasta for querying long RNA/DNA sequences. FASH differs from these other approaches in that it does not depend on the existence of contiguous seed-sequences in its initial detection phase. The FASH web server is user friendly and very easy to operate. FASH can be accessed at (secured website) PMID:18505581

  12. Nucleotide sequence of Neurospora crassa cytoplasmic initiator tRNA.

    PubMed Central

    Gillum, A M; Hecker, L I; Silberklang, M; Schwartzbach, S D; RajBhandary, U L; Barnett, W E

    1977-01-01

    Initiator methionine tRNA from the cytoplasm of Neurospora crassa has been purified and sequenced. The sequence is: pAGCUGCAUm1GGCGCAGCGGAAGCGCM22GCY*GGGCUCAUt6AACCCGGAGm7GU (or D) - CACUCGAUCGm1AAACGAG*UUGCAGCUACCAOH. Similar to initiator tRNAs from the cytoplasm of other eukaryotes, this tRNA also contains the sequence -AUCG- instead of the usual -TphiCG (or A)- found in loop IV of other tRNAs. The sequence of the N. crassa cytoplasmic initiator tRNA is quite different from that of the corresponding mitochondrial initiator tRNA. Comparison of the sequence of N. crassa cytoplasmic initiator tRNA to those of yeast, wheat germ and vertebrate cytoplasmic initiator tRNA indicates that the sequences of the two fungal tRNAs are no more similar to each other than they are to those of other initiator tRNAs. Images PMID:146192

  13. Cloning and nucleotide sequence of the aroA gene of Bordetella pertussis.

    PubMed Central

    Maskell, D J; Morrissey, P; Dougan, G

    1988-01-01

    The aroA locus of Bordetella pertussis, encoding 5-enolpyruvylshikimate 3-phosphate synthase, has been cloned into Escherichia coli by using a cosmid vector. The gene is expressed in E. coli and complemented an E. coli aroA mutant. The nucleotide sequence of the B. pertussis aroA gene was determined and contains an open reading frame encoding 442 amino acids, with a calculated molecular weight for 5-enolpyruvylshikimate 3-phosphate synthase of 46,688. The amino acid sequence derived from the nucleotide sequence shows homology with the published amino acid sequences of aroA gene products of other microorganisms. PMID:2897356

  14. Isolation and complete nucleotide sequence of the measles virus IMB-1 strain in China.

    PubMed

    Ma, Shao-hui; Wang, Li-chun; Liu, Jian-sheng; Shi, Hai-jing; Liu, Long-ding; Li, Qi-han

    2010-12-01

    The complete nucleotide sequence of the measles virus strain IMB-1, which was isolated in China, was determined. As in other measles viruses, its genome is 15,894 nucleotides in length and encodes six proteins. The full-length nucleotide sequence of the IMB-1 isolate differed from vaccine strains (including wild-type Edmonston strain) by 4%-5% at the nucleotide sequence level. This isolate has amino acid variations over the full genome, including in the hemagglutinin and fusion genes. This report is the first to describe the full-length genome of a genotype H1 strain and provide an overview of the diversity of genetic characteristics of a circulating measles virus.

  15. Cloning and sequence analysis of a cDNA clone coding for the mouse GM2 activator protein.

    PubMed Central

    Bellachioma, G; Stirling, J L; Orlacchio, A; Beccari, T

    1993-01-01

    A cDNA (1.1 kb) containing the complete coding sequence for the mouse GM2 activator protein was isolated from a mouse macrophage library using a cDNA for the human protein as a probe. There was a single ATG located 12 bp from the 5' end of the cDNA clone followed by an open reading frame of 579 bp. Northern blot analysis of mouse macrophage RNA showed that there was a single band with a mobility corresponding to a size of 2.3 kb. We deduce from this that the mouse mRNA, in common with the mRNA for the human GM2 activator protein, has a long 3' untranslated sequence of approx. 1.7 kb. Alignment of the mouse and human deduced amino acid sequences showed 68% identity overall and 75% identity for the sequence on the C-terminal side of the first 31 residues, which in the human GM2 activator protein contains the signal peptide. Hydropathicity plots showed great similarity between the mouse and human sequences even in regions of low sequence similarity. There is a single N-glycosylation site in the mouse GM2 activator protein sequence (Asn151-Phe-Thr) which differs in its location from the single site reported in the human GM2 activator protein sequence (Asn63-Val-Thr). Images Figure 1 PMID:7689829

  16. cDNA sequence and protein bioinformatics analyses of MSTN in African catfish (Clarias gariepinus).

    PubMed

    Kanjanaworakul, Poonmanee; Sawatdichaikul, Orathai; Poompuang, Supawadee

    2016-04-01

    Myostatin, also known as growth differentiation factor 8, has been identified as a potent negative regulator of skeletal muscle growth. The purpose of this study was to characterize and predict function of the myostatin gene of the African catfish (Cg-MSTN). Expression of Cg-MSTN was determined at three growth stages to establish the relationship between the levels of MSTN transcript and skeletal muscle growth. The partial cDNA sequence of Cg-MSTN was cloned by using published information from its congener walking catfish (Cm-MSTN). The Cg-MSTN was 1194 bp in length encoding a protein of 397 amino acids. The deduced MSTN sequence exhibited key functional sites similar to those of other members of the TGF-β superfamily, especially, the proteolytic processing site (RXXR motif) and nine conserved cysteines at the C-terminal. Expression of MSTN appeared to be correlated with muscle development and growth of African catfish. Protein bioinformatics revealed that the primary sequence of Cg-MSTN shared 98 % sequence identity with that of walking catfish Cm-MSTN with only two different residues, [Formula: see text]. and [Formula: see text]. The proposed model of Cg-MSTN revealed the key point mutation [Formula: see text] causing a 7.35 Å shorter distance between the N- and C-lobes and an approximately 11° narrow angle than those of Cm-MSTN. The substitution of a proline residue near the proteolytic processing site which altered the structure of myostatin may play a critical role in reducing proteolytic activity of this protein in African catfish.

  17. Human liver apolipoprotein B-100 cDNA: complete nucleic acid and derived amino acid sequence.

    PubMed Central

    Law, S W; Grant, S M; Higuchi, K; Hospattankar, A; Lackner, K; Lee, N; Brewer, H B

    1986-01-01

    Human apolipoprotein B-100 (apoB-100), the ligand on low density lipoproteins that interacts with the low density lipoprotein receptor and initiates receptor-mediated endocytosis and low density lipoprotein catabolism, has been cloned, and the complete nucleic acid and derived amino acid sequences have been determined. ApoB-100 cDNAs were isolated from normal human liver cDNA libraries utilizing immunoscreening as well as filter hybridization with radiolabeled apoB-100 oligodeoxynucleotides. The apoB-100 mRNA is 14.1 kilobases long encoding a mature apoB-100 protein of 4536 amino acids with a calculated amino acid molecular weight of 512,723. ApoB-100 contains 20 potential glycosylation sites, and 12 of a total of 25 cysteine residues are located in the amino-terminal region of the apolipoprotein providing a potential globular structure of the amino terminus of the protein. ApoB-100 contains relatively few regions of amphipathic helices, but compared to other human apolipoproteins it is enriched in beta-structure. The delineation of the entire human apoB-100 sequence will now permit a detailed analysis of the conformation of the protein, the low density lipoprotein receptor binding domain(s), and the structural relationship between apoB-100 and apoB-48 and will provide the basis for the study of genetic defects in apoB-100 in patients with dyslipoproteinemias. PMID:3464946

  18. Apis mellifera ultraspiracle: cDNA sequence and rapid up-regulation by juvenile hormone.

    PubMed

    Barchuk, A R; Maleszka, R; Simões, Z L P

    2004-10-01

    Two hormones, 20-hydroxyecdysone (20E) and juvenile hormone (JH) are key regulators of insect development including the differentiation of the alternative caste phenotypes of social insects. In addition, JH plays a different role in adult honey bees, acting as a 'behavioural pacemaker'. The functional receptor for 20E is a heterodimer consisting of the ecdysone receptor and ultraspiracle (USP) whereas the identity of the JH receptor remains unknown. We have cloned and sequenced a cDNA encoding Apis mellifera ultraspiracle (AMUSP) and examined its responses to JH. A rapid, but transient up-regulation of the AMUSP messenger is observed in the fat bodies of both queens and workers. AMusp appears to be a single copy gene that produces two transcripts ( approximately 4 and approximately 5 kb) that are differentially expressed in the animal's body. The predicted AMUSP protein shows greater sequence similarity to its orthologues from the vertebrate-crab-tick-locust group than to the dipteran-lepidopteran group. These characteristics and the rapid up-regulation by JH suggest that some of the USP functions in the honey bee may depend on ligand binding.

  19. Insertion sites and the terminal nucleotide sequences of the Tn4 transposon.

    PubMed

    Hyde, D R; Tu, C P

    1982-07-10

    The nucleotide sequences at the ends of the Tn4 transposon (mercury spectinomycin and sulfonamide resistance) have been determined. They are inverted repeated sequences of 38 nucleotides with three mismatched base pairs. These sequences are strongly homologous with the terminal sequences of Tn501 (mercury resistance) but less so with those of Tn3 (ampicillin resistance). The Tn4 transposon generates pentanucleotide members (Tn3, Tn1000, Tn501, Tn551, IS2) with the exception of Tn1721 and bacteriophage Mu. Among the three Tn4 insertion sites examined here, two of them occurred near a nonanucleotide sequence in perfect homology with part of the terminal inverted-repeat sequence of Tn4 and the third insertion occurred near a sequence of partial homology to one end of Tn4. All three insertions were in the same orientation such that IRb is proximal to its homologous sequence on the recipient DNA.

  20. Development of polymorphic genic-SSR markers by cDNA library sequencing in boxwood, Buxus spp. (Buxaceae)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Genic microsatellites or simple sequence repeat (genic-SSR) markers were developed in boxwood (Buxus taxa) for genetic diversity analysis, identification of taxa, and to facilitate breeding. cDNA libraries were developed from mRNA extracted from leaves of Buxus sempervirens ‘Vardar Valley’ and seque...

  1. Complete nucleotide sequences of a distinct bipartite begomovirus, bitter gourd yellow vein virus, infecting Momordica charantia.

    PubMed

    Tahir, Muhammad; Haider, Muhammad Saleem; Briddon, Rob W

    2010-11-01

    Momordica charantia (Cucurbitaceae), a vegetable crop commonly cultivated throughout Pakistan, and begomoviruses, a serious threat to crop plants, are natives of tropical and subtropical regions of the world. Leaf samples of M. charantia with yellow vein symptoms typical of begomovirus infections and samples from apparently healthy plants were collected from areas around Lahore in 2004. Full-length clones of a bipartite begomovirus were isolated from symptomatic samples. The complete nucleotide sequences of the components of one isolate were determined, and these showed the arrangement of genes typical of Old World begomoviruses. The complete nucleotides sequence of DNA A showed the highest nucleotide sequence identity (86.9%) to an isolate of Tomato leaf curl New Delhi virus (ToLCNDV), confirming it to belong to a distinct species of begomovirus, for which the name Bitter gourd yellow vein virus (BGYVV) is proposed. Sequence comparisons showed that BGYVV likely emerged as a result of inter-specific recombination between ToLCNDV and tomato leaf curl Bangladesh virus (ToLCBDV). The complete nucleotide sequence of DNA B showed 97.2% nucleotide sequence identity to that of an Indian strain of Squash leaf curl China virus.

  2. cDNA and derived amino acid sequence of ethanol-inducible rabbit liver cytochrome P-450 isozyme 3a (P-450ALC).

    PubMed Central

    Khani, S C; Zaphiropoulos, P G; Fujita, V S; Porter, T D; Koop, D R; Coon, M J

    1987-01-01

    Administration of ethanol to rabbits is known to induce a unique liver microsomal cytochrome P-450, termed isozyme 3a or P-450ALC, which is responsible for the increased oxidation of ethanol and other alcohols and the activation of toxic or carcinogenic compounds such as acetaminophen and N-nitrosodimethylamine. To further characterize this cytochrome P-450 we have identified cDNA clones to isozyme 3a by immunoscreening, DNA hybridization, and hybridization-selection. The cDNA sequence determined from two overlapping clones contains an open reading frame of 1416 nucleotides, and the first 25 amino acids of this reading frame correspond to residues 21-45 of cytochrome P-450 3a. The complete polypeptide, including residues 1 to 20, contains 492 amino acids and has a molecular weight of 56,820. Cytochrome P-450 3a is approximately 55% identical in sequence to P-450 isozymes 1 and 3b and 48% identical to isozyme 2. Hybridization of clone p3a-2 to electrophoretically fractionated rabbit liver poly(A)+ RNA revealed multiple bands, but, with a probe derived from the 3' nontranslated portion of this cDNA, only a 1.9-kilobase band was observed. Treatment of rabbits with imidazole, which increases the content of isozyme 3a, resulted in a transient increase in form 3a mRNA, but this was judged to be insufficient to account for the known 4.5-fold increase in form 3a protein. Genomic DNA analysis indicated that the cytochrome P-450 3a gene does not belong to a large subfamily. Images PMID:3027695

  3. Diverse nucleotide compositions and sequence fluctuation in Rubisco protein genes

    NASA Astrophysics Data System (ADS)

    Holden, Todd; Dehipawala, S.; Cheung, E.; Bienaime, R.; Ye, J.; Tremberger, G., Jr.; Schneider, P.; Lieberman, D.; Cheung, T.

    2011-10-01

    The Rubisco protein-enzyme is arguably the most abundance protein on Earth. The biology dogma of transcription and translation necessitates the study of the Rubisco genes and Rubisco-like genes in various species. Stronger correlation of fractal dimension of the atomic number fluctuation along a DNA sequence with Shannon entropy has been observed in the studied Rubisco-like gene sequences, suggesting a more diverse evolutionary pressure and constraints in the Rubisco sequences. The strategy of using metal for structural stabilization appears to be an ancient mechanism, with data from the porphobilinogen deaminase gene in Capsaspora owczarzaki and Monosiga brevicollis. Using the chi-square distance probability, our analysis supports the conjecture that the more ancient Rubisco-like sequence in Microcystis aeruginosa would have experienced very different evolutionary pressure and bio-chemical constraint as compared to Bordetella bronchiseptica, the two microbes occupying either end of the correlation graph. Our exploratory study would indicate that high fractal dimension Rubisco sequence would support high carbon dioxide rate via the Michaelis- Menten coefficient; with implication for the control of the whooping cough pathogen Bordetella bronchiseptica, a microbe containing a high fractal dimension Rubisco-like sequence (2.07). Using the internal comparison of chi-square distance probability for 16S rRNA (~ E-22) versus radiation repair Rec-A gene (~ E-05) in high GC content Deinococcus radiodurans, our analysis supports the conjecture that high GC content microbes containing Rubisco-like sequence are likely to include an extra-terrestrial origin, relative to Deinococcus radiodurans. Similar photosynthesis process that could utilize host star radiation would not compete with radiation resistant process from the biology dogma perspective in environments such as Mars and exoplanets.

  4. Nucleotide sequence of the Agrobacterium tumefaciens octopine Ti plasmid-encoded tmr gene.

    PubMed Central

    Heidekamp, F; Dirkse, W G; Hille, J; van Ormondt, H

    1983-01-01

    The nucleotide sequence of the tmr gene, encoded by the octopine Ti plasmid from Agrobacterium tumefaciens (pTiAch5), was determined. The T-DNA, which encompasses this gene, is involved in tumor formation and maintenance, and probably mediates the cytokinin-independent growth of transformed plant cells. The nucleotide sequence of the tmr gene displays a continuous open reading frame specifying a polypeptide chain of 240 amino acids. The 5'- terminus of the polyadenylated tmr mRNA isolated from octopine tobacco tumor cell lines was determined by nuclease S1 mapping. The nucleotide sequence 5'-TATAAAA-3', which sequence is identical to the canonical "TATA" box, was found 29 nucleotides upstream from the major initiation site for RNA synthesis. Two potential polyadenylation signals 5'-AATAAA-3' were found at 207 and 275 nucleotides downstream from the TAG stopcodon of the tmr gene. A comparison was made of nucleotide stretches, involved in transcription control of T-DNA genes. Images PMID:6312414

  5. The nucleotide sequence of tomato mottle virus, a new geminivirus isolated from tomatoes in Florida.

    PubMed

    Abouzid, A M; Polston, J E; Hiebert, E

    1992-12-01

    A new geminivirus, tomato mottle virus (TMoV), affecting tomato production in Florida has been cloned and sequenced. Sequence analysis of the cloned replicative forms of TMoV revealed four potential coding regions for the A component [2601 nucleotides (nt)] and two for the B component (2541 nt). Comparisons of the nucleotide sequence of the TMoV genome with those of other whitefly-transmitted geminiviruses indicate that TMoV is a typical bipartite geminivirus of the New World and is closely related to but distinct from abutilon mosaic virus.

  6. Nucleotide sequences of 5S rRNAs from four jellyfishes.

    PubMed

    Hori, H; Ohama, T; Kumazaki, T; Osawa, S

    1982-11-25

    The nucleotide sequences of 5S rRNAs from four jellyfishes, Spirocodon saltatrix, Nemopsis dofleini, Aurelia aurita and Chrysaora quinquecirrha have been determined. The sequences are highly similar to each other. A fairly high similarity was also found between these jellyfishes and a sea anemone, Anthopleura japonica.

  7. Should nucleotide sequence analyzing computer algorithms always extend homologies by extending homologies?

    PubMed

    Burnett, L; Basten, A; Hensley, W J

    1986-01-10

    Most computer algorithms used for comparing or aligning nucleotide sequences rely on the premise that the best way to extend a homology between the two sequences is to select a match rather than a mismatch. We have tested this assumption and found that it is not always valid.

  8. Identification and isolation of full-length cDNA sequences by sequencing and analysis of expressed sequence tags from guarana (Paullinia cupana).

    PubMed

    Figueirêdo, L C; Faria-Campos, A C; Astolfi-Filho, S; Azevedo, J L

    2011-06-21

    The current intense production of biological data, generated by sequencing techniques, has created an ever-growing volume of unanalyzed data. We reevaluated data produced by the guarana (Paullinia cupana) transcriptome sequencing project to identify cDNA clones with complete coding sequences (full-length clones) and complete sequences of genes of biotechnological interest, contributing to the knowledge of biological characteristics of this organism. We analyzed 15,490 ESTs of guarana in search of clones with complete coding regions. A total of 12,402 sequences were analyzed using BLAST, and 4697 full-length clones were identified, responsible for the production of 2297 different proteins. Eighty-four clones were identified as full-length for N-methyltransferase and 18 were sequenced in both directions to obtain the complete genome sequence, and confirm the search made in silico for full-length clones. Phylogenetic analyses were made with the complete genome sequences of three clones, which showed only 0.017% dissimilarity; these are phylogenetically close to the caffeine synthase of Theobroma cacao. The search for full-length clones allowed the identification of numerous clones that had the complete coding region, demonstrating this to be an efficient and useful tool in the process of biological data mining. The sequencing of the complete coding region of identified full-length clones corroborated the data from the in silico search, strengthening its efficiency and utility.

  9. Mayaro virus: complete nucleotide sequence and phylogenetic relationships with other alphaviruses.

    PubMed

    Lavergne, Anne; de Thoisy, Benoît; Lacoste, Vincent; Pascalis, Hervé; Pouliquen, Jean-François; Mercier, Véronique; Tolou, Hugues; Dussart, Philippe; Morvan, Jacques; Talarmin, Antoine; Kazanji, Mirdad

    2006-05-01

    Mayaro (MAY) virus is a member of the genus Alphavirus in the family Togaviridae. Alphaviruses are distributed throughout the world and cause a wide range of diseases in humans and animals. Here, we determined the complete nucleotide sequence of MAY from a viral strain isolated from a French Guianese patient. The deduced MAY genome was 11,429 nucleotides in length, excluding the 5' cap nucleotide and 3' poly(A) tail. Nucleotide and amino acid homologies, as well as phylogenetic analyses of the obtained sequence confirmed that MAY is not a recombinant virus and belongs to the Semliki Forest complex according to the antigenic complex classification. Furthermore, analyses based on the E1 region revealed that MAY is closely related to Una virus, the only other South American virus clustering with the Old World viruses. On the basis of our results and of the alphaviruses diversity and pathogenicity, we suggest that alphaviruses may have an Old World origin.

  10. Nucleotide sequence conservation in paramyxoviruses; the concept of codon constellation.

    PubMed

    Rima, Bert K

    2015-05-01

    The stability and conservation of the sequences of RNA viruses in the field and the high error rates measured in vitro are paradoxical. The field stability indicates that there are very strong selective constraints on sequence diversity. The nature of these constraints is discussed. Apart from constraints on variation in cis-acting RNA and the amino acid sequences of viral proteins, there are other ones relating to the presence of specific dinucleotides such CpG and UpA as well as the importance of RNA secondary structures and RNA degradation rates. Recent other constraints identified in other RNA viruses, such as effects of secondary RNA structure on protein folding or modification of cellular tRNA complements, are also discussed. Using the family Paramyxoviridae, I show that the codon usage pattern (CUP) is (i) specific for each virus species and (ii) that it is markedly different from the host - it does not vary even in vaccine viruses that have been derived by passage in a number of inappropriate host cells. The CUP might thus be an additional constraint on variation, and I propose the concept of codon constellation to indicate the informational content of the sequences of RNA molecules relating not only to stability and structure but also to the efficiency of translation of a viral mRNA resulting from the CUP and the numbers and position of rare codons.

  11. Nucleotide sequence of a human tRNA gene heterocluster

    SciTech Connect

    Chang, Y.N.; Pirtle, I.L.; Pirtle, R.M.

    1986-05-01

    Leucine tRNA from bovine liver was used as a hybridization probe to screen a human gene library harbored in Charon-4A of bacteriophage lambda. The human DNA inserts from plaque-pure clones were characterized by restriction endonuclease mapping and Southern hybridization techniques, using both (3'-/sup 32/P)-labeled bovine liver leucine tRNA and total tRNA as hybridization probes. An 8-kb Hind III fragment of one of these ..gamma..-clones was subcloned into the Hind III site of pBR322. Subsequent fine restriction mapping and DNA sequence analysis of this plasmid DNA indicated the presence of four tRNA genes within the 8-kb DNA fragment. A leucine tRNA gene with an anticodon of AAG and a proline tRNA gene with an anticodon of AGG are in a 1.6-kb subfragment. A threonine tRNA gene with an anticodon of UGU and an as yet unidentified tRNA gene are located in a 1.1-kb subfragment. These two different subfragments are separated by 2.8 kb. The coding regions of the three sequenced genes contain characteristic internal split promoter sequences and do not have intervening sequences. The 3'-flanking region of these three genes have typical RNA polymerase III termination sites of at least four consecutive T residues.

  12. Methods for making nucleotide probes for sequencing and synthesis

    DOEpatents

    Church, George M; Zhang, Kun; Chou, Joseph

    2014-07-08

    Compositions and methods for making a plurality of probes for analyzing a plurality of nucleic acid samples are provided. Compositions and methods for analyzing a plurality of nucleic acid samples to obtain sequence information in each nucleic acid sample are also provided.

  13. Cloning and sequence analysis of an Ophiophagus hannah cDNA encoding a precursor of two natriuretic peptide domains.

    PubMed

    Lei, Weiwei; Zhang, Yong; Yu, Guoyu; Jiang, Ping; He, Yingying; Lee, Wenhui; Zhang, Yun

    2011-04-01

    The king cobra (Ophiophagus hannah) is the largest venomous snake. Despite the components are mainly neurotoxins, the venom contains several proteins affecting blood system. Natriuretic peptide (NP), one of the important components of snake venoms, could cause local vasodilatation and a promoted capillary permeability facilitating a rapid diffusion of other toxins into the prey tissues. Due to the low abundance, it is hard to purify the snake venom NPs. The cDNA cloning of the NPs become a useful approach. In this study, a 957 bp natriuretic peptide-encoding cDNA clone was isolated from an O. hannah venom gland cDNA library. The open-reading frame of the cDNA encodes a 210-amino acid residues precursor protein named Oh-NP. Oh-NP has a typical signal peptide sequence of 26 amino acid residues. Surprisingly, Oh-NP has two typical NP domains which consist of the typical sequence of 17-residue loop of CFGXXDRIGC, so it is an unusual NP precursor. These two NP domains share high amino acid sequence identity. In addition, there are two homologous peptides of unknown function within the Oh-NP precursor. To our knowledge, Oh-NP is the first protein precursor containing two NP domains. It might belong to another subclass of snake venom NPs.

  14. TranslatorX: multiple alignment of nucleotide sequences guided by amino acid translations.

    PubMed

    Abascal, Federico; Zardoya, Rafael; Telford, Maximilian J

    2010-07-01

    We present TranslatorX, a web server designed to align protein-coding nucleotide sequences based on their corresponding amino acid translations. Many comparisons between biological sequences (nucleic acids and proteins) involve the construction of multiple alignments. Alignments represent a statement regarding the homology between individual nucleotides or amino acids within homologous genes. As protein-coding DNA sequences evolve as triplets of nucleotides (codons) and it is known that sequence similarity degrades more rapidly at the DNA than at the amino acid level, alignments are generally more accurate when based on amino acids than on their corresponding nucleotides. TranslatorX novelties include: (i) use of all documented genetic codes and the possibility of assigning different genetic codes for each sequence; (ii) a battery of different multiple alignment programs; (iii) translation of ambiguous codons when possible; (iv) an innovative criterion to clean nucleotide alignments with GBlocks based on protein information; and (v) a rich output, including Jalview-powered graphical visualization of the alignments, codon-based alignments coloured according to the corresponding amino acids, measures of compositional bias and first, second and third codon position specific alignments. The TranslatorX server is freely available at http://translatorx.co.uk.

  15. An analysis of expressed sequence tags of developing castor endosperm using a full-length cDNA library

    PubMed Central

    Lu, Chaofu; Wallis, James G; Browse, John

    2007-01-01

    Background Castor seeds are a major source for ricinoleate, an important industrial raw material. Genomics studies of castor plant will provide critical information for understanding seed metabolism, for effectively engineering ricinoleate production in transgenic oilseeds, or for genetically improving castor plants by eliminating toxic and allergic proteins in seeds. Results Full-length cDNAs are useful resources in annotating genes and in providing functional analysis of genes and their products. We constructed a full-length cDNA library from developing castor endosperm, and obtained 4,720 ESTs from 5'-ends of the cDNA clones representing 1,908 unique sequences. The most abundant transcripts are genes encoding storage proteins, ricin, agglutinin and oleosins. Several other sequences are also very numerous, including two acidic triacylglycerol lipases, and the oleate hydroxylase (FAH12) gene that is responsible for ricinoleate biosynthesis. The role(s) of the lipases in developing castor seeds are not clear, and co-expressing of a lipase and the FAH12 did not result in significant changes in hydroxy fatty acid accumulation in transgenic Arabidopsis seeds. Only one oleate desaturase (FAD2) gene was identified in our cDNA sequences. Sequence and functional analyses of the castor FAD2 were carried out since it had not been characterized previously. Overexpression of castor FAD2 in a FAH12-expressing Arabidopsis line resulted in decreased accumulation of hydroxy fatty acids in transgenic seeds. Conclusion Our results suggest that transcriptional regulation of FAD2 and FAH12 genes maybe one of the mechanisms that contribute to a high level of ricinoleate accumulation in castor endosperm. The full-length cDNA library will be used to search for additional genes that affect ricinoleate accumulation in seed oils. Our EST sequences will also be useful to annotate the castor genome, which whole sequence is being generated by shotgun sequencing at the Institute for Genome

  16. Nucleotide sequence and taxonomical distribution of the bacteriocin gene lin cloned from Brevibacterium linens M18.

    PubMed

    Valdes-Stauber, N; Scherer, S

    1996-04-01

    Linocin M18 is an antilisterial bacteriocin produced by the red smear cheese bacterium Brevibacterium linens M18. Oligonucleotide probes based on the N-terminal amino acid sequence were used to locate its single copy gene, lin, on the chromosomal DNA. The amino acid composition, N-terminal sequence, and molecular mass derived from the nucleotide sequence of an open reading frame of 798 nucleotides coding for 266 amino acids found on a 3-kb BamHI restriction fragment correspond closely to those obtained from the purified protein (N. Valdés-Stauber and S. Scherer, Appl. Environ. Microbiol. 60:3809-3814, 1994). No sequence homology to any protein or nucleotide sequences deposited in databases was found. Comparison of the nucleotide sequence and the N-terminal amino acid sequence derived from the protein suggests that B. linens M18 produces an N-formyl-methionyl-CAC tRNA. A wide taxonomical distribution of the gene within coryneform bacteria has been demonstrated by PCR amplification. The structural gene from linocin M18 is present at least in three Brevibacterium species, five Arthrobacter species, and five Corynebacterium species.

  17. Clusters of nucleotide substitutions and insertion/deletion mutations are associated with repeat sequences.

    PubMed

    McDonald, Michael J; Wang, Wei-Chi; Huang, Hsien-Da; Leu, Jun-Yi

    2011-06-01

    The genome-sequencing gold rush has facilitated the use of comparative genomics to uncover patterns of genome evolution, although their causal mechanisms remain elusive. One such trend, ubiquitous to prokarya and eukarya, is the association of insertion/deletion mutations (indels) with increases in the nucleotide substitution rate extending over hundreds of base pairs. The prevailing hypothesis is that indels are themselves mutagenic agents. Here, we employ population genomics data from Escherichia coli, Saccharomyces paradoxus, and Drosophila to provide evidence suggesting that it is not the indels per se but the sequence in which indels occur that causes the accumulation of nucleotide substitutions. We found that about two-thirds of indels are closely associated with repeat sequences and that repeat sequence abundance could be used to identify regions of elevated sequence diversity, independently of indels. Moreover, the mutational signature of indel-proximal nucleotide substitutions matches that of error-prone DNA polymerases. We propose that repeat sequences promote an increased probability of replication fork arrest, causing the persistent recruitment of error-prone DNA polymerases to specific sequence regions over evolutionary time scales. Experimental measures of the mutation rates of engineered DNA sequences and analyses of experimentally obtained collections of spontaneous mutations provide molecular evidence supporting our hypothesis. This study uncovers a new role for repeat sequences in genome evolution and provides an explanation of how fine-scale sequence contextual effects influence mutation rates and thereby evolution.

  18. Complete nucleotide sequence of Alfalfa mosaic virus isolated from alfalfa (Medicago sativa L.) in Argentina.

    PubMed

    Trucco, Verónica; de Breuil, Soledad; Bejerman, Nicolás; Lenardon, Sergio; Giolitti, Fabián

    2014-06-01

    The complete nucleotide sequence of an Alfalfa mosaic virus (AMV) isolate infecting alfalfa (Medicago sativa L.) in Argentina, AMV-Arg, was determined. The virus genome has the typical organization described for AMV, and comprises 3,643, 2,593, and 2,038 nucleotides for RNA1, 2 and 3, respectively. The whole genome sequence and each encoding region were compared with those of other four isolates that have been completely sequenced from China, Italy, Spain and USA. The nucleotide identity percentages ranged from 95.9 to 99.1 % for the three RNAs and from 93.7 to 99 % for the protein 1 (P1), protein 2 (P2), movement protein and coat protein (CP) encoding regions, whereas the amino acid identity percentages of these proteins ranged from 93.4 to 99.5 %, the lowest value corresponding to P2. CP sequences of AMV-Arg were compared with those of other 25 available isolates, and the phylogenetic analysis based on the CP gene was carried out. The highest percentage of nucleotide sequence identity of the CP gene was 98.3 % with a Chinese isolate and 98.6 % at the amino acid level with four isolates, two from Italy, one from Brazil and the remaining one from China. The phylogenetic analysis showed that AMV-Arg is closely related to subgroup I of AMV isolates. To our knowledge, this is the first report of a complete nucleotide sequence of AMV from South America and the first worldwide report of complete nucleotide sequence of AMV isolated from alfalfa as natural host.

  19. Generation of expressed sequence tags of random root cDNA clones of Brassica napus by single-run partial sequencing.

    PubMed Central

    Park, Y S; Kwak, J M; Kwon, O Y; Kim, Y S; Lee, D S; Cho, M J; Lee, H H; Nam, H G

    1993-01-01

    Two hundred thirty-seven expressed sequence tags (ESTs) of Brassica napus were generated by single-run partial sequencing of 197 random root cDNA clones. A computer search of these root ESTs revealed that 21 ESTs show significant similarity to the protein-coding sequences in the existing data bases, including five stress- or defense-related genes and four clones related to the genes from other kingdoms. Northern blot analysis of the 10 data base-matched cDNA clones revealed that many of the clones are expressed most abundantly in root but less abundantly in other organs. However, two clones were highly root specific. The results show that generation of the root ESTs by partial sequencing of random cDNA clones along with the expression analysis is an efficient approach to isolate genes that are functional in plant root in a large scale. We also discuss the results of the examination of cDNA libraries and sequencing methods suitable for this approach. PMID:8029332

  20. Next-generation sequencing-based 5' rapid amplification of cDNA ends for alternative promoters.

    PubMed

    Perera, Bambarendage P U; Kim, Joomyeong

    2016-02-01

    Mammalian genomes contain many unknown alternative first exons and promoters. Thus, we have modified the existing 5'RACE (5' rapid amplification of cDNA ends) approach into a next-generation sequencing (NGS)-based new protocol that can identify these alternative promoters. This protocol has incorporated two main ideas: (i) 5'RACE starting from the known second exons of genes and (ii) NGS-based sequencing of the subsequent cDNA products. This protocol also provides a bioinformatics strategy that processes the sequence reads from NGS runs. This protocol has successfully identified several alternative promoters for an imprinted gene, PEG3. Overall, this NGS-based 5'RACE protocol is a sensitive and reliable method for detecting low-abundant transcripts and promoters.

  1. In-depth cDNA library sequencing provides quantitative gene expression profiling in cancer biomarker discovery.

    PubMed

    Yang, Wanling; Ying, Dingge; Lau, Yu-Lung

    2009-06-01

    Quantitative gene expression analysis plays an important role in identifying differentially expressed genes in various pathological states, gene expression regulation and co-regulation, shedding light on gene functions. Although microarray is widely used as a powerful tool in this regard, it is suboptimal quantitatively and unable to detect unknown gene variants. Here we demonstrated effective detection of differential expression and co-regulation of certain genes by expressed sequence tag analysis using a selected subset of cDNA libraries. We discussed the issues of sequencing depth and library preparation, and propose that increased sequencing depth and improved preparation procedures may allow detection of many expression features for less abundant gene variants. With the reduction of sequencing cost and the emerging of new generation sequencing technology, in-depth sequencing of cDNA pools or libraries may represent a better and powerful tool in gene expression profiling and cancer biomarker detection. We also propose using sequence-specific subtraction to remove hundreds of the most abundant housekeeping genes to increase sequencing depth without affecting relative expression ratio of other genes, as transcripts from as few as 300 most abundantly expressed genes constitute about 20% of the total transcriptome. In-depth sequencing also represents a unique advantage of detecting unknown forms of transcripts, such as alternative splicing variants, fusion genes, and regulatory RNAs, as well as detecting mutations and polymorphisms that may play important roles in disease pathogenesis.

  2. Nucleotide sequence of an Escherichia coli chromosomal hemolysin.

    PubMed Central

    Felmlee, T; Pellett, S; Welch, R A

    1985-01-01

    We determined the DNA sequence of an 8,211-base-pair region encompassing the chromosomal hemolysin, molecularly cloned from an O4 serotype strain of Escherichia coli. All four hemolysin cistrons (transcriptional order, C, A, B, and D) were encoded on the same DNA strand, and their predicted molecular masses were, respectively, 19.7, 109.8, 79.9, and 54.6 kilodaltons. The identification of pSF4000-encoded polypeptides in E. coli minicells corroborated the assignment of the predicted polypeptides for hlyC, hlyA, and hlyD. However, based on the minicell results, two polypeptides appeared to be encoded on the hlyB region, one similar in size to the predicted molecular mass of 79.9 kilodaltons, and the other a smaller 46-kilodalton polypeptide. The four hemolysin gene displayed similar codon usage, which is atypical for E. coli. This reflects the low guanine-plus-cytosine content (40.2%) of the hemolysin DNA sequence and suggests the non-E. coli origin of the hemolysin determinant. In vitro-derived deletions of the hemolysin recombinant plasmid pSF4000 indicated that a region between 433 and 301 base pairs upstream of the putative start of hlyC is necessary for hemolysin synthesis. Based on the DNA sequence, a stem-loop transcription terminator-like structure (a 16-base-pair stem followed by seven uridylates) in the mRNA was predicted distal to the C-terminal end of hlyA. A model for the general transcriptional organization of the E. coli hemolysin determinant is presented. Images PMID:3891743

  3. Nucleotide sequence of the capsid protein gene of papaya leaf-distortion mosaic potyvirus.

    PubMed

    Maoka, T; Kashiwazaki, S; Tsuda, S; Usugi, T; Hibino, H

    1996-01-01

    The DNA complementary to the 3'-terminal 1 404 nucleotides [excluding the poly(A) tail] of papaya leaf-distortion mosaic potyvirus (PLDMV) RNA was cloned and sequenced. The sequence starts within a long open reading frame (ORF) of 1 195 nucleotides and is followed by a 3' non-coding region of 209 nucleotides. Capsid protein (CP) is encoded at the 3' terminus of the ORF. The CP contains 293 residues and has a Mr of 33 277. The CP of PLDMV exhibits 49 to 59% sequence similarity at the amino acid level to the CPs of papaya ringspot potyvirus (PRSV) and other potyviruses. This result is consistent with the absence of a serological relationship between PLDMV and PRSV or other potyviruses. The results support the assignment of PLDMV as a distinct member of the genus Potyvirus.

  4. Nucleotide Sequence of the Protective Antigen Gene of Bacillus Anthracis

    DTIC Science & Technology

    1988-02-02

    transcription and translation of the Bacillus megaterium protein C gene. J. Bacteriol. 158:e09-813. 9. Friedlander, A, M. 1986. Macrophages are sensitive to...of the Protective Antigen Gene of Bacillus anthracis 6. pEaltranalO opl. AMPOA’T B*u~iA S. L. Welkos, J. R. Lowe, F. Eden-McCutchan, M. Vodkin, S. M... Bacillus anthracls and the 5’ and 3’ flanking sequences were determined. Protective antigen ie one of three proteins comprising anthrax toxin. The open

  5. Intraspecific nucleotide sequence differences in the major noncoding region of human mitochondrial DNA.

    PubMed Central

    Horai, S; Hayasaka, K

    1990-01-01

    Nucleotide sequences of the major noncoding region of human mitochondrial DNA (mtDNA) from 95 human placentas have been determined. These sequences include at least a 482-bp-long region encompassing most of the D-loop-forming region. Comparisons of these sequences with those previously determined have revealed remarkable features of nucleotide substitutions and insertion/deletion events. The nucleotide diversity among the sequences is estimated as 1.45%, which is three- to fourfold higher than the corresponding value estimated from restriction-enzyme analysis of whole mtDNA genome. A hypervariable region has also been defined. In this 14-bp region, 17 different sequences were detected. More than 97% of the base changes are transitions. A significantly nonrandom distribution of nucleotide substitutions and sequence length variations were also noted. The phylogenetic analysis indicates that diversity among the negroids is much larger than that among the caucasoids or the mongoloids. In fact, part of the negroids first diverged from other humans in the phylogenetic tree. A striking finding in the phylogenetic analysis is that the mongoloids can be separated into two distinct groups. Divergence of part of the mongoloids follows the earliest divergence of part of the negroids. The remainder of the mongoloids subsequently diverged together with the caucasoids. This observation confirmed our earlier study, which clearly demonstrated, by the restriction-enzyme analysis, existence of two distinct groups in the Japanese. Images Figure 3 PMID:2316527

  6. An Integrated System for DNA Sequencing by Synthesis Using Novel Nucleotide Analogues

    PubMed Central

    Guo, Jia; Yu, Lin; Turro, Nicholas J.; Ju, Jingyue

    2010-01-01

    Conspectus The Human Genome Project has concluded, but its successful completion has increased, rather than decreased, the need for high-throughput DNA sequencing technologies. The possibility of clinically screening a full genome for an individual's mutations offers tremendous benefits, both for pursuing personalized medicine as well as uncovering the genomic contributions to diseases. The Sanger sequencing method—although enormously productive for more than 30 years—requires an electrophoretic separation step that, unfortunately, remains a key technical obstacle for achieving economically acceptable full-genome results. Alternative sequencing approaches thus focus on innovations that can reduce costs. The DNA sequencing by synthesis (SBS) approach has shown great promise as a new sequencing platform, with particular progress reported recently. The general fluorescent SBS approach involves (i) incorporation of nucleotide analogs bearing fluorescent reporters, (ii) identification of the incorporated nucleotide by its fluorescent emissions, and (iii) cleavage of the fluorophore, along with the reinitiation of the polymerase reaction for continuing sequence determination. In this Account, we review the construction of a DNA-immobilized chip and the development of novel nucleotide reporters for the SBS sequencing platform. Click chemistry, with its high selectivity and coupling efficiency, was explored for surface immobilization of DNA. The first generation (G-1) modified nucleotides for SBS feature a small chemical moiety capping the 3′-OH and a fluorophore tethered to the base through a chemically cleavable linker; the design ensures that the nucleotide reporters are good substrates for the polymerase. The 3′-capping moiety and the fluorophore on the DNA extension products, generated by the incorporation of the G-1 modified nucleotides, are cleaved simultaneously to reinitiate the polymerase reaction. The sequence of a DNA template immobilized on a surface

  7. Annotated Expressed Sequence Tags and cDNA Microarrays for Studies of Brain and Behavior in the Honey Bee

    PubMed Central

    Whitfield, Charles W.; Band, Mark R.; Bonaldo, Maria F.; Kumar, Charu G.; Liu, Lei; Pardinas, Jose R.; Robertson, Hugh M.; Soares, M. Bento; Robinson, Gene E.

    2002-01-01

    To accelerate the molecular analysis of behavior in the honey bee (Apis mellifera), we created expressed sequence tag (EST) and cDNA microarray resources for the bee brain. Over 20,000 cDNA clones were partially sequenced from a normalized (and subsequently subtracted) library generated from adult A. mellifera brains. These sequences were processed to identify 15,311 high-quality ESTs representing 8912 putative transcripts. Putative transcripts were functionally annotated (using the Gene Ontology classification system) based on matching gene sequences in Drosophila melanogaster. The brain ESTs represent a broad range of molecular functions and biological processes, with neurobiological classifications particularly well represented. Roughly half of Drosophila genes currently implicated in synaptic transmission and/or behavior are represented in the Apis EST set. Of Apis sequences with open reading frames of at least 450 bp, 24% are highly diverged with no matches to known protein sequences. Additionally, over 100 Apis transcript sequences conserved with other organisms appear to have been lost from the Drosophila genome. DNA microarrays were fabricated with over 7000 EST cDNA clones putatively representing different transcripts. Using probe derived from single bee brain mRNA, microarrays detected gene expression for 90% of Apis cDNAs two standard deviations greater than exogenous control cDNAs. [The sequence data described in this paper have been submitted to Genbank data library under accession nos. BI502708–BI517278. The sequences are also available at http://titan.biotec.uiuc.edu/bee/honeybee_project.htm.] PMID:11932240

  8. Characterization of rainbow trout gonad, brain and gill deep cDNA repertoires using a Roche 454-Titanium sequencing approach.

    PubMed

    Le Cam, Aurélie; Bobe, Julien; Bouchez, Olivier; Cabau, Cédric; Kah, Olivier; Klopp, Christophe; Lareyre, Jean-Jacques; Le Guen, Isabelle; Lluch, Jérôme; Montfort, Jérôme; Moreews, Francois; Nicol, Barbara; Prunet, Patrick; Rescan, Pierre-Yves; Servili, Arianna; Guiguen, Yann

    2012-05-25

    Rainbow trout, Oncorhynchus mykiss, is an important aquaculture species worldwide and, in addition to being of commercial interest, it is also a research model organism of considerable scientific importance. Because of the lack of a whole genome sequence in that species, transcriptomic analyses of this species have often been hindered. Using next-generation sequencing (NGS) technologies, we sought to fill these informational gaps. Here, using Roche 454-Titanium technology, we provide new tissue-specific cDNA repertoires from several rainbow trout tissues. Non-normalized cDNA libraries were constructed from testis, ovary, brain and gill rainbow trout tissue samples, and these different libraries were sequenced in 10 separate half-runs of 454-Titanium. Overall, we produced a total of 3million quality sequences with an average size of 328bp, representing more than 1Gb of expressed sequence information. These sequences have been combined with all publicly available rainbow trout sequences, resulting in a total of 242,187 clusters of putative transcript groups and 22,373 singletons. To identify the predominantly expressed genes in different tissues of interest, we developed a Digital Differential Display (DDD) approach. This approach allowed us to characterize the genes that are predominantly expressed within each tissue of interest. Of these genes, some were already known to be tissue-specific, thereby validating our approach. Many others, however, were novel candidates, demonstrating the usefulness of our strategy and of such tissue-specific resources. This new sequence information, acquired using NGS 454-Titanium technology, deeply enriched our current knowledge of the expressed genes in rainbow trout through the identification of an increased number of tissue-specific sequences. This identification allowed a precise cDNA tissue repertoire to be characterized in several important rainbow trout tissues. The rainbow trout contig browser can be accessed at the following

  9. Sequence analysis of expressed sequence tags from an ABA-treated cDNA library identifies stress response genes in the moss Physcomitrella patens.

    PubMed

    Machuka, J; Bashiardes, S; Ruben, E; Spooner, K; Cuming, A; Knight, C; Cove, D

    1999-04-01

    Partial cDNA sequencing was used to obtain 169 expressed sequence tags (ESTs) in the moss, Physcomitrella patens. The source of ESTs was a random cDNA library constructed from 7 day-old protonemata following treatment with 10(-4) M abscisic acid (ABA). Analysis of the ESTs identified 69% with homology to known sequences, 61% of which had significant homology to sequences of plant origin. More importantly, at least 11 ESTs had significant similarities to genes which are implicated in plant stress-responses, including responses which may involve ABA. These included a cDNA associated with desiccation tolerance, two heat shock protein genes, one cold acclimation protein cDNA and five others that may be involved in either oxidative or chemical stress or both, i.e., Zn/Cu-superoxide dismutase, NADPH protochlorophyllide oxidoreductase (PorB), selenium binding protein, glutathione peroxidase and glutathione S transferase. Analysis of codon usage between P. patens and seed plants indicated that although mosses and higher plants are to a large extent similar, minor variations also exists that may represent the distinctiveness of each group.

  10. The complete nucleotide sequence and genomic characterization of tropical soda apple mosaic virus.

    PubMed

    Fillmer, Kornelia; Adkins, Scott; Pongam, Patchara; D'Elia, Tom

    2016-08-01

    We report the first complete genome sequence of tropical soda apple mosaic virus (TSAMV), a tobamovirus originally isolated from tropical soda apple (Solanum viarum) collected in Okeechobee, Florida. The complete genome of TSAMV is 6,350 nucleotides long and contains four open reading frames encoding the following proteins: i) 126-kDa methyltransferase/helicase (3354 nt), ii) 183-kDa polymerase (4839 nt), iii) movement protein (771 nt) and iv) coat protein (483 nt). The complete genome sequence of TSAMV shares 80.4 % nucleotide sequence identity with pepper mild mottle virus (PMMoV) and 71.2-74.2 % identity with other tobamoviruses naturally infecting members of the Solanaceae plant family. Phylogenetic analysis of the deduced amino acid sequences of the 126-kDa and 183-kDa proteins and the complete genome sequence place TSAMV in a subcluster with PMMoV within the Solanaceae-infecting subgroup of tobamoviruses.

  11. Cloning and nucleotide sequence of wild type and a mutant histidine decarboxylase from Lactobacillus 30a.

    PubMed

    Vanderslice, P; Copeland, W C; Robertus, J D

    1986-11-15

    Prohistidine decarboxylase from Lactobacillus 30a is a protein that autoactivates to histidine decarboxylase by cleaving its peptide chain between serines 81 and 82 and converting Ser-82 to a pyruvoyl moiety. The pyruvoyl group serves as the prosthetic group for the decarboxylation reaction. We have cloned and determined the nucleotide sequence of the gene for this enzyme from a wild type strain and from a mutant with altered autoactivation properties. The nucleotide sequence modifies the previously determined amino acid sequence of the protein. A tripeptide missed in the chemical sequence is inserted, and three other amino acids show conservative changes. The activation mutant shows a single change of Gly-58 to an Asp. Sequence analysis up- and downstream from the gene suggests that histidine decarboxylase is part of a polycistronic message, and that the transcriptional promotor region is strongly homologous to those of other Gram-positive organisms.

  12. Human parainfluenza type 3 virus hemagglutinin-neuraminidase glycoprotein: nucleotide sequence of mRNA and limited amino acid sequence of the purified protein.

    PubMed Central

    Elango, N; Coligan, J E; Jambou, R C; Venkatesan, S

    1986-01-01

    The nucleotide sequence of mRNA for the hemagglutinin-neuraminidase (HN) protein of human parainfluenza type 3 virus obtained from the corresponding cDNA clone had a single long open reading frame encoding a putative protein of 64,254 daltons consisting of 572 amino acids. The deduced protein sequence was confirmed by limited N-terminal amino acid microsequencing of CNBr cleavage fragments of native HN that was purified by immunoprecipitation. The HN protein is moderately hydrophobic and has four potential sites (Asn-X-Ser/Thr) of N-glycosylation in the C-terminal half of the molecule. It is devoid of both the N-terminal signal sequence and the C-terminal membrane anchorage domain characteristic of the hemagglutinin of influenza virus and the fusion (F0) protein of the paramyxoviruses. Instead, it has a single prominent hydrophobic region capable of membrane insertion beginning at 32 residues from the N terminus. This N-terminal membrane insertion is similar to that of influenza virus neuraminidase and the recently reported structures of HN proteins of Sendai virus and simian virus 5. Images PMID:3003381

  13. Evolutionarily conserved sequences of striated muscle myosin heavy chain isoforms. Epitope mapping by cDNA expression.

    PubMed

    Miller, J B; Teal, S B; Stockdale, F E

    1989-08-05

    A cDNA expression strategy was used to localize amino acid sequences which were specific for fast, as opposed to slow, isoforms of the chicken skeletal muscle myosin heavy chain (MHC) and which were conserved in vertebrate evolution. Five monoclonal antibodies (mAbs), termed F18, F27, F30, F47, and F59, were prepared that reacted with all of the known chicken fast MHC isoforms but did not react with any of the known chicken slow nor with smooth muscle MHC isoforms. The epitopes recognized by mAbs F18, F30, F47, and F59 were on the globular head fragment of the MHC, whereas the epitope recognized by mAb F27 was on the helical tail or rod fragment. Reactivity of all five mAbs also was confined to fast MHCs in the rat, with the exception of mAb F59, which also reacted with the beta-cardiac MHC, the single slow MHC isoform common to both the rat heart and skeletal muscle. None of the five epitopes was expressed on amphioxus, nematode, or Dictyostelium MHC. The F27 and F59 epitopes were found on shark, electric ray, goldfish, newt, frog, turtle, chicken, quail, rabbit, and rat MHCs. The epitopes recognized by these mAbs were conserved, therefore, to varying degrees through vertebrate evolution and differed in sequence from homologous regions of a number of invertebrate MHCs and myosin-like proteins. The sequence of those epitopes on the head were mapped using a two-part cDNA expression strategy. First, Bal31 exonuclease digestion was used to rapidly generate fragments of a chicken embryonic fast MHC cDNA that were progressively deleted from the 3' end. These cDNA fragments were expressed as beta-galactosidase/MHC fusion proteins using the pUR290 vector; the fusion proteins were tested by immunoblotting for reactivity with the mAbs; and the approximate locations of the epitopes were determined from the sizes of the cDNA fragments that encoded a particular epitope. The epitopes were then precisely mapped by expression of overlapping cDNA fragments of known sequence that

  14. Population genetics and phylogenetic analysis of the vrs1 nucleotide sequence in wild and cultivated barley.

    PubMed

    Ren, Xifeng; Wang, Yonggang; Yan, Songxian; Sun, Dongfa; Sun, Genlou

    2014-04-01

    Spike morphology is a key characteristic in the study of barley genetics, breeding, and domestication. Variation at the six-rowed spike 1 (vrs1) locus is sufficient to control the development and fertility of the lateral spikelet of barley. To study the genetic variation of vrs1 in wild barley (Hordeum vulgare subsp. spontaneum) and cultivated barley (Hordeum vulgare subsp. vulgare), nucleotide sequences of vrs1 were examined in 84 wild barleys (including 10 six-rowed) and 20 cultivated barleys (including 10 six-rowed) from four populations. The length of the vrs1 sequence amplified was 1536 bp. A total of 40 haplotypes were identified in the four populations. The highest nucleotide diversity, haplotype diversity, and per-site nucleotide diversity were observed in the Southwest Asian wild barley population. The nucleotide diversity, number of haplotypes, haplotype diversity, and per-site nucleotide diversity in two-rowed barley were higher than those in six-rowed barley. The phylogenetic analysis of the vrs1 sequences partially separated the six-rowed and the two-rowed barley. The six-rowed barleys were divided into four groups.

  15. Nucleotide composition of CO1 sequences in Chelicerata (Arthropoda): detecting new mitogenomic rearrangements.

    PubMed

    Arabi, Juliette; Judson, Mark L I; Deharveng, Louis; Lourenço, Wilson R; Cruaud, Corinne; Hassanin, Alexandre

    2012-02-01

    Here we study the evolution of nucleotide composition in third codon-positions of CO1 sequences of Chelicerata, using a phylogenetic framework, based on 180 taxa and three markers (CO1, 18S, and 28S rRNA; 5,218 nt). The analyses of nucleotide composition were also extended to all CO1 sequences of Chelicerata found in GenBank (1,701 taxa). The results show that most species of Chelicerata have a positive strand bias in CO1, i.e., in favor of C nucleotides, including all Amblypygi, Palpigradi, Ricinulei, Solifugae, Uropygi, and Xiphosura. However, several taxa show a negative strand bias, i.e., in favor of G nucleotides: all Scorpiones, Opisthothelae spiders and several taxa within Acari, Opiliones, Pseudoscorpiones, and Pycnogonida. Several reversals of strand-specific bias can be attributed to either a rearrangement of the control region or an inversion of a fragment containing the CO1 gene. Key taxa for which sequencing of complete mitochondrial genomes will be necessary to determine the origin and nature of mtDNA rearrangements involved in the reversals are identified. Acari, Opiliones, Pseudoscorpiones, and Pycnogonida were found to show a strong variability in nucleotide composition. In addition, both mitochondrial and nuclear genomes have been affected by higher substitution rates in Acari and Pseudoscorpiones. The results therefore indicate that these two orders are more liable to fix mutations of all types, including base substitutions, indels, and genomic rearrangements.

  16. Identification and sequence of a cDNA clone corresponding to a gene involved in development of Undaria pinnatifida

    NASA Astrophysics Data System (ADS)

    Hou, He-Shen; Li, Ning; Wu, Chao-Yuan

    1998-03-01

    During the induction of gamete-producing gametangia, induced gametophytes were collected at 4 days intervals (0,4,8,12 d) and total RNAs were isolated by CsCl gradient ultracentrifugation. Some stage-specific expressed mRNAs were identified by differential display of mRNAs from different developing stages of the gametophytes. The cDNA of one specific mRNA was verified, cloned and sequenced. This gene was specifically expressed during 4 days of induction, and had partial homologous sequence with tobacco IAA-binding protein gene. It suggests that this cDNA may represent a gene which is related to the IAA regulating function during the development of the gametophytes.

  17. Nucleotide sequence and genome organization of a new proposed crinivirus, tetterwort vein chlorosis virus.

    PubMed

    Zhao, Fumei; Yoo, Ran Hee; Lim, Seungmo; Igori, Davaajargal; Lee, Su-Heon; Moon, Jae Sun

    2015-11-01

    The genome of tetterwort vein chlorosis virus (TVCV) from South Korea has been completely sequenced. Its genomic organization resembles those of other criniviruses, with several new features, indicating that TVCV is a member of a new species in the genus Crinivirus, family Closteroviridae. RNA1 contains 8467 nucleotides, with at least four opening reading frames (ORFs). ORF1a encodes a protein with predicted papain-like protease, methyltransferase, and helicase activities. ORF1b encodes a putative RNA-dependent RNA polymerase that is apparently expressed through a +1 ribosomal frameshift. RNA2 contains 8113 nucleotides encoding at least nine proteins, similar to most crinivirus RNA2s. The 3' untranslated regions of the bipartite RNA genome share 82.1% nucleotide sequence identity.

  18. Complete nucleotide sequence of the new potexvirus "Alstroemeria virus X". Brief report.

    PubMed

    Fuji, S; Shinoda, K; Ikeda, M; Furuya, H; Naito, H; Fukumoto, F

    2005-11-01

    A flexuous virus was isolated in Japan from an alstroemeria plant showing mosaic symptoms. The virus had a broad host range but had systemically latent infectivity in alstroemeria. The virus was assigned to the genus Potexvirus based on morphology and physical properties and on an analysis of the complete nucleotide sequence. The genomic RNA of the virus was 7,009 nucleotides in length, excluding the 3'-terminal poly (A) tail. It contained five open reading frames (ORFs), which was consistent with other members of the genus Potexvirus. Although nucleotide sequences of the ORFs differ from previously reported potexviruses, a phylogenetic analysis placed it phylogenetically close to Narcissus mosaic virus and Scallion virus X. Therefore, we propose that this virus should be designated as Alstroemeria virus X (AlsVX).

  19. Complete nucleotide sequence of a begomovirus and associated betasatellite infecting croton (Croton bonplandianus) in Pakistan.

    PubMed

    Hussain, Khadim; Hussain, Mazhar; Mansoor, Shahid; Briddon, Rob W

    2011-06-01

    The complete sequences of a begomovirus and an associated betasatellite isolated from Croton bonplandianus originating from Pakistan were determined. The sequence of the begomovirus showed the highest level of nucleotide sequence identity (88.9%) to an isolate of papaya leaf curl virus and thus represents a new species, for which we propose the name Croton yellow vein virus (CYVV). The sequence of the betasatellite showed the highest levels of sequence identity (82 to 98.4%) to six sequences in the databases that have yet to be reported, followed by isolates of tomato leaf curl Joydebpur betasatellite (48.7 to 52.5%). This indicates that the betasatellite identified here (and the six sequences in the databases) is an isolate of a newly identified species for which the name Croton yellow vein mosaic betasatellite (CroYVMB) is proposed. For the begomovirus, an analysis of the sequence indicates that it has a recombinant origin.

  20. Complete nucleotide sequence of a novel strain of fig fleck-associated virus from China.

    PubMed

    He, Zhen; Mijit, Mahmut; Li, Shifang; Zhang, Zhixiang

    2017-04-01

    The complete nucleotide sequence of fig fleck-associated virus from Xinjiang Uygur Autonomous Region of China (FFkaV-CN) was determined. The 6,723-nucleotide-long viral genome, excluding a terminal poly(A) tail, contains three open reading frames (ORFs). Pairwise comparisons showed that FFkaV-CN shares 83% and 92% sequence identity with FFkaV-Italy based on the complete genomic sequence and CP aa sequence, respectively, slightly higher than the species demarcation criterion for the genus Maculavirus. Phylogenetic analysis showed that FFkaV-CN and FFkaV-Italy clustered into one group. These results indicate that FFkaV-CN is a novel strain of FFkaV with a genome organization somewhat different from what was reported for FFkaV-Italy.

  1. Dependence of the E. coli promoter strength and physical parameters upon the nucleotide sequence

    PubMed Central

    Berezhnoy, Andrey Y.; Shckorbatov, Yuriy G.

    2005-01-01

    The energy of interaction between complementary nucleotides in promoter sequences of E. coli was calculated and visualized. The graphic method for presentation of energy properties of promoter sequences was elaborated on. Data obtained indicated that energy distribution through the length of promoter sequence results in picture with minima at −35, −8 and +7 regions corresponding to areas with elevated AT (adenine-thymine) content. The most important difference from the random sequences area is related to −8. Four promoter groups and their energy properties were revealed. The promoters with minimal and maximal energy of interaction between complementary nucleotides have low strengths, the strongest promoters correspond to promoter clusters characterized by intermediate energy values. PMID:16252339

  2. On the feasibility of using the intrinsic fluorescence of nucleotides for DNA sequencing.

    SciTech Connect

    Chowdhury, M. H.; Ray, K.; Johnson, R. L.; Gray, S. K.; Pond, J.; Lakowicz, J. R.; Univ. of Maryland; Univ. of Virginia; Lumerical Solutions, Inc.

    2010-04-29

    There is presently a worldwide effort to increase the speed and decrease the cost of DNA sequencing as exemplified by the goal of the National Human Genome Research Institute (NHGRI) to sequence a human genome for under $1000. Several high throughput technologies are under development. Among these, single strand sequencing using exonuclease appear very promising. However, this approach requires complete labeling of at least two bases at a time, with extrinsic high quantum yield probes. This is necessary because nucleotides absorb in the deep ultraviolet (UV) and emit with extremely low quantum yields. Hence intrinsic emission from DNA and nucleotides is not being exploited for DNA sequencing. In the present paper we consider the possibility of identifying single nucleotides using their intrinsic emission. We used the finite-difference time-domain (FDTD) method to calculate the effects of aluminum nanoparticles on nearby fluorophores that emit in the UV. We find that the radiated power of UV fluorophores is significantly increased when they are in close proximity to aluminum nanostructures. We show that there will be increased localized excitation near aluminum particles at wavelengths used to excite intrinsic nucleotide emission. Using FDTD simulation we show that a typical DNA base when coupled to appropriate aluminum nanostructures leads to highly directional emission. Additionally we present experimental results showing that a thin film of nucleotides show enhanced emission when in close proximity to aluminum nanostructures. Finally we provide Monte Carlo simulations that predict high levels of base calling accuracy for an assumed number of photons that is derived from the emission spectra of the intrinsic fluorescence of the bases. Our results suggest that single nucleotides can be detected and identified using aluminum nanostructures that enhance their intrinsic emission. This capability would be valuable for the ongoing efforts toward the $1000 genome.

  3. Complete Nucleotide Sequence of a Citrobacter freundii Plasmid Carrying KPC-2 in a Unique Genetic Environment

    PubMed Central

    Yao, Yancheng; Imirzalioglu, Can; Hain, Torsten; Kaase, Martin; Gatermann, Soeren; Exner, Martin; Mielke, Martin; Hauri, Anja; Dragneva, Yolanta; Bill, Rita; Wendt, Constanze; Wirtz, Angela; Chakraborty, Trinad

    2014-01-01

    The complete and annotated nucleotide sequence of a 54,036-bp plasmid harboring a blaKPC-2 gene that is clonally present in Citrobacter isolates from different species is presented. The plasmid belongs to incompatibility group N (IncN) and harbors the class A carbapenemase KPC-2 in a unique genetic environment. PMID:25395635

  4. Assessing the utility of the Oxford Nanopore MinION for snake venom gland cDNA sequencing

    PubMed Central

    Hargreaves, Adam D.

    2015-01-01

    Portable DNA sequencers such as the Oxford Nanopore MinION device have the potential to be truly disruptive technologies, facilitating new approaches and analyses and, in some cases, taking sequencing out of the lab and into the field. However, the capabilities of these technologies are still being revealed. Here we show that single-molecule cDNA sequencing using the MinION accurately characterises venom toxin-encoding genes in the painted saw-scaled viper, Echis coloratus. We find the raw sequencing error rate to be around 12%, improved to 0–2% with hybrid error correction and 3% with de novo error correction. Our corrected data provides full coding sequences and 5′ and 3′ UTRs for 29 of 33 candidate venom toxins detected, far superior to Illumina data (13/40 complete) and Sanger-based ESTs (15/29). We suggest that, should the current pace of improvement continue, the MinION will become the default approach for cDNA sequencing in a variety of species. PMID:26623194

  5. The nucleotide sequence and genome structure of mung bean yellow mosaic geminivirus.

    PubMed

    Morinaga, T; Ikegami, M; Miura, K

    1993-01-01

    Complete nucleotide sequences of the infectious cloned DNA components (DNA 1 and DNA 2) of mung bean yellow mosaic virus (MYMV) were determined. MYMV DNA 1 and DNA 2 consists of 2,723 and 2,675 nucleotides respectively. DNA 1 and DNA 2 have little sequence similarity except for a region of approximately 200 bases which is almost identical in the two molecules. Analysis of open reading frames revealed nine potential coding regions for proteins of mol. wt. > 10,000, six in DNA 1 and three in DNA 2. The nucleotide sequence of MYMV DNA was compared with that of bean golden mosaic virus (BGMV), tomato golden mosaic virus (TGMV) and African cassava mosaic virus (ACMV). The 200-base region common to the two DNAs of each virus had little sequence similarity, except for a highly conserved 33-36 base sequence potentially capable of forming a stable hairpin structure. The potential coding regions in the MYMV DNAs had counterparts in the BGMV, TGMV and ACMV, suggesting an overall similarity in genome organization, except for absence of 1L3 in MYMV DNA 1. The most highly conserved ORFs, MYMV 1R1, BGMV 1R1, TGMV 1R1 and ACMV 1R1, are the putative genes for the coat proteins of MYMV, BGMV, TGMV and ACMV, respectively. MYMV 1L1 has also a high degree of sequence similarity with BGMV 1L1, TGMV 1L1 and ACMV 1L1.

  6. Nucleotide sequence of the 3'-terminal region of potato virus YN RNA.

    PubMed

    van der Vlugt, R; Allefs, S; de Haan, P; Goldbach, R

    1989-01-01

    The sequence of the 3'-terminal 1611 nucleotides of the genome of the tobacco veinal necrosis strain of potato virus Y (PVYN) was determined. The sequence revealed an open reading frame of 1285 nucleotides, of which the start was not identified, and an untranslated region of 316 nucleotides upstream of a poly(A) tract. Comparison of the open reading frame with the amino-terminal sequence of the viral coat protein enabled mapping of the start of the coat protein at amino acid -267, and indicated that maturation of this protein requires proteolytic processing from a larger polyprotein precursor at a glutamine/glycine dipeptide sequence. The coat protein of PVYN displayed significant (51 to 63%) sequence homology to the coat proteins of four other potyviruses, tobacco etch virus, tobacco vein mottling virus, plum pox virus and sugarcane mosaic virus. Even higher sequence homology (91%) was detected with the coat protein of a fifth potyvirus, pepper mottle virus (PeMV). This homology was of the same level as found between the coat proteins of PVYN and a second strain of this virus, PVYD. Since, moreover, PVYN and PeMV were the only potyviruses displaying homology in the 3'-terminal, non-translated regions of their genomes, we conclude that PeMV should be regarded as a strain of PVY.

  7. Cloning and sequencing of a cDNA encoding a heat-stable sweet protein, mabinlin II.

    PubMed

    Nirasawa, S; Masuda, Y; Nakaya, K; Kurihara, Y

    1996-11-28

    A cDNA clone encoding a heat-stable sweet protein, mabinlin II (MAB), was isolated and sequenced. The encoded precursor to MAB was composed of 155 amino acid (aa) residues, including a signal sequence of 20 aa, an N-terminal extension peptide of 15 aa, a linker peptide of 14 aa and one residue of C-terminal extension. Comparison of the proteolytic cleavage sites during post-translational processing of MAB precursor with those of like 2S seed-storage proteins of Arabidopsis thaliana, Brassica napus and Bertholletia excelsa shows that the three individual cleavage sites between respective species are conserved.

  8. Prediction of human rotavirus serotype by nucleotide sequence analysis of the VP7 protein gene.

    PubMed Central

    Green, K Y; Sears, J F; Taniguchi, K; Midthun, K; Hoshino, Y; Gorziglia, M; Nishikawa, K; Urasawa, S; Kapikian, A Z; Chanock, R M

    1988-01-01

    Human rotavirus field isolates were characterized by direct sequence analysis of the gene encoding the serotype-specific major neutralization protein (VP7). Single-stranded RNA transcripts were prepared from virus particles obtained directly from stool specimens or after two or three passages in MA-104 cells. Two regions of the gene (nucleotides 307 through 351 and 670 through 711) which had previously been shown to contain regions of sequence divergence among rotavirus serotypes were sequenced by the dideoxynucleotide method with two different synthetic oligonucleotide primers. The resulting nucleotide sequences were compared with the corresponding sequences from rotaviruses of known serotype (serotype 1, 2, 3, or 4). A total of 25 field isolates and 10 laboratory strains examined by this method exhibited marked sequence identity in both areas of the gene with the corresponding regions of 1 of the 4 reference strains. In addition, the predicted serotype from the sequence analysis correlated in each case with the serotype determined when the rotaviruses were examined by plaque reduction neutralization or reactivity with serotype-specific monoclonal antibodies. These data suggest that as a result of the high degree of sequence conservation observed among rotaviruses of the same serotype, it is possible to predict the serotype of a rotavirus isolate by direct sequence analysis of its VP7 gene. PMID:2833626

  9. Cell cycle regulated synthesis of stable mouse thymidine kinase mRNA is mediated by a sequence within the cDNA.

    PubMed Central

    Hofbauer, R; Müllner, E; Seiser, C; Wintersberger, E

    1987-01-01

    The cDNA for mouse thymidine kinase (TK) was isolated from a cDNA library in lambda-gt11 and sequenced. It was used as a probe to follow the time course of TK mRNA expression in growth stimulated mouse fibroblasts. Linked to the HSV-TK promoter the cDNA was able to transform LTK-cells to the TK+ phenotype. The transformed cells expressed the TK mRNA and enzyme activity in a growth dependent fashion suggesting that the regulatory element is localized on the cDNA. Images PMID:3822814

  10. Guinea pig alpha 1-microglobulin/bikunin: cDNA sequencing, tissue expression and expression during acute phase.

    PubMed

    Yoshida, K; Suzuki, Y; Yamamoto, K; Sinohara, H

    1999-02-01

    cDNA encoding alpha 1-microglobulin/bikunin (AMBP) was amplified from guinea pig (Cavia porcellus) liver mRNA by reverse transcription-polymerase chain reaction (RT-PCR) and rapid amplification of cDNA ends methods, cloned and sequenced. The deduced amino acid sequence was found to be homologous to the sequence of AMBP of other mammals (69-76% amino acid identity). It has two Kunitz-type trypsin inhibitor domains in the bikunin part as reactive sites, one in the N-terminal region and another in the C-terminal region. The N-terminal inhibitor domain sequence is well-conserved, but the P1 residue of the C-terminal inhibitor domain sequence was found to be Gln rather than Arg, a residue highly conserved in the AMBP of seven other mammals examined to date. By RT-PCR and nested PCR, AMBP mRNA was detected not only in liver tissue, previously known to be a site of its synthesis, but also in pancreas, stomach, small intestine, colon, lung, spleen, kidney, testis, skeletal muscle, and leukocytes, but not in brain or heart. We examined the AMBP mRNA levels in guinea pig liver by RT-PCR, comparing normal levels and those in a state of inflammation. The mRNA levels, however, did not significantly change.

  11. Nucleotide sequence of miRNA precursor contributes to cleavage site selection by Dicer.

    PubMed

    Starega-Roslan, Julia; Galka-Marciniak, Paulina; Krzyzosiak, Wlodzimierz J

    2015-12-15

    The ribonuclease Dicer excises mature miRNAs from a diverse group of precursors (pre-miRNAs), most of which contain various secondary structure motifs in their hairpin stem. In this study, we analyzed Dicer cleavage in hairpin substrates deprived of such motifs. We searched for the factors other than the secondary structure, which may influence the length diversity and heterogeneity of miRNAs. We found that the nucleotide sequence at the Dicer cleavage site influences both of these miRNA characteristics. With regard to cleavage mechanism, we demonstrate that the Dicer RNase IIIA domain that cleaves within the 3' arm of the pre-miRNA is more sensitive to the nucleotide sequence of its substrate than is the RNase IIIB domain. The RNase IIIA domain avoids releasing miRNAs with G nucleotide and prefers to generate miRNAs with a U nucleotide at the 5' end. We also propose that the sequence restrictions at the Dicer cleavage site might be the factor that contributes to the generation of miRNA duplexes with 3' overhangs of atypical lengths. This finding implies that the two RNase III domains forming the single processing center of Dicer may exhibit some degree of flexibility, which allows for the formation of these non-standard 3' overhangs.

  12. Nucleotide sequence of miRNA precursor contributes to cleavage site selection by Dicer

    PubMed Central

    Starega-Roslan, Julia; Galka-Marciniak, Paulina; Krzyzosiak, Wlodzimierz J.

    2015-01-01

    The ribonuclease Dicer excises mature miRNAs from a diverse group of precursors (pre-miRNAs), most of which contain various secondary structure motifs in their hairpin stem. In this study, we analyzed Dicer cleavage in hairpin substrates deprived of such motifs. We searched for the factors other than the secondary structure, which may influence the length diversity and heterogeneity of miRNAs. We found that the nucleotide sequence at the Dicer cleavage site influences both of these miRNA characteristics. With regard to cleavage mechanism, we demonstrate that the Dicer RNase IIIA domain that cleaves within the 3′ arm of the pre-miRNA is more sensitive to the nucleotide sequence of its substrate than is the RNase IIIB domain. The RNase IIIA domain avoids releasing miRNAs with G nucleotide and prefers to generate miRNAs with a U nucleotide at the 5′ end. We also propose that the sequence restrictions at the Dicer cleavage site might be the factor that contributes to the generation of miRNA duplexes with 3′ overhangs of atypical lengths. This finding implies that the two RNase III domains forming the single processing center of Dicer may exhibit some degree of flexibility, which allows for the formation of these non-standard 3′ overhangs. PMID:26424848

  13. Nucleotide sequence and genome organization of atractylodes mottle virus, a new member of the genus Carlavirus.

    PubMed

    Zhao, Fumei; Igori, Davaajargal; Lim, Seungmo; Yoo, Ran Hee; Lee, Su-Heon; Moon, Jae Sun

    2015-11-01

    The complete genome sequence of a member of a distinct species of the genus Carlavirus in the family Betaflexiviridae, tentatively named atractylodes mottle virus (AtrMoV), has been determined. Analysis of its genomic organization indicates that it has a single-stranded, positive-sense genomic RNA of 8866 nucleotides, excluding the poly(A) tail, and consists of six open reading frames typical of members of the genus Carlavirus. The individual open reading frames of AtrMoV show moderately low sequence similarity to those of other carlaviruses at the nucleotide and amino acid sequence levels. Pairwise comparison and phylogenetic analysis suggest that AtrMoV is most closely related to chrysanthemum virus B.

  14. A novel HLA-B*51 allele (B*5116) identified by nucleotide sequencing.

    PubMed

    Tamouza, R; Carbonnelle, E; Schaeffer, V; Sadki, K; Abed, Y; Marzais, F; Poirier, J C; Fortier, C; Toubert, A; Raffoux, C; Charron, D

    2000-02-01

    We report here an additional HLA-B*51 variant designated HLA-B*5116. Detected by an abnormal serological reactivity pattern, this variant was identified as a B*51 allele by polymerase chain reaction using sequence-specific primers (PCR-SSP) and characterized by nucleotide sequencing. The new variant sequence match closely with the classical HLA-B*5101 excepted two adjacent nucleotide substitutions at positions 216 and 217 of the third exon and the subsequent Leucine to Glutamic acid change at codon 163 of the alpha2 domain (CTG-->GAG). This new variant was not detected in three different ethnic groups (French, Algerian and Lebanese) suggesting a very rare frequency.

  15. cDNA cloning and sequence of MAL, a hydrophobic protein associated with human T-cell differentiation.

    PubMed Central

    Alonso, M A; Weissman, S M

    1987-01-01

    We have isolated a human cDNA that is expressed in the intermediate and late stages of T-cell differentiation. The cDNA encodes a highly hydrophobic protein, termed MAL, that lacks a hydrophobic leader peptide sequence and contains four potential transmembrane domains separated by short hydrophilic segments. The predicted configuration of the MAL protein resembles the structure of integral proteins that form pores or channels in the plasma membrane and that are believed to act as transporters of water-soluble molecules and ions across the lipid bilayer. The presence of MAL mRNA in a panel of T-cell lines that express both the T-cell receptor and the T11 antigen suggests that MAL may be involved in membrane signaling in T cells activated via either T11 or T-cell receptor pathways. Images PMID:3494249

  16. Nucleotide sequence and genome organization of Dweet mottle virus and its relationship to members of the family Betaflexiviridae

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The nucleotide sequence of Dweet mottle virus (DMV) was determined and compared to sequences of members of the family Alpha- and Beta-flexiviridae. The DMV genome has 8747 nucleotides (nt) excluding the poly-(A) tail at the 3’ end of the genome. The overall G+C content of DMV genomic RNA is 40%. D...

  17. Complete nucleotide sequence analysis of a Dengue-1 virus isolated on Easter Island, Chile.

    PubMed

    Cáceres, C; Yung, V; Araya, P; Tognarelli, J; Villagra, E; Vera, L; Fernández, J

    2008-01-01

    Dengue-1 viruses responsible for the dengue fever outbreak in Easter Island in 2002 were isolated from acute-phase sera of dengue fever patients. In order to analyze the complete genome sequence, we designed primers to amplify contiguous segments across the entire sequence of the viral genome. RT-PCR products obtained were cloned, and complete nucleotide and deduced amino acid sequences were determined. This report constitutes the first complete genetic characterization of a DENV-1 isolate from Chile. Phylogenetic analysis shows that an Easter Island isolate is most closely related to Pacific DENV-1 genotype IV viruses.

  18. Complete nucleotide sequence of a subviral DNA molecule of porcine circovirus type 2.

    PubMed

    Wen, Han

    2016-07-01

    Porcine circovirus type 2 (PCV2) is a member of the genus Circovirus in the family Circoviridae. Most subgenomic molecules of PCV2 have been mapped. Here, the first full-length sequence of a subviral molecule of PCV2 (CH-IVT12) containing a reverse complement sequence of the PCV2 genome was determined by sequencing DNA extracted from PK15 cells infected with PCV2. The circular CH-IVT12 DNA consists of 1136 nucleotides and contains one major open reading frame.

  19. Nucleotide sequence of a new isolate of ribgrass mosaic tobamovirus infecting Impatiens New Guinea.

    PubMed

    Wetzel, T; Njapo Ngangom, H O; Chotewutmontri, S; Krczal, G

    2006-04-01

    The complete nucleotide sequence of a tobamovirus isolated from Impatiens New Guinea was determined. The genome was 6302 nt long, and its genomic organisation was similar to those of other crucufer tobamoviruses. Sequence comparisons with the corresponding sequences of other crucifer tobamoviruses revealed highest levels of identity with the ribgrass mosaic virus (Shanghai isolate). A small open reading frame putatively encoding a 4.5-kDa protein with a low degree of similarity to the ORF6 of tobacco mosaic virus was found nested in the movement protein gene.

  20. Nucleotide sequences of the coat protein genes of two Japanese zucchini yellow mosaic virus isolates.

    PubMed

    Kundu, A K; Ohshima, K; Sako, N

    1997-10-01

    The nucleotide (nt) sequences of the coat protein (CP) genes of two Japanese zucchini yellow mosaic virus (ZYMV) isolates (ZYMV-169 and ZYMV-M) were determined. The CP genes of both isolates were 837 nt long and encoded 279 amino acids (aa). The nt and deduced aa sequence similarities between the two isolates were 92% and 94.6%, respectively. The deduced aa sequences of CPs of the Japanese isolates were compared with those of previously reported ZYMV isolates by phylogenetic analysis. This comparison lead us to divide all ZMYV isolates into 3 groups in which ZYMV-169 formed its own distinct group.

  1. Sequence selective naked-eye detection of DNA harnessing extension of oligonucleotide-modified nucleotides.

    PubMed

    Verga, Daniela; Welter, Moritz; Marx, Andreas

    2016-02-01

    DNA polymerases can efficiently and sequence selectively incorporate oligonucleotide (ODN)-modified nucleotides and the incorporated oligonucleotide strand can be employed as primer in rolling circle amplification (RCA). The effective amplification of the DNA primer by Φ29 DNA polymerase allows the sequence-selective hybridisation of the amplified strand with a G-quadruplex DNA sequence that has horse radish peroxidase-like activity. Based on these findings we develop a system that allows DNA detection with single-base resolution by naked eye.

  2. The nucleotide sequence at the termini of adenovirus type 5 DNA.

    PubMed Central

    Steenbergh, P H; Maat, J; van Ormondt, H; Sussenbach, J S

    1977-01-01

    The sequences of the first 194 base pairs at both termini of adenovirus type 5 (Ad5) DNA have been determined, using the chemical degradation technique developed by Maxam and Gilbert (Proc. Nat. Acad. Sci. USA 74 (1977), pp. 560-564). The nucleotide sequences 1-75 were confirmed by analysis of labeled RNA transcribed from the terminal HhaI fragments in vitro. The sequence data show that Ad5 DNA has a perfect inverted terminal repetition of 103 base pairs long. Images PMID:600799

  3. Characterization of expressed sequence tags from a full-length enriched cDNA library of Cryptomeria japonica male strobili

    PubMed Central

    Futamura, Norihiro; Totoki, Yasushi; Toyoda, Atsushi; Igasaki, Tomohiro; Nanjo, Tokihiko; Seki, Motoaki; Sakaki, Yoshiyuki; Mari, Adriano; Shinozaki, Kazuo; Shinohara, Kenji

    2008-01-01

    Background Cryptomeria japonica D. Don is one of the most commercially important conifers in Japan. However, the allergic disease caused by its pollen is a severe public health problem in Japan. Since large-scale analysis of expressed sequence tags (ESTs) in the male strobili of C. japonica should help us to clarify the overall expression of genes during the process of pollen development, we constructed a full-length enriched cDNA library that was derived from male strobili at various developmental stages. Results We obtained 36,011 expressed sequence tags (ESTs) from either one or both ends of 19,437 clones derived from the cDNA library of C. japonica male strobili at various developmental stages. The 19,437 cDNA clones corresponded to 10,463 transcripts. Approximately 80% of the transcripts resembled ESTs from Pinus and Picea, while approximately 75% had homologs in Arabidopsis. An analysis of homologies between ESTs from C. japonica male strobili and known pollen allergens in the Allergome Database revealed that products of 180 transcripts exhibited significant homology. Approximately 2% of the transcripts appeared to encode transcription factors. We identified twelve genes for MADS-box proteins among these transcription factors. The twelve MADS-box genes were classified as DEF/GLO/GGM13-, AG-, AGL6-, TM3- and TM8-like MIKCC genes and type I MADS-box genes. Conclusion Our full-length enriched cDNA library derived from C. japonica male strobili provides information on expression of genes during the development of male reproductive organs. We provided potential allergens in C. japonica. We also provided new information about transcription factors including MADS-box genes expressed in male strobili of C. japonica. Large-scale gene discovery using full-length cDNAs is a valuable tool for studies of gymnosperm species. PMID:18691438

  4. The vicilin gene family of pea (Pisum sativum L.): a complete cDNA coding sequence for preprovicilin.

    PubMed Central

    Lycett, G W; Delauney, A J; Gatehouse, J A; Gilroy, J; Croy, R R; Boulter, D

    1983-01-01

    A cDNA plasmid bank has been constructed using mRNA from developing pea seeds and three cDNAs coding for vicilin polypeptides have been selected. These cDNAs have been sequenced and between them cover the whole of the coding sequence plus part of the 5' and 3' untranslated regions. Comparison with amino acid sequence data from the protein indicates that vicilin is synthesised as preprovicilin with subsequent removal of a signal peptide and a C-terminal peptide as well as post translational endo-proteolytic cleavage. The cDNAs represent two different classes of vicilin genes whilst amino acid data show that there are at least three major classes of vicilin polypeptide. The vicilin sequences show extensive homology with conglycinin and phaseolin except in the regions of the internal proteolytic cleavages. The evolutionary significance of this relationship is discussed. Images PMID:6687941

  5. PCR amplification and sequences of cDNA clones for the small and large subunits of ADP-glucose pyrophosphorylase from barley tissues.

    PubMed

    Villand, P; Aalen, R; Olsen, O A; Lüthi, E; Lönneborg, A; Kleczkowski, L A

    1992-06-01

    Several cDNAs encoding the small and large subunit of ADP-glucose pyrophosphorylase (AGP) were isolated from total RNA of the starchy endosperm, roots and leaves of barley by polymerase chain reaction (PCR). Sets of degenerate oligonucleotide primers, based on previously published conserved amino acid sequences of plant AGP, were used for synthesis and amplification of the cDNAs. For either the endosperm, roots and leaves, the restriction analysis of PCR products (ca. 550 nucleotides each) has revealed heterogeneity, suggesting presence of three transcripts for AGP in the endosperm and roots, and up to two AGP transcripts in the leaf tissue. Based on the derived amino acid sequences, two clones from the endosperm, beps and bepl, were identified as coding for the small and large subunit of AGP, respectively, while a leaf transcript (blpl) encoded the putative large subunit of AGP. There was about 50% identity between the endosperm clones, and both of them were about 60% identical to the leaf cDNA. Northern blot analysis has indicated that beps and bepl are expressed in both the endosperm and roots, while blpl is detectable only in leaves. Application of the PCR technique in studies on gene structure and gene expression of plant AGP is discussed.

  6. PatMatch: a program for finding patterns in peptide and nucleotide sequences

    PubMed Central

    Yan, Thomas; Yoo, Danny; Berardini, Tanya Z.; Mueller, Lukas A.; Weems, Dan C.; Weng, Shuai; Cherry, J. Michael; Rhee, Seung Y.

    2005-01-01

    Here, we present PatMatch, an efficient, web-based pattern-matching program that enables searches for short nucleotide or peptide sequences such as cis-elements in nucleotide sequences or small domains and motifs in protein sequences. The program can be used to find matches to a user-specified sequence pattern that can be described using ambiguous sequence codes and a powerful and flexible pattern syntax based on regular expressions. A recent upgrade has improved performance and now supports both mismatches and wildcards in a single pattern. This enhancement has been achieved by replacing the previous searching algorithm, scan_for_matches [D'Souza et al. (1997), Trends in Genetics, 13, 497–498], with nondeterministic-reverse grep (NR-grep), a general pattern matching tool that allows for approximate string matching [Navarro (2001), Software Practice and Experience, 31, 1265–1312]. We have tailored NR-grep to be used for DNA and protein searches with PatMatch. The stand-alone version of the software can be adapted for use with any sequence dataset and is available for download at The Arabidopsis Information Resource (TAIR) at . The PatMatch server is available on the web at for searching Arabidopsis thaliana sequences. PMID:15980466

  7. Pseudo nucleotide composition or PseKNC: an effective formulation for analyzing genomic sequences.

    PubMed

    Chen, Wei; Lin, Hao; Chou, Kuo-Chen

    2015-10-01

    With the avalanche of DNA/RNA sequences generated in the post-genomic age, it is urgent to develop automated methods for analyzing the relationship between the sequences and their functions. Towards this goal, a series of sequence-based methods have been proposed and applied to analyze various character-unknown DNA/RNA sequences in order for in-depth understanding their action mechanisms and processes. Compared with the classical sequence-based methods, the pseudo nucleotide composition or PseKNC approach developed very recently has the following advantages: (1) it can convert length-different DNA/RNA sequences into dimension-fixed digital vectors that can be directly handled by all the existing machine-learning algorithms or operation engines; (2) it can contain the desired features and properties according to the selection or definition of users; (3) it can cover considerable sequence pattern information, both local and global. This minireview is focused on the concept of pseudo nucleotide composition, its development and applications.

  8. Linking the human cytogenetic map with nucleotide sequence: the CCAP clone set.

    PubMed

    Jang, Wonhee; Yonescu, Raluca; Knutsen, Turid; Brown, Theresa; Reppert, Tricia; Sirotkin, Karl; Schuler, Gregory D; Ried, Thomas; Kirsch, Ilan R

    2006-07-15

    We present the completed dataset and clone repository of the Cancer Chromosome Aberration Project (CCAP), an initiative developed and funded through the intramural program of the U.S. National Cancer Institute, to provide seamless linkage of human cytogenetic markers with the primary nucleotide sequence of the human genome. Spaced at 1-2 Mb intervals across the human genome, 1,339 bacterial artificial chromosome (BAC) clones have been localized to chromosomal bands through high-resolution fluorescence in situ hybridization (FISH) mapping. Of these clones, 99.8% can be positioned on the primary human genome sequence and 95% are placed at or close to their precise nucleotide starts and stops. This dataset can be studied and manipulated within generally available public Web sites. The clones are available from a commercial repository. The CCAP BAC clone set provides anchors for the interrogation of gene and sequence involvement in oncogenic and developmental disorders when the starting point is the recognition of a structural, numerical, or interstitial chromosomal aberration. This dataset also provides a current view of the quality and coherence of the available genome sequence and insight into the nucleotide and three-dimensional structures that manifest as Giemsa light and dark chromosomal banding patterns.

  9. Nucleotide binding database NBDB – a collection of sequence motifs with specific protein-ligand interactions

    PubMed Central

    Zheng, Zejun; Goncearenco, Alexander; Berezovsky, Igor N.

    2016-01-01

    NBDB database describes protein motifs, elementary functional loops (EFLs) that are involved in binding of nucleotide-containing ligands and other biologically relevant cofactors/coenzymes, including ATP, AMP, ATP, GMP, GDP, GTP, CTP, PAP, PPS, FMN, FAD(H), NAD(H), NADP, cAMP, cGMP, c-di-AMP and c-di-GMP, ThPP, THD, F-420, ACO, CoA, PLP and SAM. The database is freely available online at http://nbdb.bii.a-star.edu.sg. In total, NBDB contains data on 249 motifs that work in interactions with 24 ligands. Sequence profiles of EFL motifs were derived de novo from nonredundant Uniprot proteome sequences. Conserved amino acid residues in the profiles interact specifically with distinct chemical parts of nucleotide-containing ligands, such as nitrogenous bases, phosphate groups, ribose, nicotinamide, and flavin moieties. Each EFL profile in the database is characterized by a pattern of corresponding ligand–protein interactions found in crystallized ligand–protein complexes. NBDB database helps to explore the determinants of nucleotide and cofactor binding in different protein folds and families. NBDB can also detect fragments that match to profiles of particular EFLs in the protein sequence provided by user. Comprehensive information on sequence, structures, and interactions of EFLs with ligands provides a foundation for experimental and computational efforts on design of required protein functions. PMID:26507856

  10. Quadfinder: server for identification and analysis of quadruplex-forming motifs in nucleotide sequences

    PubMed Central

    Scaria, Vinod; Hariharan, Manoj; Arora, Amit; Maiti, Souvik

    2006-01-01

    G-quadruplex secondary structures, which play a structural role in repetitive DNA such as telomeres, may also play a functional role at other genomic locations as targetable regulatory elements which control gene expression. The recent interest in application of quadruplexes in biological systems prompted us to develop a tool for the identification and analysis of quadruplex-forming nucleotide sequences especially in the RNA. Here we present Quadfinder, an online server for prediction and bioinformatics of uni-molecular quadruplex-forming nucleotide sequences. The server is designed to be user-friendly and needs minimal intervention by the user, while providing flexibility of defining the variants of the motif. The server is freely available at URL . PMID:16845097

  11. Nucleotide sequence and replication properties of the Bacillus borstelensis cryptic plasmid pHT926.

    PubMed Central

    Ebisu, S; Murahashi, Y; Takagi, H; Kadowaki, K; Yamaguchi, K; Yamagata, H; Udaka, S

    1995-01-01

    The nucleotide sequence of pHT926, a cryptic plasmid found in Bacillus borstelensis HP926, was determined. pHT926 replicates by a rolling-circle mechanism and belongs to the pC194 plasmid family. The copy number of pHT926 was fourfold higher than that of pUB110 and very stably maintained in Bacillus choshinensis. PMID:7487045

  12. Hybridization-based antibody cDNA recovery for the production of recombinant antibodies identified by repertoire sequencing.

    PubMed

    Valdés-Alemán, Javier; Téllez-Sosa, Juan; Ovilla-Muñoz, Marbella; Godoy-Lozano, Elizabeth; Velázquez-Ramírez, Daniel; Valdovinos-Torres, Humberto; Gómez-Barreto, Rosa E; Martinez-Barnetche, Jesús

    2014-01-01

    High-throughput sequencing of the antibody repertoire is enabling a thorough analysis of B cell diversity and clonal selection, which may improve the novel antibody discovery process. Theoretically, an adequate bioinformatic analysis could allow identification of candidate antigen-specific antibodies, requiring their recombinant production for experimental validation of their specificity. Gene synthesis is commonly used for the generation of recombinant antibodies identified in silico. Novel strategies that bypass gene synthesis could offer more accessible antibody identification and validation alternatives. We developed a hybridization-based recovery strategy that targets the complementarity-determining region 3 (CDRH3) for the enrichment of cDNA of candidate antigen-specific antibody sequences. Ten clonal groups of interest were identified through bioinformatic analysis of the heavy chain antibody repertoire of mice immunized with hen egg white lysozyme (HEL). cDNA from eight of the targeted clonal groups was recovered efficiently, leading to the generation of recombinant antibodies. One representative heavy chain sequence from each clonal group recovered was paired with previously reported anti-HEL light chains to generate full antibodies, later tested for HEL-binding capacity. The recovery process proposed represents a simple and scalable molecular strategy that could enhance antibody identification and specificity assessment, enabling a more cost-efficient generation of recombinant antibodies.

  13. 37 CFR 1.824 - Form and format for nucleotide and/or amino acid sequence submissions in computer readable form.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... nucleotide and/or amino acid sequence submissions in computer readable form. 1.824 Section 1.824 Patents... And/or Amino Acid Sequences § 1.824 Form and format for nucleotide and/or amino acid sequence... readable form may be created by any means, such as word processors, nucleotide/amino acid sequence...

  14. 37 CFR 1.824 - Form and format for nucleotide and/or amino acid sequence submissions in computer readable form.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... nucleotide and/or amino acid sequence submissions in computer readable form. 1.824 Section 1.824 Patents... And/or Amino Acid Sequences § 1.824 Form and format for nucleotide and/or amino acid sequence... readable form may be created by any means, such as word processors, nucleotide/amino acid sequence...

  15. 37 CFR 1.824 - Form and format for nucleotide and/or amino acid sequence submissions in computer readable form.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... nucleotide and/or amino acid sequence submissions in computer readable form. 1.824 Section 1.824 Patents... And/or Amino Acid Sequences § 1.824 Form and format for nucleotide and/or amino acid sequence... readable form may be created by any means, such as word processors, nucleotide/amino acid sequence...

  16. The complete nucleotide sequence of the mitochondrial genome of Phthonandria atrilineata (Lepidoptera: Geometridae).

    PubMed

    Yang, Ling; Wei, Zhao-Jun; Hong, Gui-Yun; Jiang, Shao-Tong; Wen, Long-Ping

    2009-07-01

    Using long-polymerase chain reaction (Long-PCR) method, we determined the complete nucleotide sequence of the mitochondrial genome (mitogenome) of Phthonandria atrilineata. The complete mtDNA from P. atrilineata was 15,499 base pairs in length and contained 13 protein-coding genes (PCGs), 2 rRNA genes, 22 tRNA genes, and a control region. The P. atrilineata genes were in the same order and orientation as the completely sequenced mitogenomes of other lepidopteran species. The nucleotide composition of P. atrilineata mitogenome was biased toward A + T nucleotides (81.02%), and the 13 PCGs show different A + T contents that range from 73.25% (cox1) to 92.12% (atp8). Phthonandria had the canonical set of 22 tRNA genes, that fold in the typical cloverleaf structure described for metazoan mt tRNAs, with the unique exception of trnS(AGN). The phylogenetic relationships were reconstructed with the concatenated sequences of the 13 PCGs of the mitochondrial genome, which confirmed that P. atrilineata is most closely related to the superfamily Bombycoidea.

  17. Nucleotide sequence variation of the envelope protein gene identifies two distinct genotypes of yellow fever virus.

    PubMed Central

    Chang, G J; Cropp, B C; Kinney, R M; Trent, D W; Gubler, D J

    1995-01-01

    The evolution of yellow fever virus over 67 years was investigated by comparing the nucleotide sequences of the envelope (E) protein genes of 20 viruses isolated in Africa, the Caribbean, and South America. Uniformly weighted parsimony algorithm analysis defined two major evolutionary yellow fever virus lineages designated E genotypes I and II. E genotype I contained viruses isolated from East and Central Africa. E genotype II viruses were divided into two sublineages: IIA viruses from West Africa and IIB viruses from America, except for a 1979 virus isolated from Trinidad (TRINID79A). Unique signature patterns were identified at 111 nucleotide and 12 amino acid positions within the yellow fever virus E gene by signature pattern analysis. Yellow fever viruses from East and Central Africa contained unique signatures at 60 nucleotide and five amino acid positions, those from West Africa contained unique signatures at 25 nucleotide and two amino acid positions, and viruses from America contained such signatures at 30 nucleotide and five amino acid positions in the E gene. The dissemination of yellow fever viruses from Africa to the Americas is supported by the close genetic relatedness of genotype IIA and IIB viruses and genetic evidence of a possible second introduction of yellow fever virus from West Africa, as illustrated by the TRINID79A virus isolate. The E protein genes of American IIB yellow fever viruses had higher frequencies of amino acid substitutions than did genes of yellow fever viruses of genotypes I and IIA on the basis of comparisons with a consensus amino acid sequence for the yellow fever E gene. The great variation in the E proteins of American yellow fever virus probably results from positive selection imposed by virus interaction with different species of mosquitoes or nonhuman primates in the Americas. PMID:7637022

  18. Nucleotide sequencing and characterization of the genes encoding benzene oxidation enzymes of Pseudomonas putida.

    PubMed Central

    Irie, S; Doi, S; Yorifuji, T; Takagi, M; Yano, K

    1987-01-01

    The nucleotide sequence of the genes from Pseudomonas putida encoding oxidation of benzene to catechol was determined. Five open reading frames were found in the sequence. Four corresponding protein molecules were detected by a DNA-directed in vitro translation system. Escherichia coli cells containing the fragment with the four open reading frames transformed benzene to cis-benzene glycol, which is an intermediate of the oxidation of benzene to catechol. The relation between the product of each cistron and the components of the benzene oxidation enzyme system is discussed. Images PMID:3667527

  19. Remote access to ACNUC nucleotide and protein sequence databases at PBIL.

    PubMed

    Gouy, Manolo; Delmotte, Stéphane

    2008-04-01

    The ACNUC biological sequence database system provides powerful and fast query and extraction capabilities to a variety of nucleotide and protein sequence databases. The collection of ACNUC databases served by the Pôle Bio-Informatique Lyonnais includes the EMBL, GenBank, RefSeq and UniProt nucleotide and protein sequence databases and a series of other sequence databases that support comparative genomics analyses: HOVERGEN and HOGENOM containing families of homologous protein-coding genes from vertebrate and prokaryotic genomes, respectively; Ensembl and Genome Reviews for analyses of prokaryotic and of selected eukaryotic genomes. This report describes the main features of the ACNUC system and the access to ACNUC databases from any internet-connected computer. Such access was made possible by the definition of a remote ACNUC access protocol and the implementation of Application Programming Interfaces between the C, Python and R languages and this communication protocol. Two retrieval programs for ACNUC databases, Query_win, with a graphical user interface and raa_query, with a command line interface, are also described. Altogether, these bioinformatics tools provide users with either ready-to-use means of querying remote sequence databases through a variety of selection criteria, or a simple way to endow application programs with an extensive access to these databases. Remote access to ACNUC databases is open to all and fully documented (http://pbil.univ-lyon1.fr/databases/acnuc/acnuc.html).

  20. Analysis of a cloned colicin Ib gene: complete nucleotide sequence and implications for regulation of expression.

    PubMed Central

    Varley, J M; Boulnois, G J

    1984-01-01

    The complete nucleotide sequence of a 2,971 base pair EcoRI fragment carrying the structural gene for colicin Ib has been determined. The length of the gene is 1,881 nucleotides which is predicted to produce a protein of 626 amino acids and of molecular weight 71,364. The structural gene is flanked by likely promoter and terminator signals and in between the promoter and the ribosome binding site is an inverted repeat sequence which resembles other sequences known to bind the LexA protein. Further analysis of the 5' flanking sequences revealed a second region which may act either as a second LexA binding site and/or in the binding of cyclic AMP receptor protein. Comparison of the predicted amino acid sequence of colicin Ib with that of colicins A and E1 reveals localised homology. The implications of these similarities in the proteins and of regulation of the colicin Ib structural gene are discussed. Images PMID:6091036

  1. Nucleotide sequences of immunoglobulin eta genes of chimpanzee and orangutan: DNA molecular clock and hominoid evolution

    SciTech Connect

    Sakoyama, Y.; Hong, K.J.; Byun, S.M.; Hisajima, H.; Ueda, S.; Yaoita, Y.; Hayashida, H.; Miyata, T.; Honjo, T.

    1987-02-01

    To determine the phylogenetic relationships among hominoids and the dates of their divergence, the complete nucleotide sequences of the constant region of the immunoglobulin eta-chain (C/sub eta1/) genes from chimpanzee and orangutan have been determined. These sequences were compared with the human eta-chain constant-region sequence. A molecular clock (silent molecular clock), measured by the degree of sequence divergence at the synonymous (silent) positions of protein-encoding regions, was introduced for the present study. From the comparison of nucleotide sequences of ..cap alpha../sub 1/-antitrypsin and ..beta..- and delta-globulin genes between humans and Old World monkeys, the silent molecular clock was calibrated: the mean evolutionary rate of silent substitution was determined to be 1.56 x 10/sup -9/ substitutions per site per year. Using the silent molecular clock, the mean divergence dates of chimpanzee and orangutan from the human lineage were estimated as 6.4 +/- 2.6 million years and 17.3 +/- 4.5 million years, respectively. It was also shown that the evolutionary rate of primate genes is considerably slower than those of other mammalian genes.

  2. Large-scale detection and application of expressed sequence tag single nucleotide polymorphisms in Nicotiana.

    PubMed

    Wang, Y; Zhou, D; Wang, S; Yang, L

    2015-07-14

    Single nucleotide polymorphisms (SNPs) are widespread in the Nicotiana genome. Using an alignment and variation detection method, we developed 20,607,973 SNPs, based on the expressed sequence tag sequences of 10 Nicotiana species. The replacement rate was much higher than the transversion rate in the SNPs, and SNPs widely exist in the Nicotiana. In vitro verification indicated that all of the SNPs were high quality and accurate. Evolutionary relationships between 15 varieties were investigated by polymerase chain reaction with a special primer; the specific 302 locus of these sequence results clearly indicated the origin of Zhongyan 100. A database of Nicotiana SNPs (NSNP) was developed to store and search for SNPs in Nicotiana. NSNP is a tool for researchers to develop SNP markers of sequence data.

  3. Complete nucleotide sequence of a circular plasmid from the Lyme disease spirochete, Borrelia burgdorferi.

    PubMed Central

    Dunn, J J; Buchstein, S R; Butler, L L; Fisenne, S; Polin, D S; Lade, B N; Luft, B J

    1994-01-01

    We have determined the complete nucleotide sequence of a small circular plasmid from the spirochete Borrelia burgdorferi Ip21, the agent of Lyme disease. The plasmid (cp8.3/Ip21) is 8,303 bp long, has a 76.6% A+T content, and is unstable upon passage of cells in vitro. An analysis of the sequence revealed the presence of two nearly perfect copies of a 184-bp inverted repeat sequence separated by 2,675 bp containing three closely spaced, but nonoverlapping, open reading frames (ORFs). Each inverted repeat ends in sequences that may function as signals for the initiation of transcription and translation of flanking plasmid sequences. A unique oligonucleotide probe based on the repeated sequence showed that the DNA between the repeats is present predominantly in a single orientation. Additional copies of the repeat were not detected elsewhere in the Ip21 genome. An analysis for potential ORFs indicates that the plasmid has nine highly probable protein-coding ORFs and one that is less probable; together, they occupy almost 71% of the nucleotide sequence. Analysis of the deduced amino acid sequences of the ORFs revealed one (ORF-9) with features in common with Borrelia lipoproteins and another (ORF-2) having limited homology with a replication protein, RepC, from a gram-positive plasmid that replicates by a rolling circle (RC) mechanism. Known collectively as RC plasmids, such plasmids require a double-stranded origin at which the Rep protein nicks the DNA to generate a single-stranded replication intermediate. cp8.3/Ip21 has three copies of the heptameric motif characteristically found at a nick site of most RC plasmids. These observations suggest that cp8.3/Ip21 may replicate by an RC mechanism. Images PMID:8169221

  4. The mouse collagen X gene: complete nucleotide sequence, exon structure and expression pattern.

    PubMed Central

    Elima, K; Eerola, I; Rosati, R; Metsäranta, M; Garofalo, S; Perälä, M; De Crombrugghe, B; Vuorio, E

    1993-01-01

    Overlapping genomic clones covering the 7.2 kb mouse alpha 1(X) collagen gene, 0.86 kb of promoter and 1.25 kb of 3'-flanking sequences were isolated from two genomic libraries and characterized by nucleotide sequencing. Typical features of the gene include a unique three-exon structure, similar to that in the chick gene, with the entire triple-helical domain of 463 amino acids coded by a single large exon. The highest degree of amino acid and nucleotide sequence conservation was seen in the coding region for the collagenous and C-terminal non-collagenous domains between the mouse and known chick, bovine and human collagen type X sequences. More divergence between the sequences occurred in the N-terminal non-collagenous domain. Similarity between the mammalian collagen X sequences extended into the 3'-untranslated sequence, particularly near the polyadenylation site. The promoter of the mouse collagen X gene was found to contain two TATAA boxes 159 bp apart; primer extension analyses of the transcription start site revealed that both were functional. The promoter has an unusual structure with a very low G + C content of 28% between positions -220 and -1 of the upstream transcription start site. Northern and in situ hybridization analyses confirmed that the expression of the alpha 1(X) collagen gene is restricted to hypertrophic chondrocytes in tissues undergoing endochondral calcification. The detailed sequence information of the gene is useful for studies on the promoter activity of the gene and for generation of transgenic mice. Images Figure 3 Figure 5 Figure 6 PMID:8424763

  5. DNA sequence-based "bar codes" for tracking the origins of expressed sequence tags from a maize cDNA library constructed using multiple mRNA sources.

    PubMed

    Qiu, Fang; Guo, Ling; Wen, Tsui-Jung; Liu, Feng; Ashlock, Daniel A; Schnable, Patrick S

    2003-10-01

    To enhance gene discovery, expressed sequence tag (EST) projects often make use of cDNA libraries produced using diverse mixtures of mRNAs. As such, expression data are lost because the origins of the resulting ESTs cannot be determined. Alternatively, multiple libraries can be prepared, each from a more restricted source of mRNAs. Although this approach allows the origins of ESTs to be determined, it requires the production of multiple libraries. A hybrid approach is reported here. A cDNA library was prepared using 21 different pools of maize (Zea mays) mRNAs. DNA sequence "bar codes" were added during first-strand cDNA synthesis to uniquely identify the mRNA source pool from which individual cDNAs were derived. Using a decoding algorithm that included error correction, it was possible to identify the source mRNA pool of more than 97% of the ESTs. The frequency at which a bar code is represented in an EST contig should be proportional to the abundance of the corresponding mRNA in the source pool. Consistent with this, all ESTs derived from several genes (zein and adh1) that are known to be exclusively expressed in kernels or preferentially expressed under anaerobic conditions, respectively, were exclusively tagged with bar codes associated with mRNA pools prepared from kernel and anaerobically treated seedlings, respectively. Hence, by allowing for the retention of expression data, the bar coding of cDNA libraries can enhance the value of EST projects.

  6. Nucleotide sequence of the internal transcribed spacers and 5.8S region of ribosomal DNA in Pinus pinea L.

    PubMed

    Marrocco, R; Gelati, M T; Maggini, F

    1996-01-01

    The nucleotide sequence of the first internal transcribed spacer (ITS1) belonging to different ribosomal RNA genes from Pinus pinea are reported. The analyzed ITS1 can be distinguished on the basis of their length, being one 2631 bp and the other 271 bp long. Nucleotide comparison of these regions did not show appreciable sequence homology. The larger ITS1 contains five tandem arranged subrepeats with size ranging between 219 bp and 237 bp. The nucleotide sequence of the 5.8S and the ITS2 regions belonging to the larger ribosomal RNA gene are also reported.

  7. The human sorbitol dehydrogenase gene: cDNA cloning, sequence determination, and mapping by fluorescence in situ hybridization

    SciTech Connect

    Lee, F.K.; Chung, S. ); Cheung, M.C. )

    1994-05-15

    The cDNA for human sorbitol dehydrogenase (SORD) has been cloned and sequenced. It translates into a peptide of 356 amino acid residues, one more than the sequence previously reported from peptide analysis. An extra alanine was found at the acetyl-blocked N-terminal, between positions 1 and 4. This matches the rat cDNA, which also has 356 amino acids, with an extra proline at position 3. Four other mismatches were also observed, but these are all amino acid substitutions that occur outside proposed functionally important regions. Further work must be performed to determine whether these discrepancies represent polymorphic forms of the enzyme. The SORD gene was mapped by fluorescence in situ hybridization and found to occupy a single site on chromosome 15q15, indicating that it is a single-copy gene. This was confirmed by Southern blot hybridization. SORD is thought to be involved in the etiology of diabetic complications, and its deficiency has been linked to congenital cataracts. The cloned gene could be used as a probe to study the role of this enzyme in the pathogenesis of these diseases. 24 refs., 4 figs.

  8. Nucleotide sequence and analysis of the mgl operon of Escherichia coli K12.

    PubMed

    Hogg, R W; Voelker, C; Von Carlowitz, I

    1991-10-01

    The nucleotide sequence of the Escherichia coli K12 beta-methylgalactoside transport operon, mgl, was determined. Primer extension analysis indicated that the synthesis of mRNA initiates at guanine residue 145 of the determined sequence. The operon contains three open reading frames (ORF). The operator proximal ORF, mglB, encodes the galactose binding protein, a periplasmic protein of 332 amino acids including the 23 residue amino-terminal signal peptide. Following a 62 nucleotide spacer, the second ORF, mglA, is capable of encoding a protein of 506 amino acids. The amino-terminal and carboxyl-terminal halves of this protein are homologous to each other and each half contains a putative nucleotide binding site. The third ORF, mglC, is capable of encoding a hydrophobic protein of 336 amino acids which is thought to generate the transmembrane pore. The overall organization of the mglBAC operon and its potential to encode three proteins is similar to that of the ara FGH high affinity transport operon, located approximately 1 min away on the E. coli K12 chromosome.

  9. Targeted rapid amplification of cDNA ends (T-RACE)--an improved RACE reaction through degradation of non-target sequences.

    PubMed

    Bower, Neil I; Johnston, Ian A

    2010-11-01

    Amplification of the 5' ends of cDNA, although simple in theory, can often be difficult to achieve. We describe a novel method for the specific amplification of cDNA ends. An oligo-dT adapter incorporating a dUTP-containing PCR primer primes first-strand cDNA synthesis incorporating dUTP. Using the Cap finder approach, another distinct dUTP containing adapter is added to the 3' end of the newly synthesized cDNA. Second-strand synthesis incorporating dUTP is achieved by PCR, using dUTP-containing primers complimentary to the adapter sequences incorporated in the cDNA ends. The double-stranded cDNA-containing dUTP serves as a universal template for the specific amplification of the 3' or 5' end of any gene. To amplify the ends of cDNA, asymmetric PCR is performed using a single gene-specific primer and standard dNTPs. The asymmetric PCR product is purified and non-target transcripts containing dUTP degraded by Uracil DNA glycosylase, leaving only those transcripts produced during the asymmetric PCR. Subsequent PCR using a nested gene-specific primer and the 3' or 5' T-RACE primer results in specific amplification of cDNA ends. This method can be used to specifically amplify the 3' and 5' ends of numerous cDNAs from a single cDNA synthesis reaction.

  10. Mouse mammary tumor virus-like nucleotide sequences in canine and feline mammary tumors.

    PubMed

    Hsu, Wei-Li; Lin, Hsing-Yi; Chiou, Shyan-Song; Chang, Chao-Chin; Wang, Szu-Pong; Lin, Kuan-Hsun; Chulakasian, Songkhla; Wong, Min-Liang; Chang, Shih-Chieh

    2010-12-01

    Mouse mammary tumor virus (MMTV) has been speculated to be involved in human breast cancer. Companion animals, dogs, and cats with intimate human contacts may contribute to the transmission of MMTV between mouse and human. The aim of this study was to detect MMTV-like nucleotide sequences in canine and feline mammary tumors by nested PCR. Results showed that the presence of MMTV-like env and LTR sequences in canine malignant mammary tumors was 3.49% (3/86) and 18.60% (16/86), respectively. For feline malignant mammary tumors, the presence of both env and LTR sequences was found to be 22.22% (2/9). Nevertheless, the MMTV-like LTR and env sequences also were detected in normal mammary glands of dogs and cats. In comparisons of the MMTV-like DNA sequences of our findings to those of NIH 3T3 (MMTV-positive murine cell line) and human breast cancer cells, the sequence similarities ranged from 94 to 98%. Phylogenetic analysis revealed that intermixing among sequences identified from tissues of different hosts, i.e., mouse, dog, cat, and human, indicated the MMTV-like DNA existing in these hosts. Moreover, the env transcript was detected in 1 of the 19 MMTV-positive samples by reverse transcription-PCR. Taken together, our study provides evidence for the existence and expression of MMTV-like sequences in neoplastic and normal mammary glands of dogs and cats.

  11. Mouse Mammary Tumor Virus-Like Nucleotide Sequences in Canine and Feline Mammary Tumors▿

    PubMed Central

    Hsu, Wei-Li; Lin, Hsing-Yi; Chiou, Shyan-Song; Chang, Chao-Chin; Wang, Szu-Pong; Lin, Kuan-Hsun; Chulakasian, Songkhla; Wong, Min-Liang; Chang, Shih-Chieh

    2010-01-01

    Mouse mammary tumor virus (MMTV) has been speculated to be involved in human breast cancer. Companion animals, dogs, and cats with intimate human contacts may contribute to the transmission of MMTV between mouse and human. The aim of this study was to detect MMTV-like nucleotide sequences in canine and feline mammary tumors by nested PCR. Results showed that the presence of MMTV-like env and LTR sequences in canine malignant mammary tumors was 3.49% (3/86) and 18.60% (16/86), respectively. For feline malignant mammary tumors, the presence of both env and LTR sequences was found to be 22.22% (2/9). Nevertheless, the MMTV-like LTR and env sequences also were detected in normal mammary glands of dogs and cats. In comparisons of the MMTV-like DNA sequences of our findings to those of NIH 3T3 (MMTV-positive murine cell line) and human breast cancer cells, the sequence similarities ranged from 94 to 98%. Phylogenetic analysis revealed that intermixing among sequences identified from tissues of different hosts, i.e., mouse, dog, cat, and human, indicated the MMTV-like DNA existing in these hosts. Moreover, the env transcript was detected in 1 of the 19 MMTV-positive samples by reverse transcription-PCR. Taken together, our study provides evidence for the existence and expression of MMTV-like sequences in neoplastic and normal mammary glands of dogs and cats. PMID:20881168

  12. Nucleotide sequence of the cell wall proteinase gene of Streptococcus cremoris Wg2.

    PubMed Central

    Kok, J; Leenhouts, K J; Haandrikman, A J; Ledeboer, A M; Venema, G

    1988-01-01

    A 6.5-kilobase HindIII fragment that specifies the proteolytic activity of Streptococcus cremoris Wg2 was sequenced entirely. The nucleotide sequence revealed two open reading frames (ORFs), a small ORF1 with 295 codons and a large ORF2 containing 1,772 codons. For both ORFs, there was no stop codon on the HindIII fragment. A partially overlapping PstI fragment was used to locate the translation stop of the large ORF2. The entire ORF2 contained 1,902 coding triplets, followed by an apparently rho-independent terminator sequence. The inferred amino acid sequence would result in a protein of 200 kilodaltons. Both ORFs have their putative transcription and translation signals in a 345-base-pair ClaI fragment. ORF2 is preceded by a promoter region containing a 15-base-pair complementary direct repeat. Both the truncated 33- and the 200-kilodalton proteins have a signal peptide-like N-terminal amino acid sequence. The protein specified by ORF2 contained regions of extensive homology with serine proteases of the subtilisin family. Specifically, amino acid sequences involved in the formation of the active site (viz., Asp-32, His-64, and Ser-221 of the subtilisins) are well conserved in the S. cremoris Wg2 proteinase. The homologous sequences are separated by nonhomologous regions which contain several inserts, most notably a sequence of approximately 200 amino acids between the His and Ser residues of the active site. PMID:3278687

  13. Total chemical synthesis of a 77-nucleotide-long RNA sequence having methionine-acceptance activity.

    PubMed Central

    Ogilvie, K K; Usman, N; Nicoghosian, K; Cedergren, R J

    1988-01-01

    Chemical synthesis is described of a 77-nucleotide-long RNA molecule that has the sequence of an Escherichia coli Ado-47-containing tRNA(fMet) species in which the modified nucleosides have been substituted by their unmodified parent nucleosides. The sequence was assembled on a solid-phase, controlled-pore glass support in a stepwise manner with an automated DNA synthesizer. The ribonucleotide building blocks used were fully protected 5'-monomethoxytrityl-2'-silyl-3'-N,N-diisopropylaminophosphoram idites. p-Nitro-phenylethyl groups were used to protect the O6 of guanine residues. The fully deprotected tRNA analogue was characterized by polyacrylamide gel electrophoresis (sizing), terminal nucleotide analysis, sequencing, and total enzyme degradation, all of which indicated that the sequence was correct and contained only 3-5 linkages. The 77-mer was then assayed for amino acid acceptor activity by using E. coli methionyl-tRNA synthetase. The results indicated that the synthetic product, lacking modified bases, is a substrate for the enzyme and has an amino acid acceptance 11% of that of the major native species, tRNA(fMet) containing 7-methylguanosine at position 47. Images PMID:3413059

  14. Mitochondrial DNA in the sea urchin Arbacia lixula: evolutionary inferences from nucleotide sequence analysis.

    PubMed

    De Giorgi, C; Lanave, C; Musci, M D; Saccone, C

    1991-07-01

    From the stirodont Arbacia lixula we determined the sequence of 5,127 nucleotides of mitochondrial DNA (mtDNA) encompassing 18 tRNAs, two complete coding genes, parts of three other coding genes, and part of the 12S ribosomal RNA (rRNA). The sequence confirms that the organization of mtDNA is conserved within echinoids. Furthermore, it underlines the following peculiar features of sea urchin mtDNA: the clustering of tRNAs, the short noncoding regulatory sequence, and the separation by the ND1 and ND2 genes of the two rRNA genes. Comparison with the orthologous sequences from the camarodont species Paracentrotus lividus and Strongylocentrotus purpuratus revealed that (1) echinoids have an extra piece on the amino terminus of the ND5 gene that is probably the remnant of an old leucine tRNA gene; (2) third-position codon nucleotide usage has diverged between A. lixula and the camarodont species to a significant extent, implying different directional mutational pressures; and (3) the stirodont-camarodont divergence occurred twice as long ago as did the P. lividus-S. purpuratus divergence.

  15. Cloning, nucleotide sequence, and expression of the Pasteurella haemolytica A1 glycoprotease gene.

    PubMed Central

    Abdullah, K M; Lo, R Y; Mellors, A

    1991-01-01

    Pasteurella haemolytica serotype A1 secretes a glycoprotease which is specific for O-sialoglycoproteins such as glycophorin A. The gene encoding the glycoprotease enzyme has been cloned in the recombinant plasmid pH1, and its nucleotide sequence has been determined. The gene (designated gcp) codes for a protein of 35.2 kDa, and an active enzyme protein of this molecular mass can be observed in Escherichia coli clones carrying pPH1. In vivo labeling of plasmid-encoded proteins in E. coli maxicells demonstrated the expression of a 35-kDa protein from pPH1. The amino-terminal sequence of the heterologously expressed protein corresponds to that predicted from the nucleotide sequence. The glycoprotease is a neutral metalloprotease, and the predicted amino acid sequence of the glycoprotease contains a putative zinc-binding site. The gene shows no significant homology with the genes for other proteases of procaryotic or eucaryotic origin. However, there is substantial homology between gcp and an E. coli gene, orfX, whose product is believed to function in the regulation of macromolecule biosynthesis. Images PMID:1885539

  16. Sequence analysis of a rainbow trout cDNA library and creation of a gene index.

    PubMed

    Rexroad, C E; Lee, Y; Keele, J W; Karamycheva, S; Brown, G; Koop, B; Gahr, S A; Palti, Y; Quackenbush, J

    2003-01-01

    Expressed sequence tag (EST) projects have produced extremely valuable resources for identifying genes affecting phenotypes of interest. A large-scale EST sequencing project for rainbow trout was initiated to identify and functionally annotate as many unique transcripts as possible. Over 45,000 5' ESTs were obtained by sequencing clones from a single normalized library constructed using mRNA from six tissues. The production of this sequence data and creation of a rainbow trout Gene Index eliminating redundancy and providing annotation for these sequences will facilitate research in this species.

  17. The complete nucleotide sequence and genome organization of a novel carmovirus - Honeysuckle ringspot virus isolated from honeysuckle.

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A virus associated with yellow to purple ringspot on honeysuckle plants has been detected and tentatively named as Honeysuckle ringspot virus (HnRSV). The complete nucleotide sequence of HnRSV has been determined from infected honeysuckle. The genomic RNA of HnRSV is 3,956 nucleotides in length and ...

  18. High-affinity L-arabinose transport operon. Nucleotide sequence and analysis of gene products.

    PubMed

    Scripture, J B; Voelker, C; Miller, S; O'Donnell, R T; Polgar, L; Rade, J; Horazdovsky, B F; Hogg, R W

    1987-09-05

    The nucleotide sequence of the "high-affinity" L-arabinose transport operon has been determined 3' from the regulatory region and found to contain three open reading frames designated araF, araG and araH. The first gene 3' to the regulatory region, araF, encodes the 23-residue signal peptide and the 306-residue mature form of the L-arabinose binding protein (33,200 Mr). The binding protein, which has been described elsewhere, is hydrophilic, soluble and found in the periplasm of Escherichia coli. This gene is followed by an intragenic space of 72 nucleotides, which contains a region of dyad symmetry 23 nucleotides long capable of forming an 11-member stem-loop. The second gene, designated araG, contains an open reading frame capable of encoding an equally hydrophilic protein containing 504 residues (55,000 Mr). Following a 14-nucleotide spacer, which does not appear to have any secondary structure, the third open reading frame, herein designated araH, is capable of encoding a hydrophobic protein containing 329 residues (34,000 Mr) that can only be envisioned as having an integral membrane location. 3' to araH there is a T-rich region containing a 24-nucleotide area of dyad symmetry centered 55 nucleotides from the termination codon. Analysis of the derived primary sequences of the araG and araH products indicates the nature and potential features of these components. The araG protein was found to possess internal homology between its amino and carboxyl-terminal halves, suggesting a common origin. The araG gene product has been shown to be homologous to the rbsA gene product, the hisP product, the ptsB product and the malK product, all of which presumably play similar roles in their respective transport systems. Putative ATP binding sites are observed within the regions of homology. The araH gene product has been shown to be homologous to the rbsC gene product, which is the first observed homology between two purported membrane proteins.

  19. Analysis of xylem formation in pine by cDNA sequencing

    NASA Technical Reports Server (NTRS)

    Allona, I.; Quinn, M.; Shoop, E.; Swope, K.; St Cyr, S.; Carlis, J.; Riedl, J.; Retzel, E.; Campbell, M. M.; Sederoff, R.; Whetten, R. W.; Davies, E. (Principal Investigator)

    1998-01-01

    Secondary xylem (wood) formation is likely to involve some genes expressed rarely or not at all in herbaceous plants. Moreover, environmental and developmental stimuli influence secondary xylem differentiation, producing morphological and chemical changes in wood. To increase our understanding of xylem formation, and to provide material for comparative analysis of gymnosperm and angiosperm sequences, ESTs were obtained from immature xylem of loblolly pine (Pinus taeda L.). A total of 1,097 single-pass sequences were obtained from 5' ends of cDNAs made from gravistimulated tissue from bent trees. Cluster analysis detected 107 groups of similar sequences, ranging in size from 2 to 20 sequences. A total of 361 sequences fell into these groups, whereas 736 sequences were unique. About 55% of the pine EST sequences show similarity to previously described sequences in public databases. About 10% of the recognized genes encode factors involved in cell wall formation. Sequences similar to cell wall proteins, most known lignin biosynthetic enzymes, and several enzymes of carbohydrate metabolism were found. A number of putative regulatory proteins also are represented. Expression patterns of several of these genes were studied in various tissues and organs of pine. Sequencing novel genes expressed during xylem formation will provide a powerful means of identifying mechanisms controlling this important differentiation pathway.

  20. Nucleotide sequence of the bean strain of southern bean mosaic virus.

    PubMed

    Othman, Y; Hull, R

    1995-01-10

    The genome of the bean strain of southern bean mosaic virus (SBMV-B) comprises 4109 nucleotides and thus is slightly shorter than those of the two other sequenced sobemoviruses (southern bean mosaic virus, cowpea strain (SBMV-C) and rice yellow mottle virus (RYMV)). SBMV-B has an overall sequence similarity with SBMV-C of 55% and with RYMV of 45%. Three potential open reading frames (ORFs) were recognized in SBMV-B which were in similar positions in the genomes of SBMV-C and RYMV. However, there was no analog of SBMV-C and RYMV ORF 3. From a comparison of the predicted sequences of the ORFs of these three sobemoviruses and of the noncoding regions, it is suggested that the two SBMV strains differ from one another as much as they do from RYMV and that they should be considered as different viruses.

  1. Nucleotide sequence of a satellite RNA associated with carrot motley dwarf in parsley and carrot.

    PubMed

    Menzel, Wulf; Maiss, Edgar; Vetten, H Josef

    2009-02-01

    Carrot motley dwarf (CMD) is known to result from a mixed infection by two viruses, the polerovirus Carrot red leaf virus and one of the umbraviruses Carrot mottle mimic virus or Carrot mottle virus. Some umbraviruses have been shown to be associated with small satellite (sat) RNAs, but none have been reported for the latter two. A CMD-affected parsley plant was used for sap transmission to test plants, that were used for dsRNA isolation. The presence of a 0.8-kbp dsRNA indicated the occurrence of a hitherto unrecognized satRNA associated with CMD. The satRNAs of the CMD isolate from parsley and an isolate from carrot have been sequenced and showed 94% sequence identity. Nucleotide sequences and putative translation products had no significant similarities to GenBank entries. To our knowledge, this is the first report of satRNAs associated with CMD.

  2. Conservation of nucleotide sequences for molecular diagnosis of Middle East respiratory syndrome coronavirus, 2015.

    PubMed

    Furuse, Yuki; Okamoto, Michiko; Oshitani, Hitoshi

    2015-11-01

    Infection due to the Middle East respiratory syndrome coronavirus (MERS-CoV) is widespread. The present study was performed to assess the protocols used for the molecular diagnosis of MERS-CoV by analyzing the nucleotide sequences of viruses detected between 2012 and 2015, including sequences from the large outbreak in eastern Asia in 2015. Although the diagnostic protocols were established only 2 years ago, mismatches between the sequences of primers/probes and viruses were found for several of the assays. Such mismatches could lead to a lower sensitivity of the assay, thereby leading to false-negative diagnosis. A slight modification in the primer design is suggested. Protocols for the molecular diagnosis of viral infections should be reviewed regularly after they are established, particularly for viruses that pose a great threat to public health such as MERS-CoV.

  3. Single nucleotide polymorphisms from Theobroma cacao expressed sequence tags associated with witches' broom disease in cacao.

    PubMed

    Lima, L S; Gramacho, K P; Carels, N; Novais, R; Gaiotto, F A; Lopes, U V; Gesteira, A S; Zaidan, H A; Cascardo, J C M; Pires, J L; Micheli, F

    2009-07-14

    In order to increase the efficiency of cacao tree resistance to witches' broom disease, which is caused by Moniliophthora perniciosa (Tricholomataceae), we looked for molecular markers that could help in the selection of resistant cacao genotypes. Among the different markers useful for developing marker-assisted selection, single nucleotide polymorphisms (SNPs) constitute the most common type of sequence difference between alleles and can be easily detected by in silico analysis from expressed sequence tag libraries. We report the first detection and analysis of SNPs from cacao-M. perniciosa interaction expressed sequence tags, using bioinformatics. Selection based on analysis of these SNPs should be useful for developing cacao varieties resistant to this devastating disease.

  4. Nucleotide sequence of yeast GDH1 encoding nicotinamide adenine dinucleotide phosphate-dependent glutamate dehydrogenase.

    PubMed

    Moye, W S; Amuro, N; Rao, J K; Zalkin, H

    1985-07-15

    The yeast GDH1 gene encodes NADP-dependent glutamate dehydrogenase. This gene was isolated by complementation of an Escherichia coli glutamate auxotroph. NADP-dependent glutamate dehydrogenase was overproduced 6-10-fold in Saccharomyces cerevisiae bearing GDH1 on a multicopy plasmid. The nucleotide sequence of the 1362-base pair coding region and 5' and 3' flanking sequences were determined. Transcription start sites were located by S1 nuclease mapping. Regulation of GDH1 was not maintained when the gene was present on a multicopy plasmid. Protein secondary structure predictions identified a region with potential to form the dinucleotide-binding domain. The amino acid sequences of the yeast and Neurospora crassa enzymes are 63% conserved. Unlike the N. crassa gene, yeast GDH1 has no introns.

  5. Nucleotide-sequence-specific de novo methylation in a somatic murine cell line.

    PubMed Central

    Szyf, M; Schimmer, B P; Seidman, J G

    1989-01-01

    DNA fragments encoding the mouse steroid 21-hydroxylase (C21 or Cyp21A1) gene are de novo methylated when introduced into the mouse adrenocortical tumor cell line Y1 by DNA-mediated gene transfer. Although CCGG sequences within the C21 gene are de novo methylated, CCGG sites within flanking vector sequences, other mammalian gene sequences driven by the C21 promoter, and the neomycin-resistance gene, which was cotransfected with the C21 gene, do not become methylated. At least two separate signals for de novo methylation are encoded within the gene since three fragments derived from the C21 gene were methylated de novo. Specific de novo methylation of C21-derived sequences does not occur in L cells or Y1 kin8 cells; this suggests that the cellular factors needed for de novo methylation of the C21 gene are not ubiquitous. Most DNA sequences are not de novo methylated when introduced into somatic cells and DNA sequences other than the C21 gene are not de novo methylated when introduced into Y1 cells. Several groups have suggested that de novo methylation occurs in early embryonic cells and that somatic cells strictly maintain their methylation pattern by a semiconservative methyltransferase. Our results suggest that de novo methylation of specific nucleotide sequences can occur in some mammalian somatic cells. Images PMID:2789380

  6. Developing Single Nucleotide Polymorphism (SNP) markers from transcriptome sequences for the identification of longan (Dimocarpus longan) germplasm

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Longan (Dimocarpus longan Lour.) is an important tropical fruit tree crop. Accurate varietal identification is essential for germplasm management and breeding. Using longan transcriptome sequences from public databases, we developed single nucleotide polymorphism (SNP) markers; validated 60 SNPs in...

  7. Complete Nucleotide Sequence of an Australian Isolate of Turnip mosaic virus before and after Seven Years of Serial Passaging

    PubMed Central

    Pretorius, Lara; Moyle, Richard L.; Dalton-Morgan, Jessica; Hussein, Nasser

    2016-01-01

    The complete genome sequence of an Australian isolate of Turnip mosaic virus was determined by Sanger sequencing. After seven years of serial passaging by mechanical inoculation, the isolate was resequenced by RNA sequencing (RNA-Seq). Eighteen single nucleotide polymorphisms were identified between the isolates. Both isolates had 96% identity to isolate AUST10. PMID:27856582

  8. Nucleotide sequence and expression of the Enterobacter aerogenes alpha-acetolactate decarboxylase gene in brewer's yeast.

    PubMed Central

    Sone, H; Fujii, T; Kondo, K; Shimizu, F; Tanaka, J; Inoue, T

    1988-01-01

    The nucleotide sequence of a 1.4-kilobase DNA fragment containing the alpha-acetolactate decarboxylase gene of Enterobacter aerogenes was determined. The sequence contains an entire protein-coding region of 780 nucleotides which encodes an alpha-acetolactate decarboxylase of 260 amino acids. The DNA sequence coding for alpha-acetolactate decarboxylase was placed under the control of the alcohol dehydrogenase I promoter of the yeast Saccharomyces cerevisiae in a plasmid capable of autonomous replication in both S. cerevisiae and Escherichia coli. Brewer's yeast cells transformed by this plasmid showed alpha-acetolactate decarboxylase activity and were used in laboratory-scale fermentation experiments. These experiments revealed that the diacetyl concentration in wort fermented by the plasmid-containing yeast strain was significantly lower than that in wort fermented by the parental strain. These results indicated that the alpha-acetolactate decarboxylase activity produced by brewer's yeast cells degraded alpha-acetolactate and that this degradation caused a decrease in diacetyl production. PMID:3278689

  9. The nucleotide sequence of sacbrood virus of the honey bee: an insect picorna-like virus.

    PubMed

    Ghosh, R C; Ball, B V; Willcocks, M M; Carter, M J

    1999-06-01

    We have determined the nucleotide sequence of sacbrood virus (SBV), which causes a fatal infection of honey bee larvae. The genomic RNA of SBV is longer than that of typical mammalian picornaviruses (8832 nucleotides) and contains a single, large open reading frame (179-8752) encoding a polyprotein of 2858 amino acids. Sequence comparison with other virus polyproteins revealed regions of similarity to characterized helicase, protease and RNA-dependent RNA polymerase domains; structural genes were located at the 5' terminus with non-structural genes at the 3' end. Picornavirus-like agents of insects have two distinct genomic organizations; some resemble mammalian picornaviruses with structural genes at the 5' end and non-structural genes at the 3' end, and others resemble caliciviruses in which this order is reversed; SBV thus belongs to the former type. Sequence comparison suggested that SBV is distantly related to infectious flacherie virus (IFV) of the silk worm, which possesses an RNA of similar size and gene order.

  10. Complete nucleotide sequence and genome organization of a Cactus virus X strain from Hylocereus undatus (Cactaceae).

    PubMed

    Liou, M R; Chen, Y R; Liou, R F

    2004-05-01

    The complete nucleotide sequence of a strain of Cactus virus X (CVX-Hu) isolated from Hylocereus undatus (Cactaceae) has been determined. Excluding the poly(A) tail, the sequence is 6614 nucleotides in length and contains seven open reading frames (ORFs). The genome organization of CVX is similar to that of other potexviruses. ORF1 encodes the putative viral replicase with conserved methyltransferase, helicase, and polymerase motifs. Within ORF1, two other ORFs were located separately in the +2 reading frame, we call these ORF6 and ORF7. ORF2, 3, and 4, which form the "triple gene block" characteristic of the potexviruses, encode proteins with molecular mass of 25, 12, and 7 KDa, respectively. ORF5 encodes the coat protein with an estimated molecular mass of 24 KDa. Sequence analysis indicated that proteins encoded by ORF1-5 display certain degree of homology to the corresponding proteins of other potexviruses. Putative product of ORF6, however, shows no significant similarity to those of other potexviruses. Phylogenetic analyses based on the replicase (the methyltransferase, helicase, and polymerase domains) and coat protein demonstrated a closer relationship of CVX with Bamboo mosaic virus, Cassava common mosaic virus, Foxtail mosaic virus, Papaya mosaic virus, and Plantago asiatica mosaic virus.

  11. PASTA: Ultra-Large Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences.

    PubMed

    Mirarab, Siavash; Nguyen, Nam; Guo, Sheng; Wang, Li-San; Kim, Junhyong; Warnow, Tandy

    2015-05-01

    We introduce PASTA, a new multiple sequence alignment algorithm. PASTA uses a new technique to produce an alignment given a guide tree that enables it to be both highly scalable and very accurate. We present a study on biological and simulated data with up to 200,000 sequences, showing that PASTA produces highly accurate alignments, improving on the accuracy and scalability of the leading alignment methods (including SATé). We also show that trees estimated on PASTA alignments are highly accurate--slightly better than SATé trees, but with substantial improvements relative to other methods. Finally, PASTA is faster than SATé, highly parallelizable, and requires relatively little memory.

  12. Nucleotide sequence of the transforming gene of m1 murine sarcoma virus.

    PubMed Central

    Brow, M A; Sen, A; Sutcliffe, J G

    1984-01-01

    The v-mosm1 nucleotide sequence codes for a protein that is 376 amino acids long. Although the N-terminus is homologous with that of the v-mos124 protein, the C-terminus is substantially different from the C-termini of all other examined mos proteins, suggesting that this region is nonessential and perhaps cleaved. Overall, v-mosm1 has greater homology with c-mos than does v-mos124, but mutually exclusive differences between c-mos and each of the v-mos genes preclude linear descent and suggest a common ancestral murine sarcoma virus. PMID:6319757

  13. The Complete Nucleotide Sequence of the Mitochondrial Genome of Bactrocera minax (Diptera: Tephritidae)

    PubMed Central

    Zhang, Bin; Nardi, Francesco; Hull-Sanders, Helen; Wan, Xuanwu; Liu, Yinghong

    2014-01-01

    The complete 16,043 bp mitochondrial genome (mitogenome) of Bactrocera minax (Diptera: Tephritidae) has been sequenced. The genome encodes 37 genes usually found in insect mitogenomes. The mitogenome information for B. minax was compared to the homologous sequences of Bactrocera oleae, Bactrocera tryoni, Bactrocera philippinensis, Bactrocera carambolae, Bactrocera papayae, Bactrocera dorsalis, Bactrocera correcta, Bactrocera cucurbitae and Ceratitis capitata. The analysis indicated the structure and organization are typical of, and similar to, the nine closely related species mentioned above, although it contains the lowest genome-wide A+T content (67.3%). Four short intergenic spacers with a high degree of conservation among the nine tephritid species mentioned above and B. minax were observed, which also have clear counterparts in the control regions (CRs). Correlation analysis among these ten tephritid species revealed close positive correlation between the A+T content of zero-fold degenerate sites (P0FD), the ratio of nucleotide substitution frequency at P0FD sites to all degenerate sites (zero-fold degenerate sites, two-fold degenerate sites and four-fold degenerate sites) and amino acid sequence distance (ASD) were found. Further, significant positive correlation was observed between the A+T content of four-fold degenerate sites (P4FD) and the ratio of nucleotide substitution frequency at P4FD sites to all degenerate sites; however, we found significant negative correlation between ASD and the A+T content of P4FD, and the ratio of nucleotide substitution frequency at P4FD sites to all degenerate sites. A higher nucleotide substitution frequency at non-synonymous sites compared to synonymous sites was observed in nad4, the first time that has been observed in an insect mitogenome. A poly(T) stretch at the 5′ end of the CR followed by a [TA(A)]n-like stretch was also found. In addition, a highly conserved G+A-rich sequence block was observed in front of the

  14. The complete nucleotide sequence of the mitochondrial genome of Bactrocera minax (Diptera: Tephritidae).

    PubMed

    Zhang, Bin; Nardi, Francesco; Hull-Sanders, Helen; Wan, Xuanwu; Liu, Yinghong

    2014-01-01

    The complete 16,043 bp mitochondrial genome (mitogenome) of Bactrocera minax (Diptera: Tephritidae) has been sequenced. The genome encodes 37 genes usually found in insect mitogenomes. The mitogenome information for B. minax was compared to the homologous sequences of Bactrocera oleae, Bactrocera tryoni, Bactrocera philippinensis, Bactrocera carambolae, Bactrocera papayae, Bactrocera dorsalis, Bactrocera correcta, Bactrocera cucurbitae and Ceratitis capitata. The analysis indicated the structure and organization are typical of, and similar to, the nine closely related species mentioned above, although it contains the lowest genome-wide A+T content (67.3%). Four short intergenic spacers with a high degree of conservation among the nine tephritid species mentioned above and B. minax were observed, which also have clear counterparts in the control regions (CRs). Correlation analysis among these ten tephritid species revealed close positive correlation between the A+T content of zero-fold degenerate sites (P0FD), the ratio of nucleotide substitution frequency at P0FD sites to all degenerate sites (zero-fold degenerate sites, two-fold degenerate sites and four-fold degenerate sites) and amino acid sequence distance (ASD) were found. Further, significant positive correlation was observed between the A+T content of four-fold degenerate sites (P4FD) and the ratio of nucleotide substitution frequency at P4FD sites to all degenerate sites; however, we found significant negative correlation between ASD and the A+T content of P4FD, and the ratio of nucleotide substitution frequency at P4FD sites to all degenerate sites. A higher nucleotide substitution frequency at non-synonymous sites compared to synonymous sites was observed in nad4, the first time that has been observed in an insect mitogenome. A poly(T) stretch at the 5' end of the CR followed by a [TA(A)]n-like stretch was also found. In addition, a highly conserved G+A-rich sequence block was observed in front of the

  15. Within-Host Nucleotide Diversity of Virus Populations: Insights from Next-Generation Sequencing

    PubMed Central

    Nelson, Chase W.; Hughes, Austin L.

    2014-01-01

    Next-generation sequencing (NGS) technology offers new opportunities for understanding the evolution and dynamics of viral populations within individual hosts over the course of infection. We review simple methods for estimating synonymous and nonsynonymous nucleotide diversity in viral genes from NGS data without the need for inferring linkage. We discuss the potential usefulness of these data for addressing questions of both practical and theoretical interest, including fundamental questions regarding the effective population sizes of within-host viral populations and the modes of natural selection acting on them. PMID:25481279

  16. Nanoparticle-Based Discrimination of Single-Nucleotide Polymorphism in Long DNA Sequences.

    PubMed

    Sanromán-Iglesias, María; Lawrie, Charles H; Liz-Marzán, Luis M; Grzelczak, Marek

    2017-03-01

    Circulating DNA (ctDNA) and specifically the detection cancer-associated mutations in liquid biopsies promises to revolutionize cancer detection. The main difficulty however is that the length of typical ctDNA fragments (∼150 bases) can form secondary structures potentially obscuring the mutated fragment from detection. We show that an assay based on gold nanoparticles (65 nm) stabilized with DNA (Au@DNA) can discriminate single nucleotide polymorphism in clinically relevant ssDNA sequences (70-140 bases). The preincubation step was crucial to this process, allowing sequential bridging of Au@DNA, so that single base mutation can be discriminated, down to 100 pM concentration.

  17. Complete nucleotide sequence of a virus associated with rusty mottle disease of sweet cherry (Prunus avium).

    PubMed

    Villamor, D V; Druffel, K L; Eastwell, K C

    2013-08-01

    Cherry rusty mottle is a disease of sweet cherries first described in 1940 in western North America. Because of the graft-transmissible nature of the disease, a viral nature of the disease was assumed. Here, the complete genomic nucleotide sequences of virus isolates from two trees expressing cherry rusty mottle disease symptoms are characterized; the virus is designated cherry rusty mottle associated virus (CRMaV). The biological and molecular characteristics of this virus in comparison to those of cherry necrotic rusty mottle virus (CNRMV) and cherry green ring mottle virus (CGRMV) are described. CRMaV was subsequently detected in additional sweet cherry trees expressing symptoms of cherry rusty mottle disease.

  18. Completion sequence and cloning of the infectious cDNA of a chb isolate of cucumber green mottle mosaic virus.

    PubMed

    Zhong, M; Zhao, X; Liu, Y; Wang, Y; Cao, K

    2015-03-01

    Cucumber green mottle mosaic virus (CGMMV) is an important and widespread seed-borne virus that infects Cucurbitaceous plants. It is a member of the genus Tobamovirus in the family Virgaviridae with a monopartite (+) ssRNA genome. Here we report the complete genome sequence, construction and testing of the infectious clones of a chb isolate of CGMMV. Full-length CGMMV cDNA was cloned into the vector pUC19. The linearized vector containing full-length cDNA was used as template for in vitro transcription, and the synthesized capped transcript was highly infectious in Chenopodium amaranticolor and cucumber (Cucumis sativus). Inoculated plants showed symptoms typical of CGMMV infection. The infectivity was confirmed by mechanical transmission to new plants, RT-PCR and western blot. Progeny virus derived from infectious transcripts had the same biological and biochemical properties as wild-type virus. To our knowledge, this is the first detailed report of a biologically active transcript from CGMMV.

  19. Optimizing nucleotide sequence ensembles for combinatorial protein libraries using a genetic algorithm.

    PubMed

    Craig, Roger A; Lu, Jin; Luo, Jinquan; Shi, Lei; Liao, Li

    2010-01-01

    Protein libraries are essential to the field of protein engineering. Increasingly, probabilistic protein design is being used to synthesize combinatorial protein libraries, which allow the protein engineer to explore a vast space of amino acid sequences, while at the same time placing restrictions on the amino acid distributions. To this end, if site-specific amino acid probabilities are input as the target, then the codon nucleotide distributions that match this target distribution can be used to generate a partially randomized gene library. However, it turns out to be a highly nontrivial computational task to find the codon nucleotide distributions that exactly matches a given target distribution of amino acids. We first showed that for any given target distribution an exact solution may not exist at all. Formulated as a constrained optimization problem, we then developed a genetic algorithm-based approach to find codon nucleotide distributions that match as closely as possible to the target amino acid distribution. As compared with the previous gradient descent method on various objective functions, the new method consistently gave more optimized distributions as measured by the relative entropy between the calculated and the target distributions. To simulate the actual lab solutions, new objective functions were designed to allow for two separate sets of codons in seeking a better match to the target amino acid distribution.

  20. A simple ABO genotyping by PCR using sequence-specific primers with mismatched nucleotides.

    PubMed

    Taki, Takashi; Kibayashi, Kazuhiko

    2014-05-01

    In forensics, the specific ABO blood group is often determined by analyzing the ABO gene. Among various methods used, PCR employing sequence-specific primers (PCR-SSP) is simpler than other methods for ABO typing. When performing the PCR-SSP, the pseudo-positive signals often lead to errors in ABO typing. We introduced mismatched nucleotides at the second and the third positions from the 3'-end of the primers for the PCR-SSP method and examined whether reliable typing could be achieved by suppressing pseudo-positive signals. Genomic DNA was extracted from nail clippings of 27 volunteers, and the ABO gene was examined with PCR-SSP employing primers with and without mismatched nucleotides. The ABO blood group of the nail clippings was also analyzed serologically, and these results were compared with those obtained using PCR-SSP. When mismatched primers were employed for amplification, the results of the ABO typing matched with those obtained by the serological method. When primers without mismatched nucleotides were used for PCR-SSP, pseudo-positive signals were observed. Thus our method may be used for achieving more reliable ABO typing.

  1. Mining for single nucleotide polymorphisms and insertions/deletions in maize expressed sequence tag data.

    PubMed

    Batley, Jacqueline; Barker, Gary; O'Sullivan, Helen; Edwards, Keith J; Edwards, David

    2003-05-01

    We have developed a computer based method to identify candidate single nucleotide polymorphisms (SNPs) and small insertions/deletions from expressed sequence tag data. Using a redundancy-based approach, valid SNPs are distinguished from erroneous sequence by their representation multiple times in an alignment of sequence reads. A second measure of validity was also calculated based on the cosegregation of the SNP pattern between multiple SNP loci in an alignment. The utility of this method was demonstrated by applying it to 102,551 maize (Zea mays) expressed sequence tag sequences. A total of 14,832 candidate polymorphisms were identified with an SNP redundancy score of two or greater. Segregation of these SNPs with haplotype indicates that candidate SNPs with high redundancy and cosegregation confidence scores are likely to represent true SNPs. This was confirmed by validation of 264 candidate SNPs from 27 loci, with a range of redundancy and cosegregation scores, in four inbred maize lines. The SNP transition/transversion ratio and insertion/deletion size frequencies correspond to those observed by direct sequencing methods of SNP discovery and suggest that the majority of predicted SNPs and insertion/deletions identified using this approach represent true genetic variation in maize.

  2. Complete nucleotide sequences of two begomoviruses infecting Madagascar periwinkle (Catharanthus roseus) from Pakistan.

    PubMed

    Ilyas, Muhammad; Nawaz, Kiran; Shafiq, Muhammad; Haider, Muhammad Saleem; Shahid, Ahmad Ali

    2013-02-01

    Though Catharanthus roseus (Madagascar periwinkle) is an ornamental plant, it is famous for its medicinal value. Its alkaloids are known for anti-cancerous properties, and this plant is studied mainly for its alkaloids. Here, this plant has been studied for its viral diseases. Complete DNA sequences of two begomoviruses infecting C. roseus originating from Pakistan were determined. The sequence of one begomovirus (clone KN4) shows the highest level of nucleotide sequence identity (86.5 %) to an unpublished virus, chili leaf curl India virus (ChiLCIV), and then (84.4 % identity) to papaya leaf curl virus (PaLCV), and thus represents a new species, for which the name "Catharanthus yellow mosaic virus" (CYMV) is proposed. The sequence of another begomovirus (clone KN6) shows the highest level of sequence identity (95.9 % to 99 %) to a newly reported virus from India, papaya leaf crumple virus (PaLCrV). Sequence analysis shows that KN4 and KN6 are recombinants of Pedilanthus leaf curl virus (PedLCV) and croton yellow vein mosaic virus (CrYVMV).

  3. Molecular cloning and sequencing of a cDNA encoding the thioesterase domain of the rat fatty acid synthetase.

    PubMed

    Naggert, J; Witkowski, A; Mikkelsen, J; Smith, S

    1988-01-25

    A cloned cDNA containing the entire coding sequence for the long-chain S-acyl fatty acid synthetase thioester hydrolase (thioesterase I) component as well as the 3'-noncoding region of the fatty acid synthetase has been isolated using an expression vector and domain-specific antibodies. The coding region was assigned to the thioesterase I domain by identification of sequences coding for characterized peptide fragments, amino-terminal analysis of the isolated thioesterase I domain and the presence of the serine esterase active-site sequence motif. The thioesterase I domain is 306 amino acids long with a calculated molecular mass of 33,476 daltons; its DNA is flanked at the 5'-end by a region coding for the acyl carrier protein domain and at the 3'-end by a 1,537-base pairs-long noncoding sequence with a poly(A) tail. The thioesterase I domain exhibits a low, albeit discernible, homology with the discrete medium-chain S-acyl fatty acid synthetase thioester hydrolases (thioesterase II) from rat mammary gland and duck uropygial gland, suggesting a distant but common evolutionary ancestry for these proteins.

  4. Isolation, characterization, and cDNA sequencing of alpha-1-antiproteinase-like protein from rainbow trout seminal plasma.

    PubMed

    Mak, Monika; Mak, Paweł; Olczak, Mariusz; Szalewicz, Agata; Glogowski, Jan; Dubin, Adam; Watorek, Wiesław; Ciereszko, Andrzej

    2004-03-17

    Seminal plasma of teleost fish contains serine proteinase inhibitors related to those present in blood. These inhibitors can be bound to Q-Sepharose and sequentially eluted with a NaCl gradient. In the present study, using a two-step procedure, we purified (73-fold to homogeneity) and characterized the inhibitor eluted as the second fraction of antitrypsin activity (inhibitor II) from Q-Sepharose. The molecular weight of this inhibitor was estimated to be 56 kDa with an isoelectric point of 5.4. It effectively inhibited trypsin and chymotrypsin but was less effective against elastase. It formed SDS-stable complexes with cod and bovine trypsin. Inhibitor II appeared to be a glycoprotein. Carbohydrate content was determined to be 16%. N-terminal Edman sequencing allowed identification of the first 30 N-terminal amino acids HDGDHAGHTEDHHHHLHHIAGEAHPQHSHG and 25 amino acids within the reactive loop IMPMSLPDTIMLNRPFLLFILEDST. The N-terminal sequence did not match any known sequence, however, the sequence within the reactive loop was significantly similar to carp and mammalian alpha1-antiproteinases. Both sequences were used to construct primers and obtain a cDNA sequence from liver. The mRNA coding the protein is 1675 nt in length including a single open reading frame of 1281 nt that encodes 426 amino acid residues. Analysis of this sequence indicated the presence of putative conserved serpin domains and confirmed the similarity to carp alpha1-antiproteinase and mammalian alpha1-antiproteinase. Our results indicate that inhibitor II belongs to the serpin superfamily and is similar to alpha1-antiproteinase.

  5. Molecular cloning and sequence analysis of a novel chalcone synthase cDNA from Ginkgo biloba.

    PubMed

    Pang, Yongzhen; Shen, Guo-An; Liu, Chenghong; Liu, Xiaojun; Tan, Feng; Sun, Xiaofen; Tang, Kexuan

    2004-08-01

    A chalcone synthase (CHS) gene was cloned from Ginkgo biloba for the first time and it was also the first cloned gene involved in flavonoids metabolic pathway in G. biloba. The full-length cDNA of G. biloba CHS (designated as Gbchs) was 1608bp with poly(A) tailing and it contained a 1173bp open reading frame (ORF) encoding a 391 amino acid protein. Gbchs was found to have extensive homology with those of other plant chs genes via multiple alignments. The active sites of the CoA binding, coumaroyl pocket and cyclization pocket in CHS protein of Medicago sativa were also found in GbCHS. Molecular modeling of GbCHS indicated that the three-dimensional structure of GbCHS strongly resembled that of M. sativa (MsCHS2), implying GbCHS may have similar functions with MsCHS2. Phylogenetic tree analysis revealed that GbCHS had closer relationship with CHSs from gymnosperm plants than from other plants. Gbchs is a useful tool to study the regulation of flavonoids metabolism in G. biloba.

  6. Identification and analysis of safener-inducible expressed sequence tags in Populus using a cDNA microarray.

    PubMed

    Rishi, A S; Munir, Shirin; Kapur, Vivek; Nelson, Neil D; Goyal, Arun

    2004-12-01

    Safeners are the chemicals used to protect plants from detrimental effects of herbicides, but their mode of action at the molecular level is not well understood. As an initial step towards understanding the molecular mechanism of safener action in trees, homologous genes in hybrid poplar (Populus nigra x Populus maximowiczii) that were induced by a safener were identified. We here describe the identification of differentially expressed genes in Populus that are induced by Concep-III, a herbicide safener. Expressed sequence tags (ESTs) enriched for transcriptionally induced genes were isolated by suppressive subtractive hybridization (SSH). The SSH library cDNA inserts were used to construct a cDNA microarray for high-throughput validation of the up-regulated expression of safener-induced genes. Single-pass and partial sequences of 1,344 safener-induced ESTs were assembled into 418 singletons and 328 clusters, but the putative functions of almost 53% of the ESTs are not known. Genes encoding proteins involved in all three different phases of safener action, viz., oxidation, conjugation, and sequestration, were found in the SSH library. Almost 75% of genes that showed greater than 2-fold expression upon safener treatment were redundant in the SSH library. The expression pattern for selected genes was validated by reverse transcription-polymerase chain reaction. A few safener-induced genes that were not previously reported to be induced by safeners, but which may have a role in herbicide metabolism, were identified. The newly identified genes could have potential for application in genetic engineering of plants for herbicide detoxification and tolerance.

  7. Overproduction and nucleotide sequence of the respiratory D-lactate dehydrogenase of Escherichia coli.

    PubMed Central

    Rule, G S; Pratt, E A; Chin, C C; Wold, F; Ho, C

    1985-01-01

    Recombinant DNA plasmids containing the gene for the membrane-bound D-lactate dehydrogenase (D-LDH) of Escherichia coli linked to the promoter PL from lambda were constructed. After induction, the levels of D-LDH were elevated 300-fold over that of the wild type and amounted to 35% of the total cellular protein. The nucleotide sequence of the D-LDH gene was determined and shown to agree with the amino acid composition and the amino-terminal sequence of the purified enzyme. Removal of the amino-terminal formyl-Met from D-LDH was not inhibited in cells which contained these high levels of D-LDH. Images PMID:3882663

  8. Using mitochondrial nucleotide sequences to investigate diversity and genealogical relationships within common carp (Cyprinus carpio L.).

    PubMed

    Thai, B T; Burridge, C P; Pham, T A; Austin, C M

    2005-02-01

    Direct sequencing of mitochondrial DNA (mtDNA) D-loop (745 bp) and MTATPase6/MTATPase8 (857 bp) regions was used to investigate genetic variation within common carp and develop a global genealogy of common carp strains. The D-loop region was more variable than the MTATPase6/MTATPase8 region, but given the wide distribution of carp the overall levels of sequence divergence were low. Levels of haplotype diversity varied widely among countries with Chinese, Indonesian and Vietnamese carp showing the greatest diversity whereas Japanese Koi and European carp had undetectable nucleotide variation. A genealogical analysis supports a close relationship between Vietnamese, Koi and Chinese Color carp strains and to a lesser extent, European carp. Chinese and Indonesian carp strains were the most divergent, and their relationships do not support the evolution of independent Asian and European lineages and current taxonomic treatments.

  9. Nucleotide sequence alignment of hdcA from Gram-positive bacteria

    PubMed Central

    Diaz, Maria; Ladero, Victor; Redruello, Begoña; Sanchez-Llana, Esther; del Rio, Beatriz; Fernandez, Maria; Martin, Maria Cruz; Alvarez, Miguel A.

    2016-01-01

    The decarboxylation of histidine -carried out mainly by some gram-positive bacteria- yields the toxic dietary biogenic amine histamine (Ladero et al. 2010 〈10.2174/157340110791233256〉 [1], Linares et al. 2016 〈http://dx.doi.org/10.1016/j.foodchem.2015.11.013〉〉 [2]). The reaction is catalyzed by a pyruvoyl-dependent histidine decarboxylase (Linares et al. 2011 〈10.1080/10408398.2011.582813〉 [3]), which is encoded by the gene hdcA. In order to locate conserved regions in the hdcA gene of Gram-positive bacteria, this article provides a nucleotide sequence alignment of all the hdcA sequences from Gram-positive bacteria present in databases. For further utility and discussion, see 〈http://dx.doi.org/ 10.1016/j.foodcont.2015.11.035〉〉 [4]. PMID:26958625

  10. Complete nucleotide sequence of a new variant of grapevine fanleaf virus from northeastern China.

    PubMed

    Zhou, Jun; Fan, Xudong; Dong, Yafeng; Zhang, Zunping; Ren, Fang; Hu, Guojun; Li, Zhengnan

    2017-02-01

    The complete RNA1 and RNA2 sequences of a new grapevine fanleaf virus isolate (GFLV-SDHN) from northeastern China were determined. The two RNAs are 7,367 and 3,788 nucleotides (nt) in length, respectively, excluding the poly(A) tails. Compared to other GFLV isolates, GFLV-SDHN has a 22- to 24-nt insertion in the RNA1 5' untranslated region, and there was 19.1-20.1 % and 11.7 %-13.0 % sequence divergence in RNA1, and 15.5 %-20.5 % and 8.5-13.5 % in RNA2, at the nt and amino acid level, respectively. Phylogenetic analysis revealed that the origins of GFLV-SDHN are distinct from those of other GFLV isolates. One recombination event was identified in the 2A(HP) region of RNA2 in GFLV-SDHN.

  11. [Nucleotide sequence of HLA-DQA1 promoter region (QAP) in a lung cancer patient].

    PubMed

    Qiu, C; Zhou, W; Song, C

    1996-06-01

    The HLA-DQA1 allele and nucleotide sequence of HLA-DQA1 promoter region (QAP) in a patient with IDDM complicated lung cancer have been identified by PCR/SSCP, PCR/SSCP and PCR/sequencing. The results showed that: (1) All of the lung cancer patient and his family members carried HLA-DQA1* 0301/0501 alleles. (2) a single base substitution G-->A at position -155 and deletion CAA at position -161 to -163 occurred in the patient. These results suggest that the mutation of HLA-DQA1 promoter region may modulate HLA-DQA1 gene expression by trans-acting factors binding to variant cis-acting elements and may be responsible for pathogenesis of lung cancer.

  12. Molecular detection and nucleotide sequence analysis of a new Aichi virus closely related to canine kobuvirus in sewage samples.

    PubMed

    Yamashita, Teruo; Adachi, Hirokazu; Hirose, Emi; Nakamura, Noriko; Ito, Miyabi; Yasui, Yoshihiro; Kobayashi, Shinichi; Minagawa, Hiroko

    2014-05-01

    Between 2001 and 2005, 207 raw sewage samples were collected at the inflow of a sewage treatment plant in Aichi Prefecture, Japan. Of the 207 sewage samples, 137 (66.2 %) were found to be positive for amplification of Aichi virus (AiV) nucleotide using reverse transcription (RT)-PCR with 10 forward and 10 reverse primers in the 3D region corresponding to the nucleotide sequence of all kobuviruses. AiV genotype A sequences were detected in all 137 samples. New sequences of AiV were detected in nine samples, exhibiting 83 % similarity with AiV A846/88, but 95 % similarity with canine kobuvirus (CKV) US-PC0082 in this region. The nucleotide sequences from the VP3 region to the 3' untranslated region (UTR) of sewage sample Y12/2004 were determined. The number of nucleotides in each region was the same as that of CKV. The similarity of the nucleotide (amino acid) identity of a complete VP1 region was 90.5 % (94.8 %) between Y12/2004 and CKV US-PC0082. The phylogenic analyses based on the nucleotide and the deduced amino acid sequences of VP1 and 3D showed that Y12/2004 was independent from AiV, but closely related to CKV. These results suggested that CKV is present in Aichi Prefecture, Japan.

  13. The nucleotide sequence surrounding the replication origin of the cop3 mutant of the bacteriocinogenic plasmid Clo DF13.

    PubMed Central

    Stuitje, A R; Veltkamp, E; Maat, J; Heyneker, H L

    1980-01-01

    The nucleotide sequence from about 100 base-pairs downstream to about 600 base pairs upstream the CloDF13 replication origin has been determined. A comparison of this sequence with the corresponding ColE1 origin sequence reveals that: The sequence at the origin of replication is conserved. There are large differences in the nucleotide sequence downstream the replication origin, whereas there is a large homology in the region of about 410 base-pairs upstream the replication origin. This conserved region might code for a largely homologous basic, arginine rich polypeptide of about 45 amino-acids, for both ColE1 and CloDF13. Although there are large differences in the primary structure of the region coding for the 100 nucleotide RNA, the secondary structure of this region seems to be conserved. Images PMID:6253936

  14. The nucleotide sequence surrounding the replication origin of the cop3 mutant of the bacteriocinogenic plasmid Clo DF13.

    PubMed

    Stuitje, A R; Veltkamp, E; Maat, J; Heyneker, H L

    1980-04-11

    The nucleotide sequence from about 100 base-pairs downstream to about 600 base pairs upstream the CloDF13 replication origin has been determined. A comparison of this sequence with the corresponding ColE1 origin sequence reveals that: The sequence at the origin of replication is conserved. There are large differences in the nucleotide sequence downstream the replication origin, whereas there is a large homology in the region of about 410 base-pairs upstream the replication origin. This conserved region might code for a largely homologous basic, arginine rich polypeptide of about 45 amino-acids, for both ColE1 and CloDF13. Although there are large differences in the primary structure of the region coding for the 100 nucleotide RNA, the secondary structure of this region seems to be conserved.

  15. Nucleotide sequence of ermA, a macrolide-lincosamide-streptogramin B determinant in Staphylococcus aureus.

    PubMed Central

    Murphy, E

    1985-01-01

    The complete nucleotide sequence of ermA, the prototype macrolide-lincosamide-streptogramin B resistance gene from Staphylococcus aureus, has been determined. The sequence predicts a 243-amino-acid protein that is homologous to those specified by ermC, ermAM, and ermD, resistance determinants from Staphylococcus aureus, Streptococcus sanguis, and Bacillus licheniformis, respectively. The ermA transcript, identified by Northern analysis and S1 mapping, contains a 5' leader sequence of 211 bases which has the potential to encode two short peptides of 15 and 19 amino acids; the second, longer peptide has 13 amino acids in common with the putative regulatory leader peptide of ermC. The coding sequence for this peptide is deleted in several mutants in which macrolide-lincosamide-streptogramin B resistance is constitutively expressed. Potential secondary structures available to the leader sequence of the wild-type (inducible) transcript and to constitutive deletion, insertion, and point mutations provide additional support for the translational attenuation model for induction of macrolide-lincosamide-streptogramin B resistance. Images PMID:2985541

  16. Nucleotide sequence analysis of beta tubulin gene in a wide range of dermatophytes.

    PubMed

    Rezaei-Matehkolaei, Ali; Mirhendi, Hossein; Makimura, Koichi; de Hoog, G Sybren; Satoh, Kazuo; Najafzadeh, Mohammad Javad; Shidfar, Mohammad Reza

    2014-10-01

    We investigated the resolving power of the beta tubulin protein-coding gene (BT2) for systematic study of dermatophyte fungi. Initially, 144 standard and clinical strains belonging to 26 species in the genera Trichophyton, Microsporum, and Epidermophyton were identified by internal transcribe spacer (ITS) sequencing. Subsequently, BT2 was partially amplified in all strains, and sequence analysis performed after construction of a BT2 database that showed length ranged from approximately 723 (T. ajelloi) to 808 nucleotides (M. persicolor) in different species. Intraspecific sequence variation was found in some species, but T. tonsurans, T. equinum, T. concentricum, T. verrucosum, T. rubrum, T. violaceum, T. eriotrephon, E. floccosum, M. canis, M. ferrugineum, and M. audouinii were invariant. The sequences were found to be relatively conserved among different strains of the same species. The species with the closest resemblance were Arthroderma benhamiae and T. concentricum and T. tonsurans and T. equinum with 100% and 99.8% identity, respectively; the most distant species were M. persicolor and M. amazonicum. The dendrogram obtained from BT2 topology was almost compatible with the species concept based on ITS sequencing, and similar clades and species were distinguished in the BT2 tree. Here, beta tubulin was characterized in a wide range of dermatophytes in order to assess intra- and interspecies variation and resolution and was found to be a taxonomically valuable gene.

  17. Computational generation and screening of RNA motifs in large nucleotide sequence pools

    PubMed Central

    Kim, Namhee; Izzo, Joseph A.; Elmetwaly, Shereef; Gan, Hin Hark; Schlick, Tamar

    2010-01-01

    Although identification of active motifs in large random sequence pools is central to RNA in vitro selection, no systematic computational equivalent of this process has yet been developed. We develop a computational approach that combines target pool generation, motif scanning and motif screening using secondary structure analysis for applications to 1012–1014-sequence pools; large pool sizes are made possible using program redesign and supercomputing resources. We use the new protocol to search for aptamer and ribozyme motifs in pools up to experimental pool size (1014 sequences). We show that motif scanning, structure matching and flanking sequence analysis, respectively, reduce the initial sequence pool by 6–8, 1–2 and 1 orders of magnitude, consistent with the rare occurrence of active motifs in random pools. The final yields match the theoretical yields from probability theory for simple motifs and overestimate experimental yields, which constitute lower bounds, for aptamers because screening analyses beyond secondary structure information are not considered systematically. We also show that designed pools using our nucleotide transition probability matrices can produce higher yields for RNA ligase motifs than random pools. Our methods for generating, analyzing and designing large pools can help improve RNA design via simulation of aspects of in vitro selection. PMID:20448026

  18. Organization and nucleotide sequence analysis of a ribosomal RNA gene cluster from Streptomyces ambofaciens.

    PubMed

    Pernodet, J L; Boccard, F; Alegre, M T; Gagnat, J; Guérineau, M

    1989-06-30

    The Streptomyces ambofaciens genome contains four rRNA gene clusters. These copies are called rrnA, B, C and D. The complete nucleotide (nt) sequence of rrnD has been determined. These genes possess striking similarity with other eubacterial rRNA genes. Comparison with other rRNA sequences allowed the putative localization of the sequences encoding mature rRNAs. The structural genes are arranged in the order 16S-23S-5S and are tightly linked. The mature rRNAs are predicted to contain 1528, 3120 and 120 nt, for the 16S, 23S and 5S rRNAs, respectively. The 23S rRNA is, to our knowledge, the longest of all sequenced prokaryotic 23S rRNAs. When compared to other large rRNAs it shows insertions at positions where they are also present in archaebacterial and in eukaryotic large rRNAs. Secondary structure models of S. ambofaciens rRNAs are proposed, based upon those existing for other bacterial rRNAs. Positions of putative transcription start points and of a termination signal are suggested. The corresponding putative primary transcript, containing the 16S, 23S and 5S rRNAs plus flanking regions, was folded into a secondary structure, and sequences possibly involved in rRNA maturation are described. The G + C content of the rRNA gene cluster is low (57%) compared with the overall G + C content of Streptomyces DNA (73%).

  19. Unique nucleotide sequence-guided assembly of repetitive DNA parts for synthetic biology applications

    SciTech Connect

    Torella, JP; Lienert, F; Boehm, CR; Chen, JH; Way, JC; Silver, PA

    2014-08-07

    Recombination-based DNA construction methods, such as Gibson assembly, have made it possible to easily and simultaneously assemble multiple DNA parts, and they hold promise for the development and optimization of metabolic pathways and functional genetic circuits. Over time, however, these pathways and circuits have become more complex, and the increasing need for standardization and insulation of genetic parts has resulted in sequence redundancies-for example, repeated terminator and insulator sequences-that complicate recombination-based assembly. We and others have recently developed DNA assembly methods, which we refer to collectively as unique nucleotide sequence (UNS)-guided assembly, in which individual DNA parts are flanked with UNSs to facilitate the ordered, recombination-based assembly of repetitive sequences. Here we present a detailed protocol for UNS-guided assembly that enables researchers to convert multiple DNA parts into sequenced, correctly assembled constructs, or into high-quality combinatorial libraries in only 2-3 d. If the DNA parts must be generated from scratch, an additional 2-5 d are necessary. This protocol requires no specialized equipment and can easily be implemented by a student with experience in basic cloning techniques.

  20. Unique nucleotide sequence (UNS)-guided assembly of repetitive DNA parts for synthetic biology applications

    PubMed Central

    Torella, Joseph P.; Lienert, Florian; Boehm, Christian R.; Chen, Jan-Hung; Way, Jeffrey C.; Silver, Pamela A.

    2016-01-01

    Recombination-based DNA construction methods, such as Gibson assembly, have made it possible to easily and simultaneously assemble multiple DNA parts and hold promise for the development and optimization of metabolic pathways and functional genetic circuits. Over time, however, these pathways and circuits have become more complex, and the increasing need for standardization and insulation of genetic parts has resulted in sequence redundancies — for example repeated terminator and insulator sequences — that complicate recombination-based assembly. We and others have recently developed DNA assembly methods that we refer to collectively as unique nucleotide sequence (UNS)-guided assembly, in which individual DNA parts are flanked with UNSs to facilitate the ordered, recombination-based assembly of repetitive sequences. Here we present a detailed protocol for UNS-guided assembly that enables researchers to convert multiple DNA parts into sequenced, correctly-assembled constructs, or into high-quality combinatorial libraries in only 2–3 days. If the DNA parts must be generated from scratch, an additional 2–5 days are necessary. This protocol requires no specialized equipment and can easily be implemented by a student with experience in basic cloning techniques. PMID:25101822

  1. Evidence for Balancing Selection from Nucleotide Sequence Analyses of Human G6PD

    PubMed Central

    Verrelli, Brian C.; McDonald, John H.; Argyropoulos, George; Destro-Bisol, Giovanni; Froment, Alain; Drousiotou, Anthi; Lefranc, Gerard; Helal, Ahmed N.; Loiselet, Jacques; Tishkoff, Sarah A.

    2002-01-01

    Glucose-6-phosphate dehydrogenase (G6PD) mutations that result in reduced enzyme activity have been implicated in malarial resistance and constitute one of the best examples of selection in the human genome. In the present study, we characterize the nucleotide diversity across a 5.2-kb region of G6PD in a sample of 160 Africans and 56 non-Africans, to determine how selection has shaped patterns of DNA variation at this gene. Our global sample of enzymatically normal B alleles and A, A−, and Med alleles with reduced enzyme activities reveals many previously uncharacterized silent-site polymorphisms. In comparison with the absence of amino acid divergence between human and chimpanzee G6PD sequences, we find that the number of G6PD amino acid polymorphisms in human populations is significantly high. Unlike many other G6PD-activity alleles with reduced activity, we find that the age of the A variant, which is common in Africa, may not be consistent with the recent emergence of severe malaria and therefore may have originally had a historically different adaptive function. Overall, our observations strongly support previous genotype-phenotype association studies that proposed that balancing selection maintains G6PD deficiencies within human populations. The present study demonstrates that nucleotide sequence analyses can reveal signatures of both historical and recent selection in the genome and may elucidate the impact that infectious disease has had during human evolution. PMID:12378426

  2. Nucleotide sequence at the termini of the DNA of Bacillus subtilis phage phi 29.

    PubMed Central

    Escarmís, C; Salas, M

    1981-01-01

    Phage phi 29 DNA cannot be phosphorylated with polynucleotide kinase and [gamma-32P]ATP because of the presence of a viral protein covalently linked to the 5' termini. The 5' ends can, however, be made susceptible to phosphorylation by treatment with alkali and alkaline phosphatase. Restriction fragments Hpa II C and Hpa II F, corresponding to the right and left ends of phi 29 DNA, respectively, were labeled at the 5' ends with polynucleotide kinase and [gamma-32P]ATP or at the 3' ends with terminal transferase and [alpha-32P]ATP or [alpha-32P]cordycepin 5'-triphosphate. After a secondary cleavage of the labeled fragments, the sequence of the first 150-180 nucleotides at the termini of phi 29 DNA was determined by the method of Maxam and Gilbert. The ends of phi 29 DNA are flush, and a six-nucleotides-long inverted terminal repetition was found. The functional implications of the sequences determined are discussed. Images PMID:6262800

  3. Mapping DNA methylation by transverse current sequencing: Reduction of noise from neighboring nucleotides

    NASA Astrophysics Data System (ADS)

    Alvarez, Jose; Massey, Steven; Kalitsov, Alan; Velev, Julian

    Nanopore sequencing via transverse current has emerged as a competitive candidate for mapping DNA methylation without needed bisulfite-treatment, fluorescent tag, or PCR amplification. By eliminating the error producing amplification step, long read lengths become feasible, which greatly simplifies the assembly process and reduces the time and the cost inherent in current technologies. However, due to the large error rates of nanopore sequencing, single base resolution has not been reached. A very important source of noise is the intrinsic structural noise in the electric signature of the nucleotide arising from the influence of neighboring nucleotides. In this work we perform calculations of the tunneling current through DNA molecules in nanopores using the non-equilibrium electron transport method within an effective multi-orbital tight-binding model derived from first-principles calculations. We develop a base-calling algorithm accounting for the correlations of the current through neighboring bases, which in principle can reduce the error rate below any desired precision. Using this method we show that we can clearly distinguish DNA methylation and other base modifications based on the reading of the tunneling current.

  4. Haplotype structure and population genetic inferences from nucleotide-sequence variation in human lipoprotein lipase.

    PubMed Central

    Clark, A G; Weiss, K M; Nickerson, D A; Taylor, S L; Buchanan, A; Stengård, J; Salomaa, V; Vartiainen, E; Perola, M; Boerwinkle, E; Sing, C F

    1998-01-01

    Allelic variation in 9.7 kb of genomic DNA sequence from the human lipoprotein lipase gene (LPL) was scored in 71 healthy individuals (142 chromosomes) from three populations: African Americans (24) from Jackson, MS; Finns (24) from North Karelia, Finland; and non-Hispanic Whites (23) from Rochester, MN. The sequences had a total of 88 variable sites, with a nucleotide diversity (site-specific heterozygosity) of .002+/-.001 across this 9.7-kb region. The frequency spectrum of nucleotide variation exhibited a slight excess of heterozygosity, but, in general, the data fit expectations of the infinite-sites model of mutation and genetic drift. Allele-specific PCR helped resolve linkage phases, and a total of 88 distinct haplotypes were identified. For 1,410 (64%) of the 2,211 site pairs, all four possible gametes were present in these haplotypes, reflecting a rich history of past recombination. Despite the strong evidence for recombination, extensive linkage disequilibrium was observed. The number of haplotypes generally is much greater than the number expected under the infinite-sites model, but there was sufficient multisite linkage disequilibrium to reveal two major clades, which appear to be very old. Variation in this region of LPL may depart from the variation expected under a simple, neutral model, owing to complex historical patterns of population founding, drift, selection, and recombination. These data suggest that the design and interpretation of disease-association studies may not be as straightforward as often is assumed. PMID:9683608

  5. Expressed Sequence Tags Analysis and Design of Simple Sequence Repeats Markers from a Full-Length cDNA Library in Perilla frutescens (L.)

    PubMed Central

    Seong, Eun Soo; Yoo, Ji Hye; Choi, Jae Hoo; Kim, Chang Heum; Jeon, Mi Ran; Kang, Byeong Ju; Lee, Jae Geun; Choi, Seon Kang; Ghimire, Bimal Kumar; Yu, Chang Yeon

    2015-01-01

    Perilla frutescens is valuable as a medicinal plant as well as a natural medicine and functional food. However, comparative genomics analyses of P. frutescens are limited due to a lack of gene annotations and characterization. A full-length cDNA library from P. frutescens leaves was constructed to identify functional gene clusters and probable EST-SSR markers via analysis of 1,056 expressed sequence tags. Unigene assembly was performed using basic local alignment search tool (BLAST) homology searches and annotated Gene Ontology (GO). A total of 18 simple sequence repeats (SSRs) were designed as primer pairs. This study is the first to report comparative genomics and EST-SSR markers from P. frutescens will help gene discovery and provide an important source for functional genomics and molecular genetic research in this interesting medicinal plant. PMID:26664999

  6. QGRS-H Predictor: a web server for predicting homologous quadruplex forming G-rich sequence motifs in nucleotide sequences

    PubMed Central

    Menendez, Camille; Frees, Scott; Bagga, Paramjeet S.

    2012-01-01

    Naturally occurring G-quadruplex structural motifs, formed by guanine-rich nucleic acids, have been reported in telomeric, promoter and transcribed regions of mammalian genomes. G-quadruplex structures have received significant attention because of growing evidence for their role in important biological processes, human disease and as therapeutic targets. Lately, there has been much interest in the potential roles of RNA G-quadruplexes as cis-regulatory elements of post-transcriptional gene expression. Large-scale computational genomics studies on G-quadruplexes have difficulty validating their predictions without laborious testing in ‘wet’ labs. We have developed a bioinformatics tool, QGRS-H Predictor that can map and analyze conserved putative Quadruplex forming 'G'-Rich Sequences (QGRS) in mRNAs, ncRNAs and other nucleotide sequences, e.g. promoter, telomeric and gene flanking regions. Identifying conserved regulatory motifs helps validate computations and enhances accuracy of predictions. The QGRS-H Predictor is particularly useful for mapping homologous G-quadruplex forming sequences as cis-regulatory elements in the context of 5′- and 3′-untranslated regions, and CDS sections of aligned mRNA sequences. QGRS-H Predictor features highly interactive graphic representation of the data. It is a unique and user-friendly application that provides many options for defining and studying G-quadruplexes. The QGRS-H Predictor can be freely accessed at: http://quadruplex.ramapo.edu/qgrs/app/start. PMID:22576365

  7. Identification of protein-interacting nucleotides in a RNA sequence using composition profile of tri-nucleotides.

    PubMed

    Panwar, Bharat; Raghava, Gajendra P S

    2015-04-01

    The RNA-protein interactions play a diverse role in the cells, thus identification of RNA-protein interface is essential for the biologist to understand their function. In the past, several methods have been developed for predicting RNA interacting residues in proteins, but limited efforts have been made for the identification of protein-interacting nucleotides in RNAs. In order to discriminate protein-interacting and non-interacting nucleotides, we used various classifiers (NaiveBayes, NaiveBayesMultinomial, BayesNet, ComplementNaiveBayes, MultilayerPerceptron, J48, SMO, RandomForest, SMO and SVM(light)) for prediction model development using various features and achieved highest 83.92% sensitivity, 84.82 specificity, 84.62% accuracy and 0.62 Matthew's correlation coefficient by SVM(light) based models. We observed that certain tri-nucleotides like ACA, ACC, AGA, CAC, CCA, GAG, UGA, and UUU preferred in protein-interaction. All the models have been developed using a non-redundant dataset and are evaluated using five-fold cross validation technique. A web-server called RNApin has been developed for the scientific community (http://crdd.osdd.net/raghava/rnapin/).

  8. Identification and nucleotide sequence of the glycoprotein gB gene of equine herpesvirus 4.

    PubMed Central

    Riggio, M P; Cullinane, A A; Onions, D E

    1989-01-01

    The nucleotide sequence of the glycoprotein gB gene of equine herpesvirus 4 (EHV-4) was determined. The gene was located within a BamHI genomic library by a combination of Southern and dot-blot hybridization with probes derived from the herpes simplex virus type 1 (HSV-1) gB DNA sequence. The predominant portion of the coding sequences was mapped to a 2.95-kilobase BamHI-EcoRI subfragment at the left-hand end of BamHI-C. Potential TATA box, CAT box, and mRNA start site sequences and the translational initiation codon were located in the BamHI M fragment of the virus, which is located immediately to the left of BamHI-C. A polyadenylation signal, AATAAA, occurs nine nucleotides past the chain termination codon. Translation of these sequences would give a 110-kilodalton protein possessing a 5' hydrophobic signal sequence, a hydrophilic surface domain containing 11 potential N-linked glycosylation sites, a hydrophobic transmembrane domain, and a 3' highly charged cytoplasmic domain. A potential internal proteolytic cleavage site, Arg-Arg/Ser, was identified at residues 459 to 461. Analysis of this protein revealed amino acid sequence homologies of 47% with HSV-1 gB, 54% with pseudorabies virus gpII, 51% with varicella-zoster virus gpII, 29% with human cytomegalovirus gB, and 30% with Epstein-Barr virus gB. Alignment of EHV-4 gB with HSV-1 (KOS) gB further revealed that four potential N-linked glycosylation sites and all 10 cysteine residues on the external surface of the molecules are perfectly conserved, suggesting that the proteins possess similar secondary and tertiary structures. Thus, we showed that EHV-4 gB is highly conserved with the gB and gpII glycoproteins of other herpesviruses, suggesting that this glycoprotein has a similar overall function in each virus. Images PMID:2915378

  9. The bioinformatics of nucleotide sequence coding for proteins requiring metal coenzymes and proteins embedded with metals

    NASA Astrophysics Data System (ADS)

    Tremberger, G.; Dehipawala, Sunil; Cheung, E.; Holden, T.; Sullivan, R.; Nguyen, A.; Lieberman, D.; Cheung, T.

    2015-09-01

    All metallo-proteins need post-translation metal incorporation. In fact, the isotope ratio of Fe, Cu, and Zn in physiology and oncology have emerged as an important tool. The nickel containing F430 is the prosthetic group of the enzyme methyl coenzyme M reductase which catalyzes the release of methane in the final step of methano-genesis, a prime energy metabolism candidate for life exploration space mission in the solar system. The 3.5 Gyr early life sulfite reductase as a life switch energy metabolism had Fe-Mo clusters. The nitrogenase for nitrogen fixation 3 billion years ago had Mo. The early life arsenite oxidase needed for anoxygenic photosynthesis energy metabolism 2.8 billion years ago had Mo and Fe. The selection pressure in metal incorporation inside a protein would be quantifiable in terms of the related nucleotide sequence complexity with fractal dimension and entropy values. Simulation model showed that the studied metal-required energy metabolism sequences had at least ten times more selection pressure relatively in comparison to the horizontal transferred sequences in Mealybug, guided by the outcome histogram of the correlation R-sq values. The metal energy metabolism sequence group was compared to the circadian clock KaiC sequence group using magnesium atomic level bond shifting mechanism in the protein, and the simulation model would suggest a much higher selection pressure for the energy life switch sequence group. The possibility of using Kepler 444 as an example of ancient life in Galaxy with the associated exoplanets has been proposed and is further discussed in this report. Examples of arsenic metal bonding shift probed by Synchrotron-based X-ray spectroscopy data and Zn controlled FOXP2 regulated pathways in human and chimp brain studied tissue samples are studied in relationship to the sequence bioinformatics. The analysis results suggest that relatively large metal bonding shift amount is associated with low probability correlation R

  10. Molecular cloning, nucleotide sequence, and abscisic acid induction of a suberization-associated highly anionic peroxidase.

    PubMed

    Roberts, E; Kolattukudy, P E

    1989-06-01

    A highly anionic peroxidase induced in suberizing cells was suggested to be the key enzyme involved in polymerization of phenolic monomers to generate the aromatic matrix of suberin. The enzyme encoded by a potato cDNA was found to be highly homologous to the anionic peroxidase induced in suberizing tomato fruit. A tomato genomic library was screened using the potato anionic peroxidase cDNA and one genomic clone was isolated that contained two tandemly oriented anionic peroxidase genes. These genes were sequenced and were 96% and 87% identical to the mRNA for potato anionic peroxidase. Both genes consist of three exons with the relative positions of their two introns being conserved between the two genes. Primer extension analysis showed that only one of the genes is expressed in the periderm of 3 day wound-healed tomato fruits. Southern blot analyses suggested that there are two copies each of the two highly homologous genes per haploid genome in both potato and tomato. Abscisic acid (ABA) induced the accumulation of the anionic peroxidase transcripts in potato and tomato callus tissues. Northern blots showed that peroxidase mRNA was detectable at 2 days and was maximal at 8 days after transfer of potato callus to solid agar media containing 10(-4) M ABA. The transcripts induced by ABA in both potato and tomato callus were identical in size to those induced in wound-healing potato tuber and tomato fruit. The anionic peroxidase peptide was detected in extracts of potato callus grown on the ABA-containing media by western blot analysis. The results support the suggestion that stimulation of suberization by ABA involves the induction of the highly anionic peroxidase.

  11. Power Spectrum and Mutual Information Analyses of DNA Base (Nucleotide) Sequences

    NASA Astrophysics Data System (ADS)

    Isohata, Yasuhiko; Hayashi, Masaki

    2003-03-01

    On the basis of the power spectrum analyses for the base (nucleotide) sequences of various genes, we have studied long-range correlations in total base sequences which are expressed as 1/fα, behaviour of the exponent α for the accumulated base sequences as well as periodicities at short range. In particular from the analysis of content rate distributions of α we have obtained the average value \\barα=0.40± 0.01 and \\barα=0.20± 0.01 for the human genes and S. cerevisiae genes, respectively. We have also performed the analyses using the mutual information function. We show that there exists a clear difference between the content rate distributions of correlation lengths for the sample human genes and the S. cerevisiae genes. We are led to a conjecture that the elongation of the correlation length in the base sequences of genes from the early eukaryote (S. cerevisiae) to the late eukaryote (human) should be the definite reflection of the evolutionary process.

  12. Feasibility of mini-sequencing schemes based on nucleotide polymorphisms for microbial identification and population analyses.

    PubMed

    Araujo, Ricardo; Eusebio, Nadia; Caramalho, Rita

    2015-03-01

    Practical schemes based on single nucleotide polymorphisms (SNP) have been proposed as alternatives to simplify and replace the molecular methodologies based on the extensive sequencing analysis of genes. SNaPshot mini-sequencing has been progressively experienced during the last decade and represents a fast and robust strategy to analyze critical polymorphisms. Such assays have been proposed to characterize some bacteria and microbial eukaryotes, and its feasibility was now reviewed in the present manuscript. The mini-sequencing schemes showed high discriminatory power and competence for identification of microorganisms, but some specificity errors were still found, particularly for species of the Burkholderia cepacia complex and mycobacteria. SNP assays designed for other goals, e.g., comparison of strains, detection of serotypes, virulence, epidemic, and phylogenetic-related subgroups of isolates, can be very useful by facilitating the investigation of large collections of isolates. The next-generation of SNP assays might consider the inclusion of large number of markers to fully characterize microbial taxonomy and strains; nevertheless, these new technologies are still prone to errors and can largely benefit from integration with well-established mini-sequencing assays. Newly proposed molecular tools should be systematically tested in collections of isolates with high indexes of diversity and guarantee interlaboratorial validation.

  13. The nucleotide sequence of a Polish isolate of Tomato torrado virus.

    PubMed

    Budziszewska, Marta; Obrepalska-Steplowska, Aleksandra; Wieczorek, Przemysław; Pospieszny, Henryk

    2008-12-01

    A new virus was isolated from greenhouse tomato plants showing symptoms of leaf and apex necrosis in Wielkopolska province in Poland in 2003. The observed symptoms and the virus morphology resembled viruses previously reported in Spain called Tomato torrado virus (ToTV) and that in Mexico called Tomato marchitez virus (ToMarV). The complete genome of a Polish isolate Wal'03 was determined using RT-PCR amplification using oligonucleotide primers developed against the ToTV sequences deposited in Genbank, followed by cloning, sequencing, and comparison with the sequence of the type isolate. Phylogenetic analyses, performed on the basis of fragments of polyproteins sequences, established the relationship of Polish isolate Wal'03 with Spanish ToTV and Mexican ToMarV, as well as with other viruses from Sequivirus, Sadwavirus, and Cheravirus genera, reported to be the most similar to the new tomato viruses. Wal'03 genome strands has the same organization and very high homology with the ToTV type isolate, showing only some nucleotide and deduced amino acid changes, in contrast to ToMarV, which was significantly different. The phylogenetic tree clustered aforementioned viruses to the same group, indicating that they have a common origin.

  14. Increased functional protein expression using nucleotide sequence features enriched in highly expressed genes in zebrafish.

    PubMed

    Horstick, Eric J; Jordan, Diana C; Bergeron, Sadie A; Tabor, Kathryn M; Serpe, Mihaela; Feldman, Benjamin; Burgess, Harold A

    2015-04-20

    Many genetic manipulations are limited by difficulty in obtaining adequate levels of protein expression. Bioinformatic and experimental studies have identified nucleotide sequence features that may increase expression, however it is difficult to assess the relative influence of these features. Zebrafish embryos are rapidly injected with calibrated doses of mRNA, enabling the effects of multiple sequence changes to be compared in vivo. Using RNAseq and microarray data, we identified a set of genes that are highly expressed in zebrafish embryos and systematically analyzed for enrichment of sequence features correlated with levels of protein expression. We then tested enriched features by embryo microinjection and functional tests of multiple protein reporters. Codon selection, releasing factor recognition sequence and specific introns and 3' untranslated regions each increased protein expression between 1.5- and 3-fold. These results suggested principles for increasing protein yield in zebrafish through biomolecular engineering. We implemented these principles for rational gene design in software for codon selection (CodonZ) and plasmid vectors incorporating the most active non-coding elements. Rational gene design thus significantly boosts expression in zebrafish, and a similar approach will likely elevate expression in other animal models.

  15. Nucleotide sequence of the glucoamylase gene GLU1 in the yeast Saccharomycopsis fibuligera.

    PubMed Central

    Itoh, T; Ohtsuki, I; Yamashita, I; Fukui, S

    1987-01-01

    The complete nucleotide sequence of the glucoamylase gene GLU1 from the yeast Saccharomycopsis fibuligera has been determined. The GLU1 DNA hybridized to a polyadenylated RNA of 2.1 kilobases. A single open reading frame codes for a 519-amino-acid protein which contains four potential N-glycosylation sites. The putative precursor begins with a hydrophobic segment that presumably acts as a signal sequence for secretion. Glucoamylase was purified from a culture fluid of the yeast Saccharomyces cerevisiae which had been transformed with a plasmid carrying GLU1. The molecular weight of the protein was 57,000 by both gel filtration and acrylamide gel electrophoresis. The protein was glycosylated with asparagine-linked glycosides whose molecular weight was 2,000. The amino-terminal sequence of the protein began from the 28th amino acid residue from the first methionine of the putative precursor. The amino acid composition of the purified protein matched the predicted amino acid composition. These results confirmed that GLU1 encodes glucoamylase. A comparison of the amino acid sequence of glucoamylases from several fungi and yeast shows five highly conserved regions. One homology region is absent from the yeast enzyme and so may not be essential to glucoamylase function. Images PMID:3114236

  16. The Complete Nucleotide Sequence and Biotype Variability of Papaya leaf distortion mosaic virus.

    PubMed

    Maoka, Tetsuo; Hataya, Tatsuji

    2005-02-01

    ABSTRACT The complete nucleotide sequence of the genome of Papaya leaf distortion mosaic virus (PLDMV) was determined. The viral RNA genome of strain LDM (leaf distortion mosaic) comprised 10,153 nucleotides, excluding the poly(A) tail, and contained one long open reading frame encoding a polyprotein of 3,269 amino acids (molecular weight 373,347). The polyprotein contained nine putative proteolytic cleavage sites and some motifs conserved in other potyviral polyproteins with 44 to 50% identities, indicating that PLDMV is a distinct species in the genus Potyvirus. Like the W biotype of Papaya ringspot virus (PRSV), the non-papaya-infecting biotype of PLDMV (PLDMV-C) was found in plants of the family Cucurbitaceae. The coat protein (CP) sequence of PLDMV-C in naturally infected-Trichosanthes bracteata was compared with those of three strains of the P biotype (PLDMV-P), LDM and two additional strains M (mosaic) and YM (yellow mosaic), which are biologically different from each other. The CP sequences of three strains of PLDMV-P share high identities of 95 to 97%, while they share lower identities of 88 to 89% with that of PLDMV-C. Significant changes in hydrophobicity and a deletion of two amino acids at the N-terminal region of the CP of PLDMV-C were observed. The finding of two biotypes of PLDMV implies the possibility that the papaya-infecting biotype evolved from the cucurbitaceae-infecting potyvirus, as has been previously suggested for PRSV. In addition, a similar evolutionary event acquiring infectivity to papaya may arise frequently in viruses in the family Cucurbitaceae.

  17. Complete nucleotide sequences of two isolates of cherry green ring mottle virus from peach (Prunus persica) in China.

    PubMed

    Wang, Lihui; Jiang, Dongmei; Niu, Feiqing; Lu, Meiguang; Wang, Hongqing; Li, Shifang

    2013-03-01

    Two complete nucleotide sequences of cherry green ring mottle virus (CGRMV) isolated from peach in Hebei (Hs10) and Fujian (F9) Provinces, China, were determined. Five open reading frames (ORFs) were found in the genomes of both isolates. The F9 and Hs10 isolates shared 82.2 % and 83.4-94.4 % nucleotide sequence identity, respectively, with two CGRMV isolates from cherry. Analysis of the nucleotide and amino acid sequences from the five ORFs of both isolates showed that Hs10 shares the greatest sequence identity with P1A (GenBank AJ291761) from cherry. Phylogenetic analysis indicated that CGRMV isolates from peach and cherry are closely related to members of the genus Foveavirus.

  18. [Molecular cloning and analysis of cDNA sequences encoding serine proteinase and Kunitz type inhibitor in venom gland of Vipera nikolskii viper].

    PubMed

    Ramazanova, A S; Fil'kin, S Iu; Starkov, V G; Utkin, Iu N

    2011-01-01

    Serine proteinases and Kunitz type inhibitors are widely represented in venoms of snakes from different genera. During the study of the venoms from snakes inhabiting Russia we have cloned cDNAs encoding new proteins belonging to these protein families. Thus, a new serine proteinase called nikobin was identified in the venom gland of Vipera nikolskii viper. By amino acid sequence deduced from the cDNA sequence, nikobin differs from serine proteinases identified in other snake species. Nikobin amino acid sequence contains 15 unique substitutions. This is the first serine proteinase of viper from Vipera genus for which a complete amino acid sequence established. The cDNA encoding Kunitz type inhibitor was also cloned. The deduced amino acid sequence of inhibitor is homologous to those of other proteins from that snakes of Vipera genus. However there are several unusual amino acid substitutions that might result in the change of biological activity of inhibitor.

  19. cDNA and deduced amino acid sequence of human pulmonary surfactant-associated proteolipid SPL(Phe)

    SciTech Connect

    Glasser, S.W.; Korfhagen, T.R.; Weaver, T.; Pilot-Matias, T.; Fox, J.L.; Whitsett, J.A.

    1987-06-01

    Hydrophobic surfactant-associated protein of M/sub r/ 6000-14,000 was isolated from either/ethanol or chloroform/methanol extracts of mammalian pulmonary surfactant. Automated Edman degradation in a gas-phase sequencer showed the major N-terminus of the human low molecular weight protein to be Phe-Pro-Ile-Pro-Leu-Pro-Try-Cys-Trp-Leu-Cys-Arg-Ala-Leu-. Because of the N-terminal phenylalanine, the surfactant protein was designated SPL(Phe). Antiserum generated against hydrophobic surfactant protein(s) from bovine pulmonary surfactant recognized protein of M/sub r/ 6000-14,000 in immunoblot analysis and was used to screen a lambdagt11 expression library constructed from adult human lung poly(A)/sup +/ RNA. This resulted in identification of a 1.4-kilobase cDNA clone that was shown to encode the N-terminus of the surfactant polypeptide SPL(Phe) (Phe-Pro-Ile-Pro-Leu-Pro-) within an open reading frame for a larger protein. Expression of a fused ..beta..-galactosidase-SPL (Phe) gene in Escherichia coli yielded an immunoreactive M/sub r/ 34,000 fusion peptide. Hybrid-arrested translation with the cDNA and immunoprecipitation of (/sup 35/S)methionine-labeled in vitro translation products of human poly(A)/sup +/ RNA with a surfactant polyclonal antibody resulted in identification of a M/sub r/ 40,000 precursor protein. Blot hybridization analysis of electrophoretically fractionated RNA from human lung detected a 2.0-kilobase RNA that was more abundant in adult lung than in fetal lung. These proteins, and specifically SPL(Phe), may therefore be useful for synthesis of replacement surfactants for treatment of hyaline membrane disease in newborn infants or of other surfactant-deficient states.

  20. The nucleotide sequence of the human int-1 mammary oncogene; evolutionary conservation of coding and non-coding sequences.

    PubMed Central

    van Ooyen, A; Kwee, V; Nusse, R

    1985-01-01

    The mouse mammary tumor virus can induce mammary tumors in mice by proviral activation of an evolutionarily conserved cellular oncogene called int-1. Here we present the nucleotide sequence of the human homologue of int-1, and compare it with the mouse gene. Like the mouse gene, the human homologue contains a reading frame of 370 amino acids, with only four substitutions. The amino acid changes are all in the hydrophobic leader domain of the int-1 encoded protein, and do not significantly alter its hydropathic index. The conservation between the mouse and the human int-1 genes is not restricted to exons; extensive parts of the introns are also homologous. Thus, int-1 ranks among the most conserved genes known, a property shared with other oncogenes. PMID:2998762

  1. The venom gland transcriptome of Latrodectus tredecimguttatus revealed by deep sequencing and cDNA library analysis.

    PubMed

    He, Quanze; Duan, Zhigui; Yu, Ying; Liu, Zhen; Liu, Zhonghua; Liang, Songping

    2013-01-01

    Latrodectus tredecimguttatus, commonly known as black widow spider, is well known for its dangerous bite. Although its venom has been characterized extensively, some fundamental questions about its molecular composition remain unanswered. The limited transcriptome and genome data available prevent further understanding of spider venom at the molecular level. In the present study, we combined next-generation sequencing and conventional DNA sequencing to construct a venom gland transcriptome of the spider L. tredecimguttatus, which resulted in the identification of 9,666 and 480 high-confidence proteins among 34,334 de novo sequences and 1,024 cDNA sequences, respectively, by assembly, translation, filtering, quantification and annotation. Extensive functional analyses of these proteins indicated that mRNAs involved in RNA transport and spliceosome, protein translation, processing and transport were highly enriched in the venom gland, which is consistent with the specific function of venom glands, namely the production of toxins. Furthermore, we identified 146 toxin-like proteins forming 12 families, including 6 new families in this spider in which α-LTX-Lt1a family2 is firstly identified as a subfamily of α-LTX-Lt1a family. The toxins were classified according to their bioactivities into five categories that functioned in a coordinate way. Few ion channels were expressed in venom gland cells, suggesting a possible mechanism of protection from the attack of their own toxins. The present study provides a gland transcriptome profile and extends our understanding of the toxinome of spiders and coordination mechanism for toxin production in protein expression quantity.

  2. The Venom Gland Transcriptome of Latrodectus tredecimguttatus Revealed by Deep Sequencing and cDNA Library Analysis

    PubMed Central

    He, Quanze; Duan, Zhigui; Yu, Ying; Liu, Zhen; Liu, Zhonghua; Liang, Songping

    2013-01-01

    Latrodectus tredecimguttatus, commonly known as black widow spider, is well known for its dangerous bite. Although its venom has been characterized extensively, some fundamental questions about its molecular composition remain unanswered. The limited transcriptome and genome data available prevent further understanding of spider venom at the molecular level. In the present study, we combined next-generation sequencing and conventional DNA sequencing to construct a venom gland transcriptome of the spider L. tredecimguttatus, which resulted in the identification of 9,666 and 480 high-confidence proteins among 34,334 de novo sequences and 1,024 cDNA sequences, respectively, by assembly, translation, filtering, quantification and annotation. Extensive functional analyses of these proteins indicated that mRNAs involved in RNA transport and spliceosome, protein translation, processing and transport were highly enriched in the venom gland, which is consistent with the specific function of venom glands, namely the production of toxins. Furthermore, we identified 146 toxin-like proteins forming 12 families, including 6 new families in this spider in which α-LTX-Lt1a family2 is firstly identified as a subfamily of α-LTX-Lt1a family. The toxins were classified according to their bioactivities into five categories that functioned in a coordinate way. Few ion channels were expressed in venom gland cells, suggesting a possible mechanism of protection from the attack of their own toxins. The present study provides a gland transcriptome profile and extends our understanding of the toxinome of spiders and coordination mechanism for toxin production in protein expression quantity. PMID:24312294

  3. UMD‐Predictor: A High‐Throughput Sequencing Compliant System for Pathogenicity Prediction of any Human cDNA Substitution

    PubMed Central

    Salgado, David; Desvignes, Jean‐Pierre; Rai, Ghadi; Blanchard, Arnaud; Miltgen, Morgane; Pinard, Amélie; Lévy, Nicolas; Collod‐Béroud, Gwenaëlle

    2016-01-01

    ABSTRACT Whole‐exome sequencing (WES) is increasingly applied to research and clinical diagnosis of human diseases. It typically results in large amounts of genetic variations. Depending on the mode of inheritance, only one or two correspond to pathogenic mutations responsible for the disease and present in affected individuals. Therefore, it is crucial to filter out nonpathogenic variants and limit downstream analysis to a handful of candidate mutations. We have developed a new computational combinatorial system UMD‐Predictor (http://umd‐predictor.eu) to efficiently annotate cDNA substitutions of all human transcripts for their potential pathogenicity. It combines biochemical properties, impact on splicing signals, localization in protein domains, variation frequency in the global population, and conservation through the BLOSUM62 global substitution matrix and a protein‐specific conservation among 100 species. We compared its accuracy with the seven most used and reliable prediction tools, using the largest reference variation datasets including more than 140,000 annotated variations. This system consistently demonstrated a better accuracy, specificity, Matthews correlation coefficient, diagnostic odds ratio, speed, and provided the shortest list of candidate mutations for WES. Webservices allow its implementation in any bioinformatics pipeline for next‐generation sequencing analysis. It could benefit to a wide range of users and applications varying from gene discovery to clinical diagnosis. PMID:26842889

  4. Nucleotide sequences related to the transforming gene of avian sarcoma virus are present in DNA of uninfected vertebrates.

    PubMed

    Spector, D H; Varmus, H E; Bishop, J M

    1978-09-01

    We have detected nucleotide sequences related to the transforming gene of avian sarcoma vius (ASV) in the DNA of uninfected vertebrates. Purified radioactive DNA (cDNAsarc) complementary to most of all of the gene (src) required for transformation of fibroblasts by ASV was annealed with DNA from a variety of normal species. Under conditions that facilitate pairing of partially matched nucleotide sequences (1.5 M NaCl, 59 degrees), cDNAsarc formed duplexes with chicken, human, calf, mouse, and salmon DNA but not with DNA from sea urchin, Drosophila, or Escherichia coli. The kinetics of duplex formation indicated that cDNAsarc was reacting with nucleotide sequences present in a single copy or at most a few copies per cell. In contrast to the preceding findings, nucleotide sequences complementary to the remainder of the ASV genome were observed only in chicken DNA. Thermal denaturation studies of the duplexes formed with cDNAsarc indicated a high degree of conservation of the nucleotide sequences related to src in vertebrate DNAs; the reductions in melting temperature suggested about 3--4% mismatching of cDNAsarc with chicken DNA and 8--10% mismatching of cDNAsarc with the other vertebrate DNAs.

  5. Plastid sequence evolution: a new pattern of nucleotide substitutions in the Cucurbitaceae.

    PubMed

    Decker-Walters, Deena S; Chung, Sang-Min; Staub, Jack E

    2004-05-01

    Nucleotide substitutions (i.e., point mutations) are the primary driving force in generating DNA variation upon which selection can act. Substitutions called transitions, which entail exchanges between purines (A = adenine, G = guanine) or pyrimidines (C = cytosine, T = thymine), typically outnumber transversions (e.g., exchanges between a purine and a pyrimidine) in a DNA strand. With an increasing number of plant studies revealing a transversion rather than transition bias, we chose to perform a detailed substitution analysis for the plant family Cucurbitaceae using data from several short plastid DNA sequences. We generated a phylogenetic tree for 19 taxa of the tribe Benincaseae and related genera and then scored conservative substitution changes (e.g., those not exhibiting homoplasy or reversals) from the unambiguous branches of the tree. Neither the transition nor (A+T)/(G+C) biases found in previous studies were supported by our overall data. More importantly, we found a novel and symmetrical substitution bias in which Gs had been preferentially replaced by A, As by C, Cs by T, and Ts by G, resulting in the G-->A-->C-->T-->G substitution series. Understanding this pattern will lead to new hypotheses concerning plastid evolution, which in turn will affect the choices of substitution models and other tree-building algorithms for phylogenetic analyses based on nucleotide data.

  6. Nucleotide sequence of the gene ereA encoding the erythromycin esterase in Escherichia coli.

    PubMed

    Ounissi, H; Courvalin, P

    1985-01-01

    We have cloned and determined the nucleotide sequence of the gene ereA of plasmid pIP1100 which confers high-level resistance to erythromycin (Em) in Escherichia coli. The gene was defined by initiation and termination codons and by in vitro insertion-inactivation into an open reading frame (ORF) of 1032 bp corresponding to a product with an Mr of 37 765. However, the enzyme, an Em esterase, displayed an apparent Mr of 43 000 upon electrophoresis of a minicell extract on the SDS-polyacrylamide gels. The G + C content (50.5%) of the gene ereA and the preferential codon usage in its ORF suggest that this resistance determinant should be indigenous to E. coli.

  7. Developing single nucleotide polymorphism (SNP) markers from transcriptome sequences for identification of longan (Dimocarpus longan) germplasm

    PubMed Central

    Wang, Boyi; Tan, Hua-Wei; Fang, Wanping; Meinhardt, Lyndel W; Mischke, Sue; Matsumoto, Tracie; Zhang, Dapeng

    2015-01-01

    Longan (Dimocarpus longan Lour.) is an important tropical fruit tree crop. Accurate varietal identification is essential for germplasm management and breeding. Using longan transcriptome sequences from public databases, we developed single nucleotide polymorphism (SNP) markers; validated 60 SNPs in 50 longan germplasm accessions, including cultivated varieties and wild germplasm; and designated 25 SNP markers that unambiguously identified all tested longan varieties with high statistical rigor (P<0.0001). Multiple trees from the same clone were verified and off-type trees were identified. Diversity analysis revealed genetic relationships among analyzed accessions. Cultivated varieties differed significantly from wild populations (Fst=0.300; P<0.001), demonstrating untapped genetic diversity for germplasm conservation and utilization. Within cultivated varieties, apparent differences between varieties from China and those from Thailand and Hawaii indicated geographic patterns of genetic differentiation. These SNP markers provide a powerful tool to manage longan genetic resources and breeding, with accurate and efficient genotype identification. PMID:26504559

  8. Cloning, overexpression and nucleotide sequence of a thermostable DNA ligase-encoding gene.

    PubMed

    Barany, F; Gelfand, D H

    1991-12-20

    Thermostable DNA ligase has been harnessed for the detection of single-base genetic diseases using the ligase chain reaction [Barany, Proc. Natl. Acad. Sci. USA 88 (1991) 189-193]. The Thermus thermophilus (Tth) DNA ligase-encoding gene (ligT) was cloned in Escherichia coli by genetic complementation of a ligts 7 defect in an E. coli host. Nucleotide sequence analysis of the gene revealed a single chain of 676 amino acid residues with 47% identity to the E. coli ligase. Under phoA promoter control, Tth ligase was overproduced to greater than 10% of E. coli cellular proteins. Adenylated and deadenylated forms of the purified enzyme were distinguished by apparent molecular weights of 81 kDa and 78 kDa, respectively, after separation via sodium dodecyl sulfate-polyacrylamide-gel electrophoresis.

  9. Nucleotide sequence of the genetic loci encoding subunits of Bradyrhizobium japonicum uptake hydrogenase.

    PubMed Central

    Sayavedra-Soto, L A; Powell, G K; Evans, H J; Morris, R O

    1988-01-01

    An indispensable part of the hydrogen-recycling system in Bradyrhizobium japonicum is the uptake hydrogenase, which is composed of 34.5- and 65.9-kDa subunits. The gene encoding the large subunit is located on a 5.9-kilobase fragment of the H2-uptake-complementing cosmid pHU52 [Zuber, M., Harker, A.R., Sultana, M.A. & Evans, H.J. (1986) Proc. Natl. Acad. Sci. USA 83, 7668-7672]. We have now determined that the structural genes for both subunits are present on this fragment. Two open reading frames are present that correspond in size and deduced amino acid sequence to the hydrogenase subunits, except that the small-subunit coding region contains a leader peptide of 46 amino acids. The two genes are separated by a 32-nucleotide intergenic region and likely constitute an operon. Comparison of the deduced amino acid sequences of the B. japonicum genes with those from Desulfovibrio gigas, Desulfovibrio baculatus, and Rhodobacter capsulatus indicates significant sequence identity. Images PMID:3054886

  10. Mining for single nucleotide polymorphisms and insertions / deletions in expressed sequence tag libraries of oil palm.

    PubMed

    Riju, Aykkal; Chandrasekar, Arumugam; Arunachalam, Vadivel

    2007-01-01

    The oil palm is a tropical oil bearing tree. Recently EST-derived SNPs and SSRs are a free by-product of the currently expanding EST (Expressed Sequence Tag) data bases. The development of high-throughput methods for the detection of SNPs (Single Nucleotide Polymorphism) and small indels (insertion / deletion) has led to a revolution in their use as molecular markers. Available (5452) Oil palm EST sequences were mined from dbEST of NCBI. CAP3 program was used to assemble EST sequences into contigs. Candidate SNPs and Indel polymorphisms were detected using the perl script auto_snip version 1.0 which has used 576 ESTs for detecting SNPs and Indel sites. We found 1180 SNP sites and 137 indel polymorphisms with frequency 1.36 SNPs / 100 bp. Among the six tissues from which the EST libraries had been generated, mesocarp had high frequency of 2.91 SNPs and indels per 100 bp whereas the zygotic embryos had lowest frequency of 0.15 per 100 bp. We also used the Shannon index to analyze the proportion of ten possible types of SNP/indels. ESTs from tissues of normal apex showed highest values of Shannon index (0.60) whereas abnormal apex had least value (0.02). The present report deals the use of Shannon index for comparing SNP/ indel frequencies mined from ESTlibraries and also confirm that the frequency of SNP occurrence in oil palm to use them as markers for genetic studies.

  11. Cloning and nucleotide sequence of a specific DNA fragment from Paracoccidioides brasiliensis.

    PubMed

    Goldani, L Z; Maia, A L; Sugar, A M

    1995-06-01

    We cloned and sequenced a species-specific 110-bp DNA fragment from Paracoccidioides brasiliensis. The DNA fragment was generated by PCR with primers complementary to the rat beta-actin gene under a low annealing temperature. Comparison of the nucleotide sequence, after excluding the primers, with those in the GenBank database identified approximately 60% homology with an exon of a major surface glycoprotein gene from Pneumocystis carinii and a fragment of unknown function in Saccharomyces cerevisiae chromosome VIII. By Southern hybridization analysis, the 32P-labelled fragment detected 1.0- and 1.9-kb restriction fragments within whole-cell genomic DNA of P. brasiliensis digested with HindIII and PstI, respectively, but failed to hybridize to genomic DNAs from Candida albicans, Blastomyces dermatitidis, Cryptococcus neoformans, Aspergillus fumigatus, Saccharomyces cerevisiae, Pneumocystis carinii, rat tissue, or humans under low-stringency hybridization conditions. Additionally, the specific DNA fragment from three different P. brasiliensis isolates (Pb18, RP18, RP17) was amplified by PCR with primers mostly complementary to nonactin sequences of the 110-bp DNA fragment. In contrast, there were no amplified products from other fungus genomic DNAs previously tested, including Histoplasma capsulatum. To date, this is the first species-specific DNA fragment cloned from P. brasiliensis which might be useful as a diagnostic marker for the identification and classification of different P. brasiliensis isolates.

  12. Modulation of base excision repair of 8-oxoguanine by the nucleotide sequence.

    PubMed

    Allgayer, Julia; Kitsera, Nataliya; von der Lippen, Carina; Epe, Bernd; Khobta, Andriy

    2013-10-01

    8-Oxoguanine (8-oxoG) is a major product of oxidative DNA damage, which induces replication errors and interferes with transcription. By varying the position of single 8-oxoG in a functional gene and manipulating the nucleotide sequence surrounding the lesion, we found that the degree of transcriptional inhibition is independent of the distance from the transcription start or the localization within the transcribed or the non-transcribed DNA strand. However, it is strongly dependent on the sequence context and also proportional to cellular expression of 8-oxoguanine DNA glycosylase (OGG1)-demonstrating that transcriptional arrest does not take place at unrepaired 8-oxoG and proving a causal connection between 8-oxoG excision and the inhibition of transcription. We identified the 5'-CAGGGC[8-oxoG]GACTG-3' motif as having only minimal transcription-inhibitory potential in cells, based on which we predicted that 8-oxoG excision is particularly inefficient in this sequence context. This anticipation was fully confirmed by direct biochemical assays. Furthermore, in DNA containing a bistranded Cp[8-oxoG]/Cp[8-oxoG] clustered lesion, the excision rates differed between the two strands at least by a factor of 9, clearly demonstrating that the excision preference is defined by the DNA strand asymmetry rather than the overall geometry of the double helix or local duplex stability.

  13. The complete nucleotide sequence and genomic characterization of grapevine asteroid mosaic associated virus.

    PubMed

    Vargas-Asencio, José; Wojciechowska, Klaudia; Baskerville, Maia; Gomez, Annika L; Perry, Keith L; Thompson, Jeremy R

    2017-01-02

    In analyzing grapevine clones infected with grapevine red blotch associated virus, we identified a small number of isometric particles of approximately 30nm in diameter from an enriched fraction of leaf extract. A dominant protein of 25kDa was isolated from this fraction using SDS-PAGE and was identified by mass spectrometry as belonging to grapevine asteroid mosaic associated virus (GAMaV). Using a combination of three methods RNA-Seq, sRNA-Seq, and Sanger sequencing of RT- and RACE-PCR products, we obtained a full-length genome sequence consisting of 6719 nucleotides without the poly(A) tail. The virus possesses all of the typical conserved functional domains concordant with the genus Marafivirus and lies evolutionarily between citrus sudden death associated virus and oat blue dwarf virus. A large shift in RNA-Seq coverage coincided with the predicted location of the subgenomic RNA involved in coat protein (CP) expression. Genus wide sequence alignments confirmed the cleavage motif LxG(G/A) to be dominant between the helicase and RNA dependent RNA polymerase (RdRp), and the RdRp and CP domains. A putative overlapping protein (OP) ORF lacking a canonical translational start codon was identified with a reading frame context more consistent with the putative OPs of tymoviruses and fig fleck associated virus than with those of marafiviruses. BLAST analysis of the predicted GAMaV OP showed a unique relatedness to the OPs of members of the genus Tymovirus.

  14. Cloning and nucleotide sequence of a specific DNA fragment from Paracoccidioides brasiliensis.

    PubMed Central

    Goldani, L Z; Maia, A L; Sugar, A M

    1995-01-01

    We cloned and sequenced a species-specific 110-bp DNA fragment from Paracoccidioides brasiliensis. The DNA fragment was generated by PCR with primers complementary to the rat beta-actin gene under a low annealing temperature. Comparison of the nucleotide sequence, after excluding the primers, with those in the GenBank database identified approximately 60% homology with an exon of a major surface glycoprotein gene from Pneumocystis carinii and a fragment of unknown function in Saccharomyces cerevisiae chromosome VIII. By Southern hybridization analysis, the 32P-labelled fragment detected 1.0- and 1.9-kb restriction fragments within whole-cell genomic DNA of P. brasiliensis digested with HindIII and PstI, respectively, but failed to hybridize to genomic DNAs from Candida albicans, Blastomyces dermatitidis, Cryptococcus neoformans, Aspergillus fumigatus, Saccharomyces cerevisiae, Pneumocystis carinii, rat tissue, or humans under low-stringency hybridization conditions. Additionally, the specific DNA fragment from three different P. brasiliensis isolates (Pb18, RP18, RP17) was amplified by PCR with primers mostly complementary to nonactin sequences of the 110-bp DNA fragment. In contrast, there were no amplified products from other fungus genomic DNAs previously tested, including Histoplasma capsulatum. To date, this is the first species-specific DNA fragment cloned from P. brasiliensis which might be useful as a diagnostic marker for the identification and classification of different P. brasiliensis isolates. PMID:7650207

  15. Complete nucleotide sequence of the mitochondrial genome of a salamander, Mertensiella luschani.

    PubMed

    Zardoya, Rafael; Malaga-Trillo, Edward; Veith, Michael; Meyer, Axel

    2003-10-23

    The complete nucleotide sequence (16,650 bp) of the mitochondrial genome of the salamander Mertensiella luschani (Caudata, Amphibia) was determined. This molecule conforms to the consensus vertebrate mitochondrial gene order. However, it is characterized by a long non-coding intervening sequence with two 124-bp repeats between the tRNA(Thr) and tRNA(Pro) genes. The new sequence data were used to reconstruct a phylogeny of jawed vertebrates. Phylogenetic analyses of all mitochondrial protein-coding genes at the amino acid level recovered a robust vertebrate tree in which lungfishes are the closest living relatives of tetrapods, salamanders and frogs are grouped together to the exclusion of caecilians (the Batrachia hypothesis) in a monophyletic amphibian clade, turtles show diapsid affinities and are placed as sister group of crocodiles+birds, and the marsupials are grouped together with monotremes and basal to placental mammals. The deduced phylogeny was used to characterize the molecular evolution of vertebrate mitochondrial proteins. Amino acid frequencies were analyzed across the main lineages of jawed vertebrates, and leucine and cysteine were found to be the most and least abundant amino acids in mitochondrial proteins, respectively. Patterns of amino acid replacements were conserved among vertebrates. Overall, cartilaginous fishes showed the least variation in amino acid frequencies and replacements. Constancy of rates of evolution among the main lineages of jawed vertebrates was rejected.

  16. Complete Nucleotide Sequence of Watermelon Chlorotic Stunt Virus Originating from Oman

    PubMed Central

    Khan, Akhtar J.; Akhtar, Sohail; Briddon, Rob W.; Ammara, Um; Al-Matrooshi, Abdulrahman M.; Mansoor, Shahid

    2012-01-01

    Watermelon chlorotic stunt virus (WmCSV) is a bipartite begomovirus (genus Begomovirus, family Geminiviridae) that causes economic losses to cucurbits, particularly watermelon, across the Middle East and North Africa. Recently squash (Cucurbita moschata) grown in an experimental field in Oman was found to display symptoms such as leaf curling, yellowing and stunting, typical of a begomovirus infection. Sequence analysis of the virus isolated from squash showed 97.6–99.9% nucleotide sequence identity to previously described WmCSV isolates for the DNA A component and 93–98% identity for the DNA B component. Agrobacterium-mediated inoculation to Nicotiana benthamiana resulted in the development of symptoms fifteen days post inoculation. This is the first bipartite begomovirus identified in Oman. Overall the Oman isolate showed the highest levels of sequence identity to a WmCSV isolate originating from Iran, which was confirmed by phylogenetic analysis. This suggests that WmCSV present in Oman has been introduced from Iran. The significance of this finding is discussed. PMID:22852046

  17. High-throughput nucleotide sequence analysis of diverse bacterial communities in leachates of decomposing pig carcasses.

    PubMed

    Yang, Seung Hak; Lim, Joung Soo; Khan, Modabber Ahmed; Kim, Bong Soo; Choi, Dong Yoon; Lee, Eun Young; Ahn, Hee Kwon

    2015-01-01

    The leachate generated by the decomposition of animal carcass has been implicated as an environmental contaminant surrounding the burial site. High-throughput nucleotide sequencing was conducted to investigate the bacterial communities in leachates from the decomposition of pig carcasses. We acquired 51,230 reads from six different samples (1, 2, 3, 4, 6 and 14 week-old carcasses) and found that sequences representing the phylum Firmicutes predominated. The diversity of bacterial 16S rRNA gene sequences in the leachate was the highest at 6 weeks, in contrast to those at 2 and 14 weeks. The relative abundance of Firmicutes was reduced, while the proportion of Bacteroidetes and Proteobacteria increased from 3-6 weeks. The representation of phyla was restored after 14 weeks. However, the community structures between the samples taken at 1-2 and 14 weeks differed at the bacterial classification level. The trend in pH was similar to the changes seen in bacterial communities, indicating that the pH of the leachate could be related to the shift in the microbial community. The results indicate that the composition of bacterial communities in leachates of decomposing pig carcasses shifted continuously during the study period and might be influenced by the burial site.

  18. High-throughput nucleotide sequence analysis of diverse bacterial communities in leachates of decomposing pig carcasses

    PubMed Central

    Yang, Seung Hak; Lim, Joung Soo; Khan, Modabber Ahmed; Kim, Bong Soo; Choi, Dong Yoon; Lee, Eun Young; Ahn, Hee Kwon

    2015-01-01

    The leachate generated by the decomposition of animal carcass has been implicated as an environmental contaminant surrounding the burial site. High-throughput nucleotide sequencing was conducted to investigate the bacterial communities in leachates from the decomposition of pig carcasses. We acquired 51,230 reads from six different samples (1, 2, 3, 4, 6 and 14 week-old carcasses) and found that sequences representing the phylum Firmicutes predominated. The diversity of bacterial 16S rRNA gene sequences in the leachate was the highest at 6 weeks, in contrast to those at 2 and 14 weeks. The relative abundance of Firmicutes was reduced, while the proportion of Bacteroidetes and Proteobacteria increased from 3–6 weeks. The representation of phyla was restored after 14 weeks. However, the community structures between the samples taken at 1–2 and 14 weeks differed at the bacterial classification level. The trend in pH was similar to the changes seen in bacterial communities, indicating that the pH of the leachate could be related to the shift in the microbial community. The results indicate that the composition of bacterial communities in leachates of decomposing pig carcasses shifted continuously during the study period and might be influenced by the burial site. PMID:26500442

  19. The complete nucleotide sequence and genome organization of tomato chlorosis virus.

    PubMed

    Wintermantel, W M; Wisler, G C; Anchieta, A G; Liu, H-Y; Karasev, A V; Tzanetakis, I E

    2005-11-01

    The crinivirus tomato chlorosis virus (ToCV) was discovered initially in diseased tomato and has since been identified as a serious problem for tomato production in many parts of the world, particularly in the United States, Europe and Southeast Asia. The complete nucleotide sequence of ToCV was determined and compared with related crinivirus species. RNA 1 is organized into four open reading frames (ORFs), and encodes proteins involved in replication, based on homology to other viral replication factors. RNA 2 is composed of nine ORFs including genes that encode a HSP70 homolog and two proteins involved in encapsidation of viral RNA, referred to as the coat protein and minor coat protein. Sequence homology between ToCV and other criniviruses varies throughout the viral genome. The minor coat protein (CPm) of ToCV, which forms part of the "rattlesnake tail" of virions and may be involved in determining the unique, broad vector transmissibility of ToCV, is larger than the CPm of lettuce infectious yellows virus (LIYV) by 217 amino acids. Among sequenced criniviruses, considerable variability exists in the size of some viral proteins. Analysis of these differences with respect to biological function may provide insight into the role crinivirus proteins play in virus infection and transmission.

  20. Human ribosomal RNA gene: nucleotide sequence of the transcription initiation region and comparison of three mammalian genes.

    PubMed Central

    Financsek, I; Mizumoto, K; Mishima, Y; Muramatsu, M

    1982-01-01

    The transcription initiation site of the human ribosomal RNA gene (rDNA) was located by using the single-strand specific nuclease protection method and by determining the first nucleotide of the in vitro capped 45S preribosomal RNA. The sequence of 1,211 nucleotides surrounding the initiation site was determined. The sequenced region was found to consist of 75% G and C and to contain a number of short direct and inverted repeats and palindromes. By comparison of the corresponding initiation regions of three mammalian species, several conserved sequences were found upstream and downstream from the transcription starting point. Two short A + T-rich sequences are present on human, mouse, and rat ribosomal RNA genes between the initiation site and 40 nucleotides upstream, and a C + T cluster is located at a position around -60. At and downstream from the initiation site, a common sequence, T-AG-C-T-G-A-C-A-C-G-C-T-G-T-C-C-T-CT-T, was found in the three genes from position -1 through +18. The strong conservation of these sequences suggests their functional significance in rDNA. The S1 nuclease protection experiments with cloned rDNA fragments indicated the presence in human 45S RNA of molecules several hundred nucleotides shorter than the supposed primary transcript. The first 19 nucleotides of these molecules appear identical--except for one mismatch--to the nucleotide sequence of the 5' end of a supposed early processing product of the mouse 45S RNA. Images PMID:6954460

  1. Evaluation of Ancestral Sequence Reconstruction Methods to Infer Nonstationary Patterns of Nucleotide Substitution.

    PubMed

    Matsumoto, Tomotaka; Akashi, Hiroshi; Yang, Ziheng

    2015-07-01

    Inference of gene sequences in ancestral species has been widely used to test hypotheses concerning the process of molecular sequence evolution. However, the approach may produce spurious results, mainly because using the single best reconstruction while ignoring the suboptimal ones creates systematic biases. Here we implement methods to correct for such biases and use computer simulation to evaluate their performance when the substitution process is nonstationary. The methods we evaluated include parsimony and likelihood using the single best reconstruction (SBR), averaging over reconstructions weighted by the posterior probabilities (AWP), and a new method called expected Markov counting (EMC) that produces maximum-likelihood estimates of substitution counts for any branch under a nonstationary Markov model. We simulated base composition evolution on a phylogeny for six species, with different selective pressures on G+C content among lineages, and compared the counts of nucleotide substitutions recorded during simulation with the inference by different methods. We found that large systematic biases resulted from (i) the use of parsimony or likelihood with SBR, (ii) the use of a stationary model when the substitution process is nonstationary, and (iii) the use of the Hasegawa-Kishino-Yano (HKY) model, which is too simple to adequately describe the substitution process. The nonstationary general time reversible (GTR) model, used with AWP or EMC, accurately recovered the substitution counts, even in cases of complex parameter fluctuations. We discuss model complexity and the compromise between bias and variance and suggest that the new methods may be useful for studying complex patterns of nucleotide substitution in large genomic data sets.

  2. The nucleotide sequence of cysteine transfer ribonucleic acid from baker's yeast. Identification of the products from partial degradation of the molecule and derivation of the complete sequence.

    PubMed Central

    Holness, N J; Atfield, G

    1976-01-01

    1. A series of large oligonucleotide fragments derived from tRNA Cys, were separated chromatographically and the sequence of each was deduced by examination of the products of digestion with pancreatic and T1 ribonucleases. 2. The location of the specific cleavage points in the nucleotide chain was similar to that produced by brief treatment with pancreatic ribonuclease. 3. The fragments could be arranged into two alternative sequences. The correct sequence was deduced by the sequential removal and identification of the first nine nucleotides from the 3'-end of the terminal half of the molecules. PMID:819006

  3. Binding studies of novel, non-mammalian enkephalins, structures predicted from frog and lungfish brain cDNA sequences.

    PubMed

    Bojnik, E; Magyar, A; Tóth, G; Bajusz, S; Borsodi, A; Benyhe, S

    2009-01-23

    Leu- and Met-enkephalin were the first endogenous opioid peptides identified in different mammalian species including the human. Comparative biochemical and bioinformatic evidence indicates that enkephalins are not limited to mammals. Various prodynorphin (PDYN) sequences in lower vertebrates revealed the presence of other enkephalin fingerprints in these precursor polypeptides. Among the novel enkephalins Ile-enkephalin (Tyr-Gly-Gly-Phe-Ile) was primarily observed in the African clawed frog (Xenopus laevis) PDYNs, while the structure of Phe-enkephalin (Tyr-Gly-Gly-Phe-Phe) was predicted by analyzing brain cDNA sequences encoding a PDYN of the African lungfish (Protopterus annectens). Ile-enkephalin can also be found in the PDYNs of four other fish species including the eel, bichir, zebrafish and tilapia, but no further occurrence for the Phe-enkephalin motif is available as yet. Based on sequencing data, the biological relevance of Phe- and Ile-enkephalin is suggested, because both of them can arise by regular posttranslational enzymatic processing of the respective neuropeptide precursors. In various receptor binding assays performed on rat brain membrane preparations both of the new peptides turned out to be moderate affinity opioids with a weak preference for the delta-opioid receptor (DOP) sites. Phe-enkephalin of the lungfish displayed rather unexpectedly low affinities toward the mu-opioid receptor (MOP) and DOP, while exhibiting moderate affinity toward the kappa-opioid receptor (KOP). In receptor-mediated G-protein activation assays measured by the stimulation of [(35)S]GTPgammaS binding, Met-enkephalin produced the highest stimulation followed by Leu-enkephalin, Ile-enkephalin and Phe-enkephalin, whereas the least efficacious among these endogenous peptides was still more effective than the prototype opiate agonist morphine in these functional tests.

  4. Detection and quantitation of single nucleotide polymorphisms, DNA sequence variations, DNA mutations, DNA damage and DNA mismatches

    DOEpatents

    McCutchen-Maloney, Sandra L.

    2002-01-01

    DNA mutation binding proteins alone and as chimeric proteins with nucleases are used with solid supports to detect DNA sequence variations, DNA mutations and single nucleotide polymorphisms. The solid supports may be flow cytometry beads, DNA chips, glass slides or DNA dips sticks. DNA molecules are coupled to solid supports to form DNA-support complexes. Labeled DNA is used with unlabeled DNA mutation binding proteins such at TthMutS to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by binding which gives an increase in signal. Unlabeled DNA is utilized with labeled chimeras to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by nuclease activity of the chimera which gives a decrease in signal.

  5. Real-time single-molecule electronic DNA sequencing by synthesis using polymer-tagged nucleotides on a nanopore array

    PubMed Central

    Fuller, Carl W.; Kumar, Shiv; Porel, Mintu; Chien, Minchen; Bibillo, Arek; Stranges, P. Benjamin; Dorwart, Michael; Tao, Chuanjuan; Li, Zengmin; Guo, Wenjing; Shi, Shundi; Korenblum, Daniel; Trans, Andrew; Aguirre, Anne; Liu, Edward; Harada, Eric T.; Pollard, James; Bhat, Ashwini; Cech, Cynthia; Yang, Alexander; Arnold, Cleoma; Palla, Mirkó; Hovis, Jennifer; Chen, Roger; Morozova, Irina; Kalachikov, Sergey; Russo, James J.; Kasianowicz, John J.; Davis, Randy; Roever, Stefan; Church, George M.; Ju, Jingyue

    2016-01-01

    DNA sequencing by synthesis (SBS) offers a robust platform to decipher nucleic acid sequences. Recently, we reported a single-molecule nanopore-based SBS strategy that accurately distinguishes four bases by electronically detecting and differentiating four different polymer tags attached to the 5′-phosphate of the nucleotides during their incorporation into a growing DNA strand catalyzed by DNA polymerase. Further developing this approach, we report here the use of nucleotides tagged at the terminal phosphate with oligonucleotide-based polymers to perform nanopore SBS on an α-hemolysin nanopore array platform. We designed and synthesized several polymer-tagged nucleotides using tags that produce different electrical current blockade levels and verified they are active substrates for DNA polymerase. A highly processive DNA polymerase was conjugated to the nanopore, and the conjugates were complexed with primer/template DNA and inserted into lipid bilayers over individually addressable electrodes of the nanopore chip. When an incoming complementary-tagged nucleotide forms a tight ternary complex with the primer/template and polymerase, the tag enters the pore, and the current blockade level is measured. The levels displayed by the four nucleotides tagged with four different polymers captured in the nanopore in such ternary complexes were clearly distinguishable and sequence-specific, enabling continuous sequence determination during the polymerase reaction. Thus, real-time single-molecule electronic DNA sequencing data with single-base resolution were obtained. The use of these polymer-tagged nucleotides, combined with polymerase tethering to nanopores and multiplexed nanopore sensors, should lead to new high-throughput sequencing methods. PMID:27091962

  6. Real-time single-molecule electronic DNA sequencing by synthesis using polymer-tagged nucleotides on a nanopore array.

    PubMed

    Fuller, Carl W; Kumar, Shiv; Porel, Mintu; Chien, Minchen; Bibillo, Arek; Stranges, P Benjamin; Dorwart, Michael; Tao, Chuanjuan; Li, Zengmin; Guo, Wenjing; Shi, Shundi; Korenblum, Daniel; Trans, Andrew; Aguirre, Anne; Liu, Edward; Harada, Eric T; Pollard, James; Bhat, Ashwini; Cech, Cynthia; Yang, Alexander; Arnold, Cleoma; Palla, Mirkó; Hovis, Jennifer; Chen, Roger; Morozova, Irina; Kalachikov, Sergey; Russo, James J; Kasianowicz, John J; Davis, Randy; Roever, Stefan; Church, George M; Ju, Jingyue

    2016-05-10

    DNA sequencing by synthesis (SBS) offers a robust platform to decipher nucleic acid sequences. Recently, we reported a single-molecule nanopore-based SBS strategy that accurately distinguishes four bases by electronically detecting and differentiating four different polymer tags attached to the 5'-phosphate of the nucleotides during their incorporation into a growing DNA strand catalyzed by DNA polymerase. Further developing this approach, we report here the use of nucleotides tagged at the terminal phosphate with oligonucleotide-based polymers to perform nanopore SBS on an α-hemolysin nanopore array platform. We designed and synthesized several polymer-tagged nucleotides using tags that produce different electrical current blockade levels and verified they are active substrates for DNA polymerase. A highly processive DNA polymerase was conjugated to the nanopore, and the conjugates were complexed with primer/template DNA and inserted into lipid bilayers over individually addressable electrodes of the nanopore chip. When an incoming complementary-tagged nucleotide forms a tight ternary complex with the primer/template and polymerase, the tag enters the pore, and the current blockade level is measured. The levels displayed by the four nucleotides tagged with four different polymers captured in the nanopore in such ternary complexes were clearly distinguishable and sequence-specific, enabling continuous sequence determination during the polymerase reaction. Thus, real-time single-molecule electronic DNA sequencing data with single-base resolution were obtained. The use of these polymer-tagged nucleotides, combined with polymerase tethering to nanopores and multiplexed nanopore sensors, should lead to new high-throughput sequencing methods.

  7. Sequencing of cDNA from 50 unrelated patients reveals that mutations in the triple-helical domain of type III procollagen are an infrequent cause of aortic aneurysms.

    PubMed Central

    Tromp, G; Wu, Y; Prockop, D J; Madhatheri, S L; Kleinert, C; Earley, J J; Zhuang, J; Norrgård, O; Darling, R C; Abbott, W M

    1993-01-01

    Detailed DNA sequencing of the triple-helical domain of type III procollagen was carried out on cDNA prepared from 54 patients with aortic aneurysms. The 43 male and 11 female patients originated from 50 different families and five different nationalities. 43 patients had at least one additional blood relative who had aneurysms. Five overlapping asymmetric PCR products, covering all the coding sequences of the triple-helical domain of type III procollagen, were sequenced with 28 specific sequencing primers. Analysis of the sequencing gels revealed only two nucleotide changes that altered the structure of the protein. One was a substitution of threonine for proline at amino acid position 501 and its functional importance was not clearly established. The other was a substitution of arginine for an obligatory glycine at amino acid position 136. In 40 of the 54 patients, detection of a polymorphism in the mRNA established that both alleles were expressed. The results indicate that mutations in type III procollagen are the cause of only about 2% of aortic aneurysms. Images PMID:8514866

  8. Coagulant thrombin-like enzyme (barnettobin) from Bothrops barnetti venom: molecular sequence analysis of its cDNA and biochemical properties.

    PubMed

    Vivas-Ruiz, Dan E; Sandoval, Gustavo A; Mendoza, Julio; Inga, Rosalina R; Gontijo, Silea; Richardson, Michael; Eble, Johannes A; Yarleque, Armando; Sanchez, Eladio F

    2013-07-01

    The thrombin-like enzyme from Bothrops barnetti named barnettobin was purified. We report some biochemical features of barnettobin including the complete amino acid sequence that was deduced from the cDNA. Snake venom serine proteases affect several steps of human hemostasis ranging from the blood coagulation cascade to platelet function. Barnettobin is a monomeric glycoprotein of 52 kDa as shown by reducing SDS-PAGE, and contains approx. 52% carbohydrate by mass which could be removed by N-glycosidase. The complete amino acid sequence was deduced from the cDNA sequence. Its sequence contains a single chain of 233 amino acid including three N-glycosylation sites. The sequence exhibits significant homology with those of mammalian serine proteases e.g. thrombin and with homologous TLEs. Its specific coagulant activity was 251.7 NIH thrombin units/mg, releasing fibrinopeptide A from human fibrinogen and showed defibrinogenating effect in mouse. Both coagulant and amidolytic activities were inhibited by PMSF. N-deglycosylation impaired its temperature and pH stability. Its cDNA sequence with 750 bp encodes a protein of 233 residues. Indications that carbohydrate moieties may play a role in the interaction with substrates are presented. Barnettobin is a new defibrinogenating agent which may provide an opportunity for the development of new types of anti-thrombotic drugs.

  9. Nucleotide sequence analysis and DNA hybridization studies of the ant(4')-IIa gene from Pseudomonas aeruginosa.

    PubMed Central

    Shaw, K J; Munayyer, H; Rather, P N; Hare, R S; Miller, G H

    1993-01-01

    The ant(4')-IIa gene was previously cloned from Pseudomonas aeruginosa on a 1.6-kb DNA fragment (G. A. Jacoby, M. J. Blaser, P. Santanam, H. Hächler, F. H. Kayser, R. S. Hare, and G. H. Miller, Antimicrob. Agents Chemother. 34:2381-2386, 1990). In the current study, the ant(4')-IIa gene was localized by gamma-delta mutagenesis. A region of approximately 600 nucleotides which contained the ant(4')-IIa gene was identified, and DNA sequence analysis revealed two overlapping open reading frames (ORFs) within this region. Northern (RNA) blot analysis demonstrated expression of both ORFs in P. aeruginosa; therefore, site-directed mutagenesis was used to identify the ORF which encodes the ant(4')-IIa gene. No homology was found between ant(4')-IIa and ant(4')-Ia DNA sequences. Hybridization experiments confirmed that the ant(4')-Ia probe hybridized only to gram-positive presumptive ANT(4')-I strains and that the ant(4')-IIa probe hybridized only to gram-negative strains presumed to carry ANT(4')-II. Seven gram-negative strains which had been classified as having ANT(4')-II resistance profiles did not hybridize with probes for either ant(4')-Ia or ant(4')-IIa, suggesting that at least one additional ant(4') gene may exist. The predicted amino-terminal sequences of the ANT(4')-Ia and ANT(4')-IIa proteins showed significant sequence similarity between residues 38 and 63 of the ANT(4')-Ia protein and residues 26 and 51 of the ANT(4')-IIa protein. PMID:8494365

  10. Sequencing and analysis of 10967 full-length cDNA clones from Xenopus laevis and Xenopus tropicalis

    SciTech Connect

    Morin, R D; Chang, E; Petrescu, A; Liao, N; Kirkpatrick, R; Griffith, M; Butterfield, Y; Stott, J; Barber, S; Babakaiff, R; Matsuo, C; Wong, D; Yang, G; Smailus, D; Brown-John, M; Mayo, M; Beland, J; Gibson, S; Olson, T; Tsai, M; Featherstone, R; Chand, S; Siddiqui, A; Jang, W; Lee, E; Klein, S; Prange, C; Myers, R M; Green, E D; Wagner, L; Gerhard, D; Marra, M; Jones, S M; Holt, R

    2005-10-31

    Sequencing of full-insert clones from full-length cDNA libraries from both Xenopus laevis and Xenopus tropicalis has been ongoing as part of the Xenopus Gene Collection initiative. Here we present an analysis of 10967 clones (8049 from X. laevis and 2918 from X. tropicalis). The clone set contains 2013 orthologs between X. laevis and X. tropicalis as well as 1795 paralog pairs within X. laevis. 1199 are in-paralogs, believed to have resulted from an allotetraploidization event approximately 30 million years ago, and the remaining 546 are likely out-paralogs that have resulted from more ancient gene duplications, prior to the divergence between the two species. We do not detect any evidence for positive selection by the Yang and Nielsen maximum likelihood method of approximating d{sub N}/d{sub S}. However, d{sub N}/d{sub S} for X. laevis in-paralogs is elevated relative to X. tropicalis orthologs. This difference is highly significant, and indicates an overall relaxation of selective pressures on duplicated gene pairs. Within both groups of paralogs, we found evidence of subfunctionalization, manifested as differential expression of paralogous genes among tissues, as measured by EST information from public resources. We have observed, as expected, a higher instance of subfunctionalization in out-paralogs relative to in-paralogs.

  11. Cathepsin B from the white shrimp Litopenaeus vannamei: cDNA sequence analysis, tissues-specific expression and biological activity.

    PubMed

    Stephens, A; Rojo, L; Araujo-Bernal, S; Garcia-Carreño, F; Muhlia-Almazan, A

    2012-01-01

    Cathepsin B is a cystein proteinase scarcely studied in crustaceans. Its function has not been clearly described in shrimp species belonging to the sub-order Dendrobranchiata, which includes the white shrimp Litopenaeus vannamei and other species from the Penaeidae family. Studies on vertebrates suggest that these lysosomal enzymes intracellularly hydrolize protein, as other cystein proteinases. However, the expression of the gene encoding the shrimp cathepsin B in the midgut gland was affected by starvation in a similar way as other digestive proteinases which extracellularly hydrolyze food protein. In this study the white shrimp L. vannamei cathepsin B (LvCathB) cDNA was sequenced, and characterized. Its gene expression was evaluated in various shrimp tissues, and changes in the mRNA amounts were compared with those observed on other digestive proteinases from the midgut gland during starvation. By using qRT-PCR it was found that LvCathB is expressed in most shrimp tissues except in pleopods and eye stalk. Changes on LvCathB mRNA during starvation suggest that the enzyme participates during intracellular protein hydrolysis but also, after food ingestion, it participates in hydrolyzing food proteins extracellularly as confirmed by the high activity levels we found in the gastric juice and midgut gland of the white shrimp.

  12. Construction of a cDNA library and preliminary analysis of expressed sequence tags in Piper hainanense.

    PubMed

    Fan, R; Ling, P; Hao, C Y; Li, F P; Huang, L F; Wu, B D; Wu, H S

    2015-10-19

    Black pepper is a perennial climbing vine. It is widely cultivated because its berries can be utilized not only as a spice in food but also for medicinal use. This study aimed to construct a standardized, high-quality cDNA library to facilitated identification of new Piper hainanense transcripts. For this, 262 unigenes were used to generate raw reads. The average length of these 262 unigenes was 774.8 bp. Of these, 94 genes (35.9%) were newly identified, according to the NCBI protein database. Thus, identification of new genes may broaden the molecular knowledge of P. hainanense on the basis of Clusters of Orthologous Groups and Gene Ontology categories. In addition, certain basic genes linked to physiological processes, which can contribute to disease resistance and thereby to the breeding of black pepper. A total of 26 unigenes were found to be SSR markers. Dinucleotide SSR was the main repeat motif, accounting for 61.54%, followed by trinucleotide SSR (23.07%). Eight primer pairs successfully amplified DNA fragments and detected significant amounts of polymorphism among twenty-one piper germplasm. These results present a novel sequence information of P. hainanense, which can serve as the foundation for further genetic research on this species.

  13. Complete nucleotide sequence of rose yellow leaf virus, a new member of the family Tombusviridae.

    PubMed

    Mollov, Dimitre; Lockhart, Ben; Zlesak, David C

    2014-10-01

    The genome of the rose yellow leaf virus (RYLV) has been determined to be 3918 nucleotides long and to contain seven open reading frames (ORFs). ORF1 encodes a 27-kDa peptide (p27). ORF2 shares a common start codon with ORF1 and continues through the amber stop codon of p27 to encode an 87-kDa (p87) protein that has amino acid similarity to the RNA-dependent RNA polymerase (RdRp) of members of the family Tombusviridae. ORFs 3 and 4 have no significant amino acid similarity to known functional viral ORFs. ORF5 encodes a 6-kDa (p6) protein that has similarity to movement proteins of members of the Tombusviridae. ORF5A has no conventional start codon and overlaps with p6. A putative +1 frameshift mechanism allows p6 translation to continue through the stop codon and results in a 12-kDa protein that has high homology to the carmovirus p13 movement protein. The 37-kDa protein encoded by ORF6 has amino acid sequence similarity to coat proteins (CP) of members of the Tombusviridae. ORF7 has no significant amino acid similarity to known viral ORFs. Phylogenetic analysis of the RdRp amino acid sequences grouped RYLV together with the unclassified Rosa rugosa leaf distortion virus (RrLDV), pelargonium line pattern virus (PLPV), and pelargonium chlorotic ring pattern virus (PCRPV) in a distinct subgroup of the family Tombusviridae.

  14. Nucleotide sequence and phylogenetic analysis of a new potexvirus: Malva mosaic virus.

    PubMed

    Côté, Fabien; Paré, Christine; Majeau, Nathalie; Bolduc, Marilène; Leblanc, Eric; Bergeron, Michel G; Bernardy, Michael G; Leclerc, Denis

    2008-01-01

    A filamentous virus isolated from Malva neglecta Wallr. (common mallow) and propagated in Chenopodium quinoa was grown, cloned and the complete nucleotide sequence was determined (GenBank accession # DQ660333). The genomic RNA is 6858 nt in length and contains five major open reading frames (ORFs). The genomic organization is similar to members and the viral encoded proteins shared homology with the group of the Potexvirus genus in the Flexiviridae family. Phylogenetic analysis revealed a close relationship with narcissus mosaic virus (NMV), scallion virus X (ScaVX) and, to a lesser extent, to Alstroemeria virus X (AlsVX) and pepino mosaic virus (PepMV). A novel putative pseudoknot structure is predicted in the 3'-UTR of a subgroup of potexviruses, including this newly described virus. The consensus GAAAA sequence is detected at the 5'-end of the genomic RNA and experimental data strongly suggest that this motif could be a distinctive hallmark of this genus. The name Malva mosaic virus is proposed.

  15. Complete nucleotide sequence analysis of the norovirus GII.4 Sydney variant in South Korea.

    PubMed

    Park, Ji-Sun; Lee, Sung-Geun; Jin, Ji-Young; Cho, Han-Gil; Jheong, Weon-Hwa; Paik, Soon-Young

    2015-01-01

    Norovirus is the primary cause of acute gastroenteritis in individuals of all ages. In Australia, a new strain of norovirus (GII.4) was identified in March 2012, and this strain has spread rapidly around the world. In August 2012, this new GII.4 strain was identified in patients in South Korea. Therefore, to examine the characteristics of the epidemic norovirus GII.4 2012 variant in South Korea, we conducted KM272334 full-length genomic analysis. The genome of the gg-12-08-04 strain consisted of 7,558 bp and contained three open reading frame (ORF) composites throughout the whole genome: ORF1 (5,100 bp), ORF2 (1,623 bp), and ORF3 (807 bp). Phylogenetic analyses showed that gg-12-08-04 belonged to the GII.4 Sydney 2012 variant, sharing 98.92% nucleotide similarity with this variant strain. According to SimPlot analysis, the gg-12-08-04 strain was a recombinant strain with breakpoint at the ORF1/2 junction between Osaka 2007 and Apeldoorn 2008 strains. This study is the first report of the complete sequence of the GII.4 Sydney 2012 strain in South Korea. Therefore, this may represent the standard sequence of the norovirus GII.4 2012 variant in South Korea and could therefore be useful for the development of norovirus vaccines.

  16. Nucleotide sequence and transcriptional analysis of the type A2 neurotoxin gene cluster in Clostridium botulinum.

    PubMed

    Dineen, Sean S; Bradshaw, Marite; Karasek, Charles E; Johnson, Eric A

    2004-06-01

    The nucleotide sequences of the upstream regions of the botulinum neurotoxin type A1 (BoNT/A1) cluster of Clostridium botulinum strain NCTC 2916 and the BoNT/A2 cluster of strain Kyoto-F were determined. A novel gene, designated orfx3, was identified following the orfx2 gene in both clusters. ORF-X2 and ORF-X3 exhibit similarity to the BoNT cluster associated P-47 protein. The BoNT/A1 and BoNT/A2 clusters share a similar gene arrangement, but exhibit differences in the spacing between certain genes. Sequences with similarity to transposases were identified in these intergenic regions, suggesting that these differences arose from an ancestral insertion event. Transcriptional analysis of the BoNT/A2 cluster revealed that the genes of the cluster are primarily synthesized as three polycistronic transcripts. Two divergent polycistronic transcripts, one encoding the orfx1, orfx2, and orfx3 genes, the second encoding the p47, ntnh, and bont/a2 genes, are transcribed from conserved BoNT cluster promoters. The third polycistronic transcript, expressed at low levels, encodes the positive regulatory botR gene and the orfx genes. This is the first complete analysis of a botulinum toxin A2 cluster.

  17. Complete nucleotide sequence of a Spanish isolate of Parietaria mottle virus infecting tomato.

    PubMed

    Galipienso, Luis; Rubio, Luis; López, Luis; Soler, Salvador; Aramburu, José

    2009-10-01

    The genome of a Spanish isolate of Parietaria mottle virus (PMoV) obtained from tomato (strain PMoV-T) was completely sequenced. Protein motifs conserved for RNA viruses were identified: the p1 protein contained a metyltransferase domain in its N-terminal half and a triphosphatase/ helicase domain in its C-terminal half, the p2 protein contained a RNA polymerase domain; the 3a protein contained a RNA-binding domain with α-helix and β-sheet secondary structures. In addition, stem-loop structures with potential capacity of protein interactions were predicted on the untranslated terminal regions. Comparison with the other sequenced PMoV isolate showed nucleotide identities of 93, 90, and 93% for genomic RNAs 1, 2 and 3, respectively, and amino acid identities ranging from 88 to 97% for the different proteins. A cytosine deletion was detected at position 1,366 of RNA 3, involving a start codon for the coat protein (CP) gene different from the other PMoV isolate, resulting in a CP 16 amino acids shorter. Comparison of synonymous and nonsynonymous mutations revealed different selective constraints along the genome.

  18. A survey of chromosomal and nucleotide sequence variation in Drosophila miranda.

    PubMed Central

    Yi, Soojin; Bachtrog, Doris; Charlesworth, Brian

    2003-01-01

    There have recently been several studies of the evolution of Y chromosome degeneration and dosage compensation using the neo-sex chromosomes of Drosophila miranda as a model system. To understand these evolutionary processes more fully, it is necessary to document the general pattern of genetic variation in this species. Here we report a survey of chromosomal variation, as well as polymorphism and divergence data, for 12 nuclear genes of D. miranda. These genes exhibit varying levels of DNA sequence polymorphism. Compared to its well-studied sibling species D. pseudoobscura, D. miranda has much less nucleotide sequence variation, and the effective population size of this species is inferred to be several-fold lower. Nevertheless, it harbors a few inversion polymorphisms, one of which involves the neo-X chromosome. There is no convincing evidence for a recent population expansion in D. miranda, in contrast to D. pseudoobscura. The pattern of population subdivision previously observed for the X-linked gene period is not seen for the other loci, suggesting that there is no general population subdivision in D. miranda. However, data on an additional region of period confirm population subdivision for this gene, suggesting that local selection is operating at or near period to promote differentiation between populations. PMID:12930746

  19. Full-Length Venom Protein cDNA Sequences from Venom-Derived mRNA: Exploring Compositional Variation and Adaptive Multigene Evolution

    PubMed Central

    Modahl, Cassandra M.; Mackessy, Stephen P.

    2016-01-01

    Envenomation of humans by snakes is a complex and continuously evolving medical emergency, and treatment is made that much more difficult by the diverse biochemical composition of many venoms. Venomous snakes and their venoms also provide models for the study of molecular evolutionary processes leading to adaptation and genotype-phenotype relationships. To compare venom complexity and protein sequences, venom gland transcriptomes are assembled, which usually requires the sacrifice of snakes for tissue. However, toxin transcripts are also present in venoms, offering the possibility of obtaining cDNA sequences directly from venom. This study provides evidence that unknown full-length venom protein transcripts can be obtained from the venoms of multiple species from all major venomous snake families. These unknown venom protein cDNAs are obtained by the use of primers designed from conserved signal peptide sequences within each venom protein superfamily. This technique was used to assemble a partial venom gland transcriptome for the Middle American Rattlesnake (Crotalus simus tzabcan) by amplifying sequences for phospholipases A2, serine proteases, C-lectins, and metalloproteinases from within venom. Phospholipase A2 sequences were also recovered from the venoms of several rattlesnakes and an elapid snake (Pseudechis porphyriacus), and three-finger toxin sequences were recovered from multiple rear-fanged snake species, demonstrating that the three major clades of advanced snakes (Elapidae, Viperidae, Colubridae) have stable mRNA present in their venoms. These cDNA sequences from venom were then used to explore potential activities derived from protein sequence similarities and evolutionary histories within these large multigene superfamilies. Venom-derived sequences can also be used to aid in characterizing venoms that lack proteomic profiles and identify sequence characteristics indicating specific envenomation profiles. This approach, requiring only venom, provides

  20. Full-Length Venom Protein cDNA Sequences from Venom-Derived mRNA: Exploring Compositional Variation and Adaptive Multigene Evolution.

    PubMed

    Modahl, Cassandra M; Mackessy, Stephen P

    2016-06-01

    Envenomation of humans by snakes is a complex and continuously evolving medical emergency, and treatment is made that much more difficult by the diverse biochemical composition of many venoms. Venomous snakes and their venoms also provide models for the study of molecular evolutionary processes leading to adaptation and genotype-phenotype relationships. To compare venom complexity and protein sequences, venom gland transcriptomes are assembled, which usually requires the sacrifice of snakes for tissue. However, toxin transcripts are also present in venoms, offering the possibility of obtaining cDNA sequences directly from venom. This study provides evidence that unknown full-length venom protein transcripts can be obtained from the venoms of multiple species from all major venomous snake families. These unknown venom protein cDNAs are obtained by the use of primers designed from conserved signal peptide sequences within each venom protein superfamily. This technique was used to assemble a partial venom gland transcriptome for the Middle American Rattlesnake (Crotalus simus tzabcan) by amplifying sequences for phospholipases A2, serine proteases, C-lectins, and metalloproteinases from within venom. Phospholipase A2 sequences were also recovered from the venoms of several rattlesnakes and an elapid snake (Pseudechis porphyriacus), and three-finger toxin sequences were recovered from multiple rear-fanged snake species, demonstrating that the three major clades of advanced snakes (Elapidae, Viperidae, Colubridae) have stable mRNA present in their venoms. These cDNA sequences from venom were then used to explore potential activities derived from protein sequence similarities and evolutionary histories within these large multigene superfamilies. Venom-derived sequences can also be used to aid in characterizing venoms that lack proteomic profiles and identify sequence characteristics indicating specific envenomation profiles. This approach, requiring only venom, provides

  1. CONSTRUCTION OF SILKWORM MIDGUT cDNA LIBRARY FOR SCREEN AND SEQUENCE ANALYSIS OF PERITROPHIC MEMBRANE PROTEIN GENES.

    PubMed

    Zhou, Yi-Jun; Xue, Bin; Li, Yang-Yang; Li, Fan-Chi; Ni, Min; Shen, Wei-De; Gu, Zhi-Ya; Li, Bing; Shen, Wei-De; Gu, Zhi-Ya; Li, Bing

    2016-01-01

    Silkworm is an important economic insect and the model species for Lepidoptera. The midgut of silkworm is an important physiological barrier, as its peritrophic membrane (PM) can resist pathogen invasion. In this study, a silkworm midgut cDNA library was constructed in order to identify silkworm PM genes. The capacity of the initial library was 6.92 × 10(6) pfu/ml, along with a recombination rate of 92.14% and a postamplification titer of 4.10 × 10(9) pfu/ml. Three silkworm PM protein genes were obtained by immunoscreening, two of which were chitin-binding protein (CBP) genes and one of which was a chitin deacetylase (CDA) gene as revealed by sequence analysis. Three genes were named BmCBP02, BmCBP13, and BmCDA17, and their ORF sizes are 678, 1,029, and 645 bp, respectively; all of them contain sequences of chitin-binding domains. Phylogenetic analysis indicated that BmCBP02 has the highest consensus with Mamestra configurata CBP at 61.0%; BmCBP13 has the highest consensus with Loxostege sticticalis PM CBP at 53.35%; BmCDA17 has the highest consensus with Helicoverpa armigera CDA5a at 70.83%. Tissue transcriptional analysis revealed that all three genes were specifically expressed in the midgut, and during the developmental process of fifth-instar silkworms, the transcription of all the genes showed an upward trend. This study laid a foundation for further studies on the functions of silkworm PM genes.

  2. Nucleotide sequence of the 3'-noncoding region of alfalfa mosaic virus RNA 4 and its homology with the genomic RNAs.

    PubMed Central

    Koper-Zwarthoff, E C; Brederode, F T; Walstra, P; Bol, J F

    1979-01-01

    A 226-nucleotide fragment was derived from alfalfa mosaic virus RNA 4 (ALMV RNA 4), the subgenomic messenger for viral coat protein, and its sequence was deduced by in vitro labeling with polynucleotide kinase and application of RNA sequencing techniques. The fragment contains the 3'-terminal 45 nucleotides of the coat protein cistron and the complete 3'-noncoding region of 182 nucleotides. The total length of RNA 4 was calculated to be 881 nucleotides. AlMV RNAs 1, 2 and 3 were elongated with a 3'-terminal poly(A) stretch and subjected to sequence analysis by using a specific primer, reverse transcriptase and chain terminators. This revealed and extensive homology between the 3'-terminal 140 to 150 nucleotides of all four ALMV RNAs. Despite a number of base substitutions, the secondary structure of the homologous region is highly conserved. The observed homology indicates that, as with RNA 4, the sites with a high affinity for the viral coat protein are located at the 3'-termini of the genomic RNAs. Images PMID:537914

  3. Isolation of cDNA encoding a binding protein specific to 5'-phosphorylated single-stranded DNA with G-rich sequences.

    PubMed Central

    Mizuta, T R; Fukita, Y; Miyoshi, T; Shimizu, A; Honjo, T

    1993-01-01

    We have isolated the cDNA encoding a binding protein to the sequence motif of the immunoglobulin S mu region by the southwestern method. The binding protein designated S mu bp-2 specifically binds to 5'-phosphorylated single-stranded DNA containing 5'-G and GGGG stretches. The amino acid sequence deduced from the cDNA sequence showed that the S mu bp-2 belongs to the putative helicase superfamily which is involved in replication, recombination and repair. Expression of S mu bp-2 mRNA is ubiquitous and augmented in spleen cells stimulated with lipopolysaccharide and interleukin 4 which also induce class switching. The S mu bp-2 gene is conserved among vertebrates. Possible involvement of S mu bp-2 in class switching is discussed. Images PMID:8493094

  4. New Approaches to Attenuated Hepatitis a Vaccine Development: Cloning and Sequencing of Cell-Culture Adapted Viral cDNA.

    DTIC Science & Technology

    1987-10-13

    reverse if necessary and identify by block number) FIELD GROUP SUB-GROUP Hepatitis A Vaccine, Molecular Cloning and Hybridization, 06 13 Strain Differences...cells. Molecular cloning of p16 HM175 virus cDNA. cDNA clones were derived from p16 HM175 virus RNA by cloning cDNA-RNA hybrid molecules into the Pstl... molecular cloning . Clones derived from cDNA synthesized with oligo-dT_12 18 as primer were nearly always restricted to the 3’ terminus of the genome, while

  5. Quantitative T cell repertoire analysis by deep cDNA sequencing of T cell receptor α and β chains using next-generation sequencing (NGS)

    PubMed Central

    Fang, Hua; Yamaguchi, Rui; Liu, Xiao; Daigo, Yataro; Yew, Poh Yin; Tanikawa, Chizu; Matsuda, Koichi; Imoto, Seiya; Miyano, Satoru; Nakamura, Yusuke

    2015-01-01

    Immune responses play a critical role in various disease conditions including cancer and autoimmune diseases. However, to date, there has not been a rapid, sensitive, comprehensive, and quantitative analysis method to examine T-cell or B-cell immune responses. Here, we report a new approach to characterize T cell receptor (TCR) repertoire by sequencing millions of cDNA of TCR α and β chains in combination with a newly-developed algorithm. Using samples from lung cancer patients treated with cancer peptide vaccines as a model, we demonstrate that detailed information of the V-(D)-J combination along with complementary determining region 3 (CDR3) sequences can be determined. We identified extensive abnormal splicing of TCR transcripts in lung cancer samples, indicating the dysfunctional splicing machinery in T lymphocytes by prior chemotherapy. In addition, we found three potentially novel TCR exons that have not been described previously in the reference genome. This newly developed TCR NGS platform can be applied to better understand immune responses in many disease areas including immune disorders, allergies, and organ transplantations. PMID:25964866

  6. Complete Nucleotide Sequences and Genome Organization of Two Pepper Mild Mottle Virus Isolates from Capsicum annuum in South Korea

    PubMed Central

    Choi, Seung-Kook; Choi, Gug-Seoun; Kwon, Sun-Jung

    2016-01-01

    The complete genome sequences of pepper mild mottle virus (PMMoV)-P2 and -P3 were determined by the Sanger sequencing method. Although PMMoV-P2 and PMMoV-P3 have different pathogenicity in some pepper cultivars, the complete genome sequences of PMMoV-P2 and -P3 are composed of 6,356 nucleotides (nt). In this study, we report the complete genome sequences and genome organization of PMMoV-P2 and -P3 isolates from pepper species in South Korea. PMID:27198033

  7. [Polymorphism of DNA nucleotide sequence as a source of enhancement of the discrimination potential of the STR-markers].

    PubMed

    Zemskova, E Yu; Timoshenko, T V; Leonov, S N; Ivanov, P L

    2016-01-01

    The objective of the present pilot investigation was to reveal and to study polymorphism of nucleotide sequence in the alleles of STR loci of human autosomal DNA with special reference to the role of this phenomenon as a source of the differences between homonymous allelic variants. The secondary objection was to evaluate the possibility of using the data thus obtained for the enhancement of the informative value of the forensic medical genotyping of STR loci by means of identification of single nucleotide polymorphisms (SNP) for the purpose of extending their allelic spectrum. The methodological basis of the study was constituted by the comprehensive amplified fragment length polymorphism (AFLP) analysis and amplified fragment sequence polymorphisms (AFSP) analysis of DNA with the use of the PLEX-ID^TM analytical mass-spectrometry platform (Abbot Molecular, USA). The study has demonstrated that polymorphism of DNA nucleotide sequence can be regarded as the possible source of enhancement of the discriminating potential of STR markers. It means that the analysis of polymorphism of DNA nucleotide sequence for genotyping AFLP-type markers of chromosomal DNA can considerably increase the effectiveness of their application as individualizing markers for the purpose of molecular genetic expertises.

  8. 37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... is DNA, RNA, or PRT (protein). If a nucleotide sequence contains both DNA and RNA fragments, the type shall be “DNA.” In addition, the combined DNA/RNA molecule shall be further described in the to feature... combined DNA/RNA” Name/Key Provide appropriate identifier for feature, preferably from WIPO Standard...

  9. Complete nucleotide sequence of a begomovirus associated with satellites molecules infecting a new host Tagetes patula in India.

    PubMed

    Marwal, Avinash; Sahu, Anurag Kumar; Choudhary, Devendra Kumar; Gaur, R K

    2013-08-01

    In the year 2012 leaf curl disease was observed on Marigold (Tagetes patula) in Lakshmangrh, Sikar province of India. Affected plants were severely stunted with apical leaf curl and crinkled leaves, symptoms typical of begomovirus infection. This is the first report of complete nucleotide sequence of a begomovirus associated with satellites molecules infecting a new host Tagetes patula in India.

  10. A high-density simple sequence repeat and single nucleotide polymorphism genetic map of the tetraploid cotton genome

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Cotton genome complexity was investigated with a saturated molecular genetic map that combined several sets of microsatellites or simple sequence repeats (SSR) and the first major public set of single nucleotide polymorphism (SNP) markers in cotton genomes (Gossypium spp.), and that was constructed ...

  11. Comparing genotyping-by-sequencing and Single Nucleotide Polymorphism chip genotyping in Quantitive Trait Loci mapping in wheat

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Array- or chip-based single nucleotide polymorphism (SNP) markers are widely used in genomic studies because of their abundance in a genome and cost less per data point compared to older marker technologies. Genotyping by sequencing (GBS), a relatively newer approach of genotyping, suggests equal or...

  12. 37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... is DNA, RNA, or PRT (protein). If a nucleotide sequence contains both DNA and RNA fragments, the type shall be “DNA.” In addition, the combined DNA/RNA molecule shall be further described in the to feature... combined DNA/RNA” Name/Key Provide appropriate identifier for feature, preferably from WIPO Standard...

  13. 37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... is DNA, RNA, or PRT (protein). If a nucleotide sequence contains both DNA and RNA fragments, the type shall be “DNA.” In addition, the combined DNA/RNA molecule shall be further described in the to feature... combined DNA/RNA” Name/Key Provide appropriate identifier for feature, preferably from WIPO Standard...

  14. 37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... is DNA, RNA, or PRT (protein). If a nucleotide sequence contains both DNA and RNA fragments, the type shall be “DNA.” In addition, the combined DNA/RNA molecule shall be further described in the to feature... combined DNA/RNA” Name/Key Provide appropriate identifier for feature, preferably from WIPO Standard...

  15. 37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... is DNA, RNA, or PRT (protein). If a nucleotide sequence contains both DNA and RNA fragments, the type shall be “DNA.” In addition, the combined DNA/RNA molecule shall be further described in the to feature... combined DNA/RNA” Name/Key Provide appropriate identifier for feature, preferably from WIPO Standard...

  16. Molecular cloning and nucleotide sequence of a transforming gene detected by transfection of chicken B-cell lymphoma DNA

    NASA Astrophysics Data System (ADS)

    Goubin, Gerard; Goldman, Debra S.; Luce, Judith; Neiman, Paul E.; Cooper, Geoffrey M.

    1983-03-01

    A transforming gene detected by transfection of chicken B-cell lymphoma DNA has been isolated by molecular cloning. It is homologous to a conserved family of sequences present in normal chicken and human DNAs but is not related to transforming genes of acutely transforming retroviruses. The nucleotide sequence of the cloned transforming gene suggests that it encodes a protein that is partially homologous to the amino terminus of transferrin and related proteins although only about one tenth the size of transferrin.

  17. Construction of cDNA library and preliminary analysis of expressed sequence tags from tea plant [Camellia sinensis (L) O. Kuntze].

    PubMed

    Phukon, Munmi; Namdev, Richa; Deka, Diganta; Modi, Mahendra K; Sen, Priyabrata

    2012-09-10

    Tea is the most popular non-alcoholic and healthy beverage across the world. The understanding of the genetic organization and molecular biology of tea plant, which is very poorly understood at present, is required for quantum increase in productivity and efficient use of germplasm for either cultivation or breeding program. Single-pass sequencing of randomly selected cDNA clones is the most widely accepted technique for gene identification and cloning. In the present study, a good quality cDNA library was constructed and preliminary analysis of ESTs was carried out. The titers of unamplified and amplified libraries were 1.4 × 10(6)pfu/ml and 5.27 × 10(8)pfu/ml respectively. A total of 210 cDNA clones from the constructed cDNA library were sequenced and analyzed. A total of 84 high quality Expressed Sequence Tags (ESTs) were generated, among which 71 ESTs had significant homology with sequences in NCBI non-redundant protein database by BLAST X analysis. About 80% ESTs had poly (A) tail at 3' end indicating that the cDNAs were full length. The database-matched ESTs were classified into putative cellular roles, viz. energy-related category (corresponding to 20% of total BLAST X matched ESTs), Transcription (14.2%), protein synthesis (14.2%) cell growth and division (8.6%), cell structure (5.7%), signal transduction (5.7%), transporters (2.9%), disease and defenses (2.9%), secondary metabolism (2.9%) and gene regulation (2.9%). This study provides an overview of the mRNA expression profile and first hand information of gene sequence expressed in tender leaves and apical buds of tea plant.

  18. [Molecular phylogenetic analysis of the genus Abies (Pinaceae) based on the nucleotide sequence of chloroplast DNA].

    PubMed

    Semerikova, S A; Semerikov, V L

    2014-01-01

    A phylogenetic study of firs (Abies Mill.) was conducted using nucleotide sequences of several chloroplast DNA regions with a total length of 5580 bp. The analysis included 37 taxa, which represented the main evolutionary lineages of the genus, and Keteleeria daviana. According to phylogenetic reconstruction the Abies species were subdivided into six main groups, generally corresponding to their geographic distribution. The phylogenetic tree had three basal clades. All of these clades contained American species, and only one of them contained Eurasian species. The divergence time calibrations, based on paleobotanical data and the chloroplast DNA mutation rate estimates in Pinaceae, produced similar results..The age of diversification among the clades of the present-day Abies was estimated as the end of the Oligocene-beginning of Miocene. The age of the separation of Mediterranean firs from the Asian-North American branch corresponds to the Miocene. The age of diversification within the young groups of Mediterranean, Asian, and boreal American firs (A. lasiocarpa, A. balsamea, A. fraseri) was estimated as the Pliocene-Pleistocene. Based on the phylogenetic reconstruction obtained, the most plausible biogeographic scenarios were suggested. It is noted that the existing systematic classification of the genus Abies strongly contradicts with phylogenetic reconstruction and requires revision.

  19. Nucleotide sequence of a lysine tRNA from Bacillus subtilis.

    PubMed Central

    Yamada, Y; Ishikura, H

    1977-01-01

    A lysine tRNA (tRNA1Lys) was purified from Bacillus subtilis W168 by a consecutive use of several column chromatographic systems. The nucleotide sequence was determined to be pG-A-G-C-C-A-U-U-A-G-C-U-C-A-G-U-D-G-G-D-A-G-A-G-C-A-U-C-U-G-A-C-U-U(U*)-U-U-K-A-psi-C-A-G-A-G-G-m7G(G)-U-C-G-A-A-G-G-T-psi-C-G-A-G-U-C-C-U-U-C-A-U-G-G-C-U-C-A-C-C-AOH, where K and U* are unidentified nucleosides. The nucleosides of U34 and m7G46 were partially substituted with U* and G, respectively. The binding ability of lysyl-tRNA1Lys to Escherichia coli ribosomes was stimulated with ApApA as well as ApApG. PMID:414208

  20. Whole-genome sequencing identifies genomic heterogeneity at a nucleotide and chromosomal level in bladder cancer.

    PubMed

    Morrison, Carl D; Liu, Pengyuan; Woloszynska-Read, Anna; Zhang, Jianmin; Luo, Wei; Qin, Maochun; Bshara, Wiam; Conroy, Jeffrey M; Sabatini, Linda; Vedell, Peter; Xiong, Donghai; Liu, Song; Wang, Jianmin; Shen, He; Li, Yinwei; Omilian, Angela R; Hill, Annette; Head, Karen; Guru, Khurshid; Kunnev, Dimiter; Leach, Robert; Eng, Kevin H; Darlak, Christopher; Hoeflich, Christopher; Veeranki, Srividya; Glenn, Sean; You, Ming; Pruitt, Steven C; Johnson, Candace S; Trump, Donald L

    2014-02-11

    Using complete genome analysis, we sequenced five bladder tumors accrued from patients with muscle-invasive transitional cell carcinoma of the urinary bladder (TCC-UB) and identified a spectrum of genomic aberrations. In three tumors, complex genotype changes were noted. All three had tumor protein p53 mutations and a relatively large number of single-nucleotide variants (SNVs; average of 11.2 per megabase), structural variants (SVs; average of 46), or both. This group was best characterized by chromothripsis and the presence of subclonal populations of neoplastic cells or intratumoral mutational heterogeneity. Here, we provide evidence that the process of chromothripsis in TCC-UB is mediated by nonhomologous end-joining using kilobase, rather than megabase, fragments of DNA, which we refer to as "stitchers," to repair this process. We postulate that a potential unifying theme among tumors with the more complex genotype group is a defective replication-licensing complex. A second group (two bladder tumors) had no chromothripsis, and a simpler genotype, WT tumor protein p53, had relatively few SNVs (average of 5.9 per megabase) and only a single SV. There was no evidence of a subclonal population of neoplastic cells. In this group, we used a preclinical model of bladder carcinoma cell lines to study a unique SV (translocation and amplification) of the gene glutamate receptor ionotropic N-methyl D-aspertate as a potential new therapeutic target in bladder cancer.

  1. Whole-genome sequencing identifies genomic heterogeneity at a nucleotide and chromosomal level in bladder cancer

    PubMed Central

    Morrison, Carl D.; Liu, Pengyuan; Woloszynska-Read, Anna; Zhang, Jianmin; Luo, Wei; Qin, Maochun; Bshara, Wiam; Conroy, Jeffrey M.; Sabatini, Linda; Vedell, Peter; Xiong, Donghai; Liu, Song; Wang, Jianmin; Shen, He; Li, Yinwei; Omilian, Angela R.; Hill, Annette; Head, Karen; Guru, Khurshid; Kunnev, Dimiter; Leach, Robert; Eng, Kevin H.; Darlak, Christopher; Hoeflich, Christopher; Veeranki, Srividya; Glenn, Sean; You, Ming; Pruitt, Steven C.; Johnson, Candace S.; Trump, Donald L.

    2014-01-01

    Using complete genome analysis, we sequenced five bladder tumors accrued from patients with muscle-invasive transitional cell carcinoma of the urinary bladder (TCC-UB) and identified a spectrum of genomic aberrations. In three tumors, complex genotype changes were noted. All three had tumor protein p53 mutations and a relatively large number of single-nucleotide variants (SNVs; average of 11.2 per megabase), structural variants (SVs; average of 46), or both. This group was best characterized by chromothripsis and the presence of subclonal populations of neoplastic cells or intratumoral mutational heterogeneity. Here, we provide evidence that the process of chromothripsis in TCC-UB is mediated by nonhomologous end-joining using kilobase, rather than megabase, fragments of DNA, which we refer to as “stitchers,” to repair this process. We postulate that a potential unifying theme among tumors with the more complex genotype group is a defective replication–licensing complex. A second group (two bladder tumors) had no chromothripsis, and a simpler genotype, WT tumor protein p53, had relatively few SNVs (average of 5.9 per megabase) and only a single SV. There was no evidence of a subclonal population of neoplastic cells. In this group, we used a preclinical model of bladder carcinoma cell lines to study a unique SV (translocation and amplification) of the gene glutamate receptor ionotropic N-methyl D-aspertate as a potential new therapeutic target in bladder cancer. PMID:24469795

  2. Quantitative theory of entropic forces acting on constrained nucleotide sequences applied to viruses.

    PubMed

    Greenbaum, Benjamin D; Cocco, Simona; Levine, Arnold J; Monasson, Rémi

    2014-04-01

    We outline a theory to quantify the interplay of entropic and selective forces on nucleotide organization and apply it to the genomes of single-stranded RNA viruses. We quantify these forces as intensive variables that can easily be compared between sequences, outline a computationally efficient transfer-matrix method for their calculation, and apply this method to influenza and HIV viruses. We find viruses altering their dinucleotide motif use under selective forces, with these forces on CpG dinucleotides growing stronger in influenza the longer it replicates in humans. For a subset of genes in the human genome, many involved in antiviral innate immunity, the forces acting on CpG dinucleotides are even greater than the forces observed in viruses, suggesting that both effects are in response to similar selective forces involving the innate immune system. We further find that the dynamics of entropic forces balancing selective forces can be used to predict how long it will take a virus to adapt to a new host, and that it would take H1N1 several centuries to adapt to humans from birds, typically contributing many of its synonymous substitutions to the forcible removal of CpG dinucleotides. By examining the probability landscape of dinucleotide motifs, we predict where motifs are likely to appear using only a single-force parameter and uncover the localization of UpU motifs in HIV. Essentially, we extend the natural language and concepts of statistical physics, such as entropy and conjugated forces, to understanding viral sequences and, more generally, constrained genome evolution.

  3. Postzygotic single-nucleotide mosaicisms in whole-genome sequences of clinically unremarkable individuals

    PubMed Central

    Huang, August Y; Xu, Xiaojing; Ye, Adam Y; Wu, Qixi; Yan, Linlin; Zhao, Boxun; Yang, Xiaoxu; He, Yao; Wang, Sheng; Zhang, Zheng; Gu, Bowen; Zhao, Han-Qing; Wang, Meng; Gao, Hua; Gao, Ge; Zhang, Zhichao; Yang, Xiaoling; Wu, Xiru; Zhang, Yuehua; Wei, Liping

    2014-01-01

    Postzygotic single-nucleotide mutations (pSNMs) have been studied in cancer and a few other overgrowth human disorders at whole-genome scale and found to play critical roles. However, in clinically unremarkable individuals, pSNMs have never been identified at whole-genome scale largely due to technical difficulties and lack of matched control tissue samples, and thus the genome-wide characteristics of pSNMs remain unknown. We developed a new Bayesian-based mosaic genotyper and a series of effective error filters, using which we were able to identify 17 SNM sites from ∼80× whole-genome sequencing of peripheral blood DNAs from three clinically unremarkable adults. The pSNMs were thoroughly validated using pyrosequencing, Sanger sequencing of individual cloned fragments, and multiplex ligation-dependent probe amplification. The mutant allele fraction ranged from 5%-31%. We found that C→T and C→A were the predominant types of postzygotic mutations, similar to the somatic mutation profile in tumor tissues. Simulation data showed that the overall mutation rate was an order of magnitude lower than that in cancer. We detected varied allele fractions of the pSNMs among multiple samples obtained from the same individuals, including blood, saliva, hair follicle, buccal mucosa, urine, and semen samples, indicating that pSNMs could affect multiple sources of somatic cells as well as germ cells. Two of the adults have children who were diagnosed with Dravet syndrome. We identified two non-synonymous pSNMs in SCN1A, a causal gene for Dravet syndrome, from these two unrelated adults and found that the mutant alleles were transmitted to their children, highlighting the clinical importance of detecting pSNMs in genetic counseling. PMID:25312340

  4. Phylogenetic analysis of beta-papillomaviruses as inferred from nucleotide and amino acid sequence data.

    PubMed

    Gottschling, Marc; Köhler, Anja; Stockfleth, Eggert; Nindl, Ingo

    2007-01-01

    Human papillomaviruses (HPV) of the beta-group seem to be involved in the pathogenesis of non-melanoma skin cancer. Papillomaviruses are host specific and are considered closely co-evolving with their hosts. Evolutionary incongruence between early genes and late genes has been reported among oncogenic genital alpha-papillomaviruses and considerably challenge phylogenetic reconstructions. We investigated the relationships of 29 beta-HPV (25 types plus four putative new types, subtypes, or variants) as inferred from codon aligned and amino acid sequence data of the genes E1, E2, E6, E7, L1, and L2 using likelihood, distance, and parsimony approaches. An analysis of a L1 fragment included additional nucleotide and amino acid sequences from seven non-human beta-papillomaviruses. Early genes and late genes evolution did not conflict significantly in beta-papillomaviruses based on partition homogeneity tests (p > or = 0.001). As inferred from the complete genome analyses, beta-papillomaviruses were monophyletic and segregated into four highly supported monophyletic assemblages corresponding to the species 1, 2, 3, and fused 4/5. They basically split into the species 1 and the remainder of beta-papillomaviruses, whose species 3, 4, and 5 constituted the sistergroup of species 2. beta-Papillomaviruses have been isolated from humans, apes, and monkeys, and phylogenetic analyses of the L1 fragment showed non-human papillomaviruses highly polyphyletic nesting within the HPV species. Thus, host and virus phylogenies were not congruent in beta-papillomaviruses, and multiple invasions across species borders may contribute (additionally to host-linked evolution) to their diversification.

  5. Evaluation of the flanking nucleotide sequences of sarcomeric hypertrophic cardiomyopathy substitution mutations.

    PubMed

    Meurs, Kathryn M; Mealey, Katrina L

    2008-07-03

    Hypertrophic cardiomyopathy (HCM) is a familial myocardial disease with a prevalence of 1 in 500. More than 400 causative mutations have been identified in 13 sarcomeric and myofilament related genes, 350 of these are substitution mutations within eight sarcomeric genes. Within a population, examples of recurring identical disease causing mutations that appear to have arisen independently have been noted as well as those that appear to have been inherited from a common ancestor. The large number of novel HCM mutations could suggest a mechanism of increased mutability within the sarcomeric genes. The objective of this study was to evaluate the most commonly reported HCM genes, beta myosin heavy chain (MYH7), myosin binding protein C, troponin I, troponin T, cardiac regulatory myosin light chain, cardiac essential myosin light chain, alpha tropomyosin and cardiac alpha-actin for sequence patterns surrounding the substitution mutations that may suggest a mechanism of increased mutability. The mutations as well as the 10 flanking nucleotides were evaluated for frequency of di-, tri- and tetranucleotides containing the mutation as well as for the presence of certain tri- and tetranculeotide motifs. The most common substitutions were guanine (G) to adenine (A) and cytosine (C) to thymidine (T). The CG dinucleotide had a significantly higher relative mutability than any other dinucleotide (p<0.05). The relative mutability of each possible trinucleotide and tetranucleotide sequence containing the mutation was calculated; none were at a statistically higher frequency than the others. The large number of G to A and C to T mutations as well as the relative mutability of CG may suggest that deamination of methylated CpG is an important mechanism for mutation development in at least some of these cardiac genes.

  6. Full length nucleotide sequences of 30 common SLC44A2 alleles encoding human neutrophil antigen-3 (HNA-3)

    PubMed Central

    Chen, Qing; Srivastava, Kshitij; Ardinski, Stefanie C.; Lam, Kevin; Huvard, Michael J.; Schmid, Pirmin; Flegel, Willy A.

    2015-01-01

    Background HNA-3a alloantibodies can cause severe transfusion-related acute lung injury (TRALI). The frequency of the single nucleotide polymorphisms (SNPs) indicative of the two clinically relevant HNA-3a/b antigens are known in many populations. In the present study, we determined the full length nucleotide sequence of common SLC44A2 alleles encoding the choline transporter-like protein-2 (CTL2) that harbors HNA-3a/b antigens. Study design and methods A method was devised to determine the full length coding sequence and adjacent intron sequences from genomic DNA by 8 polymerase chain reaction (PCR) amplifications covering all 22 SLC44A2 exons. Samples from 200 African American, 96 Caucasian, 2 Hispanic and 4 Asian blood donors were analyzed. We developed a decision tree to determine alleles (confirmed haplotypes) from the genotype data. Results A total of 10 SNPs were detected in the SLC44A2 coding sequence. The non-coding sequences harbored an additional 28 SNPs (1 in the 5’-untranslated region (UTR); 23 in the introns; and 4 in the 3’-UTR). No SNP indicative of a non-functional allele was detected. The nucleotide sequences for 30 SLC44A2 alleles (haplotypes) were confirmed. There may be 66 haplotypes among the 604 chromosomes screened. Conclusions We found 38 SNPs, including 1 novel SNP, in 8192 nucleotides covering the coding sequence of the SLC44A2 gene among 302 blood donors. Population frequencies of these SNPs were established for African Americans and Caucasians. Because alleles encoding HNA-3b are more common than non-functional SLC44A2 alleles, we confirmed our previous postulate that African American donors are less likely to form HNA-3a antibodies compared to Caucasians. PMID:26437811

  7. Sequence analysis and molecular characterization of larval midgut cDNA transcripts encoding peptidases from the yellow mealworm, Tenebrio molitor L.

    PubMed

    Prabhakar, S; Chen, M-S; Elpidina, E N; Vinokurov, K S; Smith, C M; Marshall, J; Oppert, B

    2007-08-01

    Peptidase sequences were analysed in randomly picked clones from cDNA libraries of the anterior or posterior midgut or whole larvae of the yellow mealworm, Tenebrio molitor Linnaeus. Of a total of 1528 sequences, 92 encoded potential peptidases, from which 50 full-length cDNA sequences were obtained, including serine and cysteine proteinases and metallopeptidases. Serine proteinase transcripts were predominant in the posterior midgut, whereas transcripts encoding cysteine and metallopeptidases were mainly found in the anterior midgut. Alignments with other proteinases indicated that 40% of the serine proteinase sequences were serine proteinase homologues, and the remaining ones were identified as either trypsin, chymotrypsin or other serine proteinases. Cysteine proteinase sequences included cathepsin B- and L-like proteinases, and metallopeptidase transcripts were similar to carboxypeptidase A. Northern blot analysis of representative sequences demonstrated the differential expression profile of selected transcripts across five developmental stages of Te. molitor. These sequences provide insights into peptidases in coleopteran insects as a basis to study the response of coleopteran larvae to external stimuli and to evaluate regulatory features of the response.

  8. Primary structure of a genomic zein sequence of maize.

    PubMed Central

    Hu, N T; Peifer, M A; Heidecker, G; Messing, J; Rubenstein, I

    1982-01-01

    The nucleotide sequence of a genomic clone (termed Z4 ) of the zein multigene family was compared to the nucleotide sequence of related cDNA clones of zein mRNAs. A tandem duplication of a 96-bp sequence is found in the genomic clone that is not present in the related cDNA clones. When the duplication is disregarded, the nucleotide sequence homology between Z4 and its related cDNAs was approximately 97%. The nucleotide sequence is also compared to other isolated cDNAs. No introns in the coding region of the zein gene are detected. The first nucleotide of a putative TATA box, TATAAATA , was located 88 nucleotides upstream of the first nucleotide of the first ATG codon which initiated the open reading frame. The first nucleotide of a putative CCAAT box, CAAAAT , appeared 45 nucleotides upstream of the first nucleotide of the zein cDNA clones in the 3' non-coding region also appeared in the genomic sequence at the same locations. The amino acid composition of the polypeptide specified by the Z4 nucleotide sequence is similar to the known composition of zein proteins. PMID:6233138

  9. Nucleotide sequences of cDNAs for human papillomavirus type 18 transcripts in HeLa cells

    SciTech Connect

    Inagaki, Yutaka; Tsunokawa, Youko; Takebe, Naoko; Terada, Masaaki; Sugimura, Takashi ); Nawa, Hiroyuki; Nakanishi, Shigetada )

    1988-05-01

    HeLa cells expressed 3.4- and 1.6-kilobase (kb) transcripts of the integrated human papillomavirus (HPV) type 18 genome. Two types of cDNA clones representing each size of HPV type 18 transcript were isolated. Sequence analysis of these two types of cDNA clones revealed that the 3.4-kb transcript contained E6, E7, the 5{prime} portion of E1, and human sequence and that the 1.6-kb transcript contained spliced and frameshifted E6 (E6{sup *}), E7, and human sequence. There was a common human sequence containing a poly(A) addition signal in the 3{prime} end portions of both transcripts, indicating that they were transcribed from the HPV genome at the same integration site with different splicing. Furthermore, the 1.6-kb transcript contained both of the two viral TATA boxes upstream of E6, strongly indicating that a cellular promoter was used for its transcription.

  10. Nucleotide sequence of a cluster of early and late genes in a conserved segment of the vaccinia virus genome.

    PubMed Central

    Plucienniczak, A; Schroeder, E; Zettlmeissl, G; Streeck, R E

    1985-01-01

    The nucleotide sequence of a 7.6 kb vaccinia DNA segment from a genomic region conserved among different orthopox virus has been determined. This segment contains a tight cluster of 12 partly overlapping open reading frames most of which can be correlated with previously identified early and late proteins and mRNAs. Regulatory signals used by vaccinia virus have been studied. Presumptive promoter regions are rich in A, T and carry the consensus sequences TATA and AATAA spaced at 20-24 base pairs. Tandem repeats of a CTATTC consensus sequence are proposed to be involved in the termination of early transcription. PMID:2987815

  11. The nucleotide sequence of blue-green algae phenylalanine-tRNA and the evolutionary origin of chloroplasts.

    PubMed Central

    Hecker, L I; Barnett, W E; Lin, F K; Furr, T D; Heckman, J E; RajBhandary, U L; Chang, S H

    1982-01-01

    Phenylalanine tRNA from the blue-green alga, Agmenellum quadruplicatum, has been purified to homogeneity. The nucleotide sequence of this tRNA was determined to be: (see tests) Comparisons of the sequence and the modified nucleosides of this tRNA with those of other tRNAPhes thus far sequenced, indicate that this blue green algal tRNAPhe is typically prokaryotic and closely resembles the chloroplast tRNAPhes of higher plants and Euglena. The significance of this observation to the evolutionary origin of chloroplasts is discussed. Images PMID:6817301

  12. cDNA sequence of rat liver fructose-1,6-bisphosphatase and evidence for down-regulation of its mRNA by insulin.

    PubMed Central

    el-Maghrabi, M R; Pilkis, J; Marker, A J; Colosia, A D; D'Angelo, G; Fraser, B A; Pilkis, S J

    1988-01-01

    A coding-length clone of rat liver fructose-1,6-bisphosphatase (EC 3.1.3.11) was isolated by immunological screening of a cDNA library in lambda gt11. Its identity was verified by comparing the deduced amino acid sequence with that obtained by direct sequencing of a complete set of CNBr and proteolytic peptides from the purified protein. The enzyme subunit is composed of 362 amino acids and has N-acetylvaline as the amino-terminal residue. The cDNA, 1255 base pairs (bp) long, consisted of 1086 bp of coding region, 15 bp of 5' untranslated sequence, and 154 bp at the 3' untranslated end. The 3' untranslated sequence contained a polyadenylylation signal (AATAAA) followed after 30 bp by a stretch of 7 adenines at the end of the clone. The deduced amino acid sequence was identical to the primary sequence of the protein and confirmed the alignment of five nonoverlapping peptides. It also confirmed the 27-residue extension, unique to the rat liver subunit, ending with a carboxyl-terminal phenylalanine. RNA blot analyses using the radiolabeled liver cDNA as a probe revealed a single band of fructose-1,6-bisphosphatase mRNA, 1.4 kilobases long, in liver and kidney but not in nongluconeogenic tissues. Fructose-1,6-bisphosphatase mRNA was increased 10-fold in livers from diabetic rats and was reduced to control levels after 24 hr of insulin treatment, suggesting that the changes in enzyme activity observed in diabetes and after insulin treatment are due to alterations in mRNA abundance. Images PMID:2847161

  13. Single nucleotide polymorphism discovery from expressed sequence tags in the waterflea Daphnia magna

    PubMed Central

    2011-01-01

    Background Daphnia (Crustacea: Cladocera) plays a central role in standing aquatic ecosystems, has a well known ecology and is widely used in population studies and environmental risk assessments. Daphnia magna is, especially in Europe, intensively used to study stress responses of natural populations to pollutants, climate change, and antagonistic interactions with predators and parasites, which have all been demonstrated to induce micro-evolutionary and adaptive responses. Although its ecology and evolutionary biology is intensively studied, little is known on the functional genomics underpinning of phenotypic responses to environmental stressors. The aim of the present study was to find genes expressed in presence of environmental stressors, and target such genes for single nucleotide polymorphic (SNP) marker development. Results We developed three expressed sequence tag (EST) libraries using clonal lineages of D. magna exposed to ecological stressors, namely fish predation, parasite infection and pesticide exposure. We used these newly developed ESTs and other Daphnia ESTs retrieved from NCBI GeneBank to mine for SNP markers targeting synonymous as well as non synonymous genetic variation. We validate the developed SNPs in six natural populations of D. magna distributed at regional scale. Conclusions A large proportion (47%) of the produced ESTs are Daphnia lineage specific genes, which are potentially involved in responses to environmental stress rather than to general cellular functions and metabolic activities, or reflect the arthropod's aquatic lifestyle. The characterization of genes expressed under stress and the validation of their SNPs for population genetic study is important for identifying ecologically responsive genes in D. magna. PMID:21668940

  14. Nucleotide sequence and characterization of a Bacillus subtilis gene encoding a flagellar switch protein.

    PubMed Central

    Zuberi, A R; Bischoff, D S; Ordal, G W

    1991-01-01

    The nucleotide sequence of the Bacillus subtilis fliM gene has been determined. This gene encodes a 38-kDa protein that is homologous to the FliM flagellar switch proteins of Escherichia coli and Salmonella typhimurium. Expression of this gene in Che+ cells of E. coli and B. subtilis interferes with normal chemotaxis. The nature of the chemotaxis defect is dependent upon the host used. In B. subtilis, overproduction of FliM generates mostly nonmotile cells. Those cells that are motile switch less frequently. Expression of B. subtilis FliM in E. coli also generates nonmotile cells. However, those cells that are motile have a tumble bias. The B. subtilis fliM gene cannot complement an E. coli fliM mutant. A frameshift mutation was constructed in the fliM gene, and the mutation was transferred onto the B. subtilis chromosome. The mutant has a Fla- phenotype. This phenotype is consistent with the hypothesis that the FliM protein encodes a component of the flagellar switch in B. subtilis. Additional characterization of the fliM mutant suggests that the hag and mot loci are not expressed. These loci are regulated by the SigD form of RNA polymerase. We also did not observe any methyl-accepting chemotaxis proteins in an in vivo methylation experiment. The expression of these proteins is also dependent upon SigD. It is possible that a functional basal body-hook complex may be required for the expression of SigD-regulated chemotaxis and motility genes. Images PMID:1898932

  15. Screening target specificity of siRNAs by rapid amplification of cDNA ends (RACE) for non-sequenced species.

    PubMed

    Sabirzhanov, Boris; Sabirzhanova, Inna B; Keifer, Joyce

    2011-05-01

    RNA interference (RNAi) is the process of sequence-specific posttranslational gene silencing triggered by double-stranded RNAs (dsRNAs). RNAi is a widely used approach for studying gene function. However, studies have shown that using siRNA can lead to off-target effects when the siRNA contains sufficient sequence identity to non-target mRNA sequences. One of the important steps in designing dsRNA is verification that it has sequence identity to only the target mRNA. In this report, we propose an approach for primary screening dsRNAs for potential off-target effects by using rapid amplification of cDNA ends. This method can be especially useful for model systems using species that have limited availability of sequence data.

  16. Nucleotide Sequence and Genetic Structure of a Novel Carbaryl Hydrolase Gene (cehA) from Rhizobium sp. Strain AC100

    PubMed Central

    Hashimoto, Masayuki; Fukui, Mitsuru; Hayano, Kouichi; Hayatsu, Masahito

    2002-01-01

    Rhizobium sp. strain AC100, which is capable of degrading carbaryl (1-naphthyl-N-methylcarbamate), was isolated from soil treated with carbaryl. This bacterium hydrolyzed carbaryl to 1-naphthol and methylamine. Carbaryl hydrolase from the strain was purified to homogeneity, and its N-terminal sequence, molecular mass (82 kDa), and enzymatic properties were determined. The purified enzyme hydrolyzed 1-naphthyl acetate and 4-nitrophenyl acetate indicating that the enzyme is an esterase. We then cloned the carbaryl hydrolase gene (cehA) from the plasmid DNA of the strain and determined the nucleotide sequence of the 10-kb region containing cehA. No homologous sequences were found by a database homology search using the nucleotide and deduced amino acid sequences of the cehA gene. Six open reading frames including the cehA gene were found in the 10-kb region, and sequencing analysis shows that the cehA gene is flanked by two copies of insertion sequence-like sequence, suggesting that it makes part of a composite transposon. PMID:11872471

  17. HLA-C locus allelic dropout in Sanger sequence-based typing due to intronic single nucleotide polymorphism.

    PubMed

    Cheng, Christopher; Kashi, Zahra Mehdizadeh; Martin, Russell; Woodruff, Gillian; Dinauer, David; Agostini, Tina

    2014-12-01

    We report a novel HLA-C allele that was identified during routine HLA typing using sequence-based methods. The patient was initially typed as a C*06:02, 06:04 with two nucleotide mismatches in exon 3, (C to T and T to G changes) which would have resulted in a non-synonymous mutation of a leucine residue being replaced with tryptophan. Further resolution of the patient's type by using sequence-specific primers (SSP) revealed that the companion allele to C*06:02 was a novel C*17:01. Confirmation of the existence of the new allele was performed across multiple platforms: Sanger sequencing, SSP, and Next Generation Sequencing (NGS) on the original sample and allele-specific clones for the entire HLA-C locus. The investigation revealed a single nucleotide mismatch within the Sanger sequencing primer binding site in intron 3. The mutation caused the initial C*17 dropout in exons 2 and 3. Further analysis of the Sanger and NGS data revealed that the C*17 had two additional unique positions in introns 2 and 7. The companion C*06:02 allele also possessed a novel position at intron 3. On August 31, 2013, the WHO nomenclature committee officially named the novel C*17:01 allele sequence as C*17:01:01:03 and the novel C*06:02 allele sequence as C*06:02:01:03.

  18. Transcription profiling of guanine nucleotide binding proteins during developmental regulation, and pesticide response in Solenopsis invicta (Hymenoptera: Formicidae)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Guanine nucleotide binding proteins (GNBP or G-protein) are glycoproteins anchored on the cytoplasmic cell membrane, and are mediators for many cellular processes. Complete cDNA of guanine nucleotide-binding protein gene ß-subunit (SiGNBP) was cloned and sequenced from S. invicta workers. To detect ...

  19. Sequencing analysis of 20,000 full-length cDNA clones from cassava reveals lineage specific expansions in gene families related to stress response

    PubMed Central

    Sakurai, Tetsuya; Plata, Germán; Rodríguez-Zapata, Fausto; Seki, Motoaki; Salcedo, Andrés; Toyoda, Atsushi; Ishiwata, Atsushi; Tohme, Joe; Sakaki, Yoshiyuki; Shinozaki, Kazuo; Ishitani, Manabu

    2007-01-01

    Background Cassava, an allotetraploid known for its remarkable tolerance to abiotic stresses is an important source of energy for humans and animals and a raw material for many industrial processes. A full-length cDNA library of cassava plants under normal, heat, drought, aluminum and post harvest physiological deterioration conditions was built; 19968 clones were sequence-characterized using expressed sequence tags (ESTs). Results The ESTs were assembled into 6355 contigs and 9026 singletons that were further grouped into 10577 scaffolds; we found 4621 new cassava sequences and 1521 sequences with no significant similarity to plant protein databases. Transcripts of 7796 distinct genes were captured and we were able to assign a functional classification to 78% of them while finding more than half of the enzymes annotated in metabolic pathways in Arabidopsis. The annotation of sequences that were not paired to transcripts of other species included many stress-related functional categories showing that our library is enriched with stress-induced genes. Finally, we detected 230 putative gene duplications that include key enzymes in reactive oxygen species signaling pathways and could play a role in cassava stress response features. Conclusion The cassava full-length cDNA library here presented contains transcripts of genes involved in stress response as well as genes important for different areas of cassava research. This library will be an important resource for gene discovery, characterization and cloning; in the near future it will aid the annotation of the cassava genome. PMID:18096061

  20. Construction of a full-length cDNA library and preliminary analysis of expressed sequence tags from lymphocytes of half-pipe snowboarding athletes.

    PubMed

    Zhao, Y H; Zhang, Z B; Zhao, C Q; Zhang, Y; Wang, Y F; Guan, W J; Zhu, Z Q

    2015-10-21

    The genes of top athletes are a valuable genetic resource for the human race, and could be exploited to identify novel genes related to sports ability, as well as other functions. We analyzed the expressed sequence tags from top half-pipe snowboarding athletes using the SMART complementary DNA (cDNA) library construction method to elucidate the characteristics of the athlete genome and the differential expression of the genes it contains. Overall, we established a full-length cDNA library from the lymphocytes of half-pipe snowboarding athletes and analyzed the inserted gene fragments. We also classified those genes according to molecular function, biological characteristics, cellular composition, protein types, and signal paths. A total of 201 functional genes were noted, which were distributed in 27 pathways. TXN, MDH1, ARL1, ARPC3, ACTG1, and other genes measured in sequence may be associated with physical ability. This suggests that the SMART cDNA library constructed from the genetic material from top athletes is an effective tool for preserving genetic sports resources and providing genetic markers of physical ability for athlete selection.

  1. Interferon-induced 56,000 Mr protein and its mRNA in human cells: molecular cloning and partial sequence of the cDNA.

    PubMed Central

    Chebath, J; Merlin, G; Metz, R; Benech, P; Revel, M

    1983-01-01

    Treatment of responsive cells by interferons (IFNs) induces within a few hours a rise in the concentration of several proteins and mRNAs. In order to characterize these IFN-induced mRNA species, we have cloned in E. coli the cDNA made from a 17-18S poly(A)+ RNA of human fibroblastoid cells (SV80) treated with IFN-beta. We describe here a pBR322 recombinant plasmid (C56) which contains a 400 bp cDNA insert corresponding to a 18S mRNA species newly induced by IFN. The C56 mRNA codes for a 56,000 dalton protein easily detectable by hybridization-translation experiments. The sequence of 66 of the carboxy-terminal amino-acids of the protein can be deduced from the cDNA sequence. IFNs-alpha, beta or gamma are able to activate the expression of this gene in human fibroblasts as well as lymphoblastoid cells. The mRNA is not detectable without IFN; it reaches maximum levels (0.1% of the total poly(A)+ RNA) within 4-8 hrs and decreases after 16 hrs. Images PMID:6186990

  2. Nucleotide sequence and analysis of the 58.3 to 65.5-kb early region of bacteriophage T4.

    PubMed Central

    Valerie, K; Stevens, J; Lynch, M; Henderson, E E; de Riel, J K

    1986-01-01

    The complete 7.2-kb nucleotide sequence from the 58.3 to 65.5-kb early region of bacteriophage T4 has been determined by Maxam and Gilbert sequencing. Computer analysis revealed at least 20 open reading frames (ORFs) within this sequence. All major ORFs are transcribed from the left strand, suggesting that they are expressed early during infection. Among the ORFs, we have identified the ipIII, ipII, denV and tk genes. The ORFs are very tightly spaced, even overlapping in some instances, and when ORF interspacing occurs, promoter-like sequences can be implicated. Several of the sequences preceding the ORFs, in particular those at ipIII, ipII, denV, and orf61.9, can potentially form stable stem-loop structures. PMID:3024113

  3. Complete nucleotide sequence and analysis of the putative polyprotein of maize dwarf mosaic virus genomic RNA (Bulgarian isolate).

    PubMed

    Kong, P; Steinbiss, H H

    1998-01-01

    The complete nucleotide sequence of maize dwarf mosaic virus Bulgarian isolate (MDMV-Bg) was determined. The viral genome was 9515 nt and contained an open reading frame encoding 3042 amino acids, flanked by 3'- and 5'-UTRs of 139 and 250 nucleotides, respectively. MDMV-Bg was more conserved in the coding region (52.9%) than in the UTRs (45.8%) when compared to the 15 other potyviruses. Of ten putative gene products of MDMV-Bg, the P1 was the most variable protein (24.9%) while the NIb was the most conserved protein (67.3%). Several sequence variations were observed between MDMV-Bg and Johnson grass mosaic virus (JGMV), and more between MDMV-Bg and the dicot potyviruses. Phylogenetic analysis suggested that MDMV-Bg was the most closely related to JGMV.

  4. The nucleotide sequence of the coat protein genes of satsuma dwarf virus and naval orange infectious mottling virus.

    PubMed

    Iwanami, T; Kondo, Y; Makita, Y; Azeyanagi, C; Ieki, H

    1998-01-01

    The sequence of the 3'-terminal 4320 and 2409 nucleotides were determined for RNA2 of satsuma dwarf virus (SDV) and navel infectious mottling virus (NIMV). Both sequences contained a part of a long open reading frame which encodes larger and smaller coat proteins (CPs) at the 3'-terminus followed by a 3'non-coding region upstream of a poly (A) tail. Amino acid sequence identity for larger and smaller CPs ranged 81-84% and 68-78%, respectively, among SDV, NIMV and the previously sequenced citrus mosaic virus (CiMV). No significant sequence similarity was found between the CPs of SDV or NIMV and those of the como-, nepo- or other viruses. The nucleotide sequence identity of the 3' non-coding region of RNA2 were 68%-78% among SDV, CiMV and NIMV. These results suggest that SDV, CiMV and NIMV are distinct, though related, viruses. They may be assigned as members of the new genus, which is close to the genera of Comovirus and Nepovirus.

  5. Cloning and nucleotide sequence of the gene coding for aspartokinase II from a thermophilic methylotrophic Bacillus sp.

    PubMed Central

    Schendel, F J; Flickinger, M C

    1992-01-01

    The structural gene coding for the lysine-sensitive aspartokinase II of the methylotrophic thermotolerant Bacillus sp. strain MGA3 was cloned from a genomic library by complementation of an Escherichia coli auxotrophic mutant lacking all three aspartokinase isozymes. The nucleotide sequence of the entire 2.2-kb PstI fragment was determined, and a single open reading frame coding for the aspartokinase II enzyme was found. Aspartokinase II was shown to be an alpha 2 beta 2 tetramer (M(r) 122,000) with the beta subunit (M(r) 18,000) encoded within the alpha subunit (M(r) 45,000) in the samea reading frame. The enzyme was purified, and the N-terminal sequences of the alpha and beta subunits were identical with those predicted from the gene sequences. The predicted amino acid sequence was 76% identical with the sequence of the Bacillus subtilis aspartokinase II. The transcription initiation site was located approximately 350 bp upstream of the translation start site, and putative promoter regions at -10 (TATGCT) and -35 (ATGACA) were identified. A 300-nucleotide intervening sequence between the transcription initiation and translational start sites suggests a possible attenuation mechanism for the regulation of transcription of this enzyme in the presence of lysine. Images PMID:1444390

  6. Complete nucleotide sequence of the haemagglutinin gene from a human influenza virus of the Hong Kong subtype.

    PubMed Central

    Both, G W; Sleigh, M J

    1980-01-01

    The complete nucleotide sequence has been determined for a cloned double-stranded DNA copy of the haemagglutinin gene from the human influenza strain A/NT/60/68/29C, a laboratory-isolated variant of A/NT/60/68, an early strain of the Hong Kong subtype. The gene is 1765 nucleotides long and contains information sufficient to code for a protein of 566 amino acids, which includes a hydrophobic leader peptide (16 residues), HA1 (328), HA2 (221) and an arginine residue which joins the HA subunits. Comparison of the predicted amino acid sequence for 29C haemagglutinin with protein sequence data available for HA from other influenza strains shows that no potential coding information is lost by processing of the mRNA. A comparison of the amino acid sequences predicted from the gene sequences for 29C and fowl plague virus haemagglutinins, (1) indicates the extent to which changes can occur in the primary sequence of different regions of the protein, while maintaining essential structure and function. Images PMID:6253883

  7. Group-specific amplification of cDNA from DRB1 genes. Complete coding sequences of partially defined alleles and identification of the new alleles DRB1*040602, DRB1*111102, DRB1*080103, and DRB1*0113.

    PubMed

    Balas, Antonio; Vilches, Carlos; Rodríguez, Miguel A; Fernández, Begoña; Martinez, Maria Paz; de Pablo, Rosario; García-Sánchez, Félix; Vicario, Jose L

    2006-12-01

    We present here the complete coding sequences, previously unavailable, of the DRB1 alleles DRB1*030102, *0306, *040701, *0408, *1327, *1356, *1411, *1446, *1503, *1504, *0806, *0813, and *0818. For cDNA isolation, new group-specific primers located at the 5'UT and 3'UT regions were used to carry out allele-specific amplification and a convenient method for determining full-length sequences for DRB1 alleles. Complete coding sequencing of samples previously typed as DRB1*0406, DRB1*080101, and DRB1*1111 revealed new alleles with noncoding nucleotide changes at exons 1 and 3. In addition, we found a novel allele, DRB1*0113, whose second exon carries a sequence motif characteristic of DRB1*07 alleles. The predicted class II haplotypic associations of all alleles are reported and discussed.

  8. Cloning of the canine beta-glucuronidase cDNA, mutation identification in canine MPS VII, and retroviral vector-mediated correction of MPS VII cells.

    PubMed

    Ray, J; Bouvet, A; DeSanto, C; Fyfe, J C; Xu, D; Wolfe, J H; Aguirre, G D; Patterson, D F; Haskins, M E; Henthorn, P S

    1998-03-01

    Mucopolysaccharidosis type VII (MPS VII) is an inherited disease resulting from deficient activity of the lysosomal acid hydrolase beta-glucuronidase (GUSB) and has been reported in humans, mice, cats, and dogs. To characterize canine MPS VII, we have isolated and sequenced the canine GUSB cDNA from normal and affected animals. A single nucleotide substitution was detected in the GUSB cDNA derived from MPS VII dogs. This guanosine to adenine base change at nucleotide position 559 in the canine cDNA sequence causes an arginine to histidine substitution at amino acid position 166. Introduction of the G to A substitution at position 559 in a mammalian expression vector containing the normal canine GUSB cDNA nearly eliminated the GUSB enzymatic activity, demonstrating that this mutation is the cause of canine MPS VII. A retroviral vector expressing the full-length canine beta-glucuronidase cDNA corrected the deficiency in MPS VII cells.

  9. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus) Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms

    PubMed Central

    Bertolini, Francesca; Scimone, Concetta; Geraci, Claudia; Schiavo, Giuseppina; Utzeri, Valerio Joe; Chiofalo, Vincenzo; Fontanesi, Luca

    2015-01-01

    Few studies investigated the donkey (Equus asinus) at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer) and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated) and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL) obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca). The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs) in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing) and Ion Torrent (RRL) runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources. PMID:26151450

  10. Next Generation Semiconductor Based Sequencing of the Donkey (Equus asinus) Genome Provided Comparative Sequence Data against the Horse Genome and a Few Millions of Single Nucleotide Polymorphisms.

    PubMed

    Bertolini, Francesca; Scimone, Concetta; Geraci, Claudia; Schiavo, Giuseppina; Utzeri, Valerio Joe; Chiofalo, Vincenzo; Fontanesi, Luca

    2015-01-01

    Few studies investigated the donkey (Equus asinus) at the whole genome level so far. Here, we sequenced the genome of two male donkeys using a next generation semiconductor based sequencing platform (the Ion Proton sequencer) and compared obtained sequence information with the available donkey draft genome (and its Illumina reads from which it was originated) and with the EquCab2.0 assembly of the horse genome. Moreover, the Ion Torrent Personal Genome Analyzer was used to sequence reduced representation libraries (RRL) obtained from a DNA pool including donkeys of different breeds (Grigio Siciliano, Ragusano and Martina Franca). The number of next generation sequencing reads aligned with the EquCab2.0 horse genome was larger than those aligned with the draft donkey genome. This was due to the larger N50 for contigs and scaffolds of the horse genome. Nucleotide divergence between E. caballus and E. asinus was estimated to be ~ 0.52-0.57%. Regions with low nucleotide divergence were identified in several autosomal chromosomes and in the whole chromosome X. These regions might be evolutionally important in equids. Comparing Y-chromosome regions we identified variants that could be useful to track donkey paternal lineages. Moreover, about 4.8 million of single nucleotide polymorphisms (SNPs) in the donkey genome were identified and annotated combining sequencing data from Ion Proton (whole genome sequencing) and Ion Torrent (RRL) runs with Illumina reads. A higher density of SNPs was present in regions homologous to horse chromosome 12, in which several studies reported a high frequency of copy number variants. The SNPs we identified constitute a first resource useful to describe variability at the population genomic level in E. asinus and to establish monitoring systems for the conservation of donkey genetic resources.

  11. Immunological responses of turbot (Psetta maxima) to nodavirus infection or polyriboinosinic polyribocytidylic acid (pIC) stimulation, using expressed sequence tags (ESTs) analysis and cDNA microarrays.

    PubMed

    Park, Kyoung C; Osborne, Jane A; Montes, Ariana; Dios, Sonia; Nerland, Audun H; Novoa, Beatriz; Figueras, Antonio; Brown, Laura L; Johnson, Stewart C

    2009-01-01

    To investigate the immunological responses of turbot to nodavirus infection or pIC stimulation, we constructed cDNA libraries from liver, kidney and gill tissues of nodavirus-infected fish and examined the differential gene expression within turbot kidney in response to nodavirus infection or pIC stimulation using a turbot cDNA microarray. Turbot were experimentally infected with nodavirus and samples of each tissue were collected at selected time points post-infection. Using equal amount of total RNA at each sampling time, we made three tissue-specific cDNA libraries. After sequencing 3230 clones we obtained 3173 (98.2%) high quality sequences from our liver, kidney and gill libraries. Of these 2568 (80.9%) were identified as known genes and 605 (19.1%) as unknown genes. A total of 768 unique genes were identified. The two largest groups resulting from the classification of ESTs according to function were the cell/organism defense genes (71 uni-genes) and apoptosis-related process (23 uni-genes). Using these clones, a 1920 element cDNA microarray was constructed and used to investigate the differential gene expression within turbot in response to experimental nodavirus infection or pIC stimulation. Kidney tissue was collected at selected times post-infection (HPI) or stimulation (HPS), and total RNA was isolated for microarray analysis. Of the 1920 genes studied on the microarray, we identified a total of 121 differentially expressed genes in the kidney: 94 genes from nodavirus-infected animals and 79 genes from those stimulated with pIC. Within the nodavirus-infected fish we observed the highest number of differentially expressed genes at 24 HPI. Our results indicate that certain genes in turbot have important roles in immune responses to nodavirus infection and dsRNA stimulation.

  12. A close relationship between primary nucleotides sequence structure and the composition of functional genes in the genome of prokaryotes.

    PubMed

    Garcia, Juan A L; Fernández-Guerra, Antoni; Casamayor, Emilio O

    2011-12-01

    Comparative genomics is an essential tool to unravel how genomes change over evolutionary time and to gain clues on the links between functional genomics and evolution. In prokaryotes, the large, good quality, genome sequences available in public databases and the recently developed large-scale computational methods, offer an unprecedent view on the ecology and evolution of microorganisms through comparative genomics. In this work, we examined the links among genome structure (i.e., the sequential distribution of nucleotides itself by detrended fluctuation analysis, DFA) and genomic diversity (i.e., gene functionality by Clusters of Orthologous Genes, COGs) in 828 full sequenced prokaryotic genomes from 548 different bacteria and archaea species. DFA scaling exponent α indicated persistent long-range correlations (fractality) in each genome analyzed. Higher resolution power was found when considering the sequential succession of purine (AG) vs. pyrimidine (CT) bases than either keto (GT) to amino (AC) forms or strongly (GC) vs. weakly (AT) bonded nucleotides. Interestingly, the phyla Aquificae, Fusobacteria, Dictyoglomi, Nitrospirae, and Thermotogae were closer to archaea than to their bacterial counterparts. A strong significant correlation was found between scaling exponent α and COGs distribution, and we consistently observed that the larger α the more heterogeneous was the gene distribution within each functional category, suggesting a close relationship between primary nucleotides sequence structure and functional genes composition.

  13. [Analysis on the preference of synonymous codon in VP1 nucleotide sequence of the EV71 based on RSCU method].

    PubMed

    Qi, Bin; Zhao, Jing-Jing; Gao, Lei; Zhu, Ping

    2009-11-01

    Based on RSCU method and by analyzing the preference of codon usage in VP1 nucleotide sequences of EV71 isolated in Chinese mainland and Taiwan region from 1998 to 2008, it is clear that there is an obvious time discrimination in RSCU calculated from EV71 VP1 strain between two different regions of China and it is more obvious in Taiwan region, therefore, according to the diversity of RSCU, the years can be divided into 2 intervals in Chinese mainland and 4 intervals in Taiwan region, especially, the number of intervals in one region have a positive co-relation with the activity of variation of the EV71 in the same region. The change of the preference of codon usage in VP1 nucleotide sequences of EV71 can significantly embody the Variation of the EV71, so we can make use of the analysis on preference of codon usage in VP1 nucleotide sequences of EV71 to predict the possible variation trend of the EV71.

  14. The Nucleotide Capture Region of Alpha Hemolysin: Insights into Nanopore Design for DNA Sequencing from Molecular Dynamics Simulations.

    PubMed

    Manara, Richard M A; Tomasio, Susana; Khalid, Syma

    2015-01-27

    Nanopore technology for DNA sequencing is constantly being refined and improved. In strand sequencing a single strand of DNA is fed through a nanopore and subsequent fluctuations in the current are measured. A major hurdle is that the DNA is translocated through the pore at a rate that is too fast for the current measurement systems. An alternative approach is "exonuclease sequencing", in which an exonuclease is attached to the nanopore that is able to process the strand, cleaving off one base at a time. The bases then flow through the nanopore and the current is measured. This method has the advantage of potentially solving the translocation rate problem, as the speed is controlled by the exonuclease. Here we consider the practical details of exonuclease attachment to the protein alpha hemolysin. We employ molecular dynamics simulations to determine the ideal (a) distance from alpha-hemolysin, and (b) the orientation of the monophosphate nucleotides upon release from the exonuclease such that they will enter the protein. Our results indicate an almost linear decrease in the probability of entry into the protein with increasing distance of nucleotide release. The nucleotide orientation is less significant for entry into the protein.

  15. Complete nucleotide sequence and gene rearrangement of the mitochondrial genome of the bell-ring frog, Buergeria buergeri (family Rhacophoridae).

    PubMed

    Sano, Naomi; Kurabayashi, Atsushi; Fujii, Tamotsu; Yonekawa, Hiromichi; Sumida, Masayuki

    2004-06-01

    In this study we determined the complete nucleotide sequence (19,959 bp) of the mitochondrial DNA of the rhacophorid frog Buergeria buergeri. The gene content, nucleotide composition, and codon usage of B. buergeri conformed to those of typical vertebrate patterns. However, due to an accumulation of lengthy repetitive sequences in the D-loop region, this species possesses the largest mitochondrial genome among all the vertebrates examined so far. Comparison of the gene organizations among amphibian species (Rana, Xenopus, salamanders and caecilians) revealed that the positioning of four tRNA genes and the ND5 gene in the mtDNA of B. buergeri diverged from the common vertebrate gene arrangement shared by Xenopus, salamanders and caecilians. The unique positions of the tRNA genes in B. buergeri are shared by ranid frogs, indicating that the rearrangements of the tRNA genes occurred in a common ancestral lineage of ranids and rhacophorids. On the other hand, the novel position of the ND5 gene seems to have arisen in a lineage leading to rhacophorids (and other closely related taxa) after ranid divergence. Phylogenetic analysis based on nucleotide sequence data of all mitochondrial genes also supported the gene rearrangement pathway.

  16. IRE1α nucleotide sequence cleavage specificity in the unfolded protein response.

    PubMed

    Poothong, Juthakorn; Sopha, Pattarawut; Kaufman, Randal J; Tirasophon, Witoon

    2017-01-01

    Inositol-requiring enzyme 1 (IRE1) is a conserved sensor of the unfolded protein response that has protein kinase and endoribonuclease (RNase) enzymatic activities and thereby initiates HAC1/XBP1 splicing. Previous studies demonstrated that human IRE1α (hIRE1α) does not cleave Saccharomyces cerevisiae HAC1 mRNA. Using an in vitro cleavage assay, we show that adenine to cytosine nucleotide substitution at the +1 position in the 3' splice site of HAC1 RNA is required for specific cleavage by hIRE1α. A similar restricted nucleotide specificity in the RNA substrate was observed for XBP1 splicing in vivo. Together these findings underscore the essential role of cytosine nucleotide at +1 in the 3' splice site for determining cleavage specificity of hIRE1α.

  17. Cloning and nucleotide sequence of the gene coding for enzymatically active fragments of the Bacillus polymyxa beta-amylase.

    PubMed

    Kawazu, T; Nakanishi, Y; Uozumi, N; Sasaki, T; Yamagata, H; Tsukagoshi, N; Udaka, S

    1987-04-01

    The gene encoding beta-amylase was cloned from Bacillus polymyxa 72 into Escherichia coli HB101 by inserting HindIII-generated DNA fragments into the HindIII site of pBR322. The 4.8-kilobase insert was shown to direct the synthesis of beta-amylase. A 1.8-kilobase AccI-AccI fragment of the donor strain DNA was sufficient for the beta-amylase synthesis. Homologous DNA was found by Southern blot analysis to be present only in B. polymyxa 72 and not in other bacteria such as E. coli or B. subtilis. B. polymyxa, as well as E. coli harboring the cloned DNA, was found to produce enzymatically active fragments of beta-amylases (70,000, 56,000, or 58,000, and 42,000 daltons), which were detected in situ by sodium dodecyl sulfate-polyacrylamide gel electrophoresis. Nucleotide sequence analysis of the cloned 3.1-kilobase DNA revealed that it contains one open reading frame of 2,808 nucleotides without a translational stop codon. The deduced amino acid sequence for these 2,808 nucleotides encoding a secretory precursor of the beta-amylase protein is 936 amino acids including a signal peptide of 33 or 35 residues at its amino-terminal end. The existence of a beta-amylase of larger than 100,000 daltons, which was predicted on the basis of the results of nucleotide sequence analysis of the gene, was confirmed by examining culture supernatants after various cultivation periods. It existed only transiently during cultivation, but the multiform beta-amylases described above existed for a long time. The large beta-amylase (approximately 160,000 daltons) existed for longer in the presence of a protease inhibitor such as chymostatin, suggesting that proteolytic cleavage is the cause of the formation of multiform beta-amylases.

  18. Organization and nucleotide sequence of a densovirus genome imply a host-dependent evolution of the parvoviruses.

    PubMed Central

    Bando, H; Kusuda, J; Gojobori, T; Maruyama, T; Kawase, S

    1987-01-01

    The genome structure of a densovirus from a silkworm was determined by sequencing more than 85% of the complete genome DNA. This is the first report of the genome organization of an insect parvovirus deduced from the DNA sequence. In the viral genome, two large open reading frames designated 1 and 2 and one smaller open reading frame designated 3 were identified. The first two open reading frames shared the same strand, while the third was found in the complementary sequence. Computer analysis suggested that open reading frame 2 may encode all four structural proteins. The genome organization and a part of the nucleotide sequence were conserved among the insect densovirus, rodent parvoviruses, and a human dependovirus. These viruses may have diverged from a common ancestor. PMID:3027382

  19. Nucleotide sequence and phylogeny of a chloramphenicol acetyltransferase encoded by the plasmid pSCS7 from Staphylococcus aureus.

    PubMed

    Schwarz, S; Cardoso, M

    1991-08-01

    The nucleotide sequence of the chloramphenicol acetyltransferase gene (cat) and its regulatory region, encoded by the plasmid pSCS7 from Staphylococcus aureus, was determined. The structural cat gene encoded a protein of 209 amino acids, which represented one monomer of the enzyme chloramphenicol acetyltransferase (CAT). Comparisons between the amino acid sequences of the pSCS7-encoded CAT from S. aureus and the previously sequenced CAT variants from S. aureus, Staphylococcus intermedius, Staphylococcus haemolyticus, Bacillus pumilis, Clostridium difficile, Clostridium perfringens, Escherichia coli, Shigella flexneri, and Proteus mirabilis were performed. An alignment of CAT amino acid sequences demonstrated the presence of 34 conserved amino acids among all CAT variants. These conserved residues were considered for their possible roles in the structure and function of CAT. On the basis of the alignment, a phylogenetic tree was constructed. It demonstrated relatively large evolutionary distances between the CAT variants of enteric bacteria, Clostridium, Bacillus, and Staphylococcus species.

  20. Complete Nucleotide Sequence and Genetic Organization of the 210-Kilobase Linear Plasmid of Rhodococcus erythropolis BD2

    PubMed Central

    Stecker, Christiane; Johann, Andre; Herzberg, Christina; Averhoff, Beate; Gottschalk, Gerhard

    2003-01-01

    The complete nucleotide sequence of the linear plasmid pBD2 from Rhodococcus erythropolis BD2 comprises 210,205 bp. Sequence analyses of pBD2 revealed 212 putative open reading frames (ORFs), 97 of which had an annotatable function. These ORFs could be assigned to six functional groups: plasmid replication and maintenance, transport and metalloresistance, catabolism, transposition, regulation, and protein modification. Many of the transposon-related sequences were found to flank the isopropylbenzene pathway genes. This finding together with the significant sequence similarities of the ipb genes to genes of the linear plasmid-encoded biphenyl pathway in other rhodococci suggests that the ipb genes were acquired via transposition events and subsequently distributed among the rhodococci via horizontal transfer. PMID:12923100

  1. Nucleotide sequence of the melA gene, coding for alpha-galactosidase in Escherichia coli K-12.

    PubMed Central

    Liljeström, P L; Liljeström, P

    1987-01-01

    Melibiose uptake and hydrolysis in E.coli is performed by the MelB and MelA proteins, respectively. We report the cloning and sequencing of the melA gene. The nucleotide sequence data showed that melA codes for a 450 amino acid long protein with a molecular weight of 50.6 kd. The sequence data also supported the assumption that the mel locus forms an operon with melA in proximal position. A comparison of MelA with alpha-galactosidase proteins from yeast and human origin showed that these proteins have only limited homology, the yeast and human proteins being more related. However, regions common to all three proteins were found indicating sequences that might comprise the active site of alpha-galactosidase. PMID:3031590

  2. Role of base stacking and sequence context in the inhibition of yeast DNA polymerase eta by pyrene nucleotide.

    PubMed

    Hwang, Hanshin; Taylor, John-Stephen

    2004-11-23

    The Y family DNA polymerase yeast pol eta inserts pyrene deoxyribose monophosphate (dPMP) in preference to A opposite an abasic site, the 3'-T of a thymine dimer, and a normal T with almost equal efficiency. In contrast, pol A family polymerases such as Klenow fragment and T7 DNA polymerase only insert dPMP efficiently opposite an abasic site and the 3'-T of a thymine dimer but not opposite undamaged DNA. Pyrene nucleotide is also an efficient chain-terminating inhibitor of DNA synthesis by pol eta but not by Klenow fragment or T7 DNA polymerase. To better understand the origin of the efficiency and sequence specificity of dPMP insertion by pol eta, the kinetics of dPMP insertion opposite various templates have been determined. In one sequence context, the efficiency of dPMP insertion increases 4.6-fold opposite G < A < T < C, suggesting that the templating nucleotide modulates dPMP insertion efficiency by having to destack prior to dPTP binding. The efficiency of insertion of dPMP opposite T in the same sequence context increases 7-fold for primers terminating in G < A < C < T and is similar to that observed for nontemplated blunt-end extension, suggesting that stacking interactions between the pyrene and the primer terminus are also important. On heterogeneous templates, the average selectivity for dPMP insertion relative to the complementary dNMP decreases in the order of dAMP > dGMP > dTMP > dCMP, from a high of 5.8 when dAMP is to be inserted following a T to a low of 0.5 when dCMP is to be inserted following a C. The relative preference for dPMP insertion at a given site can be largely explained by the energetic cost of destacking the templating base and stacking of pyrene nucleotide relative to that of stacking and base pairing the complementary nucleotide. Thus, pyrene nucleotide represents a novel class of nucleotide-based chain-terminating DNA synthesis inhibitors whose base portion consists of a hydrophobic, non-hydrogen bonding, base-pair mimic.

  3. Respiratory syncytial virus fusion glycoprotein: nucleotide sequence of mRNA, identification of cleavage activation site and amino acid sequence of N-terminus of F1 subunit.

    PubMed Central

    Elango, N; Satake, M; Coligan, J E; Norrby, E; Camargo, E; Venkatesan, S

    1985-01-01

    The amino acid sequence of respiratory syncytial virus fusion protein (Fo) was deduced from the sequence of a partial cDNA clone of mRNA and from the 5' mRNA sequence obtained by primer extension and dideoxysequencing. The encoded protein of 574 amino acids is extremely hydrophobic and has a molecular weight of 63371 daltons. The site of proteolytic cleavage within this protein was accurately mapped by determining a partial amino acid sequence of the N-terminus of the larger subunit (F1) purified by radioimmunoprecipitation using monoclonal antibodies. Alignment of the N-terminus of the F1 subunit within the deduced amino acid sequence of Fo permitted us to identify a sequence of lys-lys-arg-lys-arg-arg at the C-terminus of the smaller N-terminal F2 subunit that appears to represent the cleavage/activation domain. Five potential sites of glycosylation, four within the F2 subunit, were also identified. Three extremely hydrophobic domains are present in the protein; a) the N-terminal signal sequence, b) the N-terminus of the F1 subunit that is analogous to the N-terminus of the paramyxovirus F1 subunit and the HA2 subunit of influenza virus hemagglutinin, and c) the putative membrane anchorage domain near the C-terminus of F1. Images PMID:2987829

  4. The Nucleotide Capture Region of Alpha Hemolysin: Insights into Nanopore Design for DNA Sequencing from Molecular Dynamics Simulations

    PubMed Central

    Manara, Richard M. A.; Tomasio, Susana; Khalid, Syma

    2015-01-01

    Nanopore technology for DNA sequencing is constantly being refined and improved. In strand sequencing a single strand of DNA is fed through a nanopore and subsequent fluctuations in the current are measured. A major hurdle is that the DNA is translocated through the pore at a rate that is too fast for the current measurement systems. An alternative approach is “exonuclease sequencing”, in which an exonuclease is attached to the nanopore that is able to process the strand, cleaving off one base at a time. The bases then flow through the nanopore and the current is measured. This method has the advantage of potentially solving the translocation rate problem, as the speed is controlled by the exonuclease. Here we consider the practical details of exonuclease attachment to the protein alpha hemolysin. We employ molecular dynamics simulations to determine the ideal (a) distance from alpha-hemolysin, and (b) the orientation of the monophosphate nucleotides upon release from the exonuclease such that they will enter the protein. Our results indicate an almost linear decrease in the probability of entry into the protein with increasing distance of nucleotide release. The nucleotide orientation is less significant for entry into the protein.

  5. Complete nucleotide sequence of Rose yellow leaf virus, a new member of the family Tombusviridae

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The genome of the Rose yellow leaf virus (RYLV) has been determined to be 3918 nucleotides containing seven open reading frames (ORFs). ORF1 encodes a 27 kDa peptide (p27). ORF2 shares a common start codon with ORF1 and continues through the amber stop codon of p27 to encode a 87 kDa (p87) protein t...

  6. Cloning, sequence, and properties of the soluble pyridine nucleotide transhydrogenase of Pseudomonas fluorescens.

    PubMed Central

    French, C E; Boonstra, B; Bufton, K A; Bruce, N C

    1997-01-01

    The gene encoding the soluble pyridine nucleotide transhydrogenase (STH) of Pseudomonas fluorescens was cloned and expressed in Escherichia coli. STH is related to the flavoprotein disulfide oxidoreductases but lacks one of the conserved redox-active cysteine residues. The gene is highly similar to an E. coli gene of unknown function. PMID:9098078

  7. The group 10 allergen of Dermatophagoides farinae (Acari: Pyroglyphidae): cDNA cloning, sequence analysis, and expression in Escherichia coli BL21.

    PubMed

    Cui, Yubao; Zhou, Ying; Wang, Yungang; Ma, Guifang; Yang, Li

    2013-01-01

    Dermatophagoides farinae Hughes, American house dust mite, is highly allergenic, producing symptoms in people worldwide. Identifying and cloning the allergens in this species may enable better diagnostic and therapeutic approaches. Here, we cloned, sequenced, and expressed the full-length cDNA encoding D. farinae group 10 allergen (Der f 10) isolated from dust mites in China. Bioinformatic analysis indicated that the 888 bp sequence encoded a cytoskeleton protein 295 amino acids long, with a molecular weight of approximately equal 34 kDa. Sequence alignment with the group 10 allergens of Pyroglyphidae, Acaridae, and Glycyphagidae families revealed that the group 10 allergen from D. farinae is 95% similar to D. pteronyssinus Trouessart and Psoroptes ovis (Hering). These findings lay the groundwork for future studies, including large-scale production of recombinant Der f 10 allergen for diagnostic and therapeutic agents.

  8. Identification of essential nucleotides in an upstream repressing sequence of Saccharomyces cerevisiae by selection for increased expression of TRK2.

    PubMed Central

    Vidal, M; Buckley, A M; Yohn, C; Hoeppner, D J; Gaber, R F

    1995-01-01

    The TRK2 gene in Saccharomyces cerevisiae encodes a membrane protein involved in potassium transport and is expressed at extremely low levels. Dominant cis-acting mutations (TRK2D), selected by their ability to confer TRK2-dependent growth on low-potassium medium, identified an upstream repressor element (URS1-TRK2) in the TRK2 promoter. The URS1-TRK2 sequence (5'-AGCCGCACG-3') shares six nucleotides with the ubiquitous URS1 element (5'-AGCCGCCGA-3'), and the protein species binding URS1-CAR1 (URSF) is capable of binding URS1-TRK2 in vitro. Sequence analysis of 17 independent repression-defective TRK2D mutations identified three adjacent nucleotides essential for URS1-mediated repression in vivo. Our results suggest a role for context effects with regard to URS1-related sequences: several mutant alleles of the URS1 element previously reported to have little or no effect when analyzed within the context of a heterologous promoter (CYC1) [Luche, R.M., Sumrada, R. & Cooper, T.G. (1990) Mol. Cell. Biol. 10, 3884-3895] have major effects on repression in the context of their native promoters (TRK2 and CAR1). TRK2D mutations that abolish repression also reveal upstream activating sequence activity either within or adjacent to URS1. Additivity between TRK2D and sin3 delta mutations suggest that SIN3-mediated repression is independent of that mediated by URS1. Images Fig. 1 Fig. 4 PMID:7892273

  9. Nucleotide sequence of the 5' end of araBAD operon messenger RNA in Escherichia coli B/r.

    PubMed

    Lee, N; Carbon, J

    1977-01-01

    The transcription reaction in vitro provides a means of analyzing the nucleotide sequence of the mRNA of the araBAD operon. By controlling the time of synthesis, we obtained araBAD mRNA of varying lengths beginning from the 5' end. These 5' fragments were freed of lambda RNA transcripts by successive hybridizations to the sense strands of a pair of lambda ara transducing phages that carry ara genes in opposite orientations. The purified 5' fragments were ordered by their times of appearance during synchronized RNA elongation and by nearest neighbor analyses. The results, when combined with the knowledge of the NH2-terminal sequence of the product of the first cistron (L-ribulokinase gene araB), establish the nucleotide sequence of the first 69 bases at the 5' end of the araBAD operon mRNA. The AUG starter codon for L-ribulokinase is located at positions 29-31. The sequence is: 5' A-C-C-C-G-U-U-U-U-U-U-U-U-G-G-A-U-G-G-A-G-U-G-A-A-A-C-G-A-U-G-G-C-G-A-U-U-G-C-A-A-U-U-G-G-C-C-U-C-G-A-U-U-U-U-G-C-A-G-U-G-A-U-U-C-U-G-(U)-. . .3'.

  10. Identification of an androgen-repressed mRNA in rat ventral prostate as coding for sulphated glycoprotein 2 by cDNA cloning and sequence analysis.

    PubMed Central

    Bettuzzi, S; Hiipakka, R A; Gilna, P; Liao, S T

    1989-01-01

    The concentrations of a small number of mRNAs in the rat ventral prostate increase after castration and then decrease upon androgen treatment. Since the repression of specific gene expression may be important in the regulation of organ growth, we have cloned a cDNA for an androgen-repressed mRNA, the concentration of which increased 17-fold 4 days after castration, and this increase was reversed rapidly by androgen treatment. By sequence analysis the androgen-repressed mRNA was identified as that coding for sulphated glycoprotein 2. Images Fig. 1. PMID:2920020

  11. Complete nucleotide sequence of the Actinomyces viscosus T14V sialidase gene: presence of a conserved repeating sequence among strains of Actinomyces spp.

    PubMed Central

    Yeung, M K

    1993-01-01

    The nucleotide sequence of the Actinomyces viscosus T14V sialidase gene (nanH) and flanking regions was determined. An open reading frame of 2,703 nucleotides that encodes a predominately hydrophobic protein of 901 amino acids (M(r), 92,871) was identified. The amino acid sequence at the amino terminus of the predicted protein exhibited properties characteristic of a typical leader peptide. Five 12-amino-acid units that shared between 33 and 67% sequence identity were noted within the central domain of the protein. Each unit contained the sequence Ser-X-Asp-X-Gly-X-Thr-Trp, which is conserved among other bacterial and trypanosoma sp. sialidases. Thus, the A. viscosus T14V nanH gene and the other prokaryotic and eukaryotic sialidase genes evolved from a common ancestor. Southern hybridization analyses under conditions of high stringency revealed the existence of DNA sequences homologous to A. viscosus T14V nanH in the genomes of 18 strains of five Actinomyces species that expressed various levels of sialidase activity. The data demonstrate that the sialidase genes from divergent groups of Actinomyces spp. are highly conserved. Images PMID:8418033

  12. An Interpretation of the Ancestral Codon from Miller’s Amino Acids and Nucleotide Correlations in Modern Coding Sequences

    PubMed Central

    Carels, Nicolas; de Leon, Miguel Ponce

    2015-01-01

    Purine bias, which is usually referred to as an “ancestral codon”, is known to result in short-range correlations between nucleotides in coding sequences, and it is common in all species. We demonstrate that RWY is a more appropriate pattern than the classical RNY, and purine bias (Rrr) is the product of a network of nucleotide compensations induced by functional constraints on the physicochemical properties of proteins. Through deductions from universal correlation properties, we also demonstrate that amino acids from Miller’s spark discharge experiment are compatible with functional primeval proteins at the dawn of living cell radiation on earth. These amino acids match the hydropathy and secondary structures of modern proteins. PMID:25922573

  13. Nucleotide sequence of the Klebsiella pneumoniae nifD gene and predicted amino acid sequence of the alpha-subunit of nitrogenase MoFe protein.

    PubMed Central

    Ioannidis, I; Buck, M

    1987-01-01

    The nucleotide sequence of the Klebsiella pneumoniae nifD gene is presented and together with the accompanying paper [Holland, Zilberstein, Zamir & Sussman (1987) Biochem. J. 247, 277-285] completes the sequence of the nifHDK genes encoding the nitrogenase polypeptides. The K. pneumoniae nifD gene encodes the 483-amino acid-residue nitrogenase alpha-subunit polypeptide of Mr 54156. The alpha-subunit has five strongly conserved cysteine residues at positions 63, 89, 155, 184 and 275, some occurring in a region showing both primary sequence and potential structural homology to the K. pneumoniae nitrogenase beta-subunit. A comparison with six other alpha-subunit amino acid sequences has been made, which indicates a number of potentially important domains within alpha-subunits. PMID:3322262

  14. A combined de novo protein sequencing and cDNA library approach to the venomic analysis of Chinese spider Araneus ventricosus.

    PubMed

    Duan, Zhigui; Cao, Rui; Jiang, Liping; Liang, Songping

    2013-01-14

    In past years, spider venoms have attracted increasing attention due to their extraordinary chemical and pharmacological diversity. The recently popularized proteomic method highly improved our ability to analyze the proteins in the venom. However, the lack of information about isolated venom proteins sequences dramatically limits the ability to confidently identify venom proteins. In the present paper, the venom from Araneus ventricosus was analyzed using two complementary approaches: 2-DE/Shotgun-LC-MS/MS coupled to MASCOT search and 2-DE/Shotgun-LC-MS/MS coupled to manual de novo sequencing followed by local venom protein database (LVPD) search. The LVPD was constructed with toxin-like protein sequences obtained from the analysis of cDNA library from A. ventricosus venom glands. Our results indicate that a total of 130 toxin-like protein sequences were unambiguously identified by manual de novo sequencing coupled to LVPD search, accounting for 86.67% of all toxin-like proteins in LVPD. Thus manual de novo sequencing coupled to LVPD search was proved an extremely effective approach for the analysis of venom proteins. In addition, the approach displays impeccable advantage in validating mutant positions of isoforms from the same toxin-like family. Intriguingly, methyl esterifcation of glutamic acid was discovered for the first time in animal venom proteins by manual de novo sequencing.

  15. Partition enrichment of nucleotide sequences (PINS)--a generally applicable, sequence based method for enrichment of complex DNA samples.

    PubMed

    Kvist, Thomas; Sondt-Marcussen, Line; Mikkelsen, Marie Just

    2014-01-01

    The dwindling cost of DNA sequencing is driving transformative changes in various biological disciplines including medicine, thus resulting in an increased need for routine sequencing. Preparation of samples suitable for sequencing is the starting point of any practical application, but enrichment of the target sequence over background DNA is often laborious and of limited sensitivity thereby limiting the usefulness of sequencing. The present paper describes a new method, Probability directed Isolation of Nucleic acid Sequences (PINS), for enrichment of DNA, enabling the sequencing of a large DNA region surrounding a small known sequence. A 275,000 fold enrichment of a target DNA sample containing integrated human papilloma virus is demonstrated. Specifically, a sample containing 0.0028 copies of target sequence per ng of total DNA was enriched to 786 copies per ng. The starting concentration of 0.0028 target copies per ng corresponds to one copy of target in a background of 100,000 complete human genomes. The enriched sample was subsequently amplified using rapid genome walking and the resulting DNA sequence revealed not only the sequence of a the truncated virus, but also 1026 base pairs 5' and 50 base pairs 3' to the integration site in chromosome 8. The demonstrated enrichment method is extremely sensitive and selective and requires only minimal knowledge of the sequence to be enriched and will therefore enable sequencing where the target concentration relative to background is too low to allow the use of other sample preparation methods or where significant parts of the target sequence is unknown.

  16. Differentiation of Erysipelothrix rhusiopathiae strains by nucleotide sequence analysis of a hypervariable region in the spaA gene: discrimination of a live vaccine strain from field isolates.

    PubMed

    Nagai, Shinya; To, Ho; Kanda, Akira

    2008-05-01

    Erysipelothrix rhusiopathiae causes erysipelas in swine and is considered a reemerging disease contributing substantially to economic losses in the swine industry. Since an attenuated live vaccine was commercialized in 1974 in Japan, outbreaks of acute septicemia or subacute urticaria of erysipelas have decreased dramatically. In contrast, a chronic form of erysipelas found during meat inspections in slaughterhouses has been increasing. In this study, a new strain-typing method was developed based on nucleotide sequencing of a hypervariable region in the surface protective antigen (spaA) gene for discrimination of the live vaccine strain from field isolates. Sixteen strains isolated from arthritic lesions found in slaughtered pigs were segregated into 4 major patterns: 1) identical nucleotide sequence with the vaccine strain: 3 isolates; 2) 1 nucleotide substitution (C to A) at position 555: 5 isolates; 3) 1 nucleotide substitution at various positions: 5 isolates; and 4) 2 nucleotide substitutions: 3 isolates. Isolates with the same nucleotide sequence as the vaccine strain were further characterized by other properties, including the mouse pathogenicity test. One strain isolated from pigs on a farm where the live vaccine had been used was found to be closely related to the vaccine strain. The phylogenetic tree constructed based on the spaA sequence suggests that the evolutionary distance of the isolates is related to the pathogenicity in mice. The new strain-typing system based on nucleotide sequencing of the spaA region is useful to discriminate the vaccine strain from field isolates.

  17. Nucleotide sequence polymorphism at the apical membrane antigen-1 locus reveals population history of Plasmodium vivax in Thailand

    PubMed Central

    Putaporntip, Chaturong; Jongwutiwes, Somchai; Grynberg, Priscila; Cui, Liwang; Hughes, Austin L.

    2009-01-01

    Apical membrane antigen-1 is a candidate for inclusion in a vaccine for the human malaria parasite Plasmodium vivax. We collected 231 complete sequences of the gene encoding this antigen (pvama-1) from three regions of Thailand, the most extensive collection to date of sequences at this locus. The domain II loop (previously mentioned as a potential vaccine component) was almost completely conserved, with a single amino acid variant (I313R) observed in a single sequence. The 3′ portion of the gene (domain II through the stop codon) showed significantly lower nucleotide diversity than the 5′ portion (start codon through domain I); and a given domain I sequence might be found in a haplotype with more than one domain II sequence. These results imply a hotspot of recombination between domains I and II. We found significant geographic subdivision among the three regions of Thailand (NW, East, and South) in which collections were made in 2007. Numbers of P. vivax infections have experienced overall declines since 1990 in all three regions; but the decline has been most recent in the NW, and there has been a rebound in numbers of infections in the South since 2000. Consistent with population history, amino acid sequence diversity was greatest in the NW. The South, which had by far the lowest sequence diversity of the three regions, showed signs of a population that has expanded from a small number of founders after a bottleneck. PMID:19643205

  18. Determination of the minimal essential nucleotide sequence for diphtheria tox repressor binding by in vitro affinity selection.

    PubMed

    Tao, X; Murphy, J R

    1994-09-27

    The expression of diphtheria toxin in lysogenic toxigenic strains of Corynebacterium diphtheriae is controlled by the heavy metal ion-activated regulatory protein DtxR. In the presence of divalent heavy metal ions, DtxR specifically binds to the diphtheria tox operator and protects a 27-bp interrupted palindromic sequence from DNase I digestion. To determine the consensus DNA sequence for DtxR binding, we have used gel electrophoresis mobility-shift assay and polymerase chain reaction (PCR) amplification for in vitro affinity selection of DNA binding sequences from a universe of 6.9 x 10(10) variants. After 10 rounds of in vitro affinity selection, each round coupled with 30 cycles of PCR amplification, we isolated and characterized a family of DNA sequences that function as DtxR-responsive genetic elements both in vitro and in vivo. Moreover, these DNA sequences were found to bind activated DtxR with an affinity similar to that of the wild-type tox operator. The DNA sequence analysis of 21 unique in vitro affinity-selected binding sites has revealed the minimal essential nucleotide sequence for DtxR binding to be a 9-bp palindrome separated by a single base pair.

  19. cDNA cloning and immunological characterization of the rye grass allergen Lol p I.

    PubMed

    Perez, M; Ishioka, G Y; Walker, L E; Chesnut, R W

    1990-09-25

    The complete amino acid sequence of two "isoallergenic" forms of Lol p I, the major rye grass (Lolium perenne) pollen allergen, was deduced from cDNA sequence analysis. cDNA clones isolated from a Lolium perenne pollen library contained an open reading frame coding for a 240-amino acid protein. Comparison of the nucleotide and deduced amino acid sequence of two of these clones revealed four changes at the amino acid level and numerous nucleotide differences. Both clones contained one possible asparagine-linked glycosylation site. Northern blot analysis shows one RNA species of 1.2 kilobases. Based on the complete amino acid sequence of Lol p I, overlapping peptides covering the entire molecule were synthesized. Utilizing these peptides we have identified a determinant within the Lol p I molecule that is recognized by human leukocyte antigen class II-restricted T cells obtained from persons allergic to rye grass pollen.

  20. Nucleotide sequence of Zygosaccharomyces bailii virus Z: Evidence for +1 programmed ribosomal frameshifting and for assignment to family Amalgaviridae.

    PubMed

    Depierreux, Delphine; Vong, Minh; Nibert, Max L

    2016-06-02

    Zygosaccharomyces bailii virus Z (ZbV-Z) is a monosegmented dsRNA virus that infects the yeast Zygosaccharomyces bailii and remains unclassified to date despite its discovery >20years ago. The previously reported nucleotide sequence of ZbV-Z (GenBank AF224490) encompasses two nonoverlapping long ORFs: upstream ORF1 encoding the putative coat protein and downstream ORF2 encoding the RNA-dependent RNA polymerase (RdRp). The lack of overlap between these ORFs raises the question of how the downstream ORF is translated. After examining the previous sequence of ZbV-Z, we predicted that it contains at least one sequencing error to explain the nonoverlapping ORFs, and hence we redetermined the nucleotide sequence of ZbV-Z, derived from the same isolate of Z. bailii as previously studied, to address this prediction. The key finding from our new sequence, which includes several insertions, deletions, and substitutions relative to the previous one, is that ORF2 in fact overlaps ORF1 in the +1 frame. Moreover, a proposed sequence motif for +1 programmed ribosomal frameshifting, previously noted in influenza A viruses, plant amalgaviruses, and others, is also present in the newly identified ORF1-ORF2 overlap region of ZbV-Z. Phylogenetic analyses provided evidence that ZbV-Z represents a distinct taxon most closely related to plant amalgaviruses (genus Amalgavirus, family Amalgaviridae). We conclude that ZbV-Z is the prototype of a new species, which we propose to assign as type species of a new genus of monosegmented dsRNA mycoviruses in family Amalgaviridae. Comparisons involving other unclassified mycoviruses with RdRps apparently related to those of plant amalgaviruses, and having either mono- or bisegmented dsRNA genomes, are also discussed.

  1. Construction of a Full-Length Enriched cDNA Library and Preliminary Analysis of Expressed Sequence Tags from Bengal Tiger Panthera tigris tigris

    PubMed Central

    Liu, Changqing; Liu, Dan; Guo, Yu; Lu, Taofeng; Li, Xiangchen; Zhang, Minghai; Ma, Jianzhang; Ma, Yuehui; Guan, Weijun

    2013-01-01

    In this study, a full-length enriched cDNA library was successfully constructed from Bengal tiger, Panthera tigris tigris, the most well-known wild Animal. Total RNA was extracted from cultured Bengal tiger fibroblasts in vitro. The titers of primary and amplified libraries were 1.28 × 106 pfu/mL and 1.56 × 109 pfu/mL respectively. The percentage of recombinants from unamplified library was 90.2% and average length of exogenous inserts was 0.98 kb. A total of 212 individual ESTs with sizes ranging from 356 to 1108 bps were then analyzed. The BLASTX score revealed that 48.1% of the sequences were classified as a strong match, 45.3% as nominal and 6.6% as a weak match. Among the ESTs with known putative function, 26.4% ESTs were found to be related to all kinds of metabolisms, 19.3% ESTs to information storage and processing, 11.3% ESTs to posttranslational modification, protein turnover, chaperones, 11.3% ESTs to transport, 9.9% ESTs to signal transducer/cell communication, 9.0% ESTs to structure protein, 3.8% ESTs to cell cycle, and only 6.6% ESTs classified as novel genes. By EST sequencing, a full-length gene coding ferritin was identified and characterized. The recombinant plasmid pET32a-TAT-Ferritin was constructed, coded for the TAT-Ferritin fusion protein with two 6× His-tags in N and C-terminal. After BCA assay, the concentration of soluble Trx-TAT-Ferritin recombinant protein was 2.32 ± 0.12 mg/mL. These results demonstrated that the reliability and representativeness of the cDNA library attained to the requirements of a standard cDNA library. This library provided a useful platform for the functional genome and transcriptome research of Bengal tigers. PMID:23708105

  2. Nucleotide sequences and operon structure of plasmid-borne genes mediating uptake and utilization of raffinose in Escherichia coli.

    PubMed Central

    Aslanidis, C; Schmid, K; Schmitt, R

    1989-01-01

    The plasmid-borne raf operon encodes functions required for inducible uptake and utilization of raffinose by Escherichia coli. Raf functions include active transport (Raf permease), alpha-galactosidase, and sucrose hydrolase, which are negatively controlled by the Raf repressor. We have defined the order and extent of the three structural genes, rafA, rafB, and rafD; these are contained in a 5,284-base-pair nucleotide sequence. By comparisons of derived primary structures with known subunit molecular weights and an N-terminal peptide sequence, rafA was assigned to alpha-galactosidase (708 amino acids), rafB was assigned to Raf permease (425 amino acids), and rafD was assigned to sucrose hydrolase (476 amino acids). Transcription was shown to initiate 13 nucleotides upstream of rafA; a putative promoter, a ribosome-binding site, and a transcription termination signal were identified. Striking similarities between Raf permease and lacY-encoded lactose permease, revealed by high sequence conservation (76%), overlapping substrate specificities, and similar transport kinetics, suggest a common origin of these transport systems. alpha-Galactosidase and sucrose hydrolase are not related to host enzymes but have their counterparts in other species. We propose a modular origin of the raf operon and discuss selective forces that favored the given gene organization also found in the E. coli lac operon. Images PMID:2556373

  3. Sequencing and analysis of 10,967 full-length cDNA clones from Xenopus laevis and Xenopus tropicalis reveals post-tetraploidization transcriptome remodeling.

    PubMed

    Morin, Ryan D; Chang, Elbert; Petrescu, Anca; Liao, Nancy; Griffith, Malachi; Chow, William; Kirkpatrick, Robert; Butterfield, Yaron S; Young, Alice C; Stott, Jeffrey; Barber, Sarah; Babakaiff, Ryan; Dickson, Mark C; Matsuo, Corey; Wong, David; Yang, George S; Smailus, Duane E; Wetherby, Keith D; Kwong, Peggy N; Grimwood, Jane; Brinkley, Charles P; Brown-John, Mabel; Reddix-Dugue, Natalie D; Mayo, Michael; Schmutz, Jeremy; Beland, Jaclyn; Park, Morgan; Gibson, Susan; Olson, Teika; Bouffard, Gerard G; Tsai, Miranda; Featherstone, Ruth; Chand, Steve; Siddiqui, Asim S; Jang, Wonhee; Lee, Ed; Klein, Steven L; Blakesley, Robert W; Zeeberg, Barry R; Narasimhan, Sudarshan; Weinstein, John N; Pennacchio, Christa Prange; Myers, Richard M; Green, Eric D; Wagner, Lukas; Gerhard, Daniela S; Marra, Marco A; Jones, Steven J M; Holt, Robert A

    2006-06-01

    Sequencing of full-insert clones from full-length cDNA libraries from both Xenopus laevis and Xenopus tropicalis has been ongoing as part of the Xenopus Gene Collection Initiative. Here we present 10,967 full ORF verified cDNA clones (8049 from X. laevis and 2918 from X. tropicalis) as a community resource. Because the genome of X. laevis, but not X. tropicalis, has undergone allotetraploidization, comparison of coding sequences from these two clawed (pipid) frogs provides a unique angle for exploring the molecular evolution of duplicate genes. Within our clone set, we have identified 445 gene trios, each comprised of an allotetraploidization-derived X. laevis gene pair and their shared X. tropicalis ortholog. Pairwise dN/dS, comparisons within trios show strong evidence for purifying selection acting on all three members. However, dN/dS ratios between X. laevis gene pairs are elevated relative to their X. tropicalis ortholog. This difference is highly significant and indicates an overall relaxation of selective pressures on duplicated gene pairs. We have found that the paralogs that have been lost since the tetraploidization event are enriched for several molecular functions, but have found no such enrichment in the extant paralogs. Approximately 14% of the paralogous pairs analyzed here also show differential expression indicative of subfunctionalization.

  4. Novel cDNA sequences of aryl hydrocarbon receptors and gene expression in turtles (Chrysemys picta and Pseudemys scripta) exposed to different environments

    PubMed Central

    Marquez, Emily C.; Traylor-Knowles, Nikki; Novillo-Villajos, Apolonia; Callard, Ian P.

    2011-01-01

    Reproductive changes have been observed in painted turtles from a site with known contamination located on Cape Cod, MA, USA. We hypothesize that these changes are caused by exposure to endocrine-disrupting compounds and that genes involved in reproduction are affected. The aryl hydrocarbon receptor (AHR) is an orphan receptor that is activated by environmental contaminants. AHR mRNA was measured in turtles exposed to soil collected from a contaminated site. Adult turtles were trapped from the study site (Moody Pond, MP) or a reference site and exposed to laboratory environments containing soil from either site. The red-eared slider was used to assess neonatal exposure to soil and water from the sites. The environmental exposures occurred over a 13-month period. Juveniles showed an age-dependent increase in brain AHR1. Juvenile turtles exposed to the MP environment had elevated gonadal AHR1. Adult turtles exposed to the MP environment showed significantly decreased brain AHR2. The painted turtle AHR is the first complete reptile AHR cDNA sequence. Phylogenetic analysis of the painted turtle AHR showed that it clusters with other AHR2s. Partial AHR1 and partial AHR2 cDNA sequences were cloned from the red-eared slider. MEME analysis identified 18 motifs in the turtle AHRs, showing high conservation between motifs that overlapped functional regions in both AHR isoforms. PMID:21763458

  5. cDNA sequence and expression analysis of an antimicrobial peptide, theromacin, in the triangle-shell pearl mussel Hyriopsis cumingii.

    PubMed

    Xu, Qiaoqing; Wang, Gailing; Yuan, Hanwen; Chai, Yi; Xiao, Zhili

    2010-09-01

    Bivalve molluscs rely on the interaction between cellular and humoral factors for protection against potential pathogens. Antimicrobial peptides (AMPs) have been proven to be one of the most important humoral components that afford resistance to pathogen infection. The AMP gene to be identified was that encoding theromacin in the triangle-shell pearl mussel Hyriopsis cumingii (Hc theromacin); this gene was identified from a suppression subtractive hybridization library, and subsequently cloned by 3' and 5' rapid amplification of cDNA ends polymerase chain reaction (RACE-PCR). The full-length theromacin cDNA contains 547 bp, with a 294-bp open reading frame that encodes a 97-amino acid peptide, and the deduced peptide sequence contains a 61-amino acid putative mature peptide. The sequence also contains 10 cysteine residues. Reverse transcriptase (RT)-PCR analysis showed that Hc theromacin transcripts were constitutively expressed in the liver, foot, gill, adductor muscle, heart, mantle, intestine, and hemocytes, with the highest level in hemocytes. Theromacin mRNA levels were found to increase after challenge with gram-positive and gram-negative bacteria. After injection of the gram-positive bacteria Staphylococcus aureus and Bifidobacterium bifidum, Hc theromacin expression showed the highest fold-change at 48 and 36 h after infection, respectively, and its levels decreased gradually thereafter.

  6. e2g: an interactive web-based server for efficiently mapping large EST and cDNA sets to genomic sequences.

    PubMed

    Krüger, Jan; Sczyrba, Alexander; Kurtz, Stefan; Giegerich, Robert

    2004-07-01

    e2g is a web-based server which efficiently maps large expressed sequence tag (EST) and cDNA datasets to genomic DNA. It significantly extends the volume of data that can be mapped in reasonable time, and makes this improved efficiency available as a web service. Our server hosts large collections of EST sequences (e.g. 4.1 million mouse ESTs of 1.87 Gb) in precomputed indexed data structures for efficient sequence comparison. The user can upload a genomic DNA sequence of interest and rapidly compare this to the complete collection of ESTs on the server. This delivers a mapping of the ESTs on the genomic DNA. The e2g web interface provides a graphical overview of the mapping. Alignments of the mapped EST regions with parts of the genomic sequence are visualized. Zooming functions allow the user to interactively explore the results. Mapped sequences can be downloaded for further analysis. e2g is available on the Bielefeld University Bioinformatics Server at http://bibiserv.techfak.uni-bielefeld.de/e2g/.

  7. DNA sequencing by a single molecule detection of labeled nucleotides sequentially cleaved from a single strand of DNA

    SciTech Connect

    Goodwin, P.M.; Schecker, J.A.; Wilkerson, C.W.; Hammond, M.L.; Ambrose, W.P.; Jett, J.H.; Martin, J.C.; Marrone, B.L.; Keller, R.A. ); Haces, A.; Shih, P.J.; Harding, J.D. )

    1993-01-01

    We are developing a laser-based technique for the rapid sequencing of large DNA fragments (several kb in size) at a rate of 100 to 1000 bases per second. Our approach relies on fluorescent labeling of the bases in a single fragment of DNA, attachment of this labeled DNA fragment to a support, movement of the supported DNA into a flowing sample stream, sequential cleavage of the end nucleotide from the DNA fragment with an exonuclease, and detection of the individual fluorescently labeled bases by laser-induced fluorescence.

  8. DNA sequencing by a single molecule detection of labeled nucleotides sequentially cleaved from a single strand of DNA

    SciTech Connect

    Goodwin, P.M.; Schecker, J.A.; Wilkerson, C.W.; Hammond, M.L.; Ambrose, W.P.; Jett, J.H.; Martin, J.C.; Marrone, B.L.; Keller, R.A.; Haces, A.; Shih, P.J.; Harding, J.D.

    1993-02-01

    We are developing a laser-based technique for the rapid sequencing of large DNA fragments (several kb in size) at a rate of 100 to 1000 bases per second. Our approach relies on fluorescent labeling of the bases in a single fragment of DNA, attachment of this labeled DNA fragment to a support, movement of the supported DNA into a flowing sample stream, sequential cleavage of the end nucleotide from the DNA fragment with an exonuclease, and detection of the individual fluorescently labeled bases by laser-induced fluorescence.

  9. Nucleotide sequence analysis of pRS2 and pRS3, two small cryptic plasmids from Oenococcus oeni.

    PubMed

    Mesas, J M; Rodríguez, M C; Alegre, M T

    2001-09-01

    Nucleotide sequence analysis of two cryptic plasmids, pRS2 (2544 bp) and pRS3 (3948 bp), from Oenococcus oeni revealed the presence in both of three major open reading frames with significant similarity to other small cryptic plasmids from O. oeni. The results suggest that those plasmids could be separated into two subfamilies, one represented by pLo13 and pRS3, the other represented by pOg32, pRS1, and pRS2.

  10. Nucleotide sequences of the Pseudomonas savastanoi indoleacetic acid genes show homology with Agrobacterium tumefaciens T-DNA

    PubMed Central

    Yamada, Tetsuji; Palm, Curtis J.; Brooks, Bob; Kosuge, Tsune

    1985-01-01

    We report the nucleotide sequences of iaaM and iaaH, the genetic determinants for, respectively, tryptophan 2-monooxygenase and indoleacetamide hydrolase, the enzymes that catalyze the conversion of L-tryptophan to indoleacetic acid in the tumor-forming bacterium Pseudomonas syringae pv. savastanoi. The sequence analysis indicates that the iaaM locus contains an open reading frame encoding 557 amino acids that would comprise a protein with a molecular weight of 61,783; the iaaH locus contains an open reading frame of 455 amino acids that would comprise a protein with a molecular weight of 48,515. Significant amino acid sequence homology was found between the predicted sequence of the tryptophan monooxygenase of P. savastanoi and the deduced product of the T-DNA tms-1 gene of the octopine-type plasmid pTiA6NC from Agrobacterium tumefaciens. Strong homology was found in the 25 amino acid sequence in the putative FAD-binding region of tryptophan monooxygenase. Homology was also found in the amino acid sequences representing the central regions of the putative products of iaaH and tms-2 T-DNA. The results suggest a strong similarity in the pathways for indoleacetic acid synthesis encoded by genes in P. savastanoi and in A. tumefaciens T-DNA. Images PMID:16593610

  11. PerPlot & PerScan: tools for analysis of DNA curvature-related periodicity in genomic nucleotide sequences

    PubMed Central

    2011-01-01

    Background Periodic spacing of short adenine or thymine runs phased with DNA helical period of ~10.5 bp is associated with intrinsic DNA curvature and deformability, which play important roles in DNA-protein interactions and in the organization of chromosomes in both eukaryotes and prokaryotes. Local differences in DNA sequence periodicity have been linked to differences in gene expression in some organisms. Despite the significance of these periodic patterns, there are virtually no publicly accessible tools for their analysis. Results We present novel tools suitable for assessments of DNA curvature-related sequence periodicity in nucleotide sequences at the genome scale. Utility of the present software is demonstrated on a comparison of sequence periodicities in the genomes of Haemophilus influenzae, Methanocaldococcus jannaschii, Saccharomyces cerevisiae, and Arabidopsis thaliana. The software can be accessed through a web interface and the programs are also available for download. Conclusions The present software is suitable for comparing DNA curvature-related sequence periodicity among different genomes as well as for analysis of intrachromosomal heterogeneity of the sequence periodicity. It provides a quick and convenient way to detect anomalous regions of chromosomes that could have unusual structural and functional properties and/or distinct evolutionary history. PMID:22587738

  12. Nucleotide sequence of polypyrimidines from cloned mouse DNA as determined by base-specific blockage of exonuclease action

    SciTech Connect

    Deugau, K.V.; Mitchel, R.E.J.; Birnboim, H.C.

    1983-01-01

    Cloned fragments of mouse DNA have been screened for the presence of long polypyrimidine/polypurine segments. The polypyrimidine portion of one such segment (about 2000 nucleotides in length) has been isolated by acidic depurination of the entire cloned fragment and plasmid vector followed by selective precipitation and 5'-/sup 32/P labeling. This polypyrimidine has been used to demonstrate a new procedure for sequencing. Covalent modification of thymine with a water-soluble carbodiimide, or cytosine with glutaric anhydride, at low levels blocked in the action of snake venom exonuclease. After deblocking, separation of the products of digestion by polyacrylamide gel electrophoresis yields a sequence ladder which can be used to determine the position of C and T residues as in other sequencing methods. A sequence of 72 residues adjacent to the 5' end had been established, consisting principally of the repeating tetranucleotide (CCTT)n. A low ratio of endonuclease to exonuclease is essential for application of this method to sequences of this size. Accordingly, a very sensitive modification of a fluorometric endonuclease assay was developed and used to optimize pH and Mg/sup 2 +/ conditions to favor exonuclease activity over the accompanying endonuclease activity. The results clearly indicate that long polypyrimidine tracts can be efficiently prepared and their sequences determined with this method using commercially available exonuclease preparations without additional purification. 26 references, 5 figures.

  13. Nucleotide sequence of the FNR-regulated fumarase gene (fumB) of Escherichia coli K-12.

    PubMed Central

    Bell, P J; Andrews, S C; Sivak, M N; Guest, J R

    1989-01-01

    The nucleotide sequence of a 3,162-base-pair (bp) segment of DNA containing the FNR-regulated fumB gene, which encodes the anaerobic class I fumarase (FUMB) of Escherichia coli, was determined. The structural gene was found to comprise 1,641 bp, 547 codons (excluding the initiation and termination codons), and the gene product had a predicted Mr of 59,956. The amino acid sequence of FUMB contained the same number of residues as did that of the aerobic class I fumarase (FUMA), and there were identical amino acids at all but 56 positions (89.8% identity). There was no significant similarity between the class I fumarases and the class II enzyme (FUMC) except in one region containing the following consensus: Gly-Ser-Xxx-Ile-Met-Xxx-Xxx-Lys-Xxx-Asn. Some of the 56 amino acid substitutions must be responsible for the functional preferences of the enzymes for malate dehydration (FUMB) and fumarate hydration (FUMA). Significant similarities between the cysteine-containing sequence of the class I fumarases (FUMA and FUMB) and the mammalian aconitases were detected, and this finding further supports the view that these enzymes are all members of a family of iron-containing hydrolyases. The nucleotide sequence of a 1,142-bp distal sequence of an unidentified gene (genF) located upstream of fumB was also defined and found to encode a product that is homologous to the product of another unidentified gene (genA), located downstream of the neighboring aspartase gene (aspA). PMID:2656658

  14. Molecular cloning, nucleotide sequence, and expression in Escherichia coli of a hemolytic toxin (aerolysin) gene from Aeromonas trota

    SciTech Connect

    Khan, A.A.; Kim, E.; Cerniglia, C.E.

    1998-07-01

    Aeromonas trota AK2, which was derived from ATCC 49659 and produces the extracellular pore-forming hemolytic toxin aerolysin, was mutagenized with the transposon mini-Tn5Km1 to generate a hemolysin-deficient mutant, designated strain AK253. Southern blotting data indicated that an 8.7-kb NotI fragment of the genomic DNA of strain AK253 contained the kanamycin resistance gene of mini-Tn5Km1. The 8.7-kb NotI DNA fragment was cloned into the vector pGEM5Zf({minus}) by selecting for kanamycin resistance, and the resultant clone, pAK71, showed aerolysin activity in Escherichia coli JM109. The nucleotide sequence of the aerA gene, located on the 1.8-kb ApaI-EcoRI fragment, was determined to consist of 1,479 bp and to have an ATG initiation codon and a TAA termination codon. An in vitro coupled transcription-translation analysis of the 1.8-kb region suggested that the aerA gene codes for a 54-kDa protein, in agreement with nucleotide sequence data. The deduced amino acid sequence of the aerA gene product of A. trota exhibited 99% homology with the amino acid sequence of the aerA product of Aeromonas sobria AB3 and 57% homology with the amino acid sequences of the products of the aerA genes of Aeromonas salmonicida 17-2 and A. sobria 33.

  15. Genetic manipulation of a transcription-regulating sequence of porcine reproductive and respiratory syndrome virus reveals key nucleotides determining its activity.

    PubMed

    Zheng, Haihong; Zhang, Keyu; Zhu, Xing-Quan; Liu, Changlong; Lu, Jiaqi; Gao, Fei; Zhou, Yan; Zheng, Hao; Lin, Tao; Li, Liwei; Tong, Guangzhi; Wei, Zuzhang; Yuan, Shishan

    2014-08-01

    The factors that determine the transcription-regulating sequence (TRS) activity of porcine reproductive and respiratory syndrome virus (PRRSV) remain largely unclear. In this study, the effect of mutagenesis of conserved C nucleotides at positions 5 and 6 in the leader TRS (TRS-L) and/or canonical body TRS7 (TRS-B7) on the synthesis of subgenomic (sg) mRNA and virus infectivity was investigated in the context of a type 2 PRRSV infectious cDNA clone. The results showed that a double C mutation in the leader TRS completely abolished sg mRNAs synthesis and virus infectivity, but a single C mutation did not. A single C or double C mutation in TRS-B7.1 or/and TRS-B7.2 impaired or abolished the corresponding sg mRNA synthesis. Introduction of identical mutations in the leader and body TRSs partially restored sg mRNA7.1 and/or sg mRNA7.2 transcription, indicating that the base-pairing interaction between sense TRS-L and cTRS-B is a crucial factor influencing sg mRNA synthesis. Analysis of the mRNA leader-body junctions of mutants provided evidence for a mechanism of discontinuous minus-strand transcription. This study also showed that mutational inactivation of TRS-B7.1 or TRS-B7.2 did not affect the production of infectious progeny virus, and the sg mRNA formed from each of them could express N protein. However, TRS-B7.1 plays more important roles than TRS-B7.2 in maintaining the growth characteristic of type 2 PRRSV. These results provide more insight into the molecular mechanism of genome expression and subgenomic mRNA transcription of PRRSV.

  16. Genomic organization and complete cDNA sequence of the human phosphoinositide-specific phospholipase C {beta}3 gene (PLCB3)

    SciTech Connect

    Lagercrantz, J.; Carson, E.; Phelan, C.

    1995-04-10

    We have characterized the complete cDNA sequence, genomic structure, and expression of the human phosphoinositide-specific phospholipase C {beta}3 (PLC {beta}3) gene (gene symbol PLCB3). PLC {beta}3 plays an important role in initiating receptor-mediated signal transduction. Activation of PLC takes place in many cells as a response to stimulation by hormones, growth factors, neurotransmitters, and other ligands. The partial cDNA sequence of PLC {beta}3, previously published, was extended with 876 bp in the 5{prime} direction, giving a transcript of 4400 bp and a total open reading frame of 1234 amino acids. This was in accordance with expression analysis by Northern blotting that revealed a single 4.4-kb transcript in all tissues tested. Genomic data were obtained by sequencing plasmid subclones of a cosmid that contained the whole gene. The size of the complete transcription unit was estimated to be on the order of 15 kb. The gene contains 31 exons, with all splice donor and acceptor sites conforming to the GT/AG rule. No exon exceeds 571 bp in length, and the shortest exon spans only 36 bp. More than half of the introns are smaller than 200 bp, with the smallest being only 79 bp long. The transcription initiation site was determined to be within an 8-bp cluster 328-321 bp upstream of the translation initiation site. The 5{prime} flanking region is highly GC rich, with multiple CpG doublets, and contains multiple binding sites for Sp1. Lacking typical transcriptional regulatory sequences such as TATA and CAAT boxes, the putative promoter region conforms to the group of housekeeping promoters. 28 refs., 4 figs., 1 tab.

  17. Transcription Profiling of the Model Cyanobacterium Synechococcus sp. Strain PCC 7002 by Next-Gen (SOLiD™) Sequencing of cDNA

    PubMed Central

    Ludwig, Marcus; Bryant, Donald A.

    2011-01-01

    The genome of the unicellular, euryhaline cyanobacterium Synechococcus sp. PCC 7002 encodes about 3200 proteins. Transcripts were detected for nearly all annotated open reading frames by a global transcriptomic analysis by Next-Generation (SOLiD™) sequencing of cDNA. In the cDNA samples sequenced, ∼90% of the mapped sequences were derived from the 16S and 23S ribosomal RNAs and ∼10% of the sequences were derived from mRNAs. In cells grown photoautotrophically under standard conditions [38°C, 1% (v/v) CO2 in air, 250 μmol photons m−2 s−1], the highest transcript levels (up to 2% of the total mRNA for the most abundantly transcribed genes; e.g., cpcAB, psbA, psaA) were generally derived from genes encoding structural components of the photosynthetic apparatus. High-light exposure for 1 h caused changes in transcript levels for genes encoding proteins of the photosynthetic apparatus, Type-1 NADH dehydrogenase complex and ATP synthase, whereas dark incubation for 1 h resulted in a global decrease in transcript levels for photosynthesis-related genes and an increase in transcript levels for genes involved in carbohydrate degradation. Transcript levels for pyruvate kinase and the pyruvate dehydrogenase complex decreased sharply in cells incubated in the dark. Under dark anoxic (fermentative) conditions, transcript changes indicated a global decrease in transcripts for respiratory proteins and suggested that cells employ an alternative phosphoenolpyruvate degradation pathway via phosphoenolpyruvate synthase (ppsA) and the pyruvate:ferredoxin oxidoreductase (nifJ). Finally, the data suggested that an apparent operon involved in tetrapyrrole biosynthesis and fatty acid desaturation, acsF2–ho2–hemN2–desF, may be regulated by oxygen concentration. PMID:21779275

  18. Complete nucleotide sequence of a gene encoding a functional human class I histocompatibility antigen (HLA-CW3).

    PubMed Central

    Sodoyer, R; Damotte, M; Delovitch, T L; Trucy, J; Jordan, B R; Strachan, T

    1984-01-01

    The HLA-CW3 gene contained in a cosmid clone identified by transfection expression experiments has been completely sequenced. This provides, for the first time, data on the structure of HLA-C locus products and constitutes, together with that of the gene coding for HLA-A3, the first complete nucleotide sequences of genes coding for serologically defined class I HLA molecules. In contrast to the organisation of the two class I HLA pseudogenes whose sequences have previously been determined, the sequence of the HLA-CW3 gene reveals an additional cytoplasmic encoding domain, making the organisation of this gene very similar to that of known H-2 class I genes and also the HLA-A3 gene. The deduced amino acid sequences of HLA-CW3 and HLA-A3 now allow a systematic comparison of such sequences of HLA class I molecules from the three classical transplantation antigen loci A, B, C. The compared sequences include the previously determined partial amino acid sequences of HLA-B7, HLA-B40, HLA-A2 and HLA-A28. The comparisons confirm the extreme polymorphism of HLA classical class I molecules, and permit a study of the level of diversity and the location of sequence differences. The distribution of differences is not uniform, most of them being located in the first and second extracellular domains, the third extracellular domain is extremely conserved, and the cytoplasmic domain is also a variable region. Although it is difficult to determine locus-specific regions, we have identified several candidate positions which may be C locus-specific. PMID:6609813

  19. Rapid DNA Sequencing by Direct Nanoscale Reading of Nucleotide Bases on Individual DNA Chains

    SciTech Connect

    Lee, James Weifu; Meller, Amit

    2007-01-01

    Since the independent invention of DNA sequencing by Sanger and by Gilbert 30 years ago, it has grown from a small scale technique capable of reading several kilobase-pair of sequence per day into today's multibillion dollar industry. This growth has spurred the development of new sequencing technologies that do not involve either electrophoresis or Sanger sequencing chemistries. Sequencing by Synthesis (SBS) involves multiple parallel micro-sequencing addition events occurring on a surface, where data from each round is detected by imaging. New High Throughput Technologies for DNA Sequencing and Genomics is the second volume in the Perspectives in Bioanalysis series, which looks at the electroanalytical chemistry of nucleic acids and proteins, development of electrochemical sensors and their application in biomedicine and in the new fields of genomics and proteomics. The authors have expertly formatted the information for a wide variety of readers, including new developments that will inspire students and young scientists to create new tools for science and medicine in the 21st century. Reviews of complementary developments in Sanger and SBS sequencing chemistries, capillary electrophoresis and microdevice integration, MS sequencing and applications set the framework for the book.

  20. [Classification of nucleotide sequences over their frequency dictionaries reveals a relation between the structure of sequences and taxonomy of their bearers].

    PubMed

    Gorban', A N; Popova, T G; Sadovskiĭ, M G

    2003-01-01

    Classification of 16S RNA sequences over their frequency dictionaries, both real ones, and transformed ones was studied. Two entities were considered to be close each other from the point of view of their structure, if their frequency dictionaries were close, in Eucledian metric. A transformation procedure of a frequency dictionary has been implemented that reveals the peculiarities of information structure of a nucleotide sequence. A comparative study of two classification developed over the real frequency dictionary vs. that one developed over the transformed frequency dictionary was carried out. The strong correlation is revealed between the classification and the taxonomy of 16S RNA bearer. For the classes isolated, the information valuable words were identified. These words are the main factors of a difference between the classes. The frequency dictionaries containing the words of the length 3 exhibit the best correlation between a class and a genus. A genus, as a rule, is included into the same class, and the exclusion are sporadic. A development of hierarchy classification over the transformed frequency dictionaries separated one or two taxonomy groups, as each stage of classification. The unexpectedly frequent, or contrary, unexpectedly rare occurred of words (of the length 3) in entities under consideration make the structure difference between the classes of the nucleotide sequences.