Science.gov

Sample records for cdna nucleotide sequence

  1. Isolation and nucleotide sequence of a cDNA clone encoding rat mitochondrial malate dehydrogenase.

    PubMed Central

    Grant, P M; Tellam, J; May, V L; Strauss, A W

    1986-01-01

    We have determined the complete sequence of the rat mitochondrial malate dehydrogenase (mMDH) precursor derived from nucleotide sequence of the cDNA. A single synthetic oligodeoxynucleotide probe was used to screen a rat atrial cDNA library constructed in lambda gt10. A 1.2 kb full-length cDNA clone provided the first complete amino acid sequence of pre-mMDH. The 1014 nucleotide-long open reading frame encodes the 314 residue long mature mMDH protein and a 24 amino acid NH2-terminal extension which directs mitochondrial import and is cleaved from the precursor after import to generate mature mMDH. The amino acid composition of the transit peptide is polar and basic. The pre-mMDH transit peptide shows marked homology with those of two other enzymes targeted to the rat mitochondrial matrix. Images PMID:3755817

  2. Molecular cloning and nucleotide sequence of rat lingual lipase cDNA.

    PubMed Central

    Docherty, A J; Bodmer, M W; Angal, S; Verger, R; Riviere, C; Lowe, P A; Lyons, A; Emtage, J S; Harris, T J

    1985-01-01

    Purified rat lingual lipase (EC3113), a glycoprotein of approximate molecular weight 52,000, was used to generate polyclonal antibodies which were able to recognise the denatured and deglycosylated enzyme. These immunoglobulins were used to screen a cDNA library prepared from mRNA isolated from the serous glands of rat tongue cloned in E. coli expression vectors. An almost full length cDNA clone was isolated and the nucleotide and predicted amino acid sequence obtained. Comparison with the N-terminal amino acid sequence of the purified enzyme confirmed the identity of the cDNA and indicated that there was a hydrophobic signal sequence of 18 residues. The amino acid sequence of mature rat lingual lipase consists of 377 residues and shares little homology with porcine pancreatic lipase apart from a short region containing a serine residue at an analogous position to the ser 152 of the porcine enzyme. Images PMID:3839077

  3. Molecular cloning and nucleotide sequencing of human immunoglobulin epsilon chain cDNA.

    PubMed Central

    Seno, M; Kurokawa, T; Ono, Y; Onda, H; Sasada, R; Igarashi, K; Kikuchi, M; Sugino, Y; Nishida, Y; Honjo, T

    1983-01-01

    DNA complementary to mRNA of human immunoglobulin E heavy chain (epsilon chain) isolated and purified from U266 cells has been synthesized and inserted into the PstI site of pBR322 by G-C tailing. This recombinant plasmid was used to transform E. coli chi 1776 to screen 1445 tetracycline resistant colonies. Nine clones (pGETI - 9) containing cDNA coding for the human epsilon chain were recognized by colony hybridization and Southern blotting analysis with a nick-translated human IgE genome fragment. The nucleotide sequence of the longest cDNA contained in pGET2 was determined. The results indicate that the sequence of 1657 nucleotides codes for 494 amino acids covering a part of the variable region and all of the constant region of the human epsilon chain. Most of the amino acid sequence deduced from the nucleotide sequence is in substantial agreement with that reported. Furthermore a termination codon after the -COOH terminal amino acid marks the beginning of a 3' untranslated region of 125 nucleotides with a poly A tail. Taking this into account, the structure of the human epsilon chain mRNA, except a part of the 5' end, is conserved fairly well in the cDNA insert in pGET2. Images PMID:6300763

  4. Infectivity and complete nucleotide sequence of cucumber fruit mottle mosaic virus isolate Cm cDNA.

    PubMed

    Rhee, Sun-Ju; Hong, Jin-Sung; Lee, Gung Pyo

    2014-07-01

    Three isolates of cucumber fruit mottle mosaic virus (CFMMV) were collected from melon, cucumber, and pumpkin plants in Korea. A full-length cDNA clone of CFMMV-Cm (melon isolate) was produced and evaluated for infectivity after T7 transcription in vitro (pT7CF-Cmflc). The complete CFMMV genome sequence of the infectious clone pT7CF-Cmflc was determined. The genome of CFMMV-Cm consisted of 6,571 nucleotides and shared high nucleotide sequence identity (98.8 %) with the Israel isolate of CFMMV. Based on the infectious clone pT7CF-Cmflc, a CaMV 35S-promoter driven cDNA clone (p35SCF-Cmflc) was subsequently constructed and sequenced. Mechanical inoculation with RNA transcripts of pT7CF-Cmflc and agro-inoculation with p35SCF-Cmflc resulted in systemic infection of cucumber and melon, producing symptoms similar to those produced by CFMMV-Cm. Progeny virus in infected plants was detected by RT-PCR, western blot assay, and transmission electron microscopy.

  5. Nucleotide sequence of a cloned cDNA for proopiomelanocortin precursor of chum salmon, Onchorynchus keta.

    PubMed Central

    Soma, G I; Kitahara, N; Nishizawa, T; Nanami, H; Kotake, C; Okazaki, H; Andoh, T

    1984-01-01

    We have isolated a cDNA clone encoding salmon proopiomelanocortin precursor. Polyadenylated RNA was isolated from pituitary neurointermediate lobes and used to construct a cDNA library. The library was screened with 17 mer of oligodeoxyribonucleotides specific for the hexapeptide sequence in salmon beta-endorphin I, Phe-Met-Lys-Pro-Tyr-Thr at positions 4-9 excluding the third nucleotide. One positive clone, pSSM17 containing an insert of 1303 base pairs (bp) was characterized. Sequence determination revealed that it possessed sequences covering the entire regions encoding ACTH and beta-lipotropin and that the mRNA had the same overall organization as those of other mammalian species, i.e., the following peptide hormones were arranged in order from 5' upstream, ACTH including alpha-melanotropin and corticotropin-like intermediate lobe peptide, beta-lipotropin including gamma-lipotropin, beta-melanotropin and beta-endorphin. Amino acid sequences for putative salmon ACTH, beta-, and gamma-lipotropin were predicted. Comparison of the salmon mRNA sequence with those of mammals showed that the regions of alpha- and beta-MSH are relatively homologous, but other regions are much less so, especially in the 3' nontranslated region where it is much longer and completely heterologous. Images PMID:6095185

  6. Human secreted carbonic anhydrase: cDNA cloning, nucleotide sequence, and hybridization histochemistry

    SciTech Connect

    Aldred, P.; Fu, Ping; Barrett, G.; Penschow, J.D.; Wright, R.D.; Coghlan, J.P.; Fernley, R.T. )

    1991-01-01

    Complementary DNA clones coding for the human secreted carbonic anhydrase isozyme (CAVI) have been isolated and their nucleotide sequences determined. These clones identify a 1.45-kb mRNA that is present in high levels in parotid submandibular salivary glands but absent in other tissues such as the sublingual gland, kidney, liver, and prostate gland. Hybridization histochemistry of human salivary glands shows mRNA for CA VI located in the acinar cells of these glands. The cDNA clones encode a protein of 308 amino acids that includes a 17 amino acid leader sequence typical of secreted proteins. The mature protein has 291 amino acids compared to 259 or 260 for the cytoplasmic isozymes, with most of the extra amino acids present as a carboxyl terminal extension. In comparison, sheep CA VI has a 45 amino acid extension. Overall the human CA VI protein has a sequence identity of 35 {percent} with human CA II, while residues involved in the active site of the enzymes have been conserved. The human and sheep secreted carbonic anhydrases have a sequence identity of 72 {percent}. This includes the two cysteine residues that are known to be involved in an intramolecular disulfide bond in the sheep CA VI. The enzyme is known to be glycosylated and three potential N-glycosylation sites (Asn-X-Thr/Ser) have been identified. Two of these are known to be glycosylated in sheep CA VI. Southern analysis of human DNA indicates that there is only one gene coding for CA VI.

  7. Molecular cloning and nucleotide sequence of cDNA for human liver arginase

    SciTech Connect

    Haraguchi, Y.; Takiguchi, M.; Amaya, Y.; Kawamoto, S.; Matsuda, I.; Mori, M.

    1987-01-01

    Arginase (EC3.5.3.1) catalyzes the last step of the urea cycle in the liver of ureotelic animals. Inherited deficiency of the enzyme results in argininemia, an autosomal recessive disorder characterized by hyperammonemia. To facilitate investigation of the enzyme and gene structures and to elucidate the nature of the mutation in argininemia, the authors isolated cDNA clones for human liver arginase. Oligo(dT)-primed and random primer human liver cDNA libraries in lambda gt11 were screened using isolated rat arginase cDNA as a probe. Two of the positive clones, designated lambda hARG6 and lambda hARG109, contained an overlapping cDNA sequence with an open reading frame encoding a polypeptide of 322 amino acid residues (predicted M/sub r/, 34,732), a 5'-untranslated sequence of 56 base pairs, a 3'-untranslated sequence of 423 base pairs, and a poly(A) segment. Arginase activity was detected in Escherichia coli cells transformed with the plasmid carrying lambda hARG6 cDNA insert. RNA gel blot analysis of human liver RNA showed a single mRNA of 1.6 kilobases. The predicted amino acid sequence of human liver arginase is 87% and 41% identical with those of the rat liver and yeast enzymes, respectively. There are several highly conserved segments among the human, rat, and yeast enzymes.

  8. Nucleotide sequence and infectious cDNA clone of the L1 isolate of Pea seed-borne mosaic potyvirus.

    PubMed

    Olsen, B S; Johansen, I E

    2001-01-01

    The complete nucleotide sequence of Pea seed-borne mosaic potyvirus isolate L1 has been determined from cloned virus cDNA. The PSbMV L1 genome is 9895 nucleotides in length excluding the poly(A) tail. Computer analysis of the sequence revealed a single long open reading frame (ORF) of 9594 nucleotides. The ORF potentially encodes a polyprotein of 3198 amino acids with a deduced Mr of 363537. Nine putative proteolytic cleavage sites were identified by analogy to consensus sequences and genome arrangement in other potyviruses. Two full-length cDNA clones, p35S-L1-4 and p35S-L1-5, were assembled under control of an enhanced 35S promoter and nopaline synthase terminator. Clone p35S-L1-4 was constructed with four introns and p35S-L1-5 with five introns inserted in the cDNA. Clone p35S-L1-4 was unstable in Escherichia coli often resulting in amplification of plasmids with deletions. Clone p35S-L1-5 was stable and apparently less toxic to Escherichia coli resulting in larger bacterial colonies and higher plasmid yield. Both clones were infectious upon mechanical inoculation of plasmid DNA on susceptible pea cultivars Fjord, Scout, and Brutus. Eight pea genotypes resistant to L1 virus were also resistant to the cDNA derived L1 virus. Both native PSbMV L1 and the cDNA derived virus infected Chenopodium quinoa systemically giving rise to characteristic necrotic lesions on uninoculated leaves.

  9. Nucleotide sequence of cloned cDNA for human pancreatic kallikrein.

    PubMed

    Fukushima, D; Kitamura, N; Nakanishi, S

    1985-12-31

    Cloned cDNA sequences for human pancreatic kallikrein have been isolated and determined by molecular cloning and sequence analysis. The identity between human pancreatic and urinary kallikreins is indicated by the complete coincidence between the amino acid sequence deduced from the cloned cDNA sequence and that reported partially for urinary kallikrein. The active enzyme form of the human pancreatic kallikrein consists of 238 amino acids and is preceded by a signal peptide and a profragment of 24 amino acids. A sequence comparison of this with other mammalian kallikreins indicates that key amino acid residues required for both serine protease activity and kallikrein-like cleavage specificity are retained in the human sequence, and residues corresponding to some external loops of the kallikrein diverge from other kallikreins. Analyses by RNA blot hybridization, primer extension, and S1 nuclease mapping indicate that the pancreatic kallikrein mRNA is also expressed in the kidney and sublingual gland, suggesting the active synthesis of urinary kallikrein in these tissues. Furthermore, the tissue-specific regulation of the expression of the members of the human kallikrein gene family has been discussed.

  10. Molecular cloning and nucleotide sequence of cDNA for human glucose-6-phosphate dehydrogenase variant A(-)

    SciTech Connect

    Hirono, A.; Beutler, E. )

    1988-06-01

    Glucose-6-phosphate dehydrogenase A(-) is a common variant in Blacks that causes sensitivity to drug- and infection-induced hemolytic anemia. A cDNA library was constructed from Epstein-Barr virus-transformed lymphoblastoid cells from a male who was G6PD A(-). One of four cDNA clones isolated contained a sequence not found in the other clones nor in the published cDNA sequence. Consisting of 138 bases and coding 46 amino acids, this segment of cDNA apparently is derived from the alternative splicing involving the 3{prime} end of intron 7. Comparison of the remaining sequences of these clones with the published sequence revealed three nucleotide substitutions: C{sup 33} {yields} G, G{sup 202} {yields} A, and A{sup 376} {yields} G. Each change produces a new restriction site. Genomic DNA from five G6PD A(-) individuals was amplified by the polymerase chain reaction. The findings of the same mutation in G6PD A(-) as is found in G6PD A(+) strongly suggests that the G6PD A(-) mutation arose in an individual with G6PD A(+), adding another mutation that causes the in vivo instability of this enzyme protein.

  11. Human uroporphyrinogen III synthase: Molecular cloning, nucleotide sequence, and expression of a full-length cDNA

    SciTech Connect

    Tsai, Shihfeng; Bishop, D.F.; Desnick, R.J. )

    1988-10-01

    Uroporphyrinogen III synthase, the fourth enzyme in the heme biosynthetic pathway, is responsible for conversion of the linear tetrapyrrole, hydroxymethylbilane, to the cyclic tetrapyrrole, uroporphyrinogen III. The deficient activity of URO-synthase is the enzymatic defect in the autosomal recessive disorder congenital erythropoietic porphyria. To facilitate the isolation of a full-length cDNA for human URO-synthase, the human erythrocyte enzyme was purified to homogeneity and 81 nonoverlapping amino acids were determined by microsequencing the N terminus and four tryptic peptides. Two synthetic oligonucleotide mixtures were used to screen 1.2 {times} 10{sup 6} recombinants from a human adult liver cDNA library. Eight clones were positive with both oligonucleotide mixtures. Of these, dideoxy sequencing of the 1.3 kilobase insert from clone pUROS-2 revealed 5' and 3' untranslated sequences of 196 and 284 base pairs, respectively, and an open reading frame of 798 base pairs encoding a protein of 265 amino acids with a predicted molecular mass of 28,607 Da. The isolation and expression of this full-length cDNA for human URO-synthase should facilitate studies of the structure, organization, and chromosomal localization of this heme biosynthetic gene as well as the characterization of the molecular lesions causing congenital erythropoietic porphyria.

  12. Nucleotide sequence of Phaseolus vulgaris L. alcohol dehydrogenase encoding cDNA and three-dimensional structure prediction of the deduced protein

    PubMed Central

    Amelia, Kassim; Khor, Chin Yin; Shah, Farida Habib; Bhore, Subhash J.

    2015-01-01

    Background: Common beans (Phaseolus vulgaris L.) are widely consumed as a source of proteins and natural products. However, its yield needs to be increased. In line with the agenda of Phaseomics (an international consortium), work of expressed sequence tags (ESTs) generation from bean pods was initiated. Altogether, 5972 ESTs have been isolated. Alcohol dehydrogenase (AD) encoding gene cDNA was a noticeable transcript among the generated ESTs. This AD is an important enzyme; therefore, to understand more about it this study was undertaken. Objective: The objective of this study was to elucidate P. vulgaris L. AD (PvAD) gene cDNA sequence and to predict the three-dimensional (3D) structure of deduced protein. Materials and Methods: positive and negative strands of the PvAD cDNA clone were sequenced using M13 forward and M13 reverse primers to elucidate the nucleotide sequence. Deduced PvAD cDNA and protein sequence was analyzed for their basic features using online bioinformatics tools. Sequence comparison was carried out using bl2seq program, and tree-view program was used to construct a phylogenetic tree. The secondary structures and 3D structure of PvAD protein were predicted by using the PHYRE automatic fold recognition server. Results: The sequencing results analysis showed that PvAD cDNA is 1294 bp in length. It's open reading frame encodes for a protein that contains 371 amino acids. Deduced protein sequence analysis showed the presence of putative substrate binding, catalytic Zn binding, and NAD binding sites. Results indicate that the predicted 3D structure of PvAD protein is analogous to the experimentally determined crystal structure of s-nitrosoglutathione reductase from an Arabidopsis species. Conclusions: The 1294 bp long PvAD cDNA encodes for 371 amino acid long protein that contains conserved domains required for biological functions of AD. The predicted deduced PvAD protein's 3D structure reflects the analogy with the crystal structure of

  13. Nucleotide sequence of murine PCNA: interspecies comparison of the cDNA and the 5' flanking region of the gene.

    PubMed

    Shipman-Appasamy, P M; Cohen, K S; Prystowsky, M B

    1991-01-01

    Proliferating cell nuclear antigen (PCNA) RNA levels are regulated by transcription as well as changes in stability, in growing cells. We have cloned the murine PCNA cDNA and a fragment of the murine PCNA gene flanking the transcription initiation site. Comparison of the murine deduced amino acid sequence with the PCNA sequence from rat, human, Drosophila, Saccharomyces cerevisiae, and higher plants, reveals extensive homology between species. The homology is likely to be related to the fundamental role of PCNA as an auxiliary protein for DNA replication. Consensus sequences for transcriptional regulatory factors identified within 520 bp 5' of the cap site of the murine PCNA gene include: an inverted CCAAT site, an enhancer core element (EBP-1), three cAMP-response elements (CRE-BP), one AP-2 site, three Sp1 sites, and two octamer sequences. The first 20 bp of the transcriptional unit are homologous to an initiator element, which may direct transcription from RNA polymerase II in the absence of a TATAA box. The consensus elements in the murine PCNA gene are similar in sequence and/or location to elements identified in the genes for human, Drosophilia, and yeast PCNA.

  14. Uroporphyrinogen-III synthase: Molecular cloning, nucleotide sequence, expression of a mouse full-length cDNA, and its localization on mouse chromosome 7

    SciTech Connect

    Xu, W.; Desnick, R.J.; Kozak, C.A.

    1995-04-10

    Uroporphyrinogen-III synthase, the fourth enzyme in the heme biosynthetic pathway, is responsible for the conversion of hydroxymethylbilane to the cyclic tetrapyrrole, uroporphyrinogen III. The deficient activity of URO-S is the enzymatic defect in congenital erythropoietic porphyria (CEP), an autosomal recessive disorder. For the generation of a mouse model of CEP, the human URO-S cDNA was used to screen 2 X 10{sup 6} recombinants from a mouse adult liver cDNA library. Ten positive clones were isolated, and dideoxy sequencing of the entire 1.6-kb insert of clone pmUROS-1 revealed 5{prime} and 3{prime} untranslated sequences of 144 and 623 bp, respectively, and an open reading frame of 798 bp encoding a 265-amino-acid polypeptide with a predicted molecular mass of 28,501 Da. The mouse and human coding sequences had 80.5 and 77.8% nucleotide and amino acid identity, respectively. The authenticity of the mouse cDNA was established by expression of the active monomeric enzyme in Escherichia coli. In addition, the analysis of two multilocus genetic crosses localized the mouse gene on chromosome 7, consistent with the mapping of the human gene to a position of conserved synteny on chromosome 10. The isolation, expression, and chromosomal mapping of this full-length cDNA should facilitate studies of the structure and organization of the mouse genomic sequence and the development of a mouse model of CEP for characterization of the disease pathogenesis and evaluation of gene therapy. 38 refs., 1 tab.

  15. Full-length cDNA nucleotide sequence of a serologically undetectable HLA-DQA1 allele: HLA-DQA1*"LA".

    PubMed

    Lardy, N M; Otting, N; van der Horst, A R; Bontrop, R E; de Waal, L P

    1997-10-01

    This study describes the characterization of a serological HLA-DQ"blank" specificity that segregates with the HLA-A2, -B7, -DR14, -DR52 haplotype. Although conventional serological typing techniques could not detect an HLA-DQ product on the haplotype positive for the HLA-DQ"blank" specificity, sequence-specific oligonucleotide (SSO) dot-blot analysis demonstrated the presence of the HLA-DQA1*01 and HLA-DQB1*05 alleles. Full-length cDNA nucleotide sequence analysis revealed that the HLA-DQB1 allele that segregated with the HLA-DQ"blank" specificity was identical to HLA-DQB1*05031. As for the HLA DQA1 allele, one nucleotide substitution distinguished the HLA-DQA1 "blank" allele from HLA-DQA1*0104. In exon 2 at nucleotide position 304 a C was substituted for a T (Arg-->Cys). Pending official recognition by the WHO Nomenclature Committee, this HLA-DQA1 "blank" allele is termed HLA-DQA1*"LA". Furthermore, it is postulated that the introduction of cysteine at amino acid position 102 abrogates the classical HLA-DQ1 specificity.

  16. Complete nucleotide and derived amino acid sequence of cDNA encoding the mitochondrial uncoupling protein of rat brown adipose tissue: lack of a mitochondrial targeting presequence.

    PubMed Central

    Ridley, R G; Patel, H V; Gerber, G E; Morton, R C; Freeman, K B

    1986-01-01

    A cDNA clone spanning the entire amino acid sequence of the nuclear-encoded uncoupling protein of rat brown adipose tissue mitochondria has been isolated and sequenced. With the exception of the N-terminal methionine the deduced N-terminus of the newly synthesized uncoupling protein is identical to the N-terminal 30 amino acids of the native uncoupling protein as determined by protein sequencing. This proves that the protein contains no N-terminal mitochondrial targeting prepiece and that a targeting region must reside within the amino acid sequence of the mature protein. Images PMID:3012461

  17. Structure of LEP100, a glycoprotein that shuttles between lysosomes and the plasma membrane, deduced from the nucleotide sequence of the encoding cDNA

    PubMed Central

    1988-01-01

    LEP100, a membrane glycoprotein that has the unique property of shuttling from lysosomes to endosomes to plasma membrane and back, was purified from chicken brain. Its NH2-terminal amino acid sequence was determined, and an oligonucleotide encoding part of this sequence was used to clone the encoding cDNA. The deduced amino acid sequence consists of 414 residues of which the NH2-terminal 18 constitute a signal peptide. The sequence includes 17 sites for N-glycosylation in the NH2-terminal 75% of the polypeptide chain followed by a region lacking N-linked oligosaccharides, a single possible membrane-spanning segment, and a cytoplasmic domain of 11 residues, including three potential phosphorylation sites. Eight cysteine residues are spaced in a regular pattern through the lumenal (extracellular) domain, while a 32-residue sequence rich in proline, serine, and threonine occurs at its midpoint. Expression of the cDNA in mouse L cells resulted in targeting of LEP100 primarily to the mouse lysosomes. PMID:3339090

  18. Cloning and partial nucleotide sequence of human immunoglobulin mu chain cDNA from B cells and mouse-human hybridomas.

    PubMed Central

    Dolby, T W; Devuono, J; Croce, C M

    1980-01-01

    Purified mRNAs coding for mu and kappa human immunoglobulin polypeptides were translated in vitro and their products were characterized. The mu-specific mRNAs, derived from both human lymphoblastoid cells (GM607) and from a mouse-human somatic cell hybrid secreting human mu chains (alpha D5-H11-BC11), were copied into cDNAs and inserted into the plasmid pBR322. Several recombinant cDNAs that were obtained were identified by a combination of colony hybridization with labeled probes, in vitro translation of plasmid-selected mu mRNAs, and DNA nucleotide sequence determination. One recombinant DNA, for which the sequence has been partially determined, contains the codons for part of the C3 constant region domain through the carboxy-terminal piece (155 amino acids total) as well as the entire 3' noncoding sequence up to the poly(A) site of the human mu mRNA. The sequence A-A-U-A-A occurs 12 nucleotides prior to the poly(A) addition site in the human mu mRNA. Considerable sequence homology is observed in the mouse and human mu mRNA 3' coding and noncoding sequences. Images PMID:6777778

  19. cDNA encoding a polypeptide including a hevein sequence

    DOEpatents

    Raikhel, Natasha V.; Broekaert, Willem F.; Chua, Nam-Hai; Kush, Anil

    1993-02-16

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a pu GOVERNMENT RIGHTS This application was funded under Department of Energy Contract DE-AC02-76ER01338. The U.S. Government has certain rights under this application and any patent issuing thereon.

  20. Nucleotide sequence of a tobacco cDNA encoding plastidic glutamine synthetase and light inducibility, organ specificity and diurnal rhythmicity in the expression of the corresponding genes of tobacco and tomato.

    PubMed

    Becker, T W; Caboche, M; Carrayol, E; Hirel, B

    1992-06-01

    A full-length cDNA encoding glutamine synthetase (GS) was cloned from a lambda gt10 library of tobacco leaf RNA, and the nucleotide sequence was determined. An open reading frame accounting for a primary translation product consisting of 432 amino acids has been localized on the cDNA. The calculated molecular mass of the encoded protein is 47.2 kDa. The predicted amino acid sequence of this precursor shows higher homology to GS-2 protein sequences from other species than to a leaf GS-1 polypeptide sequence, indicating that the cDNA isolated encodes the chloroplastic isoform (GS-2) of tobacco GS. The presence of C- and N-terminal extensions which are characteristic of GS-2 proteins supports this conclusion. Genomic Southern blot analysis indicated that GS-2 is encoded by a single gene in the diploid genomes of both tomato and Nicotiana sylvestris, while two GS-2 genes are very likely present in the amphidiploid tobacco genome. Western blot analysis indicated that in etiolated and in green tomato cotyledons GS-2 subunits are represented by polypeptides of similar size, while in green tomato leaves an additional GS-2 polypeptide of higher apparent molecular weight is detectable. In contrast, tobacco GS-2 is composed of subunits of identical size in all organs examined. GS-2 transcripts and GS-2 proteins could be detected at high levels in the leaves of both tobacco or tomato. Lower amounts of GS-2 mRNA were detected in stems, corolla, and roots of tomato, but not in non-green organs of tobacco. The GS-2 transcript abundance exhibited a diurnal fluctuation in tomato leaves but not in tobacco leaves. White or red light stimulated the accumulation of GS-2 transcripts and GS-2 protein in etiolated tomato cotyledons. Far-red light cancelled this stimulation. The red light response of the GS-2 gene was reduced in etiolated seedlings of the phytochrome-deficient aurea mutant of tomato. These results indicate a phytochrome-mediated light stimulation of GS-2 gene expression

  1. Human somatostatin I: sequence of the cDNA.

    PubMed Central

    Shen, L P; Pictet, R L; Rutter, W J

    1982-01-01

    RNA has been isolated from a human pancreatic somatostatinoma and used to prepare a cDNA library. After prescreening, clones containing somatostatin I sequences were identified by hybridization with an anglerfish somatostatin I-cloned cDNA probe. From the nucleotide sequence of two of these clones, we have deduced an essentially full-length mRNA sequence, including the preprosomatostatin coding region, 105 nucleotides from the 5' untranslated region and the complete 150-nucleotide 3' untranslated region. The coding region predicts a 116-amino acid precursor protein (Mr, 12.727) that contains somatostatin-14 and -28 at its COOH terminus. The predicted amino acid sequence of human somatostatin-28 is identical to that of somatostatin-28 isolated from the porcine and ovine species. A comparison of the amino acid sequences of human and anglerfish preprosomatostatin I indicated that the COOH-terminal region encoding somatostatin-14 and the adjacent 6 amino acids are highly conserved, whereas the remainder of the molecule, including the signal peptide region, is more divergent. However, many of the amino acid differences found in the pro region of the human and anglerfish proteins are conservative changes. This suggests that the propeptides have a similar secondary structure, which in turn may imply a biological function for this region of the molecule. Images PMID:6126875

  2. The EMBL Nucleotide Sequence Database.

    PubMed

    Stoesser, Guenter; Baker, Wendy; van den Broek, Alexandra; Camon, Evelyn; Garcia-Pastor, Maria; Kanz, Carola; Kulikova, Tamara; Leinonen, Rasko; Lin, Quan; Lombard, Vincent; Lopez, Rodrigo; Redaschi, Nicole; Stoehr, Peter; Tuli, Mary Ann; Tzouvara, Katerina; Vaughan, Robert

    2002-01-01

    The EMBL Nucleotide Sequence Database (aka EMBL-Bank; http://www.ebi.ac.uk/embl/) incorporates, organises and distributes nucleotide sequences from all available public sources. EMBL-Bank is located and maintained at the European Bioinformatics Institute (EBI) near Cambridge, UK. In an international collaboration with DDBJ (Japan) and GenBank (USA), data are exchanged amongst the collaborating databases on a daily basis. Major contributors to the EMBL database are individual scientists and genome project groups. Webin is the preferred web-based submission system for individual submitters, whilst automatic procedures allow incorporation of sequence data from large-scale genome sequencing centres and from the European Patent Office (EPO). Database releases are produced quarterly. Network services allow free access to the most up-to-date data collection via FTP, email and World Wide Web interfaces. EBI's Sequence Retrieval System (SRS), a network browser for databanks in molecular biology, integrates and links the main nucleotide and protein databases plus many other specialized databases. For sequence similarity searching, a variety of tools (e.g. Blitz, Fasta, BLAST) are available which allow external users to compare their own sequences against the latest data in the EMBL Nucleotide Sequence Database and SWISS-PROT. All resources can be accessed via the EBI home page at http://www.ebi.ac.uk.

  3. Complete nucleotide sequences and construction of full-length infectious cDNA clones of cucumber green mottle mosaic virus (CGMMV) in a versatile newly developed binary vector including both 35S and T7 promoters.

    PubMed

    Park, Chan-Hwan; Ju, Hye-Kyoung; Han, Jae-Yeong; Park, Jong-Seo; Kim, Ik-Hyun; Seo, Eun-Young; Kim, Jung-Kyu; Hammond, John; Lim, Hyoun-Sub

    2017-04-01

    Seed-transmitted viruses have caused significant damage to watermelon crops in Korea in recent years, with cucumber green mottle mosaic virus (CGMMV) infection widespread as a result of infected seed lots. To determine the likely origin of CGMMV infection, we collected CGMMV isolates from watermelon and melon fields and generated full-length infectious cDNA clones. The full-length cDNAs were cloned into newly constructed binary vector pJY, which includes both the 35S and T7 promoters for versatile usage (agroinfiltration and in vitro RNA transcription) and a modified hepatitis delta virus ribozyme sequence to precisely cleave RNA transcripts at the 3' end of the tobamovirus genome. Three CGMMV isolates (OMpj, Wpj, and Mpj) were separately evaluated for infectivity in Nicotiana benthamiana, demonstrated by either Agroinfiltration or inoculation with in vitro RNA transcripts. CGMMV nucleotide identities to other tobamoviruses were calculated from pairwise alignments using DNAMAN. CGMMV identities were 49.89% to tobacco mosaic virus; 49.85% to pepper mild mottle virus; 50.47% to tomato mosaic virus; 60.9% to zucchini green mottle mosaic virus; and 60.96% to kyuri green mottle mosaic virus, confirming that CGMMV is a distinct species most similar to other cucurbit-infecting tobamoviruses. We further performed phylogenetic analysis to determine relationships of our new Korean CGMMV isolates to previously characterized isolates from Canada, China, India, Israel, Japan, Korea, Russia, Spain, and Taiwan available from NCBI. Analysis of CGMMV amino acid sequences showed three major clades, broadly typified as 'Russian,' 'Israeli,' and 'Asian' groups. All of our new Korean isolates fell within the 'Asian' clade. Neither the 128 nor 186 kDa RdRps of the three new isolates showed any detectable gene silencing suppressor function.

  4. Automated Identification of Nucleotide Sequences

    NASA Technical Reports Server (NTRS)

    Osman, Shariff; Venkateswaran, Kasthuri; Fox, George; Zhu, Dian-Hui

    2007-01-01

    STITCH is a computer program that processes raw nucleotide-sequence data to automatically remove unwanted vector information, perform reverse-complement comparison, stitch shorter sequences together to make longer ones to which the shorter ones presumably belong, and search against the user s choice of private and Internet-accessible public 16S rRNA databases. ["16S rRNA" denotes a ribosomal ribonucleic acid (rRNA) sequence that is common to all organisms.] In STITCH, a template 16S rRNA sequence is used to position forward and reverse reads. STITCH then automatically searches known 16S rRNA sequences in the user s chosen database(s) to find the sequence most similar to (the sequence that lies at the smallest edit distance from) each spliced sequence. The result of processing by STITCH is the identification of the most similar well-described bacterium. Whereas previously commercially available software for analyzing genetic sequences operates on one sequence at a time, STITCH can manipulate multiple sequences simultaneously to perform the aforementioned operations. A typical analysis of several dozen sequences (length of the order of 103 base pairs) by use of STITCH is completed in a few minutes, whereas such an analysis performed by use of prior software takes hours or days.

  5. cDNA encoding a polypeptide including a hevein sequence

    DOEpatents

    Raikhel, N.V.; Broekaert, W.F.; Chua, N.H.; Kush, A.

    1999-05-04

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74--79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli. 12 figs.

  6. CDNA encoding a polypeptide including a hevein sequence

    DOEpatents

    Raikhel, Natasha V.; Broekaert, Willem F.; Chua, Nam-Hai; Kush, Anil

    1995-03-21

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74-79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli.

  7. cDNA encoding a polypeptide including a hevein sequence

    DOEpatents

    Raikhel, Natasha V.; Broekaert, Willem F.; Chua, Nam-Hai; Kush, Anil

    1999-05-04

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74-79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli.

  8. cDNA encoding a polypeptide including a hevein sequence

    DOEpatents

    Raikhel, N.V.; Broekaert, W.F.; Chua, N.H.; Kush, A.

    1995-03-21

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1,018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74--79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli. 11 figures.

  9. cDNA encoding a polypeptide including a hevein sequence

    SciTech Connect

    Raikhel, N.V.; Broekaert, W.F.; Chua, N.H.; Kush, A.

    2000-07-04

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74--79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli.

  10. cDNA encoding a polypeptide including a hevein sequence

    DOEpatents

    Raikhel, N.V.; Broekaert, W.F.; Namhai Chua; Kush, A.

    1993-02-16

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1,018 nucleotides long and includes an open reading frame of 204 amino acids.

  11. cDNA cloning and sequencing of tarantula hemocyanin subunits.

    PubMed

    Voit, R; Feldmaier-Fuchs, G

    1990-01-01

    Tarantula heart cDNA libraries were screened with synthetic oligonucleotide probes deduced from the highly conserved amino acid sequences of the two copper-binding sites, copper A and copper B, found in chelicerate hemocyanins. Positive cDNA clones could be obtained and four different cDNA types were characterized.

  12. Nucleotide sequences important for translation initiation of enterovirus RNA.

    PubMed Central

    Iizuka, N; Yonekawa, H; Nomoto, A

    1991-01-01

    An infectious cDNA clone was constructed from the genome of coxsackievirus B1 strain. A number of RNA transcripts that have mutations in the 5' noncoding region were synthesized in vitro from the modified cDNA clones and examined for their abilities to act as mRNAs in a cell-free translation system prepared from HeLa S3 cells. RNAs that lack nucleotide sequences at positions 568 to 726 and 565 to 726 were found to be less efficient and inactive mRNAs, respectively. To understand the biological significance of this region of RNA, small deletions and point mutations were introduced in the nucleotide sequence between positions 538 and 601. Except for a nucleotide substitution at 592 (U----C) within the 7-base conserved sequence, mutations introduced in the sequence downstream of position 568 did not affect much, if any, of the ability of RNA to act as mRNA. Except for a point mutation at 558 (C----U), mutations upstream of position 567 appeared to inactivate the mRNA. In the upstream region, a sequence consisting of 21 nucleotides at positions 546 to 566 is perfectly conserved in the 5' noncoding regions of enterovirus and rhinovirus genomes. These results suggest that the 7-base conserved sequence functions to maintain the efficiency of translation initiation and that the nucleotide sequence upstream of position 567, including the 21-base conserved sequence, plays essential roles in translation initiation. A deletion mutant whose genome lacks the nucleotide sequence at positions 568 to 726 showed a small-plaque phenotype and less virulence against suckling mice than the wild-type virus. Thus, reduction of the efficiency of translation initiation may result in the construction of enteroviruses with the lower-virulence phenotype. Images PMID:1651409

  13. Nucleotide sequences encoding a thermostable alkaline protease

    DOEpatents

    Wilson, D.B.; Lao, G.

    1998-01-06

    Nucleotide sequences, derived from a thermophilic actinomycete microorganism, which encode a thermostable alkaline protease are disclosed. Also disclosed are variants of the nucleotide sequences which encode a polypeptide having thermostable alkaline proteolytic activity. Recombinant thermostable alkaline protease or recombinant polypeptide may be obtained by culturing in a medium a host cell genetically engineered to contain and express a nucleotide sequence according to the present invention, and recovering the recombinant thermostable alkaline protease or recombinant polypeptide from the culture medium. 3 figs.

  14. Nucleotide sequences encoding a thermostable alkaline protease

    DOEpatents

    Wilson, David B.; Lao, Guifang

    1998-01-01

    Nucleotide sequences, derived from a thermophilic actinomycete microorganism, which encode a thermostable alkaline protease are disclosed. Also disclosed are variants of the nucleotide sequences which encode a polypeptide having thermostable alkaline proteolytic activity. Recombinant thermostable alkaline protease or recombinant polypeptide may be obtained by culturing in a medium a host cell genetically engineered to contain and express a nucleotide sequence according to the present invention, and recovering the recombinant thermostable alkaline protease or recombinant polypeptide from the culture medium.

  15. Complete nucleotide sequence of tobacco streak virus RNA 3.

    PubMed Central

    Cornelissen, B J; Janssen, H; Zuidema, D; Bol, J F

    1984-01-01

    Double-stranded cDNA of in vitro polyadenylated tobacco streak virus (TSV) RNA 3 has been cloned and sequenced. The complete primary structure of 2,205 nucleotides reveals two open reading frames flanked by a leader sequence of 210 bases, an intercistronic region of 123 nucleotides and a 3'-extracistronic sequence of 288 nucleotides. The 5'-terminal open reading frame codes for a Mr 31,742 protein, which probably corresponds to the only in vitro translation product of TSV RNA 3. The 3'-terminal coding region predicts a Mr 26,346 protein, probably the viral coat protein, which is the translation product of the subgenomic messenger, RNA 4. Although the coat proteins of alfalfa mosaic virus (A1MV) and TSV are functionally equivalent in activating their own and each others genomes, no homology between the primary structures of those two proteins is detectable. PMID:6546793

  16. Sequence of the cDNA encoding an actin homolog in the crayfish Procambarus clarkii.

    PubMed

    Kang, W K; Naya, Y

    1993-11-15

    A cDNA library was constructed by using mRNAs purified from crayfish (Procambarus clarkii) muscle. Using a homology search of the nucleotide (nt) sequences, a clone of the library was found to encode a protein homologous to actin (Act). The insert fragment of this cDNA clone was 1072 nt in length. The amino acid sequence deduced from the nt sequence showed significant similarity to Act of various organisms as follows: 88.1% to Drosophila melanogaster, 88.2% to silk worm, 87.3% to brine shrimp, 86.3% to rat, and 86.3% to human (% identity).

  17. Long-range correlations in nucleotide sequences

    NASA Astrophysics Data System (ADS)

    Peng, C.-K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Sciortino, F.; Simons, M.; Stanley, H. E.

    1992-03-01

    DNA SEQUENCES have been analysed using models, such as an it-step Markov chain, that incorporate the possibility of short-range nucleotide correlations1. We propose here a method for studying the stochastic properties of nucleotide sequences by constructing a 1:1 map of the nucleotide sequence onto a walk, which we term a 'DNA walk'. We then use the mapping to provide a quantitative measure of the correlation between nucleotides over long distances along the DNA chain. Thus we uncover in the nucleotide sequence a remarkably long-range power law correlation that implies a new scale-invariant property of DNA. We find such long-range correlations in intron-containing genes and in nontranscribed regulatory DNA sequences, but not in complementary DNA sequences or intron-less genes.

  18. Long-range correlations in nucleotide sequences

    NASA Technical Reports Server (NTRS)

    Peng, C. K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Sciortino, F.; Simons, M.; Stanley, H. E.

    1992-01-01

    DNA sequences have been analysed using models, such as an n-step Markov chain, that incorporate the possibility of short-range nucleotide correlations. We propose here a method for studying the stochastic properties of nucleotide sequences by constructing a 1:1 map of the nucleotide sequence onto a walk, which we term a 'DNA walk'. We then use the mapping to provide a quantitative measure of the correlation between nucleotides over long distances along the DNA chain. Thus we uncover in the nucleotide sequence a remarkably long-range power law correlation that implies a new scale-invariant property of DNA. We find such long-range correlations in intron-containing genes and in nontranscribed regulatory DNA sequences, but not in complementary DNA sequences or intron-less genes.

  19. Long-range correlations in nucleotide sequences

    NASA Technical Reports Server (NTRS)

    Peng, C. K.; Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Sciortino, F.; Simons, M.; Stanley, H. E.

    1992-01-01

    DNA sequences have been analysed using models, such as an n-step Markov chain, that incorporate the possibility of short-range nucleotide correlations. We propose here a method for studying the stochastic properties of nucleotide sequences by constructing a 1:1 map of the nucleotide sequence onto a walk, which we term a 'DNA walk'. We then use the mapping to provide a quantitative measure of the correlation between nucleotides over long distances along the DNA chain. Thus we uncover in the nucleotide sequence a remarkably long-range power law correlation that implies a new scale-invariant property of DNA. We find such long-range correlations in intron-containing genes and in nontranscribed regulatory DNA sequences, but not in complementary DNA sequences or intron-less genes.

  20. Flavin reductase: sequence of cDNA from bovine liver and tissue distribution.

    PubMed Central

    Quandt, K S; Hultquist, D E

    1994-01-01

    Flavin reductase catalyzes electron transfer from reduced pyridine nucleotides to methylene blue or riboflavin, and this catalysis is the basis of the therapeutic use of methylene blue or riboflavin in the treatment of methemoglobinemia. A cDNA for a mammalian flavin reductase has been isolated and sequenced. Degenerate oligonucleotides, with sequences based on amino acid sequences of peptides derived from bovine erythrocyte flavin reductase, were used as primers in PCR to selectively amplify a partial cDNA that encodes the bovine reductase. The template used in the PCR was first strand cDNA synthesized from bovine liver total RNA using oligo(dT) primers. A PCR product was used as a specific probe to screen a bovine liver cDNA library. The sequence determined from two overlapping clones contains an open reading frame of 621 nucleotides and encodes 206 amino acids. The amino acid sequence deduced from the bovine liver flavin reductase cDNA matches the amino acid sequences determined for erythrocyte reductase-derived peptides, and the predicted molecular mass of 22,001 Da for the liver reductase agrees well with the molecular mass of 21,994 Da determined for the erythrocyte reductase by electrospray mass spectrometry. The amino acid sequence at the N terminus of the reductase has homology to sequences of pyridine nucleotide-dependent enzymes, and the predicted secondary structure, beta alpha beta, resembles the common nucleotide-binding structural motif. RNA blot analysis indicates a single 1-kilobase reductase transcript in human heart, kidney, liver, lung, pancreas, placenta, and skeletal muscle. Images PMID:7937764

  1. A model organism for new gene discovery by cDNA sequencing

    SciTech Connect

    El-Saved, N.M.; Donelson, J.E.; Alarcon, C.M.

    1994-09-01

    One method of new gene discovery is single pass sequencing of cDNAs to identify expressed sequence tags (ESTs). Model organisms can have biological properties which makes their use advantageous over studies with humans. One such model organism with advantages for cDNA sequencing is the African trypanosome T. brucei rhodesiense. This organism has the same 40 nucleotide sequence (splice leader sequence) on the 5{prime} end of all mRNAs. We have constructed a 5{prime} cDNA library by priming off the splice leader sequence and have begun sequencing this cDNA library. To date, over nearly 500 such cDNA expressed sequence tags (ESTs) have been examined. Forty-three percent of the sequences sampled from the trypanosome cDNA library have significant similarities to sequences already in the protein and translated nucleic acid databases. Among these are cDNA sequences which encode previously reported T. brucej proteins such as the VSG, tubulin, calflagin, etc., and proteins previously identified in other trypanosomatids. Other cDNAs display significant similarities to genes in unrelated organisms encoding several ribosomal proteins, metabolic enzymes, GTP binding proteins, transcription factors, cyclophillin, nucleosomal histones, histone H1, and a macrophage stress protein, among others. The 57% of the cDNAs that are not similar to sequences currently in the databases likely encode both trypanosome-specific proteins and housekeeping proteins shared with other eukaryotes. These cDNA ESTs provide new avenues of research for exploring both the biochemistry and the genome organization of this parasite, as well as a resource for identifying the 5{prime} sequence of novel genes likely to have homology to genes expressed in other organisms.

  2. Nucleotide capacitance calculation for DNA sequencing

    SciTech Connect

    Lu, Jun-Qiang; Zhang, Xiaoguang

    2008-01-01

    Using a first-principles linear response theory, the capacitance of the DNA nucleotides, adenine, cytosine, guanine and thymine, are calculated. The difference in the capacitance between the nucleotides is studied with respect to conformational distortion. The result suggests that although an alternate current capacitance measurement of a single-stranded DNA chain threaded through a nano-gap electrodes may not sufficient to be used as a stand alone method for rapid DNA sequencing, the capacitance of the nucleotides should be taken into consideration in any GHz-frequency electric measurements and may also serve as an additional criterion for identifying the DNA sequence.

  3. Human and Tree Shrew Alpha-synuclein: Comparative cDNA Sequence and Protein Structure Analysis.

    PubMed

    Wu, Zheng-Cun; Huang, Zhang-Qiong; Jiang, Qin-Fang; Dai, Jie-Jie; Zhang, Ying; Gao, Jia-Hong; Sun, Xiao-Mei; Chen, Nai-Hong; Yuan, Yu-He; Li, Cong; Han, Yuan-Yuan; Li, Yun; Ma, Kai-Li

    2015-10-01

    The synaptic protein alpha-synuclein (α-syn) is associated with a number of neurodegenerative diseases, and homology analyses among many species have been reported. Nevertheless, little is known about the cDNA sequence and protein structure of α-syn in tree shrews, and this information might contribute to our understanding of its role in both health and disease. We designed primers to the human α-syn cDNA sequence; then, tree shrew α-syn cDNA was obtained by RT-PCR and sequenced. Based on the acquired tree shrew α-syn cDNA sequence, both the amino acid sequence and the spatial structure of α-syn were predicted and analyzed. The homology analysis results showed that the tree shrew cDNA sequence matches the human cDNA sequence exactly except at nucleotide positions 45, 60, 65, 69, 93, 114, 147, 150, 157, 204, 252, 270, 284, 298, 308, and 324. Further protein sequence analysis revealed that the tree shrew α-syn protein sequence is 97.1 % identical to that of human α-syn. The secondary protein structure of tree shrew α-syn based on random coils and α-helices is the same as that of the human structure. The phosphorylation sites are highly conserved, except the site at position 103 of tree shrew α-syn. The predicted spatial structure of tree shrew α-syn is identical to that of human α-syn. Thus, α-syn might have a similar function in tree shrew and in human, and tree shrew might be a potential animal model for studying the pathogenesis of α-synucleinopathies.

  4. Complete nucleotide sequence of primitive vertebrate immunoglobulin light chain genes.

    PubMed Central

    Shamblott, M J; Litman, G W

    1989-01-01

    Antibody to Heterodontus francisci (horned shark) immunoglobulin light chain was used to screen a spleen cDNA expression library, and recombinant clones encoding light chain genes were isolated. The complete sequences of the mature coding regions of two light chain genes in this phylogenetically distant vertebrate have been determined and are reported here. Comparisons of the sequences are consistent with the presence of mammalian-like framework and complementarity-determining regions. The predicted amino acid sequences of the genes are more related to mammalian lambda than to kappa light chains. The nucleotide sequences of the genes are most related to mammalian T-cell antigen receptor beta chain. Heterodontus light chain genes may reflect characteristics of the common ancestor of immunoglobulin and T-cell antigen receptors before its evolutionary diversification. PMID:2499889

  5. Complete nucleotide sequence of primitive vertebrate immunoglobulin light chain genes.

    PubMed

    Shamblott, M J; Litman, G W

    1989-06-01

    Antibody to Heterodontus francisci (horned shark) immunoglobulin light chain was used to screen a spleen cDNA expression library, and recombinant clones encoding light chain genes were isolated. The complete sequences of the mature coding regions of two light chain genes in this phylogenetically distant vertebrate have been determined and are reported here. Comparisons of the sequences are consistent with the presence of mammalian-like framework and complementarity-determining regions. The predicted amino acid sequences of the genes are more related to mammalian lambda than to kappa light chains. The nucleotide sequences of the genes are most related to mammalian T-cell antigen receptor beta chain. Heterodontus light chain genes may reflect characteristics of the common ancestor of immunoglobulin and T-cell antigen receptors before its evolutionary diversification.

  6. The International Nucleotide Sequence Database Collaboration.

    PubMed

    Nakamura, Yasukazu; Cochrane, Guy; Karsch-Mizrachi, Ilene

    2013-01-01

    The International Nucleotide Sequence Database Collaboration (INSDC; http://www.insdc.org), one of the longest-standing global alliances of biological data archives, captures, preserves and provides comprehensive public domain nucleotide sequence information. Three partners of the INSDC work in cooperation to establish formats for data and metadata and protocols that facilitate reliable data submission to their databases and support continual data exchange around the world. In this article, the INSDC current status and update for the year of 2012 are presented. Among discussed items of international collaboration meeting in 2012, BioSample database and changes in submission are described as topics.

  7. The International Nucleotide Sequence Database Collaboration

    PubMed Central

    Cochrane, Guy; Karsch-Mizrachi, Ilene; Takagi, Toshihisa; Sequence Database Collaboration, International Nucleotide

    2016-01-01

    The International Nucleotide Sequence Database Collaboration (INSDC; http://www.insdc.org) comprises three global partners committed to capturing, preserving and providing comprehensive public-domain nucleotide sequence information. The INSDC establishes standards, formats and protocols for data and metadata to make it easier for individuals and organisations to submit their nucleotide data reliably to public archives. This work enables the continuous, global exchange of information about living things. Here we present an update of the INSDC in 2015, including data growth and diversification, new standards and requirements by publishers for authors to submit their data to the public archives. The INSDC serves as a model for data sharing in the life sciences. PMID:26657633

  8. Molecular cloning and sequencing of the banded dogfish (Triakis scyllia) interleukin-8 cDNA.

    PubMed

    Inoue, Yuuki; Haruta, Chiaki; Usui, Kazushige; Moritomo, Tadaaki; Nakanishi, Teruyuki

    2003-03-01

    The dogfish (Triakis scyllia) interleukin-8 (IL-8) cDNA was isolated from mitogen-stimulated peripheral white blood cells (WBCs) utilising the polymerase chain reaction (PCR). The cDNA sequence showed that the dogfish IL-8 clones contained an open reading frame encoding 101 amino acids. A short 5' untranslated region (UTR) of 70 nucleotides and a long 3' UTR of 893 nucleotides were also present in this 1.2-kb cDNA. Furthermore, the 3' UTR of the mRNA contained the AUUUA sequence that has been implicated in shortening of the half-life of several cytokines and growth factors. The predicted IL-8 peptide had one potential N-linked glycosylation site (Asn-72-Thr-74) that is not conserved in other vertebrates. It also contained four cysteine residues (Cys-34, 36, 61 and 77), which are characteristic of CXC subfamily cytokines and found in all vertebrates, to date. The dogfish IL-8 lacked an ELR motif as found in the lamprey and trout. Comparison of the deduced amino acids showed that the dogfish IL-8 sequence shared 50.5, 41.2, 37.1 and 40.4-45.5% identity with the chicken, lamprey, trout and mammalian IL-8 sequences, respectively.

  9. cDNA sequencing and expression analysis of Dicentrarchus labrax heme oxygenase-1.

    PubMed

    Prevot-D'Alvise, N; Pierre, S; Gaillard, S; Gouze, E; Gouze, J-N; Aubert, J; Richard, S; Grillasca, J-P

    2008-11-17

    The liver cDNA encoding heme oxygenase--1 (HO-1) was sequenced from European sea bass (Dicentrarchus labrax) (accession number no. EF139130). The HO-1 cDNA was 1250 bp in nucleotide length and the open reading frame encoded 277 amino acid residues. The deduced amino acid sequence of the European sea bass had 75% and 50% identity with the amino acid sequences of tetraodontiformes (Tetraodon nigroviridis and Takifugu rubripes) and human HO-1 proteins, respectively. A short hydrophobic transmembrane domain at the C--terminal region was found, and four histidine residues were highly conserved, including human his25 that is essential for HO catalytic activity. RT-PCR of mRNA from eight different European sea bass tissues revealed that, in a homeostatis state, the heme oxygenase--1 was abundant in the spleen and liver but not in the brain.

  10. Giant panda ribosomal protein S14: cDNA, genomic sequence cloning, sequence analysis, and overexpression.

    PubMed

    Wu, G-F; Hou, Y-L; Hou, W-R; Song, Y; Zhang, T

    2010-10-13

    RPS14 is a component of the 40S ribosomal subunit encoded by the RPS14 gene and is required for its maturation. The cDNA and the genomic sequence of RPS14 were cloned successfully from the giant panda (Ailuropoda melanoleuca) using RT-PCR technology and touchdown-PCR, respectively; they were both sequenced and analyzed. The length of the cloned cDNA fragment was 492 bp; it contained an open-reading frame of 456 bp, encoding 151 amino acids. The length of the genomic sequence is 3421 bp; it contains four exons and three introns. Alignment analysis indicates that the nucleotide sequence shares a high degree of homology with those of Homo sapiens, Bos taurus, Mus musculus, Rattus norvegicus, Gallus gallus, Xenopus laevis, and Danio rerio (93.64, 83.37, 92.54, 91.89, 87.28, 84.21, and 84.87%, respectively). Comparison of the deduced amino acid sequences of the giant panda with those of these other species revealed that the RPS14 of giant panda is highly homologous with those of B. taurus, R. norvegicus and D. rerio (85.99, 99.34 and 99.34%, respectively), and is 100% identical with the others. This degree of conservation of RPS14 suggests evolutionary selection. Topology prediction shows that there are two N-glycosylation sites, three protein kinase C phosphorylation sites, two casein kinase II phosphorylation sites, four N-myristoylation sites, two amidation sites, and one ribosomal protein S11 signature in the RPS14 protein of the giant panda. The RPS14 gene can be readily expressed in Escherichia coli. When it was fused with the N-terminally His-tagged protein, it gave rise to accumulation of an expected 22-kDa polypeptide, in good agreement with the predicted molecular weight. The expression product obtained can be purified for studies of its function.

  11. Nucleotide sequence and structure of the human apolipoprotein E gene.

    PubMed Central

    Paik, Y K; Chang, D J; Reardon, C A; Davies, G E; Mahley, R W; Taylor, J M

    1985-01-01

    The gene for human apolipoprotein E (apo-E) was selected from a library of cloned genomic DNA by screening with a specific cDNA hybridization probe, and its structure was characterized. The complete nucleotide sequence of the gene as well as 856 nucleotides of the 5' flanking region and 629 nucleotides of the 3' flanking region were determined. Analysis of the sequence showed that the mRNA-encoding region of the apo-E gene consists of four exons separated by three introns. In comparison to the structure of the mRNA, the introns are located in the 5' noncoding region, in the codon for glycine at position -4 of the signal peptide region, and in the codon for arginine at position +61 of the mature protein. The overall lengths of the apo-E gene and its corresponding mRNA are 3597 and 1163 nucleotides, respectively; a mature plasma protein of 299 amino acids is produced by this gene. Examination of the 5' terminus of the gene by S1 nuclease mapping shows apparent multiple transcription initiation sites. The proximal 5' flanking region contains a "TATA box" element as well as two nearby inverted repeat elements. In addition, there are four Alu family sequences associated with the apo-E gene: an Alu sequence located near each end of the gene and two Alu sequences located in the second intron. This knowledge of the structure permits a molecular approach to characterizing the regulation of the apo-E gene. Images PMID:2987927

  12. ERCC2: cDNA cloning and molecular characterization of a human nucleotide excision repair gene with high homology to yeast RAD3.

    PubMed Central

    Weber, C A; Salazar, E P; Stewart, S A; Thompson, L H

    1990-01-01

    Human ERCC2 genomic clones give efficient, stable correction of the nucleotide excision repair defect in UV5 Chinese hamster ovary cells. One clone having a breakpoint just 5' of classical promoter elements corrects only transiently, implicating further flanking sequences in stable gene expression. The nucleotide sequences of a cDNA clone and genomic flanking regions were determined. The ERCC2 translated amino acid sequence has 52% identity (73% homology) with the yeast nucleotide excision repair protein RAD3. RAD3 is essential for cell viability and encodes a protein that is a single-stranded DNA dependent ATPase and an ATP dependent helicase. The similarity of ERCC2 and RAD3 suggests a role for ERCC2 in both cell viability and DNA repair and provides the first insight into the biochemical function of a mammalian nucleotide excision repair gene. Images Fig. 5. PMID:2184031

  13. cDNA cloning and sequence analysis of human pancreatic procarboxypeptidase A1.

    PubMed Central

    Catasús, L; Villegas, V; Pascual, R; Avilés, F X; Wicker-Planquart, C; Puigserver, A

    1992-01-01

    Using polyclonal antibodies raised against human pancreatic procarboxypeptidases, a full-length cDNA coding for an A-type proenzyme was isolated from a lambda gt11 human pancreatic library. This cDNA contains standard 3' and 5' flanking regions, a poly(A)+ tail and a central region of 1260 nucleotides coding for a protein of 419 amino acids. On the basis of sequence comparisons, the human protein was classified as a procarboxypeptidase A1 which is very similar to the previously described A1 forms from rat and bovine pancreatic glands. The presence of the amino acid sequences assumed to be of importance for the zymogen inhibition by its activation segment, primarily on the basis of the recently reported crystal structure of the B form, further supports the proposed classification. PMID:1417781

  14. Genomic and cDNA actin sequences from a virulent strain of Entamoeba histolytica.

    PubMed Central

    Edman, U; Meza, I; Agabian, N

    1987-01-01

    Invasiveness of Entamoeba histolytica strains that cause acute amoebiasis is characterized by aggressive behavior associated with cell motility and actin function. Analysis of actin genes from E. histolytica was initiated by devising methods for the isolation of biologically active nucleic acids, which allowed the preparation of cDNA and genomic DNA libraries. E. histolytica actin-encoding cDNAs and genomic clones have been isolated from libraries prepared from the virulent HM1:IMSS strain using a heterologous actin probe. Nucleotide sequence analysis of three independent cDNA clones and one genomic clone reveals a highly unusual codon bias and the absence of intervening sequences in E. histolytica actin. The coding sequence of the genomic clone is identical to that of two of the three cDNA clones. These represent at least two distinct mRNAs differing only by five silent changes in the protein coding sequence. Multiple genomic copies of the actin gene can be detected by Southern hybridization. E. histolytica actin exhibits a higher degree of homology to cytoplasmic than to muscle actin. Although the protein has been shown not to bind DNase I, the inferred amino acid sequence indicates conservation of all residues implied to participate in this binding. Images PMID:2883657

  15. Cloning and sequence analysis of cDNA for the canine neurotensin/neuromedin N precursor

    SciTech Connect

    Dobner, P.R.; Barber, D.L.; Villa-Komaroff, L.; McKiernan, C.

    1987-05-01

    Cloned cDNAs encoding neurotensin were isolated from a cDNA library derived from primary cultures of canine enteric mucosa cells. Nucleotide sequence analysis using /sup 32/P-labeled nucleotides, has revealed the primary structure of a 170-amino acid precursor protein that encodes both neurotensin and the neurotensin-like peptide neuromedin N. The peptide-coding domains are located in tandem near the carboxyl terminus of the precursor and are bounded and separated by the paired, basic amino acid residues Lys-Arg. An additional coding domain, resembling neuromedin N, occurs immediately after an Arg-Arg basic amino acid pair located in the central region of the precursor. Additional amino acid homologies suggest that tandem duplications have contributed to the structure of the gene. RNA blot analysis, using the cloned cDNA probe, has revealed several mRNA species ranging in size from 500 to 980 nucleotides in the canine enteric mucosa. In contrast a single RNA species of 1500 nucleotides was detected in bovine hypothalamus poly-(A)/sup +/ RNA. The ability of the canine probe to cross-hybridize with bovine mRNA suggest that this probe can be used to isolate neurotensin/neuromedin N genes from other mammalian species.

  16. Complete nucleotide sequence of a potyvirus causing maize dwarf mosaic disease in central China.

    PubMed

    Liu, X; Wang, X; Zhao, Y; Zheng, C; Zhou, G

    2003-01-01

    The full-length nucleotide sequence of a potyvirus causing the maize dwarf mosaic (MDM) disease in Henan province, central China, was obtained by reverse transcription-polymerase chain reaction (RT-PCR) and rapid amplification of the cDNA 5'-end (5'-RACE). The viral genome comprised of 9596 nucleotides except the polyA tail and encoded a putative polyprotein of 3603 amino acids. The entire genomic sequence of this isolate shared identities of 94.2% and 98.3% with Sugarcane mosaic virus (SCMV) HZ isolate at the nucleotide and deduced amino acid levels, respectively, but only a 69.1% identity with MDM virus (MDMV) Bulgarian isolate (MDMV-Bg) at the nucleotide level. Phylogenetical tree analysis of the complete nucleotide sequences indicated that the Henan isolate of a potyvirus causing MDM disease is in fact a Henan strain of SCMV (SCMV-HN).

  17. Nucleotide sequence of mouse satellite DNA.

    PubMed Central

    Hörz, W; Altenburger, W

    1981-01-01

    The nucleotide sequence of uncloned mouse satellite DNA has been determined by analyzing Sau96I restriction fragments that correspond to the repeat unit of the satellite DNA. An unambiguous sequence of 234 bp has been obtained. The sequence of the first 250 bases from dimeric satellite fragments present in Sau96I limit digests corresponds almost exactly to two tandemly arranged monomer sequences including a complete Sau96I site in the center. This is in agreement with the hypothesis that a low level of divergence which cannot be detected in sequence analyses of uncloned DNA is responsible for the appearance of dimeric fragments. Most of the sequence of the 5% fraction of Sau96 monomers that are susceptible to TaqI has also been determined and has been found to agree completely with the prototype sequence. The monomer sequence is internally repetitious being composed of eight diverged subrepeats. The divergence pattern has interesting implications for theories on the evolution of mouse satellite DNA. PMID:6261227

  18. Estimation of evolutionary distances between nucleotide sequences.

    PubMed

    Zharkikh, A

    1994-09-01

    A formal mathematical analysis of the substitution process in nucleotide sequence evolution was done in terms of the Markov process. By using matrix algebra theory, the theoretical foundation of Barry and Hartigan's (Stat. Sci. 2:191-210, 1987) and Lanave et al.'s (J. Mol. Evol. 20:86-93, 1984) methods was provided. Extensive computer simulation was used to compare the accuracy and effectiveness of various methods for estimating the evolutionary distance between two nucleotide sequences. It was shown that the multiparameter methods of Lanave et al.'s (J. Mol. Evol. 20:86-93, 1984), Gojobori et al.'s (J. Mol. Evol. 18:414-422, 1982), and Barry and Hartigan's (Stat. Sci. 2:191-210, 1987) are preferable to others for the purpose of phylogenetic analysis when the sequences are long. However, when sequences are short and the evolutionary distance is large, Tajima and Nei's (Mol. Biol. Evol. 1:269-285, 1984) method is superior to others.

  19. Cloning and sequencing of a cDNA for firefly luciferase from Photuris pennsylvanica.

    PubMed

    Ye, L; Buck, L M; Schaeffer, H J; Leach, F R

    1997-04-25

    The first cDNA from the Photurinae subfamily of the Lampyridae encoding a firefly luciferase from lantern mRNA of Photuris pennsylvanica has been cloned, sequenced, the amino-acid sequence predicted and the sequence reported to GenBank. The cDNA was about 1.8 kb in length with the largest open reading frame coding for a 545-residue protein. The 5' noncoding region is 61 bp long and the 3' noncoding region is 135 bp in length. There is a 24-nucleotide poly(A) tail. When the amino-acid residues are aligned, P. pennsylvanica contains 154 (about 28% of the total residues) that are conserved in all 16 of the deduced luciferase sequences that are presently available. In this P. pennsylvanica luciferase, the amino acids at 276 of the positions are the same at corresponding positions of at least one of the other enzymes. There are two amino-acid differences between this luciferase and the unpublished sequence obtained by Dr. Keith Wood for a putative larval Photuris firefly luciferase cloned from a Maryland firefly. Signature amino-acid sequences and domains found in the deduced sequence are for adenylate kinase, the putative AMP-binding domain, luciferin 4-monooxygenase, 4-coumarate CoA ligase, long-chain fatty acid CoA ligase, 2-acylglycerophosphoethanolamine acyltransferase, the microbody-directing sequence, peptide-synthesizing complexes, and acyladenylate-synthesizing enzymes.

  20. cDNA sequences of two apolipoproteins from lamprey

    SciTech Connect

    Pontes, M.; Xu, X.; Graham, D.; Riley, M.; Doolittle, R.F.

    1987-03-24

    The messages for two small but abundant apolipoproteins found in lamprey blood plasma were cloned with the aid of oligonucleotide probes based on amino-terminal sequences. In both cases, numerous clones were identified in a lamprey liver cDNA library, consistent with the great abundance of these proteins in lamprey blood. One of the cDNAs (LAL1) has a coding region of 105 amino acids that corresponds to a 21-residue signal peptide, a putative 8-residue propeptide, and the 76-residue mature protein found in blood. The other cDNA (LAL2) codes for a total of 191 residues, the first 23 of which constitute a signal peptide. The two proteins, which occur in the high-density lipoprotein fraction of ultracentrifuged plasma, have amino acid compositions similar to those of apolipoproteins found in mammalian blood; computer analysis indicates that the sequences are largely helix-permissive. When the sequences were searched against an amino acid sequence data base, rat apolipoprotein IV was the best matching candidate in both cases. Although a reasonable alignment can be made with that sequence and LAL1, definitive assignment of the two lamprey proteins to typical mammalian classes cannot be made at this point.

  1. Molecular cloning and sequencing of the human erythrocyte 2,3-bisphosphoglycerate mutase cDNA: revised amino acid sequence.

    PubMed Central

    Joulin, V; Peduzzi, J; Roméo, P H; Rosa, R; Valentin, C; Dubart, A; Lapeyre, B; Blouquit, Y; Garel, M C; Goossens, M

    1986-01-01

    The human erythrocyte 2,3-bisphosphoglycerate mutase (BPGM) is a multifunctional enzyme which controls the metabolism of 2,3-diphosphoglycerate, the main allosteric effector of haemoglobin. Several cDNA banks were constructed from reticulocyte mRNA, either by conventional cloning methods in pBR322 and screening with specific mixed oligonucleotide probes, or in the expression vector lambda gt 11. The largest cDNA isolated contained 1673 bases [plus the poly(A) tail], which is slightly smaller than the size of the intact mRNA as estimated by Northern blot analysis (approximately 1800 bases). This cDNA encodes for a protein of 258 residues; the protein yielded 34 tryptic peptides which were subsequently isolated by h.p.l.c. Our nucleotide sequence data were entirely confirmed by the amino acid composition of these tryptic peptides and reveal several major differences from the published sequence; the revised amino acid sequence of human BPGM is presented. These findings represent the first step in the study of the expression and regulation of this enzyme as a specific marker of the erythroid cell line. Images Fig. 5. PMID:3023066

  2. Nucleotide sequence and genetic organization of Hungarian grapevine chrome mosaic nepovirus RNA2.

    PubMed Central

    Brault, V; Hibrand, L; Candresse, T; Le Gall, O; Dunez, J

    1989-01-01

    The complete nucleotide sequence of hungarian grapevine chrome mosaic nepovirus (GCMV) RNA2 has been determined. The RNA sequence is 4441 nucleotides in length, excluding the poly(A) tail. A polyprotein of 1324 amino acids with a calculated molecular weight of 146 kDa is encoded in a single long open reading frame extending from nucleotides 218 to 4190. This polyprotein is homologous with the protein encoded by the S strain of tomato black ring virus (TBRV) RNA2, the only other nepovirus sequenced so far. Direct sequencing of the viral coat protein and in vitro translation of transcripts derived from cDNA sequences demonstrate that, as for comoviruses, the coat protein is located at the carboxy terminus of the polyprotein. A model for the expression of GCMV RNA2 is presented. Images PMID:2798129

  3. Nucleotide sequence of complementary DNA encoding for quaking protein of cow, horse and pig.

    PubMed

    Murata, Tomoaki; Yamashiro, Yasuhiro; Kondo, Tatsuya; Nakaichi, Munekazu; Une, Satoshi; Taura, Yasuho

    2005-08-01

    Complementary DNA (cDNA) for bovine quaking gene (Bqk), equine quaking gene (Eqk) and porcine quaking gene (Pqk), which are homologous to mouse quaking gene (qkI), were isolated, and their nucleotide sequences were determined. cDNA sequences of Bqk, Eqk and Pqk showed very high homology to that of qkI at nucleotide level; 94.2, 95.7 and 95.6%, respectively. Deduced amino acid sequences for Bqk, Eqk and Pqk perfectly matched to that of qkI. These findings suggest that the quaking gene family is highly conserved during mammalian evolution, and that Bqk, Eqk and Pqk are likely to have important biological functions also in cow, horse and pig.

  4. cDNA encoding a polypeptide including a hev ein sequence

    DOEpatents

    Raikhel, Natasha V.; Broekaert, Willem F.; Chua, Nam-Hai; Kush, Anil

    2000-07-04

    A cDNA clone (HEV1) encoding hevein was isolated via polymerase chain reaction (PCR) using mixed oligonucleotides corresponding to two regions of hevein as primers and a Hevea brasiliensis latex cDNA library as a template. HEV1 is 1018 nucleotides long and includes an open reading frame of 204 amino acids. The deduced amino acid sequence contains a putative signal sequence of 17 amino acid residues followed by a 187 amino acid polypeptide. The amino-terminal region (43 amino acids) is identical to hevein and shows homology to several chitin-binding proteins and to the amino-termini of wound-induced genes in potato and poplar. The carboxyl-terminal portion of the polypeptide (144 amino acids) is 74-79% homologous to the carboxyl-terminal region of wound-inducible genes of potato. Wounding, as well as application of the plant hormones abscisic acid and ethylene, resulted in accumulation of hevein transcripts in leaves, stems and latex, but not in roots, as shown by using the cDNA as a probe. A fusion protein was produced in E. coli from the protein of the present invention and maltose binding protein produced by the E. coli.

  5. Characterization of cDNA clones encoding rabbit and human serum paraoxonase: The mature protein retains its signal sequence

    SciTech Connect

    Hassett, C.; Richter, R.J.; Humbert, R.; Omiecinski, C.J.; Furlong, C.E. ); Chapline, C.; Crabb, J.W. )

    1991-10-22

    Serum paraoxonase hydrolyzes the toxic metabolites of a variety of organophosphorus insecticides. High serum paraoxonase levels appear to protect against the neurotoxic effects of organophosphorus substrates of this enzyme. The amino acid sequence accounting for 42% of rabbit paraoxonase was determined. From these data, two oligonucleotide probes were synthesized and used to screen a rabbit liver cDNA library. Human paraoxonase clones were isolated from a liver cDNA library by using the rabbit cDNA as a hybridization probe. Inserts from three of the longest clones were sequenced, and one full-length clone contained an open reading frame encoding 355 amino acids, four less than the rabbit paraoxonase protein. Amino-terminal sequences derived from purified rabbit and human paraoxonase proteins suggested that the signal sequence is retained, with the exception of the initiator methionine residue. Characterization of the rabbit and human paraoxonase cDNA clones confirms that the signal sequences are not processed, except for the N-terminal methionine residue. The rabbit and human cDNA clones demonstrate striking nucleotide and deduced amino acid similarities (greater than 85%), suggesting an important metabolic role and constraints on the evolution of this protein.

  6. Nucleotide sequence alignment using sparse coding and belief propagation.

    PubMed

    Roozgard, Aminmohammad; Barzigar, Nafise; Wang, Shuang; Jiang, Xiaoqian; Ohno-Machado, Lucila; Cheng, Samuel

    2013-01-01

    Advances in DNA information extraction techniques have led to huge sequenced genomes from organisms spanning the tree of life. This increasing amount of genomic information requires tools for comparison of the nucleotide sequences. In this paper, we propose a novel nucleotide sequence alignment method based on sparse coding and belief propagation to compare the similarity of the nucleotide sequences. We used the neighbors of each nucleotide as features, and then we employed sparse coding to find a set of candidate nucleotides. To select optimum matches, belief propagation was subsequently applied to these candidate nucleotides. Experimental results show that the proposed approach is able to robustly align nucleotide sequences and is competitive to SOAPaligner [1] and BWA [2].

  7. Amino acid sequence of band-3 protein from rainbow trout erythrocytes derived from cDNA.

    PubMed Central

    Hübner, S; Michel, F; Rudloff, V; Appelhans, H

    1992-01-01

    In this report we present the first complete band-3 cDNA sequence of a poikilothermic lower vertebrate. The primary structure of the anion-exchange protein band 3 (AE1) from rainbow trout erythrocytes was determined by nucleotide sequencing of cDNA clones. The overlapping clones have a total length of 3827 bp with a 5'-terminal untranslated region of 150 bp, a 2754 bp open reading frame and a 3'-untranslated region of 924 bp. Band-3 protein from trout erythrocytes consists of 918 amino acid residues with a calculated molecular mass of 101 827 Da. Comparison of its amino acid sequence revealed a 60-65% identity within the transmembrane spanning sequence of band-3 proteins published so far. An additional insertion of 24 amino acid residues within the membrane-associated domain of trout band-3 protein was identified, which until now was thought to be a general feature only of mammalian band-3-related proteins. PMID:1637296

  8. cDNA, genomic sequence and overexpression of crystallin alpha-B Gene (CRYAB) of the Giant Panda

    PubMed Central

    Hou, Yi-ling; Hou, Wan-ru; Ren, Zheng-long; Hao, Yan-zhe; Zhang, Tian

    2008-01-01

    αB-crystallin, a small heat-shock protein, has been shown to prevent the aggregation of other proteins under various stress conditions. Here we have cloned the cDNA and the genomic sequence of CRYAB gene from the Giant Panda (Ailuropoda melanoleuca) using RT-PCR technology and Touchdown-PCR, respectively. The length of cDNA fragment cloned contains an open reading frame of 528bp encoding 175 amino acids and the length of the genomic sequence is 3189bp, containing three exons and two introns. Alignment analysis indicated that the nucleotide sequence and the deduced amino acid sequence are highly conserved to other four species studied, including Homo sapiens, Mus musculus, Rattus norvegicus and Bos taurus. The homologies for nucleotide sequences of Giant Panda CRYAB to that of these species are 93.9%, 91.5%, 91.5% and 95.3%, respectively, and the homologies for amino acid sequences are 98.3%, 97.1%,97.7% and 99.4%, respectively. Topology prediction shows that there are only four Casein kinase II phosphorylation sites in the CRYAB protein of the Giant Panda. The cDNA of CRYAB was transfected into E. coli, and the CRYAB fused with the N-terminally His-tagged protein gave rise to the accumulation of an expected 24KDa polypeptide, which accorded with the predicted protein. The expression product obtained could be used for purification and study of its function further. PMID:19043608

  9. Coding and 3' non-coding nucleotide sequence of chalcone synthase mRNA and assignment of amino acid sequence of the enzyme

    PubMed Central

    Reimold, Ursula; Kröger, Manfred; Kreuzaler, Fritz; Hahlbrock, Klaus

    1983-01-01

    The nucleotide sequence of an almost complete cDNA copy of chalcone synthase mRNA from cultured parsley cells (Petroselinum hortense) has been determined. The cDNA copy comprised the complete coding sequence for chalcone synthase, a short A-rich stretch of the 5' non-coding region and the complete 3' non-coding region including a poly(A) tail. The amino acid sequence deduced from the nucleotide sequence of the cDNA is consistent with a partial N-terminal sequence analysis, the total amino acid composition, the cyanogen bromide cleavage pattern, and the apparent mol. wt. of the subunit of the purified enzyme. PMID:16453477

  10. Insights into corn genes derived from large-scale cDNA sequencing.

    PubMed

    Alexandrov, Nickolai N; Brover, Vyacheslav V; Freidin, Stanislav; Troukhan, Maxim E; Tatarinova, Tatiana V; Zhang, Hongyu; Swaller, Timothy J; Lu, Yu-Ping; Bouck, John; Flavell, Richard B; Feldmann, Kenneth A

    2009-01-01

    We present a large portion of the transcriptome of Zea mays, including ESTs representing 484,032 cDNA clones from 53 libraries and 36,565 fully sequenced cDNA clones, out of which 31,552 clones are non-redundant. These and other previously sequenced transcripts have been aligned with available genome sequences and have provided new insights into the characteristics of gene structures and promoters within this major crop species. We found that although the average number of introns per gene is about the same in corn and Arabidopsis, corn genes have more alternatively spliced isoforms. Examination of the nucleotide composition of coding regions reveals that corn genes, as well as genes of other Poaceae (Grass family), can be divided into two classes according to the GC content at the third position in the amino acid encoding codons. Many of the transcripts that have lower GC content at the third position have dicot homologs but the high GC content transcripts tend to be more specific to the grasses. The high GC content class is also enriched with intronless genes. Together this suggests that an identifiable class of genes in plants is associated with the Poaceae divergence. Furthermore, because many of these genes appear to be derived from ancestral genes that do not contain introns, this evolutionary divergence may be the result of horizontal gene transfer from species not only with different codon usage but possibly that did not have introns, perhaps outside of the plant kingdom. By comparing the cDNAs described herein with the non-redundant set of corn mRNAs in GenBank, we estimate that there are about 50,000 different protein coding genes in Zea. All of the sequence data from this study have been submitted to DDBJ/GenBank/EMBL under accession numbers EU940701-EU977132 (FLI cDNA) and FK944382-FL482108 (EST).

  11. Nucleotide Sequence of the Akv env Gene

    PubMed Central

    Lenz, Jack; Crowther, Robert; Straceski, Anthony; Haseltine, William

    1982-01-01

    The sequence of 2,191 nucleotides encoding the env gene of murine retrovirus Akv was determined by using a molecular clone of the Akv provirus. Deduction of the encoded amino acid sequence showed that a single open reading frame encodes a 638-amino acid precursor to gp70 and p15E. In addition, there is a typical leader sequence preceding the amino terminus of gp70. The locations of potential glycosylation sites and other structural features indicate that the entire gp70 molecule and most of p15E are located on the outer side of the membrane. Internal cleavage of the env precursor to generate gp70 and p15E occurs immediately adjacent to several basic amino acids at the carboxyl terminus of gp70. This cleavage generates a region of 42 uncharged, relatively hydrophobic amino acids at the amino terminus of p15E, which is located in a position analogous to the hydrophobic membrane fusion sequence of influenza virus hemagglutinin. The mature polypeptides are predicted to associate with the membrane via a region of 30 uncharged, mostly hydrophobic amino acids located near the carboxyl terminus of p15E. Distal to this membrane association region is a sequence of 35 amino acids at the carboxyl terminus of the env precursor, which is predicted to be located on the inner side of the membrane. By analogy to Moloney murine leukemia virus, a proteolytic cleavage in this region removes the terminal 19 amino acids, thus generating the carboxyl terminus of p15E. This leaves 15 amino acids at the carboxyl terminus of p15E on the inner side of the membrane in a position to interact with virion cores during budding. The precise location and order of the large RNase T1-resistant oligonucleotides in the env region were determined and compared with those from several leukemogenic viruses of AKR origin. This permitted a determination of how the differences in the leukemogenic viruses affect the primary structure of the env gene products. PMID:6283170

  12. Complete nucleotide sequence of the polymerase 3 gene of human influenza virus A/WSN/33.

    PubMed Central

    Kaptein, J S; Nayak, D P

    1982-01-01

    The complete nucleotide sequence of polymerase 3 (P3) gene of a human influenza virus (A/WSN/33) has been determined using cDNA clones except for the last 11 nucleotides which were obtained by direct RNA sequencing. The WSN P3 gene contains 2,341 nucleotides and codes for a protein of 759 amino acids (molecular weight 85,800). The WSN P3 protein, as deduced from the plus-strand DNA sequence, is basic and enriched in positively charged amino acids. In addition, it contains clusters of basic amino acids which may provide sites for the interaction of P3 protein with the capped primer, template, and/or other polymerase proteins during the transcriptive and replicative processes of influenza viral RNA. PMID:7045393

  13. [cDNA cloning and sequence analysis of the seventh segment of maize rough dwarf virus genome].

    PubMed

    Deng, W; Yang, X; Zhang, Y; Liu, Y; Kang, L

    2000-10-01

    The double strand RNA of maize rough dwarf virus (MRDV) was prepared from the maize samples showing symptoms which was from the Luanchen county of Heibei province of China. The primers were designed according to the known sequence of MRDV, the cDNA sequence of the seventh segment of MRDV was obtained by RT-PCR, the S7 sequence was analyzed by computer after sequencing. The results showed: the full length of the S7 cDNA is 1936 bp and equal to that of the S7 cDNA from abroad, the two open reading frame(ORF1 and ORF2) contained in the S7 segment are also unchanged. In comparison with the S7 segment from Italy, the homology of S7 nucleotide is 87.7% and the homology of ORF1 amino acid sequence is 91.6%. However, the MRDV S7 segment and the rice black strike dwarf virus S8 segment showed 95.5% nucleotide identities and 93.5% ORF1 amino acid identities.

  14. Nucleotide sequence of the pyruvate decarboxylase gene from Zymomonas mobilis.

    PubMed

    Neale, A D; Scopes, R K; Wettenhall, R E; Hoogenraad, N J

    1987-02-25

    Pyruvate decarboxylase (EC 4.1.1.1), the penultimate enzyme in the alcoholic fermentation pathway of Zymomonas mobilis, converts pyruvate to acetaldehyde and carbon dioxide. The complete nucleotide sequence of the structural gene encoding pyruvate decarboxylase from Zymomonas mobilis has been determined. The coding region is 1704 nucleotides long and encodes a polypeptide of 567 amino acids with a calculated subunit mass of 60,790 daltons. The amino acid sequence was confirmed by comparison with the amino acid sequence of a selection of tryptic fragments of the enzyme. The amino acid composition obtained from the nucleotide sequence is in good agreement with that obtained experimentally.

  15. 77 FR 65537 - Requirements for Patent Applications Containing Nucleotide Sequence and/or Amino Acid Sequence...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-10-29

    ... Amino Acid Sequence Disclosures ACTION: Proposed collection; comment request. SUMMARY: The United States....'' SUPPLEMENTARY INFORMATION: I. Abstract Patent applications that contain nucleotide and/or amino acid sequence...

  16. Nucleotide sequence and expression of the 14-3-3 from the halotolerant alga Dunaliella salina.

    PubMed

    Wang, Tian-yun; Jing, Chang-Qin; Dong, Wei-Hua; Zhang, Jun-He; Zhang, Yu

    2010-02-01

    Previously we reported the nucleotide sequence of a 14-3-3 cDNA cloned from the unicellular green alga Dunaliella salina, however, the nucleotide sequence of this gene have not been reported so far. In the present study, the cloning and characterization of the nucleotide sequence, the gene copy and expression were undertaken. The coding sequence of the gene was found to be interrupted by five introns of 132, 266, 153, 152 and 625 bp, respectively. Introns 3-5 were found in conserved positions as compared to the Chlamydomonas reinhardtii 14-3-3 gene. D. salina 14-3-3 cDNA was inserted into the prokaryotic expression plasmid pET-28 and transformed into E. coli BL21, and the recombinant expressed 14-3-3 protein was purified from E. coli and immunized the rabbit. Indirect ELISA coated with 14-3-3 illustrated that the rabbit antisera titration was 1:1.00E + 06. Western blotting assays confirmed that prepared rabbit antibodies could recognize the recombinant 14-3-3 protein. Southern blotting results showed that there was only one copy of the 14-3-3 present in the genome of D. salina and 14-3-3 expression did not change throughout the Dnualiella cell cycle.

  17. Sequence of a cDNA encoding nitrite reductase from the tree Betula pendula and identification of conserved protein regions.

    PubMed

    Friemann, A; Brinkmann, K; Hachtel, W

    1992-02-01

    The sequence of an mRNA encoding nitrite reductase (NiR, EC 1.7.7.1.) from the tree Betula pendula was determined. A cDNA library constructed from leaf poly(A)+ mRNA was screened with an oligonucleotide probe deduced from NiR sequences from spinach and maize. A 2.5 kb cDNA was isolated that hybridized to an mRNA, the steady-state level of which increased markedly upon induction with nitrate. The nucleotide sequence of the cDNA contains a reading frame encoding a protein of 583 amino acids that reveals 79% identity with NiR from spinach. The transit peptide of the NiR precursor from birch was determined to be 22 amino acids in size by sequence comparison with NiR from spinach and maize and is the shortest transit peptide reported so far. A graphical evaluation of identities found in the NiR sequence alignment revealed nine well conserved sections each exceeding ten amino acids in size. Sequence comparisons with related redox proteins identified essential residues involved in cofactor binding. A putative binding site for ferredoxin was found in the N-terminal half of the protein.

  18. Cloning and sequencing of the medium-chain S-acyl fatty acid synthetase thioester hydrolase cDNA from rat mammary gland.

    PubMed Central

    Naggert, J; Williams, B; Cashman, D P; Smith, S

    1987-01-01

    cDNA clones coding for the medium-chain S-acyl fatty acid synthetase thioester hydrolase (thioesterase II) from rat mammary gland were identified in a bacteriophage lambda gt11 library and their nucleotide sequences were determined. The predicted coding region spans 263 amino acid residues and includes a sequence identical with that of a peptide derived from the enzyme active site. The rat thioesterase II cDNA sequence exhibits homology with that of a thioesterase found in duck uropygial glands. Images Fig. 3. PMID:3632637

  19. Nucleotide Sequence Analysis of RNA Synthesized from Rabbit Globin Complementary DNA

    PubMed Central

    Poon, Raymond; Paddock, Gary V.; Heindell, Howard; Whitcome, Philip; Salser, Winston; Kacian, Dan; Bank, Arthur; Gambino, Roberto; Ramirez, Francesco

    1974-01-01

    Rabbit globin complementary DNA made with RNA-dependent DNA polymerase (reverse transcriptase) was used as template for in vitro synthesis of 32P-labeled RNA. The sequences of the nucleotides in most of the fragments resulting from combined ribonuclease T1 and alkaline phosphatase digestion have been determined. Several fragments were long enough to fit uniquely with the α or β globin amino-acid sequences. These data demonstrate that the cDNA was copied from globin mRNA and contained no detectable contaminants. Images PMID:4139714

  20. Identification, characterization, and sequence analysis of a cDNA encoding a phosphoprotein of human herpesvirus 6.

    PubMed Central

    Chang, C K; Balachandran, N

    1991-01-01

    Human herpesvirus 6 (HHV-6)-specific monoclonal antibody (Mab) 9A5D12 reacted with the nucleus of HHV-6 strain GS-infected cells and immunoprecipitated a phosphorylated polypeptide with an approximate size of 41 kDa, designated HHV-6 P41. A 110-kDa polypeptide was also immunoprecipitated by the MAb. These polypeptides were synthesized early in infection, and the synthesis was greatly reduced by phosphonoacetic acid. Polypeptides with identical sizes were recognized by the MAb from cells infected with an additional eight HHV-6 strains. A 2.1-kb cDNA insert was identified from an HHV-6(GS) cDNA library constructed in the lambda gt11 expression system by using MAb 9A5D12. This cDNA insert hybridized specifically with viral DNA from HHV-6 strains GS and Z-29 and with two predominant transcripts with approximate sizes of 2.5 and 1.2 kb from infected cells. The reactivity of the MAb with a fusion protein expressed in the prokaryotic vector suggested that the cDNA encodes a 62- to 66-kDa protein. Analysis of the nucleotide sequence of the cDNA insert revealed a 623-amino-acid-residue single open reading frame of 1,871 nucleotides, with an open 5' end. The predicted polypeptide is highly basic and contains a long stretch of highly hydrophobic residues localized to the carboxy terminus. The amino-terminal half of the predicted HHV-6 protein from the cDNA shows significant homology with the UL44 gene product of human cytomegalovirus, coding for the ICP36 family of early-late-class phosphoproteins. Two TATA boxes are located at nucleotide positions 668 and 722 of the cDNA. In vitro translation of RNA transcribed in vitro from the cDNA resulted in the synthesis of a 41-kDa polypeptide only. This polypeptide was readily immunoprecipitated by MAb 9A5D12, and its partial peptide map was identical to that of the 41-kDa polypeptide detected in infected cells. Together, these results indicate that the HHV-6 P41 is encoded within a gene coding for a larger protein. Images PMID

  1. Nucleotide sequence and the encoded amino acids of human apolipoprotein A-I mRNA.

    PubMed Central

    Law, S W; Brewer, H B

    1984-01-01

    The cDNA clones encoding the precursor form of human liver apolipoprotein A-I (apoA-I), preproapoA-I, have been isolated from a cDNA library. A 17-base synthetic oligonucleotide based on residues 108-113 of apoA-I and a 26-base primer-extended, dideoxynucleotide-terminated cDNA were used as hybridization probes to select for recombinant plasmids bearing the apoA-I sequence. The complete nucleic acid sequence of human liver preproapoA-I has been determined by analysis of the cloned cDNA. The sequence is composed of 801 nucleotides encoding 267 amino acid residues. PreproapoA-I contains an 18-amino-acid prepeptide and a 6-amino-acid propeptide connected to the amino terminus of the 243-amino acid mature apoA-I. Southern blotting analysis of chromosomal DNA obtained from peripheral blood indicated the apoA-I gene is contained in a 2.1-kilobase-pair Pst I fragment and there is no gross difference in structural organization between the normal apoA-I gene and the Tangier disease apoA-I gene. Images PMID:6198645

  2. Comparison of sequence of cDNA clone with other genomic and cDNA sequences for human C-reactive protein

    SciTech Connect

    Tenchini, M.L.; Bossi, E.; Marchetti, L.; Malcovati, M. ); Lorenzetti, R. )

    1992-04-01

    A clone for C-reactive protein (CRP) has been isolated from a human liver cDNA library; this clone harbors a plasmid, pC81, which has an insert of 1631 bp. When compared to genomic and cDNA sequences published to date now, pC81 has revealed homologies and differences that might help to clarify the structure of this gene and the presence of allelic variants in man.

  3. The complete nucleotide sequence and genome organization of pea streak virus (genus Carlavirus).

    PubMed

    Su, Li; Li, Zhengnan; Bernardy, Mike; Wiersma, Paul A; Cheng, Zhihui; Xiang, Yu

    2015-10-01

    Pea streak virus (PeSV) is a member of the genus Carlavirus in the family Betaflexiviridae. Here, the first complete genome sequence of PeSV was determined by deep sequencing of a cDNA library constructed from dsRNA extracted from a PeSV-infected sample and Rapid Amplification of cDNA Ends (RACE) PCR. The PeSV genome consists of 8041 nucleotides excluding the poly(A) tail and contains six open reading frames (ORFs). The putative peptide encoded by the PeSV ORF6 has an estimated molecular mass of 6.6 kDa and shows no similarity to any known proteins. This differs from typical carlaviruses, whose ORF6 encodes a 12- to 18-kDa cysteine-rich nucleic-acid-binding protein.

  4. Nucleotide sequence of papaya mosaic virus RNA.

    PubMed

    Sit, T L; Abouhaidar, M G; Holy, S

    1989-09-01

    The RNA genome of papaya mosaic virus is 6656 nucleotides long [excluding the poly(A) tail] with six open reading frames (ORFs) more than 200 nucleotides long. The four nearest the 5' end each overlap with adjacent ORFs and could code for proteins with Mr 176307, 26248, 11949 and 7224 (ORFs 1 to 4). The fifth ORF produces the capsid protein of Mr 23043 and the sixth ORF, located completely within ORF1, could code for a protein with Mr 14113. The translation products of ORFs 1 to 3 show strong similarity with those of other potexviruses but the ORF 4 protein has only limited similarity with the other potexvirus ORF 4 proteins of 7K to 11K.

  5. Nucleotide and deduced amino acid sequences of Torpedo californica acetylcholine receptor gamma subunit.

    PubMed Central

    Claudio, T; Ballivet, M; Patrick, J; Heinemann, S

    1983-01-01

    The nucleotide sequence has been determined of a cDNA clone that codes for the 60,000-dalton gamma subunit of Torpedo californica acetylcholine receptor. The length of the cDNA clone is 2,010 base pairs. The 5' and 3' untranslated regions have respective lengths of 31 and 461 base pairs. Data suggest that the putative polyadenylylation consensus sequence A-A-T-A-A-A may not be required for polyadenylylation of the mRNA corresponding to the cDNA clone described in this study. From the DNA sequence data, the amino acid sequence of the gamma subunit was deduced. The subunit is composed of 489 amino acids giving a molecular mass of 56,600 daltons. The deduced amino acid sequence data also indicate the presence of a 17-amino acid extension or signal peptide on this subunit. From these data, structural predictions for the gamma subunit are made such as potential membrane-spanning regions, possible asparagine-linked glycosylation sites, and the assignment of regions of the protein to the extracellular, internal, and cytoplasmic domains of the lipid bilayer. Images PMID:6573658

  6. Complete nucleotide sequence and construction of an infectious clone of Chinese yam necrotic mosaic virus suggest that macluraviruses have the smallest genome among members of the family Potyviridae.

    PubMed

    Kondo, Toru; Fujita, Takashi

    2012-12-01

    The complete nucleotide sequence of Chinese yam necrotic mosaic virus (CYNMV) was determined from cloned virus cDNA. The CYNMV genomic RNA is 8224 nucleotides in length, excluding the poly(A) tail, and contains one long open reading frame encoding a large polyprotein of 2620 amino acids. CYNMV has no counterpart to the P1 cistron and a short HC-Pro cistron located at the 5' side of the potyvirus genome. A full-length cDNA clone, pCYNMV, was assembled under the control of the cauliflower mosaic virus 35S promoter and the nopaline synthase terminator. Biolistic inoculation of Nagaimo plants with cDNA resulted in systemic necrotic mosaic symptoms typical of CYNMV infection. To our knowledge, this is the first report of the complete nucleotide sequence and construction of an infectious cDNA clone of a member of the genus Macluravirus.

  7. Nucleotide and deduced amino acid sequences of rat myosin binding protein H (MyBP-H).

    PubMed

    Jung, J; Oh, J; Lee, K

    1998-12-01

    The complete nucleotide sequence of the cDNA clone encoding rat skeletal muscle myosin-binding protein H (MyBP-H) was determined and amino acid sequence was deduced from the nucleotide sequence (GenBank accession number AF077338). The full-length cDNA of 1782 base pairs(bp) contains a single open reading frame of 1454 bp encoding a rat MyBP-H protein of the predicted molecular mass 52.7 kDa and includes the common consensus 'CA__TG' protein binding motif. The cDNA sequence of rat MyBP-H show 92%, 84% and 41% homology with those of mouse, human and chicken, respectively. The protein contains tandem internal motifs array (-FN III-Ig C2-FN III-Ig C2-) in the C-terminal region which resembles to the immunoglobulin superfamily C2 and fibronectin type III motifs. The amino acid sequence of the C-terminal Ig C2 was highly conserved among MyBPs family and other thick filament binding proteins, suggesting that the C-terminal Ig C2 might play an important role in its function. All proteins belonging to MyBP-H member contains 'RKPS' sequence which is assumed to be cAMP- and cGMP-dependent protein kinase A phosphorylation site. Computer analysis of the primary sequence of rat MyBP-H predicted 11 protein kinase C (PKC) phosphorylation site, 7 casein kinase II (CK2) phosphorylation site and 4 N-myristoylation site.

  8. cDNA, genomic sequence cloning and overexpression of giant panda (Ailuropoda melanoleuca) mitochondrial ATP synthase ATP5G1.

    PubMed

    Hou, W-R; Hou, Y-L; Ding, X; Wang, T

    2012-09-03

    The ATP5G1 gene is one of the three genes that encode mitochondrial ATP synthase subunit c of the proton channel. We cloned the cDNA and determined the genomic sequence of the ATP5G1 gene from the giant panda (Ailuropoda melanoleuca) using RT-PCR technology and touchdown-PCR, respectively. The cloned cDNA fragment contains an open reading frame of 411 bp encoding 136 amino acids; the length of the genomic sequence is of 1838 bp, containing three exons and two introns. Alignment analysis revealed that the nucleotide sequence and the deduced protein sequence are highly conserved compared to Homo sapiens, Mus musculus, Rattus norvegicus, Bos taurus, and Sus scrofa. The homologies for nucleotide sequences of the giant panda ATP5G1 to those of these species are 93.92, 92.21, 92.46, 93.67, and 92.46%, respectively, and the homologies for amino acid sequences are 90.44, 95.59, 93.38, 94.12, and 91.91%, respectively. Topology prediction showed that there is one protein kinase C phosphorylation site, one casein kinase II phosphorylation site, five N-myristoylation sites, and one ATP synthase c subunit signature in the ATP5G1 protein of the giant panda. The cDNA of ATP5G1 was transfected into Escherichia coli, and the ATP5G1 fused with the N-terminally GST-tagged protein gave rise to accumulation of an expected 40-kDa polypeptide, which had the characteristics of the predicted protein.

  9. cDNA sequences of variant forms of human placenta diamine oxidase

    SciTech Connect

    Zhang, X.; Kim, J.; McIntire, S.

    1995-08-01

    Genes for two forms of human placenta diamine oxidase (dao) were cloned from a cDNA library and sequenced. One gene, pdao1, is identical in length to human kidney dao but differs from it by two bases in the coding region and differs slightly in the 3{prime} - and 5{prime}-noncoding regions. The second gene, pdao2, is nearly identical to these genes in the coding region, except that it has an extra 57-nucleotide coding segment near the 3{prime} end of this region. This segment corresponds to the contiguous sequence of the 3{prime} end of intron 3 of human kidney dao. pdao2 also differs significantly from pdao1 and human kidney dao in a 13-base sequence in the t{prime}-noncoding region. It is proposed that pdao1 and human kidney dao are polymorphic forms of the same allele. Whether pdao2 is a polymorph of these two is not certain, because of the significant differences in the coding and noncoding regions. pdao2 may represent a different allele. 21 refs., 2 figs.

  10. Cost-effective sequencing of full-length cDNA clones powered by a de novo-reference hybrid assembly.

    PubMed

    Kuroshu, Reginaldo M; Watanabe, Junichi; Sugano, Sumio; Morishita, Shinichi; Suzuki, Yutaka; Kasahara, Masahiro

    2010-05-07

    Sequencing full-length cDNA clones is important to determine gene structures including alternative splice forms, and provides valuable resources for experimental analyses to reveal the biological functions of coded proteins. However, previous approaches for sequencing cDNA clones were expensive or time-consuming, and therefore, a fast and efficient sequencing approach was demanded. We developed a program, MuSICA 2, that assembles millions of short (36-nucleotide) reads collected from a single flow cell lane of Illumina Genome Analyzer to shotgun-sequence approximately 800 human full-length cDNA clones. MuSICA 2 performs a hybrid assembly in which an external de novo assembler is run first and the result is then improved by reference alignment of shotgun reads. We compared the MuSICA 2 assembly with 200 pooled full-length cDNA clones finished independently by the conventional primer-walking using Sanger sequencers. The exon-intron structure of the coding sequence was correct for more than 95% of the clones with coding sequence annotation when we excluded cDNA clones insufficiently represented in the shotgun library due to PCR failure (42 out of 200 clones excluded), and the nucleotide-level accuracy of coding sequences of those correct clones was over 99.99%. We also applied MuSICA 2 to full-length cDNA clones from Toxoplasma gondii, to confirm that its ability was competent even for non-human species. The entire sequencing and shotgun assembly takes less than 1 week and the consumables cost only approximately US$3 per clone, demonstrating a significant advantage over previous approaches.

  11. Cost-Effective Sequencing of Full-Length cDNA Clones Powered by a De Novo-Reference Hybrid Assembly

    PubMed Central

    Sugano, Sumio; Morishita, Shinichi; Suzuki, Yutaka

    2010-01-01

    Background Sequencing full-length cDNA clones is important to determine gene structures including alternative splice forms, and provides valuable resources for experimental analyses to reveal the biological functions of coded proteins. However, previous approaches for sequencing cDNA clones were expensive or time-consuming, and therefore, a fast and efficient sequencing approach was demanded. Methodology We developed a program, MuSICA 2, that assembles millions of short (36-nucleotide) reads collected from a single flow cell lane of Illumina Genome Analyzer to shotgun-sequence ∼800 human full-length cDNA clones. MuSICA 2 performs a hybrid assembly in which an external de novo assembler is run first and the result is then improved by reference alignment of shotgun reads. We compared the MuSICA 2 assembly with 200 pooled full-length cDNA clones finished independently by the conventional primer-walking using Sanger sequencers. The exon-intron structure of the coding sequence was correct for more than 95% of the clones with coding sequence annotation when we excluded cDNA clones insufficiently represented in the shotgun library due to PCR failure (42 out of 200 clones excluded), and the nucleotide-level accuracy of coding sequences of those correct clones was over 99.99%. We also applied MuSICA 2 to full-length cDNA clones from Toxoplasma gondii, to confirm that its ability was competent even for non-human species. Conclusions The entire sequencing and shotgun assembly takes less than 1 week and the consumables cost only ∼US$3 per clone, demonstrating a significant advantage over previous approaches. PMID:20479877

  12. Reading biological processes from nucleotide sequences

    NASA Astrophysics Data System (ADS)

    Murugan, Anand

    Cellular processes have traditionally been investigated by techniques of imaging and biochemical analysis of the molecules involved. The recent rapid progress in our ability to manipulate and read nucleic acid sequences gives us direct access to the genetic information that directs and constrains biological processes. While sequence data is being used widely to investigate genotype-phenotype relationships and population structure, here we use sequencing to understand biophysical mechanisms. We present work on two different systems. First, in chapter 2, we characterize the stochastic genetic editing mechanism that produces diverse T-cell receptors in the human immune system. We do this by inferring statistical distributions of the underlying biochemical events that generate T-cell receptor coding sequences from the statistics of the observed sequences. This inferred model quantitatively describes the potential repertoire of T-cell receptors that can be produced by an individual, providing insight into its potential diversity and the probability of generation of any specific T-cell receptor. Then in chapter 3, we present work on understanding the functioning of regulatory DNA sequences in both prokaryotes and eukaryotes. Here we use experiments that measure the transcriptional activity of large libraries of mutagenized promoters and enhancers and infer models of the sequence-function relationship from this data. For the bacterial promoter, we infer a physically motivated 'thermodynamic' model of the interaction of DNA-binding proteins and RNA polymerase determining the transcription rate of the downstream gene. For the eukaryotic enhancers, we infer heuristic models of the sequence-function relationship and use these models to find synthetic enhancer sequences that optimize inducibility of expression. Both projects demonstrate the utility of sequence information in conjunction with sophisticated statistical inference techniques for dissecting underlying biophysical

  13. Identification of cDNA clones encoding secretory isoenzyme forms: sequence determination of canine pancreatic prechymotrypsinogen 2 mRNA.

    PubMed Central

    Pinsky, S D; LaForge, K S; Luc, V; Scheele, G

    1983-01-01

    A cDNA library has been constructed from canine poly(A)+ mRNA. Clones containing cDNA inserts coding for prechymotrypsinogen 2 (isoelectric point = 7.1; Mr = 27,500), one of three canine pancreatic isoenzyme forms, were selected by colony hybridization using a cDNA probe synthesized from immunoselected prechymotrypsinogen 2 mRNA. To verify that cDNA clones code for prechymotrypsinogen 2 forms that translocate across rough endoplasmic reticulum membranes and fold into stable and identifiable secretory proteins, we conducted in vitro translation of hybrid-selected mRNA in the presence of microsomal membranes and optimal concentrations of glutathione and analyzed nascent translation products in their nonreduced state by two-dimensional isoelectric focusing/NaDodSO4 gel electrophoresis and fluorography. A near full-length chymotrypsinogen 2 cDNA and its primed extension were used to determine the nucleotide sequence for the entire coding region of prechymotrypsinogen 2 mRNA and 87 residues, including a poly(A) addition signal, in the 3' nontranslated region. The deduced amino acid sequence shows a 263-residue presecretory protein containing an 18-residue amino-terminal transport peptide (Met-Ala-Phe-Leu-Trp-Leu-Leu-Ser-Cys-Phe-Ala-Leu-Leu-Gly-Thr-Ala-Phe-Gly ), which we have previously shown to mediate the translocation of chymotrypsinogen 2 across the rough endoplasmic reticulum membrane. Following the transport peptide is a 245-residue proenzyme, which shows 82% and 80% sequence identity with bovine chymotrypsinogens A and B, respectively. Conserved among the three zymogens are 10 Cys residues that form five disulfide bonds in bovine chymotrypsinogens A and B and the residues that are required for zymogen activation, substrate binding, and catalytic activity. Images PMID:6584866

  14. Nucleotide sequence of SHV-2 beta-lactamase gene

    SciTech Connect

    Garbarg-Chenon, A.; Godard, V.; Labia, R.; Nicolas, J.C. )

    1990-07-01

    The nucleotide sequence of plasmid-mediated beta-lactamase SHV-2 from Salmonella typhimurium (SHV-2pHT1) was determined. The gene was very similar to chromosomally encoded beta-lactamase LEN-1 of Klebsiella pneumoniae. Compared with the sequence of the Escherichia coli SHV-2 enzyme (SHV-2E.coli) obtained by protein sequencing, the deduced amino acid sequence of SHV-2pHT1 differed by three amino acid substitutions.

  15. Analysis of cloned cDNA and genomic sequences for phytochrome: complete amino acid sequences for two gene products expressed in etiolated Avena.

    PubMed Central

    Hershey, H P; Barker, R F; Idler, K B; Lissemore, J L; Quail, P H

    1985-01-01

    Cloned cDNA and genomic sequences have been analyzed to deduce the amino acid sequence of phytochrome from etiolated Avena. Restriction endonuclease site polymorphism between clones indicates that at least four phytochrome genes are expressed in this tissue. Sequence analysis of two complete and one partial coding region shows approximately 98% homology at both the nucleotide and amino acid levels, with the majority of amino acid changes being conservative. High sequence homology is also found in the 5'-untranslated region but significant divergence occurs in the 3'-untranslated region. The phytochrome polypeptides are 1128 amino acid residues long corresponding to a molecular mass of 125 kdaltons. The known protein sequence at the chromophore attachment site occurs only once in the polypeptide, establishing that phytochrome has a single chromophore per monomer covalently linked to Cys-321. Computer analyses of the amino acid sequences have provided predictions regarding a number of structural features of the phytochrome molecule. PMID:3001642

  16. Primary structure of bovine pituitary secretory protein I (chromogranin A) deduced from the cDNA sequence

    SciTech Connect

    Ahn, T.G.; Cohn, D.V.; Gorr, S.U.; Ornstein, D.L.; Kashdan, M.A.; Levine, M.A.

    1987-07-01

    Secretory protein I (SP-I), also referred to as chromogranin A, is an acidic glycoprotein that has been found in every tissue of endocrine and neuroendocrine origin examined but never in exocrine or epithelial cells. Its co-storage and co-secretion with peptide hormones and neurotransmitters suggest that it has an important endocrine or secretory function. The authors have isolated cDNA clones from a bovine pituitary lambdagt11 expression library using an antiserum to parathyroid SP-I. The largest clone (SP4B) hybridized to a transcript of 2.1 kilobases in RNA from parathyroid, pituitary, and adrenal medulla. Immunoblots of bacterial lysates derived from SP4B lysognes demonstrated specific antibody binding to an SP4B/..beta..-galactosidase fusion protein (160 kDa) with a cDNA-derived component of 46 kDa. Radioimmunoassay of the bacterial lystates with SP-I antiserum yielded parallel displacement curves of /sup 125/I-labeled SP-I by the SP4B lysate and authentic SP-I. SP4B contains a cDNA of 1614 nucleotides that encodes a 449-amino acid protein (calculated mass, 50 kDa). The nucleotide sequences of the pituitary SP-I cDNA and adrenal medullary SP-I cDNAs are nearly identical. Analysis of genomic DNA suggests that pituitary, adrenal, and parathyroid SP-I are products of the same gene.

  17. The complete sequence of a full length cDNA for human liver glyceraldehyde-3-phosphate dehydrogenase: evidence for multiple mRNA species.

    PubMed Central

    Arcari, P; Martinelli, R; Salvatore, F

    1984-01-01

    A recombinant M13 clone (O42) containing a 65 b.p. cDNA fragment from human fetal liver mRNA coding for glyceraldehyde-3-phosphate dehydrogenase has been identified and it has been used to isolate from a full-length human adult liver cDNA library a recombinant clone, pG1, which has been subcloned in M13 phage and completely sequenced with the chain terminator method. Besides the coding region of 1008 b.p., the cDNA sequence includes 60 nucleotides at the 5'-end and 204 nucleotides at the 3'-end up to the polyA tail. Hybridization of pG1 to human liver total RNA shows only one band about the size of pG1 cDNA. A much stronger hybridization signal was observed using RNA derived from human hepatocarcinoma and kidney carcinoma cell lines. Sequence homology between clone 042 and the homologous region of clone pG1 is 86%. On the other hand, homology among the translated sequences and the known human muscle protein sequence ranges between 77 and 90%; these data demonstrate the existence of more than one gene coding for G3PD. Southern blot of human DNA, digested with several restriction enzymes, also indicate that several homologous sequences are present in the human genome. Images PMID:6096821

  18. The Nucleotide Sequence of the lac Operator

    PubMed Central

    Gilbert, Walter; Maxam, Allan

    1973-01-01

    The lac repressor protects the lac operator against digestion with deoxyribonuclease. The protected fragment is double-stranded and about 27 base-pairs long. We determined the sequence of RNA transcription copies of this fragment and present a sequence for 24 base pairs. It is: 5′--T G G A A T T G T G A G C G G A T A A C A A T T 3′ 3′--A C C T T A A C A C T C G C C T A T T G T T A A 5′ The sequence has 2-fold symmetry regions; the two longest are separated by one turn of the DNA double helix. PMID:4587255

  19. Nucleotide sequence of the coat protein gene of canine parvovirus.

    PubMed Central

    Rhode, S L

    1985-01-01

    The nucleotide sequence of the canine parvovirus (CPV2) from map units 33 to 95 has been determined. This includes the entire coat protein gene and noncoding sequences at the 3' end of the gene, exclusive of the terminal inverted repeat. The predicted capsid protein structures are discussed and compared with those of the rodent parvoviruses H-1 and MVM. PMID:3989914

  20. [Tabular excel editor for analysis of aligned nucleotide sequences].

    PubMed

    Demkin, V V

    2010-01-01

    Excel platform was used for transition of results of multiple aligned nucleotide sequences obtained using the BLAST network service to the form appropriate for visual analysis and editing. Two macros operators for MS Excel 2007 were constructed. The array of aligned sequences transformed into Excel table and processed using macros operators is more appropriate for analysis than initial html data.

  1. Cloning and sequencing of dolphinfish (Coryphaena hippurus, Coryphaenidae) growth hormone-encoding cDNA.

    PubMed

    Peduel, A D; Elizur, A; Knibb, W

    1994-01-01

    The cDNA encoding the preprotein growth hormone from the dolphinfish (Coryphaena hippurus) has been cloned and sequenced. The cDNA was derived by reverse transcription of RNA from the pituitary of a young fish using the method known as Rapid Amplification of cDNA Ends (RACE). An oligonucleotide primer corresponding to the 5' region of Pagrus major and the universal RACE primer enabled amplification using the Polymerase Chain Reaction (PCR). The dolphinfish and yellow-tail, Seriola quineqeradiata, are both members of the sub-order Percoidei (Perciforme) and their GH sequences show a high level of homology.

  2. Human debrisoquine 4-hydroxylase (P450IID1): cDNA and deduced amino acid sequence and assignment of the CYP2D locus to chromosome 22.

    PubMed

    Gonzalez, F J; Vilbois, F; Hardwick, J P; McBride, O W; Nebert, D W; Gelboin, H V; Meyer, U A

    1988-02-01

    The enzyme P450db1 (db1) is responsible for the common human defect in drug oxidation known as the "debrisoquine/sparteine polymorphism." Polyclonal antibody against the rat db1 protein was used to screen a human liver lambda gt11 library for the db1 cDNA clone. A cDNA containing the full protein coding sequence was isolated; the deduced NH2-terminal sequence of this cDNA was identical to that derived from direct sequencing of the purified human db1 protein. Comparison of the human db1 with rat db1 revealed 71 and 73% similarities of nucleotides and amino acids, respectively. By use of human-rodent somatic cell hybrids the db1 gene was localized to human chromosome 22 (CYP2D locus).

  3. Cloning and sequencing of the cDNA for S-acyl fatty acid synthase thioesterase from the uropygial gland of mallard duck.

    PubMed

    Poulose, A J; Rogers, L; Cheesbrough, T M; Kolattukudy, P E

    1985-12-15

    In vitro translation of poly(A)+ RNA from the uropygial glands of mallard ducks (Anas platyrhynchos) generated a 29-kDa protein which cross-reacted with rabbit antibodies prepared against S-acyl fatty acid synthase thioesterase (Kolattukudy, P. E., Rogers, L., and Flurkey, W. (1985) J. Biol. Chem., 260, 10789-10793). A poly(A)+ RNA fraction enriched in this thioesterase mRNA, isolated by sucrose density gradient centrifugation, was used to prepare cDNA which was cloned in Escherichia coli using the plasmid pUC9. Using hybrid-selected translation and colony hybridization, 17 clones were selected which contained the cDNA for S-acyl fatty acid synthase thioesterase. Northern blot analysis showed that the mature mRNA for this thioesterase contained 1350 nucleotides whereas the cloned cDNA inserts contained 1150-1200 base pairs. Five of the 6 clones tested for 5'-sequence had identical sequences, and the three tested for 3'-end showed the same sequence with poly(A) tails. Two clones, pTE1 and pTE3, representing nearly the full length of mRNA, were selected for sequencing. Maxam-Gilbert and Sanger dideoxy chain termination methods were used on the cloned cDNA and on restriction fragments subcloned in M13 in order to determine the complete nucleotide sequence of the cloned cDNA. The nucleotide sequence showed an open reading frame coding for a peptide of 28.8 kDa. Two peptides isolated from the tryptic digest of the thioesterase purified from the gland showed amino acid sequences which matched with two segments of the sequence deduced from the nucleotide sequence. Another segment containing a serine residue showed an amino acid sequence homologous to the active serine-containing segment of the thioesterase domain of fatty acid synthase. Thus, the clones represent cDNA for S-acyl fatty acid synthase thioesterase. The present results constitute the first case of a complete sequence of a thioesterase.

  4. Cloning and characterization of a highly repetitive fish nucleotide sequence.

    PubMed

    Datta, U; Dutta, P; Mandal, R K

    1988-01-01

    We have cloned and sequenced a highly repetitive HindIII fragment of DNA from the common carp Cyprinus carpio. It represents a tandemly repeated sequence with a monomeric unit of 245 bp and comprises 8% of the fish genome. Higher units of this monomer appear as a ladder in Southern blots. The monomeric unit has been sequenced; it is A + T-rich with some direct and some inverse-repeat nucleotide clusters.

  5. Nucleotide sequence composition and method for detection of neisseria gonorrhoeae

    SciTech Connect

    Lo, A.; Yang, H.L.

    1990-02-13

    This patent describes a composition of matter that is specific for {ital Neisseria gonorrhoeae}. It comprises: at least one nucleotide sequence for which the ratio of the amount of the sequence which hybridizes to chromosomal DNA of {ital Neisseria gonorrhoeae} to the amount of the sequence which hybridizes to chromosomal DNA of {ital Neisseria meningitidis} is greater than about five. The ratio being obtained by a method described.

  6. cDNA, genomic sequence cloning and overexpression of glyceraldehyde-3-phosphate dehydrogenase gene (GAPDH) from the Giant Panda.

    PubMed

    Hou, Wan-Ru; Hou, Yi-Ling; Du, Yu-Jie; Zhang, Tian; Hao, Yan-Zhe

    2010-01-01

    GAPDH (glyceraldehyde-3-phosphate dehydrogenase) is a key enzyme of the glycolytic pathway and it is related to the occurrence of some diseases. The cDNA and the genomic sequence of GAPDH were cloned successfully from the Giant Panda (Ailuropoda melanoleuca) using the RT-PCR technology and Touchdown-PCR, respectively. Both sequences were analyzed preliminarily. The cDNA of GAPDH cloned from the Giant Panda is 1191 bp in size, contains an open reading frame of 1002 bp encoding 333 amino acids. The genomic sequence is 3941 bp in length and was found to possess 10 exons and 9 introns. Alignment analysis indicates that the nucleotide sequence and the deduced amino acid sequence are highly conserved in some mammalian species, including Homo sapiens, Mu musculus, Rattus norvegicus, Canis lupus familiaris and Bos taurus. The homologies for the nucleotide sequences of the Giant Panda GAPDH to that of these species are 90.67, 90.92, 90.62, 95.01 and 92.32% respectively, while the homologies for the amino acid sequences are 94.93, 95.5, 95.8, 98.8 and 97.0%. Primary structure analysis revealed that the molecular weight of the putative GAPDH protein is 35.7899 kDa with a theoretical pI of 8.21. Topology prediction showed that there is one Glyceraldehyde 3-phosphate dehydrogenase active site, two N-glycosylation sites, four Casein kinase II phosphorylation sites, seven Protein kinase C phosphorylation sites and eight N-myristoylation sites in the GAPDH protein of the Giant Panda. The GAPDH gene was overexpressed in E. coli BL21. The results indicated that the fusion of GAPDH with the N-terminally His-tagged form gave rise to the accumulation of an expected 43 kDa polypeptide. The SDS-PAGE analysis also showed that the recombinant GAPDH was soluble and thus could be used for further functional studies.

  7. Cloning and genomic nucleotide sequence of the matrix attachment region binding protein from the halotolerant alga Dunaliella salina.

    PubMed

    Wang, Peng-Ju; Wang, Tian-Yun; Wang, Ya-Feng; Yang, Rui; Li, Zhao-Xi

    2013-07-01

    In our previous study, the sequence of a matrix attachment region binding protein (MBP) cDNA was cloned from the unicellular green alga Dunaliella salina. However, the nucleotide sequence of this gene has not been reported so far. In this paper, the nucleotide sequence of MBP was cloned and characterized, and its gene copy number was determined. The MBP nucleotide sequence is 5641 bp long, and interrupted by 12 introns ranging from 132 to 562 bp. All the introns in the D. salina MBP gene have orthodox splice sites, exhibiting GT at the 5' end and AG at the 3' end. Southern blot analysis showed that MBP only has one copy in the D. salina genome.

  8. Nucleotide correlations and electronic transport of DNA sequences

    NASA Astrophysics Data System (ADS)

    Albuquerque, E. L.; Vasconcelos, M. S.; Lyra, M. L.; de Moura, F. A. B. F.

    2005-02-01

    We use a tight-binding formulation to investigate the transmissivity and wave-packet dynamics of sequences of single-strand DNA molecules made up from the nucleotides guanine G , adenine A , cytosine C , and thymine T . In order to reveal the relevance of the underlying correlations in the nucleotides distribution, we compare the results for the genomic DNA sequence with those of two artificial sequences: (i) the Rudin-Shapiro one, which has long-range correlations; (ii) a random sequence, which is a kind of prototype of a short-range correlated system, presented here with the same first-neighbor pair correlations of the human DNA sequence. We found that the long-range character of the correlations is important to the persistence of resonances of finite segments. On the other hand, the wave-packet dynamics seems to be mostly influenced by the short-range correlations.

  9. The complete nucleotide sequence of bean yellow mosaic potyvirus RNA.

    PubMed

    Guyatt, K J; Proll, D F; Menssen, A; Davidson, A D

    1996-01-01

    The complete nucleotide sequence of an Australian strain of bean yellow mosaic virus (BYMV-S) has been determined from cloned viral cDNAs. The BYMV-S genome is 9 547 nucleotides in length excluding a poly(A) tail. Computer analysis of the sequence revealed a single long open reading frame (ORF) of 9168 nucleotides, commencing at position 206 and terminating with UAG at position 9374-6. The ORF potentially encodes a polyprotein of 3056 amino acids with a deduced Mr of 347 409. The 5' and 3' untranslated regions are 205 and 174 nucleotides in length respectively. Alignment of the amino acid sequence of the BYMV-S polyprotein with those of other potyviruses identified nine putative proteolytic cleavage sites. The predicted consensus cleavage site of the BYMV NIa protease was found to differ from that described for other potyviruses. Processing of the BYMV polyprotein at the designated proteolytic cleavage sites would result in a typical potyviral genome arrangement. The amino acid sequences of the putative BYMV encoded proteins were compared to the homologous gene products of twelve individual potyviruses to identify overall and specific regions of amino acid sequence homology.

  10. Investigation of single nucleotide polymorphisms based on the intronic sequences of the propylene alcohol dehydrogenase gene in Chinese tobacco genotypes

    PubMed Central

    Wei, Ji-Cheng; Qiu, En-Jian; Guo, Hui-Yan; Hao, Ai-Ping; Chen, Rong-Ping

    2014-01-01

    A pair of primers was designed to amplify the propylene alcohol dehydrogenase gene sequence based on the cDNA sequence of the tobacco allyl-alcohol dehydrogenase gene. All introns were sequenced using traditional polymerase chain reaction (PCR) methods and T-A cloning. The sequences from common tobacco (Nicotiana tabaccum L.) and rustica tobacco (Nicotiana rustica L.) were analysed between the third intron and the fourth intron of the propylene alcohol dehydrogenase gene. The results showed that the alcohol dehydrogenase gene is a low-copy nuclear gene. The intron sequences have a combination of single nucleotide polymorphisms and length polymorphisms between common tobacco and rustica tobacco, which are suitable to identify the different germplasms. Furthermore, there are some single nucleotide polymorphism sites in the target sequence within common tobacco that can be used to distinguish intraspecific varieties. PMID:26740754

  11. Generation and analysis of expressed sequence tags from Trypanosoma cruzi trypomastigote and amastigote cDNA libraries.

    PubMed

    Agüero, Fernán; Abdellah, Karim Ben; Tekiel, Valeria; Sánchez, Daniel O; González, Antonio

    2004-08-01

    We have generated 2771 expressed sequence tags (ESTs) from two cDNA libraries of Trypanosoma cruzi CL-Brener. The libraries were constructed from trypomastigote and amastigotes, using a spliced leader primer to synthesize the cDNA second strand, thus selecting for full-length cDNAs. Since the libraries were not normalized nor pre-screened, we compared the representation of transcripts between the two using a statistical test and identify a subset of transcripts that show apparent differential representation. A non-redundant set of 1619 reconstructed transcripts was generated by sequence clustering. This dataset was used to perform similarity searches against protein and nucleotide databases. Based on these searches, 339 sequences could be assigned a putative identity. One thousand one-hundred and sixteen sequences in the non-redundant clustered dataset (68.8%) are new expression tags, not represented in the T. cruzi epimastigote ESTs that are in the public databases. Additional information is provided online at http://genoma.unsam.edu.ar/projects/tram. To the best of our knowledge these are the first ESTs reported for the life cycle stages of T. cruzi that occur in the vertebrate host.

  12. Complete sequence analysis of cDNA clones encoding rat whey phosphoprotein: homology to a protease inhibitor.

    PubMed

    Dandekar, A M; Robinson, E A; Appella, E; Qasba, P K

    1982-07-01

    Lactoprotein clones have been isolated from a rat mammary gland recombinant library of cDNA plasmids. Clones p-Wp 52 and p-Wp 47 were shown by hybrid selection, in vitro translation, and immunoprecipitation to represent a cloned DNA sequence encoding rat whey phosphoprotein. We report here the nucleotide sequence of the cDNA insert of p-Wp 52 and shows that it encodes the complete whey phosphoprotein sequence. The encoded sequence shows a high content of half-cystine, glutamic acid, aspartic acid, and serine but an absence of tyrosine. The half-cystines appear in unique arrangements and are repeated in two domains of the protein. The second domain has striking similarities with the second domain of the red sea turtle protease inhibitor. Clone p-Wp 52 has allowed the study of expression of whey phosphoprotein mRNA during functional differentiation of rat mammary gland and in mammary tumors. The whey phosphoprotein mRNA is detected during midpregnancy and lactation in the rat mammary gland but is barely detected in mammary tumors in which other milk protein mRNAs are expressed. The whey phosphoprotein gene in these tumors is hypermethylated, correlating with the reduced expression of this gene.

  13. Construction and EST sequencing of full-length, drought stress cDNA libraries for common beans (Phaseolus vulgaris L.)

    PubMed Central

    2011-01-01

    Background Common bean is an important legume crop with only a moderate number of short expressed sequence tags (ESTs) made with traditional methods. The goal of this research was to use full-length cDNA technology to develop ESTs that would overlap with the beginning of open reading frames and therefore be useful for gene annotation of genomic sequences. The library was also constructed to represent genes expressed under drought, low soil phosphorus and high soil aluminum toxicity. We also undertook comparisons of the full-length cDNA library to two previous non-full clone EST sets for common bean. Results Two full-length cDNA libraries were constructed: one for the drought tolerant Mesoamerican genotype BAT477 and the other one for the acid-soil tolerant Andean genotype G19833 which has been selected for genome sequencing. Plants were grown in three soil types using deep rooting cylinders subjected to drought and non-drought stress and tissues were collected from both roots and above ground parts. A total of 20,000 clones were selected robotically, half from each library. Then, nearly 10,000 clones from the G19833 library were sequenced with an average read length of 850 nucleotides. A total of 4,219 unigenes were identified consisting of 2,981 contigs and 1,238 singletons. These were functionally annotated with gene ontology terms and placed into KEGG pathways. Compared to other EST sequencing efforts in common bean, about half of the sequences were novel or represented the 5' ends of known genes. Conclusions The present full-length cDNA libraries add to the technological toolbox available for common bean and our sequencing of these clones substantially increases the number of unique EST sequences available for the common bean genome. All of this should be useful for both functional gene annotation, analysis of splice site variants and intron/exon boundary determination by comparison to soybean genes or with common bean whole-genome sequences. In addition the

  14. Nucleotide sequence of the capsid protein gene and 3' non-coding region of papaya mosaic virus RNA.

    PubMed

    Abouhaidar, M G

    1988-01-01

    The nucleotide sequences of cDNA clones corresponding to the 3' OH end of papaya mosaic virus RNA have been determined. The 3'-terminal sequence obtained was 900 nucleotides in length, excluding the poly(A) tail, and contained an open reading frame capable of giving rise to a protein of 214 amino acid residues with an Mr of 22930. This protein was identified as the viral capsid protein. The 3' non-coding region of PMV genome RNA was about 121 nucleotides long [excluding the poly(A) tail] and homologous to the complementary sequence of the non-coding region at the 5' end of PMV RNA. A long open reading frame was also found in the predicted 5' end region of the negative strand.

  15. Nucleotide sequence relationship between intracisternal type A particles of Mus musculus and an endogenous retrovirus (M432) of Mus cervicolor.

    PubMed

    Kuff, E L; Lueders, K K; Scolnick, E M

    1978-10-01

    Intracisternal type A particles are retrovirus-like structures found in embryonic cells and many tumors of Mus musculus but having no clear relationship with other retroviruses of this mouse species. We have observed a partial nucleotide sequence homology between the high-molecular-weight (32S and 35S) RNA components of intracisternal A-particles from a neuroblastoma cell line and the 70S RNA fraction from M432, a type of retrovirus endogenous to the Asian mouse Mus cervicolor. M432 complementary DNA (cDNA) was hybridized to the extent of 30% by the A-particle RNAs. The hybrids showed a lower thermal stability (DeltaT(m), 7 degrees C) than those formed with homologous RNA. The reaction was commensurate with that found between M432 cDNA and divergent sequences in the M. musculus genome. The capacity to hybridize M432 cDNA was closely correlated with the concentration of A-particle sequences in the cytoplasmic RNA of several M. musculus cell types. The major RNA fraction of M432 virus showed a reciprocal partial reaction with the A-particle cDNA's; the virus, which was grown in NIH/3T3 (M. musculus) cells, also contained a small proportion of apparently authentic A-particle nucleotide sequences. A subset of A-particle sequences seemed to be almost totally lacking in the main M432 RNA. The A-particle cDNA's hybridized extensively with divergent sequences in M. cervicolor cellular DNA, indicating that this mouse species may contain not only the partially homologous M432 virogene, but also a more complete genetic equivalent of the intracisternal A-particle.

  16. Selective and flexible depletion of problematic sequences from RNA-seq libraries at the cDNA stage.

    PubMed

    Archer, Stuart K; Shirokikh, Nikolay E; Preiss, Thomas

    2014-05-26

    A major hurdle to transcriptome profiling by deep-sequencing technologies is that abundant transcripts, such as rRNAs, can overwhelm the libraries, severely reducing transcriptome-wide coverage. Methods for depletion of such unwanted sequences typically require treatment of RNA samples prior to library preparation, are costly and not suited to unusual species and applications. Here we describe Probe-Directed Degradation (PDD), an approach that employs hybridisation to DNA oligonucleotides at the single-stranded cDNA library stage and digestion with Duplex-Specific Nuclease (DSN). Targeting Saccharomyces cerevisiae rRNA sequences in Illumina HiSeq libraries generated by the split adapter method we show that PDD results in efficient removal of rRNA. The probes generate extended zones of depletion as a function of library insert size and the requirements for DSN cleavage. Using intact total RNA as starting material, probes can be spaced at the minimum anticipated library size minus 20 nucleotides to achieve continuous depletion. No off-target bias is detectable when comparing PDD-treated with untreated libraries. We further provide a bioinformatics tool to design suitable PDD probe sets. We find that PDD is a rapid procedure that results in effective and specific depletion of unwanted sequences from deep-sequencing libraries. Because PDD acts at the cDNA stage, handling of fragile RNA samples can be minimised and it should further be feasible to remediate existing libraries. Importantly, PDD preserves the original RNA fragment boundaries as is required for nucleotide-resolution footprinting or base-cleavage studies. Finally, as PDD utilises unmodified DNA oligonucleotides it can provide a low-cost option for large-scale projects, or be flexibly customised to suit different depletion targets, sample types and organisms.

  17. Nucleotide sequence of the tobacco (Nicotiana tabacum) anionic peroxidase gene

    SciTech Connect

    Diaz-De-Leon, F.; Klotz, K.L.; Lagrimini, L.M. )

    1993-03-01

    Peroxidases have been implicated in numerous physiological processes including lignification (Grisebach, 1981), wound-healing (Espelie et al., 1986), phenol oxidation (Lagrimini, 1991), pathogen defense (Ye et al., 1990), and the regulation of cell elongation through the formation of interchain covalent bonds between various cell wall polymers (Fry, 1986; Goldberg et al., 1986; Bradley et al., 1992). However, a complete description of peroxidase action in vivo is not available because of the vast number of potential substrates and the existence of multiple isoenzymes. The tobacco anionic peroxidase is one of the better-characterized isoenzymes. This enzyme has been shown to oxidize a number of significant plant secondary compounds in vitro including cinnamyl alcohols, phenolic acids, and indole-3-acetic acid (Maeder, 1980; Lagrimini, 1991). A cDNA encoding the enzyme has been obtained, and this enzyme was shown to be expressed at the highest levels in lignifying tissues (xylem and tracheary elements) and also in epidermal tissue (Lagrimini et al., 1987). It was shown at this time that there were four distinct copies of the anionic peroxidase gene in tobacco (Nicotiana tabacum). A tobacco genomic DNA library was constructed in the [lambda]-phase EMBL3, from which two unique peroxidase genes were sequenced. One of these clones, [lambda]POD1, was designated as a pseudogene when the exonic sequences were found to differ from the cDNA sequences by 1%, and several frame shifts in the coding sequences indicated a dysfunctional gene (the authors' unpublished results). The other clone, [lambda]POD3, described in this manuscript, was designated as the functional tobacco anionic peroxidase gene because of 100% homology with the cDNA. Significant structural elements include an AS-2 box indicated in shoot-specific expression (Lam and Chua, 1989), a TATA box, and two intervening sequences. 10 refs., 1 tab.

  18. cDNA cloning, sequence analysis, and chromosomal localization of the gene for human carnitine palmitoyltransferase

    SciTech Connect

    Finocchiaro, G.; Taroni, F.; Martin, A.L.; Colombo, I.; Tarelli, G.T.; DiDonato, S. ); Rocchi, M. )

    1991-01-15

    The authors have cloned and sequenced a cDNA encoding human liver carnitine palmitoyltransferase an inner mitochondrial membrane enzyme that plays a major role in the fatty acid oxidation pathway. Mixed oligonucleotide primers whose sequences were deduced from one tryptic peptide obtained from purified CPTase were used in a polymerase chain reaction, allowing the amplification of a 0.12-kilobase fragment of human genomic DNA encoding such a peptide. A 60-base-pair (bp) oligonucleotide synthesized on the basis of the sequence from this fragment was used for the screening of a cDNA library from human liver and hybridized to a cDNA insert of 2255 bp. This cDNA contains an open reading frame of 1974 bp that encodes a protein of 658 amino acid residues including 25 residues of an NH{sub 2}-terminal leader peptide. The assignment of this open reading frame to human liver CPTase is confirmed by matches to seven different amino acid sequences of tryptic peptides derived from pure human CPTase and by the 82.2% homology with the amino acid sequence of rat CPTase. The NH{sub 2}-terminal region of CPTase contains a leucine-proline motif that is shared by carnitine acetyl- and octanoyltransferases and by choline acetyltransferase. The gene encoding CPTase was assigned to human chromosome 1, region 1q12-1pter, by hybridization of CPTase cDNA with a DNA panel of 19 human-hanster somatic cell hybrids.

  19. cDNA sequence analysis of a 29-kDa cysteine-rich surface antigen of pathogenic Entamoeba histolytica

    SciTech Connect

    Torian, B.E.; Stroeher, V.L.; Stamm, W.E. ); Flores, B.M. ); Hagen, F.S. )

    1990-08-01

    A {lambda}gt11 cDNA library was constructed from poly(U)-Spharose-selected Entamoeba histolytica trophozoite RNA in order to clone and identify surface antigens. The library was screened with rabbit polyclonal anti-E. histolytica serum. A 700-base-pair cDNA insert was isolated and the nucleotide sequence was determined. The deduced amino acid sequence of the cDNA revealed a cysteine-rich protein. DNA hybridizations showed that the gene was specific to E. histolytica since the cDNA probe reacted with DNA from four axenic strains of E. histolytica but did not react with DNA from Entamoeba invadens, Acanthamoeba castellanii, or Trichomonas vaginalis. The insert was subcloned into the expression vector pGEX-1 and the protein was expressed as a fusion with the C terminus of glutathione S-transferase. Purified fusion protein was used to generate 22 monoclonal antibodies (mAbs) and a mouse polyclonal antiserum specific for the E. histolytica portion of the fusion protein. A 29-kDa protein was identified as a surface antigen when mAbs were used to immunoprecipitate the antigen from metabolically {sup 35}S-labeled live trophozoites. The surface location of the antigen was corroborated by mAb immunoprecipitation of a 29-kDa protein from surface-{sup 125}I-labeled whole trophozoites as well as by the reaction of mAbs with live trophozoites in an indirect immunofluorescence assay performed at 4{degree}C. Immunoblotting with mAbs demonstrated that the antigen was present on four axenic isolates tested. mAbs recognized epitopes on the 29-kDa native antigen on some but not all clinical isolates tested.

  20. Rat mammary-gland transferrin: nucleotide sequence, phylogenetic analysis and glycan structure.

    PubMed Central

    Escrivá, H; Pierce, A; Coddeville, B; González, F; Benaissa, M; Léger, D; Wieruszeski, J M; Spik, G; Pamblanco, M

    1995-01-01

    The complete cDNA for rat mammary-gland transferrin (Tf) has been sequenced and also the native protein isolated from milk in order to analyse the structure of the main glycan variants present. A lactating-rat mammary-gland cDNA library in lambda gt10 was screened with a partial cDNA copy of rat liver Tf and subsequently rescreened with 5' fragments of the longest clones. This produced a 2275 bp insert coding for an open reading frame of 695 amino acid residues. This includes a 19-amino acid signal sequence and the mature protein containing 676 amino acids and one N-glycosylation site in the C-terminal domain at residue 490. Phylogenetic analysis was carried out using 14 translated Tf nucleotide sequences, and the derived evolutionary tree shows that at least three gene duplication events have occurred during Tf evolution, one of which generated the N- and C-terminal domains and occurred before separation of arthropods and chordates. The two halves of human melanotransferrin are more similar to each other than to any other sequence, which contrasts with the pattern shown by the remaining sequences. Native rat milk Tf is separated into four bands on native PAGE that differ only in their sialic acid content: one biantennary glycan is present containing either no sialic acid residues or up to three. The complete structures of the two major variants were determined by methylation, m.s. and 400 MHz 1H-n.m.r. spectroscopy. They contain either one or two neuraminic acid residues (alpha 2-->6)-linked to galactose in conventional biantennary N-acetyl-lactosamine-type glycans. Most contain fucose (alpha 1-->6)-linked to the terminal non-reducing N-acetylglucosamine. Images Figure 4 PMID:7717992

  1. The nucleotide sequence and genome organization of Plasmopara halstedii virus

    PubMed Central

    2011-01-01

    Background Only very few viruses of Oomycetes have been studied in detail. Isometric virions were found in different isolates of the oomycete Plasmopara halstedii, the downy mildew pathogen of sunflower. However, complete nucleotide sequences and data on the genome organization were lacking. Methods Viral RNA of different P. halstedii isolates was subjected to nucleotide sequencing and analysis of the viral genome. The N-terminal sequence of the viral coat protein was determined using Top-Down MALDI-TOF analysis. Results The complete nucleotide sequences of both single-stranded RNA segments (RNA1 and RNA2) were established. RNA1 consisted of 2793 nucleotides (nt) exclusive its 3' poly(A) tract and a single open-reading frame (ORF1) of 2745 nt. ORF1 was framed by a 5' untranslated region (5' UTR) of 18 nt and a 3' untranslated region (3' UTR) of 30 nt. ORF1 contained motifs of RNA-dependent RNA polymerases (RdRp) and showed similarities to RdRp of Scleropthora macrospora virus A (SmV A) and viruses within the Nodaviridae family. RNA2 consisted of 1526 nt exclusive its 3' poly(A) tract and a second ORF (ORF2) of 1128 nt. ORF2 coded for the single viral coat protein (CP) and was framed by a 5' UTR of 164 nt and a 3' UTR of 234 nt. The deduced amino acid sequence of ORF2 was verified by nano-LC-ESI-MS/MS experiments. Top-Down MALDI-TOF analysis revealed the N-terminal sequence of the CP. The N-terminal sequence represented a region within ORF2 suggesting a proteolytic processing of the CP in vivo. The CP showed similarities to CP of SmV A and viruses within the Tombusviridae family. Fragments of RNA1 (ca. 1.9 kb) and RNA2 (ca. 1.4 kb) were used to analyze the nucleotide sequence variation of virions in different P. halstedii isolates. Viral sequence variation was 0.3% or less regardless of their host's pathotypes, the geographical origin and the sensitivity towards the fungicide metalaxyl. Conclusions The results showed the presence of a single and new virus type in

  2. Cloning, sequencing, and expression of cDNA for human. beta. -glucuronidase

    SciTech Connect

    Oshima, A.; Kyle, J.W.; Miller, R.D.; Hoffmann, J.W.; Powell, P.P.; Grubb, J.H.; Sly, W.S.; Tropak, M.; Guise, K.S.; Gravel, R.A.

    1987-02-01

    The authors report here the cDNA sequence for human placental ..beta..-glucuronidase (..beta..-D-glucuronoside glucuronosohydrolase, EC 3.2.1.31) and demonstrate expression of the human enzyme in transfected COS cells. They also sequenced a partial cDNA clone from human fibroblasts that contained a 153-base-pair deletion within the coding sequence and found a second type of cDNA clone from placenta that contained the same deletion. Nuclease S1 mapping studies demonstrated two types of mRNAs in human placenta that corresponded to the two types of cDNA clones isolated. The NH/sub 2/-terminal amino acid sequence determined for human spleen ..beta..-glucuronidase agreed with that inferred from the DNA sequence of the two placental clones, beginning at amino acid 23, suggesting a cleaved signal sequence of 22 amino acids. When transfected into COS cells, plasmids containing either placental clone expressed an immunoprecipitable protein that contained N-linked oligosaccharides as evidenced by sensitivity to endoglycosidase F. However, only transfection with the clone containing the 153-base-pair segment led to expression of human ..beta..-glucuronidase activity. These studies provide the sequence for the full-length cDNA for human ..beta..-glucuronidase, demonstrate the existence of two populations of mRNA for ..beta..-glucuronidase in human placenta, only one of which specifies a catalytically active enzyme, and illustrate the importance of expression studies in verifying that a cDNA is functionally full-length.

  3. Cloning and sequencing of a dextranase-encoding cDNA from Penicillium minioluteum.

    PubMed

    Garcia, B; Margolles, E; Roca, H; Mateu, D; Raices, M; Gonzales, M E; Herrera, L; Delgado, J

    1996-10-01

    A cDNA from Penicillium minioluteum HI-4 encoding a dextranase (1,6-alpha-glucan hydrolase, EC 3.2.1.11) was isolated and characterized. cDNA clones corresponding to genes expressed in dextran-induced cultures were identified by differential hybridization. Southern hybridization and restriction mapping analysis of selected clones revealed four different groups of cDNAs. The dextranase cDNA was identified after expressing a cDNA fragment from each of the isolated groups of cDNA clones in the Escherichia coli T7 system. The expression of a 2 kb cDNA fragment in E. coli led to the production of a 67 kDa protein which was recognized by an anti-dextranase polyclonal antibody. The cDNA contains 2109 bp plus a poly(A) tail, coding for a protein of 608 amino acids, including 20 N-terminal amino acid residues which might correspond to a signal peptide. There was 29% sequence identity between the P. minioluteum dextranase and the dextranase from Arthrobacter sp. CB-8.

  4. Nucleotide sequencing and identification of some wild mushrooms.

    PubMed

    Das, Sudip Kumar; Mandal, Aninda; Datta, Animesh K; Gupta, Sudha; Paul, Rita; Saha, Aditi; Sengupta, Sonali; Dubey, Priyanka Kumari

    2013-01-01

    The rDNA-ITS (Ribosomal DNA Internal Transcribed Spacers) fragment of the genomic DNA of 8 wild edible mushrooms (collected from Eastern Chota Nagpur Plateau of West Bengal, India) was amplified using ITS1 (Internal Transcribed Spacers 1) and ITS2 primers and subjected to nucleotide sequence determination for identification of mushrooms as mentioned. The sequences were aligned using ClustalW software program. The aligned sequences revealed identity (homology percentage from GenBank data base) of Amanita hemibapha [CN (Chota Nagpur) 1, % identity 99 (JX844716.1)], Amanita sp. [CN 2, % identity 98 (JX844763.1)], Astraeus hygrometricus [CN 3, % identity 87 (FJ536664.1)], Termitomyces sp. [CN 4, % identity 90 (JF746992.1)], Termitomyces sp. [CN 5, % identity 99 (GU001667.1)], T. microcarpus [CN 6, % identity 82 (EF421077.1)], Termitomyces sp. [CN 7, % identity 76 (JF746993.1)], and Volvariella volvacea [CN 8, % identity 100 (JN086680.1)]. Although out of 8 mushrooms 4 could be identified up to species level, the nucleotide sequences of the rest may be relevant to further characterization. A phylogenetic tree is constructed using Neighbor-Joining method showing interrelationship between/among the mushrooms. The determined nucleotide sequences of the mushrooms may provide additional information enriching GenBank database aiding to molecular taxonomy and facilitating its domestication and characterization for human benefits.

  5. Nucleotide Sequencing and Identification of Some Wild Mushrooms

    PubMed Central

    Das, Sudip Kumar; Mandal, Aninda; Datta, Animesh K.; Gupta, Sudha; Paul, Rita; Saha, Aditi; Sengupta, Sonali; Dubey, Priyanka Kumari

    2013-01-01

    The rDNA-ITS (Ribosomal DNA Internal Transcribed Spacers) fragment of the genomic DNA of 8 wild edible mushrooms (collected from Eastern Chota Nagpur Plateau of West Bengal, India) was amplified using ITS1 (Internal Transcribed Spacers 1) and ITS2 primers and subjected to nucleotide sequence determination for identification of mushrooms as mentioned. The sequences were aligned using ClustalW software program. The aligned sequences revealed identity (homology percentage from GenBank data base) of Amanita hemibapha [CN (Chota Nagpur) 1, % identity 99 (JX844716.1)], Amanita sp. [CN 2, % identity 98 (JX844763.1)], Astraeus hygrometricus [CN 3, % identity 87 (FJ536664.1)], Termitomyces sp. [CN 4, % identity 90 (JF746992.1)], Termitomyces sp. [CN 5, % identity 99 (GU001667.1)], T. microcarpus [CN 6, % identity 82 (EF421077.1)], Termitomyces sp. [CN 7, % identity 76 (JF746993.1)], and Volvariella volvacea [CN 8, % identity 100 (JN086680.1)]. Although out of 8 mushrooms 4 could be identified up to species level, the nucleotide sequences of the rest may be relevant to further characterization. A phylogenetic tree is constructed using Neighbor-Joining method showing interrelationship between/among the mushrooms. The determined nucleotide sequences of the mushrooms may provide additional information enriching GenBank database aiding to molecular taxonomy and facilitating its domestication and characterization for human benefits. PMID:24489501

  6. Avian Retroviruses That Cause Carcinoma and Leukemia: Identification of Nucleotide Sequences Associated with Pathogenicity

    PubMed Central

    Sheiness, Diana; Bister, Klaus; Moscovici, Carlo; Fanshier, Lois; Gonda, Thomas; Bishop, J. Michael

    1980-01-01

    Avian myelocytomatosis virus (MC29V) is a retrovirus that transforms both fibroblasts and macrophages in culture and induces myelocytomatosis, carcinomas, and sarcomas in birds. Previous work identified a sequence of about 1,500 nucleotides (here denoted oncMCV) that apparently derived from a normal cellular sequence and that may encode the oncogenic capacity of MC29V. In an effort to further implicate oncMCV in tumorigenesis, we used molecular hybridization to examine the distribution of nucleotide sequences related to oncMCV among the genomes of various avian retroviruses. In addition, we characterized further the genetic composition of the remainder of the MC29V genome. Our work exploited the availability of radioactive DNAs (cDNA's) complementary to oncMCV (cDNAMCV) or to specific portions of the genome of avian sarcoma virus (ASV). We showed that genomic RNAs of avian erythroblastosis virus (AEV) and avian myeloblastosis virus (AMV) could not hybridize appreciably with cDNAMCV. By contrast, cDNAMCV hybridized extensively (about 75%) and with essentially complete fidelity to the genome of Mill Hill 2 virus (MH2V), whose pathogenicity is very similar to that of MC29V, but different from that of AEV or AMV. Hybridization with the ASV cDNA's demonstrated that the MC29V genome includes about half of the ASV envelope protein gene and that the remainder of the MC29V genome is closely related to nucleotide sequences that are shared among the genomes of many avian leukosis and sarcoma viruses. We conclude that oncMCV probably specifies the unique set of pathogenicities displayed by MC29V and MH2V, whereas the oncogenic potentials of AEV and AMV are presumably encoded by a distinct nucleotide sequence unrelated to oncMCV. The genomes of ASV, MC29V, and other avian oncoviruses thus share a set of common sequences, but apparently owe their various oncogenic potentials to unrelated transforming genes. Images PMID:6245277

  7. [Computer programs for the analysis of nucleotide sequences (MALK)].

    PubMed

    Mironov, A A; Aleksandrov, N N; Liunovskaia-Gurova, L V; Kister, A E

    1987-01-01

    A system for the computer analysis of nucleic acid and protein sequences ("Helix") is described. Format of the DNA sequences is EMBL--compatible and may be easily commented with the help of convenient menus. "Helix" has also following possibilities: an effective alignment of gele reading data and formation of the final sequence; simple making of recombined molecules "in calcular"; calculations of nucleotide and dinucleotide distribution along the sequence; looking for coding frames; calculations percentage of codons and amino acids in coding frames; searching for direct and inverted repeats; sequences alignment; protein secondary structure prediction; restriction mapping; DNA--protein translation. "Helix" also contain programs for RNA-structure prediction, looking for homologies throughover the EMAL bank, choosing optimal sequence for probes and searching promoters. All the programs are written at FORTRAN-77 and automatically translated into FORTRAN-4. "Helix" require only 64 kbite.

  8. Method for the detection of specific nucleic acid sequences by polymerase nucleotide incorporation

    DOEpatents

    Castro, Alonso

    2004-06-01

    A method for rapid and efficient detection of a target DNA or RNA sequence is provided. A primer having a 3'-hydroxyl group at one end and having a sequence of nucleotides sufficiently homologous with an identifying sequence of nucleotides in the target DNA is selected. The primer is hybridized to the identifying sequence of nucleotides on the DNA or RNA sequence and a reporter molecule is synthesized on the target sequence by progressively binding complementary nucleotides to the primer, where the complementary nucleotides include nucleotides labeled with a fluorophore. Fluorescence emitted by fluorophores on single reporter molecules is detected to identify the target DNA or RNA sequence.

  9. Molecular cloning and characterization of a new cDNA sequence encoding a venom peptide from the centipede Scolopendra subspinipes mutilans.

    PubMed

    Liu, Wanhong; Luo, Feng; He, Jing; Cao, Zhijian; Miao, Lixia

    2012-01-01

    Many studies have been performed on venomous peptides derived from animals. However, little of this research has focused on peptides from centipede venoms. Here, a venom gland cDNA library was successfully constructed for the centipede Scolopendra subspinipes mutilans. A new cDNA encoding the precursor of a venom peptide, named SsmTx, was cloned from the venomous gland cDNA library of the centipede S. subspinipes mutilans. The full-length SsmTx cDNA sequence is 465 nt, including a 249 nt ORF, a 45 nt 5' UTR and a 171 nt 3' UTR. There is a signal tail AATAAA 31 nt upstream of the poly (A) tail. The precursor nucleotide sequence of SsmTx encodes a signal peptide of 25 residues and a mature peptide of 57 residues, which is bridged by two pairs of disulfide bonds. SsmTx displays a unique cysteine motif that is completely different from that of other venomous animal toxins. This is the first reported cDNA sequence encoding a venom peptide from the centipede S. subspinipes mutilans.

  10. Complete nucleotide sequences of Nipah virus isolates from Malaysia.

    PubMed

    Chan, Y P; Chua, K B; Koh, C L; Lim, M E; Lam, S K

    2001-09-01

    We have completely sequenced the genomes of two Nipah virus (NiV) isolates, one from the throat secretion and the other from the cerebrospinal fluid (CSF) of the sole surviving encephalitic patient with positive CSF virus isolation in Malaysia. The two genomes have 18246 nucleotides each and differ by only 4 nucleotides. The NiV genome is 12 nucleotides longer than the Hendra virus (HeV) genome and both genomes have identical leader and trailer sequence lengths and hexamer-phasing positions for all their genes. Both NiV and HeV are also very closely related with respect to their genomic end sequences, gene start and stop signals, P gene-editing signals and deduced amino acid sequences of nucleocapsid protein, phosphoprotein, matrix protein, fusion protein, glycoprotein and RNA polymerase. The existing evidence demonstrates a clear need for the creation of a new genus within the subfamily Paramyxovirinae to accommodate the close similarities between NiV and HeV and their significant differences from other members of the subfamily.

  11. Acetylcholinesterase of the sand fly, Phlebotomus papatasi (Scopoli): cDNA sequence, baculovirus expression, and biochemical properties.

    PubMed

    Temeyer, Kevin B; Brake, Danett K; Tuckow, Alexander P; Li, Andrew Y; Pérez de León, Adalberto A

    2013-02-04

    Millions of people and domestic animals around the world are affected by leishmaniasis, a disease caused by various species of flagellated protozoans in the genus Leishmania that are transmitted by several sand fly species. Insecticides are widely used for sand fly population control to try to reduce or interrupt Leishmania transmission. Zoonotic cutaneous leishmaniasis caused by L. major is vectored mainly by Phlebotomus papatasi (Scopoli) in Asia and Africa. Organophosphates comprise a class of insecticides used for sand fly control, which act through the inhibition of acetylcholinesterase (AChE) in the central nervous system. Point mutations producing an altered, insensitive AChE are a major mechanism of organophosphate resistance in insects and preliminary evidence for organophosphate-insensitive AChE has been reported in sand flies. This report describes the identification of complementary DNA for an AChE in P. papatasi and the biochemical characterization of recombinant P. papatasi AChE. A P. papatasi Israeli strain laboratory colony was utilized to prepare total RNA utilized as template for RT-PCR amplification and sequencing of cDNA encoding acetylcholinesterase 1 using gene specific primers and 3'-5'-RACE. The cDNA was cloned into pBlueBac4.5/V5-His TOPO, and expressed by baculovirus in Sf21 insect cells in serum-free medium. Recombinant P. papatasi acetylcholinesterase was biochemically characterized using a modified Ellman's assay in microplates. A 2309 nucleotide sequence of PpAChE1 cDNA [GenBank: JQ922267] of P. papatasi from a laboratory colony susceptible to insecticides is reported with 73-83% nucleotide identity to acetylcholinesterase mRNA sequences of Culex tritaeniorhynchus and Lutzomyia longipalpis, respectively. The P. papatasi cDNA ORF encoded a 710-amino acid protein [GenBank: AFP20868] exhibiting 85% amino acid identity with acetylcholinesterases of Cx. pipiens, Aedes aegypti, and 92% amino acid identity for L. longipalpis. Recombinant P

  12. Acetylcholinesterase of the sand fly, Phlebotomus papatasi (Scopoli): cDNA sequence, baculovirus expression, and biochemical properties

    PubMed Central

    2013-01-01

    Background Millions of people and domestic animals around the world are affected by leishmaniasis, a disease caused by various species of flagellated protozoans in the genus Leishmania that are transmitted by several sand fly species. Insecticides are widely used for sand fly population control to try to reduce or interrupt Leishmania transmission. Zoonotic cutaneous leishmaniasis caused by L. major is vectored mainly by Phlebotomus papatasi (Scopoli) in Asia and Africa. Organophosphates comprise a class of insecticides used for sand fly control, which act through the inhibition of acetylcholinesterase (AChE) in the central nervous system. Point mutations producing an altered, insensitive AChE are a major mechanism of organophosphate resistance in insects and preliminary evidence for organophosphate-insensitive AChE has been reported in sand flies. This report describes the identification of complementary DNA for an AChE in P. papatasi and the biochemical characterization of recombinant P. papatasi AChE. Methods A P. papatasi Israeli strain laboratory colony was utilized to prepare total RNA utilized as template for RT-PCR amplification and sequencing of cDNA encoding acetylcholinesterase 1 using gene specific primers and 3’-5’-RACE. The cDNA was cloned into pBlueBac4.5/V5-His TOPO, and expressed by baculovirus in Sf21 insect cells in serum-free medium. Recombinant P. papatasi acetylcholinesterase was biochemically characterized using a modified Ellman’s assay in microplates. Results A 2309 nucleotide sequence of PpAChE1 cDNA [GenBank: JQ922267] of P. papatasi from a laboratory colony susceptible to insecticides is reported with 73-83% nucleotide identity to acetylcholinesterase mRNA sequences of Culex tritaeniorhynchus and Lutzomyia longipalpis, respectively. The P. papatasi cDNA ORF encoded a 710-amino acid protein [GenBank: AFP20868] exhibiting 85% amino acid identity with acetylcholinesterases of Cx. pipiens, Aedes aegypti, and 92% amino acid identity for

  13. [cDNA cloning and sequence analysis of pluripotency genes in tree shrews (Tupaia belangeri)].

    PubMed

    Wang, Cai-Yun; Ma, Yun-Han; He, Da-Jian; Yang, Shi-Hua

    2013-04-01

    In this paper, partial sequences of the tree shrew (Tupaia belangeri) Klf4, Sox2, and c-Myc genes were cloned and sequenced, which were 382, 612, and 485 bp in length and encoded 127, 204, and 161 amino acids, respectively. Whereas, their cDNA sequence identities with those of human were 89%, 98%, and 89%, respectively. Their phylogenetic tree results indicated different topologies and suggested individual evolutional pathways. These results can facilitate further functional studies.

  14. Complete nucleotide sequence and genome organization of bovine parvovirus.

    PubMed Central

    Chen, K C; Shull, B C; Moses, E A; Lederman, M; Stout, E R; Bates, R C

    1986-01-01

    We determined the complete nucleotide sequence of bovine parvovirus (BPV), an autonomous parvovirus. The sequence is 5,491 nucleotides long. The terminal regions contain nonidentical imperfect palindromic sequences of 150 and 121 nucleotides. In the plus strand, there are three large open reading frames (left ORF, mid ORF, and right ORF) with coding capacities of 729, 255, and 685 amino acids, respectively. As with all parvoviruses studied to date, the left ORF of BPV codes for the nonstructural protein NS-1 and the right ORF codes for the major parts of the three capsid proteins. The mid ORF probably encodes the major part of the nonstructural protein NP-1. There are promoterlike sequences at map units 4.5, 12.8, and 38.7 and polyadenylation signals at map units 61.6, 64.6, and 98.5. BPV has little DNA homology with the defective parvovirus AAV, with the human autonomous parvovirus B19, or with the other autonomous parvoviruses sequenced (canine parvovirus, feline panleukopenia virus, H-1, and minute virus of mice). Even though the overall DNA homology of BPV with other parvoviruses is low, several small regions of high homology are observed when the amino acid sequences encoded by the left and right ORFs are compared. From these comparisons, it can be shown that the evolutionary relationship among the parvoviruses is B19 in equilibrium with AAV in equilibrium with BPV in equilibrium with MVM. The highly conserved amino acid sequences observed among all parvoviruses may be useful in the identification and detection of parvoviruses and in the design of a general parvovirus vaccine. PMID:3783814

  15. cDNA sequence of human transforming gene hst and identification of the coding sequence required for transforming activity

    SciTech Connect

    Taira, M.; Yoshida, T.; Miyagawa, K.; Sakamoto, H.; Terada, M.; Sugimura, T.

    1987-05-01

    The hst gene was originally identified as a transforming gene in DNAs from human stomach cancers and from a noncancerous portion of stomach mucosa by DNA-mediated transfection assay using NIH3T3 cells. cDNA clones of hst were isolated from the cDNA library constructed from poly(A)/sup +/ RNA of a secondary transformant induced by the DNA from a stomach cancer. The sequence analysis of the hst cDNA revealed the presence of two open reading frames. When this cDNA was inserted into an expression vector containing the simian virus 40 promoter, it efficiently induced the transformation of NIH3T3 cells upon transfection. It was found that one of the reading frames, which coded for 206 amino acids, was responsible for the transforming activity.

  16. Nucleotide-Specific Contrast for DNA Sequencing by Electron Spectroscopy

    PubMed Central

    Schmid, Andreas K.; Davis, Ronald W.

    2016-01-01

    DNA sequencing by imaging in an electron microscope is an approach that holds promise to deliver long reads with low error rates and without the need for amplification. Earlier work using transmission electron microscopes, which use high electron energies on the order of 100 keV, has shown that low contrast and radiation damage necessitates the use of heavy atom labeling of individual nucleotides, which increases the read error rates. Other prior work using scattering electrons with much lower energy has shown to suppress beam damage on DNA. Here we explore possibilities to increase contrast by employing two methods, X-ray photoelectron and Auger electron spectroscopy. Using bulk DNA samples with monomers of each base, both methods are shown to provide contrast mechanisms that can distinguish individual nucleotides without labels. Both spectroscopic techniques can be readily implemented in a low energy electron microscope, which may enable label-free DNA sequencing by direct imaging. PMID:27149617

  17. Nucleotide-Specific Contrast for DNA Sequencing by Electron Spectroscopy.

    PubMed

    Mankos, Marian; Persson, Henrik H J; N'Diaye, Alpha T; Shadman, Khashayar; Schmid, Andreas K; Davis, Ronald W

    2016-01-01

    DNA sequencing by imaging in an electron microscope is an approach that holds promise to deliver long reads with low error rates and without the need for amplification. Earlier work using transmission electron microscopes, which use high electron energies on the order of 100 keV, has shown that low contrast and radiation damage necessitates the use of heavy atom labeling of individual nucleotides, which increases the read error rates. Other prior work using scattering electrons with much lower energy has shown to suppress beam damage on DNA. Here we explore possibilities to increase contrast by employing two methods, X-ray photoelectron and Auger electron spectroscopy. Using bulk DNA samples with monomers of each base, both methods are shown to provide contrast mechanisms that can distinguish individual nucleotides without labels. Both spectroscopic techniques can be readily implemented in a low energy electron microscope, which may enable label-free DNA sequencing by direct imaging.

  18. Cloning and sequence analysis of Hemonchus contortus HC58cDNA.

    PubMed

    Muleke, Charles I; Ruofeng, Yan; Lixin, Xu; Xinwen, Bo; Xiangrui, Li

    2007-06-01

    The complete coding sequence of Hemonchus contortus HC58cDNA was generated by rapid amplification of cDNA ends and polymerase chain reaction using primers based on the 5' and 3' ends of the parasite mRNA, accession no. AF305964. The HC58cDNA gene was 851 bp long, with open reading frame of 717 bp, precursors to 239 amino acids coding for approximately 27 kDa protein. Analysis of amino acid sequence revealed conserved residues of cysteine, histidine, asparagine, occluding loop pattern, hemoglobinase motif and glutamine of the oxyanion hole characteristic of cathepsin B like proteases (CBL). Comparison of the predicted amino acid sequences showed the protein shared 33.5-58.7% identity to cathepsin B homologues in the papain clan CA family (family C1). Phylogenetic analysis revealed close evolutionary proximity of the protein sequence to counterpart sequences in the CBL, suggesting that HC58cDNA was a member of the papain family.

  19. The nucleotide sequence of the human beta-globin gene.

    PubMed

    Lawn, R M; Efstratiadis, A; O'Connell, C; Maniatis, T

    1980-10-01

    We report the complete nucleotide sequence of the human beta-globin gene. The purpose of this study is to obtain information necessary to study the evolutionary relationships between members of the human beta-like globin gene family and to provide the basis for comparing normal beta-globin genes with those obtained from the DNA of individuals with genetic defects in hemoglobin expression.

  20. Cloning and sequencing of human intestinal alkaline phosphatase cDNA

    SciTech Connect

    Berger, J.; Garattini, E.; Hua, J.C.; Udenfriend, S.

    1987-02-01

    Partial protein sequence data obtained on intestinal alkaline phosphatase indicated a high degree of homology with the reported sequence of the placental isoenzyme. Accordingly, placental alkaline phosphatase cDNA was cloned and used as a probe to clone intestinal alkaline phosphatase cDNA. The latter is somewhat larger (3.1 kilobases) than the cDNA for the placental isozyme (2.8 kilobases). Although the 3' untranslated regions are quite different, there is almost 90% homology in the translated regions of the two isozymes. There are, however, significant differences at their amino and carboxyl termini and a substitution of an alanine in intestinal alkaline phosphatase for a glycine in the active site of the placental isozyme.

  1. Sequence and characterization of cDNA encoding the motilin precursor from chicken, dog, cow and horse. Evidence of mosaic evolution in prepromotilin.

    PubMed

    Huang, Z; Depoortere, I; De Clercq, P; Peeters, T

    1999-11-15

    Motilin is involved in the regulation of the fasting motility pattern in man and in dog, but may have a different role in other species. Immunoreactive motilin has been demonstrated in several species, but the sequence is mostly unknown. The aim of this study was to isolate and sequence the cDNA encoding the motilin precursor from several mammalian species and from chicken. Total RNA was isolated from the duodenal mucosa of the chicken, dog, cow and horse. In each case single stranded cDNA was synthesized. Motilin cDNA fragments were amplified by PCR, ligated into a plasmid and cloned. Clones which were positive after screening with an appropriate (32)P-labeled probe were sequenced. The 5'- and 3'-ends were determined by the rapid amplification of cDNA ends (RACE) method. Analysis of the cDNAs revealed an open reading frame coding for 115 (chicken and cow), or 117 (dog and horse) amino acids. It consists of a 25 amino acid signal peptide, motilin itself, and a 68 (chicken and cow) or 70 (dog and horse) amino acid motilin associated peptide (MAP). As in all motilin precursors already sequenced (man, monkey, pig and rabbit), an endoproteinase cleavage site is present at Lys(23)-Lys(24). Comparison of all known sequences shows considerable identity in amino acid and nucleotide sequence of the signal peptide and motilin. However, the MAPs differ not only in length but also, more strongly, in amino acid and nucleotide sequence. Our study demonstrates that the N- and C-terminal regions of the motilin precursor have evolved at different rates, which is evidence for 'mosaic evolution'.

  2. Contamination of cDNA libraries and expressed sequence-tags databases

    SciTech Connect

    Dean, M.; Allikmets, R.

    1995-11-01

    Partially sequenced cDNAs, or expressed sequence tags (ESTs), are claimed to represent an efficient strategy for characterizing an organism`s genes. By necessity, these sequences are incompletely characterized, and examples of contamination of cDNA libraries with sequences from other species have been described. It has been suggested that a Human T-cell cDNA library (Clontech HL1963g) is contaminated by sequences from yeast (Saccharomyces cerevisiae) and an unknown bacterium. We are characterizing human ESTs that represent new members of the ATP-binding cassette transporter super-family. In examining human ESTs generated from the T-cell library, we have encountered one gene that was in fact a yeast sequence (Genbank Z15214 = SSH2 locus) and several genes that do not hybridize to human DNA or RNA. PCR primers from these sequences failed to amplify a product from human, yeast, or Escherichia coli DNA but did produce a product from a Clontech kidney cDNA library (HL1123a). To determine the source of the contamination, we amplified a conserved segment of the 16S rDNA (following a suggestion from Dr. C. Savakis) from the kidney library. The sequence of this product was nearly identical to that of the bacterium Leuconostoc lactis (300 of 304 bp). Leuconostoc species are commonly found in dairy products, fruits, vegetables, and wine and are nonpathogenic to humans. 6 refs., 1 fig.

  3. Analysis of a cDNA clone expressing a human autoimmune antigen: full-length sequence of the U2 small nuclear RNA-associated B antigen

    SciTech Connect

    Habets, W.J.; Sillekens, P.T.G.; Hoet, M.H.; Schalken, J.A.; Roebroek, A.J.M.; Leunissen, J.A.M.; Van de Ven, W.J.M.; Van Venrooij, W.J.

    1987-04-01

    A U2 small nuclear RNA-associated protein, designated B'', was recently identified as the target antigen for autoimmune sera from certain patients with systemic lupus erythematosus and other rheumatic diseases. Such antibodies enabled them to isolate cDNA clone lambdaHB''-1 from a phage lambdagt11 expression library. This clone appeared to code for the B'' protein as established by in vitro translation of hybrid-selected mRNA. The identity of clone lambdaHB''-1 was further confirmed by partial peptide mapping and analysis of the reactivity of the recombinant antigen with monospecific and monoclonal antibodies. Analysis of the nucleotide sequence of the 1015-base-pair cDNA insert of clone lambdaHB''-1 revealed a large open reading frame of 800 nucleotides containing the coding sequence for a polypeptide of 25,457 daltons. In vitro transcription of the lambdaHB''-1 cDNA insert and subsequent translation resulted in a protein product with the molecular size of the B'' protein. These data demonstrate that clone lambdaHB''-1 contains the complete coding sequence of this antigen. The deduced polypeptide sequence contains three very hydrophilic regions that might constitute RNA binding sites and/or antigenic determinants. These findings might have implications both for the understanding of the pathogenesis of rheumatic diseases as well as for the elucidation of the biological function of autoimmune antigens.

  4. The complete nucleotide sequence of pelargonium leaf curl virus.

    PubMed

    McGavin, Wendy J; MacFarlane, Stuart A

    2016-05-01

    Investigation of a tombusvirus isolated from tulip plants in Scotland revealed that it was pelargonium leaf curl virus (PLCV) rather than the originally suggested tomato bushy stunt virus. The complete sequence of the PLCV genome was determined for the first time, revealing it to be 4789 nucleotides in size and to have an organization similar to that of the other, previously described tombusviruses. Primers derived from the sequence were used to construct a full-length infectious clone of PLCV that recapitulates the disease symptoms of leaf curling in systemically infected pelargonium plants.

  5. Analysis of expressed sequence tags generated from full-length enriched cDNA libraries of melon.

    PubMed

    Clepet, Christian; Joobeur, Tarek; Zheng, Yi; Jublot, Delphine; Huang, Mingyun; Truniger, Veronica; Boualem, Adnane; Hernandez-Gonzalez, Maria Elena; Dolcet-Sanjuan, Ramon; Portnoy, Vitaly; Mascarell-Creus, Albert; Caño-Delgado, Ana I; Katzir, Nurit; Bendahmane, Abdelhafid; Giovannoni, James J; Aranda, Miguel A; Garcia-Mas, Jordi; Fei, Zhangjun

    2011-05-20

    Melon (Cucumis melo), an economically important vegetable crop, belongs to the Cucurbitaceae family which includes several other important crops such as watermelon, cucumber, and pumpkin. It has served as a model system for sex determination and vascular biology studies. However, genomic resources currently available for melon are limited. We constructed eleven full-length enriched and four standard cDNA libraries from fruits, flowers, leaves, roots, cotyledons, and calluses of four different melon genotypes, and generated 71,577 and 22,179 ESTs from full-length enriched and standard cDNA libraries, respectively. These ESTs, together with ~35,000 ESTs available in public domains, were assembled into 24,444 unigenes, which were extensively annotated by comparing their sequences to different protein and functional domain databases, assigning them Gene Ontology (GO) terms, and mapping them onto metabolic pathways. Comparative analysis of melon unigenes and other plant genomes revealed that 75% to 85% of melon unigenes had homologs in other dicot plants, while approximately 70% had homologs in monocot plants. The analysis also identified 6,972 gene families that were conserved across dicot and monocot plants, and 181, 1,192, and 220 gene families specific to fleshy fruit-bearing plants, the Cucurbitaceae family, and melon, respectively. Digital expression analysis identified a total of 175 tissue-specific genes, which provides a valuable gene sequence resource for future genomics and functional studies. Furthermore, we identified 4,068 simple sequence repeats (SSRs) and 3,073 single nucleotide polymorphisms (SNPs) in the melon EST collection. Finally, we obtained a total of 1,382 melon full-length transcripts through the analysis of full-length enriched cDNA clones that were sequenced from both ends. Analysis of these full-length transcripts indicated that sizes of melon 5' and 3' UTRs were similar to those of tomato, but longer than many other dicot plants. Codon

  6. Analysis of expressed sequence tags generated from full-length enriched cDNA libraries of melon

    PubMed Central

    2011-01-01

    Background Melon (Cucumis melo), an economically important vegetable crop, belongs to the Cucurbitaceae family which includes several other important crops such as watermelon, cucumber, and pumpkin. It has served as a model system for sex determination and vascular biology studies. However, genomic resources currently available for melon are limited. Result We constructed eleven full-length enriched and four standard cDNA libraries from fruits, flowers, leaves, roots, cotyledons, and calluses of four different melon genotypes, and generated 71,577 and 22,179 ESTs from full-length enriched and standard cDNA libraries, respectively. These ESTs, together with ~35,000 ESTs available in public domains, were assembled into 24,444 unigenes, which were extensively annotated by comparing their sequences to different protein and functional domain databases, assigning them Gene Ontology (GO) terms, and mapping them onto metabolic pathways. Comparative analysis of melon unigenes and other plant genomes revealed that 75% to 85% of melon unigenes had homologs in other dicot plants, while approximately 70% had homologs in monocot plants. The analysis also identified 6,972 gene families that were conserved across dicot and monocot plants, and 181, 1,192, and 220 gene families specific to fleshy fruit-bearing plants, the Cucurbitaceae family, and melon, respectively. Digital expression analysis identified a total of 175 tissue-specific genes, which provides a valuable gene sequence resource for future genomics and functional studies. Furthermore, we identified 4,068 simple sequence repeats (SSRs) and 3,073 single nucleotide polymorphisms (SNPs) in the melon EST collection. Finally, we obtained a total of 1,382 melon full-length transcripts through the analysis of full-length enriched cDNA clones that were sequenced from both ends. Analysis of these full-length transcripts indicated that sizes of melon 5' and 3' UTRs were similar to those of tomato, but longer than many other dicot

  7. Rapid Amplification of cDNA Ends for RNA Transcript Sequencing in Staphylococcus.

    PubMed

    Miller, Eric

    2016-01-01

    Rapid amplification of cDNA ends (RACE) is a technique that was developed to swiftly and efficiently amplify full-length RNA molecules in which the terminal ends have not been characterized. Current usage of this procedure has been more focused on sequencing and characterizing RNA 5' and 3' untranslated regions. Herein is described an adapted RACE protocol to amplify bacterial RNA transcripts.

  8. Molecular Cloning and Sequencing of Channel Catfish, Ictalurus punctatus, Cathepsin H and L cDNA

    USDA-ARS?s Scientific Manuscript database

    Cathepsin H and L, a lysosomal cysteine endopeptidase of the papain family, are ubiquitously expressed and involve in antigen processing. In this communication, the channel catfish cathepsin H and L transcripts were sequenced and analyzed. Total RNA from tissues was extracted and cDNA libraries we...

  9. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2011-07-01 2011-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide and...

  10. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2010-07-01 2010-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide and...

  11. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2013-07-01 2013-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide and...

  12. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2012-07-01 2012-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide and...

  13. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2014-07-01 2014-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide and...

  14. Identification of repeats in DNA sequences using nucleotide distribution uniformity.

    PubMed

    Yin, Changchuan

    2017-01-07

    Repetitive elements are important in genomic structures, functions and regulations, yet effective methods in precisely identifying repetitive elements in DNA sequences are not fully accessible, and the relationship between repetitive elements and periodicities of genomes is not clearly understood. We present an ab initio method to quantitatively detect repetitive elements and infer the consensus repeat pattern in repetitive elements. The method uses the measure of the distribution uniformity of nucleotides at periodic positions in DNA sequences or genomes. It can identify periodicities, consensus repeat patterns, copy numbers and perfect levels of repetitive elements. The results of using the method on different DNA sequences and genomes demonstrate efficacy and accuracy in identifying repeat patterns and periodicities. The complexity of the method is linear with respect to the lengths of the analyzed sequences. The Python programs in this study are freely available to the public upon request or at https://github.com/cyinbox/DNADU. Copyright © 2016 Elsevier Ltd. All rights reserved.

  15. Structure and nucleotide sequence of the rat intestinal vitamin D-dependent calcium binding protein gene.

    PubMed Central

    Krisinger, J; Darwish, H; Maeda, N; DeLuca, H F

    1988-01-01

    The vitamin D-dependent intestinal calcium binding protein (ICaBP, 9 kDa) is under transcriptional regulation by 1,25-dihydroxyvitamin D3 [1,25-(OH)2D3], the hormonal active form of the vitamin. To study the mechanism of gene regulation by 1,25-(OH)2D3, we isolated the rat ICaBP gene by using a cDNA probe. Its nucleotide sequence revealed 3 exons separated by 2 introns within approximately 3 kilobases. The first exon represents only noncoding sequences, while the second and third encode the two calcium binding domains of the protein. The gene contains a 15-base-pair imperfect palindrome in the first intron that shows high homology to the estrogen-responsive element. This sequence may represent the vitamin D-responsive element involved in the regulation of the ICaBP gene. The second intron shows an 84-base-pair-long simple nucleotide repeat that implicates Z-DNA formation. Genomic Southern analysis shows that the rat gene is represented as a single copy. Images PMID:3194402

  16. Differential direct coding: a compression algorithm for nucleotide sequence data

    PubMed Central

    Vey, Gregory

    2009-01-01

    While modern hardware can provide vast amounts of inexpensive storage for biological databases, the compression of nucleotide sequence data is still of paramount importance in order to facilitate fast search and retrieval operations through a reduction in disk traffic. This issue becomes even more important in light of the recent increase of very large data sets, such as metagenomes. In this article, I propose the Differential Direct Coding algorithm, a general-purpose nucleotide compression protocol that can differentiate between sequence data and auxiliary data by supporting the inclusion of supplementary symbols that are not members of the set of expected nucleotide bases, thereby offering reconciliation between sequence-specific and general-purpose compression strategies. This algorithm permits a sequence to contain a rich lexicon of auxiliary symbols that can represent wildcards, annotation data and special subsequences, such as functional domains or special repeats. In particular, the representation of special subsequences can be incorporated to provide structure-based coding that increases the overall degree of compression. Moreover, supporting a robust set of symbols removes the requirement of wildcard elimination and restoration phases, resulting in a complexity of O(n) for execution time, making this algorithm suitable for very large data sets. Because this algorithm compresses data on the basis of triplets, it is highly amenable to interpretation as a polypeptide at decompression time. Also, an encoded sequence may be further compressed using other existing algorithms, like gzip, thereby maximizing the final degree of compression. Overall, the Differential Direct Coding algorithm can offer a beneficial impact on disk traffic for database queries and other disk-intensive operations. PMID:20157486

  17. Identification of genomic sequences corresponding to cDNA clones

    SciTech Connect

    Spoerel, N.A.; Kafatos, F.C.

    1987-01-01

    The general methods applicable to the isolation of genomic sequences from phage lambda or cosmid libraries have been described. This chapter presents strategies for the investigation of genes that occur in several identical or nonidentical copies per genome, or that share a common conserved domain with other genes. The methods discussed are applicable both to the identification of the genes in Southern blots and to their isolation from libraries. Furthermore, the methods are well suited for the analysis of homologous genes in different species. A high proportion of genes in eukaryotes are known to be members of multigene families. Carefully controlled hybridization conditions and well-tailored probes are powerful tools in the isolation and analysis of genes which share a common domain or are members of multigene families. This chapter consists of a short review of recommended strategies and relevant parameters, which have been discussed in more detail earlier. Using three examples from the authors' analysis of the silk moth choriun locus, they demonstrate how powerful carefully tailored short single-stranded probes can be in the analysis of closely related gene copies.

  18. Sequencing of cDNA Clones from the Genetic Map of Tomato (Lycopersicon esculentum)

    PubMed Central

    Ganal, Martin W.; Czihal, Rosemarie; Hannappel, Ulrich; Kloos, Dorothee-U.; Polley, Andreas; Ling, Hong-Qing

    1998-01-01

    The dense RFLP linkage map of tomato (Lycopersicon esculentum) contains >300 anonymous cDNA clones. Of those clones, 272 were partially or completely sequenced. The sequences were compared at the DNA and protein level to known genes in databases. For 57% of the clones, a significant match to previously described genes was found. The information will permit the conversion of those markers to STS markers and allow their use in PCR-based mapping experiments. Furthermore, it will facilitate the comparative mapping of genes across distantly related plant species by direct comparison of DNA sequences and map positions. [cDNA sequence data reported in this paper have been submitted to the EMBL database under accession nos. AA824695–AA825005 and the dbEST_Id database under accession nos. 1546519–1546862.] PMID:9724330

  19. Genes galore: a summary of methods for accessing results from large-scale partial sequencing of anonymous Arabidopsis cDNA clones.

    PubMed Central

    Newman, T; de Bruijn, F J; Green, P; Keegstra, K; Kende, H; McIntosh, L; Ohlrogge, J; Raikhel, N; Somerville, S; Thomashow, M

    1994-01-01

    High-throughput automated partial sequencing of anonymous cDNA clones provides a method to survey the repertoire of expressed genes from an organism. Comparison of the coding capacity of these expressed sequence tags (ESTs) with the sequences in the public data bases results in assignment of putative function to a significant proportion of the ESTs. Thus, the more than 13,400 plant ESTs that are currently available provide a new resource that will facilitate progress in many areas of plant biology. These opportunities are illustrated by a description of the results obtained from analysis of 1500 Arabidopsis ESTs from a cDNA library prepared from equal portions of poly(A+) mRNA from etiolated seedlings, roots, leaves, and flowering inflorescences. More than 900 different sequences were represented, 32% of which showed significant nucleotide or deduced amino acid sequences similarity to previously characterized genes or proteins from a wide range of organisms. At least 165 of the clones had significant deduced amino acid sequence homology to proteins or gene products that have not been previously characterized from higher plants. A summary of methods for accessing the information and materials generated by the Arabidopsis cDNA sequencing project is provided. PMID:7846151

  20. Nucleotide sequence and structural organization of the human vasopressin pituitary receptor (V3) gene.

    PubMed

    René, P; Lenne, F; Ventura, M A; Bertagna, X; de Keyzer, Y

    2000-01-04

    In the pituitary, vasopressin triggers ACTH release through a specific receptor subtype, termed V3 or V1b. We cloned the V3 cDNA and showed that its expression was almost exclusive to pituitary corticotrophs and some corticotroph tumors. To study the determinants of this tissue specificity, we have now cloned the gene for the human (h) V3 receptor and characterized its structure. It is composed of two exons, spanning 10kb, with the coding region interrupted between transmembrane domains 6 and 7. We established that the transcription initiation site is located 498 nucleotides upstream of the initiator codon and showed that two polyadenylation sites may be used, while the most frequent is the most downstream. Sequence analysis of the promoter region showed no TATA box but identified consensus binding motifs for Sp1, CREB, and half sites of the estrogen receptor binding site. However comparison with another corticotroph-specific gene, proopiomelanocortin, did not identify common regulatory elements in the two promoters except for a short GC-rich region. Unexpectedly, hV3 gene analysis revealed that a formerly cloned 'artifactual' hV3 cDNA indeed corresponded to a spliced antisense transcript, overlapping the 5' part of the coding sequence in exon 1 and the promoter region. This transcript, hV3rev, was detected in normal pituitary and in many corticotroph tumors expressing hV3 sense mRNA and may therefore play a role in hV3 gene expression.

  1. Comparison of latent and nominal rabbit Ig VHa1 allotype cDNA sequences.

    PubMed

    McCormack, W T; Dhanarajan, P; Roux, K H

    1988-09-15

    The genetic basis for the expression of a latent VH allotype in the rabbit was investigated. VH region cDNA libraries were produced from spleen mRNA derived from a homozygous a2a2 rabbit expressing an induced latent VHa1 allotype and, for comparison, from a normal homozygus a1a1 rabbit expressing nominal VHa1 allotype. The deduced amino acid sequences of the nominal VHa1 cDNA were concordant with previously published VHa1 protein sequences. A comparison of two complete VH-DH-JH and six partial VHa1 sequences reveals highly conserved sequence within VH framework regions (FR) and considerable diversity in complementarity-determining regions and D region sequences. Two functional JH genes or alleles are evident. Amino acid sequencing of the N-terminal 15 residues of pooled affinity-purified latent VHa1 H chain showed complete sequence identity with the nominal VHa1 sequences. Possible latent VHa1-encoding cDNA clones, derived from the a2a2 rabbit, were selected by hybridization with oligonucleotide probes corresponding to the VHa1 allotype-associated segments of the first and third framework regions (FR1 and FR3). cDNA sequence analysis reveals that the 5' untranslated regions of nominal and latent VHa1 cDNA were virtually identical to each other and to previously reported sequences associated with VHa2 and VHa-negative genes. Moreover, some latent VHa1 genes encode FR1 segments that are essentially homologous to the corresponding segment of a nominal VHa1 allotype. In contrast, other putative latent genes display blocks of VHa1 sequence in either FR1 or FR3 that are flanked by blocks of sequence identical to other rabbit VH genes (i.e., VHa2 or VHa-negative). These composite sequences may be directly encoded by composite germ-line VH genes or may be the products of somatically generated recombination or gene conversion between genes encoding latent and nominal allotypes. The data do not support the hypothesis that latent genes are the result of extensive modification

  2. Cloning and sequencing of a cDNA encoding a taste-modifying protein, miraculin.

    PubMed

    Masuda, Y; Nirasawa, S; Nakaya, K; Kurihara, Y

    1995-08-19

    A cDNA clone encoding a taste-modifying protein, miraculin (MIR), was isolated and sequenced. The encoded precursor to MIR was composed of 220 amino acid (aa) residues, including a possible signal sequence of 29 aa. Northern blot analysis showed that the mRNA encoding MIR was already expressed in fruits of Richadella dulcifica at 3 weeks after pollination and was present specifically in the pulp.

  3. Sequence and neuronal expression of mouse endothelin-1 cDNA.

    PubMed

    Kurama, M; Ishida, N; Matsui, M; Saida, K; Mitsui, Y

    1996-07-17

    We have isolated and sequenced a cDNA that encodes mouse endothelin-1 (ET-1). The putative protein contains 202 amino acids corresponds to the prepro-form of ET-1. Twenty-one amino acids sequence of the putative mature ET-1 was identical with that of rat, porcine, bovine, and human. In situ hybridization histochemistry indicate that ET-1 mRNA was expressed in several hypothalamic nuclei including the suprachiasmatic nucleus (SCN) in rodent brain.

  4. Cloning and sequencing of cDNA and genomic DNA encoding PDM phosphatase of Fusarium moniliforme.

    PubMed

    Yoshida, Hiroshi; Iizuka, Mari; Narita, Takao; Norioka, Naoko; Norioka, Shigemi

    2006-12-01

    PDM phosphatase was purified approximately 500-fold through six steps from the extract of dried powder of the culture filtrate of Fusarium moniliforme. The purified preparation appeared homogeneous on SDS-PAGE although the protein band was broad. Amino acid sequence information was collected on tryptic peptides from this preparation. cDNA cloning was carried out based on the information. A full-length cDNA was obtained and sequenced. The sequence had an open reading frame of 651 amino acid residues with a molecular mass of 69,988 Da. Cloning and sequencing of the genomic DNA corresponding to the cDNA was also conducted. The deduced amino acid sequence could account for many but not all of the tryptic peptides, suggesting presence of contaminant protein(s). SDS-PAGE analysis after chemical deglycosylation showed two proteins with molecular masses of 58 and 68 kDa. This implied that the 58 kDa protein had been copurified with PDM phosphatase. Homology search showed that PDM phosphatase belongs to the purple acid phosphatase family, which is widely distributed in the biosphere. Sequence data of fungal purple acid phosphatases were collected from the database. Processing of the data revealed presence of two types, whose evolutionary relationships were discussed.

  5. Generation of expressed sequence tags from a normalized porcine skeletal muscle cDNA library.

    PubMed

    Yao, Jianbo; Coussens, Paul M; Saama, Peter; Suchyta, Steven; Ernst, Catherine W

    2002-11-01

    Recent developments in microarray technologies permit scientists to analyze expression of thousands of genes simultaneously in diverse biological systems. In an effort to provide integrated resources for application of microarray technologies to studies of skeletal muscle growth and development in swine, we have constructed a normalized cDNA library from porcine skeletal muscle. The effectiveness of normalization was evaluated by DNA sequencing of clones randomly picked from the library before and after normalization, and also by Southern blot hybridization using probes representing abundant transcripts. Our data suggests that the normalization procedure successfully reduced the highly abundant cDNA species in the normalized library. To date, a total of 782 EST (expressed sequence tag) sequences have been generated from this normalized library (687 ESTs) and the original library (95 ESTs). The sequence information of these ESTs plus their BLAST results has been made available through a web accessible database (http://nbfgc.msu.edu). Cluster analysis of the data indicates that a total of 742 unique sequences are present in this collection. BLASTN search of the 742 EST sequences against the public database (dbEST) revealed that 139 had no significant matches (E-value > 10(-15)) to porcine ESTs already entered in the database, suggesting the possibility of their specific expression in porcine skeletal muscle. Generation of non-redundant ESTs from this library will allow us to construct cDNA microarrays for identification of gene expression changes that regulate muscle growth and affect meat quality in swine.

  6. Bioinformatics comparison of sulfate-reducing metabolism nucleotide sequences

    NASA Astrophysics Data System (ADS)

    Tremberger, G.; Dehipawala, Sunil; Nguyen, A.; Cheung, E.; Sullivan, R.; Holden, T.; Lieberman, D.; Cheung, T.

    2015-09-01

    The sulfate-reducing bacteria can be traced back to 3.5 billion years ago. The thermodynamics details of the sulfur cycle have been well documented. A recent sulfate-reducing bacteria report (Robator, Jungbluth, et al , 2015 Jan, Front. Microbiol) with Genbank nucleotide data has been analyzed in terms of the sulfite reductase (dsrAB) via fractal dimension and entropy values. Comparison to oil field sulfate-reducing sequences was included. The AUCG translational mass fractal dimension versus ATCG transcriptional mass fractal dimension for the low temperature dsrB and dsrA sequences reported in Reference Thirteen shows correlation R-sq ~ 0.79 , with a probably of about 3% in simulation. A recent report of using Cystathionine gamma-lyase sequence to produce CdS quantum dot in a biological method, where the sulfur is reduced just like in the H2S production process, was included for comparison. The AUCG mass fractal dimension versus ATCG mass fractal dimension for the Cystathionine gamma-lyase sequences was found to have R-sq of 0.72, similar to the low temperature dissimilatory sulfite reductase dsr group with 3% probability, in contrary to the oil field group having R-sq ~ 0.94, a high probable outcome in the simulation. The other two simulation histograms, namely, fractal dimension versus entropy R-sq outcome values, and di-nucleotide entropy versus mono-nucleotide entropy R-sq outcome values are also discussed in the data analysis focusing on low probability outcomes.

  7. cDNA sequence and expression pattern of the putative pheromone carrier aphrodisin.

    PubMed Central

    Mägert, H J; Hadrys, T; Cieslak, A; Gröger, A; Feller, S; Forssmann, W G

    1995-01-01

    The cDNA sequence for aphrodisin, a lipocalin from hamster vaginal discharge which is involved in pheromonal activity, has been determined. Corresponding genomic clones were isolated and the promoter region was identified. Primer extension analysis revealed an adenosine residue as the main transcription initiation site, located 50 bp upstream of the translation start codon ATG, which is surrounded by a typical Kozak sequence. However, data from polymerase chain reaction analysis suggest the existence of at least one alternative transcription initiation site. The aphrodisin cDNA is 732 bp long and codes for the mature 151-aa aphrodisin and an additional N-terminal 16-aa secretory signal peptide. The 3' nontranslated region is 228 bp long. Among the known sequences, the aphrodisin cDNA shares the highest homology with the rat odorant-binding protein cDNA (45%), which verifies the protein data. Vaginal tissue and Bartholin's glands are the main aphrodisin gene-expressing tissues of the female hamster genital tract, as demonstrated by Northern blot analysis. Under less stringent hybridization conditions, RNA isolated from rat Bartholin's glands also showed a signal, indicating the occurrence of aphrodisin-related mRNA in this species. Images Fig. 4 Fig. 5 Fig. 6 Fig. 7 PMID:7892229

  8. Nucleotide sequence and genome organization of canine parvovirus.

    PubMed Central

    Reed, A P; Jones, E V; Miller, T J

    1988-01-01

    The genome of a canine parvovirus isolate strain (CPV-N) was cloned, and the DNA sequence was determined. The entire genome, including ends, was 5,323 nucleotides in length. The terminal repeat at the 3' end of the genome shared similar structural characteristics but limited homology with the rodent parvoviruses. The 5' terminal repeat was not detected in any of the clones. Instead, a region of DNA starting near the capsid gene stop codon and extending 248 base pairs into the coding region had been duplicated and inserted 75 base pairs downstream from the poly(A) addition site. Consensus sequences for the 5' donor and 3' acceptor sites as well as promotors and poly(A) addition sites were identified and compared with the available information on related parvoviruses. The genomic organization of CPV-N is similar to that of feline parvovirus (FPV) in that there are two major open reading frames (668 and 722 amino acids) in the plus strand (mRNA polarity). Both coding domains are in the same frame, and no significant open reading frames were apparent in any of the other frames of both minus and plus DNA strands. The nucleotide and amino acid homologies of the capsid genes between CPV-N and FPV were 98 and 99%, respectively. In contrast, the nucleotide and amino acid homologies of the capsid genes for CPV-N and CPV-b (S. Rhode III, J. Virol. 54:630-633, 1985) were 95 and 98%, respectively. These results indicate that very few nucleotide or amino acid changes differentiate the antigenic and host range specificity of FPV and CPV. PMID:2824850

  9. Computer-based methods for the mouse full-length cDNA encyclopedia: real-time sequence clustering for construction of a nonredundant cDNA library.

    PubMed

    Konno, H; Fukunishi, Y; Shibata, K; Itoh, M; Carninci, P; Sugahara, Y; Hayashizaki, Y

    2001-02-01

    We developed computer-based methods for constructing a nonredundant mouse full-length cDNA library. Our cDNA library construction process comprises assessment of library quality, sequencing the 3' ends of inserts and clustering, and completing a re-array to generate a nonredundant library from a redundant one. After the cDNA libraries are generated, we sequence the 5' ends of the inserts to check the quality of the library; then we determine the sequencing priority of each library. Selected libraries undergo large-scale sequencing of the 3' ends of the inserts and clustering of the tag sequences. After clustering, the nonredundant library is constructed from the original libraries, which have redundant clones. All libraries, plates, clones, sequences, and clusters are uniquely identified, and all information is saved in the database according to this identifier. At press time, our system has been in place for the past two years; we have clustered 939,725 3' end sequences into 127,385 groups from 227 cDNA libraries/sublibraries (see http://genome.gse.riken.go.jp/).

  10. Human thyroid peroxidase: complete cDNA and protein sequence, chromosome mapping, and identification of two alternately spliced mRNAs

    SciTech Connect

    Kimura, S.; Kotani, T.; McBride, O.W.; Umeki, K.; Hirai, K.; Nakayama, T.; Ohtaki, S.

    1987-08-01

    Two forms of human thyroid peroxidase cDNAs were isolated from a lambdagt11 cDNA library, prepared from Graves disease thyroid tissue mRNA, by use of oligonucleotides. The longest complete cDNA, designated phTPO-1, has 3048 nucleotides and an open reading frame consisting of 933 amino acids, which would encode a protein with a molecular weight of 103,026. Five potential asparagine-linked glycosylation sites are found in the deduced amino acid sequence. The second peroxidase cDNA, designated phTPO-2, is almost identical to phTPO-1 beginning 605 base pairs downstream except that it contains 1-base-pair difference and lacks 171 base pairs in the middle of the sequence. This results in a loss of 57 amino acids corresponding to a molecular weight of 6282. Interestingly, this 171-nucleotide sequence has GT and AG at its 5' and 3' boundaries, respectively, that are in good agreement with donor and acceptor splice site consensus sequences. Using specific oligonucleotide probes for the mRNAs derived from the cDNA sequences hTOP-1 and hTOP-2, the authors show that both are expressed in all thyroid tissues examined and the relative level of two mRNAs is different in each sample. The results suggest that two thyroid peroxidase proteins might be generated through alternate splicing of the same gene. By using somatic cell hybrid lines, the thyroid peroxidase gene was mapped to the short arm of human chromosome 2.

  11. Nucleotide sequence of a complementary DNA encoding pea cytosolic copper/zinc superoxide dismutase. [Pisum sativum L

    SciTech Connect

    White, D.A.; Zilinskas, B.A. )

    1991-08-01

    The authors now report the nucleotide sequence of the cytosolic Cu/Zn SOD cloned from a {lambda}gt11 cDNA library constructed from mRNA extracted from leaves of 7- to 10-d pea seedlings (Pisum sativum L.). The clone was isolated using a 22-base synthetic oligonucleotide complementary to the amino acid sequence CGIIGLQG. This sequence, found at the protein's carboxy terminus, is highly conserved among plant cytosolic Cu/Zn SODs but not chloroplastic Cu/Zn SODs. The 738-base pair sequence contains an open reading frame specifying 152 codons and a predicted M{sub r} of 18,024 D. The deduced amino acid sequence is highly homologous (79-82% identity) with the sequences of other known plant cytosolic Cu/Zn SODs but less highly conserved (63-65%) when compared with several chloroplastic Cu/Zn SODs including pea (10).

  12. HUGE: a database for human large proteins identified by Kazusa cDNA sequencing project.

    PubMed Central

    Suyama, M; Nagase, T; Ohara, O

    1999-01-01

    HUGE is a database for human large proteins newly identified by Kazusa cDNA project, which aims to predict protein primary structures from sequences of human large cDNAs (>4 kb). In particular, cDNA clones capable of coding for large proteins (>50 kDa) are current targets of the project. More than 700 sequences of human cDNAs (average size, 5.1 kb) have been determined to date and deposited in the public databases. Notable information implied from the cDNAs and the predicted protein sequences can be obtained through HUGE via the World Wide Web at URL http://www.kazusa.or.jp/huge PMID:9847221

  13. Nucleotide sequence of the Rhodospirillum rubrum atp operon.

    PubMed Central

    Falk, G; Hampe, A; Walker, J E

    1985-01-01

    The nucleotide sequence was determined of a 8775-base-pair region of DNA cloned from the photosynthetic non-sulphur bacterium Rhodospirillum rubrum. It contains a cluster of five genes encoding F1-ATPase subunits. The genes are arranged in the same order as F1 genes in the Escherichia coli unc operon. However, as in the related organism Rhodopseudomonas blastica, neither genes for components of F0, the membrane sector of ATP synthase, nor a homologue of the E. coli uncI gene are associated with this locus, as they are in E. coli. Images Fig. 2. PMID:2861810

  14. cDNA Library Enrichment of Full Length Transcripts for SMRT Long Read Sequencing

    PubMed Central

    Hartwig, Benjamin; Reinhardt, Richard; Schneeberger, Korbinian

    2016-01-01

    The utility of genome assemblies does not only rely on the quality of the assembled genome sequence, but also on the quality of the gene annotations. The Pacific Biosciences Iso-Seq technology is a powerful support for accurate eukaryotic gene model annotation as it allows for direct readout of full-length cDNA sequences without the need for noisy short read-based transcript assembly. We propose the implementation of the TeloPrime Full Length cDNA Amplification kit to the Pacific Biosciences Iso-Seq technology in order to enrich for genuine full-length transcripts in the cDNA libraries. We provide evidence that TeloPrime outperforms the commonly used SMARTer PCR cDNA Synthesis Kit in identifying transcription start and end sites in Arabidopsis thaliana. Furthermore, we show that TeloPrime-based Pacific Biosciences Iso-Seq can be successfully applied to the polyploid genome of bread wheat (Triticum aestivum) not only to efficiently annotate gene models, but also to identify novel transcription sites, gene homeologs, splicing isoforms and previously unidentified gene loci. PMID:27327613

  15. The human clotting factor VIII cDNA contains an autonomously replicating sequence consensus- and matrix attachment region-like sequence that binds a nuclear factor, represses heterologous gene expression, and mediates the transcriptional effects of sodium butyrate.

    PubMed Central

    Fallaux, F J; Hoeben, R C; Cramer, S J; van den Wollenberg, D J; Briët, E; van Ormondt, H; van Der Eb, A J

    1996-01-01

    Expression of the human blood-clotting factor VIII (FVIII) cDNA is hampered by the presence of sequences located in the coding region that repress transcription. We have previously identified a 305-bp fragment within the FVIII cDNA that is involved in the repression (R.C. Hoeben, F.J. Fallaux, S.J. Cramer, D.J.M. van den Wollenberg, H. van Ormondt, E. Briet, and A.J. van der Eb, Blood 85:2447-2454, 1995). Here, we show that this 305-bp region of FVIII cDNA contains sequences that resemble the yeast (Saccharomyces cerevisiae) autonomously replicating sequence consensus. Two of these DNA elements coincide with AT-rich sequences that are often found in matrix attachment regions or scaffold-attached regions. One of these elements, consisting of nucleotides 1569 to 1600 of the FVIII cDNA (nucleotide numbering is according to the system of Wood et al. (W.I. Wood, D.J. Capon, C.C. Simonsen, D.L. Eaton, J. Gitschier, D. Keyt, P.H. Seeburg, D.H. Smith, P. Hollingshead, K.L. Wion, et al., Nature [London] 312:330-337,1984), binds a nuclear factor in vitro but loses this capacity after four of its base pairs have been changed. A synthetic heptamer of this segment can repress the expression of a chloramphenicol acetyltransferase (CAT) reporter gene and also loses this capacity upon mutation. Furthermore, we demonstrate that repression by FVIII sequences can be relieved by sodium butyrate. We demonstrate that the synthetic heptamer (FVIII nucleotides 1569 to 1600), when placed upstream of the Moloney murine leukemia virus long terminal repeat promoter that drives the CAT reporter, can render the CAT reporter inducible by butyrate. This effect was absent when the same element was mutated. The stimulatory effect of butyrate could not be attributed to butyrate-responsive elements in the studied long terminal repeat promoters. Our data provide a functional characterization of the sequences that repress expression of the FVIII cDNA. These data also suggest a link between

  16. Sequencing and comparative genomic analysis of 1227 Felis catus cDNA sequences enriched for developmental, clinical and nutritional phenotypes

    PubMed Central

    2012-01-01

    Background The feline genome is valuable to the veterinary and model organism genomics communities because the cat is an obligate carnivore and a model for endangered felids. The initial public release of the Felis catus genome assembly provided a framework for investigating the genomic basis of feline biology. However, the entire set of protein coding genes has not been elucidated. Results We identified and characterized 1227 protein coding feline sequences, of which 913 map to public sequences and 314 are novel. These sequences have been deposited into NCBI's genbank database and complement public genomic resources by providing additional protein coding sequences that fill in some of the gaps in the feline genome assembly. Through functional and comparative genomic analyses, we gained an understanding of the role of these sequences in feline development, nutrition and health. Specifically, we identified 104 orthologs of human genes associated with Mendelian disorders. We detected negative selection within sequences with gene ontology annotations associated with intracellular trafficking, cytoskeleton and muscle functions. We detected relatively less negative selection on protein sequences encoding extracellular networks, apoptotic pathways and mitochondrial gene ontology annotations. Additionally, we characterized feline cDNA sequences that have mouse orthologs associated with clinical, nutritional and developmental phenotypes. Together, this analysis provides an overview of the value of our cDNA sequences and enhances our understanding of how the feline genome is similar to, and different from other mammalian genomes. Conclusions The cDNA sequences reported here expand existing feline genomic resources by providing high-quality sequences annotated with comparative genomic information providing functional, clinical, nutritional and orthologous gene information. PMID:22257742

  17. cDNA sequence of a human skeletal muscle ADP/ATP translocator: lack of a leader peptide, divergence from a fibroblast translocator cDNA, and coevolution with mitochondrial DNA genes

    SciTech Connect

    Neckelmann, N.; Li, K.; Wade, R.P.; Shuster, R.; Wallace, D.C.

    1987-11-01

    The authors have characterized a 1400-nucleotide cDNA for the human skeletal muscle ADP/ATP translocator. The deduced amino acid sequence is 94% homologous to the beef heart ADP/ATP translocator protein and contains only a single additional amino-terminal methionine. This implies that the human translocator lacks an amino-terminal targeting peptide, a conclusion substantiated by measuring the molecular weight of the protein synthesized in vitro. A 1400-nucleotide transcript encoding the skeletal muscle translocator was detected on blots of total RNA from human heart, kidney, skeletal muscle, and HeLa cells by hybridization with oligonucleotide probes homologous to the coding region and 3' noncoding region of the cDNA. However, the level of this mRNA varied substantially among tissues. Comparison of our skeletal muscle translocator sequence with that of a recently published human fibroblast translocator cognate revealed that the two proteins are 88% identical and diverged about 275 million years ago. Hence, tissues vary both in the level of expression of individual translocator genes and in differential expression of cognate translocator genes. Comparison of the base substitution rates of the ADP/ATP translocator and the oxidative phosphorylation genes encoded by mitochondrial DNA revealed that the mitochondrial DNA genes fix 10 times more synonymous substitutions and 12 times more replacement substitutions; yet, these nuclear and cytoplasmic respiration genes experience comparable evolutionary constraints. This suggest that the mitochondrial DNA genes are highly prone to deleterious mutations.

  18. Deblur Rapidly Resolves Single-Nucleotide Community Sequence Patterns

    PubMed Central

    Amir, Amnon; McDonald, Daniel; Navas-Molina, Jose A.; Kopylova, Evguenia; Morton, James T.; Zech Xu, Zhenjiang; Kightley, Eric P.; Thompson, Luke R.; Hyde, Embriette R.; Gonzalez, Antonio

    2017-01-01

    ABSTRACT High-throughput sequencing of 16S ribosomal RNA gene amplicons has facilitated understanding of complex microbial communities, but the inherent noise in PCR and DNA sequencing limits differentiation of closely related bacteria. Although many scientific questions can be addressed with broad taxonomic profiles, clinical, food safety, and some ecological applications require higher specificity. Here we introduce a novel sub-operational-taxonomic-unit (sOTU) approach, Deblur, that uses error profiles to obtain putative error-free sequences from Illumina MiSeq and HiSeq sequencing platforms. Deblur substantially reduces computational demands relative to similar sOTU methods and does so with similar or better sensitivity and specificity. Using simulations, mock mixtures, and real data sets, we detected closely related bacterial sequences with single nucleotide differences while removing false positives and maintaining stability in detection, suggesting that Deblur is limited only by read length and diversity within the amplicon sequences. Because Deblur operates on a per-sample level, it scales to modern data sets and meta-analyses. To highlight Deblur’s ability to integrate data sets, we include an interactive exploration of its application to multiple distinct sequencing rounds of the American Gut Project. Deblur is open source under the Berkeley Software Distribution (BSD) license, easily installable, and downloadable from https://github.com/biocore/deblur. IMPORTANCE Deblur provides a rapid and sensitive means to assess ecological patterns driven by differentiation of closely related taxa. This algorithm provides a solution to the problem of identifying real ecological differences between taxa whose amplicons differ by a single base pair, is applicable in an automated fashion to large-scale sequencing data sets, and can integrate sequencing runs collected over time. PMID:28289731

  19. Cloning and sequence analysis of cDNA for the proteasome activator PA28-beta subunit of flounder (Paralichthys olivaceus).

    PubMed

    Kim, Dae-Hyun; Lee, Sun-Me; Hong, Bo-Young; Kim, Young-Tae; Choi, Tae-Jin

    2003-12-01

    Proteasome is a large multisubunit complex involved in intracellular proteolysis in antigen processing for loading MHC class I molecules. Two activators PA28-alpha and PA28-beta, which are induced by interferon-gamma (IFN-gamma), activate this latent enzyme complex. Genes encoding these activators, PSME1 and PSME2, respectively, have been characterized from various mammalian but only from zebrafish among piscine. We have cloned a PSME2 gene homologue from a leukocyte cDNA library of flounder, a marine fish. The flounder PSME2 gene (fPSME2) encompasses 1063 nucleotides and encodes a polypeptide of 242 amino acids (aa), with a deduced molecular weight of 27.2 kDa. The deduced protein has 82% sequence similarity to that of zebrafish and 73-74% sequence similarity to that of various mammalians and shows higher level sequence homology in the C-terminal region. There was a PA28-beta protein subunit-specific insert located at the corresponding to the KEKE motif of PA28-alpha protein. A phylogenetic tree derived using deduced amino acid sequences showed a diversion of piscine PSME2 from mammalian counterpart after diversion of PSME1 and PSME2 from a common ancestral gene. Northern blot analysis revealed a higher level expression of fPSME2 gene in kidney, spleen and muscle tissues of bacterial lipopolysaccharide (LPS) stimulated flounder than those from non-induced flounder.

  20. Purification, amino acid sequence, and cDNA cloning of trypsin inhibitors from onion (Allium cepa L.) bulbs.

    PubMed

    Deshimaru, Masanobu; Watanabe, Akira; Suematsu, Keiko; Hatano, Maki; Terada, Shigeyuki

    2003-08-01

    Three protease inhibitors (OTI-1-3) have been purified from onion (Allium cepa L.) bulbs. Molecular masses of these inhibitors were found to be 7,370.2, 7,472.2, and 7,642.6 Da by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF-MS), respectively. Based on amino acid composition and N-terminal sequence, OTI-1 and -2 are the N-terminal truncated proteins of OTI-3. All the inhibitors are stable to heat and extreme pH. OTI-3 inhibited trypsin, chymotrypsin, and plasmin with dissociation constants of 1.3 x 10(-9) M, 2.3 x 10(-7) M, and 3.1 x 10(-7) M, respectively. The complete amino acid sequence of OTI-3 showed a significant homology to Bowman-Birk family inhibitors, and the first reactive site (P1) was found to be Arg17 by limited proteolysis by trypsin. The second reactive site (P1) was estimated to be Leu46, that may inhibit chymotrypsin. OTI-3 lacks an S-S bond near the second reactive site, resulting in a low affinity for the enzyme. The sequence of OTI-3 was also ascertained by the nucleotide sequence of a cDNA clone encoding a 101-residue precursor of the onion inhibitor.

  1. Complete cDNA and derived amino acid sequence of human factor V

    SciTech Connect

    Jenny, R.J.; Pittman, D.D.; Toole, J.J.; Kriz, R.W.; Aldape, R.A.; Hewick, R.M.; Kaufman, R.J.; Mann, K.G.

    1987-07-01

    cDNA clones encoding human factor V have been isolated from an oligo(dT)-primed human fetal liver cDNA library prepared with vector Charon 21A. The cDNA sequence of factor V from three overlapping clones includes a 6672-base-pair (bp) coding region, a 90-bp 5' untranslated region, and a 163-bp 3' untranslated region within which is a poly(A)tail. The deduced amino acid sequence consists of 2224 amino acids inclusive of a 28-amino acid leader peptide. Direct comparison with human factor VIII reveals considerable homology between proteins in amino acid sequence and domain structure: a triplicated A domain and duplicated C domain show approx. 40% identity with the corresponding domains in factor VIII. As in factor VIII, the A domains of factor V share approx. 40% amino acid-sequence homology with the three highly conserved domains in ceruloplasmin. The B domain of factor V contains 35 tandem and approx. 9 additional semiconserved repeats of nine amino acids of the form Asp-Leu-Ser-Gln-Thr-Thr/Asn-Leu-Ser-Pro and 2 additional semiconserved repeats of 17 amino acids. Factor V contains 37 potential N-linked glycosylation sites, 25 of which are in the B domain, and a total of 19 cysteine residues.

  2. [Cloning and analyzing of the cDNA sequence of CHS-A gene of Narcissus].

    PubMed

    Huang, Yin Yi; Shen, Ming Shan; Chen, Liang; Li, Peng; Chen, Mu Zhuan

    2002-09-01

    Chalcone synthase (CHS) is a key enzyme in the biosynthesis of all classes of flavonoids. The production of flower pigment is specifically regulated by the activity of CHS. We cloned the cDNA sequence of CHS-A gene from Narcissus by PCR and analyzed the coding sequence of gene. The result demonstrated that the sequence of the coding region was 1167bp, encoding a protein of 389 amino acid which was more than 80% homology with CHS of the other 8 plants, such as Nicotine abacus and Solana tuberosum.

  3. Characterization of full-length sequenced cDNA inserts (FLIcs) from Atlantic salmon (Salmo salar)

    PubMed Central

    Andreassen, Rune; Lunner, Sigbjørn; Høyheim, Bjørn

    2009-01-01

    Background Sequencing of the Atlantic salmon genome is now being planned by an international research consortium. Full-length sequenced inserts from cDNAs (FLIcs) are an important tool for correct annotation and clustering of the genomic sequence in any species. The large amount of highly similar duplicate sequences caused by the relatively recent genome duplication in the salmonid ancestor represents a particular challenge for the genome project. FLIcs will therefore be an extremely useful resource for the Atlantic salmon sequencing project. In addition to be helpful in order to distinguish between duplicate genome regions and in determining correct gene structures, FLIcs are an important resource for functional genomic studies and for investigation of regulatory elements controlling gene expression. In contrast to the large number of ESTs available, including the ESTs from 23 developmental and tissue specific cDNA libraries contributed by the Salmon Genome Project (SGP), the number of sequences where the full-length of the cDNA insert has been determined has been small. Results High quality full-length insert sequences from 560 pre-smolt white muscle tissue specific cDNAs were generated, accession numbers [GenBank: BT043497 - BT044056]. Five hundred and ten (91%) of the transcripts were annotated using Gene Ontology (GO) terms and 440 of the FLIcs are likely to contain a complete coding sequence (cCDS). The sequence information was used to identify putative paralogs, characterize salmon Kozak motifs, polyadenylation signal variation and to identify motifs likely to be involved in the regulation of particular genes. Finally, conserved 7-mers in the 3'UTRs were identified, of which some were identical to miRNA target sequences. Conclusion This paper describes the first Atlantic salmon FLIcs from a tissue and developmental stage specific cDNA library. We have demonstrated that many FLIcs contained a complete coding sequence (cCDS). This suggests that the remaining cDNA

  4. cDNA, genomic sequence cloning and overexpression of ribosomal protein S25 gene (RPS25) from the Giant Panda.

    PubMed

    Hao, Yan-Zhe; Hou, Wan-Ru; Hou, Yi-Ling; Du, Yu-Jie; Zhang, Tian; Peng, Zheng-Song

    2009-11-01

    RPS25 is a component of the 40S small ribosomal subunit encoded by RPS25 gene, which is specific to eukaryotes. Studies in reference to RPS25 gene from animals were handful. The Giant Panda (Ailuropoda melanoleuca), known as a "living fossil", are increasingly concerned by the world community. Studies on RPS25 of the Giant Panda could provide scientific data for inquiring into the hereditary traits of the gene and formulating the protective strategy for the Giant Panda. The cDNA of the RPS25 cloned from Giant Panda is 436 bp in size, containing an open reading frame of 378 bp encoding 125 amino acids. The length of the genomic sequence is 1,992 bp, which was found to possess four exons and three introns. Alignment analysis indicated that the nucleotide sequence of the coding sequence shows a high homology to those of Homo sapiens, Bos taurus, Mus musculus and Rattus norvegicus as determined by Blast analysis, 92.6, 94.4, 89.2 and 91.5%, respectively. Primary structure analysis revealed that the molecular weight of the putative RPS25 protein is 13.7421 kDa with a theoretical pI 10.12. Topology prediction showed there is one N-glycosylation site, one cAMP and cGMP-dependent protein kinase phosphorylation site, two Protein kinase C phosphorylation sites and one Tyrosine kinase phosphorylation site in the RPS25 protein of the Giant Panda. The RPS25 gene was overexpressed in E. coli BL21 and Western Blotting of the RPS25 protein was also done. The results indicated that the RPS25 gene can be really expressed in E. coli and the RPS25 protein fusioned with the N-terminally his-tagged form gave rise to the accumulation of an expected 17.4 kDa polypeptide. The cDNA and the genomic sequence of RPS25 were cloned successfully for the first time from the Giant Panda using RT-PCR technology and Touchdown-PCR, respectively, which were both sequenced and analyzed preliminarily; then the cDNA of the RPS25 gene was overexpressed in E. coli BL21 and immunoblotted, which is the first

  5. Molecular cloning and sequencing of a cDNA encoding partial putative molt-inhibiting hormone from Penaeus chinensis

    NASA Astrophysics Data System (ADS)

    Wang, Zai-Zhao; Xiang, Jian-Hai

    2002-09-01

    Total RNA was extracted from eyestalks of shrimp Penaeus chinensis. Eyestalk cDNA was obtained from total RNA by reverse transcription. Reverse transcriptase-polymerase chain reaction (RT-PCR) was initiated using eyestalk cDNA and degenerate primers designed from the amino acid sequence of molt-inhibiting hormone from shrimp Penaeus japonicus. A specific cDNA was obtained and cloned into a T vector for sequencing. The cDNA consisted of 201 base pairs and encoding for a peptide of 67 amino acid residues. The peptide of P. chinensis had the highest identity with molt-inhibiting hormones of P. japonicus. The cDNA could be a partial gene of molt-inhibiting hormones from P. chinensis. This paper reports for the first time cDNA encoding for neuropeptide of P. chinensis.

  6. The nucleotide sequence of a nematode vitellogenin gene.

    PubMed Central

    Spieth, J; Denison, K; Zucker, E; Blumenthal, T

    1985-01-01

    The nematode, Caenorhabditis elegans, contains a family of six genes that code for vitellogenins. Here we report the complete nucleotide sequence of one of these genes, vit-5. The gene specifies a mRNA of 4869 nucleotides, including untranslated regions of 9 bases at the 5' end and 51 bases at the 3' end. Vit-5 contains four short introns totalling 218 bp. The predicted vitellogenin, yp170A, has a molecular weight of 186,430. At its N terminus it is clearly related to the vitellogenins of vertebrates. However, the vit-5-encoded protein does not contain a serine-rich sequence related to the vertebrate vitellin, phosvitin. In fact, the amino acid composition of the nematode protein is very similar to that of the vertebrate protein without phosvitin. Vit-5 has a highly asymmetric codon choice dictionary. The favored codons are different from those favored in other organisms, but are characteristic of highly expressed C. elegans genes. The strong selection against rare codons is not as great near the 5' end of the gene; rare codons are 15 times more frequent within the first 54 bp than in the next 4.8 kb. PMID:3855245

  7. Nucleotide-Specific Contrast for DNA Sequencing by Electron Spectroscopy

    DOE PAGES

    Mankos, Marian; Persson, Henrik H. J.; N’Diaye, Alpha T.; ...

    2016-05-05

    DNA sequencing by imaging in an electron microscope is an approach that holds promise to deliver long reads with low error rates and without the need for amplification. Earlier work using transmission electron microscopes, which use high electron energies on the order of 100 keV, has shown that low contrast and radiation damage necessitates the use of heavy atom labeling of individual nucleotides, which increases the read error rates. Other prior work using scattering electrons with much lower energy has shown to suppress beam damage on DNA. Here we explore possibilities to increase contrast by employing two methods, X-ray photoelectronmore » and Auger electron spectroscopy. Using bulk DNA samples with monomers of each base, both methods are shown to provide contrast mechanisms that can distinguish individual nucleotides without labels. In conclusion, both spectroscopic techniques can be readily implemented in a low energy electron microscope, which may enable label-free DNA sequencing by direct imaging.« less

  8. Presence of the delta-MSH sequence in a proopiomelanocortin cDNA cloned from the pituitary of the galeoid shark, Heterodontus portusjacksoni.

    PubMed

    Dores, Robert M; Cameron, Erin; Lecaude, Stephanie; Danielson, Phillip B

    2003-08-01

    Since a fourth MSH sequence, delta-MSH, has been detected in the proopiomelanocortin (POMC) gene of a dogfish and a stingray, members of superorder Squalea (class Chondrichthyes), it is possible that this novel MSH sequence might be a feature common to the POMC genes of all modern sharks and rays. As an initial step towards addressing this question, a full-length POMC cDNA was cloned and sequenced from the pituitary of the Port Jackson shark, Heterodontus portusjacksoni. The Port Jackson shark represents one of the oldest lineages in superorder Galea, and this superorder together with superorder Squalea form infraclass Neoselachii (the extant sharks and rays). The Port Jackson shark POMC cDNA has an open reading frame that is 1032 nucleotides in length and encodes the deduced amino acids sequences for beta-endorphin, ACTH/alpha-MSH, beta-MSH, gamma-MSH, and delta-MSH. Port Jackson shark delta-MSH has 83% primary sequence identity with dogfish and stingray delta-MSH, and it appears that the delta-MSH sequence may have been the result of an internal domain duplication and reinsertion of the beta-MSH sequence. The presence of the delta-MSH sequence in the POMC genes of representatives of both superorders of infraclass Neoselachii would indicate that the delta-MSH sequence must have been present in the ancestral euselachian shark that gave rise to the neoselachian radiation.

  9. Expressed sequence tags from a NaCl-treated Suaeda salsa cDNA library.

    PubMed

    Zhang, L; Ma, X L; Zhang, Q; Ma, C L; Wang, P P; Sun, Y F; Zhao, Y X; Zhang, H

    2001-04-18

    Past efforts to improve plant tolerance to osmotic stress have had limited success owing to the genetic complexity of stress responses. The first step towards cataloging and categorizing genetically complex abotic stress responses is the rapid discovery of genes by the large-scale partial sequencing of randomly selected cDNA clones or expressed sequence tags (ESTs). Suaeda salsa, which can survive seawater-level salinity, is a favorite halophytic model for salt tolerant research. We constructed a NaCl-treated cDNA library of Suaeda salsa and sequenced 1048 randomly selected clones, out of which 1016 clones produced readable sequences (773 showed homology to previously identified genes, 227 matched unknown protein coding regions, 16 anomalous sequences or sequences of bacterial origin were excluded from further analysis). By sequence analysis we identified 492 unique clones: 315 showed homology to previously identified genes, 177 matched unknown protein coding regions (101 of which have been found before in other organisms and 76 are completely novel). All our EST data are available on the Internet. We believe that our dbEST and the associated DNA materials will be a useful source to scientists engaging in stress-tolerance study.

  10. Nucleic acid (cDNA) and amino acid sequences of alpha-type gliadins from wheat (Triticum aestivum).

    PubMed Central

    Kasarda, D D; Okita, T W; Bernardin, J E; Baecker, P A; Nimmo, C C; Lew, E J; Dietler, M D; Greene, F C

    1984-01-01

    The complete amino acid sequence for an alpha-type gliadin protein of wheat (Triticum aestivum Linnaeus) endosperm has been derived from a cloned cDNA sequence. An additional cDNA clone that corresponds to about 75% of a similar alpha-type gliadin has been sequenced and shows some important differences. About 97% of the composite sequence of A-gliadin (an alpha-type gliadin fraction) has also been obtained by direct amino acid sequencing. This sequence shows a high degree of similarity with amino acid sequences derived from both cDNA clones and is virtually identical to one of them. On the basis of sequence information, after loss of the signal sequence, the mature alpha-type gliadins may be divided into five different domains, two of which may have evolved from an ancestral gliadin gene, whereas the remaining three contain repeating sequences that may have developed independently. Images PMID:6589619

  11. Optimal cDNA microarray design using expressed sequence tags for organisms with limited genomic information

    PubMed Central

    Chen, Yian A; Mckillen, David J; Wu, Shuyuan; Jenny, Matthew J; Chapman, Robert; Gross, Paul S; Warr, Gregory W; Almeida, Jonas S

    2004-01-01

    Background Expression microarrays are increasingly used to characterize environmental responses and host-parasite interactions for many different organisms. Probe selection for cDNA microarrays using expressed sequence tags (ESTs) is challenging due to high sequence redundancy and potential cross-hybridization between paralogous genes. In organisms with limited genomic information, like marine organisms, this challenge is even greater due to annotation uncertainty. No general tool is available for cDNA microarray probe selection for these organisms. Therefore, the goal of the design procedure described here is to select a subset of ESTs that will minimize sequence redundancy and characterize potential cross-hybridization while providing functionally representative probes. Results Sequence similarity between ESTs, quantified by the E-value of pair-wise alignment, was used as a surrogate for expected hybridization between corresponding sequences. Using this value as a measure of dissimilarity, sequence redundancy reduction was performed by hierarchical cluster analyses. The choice of how many microarray probes to retain was made based on an index developed for this research: a sequence diversity index (SDI) within a sequence diversity plot (SDP). This index tracked the decreasing within-cluster sequence diversity as the number of clusters increased. For a given stage in the agglomeration procedure, the EST having the highest similarity to all the other sequences within each cluster, the centroid EST, was selected as a microarray probe. A small dataset of ESTs from Atlantic white shrimp (Litopenaeus setiferus) was used to test this algorithm so that the detailed results could be examined. The functional representative level of the selected probes was quantified using Gene Ontology (GO) annotations. Conclusions For organisms with limited genomic information, combining hierarchical clustering methods to analyze ESTs can yield an optimal cDNA microarray design. If

  12. cDNA sequence, genomic organization, and evolutionary conservation of a novel gene from the WAGR region

    SciTech Connect

    Schwartz, F.; Eisenman, R.; Knoll, J.; Bruns, G.

    1995-09-20

    A new gene (239FB) with predominant and differential expression in fetal brain has recently been isolated from a chromosome 11p13-p14 boundary area near FSHB. The corresponding mRNA has an open reading frame of 294 amino acids, a 3` untranslated region of 1247 nucleotides, and a highly GC-rich 5` untranslated region. The coding and 3` UT sequence is specified by 6 exons within nearly 87 kb of isolated genomic locus. The 5` end region of the transcript maps adjacent to the only genomically defined CpG island in a chromosomal subregion that may be associated with part of the mental retardation of some WAGR (Wilms tumor, aniridia, genitourinary anomalies, and mental retardation) syndrome patients. In addition to nucleotide and amino acid similarity to an EST from a normalized infant brain cDNA library, the predicted protein has extensive similarity to Caenorhbditis elegans polypeptides of, as yet, unknown function. The 239FB locus is, therefore, likely part of a family of genes with two members expressed in human brain. The extensive conservation of the predicted protein suggests a fundamental function of the gene product and will enable evaluation of the role of the 239FB gene in neurogenesis in model organisms. 48 refs., 4 figs., 1 tab.

  13. Sequencing of first-strand cDNA library reveals full-length transcriptomes

    PubMed Central

    Agarwal, Saurabh; Macfarlan, Todd S.; Sartor, Maureen A.; Iwase, Shigeki

    2016-01-01

    Massively parallel strand-specific sequencing of RNA (ssRNA-seq) has emerged as a powerful tool for profiling complex transcriptomes. However, many current methods for ssRNA-seq suffer from the underrepresentation of both the 5′ and 3′ ends of RNAs, which can be attributed to second-strand cDNA synthesis. The 5′ and 3′ ends of RNA harbour crucial information for gene regulation; namely, transcription start sites (TSSs) and polyadenylation sites. Here we report a novel ssRNA-seq method that does not involve second-strand cDNA synthesis, as we Directly Ligate sequencing Adaptors to the First-strand cDNA (DLAF). This novel method with fewer enzymatic reactions results in a higher quality of the libraries than the conventional method. Sequencing of DLAF libraries followed by a novel analysis pipeline enables the profiling of both 5′ ends and polyadenylation sites at near-base resolution. Therefore, DLAF offers the first genomics tool to obtain the ‘full-length’ transcriptome with a single library. PMID:25607527

  14. cDNA sequence and chromosomal localization of human enterokinase, the proteolytic activator of trypsinogen.

    PubMed

    Kitamoto, Y; Veile, R A; Donis-Keller, H; Sadler, J E

    1995-04-11

    Enterokinase is a serine protease of the duodenal brush border membrane that cleaves trypsinogen and produces active trypsin, thereby leading to the activation of many pancreatic digestive enzymes. Overlapping cDNA clones that encode the complete human enterokinase amino acid sequence were isolated from a human intestine cDNA library. Starting from the first ATG codon, the composite 3696 nt cDNA sequence contains an open reading frame of 3057 nt that encodes a 784 amino acid heavy chain followed by a 235 amino acid light chain; the two chains are linked by at least one disulfide bond. The heavy chain contains a potential N-terminal myristoylation site, a potential signal anchor sequence near the amino terminus, and six structural motifs that are found in otherwise unrelated proteins. These domains resemble motifs of the LDL receptor (two copies), complement component Clr (two copies), the metalloprotease meprin (one copy), and the macrophage scavenger receptor (one copy). The enterokinase light chain is homologous to the trypsin-like serine proteinases. These structural features are conserved among human, bovine, and porcine enterokinase. By Northern blotting, a 4.4 kb enterokinase mRNA was detected only in small intestine. The enterokinase gene was localized to human chromosome 21q21 by fluorescence in situ hybridization.

  15. The complete nucleotide sequence of chrysanthemum stem necrosis virus.

    PubMed

    Dullemans, A M; Verhoeven, J Th J; Kormelink, R; van der Vlugt, R A A

    2015-02-01

    The complete genome sequence of chrysanthemum stem necrosis virus (CSNV) was determined using Roche 454 next-generation sequencing. CSNV is a tentative member of the genus Tospovirus within the family Bunyaviridae, whose members are arthropod-borne. This is the first report of the entire RNA genome sequence of a CSNV isolate. The large RNA of CSNV is 8955 nucleotides (nt) in size and contains a single open reading frame of 8625 nt in the antisense arrangement, coding for the putative RNA-dependent RNA polymerase (L protein) of 2874 aa with a predicted Mr of 331 kDa. Two untranslated regions of 397 and 33 nt are present at the 5' and 3' termini, respectively. The medium (M) and small (S) RNAs are 4830 and 2947 nt in size, respectively, and show 99 % identity to the corresponding genomic segments of previously partially characterized CSNV genomes. Protein sequences for the precursor of the Gn/Gc proteins, N and NSs, are identical in length in all of the analysed CSNV isolates.

  16. Generalized Levy-walk model for DNA nucleotide sequences

    NASA Technical Reports Server (NTRS)

    Buldyrev, S. V.; Goldberger, A. L.; Havlin, S.; Simons, M.; Stanley, H. E.

    1993-01-01

    We propose a generalized Levy walk to model fractal landscapes observed in noncoding DNA sequences. We find that this model provides a very close approximation to the empirical data and explains a number of statistical properties of genomic DNA sequences such as the distribution of strand-biased regions (those with an excess of one type of nucleotide) as well as local changes in the slope of the correlation exponent alpha. The generalized Levy-walk model simultaneously accounts for the long-range correlations in noncoding DNA sequences and for the apparently paradoxical finding of long subregions of biased random walks (length lj) within these correlated sequences. In the generalized Levy-walk model, the lj are chosen from a power-law distribution P(lj) varies as lj(-mu). The correlation exponent alpha is related to mu through alpha = 2-mu/2 if 2 < mu < 3. The model is consistent with the finding of "repetitive elements" of variable length interspersed within noncoding DNA.

  17. Crustacean hyperglycemic hormones of two cold water crab species, Chionoecetes opilio and C. japonicus: isolation of cDNA sequences and localization of CHH neuropeptide in eyestalk ganglia.

    PubMed

    Chung, J Sook; Ahn, I S; Yu, O H; Kim, D S

    2015-04-01

    Crustacean hyperglycemic hormone (CHH) is primarily known for its prototypical function in hyperglycemia which is induced by the release of CHH. The CHH release takes place as an adaptive response to the energy demands of the animals experiencing stressful environmental, physiological or behavioral conditions. Although >63 decapod CHH nucleotide sequences are known (GenBank), the majority of them is garnered from the species inhabiting shallow and warm water. In order to understand the adaptive role of CHH in Chionoecetes opilio and Chionoecetes japonicus inhabiting deep water environments, we first aimed for the isolation of the full-length cDNA sequence of CHH from the eyestalk ganglia of C. opilio (ChoCHH) and C. japonicus (ChjCHH) using degenerate PCR and 5' and 3' RACE. Cho- and ChjCHH cDNA sequences are identical in 5' UTR and ORF with 100% sequence identity of the putative 138aa of preproCHHs. The length of 3' UTR ChjCHH cDNA sequence is 39 nucleotides shorter than that of ChoCHH. This is the first report in decapod crustaceans that two different species have the identical sequence of CHH. ChoCHH expression increases during embryogenesis of C. opilio and is significantly higher in adult males and females. C. japonicus males have slightly higher ChjCHH expression than C. opilio males, but no statistical difference. In both species, the immunostaining intensity of CHH is stronger in the sinus gland than that of X-organ cells. Future studies will enable us to gain better understanding of the comparative metabolic physiology and endocrinology of cold, deep water species of Chionoecetes spp.

  18. Sequence of a cDNA encoding the bi-specific NAD(P)H-nitrate reductase from the tree Betula pendula and identification of conserved protein regions.

    PubMed

    Friemann, A; Brinkmann, K; Hachtel, W

    1991-05-01

    Nitrate reductase (NR) assays revealed a bispecific NAD(P)H-NR (EC 1.6.6.2.) to be the only nitrate-reducing enzyme in leaves of hydroponically grown birches. To obtain the primary structure of the NAD(P)H-NR, leaf poly(A)+ mRNA was used to construct a cDNA library in the lambda gt11 phage. Recombinant clones were screened with heterologous gene probes encoding NADH-NR from tobacco and squash. A 3.0 kb cDNA was isolated which hybridized to a 3.2 kb mRNA whose level was significantly higher in plants grown on nitrate than in those grown on ammonia. The nucleotide sequence of the cDNA comprises a reading frame encoding a protein of 898 amino acids which reveals 67%-77% identity with NADH-nitrate reductase sequences from higher plants. To identify conserved and variable regions of the multicentre electron-transfer protein a graphical evaluation of identities found in NR sequence alignments was carried out. Thirteen well-conserved sections exceeding a size of 10 amino acids were found in higher plant nitrate reductases. Sequence comparisons with related redox proteins indicate that about half of the conserved NR regions are involved in cofactor binding. The most striking difference in the birch NAD(P)H-NR sequence in comparison to NADH-NR sequences was found at the putative pyridine nucleotide binding site. Southern analysis indicates that the bi-specific NR is encoded by a single copy gene in birch.

  19. Cloning and sequence analysis of a cDNA encoding a Brazil nut protein exceptionally rich in methionine.

    PubMed

    Altenbach, S B; Pearson, K W; Leung, F W; Sun, S S

    1987-05-01

    The primary amino acid sequence of an abundant methionine-rich seed protein found in Brazil nut (Bertholletia excelsa H.B.K.) has been elucidated by protein sequencing and from the nucleotide sequence of cDNA clones. The 9 kDa subunit of this protein was found to contain 77 amino acids of which 14 were methionine (18%) and 6 were cysteine (8%). Over half of the methionine residues in this subunit are clustered in two regions of the polypeptide where they are interspersed with arginine residues. In one of these regions, methionine residues account for 5 out of 6 amino acids and four of these methionine residues are contiguous. The sequence data verifies that the Brazil nut sulfur-rich protein is synthesized as a precursor polypeptide that is considerably larger than either of the two subunits of the mature protein. Three proteolytic processing steps by which the encoded polypeptide is sequentially trimmed to the 9 kDa and 3 kDa subunit polypeptides have been correlated with the sequence information. In addition, we have found that the sulfur-rich protein from Brazil nut is homologous in its amino acid sequence to small water-soluble proteins found in two other oilseeds, castor bean (Ricinus communis) and rapeseed (Brassica napus). When the amino acid sequences of these three proteins are aligned to maximize homology, the arrangement of cysteine residues is conserved. However, the two subunits of the Brazil nut protein contain over 19% methionine whereas the homologous proteins from castor bean and rapeseed contain only 2.1% and 2.6% methionine, respectively.

  20. Interspecies diversity of the occludin sequence: cDNA cloning of human, mouse, dog, and rat-kangaroo homologues.

    PubMed

    Ando-Akatsuka, Y; Saitou, M; Hirase, T; Kishi, M; Sakakibara, A; Itoh, M; Yonemura, S; Furuse, M; Tsukita, S

    1996-04-01

    Occludin has been identified from chick liver as a novel integral membrane protein localizing at tight junctions (Furuse, M., T. Hirase, M. Itoh, A. Nagafuchi, S. Yonemura, Sa. Tsukita, and Sh. Tsukita. 1993. J. Cell Biol. 123:1777-1788). To analyze and modulate the functions of tight junctions, it would be advantageous to know the mammalian homologues of occludin and their genes. Here we describe the nucleotide sequences of full length cDNAs encoding occludin of rat-kangaroo (potoroo), human, mouse, and dog. Rat-kangaroo occludin cDNA was prepared from RNA isolated from PtK2 cell culture, using a mAb against chicken occludin, whereas the others were amplified by polymerase chain reaction based on the sequence found around the human neuronal apoptosis inhibitory protein gene. The amino acid sequences of the three mammalian (human, murine, and canine) occludins were very closely related to each other (approximately 90% identity), whereas they diverged considerably from those of chicken and rat-kangaroo (approximately 50% identity). Implications of these data and novel experimental options in cell biological research are discussed.

  1. Nucleotide sequences specific to Yersinia pestis and methods for the detection of Yersinia pestis

    DOEpatents

    McCready, Paula M [Tracy, CA; Radnedge, Lyndsay [San Mateo, CA; Andersen, Gary L [Berkeley, CA; Ott, Linda L [Livermore, CA; Slezak, Thomas R [Livermore, CA; Kuczmarski, Thomas A [Livermore, CA; Motin, Vladinir L [League City, TX

    2009-02-24

    Nucleotide sequences specific to Yersinia pestis that serve as markers or signatures for identification of this bacterium were identified. In addition, forward and reverse primers and hybridization probes derived from these nucleotide sequences that are used in nucleotide detection methods to detect the presence of the bacterium are disclosed.

  2. Nucleotide sequences specific to Francisella tularensis and methods for the detection of Francisella tularensis

    DOEpatents

    McCready, Paula M.; Radnedge, Lyndsay; Andersen, Gary L.; Ott, Linda L.; Slezak, Thomas R.; Kuczmarski, Thomas A.; Vitalis, Elizabeth A

    2009-02-24

    Described herein is the identification of nucleotide sequences specific to Francisella tularensis that serves as a marker or signature for identification of this bacterium. In addition, forward and reverse primers and hybridization probes derived from these nucleotide sequences that are used in nucleotide detection methods to detect the presence of the bacterium are disclosed.

  3. Nucleotide sequences specific to Francisella tularensis and methods for the detection of Francisella tularensis

    DOEpatents

    McCready, Paula M.; Radnedge, Lyndsay; Andersen, Gary L.; Ott, Linda L.; Slezak, Thomas R.; Kuczmarski, Thomas A.; Vitalis, Elizabeth A

    2007-02-06

    Described herein is the identification of nucleotide sequences specific to Francisella tularensis that serves as a marker or signature for identification of this bacterium. In addition, forward and reverse primers and hybridization probes derived from these nucleotide sequences that are used in nucleotide detection methods to detect the presence of the bacterium are disclosed.

  4. Nucleotide sequences specific to Brucella and methods for the detection of Brucella

    DOEpatents

    McCready, Paula M.; Radnedge, Lyndsay; Andersen, Gary L.; Ott, Linda L.; Slezak, Thomas R.; Kuczmarski, Thomas A.

    2009-02-24

    Nucleotide sequences specific to Brucella that serves as a marker or signature for identification of this bacterium were identified. In addition, forward and reverse primers and hybridization probes derived from these nucleotide sequences that are used in nucleotide detection methods to detect the presence of the bacterium are disclosed.

  5. Nucleic acid (cDNA) and amino acid sequences of the maize endosperm protein glutelin-2.

    PubMed Central

    Prat, S; Cortadas, J; Puigdomènech, P; Palau, J

    1985-01-01

    The cDNA coding for a glutelin-2 protein from maize endosperm has been cloned and the complete amino acid sequence of the protein derived for the first time. An immature maize endosperm cDNA bank was screened for the expression of a beta-lactamase:glutelin-2 (G2) fusion polypeptide by using antibodies against the purified 28 kd G2 protein. A clone corresponding to the 28 kd G2 protein was sequenced and the primary structure of this protein was derived. Five regions can be defined in the protein sequence: an 11 residue N-terminal part, a repeated region formed by eight units of the sequence Pro-Pro-Pro-Val-His-Leu, an alternating Pro-X stretch 21 residues long, a Cys rich domain and a C-terminal part rich in Gln. The protein sequence is preceded by 19 residues which have the characteristics of the signal peptide found in secreted proteins. Unlike zeins, the main maize storage proteins, 28 kd glutelin-2 has several homologous sequences in common with other cereal storage proteins. Images PMID:3839076

  6. cDNA sequences of two inducible T-cell genes

    SciTech Connect

    Kwon, B.S. Guthrie Research Institute, Sayre, PA ); Weissman, S.M. )

    1989-03-01

    The authors have previously described a set of human T-lymphocyte-specific cDNA clones isolated by a modified differential screening procedure. Apparent full-length cDNAs containing the sequences of 14 of the 16 initial isolates were sequenced and were found to represent five different species of mRNA; three of the five species were identical to previously reported cDNA sequences of preproenkephalin, T-cell-replacing factor, and a serine esterase, respectively. The other two species, 4-1BB and L2G25B, were inducible sequences found in mRNA from both a cytolytic T-lymphocyte and a helper T-lymphocyte clone and were not previously described in T-cell mRNA; these mRNA sequences encode peptides of 256 and 92 amino acids, respectively. Both peptides contain putative leader sequences. The protein encoded by 4-1BB also has a potential membrane anchor segment and other features also seen in known receptor proteins.

  7. Nucleotide sequence of the large double-stranded RNA segment of bacteriophage phi 6: genes specifying the viral replicase and transcriptase.

    PubMed Central

    Mindich, L; Nemhauser, I; Gottlieb, P; Romantschuk, M; Carton, J; Frucht, S; Strassman, J; Bamford, D H; Kalkkinen, N

    1988-01-01

    The genome of the lipid-containing bacteriophage phi 6 contains three segments of double-stranded RNA. We determined the nucleotide sequence of cDNA derived from the largest RNA segment (L). This segment specifies the procapsid proteins necessary for transcription and replication of the phi 6 genome. The coding sequences of the four proteins on this segment were identified on the basis of size and the correlation of predicted N-terminal amino acid sequences with those found through analysis of isolated proteins. This report completes the sequence analysis of phi 6. This constitutes the first complete sequence of a double-stranded RNA genome virus. PMID:3346944

  8. Nucleotide sequence of the chicken 5-aminolevulinate synthase gene.

    PubMed Central

    Maguire, D J; Day, A R; Borthwick, I A; Srivastava, G; Wigley, P L; May, B K; Elliott, W H

    1986-01-01

    5-Aminolevulinate synthase, the first and rate-controlling enzyme of heme biosynthesis, is regulated in the liver by the end-product heme. To study this negative control mechanism, we have isolated the chicken gene for ALA-synthase and determined the nucleotide sequence. The structural gene is 6.9 kb long and contains 10 exons. The transcriptional start site for ALA-synthase was determined by primer extension analysis. A fragment of 291 bp from the 5' flanking region including 34 bp of the first exon shows promoter activity when introduced upstream of a chicken histone H2B gene and injected into the nuclei of Xenopus laevis oocytes. Images PMID:3005973

  9. Complete nucleotide sequence of a native plasmid from Brevibacterium linens.

    PubMed

    Moore, Mathew; Svenson, Charles; Bowling, David; Glenn, Dianne

    2003-03-01

    Brevibacterium linens has commercial significance in the dairy industry and potential application in the production of bacteriocins and carotenoids. Strain development of these industrially significant organisms would be facilitated by the use of vectors, yet few are available. In this study we report the isolation of four novel plasmids from the Gram-positive coryneform B. linens, and determine the first complete nucleotide sequence of a native plasmid of B. linens. The cryptic plasmid pLIM is 7610 bp in length, and belongs to a subfamily of theta replicating ColE2-related plasmids. Initial investigation suggests that replication in pLIM requires two replicases, a primase (RepA) and a DNA binding protein (RepB), encoded by a single operon repAB. The origin of replication is located upstream of repAB transcription.

  10. Base sequence context effects on nucleotide excision repair.

    PubMed

    Cai, Yuqin; Patel, Dinshaw J; Broyde, Suse; Geacintov, Nicholas E

    2010-08-23

    Nucleotide excision repair (NER) plays a critical role in maintaining the integrity of the genome when damaged by bulky DNA lesions, since inefficient repair can cause mutations and human diseases notably cancer. The structural properties of DNA lesions that determine their relative susceptibilities to NER are therefore of great interest. As a model system, we have investigated the major mutagenic lesion derived from the environmental carcinogen benzo[a]pyrene (B[a]P), 10S (+)-trans-anti-B[a]P-N(2)-dG in six different sequence contexts that differ in how the lesion is positioned in relation to nearby guanine amino groups. We have obtained molecular structural data by NMR and MD simulations, bending properties from gel electrophoresis studies, and NER data obtained from human HeLa cell extracts for our six investigated sequence contexts. This model system suggests that disturbed Watson-Crick base pairing is a better recognition signal than a flexible bend, and that these can act in concert to provide an enhanced signal. Steric hinderance between the minor groove-aligned lesion and nearby guanine amino groups determines the exact nature of the disturbances. Both nearest neighbor and more distant neighbor sequence contexts have an impact. Regardless of the exact distortions, we hypothesize that they provide a local thermodynamic destabilization signal for repair.

  11. Molecular cloning, sequencing and expression of cDNA encoding human trehalase.

    PubMed

    Ishihara, R; Taketani, S; Sasai-Takedatsu, M; Kino, M; Tokunaga, R; Kobayashi, Y

    1997-11-20

    A complete cDNA clone encoding human trehalase, a glycoprotein of brush-border membranes, has been isolated from a human kidney library. The cDNA encodes a protein of 583 amino acids with a calculated molecular weight of 66,595. Human enzyme contains a typical cleavable signal peptide at amino terminus, five potential glycosylation sites, and a hydrophobic region at carboxyl terminus where the protein is anchored to plasma membranes via glycosylphosphatidylinositol. The deduced amino acid sequence of the human enzyme showed similarity to sequences of the enzyme from rabbit, silk worm, Tenebrio molitor, Escherichia coli and yeast. Northern blots revealed that human trehalase mRNA of approx. 2.0 kb was found mainly in the kidney, liver and small intestine. Expression of the recombinant trehalase in E. coli provided a high level of the enzyme activity. The isolation and expression of cDNA for human trehalase should facilitate studies of the structure of the gene, as well as a basis for a better understanding of the catalytic mechanism.

  12. Nucleotide sequence of the hemolysin I gene from Actinobacillus pleuropneumoniae.

    PubMed Central

    Frey, J; Meier, R; Gygi, D; Nicolet, J

    1991-01-01

    The DNA sequence of the gene encoding the structural protein of hemolysin I (HlyI) of Actinobacillus pleuropneumoniae serotype 1 strain 4074 was analyzed. The nucleotide sequence shows a 3,072-bp reading frame encoding a protein of 1,023 amino acids with a calculated molecular size of 110.1 kDa. This corresponds to the HlyI protein, which has an apparent molecular size on sodium dodecyl sulfate gels of 105 kDa. The structure of the protein derived from the DNA sequence shows three hydrophobic regions in the N-terminal part of the protein, 13 glycine-rich domains in the second half of the protein, and a hydrophilic C-terminal area, all of which are typical of the cytotoxins of the RTX (repeats in the structural toxin) toxin family. The derived amino acid sequence of HlyI shows 42% homology with the hemolysin of A. pleuropneumoniae serotype 5, 41% homology with the leukotoxin of Pasteurella haemolytica, and 56% homology with the Escherichia coli alpha-hemolysin. The 13 glycine-rich repeats and three hydrophobic areas of the HlyI sequence show more similarity to the E. coli alpha-hemolysin than to either the A. pleuropneumoniae serotype 5 hemolysin or the leukotoxin (while the last two are more similar to each other). Two types of RTX hemolysins therefore seem to be present in A. pleuropneumoniae, one (HlyI) resembling the alpha-hemolysin and a second more closely related to the leukotoxin. Ca(2+)-binding experiments using HlyI and recombinant A. pleuropneumoniae prohemolysin (HlyIA) that was produced in E. coli shows that HlyI binds 45Ca2+, probably because of the 13 glycine-rich repeated domains. Activation of the prohemolysin is not required for Ca2+ binding. Images PMID:1879928

  13. A novel regucalcin gene promoter region-related protein: comparison of nucleotide and amino acid sequences in vertebrate species.

    PubMed

    Sawada, Natsumi; Yamaguchi, Masayoshi

    2005-01-01

    The molecular cloning and sequencing of the cDNA coding for a novel regucalcin gene promoter region-related protein (RGPR-p117) from bovine, rabbit and chicken livers was investigated using rapid amplification of cDNA endo (RACE) method. Their nucleotide and amino acid sequences were compared with human, rat and mouse sequences published previously. RGPR-p117 of bovine, rabbit and chicken livers consisted of 1052, 1045, and 929 amino acid residues with calculated molecular mass of 117, 114, and 103 kDa, and estimated pI of 5.64, 5.84, and 5.59, respectively. Comparison analysis revealed that the nucleotide sequences of RGPR-p117 from mammalian species were highly-conserved in their coding region, and the homologies were at least 72.9%. The RGPR-p117 proteins in mammalian species consisted of 1045-1060 amino acids, and had 63.1-90.2% identity. Meanwhile, the nucleotide and amino acid sequences of chicken RGPR-p117 had at least 36.4 and 43.7% identities, respectively. Phylogenetic analysis showed that RGPR-p117 in six vertebrates appears to form a single cluster. Mammalian RGPR-p117 conserved a leucine zipper motif. Moreover, the analysis for subcellular localization of RGPR-p117 from six vertebrates showed the probability of nuclear localization >52.2%; the nuclear localization in rat and mouse was 78.3%. This study demonstrates a great conservation of RGPR-p117 genes throughout evolution.

  14. Nucleotide sequence of medium-chain acyl-CoA dehydrogenase mRNA and its expression in enzyme-deficient human tissue

    SciTech Connect

    Kelly, D.P.; Kim, J.J.; Billadello, J.J.; Hainline, B.E.; Chu, T.W.; Strauss, A.W.

    1987-06-01

    Medium-chain acyl-CoA dehydrogenase is one of three similar enzymes that catalyze the initial step of fatty acid ..beta..-oxidation. Definition of the primary structure of MCAD and the tissue distribution of its mRNA is of biochemical and clinical importance because of the recent recognition of inherited MCAD deficiency in humans. The MCAD mRNA nucleotide sequence was determined from two overlapping cDNA clones isolated from human liver and placental cDNA libraries, respectively. The MCAD mRNA includes a 1263-base-pair coding region and a 738-base-pair 3'-nontranslated region. A partial amino acid sequence (137 residues) determined on peptides derived from MCAD purified from porcine liver confirmed the identity of the cDNA clone. Comparison of the amino acid sequence predicted from the human MCAD cDNA with the partial protein sequence of the porcine MCAD revealed a high degree (88%) of interspecies sequence identity. RNA blot analysis shows that MCAD mRNA is expressed in a variety of rat (2.2 kilobases) and human (2.4 kilobases) tissues. Blot hybridization of RNA prepared from cultured skin fibroblasts from a patient with MCAD deficiency disclosed that mRNA was present and of similar size of MCAD mRNA derived from control fibroblasts. The isolation and characterization of MCAD cDNA is an important step in the definition of the defect underlying its metabolic consequences.

  15. 2058 Expressed sequence tags (ESTs) from a human fetal lung cDNA library

    SciTech Connect

    Kazunori, Sudo |; Katsuya Chinen; Yusuke Nakamura

    1994-11-15

    ESTs (expressed sequence tags) provide complementary resources for structural and functional analyses of the human genome. The authors have performed single-pass sequencing of 2058 randomly selected, directionally cloned cDNAs isolated from a fetal-lung cDNA library constructed with oligo (dT) primers. Computer analyses of the 5{prime}-end sequences revealed that 60.4% of the clones were considered to be identical to previously reported human genes or ESTs; 9.0% of them showed significant homology to known genes in human, other mammals, or lower organisms; 30.6% showed no homology to any genes or DNA sequences in the public database. These data and reagents will be useful for future investigations of gene expression during prenatal development of human lung. 11 refs., 1 fig., 2 tabs.

  16. [Nucleotide sequence of genes for alpha- and beta-subunits of luciferase from Photobacterium leiognathi].

    PubMed

    Illarionov, B A; Protopopova, M V; Karginov, V A; Mertvetsov, N P; Gitel'zon, I I

    1988-03-01

    Nucleotide sequence of the Photobacterium leiognathi DNA containing genes of alpha and beta subunits of luciferase has been determined. We also deduced amino acid sequence and molecular mass of luciferase and localized luciferase genes in the sequenced DNA fragment.

  17. Time scale for cyclostome evolution inferred with a phylogenetic diagnosis of hagfish and lamprey cDNA sequences.

    PubMed

    Kuraku, Shigehiro; Kuratani, Shigeru

    2006-12-01

    The Cyclostomata consists of the two orders Myxiniformes (hagfishes) and Petromyzoniformes (lampreys), and its monophyly has been unequivocally supported by recent molecular phylogenetic studies. Under this updated vertebrate phylogeny, we performed in silico evolutionary analyses using currently available cDNA sequences of cyclostomes. We first calculated the GC-content at four-fold degenerate sites (GC(4)), which revealed that an extremely high GC-content is shared by all the lamprey species we surveyed, whereas no striking pattern in GC-content was observed in any of the hagfish species surveyed. We then estimated the timing of diversification in cyclostome evolution using nucleotide and amino acid sequences. We obtained divergence times of 470-390 million years ago (Mya) in the Ordovician-Silurian-Devonian Periods for the interordinal split between Myxiniformes and Petromyzoniformes; 90-60 Mya in the Cretaceous-Tertiary Periods for the split between the two hagfish subfamilies, Myxininae and Eptatretinae; 280-220 Mya in the Permian-Triassic Periods for the split between the two lamprey subfamilies, Geotriinae and Petromyzoninae; and 30-10 Mya in the Tertiary Period for the split between the two lamprey genera, Petromyzon and Lethenteron. This evolutionary configuration indicates that Myxiniformes and Petromyzoniformes diverged shortly after the common ancestor of cyclostomes split from the future gnathostome lineage. Our results also suggest that intra-subfamilial diversification in hagfish and lamprey lineages (especially those distributed in the northern hemisphere) occurred in the Cretaceous or Tertiary Periods.

  18. cDNA cloning and sequencing of human fibrillarin, a conserved nucleolar protein recognized by autoimmune antisera

    SciTech Connect

    Aris, J.P.; Blobel, G. )

    1991-02-01

    The authors have isolated a 1.1-kilobase cDNA clone that encodes human fibrillarin by screening a hepatoma library in parallel with DNA probes derived from the fibrillarin genes of Saccharomyces cerevisiae (NOP1) and Xenopus laevis. RNA blot analysis indicates that the corresponding mRNA is {approximately}1,300 nucleotides in length. Human fibrillarin expressed in vitro migrates on SDS gels as a 36-kDa protein that is specifically immunoprecipitated by antisera from humans with scleroderma autoimmune disease. Human fibrillarin contains an amino-terminal repetitive domain {approximately}75-80 amino acids in length that is rich in glycine and arginine residues and is similar to amino-terminal domains in the yeast and Xenopus fibrillarins. The occurrence of a putative RNA-binding domain and an RNP consensus sequence within the protein is consistent with the association of fibrillarin with small nucleolar RNAs. Protein sequence alignments show that 67% of amino acids from human fibrillarin are identical to those in yeast fibrillarin and that 81% are identical to those in Xenopus fibrillarin. This identity suggests the evolutionary conservation of an important function early in the pathway for ribosome biosynthesis.

  19. Molecular Cloning and Sequence Analysis of Novel Cytochrome P450 cDNA Fragments from Dastarcus helophoroides

    PubMed Central

    Wang, Hai-Dong; Li, Fei-Fei; He, Cai; Cui, Jun; Song, Wang; Li, Meng-Lou

    2014-01-01

    The predatory beetle Dastarcus helophoroides (Fairmaire) (Coleoptera: Bothrideridae) is a natural enemy of many longhorned beetles and is mainly distributed in both China and Japan. To date, no research on D. helophoroides P450 enzymes has been reported. In our study, for the better understanding of P450 enzymes in D. helophoroides, 100 novel cDNA fragments encoding cytochrome P450 were amplified from the total RNA of adult D. helophoroides abdomens using five pairs of degenerate primers designed according to the conserved amino acid sequences of the CYP6 family genes in insects through RT-PCR. The obtained nucleotide sequences were 250 bp, 270 bp, and 420 bp in length depending on different primers. Ninety-six fragments were determined to represent CYP6 genes, mainly from CYP6BK, CYP6BQ, and CYP6BR subfamilies, and four fragments were determined to represent CYP9 genes. Twenty-two fragments, submitted to GenBank, were selected for further homologous analysis, which revealed that some fragments of different sizes might be parts of the same P450 gene. PMID:25373175

  20. Molecular cloning and sequence analysis of novel cytochrome P450 cDNA fragments from Dastarcus helophoroides.

    PubMed

    Wang, Hai-Dong; Li, Fei-Fei; He, Cai; Cui, Jun; Song, Wang; Li, Meng-Lou

    2014-02-26

    The predatory beetle Dastarcus helophoroides (Fairmaire) (Coleoptera: Bothrideridae) is a natural enemy of many longhorned beetles and is mainly distributed in both China and Japan. To date, no research on D. helophoroides P450 enzymes has been reported. In our study, for the better understanding of P450 enzymes in D. helophoroides, 100 novel cDNA fragments encoding cytochrome P450 were amplified from the total RNA of adult D. helophoroides abdomens using five pairs of degenerate primers designed according to the conserved amino acid sequences of the CYP6 family genes in insects through RT-PCR. The obtained nucleotide sequences were 250 bp, 270 bp, and 420 bp in length depending on different primers. Ninety-six fragments were determined to represent CYP6 genes, mainly from CYP6BK, CYP6BQ, and CYP6BR subfamilies, and four fragments were determined to represent CYP9 genes. Twenty-two fragments, submitted to GenBank, were selected for further homologous analysis, which revealed that some fragments of different sizes might be parts of the same P450 gene.

  1. Recognizing nucleotides by cross-tunneling currents for DNA sequencing

    NASA Astrophysics Data System (ADS)

    Bagci, V. M. K.; Kaun, Chao-Cheng

    2011-07-01

    Using first-principles calculations, we study electron transport through nucleotides inside a rectangular nanogap formed by two pairs of gold electrodes which are perpendicular and parallel to the nucleobase plane. We propose that this setup will enhance the nucleotide selectivity of tunneling signals to a great extent. Information from three electrical probing processes offers full nucleotide recognition, which survives the noise from neighboring nucleotides and configuration fluctuations.

  2. Nucleotide sequence of the human N-myc gene

    SciTech Connect

    Stanton, L.W.; Schwab, M.; Bishop, J.M.

    1986-03-01

    Human neuroblastomas frequently display amplification and augmented expression of a gene known as N-myc because of its similarity to the protooncogene c-myc. It has therefore been proposed that N-myc is itself a protooncogene, and subsequent tests have shown that N-myc and c-myc have similar biological activities in cell culture. The authors have now detailed the kinship between N-myc and c-myc by determining the nucleotide sequence of human N-myc and deducing the amino acid sequence of the protein encoded by the gene. The topography of N-myc is strikingly similar to that of c-myc: both genes contain three exons of similar lengths; the coding elements of both genes are located in the second and third exons; and both genes have unusually long 5' untranslated regions in their mRNAs, with features that raise the possibility that expression of the genes may be subject to similar controls of translation. The resemblance between the proteins encoded by N-myc and c-myc sustains previous suspicions that the genes encode related functions.

  3. Nucleotide sequence from the coding region of rabbit β-globin messenger RNA

    PubMed Central

    Proudfoot, N.J.

    1976-01-01

    A sequence of 89 nucleotides from rabbit β-globin mRNA has been determined and is shown to code for residues 107 to 137 of the β-globin protein. In addition, a sequence heterogeneity has been identified within this 89 nucleotide long sequence which corresponds to a known polymorphic variant of rabbit β-globin. Images PMID:61580

  4. Characterisation of full-length cDNA sequences provides insights into the Eimeria tenellatranscriptome

    PubMed Central

    2012-01-01

    Background Eimeria tenella is an apicomplexan parasite that causes coccidiosis in the domestic fowl. Infection with this parasite is diagnosed frequently in intensively reared poultry and its control is usually accorded a high priority, especially in chickens raised for meat. Prophylactic chemotherapy has been the primary method used for the control of coccidiosis. However, drug efficacy can be compromised by drug-resistant parasites and the lack of new drugs highlights demands for alternative control strategies including vaccination. In the long term, sustainable control of coccidiosis will most likely be achieved through integrated drug and vaccination programmes. Characterisation of the E. tenella transcriptome may provide a better understanding of the biology of the parasite and aid in the development of a more effective control for coccidiosis. Results More than 15,000 partial sequences were generated from the 5' and 3' ends of clones randomly selected from an E. tenella second generation merozoite full-length cDNA library. Clustering of these sequences produced 1,529 unique transcripts (UTs). Based on the transcript assembly and subsequently primer walking, 433 full-length cDNA sequences were successfully generated. These sequences varied in length, ranging from 441 bp to 3,083 bp, with an average size of 1,647 bp. Simple sequence repeat (SSR) analysis identified CAG as the most abundant trinucleotide motif, while codon usage analysis revealed that the ten most infrequently used codons in E. tenella are UAU, UGU, GUA, CAU, AUA, CGA, UUA, CUA, CGU and AGU. Subsequent analysis of the E. tenella complete coding sequences identified 25 putative secretory and 60 putative surface proteins, all of which are now rational candidates for development as recombinant vaccines or drug targets in the effort to control avian coccidiosis. Conclusions This paper describes the generation and characterisation of full-length cDNA sequences from E. tenella second generation

  5. Construction of cDNA library and preliminary analysis of expressed sequence tags from Siberian tiger

    PubMed Central

    Liu, Chang-Qing; Lu, Tao-Feng; Feng, Bao-Gang; Liu, Dan; Guan, Wei-Jun; Ma, Yue-Hui

    2010-01-01

    In this study we successfully constructed a full-length cDNA library from Siberian tiger, Panthera tigris altaica, the most well-known wild Animal. Total RNA was extracted from cultured Siberian tiger fibroblasts in vitro. The titers of primary and amplified libraries were 1.30×106 pfu/ml and 1.62×109 pfu/ml respectively. The proportion of recombinants from unamplified library was 90.5% and average length of exogenous inserts was 1.13 kb. A total of 282 individual ESTs with sizes ranging from 328 to 1,142bps were then analyzed the BLASTX score revealed that 53.9% of the sequences were classified as strong match, 38.6% as nominal and 7.4% as weak match. 28.0% of them were found to be related to enzyme/catalytic protein, 20.9% ESTs to metabolism, 13.1% ESTs to transport, 12.1% ESTs to signal transducer/cell communication, 9.9% ESTs to structure protein, 3.9% ESTs to immunity protein/defense metabolism, 3.2% ESTs to cell cycle, and 8.9 ESTs classified as novel genes. These results demonstrated that the reliability and representativeness of the cDNA library attained to the requirements of a standard cDNA library. This library provided a useful platform for the functional genomic research of Siberian tigers. PMID:20941376

  6. Construction of cDNA library and preliminary analysis of expressed sequence tags from Siberian tiger.

    PubMed

    Liu, Chang-Qing; Lu, Tao-Feng; Feng, Bao-Gang; Liu, Dan; Guan, Wei-Jun; Ma, Yue-Hui

    2010-10-01

    In this study we successfully constructed a full-length cDNA library from Siberian tiger, Panthera tigris altaica, the most well-known wild Animal. Total RNA was extracted from cultured Siberian tiger fibroblasts in vitro. The titers of primary and amplified libraries were 1.30×10(6) pfu/ml and 1.62×10(9) pfu/ml respectively. The proportion of recombinants from unamplified library was 90.5% and average length of exogenous inserts was 1.13 kb. A total of 282 individual ESTs with sizes ranging from 328 to 1,142 bps were then analyzed the BLASTX score revealed that 53.9% of the sequences were classified as strong match, 38.6% as nominal and 7.4% as weak match. 28.0% of them were found to be related to enzyme/catalytic protein, 20.9% ESTs to metabolism, 13.1% ESTs to transport, 12.1% ESTs to signal transducer/cell communication, 9.9% ESTs to structure protein, 3.9% ESTs to immunity protein/defense metabolism, 3.2% ESTs to cell cycle, and 8.9 ESTs classified as novel genes. These results demonstrated that the reliability and representativeness of the cDNA library attained to the requirements of a standard cDNA library. This library provided a useful platform for the functional genomic research of Siberian tigers.

  7. Spatially localized generation of nucleotide sequence-specific DNA damage

    PubMed Central

    Oh, Dennis H.; King, Brett A.; Boxer, Steven G.; Hanawalt, Philip C.

    2001-01-01

    Psoralens linked to triplex-forming oligonucleotides (psoTFOs) have been used in conjunction with laser-induced two-photon excitation (TPE) to damage a specific DNA target sequence. To demonstrate that TPE can initiate photochemistry resulting in psoralen–DNA photoadducts, target DNA sequences were incubated with psoTFOs to form triple-helical complexes and then irradiated in liquid solution with pulsed 765-nm laser light, which is half the quantum energy required for conventional one-photon excitation, as used in psoralen + UV A radiation (320–400 nm) therapy. Target DNA acquired strand-specific psoralen monoadducts in a light dose-dependent fashion. To localize DNA damage in a model tissue-like medium, a DNA–psoTFO mixture was prepared in a polyacrylamide gel and then irradiated with a converging laser beam targeting the rear of the gel. The highest number of photoadducts formed at the rear while relatively sparing DNA at the front of the gel, demonstrating spatial localization of sequence-specific DNA damage by TPE. To assess whether TPE treatment could be extended to cells without significant toxicity, cultured monolayers of normal human dermal fibroblasts were incubated with tritium-labeled psoralen without TFO to maximize detectable damage and irradiated by TPE. DNA from irradiated cells treated with psoralen exhibited a 4- to 7-fold increase in tritium activity relative to untreated controls. Functional survival assays indicated that the psoralen–TPE treatment was not toxic to cells. These results demonstrate that DNA damage can be simultaneously manipulated at the nucleotide level and in three dimensions. This approach for targeting photochemical DNA damage may have photochemotherapeutic applications in skin and other optically accessible tissues. PMID:11572980

  8. Murine muscle-specific enolase: cDNA cloning, sequence, and developmental expression.

    PubMed Central

    Lamandé, N; Mazo, A M; Lucas, M; Montarras, D; Pinset, C; Gros, F; Legault-Demare, L; Lazar, M

    1989-01-01

    In vertebrates, the glycolytic enzyme enolase (EC 4.2.1.11) is present as homodimers and heterodimers formed from three distinct subunits of identical molecular weight, alpha, beta, and gamma. We report the cloning and sequencing of a cDNA encoding the beta subunit of murine muscle-specific enolase. The corresponding amino acid sequence shows greater than 80% homology with the beta subunit from chicken obtained by protein sequencing and with alpha and gamma subunits from rat and mouse deduced from cloned cDNAs. In contrast, there is no homology between the 3' untranslated regions of mouse alpha, beta, and gamma enolase mRNAs, which also differ greatly in length. The short 3' untranslated region of beta enolase mRNA accounts for its distinct length, 1600 bases. It is known that a progressive transition from alpha alpha to beta beta enolase occurs in developing skeletal muscle. We show that this transition mainly results from a differential regulation of alpha and beta mRNA levels. Analysis of myogenic cell lines shows that beta enolase gene is expressed at the myoblast stage. Moreover, transfection of premyogenic C3H10T1/2 cells with MyoD1 cDNA shows that the initial expression of beta transcripts occurs during the very first steps of the myogenic pathway, suggesting that it could be a marker event of myogenic lineage determination. Images PMID:2734297

  9. Benchmarking of the Oxford Nanopore MinION sequencing for quantitative and qualitative assessment of cDNA populations

    PubMed Central

    Oikonomopoulos, Spyros; Wang, Yu Chang; Djambazian, Haig; Badescu, Dunarel; Ragoussis, Jiannis

    2016-01-01

    To assess the performance of the Oxford Nanopore Technologies MinION sequencing platform, cDNAs from the External RNA Controls Consortium (ERCC) RNA Spike-In mix were sequenced. This mix mimics mammalian mRNA species and consists of 92 polyadenylated transcripts with known concentration. cDNA libraries were generated using a template switching protocol to facilitate the direct comparison between different sequencing platforms. The MinION performance was assessed for its ability to sequence the cDNAs directly with good accuracy in terms of abundance and full length. The abundance of the ERCC cDNA molecules sequenced by MinION agreed with their expected concentration. No length or GC content bias was observed. The majority of cDNAs were sequenced as full length. Additionally, a complex cDNA population derived from a human HEK-293 cell line was sequenced on an Illumina HiSeq 2500, PacBio RS II and ONT MinION platforms. We observed that there was a good agreement in the measured cDNA abundance between PacBio RS II and ONT MinION (rpearson = 0.82, isoforms with length more than 700bp) and between Illumina HiSeq 2500 and ONT MinION (rpearson = 0.75). This indicates that the ONT MinION can sequence quantitatively both long and short full length cDNA molecules. PMID:27554526

  10. Achieving high throughput sequencing of a cDNA library utilizing an alternative protocol for the bench top next-generation sequencing system.

    PubMed

    Wan, Minxi; Faruq, Junaid; Rosenberg, Julian N; Xia, Jinlan; Oyler, George A; Betenbaugh, Michael J

    2013-02-15

    The development of next-generation sequencing (NGS) technologies has provided novel tools for genome analysis and expression profiling. A high throughput cDNA sequencing method using a bench top next-generation sequencing system, GS Junior, is now available. Here, we used an alternative protocol to the standard method for generating the cDNA library. This protocol can decrease the number of processing steps to manipulate RNA when constructing a cDNA library from an RNA sample, and does not require mRNA isolation from total RNA. Thus it can decrease the risk of RNA degradation and the cost for preparing a cDNA library. Also, the efficiency of sequencing data obtained with this approach is comparable to the standard method as verified by sequencing characteristics and expression levels of the reference gene glyceraldehyde-3-phosphate dehydrogenase (GAPDH).

  11. Isolation and sequence of a cDNA clone for human tyrosinase that maps at the mouse c-albino locus

    SciTech Connect

    Kwon, B.S.; Haq, A.K.; Pomerantz, S.H.; Halaban, R.

    1987-11-01

    Screening of a lambdagt11 human melanocyte cDNA library with antibodies against hamster tyrosinase resulted in the isolation of 16 clones. The cDNA inserts from 13 of the 16 clones cross-hybridized with each other, indicating that they were form related mRNA species. One of the cDNA clones, Pmel34, detected one mRNA species with an approximate length of 2.4 kilobases that was expressed preferentially in normal and malignant melanocytes but not in other cell types. The amino acid sequence deduced from the nucleotide sequence showed that the putative human tyrosinase is composed of 548 amino acids with a molecular weight of 62,610. The deduced protein contains glycosylation sites and histidine-rich sites that could be used for copper binding. Southern blot analysis of DNA derived from newborn mice carrying lethal albino deletion mutations revealed that Pmel34 maps near or at the c-albino locus, the position of the structural gene for tyrosinase.

  12. Nucleotide sequence of a cloned woodchuck hepatitis virus genome: comparison with the hepatitis B virus sequence.

    PubMed Central

    Galibert, F; Chen, T N; Mandart, E

    1982-01-01

    The complete nucleotide sequence of a woodchuck hepatitis virus genome cloned in Escherichia coli was determined by the method of Maxam and Gilbert. This sequence was found to be 3,308 nucleotides long. Potential ATG initiator triplets and nonsense codons were identified and used to locate regions with a substantial coding capacity. A striking similarity was observed between the organization of human hepatitis B virus and woodchuck hepatitis virus. Nucleotide sequences of these open regions in the woodchuck virus were compared with corresponding regions present in hepatitis B virus. This allowed the location of four viral genes on the L strand and indicated the absence of protein coded by the S strand. Evolution rates of the various parts of the genome as well as of the four different proteins coded by hepatitis B virus and woodchuck hepatitis virus were compared. These results indicated that: (i) the core protein has evolved slightly less rapidly than the other proteins; and (ii) when a region of DNA codes for two different proteins, there is less freedom for the DNA to evolve and, moreover, one of the proteins can evolve more rapidly than the other. A hairpin structure, very well conserved in the two genomes, was located in the only region devoid of coding function, suggesting the location of the origin of replication of the viral DNA. Images PMID:7086958

  13. Primary Analysis of the Expressed Sequence Tags in a Pentastomid Nymph cDNA Library

    PubMed Central

    Yuan, Zhongying; Yin, Jianhai; Zang, Wei; Xu, Yuxin; Lu, Weiyuan; Wang, Yanjuan; Wang, Ying; Cao, Jianping

    2013-01-01

    Background Pentastomiasis is a rare zoonotic disease caused by pentastomids. Despite their worm-like appearance, they are commonly placed into a separate sub-class of the subphylum Crustacea, phylum Arthropoda. However, until now, the systematic classification of the pentastomids and the diagnosis of pentastomiasis are immature, and genetic information about pentastomid nylum is almost nonexistent. The objective of this study was to obtain information on pentastomid nymph genes and identify the gene homologues related to host-parasite interactions or stage-specific antigens. Methodology/Principal Findings Total pentastomid nymph RNA was used to construct a cDNA library and 500 colonies were sequenced. Analysis shows one hundred and ninety-seven unigenes were identified. In which, 147 genes were annotated, and 75 unigenes (53.19%) were mapped to 82 KEGG pathways, including 29 metabolism pathways, 29 genetic information processing pathways, 4 environmental information processing pathways, 7 cell motility pathways and 5 organismal systems pathways. Additionally, two host-parasite interaction-related gene homologues, a putative Kunitz inhibitor and a putative cysteine protease. Conclusion/Significance We first successfully constructed a cDNA library and gained a number of expressed sequence tags (EST) from pentastomid nymphs, which will lay the foundation for the further study on pentastomids and pentastomiasis. PMID:23437150

  14. An efficient strategy for large-scale high-throughput transposon-mediated sequencing of cDNA clones

    PubMed Central

    Butterfield, Yaron S. N.; Marra, Marco A.; Asano, Jennifer K.; Chan, Susanna Y.; Guin, Ranabir; Krzywinski, Martin I.; Lee, Soo Sen; MacDonald, Kim W. K.; Mathewson, Carrie A.; Olson, Teika E.; Pandoh, Pawan K.; Prabhu, Anna-Liisa; Schnerch, Angelique; Skalska, Ursula; Smailus, Duane E.; Stott, Jeff M.; Tsai, Miranda I.; Yang, George S.; Zuyderduyn, Scott D.; Schein, Jacqueline E.; Jones, Steven J. M.

    2002-01-01

    We describe an efficient high-throughput method for accurate DNA sequencing of entire cDNA clones. Developed as part of our involvement in the Mammalian Gene Collection full-length cDNA sequencing initiative, the method has been used and refined in our laboratory since September 2000. Amenable to large scale projects, we have used the method to generate >7 Mb of accurate sequence from 3695 candidate full-length cDNAs. Sequencing is accomplished through the insertion of Mu transposon into cDNAs, followed by sequencing reactions primed with Mu-specific sequencing primers. Transposon insertion reactions are not performed with individual cDNAs but rather on pools of up to 96 clones. This pooling strategy reduces the number of transposon insertion sequencing libraries that would otherwise be required, reducing the costs and enhancing the efficiency of the transposon library construction procedure. Sequences generated using transposon-specific sequencing primers are assembled to yield the full-length cDNA sequence, with sequence editing and other sequence finishing activities performed as required to resolve sequence ambiguities. Although analysis of the many thousands (22 785) of sequenced Mu transposon insertion events revealed a weak sequence preference for Mu insertion, we observed insertion of the Mu transposon into 1015 of the possible 1024 5mer candidate insertion sites. PMID:12034834

  15. Complete nucleotide sequence of a monopartite Begomovirus and associated satellites infecting Carica papaya in Nepal.

    PubMed

    Shahid, M S; Yoshida, S; Khatri-Chhetri, G B; Briddon, R W; Natsuaki, K T

    2013-06-01

    Carica papaya (papaya) is a fruit crop that is cultivated mostly in kitchen gardens throughout Nepal. Leaf samples of C. papaya plants with leaf curling, vein darkening, vein thickening, and a reduction in leaf size were collected from a garden in Darai village, Rampur, Nepal in 2010. Full-length clones of a monopartite Begomovirus, a betasatellite and an alphasatellite were isolated. The complete nucleotide sequence of the Begomovirus showed the arrangement of genes typical of Old World begomoviruses with the highest nucleotide sequence identity (>99 %) to an isolate of Ageratum yellow vein virus (AYVV), confirming it as an isolate of AYVV. The complete nucleotide sequence of betasatellite showed greater than 89 % nucleotide sequence identity to an isolate of Tomato leaf curl Java betasatellite originating from Indonesian. The sequence of the alphasatellite displayed 92 % nucleotide sequence identity to Sida yellow vein China alphasatellite. This is the first identification of these components in Nepal and the first time they have been identified in papaya.

  16. A genomic analysis of Histomonas meleagridis through sequencing of a cDNA library.

    PubMed

    Klodnicki, M E; McDougald, L R; Beckstead, R B

    2013-04-01

    Histomonas meleagridis, a flagellated protozoan of the Order Trichomonadida, is the causative agent of blackhead disease in gallinaceous birds. Few genes have been identified in this organism; thus, little is known regarding the molecular basis for its metabolism, virulence, and antigenicity. To identify new genes, a cDNA library derived from a lab strain of H. meleagridis was sequenced and annotated. Data obtained from these experiments identified 3,425 H. meleagridis genes. Analysis of the data allowed the identification of 81 genes coding for putative hydrogenosomal proteins and was used to determine the codon usage frequency. Sequence information also identified bacteria that are cultured with H. meleagridis. Future analysis of these data should provide valuable molecular insights into H. meleagridis and provide the platform for molecular studies aimed at understanding the pathogenesis of blackhead disease.

  17. Maple syrup urine disease. Complete primary structure of the E1 beta subunit of human branched chain alpha-ketoacid dehydrogenase complex deduced from the nucleotide sequence and a gene analysis of patients with this disease.

    PubMed Central

    Nobukuni, Y; Mitsubuchi, H; Endo, F; Akaboshi, I; Asaka, J; Matsuda, I

    1990-01-01

    A defect in the E1 beta subunit of the branched chain alpha-ketoacid dehydrogenase (BCKDH) complex is one cause of maple syrup urine disease (MSUD). In an attempt to elucidate the molecular basis of MSUD, we isolated and characterized a 1.35 kbp cDNA clone encoding the entire precursor of the E1 beta subunit of BCKDH complex from a human placental cDNA library. Nucleotide sequence analysis revealed that the isolated cDNA clone (lambda hBE1 beta-1) contained a 5'-untranslated sequence of four nucleotides, the translated sequence of 1,176 nucleotides and the 3'-untranslated sequence of 169 nucleotides. Comparison of the amino acid sequence predicted from the nucleotide sequence of the cDNA insert of the clone with the NH2-terminal amino acid sequence of the purified mature bovine BCKDH-E1 beta subunit showed that the cDNA insert encodes for a 342-amino acid subunit with a Mr = 37,585. The subunit is synthesized as the precursor with a leader sequence of 50 amino acids and is processed at the NH2 terminus. A search for protein homology revealed that the primary structure of human BCKDH-E1 beta was similar to the bovine BCKDH-E1 beta and to the E1 beta subunit of human pyruvate dehydrogenase complex, in all regions. The structures and functions of mammalian alpha-ketoacid dehydrogenase complexes are apparently highly conserved. Genomic DNA from lymphoblastoid cell lines derived from normal and five MSUD patients, in whom E1 beta was not detected by immunoblot analysis, gave the same restriction maps on Southern blot analysis. The gene has at least 80 kbp. Images PMID:2365818

  18. Cloning and sequencing of Octopus dofleini hemocyanin cDNA: derived sequences of functional units Ode and Odf.

    PubMed Central

    Lang, W H; van Holde, K E

    1991-01-01

    A number of additional cDNA clones coding for portions of the very large polypeptide chain of Octopus dofleini hemocyanin were isolated and sequenced. These data reveal two very similar coding sequences, which we have denoted "A-type" and "G-type." We have obtained complete A-type sequences coding for functional units Ode and Odf; consequently a total of three such unit sequences are now known from a single subunit of one molluscan hemocyanin. This presents the opportunity to make sequence comparisons within one hemocyanin subunit. Domains within one subunit show on the average 42% identity in amino acid residues; corresponding functional units from hemocyanins of different species show degrees of identity of 53-75%. Therefore, molluscan hemocyanins already existed before the individual molluscan classes diverged in the early Cambrian. Sequence comparisons of molluscan hemocyanins with arthropodan hemocyanins and tyrosinases allow us to identify the ligands of the "Copper B" site with high probability. Possible ligands for the "Copper A" site are proposed, based on sequence comparisons between molluscan hemocyanins and tyrosinases. Besides two histidine side chains, a methionine side chain might be involved in binding of Copper A, a result not in conflict with spectroscopic studies. Images PMID:1898774

  19. The nucleotide sequences of 5S ribosomal RNAs from four Bryophyta-species.

    PubMed Central

    Katoh, K; Hori, H; Osawa, S

    1983-01-01

    The nucleotide sequences of cytoplasmic 5S rRNA from four bryophytes, Marchantia polymorpha, Lophocolea heterophylla, Plagiomnium trichomanes and Anthoceros punctatus have been determined. These RNAs are 119 nucleotides long except for the Anthoceros RNA that has 118 nucleotides. Their sequences are highly similar to each other (91-99% identity) and are more related to those from seed plants (78-83% identity) than to those from green algae (61-73% identity). PMID:6571698

  20. Development of single-nucleotide polymorphism markers for Bromus tectorum (Poaceae) from a partially sequenced transcriptome1

    PubMed Central

    Merrill, Keith R.; Coleman, Craig E.; Meyer, Susan E.; Leger, Elizabeth A.; Collins, Katherine A.

    2016-01-01

    Premise of the study: Bromus tectorum (Poaceae) is an annual grass species that is invasive in many areas of the world but most especially in the U.S. Intermountain West. Single-nucleotide polymorphism (SNP) markers were developed for use in investigating the geospatial and ecological diversity of B. tectorum in the Intermountain West to better understand the mechanisms behind its successful invasion. Methods and Results: Normalized cDNA libraries from six diverse B. tectorum individuals were pooled and sequenced using 454 sequencing. Ninety-five SNP assays were developed for use on 96.96 arrays with the Fluidigm EP1 genotyping platform. Verification of the 95 SNPs by genotyping 251 individuals from 12 populations is reported, along with amplification data from four related Bromus species. Conclusions: These SNP markers are polymorphic across populations of B. tectorum, are optimized for high-throughput applications, and may be applicable to other, related Bromus species. PMID:27843723

  1. Cloning and sequence analysis of a full-length cDNA of SmPP1cb encoding turbot protein phosphatase 1 beta catalytic subunit

    NASA Astrophysics Data System (ADS)

    Qi, Fei; Guo, Huarong; Wang, Jian

    2008-02-01

    Reversible protein phosphorylation, catalyzed by protein kinases and phosphatases, is an important and versatile mechanism by which eukaryotic cells regulate almost all the signaling processes. Protein phosphatase 1 (PP1) is the first and well-characterized member of the protein serine/threonine phosphatase family. In the present study, a full-length cDNA encoding the beta isoform of the catalytic subunit of protein phosphatase 1(PP1cb), was for the first time isolated and sequenced from the skin tissue of flatfish turbot Scophthalmus maximus, designated SmPP1cb, by the rapid amplification of cDNA ends (RACE) technique. The cDNA sequence of SmPP1cb we obtained contains a 984 bp open reading frame (ORF), flanked by a complete 39 bp 5' untranslated region and 462 bp 3' untranslated region. The ORF encodes a putative 327 amino acid protein, and the N-terminal section of this protein is highly acidic, Met-Ala-Glu-Gly-Glu-Leu-Asp-Val-Asp, a common feature for PP1 catalytic subunit but absent in protein phosphatase 2B (PP2B). And its calculated molecular mass is 37 193 Da and pI 5.8. Sequence analysis indicated that, SmPP1cb is extremely conserved in both amino acid and nucleotide acid levels compared with the PP1cb of other vertebrates and invertebrates, and its Kozak motif contained in the 5'UTR around ATG start codon is GXXAXXGXX ATGG, which is different from mammalian in two positions A-6 and G-3, indicating the possibility of different initiation of translation in turbot, and also the 3'UTR of SmPP1cb is highly diverse in the sequence similarity and length compared with other animals, especially zebrafish. The cloning and sequencing of SmPP1cb gene lays a good foundation for the future work on the biological functions of PP1 in the flatfish turbot.

  2. Nucleotide sequences of the cylindrical inclusion protein genes of two Japanese zucchini yellow mosaic virus isolates.

    PubMed

    Kundu, A K; Ohshima, K; Sako, N; Yaegashi, H

    1999-02-01

    The nucleotide sequences of the cylindrical inclusion protein (CIP) genes of two Japanese zucchini yellow mosaic virus (ZYMV) isolates (ZYMV-169 and ZYMV-M) were determined. The CIP genes of both isolates comprised 1902 nucleotides and encoded 634 amino acids containing consensus nucleotide binding motif. The sequence similarities between the two isolates at the nucleotide and amino acid levels were 91% and 98%, respectively. When the CIP gene sequences of the Japanese ZYMV isolates were compared with those of previously reported ZYMV isolates, the nucleotide and amino acid sequence similarities ranged between 81% and 97%, and between 95% and 97%, respectively. Phylogenetic analysis of the deduced amino acid sequences of the CIP genes indicated that the Japanese ZYMV isolates were closely related to those of other ZYMV isolates.

  3. Nuclear-encoded chloroplast ribosomal protein L12 of Nicotiana tabacum: characterization of mature protein and isolation and sequence analysis of cDNA clones encoding its cytoplasmic precursor.

    PubMed Central

    Elhag, G A; Thomas, F J; McCreery, T P; Bourque, D P

    1992-01-01

    Poly(A)+ mRNA isolated from Nicotiana tabacum (cv. Petite Havana) leaves was used to prepare a cDNA library in the expression vector lambda gt11. Recombinant phage containing cDNAs coding for chloroplast ribosomal protein L12 were identified and sequenced. Mature tobacco L12 protein has 44% amino acid identity with ribosomal protein L7/L12 of Escherichia coli. The longest L12 cDNA (733 nucleotides) codes for a 13,823 molecular weight polypeptide with a transit peptide of 53 amino acids and a mature protein of 133 amino acids. The transit peptide and mature protein share 43% and 79% amino acid identity, respectively, with corresponding regions of spinach chloroplast ribosomal protein L12. The predicted amino terminus of the mature protein was confirmed by partial sequence analysis of HPLC-purified tobacco chloroplast ribosomal protein L12. A single L12 mRNA of about 0.8 kb was detected by hybridization of L12 cDNA to poly(A)+ and total leaf RNA. Hybridization patterns of restriction fragments of tobacco genomic DNA probed with the L12 cDNA suggested the existence of more than one gene for ribosomal protein L12. Characterization of a second cDNA with an identical L12 coding sequence but a different 3'-noncoding sequence provided evidence that at least two L12 genes are expressed in tobacco. Images PMID:1542565

  4. Antigenic characteristics and cDNA sequences of HLA-B73.

    PubMed

    Hoffmann, H J; Kristensen, T J; Jensen, T G; Graugaard, B; Lamm, L U

    1995-06-01

    The cDNA sequence and serological data for HLA-B73 are reported. Anti-B73 sera are found relatively frequently, considering the rarity of the antigen. It was noted early that in some cases the antibodies in sera of multiparous women did not react with the eliciting cells (fathers) and thus all behaved as a naturally occurring antibody. We report on 18 B73 antisera found during the screening of 55,000 Danish sera. Only one of the 17 stimulators typed also had the B73 tissue type. Ten of the stimulators had antigens from the B7 CREG (B7, B22, B27, B42, B67, B73), whereas none of the responders had such tissue types. In seven cases the serum was not able to react with the stimulator's lymphocytes in a cytotoxicity assay and in four cases the stimulator lymphocytes could not deplete the anti-B73 activity from the serum in absorption experiments. The cDNA of B73 was expressed correctly in COS cells and was recognized on the cell surface by a monospecific serum. The alpha 1 alpha 2 domains of B73 are most similar to those of the HLA-B22 family. Interestingly, the alpha 3 and transmembrane domains of HLA-B73 are not standard human domains, but are most similar to the corresponding domains of some gorilla and chimpanzee HLA-B genes.

  5. Shark (Scyliorhinus torazame) metallothionein: cDNA cloning, genomic sequence, and expression analysis.

    PubMed

    Cho, Young Sun; Choi, Buyl Nim; Ha, En-Mi; Kim, Ki Hong; Kim, Sung Koo; Kim, Dong Soo; Nam, Yoon Kwon

    2005-01-01

    Novel metallothionein (MT) complementary DNA and genomic sequences were isolated from a cartilaginous shark species, Scyliorhinus torazame. The full-length open reading frame (ORF) of shark MT cDNA encoded 68 amino acids with a high cysteine content (29%). The genomic ORF sequence (932 bp) of shark MT isolated by polymerase chain reaction (PCR) comprised 3 exons with 2 interventing introns. Shark MT sequence shared many conserved features with other vertebrate MTs: overall amino acid identities of shark MT ranged from 47% to 57% with fish MTs, and 41% to 62% with mammalian MTs. However, in addition to these conserved characteristics, shark MT sequence exhibited some unique characteristics. It contained 4 extra amino acids (Lys-Ala-Gly-Arg) at the end of the beta-domain, which have not been reported in any other vertebrate MTs. The last amino acid residue at the C-terminus was Ser, which also has not been reported in fish and mammalian MTs. The MT messenger RNA levels in shark liver and kidney, assessed by semiquantitative reverse transcriptase PCR and RNA blot hybridization, were significantly affected by experimental exposures to heavy metals (cadmium, copper, and zinc). Generally, the transcriptional activation of shark MT gene was dependent on the dose (0-10 mg/kg body weight for injection and 0-20 microM for immersion) and duration (1-10 days); zinc was a more potent inducer than copper and cadmium.

  6. Differential representation of sunflower ESTs in enriched organ-specific cDNA libraries in a small scale sequencing project.

    PubMed

    Fernández, Paula; Paniego, Norma; Lew, Sergio; Hopp, H Esteban; Heinz, Ruth A

    2003-09-30

    Subtractive hybridization methods are valuable tools for identifying differentially regulated genes in a given tissue avoiding redundant sequencing of clones representing the same expressed genes, maximizing detection of low abundant transcripts and thus, affecting the efficiency and cost effectiveness of small scale cDNA sequencing projects aimed to the specific identification of useful genes for breeding purposes. The objective of this work is to evaluate alternative strategies to high-throughput sequencing projects for the identification of novel genes differentially expressed in sunflower as a source of organ-specific genetic markers that can be functionally associated to important traits. Differential organ-specific ESTs were generated from leaf, stem, root and flower bud at two developmental stages (R1 and R4). The use of different sources of RNA as tester and driver cDNA for the construction of differential libraries was evaluated as a tool for detection of rare or low abundant transcripts. Organ-specificity ranged from 75 to 100% of non-redundant sequences in the different cDNA libraries. Sequence redundancy varied according to the target and driver cDNA used in each case. The R4 flower cDNA library was the less redundant library with 62% of unique sequences. Out of a total of 919 sequences that were edited and annotated, 318 were non-redundant sequences. Comparison against sequences in public databases showed that 60% of non-redundant sequences showed significant similarity to known sequences. The number of predicted novel genes varied among the different cDNA libraries, ranging from 56% in the R4 flower to 16 % in the R1 flower bud library. Comparison with sunflower ESTs on public databases showed that 197 of non-redundant sequences (60%) did not exhibit significant similarity to previously reported sunflower ESTs. This approach helped to successfully isolate a significant number of new reported sequences putatively related to responses to important

  7. Differential representation of sunflower ESTs in enriched organ-specific cDNA libraries in a small scale sequencing project

    PubMed Central

    Fernández, Paula; Paniego, Norma; Lew, Sergio; Hopp, H Esteban; Heinz, Ruth A

    2003-01-01

    Background Subtractive hybridization methods are valuable tools for identifying differentially regulated genes in a given tissue avoiding redundant sequencing of clones representing the same expressed genes, maximizing detection of low abundant transcripts and thus, affecting the efficiency and cost effectiveness of small scale cDNA sequencing projects aimed to the specific identification of useful genes for breeding purposes. The objective of this work is to evaluate alternative strategies to high-throughput sequencing projects for the identification of novel genes differentially expressed in sunflower as a source of organ-specific genetic markers that can be functionally associated to important traits. Results Differential organ-specific ESTs were generated from leaf, stem, root and flower bud at two developmental stages (R1 and R4). The use of different sources of RNA as tester and driver cDNA for the construction of differential libraries was evaluated as a tool for detection of rare or low abundant transcripts. Organ-specificity ranged from 75 to 100% of non-redundant sequences in the different cDNA libraries. Sequence redundancy varied according to the target and driver cDNA used in each case. The R4 flower cDNA library was the less redundant library with 62% of unique sequences. Out of a total of 919 sequences that were edited and annotated, 318 were non-redundant sequences. Comparison against sequences in public databases showed that 60% of non-redundant sequences showed significant similarity to known sequences. The number of predicted novel genes varied among the different cDNA libraries, ranging from 56% in the R4 flower to 16 % in the R1 flower bud library. Comparison with sunflower ESTs on public databases showed that 197 of non-redundant sequences (60%) did not exhibit significant similarity to previously reported sunflower ESTs. This approach helped to successfully isolate a significant number of new reported sequences putatively related to responses

  8. cDNA, genomic sequence cloning and overexpression of ribosomal protein gene L9 (rpL9) of the giant panda (Ailuropoda melanoleuca).

    PubMed

    Hou, W R; Hou, Y L; Wu, G F; Song, Y; Su, X L; Sun, B; Li, J

    2011-01-01

    The ribosomal protein L9 (RPL9), a component of the large subunit of the ribosome, has an unusual structure, comprising two compact globular domains connected by an α-helix; it interacts with 23 S rRNA. To obtain information about rpL9 of Ailuropoda melanoleuca (the giant panda) we designed primers based on the known mammalian nucleotide sequence. RT-PCR and PCR strategies were employed to isolate cDNA and the rpL9 gene from A. melanoleuca; these were sequenced and analyzed. We overexpressed cDNA of the rpL9 gene in Escherichia coli BL21. The cloned cDNA fragment was 627 bp in length, containing an open reading frame of 579 bp. The deduced protein is composed of 192 amino acids, with an estimated molecular mass of 21.86 kDa and an isoelectric point of 10.36. The length of the genomic sequence is 3807 bp, including six exons and five introns. Based on alignment analysis, rpL9 has high similarity among species; we found 85% agreement of DNA and amino acid sequences with the other species that have been analyzed. Based on topology predictions, there are two N-glycosylation sites, five protein kinase C phosphorylation sites, one casein kinase II phosphorylation site, two tyrosine kinase phosphorylation sites, three N-myristoylation sites, one amidation site, and one ribosomal protein L6 signature 2 in the L9 protein of A. melanoleuca. The rpL9 gene can be readily expressed in E. coli; it fuses with the N-terminal GST-tagged protein, giving rise to the accumulation of an expected 26.51-kDa polypeptide, which is in good agreement with the predicted molecular weight. This expression product could be used for purification and further study of its function.

  9. Cloning and cDNA sequence of the dihydrolipoamide dehydrogenase component of human. cap alpha. -ketoacid dehydrogenase complexes

    SciTech Connect

    Pons, G.; Raefsky-Estrin, C.; Carothers, D.J.; Pepin, R.A.; Javed, A.A.; Jesse, B.W.; Ganapathi, M.K.; Samols, D.; Patel, M.S.

    1988-03-01

    cDNA clones comprising the entire coding region for human dihydrolipoamide dehydrogenase have been isolated from a human liver cDNA library. The cDNA sequence of the largest clone consisted of 2082 base pairs and contained a 1527-base open reading frame that encodes a precursor dihydrolipoamide dehydrogenase of 509 amino acid residues. The first 35-amino acid residues of the open reading frame probably correspond to a typical mitochondrial import leader sequence. The predicted amino acid sequence of the mature protein, starting at the residue number 36 of the open reading frame, is almost identical (>98% homology) with the known partial amino acid sequence of the pig heart dihydrolipoamide dehydrogenase. The cDNA clone also contains a 3' untranslated region of 505 bases with an unusual polyadenylylation signal (TATAAA) and a short poly(A) track. By blot-hybridization analysis with the cDNA as probe, two mRNAs, 2.2 and 2.4 kilobases in size, have been detected in human tissues and fibroblasts, whereas only one mRNA (2.4 kilobases) was detected in rat tissues.

  10. Isolation and sequencing of the cDNA of a novel cytochrome P450 from rat oesophagus.

    PubMed

    Brookman-Amissah, N; Mackay, A G; Swann, P F

    2001-01-01

    RT-PCR was used to find whether cytochromes P450 of the 2A, 2B and 2E sub-families are expressed in the rat oesophagus. This showed that this tissue expresses a previously unknown member of the CYP2B sub-family, now designated CYP2B21. Using a combination of 5'- and 3'-RACE (rapid amplification of cDNA ends) and library screening, the cDNA was amplified and sequenced. The cDNA sequence (GenBank accession no. AF159245) covers the whole of the coding region and the whole of the 3'-untranslated region (UTR), but only 17 nt of the 5'-UTR. The DNA sequence has strong similarity to those of CYP2B1 and CYP2B2, with the derived amino acid sequence being 84 and 83% identical, respectively. The ease with which this cDNA was found in the cDNA library suggests that CYP2B21 is a major P450 of the oesophagus. The catalytic activity of this new CYP2B is not yet known, but as previous authors have reported that other members of this sub-family (CYP2B1 or 2B2) metabolize the selective oesophageal carcinogen N:-nitrosomethylbutylamine with the chemical selectivity necessary for carcinogenesis, i.e. they preferentially hydroxylate the alpha-carbon of the butyl chain, this new CYP2B may be the nitrosamine-activating enzyme of the oesophagus.

  11. Sequence of a cDNA clone encoding the polysialic acid-rich and cytoplasmic domains of the neural cell adhesion molecule N-CAM.

    PubMed Central

    Hemperly, J J; Murray, B A; Edelman, G M; Cunningham, B A

    1986-01-01

    Purified fractions of the neural cell-adhesion molecule N-CAM from embryonic chicken brain contain two similar polypeptides (Mr, 160,000 and 130,000), each containing an amino-terminal external binding region, a carbohydrate-rich central region, and a carboxyl-terminal region that is associated with the cell. Previous studies indicate that the two polypeptides arise by alternative splicing of mRNAs transcribed from a single gene. We report here the 3556-nucleotide sequence of a cDNA clone (pEC208) that encodes 964 amino acids from the carbohydrate and cell-associated domains of the larger N-CAM polypeptide followed by 664 nucleotides of 3' untranslated sequence. The predicted protein sequence contains attachment sites for polysialic acid-containing oligosaccharides, four tandem homologous regions of polypeptide resembling those seen in the immunoglobulin superfamily, and a single hydrophobic sequence that appears to be the membrane-spanning segment. The cytoplasmic domain carboxyl terminal to this segment includes a block of approximately equal to 250 amino acids present in the larger but not in the smaller N-CAM polypeptide. We designate these the ld (large domain) polypeptide and the sd (small domain) polypeptide. The intracellular domains of the ld and sd polypeptides are likely to be critical for cell-surface modulation of N-CAM by interacting in a differential fashion with other intrinsic proteins or with the cytoskeleton. PMID:3458261

  12. Cloning and sequence of a cDNA encoding a novel hybrid proline-rich protein associated with cytokinin-induced haustoria formation in Cuscuta reflexa.

    PubMed

    Subramaniam, K; Ranie, J; Srinivasa, B R; Sinha, A M; Mahadevan, S

    1994-04-20

    A complete cDNA encoding a novel hybrid Pro-rich protein (HyPRP) was identified by differentially screening 3 x 10(4) recombinant plaques of a Cuscuta reflexa cytokinin-induced haustorial cDNA library constructed in lambda gt10. The nucleotide (nt) sequence consists of: (i) a 424-bp 5'-non coding region having five start codons (ATGs) and three upstream open reading frames (uORFs); (ii) an ORF of 987 bp with coding potential for a 329-amino-acid (aa) protein of M(r) 35,203 with a hydrophobic N-terminal region including a stretch of nine consecutive Phe followed by a Pro-rich sequence and a Cys-rich hydrophobic C terminus; and (iii) a 178-bp 3'-UTR (untranslated region). Comparison of the predicted aa sequence with the NBRF and SWISSPROT databases and with a recent report of an embryo-specific protein of maize [Jose-Estanyol et al., Plant Cell 4 (1992) 413-423] showed it to be similar to the class of HyPRPs encoded by genes preferentially expressed in young tomato fruits, maize embryos and in vitro-cultured carrot embryos. Northern analysis revealed an approx. 1.8-kb mRNA of this gene expressed in the subapical region of the C. reflexa vine which exhibited maximum sensitivity to cytokinin in haustorial induction.

  13. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data shall...

  14. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data shall...

  15. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data shall...

  16. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data shall...

  17. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data shall...

  18. Sequence of a novel cytochrome CYP2B cDNA coding for a protein which is expressed in a sebaceous gland, but not in the liver.

    PubMed Central

    Friedberg, T; Grassow, M A; Bartlomowicz-Oesch, B; Siegert, P; Arand, M; Adesnik, M; Oesch, F

    1992-01-01

    The major phenobarbital-inducible rat hepatic cytochromes P-450, CYP2B1 and CYP2B2, are the paradigmatic members of a cytochrome P-450 gene subfamily that contains at least seven additional members. Specific oligonucleotide probes for these genomic members of the CYP2B subfamily were used to assess their tissue-specific expression. In Northern-blot analysis a probe specific to gene 4 (which is designated now as CYP2B12) hybridized to a single mRNA present in the preputial gland, an organ which is used as a model for sebaceous glands, but did not hybridize to mRNA isolated from the liver or from five other tissues of untreated or Aroclor 1254-treated rats. The cDNA sequence for the CYP2B12 RNA was determined from overlapping cDNA clones and contained a long open reading frame of 1476 bp. The nucleotide sequence of the CYP2B12 cDNA was 85% similar to the sequence of the CYP2B1 cDNA in its coding region and was different from any CYP2B cDNA characterized until now. The cDNA-derived primary structure of the CYP2B12 protein contains a signal sequence for its insertion into the endoplasmic reticulum and the putative haem-binding site characteristic of cytochromes P-450. A part of the potential haem pocket of CYP2B12 was identical with a similar structure in a bacterial protocatechuate dioxygenase. In immunoblot analysis of preputial-gland microsomes, antibodies against CYP2B1 recognized a single abundant protein with a lower apparent molecular mass than that of CYP2B1. Our results demonstrate that the CYP2B12 protein has the potential to be enzymically active and are the first demonstration that a member of the CYP2B subfamily is expressed exclusively and at high levels in an extrahepatic organ. Images Fig. 1. Fig. 5. Fig. 6. PMID:1445240

  19. 16S rRNA sequences of uncultivated hot spring cyanobacterial mat inhabitants retrieved as randomly primed cDNA

    SciTech Connect

    Weller, R.; Ward, D.M. ); Weller, J.W. )

    1991-04-01

    Cloning and analysis of cDNAs synthesized from rRNAs is one approach to assess the species composition of natural microbial communities. In some earlier attempts to synthesize cDNA from 16S rRNA (16S rcDNA) from the Octopus Spring cyanobacterial mat, a dominance of short 16S rcDNAs was observed, which appear to have originated only from certain organisms. Priming of cDNA synthesis from small ribosomal subunit RNA with random deoxyhexanucleotides can retrieve longer sequences, more suitable for phylogenetic analysis. Here we report the retrieval of 16S rRNA sequences form three formerly uncultured community members. One sequence type, which was retrieved three times from a total of five sequences analyzed, can be placed in the cyanobacterial phylum. A second sequence type is related to 16S rRNAs from green nonsulfur bacteria. The third sequence type may represent a novel phylogenetic type.

  20. cDNA sequence and chromosome localization of pig {alpha}1,3 galactosyltransferase

    SciTech Connect

    Strahan, K.M.; Preece, A.F.; Gustafsson, K.; Gu, F.; Gustavsson, I.; Anderson, L.

    1995-01-11

    Human serum contains natural antibodies (NAb), which can bind to endothelial cell surface antigens of other mammals. This is believed to be the major initiating event in the process of hyperacute rejection of pig to primate xenografts. Recent work has implicated galactosyl {alpha}1,3 galactosyl {beta}1,4 N-acetyl-glucosaminyl carbohydrate epitopes, on the surface of pig endothelial cells, as a major target of human natural antibodies. This epitope is made by a specific galactosyltransferase ({alpha}1,3 GT) present in pigs but not in higher primates. We have now cloned and sequenced a full-length pig {alpha}1,3 GT cDNA. The predicted 371 amino acid protein sequence shares 85% and 76% identity with previously characterized cattle and mouse {alpha}1,3 GT protein sequences, respectively. By using fluorescence and isotopic in situ hybridization, the GGTA/gene was mapped to the region q2.10-q2.11 of pig chromosome 1, providing further evidence of homology between the subterminal region of pig chromosome 1q and human chromosome 9q, which harbors the locus encoding the ABO blood group system, as well as a human pseudogene homologous to the pig GGTA1 gene. 29 refs., 5 figs.

  1. Epitopes of human testis-specific lactate dehydrogenase deduced from a cDNA sequence

    SciTech Connect

    Millan, J.L.; Driscoll, C.E.; LeVan, K.M.; Goldberg, E.

    1987-08-01

    The sequence and structure of human testis-specific L-lactate dehydrogenase (LDHC/sub 4/, LDHX; (L)-lactate:NAD/sup +/ oxidoreductase, EC 1.1.1.27) has been derived from analysis of a complementary DNA (cDNA) clone comprising the complete protein coding region of the enzyme. From the deduced amino acid sequence, human LDHC/sub 4/ is as different from rodent LDHC/sub 4/ (73% homology) as it is from human LDHA/sub 4/ (76% homology) and porcine LDHB/sub 4/ (68% homology). Subunit homologies are consistent with the conclusion that the LDHC gene arose by at least two independent duplication events. Furthermore, the lower degree of homology between mouse and human LDHC/sub 4/ and the appearance of this isozyme late in evolution suggests a higher rate of mutation in the mammalian LDHC genes than in the LDHA and -B genes. Comparison of exposed amino acid residues of discrete anti-genic determinants of mouse and human LDHC/sub 4/ reveals significant differences. Knowledge of the human LDHC/sub 4/ sequence will help design human-specific peptides useful in the development of a contraceptive vaccine.

  2. High nucleotide and amino acid sequence similarities in tumour necrosis factor-alpha amongst Indian buffalo (Bubalus bubalis), Indian cattle (Bos indicus) and other ruminants.

    PubMed

    Gupta, P K; Bind, R B; Walunj, S S; Saini, M

    2004-08-01

    Tumour necrosis factor-alpha (TNF-alpha) mRNA from Indian water buffalo (Bubalus bubalis) and Indian cattle (Bos indicus) was reverse transcribed and amplified using reverse transcriptase-polymerase chain reaction (RT-PCR). The nucleotide sequences of cDNAs were determined after cloning into pGEM-T-Easy vector (Promega, Madison, WI) and compared with reported nucleotide sequences of TNF-alpha cDNA from other species. The nucleotide sequences of TNF-alpha from Indian cattle revealed significantly high similarities at nucleotide (99.2%) and amino acid (100%) levels with those of cattle (Bos taurus; Zebu). The sequences from buffalo had 98.4% nucleotide and 99.1% amino acid similarities with Indian cattle, indicating functional cross-reactivity. One amino acid deletion at position 63 and one substitution (A-->P) at position 64 were observed in buffalo compared with Indian cattle. The amino acid deletion at position 63 was predicted due to differences in pre-mRNA splicing.

  3. Synthetic oligonucleotides with particular base sequences from the cDNA encoding proteins of Mycobacterium bovis BCG induce interferons and activate natural killer cells.

    PubMed

    Tokunaga, T; Yano, O; Kuramoto, E; Kimura, Y; Yamamoto, T; Kataoka, T; Yamamoto, S

    1992-01-01

    Thirteen kinds of 45-mer single-stranded oligonucleotide, having sequence randomly selected from the known cDNA encoding BCG proteins, were tested for their capability to augment natural killer (NK) cell activity of mouse spleen cells in vitro. Six out of the 13 oligonucleotides showed the activity, while the others did not. In order to know the minimal and essential sequence(s) responsible for the biological activity, 2 kinds of 30-mer and 5 kinds of 15-mer oligonucleotide fragments of an active 45-mer nucleotide were tested for their activity. One of the 30-mer oligonucleotides, designated BCG-A4a, was active, but the other 30-mer was inactive. All of the 15-mer oligonucleotide fragments were inactive. The BCG-A4a also stimulated the spleen cells to produce interferon (IFN)-alpha and -gamma. An experiment using anti-IFN antisera showed that the NK cell activation by the oligonucleotide was ascribed to the IFN-alpha produced. It was noticed that all of the biologically active oligonucleotides possessed one or more palindrome sequence(s), and the inactive ones did not, with an exception of a 45-mer inactive oligonucleotide containing overlapping palindrome sequences (GGGCCCGGG). These findings strongly suggest that certain palindrome sequences, like GACGTC, GGCGCC and TGCGCA, are essential for 30-mer oligonucleotides, like BCG-A4a, to induce IFNs.

  4. Identification of a cDNA clone that contains the complete coding sequence for a 140-kD rat NCAM polypeptide

    PubMed Central

    1987-01-01

    Neural cell adhesion molecules (NCAMs) are cell surface glycoproteins that appear to mediate cell-cell adhesion. In vertebrates NCAMs exist in at least three different polypeptide forms of apparent molecular masses 180, 140, and 120 kD. The 180- and 140-kD forms span the plasma membrane whereas the 120-kD form lacks a transmembrane region. In this study, we report the isolation of NCAM clones from an adult rat brain cDNA library. Sequence analysis indicated that the longest isolate, pR18, contains a 2,574 nucleotide open reading frame flanked by 208 bases of 5' and 409 bases of 3' untranslated sequence. The predicted polypeptide encoded by clone pR18 contains a single membrane-spanning region and a small cytoplasmic domain (120 amino acids), suggesting that it codes for a full-length 140-kD NCAM form. In Northern analysis, probes derived from 5' sequences of pR18, which presumably code for extracellular portions of the molecule hybridized to five discrete mRNA size classes (7.4, 6.7, 5.2, 4.3, and 2.9 kb) in adult rat brain but not to liver or muscle RNA. However, the 5.2- and 2.9-kb mRNA size classes did not hybridize to either a large restriction fragment or three oligonucleotides derived from the putative transmembrane coding region and regions that lie 3' to it. The 3' probes did hybridize to the 7.4-, 6.7-, and 4.3-kb message size classes. These combined results indicate that clone pR18 is derived from either the 7.4-, 6.7-, or 4.3- kb adult rat brain RNA size class. Comparison with chicken and mouse NCAM cDNA sequences suggests that pR18 represents the amino acid coding region of the 6.7- or 4.3-kb mRNA. The isolation of pR18, the first cDNA that contains the complete coding sequence of an NCAM polypeptide, unambiguously demonstrates the predicted linear amino acid sequence of this probable rat 140-kD polypeptide. This cDNA also contains a 30-base pair segment not found in NCAM cDNAs isolated from other species. The significance of this segment and other

  5. Annotated expressed sequence tags and cDNA microarrays for studies of brain and behavior in the honey bee.

    PubMed

    Whitfield, Charles W; Band, Mark R; Bonaldo, Maria F; Kumar, Charu G; Liu, Lei; Pardinas, Jose R; Robertson, Hugh M; Soares, M Bento; Robinson, Gene E

    2002-04-01

    To accelerate the molecular analysis of behavior in the honey bee (Apis mellifera), we created expressed sequence tag (EST) and cDNA microarray resources for the bee brain. Over 20,000 cDNA clones were partially sequenced from a normalized (and subsequently subtracted) library generated from adult A. mellifera brains. These sequences were processed to identify 15,311 high-quality ESTs representing 8912 putative transcripts. Putative transcripts were functionally annotated (using the Gene Ontology classification system) based on matching gene sequences in Drosophila melanogaster. The brain ESTs represent a broad range of molecular functions and biological processes, with neurobiological classifications particularly well represented. Roughly half of Drosophila genes currently implicated in synaptic transmission and/or behavior are represented in the Apis EST set. Of Apis sequences with open reading frames of at least 450 bp, 24% are highly diverged with no matches to known protein sequences. Additionally, over 100 Apis transcript sequences conserved with other organisms appear to have been lost from the Drosophila genome. DNA microarrays were fabricated with over 7000 EST cDNA clones putatively representing different transcripts. Using probe derived from single bee brain mRNA, microarrays detected gene expression for 90% of Apis cDNAs two standard deviations greater than exogenous control cDNAs. [The sequence data described in this paper have been submitted to Genbank data library under accession nos. BI502708-BI517278. The sequences are also available at http://titan.biotec.uiuc.edu/bee/honeybee_project.htm.

  6. Expressed Sequence Tags With cDNA Termini: Previously Overlooked Resources for Gene Annotation and Transcriptome Exploration in Chlamydomonas reinhardtii

    PubMed Central

    Liang, Chun; Liu, Yuansheng; Liu, Lin; Davis, Adam C.; Shen, Yingjia; Li, Qingshun Quinn

    2008-01-01

    Many of Chlamydomonas reinhardtii expressed sequence tags (ESTs) in GenBank dbEST and community EST assemblies were either over- or undertrimmed in terms of their cDNA termini, which are defined as the diagnostic sequence elements that delineate 3′/5′ ends of mRNA transcripts. Overtrimming represents a loss of directional, positional, and structural information of transcript ends whereas undertrimming causes unclean spurious sequences retained in ESTs that exert deleterious impacts on downstream EST-based applications. We examined 309,278 raw EST sequencing trace files of C. reinhardtii and found that only 57% had cDNA termini that matched the expected structures specified in their cDNA library constructions while satisfying our minimum length requirement for their final clean sequences. Using GMAP, 156,963 individual ESTs were mapped to the genome successfully, with their in silico-verified cDNA termini anchored to the genome. Our data analysis suggested strong macro- and microheterogeneity of 3′/5′ end positions of individual transcripts derived from the same genes in C. reinhardtii. This work annotating differential ends of individual transcripts in the draft genome presents the research community with a new stream of data that will facilitate accurate determination of gene structures, genome annotation, and exploration of the transcriptome and mRNA metabolism in C. reinhardtii. PMID:18493042

  7. Nucleotide sequence of the Lactococcus lactis NCDO 763 (ML3) rpoD gene.

    PubMed

    Gansel, X; Hartke, A; Boutibonnes, P; Auffray, Y

    1993-10-19

    The complete nucleotide sequence of rpoD gene from Lactococcus lactis has been determined. The nucleotide data have indicated the presence of an open reading frame of 1020 base pairs encoding a polypeptide which shares the framework structure for principal sigma factors of eubacteria strains.

  8. Nucleotide sequence of a lysine transfer ribonucleic Acid from bakers' yeast.

    PubMed

    Madison, J T; Boguslawski, S J; Teetor, G H

    1972-05-12

    The nucleotide sequence of one of the two major lysine transfer RNA's from bakers' yeast has been determined. Its structure is compared to that of a lysine tRNA from a haploid yeast. A total of 21 nucleotides differ in the two molecules. Only the T-psi-C-G (thymidine-pseudouridine-cytidine-guanosine) loop and its supporting stem are identical.

  9. Cloning and sequence analysis of a cDNA clone coding for the mouse GM2 activator protein.

    PubMed Central

    Bellachioma, G; Stirling, J L; Orlacchio, A; Beccari, T

    1993-01-01

    A cDNA (1.1 kb) containing the complete coding sequence for the mouse GM2 activator protein was isolated from a mouse macrophage library using a cDNA for the human protein as a probe. There was a single ATG located 12 bp from the 5' end of the cDNA clone followed by an open reading frame of 579 bp. Northern blot analysis of mouse macrophage RNA showed that there was a single band with a mobility corresponding to a size of 2.3 kb. We deduce from this that the mouse mRNA, in common with the mRNA for the human GM2 activator protein, has a long 3' untranslated sequence of approx. 1.7 kb. Alignment of the mouse and human deduced amino acid sequences showed 68% identity overall and 75% identity for the sequence on the C-terminal side of the first 31 residues, which in the human GM2 activator protein contains the signal peptide. Hydropathicity plots showed great similarity between the mouse and human sequences even in regions of low sequence similarity. There is a single N-glycosylation site in the mouse GM2 activator protein sequence (Asn151-Phe-Thr) which differs in its location from the single site reported in the human GM2 activator protein sequence (Asn63-Val-Thr). Images Figure 1 PMID:7689829

  10. The cDNA sequence of three hemocyanin subunits from the garden snail Helix lucorum.

    PubMed

    De Smet, Lina; Dimitrov, Ivan; Debyser, Griet; Dolashka-Angelova, Pavlina; Dolashki, Aleksandar; Van Beeumen, Jozef; Devreese, Bart

    2011-11-10

    Hemocyanins are blue copper containing respiratory proteins residing in the hemolymph of many molluscs and arthropods. They can have different molecular masses and quaternary structures. Moreover, several molluscan hemocyanins are isolated with one, two or three isoforms occurring as decameric, didecameric, multidecameric or tubule aggregates. We could recently isolate three different hemocyanin isopolypeptides from the hemolymph of the garden snail Helix lucorum (HlH). These three structural subunits were named α(D)-HlH, α(N)-HlH and β-HlH. We have cloned and sequenced their cDNA which is the first result ever reported for three isoforms of a molluscan hemocyanin. Whereas the complete gene sequence of α(D)-HlH and β-HlH was obtained, including the 5' and 3' UTR, 180bp of the 5' end and around 900bp at the 3' end are missing for the third subunit. The subunits α(D)-HlH and β-HlH comprise a signal sequence of 19 amino acids plus a polypeptide of 3409 and 3414 amino acids, respectively. We could determine 3031 residues of the α(N)-HLH subunit. Sequence comparison with other molluscan hemocyanins shows that α(D)-HlH is more related to Aplysia californicum hemocyanin than to each of its own isopolypeptides. The structural subunits comprise 8 different functional units (FUs: a, b, c, d, e, f, g, h) and each functional unit possesses a highly conserved copper-A and copper-B site for reversible oxygen binding. Potential N-glycosylation sites are present in all three structural subunits. We confirmed that all three different isoforms are effectively produced and secreted in the hemolymph of H. lucorum by analyzing a tryptic digest of the purified native hemocyanin by MALDI-TOF and LC-FTICR mass spectrometry. Copyright © 2011 Elsevier B.V. All rights reserved.

  11. cDNA sequence and protein bioinformatics analyses of MSTN in African catfish (Clarias gariepinus).

    PubMed

    Kanjanaworakul, Poonmanee; Sawatdichaikul, Orathai; Poompuang, Supawadee

    2016-04-01

    Myostatin, also known as growth differentiation factor 8, has been identified as a potent negative regulator of skeletal muscle growth. The purpose of this study was to characterize and predict function of the myostatin gene of the African catfish (Cg-MSTN). Expression of Cg-MSTN was determined at three growth stages to establish the relationship between the levels of MSTN transcript and skeletal muscle growth. The partial cDNA sequence of Cg-MSTN was cloned by using published information from its congener walking catfish (Cm-MSTN). The Cg-MSTN was 1194 bp in length encoding a protein of 397 amino acids. The deduced MSTN sequence exhibited key functional sites similar to those of other members of the TGF-β superfamily, especially, the proteolytic processing site (RXXR motif) and nine conserved cysteines at the C-terminal. Expression of MSTN appeared to be correlated with muscle development and growth of African catfish. Protein bioinformatics revealed that the primary sequence of Cg-MSTN shared 98 % sequence identity with that of walking catfish Cm-MSTN with only two different residues, [Formula: see text]. and [Formula: see text]. The proposed model of Cg-MSTN revealed the key point mutation [Formula: see text] causing a 7.35 Å shorter distance between the N- and C-lobes and an approximately 11° narrow angle than those of Cm-MSTN. The substitution of a proline residue near the proteolytic processing site which altered the structure of myostatin may play a critical role in reducing proteolytic activity of this protein in African catfish.

  12. Apis mellifera ultraspiracle: cDNA sequence and rapid up-regulation by juvenile hormone.

    PubMed

    Barchuk, A R; Maleszka, R; Simões, Z L P

    2004-10-01

    Two hormones, 20-hydroxyecdysone (20E) and juvenile hormone (JH) are key regulators of insect development including the differentiation of the alternative caste phenotypes of social insects. In addition, JH plays a different role in adult honey bees, acting as a 'behavioural pacemaker'. The functional receptor for 20E is a heterodimer consisting of the ecdysone receptor and ultraspiracle (USP) whereas the identity of the JH receptor remains unknown. We have cloned and sequenced a cDNA encoding Apis mellifera ultraspiracle (AMUSP) and examined its responses to JH. A rapid, but transient up-regulation of the AMUSP messenger is observed in the fat bodies of both queens and workers. AMusp appears to be a single copy gene that produces two transcripts ( approximately 4 and approximately 5 kb) that are differentially expressed in the animal's body. The predicted AMUSP protein shows greater sequence similarity to its orthologues from the vertebrate-crab-tick-locust group than to the dipteran-lepidopteran group. These characteristics and the rapid up-regulation by JH suggest that some of the USP functions in the honey bee may depend on ligand binding.

  13. Human liver apolipoprotein B-100 cDNA: complete nucleic acid and derived amino acid sequence.

    PubMed Central

    Law, S W; Grant, S M; Higuchi, K; Hospattankar, A; Lackner, K; Lee, N; Brewer, H B

    1986-01-01

    Human apolipoprotein B-100 (apoB-100), the ligand on low density lipoproteins that interacts with the low density lipoprotein receptor and initiates receptor-mediated endocytosis and low density lipoprotein catabolism, has been cloned, and the complete nucleic acid and derived amino acid sequences have been determined. ApoB-100 cDNAs were isolated from normal human liver cDNA libraries utilizing immunoscreening as well as filter hybridization with radiolabeled apoB-100 oligodeoxynucleotides. The apoB-100 mRNA is 14.1 kilobases long encoding a mature apoB-100 protein of 4536 amino acids with a calculated amino acid molecular weight of 512,723. ApoB-100 contains 20 potential glycosylation sites, and 12 of a total of 25 cysteine residues are located in the amino-terminal region of the apolipoprotein providing a potential globular structure of the amino terminus of the protein. ApoB-100 contains relatively few regions of amphipathic helices, but compared to other human apolipoproteins it is enriched in beta-structure. The delineation of the entire human apoB-100 sequence will now permit a detailed analysis of the conformation of the protein, the low density lipoprotein receptor binding domain(s), and the structural relationship between apoB-100 and apoB-48 and will provide the basis for the study of genetic defects in apoB-100 in patients with dyslipoproteinemias. PMID:3464946

  14. Variation in the nucleotide sequence of a prolamin gene family in wild rice.

    PubMed

    Barbier, P; Ishihama, A

    1990-07-01

    Variation in the DNA sequence of the 10 kDa prolamin gene family within the wild rice species Oryza rufipogon was probed using the direct sequencing of PCR-amplified genes. A comparison of the nucleotide and deduced amino-acid sequences of eight Asian strains of O. rufipogon and one strain of the related African species O. longistaminata is presented.

  15. Development of polymorphic genic-SSR markers by cDNA library sequencing in boxwood, Buxus spp. (Buxaceae)

    USDA-ARS?s Scientific Manuscript database

    Genic microsatellites or simple sequence repeat (genic-SSR) markers were developed in boxwood (Buxus taxa) for genetic diversity analysis, identification of taxa, and to facilitate breeding. cDNA libraries were developed from mRNA extracted from leaves of Buxus sempervirens ‘Vardar Valley’ and seque...

  16. Automated Workflow for Preparation of cDNA for Cap Analysis of Gene Expression on a Single Molecule Sequencer

    PubMed Central

    Nagao-Sato, Sayaka; Saijo, Eri; Lassmann, Timo; Kanamori-Katayama, Mutsumi; Kaiho, Ai; Lizio, Marina; Kawaji, Hideya; Carninci, Piero; Forrest, Alistair R. R.; Hayashizaki, Yoshihide

    2012-01-01

    Background Cap analysis of gene expression (CAGE) is a 5′ sequence tag technology to globally determine transcriptional starting sites in the genome and their expression levels and has most recently been adapted to the HeliScope single molecule sequencer. Despite significant simplifications in the CAGE protocol, it has until now been a labour intensive protocol. Methodology In this study we set out to adapt the protocol to a robotic workflow, which would increase throughput and reduce handling. The automated CAGE cDNA preparation system we present here can prepare 96 ‘HeliScope ready’ CAGE cDNA libraries in 8 days, as opposed to 6 weeks by a manual operator.We compare the results obtained using the same RNA in manual libraries and across multiple automation batches to assess reproducibility. Conclusions We show that the sequencing was highly reproducible and comparable to manual libraries with an 8 fold increase in productivity. The automated CAGE cDNA preparation system can prepare 96 CAGE sequencing samples simultaneously. Finally we discuss how the system could be used for CAGE on Illumina/SOLiD platforms, RNA-seq and full-length cDNA generation. PMID:22303458

  17. Developmentally regulated plant genes: the nucleotide sequence of a wheat gliadin genomic clone.

    PubMed Central

    Rafalski, J A; Scheets, K; Metzler, M; Peterson, D M; Hedgcoth, C; Söll, D G

    1984-01-01

    Gliadins, the major wheat seed storage proteins, are encoded by a multigene family. Northern blot analysis shows that gliadin genes are transcribed in endosperm tissue into two classes of poly(A)+ mRNA, 1400 bases (class I) and 1600 bases (class II) in length. Using poly(A)+ RNA from developing wheat endosperm we constructed a cDNA library from which a number of clones coding for alpha/beta and gamma gliadins were identified by hybrid-selected mRNA translation and DNA sequencing. These cDNA clones were used as probes for the isolation of genomic gliadin clones from a wheat genomic library. One such genomic clone was characterized in detail and its DNA sequence determined. It contains a gene for a 33-kd alpha/beta gliadin protein (a 20 amino acid signal peptide and a 266 amino acid mature protein) which is very rich in glutamine (33.8%) and proline (15.4%). The gene sequence does not contain introns. A typical eukaryotic promoter sequence is present at -104 (relative to the translation initiation codon) and there are two normal polyadenylation signals 77 and 134 bases downstream from the translation termination codon. The coding sequence contains some internal sequence repetition, and is highly homologous to several alpha/beta gliadin cDNA clones. Homology to a gamma-gliadin cDNA clone is low, and there is no homology with known glutenin or zein cDNA sequences. Images Fig. 1. Fig. 2. PMID:6204862

  18. Complete nucleotide sequence of the 23S rRNA gene of the Cyanobacterium, Anacystis nidulans.

    PubMed Central

    Douglas, S E; Doolittle, W F

    1984-01-01

    The nucleotide sequence of the Anacystis nidulans 23S rRNA gene, including the 5'- and 3'-flanking regions has been determined. The gene is 2876 nucleotides long and shows higher primary sequence homology to the 23S rRNAs of plastids (84.5%) than to that of E. coli (79%). The predicted rRNA transcript also shares many secondary structural features with those of plastids, reinforcing the endosymbiont hypothesis for the origin of these organelles. PMID:6326060

  19. Statistical analysis of nucleotide sequences of the hemagglutinin gene of human influenza A viruses.

    PubMed Central

    Ina, Y; Gojobori, T

    1994-01-01

    To examine whether positive selection operates on the hemagglutinin 1 (HA1) gene of human influenza A viruses (H1 subtype), 21 nucleotide sequences of the HA1 gene were statistically analyzed. The nucleotide sequences were divided into antigenic and nonantigenic sites. The nucleotide diversities for antigenic and nonantigenic sites of the HA1 gene were computed at synonymous and nonsynonymous sites separately. For nonantigenic sites, the nucleotide diversities were larger at synonymous sites than at nonsynonymous sites. This is consistent with the neutral theory of molecular evolution. For antigenic sites, however, the nucleotide diversities at nonsynonymous sites were larger than those at synonymous sites. These results suggest that positive selection operates on antigenic sites of the HA1 gene of human influenza A viruses (H1 subtype). PMID:8078892

  20. FASH: A web application for nucleotides sequence search

    PubMed Central

    Veksler-Lublinksy, Isana; Barash, Danny; Avisar, Chai; Troim, Einav; Chew, Paul; Kedem, Klara

    2008-01-01

    FASH (Fourier Alignment Sequence Heuristics) is a web application, based on the Fast Fourier Transform, for finding remote homologs within a long nucleic acid sequence. Given a query sequence and a long text-sequence (e.g, the human genome), FASH detects subsequences within the text that are remotely-similar to the query. FASH offers an alternative approach to Blast/Fasta for querying long RNA/DNA sequences. FASH differs from these other approaches in that it does not depend on the existence of contiguous seed-sequences in its initial detection phase. The FASH web server is user friendly and very easy to operate. FASH can be accessed at (secured website) PMID:18505581

  1. Nucleotide sequence of Neurospora crassa cytoplasmic initiator tRNA.

    PubMed Central

    Gillum, A M; Hecker, L I; Silberklang, M; Schwartzbach, S D; RajBhandary, U L; Barnett, W E

    1977-01-01

    Initiator methionine tRNA from the cytoplasm of Neurospora crassa has been purified and sequenced. The sequence is: pAGCUGCAUm1GGCGCAGCGGAAGCGCM22GCY*GGGCUCAUt6AACCCGGAGm7GU (or D) - CACUCGAUCGm1AAACGAG*UUGCAGCUACCAOH. Similar to initiator tRNAs from the cytoplasm of other eukaryotes, this tRNA also contains the sequence -AUCG- instead of the usual -TphiCG (or A)- found in loop IV of other tRNAs. The sequence of the N. crassa cytoplasmic initiator tRNA is quite different from that of the corresponding mitochondrial initiator tRNA. Comparison of the sequence of N. crassa cytoplasmic initiator tRNA to those of yeast, wheat germ and vertebrate cytoplasmic initiator tRNA indicates that the sequences of the two fungal tRNAs are no more similar to each other than they are to those of other initiator tRNAs. Images PMID:146192

  2. cDNA and derived amino acid sequence of ethanol-inducible rabbit liver cytochrome P-450 isozyme 3a (P-450ALC).

    PubMed Central

    Khani, S C; Zaphiropoulos, P G; Fujita, V S; Porter, T D; Koop, D R; Coon, M J

    1987-01-01

    Administration of ethanol to rabbits is known to induce a unique liver microsomal cytochrome P-450, termed isozyme 3a or P-450ALC, which is responsible for the increased oxidation of ethanol and other alcohols and the activation of toxic or carcinogenic compounds such as acetaminophen and N-nitrosodimethylamine. To further characterize this cytochrome P-450 we have identified cDNA clones to isozyme 3a by immunoscreening, DNA hybridization, and hybridization-selection. The cDNA sequence determined from two overlapping clones contains an open reading frame of 1416 nucleotides, and the first 25 amino acids of this reading frame correspond to residues 21-45 of cytochrome P-450 3a. The complete polypeptide, including residues 1 to 20, contains 492 amino acids and has a molecular weight of 56,820. Cytochrome P-450 3a is approximately 55% identical in sequence to P-450 isozymes 1 and 3b and 48% identical to isozyme 2. Hybridization of clone p3a-2 to electrophoretically fractionated rabbit liver poly(A)+ RNA revealed multiple bands, but, with a probe derived from the 3' nontranslated portion of this cDNA, only a 1.9-kilobase band was observed. Treatment of rabbits with imidazole, which increases the content of isozyme 3a, resulted in a transient increase in form 3a mRNA, but this was judged to be insufficient to account for the known 4.5-fold increase in form 3a protein. Genomic DNA analysis indicated that the cytochrome P-450 3a gene does not belong to a large subfamily. Images PMID:3027695

  3. Complete nucleotide sequence of a new isolate of passion fruit woodiness virus from Western Australia.

    PubMed

    Fukumoto, Tomohiro; Nakamura, Masayuki; Wylie, Stephen J; Chiaki, Yuya; Iwai, Hisashi

    2013-08-01

    We determined the complete genome sequence of the passion fruit woodiness virus Gld-1 isolate (PWV-Gld-1) from Australia and compared it with that of PWV-MU-2, another Australian isolate of PWV. The genomes shared high sequence identity in both the complete nucleotide sequence and the ORF amino acid sequence. All of the cleavage sites of each protein were identical to those of MU-2, and the sequence identity for the individual proteins ranged from 97.2 % to 100.0 %. However, the 5' untranslated region (5'UTR) of the Gld-1 isolate shared only 46.8 % sequence identity with that of PWV-MU-2 and was 177 nucleotides shorter. Re-sequencing of the 5'UTR of MU-2 revealed that the 5' end of the original sequence includes an artifact generated by deep sequencing.

  4. RNA Secondary Structures Having a Compatible Sequence of Certain Nucleotide Ratios.

    PubMed

    Barrett, Christopher L; Li, Thomas J X; Reidys, Christian M

    2016-11-01

    Given a random RNA secondary structure, S, we study RNA sequences having fixed ratios of nucleotides that are compatible with S. We perform this analysis for RNA secondary structures subject to various base-pairing rules and minimum arc- and stack-length restrictions. Our main result reads as follows: in the simplex of nucleotide ratios, there exists a convex region, in which, in the limit of long sequences, a random structure asymptotically almost surely (a.a.s.) has compatible sequence with these ratios and outside of which a.a.s. a random structure has no such compatible sequence. We localize this region for RNA secondary structures subject to various base-pairing rules and minimum arc- and stack-length restrictions. In particular, for GC-sequences (GC denoting the nucleotides guanine and cytosine, respectively) having a ratio of G nucleotides smaller than 1/3, a random RNA secondary structure without any minimum arc- and stack-length restrictions has a.a.s. no such compatible sequence. For sequences having a ratio of G nucleotides larger than 1/3, a random RNA secondary structure has a.a.s. such compatible sequences. We discuss our results in the context of various families of RNA structures.

  5. Cloning and nucleotide sequence of the aroA gene of Bordetella pertussis.

    PubMed Central

    Maskell, D J; Morrissey, P; Dougan, G

    1988-01-01

    The aroA locus of Bordetella pertussis, encoding 5-enolpyruvylshikimate 3-phosphate synthase, has been cloned into Escherichia coli by using a cosmid vector. The gene is expressed in E. coli and complemented an E. coli aroA mutant. The nucleotide sequence of the B. pertussis aroA gene was determined and contains an open reading frame encoding 442 amino acids, with a calculated molecular weight for 5-enolpyruvylshikimate 3-phosphate synthase of 46,688. The amino acid sequence derived from the nucleotide sequence shows homology with the published amino acid sequences of aroA gene products of other microorganisms. PMID:2897356

  6. Cloning and sequence analysis of an Ophiophagus hannah cDNA encoding a precursor of two natriuretic peptide domains.

    PubMed

    Lei, Weiwei; Zhang, Yong; Yu, Guoyu; Jiang, Ping; He, Yingying; Lee, Wenhui; Zhang, Yun

    2011-04-01

    The king cobra (Ophiophagus hannah) is the largest venomous snake. Despite the components are mainly neurotoxins, the venom contains several proteins affecting blood system. Natriuretic peptide (NP), one of the important components of snake venoms, could cause local vasodilatation and a promoted capillary permeability facilitating a rapid diffusion of other toxins into the prey tissues. Due to the low abundance, it is hard to purify the snake venom NPs. The cDNA cloning of the NPs become a useful approach. In this study, a 957 bp natriuretic peptide-encoding cDNA clone was isolated from an O. hannah venom gland cDNA library. The open-reading frame of the cDNA encodes a 210-amino acid residues precursor protein named Oh-NP. Oh-NP has a typical signal peptide sequence of 26 amino acid residues. Surprisingly, Oh-NP has two typical NP domains which consist of the typical sequence of 17-residue loop of CFGXXDRIGC, so it is an unusual NP precursor. These two NP domains share high amino acid sequence identity. In addition, there are two homologous peptides of unknown function within the Oh-NP precursor. To our knowledge, Oh-NP is the first protein precursor containing two NP domains. It might belong to another subclass of snake venom NPs. Copyright © 2011 Elsevier Ltd. All rights reserved.

  7. Identification and isolation of full-length cDNA sequences by sequencing and analysis of expressed sequence tags from guarana (Paullinia cupana).

    PubMed

    Figueirêdo, L C; Faria-Campos, A C; Astolfi-Filho, S; Azevedo, J L

    2011-06-21

    The current intense production of biological data, generated by sequencing techniques, has created an ever-growing volume of unanalyzed data. We reevaluated data produced by the guarana (Paullinia cupana) transcriptome sequencing project to identify cDNA clones with complete coding sequences (full-length clones) and complete sequences of genes of biotechnological interest, contributing to the knowledge of biological characteristics of this organism. We analyzed 15,490 ESTs of guarana in search of clones with complete coding regions. A total of 12,402 sequences were analyzed using BLAST, and 4697 full-length clones were identified, responsible for the production of 2297 different proteins. Eighty-four clones were identified as full-length for N-methyltransferase and 18 were sequenced in both directions to obtain the complete genome sequence, and confirm the search made in silico for full-length clones. Phylogenetic analyses were made with the complete genome sequences of three clones, which showed only 0.017% dissimilarity; these are phylogenetically close to the caffeine synthase of Theobroma cacao. The search for full-length clones allowed the identification of numerous clones that had the complete coding region, demonstrating this to be an efficient and useful tool in the process of biological data mining. The sequencing of the complete coding region of identified full-length clones corroborated the data from the in silico search, strengthening its efficiency and utility.

  8. Isolation and complete nucleotide sequence of the measles virus IMB-1 strain in China.

    PubMed

    Ma, Shao-hui; Wang, Li-chun; Liu, Jian-sheng; Shi, Hai-jing; Liu, Long-ding; Li, Qi-han

    2010-12-01

    The complete nucleotide sequence of the measles virus strain IMB-1, which was isolated in China, was determined. As in other measles viruses, its genome is 15,894 nucleotides in length and encodes six proteins. The full-length nucleotide sequence of the IMB-1 isolate differed from vaccine strains (including wild-type Edmonston strain) by 4%-5% at the nucleotide sequence level. This isolate has amino acid variations over the full genome, including in the hemagglutinin and fusion genes. This report is the first to describe the full-length genome of a genotype H1 strain and provide an overview of the diversity of genetic characteristics of a circulating measles virus.

  9. An analysis of expressed sequence tags of developing castor endosperm using a full-length cDNA library

    PubMed Central

    Lu, Chaofu; Wallis, James G; Browse, John

    2007-01-01

    Background Castor seeds are a major source for ricinoleate, an important industrial raw material. Genomics studies of castor plant will provide critical information for understanding seed metabolism, for effectively engineering ricinoleate production in transgenic oilseeds, or for genetically improving castor plants by eliminating toxic and allergic proteins in seeds. Results Full-length cDNAs are useful resources in annotating genes and in providing functional analysis of genes and their products. We constructed a full-length cDNA library from developing castor endosperm, and obtained 4,720 ESTs from 5'-ends of the cDNA clones representing 1,908 unique sequences. The most abundant transcripts are genes encoding storage proteins, ricin, agglutinin and oleosins. Several other sequences are also very numerous, including two acidic triacylglycerol lipases, and the oleate hydroxylase (FAH12) gene that is responsible for ricinoleate biosynthesis. The role(s) of the lipases in developing castor seeds are not clear, and co-expressing of a lipase and the FAH12 did not result in significant changes in hydroxy fatty acid accumulation in transgenic Arabidopsis seeds. Only one oleate desaturase (FAD2) gene was identified in our cDNA sequences. Sequence and functional analyses of the castor FAD2 were carried out since it had not been characterized previously. Overexpression of castor FAD2 in a FAH12-expressing Arabidopsis line resulted in decreased accumulation of hydroxy fatty acids in transgenic seeds. Conclusion Our results suggest that transcriptional regulation of FAD2 and FAH12 genes maybe one of the mechanisms that contribute to a high level of ricinoleate accumulation in castor endosperm. The full-length cDNA library will be used to search for additional genes that affect ricinoleate accumulation in seed oils. Our EST sequences will also be useful to annotate the castor genome, which whole sequence is being generated by shotgun sequencing at the Institute for Genome

  10. Insertion sites and the terminal nucleotide sequences of the Tn4 transposon.

    PubMed

    Hyde, D R; Tu, C P

    1982-07-10

    The nucleotide sequences at the ends of the Tn4 transposon (mercury spectinomycin and sulfonamide resistance) have been determined. They are inverted repeated sequences of 38 nucleotides with three mismatched base pairs. These sequences are strongly homologous with the terminal sequences of Tn501 (mercury resistance) but less so with those of Tn3 (ampicillin resistance). The Tn4 transposon generates pentanucleotide members (Tn3, Tn1000, Tn501, Tn551, IS2) with the exception of Tn1721 and bacteriophage Mu. Among the three Tn4 insertion sites examined here, two of them occurred near a nonanucleotide sequence in perfect homology with part of the terminal inverted-repeat sequence of Tn4 and the third insertion occurred near a sequence of partial homology to one end of Tn4. All three insertions were in the same orientation such that IRb is proximal to its homologous sequence on the recipient DNA.

  11. Next-generation sequencing-based 5' rapid amplification of cDNA ends for alternative promoters.

    PubMed

    Perera, Bambarendage P U; Kim, Joomyeong

    2016-02-01

    Mammalian genomes contain many unknown alternative first exons and promoters. Thus, we have modified the existing 5'RACE (5' rapid amplification of cDNA ends) approach into a next-generation sequencing (NGS)-based new protocol that can identify these alternative promoters. This protocol has incorporated two main ideas: (i) 5'RACE starting from the known second exons of genes and (ii) NGS-based sequencing of the subsequent cDNA products. This protocol also provides a bioinformatics strategy that processes the sequence reads from NGS runs. This protocol has successfully identified several alternative promoters for an imprinted gene, PEG3. Overall, this NGS-based 5'RACE protocol is a sensitive and reliable method for detecting low-abundant transcripts and promoters.

  12. Screening of substrate peptide sequences for tissue-type transglutaminase (TGase 2) using T7 phage cDNA library.

    PubMed

    Sugimura, Yoshiaki; Yamashita, Hiroyuki; Hitomi, Kiyotaka

    2011-03-01

    Transglutaminase (TGase) is a family of enzymes that catalyzes cross-linking reaction between glutamine- and lysine residue of substrate proteins in several mammalian biological events. Substrate proteins for TGase and their physiological relevance have been still in research, continuously expanding. In this study, we have established a novel screening system that enables identification of cDNA sequence encoding favorable primary structure as a substrate for tissue-type transglutaminase (TGase 2), a multifunctional and ubiquitously expressing isozyme. By the screening, we identified several T7 phage clones that displayed substrate peptides for TGase 2 as a translated product from human brain cDNA library. Among the selected clones, the C-terminal region of IKAP, IkappaB kinase complex associated protein, appeared as a highly reactive substrate sequence for TGase 2. This system will open possibility of rapid identification of substrate sequences for transglutaminases at a genetic level.

  13. Generation of expressed sequence tags of random root cDNA clones of Brassica napus by single-run partial sequencing.

    PubMed Central

    Park, Y S; Kwak, J M; Kwon, O Y; Kim, Y S; Lee, D S; Cho, M J; Lee, H H; Nam, H G

    1993-01-01

    Two hundred thirty-seven expressed sequence tags (ESTs) of Brassica napus were generated by single-run partial sequencing of 197 random root cDNA clones. A computer search of these root ESTs revealed that 21 ESTs show significant similarity to the protein-coding sequences in the existing data bases, including five stress- or defense-related genes and four clones related to the genes from other kingdoms. Northern blot analysis of the 10 data base-matched cDNA clones revealed that many of the clones are expressed most abundantly in root but less abundantly in other organs. However, two clones were highly root specific. The results show that generation of the root ESTs by partial sequencing of random cDNA clones along with the expression analysis is an efficient approach to isolate genes that are functional in plant root in a large scale. We also discuss the results of the examination of cDNA libraries and sequencing methods suitable for this approach. PMID:8029332

  14. Quantum Point Contact Single-Nucleotide Conductance for DNA and RNA Sequence Identification.

    PubMed

    Afsari, Sepideh; Korshoj, Lee E; Abel, Gary R; Khan, Sajida; Chatterjee, Anushree; Nagpal, Prashant

    2017-10-06

    Several nanoscale electronic methods have been proposed for high-throughput single-molecule nucleic acid sequence identification. While many studies display a large ensemble of measurements as "electronic fingerprints" with some promise for distinguishing the DNA and RNA nucleobases (adenine, guanine, cytosine, thymine, and uracil), important metrics such as accuracy and confidence of base calling fall well below the current genomic methods. Issues such as unreliable metal-molecule junction formation, variation of nucleotide conformations, insufficient differences between the molecular orbitals responsible for single-nucleotide conduction, and lack of rigorous base calling algorithms lead to overlapping nanoelectronic measurements and poor nucleotide discrimination, especially at low coverage on single molecules. Here, we demonstrate a technique for reproducible conductance measurements on conformation-constrained single nucleotides and an advanced algorithmic approach for distinguishing the nucleobases. Our quantum point contact single-nucleotide conductance sequencing (QPICS) method uses combed and electrostatically bound single DNA and RNA nucleotides on a self-assembled monolayer of cysteamine molecules. We demonstrate that by varying the applied bias and pH conditions, molecular conductance can be switched ON and OFF, leading to reversible nucleotide perturbation for electronic recognition (NPER). We utilize NPER as a method to achieve >99.7% accuracy for DNA and RNA base calling at low molecular coverage (∼12×) using unbiased single measurements on DNA/RNA nucleotides, which represents a significant advance compared to existing sequencing methods. These results demonstrate the potential for utilizing simple surface modifications and existing biochemical moieties in individual nucleobases for a reliable, direct, single-molecule, nanoelectronic DNA and RNA nucleotide identification method for sequencing.

  15. In-depth cDNA library sequencing provides quantitative gene expression profiling in cancer biomarker discovery.

    PubMed

    Yang, Wanling; Ying, Dingge; Lau, Yu-Lung

    2009-06-01

    Quantitative gene expression analysis plays an important role in identifying differentially expressed genes in various pathological states, gene expression regulation and co-regulation, shedding light on gene functions. Although microarray is widely used as a powerful tool in this regard, it is suboptimal quantitatively and unable to detect unknown gene variants. Here we demonstrated effective detection of differential expression and co-regulation of certain genes by expressed sequence tag analysis using a selected subset of cDNA libraries. We discussed the issues of sequencing depth and library preparation, and propose that increased sequencing depth and improved preparation procedures may allow detection of many expression features for less abundant gene variants. With the reduction of sequencing cost and the emerging of new generation sequencing technology, in-depth sequencing of cDNA pools or libraries may represent a better and powerful tool in gene expression profiling and cancer biomarker detection. We also propose using sequence-specific subtraction to remove hundreds of the most abundant housekeeping genes to increase sequencing depth without affecting relative expression ratio of other genes, as transcripts from as few as 300 most abundantly expressed genes constitute about 20% of the total transcriptome. In-depth sequencing also represents a unique advantage of detecting unknown forms of transcripts, such as alternative splicing variants, fusion genes, and regulatory RNAs, as well as detecting mutations and polymorphisms that may play important roles in disease pathogenesis.

  16. Diverse nucleotide compositions and sequence fluctuation in Rubisco protein genes

    NASA Astrophysics Data System (ADS)

    Holden, Todd; Dehipawala, S.; Cheung, E.; Bienaime, R.; Ye, J.; Tremberger, G., Jr.; Schneider, P.; Lieberman, D.; Cheung, T.

    2011-10-01

    The Rubisco protein-enzyme is arguably the most abundance protein on Earth. The biology dogma of transcription and translation necessitates the study of the Rubisco genes and Rubisco-like genes in various species. Stronger correlation of fractal dimension of the atomic number fluctuation along a DNA sequence with Shannon entropy has been observed in the studied Rubisco-like gene sequences, suggesting a more diverse evolutionary pressure and constraints in the Rubisco sequences. The strategy of using metal for structural stabilization appears to be an ancient mechanism, with data from the porphobilinogen deaminase gene in Capsaspora owczarzaki and Monosiga brevicollis. Using the chi-square distance probability, our analysis supports the conjecture that the more ancient Rubisco-like sequence in Microcystis aeruginosa would have experienced very different evolutionary pressure and bio-chemical constraint as compared to Bordetella bronchiseptica, the two microbes occupying either end of the correlation graph. Our exploratory study would indicate that high fractal dimension Rubisco sequence would support high carbon dioxide rate via the Michaelis- Menten coefficient; with implication for the control of the whooping cough pathogen Bordetella bronchiseptica, a microbe containing a high fractal dimension Rubisco-like sequence (2.07). Using the internal comparison of chi-square distance probability for 16S rRNA (~ E-22) versus radiation repair Rec-A gene (~ E-05) in high GC content Deinococcus radiodurans, our analysis supports the conjecture that high GC content microbes containing Rubisco-like sequence are likely to include an extra-terrestrial origin, relative to Deinococcus radiodurans. Similar photosynthesis process that could utilize host star radiation would not compete with radiation resistant process from the biology dogma perspective in environments such as Mars and exoplanets.

  17. Complete nucleotide sequences of a distinct bipartite begomovirus, bitter gourd yellow vein virus, infecting Momordica charantia.

    PubMed

    Tahir, Muhammad; Haider, Muhammad Saleem; Briddon, Rob W

    2010-11-01

    Momordica charantia (Cucurbitaceae), a vegetable crop commonly cultivated throughout Pakistan, and begomoviruses, a serious threat to crop plants, are natives of tropical and subtropical regions of the world. Leaf samples of M. charantia with yellow vein symptoms typical of begomovirus infections and samples from apparently healthy plants were collected from areas around Lahore in 2004. Full-length clones of a bipartite begomovirus were isolated from symptomatic samples. The complete nucleotide sequences of the components of one isolate were determined, and these showed the arrangement of genes typical of Old World begomoviruses. The complete nucleotides sequence of DNA A showed the highest nucleotide sequence identity (86.9%) to an isolate of Tomato leaf curl New Delhi virus (ToLCNDV), confirming it to belong to a distinct species of begomovirus, for which the name Bitter gourd yellow vein virus (BGYVV) is proposed. Sequence comparisons showed that BGYVV likely emerged as a result of inter-specific recombination between ToLCNDV and tomato leaf curl Bangladesh virus (ToLCBDV). The complete nucleotide sequence of DNA B showed 97.2% nucleotide sequence identity to that of an Indian strain of Squash leaf curl China virus.

  18. Classification of nucleotide sequences using support vector machines.

    PubMed

    Seo, Tae-Kun

    2010-10-01

    Species identification is one of the most important issues in biological studies. Due to recent increases in the amount of genomic information available and the development of DNA sequencing technologies, the applicability of using DNA sequences to identify species (commonly referred to as "DNA barcoding") is being tested in many areas. Several methods have been suggested to identify species using DNA sequences, including similarity scores, analysis of phylogenetic and population genetic information, and detection of species-specific sequence patterns. Although these methods have demonstrated good performance under a range of circumstances, they also have limitations, as they are subject to loss of information, require intensive computation and are sensitive to model mis-specification, and can be difficult to evaluate in terms of the significance of identification. Here, we suggest a new DNA barcoding method in which support vector machine (SVM) procedures are adopted. Our new method is nonparametric and thus is expected to be robust for a wide range of evolutionary scenarios as well as multilocus analyses. Furthermore, we describe bootstrap procedures that can be used to test the significances of species identifications. We implemented a novel conversion technique for transforming sequence data to real-valued vectors, and therefore, bootstrap procedures can be easily combined with our SVM approach. In this study, we present the results of simulation studies and empirical data analyses to demonstrate the performance of our method and discuss its properties.

  19. Nature and distribution of feline sarcoma virus nucleotide sequences.

    PubMed Central

    Frankel, A E; Gilbert, J H; Porzig, K J; Scolnick, E M; Aaronson, S A

    1979-01-01

    The genomes of three independent isolates of feline sarcoma virus (FeSV) were compared by molecular hybridization techniques. Using complementary DNAs prepared from two strains, SM- and ST-FeSV, common complementary DNA'S were selected by sequential hybridization to FeSV and feline leukemia virus RNAs. These DNAs were shown to be highly related among the three independent sarcoma virus isolates. FeSV-specific complementary DNAs were prepared by selection for hybridization by the homologous FeSV RNA and against hybridization by fline leukemia virus RNA. Sarcoma virus-specific sequences of SM-FeSV were shown to differ from those of either ST- or GA-FeSV strains, whereas ST-FeSV-specific DNA shared extensive sequence homology with GA-FeSV. By molecular hybridization, each set of FeSV-specific sequences was demonstrated to be present in normal cat cellular DNA in approximately one copy per haploid genome and was conserved throughout Felidae. In contrast, FeSV-common sequences were present in multiple DNA copies and were found only in Mediterranean cats. The present results are consistent with the concept that each FeSV strain has arisen by a mechanism involving recombination between feline leukemia virus and cat cellular DNA sequences, the latter represented within the cat genome in a manner analogous to that of a cellular gene. PMID:225544

  20. Nucleotide sequence of the Agrobacterium tumefaciens octopine Ti plasmid-encoded tmr gene.

    PubMed Central

    Heidekamp, F; Dirkse, W G; Hille, J; van Ormondt, H

    1983-01-01

    The nucleotide sequence of the tmr gene, encoded by the octopine Ti plasmid from Agrobacterium tumefaciens (pTiAch5), was determined. The T-DNA, which encompasses this gene, is involved in tumor formation and maintenance, and probably mediates the cytokinin-independent growth of transformed plant cells. The nucleotide sequence of the tmr gene displays a continuous open reading frame specifying a polypeptide chain of 240 amino acids. The 5'- terminus of the polyadenylated tmr mRNA isolated from octopine tobacco tumor cell lines was determined by nuclease S1 mapping. The nucleotide sequence 5'-TATAAAA-3', which sequence is identical to the canonical "TATA" box, was found 29 nucleotides upstream from the major initiation site for RNA synthesis. Two potential polyadenylation signals 5'-AATAAA-3' were found at 207 and 275 nucleotides downstream from the TAG stopcodon of the tmr gene. A comparison was made of nucleotide stretches, involved in transcription control of T-DNA genes. Images PMID:6312414

  1. Characterization of rainbow trout gonad, brain and gill deep cDNA repertoires using a Roche 454-Titanium sequencing approach.

    PubMed

    Le Cam, Aurélie; Bobe, Julien; Bouchez, Olivier; Cabau, Cédric; Kah, Olivier; Klopp, Christophe; Lareyre, Jean-Jacques; Le Guen, Isabelle; Lluch, Jérôme; Montfort, Jérôme; Moreews, Francois; Nicol, Barbara; Prunet, Patrick; Rescan, Pierre-Yves; Servili, Arianna; Guiguen, Yann

    2012-05-25

    Rainbow trout, Oncorhynchus mykiss, is an important aquaculture species worldwide and, in addition to being of commercial interest, it is also a research model organism of considerable scientific importance. Because of the lack of a whole genome sequence in that species, transcriptomic analyses of this species have often been hindered. Using next-generation sequencing (NGS) technologies, we sought to fill these informational gaps. Here, using Roche 454-Titanium technology, we provide new tissue-specific cDNA repertoires from several rainbow trout tissues. Non-normalized cDNA libraries were constructed from testis, ovary, brain and gill rainbow trout tissue samples, and these different libraries were sequenced in 10 separate half-runs of 454-Titanium. Overall, we produced a total of 3million quality sequences with an average size of 328bp, representing more than 1Gb of expressed sequence information. These sequences have been combined with all publicly available rainbow trout sequences, resulting in a total of 242,187 clusters of putative transcript groups and 22,373 singletons. To identify the predominantly expressed genes in different tissues of interest, we developed a Digital Differential Display (DDD) approach. This approach allowed us to characterize the genes that are predominantly expressed within each tissue of interest. Of these genes, some were already known to be tissue-specific, thereby validating our approach. Many others, however, were novel candidates, demonstrating the usefulness of our strategy and of such tissue-specific resources. This new sequence information, acquired using NGS 454-Titanium technology, deeply enriched our current knowledge of the expressed genes in rainbow trout through the identification of an increased number of tissue-specific sequences. This identification allowed a precise cDNA tissue repertoire to be characterized in several important rainbow trout tissues. The rainbow trout contig browser can be accessed at the following

  2. Annotated Expressed Sequence Tags and cDNA Microarrays for Studies of Brain and Behavior in the Honey Bee

    PubMed Central

    Whitfield, Charles W.; Band, Mark R.; Bonaldo, Maria F.; Kumar, Charu G.; Liu, Lei; Pardinas, Jose R.; Robertson, Hugh M.; Soares, M. Bento; Robinson, Gene E.

    2002-01-01

    To accelerate the molecular analysis of behavior in the honey bee (Apis mellifera), we created expressed sequence tag (EST) and cDNA microarray resources for the bee brain. Over 20,000 cDNA clones were partially sequenced from a normalized (and subsequently subtracted) library generated from adult A. mellifera brains. These sequences were processed to identify 15,311 high-quality ESTs representing 8912 putative transcripts. Putative transcripts were functionally annotated (using the Gene Ontology classification system) based on matching gene sequences in Drosophila melanogaster. The brain ESTs represent a broad range of molecular functions and biological processes, with neurobiological classifications particularly well represented. Roughly half of Drosophila genes currently implicated in synaptic transmission and/or behavior are represented in the Apis EST set. Of Apis sequences with open reading frames of at least 450 bp, 24% are highly diverged with no matches to known protein sequences. Additionally, over 100 Apis transcript sequences conserved with other organisms appear to have been lost from the Drosophila genome. DNA microarrays were fabricated with over 7000 EST cDNA clones putatively representing different transcripts. Using probe derived from single bee brain mRNA, microarrays detected gene expression for 90% of Apis cDNAs two standard deviations greater than exogenous control cDNAs. [The sequence data described in this paper have been submitted to Genbank data library under accession nos. BI502708–BI517278. The sequences are also available at http://titan.biotec.uiuc.edu/bee/honeybee_project.htm.] PMID:11932240

  3. The nucleotide sequence of tomato mottle virus, a new geminivirus isolated from tomatoes in Florida.

    PubMed

    Abouzid, A M; Polston, J E; Hiebert, E

    1992-12-01

    A new geminivirus, tomato mottle virus (TMoV), affecting tomato production in Florida has been cloned and sequenced. Sequence analysis of the cloned replicative forms of TMoV revealed four potential coding regions for the A component [2601 nucleotides (nt)] and two for the B component (2541 nt). Comparisons of the nucleotide sequence of the TMoV genome with those of other whitefly-transmitted geminiviruses indicate that TMoV is a typical bipartite geminivirus of the New World and is closely related to but distinct from abutilon mosaic virus.

  4. Nucleotide sequences of 5S rRNAs from four jellyfishes.

    PubMed

    Hori, H; Ohama, T; Kumazaki, T; Osawa, S

    1982-11-25

    The nucleotide sequences of 5S rRNAs from four jellyfishes, Spirocodon saltatrix, Nemopsis dofleini, Aurelia aurita and Chrysaora quinquecirrha have been determined. The sequences are highly similar to each other. A fairly high similarity was also found between these jellyfishes and a sea anemone, Anthopleura japonica.

  5. Should nucleotide sequence analyzing computer algorithms always extend homologies by extending homologies?

    PubMed

    Burnett, L; Basten, A; Hensley, W J

    1986-01-10

    Most computer algorithms used for comparing or aligning nucleotide sequences rely on the premise that the best way to extend a homology between the two sequences is to select a match rather than a mismatch. We have tested this assumption and found that it is not always valid.

  6. Nucleotide sequences of 5S rRNAs from four jellyfishes.

    PubMed Central

    Hori, H; Ohama, T; Kumazaki, T; Osawa, S

    1982-01-01

    The nucleotide sequences of 5S rRNAs from four jellyfishes, Spirocodon saltatrix, Nemopsis dofleini, Aurelia aurita and Chrysaora quinquecirrha have been determined. The sequences are highly similar to each other. A fairly high similarity was also found between these jellyfishes and a sea anemone, Anthopleura japonica. PMID:6130512

  7. Complete sequence of HLA-B27 cDNA identified through the characterization of structural markers unique to the HLA-A, -B, and -C allelic series

    SciTech Connect

    Szoets, H.; Reithmueller, G.; Weiss, E.; Meo, T.

    1986-03-01

    Antigen HLA-B27 is a high-risk genetic factor with respect to a group of rheumatoid disorders, especially ankylosing spondylitis. A cDNA library was constructed from an autozygous B-cell line expressing HLA-B27, HLA-Cw1, and the previously cloned HLA-A2 antigen. Clones detected with an HLA probe were isolated and sorted into homology groups by differential hybridization and restriction maps. Nucleotide sequencing allowed the unambiguous assignment of cDNAs to HLA-A, -B, and -C loci. The HLA-B27 mRNA has the structure features and the codon variability typical of an HLA class I transcript but it specifies two uncommon amino acid replacements: a cysteine in position 67 and a serine in position 131. The latter substitution may have functional consequences, because it occurs in a conserved region and at a position invariably occupied by a species-specific arginine in humans and lysine in mice. The availability of the complete sequence of HLA-B27 and of the partial sequence of HLA-Cw1 allows the recognition of locus-specific sequence markers, particularly, but not exclusively, in the transmembrane and cytoplasmic domains.

  8. Molecular cloning and sequence analysis of a cDNA encoding pituitary thyroid stimulating hormone beta-subunit of the Chinese soft-shell turtle Pelodiscus sinensis and regulation of its gene expression.

    PubMed

    Chien, Jung-Tsun; Chowdhury, Indrajit; Lin, Yao-Sung; Liao, Ching-Fong; Shen, San-Tai; Yu, John Yuh-Lin

    2006-04-01

    A cDNA encoding thyroid stimulating hormone beta-subunit (TSHbeta) was cloned from pituitary of the Chinese soft-shell turtle, Pelodiscus sinensis, and its regulation of mRNA expression was investigated for the first time in reptile. The Chinese soft-shell turtle TSHbeta cDNA was cloned from pituitary RNA by reverse transcription and polymerase chain reaction (RT-PCR), and rapid amplification cDNA end (RACE) methods. The Chinese soft-shell turtle TSHbeta cDNA consists of 580-bp nucleotides, including 67-bp nucleotides of 5'-untranslated region (UTR), 402-bp of the open reading frame, and 97-bp of 3'-UTR followed by a 14 poly (A) trait. It encodes a precursor protein molecule of 133 amino acids with a putative signal peptide of 19 amino acids and a putative mature protein of 114 amino acids. The number and position of 12 cysteine residues, presumably forming six disulfide bonds, one putative asparagine-linked glycosylation site, and six proline residues that are found at positions for changing the backbone direction of the protein have been conserved in the turtle as in other vertebrate groups. The deduced amino acid sequence of the Chinese soft-shell turtle TSHbeta mature protein shares identities of 82-83% with birds, 71-72% with mammals, 49-57% with amphibians, and 44-61% with fish. The Chinese soft-shell turtle pituitaries were incubated in vitro with synthetic TRH (TSH-releasing hormone), thyroxine and triiodothyronine at doses of 10(-10) and 10(-8)M. TRH stimulated, while thyroid hormones suppressed, TSHbeta mRNA levels in dose-related manner. The sequences of cDNA and its deduced peptide of TSHbeta as well as the regulation of its mRNA level were reported for the first time in reptile.

  9. Nucleotide sequence conservation in paramyxoviruses; the concept of codon constellation.

    PubMed

    Rima, Bert K

    2015-05-01

    The stability and conservation of the sequences of RNA viruses in the field and the high error rates measured in vitro are paradoxical. The field stability indicates that there are very strong selective constraints on sequence diversity. The nature of these constraints is discussed. Apart from constraints on variation in cis-acting RNA and the amino acid sequences of viral proteins, there are other ones relating to the presence of specific dinucleotides such CpG and UpA as well as the importance of RNA secondary structures and RNA degradation rates. Recent other constraints identified in other RNA viruses, such as effects of secondary RNA structure on protein folding or modification of cellular tRNA complements, are also discussed. Using the family Paramyxoviridae, I show that the codon usage pattern (CUP) is (i) specific for each virus species and (ii) that it is markedly different from the host - it does not vary even in vaccine viruses that have been derived by passage in a number of inappropriate host cells. The CUP might thus be an additional constraint on variation, and I propose the concept of codon constellation to indicate the informational content of the sequences of RNA molecules relating not only to stability and structure but also to the efficiency of translation of a viral mRNA resulting from the CUP and the numbers and position of rare codons.

  10. Nucleotide sequence of a human tRNA gene heterocluster

    SciTech Connect

    Chang, Y.N.; Pirtle, I.L.; Pirtle, R.M.

    1986-05-01

    Leucine tRNA from bovine liver was used as a hybridization probe to screen a human gene library harbored in Charon-4A of bacteriophage lambda. The human DNA inserts from plaque-pure clones were characterized by restriction endonuclease mapping and Southern hybridization techniques, using both (3'-/sup 32/P)-labeled bovine liver leucine tRNA and total tRNA as hybridization probes. An 8-kb Hind III fragment of one of these ..gamma..-clones was subcloned into the Hind III site of pBR322. Subsequent fine restriction mapping and DNA sequence analysis of this plasmid DNA indicated the presence of four tRNA genes within the 8-kb DNA fragment. A leucine tRNA gene with an anticodon of AAG and a proline tRNA gene with an anticodon of AGG are in a 1.6-kb subfragment. A threonine tRNA gene with an anticodon of UGU and an as yet unidentified tRNA gene are located in a 1.1-kb subfragment. These two different subfragments are separated by 2.8 kb. The coding regions of the three sequenced genes contain characteristic internal split promoter sequences and do not have intervening sequences. The 3'-flanking region of these three genes have typical RNA polymerase III termination sites of at least four consecutive T residues.

  11. Methods for making nucleotide probes for sequencing and synthesis

    DOEpatents

    Church, George M; Zhang, Kun; Chou, Joseph

    2014-07-08

    Compositions and methods for making a plurality of probes for analyzing a plurality of nucleic acid samples are provided. Compositions and methods for analyzing a plurality of nucleic acid samples to obtain sequence information in each nucleic acid sample are also provided.

  12. A new hybrid fractal algorithm for predicting thermophilic nucleotide sequences.

    PubMed

    Lu, Jin-Long; Hu, Xue-Hai; Hu, Dong-Gang

    2012-01-21

    Knowledge of thermophilic mechanisms about some organisms whose optimum growth temperature (OGT) ranges from 50 to 80 degree plays a major role in helping design stable proteins. How to predict a DNA sequence to be thermophilic is a long but not fairly resolved problem. Chaos game representation (CGR) can investigate the patterns hiding in DNA sequences, and can visually reveal previously unknown structure. Fractal dimensions are good tools to measure sizes of complex, highly irregular geometric objects. In this paper, we convert every DNA sequence into a high dimensional vector by CGR algorithm and fractal dimension, and then predict the DNA sequence thermostability by these fractal features and support vector machine (SVM). We have conducted experiments on three groups: 17-dimensional vector, 65-dimensional vector, and 257-dimensional vector. Each group is evaluated by the 10-fold cross-validation test. For the results, the group of 257-dimensional vector gets the best results: the average accuracy is 0.9456 and average MCC is 0.8878. The results are also compared with the previous work with single CGR features. The comparison shows the high effectiveness of the new hybrid fractal algorithm.

  13. Cloning and characterization of cDNA sequences encoding for new venom peptides of the Brazilian scorpion Opisthacanthus cayaporum.

    PubMed

    Silva, Edelyn C N; Camargos, Thalita S; Maranhão, Andrea Q; Silva-Pereira, Ildinete; Silva, Luciano P; Possani, Lourival D; Schwartz, Elisabeth F

    2009-09-01

    Scorpion venom glands produce a large variety of bioactive peptides. This communication reports the identification of venom components obtained by sequencing clones isolated from a cDNA library prepared with venomous glands of the Brazilian scorpion Opisthacanthus cayaporum (Ischnuridae). Two main types of components were identified: peptides with toxin-like sequences and proteins involved in cellular processes. Using the expressed sequence tag (EST) strategy 118 clones were identified, from which 61 code for unique sequences (17 contigs and 44 singlets) with an average length of 531 base-pairs (bp). These results were compared with those previously obtained by the proteomic analysis of the same venom, showing a considerable degree of similarity in terms of the molecular masses expected and DNA sequences found. About 36% of the ESTs correspond to toxin-like peptides and proteins with identifiable open reading frames (ORFs). The cDNA sequencing results also show the presence of sequences whose putative products correspond to a scorpine-like component; three short antimicrobial peptides; three K(+)-channel blockers; and an additional peptide containing 78 amino acid residues, whose sequence resembles peptide La1 from another Ischnuridae scorpion Liocheles australiasiae, thus far with unknown function.

  14. Characterization of the cDNA and genomic sequence of a G protein [gamma] subunit ([gamma][sub 5])

    SciTech Connect

    Fisher, K.J.; Aronson, N.N. Jr. )

    1992-04-01

    A cDNA from human placenta and liver tissues that contained both sequence for the lysosomal glycosidase di-N-acetylchitobiase and sequence homologous to the subunit of GTP-binding proteins was previously isolated. Here, we have shown that the [gamma]-subunit-homologous portion of this unusual cDNA is derived from a member of the [gamma]-subunit multigene family. The partial human [gamma]-subunit sequence was used to isolate the corresponding full-length cDNA clones from bovine and rat livers. The two cDNAs encode identical 68-amino-acid proteins (7.3 kDa) homologous to previously cloned G protein [gamma] subunits. The bovine gene sequence encoding this new [gamma]-subunit isoform ([gamma][sub 5]) was determined and found to have an intron-exon structure consistent with the original human chitobiase-[gamma][sub 5]-subunit hybrid mRNA being a product of alternative splicing. Genomic cloning also resulted in the isolation of a human [gamma][sub 5] pseudogene. 25 refs., 7 figs.

  15. Sequence analysis of expressed sequence tags from an ABA-treated cDNA library identifies stress response genes in the moss Physcomitrella patens.

    PubMed

    Machuka, J; Bashiardes, S; Ruben, E; Spooner, K; Cuming, A; Knight, C; Cove, D

    1999-04-01

    Partial cDNA sequencing was used to obtain 169 expressed sequence tags (ESTs) in the moss, Physcomitrella patens. The source of ESTs was a random cDNA library constructed from 7 day-old protonemata following treatment with 10(-4) M abscisic acid (ABA). Analysis of the ESTs identified 69% with homology to known sequences, 61% of which had significant homology to sequences of plant origin. More importantly, at least 11 ESTs had significant similarities to genes which are implicated in plant stress-responses, including responses which may involve ABA. These included a cDNA associated with desiccation tolerance, two heat shock protein genes, one cold acclimation protein cDNA and five others that may be involved in either oxidative or chemical stress or both, i.e., Zn/Cu-superoxide dismutase, NADPH protochlorophyllide oxidoreductase (PorB), selenium binding protein, glutathione peroxidase and glutathione S transferase. Analysis of codon usage between P. patens and seed plants indicated that although mosses and higher plants are to a large extent similar, minor variations also exists that may represent the distinctiveness of each group.

  16. Nucleotide sequences and phylogeny of the nucleocapsid gene of Oropouche virus.

    PubMed

    Saeed, M F; Wang, H; Nunes, M; Vasconcelos, P F; Weaver, S C; Shope, R E; Watts, D M; Tesh, R B; Barrett, A D

    2000-03-01

    The nucleotide sequence of the S RNA segment of the Oropouche (ORO) virus prototype strain TRVL 9760 was determined and found to be 754 nucleotides in length. In the virion-complementary orientation, the RNA contained two overlapping open reading frames of 693 and 273 nucleotides that were predicted to encode proteins of 231 and 91 amino acids, respectively. Subsequently, the nucleotide sequences of the nucleocapsid genes of 27 additional ORO virus strains, representing a 42 year interval and a wide geographical range in South America, were determined. Phylogenetic analyses revealed that all the ORO virus strains formed a monophyletic group that comprised three distinct lineages. Lineage I contained the prototype strain from Trinidad and most of the Brazilian strains, lineage II contained six Peruvian strains isolated between 1992 and 1998, and two strains from western Brazil isolated in 1991, while lineage III comprised four strains isolated in Panama during 1989.

  17. Mayaro virus: complete nucleotide sequence and phylogenetic relationships with other alphaviruses.

    PubMed

    Lavergne, Anne; de Thoisy, Benoît; Lacoste, Vincent; Pascalis, Hervé; Pouliquen, Jean-François; Mercier, Véronique; Tolou, Hugues; Dussart, Philippe; Morvan, Jacques; Talarmin, Antoine; Kazanji, Mirdad

    2006-05-01

    Mayaro (MAY) virus is a member of the genus Alphavirus in the family Togaviridae. Alphaviruses are distributed throughout the world and cause a wide range of diseases in humans and animals. Here, we determined the complete nucleotide sequence of MAY from a viral strain isolated from a French Guianese patient. The deduced MAY genome was 11,429 nucleotides in length, excluding the 5' cap nucleotide and 3' poly(A) tail. Nucleotide and amino acid homologies, as well as phylogenetic analyses of the obtained sequence confirmed that MAY is not a recombinant virus and belongs to the Semliki Forest complex according to the antigenic complex classification. Furthermore, analyses based on the E1 region revealed that MAY is closely related to Una virus, the only other South American virus clustering with the Old World viruses. On the basis of our results and of the alphaviruses diversity and pathogenicity, we suggest that alphaviruses may have an Old World origin.

  18. Evolutionarily conserved sequences of striated muscle myosin heavy chain isoforms. Epitope mapping by cDNA expression.

    PubMed

    Miller, J B; Teal, S B; Stockdale, F E

    1989-08-05

    A cDNA expression strategy was used to localize amino acid sequences which were specific for fast, as opposed to slow, isoforms of the chicken skeletal muscle myosin heavy chain (MHC) and which were conserved in vertebrate evolution. Five monoclonal antibodies (mAbs), termed F18, F27, F30, F47, and F59, were prepared that reacted with all of the known chicken fast MHC isoforms but did not react with any of the known chicken slow nor with smooth muscle MHC isoforms. The epitopes recognized by mAbs F18, F30, F47, and F59 were on the globular head fragment of the MHC, whereas the epitope recognized by mAb F27 was on the helical tail or rod fragment. Reactivity of all five mAbs also was confined to fast MHCs in the rat, with the exception of mAb F59, which also reacted with the beta-cardiac MHC, the single slow MHC isoform common to both the rat heart and skeletal muscle. None of the five epitopes was expressed on amphioxus, nematode, or Dictyostelium MHC. The F27 and F59 epitopes were found on shark, electric ray, goldfish, newt, frog, turtle, chicken, quail, rabbit, and rat MHCs. The epitopes recognized by these mAbs were conserved, therefore, to varying degrees through vertebrate evolution and differed in sequence from homologous regions of a number of invertebrate MHCs and myosin-like proteins. The sequence of those epitopes on the head were mapped using a two-part cDNA expression strategy. First, Bal31 exonuclease digestion was used to rapidly generate fragments of a chicken embryonic fast MHC cDNA that were progressively deleted from the 3' end. These cDNA fragments were expressed as beta-galactosidase/MHC fusion proteins using the pUR290 vector; the fusion proteins were tested by immunoblotting for reactivity with the mAbs; and the approximate locations of the epitopes were determined from the sizes of the cDNA fragments that encoded a particular epitope. The epitopes were then precisely mapped by expression of overlapping cDNA fragments of known sequence that

  19. TranslatorX: multiple alignment of nucleotide sequences guided by amino acid translations.

    PubMed

    Abascal, Federico; Zardoya, Rafael; Telford, Maximilian J

    2010-07-01

    We present TranslatorX, a web server designed to align protein-coding nucleotide sequences based on their corresponding amino acid translations. Many comparisons between biological sequences (nucleic acids and proteins) involve the construction of multiple alignments. Alignments represent a statement regarding the homology between individual nucleotides or amino acids within homologous genes. As protein-coding DNA sequences evolve as triplets of nucleotides (codons) and it is known that sequence similarity degrades more rapidly at the DNA than at the amino acid level, alignments are generally more accurate when based on amino acids than on their corresponding nucleotides. TranslatorX novelties include: (i) use of all documented genetic codes and the possibility of assigning different genetic codes for each sequence; (ii) a battery of different multiple alignment programs; (iii) translation of ambiguous codons when possible; (iv) an innovative criterion to clean nucleotide alignments with GBlocks based on protein information; and (v) a rich output, including Jalview-powered graphical visualization of the alignments, codon-based alignments coloured according to the corresponding amino acids, measures of compositional bias and first, second and third codon position specific alignments. The TranslatorX server is freely available at http://translatorx.co.uk.

  20. Molecular cloning and chromosomal localization of a novel human tracheo-bronchial mucin cDNA containing tandemly repeated sequences of 48 base pairs.

    PubMed

    Porchet, N; Nguyen, V C; Dufosse, J; Audie, J P; Guyonnet-Duperat, V; Gross, M S; Denis, C; Degand, P; Bernheim, A; Aubert, J P

    1991-03-15

    A lambda gt11 cDNA library constructed from human tracheo-bronchial mucosa was screened with a polyclonal antiserum raised to chemically deglycosylated pronase glycopeptides from human bronchial mucins. Out of 20 positives clones, one partial cDNA clone was isolated and allowed to map a novel human tracheo-bronchial mucin gene. It contains 48 nucleotide tandem repeats quite perfectly identical which encodes a protein containing about 50% of hydroxy amino-acids. This clone hybridized to polydisperse messages produced by human tracheo-bronchial and human colonic mucosae. The gene (proposed name MUC 4) from which cDNA is derived maps to chromosome 3.

  1. Nucleotide sequence and taxonomical distribution of the bacteriocin gene lin cloned from Brevibacterium linens M18.

    PubMed

    Valdes-Stauber, N; Scherer, S

    1996-04-01

    Linocin M18 is an antilisterial bacteriocin produced by the red smear cheese bacterium Brevibacterium linens M18. Oligonucleotide probes based on the N-terminal amino acid sequence were used to locate its single copy gene, lin, on the chromosomal DNA. The amino acid composition, N-terminal sequence, and molecular mass derived from the nucleotide sequence of an open reading frame of 798 nucleotides coding for 266 amino acids found on a 3-kb BamHI restriction fragment correspond closely to those obtained from the purified protein (N. Valdés-Stauber and S. Scherer, Appl. Environ. Microbiol. 60:3809-3814, 1994). No sequence homology to any protein or nucleotide sequences deposited in databases was found. Comparison of the nucleotide sequence and the N-terminal amino acid sequence derived from the protein suggests that B. linens M18 produces an N-formyl-methionyl-CAC tRNA. A wide taxonomical distribution of the gene within coryneform bacteria has been demonstrated by PCR amplification. The structural gene from linocin M18 is present at least in three Brevibacterium species, five Arthrobacter species, and five Corynebacterium species.

  2. Clusters of nucleotide substitutions and insertion/deletion mutations are associated with repeat sequences.

    PubMed

    McDonald, Michael J; Wang, Wei-Chi; Huang, Hsien-Da; Leu, Jun-Yi

    2011-06-01

    The genome-sequencing gold rush has facilitated the use of comparative genomics to uncover patterns of genome evolution, although their causal mechanisms remain elusive. One such trend, ubiquitous to prokarya and eukarya, is the association of insertion/deletion mutations (indels) with increases in the nucleotide substitution rate extending over hundreds of base pairs. The prevailing hypothesis is that indels are themselves mutagenic agents. Here, we employ population genomics data from Escherichia coli, Saccharomyces paradoxus, and Drosophila to provide evidence suggesting that it is not the indels per se but the sequence in which indels occur that causes the accumulation of nucleotide substitutions. We found that about two-thirds of indels are closely associated with repeat sequences and that repeat sequence abundance could be used to identify regions of elevated sequence diversity, independently of indels. Moreover, the mutational signature of indel-proximal nucleotide substitutions matches that of error-prone DNA polymerases. We propose that repeat sequences promote an increased probability of replication fork arrest, causing the persistent recruitment of error-prone DNA polymerases to specific sequence regions over evolutionary time scales. Experimental measures of the mutation rates of engineered DNA sequences and analyses of experimentally obtained collections of spontaneous mutations provide molecular evidence supporting our hypothesis. This study uncovers a new role for repeat sequences in genome evolution and provides an explanation of how fine-scale sequence contextual effects influence mutation rates and thereby evolution.

  3. Identification and sequence of a cDNA clone corresponding to a gene involved in development of Undaria pinnatifida

    NASA Astrophysics Data System (ADS)

    Hou, He-Shen; Li, Ning; Wu, Chao-Yuan

    1998-03-01

    During the induction of gamete-producing gametangia, induced gametophytes were collected at 4 days intervals (0,4,8,12 d) and total RNAs were isolated by CsCl gradient ultracentrifugation. Some stage-specific expressed mRNAs were identified by differential display of mRNAs from different developing stages of the gametophytes. The cDNA of one specific mRNA was verified, cloned and sequenced. This gene was specifically expressed during 4 days of induction, and had partial homologous sequence with tobacco IAA-binding protein gene. It suggests that this cDNA may represent a gene which is related to the IAA regulating function during the development of the gametophytes.

  4. Complete nucleotide sequence of Alfalfa mosaic virus isolated from alfalfa (Medicago sativa L.) in Argentina.

    PubMed

    Trucco, Verónica; de Breuil, Soledad; Bejerman, Nicolás; Lenardon, Sergio; Giolitti, Fabián

    2014-06-01

    The complete nucleotide sequence of an Alfalfa mosaic virus (AMV) isolate infecting alfalfa (Medicago sativa L.) in Argentina, AMV-Arg, was determined. The virus genome has the typical organization described for AMV, and comprises 3,643, 2,593, and 2,038 nucleotides for RNA1, 2 and 3, respectively. The whole genome sequence and each encoding region were compared with those of other four isolates that have been completely sequenced from China, Italy, Spain and USA. The nucleotide identity percentages ranged from 95.9 to 99.1 % for the three RNAs and from 93.7 to 99 % for the protein 1 (P1), protein 2 (P2), movement protein and coat protein (CP) encoding regions, whereas the amino acid identity percentages of these proteins ranged from 93.4 to 99.5 %, the lowest value corresponding to P2. CP sequences of AMV-Arg were compared with those of other 25 available isolates, and the phylogenetic analysis based on the CP gene was carried out. The highest percentage of nucleotide sequence identity of the CP gene was 98.3 % with a Chinese isolate and 98.6 % at the amino acid level with four isolates, two from Italy, one from Brazil and the remaining one from China. The phylogenetic analysis showed that AMV-Arg is closely related to subgroup I of AMV isolates. To our knowledge, this is the first report of a complete nucleotide sequence of AMV from South America and the first worldwide report of complete nucleotide sequence of AMV isolated from alfalfa as natural host.

  5. Acetylcholinesterase of the Sand Fly, Phlebotomus papatasi (Scopoli): cDNA Sequence, Baculovirus Expression, and Biochemical Properties

    DTIC Science & Technology

    2013-01-01

    and domestic animals around the world are affected by leishmaniasis, a disease caused by various species of flagellated protozoans in the genus ...identity to acetylcholinesterase mRNA sequences of Culex tritaeniorhynchus and Lutzomyia longipalpis, respectively. The P. papatasi cDNA ORF encoded a...papatasi AChE1 will facilitate rapid in vitro screening to identify novel PpAChE inhibitors, and comparative studies on biochemical kinetics of

  6. Cloning and nucleotide sequence of the Lactobacillus casei lactate dehydrogenase gene.

    PubMed Central

    Kim, S F; Baek, S J; Pack, M Y

    1991-01-01

    An allosteric L-(+)-lactate dehydrogenase gene of Lactobacillus casei ATCC 393 was cloned in Escherichia coli, and the nucleotide sequence of the gene was determined. The gene was composed of an open reading frame of 981 bp, starting with a GTG codon and ending with a TAA codon. The sequences for the promoter and ribosome binding site were identified, and a sequence for a structure resembling a rho-independent transcription terminator was also found. Images PMID:1768113

  7. Nucleotide sequence of an Escherichia coli chromosomal hemolysin.

    PubMed Central

    Felmlee, T; Pellett, S; Welch, R A

    1985-01-01

    We determined the DNA sequence of an 8,211-base-pair region encompassing the chromosomal hemolysin, molecularly cloned from an O4 serotype strain of Escherichia coli. All four hemolysin cistrons (transcriptional order, C, A, B, and D) were encoded on the same DNA strand, and their predicted molecular masses were, respectively, 19.7, 109.8, 79.9, and 54.6 kilodaltons. The identification of pSF4000-encoded polypeptides in E. coli minicells corroborated the assignment of the predicted polypeptides for hlyC, hlyA, and hlyD. However, based on the minicell results, two polypeptides appeared to be encoded on the hlyB region, one similar in size to the predicted molecular mass of 79.9 kilodaltons, and the other a smaller 46-kilodalton polypeptide. The four hemolysin gene displayed similar codon usage, which is atypical for E. coli. This reflects the low guanine-plus-cytosine content (40.2%) of the hemolysin DNA sequence and suggests the non-E. coli origin of the hemolysin determinant. In vitro-derived deletions of the hemolysin recombinant plasmid pSF4000 indicated that a region between 433 and 301 base pairs upstream of the putative start of hlyC is necessary for hemolysin synthesis. Based on the DNA sequence, a stem-loop transcription terminator-like structure (a 16-base-pair stem followed by seven uridylates) in the mRNA was predicted distal to the C-terminal end of hlyA. A model for the general transcriptional organization of the E. coli hemolysin determinant is presented. Images PMID:3891743

  8. Nucleotide Sequence of the Protective Antigen Gene of Bacillus Anthracis

    DTIC Science & Technology

    1988-02-02

    which appear to encode a sIgnal peptide having characteristics in common with those of other secreted proteins. A consensus TATAAT sequence was located ...UNCLASSIFIED 4144MIT? @.MICATION OF TWOS Ph" r~ .Ewa ..4 20. ABSTRACT (cont) was located seven bp upstream of the ATG initiation codon. The codon usage f.’r...TATAAT seqc. e was located at the putative -10 promoter site. A Shine-Dalgarno site similar to that found in genes of other Bacillus sp. was located seven

  9. Expression of a cDNA sequence encoding human purine nucleoside phosphorylase in rodent and human cells.

    PubMed Central

    McIvor, R S; Goddard, J M; Simonsen, C C; Martin, D W

    1985-01-01

    A cDNA sequence which contains the entire coding region for human purine nucleoside phosphorylase (PNP) was recombined for selection and expression in mammalian cells. Plasmids containing either the simian virus 40 early promoter or the mouse metallothionein promoter positioned just upstream of the PNP coding sequence were constructed. These plasmids also contained the gene for a methotrexate-resistant dihydrofolate reductase, allowing for selection and amplification of positive transferrents after transfection of cells by the DNA-calcium phosphate coprecipitation technique. Expression of human PNP activity was readily detected in both mouse (L) and CHO cells by isoelectric focusing of cell extracts followed by histochemical staining for PNP activity. The simian virus 40 early promoter directed considerable expression of human PNP activity in CHO cells but only scant activity in mouse cells. The mouse metallothionein promoter was not successful in effecting human PNP expression in CHO cells but provided substantial human PNP activity in mouse cells and was inducible by incubation with zinc. HeLa cell transferrents were isolated and screened for the presence of transferred PNP cDNA sequences by Southern hybridization analysis. RNA transcripts derived from the transferred PNP cDNA were identified in one of these cell lines. Images PMID:3929070

  10. Construction of cDNA library and preliminary analysis of expressed sequence tags from green microalga Ankistrodesmus convolutus Corda.

    PubMed

    Thanh, Tran; Chi, Vu Thi Quynh; Abdullah, Mohd Puad; Omar, Hishamuddin; Noroozi, Mostafa; Ky, Huynh; Napis, Suhaimi

    2011-01-01

    Green microalga Ankistrodesmus convolutus Corda is a fast growing alga which produces appreciable amount of carotenoids and polyunsaturated fatty acids. To our knowledge, this is the first report on the construction of cDNA library and preliminary analysis of ESTs for this species. The titers of the primary and amplified cDNA libraries were 1.1×10(6) and 6.0×10(9) pfu/ml respectively. The percentage of recombinants was 97% in the primary library and a total of 337 out of 415 original cDNA clones selected randomly contained inserts ranging from 600 to 1,500 bps. A total of 201 individual ESTs with sizes ranging from 390 to 1,038 bps were then analyzed and the BLASTX score revealed that 35.8% of the sequences were classified as strong match, 38.3% as nominal and 25.9% as weak match. Among the ESTs with known putative function, 21.4% of them were found to be related to gene expression, 14.4% ESTs to photosynthesis, 10.9% ESTs to metabolism, 5.5% ESTs to miscellaneous, 2.0% to stress response, and the remaining 45.8% were classified as novel genes. Analysis of ESTs described in this paper can be an effective approach to isolate and characterize new genes from A. convolutus and thus the sequences obtained represented a significant contribution to the extensive database of sequences from green microalgae.

  11. Complete nucleotide sequence and transcriptional analysis of snakehead fish retrovirus.

    PubMed

    Hart, D; Frerichs, G N; Rambaut, A; Onions, D E

    1996-06-01

    The complete genome of the snakehead fish retrovirus has been cloned and sequenced, and its transcriptional profile in cell culture has been determined. The 11.2-kb provirus displays a complex expression pattern capable of encoding accessory proteins and is unique in the predicted location of the env initiation codon and signal peptide upstream of gag and the common splice donor site. The virus is distinguishable from all known retrovirus groups by the presence of an arginine tRNA primer binding site. The coding regions are highly divergent and show a number of unusual characteristics, including a large Gag coiled-coil region, a Pol domain of unknown function, and a long, lentiviral-like, Env cytoplasmic domain. Phylogenetic analysis of the Pol sequence emphasizes the divergent nature of the virus from the avian and mammalian retroviruses. The snakehead virus is also distinct from a previously characterized complex fish retrovirus, suggesting that discrete groups of these viruses have yet to be identified in the lower vertebrates.

  12. Nucleotide sequence of the capsid protein gene of papaya leaf-distortion mosaic potyvirus.

    PubMed

    Maoka, T; Kashiwazaki, S; Tsuda, S; Usugi, T; Hibino, H

    1996-01-01

    The DNA complementary to the 3'-terminal 1 404 nucleotides [excluding the poly(A) tail] of papaya leaf-distortion mosaic potyvirus (PLDMV) RNA was cloned and sequenced. The sequence starts within a long open reading frame (ORF) of 1 195 nucleotides and is followed by a 3' non-coding region of 209 nucleotides. Capsid protein (CP) is encoded at the 3' terminus of the ORF. The CP contains 293 residues and has a Mr of 33 277. The CP of PLDMV exhibits 49 to 59% sequence similarity at the amino acid level to the CPs of papaya ringspot potyvirus (PRSV) and other potyviruses. This result is consistent with the absence of a serological relationship between PLDMV and PRSV or other potyviruses. The results support the assignment of PLDMV as a distinct member of the genus Potyvirus.

  13. Statistical analysis of nucleotide runs in coding and noncoding DNA sequences.

    PubMed

    Sprizhitsky YuA; Nechipurenko YuD; Alexandrov, A A; Volkenstein, M V

    1988-10-01

    A statistical analysis of the occurrence of particular nucleotide runs in DNA sequences of different species has been carried out. There are considerable differences of run distributions in DNA sequences of procaryotes, invertebrates and vertebrates. There is an abundance of short runs (1-2 nucleotides long) in the coding sequences and there is a deficiency of such runs in the noncoding regions. However, some interesting exceptions from this rule exist for the run distribution of adenine in procaryotes and for the arrangement of purine-pyrimidine runs in eucaryotes. The similarity in the distributions of such runs in the coding and noncoding regions may be due to some structural features of the DNA molecule as a whole. Runs of guanine (or cytosine) of three to six nucleotides occur predominantly in noncoding DNA regions in eucaryotes, especially in vertebrates.

  14. The full-length nucleotide sequences of the virulent Trinidad donkey strain of Venezuelan equine encephalitis virus and its attenuated vaccine derivative, strain TC-83.

    PubMed

    Kinney, R M; Johnson, B J; Welch, J B; Tsuchiya, K R; Trent, D W

    1989-05-01

    Nucleotide sequence analysis of cDNA clones covering the entire genomes of Trinidad donkey (TRD) Venezuelan equine encephalitis (VEE) virus and its vaccine derivative, TC-83, has revealed 11 differences between the genomes of TC-83 virus and its parent. One nucleotide substitution and a single nucleotide deletion occurred in the 5'- and 3'-noncoding regions of the TC-83 genome, respectively. The deduced amino acid sequences of the nonstructural polypeptides of the two viruses differed only in a conservative Ser(TRD) to Thr(TC-83) substitution in nonstructural protein (nsP) three at amino acid position 260. The two silent mutations (one each in E1 and E2), one amino acid substitution in the E1 glycoprotein, and five substitutions in the E2 envelope glycoprotein of TC-83 virus were reported previously (B.J.B. Johnson, R.M. Kinney, C.L. Kost, and D.W. Trent, 1986, J. Gen. Virol. 67, 1951-1960). The genome of TRD virus was 11,444 nucleotides long with a 5'-noncoding region of 44 nucleotides. The carboxyl terminal portion of VEE nsP3 contained two peptide segments (7 and 34 amino acids long) that were repeated with high fidelity. The open reading frame of the nonstructural polyprotein was interrupted by an in-frame opal termination codon between nsP3 and nsP4, as has been reported for Sindbis, Ross River, and Middelburg viruses. The deduced amino acid sequences of the VEE TRD nsP1, nsP2, nsP3, and nsP4 polypeptides showed 60-66%, 57-58%, 35-44%, and 73-71% identity with the aligned sequences of the cognate polypeptides of Sindbis and Semliki Forest viruses, respectively. The lack of homology in the nsP3 of the viruses is due to sequence variation in the carboxyl terminal half of this polypeptide.

  15. Intraspecific nucleotide sequence differences in the major noncoding region of human mitochondrial DNA.

    PubMed Central

    Horai, S; Hayasaka, K

    1990-01-01

    Nucleotide sequences of the major noncoding region of human mitochondrial DNA (mtDNA) from 95 human placentas have been determined. These sequences include at least a 482-bp-long region encompassing most of the D-loop-forming region. Comparisons of these sequences with those previously determined have revealed remarkable features of nucleotide substitutions and insertion/deletion events. The nucleotide diversity among the sequences is estimated as 1.45%, which is three- to fourfold higher than the corresponding value estimated from restriction-enzyme analysis of whole mtDNA genome. A hypervariable region has also been defined. In this 14-bp region, 17 different sequences were detected. More than 97% of the base changes are transitions. A significantly nonrandom distribution of nucleotide substitutions and sequence length variations were also noted. The phylogenetic analysis indicates that diversity among the negroids is much larger than that among the caucasoids or the mongoloids. In fact, part of the negroids first diverged from other humans in the phylogenetic tree. A striking finding in the phylogenetic analysis is that the mongoloids can be separated into two distinct groups. Divergence of part of the mongoloids follows the earliest divergence of part of the negroids. The remainder of the mongoloids subsequently diverged together with the caucasoids. This observation confirmed our earlier study, which clearly demonstrated, by the restriction-enzyme analysis, existence of two distinct groups in the Japanese. Images Figure 3 PMID:2316527

  16. Detecting selection in noncoding regions of nucleotide sequences.

    PubMed Central

    Wong, Wendy S W; Nielsen, Rasmus

    2004-01-01

    We present a maximum-likelihood method for examining the selection pressure and detecting positive selection in noncoding regions using multiple aligned DNA sequences. The rate of substitution in noncoding regions relative to the rate of synonymous substitution in coding regions is modeled by a parameter zeta. When a site in a noncoding region is evolving neutrally zeta = 1, while zeta > 1 indicates the action of positive selection, and zeta < 1 suggests negative selection. Using a combined model for the evolution of noncoding and coding regions, we develop two likelihood-ratio tests for the detection of selection in noncoding regions. Data analysis of both simulated and real viral data is presented. Using the new method we show that positive selection in viruses is acting primarily in protein-coding regions and is rare or absent in noncoding regions. PMID:15238543

  17. Assessing the utility of the Oxford Nanopore MinION for snake venom gland cDNA sequencing

    PubMed Central

    Hargreaves, Adam D.

    2015-01-01

    Portable DNA sequencers such as the Oxford Nanopore MinION device have the potential to be truly disruptive technologies, facilitating new approaches and analyses and, in some cases, taking sequencing out of the lab and into the field. However, the capabilities of these technologies are still being revealed. Here we show that single-molecule cDNA sequencing using the MinION accurately characterises venom toxin-encoding genes in the painted saw-scaled viper, Echis coloratus. We find the raw sequencing error rate to be around 12%, improved to 0–2% with hybrid error correction and 3% with de novo error correction. Our corrected data provides full coding sequences and 5′ and 3′ UTRs for 29 of 33 candidate venom toxins detected, far superior to Illumina data (13/40 complete) and Sanger-based ESTs (15/29). We suggest that, should the current pace of improvement continue, the MinION will become the default approach for cDNA sequencing in a variety of species. PMID:26623194

  18. An Integrated System for DNA Sequencing by Synthesis Using Novel Nucleotide Analogues

    PubMed Central

    Guo, Jia; Yu, Lin; Turro, Nicholas J.; Ju, Jingyue

    2010-01-01

    Conspectus The Human Genome Project has concluded, but its successful completion has increased, rather than decreased, the need for high-throughput DNA sequencing technologies. The possibility of clinically screening a full genome for an individual's mutations offers tremendous benefits, both for pursuing personalized medicine as well as uncovering the genomic contributions to diseases. The Sanger sequencing method—although enormously productive for more than 30 years—requires an electrophoretic separation step that, unfortunately, remains a key technical obstacle for achieving economically acceptable full-genome results. Alternative sequencing approaches thus focus on innovations that can reduce costs. The DNA sequencing by synthesis (SBS) approach has shown great promise as a new sequencing platform, with particular progress reported recently. The general fluorescent SBS approach involves (i) incorporation of nucleotide analogs bearing fluorescent reporters, (ii) identification of the incorporated nucleotide by its fluorescent emissions, and (iii) cleavage of the fluorophore, along with the reinitiation of the polymerase reaction for continuing sequence determination. In this Account, we review the construction of a DNA-immobilized chip and the development of novel nucleotide reporters for the SBS sequencing platform. Click chemistry, with its high selectivity and coupling efficiency, was explored for surface immobilization of DNA. The first generation (G-1) modified nucleotides for SBS feature a small chemical moiety capping the 3′-OH and a fluorophore tethered to the base through a chemically cleavable linker; the design ensures that the nucleotide reporters are good substrates for the polymerase. The 3′-capping moiety and the fluorophore on the DNA extension products, generated by the incorporation of the G-1 modified nucleotides, are cleaved simultaneously to reinitiate the polymerase reaction. The sequence of a DNA template immobilized on a surface

  19. Human parainfluenza type 3 virus hemagglutinin-neuraminidase glycoprotein: nucleotide sequence of mRNA and limited amino acid sequence of the purified protein.

    PubMed Central

    Elango, N; Coligan, J E; Jambou, R C; Venkatesan, S

    1986-01-01

    The nucleotide sequence of mRNA for the hemagglutinin-neuraminidase (HN) protein of human parainfluenza type 3 virus obtained from the corresponding cDNA clone had a single long open reading frame encoding a putative protein of 64,254 daltons consisting of 572 amino acids. The deduced protein sequence was confirmed by limited N-terminal amino acid microsequencing of CNBr cleavage fragments of native HN that was purified by immunoprecipitation. The HN protein is moderately hydrophobic and has four potential sites (Asn-X-Ser/Thr) of N-glycosylation in the C-terminal half of the molecule. It is devoid of both the N-terminal signal sequence and the C-terminal membrane anchorage domain characteristic of the hemagglutinin of influenza virus and the fusion (F0) protein of the paramyxoviruses. Instead, it has a single prominent hydrophobic region capable of membrane insertion beginning at 32 residues from the N terminus. This N-terminal membrane insertion is similar to that of influenza virus neuraminidase and the recently reported structures of HN proteins of Sendai virus and simian virus 5. Images PMID:3003381

  20. The nucleotide sequence of 5S ribosomal RNA from slime mold Physarum polycephalum.

    PubMed

    Komiya, H; Takemura, S

    1981-12-01

    The nucleotide sequence of 5S ribosomal RNA from plasmodia of the slime mold Physarum polycephalum was determined as pppGGAUGCGGC CAUACUAAGG 20 AGAAAGCACC 30 UCAUCCCGUC 40 CGAUCUGAGA 50 AGUUAAGCUC 60 CUUCAGGCGU 70 GGUUAGUACU 80 GGGGUGGGGG 90 ACCACCUGGG 100 AAUCCCACGU 110 GCUGCAUUCU 120 Uoh by chemical and enzymatic gel sequencing technics using 3' and 5' end-labeled RNA. This RNA is very different from 5S rRNA of the cellular slime mold Dictyostelium discoideum (36 nucleotides are different), and shows greater similarity to 5S rRNAs from Protozoa and Metazoa than to those from fungi.

  1. Long-range macromolecule interaction and “speed reading” long nucleotide sequences in DNA

    NASA Astrophysics Data System (ADS)

    Namiot, V. A.; Anashkina, A. A.; Filatov, I. V.; Tumanyan, V. G.; Esipova, N. G.

    2013-01-01

    Methods based on the phenomenon of the specific long-range interaction between long macromolecules proposed for “speed reading” nucleotide sequences in single DNA molecules. One way is to measure the electric field potential along the preliminary stretched double DNA strand. Another way of information “reading” is to measure deformation of strand elements caused by an electric field that is generated by the “straightening” electrode due to an alternating voltage applied to it. On the base of the obtained information the sequence of nucleotides in the strand could be determined in principle.

  2. [Cloning and sequencing of KIR2DL1 framework gene cDNA and identification of a novel allele].

    PubMed

    Sun, Ge; Wang, Chang; Zhen, Jianxin; Zhang, Guobin; Xu, Yunping; Deng, Zhihui

    2016-10-01

    To develop an assay for cDNA cloning and haplotype sequencing of KIR2DL1 framework gene and determine the genotype of an ethnic Han from southern China. Total RNA was isolated from peripheral blood sample, and complementary DNA (cDNA) transcript was synthesized by RT-PCR. The entire coding sequence of the KIR2DL1 framework gene was amplified with a pair of KIR2DL1-specific PCR primers. The PCR products with a length of approximately 1.2 kb were then subjected to cloning and haplotype sequencing. A specific target fragment of the KIR2DL1 framework gene was obtained. Following allele separation, a wild-type KIR2DL1*00302 allele and a novel variant allele, KIR2DL1*031, were identified. Sequence alignment with KIR2DL1 alleles from the IPD-KIR Database showed that the novel allele KIR2DL1*031 has differed from the closest allele KIR2DL1*00302 by a non-synonymous mutation at CDS nt 188A>G (codon 42 GAG>GGG) in exon 4, which has caused an amino acid change Glu42Gly. The sequence of the novel allele KIR2DL1*031 was submitted to GenBank under the accession number KP025960 and to the IPD-KIR Database under the submission number IWS40001982. A name KIR2DL1*031 has been officially assigned by the World Health Organization (WHO) Nomenclature Committee. An assay for cDNA cloning and haplotype sequencing of KIR2DL1 has been established, which has a broad applications in KIR studies at allelic level.

  3. The complete nucleotide sequence and genomic characterization of tropical soda apple mosaic virus.

    PubMed

    Fillmer, Kornelia; Adkins, Scott; Pongam, Patchara; D'Elia, Tom

    2016-08-01

    We report the first complete genome sequence of tropical soda apple mosaic virus (TSAMV), a tobamovirus originally isolated from tropical soda apple (Solanum viarum) collected in Okeechobee, Florida. The complete genome of TSAMV is 6,350 nucleotides long and contains four open reading frames encoding the following proteins: i) 126-kDa methyltransferase/helicase (3354 nt), ii) 183-kDa polymerase (4839 nt), iii) movement protein (771 nt) and iv) coat protein (483 nt). The complete genome sequence of TSAMV shares 80.4 % nucleotide sequence identity with pepper mild mottle virus (PMMoV) and 71.2-74.2 % identity with other tobamoviruses naturally infecting members of the Solanaceae plant family. Phylogenetic analysis of the deduced amino acid sequences of the 126-kDa and 183-kDa proteins and the complete genome sequence place TSAMV in a subcluster with PMMoV within the Solanaceae-infecting subgroup of tobamoviruses.

  4. Cloning and nucleotide sequence of wild type and a mutant histidine decarboxylase from Lactobacillus 30a.

    PubMed

    Vanderslice, P; Copeland, W C; Robertus, J D

    1986-11-15

    Prohistidine decarboxylase from Lactobacillus 30a is a protein that autoactivates to histidine decarboxylase by cleaving its peptide chain between serines 81 and 82 and converting Ser-82 to a pyruvoyl moiety. The pyruvoyl group serves as the prosthetic group for the decarboxylation reaction. We have cloned and determined the nucleotide sequence of the gene for this enzyme from a wild type strain and from a mutant with altered autoactivation properties. The nucleotide sequence modifies the previously determined amino acid sequence of the protein. A tripeptide missed in the chemical sequence is inserted, and three other amino acids show conservative changes. The activation mutant shows a single change of Gly-58 to an Asp. Sequence analysis up- and downstream from the gene suggests that histidine decarboxylase is part of a polycistronic message, and that the transcriptional promotor region is strongly homologous to those of other Gram-positive organisms.

  5. Human glutamate pyruvate transaminase (GPT): Localization to 8q24.3, cDNA and genomic sequences, and polymorphic sites

    SciTech Connect

    Sohocki, M.M.; Sullivan, L.S.; Daiger, S.P.

    1997-03-01

    Two frequent protein variants of glutamate pyruvate transaminase (GPT) (E.C.2.6.1.2) have been used as genetic markers in humans for more than two decades, although chromosomal mapping of the GPT locus in the 1980s produced conflicting results. To resolve this conflict and develop useful DNA markers for this gene, we isolated and characterized cDNA and genomic clones of GPT. We have definitively mapped human GPT to the terminus of 8q using several methods. First, two cosmids shown to contain the GPT sequence were derived from a chromosome 8-specific library. Second, by fluorescence in situ hybridization, we mapped the cosmid containing the human GPT gene to chromosome band 8q24.3. Third, we mapped the rat gpt cDNA to the syntenic region of rat chromosome 7. Finally, PCR primers specific to human GPT amplify sequences contained within a {open_quotes}half-YAC{close_quotes} from the long arm of chromosome 8, that is, a YAC containing the 8q telomere. The human GPT genomic sequence spans 2.7 kb and consists of 11 exons, ranging in size from 79 to 243 bp. The exonic sequence encodes a protein of 495 amino acids that is nearly identical to the previously reported protein sequence of human GPT-1. The two polymorphic GPT isozymes are the result of a nucleotide substitution in codon 14. In addition, a cosmid containing the GPT sequence also contains a previously unmapped, polymorphic microsatellite sequence, D8S421. The cloned GPT gene and associated polymorphisms will be useful for linkage and physical mapping of disease loci that map to the terminus of 8q, including atypical vitelliform macular dystrophy (VMD1) and epidermolysis bullosa simplex, type Ogna (EBS1). In addition, this will be a useful system for characterizing the telomeric region of 8q. Finally, determination of the molecular basis of the GPT isozyme variants will permit PCR-based detection of this world-wide polymorphism. 22 refs., 3 figs.

  6. Cloning and sequencing of a cDNA encoding a heat-stable sweet protein, mabinlin II.

    PubMed

    Nirasawa, S; Masuda, Y; Nakaya, K; Kurihara, Y

    1996-11-28

    A cDNA clone encoding a heat-stable sweet protein, mabinlin II (MAB), was isolated and sequenced. The encoded precursor to MAB was composed of 155 amino acid (aa) residues, including a signal sequence of 20 aa, an N-terminal extension peptide of 15 aa, a linker peptide of 14 aa and one residue of C-terminal extension. Comparison of the proteolytic cleavage sites during post-translational processing of MAB precursor with those of like 2S seed-storage proteins of Arabidopsis thaliana, Brassica napus and Bertholletia excelsa shows that the three individual cleavage sites between respective species are conserved.

  7. New Approaches to Attenuated Hepatitis A Vaccine Development: Cloning and Sequencing of Cell-Culture Adapted Viral cDNA

    DTIC Science & Technology

    1989-10-01

    mutations predicting 8 changes in the amino acid sequences of HAV proteins . Only one amino acid substitution occurred among the capsid proteins (VP2...frame, predicting 8 amino acid substitutions in the proteins of the p16 virus (Table 1). (i) 5’ nontranslated RNA. There were five mutations (six...not predict a change in the amino acid sequence of the capsid protein . This silent "mutation" has been found in all cell-culture adapted HM175 cDNA

  8. cDNA cloning and complete primary structure of skeletal muscle phosphorylase kinase (alpha subunit).

    PubMed Central

    Zander, N F; Meyer, H E; Hoffmann-Posorske, E; Crabb, J W; Heilmeyer, L M; Kilimann, M W

    1988-01-01

    We have isolated and sequenced a cDNA encoding the alpha subunit of phosphorylase kinase from rabbit fast-twitch skeletal muscle. The cDNA molecule consists of 388 nucleotides of 5'-nontranslated sequence, the complete coding sequence of 3711 nucleotides, and 342 nucleotides of 3'-nontranslated sequence followed by a poly(dA) tract. It encodes a polypeptide of 1237 amino acids and a deduced molecular mass of 138,422 Da. Nearly half of the deduced amino acid sequence is confirmed by peptide sequencing. Seven positions of endogenously phosphorylated serine residues and autophosphorylation sites, identified by peptide sequencing, could be assigned. They cluster in a segment of only 60 amino acids. RNA blot hybridization analysis demonstrates a predominant RNA species of approximately equal to 4500 nucleotides and a less abundant RNA of 8700 nucleotides. Images PMID:3362857

  9. Cell cycle regulated synthesis of stable mouse thymidine kinase mRNA is mediated by a sequence within the cDNA.

    PubMed Central

    Hofbauer, R; Müllner, E; Seiser, C; Wintersberger, E

    1987-01-01

    The cDNA for mouse thymidine kinase (TK) was isolated from a cDNA library in lambda-gt11 and sequenced. It was used as a probe to follow the time course of TK mRNA expression in growth stimulated mouse fibroblasts. Linked to the HSV-TK promoter the cDNA was able to transform LTK-cells to the TK+ phenotype. The transformed cells expressed the TK mRNA and enzyme activity in a growth dependent fashion suggesting that the regulatory element is localized on the cDNA. Images PMID:3822814

  10. Population genetics and phylogenetic analysis of the vrs1 nucleotide sequence in wild and cultivated barley.

    PubMed

    Ren, Xifeng; Wang, Yonggang; Yan, Songxian; Sun, Dongfa; Sun, Genlou

    2014-04-01

    Spike morphology is a key characteristic in the study of barley genetics, breeding, and domestication. Variation at the six-rowed spike 1 (vrs1) locus is sufficient to control the development and fertility of the lateral spikelet of barley. To study the genetic variation of vrs1 in wild barley (Hordeum vulgare subsp. spontaneum) and cultivated barley (Hordeum vulgare subsp. vulgare), nucleotide sequences of vrs1 were examined in 84 wild barleys (including 10 six-rowed) and 20 cultivated barleys (including 10 six-rowed) from four populations. The length of the vrs1 sequence amplified was 1536 bp. A total of 40 haplotypes were identified in the four populations. The highest nucleotide diversity, haplotype diversity, and per-site nucleotide diversity were observed in the Southwest Asian wild barley population. The nucleotide diversity, number of haplotypes, haplotype diversity, and per-site nucleotide diversity in two-rowed barley were higher than those in six-rowed barley. The phylogenetic analysis of the vrs1 sequences partially separated the six-rowed and the two-rowed barley. The six-rowed barleys were divided into four groups.

  11. Nucleotide composition of CO1 sequences in Chelicerata (Arthropoda): detecting new mitogenomic rearrangements.

    PubMed

    Arabi, Juliette; Judson, Mark L I; Deharveng, Louis; Lourenço, Wilson R; Cruaud, Corinne; Hassanin, Alexandre

    2012-02-01

    Here we study the evolution of nucleotide composition in third codon-positions of CO1 sequences of Chelicerata, using a phylogenetic framework, based on 180 taxa and three markers (CO1, 18S, and 28S rRNA; 5,218 nt). The analyses of nucleotide composition were also extended to all CO1 sequences of Chelicerata found in GenBank (1,701 taxa). The results show that most species of Chelicerata have a positive strand bias in CO1, i.e., in favor of C nucleotides, including all Amblypygi, Palpigradi, Ricinulei, Solifugae, Uropygi, and Xiphosura. However, several taxa show a negative strand bias, i.e., in favor of G nucleotides: all Scorpiones, Opisthothelae spiders and several taxa within Acari, Opiliones, Pseudoscorpiones, and Pycnogonida. Several reversals of strand-specific bias can be attributed to either a rearrangement of the control region or an inversion of a fragment containing the CO1 gene. Key taxa for which sequencing of complete mitochondrial genomes will be necessary to determine the origin and nature of mtDNA rearrangements involved in the reversals are identified. Acari, Opiliones, Pseudoscorpiones, and Pycnogonida were found to show a strong variability in nucleotide composition. In addition, both mitochondrial and nuclear genomes have been affected by higher substitution rates in Acari and Pseudoscorpiones. The results therefore indicate that these two orders are more liable to fix mutations of all types, including base substitutions, indels, and genomic rearrangements.

  12. Guinea pig alpha 1-microglobulin/bikunin: cDNA sequencing, tissue expression and expression during acute phase.

    PubMed

    Yoshida, K; Suzuki, Y; Yamamoto, K; Sinohara, H

    1999-02-01

    cDNA encoding alpha 1-microglobulin/bikunin (AMBP) was amplified from guinea pig (Cavia porcellus) liver mRNA by reverse transcription-polymerase chain reaction (RT-PCR) and rapid amplification of cDNA ends methods, cloned and sequenced. The deduced amino acid sequence was found to be homologous to the sequence of AMBP of other mammals (69-76% amino acid identity). It has two Kunitz-type trypsin inhibitor domains in the bikunin part as reactive sites, one in the N-terminal region and another in the C-terminal region. The N-terminal inhibitor domain sequence is well-conserved, but the P1 residue of the C-terminal inhibitor domain sequence was found to be Gln rather than Arg, a residue highly conserved in the AMBP of seven other mammals examined to date. By RT-PCR and nested PCR, AMBP mRNA was detected not only in liver tissue, previously known to be a site of its synthesis, but also in pancreas, stomach, small intestine, colon, lung, spleen, kidney, testis, skeletal muscle, and leukocytes, but not in brain or heart. We examined the AMBP mRNA levels in guinea pig liver by RT-PCR, comparing normal levels and those in a state of inflammation. The mRNA levels, however, did not significantly change.

  13. Isolation and characterization of cDNA clones for rat ribophorin I: complete coding sequence and in vitro synthesis and insertion of the encoded product into endoplasmic reticulum membranes

    PubMed Central

    1987-01-01

    Ribophorins I and II are two transmembrane glycoproteins that are characteristic of the rough endoplasmic reticulum and are thought to be part of the apparatus that affects the co-translational translocation of polypeptides synthesized on membrane-bound polysomes. A ribophorin I cDNA clone containing a 0.6-kb insert was isolated from a rat liver lambda gtll cDNA library by immunoscreening with specific antibodies. This cDNA was used to isolate a clone (2.3 kb) from a rat brain lambda gtll cDNA library that contains the entire ribophorin I coding sequence. SP6 RNA transcripts of the insert in this clone directed the in vitro synthesis of a polypeptide of the expected size that was immunoprecipitated with anti-ribophorin I antibodies. When synthesized in the presence of microsomes, this polypeptide, like the translation product of the natural ribophorin I mRNA, underwent membrane insertion, signal cleavage, and co-translational glycosylation. The complete amino acid sequence of the polypeptide encoded in the cDNA insert was derived from the nucleotide sequence and found to contain a segment that corresponds to a partial amino terminal sequence of ribophorin I that was obtained by Edman degradation. This confirmed the identity of the cDNA clone and established that ribophorin I contains 583 amino acids and is synthesized with a cleavable amino terminal insertion signal of 22 residues. Analysis of the amino acid sequence of ribophorin I suggested that the polypeptide has a simple transmembrane disposition with a rather hydrophilic carboxy terminal segment of 150 amino acids exposed on the cytoplasmic face of the membrane, and a luminal domain of 414 amino acids containing three potential N-glycosylation sites. Hybridization measurements using the cloned cDNA as a probe showed that ribophorin I mRNA levels increase fourfold 15 h after partial hepatectomy, in confirmation of measurements made by in vitro translation of liver mRNA. Southern blot analysis of rat genomic

  14. cDNA cloning and sequence of MAL, a hydrophobic protein associated with human T-cell differentiation.

    PubMed Central

    Alonso, M A; Weissman, S M

    1987-01-01

    We have isolated a human cDNA that is expressed in the intermediate and late stages of T-cell differentiation. The cDNA encodes a highly hydrophobic protein, termed MAL, that lacks a hydrophobic leader peptide sequence and contains four potential transmembrane domains separated by short hydrophilic segments. The predicted configuration of the MAL protein resembles the structure of integral proteins that form pores or channels in the plasma membrane and that are believed to act as transporters of water-soluble molecules and ions across the lipid bilayer. The presence of MAL mRNA in a panel of T-cell lines that express both the T-cell receptor and the T11 antigen suggests that MAL may be involved in membrane signaling in T cells activated via either T11 or T-cell receptor pathways. Images PMID:3494249

  15. Nucleotide sequence and genome organization of a new proposed crinivirus, tetterwort vein chlorosis virus.

    PubMed

    Zhao, Fumei; Yoo, Ran Hee; Lim, Seungmo; Igori, Davaajargal; Lee, Su-Heon; Moon, Jae Sun

    2015-11-01

    The genome of tetterwort vein chlorosis virus (TVCV) from South Korea has been completely sequenced. Its genomic organization resembles those of other criniviruses, with several new features, indicating that TVCV is a member of a new species in the genus Crinivirus, family Closteroviridae. RNA1 contains 8467 nucleotides, with at least four opening reading frames (ORFs). ORF1a encodes a protein with predicted papain-like protease, methyltransferase, and helicase activities. ORF1b encodes a putative RNA-dependent RNA polymerase that is apparently expressed through a +1 ribosomal frameshift. RNA2 contains 8113 nucleotides encoding at least nine proteins, similar to most crinivirus RNA2s. The 3' untranslated regions of the bipartite RNA genome share 82.1% nucleotide sequence identity.

  16. Complete nucleotide sequence of the new potexvirus "Alstroemeria virus X". Brief report.

    PubMed

    Fuji, S; Shinoda, K; Ikeda, M; Furuya, H; Naito, H; Fukumoto, F

    2005-11-01

    A flexuous virus was isolated in Japan from an alstroemeria plant showing mosaic symptoms. The virus had a broad host range but had systemically latent infectivity in alstroemeria. The virus was assigned to the genus Potexvirus based on morphology and physical properties and on an analysis of the complete nucleotide sequence. The genomic RNA of the virus was 7,009 nucleotides in length, excluding the 3'-terminal poly (A) tail. It contained five open reading frames (ORFs), which was consistent with other members of the genus Potexvirus. Although nucleotide sequences of the ORFs differ from previously reported potexviruses, a phylogenetic analysis placed it phylogenetically close to Narcissus mosaic virus and Scallion virus X. Therefore, we propose that this virus should be designated as Alstroemeria virus X (AlsVX).

  17. Transport properties of nucleotides in a graphene nanogap for DNA sequencing

    NASA Astrophysics Data System (ADS)

    Prasongkit, J.; Grigoriev, A.; Scheicher, R. H.; Ahuja, R.

    2011-03-01

    The application of graphene nanogaps for DNA sequencing has been proposed [H. W. Ch. Postma, Nano Lett. 10, 420 (2010)]. We used density functional theory and the non-equilibrium Green's function method to study the electron transport properties of nucleotides located inside a graphene nanogap. Our setup considered different positions and orientations of the bases with respect to the graphene electrodes, and we analyzed how the transmission spectra depend on such shifts and rotations. Even when taking into account current changes due to base fluctuations, we find that each nucleotide possesses a different characteristic current magnitude, owing to its distinctive electronic properties. Based on our results, it thus seems that the electrical readout from a graphene nanogap could in principle be sufficiently sensitive to distinguish between the four nucleotides, and thus achieve the goal of rapid and economical whole-genome sequencing. Swedish Research Council (VR, grant no. 621-2009-3628).

  18. Complete nucleotide sequence of a begomovirus and associated betasatellite infecting croton (Croton bonplandianus) in Pakistan.

    PubMed

    Hussain, Khadim; Hussain, Mazhar; Mansoor, Shahid; Briddon, Rob W

    2011-06-01

    The complete sequences of a begomovirus and an associated betasatellite isolated from Croton bonplandianus originating from Pakistan were determined. The sequence of the begomovirus showed the highest level of nucleotide sequence identity (88.9%) to an isolate of papaya leaf curl virus and thus represents a new species, for which we propose the name Croton yellow vein virus (CYVV). The sequence of the betasatellite showed the highest levels of sequence identity (82 to 98.4%) to six sequences in the databases that have yet to be reported, followed by isolates of tomato leaf curl Joydebpur betasatellite (48.7 to 52.5%). This indicates that the betasatellite identified here (and the six sequences in the databases) is an isolate of a newly identified species for which the name Croton yellow vein mosaic betasatellite (CroYVMB) is proposed. For the begomovirus, an analysis of the sequence indicates that it has a recombinant origin.

  19. Characterization of expressed sequence tags from a full-length enriched cDNA library of Cryptomeria japonica male strobili

    PubMed Central

    Futamura, Norihiro; Totoki, Yasushi; Toyoda, Atsushi; Igasaki, Tomohiro; Nanjo, Tokihiko; Seki, Motoaki; Sakaki, Yoshiyuki; Mari, Adriano; Shinozaki, Kazuo; Shinohara, Kenji

    2008-01-01

    Background Cryptomeria japonica D. Don is one of the most commercially important conifers in Japan. However, the allergic disease caused by its pollen is a severe public health problem in Japan. Since large-scale analysis of expressed sequence tags (ESTs) in the male strobili of C. japonica should help us to clarify the overall expression of genes during the process of pollen development, we constructed a full-length enriched cDNA library that was derived from male strobili at various developmental stages. Results We obtained 36,011 expressed sequence tags (ESTs) from either one or both ends of 19,437 clones derived from the cDNA library of C. japonica male strobili at various developmental stages. The 19,437 cDNA clones corresponded to 10,463 transcripts. Approximately 80% of the transcripts resembled ESTs from Pinus and Picea, while approximately 75% had homologs in Arabidopsis. An analysis of homologies between ESTs from C. japonica male strobili and known pollen allergens in the Allergome Database revealed that products of 180 transcripts exhibited significant homology. Approximately 2% of the transcripts appeared to encode transcription factors. We identified twelve genes for MADS-box proteins among these transcription factors. The twelve MADS-box genes were classified as DEF/GLO/GGM13-, AG-, AGL6-, TM3- and TM8-like MIKCC genes and type I MADS-box genes. Conclusion Our full-length enriched cDNA library derived from C. japonica male strobili provides information on expression of genes during the development of male reproductive organs. We provided potential allergens in C. japonica. We also provided new information about transcription factors including MADS-box genes expressed in male strobili of C. japonica. Large-scale gene discovery using full-length cDNAs is a valuable tool for studies of gymnosperm species. PMID:18691438

  20. Complete nucleotide sequence of a novel strain of fig fleck-associated virus from China.

    PubMed

    He, Zhen; Mijit, Mahmut; Li, Shifang; Zhang, Zhixiang

    2017-04-01

    The complete nucleotide sequence of fig fleck-associated virus from Xinjiang Uygur Autonomous Region of China (FFkaV-CN) was determined. The 6,723-nucleotide-long viral genome, excluding a terminal poly(A) tail, contains three open reading frames (ORFs). Pairwise comparisons showed that FFkaV-CN shares 83% and 92% sequence identity with FFkaV-Italy based on the complete genomic sequence and CP aa sequence, respectively, slightly higher than the species demarcation criterion for the genus Maculavirus. Phylogenetic analysis showed that FFkaV-CN and FFkaV-Italy clustered into one group. These results indicate that FFkaV-CN is a novel strain of FFkaV with a genome organization somewhat different from what was reported for FFkaV-Italy.

  1. Nucleotide sequence determination of bacteriophage T4 glycine transfer ribonucleic acid

    PubMed Central

    Stahl, Stephen; Paddock, Gary V.; Abelson, John

    1974-01-01

    The nucleotide sequence of a T4 tRNA with an anticodon for glycine has been determined using 32P-labeled material from T4-infected cultures of Escherichia coli. The sequence is: pGCGGAUAUCGUAUAAUGmGDAUUACCUCAGACUUCCAAψCUGAUGAUGUGAGTψCGAUUCUCAUUAUCCGCUCCA-OH. The 74 nucleotide sequence can be arranged in the classic cloverleaf pattern for tRNAs. The anticodon of T4 tRNAGly is UCC with a possible modification of the U. The tRNA molecule would thus be expected to recognize the glycine codons GGG and GGA. Comparative analysis of tRNAsGly from T2 and T6 indicate that their sequences are identical with that from T4. Images PMID:10793690

  2. Dependence of the E. coli promoter strength and physical parameters upon the nucleotide sequence

    PubMed Central

    Berezhnoy, Andrey Y.; Shckorbatov, Yuriy G.

    2005-01-01

    The energy of interaction between complementary nucleotides in promoter sequences of E. coli was calculated and visualized. The graphic method for presentation of energy properties of promoter sequences was elaborated on. Data obtained indicated that energy distribution through the length of promoter sequence results in picture with minima at −35, −8 and +7 regions corresponding to areas with elevated AT (adenine-thymine) content. The most important difference from the random sequences area is related to −8. Four promoter groups and their energy properties were revealed. The promoters with minimal and maximal energy of interaction between complementary nucleotides have low strengths, the strongest promoters correspond to promoter clusters characterized by intermediate energy values. PMID:16252339

  3. On the feasibility of using the intrinsic fluorescence of nucleotides for DNA sequencing.

    SciTech Connect

    Chowdhury, M. H.; Ray, K.; Johnson, R. L.; Gray, S. K.; Pond, J.; Lakowicz, J. R.; Univ. of Maryland; Univ. of Virginia; Lumerical Solutions, Inc.

    2010-04-29

    There is presently a worldwide effort to increase the speed and decrease the cost of DNA sequencing as exemplified by the goal of the National Human Genome Research Institute (NHGRI) to sequence a human genome for under $1000. Several high throughput technologies are under development. Among these, single strand sequencing using exonuclease appear very promising. However, this approach requires complete labeling of at least two bases at a time, with extrinsic high quantum yield probes. This is necessary because nucleotides absorb in the deep ultraviolet (UV) and emit with extremely low quantum yields. Hence intrinsic emission from DNA and nucleotides is not being exploited for DNA sequencing. In the present paper we consider the possibility of identifying single nucleotides using their intrinsic emission. We used the finite-difference time-domain (FDTD) method to calculate the effects of aluminum nanoparticles on nearby fluorophores that emit in the UV. We find that the radiated power of UV fluorophores is significantly increased when they are in close proximity to aluminum nanostructures. We show that there will be increased localized excitation near aluminum particles at wavelengths used to excite intrinsic nucleotide emission. Using FDTD simulation we show that a typical DNA base when coupled to appropriate aluminum nanostructures leads to highly directional emission. Additionally we present experimental results showing that a thin film of nucleotides show enhanced emission when in close proximity to aluminum nanostructures. Finally we provide Monte Carlo simulations that predict high levels of base calling accuracy for an assumed number of photons that is derived from the emission spectra of the intrinsic fluorescence of the bases. Our results suggest that single nucleotides can be detected and identified using aluminum nanostructures that enhance their intrinsic emission. This capability would be valuable for the ongoing efforts toward the $1000 genome.

  4. On the Feasibility of Using the Intrinsic Fluorescence of Nucleotides for DNA Sequencing

    PubMed Central

    Chowdhury, Mustafa H.; Ray, Krishanu; Johnson, Michael L.; Gray, Stephen K.; Pond, James; Lakowicz, Joseph R.

    2010-01-01

    There is presently a worldwide effort to increase the speed and decrease the cost of DNA sequencing as exemplified by the goal of the National Human Genome Research Institute (NHGRI) to sequence a human genome for under $1000. Several high throughput technologies are under development. Among these, single strand sequencing using exonuclease appear very promising. However, this approach requires complete labeling of at least two bases at a time, with extrinsic high quantum yield probes. This is necessary because nucleotides absorb in the deep ultra-violet (UV) and emit with extremely low quantum yields. Hence intrinsic emission from DNA and nucleotides is not being exploited for DNA sequencing. In the present paper we consider the possibility of identifying single nucleotides using their intrinsic emission. We used the finite-difference time-domain (FDTD) method to calculate the effects of aluminum nanoparticles on nearby fluorophores that emit in the UV. We find that the radiated power of UV fluorophores is significantly increased when they are in close proximity to aluminum nanostructures. We show that there will be increased localized excitation near aluminum particles at wavelengths used to excite intrinsic nucleotide emission. Using FDTD simulation we show that a typical DNA base when coupled to appropriate aluminum nanostructures leads to highly directional emission. Additionally we present experimental results showing that a thin film of nucleotides show enhanced emission when in close proximity to aluminum nanostructures. Finally we provide Monte Carlo simulations that predict high levels of base calling accuracy for an assumed number of photons that is derived from the emission spectra of the intrinsic fluorescence of the bases. Our results suggest that single nucleotides can be detected and identified using aluminum nanostructures that enhance their intrinsic emission. This capability would be valuable for the ongoing efforts towards the $1000 genome. PMID

  5. The vicilin gene family of pea (Pisum sativum L.): a complete cDNA coding sequence for preprovicilin.

    PubMed Central

    Lycett, G W; Delauney, A J; Gatehouse, J A; Gilroy, J; Croy, R R; Boulter, D

    1983-01-01

    A cDNA plasmid bank has been constructed using mRNA from developing pea seeds and three cDNAs coding for vicilin polypeptides have been selected. These cDNAs have been sequenced and between them cover the whole of the coding sequence plus part of the 5' and 3' untranslated regions. Comparison with amino acid sequence data from the protein indicates that vicilin is synthesised as preprovicilin with subsequent removal of a signal peptide and a C-terminal peptide as well as post translational endo-proteolytic cleavage. The cDNAs represent two different classes of vicilin genes whilst amino acid data show that there are at least three major classes of vicilin polypeptide. The vicilin sequences show extensive homology with conglycinin and phaseolin except in the regions of the internal proteolytic cleavages. The evolutionary significance of this relationship is discussed. Images PMID:6687941

  6. Genomic organization and nucleotide sequences of two corn histone H4 genes.

    PubMed

    Philipps, G; Chaubet, N; Chaboute, M E; Ehling, M; Gigot, C

    1986-01-01

    The sea urchin histone H4 gene has been used as a probe to clone two corn histone H4 genes from a lambda gtWES X lambda B corn genomic library. The nucleotide (nt) sequences of both genes showed that the encoded amino acid sequences were identical to that of the H4 of pea and one variant of wheat. The nt sequences of the coding regions showed 92% homology. 5'- and 3'-flanking regions do not show extensive nt sequence analogies. Southern blotting of the EcoRI digested genomic DNA suggests the existence of multiple H4 genes dispersed throughout the genome.

  7. Iterative Correction of Reference Nucleotides (iCORN) using second generation sequencing technology.

    PubMed

    Otto, Thomas D; Sanders, Mandy; Berriman, Matthew; Newbold, Chris

    2010-07-15

    The accuracy of reference genomes is important for downstream analysis but a low error rate requires expensive manual interrogation of the sequence. Here, we describe a novel algorithm (Iterative Correction of Reference Nucleotides) that iteratively aligns deep coverage of short sequencing reads to correct errors in reference genome sequences and evaluate their accuracy. Using Plasmodium falciparum (81% A + T content) as an extreme example, we show that the algorithm is highly accurate and corrects over 2000 errors in the reference sequence. We give examples of its application to numerous other eukaryotic and prokaryotic genomes and suggest additional applications. The software is available at http://icorn.sourceforge.net

  8. Complete Nucleotide Sequence of a Citrobacter freundii Plasmid Carrying KPC-2 in a Unique Genetic Environment

    PubMed Central

    Yao, Yancheng; Imirzalioglu, Can; Hain, Torsten; Kaase, Martin; Gatermann, Soeren; Exner, Martin; Mielke, Martin; Hauri, Anja; Dragneva, Yolanta; Bill, Rita; Wendt, Constanze; Wirtz, Angela; Chakraborty, Trinad

    2014-01-01

    The complete and annotated nucleotide sequence of a 54,036-bp plasmid harboring a blaKPC-2 gene that is clonally present in Citrobacter isolates from different species is presented. The plasmid belongs to incompatibility group N (IncN) and harbors the class A carbapenemase KPC-2 in a unique genetic environment. PMID:25395635

  9. PCR amplification and sequences of cDNA clones for the small and large subunits of ADP-glucose pyrophosphorylase from barley tissues.

    PubMed

    Villand, P; Aalen, R; Olsen, O A; Lüthi, E; Lönneborg, A; Kleczkowski, L A

    1992-06-01

    Several cDNAs encoding the small and large subunit of ADP-glucose pyrophosphorylase (AGP) were isolated from total RNA of the starchy endosperm, roots and leaves of barley by polymerase chain reaction (PCR). Sets of degenerate oligonucleotide primers, based on previously published conserved amino acid sequences of plant AGP, were used for synthesis and amplification of the cDNAs. For either the endosperm, roots and leaves, the restriction analysis of PCR products (ca. 550 nucleotides each) has revealed heterogeneity, suggesting presence of three transcripts for AGP in the endosperm and roots, and up to two AGP transcripts in the leaf tissue. Based on the derived amino acid sequences, two clones from the endosperm, beps and bepl, were identified as coding for the small and large subunit of AGP, respectively, while a leaf transcript (blpl) encoded the putative large subunit of AGP. There was about 50% identity between the endosperm clones, and both of them were about 60% identical to the leaf cDNA. Northern blot analysis has indicated that beps and bepl are expressed in both the endosperm and roots, while blpl is detectable only in leaves. Application of the PCR technique in studies on gene structure and gene expression of plant AGP is discussed.

  10. Nucleotide sequence of the 3'-terminal region of potato virus YN RNA.

    PubMed

    van der Vlugt, R; Allefs, S; de Haan, P; Goldbach, R

    1989-01-01

    The sequence of the 3'-terminal 1611 nucleotides of the genome of the tobacco veinal necrosis strain of potato virus Y (PVYN) was determined. The sequence revealed an open reading frame of 1285 nucleotides, of which the start was not identified, and an untranslated region of 316 nucleotides upstream of a poly(A) tract. Comparison of the open reading frame with the amino-terminal sequence of the viral coat protein enabled mapping of the start of the coat protein at amino acid -267, and indicated that maturation of this protein requires proteolytic processing from a larger polyprotein precursor at a glutamine/glycine dipeptide sequence. The coat protein of PVYN displayed significant (51 to 63%) sequence homology to the coat proteins of four other potyviruses, tobacco etch virus, tobacco vein mottling virus, plum pox virus and sugarcane mosaic virus. Even higher sequence homology (91%) was detected with the coat protein of a fifth potyvirus, pepper mottle virus (PeMV). This homology was of the same level as found between the coat proteins of PVYN and a second strain of this virus, PVYD. Since, moreover, PVYN and PeMV were the only potyviruses displaying homology in the 3'-terminal, non-translated regions of their genomes, we conclude that PeMV should be regarded as a strain of PVY.

  11. BSviewer: a genotype-preserving, nucleotide-level visualizer for bisulfite sequencing data.

    PubMed

    Sun, Kun; Lun, Fiona F M; Jiang, Peiyong; Sun, Hao

    2017-08-08

    The bisulfite sequencing technology has been widely used to study the DNA methylation profile in many species. However, most of the current visualization tools for bisulfite sequencing data only provide high-level views (i.e., overall methylation densities) while miss the methylation dynamics at nucleotide level. Meanwhile, they also focus on CpG sites while omit other information (such as genotypes on SNP sites) which could be helpful for interpreting the methylation pattern of the sequencing data. A bioinformatics tool that visualizes the methylation statuses at nucleotide level and preserves the most essential information of the sequencing data is thus valuable and needed. We have developed BSviewer, a lightweight nucleotide-level visualization tool for bisulfite sequencing data. Using an imprinting gene as an example, we show that BSviewer could be specifically helpful for interpreting the bisulfite sequencing data with allele-specific DNA methylation pattern. BSviewer is implemented in Perl and runs on most GNU/Linux platforms. Source code and testing dataset are freely available at http://sunlab.cpy.cuhk.edu.hk/BSviewer/ . haosun@cuhk.edu.hk.

  12. The nucleotide sequence and genome structure of mung bean yellow mosaic geminivirus.

    PubMed

    Morinaga, T; Ikegami, M; Miura, K

    1993-01-01

    Complete nucleotide sequences of the infectious cloned DNA components (DNA 1 and DNA 2) of mung bean yellow mosaic virus (MYMV) were determined. MYMV DNA 1 and DNA 2 consists of 2,723 and 2,675 nucleotides respectively. DNA 1 and DNA 2 have little sequence similarity except for a region of approximately 200 bases which is almost identical in the two molecules. Analysis of open reading frames revealed nine potential coding regions for proteins of mol. wt. > 10,000, six in DNA 1 and three in DNA 2. The nucleotide sequence of MYMV DNA was compared with that of bean golden mosaic virus (BGMV), tomato golden mosaic virus (TGMV) and African cassava mosaic virus (ACMV). The 200-base region common to the two DNAs of each virus had little sequence similarity, except for a highly conserved 33-36 base sequence potentially capable of forming a stable hairpin structure. The potential coding regions in the MYMV DNAs had counterparts in the BGMV, TGMV and ACMV, suggesting an overall similarity in genome organization, except for absence of 1L3 in MYMV DNA 1. The most highly conserved ORFs, MYMV 1R1, BGMV 1R1, TGMV 1R1 and ACMV 1R1, are the putative genes for the coat proteins of MYMV, BGMV, TGMV and ACMV, respectively. MYMV 1L1 has also a high degree of sequence similarity with BGMV 1L1, TGMV 1L1 and ACMV 1L1.

  13. The discrepancy among single nucleotide variants detected by DNA and RNA high throughput sequencing data.

    PubMed

    Guo, Yan; Zhao, Shilin; Sheng, Quanhu; Samuels, David C; Shyr, Yu

    2017-10-03

    High throughput sequencing technology enables the both the human genome and transcriptome to be screened at the single nucleotide resolution. Tools have been developed to infer single nucleotide variants (SNVs) from both DNA and RNA sequencing data. To evaluate how much difference can be expected between DNA and RNA sequencing data, and among tissue sources, we designed a study to examine the single nucleotide difference among five sources of high throughput sequencing data generated from the same individual, including exome sequencing from blood, tumor and adjacent normal tissue, and RNAseq from tumor and adjacent normal tissue. Through careful quality control and analysis of the SNVs, we found little difference between DNA-DNA pairs (1%-2%). However, between DNA-RNA pairs, SNV differences ranged anywhere from 10% to 20%. Only a small portion of these differences can be explained by RNA editing. Instead, the majority of the DNA-RNA differences should be attributed to technical errors from sequencing and post-processing of RNAseq data. Our analysis results suggest that SNV detection using RNAseq is subject to high false positive rates.

  14. Template sequence near the initiation nucleotide can modulate brome mosaic virus RNA accumulation in plant protoplasts.

    PubMed

    Hema, M; Kao, C Cheng

    2004-02-01

    Bromoviral templates for plus-strand RNA synthesis are rich in A or U nucleotides in comparison to templates for minus-strand RNA synthesis. Previous studies demonstrated that plus-strand RNA synthesis by the brome mosaic virus (BMV) RNA replicase is more efficient if the template contains an A/U-rich template sequence near the initiation site (K. Sivakumaran and C. C. Kao, J. Virol. 73:6415-6423, 1999). These observations led us to examine the effects of nucleotide changes near the template's initiation site on the accumulation of BMV RNA3 genomic minus-strand, genomic plus-strand, and subgenomic RNAs in barley protoplasts transfected with wild-type and mutant BMV transcripts. Mutations in the template for minus-strand synthesis had only modest effects on BMV replication in barley protoplasts. Mutants with changes to the +3, +5, and +7 template nucleotides accumulated minus-strand RNA at levels similar to the the wild-type level. However, mutations at positions adjacent to the initiation cytidylate in the templates for genomic and subgenomic plus-strand RNA synthesis significantly decreased RNA accumulation. For example, changes at the third template nucleotide for plus-strand RNA3 synthesis resulted in RNA accumulation at between 18 and 24% of the wild-type level, and mutations in the third template nucleotide for subgenomic RNA4 resulted in accumulations at between 7 and 14% of the wild-type level. The effects of the mutations generally decreased as the mutations occurred further from the initiation nucleotide. These findings demonstrate that there are different requirements of the template sequence near the initiation nucleotide for BMV RNA accumulation in plant cells.

  15. Prediction of human rotavirus serotype by nucleotide sequence analysis of the VP7 protein gene.

    PubMed Central

    Green, K Y; Sears, J F; Taniguchi, K; Midthun, K; Hoshino, Y; Gorziglia, M; Nishikawa, K; Urasawa, S; Kapikian, A Z; Chanock, R M

    1988-01-01

    Human rotavirus field isolates were characterized by direct sequence analysis of the gene encoding the serotype-specific major neutralization protein (VP7). Single-stranded RNA transcripts were prepared from virus particles obtained directly from stool specimens or after two or three passages in MA-104 cells. Two regions of the gene (nucleotides 307 through 351 and 670 through 711) which had previously been shown to contain regions of sequence divergence among rotavirus serotypes were sequenced by the dideoxynucleotide method with two different synthetic oligonucleotide primers. The resulting nucleotide sequences were compared with the corresponding sequences from rotaviruses of known serotype (serotype 1, 2, 3, or 4). A total of 25 field isolates and 10 laboratory strains examined by this method exhibited marked sequence identity in both areas of the gene with the corresponding regions of 1 of the 4 reference strains. In addition, the predicted serotype from the sequence analysis correlated in each case with the serotype determined when the rotaviruses were examined by plaque reduction neutralization or reactivity with serotype-specific monoclonal antibodies. These data suggest that as a result of the high degree of sequence conservation observed among rotaviruses of the same serotype, it is possible to predict the serotype of a rotavirus isolate by direct sequence analysis of its VP7 gene. PMID:2833626

  16. Complete nucleotide sequence and affinities of the genomic RNA of Narcissus common latent virus (genus Carlavirus).

    PubMed

    Zheng, H-Y; Chen, J; Adams, M J; Chen, J-P

    2006-08-01

    The complete sequence of an isolate of Narcissus common latent virus (NCLV) from Zhangzhou city, Fujian, China was determined from amplified fragments of purified viral RNA. Excluding the poly(A) tail, the genomic RNA of NCLV was 8539 nucleotides (nt) long and had the typical organization for a member of the genus Carlavirus. The most closely related species were Potato virus M, Hop latent virus and Aconitum latent virus, which had 58-59% nt identity to NCLV in their entire genomes. These relationships were confirmed by a phylogenetic analysis using a composite nucleotide alignment of all the open reading frames.

  17. Nucleotide sequence of miRNA precursor contributes to cleavage site selection by Dicer.

    PubMed

    Starega-Roslan, Julia; Galka-Marciniak, Paulina; Krzyzosiak, Wlodzimierz J

    2015-12-15

    The ribonuclease Dicer excises mature miRNAs from a diverse group of precursors (pre-miRNAs), most of which contain various secondary structure motifs in their hairpin stem. In this study, we analyzed Dicer cleavage in hairpin substrates deprived of such motifs. We searched for the factors other than the secondary structure, which may influence the length diversity and heterogeneity of miRNAs. We found that the nucleotide sequence at the Dicer cleavage site influences both of these miRNA characteristics. With regard to cleavage mechanism, we demonstrate that the Dicer RNase IIIA domain that cleaves within the 3' arm of the pre-miRNA is more sensitive to the nucleotide sequence of its substrate than is the RNase IIIB domain. The RNase IIIA domain avoids releasing miRNAs with G nucleotide and prefers to generate miRNAs with a U nucleotide at the 5' end. We also propose that the sequence restrictions at the Dicer cleavage site might be the factor that contributes to the generation of miRNA duplexes with 3' overhangs of atypical lengths. This finding implies that the two RNase III domains forming the single processing center of Dicer may exhibit some degree of flexibility, which allows for the formation of these non-standard 3' overhangs.

  18. Nucleotide sequence of miRNA precursor contributes to cleavage site selection by Dicer

    PubMed Central

    Starega-Roslan, Julia; Galka-Marciniak, Paulina; Krzyzosiak, Wlodzimierz J.

    2015-01-01

    The ribonuclease Dicer excises mature miRNAs from a diverse group of precursors (pre-miRNAs), most of which contain various secondary structure motifs in their hairpin stem. In this study, we analyzed Dicer cleavage in hairpin substrates deprived of such motifs. We searched for the factors other than the secondary structure, which may influence the length diversity and heterogeneity of miRNAs. We found that the nucleotide sequence at the Dicer cleavage site influences both of these miRNA characteristics. With regard to cleavage mechanism, we demonstrate that the Dicer RNase IIIA domain that cleaves within the 3′ arm of the pre-miRNA is more sensitive to the nucleotide sequence of its substrate than is the RNase IIIB domain. The RNase IIIA domain avoids releasing miRNAs with G nucleotide and prefers to generate miRNAs with a U nucleotide at the 5′ end. We also propose that the sequence restrictions at the Dicer cleavage site might be the factor that contributes to the generation of miRNA duplexes with 3′ overhangs of atypical lengths. This finding implies that the two RNase III domains forming the single processing center of Dicer may exhibit some degree of flexibility, which allows for the formation of these non-standard 3′ overhangs. PMID:26424848

  19. The complete nucleotide sequence of RNA beta from the type strain of barley stripe mosaic virus.

    PubMed Central

    Gustafson, G; Armour, S L

    1986-01-01

    The complete nucleotide sequence of RNA beta from the type strain of barley stripe mosaic virus (BSMV) has been determined. The sequence is 3289 nucleotides in length and contains four open reading frames (ORFs) which code for proteins of Mr 22,147 (ORF1), Mr 58,098 (ORF2), Mr 17,378 (ORF3), and Mr 14,119 (ORF4). The predicted N-terminal amino acid sequence of the polypeptide encoded by the ORF nearest the 5'-end of the RNA (ORF1) is identical (after the initiator methionine) to the published N-terminal amino acid sequence of BSMV coat protein for 29 of the first 30 amino acids. ORF2 occupies the central portion of the coding region of RNA beta and ORF3 is located at the 3'-end. The ORF4 sequence overlaps the 3'-region of ORF2 and the 5'-region of ORF3 and differs in codon usage from the other three RNA beta ORFs. The coding region of RNA beta is followed by a poly(A) tract and a 238 nucleotide tRNA-like structure which are common to all three BSMV genomic RNAs. Images PMID:3754962

  20. Nucleotide sequence and genome organization of atractylodes mottle virus, a new member of the genus Carlavirus.

    PubMed

    Zhao, Fumei; Igori, Davaajargal; Lim, Seungmo; Yoo, Ran Hee; Lee, Su-Heon; Moon, Jae Sun

    2015-11-01

    The complete genome sequence of a member of a distinct species of the genus Carlavirus in the family Betaflexiviridae, tentatively named atractylodes mottle virus (AtrMoV), has been determined. Analysis of its genomic organization indicates that it has a single-stranded, positive-sense genomic RNA of 8866 nucleotides, excluding the poly(A) tail, and consists of six open reading frames typical of members of the genus Carlavirus. The individual open reading frames of AtrMoV show moderately low sequence similarity to those of other carlaviruses at the nucleotide and amino acid sequence levels. Pairwise comparison and phylogenetic analysis suggest that AtrMoV is most closely related to chrysanthemum virus B.

  1. A novel HLA-B*51 allele (B*5116) identified by nucleotide sequencing.

    PubMed

    Tamouza, R; Carbonnelle, E; Schaeffer, V; Sadki, K; Abed, Y; Marzais, F; Poirier, J C; Fortier, C; Toubert, A; Raffoux, C; Charron, D

    2000-02-01

    We report here an additional HLA-B*51 variant designated HLA-B*5116. Detected by an abnormal serological reactivity pattern, this variant was identified as a B*51 allele by polymerase chain reaction using sequence-specific primers (PCR-SSP) and characterized by nucleotide sequencing. The new variant sequence match closely with the classical HLA-B*5101 excepted two adjacent nucleotide substitutions at positions 216 and 217 of the third exon and the subsequent Leucine to Glutamic acid change at codon 163 of the alpha2 domain (CTG-->GAG). This new variant was not detected in three different ethnic groups (French, Algerian and Lebanese) suggesting a very rare frequency.

  2. Human lymphocyte Fe receptor for IgE: sequence homology of its cloned cDNA with animal lectins

    SciTech Connect

    Ikuta, K.; Takami, M.; Kim, C.W.; Honjo, T.; Miyoshi, T.; Tagaya, Y.; Kawabe, T.; Yodoi, J.

    1987-02-01

    The authors have purified the human lymphocyte Fc receptor specific for IgE (Fcepsilon receptor) and its soluble form by using the anti-Fcepsilon receptor monoclonal antibody H107. Using an oligonucleotide probe corresponding to the partial amino acid sequence of the soluble Fcepsilon receptor related to IgE binding factor, they cloned, sequenced, and expressed a cDNA for the receptor. The Fcepsilon receptor has 321 amino acid residues with no NH/sub 2/-terminal signal sequence. The receptor was separated into two domains by a putative 24-amino acid residue transmembrane region located near the NH/sub 2/-terminal end. The Fcepsilon receptor showed a marked homology with animal lectins including human and rat asialoglycoprotein receptors, chicken hepatic lectin, and rat mannose binding proteins.

  3. Hybridization-based antibody cDNA recovery for the production of recombinant antibodies identified by repertoire sequencing.

    PubMed

    Valdés-Alemán, Javier; Téllez-Sosa, Juan; Ovilla-Muñoz, Marbella; Godoy-Lozano, Elizabeth; Velázquez-Ramírez, Daniel; Valdovinos-Torres, Humberto; Gómez-Barreto, Rosa E; Martinez-Barnetche, Jesús

    2014-01-01

    High-throughput sequencing of the antibody repertoire is enabling a thorough analysis of B cell diversity and clonal selection, which may improve the novel antibody discovery process. Theoretically, an adequate bioinformatic analysis could allow identification of candidate antigen-specific antibodies, requiring their recombinant production for experimental validation of their specificity. Gene synthesis is commonly used for the generation of recombinant antibodies identified in silico. Novel strategies that bypass gene synthesis could offer more accessible antibody identification and validation alternatives. We developed a hybridization-based recovery strategy that targets the complementarity-determining region 3 (CDRH3) for the enrichment of cDNA of candidate antigen-specific antibody sequences. Ten clonal groups of interest were identified through bioinformatic analysis of the heavy chain antibody repertoire of mice immunized with hen egg white lysozyme (HEL). cDNA from eight of the targeted clonal groups was recovered efficiently, leading to the generation of recombinant antibodies. One representative heavy chain sequence from each clonal group recovered was paired with previously reported anti-HEL light chains to generate full antibodies, later tested for HEL-binding capacity. The recovery process proposed represents a simple and scalable molecular strategy that could enhance antibody identification and specificity assessment, enabling a more cost-efficient generation of recombinant antibodies.

  4. TUIT, a BLAST-Based tool for taxonomic classification of nucleotide sequences

    PubMed Central

    Tuzhikov, Alexander; Panchin, Alexander; Shestopalov, Valery I.

    2014-01-01

    Pyrosequencing of 16S ribosomal RNA (rRNA) genes has become the gold standard in human microbiome studies. The routine task of taxonomic classification using 16S rRNA reads is commonly performed by the Ribosomal Database Project (RDP) II Classifier, a robust tool that relies on a set of well-characterized reference sequences. However, the RDP II Classifier may be unable to classify a significant part of the dataset due to the absence of proper reference sequences. The taxonomic classification for some of the unclassified sequences might still be performed using BLAST searches against large and frequently updated nucleotide databases. Here we introduce TUIT (Taxonomic Unit Identification Tool) – an efficient open source and platform-independent application that can perform taxonomic classification on its own or can be used in combination with RDP II Classifier to maximize the taxonomic identification rate. Using a set of simulated DNA sequences we demonstrate that the algorithm performs taxonomic classification with high specificity for sequences as short as 125 base pairs. TUIT is applicable for 16S rRNA gene sequence classification; however, it is not restricted to 16S rRNA sequences. In addition, TUIT may be used as a complementary tool for effective taxonomic classification of nucleotide sequences generated by many current platforms, such as Roche 454 and Illumina. Standalone TUIT is available online at http://sourceforge.net/projects/tuit/. PMID:24502797

  5. Sequence selective naked-eye detection of DNA harnessing extension of oligonucleotide-modified nucleotides.

    PubMed

    Verga, Daniela; Welter, Moritz; Marx, Andreas

    2016-02-01

    DNA polymerases can efficiently and sequence selectively incorporate oligonucleotide (ODN)-modified nucleotides and the incorporated oligonucleotide strand can be employed as primer in rolling circle amplification (RCA). The effective amplification of the DNA primer by Φ29 DNA polymerase allows the sequence-selective hybridisation of the amplified strand with a G-quadruplex DNA sequence that has horse radish peroxidase-like activity. Based on these findings we develop a system that allows DNA detection with single-base resolution by naked eye.

  6. Complete nucleotide sequence analysis of a Dengue-1 virus isolated on Easter Island, Chile.

    PubMed

    Cáceres, C; Yung, V; Araya, P; Tognarelli, J; Villagra, E; Vera, L; Fernández, J

    2008-01-01

    Dengue-1 viruses responsible for the dengue fever outbreak in Easter Island in 2002 were isolated from acute-phase sera of dengue fever patients. In order to analyze the complete genome sequence, we designed primers to amplify contiguous segments across the entire sequence of the viral genome. RT-PCR products obtained were cloned, and complete nucleotide and deduced amino acid sequences were determined. This report constitutes the first complete genetic characterization of a DENV-1 isolate from Chile. Phylogenetic analysis shows that an Easter Island isolate is most closely related to Pacific DENV-1 genotype IV viruses.

  7. Complete nucleotide sequence of a subviral DNA molecule of porcine circovirus type 2.

    PubMed

    Wen, Han

    2016-07-01

    Porcine circovirus type 2 (PCV2) is a member of the genus Circovirus in the family Circoviridae. Most subgenomic molecules of PCV2 have been mapped. Here, the first full-length sequence of a subviral molecule of PCV2 (CH-IVT12) containing a reverse complement sequence of the PCV2 genome was determined by sequencing DNA extracted from PK15 cells infected with PCV2. The circular CH-IVT12 DNA consists of 1136 nucleotides and contains one major open reading frame.

  8. Nucleotide sequences of the coat protein genes of two Japanese zucchini yellow mosaic virus isolates.

    PubMed

    Kundu, A K; Ohshima, K; Sako, N

    1997-10-01

    The nucleotide (nt) sequences of the coat protein (CP) genes of two Japanese zucchini yellow mosaic virus (ZYMV) isolates (ZYMV-169 and ZYMV-M) were determined. The CP genes of both isolates were 837 nt long and encoded 279 amino acids (aa). The nt and deduced aa sequence similarities between the two isolates were 92% and 94.6%, respectively. The deduced aa sequences of CPs of the Japanese isolates were compared with those of previously reported ZYMV isolates by phylogenetic analysis. This comparison lead us to divide all ZMYV isolates into 3 groups in which ZYMV-169 formed its own distinct group.

  9. Nucleotide sequence of a new isolate of ribgrass mosaic tobamovirus infecting Impatiens New Guinea.

    PubMed

    Wetzel, T; Njapo Ngangom, H O; Chotewutmontri, S; Krczal, G

    2006-04-01

    The complete nucleotide sequence of a tobamovirus isolated from Impatiens New Guinea was determined. The genome was 6302 nt long, and its genomic organisation was similar to those of other crucufer tobamoviruses. Sequence comparisons with the corresponding sequences of other crucifer tobamoviruses revealed highest levels of identity with the ribgrass mosaic virus (Shanghai isolate). A small open reading frame putatively encoding a 4.5-kDa protein with a low degree of similarity to the ORF6 of tobacco mosaic virus was found nested in the movement protein gene.

  10. The nucleotide sequence at the termini of adenovirus type 5 DNA.

    PubMed Central

    Steenbergh, P H; Maat, J; van Ormondt, H; Sussenbach, J S

    1977-01-01

    The sequences of the first 194 base pairs at both termini of adenovirus type 5 (Ad5) DNA have been determined, using the chemical degradation technique developed by Maxam and Gilbert (Proc. Nat. Acad. Sci. USA 74 (1977), pp. 560-564). The nucleotide sequences 1-75 were confirmed by analysis of labeled RNA transcribed from the terminal HhaI fragments in vitro. The sequence data show that Ad5 DNA has a perfect inverted terminal repetition of 103 base pairs long. Images PMID:600799

  11. Nucleotide sequence and genome organization of Dweet mottle virus and its relationship to members of the family Betaflexiviridae

    USDA-ARS?s Scientific Manuscript database

    The nucleotide sequence of Dweet mottle virus (DMV) was determined and compared to sequences of members of the family Alpha- and Beta-flexiviridae. The DMV genome has 8747 nucleotides (nt) excluding the poly-(A) tail at the 3’ end of the genome. The overall G+C content of DMV genomic RNA is 40%. D...

  12. Nucleotide sequence characterization of Ty 1-17, a class II transposon from yeast.

    PubMed Central

    Warmington, J R; Waring, R B; Newlon, C S; Indge, K J; Oliver, S G

    1985-01-01

    We have determined the nucleotide sequence of a class II yeast transposon (Ty 1-17) which is found just centromere-distal to the LEU2 structural gene on chromosome III of Saccharomyces cerevisiae. The complete element is 5961 bp long and is bounded by two identical, directly repeated, delta sequences of 332 bp each. The sequence organization indicates that Ty 1-17 is a retrotransposon, like the class I elements characterized previously. It contains two long open reading-frames, TyA (439 amino acids) and TyB (1349 amino acids). In this paper, the sequences of the two classes of yeast transposon are compared with one another and with analogous elements, such as retroviral proviruses, cauliflower mosaic virus and copia sequences. Features of the Ty 1-17 sequence which may be important to its mechanism of transposition and its genetic action are discussed. PMID:2997719

  13. PatMatch: a program for finding patterns in peptide and nucleotide sequences

    PubMed Central

    Yan, Thomas; Yoo, Danny; Berardini, Tanya Z.; Mueller, Lukas A.; Weems, Dan C.; Weng, Shuai; Cherry, J. Michael; Rhee, Seung Y.

    2005-01-01

    Here, we present PatMatch, an efficient, web-based pattern-matching program that enables searches for short nucleotide or peptide sequences such as cis-elements in nucleotide sequences or small domains and motifs in protein sequences. The program can be used to find matches to a user-specified sequence pattern that can be described using ambiguous sequence codes and a powerful and flexible pattern syntax based on regular expressions. A recent upgrade has improved performance and now supports both mismatches and wildcards in a single pattern. This enhancement has been achieved by replacing the previous searching algorithm, scan_for_matches [D'Souza et al. (1997), Trends in Genetics, 13, 497–498], with nondeterministic-reverse grep (NR-grep), a general pattern matching tool that allows for approximate string matching [Navarro (2001), Software Practice and Experience, 31, 1265–1312]. We have tailored NR-grep to be used for DNA and protein searches with PatMatch. The stand-alone version of the software can be adapted for use with any sequence dataset and is available for download at The Arabidopsis Information Resource (TAIR) at . The PatMatch server is available on the web at for searching Arabidopsis thaliana sequences. PMID:15980466

  14. Pseudo nucleotide composition or PseKNC: an effective formulation for analyzing genomic sequences.

    PubMed

    Chen, Wei; Lin, Hao; Chou, Kuo-Chen

    2015-10-01

    With the avalanche of DNA/RNA sequences generated in the post-genomic age, it is urgent to develop automated methods for analyzing the relationship between the sequences and their functions. Towards this goal, a series of sequence-based methods have been proposed and applied to analyze various character-unknown DNA/RNA sequences in order for in-depth understanding their action mechanisms and processes. Compared with the classical sequence-based methods, the pseudo nucleotide composition or PseKNC approach developed very recently has the following advantages: (1) it can convert length-different DNA/RNA sequences into dimension-fixed digital vectors that can be directly handled by all the existing machine-learning algorithms or operation engines; (2) it can contain the desired features and properties according to the selection or definition of users; (3) it can cover considerable sequence pattern information, both local and global. This minireview is focused on the concept of pseudo nucleotide composition, its development and applications.

  15. Nucleotide binding database NBDB – a collection of sequence motifs with specific protein-ligand interactions

    PubMed Central

    Zheng, Zejun; Goncearenco, Alexander; Berezovsky, Igor N.

    2016-01-01

    NBDB database describes protein motifs, elementary functional loops (EFLs) that are involved in binding of nucleotide-containing ligands and other biologically relevant cofactors/coenzymes, including ATP, AMP, ATP, GMP, GDP, GTP, CTP, PAP, PPS, FMN, FAD(H), NAD(H), NADP, cAMP, cGMP, c-di-AMP and c-di-GMP, ThPP, THD, F-420, ACO, CoA, PLP and SAM. The database is freely available online at http://nbdb.bii.a-star.edu.sg. In total, NBDB contains data on 249 motifs that work in interactions with 24 ligands. Sequence profiles of EFL motifs were derived de novo from nonredundant Uniprot proteome sequences. Conserved amino acid residues in the profiles interact specifically with distinct chemical parts of nucleotide-containing ligands, such as nitrogenous bases, phosphate groups, ribose, nicotinamide, and flavin moieties. Each EFL profile in the database is characterized by a pattern of corresponding ligand–protein interactions found in crystallized ligand–protein complexes. NBDB database helps to explore the determinants of nucleotide and cofactor binding in different protein folds and families. NBDB can also detect fragments that match to profiles of particular EFLs in the protein sequence provided by user. Comprehensive information on sequence, structures, and interactions of EFLs with ligands provides a foundation for experimental and computational efforts on design of required protein functions. PMID:26507856

  16. PCV: An Alignment Free Method for Finding Homologous Nucleotide Sequences and its Application in Phylogenetic Study.

    PubMed

    Kumar, Rajnish; Mishra, Bharat Kumar; Lahiri, Tapobrata; Kumar, Gautam; Kumar, Nilesh; Gupta, Rahul; Pal, Manoj Kumar

    2017-06-01

    Online retrieval of the homologous nucleotide sequences through existing alignment techniques is a common practice against the given database of sequences. The salient point of these techniques is their dependence on local alignment techniques and scoring matrices the reliability of which is limited by computational complexity and accuracy. Toward this direction, this work offers a novel way for numerical representation of genes which can further help in dividing the data space into smaller partitions helping formation of a search tree. In this context, this paper introduces a 36-dimensional Periodicity Count Value (PCV) which is representative of a particular nucleotide sequence and created through adaptation from the concept of stochastic model of Kolekar et al. (American Institute of Physics 1298:307-312, 2010. doi: 10.1063/1.3516320 ). The PCV construct uses information on physicochemical properties of nucleotides and their positional distribution pattern within a gene. It is observed that PCV representation of gene reduces computational cost in the calculation of distances between a pair of genes while being consistent with the existing methods. The validity of PCV-based method was further tested through their use in molecular phylogeny constructs in comparison with that using existing sequence alignment methods.

  17. Linking the human cytogenetic map with nucleotide sequence: the CCAP clone set.

    PubMed

    Jang, Wonhee; Yonescu, Raluca; Knutsen, Turid; Brown, Theresa; Reppert, Tricia; Sirotkin, Karl; Schuler, Gregory D; Ried, Thomas; Kirsch, Ilan R

    2006-07-15

    We present the completed dataset and clone repository of the Cancer Chromosome Aberration Project (CCAP), an initiative developed and funded through the intramural program of the U.S. National Cancer Institute, to provide seamless linkage of human cytogenetic markers with the primary nucleotide sequence of the human genome. Spaced at 1-2 Mb intervals across the human genome, 1,339 bacterial artificial chromosome (BAC) clones have been localized to chromosomal bands through high-resolution fluorescence in situ hybridization (FISH) mapping. Of these clones, 99.8% can be positioned on the primary human genome sequence and 95% are placed at or close to their precise nucleotide starts and stops. This dataset can be studied and manipulated within generally available public Web sites. The clones are available from a commercial repository. The CCAP BAC clone set provides anchors for the interrogation of gene and sequence involvement in oncogenic and developmental disorders when the starting point is the recognition of a structural, numerical, or interstitial chromosomal aberration. This dataset also provides a current view of the quality and coherence of the available genome sequence and insight into the nucleotide and three-dimensional structures that manifest as Giemsa light and dark chromosomal banding patterns.

  18. The human sorbitol dehydrogenase gene: cDNA cloning, sequence determination, and mapping by fluorescence in situ hybridization

    SciTech Connect

    Lee, F.K.; Chung, S. ); Cheung, M.C. )

    1994-05-15

    The cDNA for human sorbitol dehydrogenase (SORD) has been cloned and sequenced. It translates into a peptide of 356 amino acid residues, one more than the sequence previously reported from peptide analysis. An extra alanine was found at the acetyl-blocked N-terminal, between positions 1 and 4. This matches the rat cDNA, which also has 356 amino acids, with an extra proline at position 3. Four other mismatches were also observed, but these are all amino acid substitutions that occur outside proposed functionally important regions. Further work must be performed to determine whether these discrepancies represent polymorphic forms of the enzyme. The SORD gene was mapped by fluorescence in situ hybridization and found to occupy a single site on chromosome 15q15, indicating that it is a single-copy gene. This was confirmed by Southern blot hybridization. SORD is thought to be involved in the etiology of diabetic complications, and its deficiency has been linked to congenital cataracts. The cloned gene could be used as a probe to study the role of this enzyme in the pathogenesis of these diseases. 24 refs., 4 figs.

  19. DNA sequence-based "bar codes" for tracking the origins of expressed sequence tags from a maize cDNA library constructed using multiple mRNA sources.

    PubMed

    Qiu, Fang; Guo, Ling; Wen, Tsui-Jung; Liu, Feng; Ashlock, Daniel A; Schnable, Patrick S

    2003-10-01

    To enhance gene discovery, expressed sequence tag (EST) projects often make use of cDNA libraries produced using diverse mixtures of mRNAs. As such, expression data are lost because the origins of the resulting ESTs cannot be determined. Alternatively, multiple libraries can be prepared, each from a more restricted source of mRNAs. Although this approach allows the origins of ESTs to be determined, it requires the production of multiple libraries. A hybrid approach is reported here. A cDNA library was prepared using 21 different pools of maize (Zea mays) mRNAs. DNA sequence "bar codes" were added during first-strand cDNA synthesis to uniquely identify the mRNA source pool from which individual cDNAs were derived. Using a decoding algorithm that included error correction, it was possible to identify the source mRNA pool of more than 97% of the ESTs. The frequency at which a bar code is represented in an EST contig should be proportional to the abundance of the corresponding mRNA in the source pool. Consistent with this, all ESTs derived from several genes (zein and adh1) that are known to be exclusively expressed in kernels or preferentially expressed under anaerobic conditions, respectively, were exclusively tagged with bar codes associated with mRNA pools prepared from kernel and anaerobically treated seedlings, respectively. Hence, by allowing for the retention of expression data, the bar coding of cDNA libraries can enhance the value of EST projects.

  20. Targeted rapid amplification of cDNA ends (T-RACE)--an improved RACE reaction through degradation of non-target sequences.

    PubMed

    Bower, Neil I; Johnston, Ian A

    2010-11-01

    Amplification of the 5' ends of cDNA, although simple in theory, can often be difficult to achieve. We describe a novel method for the specific amplification of cDNA ends. An oligo-dT adapter incorporating a dUTP-containing PCR primer primes first-strand cDNA synthesis incorporating dUTP. Using the Cap finder approach, another distinct dUTP containing adapter is added to the 3' end of the newly synthesized cDNA. Second-strand synthesis incorporating dUTP is achieved by PCR, using dUTP-containing primers complimentary to the adapter sequences incorporated in the cDNA ends. The double-stranded cDNA-containing dUTP serves as a universal template for the specific amplification of the 3' or 5' end of any gene. To amplify the ends of cDNA, asymmetric PCR is performed using a single gene-specific primer and standard dNTPs. The asymmetric PCR product is purified and non-target transcripts containing dUTP degraded by Uracil DNA glycosylase, leaving only those transcripts produced during the asymmetric PCR. Subsequent PCR using a nested gene-specific primer and the 3' or 5' T-RACE primer results in specific amplification of cDNA ends. This method can be used to specifically amplify the 3' and 5' ends of numerous cDNAs from a single cDNA synthesis reaction.

  1. Quadfinder: server for identification and analysis of quadruplex-forming motifs in nucleotide sequences

    PubMed Central

    Scaria, Vinod; Hariharan, Manoj; Arora, Amit; Maiti, Souvik

    2006-01-01

    G-quadruplex secondary structures, which play a structural role in repetitive DNA such as telomeres, may also play a functional role at other genomic locations as targetable regulatory elements which control gene expression. The recent interest in application of quadruplexes in biological systems prompted us to develop a tool for the identification and analysis of quadruplex-forming nucleotide sequences especially in the RNA. Here we present Quadfinder, an online server for prediction and bioinformatics of uni-molecular quadruplex-forming nucleotide sequences. The server is designed to be user-friendly and needs minimal intervention by the user, while providing flexibility of defining the variants of the motif. The server is freely available at URL . PMID:16845097

  2. Nucleotide sequence variability of the Adh gene of the coastal plant Calystegia soldanella (Convolvulaceae) in Japan.

    PubMed

    Ohsako, Takanori; Matsuoka, Gakuto

    2008-02-01

    Calystegia soldanella (Convolvulaceae) is a self-incompatible perennial herb distributed on sandy seashores throughout the temperate zone of the world. In Japan, the species occasionally grows on the sandy shores of Lake Biwa. To clarify the genetic differentiation among local populations, we investigated the nucleotide sequence variability of the Adh gene. In a 1625-bp sequence between exon 2 and the 3' noncoding region of the Adh gene, a total of 44 polymorphic sites were found among 91 individuals from 19 populations. The nucleotide diversity for the entire sample was 0.00212. Similar values were determined for geographical groups of populations. No genetic differentiation among the groups of populations was found. The complete lack of genetic differentiation between the sea coastal populations and the inland populations could not be attributed to gene flow. Although the inland populations are geographically isolated from the sea coastal populations, the time since separation might be insufficient to establish significant genetic differentiation.

  3. Sequence analysis of a rainbow trout cDNA library and creation of a gene index.

    PubMed

    Rexroad, C E; Lee, Y; Keele, J W; Karamycheva, S; Brown, G; Koop, B; Gahr, S A; Palti, Y; Quackenbush, J

    2003-01-01

    Expressed sequence tag (EST) projects have produced extremely valuable resources for identifying genes affecting phenotypes of interest. A large-scale EST sequencing project for rainbow trout was initiated to identify and functionally annotate as many unique transcripts as possible. Over 45,000 5' ESTs were obtained by sequencing clones from a single normalized library constructed using mRNA from six tissues. The production of this sequence data and creation of a rainbow trout Gene Index eliminating redundancy and providing annotation for these sequences will facilitate research in this species.

  4. Nucleotide sequence and replication properties of the Bacillus borstelensis cryptic plasmid pHT926.

    PubMed Central

    Ebisu, S; Murahashi, Y; Takagi, H; Kadowaki, K; Yamaguchi, K; Yamagata, H; Udaka, S

    1995-01-01

    The nucleotide sequence of pHT926, a cryptic plasmid found in Bacillus borstelensis HP926, was determined. pHT926 replicates by a rolling-circle mechanism and belongs to the pC194 plasmid family. The copy number of pHT926 was fourfold higher than that of pUB110 and very stably maintained in Bacillus choshinensis. PMID:7487045

  5. Construction and Evaluation of cDNA Libraries for Large-Scale Expressed Sequence Tag Sequencing in Wheat (Triticum aestivum L.)

    PubMed Central

    Zhang, D.; Choi, D. W.; Wanamaker, S.; Fenton, R. D.; Chin, A.; Malatrasi, M.; Turuspekov, Y.; Walia, H.; Akhunov, E. D.; Kianian, P.; Otto, C.; Simons, K.; Deal, K. R.; Echenique, V.; Stamova, B.; Ross, K.; Butler, G. E.; Strader, L.; Verhey, S. D.; Johnson, R.; Altenbach, S.; Kothari, K.; Tanaka, C.; Shah, M. M.; Laudencia-Chingcuanco, D.; Han, P.; Miller, R. E.; Crossman, C. C.; Chao, S.; Lazo, G. R.; Klueva, N.; Gustafson, J. P.; Kianian, S. F.; Dubcovsky, J.; Walker-Simmons, M. K.; Gill, K. S.; Dvořák, J.; Anderson, O. D.; Sorrells, M. E.; McGuire, P. E.; Qualset, C. O.; Nguyen, H. T.; Close, T. J.

    2004-01-01

    A total of 37 original cDNA libraries and 9 derivative libraries enriched for rare sequences were produced from Chinese Spring wheat (Triticum aestivum L.), five other hexaploid wheat genotypes (Cheyenne, Brevor, TAM W101, BH1146, Butte 86), tetraploid durum wheat (T. turgidum L.), diploid wheat (T. monococcum L.), and two other diploid members of the grass tribe Triticeae (Aegilops speltoides Tausch and Secale cereale L.). The emphasis in the choice of plant materials for library construction was reproductive development subjected to environmental factors that ultimately affect grain quality and yield, but roots and other tissues were also included. Partial cDNA expressed sequence tags (ESTs) were examined by various measures to assess the quality of these libraries. All ESTs were processed to remove cloning system sequences and contaminants and then assembled using CAP3. Following these processing steps, this assembly yielded 101,107 sequences derived from 89,043 clones, which defined 16,740 contigs and 33,213 singletons, a total of 49,953 “unigenes.” Analysis of the distribution of these unigenes among the libraries led to the conclusion that the enrichment methods were effective in reducing the most abundant unigenes and to the observation that the most diverse libraries were from tissues exposed to environmental stresses including heat, drought, salinity, or low temperature. PMID:15514038

  6. Trehalase from male accessory gland of an insect, Tenebrio molitor. cDNA sequencing and developmental profile of the gene expression.

    PubMed Central

    Takiguchi, M; Niimi, T; Su, Z H; Yaginuma, T

    1992-01-01

    A cDNA of alpha alpha-trehalase (EC 3.2.1.28) from a cDNA library of male bean-shaped accessory gland of the mealworm beetle, Tenebrio molitor, has been isolated by the homology screening approach. Sequence analysis of the cDNA (1830 bp) revealed that the cDNA encoded a protein of 555 amino acids with a calculated M(r) of 64457. The deduced amino acid sequence had significant similarities to rabbit small intestine and Escherichia coli trehalases. Northern blotting and semi-quantitative PCR analyses revealed that a trehalase transcript with about 2.0 kb was abundant in bean-shaped accessory glands. In the glands, the amount of trehalase transcript increased from 1 to 2 days after adult ecdysis. These tissue- and stage-specific gene expressions of trehalase corresponded to the tissue- and stage-specificity of trehalase activity. Images Fig. 2. Fig. 3. Fig. 4 PMID:1445264

  7. The complete nucleotide sequence of the mitochondrial genome of Phthonandria atrilineata (Lepidoptera: Geometridae).

    PubMed

    Yang, Ling; Wei, Zhao-Jun; Hong, Gui-Yun; Jiang, Shao-Tong; Wen, Long-Ping

    2009-07-01

    Using long-polymerase chain reaction (Long-PCR) method, we determined the complete nucleotide sequence of the mitochondrial genome (mitogenome) of Phthonandria atrilineata. The complete mtDNA from P. atrilineata was 15,499 base pairs in length and contained 13 protein-coding genes (PCGs), 2 rRNA genes, 22 tRNA genes, and a control region. The P. atrilineata genes were in the same order and orientation as the completely sequenced mitogenomes of other lepidopteran species. The nucleotide composition of P. atrilineata mitogenome was biased toward A + T nucleotides (81.02%), and the 13 PCGs show different A + T contents that range from 73.25% (cox1) to 92.12% (atp8). Phthonandria had the canonical set of 22 tRNA genes, that fold in the typical cloverleaf structure described for metazoan mt tRNAs, with the unique exception of trnS(AGN). The phylogenetic relationships were reconstructed with the concatenated sequences of the 13 PCGs of the mitochondrial genome, which confirmed that P. atrilineata is most closely related to the superfamily Bombycoidea.

  8. 37 CFR 1.824 - Form and format for nucleotide and/or amino acid sequence submissions in computer readable form.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... nucleotide and/or amino acid sequence submissions in computer readable form. 1.824 Section 1.824 Patents... And/or Amino Acid Sequences § 1.824 Form and format for nucleotide and/or amino acid sequence... readable form may be created by any means, such as word processors, nucleotide/amino acid sequence editors...

  9. 37 CFR 1.824 - Form and format for nucleotide and/or amino acid sequence submissions in computer readable form.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... nucleotide and/or amino acid sequence submissions in computer readable form. 1.824 Section 1.824 Patents... And/or Amino Acid Sequences § 1.824 Form and format for nucleotide and/or amino acid sequence... readable form may be created by any means, such as word processors, nucleotide/amino acid sequence editors...

  10. 37 CFR 1.824 - Form and format for nucleotide and/or amino acid sequence submissions in computer readable form.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... nucleotide and/or amino acid sequence submissions in computer readable form. 1.824 Section 1.824 Patents... And/or Amino Acid Sequences § 1.824 Form and format for nucleotide and/or amino acid sequence... readable form may be created by any means, such as word processors, nucleotide/amino acid sequence editors...

  11. 37 CFR 1.824 - Form and format for nucleotide and/or amino acid sequence submissions in computer readable form.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... nucleotide and/or amino acid sequence submissions in computer readable form. 1.824 Section 1.824 Patents... And/or Amino Acid Sequences § 1.824 Form and format for nucleotide and/or amino acid sequence... readable form may be created by any means, such as word processors, nucleotide/amino acid sequence editors...

  12. 37 CFR 1.824 - Form and format for nucleotide and/or amino acid sequence submissions in computer readable form.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... nucleotide and/or amino acid sequence submissions in computer readable form. 1.824 Section 1.824 Patents... And/or Amino Acid Sequences § 1.824 Form and format for nucleotide and/or amino acid sequence... readable form may be created by any means, such as word processors, nucleotide/amino acid sequence editors...

  13. Analysis of xylem formation in pine by cDNA sequencing

    NASA Technical Reports Server (NTRS)

    Allona, I.; Quinn, M.; Shoop, E.; Swope, K.; St Cyr, S.; Carlis, J.; Riedl, J.; Retzel, E.; Campbell, M. M.; Sederoff, R.; hide

    1998-01-01

    Secondary xylem (wood) formation is likely to involve some genes expressed rarely or not at all in herbaceous plants. Moreover, environmental and developmental stimuli influence secondary xylem differentiation, producing morphological and chemical changes in wood. To increase our understanding of xylem formation, and to provide material for comparative analysis of gymnosperm and angiosperm sequences, ESTs were obtained from immature xylem of loblolly pine (Pinus taeda L.). A total of 1,097 single-pass sequences were obtained from 5' ends of cDNAs made from gravistimulated tissue from bent trees. Cluster analysis detected 107 groups of similar sequences, ranging in size from 2 to 20 sequences. A total of 361 sequences fell into these groups, whereas 736 sequences were unique. About 55% of the pine EST sequences show similarity to previously described sequences in public databases. About 10% of the recognized genes encode factors involved in cell wall formation. Sequences similar to cell wall proteins, most known lignin biosynthetic enzymes, and several enzymes of carbohydrate metabolism were found. A number of putative regulatory proteins also are represented. Expression patterns of several of these genes were studied in various tissues and organs of pine. Sequencing novel genes expressed during xylem formation will provide a powerful means of identifying mechanisms controlling this important differentiation pathway.

  14. Analysis of xylem formation in pine by cDNA sequencing

    NASA Technical Reports Server (NTRS)

    Allona, I.; Quinn, M.; Shoop, E.; Swope, K.; St Cyr, S.; Carlis, J.; Riedl, J.; Retzel, E.; Campbell, M. M.; Sederoff, R.; Whetten, R. W.; Davies, E. (Principal Investigator)

    1998-01-01

    Secondary xylem (wood) formation is likely to involve some genes expressed rarely or not at all in herbaceous plants. Moreover, environmental and developmental stimuli influence secondary xylem differentiation, producing morphological and chemical changes in wood. To increase our understanding of xylem formation, and to provide material for comparative analysis of gymnosperm and angiosperm sequences, ESTs were obtained from immature xylem of loblolly pine (Pinus taeda L.). A total of 1,097 single-pass sequences were obtained from 5' ends of cDNAs made from gravistimulated tissue from bent trees. Cluster analysis detected 107 groups of similar sequences, ranging in size from 2 to 20 sequences. A total of 361 sequences fell into these groups, whereas 736 sequences were unique. About 55% of the pine EST sequences show similarity to previously described sequences in public databases. About 10% of the recognized genes encode factors involved in cell wall formation. Sequences similar to cell wall proteins, most known lignin biosynthetic enzymes, and several enzymes of carbohydrate metabolism were found. A number of putative regulatory proteins also are represented. Expression patterns of several of these genes were studied in various tissues and organs of pine. Sequencing novel genes expressed during xylem formation will provide a powerful means of identifying mechanisms controlling this important differentiation pathway.

  15. Plastid: nucleotide-resolution analysis of next-generation sequencing and genomics data.

    PubMed

    Dunn, Joshua G; Weissman, Jonathan S

    2016-11-22

    Next-generation sequencing (NGS) informs many biological questions with unprecedented depth and nucleotide resolution. These assays have created a need for analytical tools that enable users to manipulate data nucleotide-by-nucleotide robustly and easily. Furthermore, because many NGS assays encode information jointly within multiple properties of read alignments - for example, in ribosome profiling, the locations of ribosomes are jointly encoded in alignment coordinates and length - analytical tools are often required to extract the biological meaning from the alignments before analysis. Many assay-specific pipelines exist for this purpose, but there remains a need for user-friendly, generalized, nucleotide-resolution tools that are not limited to specific experimental regimes or analytical workflows. Plastid is a Python library designed specifically for nucleotide-resolution analysis of genomics and NGS data. As such, Plastid is designed to extract assay-specific information from read alignments while retaining generality and extensibility to novel NGS assays. Plastid represents NGS and other biological data as arrays of values associated with genomic or transcriptomic positions, and contains configurable tools to convert data from a variety of sources to such arrays. Plastid also includes numerous tools to manipulate even discontinuous genomic features, such as spliced transcripts, with nucleotide precision. Plastid automatically handles conversion between genomic and feature-centric coordinates, accounting for splicing and strand, freeing users of burdensome accounting. Finally, Plastid's data models use consistent and familiar biological idioms, enabling even beginners to develop sophisticated analytical workflows with minimal effort. Plastid is a versatile toolkit that has been used to analyze data from multiple NGS assays, including RNA-seq, ribosome profiling, and DMS-seq. It forms the genomic engine of our ORF annotation tool, ORF-RATER, and is readily

  16. Nucleotide sequence variation of the envelope protein gene identifies two distinct genotypes of yellow fever virus.

    PubMed

    Chang, G J; Cropp, B C; Kinney, R M; Trent, D W; Gubler, D J

    1995-09-01

    The evolution of yellow fever virus over 67 years was investigated by comparing the nucleotide sequences of the envelope (E) protein genes of 20 viruses isolated in Africa, the Caribbean, and South America. Uniformly weighted parsimony algorithm analysis defined two major evolutionary yellow fever virus lineages designated E genotypes I and II. E genotype I contained viruses isolated from East and Central Africa. E genotype II viruses were divided into two sublineages: IIA viruses from West Africa and IIB viruses from America, except for a 1979 virus isolated from Trinidad (TRINID79A). Unique signature patterns were identified at 111 nucleotide and 12 amino acid positions within the yellow fever virus E gene by signature pattern analysis. Yellow fever viruses from East and Central Africa contained unique signatures at 60 nucleotide and five amino acid positions, those from West Africa contained unique signatures at 25 nucleotide and two amino acid positions, and viruses from America contained such signatures at 30 nucleotide and five amino acid positions in the E gene. The dissemination of yellow fever viruses from Africa to the Americas is supported by the close genetic relatedness of genotype IIA and IIB viruses and genetic evidence of a possible second introduction of yellow fever virus from West Africa, as illustrated by the TRINID79A virus isolate. The E protein genes of American IIB yellow fever viruses had higher frequencies of amino acid substitutions than did genes of yellow fever viruses of genotypes I and IIA on the basis of comparisons with a consensus amino acid sequence for the yellow fever E gene. The great variation in the E proteins of American yellow fever virus probably results from positive selection imposed by virus interaction with different species of mosquitoes or nonhuman primates in the Americas.

  17. Nucleotide sequence variation of the envelope protein gene identifies two distinct genotypes of yellow fever virus.

    PubMed Central

    Chang, G J; Cropp, B C; Kinney, R M; Trent, D W; Gubler, D J

    1995-01-01

    The evolution of yellow fever virus over 67 years was investigated by comparing the nucleotide sequences of the envelope (E) protein genes of 20 viruses isolated in Africa, the Caribbean, and South America. Uniformly weighted parsimony algorithm analysis defined two major evolutionary yellow fever virus lineages designated E genotypes I and II. E genotype I contained viruses isolated from East and Central Africa. E genotype II viruses were divided into two sublineages: IIA viruses from West Africa and IIB viruses from America, except for a 1979 virus isolated from Trinidad (TRINID79A). Unique signature patterns were identified at 111 nucleotide and 12 amino acid positions within the yellow fever virus E gene by signature pattern analysis. Yellow fever viruses from East and Central Africa contained unique signatures at 60 nucleotide and five amino acid positions, those from West Africa contained unique signatures at 25 nucleotide and two amino acid positions, and viruses from America contained such signatures at 30 nucleotide and five amino acid positions in the E gene. The dissemination of yellow fever viruses from Africa to the Americas is supported by the close genetic relatedness of genotype IIA and IIB viruses and genetic evidence of a possible second introduction of yellow fever virus from West Africa, as illustrated by the TRINID79A virus isolate. The E protein genes of American IIB yellow fever viruses had higher frequencies of amino acid substitutions than did genes of yellow fever viruses of genotypes I and IIA on the basis of comparisons with a consensus amino acid sequence for the yellow fever E gene. The great variation in the E proteins of American yellow fever virus probably results from positive selection imposed by virus interaction with different species of mosquitoes or nonhuman primates in the Americas. PMID:7637022

  18. Molecular cloning and nucleotide sequence of chicken avidin-related genes 1-5.

    PubMed

    Keinänen, R A; Wallén, M J; Kristo, P A; Laukkanen, M O; Toimela, T A; Helenius, M A; Kulomaa, M S

    1994-03-01

    Using avidin cDNA as a hybridisation probe, we detected a gene family whose putative products are related to the chicken egg-white avidin. Two overlapping genomic clones were found to contain five genes (avidin-related genes 1-5, avr1-avr5), which have been cloned, characterized and sequenced. All of the genes have a four-exon structure with an overall identity with the avidin cDNA of 88-92%. The genes appear to have no pseudogenic features and, in fact, two of these genes have been shown to be transcribed. The putative proteins share a sequence identity of 68-78% with avidin. The amino acid residues responsible for the biotin-binding activity of avidin and the bacterial biotin-binding protein, streptavidin, are highly conserved. Since avidin is induced in both a progesterone-specific manner and in connection with inflammation, these genes offer a valuable tool to study complex gene regulation in vivo.

  19. Nucleotide sequencing and characterization of the genes encoding benzene oxidation enzymes of Pseudomonas putida.

    PubMed Central

    Irie, S; Doi, S; Yorifuji, T; Takagi, M; Yano, K

    1987-01-01

    The nucleotide sequence of the genes from Pseudomonas putida encoding oxidation of benzene to catechol was determined. Five open reading frames were found in the sequence. Four corresponding protein molecules were detected by a DNA-directed in vitro translation system. Escherichia coli cells containing the fragment with the four open reading frames transformed benzene to cis-benzene glycol, which is an intermediate of the oxidation of benzene to catechol. The relation between the product of each cistron and the components of the benzene oxidation enzyme system is discussed. Images PMID:3667527

  20. SinicView: a visualization environment for comparisons of multiple nucleotide sequence alignment tools.

    PubMed

    Shih, Arthur Chun-Chieh; Lee, D T; Lin, Laurent; Peng, Chin-Lin; Chen, Shiang-Heng; Wu, Yu-Wei; Wong, Chun-Yi; Chou, Meng-Yuan; Shiao, Tze-Chang; Hsieh, Mu-Fen

    2006-03-02

    Deluged by the rate and complexity of completed genomic sequences, the need to align longer sequences becomes more urgent, and many more tools have thus been developed. In the initial stage of genomic sequence analysis, a biologist is usually faced with the questions of how to choose the best tool to align sequences of interest and how to analyze and visualize the alignment results, and then with the question of whether poorly aligned regions produced by the tool are indeed not homologous or are just results due to inappropriate alignment tools or scoring systems used. Although several systematic evaluations of multiple sequence alignment (MSA) programs have been proposed, they may not provide a standard-bearer for most biologists because those poorly aligned regions in these evaluations are never discussed. Thus, a tool that allows cross comparison of the alignment results obtained by different tools simultaneously could help a biologist evaluate their correctness and accuracy. In this paper, we present a versatile alignment visualization system, called SinicView, (for Sequence-aligning INnovative and Interactive Comparison VIEWer), which allows the user to efficiently compare and evaluate assorted nucleotide alignment results obtained by different tools. SinicView calculates similarity of the alignment outputs under a fixed window using the sum-of-pairs method and provides scoring profiles of each set of aligned sequences. The user can visually compare alignment results either in graphic scoring profiles or in plain text format of the aligned nucleotides along with the annotations information. We illustrate the capabilities of our visualization system by comparing alignment results obtained by MLAGAN, MAVID, and MULTIZ, respectively. With SinicView, users can use their own data sequences to compare various alignment tools or scoring systems and select the most suitable one to perform alignment in the initial stage of sequence analysis.

  1. Chimeric cDNA Sequences from Citrus tristeza virus Confer RNA Silencing-Mediated Resistance in Transgenic Nicotiana benthamiana Plants.

    PubMed

    Roy, Gourgopal; Sudarshana, Mysore R; Ullman, Diane E; Ding, Shou-Wei; Dandekar, Abhaya M; Falk, Bryce W

    2006-08-01

    ABSTRACT RNA silencing has been shown to be an important mechanism for conferring resistance in transgenic, virus-resistant plants. We used this approach to evaluate resistance in Nicotiana benthamiana plants transformed with chimeric coding and noncoding sequences from Citrus tristeza virus (CTV). Several independent transgenic plant lines were generated, using two constructs (pCTV1 and pCTV2) designed to produce self-complementary transcripts. The pCTV1 contained cDNA sequences from the CTV capsid protein (CP), p20, and 3' untranslated region (UTR); and pCTV2 contained CP, p23, and 3' UTR sequences. Heterologous recombinant Potato virus X (PVX) containing either homologous or heterologous CTV sequences was used to challenge plants and resistance was evaluated phenotypically and validated with reverse-transcriptase polymerase chain reaction and northern hybridization analysis. Transgenic plants (T1 generation) for each construct showed resistance to recombinant PVX constructs used for challenge experiments when PVX contained p20 or UTR (for CTV1 plants), or p23 or UTR (for CTV2 plants). However, no resistance was seen when plants were challenged with PVX containing the CTV CP. T2 generation plants also showed resistance even when challenged with PVX containing the cognate CTV sequences obtained from heterologous CTV isolates. The presence of transgene-specific small interfering RNAs in the resistant CTV1 and CTV2 plants indicated that resistance was mediated by post-transcriptional gene silencing.

  2. An integrated PCR colony hybridization approach to screen cDNA libraries for full-length coding sequences.

    PubMed

    Pollier, Jacob; González-Guzmán, Miguel; Ardiles-Diaz, Wilson; Geelen, Danny; Goossens, Alain

    2011-01-01

    cDNA-Amplified Fragment Length Polymorphism (cDNA-AFLP) is a commonly used technique for genome-wide expression analysis that does not require prior sequence knowledge. Typically, quantitative expression data and sequence information are obtained for a large number of differentially expressed gene tags. However, most of the gene tags do not correspond to full-length (FL) coding sequences, which is a prerequisite for subsequent functional analysis. A medium-throughput screening strategy, based on integration of polymerase chain reaction (PCR) and colony hybridization, was developed that allows in parallel screening of a cDNA library for FL clones corresponding to incomplete cDNAs. The method was applied to screen for the FL open reading frames of a selection of 163 cDNA-AFLP tags from three different medicinal plants, leading to the identification of 109 (67%) FL clones. Furthermore, the protocol allows for the use of multiple probes in a single hybridization event, thus significantly increasing the throughput when screening for rare transcripts. The presented strategy offers an efficient method for the conversion of incomplete expressed sequence tags (ESTs), such as cDNA-AFLP tags, to FL-coding sequences.

  3. Remote access to ACNUC nucleotide and protein sequence databases at PBIL.

    PubMed

    Gouy, Manolo; Delmotte, Stéphane

    2008-04-01

    The ACNUC biological sequence database system provides powerful and fast query and extraction capabilities to a variety of nucleotide and protein sequence databases. The collection of ACNUC databases served by the Pôle Bio-Informatique Lyonnais includes the EMBL, GenBank, RefSeq and UniProt nucleotide and protein sequence databases and a series of other sequence databases that support comparative genomics analyses: HOVERGEN and HOGENOM containing families of homologous protein-coding genes from vertebrate and prokaryotic genomes, respectively; Ensembl and Genome Reviews for analyses of prokaryotic and of selected eukaryotic genomes. This report describes the main features of the ACNUC system and the access to ACNUC databases from any internet-connected computer. Such access was made possible by the definition of a remote ACNUC access protocol and the implementation of Application Programming Interfaces between the C, Python and R languages and this communication protocol. Two retrieval programs for ACNUC databases, Query_win, with a graphical user interface and raa_query, with a command line interface, are also described. Altogether, these bioinformatics tools provide users with either ready-to-use means of querying remote sequence databases through a variety of selection criteria, or a simple way to endow application programs with an extensive access to these databases. Remote access to ACNUC databases is open to all and fully documented (http://pbil.univ-lyon1.fr/databases/acnuc/acnuc.html).

  4. Nucleotide sequences of immunoglobulin eta genes of chimpanzee and orangutan: DNA molecular clock and hominoid evolution

    SciTech Connect

    Sakoyama, Y.; Hong, K.J.; Byun, S.M.; Hisajima, H.; Ueda, S.; Yaoita, Y.; Hayashida, H.; Miyata, T.; Honjo, T.

    1987-02-01

    To determine the phylogenetic relationships among hominoids and the dates of their divergence, the complete nucleotide sequences of the constant region of the immunoglobulin eta-chain (C/sub eta1/) genes from chimpanzee and orangutan have been determined. These sequences were compared with the human eta-chain constant-region sequence. A molecular clock (silent molecular clock), measured by the degree of sequence divergence at the synonymous (silent) positions of protein-encoding regions, was introduced for the present study. From the comparison of nucleotide sequences of ..cap alpha../sub 1/-antitrypsin and ..beta..- and delta-globulin genes between humans and Old World monkeys, the silent molecular clock was calibrated: the mean evolutionary rate of silent substitution was determined to be 1.56 x 10/sup -9/ substitutions per site per year. Using the silent molecular clock, the mean divergence dates of chimpanzee and orangutan from the human lineage were estimated as 6.4 +/- 2.6 million years and 17.3 +/- 4.5 million years, respectively. It was also shown that the evolutionary rate of primate genes is considerably slower than those of other mammalian genes.

  5. Analysis of a cloned colicin Ib gene: complete nucleotide sequence and implications for regulation of expression.

    PubMed Central

    Varley, J M; Boulnois, G J

    1984-01-01

    The complete nucleotide sequence of a 2,971 base pair EcoRI fragment carrying the structural gene for colicin Ib has been determined. The length of the gene is 1,881 nucleotides which is predicted to produce a protein of 626 amino acids and of molecular weight 71,364. The structural gene is flanked by likely promoter and terminator signals and in between the promoter and the ribosome binding site is an inverted repeat sequence which resembles other sequences known to bind the LexA protein. Further analysis of the 5' flanking sequences revealed a second region which may act either as a second LexA binding site and/or in the binding of cyclic AMP receptor protein. Comparison of the predicted amino acid sequence of colicin Ib with that of colicins A and E1 reveals localised homology. The implications of these similarities in the proteins and of regulation of the colicin Ib structural gene are discussed. Images PMID:6091036

  6. Nucleotide sequence of an exceptionally long 5.8S ribosomal RNA from Crithidia fasciculata.

    PubMed Central

    Schnare, M N; Gray, M W

    1982-01-01

    In Crithidia fasciculata, a trypanosomatid protozoan, the large ribosomal subunit contains five small RNA species (e, f, g, i, j) in addition to 5S rRNA [Gray, M.W. (1981) Mol. Cell. Biol. 1, 347-357]. The complete primary sequence of species i is shown here to be pAACGUGUmCGCGAUGGAUGACUUGGCUUCCUAUCUCGUUGA ... AGAmACGCAGUAAAGUGCGAUAAGUGGUApsiCAAUUGmCAGAAUCAUUCAAUUACCGAAUCUUUGAACGAAACGG ... CGCAUGGGAGAAGCUCUUUUGAGUCAUCCCCGUGCAUGCCAUAUUCUCCAmGUGUCGAA(C)OH. This sequence establishes that species i is a 5.8S rRNA, despite its exceptional length (171-172 nucleotides). The extra nucleotides in C. fasciculata 5.8S rRNA are located in a region whose primary sequence and length are highly variable among 5.8S rRNAs, but which is capable of forming a stable hairpin loop structure (the "G+C-rich hairpin"). The sequence of C. fasciculata 5.8S rRNA is no more closely related to that of another protozoan, Acanthamoeba castellanii, than it is to representative 5.8S rRNA sequences from the other eukaryotic kingdoms, emphasizing the deep phylogenetic divisions that seem to exist within the Kingdom Protista. Images PMID:7079176

  7. Large-scale detection and application of expressed sequence tag single nucleotide polymorphisms in Nicotiana.

    PubMed

    Wang, Y; Zhou, D; Wang, S; Yang, L

    2015-07-14

    Single nucleotide polymorphisms (SNPs) are widespread in the Nicotiana genome. Using an alignment and variation detection method, we developed 20,607,973 SNPs, based on the expressed sequence tag sequences of 10 Nicotiana species. The replacement rate was much higher than the transversion rate in the SNPs, and SNPs widely exist in the Nicotiana. In vitro verification indicated that all of the SNPs were high quality and accurate. Evolutionary relationships between 15 varieties were investigated by polymerase chain reaction with a special primer; the specific 302 locus of these sequence results clearly indicated the origin of Zhongyan 100. A database of Nicotiana SNPs (NSNP) was developed to store and search for SNPs in Nicotiana. NSNP is a tool for researchers to develop SNP markers of sequence data.

  8. HPC-CLUST: distributed hierarchical clustering for large sets of nucleotide sequences.

    PubMed

    Matias Rodrigues, João F; von Mering, Christian

    2014-01-15

    Nucleotide sequence data are being produced at an ever increasing rate. Clustering such sequences by similarity is often an essential first step in their analysis-intended to reduce redundancy, define gene families or suggest taxonomic units. Exact clustering algorithms, such as hierarchical clustering, scale relatively poorly in terms of run time and memory usage, yet they are desirable because heuristic shortcuts taken during clustering might have unintended consequences in later analysis steps. Here we present HPC-CLUST, a highly optimized software pipeline that can cluster large numbers of pre-aligned DNA sequences by running on distributed computing hardware. It allocates both memory and computing resources efficiently, and can process more than a million sequences in a few hours on a small cluster. Source code and binaries are freely available at http://meringlab.org/software/hpc-clust/; the pipeline is implemented in Cþþ and uses the Message Passing Interface (MPI) standard for distributed computing.

  9. A carboxypeptidase inhibitor from the medical leech Hirudo medicinalis. Isolation, sequence analysis, cDNA cloning, recombinant expression, and characterization.

    PubMed

    Reverter, D; Vendrell, J; Canals, F; Horstmann, J; Avilés, F X; Fritz, H; Sommerhoff, C P

    1998-12-04

    A novel metallocarboxypeptidase inhibitor was isolated from the medical leech Hirudo medicinalis. Amino acid sequence analysis provided a nearly complete primary structure. which was subsequently verified and completed by cDNA cloning using reverse transcriptase-polymerase chain reaction/rapid amplification of cDNA end techniques. The inhibitor, called LCI (leech carboxypeptidase inhibitor), is a cysteine-rich polypeptide composed of 66 amino acid residues. It does not show sequence similarity to any other protein except at its C-terminal end. In this region, the inhibitor shares the amino acid sequence -Thr-Cys-X-Pro-Tyr-Val-X with Solanacea carboxypeptidase inhibitors, suggesting a similar mechanism of inhibition where the C-terminal tail of the inhibitor interacts with the active center of metallocarboxypeptidases in a substrate-like manner. This hypothesis is supported by the hydrolytic release of the C-terminal glutamic acid residue of LCI after binding to the enzyme. Heterologous overexpression of LCI in Escherichia coli, either into the medium or as an intracellular thioredoxin fusion protein, yields a protein with full inhibitory activity. Both in the natural and recombinant forms, LCI is a tightly binding, competitive inhibitor of different types of pancreatic-like carboxypeptidases, with equilibrium dissociation constants Ki of 0.2-0.4 x 10(-9) M for the complexes with the pancreatic enzymes A1, A2, and B and plasma carboxypeptidase B. Circular dichroism and nuclear magnetic resonance spectroscopy analysis indicate that recombinant LCI is a compactly folded globular protein, stable to a wide range of pH and denaturing conditions.

  10. Complete nucleotide sequence of a circular plasmid from the Lyme disease spirochete, Borrelia burgdorferi.

    PubMed Central

    Dunn, J J; Buchstein, S R; Butler, L L; Fisenne, S; Polin, D S; Lade, B N; Luft, B J

    1994-01-01

    We have determined the complete nucleotide sequence of a small circular plasmid from the spirochete Borrelia burgdorferi Ip21, the agent of Lyme disease. The plasmid (cp8.3/Ip21) is 8,303 bp long, has a 76.6% A+T content, and is unstable upon passage of cells in vitro. An analysis of the sequence revealed the presence of two nearly perfect copies of a 184-bp inverted repeat sequence separated by 2,675 bp containing three closely spaced, but nonoverlapping, open reading frames (ORFs). Each inverted repeat ends in sequences that may function as signals for the initiation of transcription and translation of flanking plasmid sequences. A unique oligonucleotide probe based on the repeated sequence showed that the DNA between the repeats is present predominantly in a single orientation. Additional copies of the repeat were not detected elsewhere in the Ip21 genome. An analysis for potential ORFs indicates that the plasmid has nine highly probable protein-coding ORFs and one that is less probable; together, they occupy almost 71% of the nucleotide sequence. Analysis of the deduced amino acid sequences of the ORFs revealed one (ORF-9) with features in common with Borrelia lipoproteins and another (ORF-2) having limited homology with a replication protein, RepC, from a gram-positive plasmid that replicates by a rolling circle (RC) mechanism. Known collectively as RC plasmids, such plasmids require a double-stranded origin at which the Rep protein nicks the DNA to generate a single-stranded replication intermediate. cp8.3/Ip21 has three copies of the heptameric motif characteristically found at a nick site of most RC plasmids. These observations suggest that cp8.3/Ip21 may replicate by an RC mechanism. Images PMID:8169221

  11. PAPNC, a novel method to calculate nucleotide diversity from large scale next generation sequencing data

    PubMed Central

    Shao, Wei; Kearney, Mary F.; Boltz, Valerie F.; Spindler, Jonathan E.; Mellors, John W.; Maldarelli, Frank; Coffin, John M.

    2014-01-01

    Estimating viral diversity in infected patients can provide insight into pathogen evolution and emergence of drug resistance. With the widespread adoption of deep sequencing, it is important to develop tools to accurately calculate population diversity from very large datasets. Current methods for estimating diversity that are based on multiple alignments are not practical to apply to such data. In this study, the authors report a novel method (Pairwise Alignment Positional Nucleotide Counting, PAPNC) for estimating population diversity from 454 sequence data. The diversity measurements determined using this method were comparable to those calculated by average pairwise difference (APD) of multiply aligned sequences using MEGA5. Diversities were estimated for 9 patient plasma HIV samples sequenced with Titanium 454 technology and by single-genome sequencing (SGS). Diversities calculated from deep sequencing using PAPNC ranged from 0.002 to 0.021 while APD measurements calculated from SGS data ranged proximately from 0.001 to 0.018, with the difference being attributable to PCR error (contributing background diversity of 0.0016 in a control sample). Comparison of APDs estimated from 100 sets of sequences drawn at random from 454 generated data and from corresponding SGS data showed very close correlation between the two methods with R2 of 0.96, and differing on average by about 1% (after correction for PCR error). The authors have developed a novel method that is good for calculating genetic diversities for large scale datasets from next generation sequencing. It can be implemented easily as a function in available variation calling programs like SAM tools or haplotype reconstruction software for nucleotide genetic diversity calculation. A Perl script implementing this method is available upon request. PMID:24681054

  12. PAPNC, a novel method to calculate nucleotide diversity from large scale next generation sequencing data.

    PubMed

    Shao, Wei; Kearney, Mary F; Boltz, Valerie F; Spindler, Jonathan E; Mellors, John W; Maldarelli, Frank; Coffin, John M

    2014-07-01

    Estimating viral diversity in infected patients can provide insight into pathogen evolution and emergence of drug resistance. With the widespread adoption of deep sequencing, it is important to develop tools to accurately calculate population diversity from very large datasets. Current methods for estimating diversity that are based on multiple alignments are not practical to apply to such data. In this study, the authors report a novel method (Pairwise Alignment Positional Nucleotide Counting, PAPNC) for estimating population diversity from 454 sequence data. The diversity measurements determined using this method were comparable to those calculated by average pairwise difference (APD) of multiply aligned sequences using MEGA5. Diversities were estimated for 9 patient plasma HIV samples sequenced with Titanium 454 technology and by single-genome sequencing (SGS). Diversities calculated from deep sequencing using PAPNC ranged from 0.002 to 0.021 while APD measurements calculated from SGS data ranged proximately from 0.001 to 0.018, with the difference being attributable to PCR error (contributing background diversity of 0.0016 in a control sample). Comparison of APDs estimated from 100 sets of sequences drawn at random from 454 generated data and from corresponding SGS data showed very close correlation between the two methods with R(2) of 0.96, and differing on average by about 1% (after correction for PCR error). The authors have developed a novel method that is good for calculating genetic diversities for large scale datasets from next generation sequencing. It can be implemented easily as a function in available variation calling programs like SAMtools or haplotype reconstruction software for nucleotide genetic diversity calculation. A Perl script implementing this method is available upon request. Copyright © 2014 Elsevier B.V. All rights reserved.

  13. The mouse collagen X gene: complete nucleotide sequence, exon structure and expression pattern.

    PubMed Central

    Elima, K; Eerola, I; Rosati, R; Metsäranta, M; Garofalo, S; Perälä, M; De Crombrugghe, B; Vuorio, E

    1993-01-01

    Overlapping genomic clones covering the 7.2 kb mouse alpha 1(X) collagen gene, 0.86 kb of promoter and 1.25 kb of 3'-flanking sequences were isolated from two genomic libraries and characterized by nucleotide sequencing. Typical features of the gene include a unique three-exon structure, similar to that in the chick gene, with the entire triple-helical domain of 463 amino acids coded by a single large exon. The highest degree of amino acid and nucleotide sequence conservation was seen in the coding region for the collagenous and C-terminal non-collagenous domains between the mouse and known chick, bovine and human collagen type X sequences. More divergence between the sequences occurred in the N-terminal non-collagenous domain. Similarity between the mammalian collagen X sequences extended into the 3'-untranslated sequence, particularly near the polyadenylation site. The promoter of the mouse collagen X gene was found to contain two TATAA boxes 159 bp apart; primer extension analyses of the transcription start site revealed that both were functional. The promoter has an unusual structure with a very low G + C content of 28% between positions -220 and -1 of the upstream transcription start site. Northern and in situ hybridization analyses confirmed that the expression of the alpha 1(X) collagen gene is restricted to hypertrophic chondrocytes in tissues undergoing endochondral calcification. The detailed sequence information of the gene is useful for studies on the promoter activity of the gene and for generation of transgenic mice. Images Figure 3 Figure 5 Figure 6 PMID:8424763

  14. Completion sequence and cloning of the infectious cDNA of a chb isolate of cucumber green mottle mosaic virus.

    PubMed

    Zhong, M; Zhao, X; Liu, Y; Wang, Y; Cao, K

    2015-03-01

    Cucumber green mottle mosaic virus (CGMMV) is an important and widespread seed-borne virus that infects Cucurbitaceous plants. It is a member of the genus Tobamovirus in the family Virgaviridae with a monopartite (+) ssRNA genome. Here we report the complete genome sequence, construction and testing of the infectious clones of a chb isolate of CGMMV. Full-length CGMMV cDNA was cloned into the vector pUC19. The linearized vector containing full-length cDNA was used as template for in vitro transcription, and the synthesized capped transcript was highly infectious in Chenopodium amaranticolor and cucumber (Cucumis sativus). Inoculated plants showed symptoms typical of CGMMV infection. The infectivity was confirmed by mechanical transmission to new plants, RT-PCR and western blot. Progeny virus derived from infectious transcripts had the same biological and biochemical properties as wild-type virus. To our knowledge, this is the first detailed report of a biologically active transcript from CGMMV.

  15. Nucleotide sequence of the internal transcribed spacers and 5.8S region of ribosomal DNA in Pinus pinea L.

    PubMed

    Marrocco, R; Gelati, M T; Maggini, F

    1996-01-01

    The nucleotide sequence of the first internal transcribed spacer (ITS1) belonging to different ribosomal RNA genes from Pinus pinea are reported. The analyzed ITS1 can be distinguished on the basis of their length, being one 2631 bp and the other 271 bp long. Nucleotide comparison of these regions did not show appreciable sequence homology. The larger ITS1 contains five tandem arranged subrepeats with size ranging between 219 bp and 237 bp. The nucleotide sequence of the 5.8S and the ITS2 regions belonging to the larger ribosomal RNA gene are also reported.

  16. The complete nucleotide sequence of the egg drop syndrome virus: an intermediate between mastadenoviruses and aviadenoviruses.

    PubMed

    Hess, M; Blöcker, H; Brandt, P

    1997-11-10

    The complete nucleotide sequence of an avian adenovirus, the egg drop syndrome (EDS) virus, was determined. The total genome length is 33,213 nucleotides, resulting in a molecular weight of 21.9 x 10(6). The GC content is only 42.5%. Between map units 3.5 and 76.9, the distribution of open reading frames with homology to known genes is similar to that reported for other mammalian and avian adenoviruses. However, no homologies to adenovirus genes such as E1A, pIX, pV, and E3 could be found. Outside this region, several open reading frames were identified without any obvious homology to known adenovirus proteins. In the region organized similarly as other adenoviral genomes, most homologies were found to an ovine adenovirus (OAV strain 287). The highest level of amino acid identity was found for the hexon proteins of EDS and OAV. The virus-associated RNA (VA RNA) was identified thanks to the homology with the VA RNA of fowl adenovirus serotype 1 (FAV1). Similarities with FAV1 were also found in the fiber protein. Our results demonstrate that the avian EDS virus represents an intermediate between mammalian and avian adenoviruses. The nucleotide sequence and genomic organization of the EDS virus reflect the heterogeneity of the aviadenovirus genus and the Adenoviridae family.

  17. PEG-Labeled Nucleotides and Nanopore Detection for Single Molecule DNA Sequencing by Synthesis

    PubMed Central

    Kumar, Shiv; Tao, Chuanjuan; Chien, Minchen; Hellner, Brittney; Balijepalli, Arvind; Robertson, Joseph W. F.; Li, Zengmin; Russo, James J.; Reiner, Joseph E.; Kasianowicz, John J.; Ju, Jingyue

    2012-01-01

    We describe a novel single molecule nanopore-based sequencing by synthesis (Nano-SBS) strategy that can accurately distinguish four bases by detecting 4 different sized tags released from 5′-phosphate-modified nucleotides. The basic principle is as follows. As each nucleotide is incorporated into the growing DNA strand during the polymerase reaction, its tag is released and enters a nanopore in release order. This produces a unique ionic current blockade signature due to the tag's distinct chemical structure, thereby determining DNA sequence electronically at single molecule level with single base resolution. As proof of principle, we attached four different length PEG-coumarin tags to the terminal phosphate of 2′-deoxyguanosine-5′-tetraphosphate. We demonstrate efficient, accurate incorporation of the nucleotide analogs during the polymerase reaction, and excellent discrimination among the four tags based on nanopore ionic currents. This approach coupled with polymerase attached to the nanopores in an array format should yield a single-molecule electronic Nano-SBS platform. PMID:23002425

  18. PEG-labeled nucleotides and nanopore detection for single molecule DNA sequencing by synthesis.

    PubMed

    Kumar, Shiv; Tao, Chuanjuan; Chien, Minchen; Hellner, Brittney; Balijepalli, Arvind; Robertson, Joseph W F; Li, Zengmin; Russo, James J; Reiner, Joseph E; Kasianowicz, John J; Ju, Jingyue

    2012-01-01

    We describe a novel single molecule nanopore-based sequencing by synthesis (Nano-SBS) strategy that can accurately distinguish four bases by detecting 4 different sized tags released from 5'-phosphate-modified nucleotides. The basic principle is as follows. As each nucleotide is incorporated into the growing DNA strand during the polymerase reaction, its tag is released and enters a nanopore in release order. This produces a unique ionic current blockade signature due to the tag's distinct chemical structure, thereby determining DNA sequence electronically at single molecule level with single base resolution. As proof of principle, we attached four different length PEG-coumarin tags to the terminal phosphate of 2'-deoxyguanosine-5'-tetraphosphate. We demonstrate efficient, accurate incorporation of the nucleotide analogs during the polymerase reaction, and excellent discrimination among the four tags based on nanopore ionic currents. This approach coupled with polymerase attached to the nanopores in an array format should yield a single-molecule electronic Nano-SBS platform.

  19. Molecular cloning and sequencing of a cDNA encoding the thioesterase domain of the rat fatty acid synthetase.

    PubMed

    Naggert, J; Witkowski, A; Mikkelsen, J; Smith, S

    1988-01-25

    A cloned cDNA containing the entire coding sequence for the long-chain S-acyl fatty acid synthetase thioester hydrolase (thioesterase I) component as well as the 3'-noncoding region of the fatty acid synthetase has been isolated using an expression vector and domain-specific antibodies. The coding region was assigned to the thioesterase I domain by identification of sequences coding for characterized peptide fragments, amino-terminal analysis of the isolated thioesterase I domain and the presence of the serine esterase active-site sequence motif. The thioesterase I domain is 306 amino acids long with a calculated molecular mass of 33,476 daltons; its DNA is flanked at the 5'-end by a region coding for the acyl carrier protein domain and at the 3'-end by a 1,537-base pairs-long noncoding sequence with a poly(A) tail. The thioesterase I domain exhibits a low, albeit discernible, homology with the discrete medium-chain S-acyl fatty acid synthetase thioester hydrolases (thioesterase II) from rat mammary gland and duck uropygial gland, suggesting a distant but common evolutionary ancestry for these proteins.

  20. Molecular cloning and sequence analysis of a novel chalcone synthase cDNA from Ginkgo biloba.

    PubMed

    Pang, Yongzhen; Shen, Guo-An; Liu, Chenghong; Liu, Xiaojun; Tan, Feng; Sun, Xiaofen; Tang, Kexuan

    2004-08-01

    A chalcone synthase (CHS) gene was cloned from Ginkgo biloba for the first time and it was also the first cloned gene involved in flavonoids metabolic pathway in G. biloba. The full-length cDNA of G. biloba CHS (designated as Gbchs) was 1608bp with poly(A) tailing and it contained a 1173bp open reading frame (ORF) encoding a 391 amino acid protein. Gbchs was found to have extensive homology with those of other plant chs genes via multiple alignments. The active sites of the CoA binding, coumaroyl pocket and cyclization pocket in CHS protein of Medicago sativa were also found in GbCHS. Molecular modeling of GbCHS indicated that the three-dimensional structure of GbCHS strongly resembled that of M. sativa (MsCHS2), implying GbCHS may have similar functions with MsCHS2. Phylogenetic tree analysis revealed that GbCHS had closer relationship with CHSs from gymnosperm plants than from other plants. Gbchs is a useful tool to study the regulation of flavonoids metabolism in G. biloba.

  1. Isolation, characterization, and cDNA sequencing of alpha-1-antiproteinase-like protein from rainbow trout seminal plasma.

    PubMed

    Mak, Monika; Mak, Paweł; Olczak, Mariusz; Szalewicz, Agata; Glogowski, Jan; Dubin, Adam; Watorek, Wiesław; Ciereszko, Andrzej

    2004-03-17

    Seminal plasma of teleost fish contains serine proteinase inhibitors related to those present in blood. These inhibitors can be bound to Q-Sepharose and sequentially eluted with a NaCl gradient. In the present study, using a two-step procedure, we purified (73-fold to homogeneity) and characterized the inhibitor eluted as the second fraction of antitrypsin activity (inhibitor II) from Q-Sepharose. The molecular weight of this inhibitor was estimated to be 56 kDa with an isoelectric point of 5.4. It effectively inhibited trypsin and chymotrypsin but was less effective against elastase. It formed SDS-stable complexes with cod and bovine trypsin. Inhibitor II appeared to be a glycoprotein. Carbohydrate content was determined to be 16%. N-terminal Edman sequencing allowed identification of the first 30 N-terminal amino acids HDGDHAGHTEDHHHHLHHIAGEAHPQHSHG and 25 amino acids within the reactive loop IMPMSLPDTIMLNRPFLLFILEDST. The N-terminal sequence did not match any known sequence, however, the sequence within the reactive loop was significantly similar to carp and mammalian alpha1-antiproteinases. Both sequences were used to construct primers and obtain a cDNA sequence from liver. The mRNA coding the protein is 1675 nt in length including a single open reading frame of 1281 nt that encodes 426 amino acid residues. Analysis of this sequence indicated the presence of putative conserved serpin domains and confirmed the similarity to carp alpha1-antiproteinase and mammalian alpha1-antiproteinase. Our results indicate that inhibitor II belongs to the serpin superfamily and is similar to alpha1-antiproteinase.

  2. Nucleotide sequence of the cell wall proteinase gene of Streptococcus cremoris Wg2.

    PubMed Central

    Kok, J; Leenhouts, K J; Haandrikman, A J; Ledeboer, A M; Venema, G

    1988-01-01

    A 6.5-kilobase HindIII fragment that specifies the proteolytic activity of Streptococcus cremoris Wg2 was sequenced entirely. The nucleotide sequence revealed two open reading frames (ORFs), a small ORF1 with 295 codons and a large ORF2 containing 1,772 codons. For both ORFs, there was no stop codon on the HindIII fragment. A partially overlapping PstI fragment was used to locate the translation stop of the large ORF2. The entire ORF2 contained 1,902 coding triplets, followed by an apparently rho-independent terminator sequence. The inferred amino acid sequence would result in a protein of 200 kilodaltons. Both ORFs have their putative transcription and translation signals in a 345-base-pair ClaI fragment. ORF2 is preceded by a promoter region containing a 15-base-pair complementary direct repeat. Both the truncated 33- and the 200-kilodalton proteins have a signal peptide-like N-terminal amino acid sequence. The protein specified by ORF2 contained regions of extensive homology with serine proteases of the subtilisin family. Specifically, amino acid sequences involved in the formation of the active site (viz., Asp-32, His-64, and Ser-221 of the subtilisins) are well conserved in the S. cremoris Wg2 proteinase. The homologous sequences are separated by nonhomologous regions which contain several inserts, most notably a sequence of approximately 200 amino acids between the His and Ser residues of the active site. PMID:3278687

  3. Mouse Mammary Tumor Virus-Like Nucleotide Sequences in Canine and Feline Mammary Tumors▿

    PubMed Central

    Hsu, Wei-Li; Lin, Hsing-Yi; Chiou, Shyan-Song; Chang, Chao-Chin; Wang, Szu-Pong; Lin, Kuan-Hsun; Chulakasian, Songkhla; Wong, Min-Liang; Chang, Shih-Chieh

    2010-01-01

    Mouse mammary tumor virus (MMTV) has been speculated to be involved in human breast cancer. Companion animals, dogs, and cats with intimate human contacts may contribute to the transmission of MMTV between mouse and human. The aim of this study was to detect MMTV-like nucleotide sequences in canine and feline mammary tumors by nested PCR. Results showed that the presence of MMTV-like env and LTR sequences in canine malignant mammary tumors was 3.49% (3/86) and 18.60% (16/86), respectively. For feline malignant mammary tumors, the presence of both env and LTR sequences was found to be 22.22% (2/9). Nevertheless, the MMTV-like LTR and env sequences also were detected in normal mammary glands of dogs and cats. In comparisons of the MMTV-like DNA sequences of our findings to those of NIH 3T3 (MMTV-positive murine cell line) and human breast cancer cells, the sequence similarities ranged from 94 to 98%. Phylogenetic analysis revealed that intermixing among sequences identified from tissues of different hosts, i.e., mouse, dog, cat, and human, indicated the MMTV-like DNA existing in these hosts. Moreover, the env transcript was detected in 1 of the 19 MMTV-positive samples by reverse transcription-PCR. Taken together, our study provides evidence for the existence and expression of MMTV-like sequences in neoplastic and normal mammary glands of dogs and cats. PMID:20881168

  4. Mouse mammary tumor virus-like nucleotide sequences in canine and feline mammary tumors.

    PubMed

    Hsu, Wei-Li; Lin, Hsing-Yi; Chiou, Shyan-Song; Chang, Chao-Chin; Wang, Szu-Pong; Lin, Kuan-Hsun; Chulakasian, Songkhla; Wong, Min-Liang; Chang, Shih-Chieh

    2010-12-01

    Mouse mammary tumor virus (MMTV) has been speculated to be involved in human breast cancer. Companion animals, dogs, and cats with intimate human contacts may contribute to the transmission of MMTV between mouse and human. The aim of this study was to detect MMTV-like nucleotide sequences in canine and feline mammary tumors by nested PCR. Results showed that the presence of MMTV-like env and LTR sequences in canine malignant mammary tumors was 3.49% (3/86) and 18.60% (16/86), respectively. For feline malignant mammary tumors, the presence of both env and LTR sequences was found to be 22.22% (2/9). Nevertheless, the MMTV-like LTR and env sequences also were detected in normal mammary glands of dogs and cats. In comparisons of the MMTV-like DNA sequences of our findings to those of NIH 3T3 (MMTV-positive murine cell line) and human breast cancer cells, the sequence similarities ranged from 94 to 98%. Phylogenetic analysis revealed that intermixing among sequences identified from tissues of different hosts, i.e., mouse, dog, cat, and human, indicated the MMTV-like DNA existing in these hosts. Moreover, the env transcript was detected in 1 of the 19 MMTV-positive samples by reverse transcription-PCR. Taken together, our study provides evidence for the existence and expression of MMTV-like sequences in neoplastic and normal mammary glands of dogs and cats.

  5. Cloning, nucleotide sequence, and expression of the Pasteurella haemolytica A1 glycoprotease gene.

    PubMed Central

    Abdullah, K M; Lo, R Y; Mellors, A

    1991-01-01

    Pasteurella haemolytica serotype A1 secretes a glycoprotease which is specific for O-sialoglycoproteins such as glycophorin A. The gene encoding the glycoprotease enzyme has been cloned in the recombinant plasmid pH1, and its nucleotide sequence has been determined. The gene (designated gcp) codes for a protein of 35.2 kDa, and an active enzyme protein of this molecular mass can be observed in Escherichia coli clones carrying pPH1. In vivo labeling of plasmid-encoded proteins in E. coli maxicells demonstrated the expression of a 35-kDa protein from pPH1. The amino-terminal sequence of the heterologously expressed protein corresponds to that predicted from the nucleotide sequence. The glycoprotease is a neutral metalloprotease, and the predicted amino acid sequence of the glycoprotease contains a putative zinc-binding site. The gene shows no significant homology with the genes for other proteases of procaryotic or eucaryotic origin. However, there is substantial homology between gcp and an E. coli gene, orfX, whose product is believed to function in the regulation of macromolecule biosynthesis. Images PMID:1885539

  6. Total chemical synthesis of a 77-nucleotide-long RNA sequence having methionine-acceptance activity.

    PubMed Central

    Ogilvie, K K; Usman, N; Nicoghosian, K; Cedergren, R J

    1988-01-01

    Chemical synthesis is described of a 77-nucleotide-long RNA molecule that has the sequence of an Escherichia coli Ado-47-containing tRNA(fMet) species in which the modified nucleosides have been substituted by their unmodified parent nucleosides. The sequence was assembled on a solid-phase, controlled-pore glass support in a stepwise manner with an automated DNA synthesizer. The ribonucleotide building blocks used were fully protected 5'-monomethoxytrityl-2'-silyl-3'-N,N-diisopropylaminophosphoram idites. p-Nitro-phenylethyl groups were used to protect the O6 of guanine residues. The fully deprotected tRNA analogue was characterized by polyacrylamide gel electrophoresis (sizing), terminal nucleotide analysis, sequencing, and total enzyme degradation, all of which indicated that the sequence was correct and contained only 3-5 linkages. The 77-mer was then assayed for amino acid acceptor activity by using E. coli methionyl-tRNA synthetase. The results indicated that the synthetic product, lacking modified bases, is a substrate for the enzyme and has an amino acid acceptance 11% of that of the major native species, tRNA(fMet) containing 7-methylguanosine at position 47. Images PMID:3413059

  7. Mitochondrial DNA in the sea urchin Arbacia lixula: evolutionary inferences from nucleotide sequence analysis.

    PubMed

    De Giorgi, C; Lanave, C; Musci, M D; Saccone, C

    1991-07-01

    From the stirodont Arbacia lixula we determined the sequence of 5,127 nucleotides of mitochondrial DNA (mtDNA) encompassing 18 tRNAs, two complete coding genes, parts of three other coding genes, and part of the 12S ribosomal RNA (rRNA). The sequence confirms that the organization of mtDNA is conserved within echinoids. Furthermore, it underlines the following peculiar features of sea urchin mtDNA: the clustering of tRNAs, the short noncoding regulatory sequence, and the separation by the ND1 and ND2 genes of the two rRNA genes. Comparison with the orthologous sequences from the camarodont species Paracentrotus lividus and Strongylocentrotus purpuratus revealed that (1) echinoids have an extra piece on the amino terminus of the ND5 gene that is probably the remnant of an old leucine tRNA gene; (2) third-position codon nucleotide usage has diverged between A. lixula and the camarodont species to a significant extent, implying different directional mutational pressures; and (3) the stirodont-camarodont divergence occurred twice as long ago as did the P. lividus-S. purpuratus divergence.

  8. The complete nucleotide sequence and genome organization of a novel carmovirus - Honeysuckle ringspot virus isolated from honeysuckle.

    USDA-ARS?s Scientific Manuscript database

    A virus associated with yellow to purple ringspot on honeysuckle plants has been detected and tentatively named as Honeysuckle ringspot virus (HnRSV). The complete nucleotide sequence of HnRSV has been determined from infected honeysuckle. The genomic RNA of HnRSV is 3,956 nucleotides in length and ...

  9. Identification and analysis of safener-inducible expressed sequence tags in Populus using a cDNA microarray.

    PubMed

    Rishi, A S; Munir, Shirin; Kapur, Vivek; Nelson, Neil D; Goyal, Arun

    2004-12-01

    Safeners are the chemicals used to protect plants from detrimental effects of herbicides, but their mode of action at the molecular level is not well understood. As an initial step towards understanding the molecular mechanism of safener action in trees, homologous genes in hybrid poplar (Populus nigra x Populus maximowiczii) that were induced by a safener were identified. We here describe the identification of differentially expressed genes in Populus that are induced by Concep-III, a herbicide safener. Expressed sequence tags (ESTs) enriched for transcriptionally induced genes were isolated by suppressive subtractive hybridization (SSH). The SSH library cDNA inserts were used to construct a cDNA microarray for high-throughput validation of the up-regulated expression of safener-induced genes. Single-pass and partial sequences of 1,344 safener-induced ESTs were assembled into 418 singletons and 328 clusters, but the putative functions of almost 53% of the ESTs are not known. Genes encoding proteins involved in all three different phases of safener action, viz., oxidation, conjugation, and sequestration, were found in the SSH library. Almost 75% of genes that showed greater than 2-fold expression upon safener treatment were redundant in the SSH library. The expression pattern for selected genes was validated by reverse transcription-polymerase chain reaction. A few safener-induced genes that were not previously reported to be induced by safeners, but which may have a role in herbicide metabolism, were identified. The newly identified genes could have potential for application in genetic engineering of plants for herbicide detoxification and tolerance.

  10. The complete nucleotide sequence of goat (Capra hircus) mitochondrial genome. Goat mitochondrial genome.

    PubMed

    Parma, Pietro; Pietro, Parma; Feligini, Maria; Maria, Feligini; Greeppi, Gianfranco; Gianfranco, Greppi; Enne, Giuseppe; Giuseppe, Enne

    2003-06-01

    The goat mtDNA sequences reported to date are fragmentary. By using both in silico cloning procedure and conventional molecular biology techniques we have determined the complete nucleotide sequence of the goat (Capra hircus) mitochondrial genome. The length of the sequence was 16.640 bp. Genes responsible for 12S and 16S rRNAs, 22 tRNAs and 13 protein-coding regions are found. The genome organization is conformed to those of other mitochondrial genomes. Comparison between the 13 protein coding genes of goat, cow and sheep reveals that the difference range from 1.2 to 12.2% with a mean of 7.3% between goat and cow and from 0 to 15.6% (mean 4.7%) between goat and sheep.

  11. Nucleotide sequence of yeast GDH1 encoding nicotinamide adenine dinucleotide phosphate-dependent glutamate dehydrogenase.

    PubMed

    Moye, W S; Amuro, N; Rao, J K; Zalkin, H

    1985-07-15

    The yeast GDH1 gene encodes NADP-dependent glutamate dehydrogenase. This gene was isolated by complementation of an Escherichia coli glutamate auxotroph. NADP-dependent glutamate dehydrogenase was overproduced 6-10-fold in Saccharomyces cerevisiae bearing GDH1 on a multicopy plasmid. The nucleotide sequence of the 1362-base pair coding region and 5' and 3' flanking sequences were determined. Transcription start sites were located by S1 nuclease mapping. Regulation of GDH1 was not maintained when the gene was present on a multicopy plasmid. Protein secondary structure predictions identified a region with potential to form the dinucleotide-binding domain. The amino acid sequences of the yeast and Neurospora crassa enzymes are 63% conserved. Unlike the N. crassa gene, yeast GDH1 has no introns.

  12. Conservation of nucleotide sequences for molecular diagnosis of Middle East respiratory syndrome coronavirus, 2015.

    PubMed

    Furuse, Yuki; Okamoto, Michiko; Oshitani, Hitoshi

    2015-11-01

    Infection due to the Middle East respiratory syndrome coronavirus (MERS-CoV) is widespread. The present study was performed to assess the protocols used for the molecular diagnosis of MERS-CoV by analyzing the nucleotide sequences of viruses detected between 2012 and 2015, including sequences from the large outbreak in eastern Asia in 2015. Although the diagnostic protocols were established only 2 years ago, mismatches between the sequences of primers/probes and viruses were found for several of the assays. Such mismatches could lead to a lower sensitivity of the assay, thereby leading to false-negative diagnosis. A slight modification in the primer design is suggested. Protocols for the molecular diagnosis of viral infections should be reviewed regularly after they are established, particularly for viruses that pose a great threat to public health such as MERS-CoV.

  13. Identification of shark species in seafood products by forensically informative nucleotide sequencing (FINS).

    PubMed

    Blanco, M; Pérez-Martín, R I; Sotelo, C G

    2008-11-12

    The identification of commercial shark species is a relevant issue to ensure the correct labeling of seafood products, to maintain consumer confidence in seafood, and to enhance the knowledge of the species and volumes that are at present being captured, thus improving the management of shark fisheries. The polymerase chain reaction was employed to obtain a 423 bp amplicon from the mitochondrial cytochrome b gene. The sequences from this fragment, belonging to 63 authentic individuals of 23 species, were analyzed using a genetic distance method. Nine different samples of commercial fresh, frozen, and convenience food were obtained in local and international markets to validate the methodology. These samples were analyzed, and sequences were employed for species identification, showing that forensically informative nucleotide sequencing (FINS) is a suitable technique for identification of processed seafood containing shark as an ingredient. The results also showed that incorrect labeling practices may occur regarding shark products, probably because of incorrect labeling at the production point.

  14. Nucleotide sequence of the bean strain of southern bean mosaic virus.

    PubMed

    Othman, Y; Hull, R

    1995-01-10

    The genome of the bean strain of southern bean mosaic virus (SBMV-B) comprises 4109 nucleotides and thus is slightly shorter than those of the two other sequenced sobemoviruses (southern bean mosaic virus, cowpea strain (SBMV-C) and rice yellow mottle virus (RYMV)). SBMV-B has an overall sequence similarity with SBMV-C of 55% and with RYMV of 45%. Three potential open reading frames (ORFs) were recognized in SBMV-B which were in similar positions in the genomes of SBMV-C and RYMV. However, there was no analog of SBMV-C and RYMV ORF 3. From a comparison of the predicted sequences of the ORFs of these three sobemoviruses and of the noncoding regions, it is suggested that the two SBMV strains differ from one another as much as they do from RYMV and that they should be considered as different viruses.

  15. Nucleotide sequence of a satellite RNA associated with carrot motley dwarf in parsley and carrot.

    PubMed

    Menzel, Wulf; Maiss, Edgar; Vetten, H Josef

    2009-02-01

    Carrot motley dwarf (CMD) is known to result from a mixed infection by two viruses, the polerovirus Carrot red leaf virus and one of the umbraviruses Carrot mottle mimic virus or Carrot mottle virus. Some umbraviruses have been shown to be associated with small satellite (sat) RNAs, but none have been reported for the latter two. A CMD-affected parsley plant was used for sap transmission to test plants, that were used for dsRNA isolation. The presence of a 0.8-kbp dsRNA indicated the occurrence of a hitherto unrecognized satRNA associated with CMD. The satRNAs of the CMD isolate from parsley and an isolate from carrot have been sequenced and showed 94% sequence identity. Nucleotide sequences and putative translation products had no significant similarities to GenBank entries. To our knowledge, this is the first report of satRNAs associated with CMD.

  16. Single nucleotide polymorphisms from Theobroma cacao expressed sequence tags associated with witches' broom disease in cacao.

    PubMed

    Lima, L S; Gramacho, K P; Carels, N; Novais, R; Gaiotto, F A; Lopes, U V; Gesteira, A S; Zaidan, H A; Cascardo, J C M; Pires, J L; Micheli, F

    2009-07-14

    In order to increase the efficiency of cacao tree resistance to witches' broom disease, which is caused by Moniliophthora perniciosa (Tricholomataceae), we looked for molecular markers that could help in the selection of resistant cacao genotypes. Among the different markers useful for developing marker-assisted selection, single nucleotide polymorphisms (SNPs) constitute the most common type of sequence difference between alleles and can be easily detected by in silico analysis from expressed sequence tag libraries. We report the first detection and analysis of SNPs from cacao-M. perniciosa interaction expressed sequence tags, using bioinformatics. Selection based on analysis of these SNPs should be useful for developing cacao varieties resistant to this devastating disease.

  17. On the identification of group II introns in nucleotide sequence data.

    PubMed

    Knoop, V; Kloska, S; Brennicke, A

    1994-09-30

    Four different consensus sequences (GTI, group II identifiers) have been derived from domains V of known group II introns and are used as query input sequences for sensitive database screenings with the FASTA and LFASTA programs. The set of four GTI sequences can identify all domains V of the 96 known group II introns in the completely sequenced chloroplast genomes of Marchantia polymorpha, Epifagus virginiana, Oryza sativa, Nicotiana tabacum and the completely sequenced mitochondrial genomes of Saccharomyces cerevisiae, Podospora anserina, Schizosaccharomyces pombe and Marchantia polymorpha. Seven moderately high-scoring hits can easily be rejected as false-positives since they do not fulfil secondary structure requirements. Large FASTA outputs obtained after screening the entire nucleotide sequence database are evaluated in a second step by a program (D5SCAN) that allows the assignment of variable selection criteria for potential domain V secondary structures. Database searches with these routines yield evidence for several group II intron sequences previously unrecognized. These include novel intron structures in the cyanobacterium Synechocystis and in the mitochondrial genomes of Marchantia, soybean, pea, broad bean, sugar beet and a heterobasidiomycete. Potential intron remnants are found contributing to the secondary structure of rRNAs in several trypanosome species. At a given sensitivity of 95% positively identified true domains V, the search routine produces one false positive hit per 10,000 kb.

  18. Nucleotide-sequence-specific de novo methylation in a somatic murine cell line.

    PubMed Central

    Szyf, M; Schimmer, B P; Seidman, J G

    1989-01-01

    DNA fragments encoding the mouse steroid 21-hydroxylase (C21 or Cyp21A1) gene are de novo methylated when introduced into the mouse adrenocortical tumor cell line Y1 by DNA-mediated gene transfer. Although CCGG sequences within the C21 gene are de novo methylated, CCGG sites within flanking vector sequences, other mammalian gene sequences driven by the C21 promoter, and the neomycin-resistance gene, which was cotransfected with the C21 gene, do not become methylated. At least two separate signals for de novo methylation are encoded within the gene since three fragments derived from the C21 gene were methylated de novo. Specific de novo methylation of C21-derived sequences does not occur in L cells or Y1 kin8 cells; this suggests that the cellular factors needed for de novo methylation of the C21 gene are not ubiquitous. Most DNA sequences are not de novo methylated when introduced into somatic cells and DNA sequences other than the C21 gene are not de novo methylated when introduced into Y1 cells. Several groups have suggested that de novo methylation occurs in early embryonic cells and that somatic cells strictly maintain their methylation pattern by a semiconservative methyltransferase. Our results suggest that de novo methylation of specific nucleotide sequences can occur in some mammalian somatic cells. Images PMID:2789380

  19. Patterns of nucleotide misincorporations during enzymatic amplification and direct large-scale sequencing of ancient DNA.

    PubMed

    Stiller, M; Green, R E; Ronan, M; Simons, J F; Du, L; He, W; Egholm, M; Rothberg, J M; Keates, S G; Keats, S G; Ovodov, N D; Antipina, E E; Baryshnikov, G F; Kuzmin, Y V; Vasilevski, A A; Wuenschell, G E; Termini, J; Hofreiter, M; Jaenicke-Després, V; Pääbo, S

    2006-09-12

    Whereas evolutionary inferences derived from present-day DNA sequences are by necessity indirect, ancient DNA sequences provide a direct view of past genetic variants. However, base lesions that accumulate in DNA over time may cause nucleotide misincorporations when ancient DNA sequences are replicated. By repeated amplifications of mitochondrial DNA sequences from a large number of ancient wolf remains, we show that C/G-to-T/A transitions are the predominant type of such misincorporations. Using a massively parallel sequencing method that allows large numbers of single DNA strands to be sequenced, we show that modifications of C, as well as to a lesser extent of G, residues cause such misincorporations. Experiments where oligonucleotides containing modified bases are used as templates in amplification reactions suggest that both of these types of misincorporations can be caused by deamination of the template bases. New DNA sequencing methods in conjunction with knowledge of misincorporation processes have now, in principle, opened the way for the determination of complete genomes from organisms that became extinct during and after the last glaciation.

  20. Muscle coding sequences and their regulation during myogenesis: cloning of muscle actin cDNA probes.

    PubMed

    Minty, A; Caravatti, M; Robert, B; Cohen, A; Daubas, P; Weydert, A; Gros, F; Buckingham, M

    1981-01-01

    For a number of years our group has been mainly interested in the regulation of muscle gene expression during myogenesis. Using primary cultures and cell lines we have tried to find out whether the coding sequences for muscle proteins are already present in an unexpressed form or if there is a transcriptional switch at the onset of differentiation. Metabolic studies on pulse-labelled RNA, together with translation and molecular hybridization experiments have given a certain number of indications. More recently the development of genetic engineering techniques has made it possible to answer these questions directly with probes which are complementary to specific muscle coding sequences. We have identified a plasmid which contains a coding sequence for muscle actin. Other recombinant plasmids are being characterized. Such plasmids, used as probes, will permit us to study the organization and expression of the genes coding for the contractile proteins in muscle cells.

  1. Developing Single Nucleotide Polymorphism (SNP) markers from transcriptome sequences for the identification of longan (Dimocarpus longan) germplasm

    USDA-ARS?s Scientific Manuscript database

    Longan (Dimocarpus longan Lour.) is an important tropical fruit tree crop. Accurate varietal identification is essential for germplasm management and breeding. Using longan transcriptome sequences from public databases, we developed single nucleotide polymorphism (SNP) markers; validated 60 SNPs in...

  2. Complete Nucleotide Sequence of an Australian Isolate of Turnip mosaic virus before and after Seven Years of Serial Passaging

    PubMed Central

    Pretorius, Lara; Moyle, Richard L.; Dalton-Morgan, Jessica; Hussein, Nasser

    2016-01-01

    The complete genome sequence of an Australian isolate of Turnip mosaic virus was determined by Sanger sequencing. After seven years of serial passaging by mechanical inoculation, the isolate was resequenced by RNA sequencing (RNA-Seq). Eighteen single nucleotide polymorphisms were identified between the isolates. Both isolates had 96% identity to isolate AUST10. PMID:27856582

  3. cDNA library construction for next-generation sequencing to determine the transcriptional landscape of Legionella pneumophila.

    PubMed

    Sahr, Tobias; Buchrieser, Carmen

    2013-01-01

    The adaptation of Legionella pneumophila to the different conditions it encounters in the environment and in the host is governed by a complex regulatory system. Current knowledge of these regulatory networks and the transcriptome responses of L. pneumophila is mainly based on microarray analysis and limited to transcriptional products of annotated protein-coding genes. The application of the Next-Generation Sequencing (NGS) technology allows now genome-wide strand-specific sequencing and accurate determination of all expressed regions of the genome to reveal the complete transcriptional network and the dynamic interplay of specific regulators on a genome-wide level. NGS-based techniques promote deeper understanding of the global transcriptional organization of L. pneumophila by identifying transcription start sites (TSS), alternative TSS and operon organization, noncoding RNAs, antisense RNAs, and 5'-/3'-untranslated regions. In this chapter we describe the construction of cDNA libraries for (1) RNA deep sequencing (RNA-seq) and (2) TSS mapping using the Illumina technology.

  4. Primary structure of fox (Vulpes vulpes) proinsulin based on sequence studies of pancreatic peptides and cDNA.

    PubMed

    Fiertek, D; Gromowska, M; Andersen, A S; Hansen, P H; Majewski, T; Izdebski, J

    2000-08-01

    Insulin and C-peptide were extracted and purified from fox (Vulpes vulpes) pancreas using gel filtration, ion-exchange chromatography and HPLC. Chromatographic data for the insulin, as well as for its oxidized and carboxymethylated chains proved it to be identical to that of polar fox (Alopex lagopus) and dog. The sequence analysis of a peptide which was assumed to be the corresponding C-peptide revealed that it comprises 23 amino acid residues and is identical to the C-peptide fragment isolated from dog pancreas: it differs from polar fox C-peptide by a single substitution (Asp-->Glu). mRNA was isolated from pancreatic tissue and cDNA was obtained by reverse transcription. A polymerase chain reaction was performed using gene-specific primers to obtain a DNA fragment corresponding to part of fox proinsulin. DNA sequencing revealed 100% identity to dog proinsulin at the protein level, although some amino acids were encoded by different codons. The total sequence of proinsulin was deduced from these results.

  5. PASTA: Ultra-Large Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences.

    PubMed

    Mirarab, Siavash; Nguyen, Nam; Guo, Sheng; Wang, Li-San; Kim, Junhyong; Warnow, Tandy

    2015-05-01

    We introduce PASTA, a new multiple sequence alignment algorithm. PASTA uses a new technique to produce an alignment given a guide tree that enables it to be both highly scalable and very accurate. We present a study on biological and simulated data with up to 200,000 sequences, showing that PASTA produces highly accurate alignments, improving on the accuracy and scalability of the leading alignment methods (including SATé). We also show that trees estimated on PASTA alignments are highly accurate--slightly better than SATé trees, but with substantial improvements relative to other methods. Finally, PASTA is faster than SATé, highly parallelizable, and requires relatively little memory.

  6. Expressed Sequence Tags Analysis and Design of Simple Sequence Repeats Markers from a Full-Length cDNA Library in Perilla frutescens (L.)

    PubMed Central

    Seong, Eun Soo; Yoo, Ji Hye; Choi, Jae Hoo; Kim, Chang Heum; Jeon, Mi Ran; Kang, Byeong Ju; Lee, Jae Geun; Choi, Seon Kang; Ghimire, Bimal Kumar; Yu, Chang Yeon

    2015-01-01

    Perilla frutescens is valuable as a medicinal plant as well as a natural medicine and functional food. However, comparative genomics analyses of P. frutescens are limited due to a lack of gene annotations and characterization. A full-length cDNA library from P. frutescens leaves was constructed to identify functional gene clusters and probable EST-SSR markers via analysis of 1,056 expressed sequence tags. Unigene assembly was performed using basic local alignment search tool (BLAST) homology searches and annotated Gene Ontology (GO). A total of 18 simple sequence repeats (SSRs) were designed as primer pairs. This study is the first to report comparative genomics and EST-SSR markers from P. frutescens will help gene discovery and provide an important source for functional genomics and molecular genetic research in this interesting medicinal plant. PMID:26664999

  7. Genomic DNA Enrichment Using Sequence Capture Microarrays: a Novel Approach to Discover Sequence Nucleotide Polymorphisms (SNP) in Brassica napus L

    PubMed Central

    Clarke, Wayne E.; Parkin, Isobel A.; Gajardo, Humberto A.; Gerhardt, Daniel J.; Higgins, Erin; Sidebottom, Christine; Sharpe, Andrew G.; Snowdon, Rod J.; Federico, Maria L.; Iniguez-Luy, Federico L.

    2013-01-01

    Targeted genomic selection methodologies, or sequence capture, allow for DNA enrichment and large-scale resequencing and characterization of natural genetic variation in species with complex genomes, such as rapeseed canola (Brassica napus L., AACC, 2n=38). The main goal of this project was to combine sequence capture with next generation sequencing (NGS) to discover single nucleotide polymorphisms (SNPs) in specific areas of the B. napus genome historically associated (via quantitative trait loci –QTL– analysis) to traits of agronomical and nutritional importance. A 2.1 million feature sequence capture platform was designed to interrogate DNA sequence variation across 47 specific genomic regions, representing 51.2 Mb of the Brassica A and C genomes, in ten diverse rapeseed genotypes. All ten genotypes were sequenced using the 454 Life Sciences chemistry and to assess the effect of increased sequence depth, two genotypes were also sequenced using Illumina HiSeq chemistry. As a result, 589,367 potentially useful SNPs were identified. Analysis of sequence coverage indicated a four-fold increased representation of target regions, with 57% of the filtered SNPs falling within these regions. Sixty percent of discovered SNPs corresponded to transitions while 40% were transversions. Interestingly, fifty eight percent of the SNPs were found in genic regions while 42% were found in intergenic regions. Further, a high percentage of genic SNPs was found in exons (65% and 64% for the A and C genomes, respectively). Two different genotyping assays were used to validate the discovered SNPs. Validation rates ranged from 61.5% to 84% of tested SNPs, underpinning the effectiveness of this SNP discovery approach. Most importantly, the discovered SNPs were associated with agronomically important regions of the B. napus genome generating a novel data resource for research and breeding this crop species. PMID:24312619

  8. Genomic DNA enrichment using sequence capture microarrays: a novel approach to discover sequence nucleotide polymorphisms (SNP) in Brassica napus L.

    PubMed

    Clarke, Wayne E; Parkin, Isobel A; Gajardo, Humberto A; Gerhardt, Daniel J; Higgins, Erin; Sidebottom, Christine; Sharpe, Andrew G; Snowdon, Rod J; Federico, Maria L; Iniguez-Luy, Federico L

    2013-01-01

    Targeted genomic selection methodologies, or sequence capture, allow for DNA enrichment and large-scale resequencing and characterization of natural genetic variation in species with complex genomes, such as rapeseed canola (Brassica napus L., AACC, 2n=38). The main goal of this project was to combine sequence capture with next generation sequencing (NGS) to discover single nucleotide polymorphisms (SNPs) in specific areas of the B. napus genome historically associated (via quantitative trait loci -QTL- analysis) to traits of agronomical and nutritional importance. A 2.1 million feature sequence capture platform was designed to interrogate DNA sequence variation across 47 specific genomic regions, representing 51.2 Mb of the Brassica A and C genomes, in ten diverse rapeseed genotypes. All ten genotypes were sequenced using the 454 Life Sciences chemistry and to assess the effect of increased sequence depth, two genotypes were also sequenced using Illumina HiSeq chemistry. As a result, 589,367 potentially useful SNPs were identified. Analysis of sequence coverage indicated a four-fold increased representation of target regions, with 57% of the filtered SNPs falling within these regions. Sixty percent of discovered SNPs corresponded to transitions while 40% were transversions. Interestingly, fifty eight percent of the SNPs were found in genic regions while 42% were found in intergenic regions. Further, a high percentage of genic SNPs was found in exons (65% and 64% for the A and C genomes, respectively). Two different genotyping assays were used to validate the discovered SNPs. Validation rates ranged from 61.5% to 84% of tested SNPs, underpinning the effectiveness of this SNP discovery approach. Most importantly, the discovered SNPs were associated with agronomically important regions of the B. napus genome generating a novel data resource for research and breeding this crop species.

  9. Complete nucleotide sequence and genome organization of a Cactus virus X strain from Hylocereus undatus (Cactaceae).

    PubMed

    Liou, M R; Chen, Y R; Liou, R F

    2004-05-01

    The complete nucleotide sequence of a strain of Cactus virus X (CVX-Hu) isolated from Hylocereus undatus (Cactaceae) has been determined. Excluding the poly(A) tail, the sequence is 6614 nucleotides in length and contains seven open reading frames (ORFs). The genome organization of CVX is similar to that of other potexviruses. ORF1 encodes the putative viral replicase with conserved methyltransferase, helicase, and polymerase motifs. Within ORF1, two other ORFs were located separately in the +2 reading frame, we call these ORF6 and ORF7. ORF2, 3, and 4, which form the "triple gene block" characteristic of the potexviruses, encode proteins with molecular mass of 25, 12, and 7 KDa, respectively. ORF5 encodes the coat protein with an estimated molecular mass of 24 KDa. Sequence analysis indicated that proteins encoded by ORF1-5 display certain degree of homology to the corresponding proteins of other potexviruses. Putative product of ORF6, however, shows no significant similarity to those of other potexviruses. Phylogenetic analyses based on the replicase (the methyltransferase, helicase, and polymerase domains) and coat protein demonstrated a closer relationship of CVX with Bamboo mosaic virus, Cassava common mosaic virus, Foxtail mosaic virus, Papaya mosaic virus, and Plantago asiatica mosaic virus.

  10. The nucleotide sequence of sacbrood virus of the honey bee: an insect picorna-like virus.

    PubMed

    Ghosh, R C; Ball, B V; Willcocks, M M; Carter, M J

    1999-06-01

    We have determined the nucleotide sequence of sacbrood virus (SBV), which causes a fatal infection of honey bee larvae. The genomic RNA of SBV is longer than that of typical mammalian picornaviruses (8832 nucleotides) and contains a single, large open reading frame (179-8752) encoding a polyprotein of 2858 amino acids. Sequence comparison with other virus polyproteins revealed regions of similarity to characterized helicase, protease and RNA-dependent RNA polymerase domains; structural genes were located at the 5' terminus with non-structural genes at the 3' end. Picornavirus-like agents of insects have two distinct genomic organizations; some resemble mammalian picornaviruses with structural genes at the 5' end and non-structural genes at the 3' end, and others resemble caliciviruses in which this order is reversed; SBV thus belongs to the former type. Sequence comparison suggested that SBV is distantly related to infectious flacherie virus (IFV) of the silk worm, which possesses an RNA of similar size and gene order.

  11. Nucleotide sequence and expression of the Enterobacter aerogenes alpha-acetolactate decarboxylase gene in brewer's yeast.

    PubMed Central

    Sone, H; Fujii, T; Kondo, K; Shimizu, F; Tanaka, J; Inoue, T

    1988-01-01

    The nucleotide sequence of a 1.4-kilobase DNA fragment containing the alpha-acetolactate decarboxylase gene of Enterobacter aerogenes was determined. The sequence contains an entire protein-coding region of 780 nucleotides which encodes an alpha-acetolactate decarboxylase of 260 amino acids. The DNA sequence coding for alpha-acetolactate decarboxylase was placed under the control of the alcohol dehydrogenase I promoter of the yeast Saccharomyces cerevisiae in a plasmid capable of autonomous replication in both S. cerevisiae and Escherichia coli. Brewer's yeast cells transformed by this plasmid showed alpha-acetolactate decarboxylase activity and were used in laboratory-scale fermentation experiments. These experiments revealed that the diacetyl concentration in wort fermented by the plasmid-containing yeast strain was significantly lower than that in wort fermented by the parental strain. These results indicated that the alpha-acetolactate decarboxylase activity produced by brewer's yeast cells degraded alpha-acetolactate and that this degradation caused a decrease in diacetyl production. PMID:3278689

  12. Determination of Single-Nucleotide Polymorphisms by Real-time Pyrophosphate DNA Sequencing

    PubMed Central

    Alderborn, Anders; Kristofferson, Anna; Hammerling, Ulf

    2000-01-01

    The characterization of naturally occurring variations in the human genome has evoked an immense interest during recent years. Variations known as biallelic Single-Nucleotide Polymorphisms (SNPs) have become increasingly popular markers in molecular genetics because of their wide application both in evolutionary relationship studies and in the identification of susceptibility to common diseases. We have addressed the issue of SNP genotype determination by investigating variations within the Renin–Angiotensin–Aldosterone System (RAAS) using pyrosequencing, a real-time pyrophosphate detection technology. The method is based on indirect luminometric quantification of the pyrophosphate that is released as a result of nucleotide incorporation onto an amplified template. The technical platform employed comprises a highly automated sequencing instrument that allows the analysis of 96 samples within 10 to 20 minutes. In addition to each studied polymorphic position, 5–10 downstream bases were sequenced for acquisition of reference signals. Evaluation of pyrogram data was accomplished by comparison of peak heights, which are proportional to the number of incorporated nucleotides. Analysis of the pyrograms that resulted from alternate allelic configurations for each addressed SNP revealed a highly discriminating pattern. Homozygous samples produced clear-cut single base peaks in the expected position, whereas heterozygous counterparts were characterized by distinct half-height peaks representing both allelic positions. Whenever any of the allelic bases of an SNP formed a homopolymer with adjacent bases, the nonallelic signal was added to those of the SNP. This feature did not, however, influence SNP readability. Furthermore, the multibase reading capacity of the described system provides extensive flexibility in regard to the positioning of sequencing primers and allows the determination of several closely located SNPs in a single run. PMID:10958643

  13. Nucleotide sequence of the transforming gene of m1 murine sarcoma virus.

    PubMed Central

    Brow, M A; Sen, A; Sutcliffe, J G

    1984-01-01

    The v-mosm1 nucleotide sequence codes for a protein that is 376 amino acids long. Although the N-terminus is homologous with that of the v-mos124 protein, the C-terminus is substantially different from the C-termini of all other examined mos proteins, suggesting that this region is nonessential and perhaps cleaved. Overall, v-mosm1 has greater homology with c-mos than does v-mos124, but mutually exclusive differences between c-mos and each of the v-mos genes preclude linear descent and suggest a common ancestral murine sarcoma virus. PMID:6319757

  14. The Complete Nucleotide Sequence of the Mitochondrial Genome of Bactrocera minax (Diptera: Tephritidae)

    PubMed Central

    Zhang, Bin; Nardi, Francesco; Hull-Sanders, Helen; Wan, Xuanwu; Liu, Yinghong

    2014-01-01

    The complete 16,043 bp mitochondrial genome (mitogenome) of Bactrocera minax (Diptera: Tephritidae) has been sequenced. The genome encodes 37 genes usually found in insect mitogenomes. The mitogenome information for B. minax was compared to the homologous sequences of Bactrocera oleae, Bactrocera tryoni, Bactrocera philippinensis, Bactrocera carambolae, Bactrocera papayae, Bactrocera dorsalis, Bactrocera correcta, Bactrocera cucurbitae and Ceratitis capitata. The analysis indicated the structure and organization are typical of, and similar to, the nine closely related species mentioned above, although it contains the lowest genome-wide A+T content (67.3%). Four short intergenic spacers with a high degree of conservation among the nine tephritid species mentioned above and B. minax were observed, which also have clear counterparts in the control regions (CRs). Correlation analysis among these ten tephritid species revealed close positive correlation between the A+T content of zero-fold degenerate sites (P0FD), the ratio of nucleotide substitution frequency at P0FD sites to all degenerate sites (zero-fold degenerate sites, two-fold degenerate sites and four-fold degenerate sites) and amino acid sequence distance (ASD) were found. Further, significant positive correlation was observed between the A+T content of four-fold degenerate sites (P4FD) and the ratio of nucleotide substitution frequency at P4FD sites to all degenerate sites; however, we found significant negative correlation between ASD and the A+T content of P4FD, and the ratio of nucleotide substitution frequency at P4FD sites to all degenerate sites. A higher nucleotide substitution frequency at non-synonymous sites compared to synonymous sites was observed in nad4, the first time that has been observed in an insect mitogenome. A poly(T) stretch at the 5′ end of the CR followed by a [TA(A)]n-like stretch was also found. In addition, a highly conserved G+A-rich sequence block was observed in front of the

  15. The complete nucleotide sequence of the mitochondrial DNA of the dogfish, Scyliorhinus canicula.

    PubMed Central

    Delarbre, C; Spruyt, N; Delmarre, C; Gallut, C; Barriel, V; Janvier, P; Laudet, V; Gachelin, G

    1998-01-01

    We have determined the complete nucleotide sequence of the mitochondrial DNA (mtDNA) of the dogfish, Scyliorhinus canicula. The 16,697-bp-long mtDNA possesses a gene organization identical to that of the Osteichthyes, but different from that of the sea lamprey Petromyzon marinus. The main features of the mtDNA of osteichthyans were thus established in the common ancestor to chondrichthyans and osteichthyans. The phylogenetic analysis confirms that the Chondrichthyes are the sister group of the Osteichthyes. PMID:9725850

  16. Within-Host Nucleotide Diversity of Virus Populations: Insights from Next-Generation Sequencing

    PubMed Central

    Nelson, Chase W.; Hughes, Austin L.

    2014-01-01

    Next-generation sequencing (NGS) technology offers new opportunities for understanding the evolution and dynamics of viral populations within individual hosts over the course of infection. We review simple methods for estimating synonymous and nonsynonymous nucleotide diversity in viral genes from NGS data without the need for inferring linkage. We discuss the potential usefulness of these data for addressing questions of both practical and theoretical interest, including fundamental questions regarding the effective population sizes of within-host viral populations and the modes of natural selection acting on them. PMID:25481279

  17. The complete nucleotide sequence of the mitochondrial genome of Bactrocera minax (Diptera: Tephritidae).

    PubMed

    Zhang, Bin; Nardi, Francesco; Hull-Sanders, Helen; Wan, Xuanwu; Liu, Yinghong

    2014-01-01

    The complete 16,043 bp mitochondrial genome (mitogenome) of Bactrocera minax (Diptera: Tephritidae) has been sequenced. The genome encodes 37 genes usually found in insect mitogenomes. The mitogenome information for B. minax was compared to the homologous sequences of Bactrocera oleae, Bactrocera tryoni, Bactrocera philippinensis, Bactrocera carambolae, Bactrocera papayae, Bactrocera dorsalis, Bactrocera correcta, Bactrocera cucurbitae and Ceratitis capitata. The analysis indicated the structure and organization are typical of, and similar to, the nine closely related species mentioned above, although it contains the lowest genome-wide A+T content (67.3%). Four short intergenic spacers with a high degree of conservation among the nine tephritid species mentioned above and B. minax were observed, which also have clear counterparts in the control regions (CRs). Correlation analysis among these ten tephritid species revealed close positive correlation between the A+T content of zero-fold degenerate sites (P0FD), the ratio of nucleotide substitution frequency at P0FD sites to all degenerate sites (zero-fold degenerate sites, two-fold degenerate sites and four-fold degenerate sites) and amino acid sequence distance (ASD) were found. Further, significant positive correlation was observed between the A+T content of four-fold degenerate sites (P4FD) and the ratio of nucleotide substitution frequency at P4FD sites to all degenerate sites; however, we found significant negative correlation between ASD and the A+T content of P4FD, and the ratio of nucleotide substitution frequency at P4FD sites to all degenerate sites. A higher nucleotide substitution frequency at non-synonymous sites compared to synonymous sites was observed in nad4, the first time that has been observed in an insect mitogenome. A poly(T) stretch at the 5' end of the CR followed by a [TA(A)]n-like stretch was also found. In addition, a highly conserved G+A-rich sequence block was observed in front of the

  18. Nanoparticle-Based Discrimination of Single-Nucleotide Polymorphism in Long DNA Sequences.

    PubMed

    Sanromán-Iglesias, María; Lawrie, Charles H; Liz-Marzán, Luis M; Grzelczak, Marek

    2017-04-19

    Circulating DNA (ctDNA) and specifically the detection cancer-associated mutations in liquid biopsies promises to revolutionize cancer detection. The main difficulty however is that the length of typical ctDNA fragments (∼150 bases) can form secondary structures potentially obscuring the mutated fragment from detection. We show that an assay based on gold nanoparticles (65 nm) stabilized with DNA (Au@DNA) can discriminate single nucleotide polymorphism in clinically relevant ssDNA sequences (70-140 bases). The preincubation step was crucial to this process, allowing sequential bridging of Au@DNA, so that single base mutation can be discriminated, down to 100 pM concentration.

  19. Complete nucleotide sequence of a virus associated with rusty mottle disease of sweet cherry (Prunus avium).

    PubMed

    Villamor, D V; Druffel, K L; Eastwell, K C

    2013-08-01

    Cherry rusty mottle is a disease of sweet cherries first described in 1940 in western North America. Because of the graft-transmissible nature of the disease, a viral nature of the disease was assumed. Here, the complete genomic nucleotide sequences of virus isolates from two trees expressing cherry rusty mottle disease symptoms are characterized; the virus is designated cherry rusty mottle associated virus (CRMaV). The biological and molecular characteristics of this virus in comparison to those of cherry necrotic rusty mottle virus (CNRMV) and cherry green ring mottle virus (CGRMV) are described. CRMaV was subsequently detected in additional sweet cherry trees expressing symptoms of cherry rusty mottle disease.

  20. Complete nucleotide sequences of two begomoviruses infecting Madagascar periwinkle (Catharanthus roseus) from Pakistan.

    PubMed

    Ilyas, Muhammad; Nawaz, Kiran; Shafiq, Muhammad; Haider, Muhammad Saleem; Shahid, Ahmad Ali

    2013-02-01

    Though Catharanthus roseus (Madagascar periwinkle) is an ornamental plant, it is famous for its medicinal value. Its alkaloids are known for anti-cancerous properties, and this plant is studied mainly for its alkaloids. Here, this plant has been studied for its viral diseases. Complete DNA sequences of two begomoviruses infecting C. roseus originating from Pakistan were determined. The sequence of one begomovirus (clone KN4) shows the highest level of nucleotide sequence identity (86.5 %) to an unpublished virus, chili leaf curl India virus (ChiLCIV), and then (84.4 % identity) to papaya leaf curl virus (PaLCV), and thus represents a new species, for which the name "Catharanthus yellow mosaic virus" (CYMV) is proposed. The sequence of another begomovirus (clone KN6) shows the highest level of sequence identity (95.9 % to 99 %) to a newly reported virus from India, papaya leaf crumple virus (PaLCrV). Sequence analysis shows that KN4 and KN6 are recombinants of Pedilanthus leaf curl virus (PedLCV) and croton yellow vein mosaic virus (CrYVMV).

  1. Mining for single nucleotide polymorphisms and insertions/deletions in maize expressed sequence tag data.

    PubMed

    Batley, Jacqueline; Barker, Gary; O'Sullivan, Helen; Edwards, Keith J; Edwards, David

    2003-05-01

    We have developed a computer based method to identify candidate single nucleotide polymorphisms (SNPs) and small insertions/deletions from expressed sequence tag data. Using a redundancy-based approach, valid SNPs are distinguished from erroneous sequence by their representation multiple times in an alignment of sequence reads. A second measure of validity was also calculated based on the cosegregation of the SNP pattern between multiple SNP loci in an alignment. The utility of this method was demonstrated by applying it to 102,551 maize (Zea mays) expressed sequence tag sequences. A total of 14,832 candidate polymorphisms were identified with an SNP redundancy score of two or greater. Segregation of these SNPs with haplotype indicates that candidate SNPs with high redundancy and cosegregation confidence scores are likely to represent true SNPs. This was confirmed by validation of 264 candidate SNPs from 27 loci, with a range of redundancy and cosegregation scores, in four inbred maize lines. The SNP transition/transversion ratio and insertion/deletion size frequencies correspond to those observed by direct sequencing methods of SNP discovery and suggest that the majority of predicted SNPs and insertion/deletions identified using this approach represent true genetic variation in maize.

  2. A simple ABO genotyping by PCR using sequence-specific primers with mismatched nucleotides.

    PubMed

    Taki, Takashi; Kibayashi, Kazuhiko

    2014-05-01

    In forensics, the specific ABO blood group is often determined by analyzing the ABO gene. Among various methods used, PCR employing sequence-specific primers (PCR-SSP) is simpler than other methods for ABO typing. When performing the PCR-SSP, the pseudo-positive signals often lead to errors in ABO typing. We introduced mismatched nucleotides at the second and the third positions from the 3'-end of the primers for the PCR-SSP method and examined whether reliable typing could be achieved by suppressing pseudo-positive signals. Genomic DNA was extracted from nail clippings of 27 volunteers, and the ABO gene was examined with PCR-SSP employing primers with and without mismatched nucleotides. The ABO blood group of the nail clippings was also analyzed serologically, and these results were compared with those obtained using PCR-SSP. When mismatched primers were employed for amplification, the results of the ABO typing matched with those obtained by the serological method. When primers without mismatched nucleotides were used for PCR-SSP, pseudo-positive signals were observed. Thus our method may be used for achieving more reliable ABO typing.

  3. Optimizing nucleotide sequence ensembles for combinatorial protein libraries using a genetic algorithm.

    PubMed

    Craig, Roger A; Lu, Jin; Luo, Jinquan; Shi, Lei; Liao, Li

    2010-01-01

    Protein libraries are essential to the field of protein engineering. Increasingly, probabilistic protein design is being used to synthesize combinatorial protein libraries, which allow the protein engineer to explore a vast space of amino acid sequences, while at the same time placing restrictions on the amino acid distributions. To this end, if site-specific amino acid probabilities are input as the target, then the codon nucleotide distributions that match this target distribution can be used to generate a partially randomized gene library. However, it turns out to be a highly nontrivial computational task to find the codon nucleotide distributions that exactly matches a given target distribution of amino acids. We first showed that for any given target distribution an exact solution may not exist at all. Formulated as a constrained optimization problem, we then developed a genetic algorithm-based approach to find codon nucleotide distributions that match as closely as possible to the target amino acid distribution. As compared with the previous gradient descent method on various objective functions, the new method consistently gave more optimized distributions as measured by the relative entropy between the calculated and the target distributions. To simulate the actual lab solutions, new objective functions were designed to allow for two separate sets of codons in seeking a better match to the target amino acid distribution.

  4. Nucleotide sequence analysis of the respiratory syncytial virus subgroup A cold-passaged (cp) temperature sensitive (ts) cpts-248/404 live attenuated virus vaccine candidate.

    PubMed

    Firestone, C Y; Whitehead, S S; Collins, P L; Murphy, B R; Crowe, J E

    1996-11-15

    The complete nucleotide sequence of the RSV cpts-248/404 live attenuated vaccine candidate was determined from cloned cDNA and was compared to that of the RSV A2/HEK7 wild-type, cold-passaged cp-RSV, and cpts-248 virus, which constitute the series of progenitor viruses. RSV cpts-248/404 is more attenuated and more temperature sensitive (ts) (shut-off temperature 36 degrees) than its cpts-248 parent virus (shut-off temperature 38 degrees) and is currently being evaluated in phase I clinical trials in humans. Our ultimate goal is to identify the genetic basis for the host range attenuation phenotype exhibited by cp-RSV (i.e., efficient replication in tissue culture but decreased replication in chimpanzees and humans) and for the ts and attenuation phenotypes of its chemically mutagenized derivatives, cpts-248 and cpts-248/404. Compared with its cpts-248 parent, the cpts-248/404 virus possesses an amino acid change in the polymerase (L) protein and a single nucleotide substitution in the M2 gene start sequence. In total, the cpts-248/404 mutant differs from its wild-type RSV A2/HEK7 progenitor in seven amino acids [four in the polymerase (L) protein, two in the fusion (F) glycoprotein, and one in the (N) nucleoprotein] and one nucleotide difference in the M2 gene start sequence. Heterogeneity at nucleotide position 4 (G or C, negative sense, compared to G in the RSV A2/HEK7 progenitor) in the leader region of vRNA developed during passage of the cpts-248/404 in tissue culture. Biologically cloned derivatives of RSV cpts-248/404 virus that differed at position 4 possessed the same level of temperature sensitivity and exhibited the same level of replication in the upper and lower respiratory tract of mice, suggesting that heterogeneity at this position is not clinically relevant. The determination of the nucleotide sequence of the cpts-248/404 virus will allow evaluation of the stability of the eight mutations that are associated with the attenuation phenotype during

  5. [Molecular cloning and analysis of cDNA sequences encoding serine proteinase and Kunitz type inhibitor in venom gland of Vipera nikolskii viper].

    PubMed

    Ramazanova, A S; Fil'kin, S Iu; Starkov, V G; Utkin, Iu N

    2011-01-01

    Serine proteinases and Kunitz type inhibitors are widely represented in venoms of snakes from different genera. During the study of the venoms from snakes inhabiting Russia we have cloned cDNAs encoding new proteins belonging to these protein families. Thus, a new serine proteinase called nikobin was identified in the venom gland of Vipera nikolskii viper. By amino acid sequence deduced from the cDNA sequence, nikobin differs from serine proteinases identified in other snake species. Nikobin amino acid sequence contains 15 unique substitutions. This is the first serine proteinase of viper from Vipera genus for which a complete amino acid sequence established. The cDNA encoding Kunitz type inhibitor was also cloned. The deduced amino acid sequence of inhibitor is homologous to those of other proteins from that snakes of Vipera genus. However there are several unusual amino acid substitutions that might result in the change of biological activity of inhibitor.

  6. cDNA and deduced amino acid sequence of human pulmonary surfactant-associated proteolipid SPL(Phe)

    SciTech Connect

    Glasser, S.W.; Korfhagen, T.R.; Weaver, T.; Pilot-Matias, T.; Fox, J.L.; Whitsett, J.A.

    1987-06-01

    Hydrophobic surfactant-associated protein of M/sub r/ 6000-14,000 was isolated from either/ethanol or chloroform/methanol extracts of mammalian pulmonary surfactant. Automated Edman degradation in a gas-phase sequencer showed the major N-terminus of the human low molecular weight protein to be Phe-Pro-Ile-Pro-Leu-Pro-Try-Cys-Trp-Leu-Cys-Arg-Ala-Leu-. Because of the N-terminal phenylalanine, the surfactant protein was designated SPL(Phe). Antiserum generated against hydrophobic surfactant protein(s) from bovine pulmonary surfactant recognized protein of M/sub r/ 6000-14,000 in immunoblot analysis and was used to screen a lambdagt11 expression library constructed from adult human lung poly(A)/sup +/ RNA. This resulted in identification of a 1.4-kilobase cDNA clone that was shown to encode the N-terminus of the surfactant polypeptide SPL(Phe) (Phe-Pro-Ile-Pro-Leu-Pro-) within an open reading frame for a larger protein. Expression of a fused ..beta..-galactosidase-SPL (Phe) gene in Escherichia coli yielded an immunoreactive M/sub r/ 34,000 fusion peptide. Hybrid-arrested translation with the cDNA and immunoprecipitation of (/sup 35/S)methionine-labeled in vitro translation products of human poly(A)/sup +/ RNA with a surfactant polyclonal antibody resulted in identification of a M/sub r/ 40,000 precursor protein. Blot hybridization analysis of electrophoretically fractionated RNA from human lung detected a 2.0-kilobase RNA that was more abundant in adult lung than in fetal lung. These proteins, and specifically SPL(Phe), may therefore be useful for synthesis of replacement surfactants for treatment of hyaline membrane disease in newborn infants or of other surfactant-deficient states.

  7. An Ambystoma mexicanum EST sequencing project: analysis of 17,352 expressed sequence tags from embryonic and regenerating blastema cDNA libraries

    PubMed Central

    Habermann, Bianca; Bebin, Anne-Gaelle; Herklotz, Stephan; Volkmer, Michael; Eckelt, Kay; Pehlke, Kerstin; Epperlein, Hans Henning; Schackert, Hans Konrad; Wiebe, Glenis; Tanaka, Elly M

    2004-01-01

    Background The ambystomatid salamander, Ambystoma mexicanum (axolotl), is an important model organism in evolutionary and regeneration research but relatively little sequence information has so far been available. This is a major limitation for molecular studies on caudate development, regeneration and evolution. To address this lack of sequence information we have generated an expressed sequence tag (EST) database for A. mexicanum. Results Two cDNA libraries, one made from stage 18-22 embryos and the other from day-6 regenerating tail blastemas, generated 17,352 sequences. From the sequenced ESTs, 6,377 contigs were assembled that probably represent 25% of the expressed genes in this organism. Sequence comparison revealed significant homology to entries in the NCBI non-redundant database. Further examination of this gene set revealed the presence of genes involved in important cell and developmental processes, including cell proliferation, cell differentiation and cell-cell communication. On the basis of these data, we have performed phylogenetic analysis of key cell-cycle regulators. Interestingly, while cell-cycle proteins such as the cyclin B family display expected evolutionary relationships, the cyclin-dependent kinase inhibitor 1 gene family shows an unusual evolutionary behavior among the amphibians. Conclusions Our analysis reveals the importance of a comprehensive sequence set from a representative of the Caudata and illustrates that the EST sequence database is a rich source of molecular, developmental and regeneration studies. To aid in data mining, the ESTs have been organized into an easily searchable database that is freely available online. PMID:15345051

  8. The venom gland transcriptome of Latrodectus tredecimguttatus revealed by deep sequencing and cDNA library analysis.

    PubMed

    He, Quanze; Duan, Zhigui; Yu, Ying; Liu, Zhen; Liu, Zhonghua; Liang, Songping

    2013-01-01

    Latrodectus tredecimguttatus, commonly known as black widow spider, is well known for its dangerous bite. Although its venom has been characterized extensively, some fundamental questions about its molecular composition remain unanswered. The limited transcriptome and genome data available prevent further understanding of spider venom at the molecular level. In the present study, we combined next-generation sequencing and conventional DNA sequencing to construct a venom gland transcriptome of the spider L. tredecimguttatus, which resulted in the identification of 9,666 and 480 high-confidence proteins among 34,334 de novo sequences and 1,024 cDNA sequences, respectively, by assembly, translation, filtering, quantification and annotation. Extensive functional analyses of these proteins indicated that mRNAs involved in RNA transport and spliceosome, protein translation, processing and transport were highly enriched in the venom gland, which is consistent with the specific function of venom glands, namely the production of toxins. Furthermore, we identified 146 toxin-like proteins forming 12 families, including 6 new families in this spider in which α-LTX-Lt1a family2 is firstly identified as a subfamily of α-LTX-Lt1a family. The toxins were classified according to their bioactivities into five categories that functioned in a coordinate way. Few ion channels were expressed in venom gland cells, suggesting a possible mechanism of protection from the attack of their own toxins. The present study provides a gland transcriptome profile and extends our understanding of the toxinome of spiders and coordination mechanism for toxin production in protein expression quantity.

  9. The Venom Gland Transcriptome of Latrodectus tredecimguttatus Revealed by Deep Sequencing and cDNA Library Analysis

    PubMed Central

    He, Quanze; Duan, Zhigui; Yu, Ying; Liu, Zhen; Liu, Zhonghua; Liang, Songping

    2013-01-01

    Latrodectus tredecimguttatus, commonly known as black widow spider, is well known for its dangerous bite. Although its venom has been characterized extensively, some fundamental questions about its molecular composition remain unanswered. The limited transcriptome and genome data available prevent further understanding of spider venom at the molecular level. In the present study, we combined next-generation sequencing and conventional DNA sequencing to construct a venom gland transcriptome of the spider L. tredecimguttatus, which resulted in the identification of 9,666 and 480 high-confidence proteins among 34,334 de novo sequences and 1,024 cDNA sequences, respectively, by assembly, translation, filtering, quantification and annotation. Extensive functional analyses of these proteins indicated that mRNAs involved in RNA transport and spliceosome, protein translation, processing and transport were highly enriched in the venom gland, which is consistent with the specific function of venom glands, namely the production of toxins. Furthermore, we identified 146 toxin-like proteins forming 12 families, including 6 new families in this spider in which α-LTX-Lt1a family2 is firstly identified as a subfamily of α-LTX-Lt1a family. The toxins were classified according to their bioactivities into five categories that functioned in a coordinate way. Few ion channels were expressed in venom gland cells, suggesting a possible mechanism of protection from the attack of their own toxins. The present study provides a gland transcriptome profile and extends our understanding of the toxinome of spiders and coordination mechanism for toxin production in protein expression quantity. PMID:24312294

  10. Mining of haplotype-based expressed sequence tag single nucleotide polymorphisms in citrus.

    PubMed

    Chen, Chunxian; Gmitter, Fred G

    2013-11-01

    Single nucleotide polymorphisms (SNPs), the most abundant variations in a genome, have been widely used in various studies. Detection and characterization of citrus haplotype-based expressed sequence tag (EST) SNPs will greatly facilitate further utilization of these gene-based resources. In this paper, haplotype-based SNPs were mined out of publicly available citrus expressed sequence tags (ESTs) from different citrus cultivars (genotypes) individually and collectively for comparison. There were a total of 567,297 ESTs belonging to 27 cultivars in varying numbers and consequentially yielding different numbers of haplotype-based quality SNPs. Sweet orange (SO) had the most (213,830) ESTs, generating 11,182 quality SNPs in 3,327 out of 4,228 usable contigs. Summed from all the individually mining results, a total of 25,417 quality SNPs were discovered - 15,010 (59.1%) were transitions (AG and CT), 9,114 (35.9%) were transversions (AC, GT, CG, and AT), and 1,293 (5.0%) were insertion/deletions (indels). A vast majority of SNP-containing contigs consisted of only 2 haplotypes, as expected, but the percentages of 2 haplotype contigs varied widely in these citrus cultivars. BLAST of the 25,417 25-mer SNP oligos to the Clementine reference genome scaffolds revealed 2,947 SNPs had "no hits found", 19,943 had 1 unique hit / alignment, 1,571 had one hit and 2+ alignments per hit, and 956 had 2+ hits and 1+ alignment per hit. Of the total 24,293 scaffold hits, 23,955 (98.6%) were on the main scaffolds 1 to 9, and only 338 were on 87 minor scaffolds. Most alignments had 100% (25/25) or 96% (24/25) nucleotide identities, accounting for 93% of all the alignments. Considering almost all the nucleotide discrepancies in the 24/25 alignments were at the SNP sites, it served well as in silico validation of these SNPs, in addition to and consistent with the rate (81%) validated by sequencing and SNaPshot assay. High-quality EST-SNPs from different citrus genotypes were detected, and

  11. Mining of haplotype-based expressed sequence tag single nucleotide polymorphisms in citrus

    PubMed Central

    2013-01-01

    Background Single nucleotide polymorphisms (SNPs), the most abundant variations in a genome, have been widely used in various studies. Detection and characterization of citrus haplotype-based expressed sequence tag (EST) SNPs will greatly facilitate further utilization of these gene-based resources. Results In this paper, haplotype-based SNPs were mined out of publicly available citrus expressed sequence tags (ESTs) from different citrus cultivars (genotypes) individually and collectively for comparison. There were a total of 567,297 ESTs belonging to 27 cultivars in varying numbers and consequentially yielding different numbers of haplotype-based quality SNPs. Sweet orange (SO) had the most (213,830) ESTs, generating 11,182 quality SNPs in 3,327 out of 4,228 usable contigs. Summed from all the individually mining results, a total of 25,417 quality SNPs were discovered – 15,010 (59.1%) were transitions (AG and CT), 9,114 (35.9%) were transversions (AC, GT, CG, and AT), and 1,293 (5.0%) were insertion/deletions (indels). A vast majority of SNP-containing contigs consisted of only 2 haplotypes, as expected, but the percentages of 2 haplotype contigs varied widely in these citrus cultivars. BLAST of the 25,417 25-mer SNP oligos to the Clementine reference genome scaffolds revealed 2,947 SNPs had “no hits found”, 19,943 had 1 unique hit / alignment, 1,571 had one hit and 2+ alignments per hit, and 956 had 2+ hits and 1+ alignment per hit. Of the total 24,293 scaffold hits, 23,955 (98.6%) were on the main scaffolds 1 to 9, and only 338 were on 87 minor scaffolds. Most alignments had 100% (25/25) or 96% (24/25) nucleotide identities, accounting for 93% of all the alignments. Considering almost all the nucleotide discrepancies in the 24/25 alignments were at the SNP sites, it served well as in silico validation of these SNPs, in addition to and consistent with the rate (81%) validated by sequencing and SNaPshot assay. Conclusions High-quality EST-SNPs from different

  12. Nucleotide sequence alignment of hdcA from Gram-positive bacteria.

    PubMed

    Diaz, Maria; Ladero, Victor; Redruello, Begoña; Sanchez-Llana, Esther; Del Rio, Beatriz; Fernandez, Maria; Martin, Maria Cruz; Alvarez, Miguel A

    2016-03-01

    The decarboxylation of histidine -carried out mainly by some gram-positive bacteria- yields the toxic dietary biogenic amine histamine (Ladero et al. 2010 〈10.2174/157340110791233256〉 [1], Linares et al. 2016 〈http://dx.doi.org/10.1016/j.foodchem.2015.11.013〉〉 [2]). The reaction is catalyzed by a pyruvoyl-dependent histidine decarboxylase (Linares et al. 2011 〈10.1080/10408398.2011.582813〉 [3]), which is encoded by the gene hdcA. In order to locate conserved regions in the hdcA gene of Gram-positive bacteria, this article provides a nucleotide sequence alignment of all the hdcA sequences from Gram-positive bacteria present in databases. For further utility and discussion, see 〈http://dx.doi.org/ 10.1016/j.foodcont.2015.11.035〉〉 [4].

  13. Nucleotide sequence alignment of hdcA from Gram-positive bacteria

    PubMed Central

    Diaz, Maria; Ladero, Victor; Redruello, Begoña; Sanchez-Llana, Esther; del Rio, Beatriz; Fernandez, Maria; Martin, Maria Cruz; Alvarez, Miguel A.

    2016-01-01

    The decarboxylation of histidine -carried out mainly by some gram-positive bacteria- yields the toxic dietary biogenic amine histamine (Ladero et al. 2010 〈10.2174/157340110791233256〉 [1], Linares et al. 2016 〈http://dx.doi.org/10.1016/j.foodchem.2015.11.013〉〉 [2]). The reaction is catalyzed by a pyruvoyl-dependent histidine decarboxylase (Linares et al. 2011 〈10.1080/10408398.2011.582813〉 [3]), which is encoded by the gene hdcA. In order to locate conserved regions in the hdcA gene of Gram-positive bacteria, this article provides a nucleotide sequence alignment of all the hdcA sequences from Gram-positive bacteria present in databases. For further utility and discussion, see 〈http://dx.doi.org/ 10.1016/j.foodcont.2015.11.035〉〉 [4]. PMID:26958625

  14. Nucleotide sequences of three tRNA(Ser) from Drosophila melanogaster reading the six serine codons.

    PubMed

    Cribbs, D L; Gillam, I C; Tener, G M

    1987-10-05

    The nucleotide sequences of three serine tRNAs from Drosophila melanogaster, together capable of decoding the six serine codons, were determined. tRNA(Ser)2b has the anticodon GCU, tRNA(Ser)4 has CGA and tRNA(Ser)7 has IGA. tRNA(Ser)2b differs from the last two by about 25%. However, tRNA(Ser)4 and tRNA(Ser)7 are 96% homologous, differing only at the first position of the anticodon and two other sites. This unusual sequence relationship suggests, together with similar pairs in the yeasts Schizosaccharomyces pombe and Saccharomyces cerevisiae, that eukaryotic tRNA(Ser)UCN may be undergoing concerted evolution.

  15. Complete nucleotide sequence of a new variant of grapevine fanleaf virus from northeastern China.

    PubMed

    Zhou, Jun; Fan, Xudong; Dong, Yafeng; Zhang, Zunping; Ren, Fang; Hu, Guojun; Li, Zhengnan

    2017-02-01

    The complete RNA1 and RNA2 sequences of a new grapevine fanleaf virus isolate (GFLV-SDHN) from northeastern China were determined. The two RNAs are 7,367 and 3,788 nucleotides (nt) in length, respectively, excluding the poly(A) tails. Compared to other GFLV isolates, GFLV-SDHN has a 22- to 24-nt insertion in the RNA1 5' untranslated region, and there was 19.1-20.1 % and 11.7 %-13.0 % sequence divergence in RNA1, and 15.5 %-20.5 % and 8.5-13.5 % in RNA2, at the nt and amino acid level, respectively. Phylogenetic analysis revealed that the origins of GFLV-SDHN are distinct from those of other GFLV isolates. One recombination event was identified in the 2A(HP) region of RNA2 in GFLV-SDHN.

  16. Nucleotide-resolution DNA double-strand break mapping by next-generation sequencing.

    PubMed

    Crosetto, Nicola; Mitra, Abhishek; Silva, Maria Joao; Bienko, Magda; Dojer, Norbert; Wang, Qi; Karaca, Elif; Chiarle, Roberto; Skrzypczak, Magdalena; Ginalski, Krzysztof; Pasero, Philippe; Rowicka, Maga; Dikic, Ivan

    2013-04-01

    We present a genome-wide approach to map DNA double-strand breaks (DSBs) at nucleotide resolution by a method we termed BLESS (direct in situ breaks labeling, enrichment on streptavidin and next-generation sequencing). We validated and tested BLESS using human and mouse cells and different DSBs-inducing agents and sequencing platforms. BLESS was able to detect telomere ends, Sce endonuclease-induced DSBs and complex genome-wide DSB landscapes. As a proof of principle, we characterized the genomic landscape of sensitivity to replication stress in human cells, and we identified >2,000 nonuniformly distributed aphidicolin-sensitive regions (ASRs) overrepresented in genes and enriched in satellite repeats. ASRs were also enriched in regions rearranged in human cancers, with many cancer-associated genes exhibiting high sensitivity to replication stress. Our method is suitable for genome-wide mapping of DSBs in various cells and experimental conditions, with a specificity and resolution unachievable by current techniques.